赞
踩
YOLO作为现在目标检测技术中较为基础且流行的技术之一。本文将以开源者的论文与模型为基础讲述YOLO技术的v1至v8的技术原理与实现。该篇可以作为是YOLO系列论文导读或是重要知识的简介,当然对于v5及之后基于模型的优化更迭,我们将从模型出发进行分析学习。
You Only Look Once即我们常说的YOLO。对于YOLO技术的第一版,我认为讲其摘要部分全部放在这里都不为过:
We present YOLO, a new approach to object detection. Prior work on object detection repurposes classifiers to perform detection. Instead, we frame object detection as a regression problem to spatially separated bounding boxes and associated class probabilities. A single neural network predicts bounding boxes and class probabilities directly from full images in one evaluation. Since the whole detection pipeline is a single network, it can be optimized end-to-end directly on detection performance. Our unified architecture is extremely fast. Our base YOLO model processes images in real-time at 45 frames per second. A smaller version of the network, Fast YOLO, processes an astounding 155 frames per second while still achieving double the mAP of other real-time detectors. Compared to state-of-the-art detection systems, YOLO makes more localization errors but is less likely to predict false positives on background. Finally, YOLO learns very general representations of objects. It outperforms other detection methods, including DPM and R-CNN, when generalizing from natural images to other domains like artwork.
这里开篇指出YOLO技术与传统目标检测算法的不同,从一开始的单个图片的学习识别,到使用滑动窗口进行卷积神经网络学习。YOLO的创新点在于:从原来固定的、人为设置的图形检测窗口,设计为或者说是训练为一种自适应的图形检测窗口。
Instead, we frame object detection as a regression problem to spatially separated bounding boxes and associated class probabilities.
因此,YOLO将会使用一个神经网络去预测出检测边界窗(bounding box)与物体类别。
Introduction部分,大致讲述之前的检测模型(滑动窗口)的复杂与自己(YOLO)的简单。另外,叫YOLO(you only look once)指的是可以像人类一样只看一眼就识别出图像及种类。这里这个命名应该就是用来对比滑动窗口需要不断的更新检测边界框进行检测的过程。这一部分论文介绍了YOLO对比其他现有目标检测算法的三点优势:
首先,将输入图像划分为S*S个的网格。如果一个物体的中心点落在一个网格单元中,则这个网格单元就用来负责检测这个物体。
每个网格单元都要预测出B个检测边界框(bounding box)以及对于这些边界框的置信度(可信度);并且,每一个网格单元要预测一组待分类的类别条件概率集合。注意这里每个网格单元只预测一组类别的条件概率而非每个bounding box都预测一组。
每个网格预测出b个边界框以及对这些边界框的自信程度,即自信分数与预测框与真实框的交并比的乘积,记为confidence,表示为:
C
o
n
f
i
d
e
n
c
e
=
P
r
(
O
b
j
e
c
t
)
×
I
O
U
p
r
e
d
t
r
u
t
h
Confidence = Pr(Object)\times IOU_{pred}^{truth}
Confidence=Pr(Object)×IOUpredtruth
文至此处出现了第一个式子,批注一点,对于论文中的每个式子,本文均会去进行分析与推导,所以don’t worry about them
声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/笔触狂放9/article/detail/860974?site
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。