赞
踩
2014 – ECCV – Ten Years ofPedestrian Detection, What Have We Learned?
总结了40多种行人检测器在Caltechpedestrian detection benchmark上的性能,主要分为三种算法,作者又用决策森林方法对多种算法进行了互补组合。
Viola&Jones variants
HOG+SVM rigid templates
Deformable part detectors (DPM)
Convolutional neural networks (ConvNets)
l 现有数据集
INRIA:图数据集,数据量少,标注丰富?(城市、海边、山区)
Dalal, N., Triggs, B.: Histograms of oriented gradients for humandetection. In: CVPR. (2005)
INRIA(FrenchNational Institute for Research in Computer Science and Control)即法国国家计算机技术和控制研究所。OpenCV中自带的getDefaultPeopleDetector()
是这套方法。【http://blog.csdn.net/zhazhiqiang/article/details/20220787】
ETH [2], TUD-Brussels[3]:中等规模视频数据集(ETH有双目)
2. Ess, A., Leibe, B., Schindler, K., Van Gool, L.: A mobile visionsystem for robust multi-person tracking. In: CVPR, IEEE Press (June 2008)
3. Wojek, C., Walk, S., Schiele, B.: Multi-cue onboard pedestriandetection. In: CVPR. (2009)
Daimler [4] (Daimler stereo [5])
4. Enzweiler, M., Gavrila, D.M.: Monocular pedestrian detection:Survey and experiments. PAMI (2009)
5. Keller, C., Fernandez, D., Gavrila, D.: Dense stereo-based roigeneration for pedestrian detection. In: DAGM. (2009)
Caltech-USA [6],KITTI [7]:大数据集,challenging(KITTI包括双目)
(Caltech-USA)
6. Dollar, P., Wojek, C., Schiele, B., Perona, P.: Pedestriandetection: A benchmark. In: CVPR. (2009)
(KITTI)
7. Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomousdriving? the kitti vision benchmark suite. In: Conference on Computer Visionand Pattern Recognition (CVPR). (2012)
KITTI使用area under the precision-recall curve (AUC, higher is better)进行评估,其他方法使用log-averagemiss-rate (MR, lower is better)进行评估。
l 各种方法讨论
VJ detector 2003 滑动窗口+HAAR
HOG 2005 àdeformable part model (DPM) 2008
改善性能的若干方法:
1.训练数据
Caltech pedestrian detection benchmark 使用per-image evaluation metrics(所以一些早期方法效果很差),当然也与他们使用的训练数据集合有关。
2.解决方案
现在主要的三大类方法是:DPM variants、Deepnetworks(DN)和Decision forests(DF)
3.分类器
非线性或线性,分类器类型似乎影响并不太大
4.额外数据
深度图,光流,跟踪,激光雷达等
5.context
周围信息,影响力较额外数据或深度结构有限
6. deformable part
对于遮挡的情况比较有效
7. multi-scale models
对于训练时候,不采用缩放到一致的特征大小,训练不同大小图像,训练耗时,测试并不影响很多
8.深度结构
与另两大类效果持平
9.better feature
比较关键的因素
l 实验部分
采用Integral Channels Features framework,一种决策森林进行实验,
In particular, weuse the (open source) SquaresChnFtrsbaseline described in [31]:2048 level-2decision trees (3 threshold comparisons per tree) over HOG+LUV channels (10 channels), composing one 64 _ 128 pixelstemplate learned via vanilla AdaBoost and few bootstrapping rounds of hardnegative mining.
1Reviewing the effect of features
Much of the progress since VJ can by explained by the use of better features, based on orientedgradients and colour information. Simple tweaks to these well known features(e.g. projection onto the DCT basis) can still yield noticeable improvements.
2Complementarity of approaches
Our experiments show that adding extra features,flow, and context information are largely complementary (12% gain,instead of 3+7+5%), even when starting from a strong detector.
It remains to be seen if future progress indetection quality will be obtained by further insights of the “core” algorithm(thus further diminishing the relative improvement of add-ons), or by extendingthe diversity of techniques employed inside a system.
3 Howmuch model capacity is needed?
Our results indicate that research on increasingthe discriminative power of detectors is likely to further improve detectionquality. More discriminative power can originate from more and better featuresor more complex classifiers.
4 Generalisation acrossdatasets
While detectors learned on one dataset may not necessarily transferwell to others, their ranking is stable across datasets, suggesting thatinsights can be learned from well-performing methods regardless of thebenchmark.
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。