-
摘要:
目标检测作为计算机视觉领域的热点问题,目前基于深度学习的目标检测方法可以分为2类:两步检测和一步检测,前者有着较高准确性,后者有着较好速度,但是为提高检测的性能两者都引入了锚机制。为提高目标检测系统的性能,基于深度卷积神经网络的两步检测算法引入了注意力引导(AG)模块,通过对候选区域网络(RPN)的锚机制进行引导,使得对于预选锚框形状的选择更具有多样性;同时针对传统的后处理方式非极大值抑制(NMS)算法存在的误检和漏检的问题,提出了一种置信度因子的NMS(Cf-NMS)算法,对于模型的整体性能有着很大的贡献。实验结果说明,所提方法虽然在速度性能上有略微的下降,但是无论是在RPN变体还是现有的先进算法在准确性方面都有提升。
-
关键词:
- 计算机视觉 /
- 目标检测 /
- 深度学习 /
- 候选区域网络(RPN) /
- 非极大值抑制(NMS)
Abstract:Object detection is a hot topic in the field of computer vision. At present, object detection methods based on deep learning can be divided into two categories: two-stage detection and one-stage detection. The former has higher accuracy, while the latter has better speed. In order to improve the performance of detection, anchor mechanism is introduced in both categories. In this paper, Attention Guidance (AG) module is introduced in the two-stage detection method based on the deep convolutional neural network, which guides the anchor mechanism of Region Proposal Network (RPN), making the selection of preselected box shape more diversified. At the same time, to solve the problem of false detection and missed detection in the traditional post-processing Non-Maximum Suppression (NMS) algorithm, a Confidence factor NMS (Cf-NMS) method is proposed, which makes a great contribution to the overall performance of the model. Experiment results showed that, although it has a slight decrease in speed performance, the proposed method has an improvement in accuracy in both the RPN variant and the existing advanced method.
-
表 1 候选区域的平均查全率
Table 1. Average recall rate of region proposal
方法 AR100 AR300 AR500 ARS ARM ARL 运行时间/s RPN+9 anchors 45.7 53.4 57.8 28.7 52.8 63.9 0.09 RPN+12 anchors 50.2 56.6 58.3 33.9 58.2 67.5 0.09 RPN+AG 53.2 60.2 61.3 39.6 62.3 70.2 0.11 表 2 不同NMS算法的mAP的实验结果
Table 2. Experimental results of mAP by different NMS algorithms
-
[1] 方路平, 何杭江, 周国民.目标检测算法研究综述[J].计算机工程与应用, 2018, 54(13):11-18. doi: 10.3778/j.issn.1002-8331.1804-0167FANG L P, HE H J, ZHOU G M.Research overview of object detection methods[J].Computer Engineering and Applications, 2018, 54(13):11-18(in Chinese). doi: 10.3778/j.issn.1002-8331.1804-0167 [2] 刘栋, 李素, 曹志冬.深度学习及其在图像物体分类与检测中的应用综述[J].计算机科学, 2016, 43(12):13-23. doi: 10.11896/j.issn.1002-137X.2016.12.003LIU D, LI S, CAO Z D.State-of-the-art on deep learning and its application in image object classification and detection[J].Computer Science, 2016, 43(12):13-23(in Chinese). doi: 10.11896/j.issn.1002-137X.2016.12.003 [3] REDMON J, DIVVALA S, GIRSHICK R, et al.You only look once: Unified, real-time object detection[C]//IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Piscataway: IEEE Press, 2016: 779-788. [4] LIU W, ANGUELOV D, ERHAN D, et al.SSD: Single shot multibox detector[C]//European Conference on Computer Vision.Berlin: Springer, 2016: 21-37. [5] CHEN C, LIU M Y, TUZEL O, et al.R-CNN for small object detection[C]//Asian Conference on Computer Vision.Berlin: Springer, 2016: 214-230 [6] GIRSHICK R.Fast R-CNN[C]//Proceedings of the IEEE International Conference on Computer Vision.Piscataway: IEEE Press, 2015: 1440-1448. [7] REN S, HE K, GIRSHICK R, et al.Faster R-CNN:Towards real-time object detection with region proposal networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6):1137-1149. doi: 10.1109/TPAMI.2016.2577031 [8] CAI Z, VASCONCELOS N.Cascade R-CNN: Delving into high quality object detection[C]//IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Piscataway: IEEE Press, 2018: 6154-6162. [9] HE K, GKIOXARI G, DOLLÁR P, et al.Mask R-CNN[C]//Proceedings of the IEEE International Conference on Computer Vision.Piscataway: IEEE Press, 2017: 2961-2969. [10] LIN T Y, DOLLÁR P, GIRSHICK R, et al.Feature pyramid networks for object detection[C]//IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Piscataway: IEEE Press, 2017: 2117-2125. [11] YAN Q, GONG D, SHI Q, et al.Attention-guided network for ghost-free high dynamic range imaging[EB/OL].(2019-04-23)[2019-09-20].https://arxiv.org/abs/1904.10293. [12] VELIČKOVIĆ P, CUCURULL G, CASANOVA A, et al.Graph attention networks[EB/OL].(2018-02-04)[2019-09-20].https://arxiv.org/abs/1710.10903. [13] GUAN Q, HUANG Y, ZHONG Z, et al.Diagnose like a radiologist: Attention guided convolutional neural network for thorax disease classification[EB/OL].(2018-01-30)[2019-09-20].https://arxiv.org/abs/1801.09927. [14] LIN T Y, MAIRE M, BELONGIE S, et al.Microsoft COCO: Common objects in context[C]//European Conference on Computer Vision.Berlin: Springer, 2014: 740-755. [15] DALAL N, TRIGGS B.Histograms of oriented gradients for human detection[C]//IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Piscataway: IEEE Press, 2005: 886-893. [16] LOWE D G.Distinctive image features from scale-invariant keypoints[J].International Journal of Computer Vision, 2004, 60(2):91-110. doi: 10.1023/B:VISI.0000029664.99615.94 [17] 张强, 张陈斌, 陈宗海.一种改进约束条件的简化非极大值抑制[J].中国科学技术大学学报, 2016, 46(1):6-11. http://www.cqvip.com/QK/94257X/201601/669806218.htmlZHANG Q, ZHANG C B, CHEN Z H.A simplified non-maximum suppression with improved constraints[J].Journal University of Science and Technology of China, 2016, 46(1):6-11(in Chinese). http://www.cqvip.com/QK/94257X/201601/669806218.html [18] 赵文清, 严海, 邵绪强.改进的非极大值抑制算法的目标检测[J].中国图象图形学报, 2018, 23(11):1676-1685. doi: 10.11834/jig.180275ZHAO W Q, YAN H, SHAO X Q.Object detection based on improved non-maximums suppression algorithm[J].Journal of Image and Graphics, 2018, 23(11):1676-1685(in Chinese). doi: 10.11834/jig.180275 [19] KRIZHEVSKY A, SUTSKEVER I, HINTON G E.ImageNet classification with deep convolutional neural networks[C]//Advances in Neural Information Processing Systems.Red Hook: Curran Associates Inc., 2012: 1097-1105. -