-
摘要:
为实现公安监控系统内容分析的精准智能及提高服务实战能力,提出一种轻量化的多目标实时检测算法。首先,基于CenterNet检测网络增加了CBNet的多融合阶梯级联结构,有效地解决了主干网络在日常监控中特征提取能力不足的问题;其次,通过模型剪枝压缩网络减少参数量,加快了监控视频分析速度。本文利用部分COCO数据集和自行采集的现场数据进行训练与测试,并与其他主流检测算法(YOLO、Faster-RCNN、SSD等)进行消融实验。实验结果表明:所提模型在公共安全监控中能有效地做到速度与精度的均衡,并具有较强的普适性。
Abstract:For the public security monitoring system, a lightweight multi-target real-time detection algorithm is proposed in order to realize the accurate intelligence of the content analysis and improve the actual service ability. First, the multi-fusion gradient cascade structure of CBNet is added based on CenterNet detection network, which effectively solves the problem of insufficient feature extraction capability of the backbone network in daily monitoring videos. Second, the number of parameters is reduced through the model pruning and compression, which can speed up the analysis speed of monitoring videos. During the experiments, the dataset for training and testing consists of a part of COCO datasets and a number of field data collected by ourselves. The ablation experiments are conducted with other mainstream detection algorithms (YOLO, Faster-RCNN, SSD, etc.). The experimental results show that the presented model can effectively balance the speed and precision in the analysis of monitoring videos for public security and has stronger universality.
-
Key words:
- target detection /
- deep learning /
- model compression /
- model distillation /
- cascade fusion
-
表 1 主流模型精度对比
Table 1. Mainstream models' precision comparison
模型 召回率 准确率 内存/KB ResNet-18 0.83 0.80 25 014 ResNet-18+hourglass+CBNet 0.92 0.90 24 254 ResNet-18×2+CBNet 0.86 0.83 22 254 Hourglass×2+CBNet 0.85 0.83 23 280 SSD 0.87 0.84 23 020 slim YOLOV3 0.90 0.88 33 894 表 2 模型压缩前后精度变化
Table 2. Model's accuracy before and after compression
模型 召回率 准确率 内存/KB ResNet-18+hourglass+CBNet 0.92 0.90 24 254 ResNet-18+hourglass+CBNet(剪枝后) 0.90 0.88 3 676 表 3 模型推理速度对比
Table 3. Model's inference speed comparison
模型 帧数/s CenterNet(ResNet-18) 6.6 ResNet-18+hourglass+CBNet 9.9 ResNet-18(深度可分离卷积)+hourglass+CBNet 11.2 ResNet-18+hourglass+CBNet(剪枝后) 19.2 -
[1] KALIA R, LEE K D, SAMIR B V R, et al.An analysis of the effect of different image preprocessing techniques on the performance of SURF: Speeded up robust features[C]//Workshop on Frontiers of Computer Vision.Piscataway: IEEE Press, 2011: 1-6. [2] LOWE D G.Distinctive image features from scale-invariant keypoints[J].International Journal of Computer Vision, 2004, 60(2):91-110. doi: 10.1023/B:VISI.0000029664.99615.94 [3] MUNRO S, THOMAS K L, ABU-SHAAR M.Molecular characterization of a peripheral receptor for cannabinoids[J].Nature, 1993, 365(6441):61-65. doi: 10.1038/365061a0 [4] PLATT J C.A fast algorithm for training support vector machines[J].Journal of Information Technology, 1998, 2(5):1-28. http://www.researchgate.net/publication/242613062_A_fast_algorithm_for_training_support_vector_machines [5] FREUND Y, SCHAPIRE R E.A decision-theoretic generalization of on-line learning and an application to boosting[C]//Proceedings of the 2nd European Conference on Computational Learning Theory.Berlin: Springer, 1995: 22-37. https://www.researchgate.net/publication/225540813_Lecture_Notes_in_Computer_Science [6] GIRSHICK R, DONAHUE J, DARRELL T, et al.Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Piscataway: IEEE Press, 2014: 580-587. [7] GIRSHICK R.Fast-RCNN[C]//Proceedings of 2015 IEEE In-ternational Conference on Computer Vision.Piscataway: IEEE Press, 2015: 10-15. [8] REN S, HE K, GIRSHICK R, et al.Faster R-CNN: Towards real-time object detection with region proposal networks[C]//Proceedings of the 28th International Conference on Neural Information Processing Systems.Cambridge: MIT Press, 2015: 1-15. [9] REDMON J, FARHADI A.YOLO9000: Better, faster, stronger[EB/OL].(2016-12-25)[2020-02-27].https://arxiv.org/abs/1612.08242. [10] LIU W, ANGUELOV D, ERHAN D, et al.SSD: Single shot multibox detector[C]//Proceedings of 2016 European Conference on Computer Vision and Pattern Recognition.Berlin: Springer, 2016: 13-17. https://www.researchgate.net/publication/286513835_SSD_Single_Shot_MultiBox_Detector [11] LAW H, DENG J.CornerNet:Detecting objects as paired keypoints[J].International Journal of Computer Vision, 2018, 128:642-656. doi: 10.1007/s11263-019-01204-1 [12] KONG T, SUN F, LIU H, et al.FoveaBox: Beyond anchor-based object detector[EB/OL].(2019-04-08)[2020-02-27].https://arxiv.org/abs/1904.03797. [13] ZHOU X, WANG D, KRÄHENBVHL P.Objects as points[EB/OL].(2019-04-16)[2020-02-27].https://arxiv.org/abs/1904.07850. [14] HE K M, ZHANG X Y, REN S Q.Deep residual learning for image recognition[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Piscataway: IEEE Press, 2016: 770-778. [15] NEWELL A, YANG K, JIA D.Stacked hourglass networks for human pose estimation[EB/OL].(2016-03-22)[2020-02-27].https://arxiv.org/abs/1603.06937. [16] LIU Y, WANG Y, WANG S, et al.CBNet: A novel composite backbone network architecture for object detection[EB/OL].(2019-09-09)[2020-02-27].https://arxiv.org/abs/1909.03625. [17] CHOLLET F.Xception: Deep learning with depthwise separable convolutions[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Piscataway: IEEE Press, 2017: 1800-1807. [18] TAN M, PANG R, LE Q V.EfficientDet: Scalable and efficient object detection[EB/OL].(2019-11-20)[2020-02-27].https://arxiv.org/abs/1911.09070. [19] HE Y, ZHANG X Y, SUN J.Channel pruning for accelerating very deep neural networks[EB/OL].(2017-08-21)[2020-02-27].https://arxiv.org/abs/1707.06168. -