Small target vehicle detection based on perceptual enhancement and multi scale feature fusion
-
摘要:
为解决车辆小目标携带信息匮乏,特征表达能力弱,导致现有算法检测精度低、漏检等问题,提出一种基于感知增强与多尺度特征融合的小目标车辆检测算法。设计空间局部特征感知增强主干网络SLFPB-ST,解决特征提取过程中小目标特征信息丢失严重的问题;提出一种多尺度特征融合网络(MSIFN),通过分配权重,关注更多的细节信息,同时,在MSIFN中加入大目标抑制块(LRB)约束大目标特征,保留小目标特征表达;采用无锚框机制减少小目标漏检问题,提高检测精度。在UA-DETRAC数据集和Vehicle数据集上的实验结果表明:与Swin Transformer算法相比,所提算法在mAP、AP50和AP75指标上分别提升5.15%、9.35%和4.35%,参数量增加14 MB,检测速度降低0.4 帧/s,验证了算法具有良好的鲁棒性和适用性。
-
关键词:
- 小目标车辆检测 /
- Swin Transformer /
- 多尺度特征融合 /
- 空洞卷积 /
- GIOU
Abstract:This paper suggests a perception-enhanced and multi-scale fusion-based algorithm for small target vehicle detection in order to address the problem of inadequate information and weak feature expression ability of small targets carried by vehicles, which leads to low detection accuracy and missed detections in current algorithms. Firstly, a spatial local feature enhancement backbone network SLFPB-ST is designed to solve the problem of severe loss of feature information for small targets during the feature extraction process. Secondly, a multi-scale information fusion network (MSIFN) is proposed to fuse features at multiple scales by allocating weights to focus on more detailed information. In order to restrict the characteristics of large objects while maintaining the feature representation of small targets, MSIFN also has a large objects restriction block (LRB). Finally, an anchor-free mechanism is adopted to reduce missed detections of small targets and improve detection accuracy. Experimental results on the UA-DETRAC dataset and Vehicle dataset demonstrate that compared with the Swin Transformer algorithm, our algorithm achieves an improvement of 5.15% in mAP, 9.35% in AP50, and 4.35% in AP75 with an increase in parameter size by 14 MB and a decrease in detection speed by 0.4 frames per second. This validates that our algorithm exhibits good robustness and applicability.
-
表 1 本文算法与主流算法检测结果比较
Table 1. Comparison of detection results between the proposed algorithm and mainstream algorithms
算法 主干网络 mAP/% AP50/% AP75/% AP(S)/% AP(M)/% AP(L)/% 参数量/MB 检测速度/(帧·s−1) Faster R-CNN[5] ResNet101 51.85 73.75 56.27 33.28 54.84 64.18 243.6 26.8 SSD[7] VGG16 49.35 66.42 48.35 28.24 51.52 60.80 92.1 52.4 RetinaNet[8] ResNet50 54.85 77.15 59.16 36.65 59.29 62.21 96.6 46.8 CenterNet[9] DLA-34 55.65 79.04 59.24 36.12 58.14 66.10 88.8 45.5 YOLOv4[11] Darknet53 56.85 81.24 62.12 36.55 60.10 66.90 198.5 30.6 DETR[14] ResNet-50 48.74 68.22 47.20 27.56 49.50 58.85 142 35.7 原始算法[15] Swin-T 53.45 74.85 56.45 34.64 55.40 64.65 114 42.8 本文 SLFPB-ST 58.60 84.20 60.80 37.15 57.60 67.56 128 42.4 表 2 不同算法结果比较
Table 2. Comparison of results of different algorithms
表 3 本文算法与原始算法的性能指标比较
Table 3. Performance comparison between the proposed algorithm and original algorithm
算法 AP50/% F1/% 精确率/% 召回率/% 检测速度/(帧·s−1) 原始算法 74.85 73.60 83.42 72.65 42.8 本文 84.20 85.16 91.30 79.48 42.4 表 4 消融实验结果
Table 4. Results of ablation experiments
Swin-T SLFPB MSIFN LRB 检测速度/(帧·s−1) mAP/% √ 35.2 53.45 √ √ 38.5 53.95 √ √ 36.9 55.12 √ √ √ 33.4 56.74 √ √ √ √ 42.4 58.60 -
[1] XIAO Z Y, ZHANG G B. An attention-based odometry framework for multisensory unmanned ground vehicles (UGVs)[J]. Drones, 2023, 7(12): 699. [2] 王坤, 项琦鑫. 改进Yolov4的车辆弱目标检测算法[J]. 中国惯性技术学报, 2023, 31(8): 797-805.WANG K, XIANG Q X. Improved Yolov4 algorithm for vehicle weak object detection[J]. Journal of Chinese Inertial Technology, 2023, 31(8): 797-805(in Chinese). [3] LIANG H, SEO S. UAV low-altitude remote sensing inspection system using a small target detection network for Helmet wear detection[J]. Remote Sensing, 2023, 15(1): 196. [4] GIRSHICK R. Fast R-CNN[C]//Proceedings of the IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2016: 1440-1448. [5] REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149. [6] CAI Z W, VASCONCELOS N. Cascade R-CNN: high quality object detection and instance segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(5): 1483-1498. [7] CHEN W P, QIAO Y T, LI Y J. Inception-SSD: an improved single shot detector for vehicle detection[J]. Journal of Ambient Intelligence and Humanized Computing, 2022, 13(11): 5047-5053. [8] LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[C]//Proceedings of the IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2017: 2999-3007. [9] DUAN K W, BAI S, XIE L X, et al. CenterNet: keypoint triplets for object detection[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE Press, 2020: 6568-6577. [10] DING X W, YANG R D. Vehicle and parking space detection based on improved YOLO network model[J]. Journal of Physics: Conference Series, 2019, 1325(1): 012084. [11] SUMIT S S, WATADA J, ROY A, et al. In object detection deep learning methods, YOLO shows supremum to Mask R-CNN[J]. Journal of Physics: Conference Series, 2020, 1529(4): 042086. [12] SU X Y, LIU H M, TAO L F, et al. An end-to-end framework for remaining useful life prediction of rolling bearing based on feature pre-extraction mechanism and deep adaptive Transformer model[J]. Computers & Industrial Engineering, 2021, 161: 107531. [13] KOJIMA T, IWASAWA Y, MATSUO Y. Robustifying vision Transformer without retraining from scratch using attention-based test-time adaptation[J]. New Generation Computing, 2023, 41(1): 5-24. [14] QI F, CHEN G M, LIU J Y, et al. End-to-end pest detection on an improved deformable DETR with multihead criss cross attention[J]. Ecological Informatics, 2022, 72: 101902. [15] LIU Z, LIN Y T, CAO Y, et al. Swin Transformer: hierarchical vision Transformer using shifted windows[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE Press, 2022: 9992-10002. [16] XU X. Research on a small target object detection algorithm for electric transmission lines based on convolutional neural network[J]. IAENG International Journal of Computer Science, 2023, 50(2): 375-380. [17] FU C Y, LIU W, RANGA A, et al. DSSD: deconvolutional single shot detector [EB/OL]. (2017-01-23)[2024-03-01]. https://arxiv.org/abs/1701.06659. [18] ZHANG Q, ZHANG H Y, LU X W. Adaptive feature fusion for small object detection[J]. Applied Sciences, 2022, 12(22): 11854. [19] CHEN Y K, ZHANG P Z, KONG T, et al. Scale-aware automatic augmentations for object detection with dynamic training[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(2): 2367-2383. [20] LIU S T, HUANG D, WANG Y H. Learning spatial fusion for single-shot object detection[EB/OL]. (2019-11-25)[2024-03-01]. https://arxiv.org/abs/1911.09516. [21] ZOPH B, CUBUK E D, GHIASI G, et al. Learning data augmentation strategies for object detection[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2020: 566-583. [22] REZATOFIGHI H, TSOI N, GWAK J, et al. Generalized intersection over union: a metric and a loss for bounding box regression[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2020: 658-666. -


下载: