留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于感知增强与多尺度特征融合的小目标车辆检测

沈瑜 李阳阳 李博昊 高宝渠 魏子易 白珊

沈瑜,李阳阳,李博昊,等. 基于感知增强与多尺度特征融合的小目标车辆检测[J]. 北京航空航天大学学报,2026,52(5):1422-1432
引用本文: 沈瑜,李阳阳,李博昊,等. 基于感知增强与多尺度特征融合的小目标车辆检测[J]. 北京航空航天大学学报,2026,52(5):1422-1432
SHEN Y,LI Y Y,LI B H,et al. Small target vehicle detection based on perceptual enhancement and multi scale feature fusion[J]. Journal of Beijing University of Aeronautics and Astronautics,2026,52(5):1422-1432 (in Chinese)
Citation: SHEN Y,LI Y Y,LI B H,et al. Small target vehicle detection based on perceptual enhancement and multi scale feature fusion[J]. Journal of Beijing University of Aeronautics and Astronautics,2026,52(5):1422-1432 (in Chinese)

基于感知增强与多尺度特征融合的小目标车辆检测

doi: 10.13700/j.bh.1001-5965.2024.0124
基金项目: 

国家自然科学基金(61861025,62241106)

详细信息
    通讯作者:

    E-mail:18609311366@163.com

  • 中图分类号: V221+.3;TB553

Small target vehicle detection based on perceptual enhancement and multi scale feature fusion

Funds: 

National Natural Science Foundation of China (61861025,62241106)

More Information
  • 摘要:

    为解决车辆小目标携带信息匮乏,特征表达能力弱,导致现有算法检测精度低、漏检等问题,提出一种基于感知增强与多尺度特征融合的小目标车辆检测算法。设计空间局部特征感知增强主干网络SLFPB-ST,解决特征提取过程中小目标特征信息丢失严重的问题;提出一种多尺度特征融合网络(MSIFN),通过分配权重,关注更多的细节信息,同时,在MSIFN中加入大目标抑制块(LRB)约束大目标特征,保留小目标特征表达;采用无锚框机制减少小目标漏检问题,提高检测精度。在UA-DETRAC数据集和Vehicle数据集上的实验结果表明:与Swin Transformer算法相比,所提算法在mAP、AP50和AP75指标上分别提升5.15%、9.35%和4.35%,参数量增加14 MB,检测速度降低0.4 帧/s,验证了算法具有良好的鲁棒性和适用性。

     

  • 图 1  本文算法框架

    Figure 1.  Framework of the proposed algorithm

    图 2  Swin Transformer网络结构

    Figure 2.  Network framework of Swin Transformer

    图 3  Swin Transformer Block网络结构

    Figure 3.  Network framework of Swin Transformer Block

    图 4  SLFPB-ST主干网络

    Figure 4.  SLFPB-ST backbone network

    图 5  空间局部特征感知块

    Figure 5.  Spatial local feature perception block

    图 6  多尺度特征融合网络

    Figure 6.  Multi-scale information fusion network

    图 7  RFA和ASF的过程

    Figure 7.  Process of RFA and ASF

    图 8  大目标抑制块

    CAP:通道平均池化;σ:Sigmoid;CMP:通道最大池化;$ \otimes $:矩阵乘积。

    Figure 8.  Large objects restraint block

    图 9  总损失曲线

    Figure 9.  Total loss curve

    图 10  数据集部分样本示例

    Figure 10.  Examples of partial samples of dataset

    图 11  本文算法与原始算法各类别检测精度对比

    Figure 11.  Comparison of detection accuracy of each category between the proposed algorithm and original algorithm

    图 12  Vehicle数据集小目标检测结果对比

    Figure 12.  Comparison of small target detection results in Vehicle dataset

    图 13  UA-DETRAC数据集遮挡目标检测结果对比

    Figure 13.  Comparison of occlusion target detection results for UA-DETRAC dataset

    图 14  不同光照情况下检测结果对比

    Figure 14.  Comparison of detection results under different light conditions

    图 15  可视化热力图对比

    Figure 15.  Comparison of visual thermal maps

    表  1  本文算法与主流算法检测结果比较

    Table  1.   Comparison of detection results between the proposed algorithm and mainstream algorithms

    算法 主干网络 mAP/% AP50/% AP75/% AP(S)/% AP(M)/% AP(L)/% 参数量/MB 检测速度/(帧·s−1)
    Faster R-CNN[5] ResNet101 51.85 73.75 56.27 33.28 54.84 64.18 243.6 26.8
    SSD[7] VGG16 49.35 66.42 48.35 28.24 51.52 60.80 92.1 52.4
    RetinaNet[8] ResNet50 54.85 77.15 59.16 36.65 59.29 62.21 96.6 46.8
    CenterNet[9] DLA-34 55.65 79.04 59.24 36.12 58.14 66.10 88.8 45.5
    YOLOv4[11] Darknet53 56.85 81.24 62.12 36.55 60.10 66.90 198.5 30.6
    DETR[14] ResNet-50 48.74 68.22 47.20 27.56 49.50 58.85 142 35.7
    原始算法[15] Swin-T 53.45 74.85 56.45 34.64 55.40 64.65 114 42.8
    本文 SLFPB-ST 58.60 84.20 60.80 37.15 57.60 67.56 128 42.4
    下载: 导出CSV

    表  2  不同算法结果比较

    Table  2.   Comparison of results of different algorithms

    算法 mAP/% mAP提升/% AP(S)/% AP(S)提升/% 参数量/MB 参数量降低/MB 检测速度/(帧·s−1) 检测速度提升/(帧·s−1)
    文献[16] 52.23 1.85 33.50 1.64 278 −36 24.6 1.2
    文献[17] 56.40 0.70 26.64 0.65 124 14 42.0 4.5
    文献[18] 51.65 3.97 29.28 1.75 198 −46 38.8 −9.1
    本文 58.60 5.15 37.15 2.51 128 −14 42.4 −0.4
    下载: 导出CSV

    表  3  本文算法与原始算法的性能指标比较

    Table  3.   Performance comparison between the proposed algorithm and original algorithm

    算法 AP50/% F1/% 精确率/% 召回率/% 检测速度/(帧·s−1
    原始算法 74.85 73.60 83.42 72.65 42.8
    本文 84.20 85.16 91.30 79.48 42.4
    下载: 导出CSV

    表  4  消融实验结果

    Table  4.   Results of ablation experiments

    Swin-T SLFPB MSIFN LRB 检测速度/(帧·s−1 mAP/%
    35.2 53.45
    38.5 53.95
    36.9 55.12
    33.4 56.74
    42.4 58.60
    下载: 导出CSV
  • [1] XIAO Z Y, ZHANG G B. An attention-based odometry framework for multisensory unmanned ground vehicles (UGVs)[J]. Drones, 2023, 7(12): 699.
    [2] 王坤, 项琦鑫. 改进Yolov4的车辆弱目标检测算法[J]. 中国惯性技术学报, 2023, 31(8): 797-805.

    WANG K, XIANG Q X. Improved Yolov4 algorithm for vehicle weak object detection[J]. Journal of Chinese Inertial Technology, 2023, 31(8): 797-805(in Chinese).
    [3] LIANG H, SEO S. UAV low-altitude remote sensing inspection system using a small target detection network for Helmet wear detection[J]. Remote Sensing, 2023, 15(1): 196.
    [4] GIRSHICK R. Fast R-CNN[C]//Proceedings of the IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2016: 1440-1448.
    [5] REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149.
    [6] CAI Z W, VASCONCELOS N. Cascade R-CNN: high quality object detection and instance segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(5): 1483-1498.
    [7] CHEN W P, QIAO Y T, LI Y J. Inception-SSD: an improved single shot detector for vehicle detection[J]. Journal of Ambient Intelligence and Humanized Computing, 2022, 13(11): 5047-5053.
    [8] LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[C]//Proceedings of the IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2017: 2999-3007.
    [9] DUAN K W, BAI S, XIE L X, et al. CenterNet: keypoint triplets for object detection[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE Press, 2020: 6568-6577.
    [10] DING X W, YANG R D. Vehicle and parking space detection based on improved YOLO network model[J]. Journal of Physics: Conference Series, 2019, 1325(1): 012084.
    [11] SUMIT S S, WATADA J, ROY A, et al. In object detection deep learning methods, YOLO shows supremum to Mask R-CNN[J]. Journal of Physics: Conference Series, 2020, 1529(4): 042086.
    [12] SU X Y, LIU H M, TAO L F, et al. An end-to-end framework for remaining useful life prediction of rolling bearing based on feature pre-extraction mechanism and deep adaptive Transformer model[J]. Computers & Industrial Engineering, 2021, 161: 107531.
    [13] KOJIMA T, IWASAWA Y, MATSUO Y. Robustifying vision Transformer without retraining from scratch using attention-based test-time adaptation[J]. New Generation Computing, 2023, 41(1): 5-24.
    [14] QI F, CHEN G M, LIU J Y, et al. End-to-end pest detection on an improved deformable DETR with multihead criss cross attention[J]. Ecological Informatics, 2022, 72: 101902.
    [15] LIU Z, LIN Y T, CAO Y, et al. Swin Transformer: hierarchical vision Transformer using shifted windows[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE Press, 2022: 9992-10002.
    [16] XU X. Research on a small target object detection algorithm for electric transmission lines based on convolutional neural network[J]. IAENG International Journal of Computer Science, 2023, 50(2): 375-380.
    [17] FU C Y, LIU W, RANGA A, et al. DSSD: deconvolutional single shot detector [EB/OL]. (2017-01-23)[2024-03-01]. https://arxiv.org/abs/1701.06659.
    [18] ZHANG Q, ZHANG H Y, LU X W. Adaptive feature fusion for small object detection[J]. Applied Sciences, 2022, 12(22): 11854.
    [19] CHEN Y K, ZHANG P Z, KONG T, et al. Scale-aware automatic augmentations for object detection with dynamic training[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(2): 2367-2383.
    [20] LIU S T, HUANG D, WANG Y H. Learning spatial fusion for single-shot object detection[EB/OL]. (2019-11-25)[2024-03-01]. https://arxiv.org/abs/1911.09516.
    [21] ZOPH B, CUBUK E D, GHIASI G, et al. Learning data augmentation strategies for object detection[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2020: 566-583.
    [22] REZATOFIGHI H, TSOI N, GWAK J, et al. Generalized intersection over union: a metric and a loss for bounding box regression[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2020: 658-666.
  • 加载中
图(15) / 表(4)
计量
  • 文章访问数:  355
  • HTML全文浏览量:  120
  • PDF下载量:  56
  • 被引次数: 0
出版历程
  • 收稿日期:  2024-03-05
  • 录用日期:  2024-06-21
  • 网络出版日期:  2024-07-19
  • 整期出版日期:  2026-05-26

目录

    /

    返回文章
    返回
    常见问答