留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于特征信息补充与增强的无人机视角图像目标检测算法

邬开俊 蒲卓

邬开俊,蒲卓. 基于特征信息补充与增强的无人机视角图像目标检测算法[J]. 北京航空航天大学学报,2026,52(5):1445-1455
引用本文: 邬开俊,蒲卓. 基于特征信息补充与增强的无人机视角图像目标检测算法[J]. 北京航空航天大学学报,2026,52(5):1445-1455
WU K J,PU Z. Object detection algorithm for UAV viewpoint images based on feature information complementation and enhancement[J]. Journal of Beijing University of Aeronautics and Astronautics,2026,52(5):1445-1455 (in Chinese)
Citation: WU K J,PU Z. Object detection algorithm for UAV viewpoint images based on feature information complementation and enhancement[J]. Journal of Beijing University of Aeronautics and Astronautics,2026,52(5):1445-1455 (in Chinese)

基于特征信息补充与增强的无人机视角图像目标检测算法

doi: 10.13700/j.bh.1001-5965.2024.0190
基金项目: 

甘肃省自然科学基金(23JRRA913) ;内蒙古自治区重点研发与成果转化计划项目(2023YFSH0043);兰州交通大学重点研发项目资助(ZDYF2304)

详细信息
    通讯作者:

    E-mail:shiyuepz@163.com

  • 中图分类号: V279;TP391.41

Object detection algorithm for UAV viewpoint images based on feature information complementation and enhancement

Funds: 

Natural Science Foundation of Gansu Province (23JRRA913); Inner Mongolia Autonomous Region Key Research and Development and Achievement Transformation Program Project (2023YFSH0043); Supported by Key Research and Development Project of Lanzhou Jiaotong University (ZDYF2304)

More Information
  • 摘要:

    针对无人机(UAV)视角图像中目标尺度变化大、小尺寸目标占比高且背景噪声干扰严重等问题,提出一种基于特征信息增强与补充的无人机视角图像目标检测算法。为利用高层语义信息捕捉更加丰富的多尺度信息,提出一种多元融合空间金字塔池化(MFSPPF)方法;设计多分支语义增强(MBSE)模块,可以通过多个分支提取丰富的多尺度特征并构建多尺度特征之间的联系,从而在特征融合传递信息时防止重要特征信息丢失;提出细节特征补充(DFC)模块,将低层特征信息提取细化后得到丰富细粒度特征信息,经过特征融合传递实现对高层特征中细节信息的补充。通过在VisDrone2021数据集上进行实验,结果表明:所提算法相较于基线算法YOLOv8m,平均精度(AP)、AP50、AP75、AP(s)、AP(m)、AP(l)分别提高3.7%、5.4%、3.9%、3.1%、4.0%、7.9%。并且所提方法在YOLOv8其他模型中同样适用。与其他算法相比,所提算法在不同交并比(IOU)阈值标准下和不同尺寸目标检测下都具有优异的检测效果,同时保持较快的检测速度,能够适用于无人机视角图像检测任务。

     

  • 图 1  本文算法架构

    Figure 1.  Architecture of the proposed algorithm

    图 2  多元融合快速空间金字塔池化模块结构

    Figure 2.  Multivariate fusion spatial pyramid pooling-fast module structure

    图 3  多分支语义增强模块结构

    Figure 3.  Multi-branch semantic enhancement module structure

    图 4  细节特征补充模块结构

    Figure 4.  Detailed features complement module structure

    图 5  基线算法与本文算法在VisDrone验证集上的混淆矩阵结果

    Figure 5.  Confusion matrix results for baseline algorithm and the proposed algorithm on VisDrone validation set

    图 6  本文算法可视化结果

    Figure 6.  Results of visualization of the proposed algorithm

    表  1  VisDrone验证集上的消融实验结果

    Table  1.   Results of ablation experiments on VisDrone validation set

    算法 AP/% AP50/% AP75/% AP(s)/% AP(m)/% AP(l)/% 参数量
    基线 MFSPPF DFC MBSE
    33.7 53.5 35.7 25.5 46.0 42.5 25.85×106
    37.1 58.5 39.0 28.3 49.2 47.3 26.34×106
    36.7 58.1 38.7 28.0 49.0 48.0 26.81×106
    37.0 58.4 39.1 27.9 49.6 52.3 32.71×106
    37.4 58.9 39.6 28.6 50.0 50.4 35.47×106
    下载: 导出CSV

    表  2  YOLOv8其他算法改进后检测性能对比

    Table  2.   Comparison of detection performance of improved YOLOv8 other algorithms

    算法 AP/% AP50/% AP75/% AP(s)/% AP(m)/% AP(l)/% 参数量
    Yolov8n[5] 28.6 46.8 29.5 19.7 40.1 45.1 3.01×106
    Yolov8n+MFSPPF+DFC+MBSE 31.1 50.1 32.4 22.7 42.6 46.9 4.40×106
    Yolov8s[5] 34.0 54.5 35.3 24.9 46.0 48.1 11.14×106
    Yolov8s+MFSPPF+DFC+MBSE 35.2 56.2 36.4 26.3 47.4 50.4 16.63×106
    下载: 导出CSV

    表  3  VisDrone测试集上的对比试验结果

    Table  3.   Comparative test results on the VisDrone test set

    算法 骨干网络 AP/% AP50/% AP75/% AP(s)/% AP(m)/% AP(l)/% 推理时间/ms
    RetinaNet[6] ResNet-50 8.0 15.5 7.6 2.1 13.2 23.7 22.7
    Faster RCNN[12] ResNet-50 12.8 23.9 12.6 5.2 21.1 29.7 33.4
    Yolov5m[30] CSPDarkNet 21.3 37.7 21.7 13.2 31.2 33.4 21.1
    YoloXm[10] CSPDarkNet 19.7 36.5 19.1 13.0 28.0 23.9 28.1
    TOOD[31] ResNet-50 22.9 38.2 23.8 13.9 34.3 43.1 43.8
    VFNet[32] ResNet-50 17.6 30.3 17.9 10.0 26.7 36.4 43.4
    YOLOv8m[5] CSPDarkNet 25.8 42.4 27.1 16.8 37.4 39.3 18.4
    Gold YOLOm[33] CSPDarkNet 25.9 43.9 26.5 15.6 37.1 48.7 28.2
    本文算法 CSPDarkNet 28.7 47.4 29.8 18.4 41.0 49.0 28.6
    下载: 导出CSV
  • [1] 江波, 屈若锟, 李彦冬, 等. 基于深度学习的无人机航拍目标检测研究综述[J]. 航空学报, 2021, 42(4): 524519.

    JIANG B, QU R K, LI Y D, et al. Object detection in UAV imagery based on deep learning: review[J]. Acta Aeronautica et Astronautica Sinica, 2021, 42(4): 524519(in Chinese).
    [2] LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft COCO: common objects in context[C]//Proceedings of the Computer Vision-ECCV. Berlin: Springer, 2014: 740-755.
    [3] EVERINGHAM M, VAN GOOL L, WILLIAMS C K I, et al. The pascal visual object classes (VOC) challenge[J]. International Journal of Computer Vision, 2010, 88(2): 303-338.
    [4] ZHU P, WEN L, DU D, et al. Detection and tracking meet drones challenge[J]. IEEE Trans Pattern Anal Mach Intell, 2022, 44(11): 7380-7399.
    [5] JOCHER G. YOLOv8 by ultralytics[EB/OL]. (2023-09-27)[2024-02-10]. https://github.com/ultralytics/ultralytics.
    [6] LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[C]//Proceedings of the IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2017: 2999-3007.
    [7] LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot multiBox detector[C]//Proceedings of the Computer Vision-ECCV. Berlin: Springer, 2016: 21-37.
    [8] TIAN Z, SHEN C H, CHEN H, et al. FCOS: a simple and strong anchor-free object detector[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(4): 1922-1933.
    [9] REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2016: 779-788.
    [10] GE Z, LIU S T, WANG F, et al. Yolox: exceeding yolo series in 2021[EB/OL]. (2021-08-06)[2024-03-29]. https://arxiv.org/abs/2107.08430.
    [11] WANG C Y, BOCHKOVSKIY A, LIAO H M. YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2023: 7464-7475.
    [12] REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149.
    [13] HE K M, GKIOXARI G, DOLLÁR P, et al. Mask R-CNN[C]//Proceedings of the IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2017: 2980-2988.
    [14] CAI Z W, VASCONCELOS N. Cascade R-CNN: delving into high quality object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 6154-6162.
    [15] WU Y, CHEN Y P, YUAN L, et al. Rethinking classification and localization for object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2020: 10183-10192.
    [16] CARION N, MASSA F, SYNNAEVE G, et al. End-to-end object detection with transformers[C]//Proceedings of the Computer Vision-ECCV . Berlin: Springer, 2020: 213-229.
    [17] ZHU X Z, SU W J, LU L W, et al. Deformable detr: deformable transformers for end-to-end object detection1[EB/OL]. (2021-03-18)[2024-03-29]. https://arxiv.org/abs/2010.04159.
    [18] LI F, ZHANG H, LIU S L, et al. DN-DETR: accelerate DETR training by introducing query DeNoising[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2022: 13609-13617.
    [19] ZHAO Y, LV W Y, XU S L, et al. Detrs beat yolos on real-time object detection[EB/OL]. (2023-04-17)[2024-03-29]. https://arxiv.org/abs/2304.08069.
    [20] KIEFER B, OTT D, ZELL A. Leveraging synthetic data in object detection on unmanned aerial vehicles[C]//Proceedings of the 26th International Conference on Pattern Recognition. Piscataway: IEEE Press, 2022: 3564-3571.
    [21] CHEN Y T, LI J, NIU Y F, et al. Small object detection networks based on classification-oriented super-resolution GAN for UAV aerial imagery[C]//Proceedings of the Chinese Control and Decision Conference. Piscataway: IEEE Press, 2019: 4610-4615.
    [22] 王殿伟, 胡里晨, 房杰, 等. 基于改进Double-Head RCNN的无人机航拍图像小目标检测算法[J]. 北京航空航天大学学报, 2024, 50(7): 2141-2149.

    WANG D W, HU L C, FANG J, et al. Small target detection algorithm based on improved Double-Head RCNN for UAV aerial images[J]. Journal of Beijing University of Aeronautics and Astronautics, 2024, 50(7): 2141-2149(in Chinese).
    [23] 冒国韬, 邓天民, 于楠晶. 基于多尺度分割注意力的无人机航拍图像目标检测算法[J]. 航空学报, 2023, 44(5): 268-278.

    MAO G T, DENG T M, YU N J. Object detection in UAV images based on multi-scale split attention[J]. Acta Aeronautica et Astronautica Sinica, 2023, 44(5): 268-278(in Chinese).
    [24] LI C L, YANG T, ZHU S J, et al. Density map guided object detection in aerial images[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Piscataway: IEEE Press, 2020737-746.
    [25] LENG J X, MO M, ZHOU Y H, et al. Pareto refocusing for drone-view object detection[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2023, 33(3): 1320-1334.
    [26] DENG S T, LI S, XIE K, et al. A global-local self-adaptive network for drone-view object detection[J]. IEEE Transactions on Image Processing, 2020, 30: 1556-1569.
    [27] CHEN N Y, LI Y, YANG Z M, et al. LODNU: lightweight object detection network in UAV vision[J]. The Journal of Supercomputing, 2023, 79(9): 10117-10138.
    [28] YU G H, CHANG Q Y, LV W Y, et al. PP-PicoDet: a better real-time object detector on mobile devices[EB/OL]. (2021-11-01)[2024-03-30]. https://arxiv.org/abs/2111.00902.
    [29] HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 7132-7141.
    [30] JOCHER G. YOLOv5 by ultralytics (version 7.0) [EB/OL]. (2022-11-22)[2024-02-10]. https://doi.org/10.5281/zenodo.3908559.
    [31] FENG C J, ZHONG Y J, GAO Y, et al. TOOD: task-aligned one-stage object detection[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE Press, 2021: 3490-3499.
    [32] ZHANG H Y, WANG Y, DAYOUB F, et al. VarifocalNet: an IoU-aware dense object detector[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2021: 8510-8519.
    [33] WANG C C, HE W, NIE Y, et al. Gold-YOLO: efficient object detector via gather-and-distribute mechanism[EB/OL]. (2023-09-20)[2024-03-30]. https://arxiv.org/abs/2309.11331.
  • 加载中
图(6) / 表(3)
计量
  • 文章访问数:  410
  • HTML全文浏览量:  180
  • PDF下载量:  7
  • 被引次数: 0
出版历程
  • 收稿日期:  2024-04-01
  • 录用日期:  2024-05-20
  • 网络出版日期:  2024-09-05
  • 整期出版日期:  2026-05-31

目录

    /

    返回文章
    返回
    常见问答