留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于改进Double-Head RCNN的无人机航拍图像小目标检测算法

王殿伟 胡里晨 房杰 许志杰

王殿伟,胡里晨,房杰,等. 基于改进Double-Head RCNN的无人机航拍图像小目标检测算法[J]. 北京航空航天大学学报,2024,50(7):2141-2149 doi: 10.13700/j.bh.1001-5965.2022.0591
引用本文: 王殿伟,胡里晨,房杰,等. 基于改进Double-Head RCNN的无人机航拍图像小目标检测算法[J]. 北京航空航天大学学报,2024,50(7):2141-2149 doi: 10.13700/j.bh.1001-5965.2022.0591
WANG D W,HU L C,FANG J,et al. Small target detection algorithm based on improved Double-Head RCNN for UAV aerial images[J]. Journal of Beijing University of Aeronautics and Astronautics,2024,50(7):2141-2149 (in Chinese) doi: 10.13700/j.bh.1001-5965.2022.0591
Citation: WANG D W,HU L C,FANG J,et al. Small target detection algorithm based on improved Double-Head RCNN for UAV aerial images[J]. Journal of Beijing University of Aeronautics and Astronautics,2024,50(7):2141-2149 (in Chinese) doi: 10.13700/j.bh.1001-5965.2022.0591

基于改进Double-Head RCNN的无人机航拍图像小目标检测算法

doi: 10.13700/j.bh.1001-5965.2022.0591
基金项目: 国家自然科学基金(62201454);西安邮电大学研究生创新基金(CXJJLY2021058)
详细信息
    通讯作者:

    E-mail:wangdianwei@xupt.edu.cn

  • 中图分类号: V279;TP391.41

Small target detection algorithm based on improved Double-Head RCNN for UAV aerial images

Funds: National Natural Science Foundation of China (62201454); Postgraduate Innovation Foundation of Xi’an University of Posts and Telecommunications (CXJJLY2021058)
More Information
  • 摘要:

    为解决无人机航拍图像中小目标特征信息少且容易被噪声干扰导致现有算法漏检率和误检率高的问题,提出一种改进Double-Head Region-卷积神经网络(RCNN)的无人机航拍图像小目标检测算法。在骨干网络ResNet-50上引入Transformer和可变形卷积(DCN)模块,更有效提取小目标特征信息和语义信息;提出一种基于内容感知特征重组(CARAFE)的特征金字塔网络(FPN)结构模块,解决特征融合过程中小目标被背景噪声干扰而丢失特征信息的问题;在区域建议网络中针对小目标尺度分布特点重新设置Anchor生成尺度,进一步提升小目标检测性能。在VisDrone-DET2021数据集上的实验结果表明:所提算法能提取更具有表征能力的小目标特征信息和语义信息,对比Double-Head RCNN算法,所提算法的参数量增加了9.73×106,FPS损失了0.6,但是AP、AP50和AP75分别提升了2.6%、6.2%和2.1%,APs提升了3.1%。

     

  • 图 1  本文算法框架

    Figure 1.  Framework of the proposed algorithm

    图 2  残差网络结构

    Figure 2.  Structure of residual network

    图 3  CARAFE[20]结构

    Figure 3.  Structure of CARAFE[20]

    图 4  VisDrone-DET2021[23]数据集目标尺寸分布

    Figure 4.  Target size distribution in VisDrone-DET2021[23] dataset

    图 5  损失函数曲线

    Figure 5.  Loss function curve

    图 6  不同算法检测结果

    Figure 6.  Detection results of different algorithms

    图 7  特征图对比

    Figure 7.  Comparison of feature maps

    表  1  本文算法与先进算法比较

    Table  1.   Comparison between the proposed algorithm and advanced algorithms

    算法 骨干网络 输入图像分辨率/像素 轮数 AP/% AP50/% AP75/% APs/% APm/% AP1/% FPS
    RetinaNet+PVT v2[4] PVTv2-B1 1333×800 20 20.6 34.1 21.4 10.4 34.5 48.9 10.9
    Deform DETR[6] ResNet-50 1333×800 50 18.0 32.1 17.5 9.7 27.8 44.9 9.2
    RetinaNet[7] ResNet-50 1333×800 20 18.5 30.1 19.3 8.2 31.7 48.0 16.6
    YOLOX-S[8] CSPDarkNet 640×640 300 19.9 34.6 19.6 10.8 30.9 42.6 53.1
    Cascade R-CNN[12] ResNet-50 1333×800 20 24.5 39.3 25.9 15.4 36.9 45.2 9.0
    Grid R-CNN[13] ResNet-50 1333×800 20 25.1 39.3 26.9 15.8 37.8 47.7 10.4
    FPN[15] ResNet-50 1333×800 20 22.9 36.8 23.9 13.6 34.7 53.8 14.5
    VFNet[24] ResNet-50 1333×800 20 22.5 37.4 23.5 13.1 34.5 45.4 13.7
    Double-Head RCNN[18] ResNet-50 1333×800 20 23.8 38.3 24.8 15.0 35.1 44.8 6.5
    本文算法 R50-Attention 1333×800 20 26.4 44.5 26.9 18.1 36.1 48.7 5.9
    下载: 导出CSV

    表  2  消融实验结果

    Table  2.   Ablation experimental results

    Double-Head RCNN[18] AP/% AP50/% AP75/% APs/% APm/% AP1/% 参数量 FPS
    R50-Attention CARAFE-FPN Anchor生成策略
    23.8 38.3 24.8 15.0 35.1 44.8 46.76×106 6.5
    24.9 41.6 25.5 17.2 34.3 44.4 46.76×106 6.9
    25.0 40.9 26.2 15.5 37.2 48.4 50.88×106 5.9
    24.6 39.6 26.0 15.6 36.5 47.6 52.36×106 6.4
    25.6 42.7 26.2 17.6 35.4 46.0 52.36×106 6.6
    26.4 44.5 26.9 18.1 36.1 48.7 56.49×106 5.9
     注:√表示添加一个方法。
    下载: 导出CSV
  • [1] LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft COCO: Common objects in context[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2014: 740-755.
    [2] WANG J W, YANG W, GUO H W, et al. Tiny object detection in aerial images[C]//Proceedings of the International Conference on Pattern Recognition. Piscataway: IEEE Press, 2021: 3791-3798.
    [3] 陈映雪, 丁文锐, 李红光, 等. 基于视频帧间运动估计的无人机图像车辆检测[J]. 北京航空航天大学学报, 2020, 46(3): 634-642.

    CHEN Y X, DING W R, LI H G, et al. Vehicle detection in UAV image based on video interframe motion estimation[J]. Journal of Beijing University of Aeronautics and Astronautics, 2020, 46(3): 634-642(in Chinese).
    [4] WANG W H, XIE E Z, LI X, et al. PVT v2: Improved baselines with pyramid vision transformer[J]. Computational Visual Media, 2022, 8(3): 415-424. doi: 10.1007/s41095-022-0274-8
    [5] CARION N, MASSA F, SYNNAEVE G, et al. End-to-end object detection with transformers[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2020: 213-229.
    [6] ZHU X Z, SU W J, LU L W, et al. Deformable DETR: Deformable transformers for end-to-end object detection[EB/OL]. (2021-03-18)[2022-05-18]. http://arxiv.org/abs/2010.04159.
    [7] LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[C]//Proceedings of the IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2017: 2999-3007.
    [8] GE Z, LIU S T, WANG F, et al. YOLOX: Exceeding YOLO series in 2021[EB/OL]. (2021-08-06)[2022-05-20]. http://arxiv.org/abs/2107.08430.
    [9] WANG C Y, BOCHKOVSKIY A, LIAO H Y M. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[EB/OL]. (2022-07-01)[2022-07-03]. http://arxiv.org/abs/2207.02696.
    [10] REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149. doi: 10.1109/TPAMI.2016.2577031
    [11] PANG J M, CHEN K, SHI J P, et al. Libra R-CNN: Towards balanced learning for object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 821-830.
    [12] CAI Z W, VASCONCELOS N. Cascade R-CNN: Delving into high quality object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 6154-6162.
    [13] LU X, LI B Y, YUE Y X, et al. Grid R-CNN[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 7355-7364.
    [14] GIRSHICK R. Fast R-CNN[C]//Proceedings of the IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2015: 1440-1448.
    [15] LIN T Y, DOLLÁR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2017: 936-944.
    [16] LIU S, QI L, QIN H F, et al. Path aggregation network for instance segmentation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 8759-8768.
    [17] TAN M X, PANG R M, LE Q V. EfficientDet: Scalable and efficient object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2020: 10778-10787.
    [18] WU Y, CHEN Y P, YUAN L, et al. Rethinking classification and localization for object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2020: 10183-10192.
    [19] DAI J F, QI H Z, XIONG Y W, et al. Deformable convolutional networks[C]//Proceedings of the IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2017: 764-773.
    [20] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Proceedings of the International Conference on Neural Information Processing Systems. New York: ACM, 2017: 6000-6010.
    [21] WANG J Q, CHEN K, XU R, et al. CARAFE: Content-aware ReAssembly of FEatures[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE Press, 2019: 3007-3016.
    [22] ZHU X Z, CHENG D Z, ZHANG Z, et al. An empirical study of spatial attention mechanisms in deep networks[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE Press, 2019: 6687-6696.
    [23] CAO Y R, HE Z J, WANG L J, et al. VisDrone-DET2021: The vision meets drone object detection challenge results[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops. Piscataway: IEEE Press, 2021: 2847-2854.
    [24] ZHANG H Y, WANG Y, DAVOUB F, et al. VarifocalNet: An iou-aware dense object detector[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2021: 8510-8519.
  • 加载中
图(7) / 表(2)
计量
  • 文章访问数:  669
  • HTML全文浏览量:  162
  • PDF下载量:  10
  • 被引次数: 0
出版历程
  • 收稿日期:  2022-07-05
  • 录用日期:  2022-11-01
  • 网络出版日期:  2023-01-10
  • 整期出版日期:  2024-07-18

目录

    /

    返回文章
    返回
    常见问答