留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于深度学习的无人机视觉目标检测与跟踪

蒲良 张学军

蒲良, 张学军. 基于深度学习的无人机视觉目标检测与跟踪[J]. 北京航空航天大学学报, 2022, 48(5): 872-880. doi: 10.13700/j.bh.1001-5965.2020.0664
引用本文: 蒲良, 张学军. 基于深度学习的无人机视觉目标检测与跟踪[J]. 北京航空航天大学学报, 2022, 48(5): 872-880. doi: 10.13700/j.bh.1001-5965.2020.0664
PU Liang, ZHANG Xuejun. Deep learning based UAV vision object detection and tracking[J]. Journal of Beijing University of Aeronautics and Astronautics, 2022, 48(5): 872-880. doi: 10.13700/j.bh.1001-5965.2020.0664(in Chinese)
Citation: PU Liang, ZHANG Xuejun. Deep learning based UAV vision object detection and tracking[J]. Journal of Beijing University of Aeronautics and Astronautics, 2022, 48(5): 872-880. doi: 10.13700/j.bh.1001-5965.2020.0664(in Chinese)

基于深度学习的无人机视觉目标检测与跟踪

doi: 10.13700/j.bh.1001-5965.2020.0664
详细信息
    通讯作者:

    张学军, E-mail: zhxj@buaa.edu.cn

  • 中图分类号: V279+.2; TP391

Deep learning based UAV vision object detection and tracking

More Information
  • 摘要:

    针对目标检测中小目标物体漏检率及误检率高等问题,提出了一种基于Yolov3-Tiny算法的改进模型。改进k-means聚类方法,增加3×3和1×1的卷积池化层,将第9层卷积输出上采样,并与第8层卷积得到的特征图进行连接,得到新的输出:52×52卷积层,形成新的特征金字塔。基于卡尔曼滤波算法实现目标跟踪,提出融合跟踪算法的检测网络,使用匈牙利匹配算法对检测边缘框与跟踪边缘框进行最优匹配,利用跟踪结果修正检测结果,提高了检测速度,同时提升了检测能力。在ROS、Gazebo和自动驾驶仪软件PX4的综合仿真环境下对所提算法进行了对比试验。试验结果表明:改进算法平均检测速度降低了15.6%,mAP提高了6.5%。融合跟踪算法后的网络平均检测速度提高了34.2%,mAP提高了8.6%。融合跟踪算法后的网络能够满足系统实时性和准确性的要求。

     

  • 图 1  无人机仿真处理平台流程

    Figure 1.  UAV simulation processing platform flowchart

    图 2  仿真界面

    Figure 2.  Simulation interface

    图 3  Yolov3-Tiny网络结构

    Figure 3.  Yolov3-Tiny network structure

    图 4  Yolov3-Tiny检测效果

    Figure 4.  Yolov3-Tiny detection effect

    图 5  感受野大小与卷积层层数关系

    Figure 5.  Receptive field size and number of convolutional layers

    图 6  改进的Yolov3-Tiny网络结构

    Figure 6.  Improved Yolov3-Tiny network structure

    图 7  平均重叠度随聚类数的变化

    Figure 7.  Variation of average degree of overlap with number of clusters

    图 8  改进后k-means聚类检测效果

    Figure 8.  Effect of improved k-means clustering detection

    图 9  卡尔曼滤波算法工作流程

    Figure 9.  Kalman filtering algorithm workflow

    图 10  行人检测和跟踪边缘框示意图

    Figure 10.  Schematic diagram of pedestrian detection and tracking enclosure frames

    图 11  检测与跟踪融合流程

    Figure 11.  Detection and tracking fusion process

    图 12  各类别下的AP对比

    Figure 12.  Comparison of AP under each category

    图 13  不同模型下的mAP对比

    Figure 13.  Comparison of mAP under different models

    图 14  不同模型下的平均检测速度对比

    Figure 14.  Comparison of average dection speed under different models

    图 15  效果对比

    Figure 15.  Effect comparison

    表  1  Yolov3-Tiny主体网络及感受野

    Table  1.   Yolov3-Tiny subject network and receptive field

    卷积层层数 卷积核大小 步长 输入尺寸 输出尺寸 感受野大小
    1 3 1 416 416 3
    2 3 1 208 208 8
    3 3 1 104 104 18
    4 3 1 52 52 38
    5 3 1 26 26 78
    6 3 1 13 13 158
    7 3 1 13 13 254
    8 3 1 13 13 318
    下载: 导出CSV

    表  2  实验环境

    Table  2.   Experimental environment

    参数 配置
    CPU Intel i7-8700
    GPU NVIDIA GTX 1070
    系统 Ubuntu16.04
    加速环境 CUDA 9.0 cuDNN7.0
    训练框架 Darknet
    下载: 导出CSV

    表  3  实验对比结果

    Table  3.   Experimental comparison results

    算法 AP50/% mAP50/% 平均检测速度/(帧·s-1)
    汽车 消防栓 路标
    Yolov3-Tiny 72.23 75.48 69.49 66.67 71.23 45
    改进Yolov3-Tiny 77.12 80.45 72.18 70.23 75.83 38
    融合了跟踪算法的Yolov3-Tiny 82.33 87.14 77.47 75.28 82.34 51
    下载: 导出CSV
  • [1] LOWE D G. Object recognition from local scale-invariant features[C]//Proceedings of IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 1999, 2: 1150-1157.
    [2] LOWE D G. Distinctive image features from scale-invariant keypoints[J]. International Journal of Computer Vision, 2004, 60(2): 91-110. doi: 10.1023/B:VISI.0000029664.99615.94
    [3] REICHARDT W, POGGIO T. Visual control of orientation behaviour in the fly. Part I. A quantitative analysis[J]. Quarterly Reviews of Biophysics, 1976, 9(3): 311-375. doi: 10.1017/S0033583500002523
    [4] 卢章平, 孔德飞, 李小蕾, 等. 背景差分与三帧差分结合的运动目标检测算法[J]. 计算机测量与控制, 2013, 21(12): 3315-3318. doi: 10.3969/j.issn.1671-4598.2013.12.050

    LU Z P, KONG D F, LI X L, et al. A method for moving object detection based on background subtraction and three-frame differencing[J]. Computer Measurement & Control, 2013, 21(12): 3315-3318(in Chienese). doi: 10.3969/j.issn.1671-4598.2013.12.050
    [5] SANG H F, XU C. Moving object detection based on background subtraction of block updates[C]//2013 6th International Conference on Intelligent Networks and Intelligent Systems (ICINIS). Piscataway: IEEE Press, 2013: 51-54.
    [6] KARASULU B, KORUKOGLU S. Moving object detection and tracking by using annealed background subtraction method in videos: Performance optimization[J]. Expert Systems with Applications, 2012, 39(1): 33-43. doi: 10.1016/j.eswa.2011.06.040
    [7] 杨阳, 唐慧明. 基于视频的行人车辆检测与分类[J]. 计算机工程, 2014, 40(11): 135-138. https://www.cnki.com.cn/Article/CJFDTOTAL-JSJC201411028.htm

    YANG Y, TANG H M. Pedestrian-vehicle detection and classification based on video[J]. Computer Engineering, 2014, 40(11): 135-138(in Chineses). https://www.cnki.com.cn/Article/CJFDTOTAL-JSJC201411028.htm
    [8] GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2014: 580-587.
    [9] REN S, HE K, GIRSHICK R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 39(6): 1137-1149.
    [10] DAI J. R-FCN: Object detection via region-based fully convolutional networks[C]//Proceedings of IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2016: 1-9.
    [11] LIU W, ANGUELOV D, ERHAN D, et al. SSD: Single shot multibox detector[C]//European Conference on Computer Vision. Berlin: Springer, 2016: 21-37.
    [12] REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: Unified, real-time object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2016: 779-788.
    [13] REDMON J, FARHADI A. YOLO9000: Better, faster, stronger[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2017: 7263-7271.
    [14] REDMON J, FARHADI A. YOLOv3: An incremental improvement[EB/OL]. (2018-04-08)[2020-11-01]. https://arxiv.org/abs/1804.02767.
    [15] JO K U, IM J H, KIM J, et al. A real-time multi-class multi-object tracker using YOLOv2[C]//2017 IEEE International Conference on Signal and Image Processing Applications (ICSIPA). Piscataway: IEEE Press, 2017: 507-511.
    [16] 王聪. 基于深度学习的无人机单目标识别与跟踪算法研究[D]. 泉州: 华侨大学, 2019.

    WANG C. Research on single target recognition and tracking algorithm for UAV based on deep learning[D]. Quanzhou: Huaqiao University, 2019(in Chinese).
    [17] LIN T Y, DOLLÁR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2017: 2117-2125.
    [18] 鞠默然, 罗海波, 王仲博, 等. 改进的YOLO V3算法及其在小目标检测中的应用[J]. 光学学报, 2019, 39(7): 0715004. https://www.cnki.com.cn/Article/CJFDTOTAL-GXXB201907028.htm

    JU M R, LUO H B, WANG Z B, et al. Improved YOLO V3 algorithm and its application in small target detection[J]. Acta Optica Sinica, 2019, 39(7): 0715004(in Chinese). https://www.cnki.com.cn/Article/CJFDTOTAL-GXXB201907028.htm
    [19] KONG T, SUN F, LIU H, et al. FoveaBox: Beyound anchor-based object detection[J]. IEEE Transactions on Image Processing, 2020, 29: 7389-7398. doi: 10.1109/TIP.2020.3002345
    [20] DALAL N, TRIGGS B. Histograms of oriented gradients for human detection[C]//2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05). Piscataway: IEEE Press, 2005, 1: 886-893.
    [21] KALMAN R E. A new approach to linear filtering and prediction problems[J]. Transactions of the ASME-Journal of Basic Engineering, 1960, 82: 35-45. doi: 10.1115/1.3662552
    [22] ARULAMPALAM M S, MASKELL S, GORDON N, et al. A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking[J]. IEEE Transactions on Signal Processing, 2002, 50(2): 174-188. doi: 10.1109/78.978374
    [23] XIAO K, TAN S, WANG G, et al. XTDrone: A customizable multi-rotor UAVs simulation platform[EB/OL]. (2020-03-21)[2020-11-01]. https://arxiv.org/abs/2003.09700v1.
  • 加载中
图(15) / 表(3)
计量
  • 文章访问数:  791
  • HTML全文浏览量:  216
  • PDF下载量:  168
  • 被引次数: 0
出版历程
  • 收稿日期:  2020-11-27
  • 录用日期:  2021-01-31
  • 网络出版日期:  2022-05-20

目录

    /

    返回文章
    返回
    常见问答