留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于双注意力混洗的无人机航拍目标跟踪算法

金国栋 薛远亮 谭力宁 许剑锟

金国栋,薛远亮,谭力宁,等. 基于双注意力混洗的无人机航拍目标跟踪算法[J]. 北京航空航天大学学报,2023,49(1):53-65 doi: 10.13700/j.bh.1001-5965.2021.0177
引用本文: 金国栋,薛远亮,谭力宁,等. 基于双注意力混洗的无人机航拍目标跟踪算法[J]. 北京航空航天大学学报,2023,49(1):53-65 doi: 10.13700/j.bh.1001-5965.2021.0177
JIN G D,XUE Y L,TAN L N,et al. Aerial object tracking algorithm for UAVs based on dual-attention shuffling[J]. Journal of Beijing University of Aeronautics and Astronautics,2023,49(1):53-65 (in Chinese) doi: 10.13700/j.bh.1001-5965.2021.0177
Citation: JIN G D,XUE Y L,TAN L N,et al. Aerial object tracking algorithm for UAVs based on dual-attention shuffling[J]. Journal of Beijing University of Aeronautics and Astronautics,2023,49(1):53-65 (in Chinese) doi: 10.13700/j.bh.1001-5965.2021.0177

基于双注意力混洗的无人机航拍目标跟踪算法

doi: 10.13700/j.bh.1001-5965.2021.0177
基金项目: 国家自然科学基金 (61673017,61403398)
详细信息
    作者简介:

    金国栋等:基于双注意力混洗的多尺度无人机航拍目标跟踪算法 13

    通讯作者:

    E-mail:641797825@qq.com

  • 中图分类号: V279;TP391

Aerial object tracking algorithm for UAVs based on dual-attention shuffling

Funds: National Natural Science Foundation of China (61673017,61403398)
More Information
  • 摘要:

    针对无人机(UAV)跟踪过程中目标经常出现尺寸小、尺度变化大和相似物干扰等问题,提出了一种基于双注意力混洗的多尺度无人机实时跟踪算法。考虑到无人机视角下目标像素点少,构建了双采样融合的深层网络,既提供了语义信息丰富的深度特征,又保留了目标的细节信息;设计了双注意力混洗模块,通道注意力和空间注意力同时分组筛选提取到的特征信息,混洗不同通道间的信息,加强信息交流,提高了算法辨别能力;为利用不同层的特征信息,加入多个区域建议网络完成目标的分类和回归,并针对无人机的目标特点,将结果进行加权融合。实验结果表明:所提算法在数据集上的成功率和准确率分别为60.3%和79.3%,速度为37.5 帧/s。所提算法的辨别能力和多尺度适应能力明显增强,能有效应对无人机跟踪中常见的挑战。

     

  • 图 1  残差模块

    Figure 1.  ResBlock

    图 2  本文算法网络结构

    Figure 2.  Structure of the proposed algorithm network

    图 3  投影映射的下采样

    Figure 3.  Down-sampling of projection mapping

    图 4  步长为2、大小为1的卷积过程

    Figure 4.  Convolution process with a step of 2 and size of 1

    图 5  双注意力混洗

    Figure 5.  Dual-attention shuffling

    图 6  通道混洗

    Figure 6.  Channel shuffle

    图 7  Grad-CAM热力图

    Figure 7.  Heatmap of Grad-CAM

    图 8  RPN模块

    Figure 8.  Region proposal network module

    图 9  算法实现步骤

    Figure 9.  Algorithm flowchart

    图 10  部分跟踪结果

    Figure 10.  Some tracking results

    图 11  不同算法在UAV123数据集上的整体评估结果

    Figure 11.  Overall evaluation results of different algorithms on UAV123 dataset

    不同算法在UAV123数据集上各属性的评估结果

    Evaluation results of different algorithms in terms of different attributes of UAV123 dataset

    图 13  无人机拍摄视频跟踪结果

    Figure 13.  Tracking results of UAV shooting video

  • [1] 孟琭, 杨旭. 目标跟踪算法综述[J]. 自动化学报, 2019, 45(7): 1244-1260. doi: 10.16383/j.aas.c180277

    MENG L, YANG X. A survey of object tracking algorithms[J]. Acta Automatica Sinica, 2019, 45(7): 1244-1260(in Chinese). doi: 10.16383/j.aas.c180277
    [2] BERTINETTO L, VALMADRE J, HENRIQUES J F, et al. Fully-convolutional siamese networks for object tracking[C]//Proceedings of the 2016 European Conference on Computer Vision(ECCV). Berlin: Springer, 2016, 9914: 850-865.
    [3] HUANG C, LUCEY S, RAMANAN D. Learning policies for adaptive tracking with deep feature cascades[C]//Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV). Piscataway: IEEE Press, 2017: 105-114.
    [4] LI B, YAN J, WU W, et al. High performance visual tracking with Siamese region proposal network[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 8971-8980.
    [5] REN S, HE K, GIRSHICK R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149. doi: 10.1109/TPAMI.2016.2577031
    [6] LI B, WU W, WANG Q, et al. SiamRPN++: Evolution of Siamese visual tracking with very deep networks[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 4277-4286.
    [7] FU C, CAO Z, LI Y, et al. Siamese anchor proposal network for high-speed aerial tracking[EB/OL]. (2021-03-26)[2021-04-05]. https://arxiv.org/abs/2012.10706.
    [8] 刘芳, 孙亚楠, 王洪娟, 等. 基于残差学习的自适应无人机目标跟踪算法[J]. 北京航空航天大学学报, 2020, 46(10): 1874-1882. doi: 10.13700/j.bh.1001-5965.2019.0551

    LIU F, SUN Y N, WANG H J, et al. Adaptive UAV target tracking algorithm based on residual learning[J]. Journal of Beijing University of Aeronautics and Astronautics, 2020, 46(10): 1874-1882(in Chinese). doi: 10.13700/j.bh.1001-5965.2019.0551
    [9] 刘芳, 杨安喆, 吴志威. 基于自适应Siamese网络的无人机目标跟踪算法[J]. 航空学报, 2020, 41(1): 243-255.

    LIU F, YANG A Z, WU Z W. Adaptive Siamese network based UAV target tracking algorithm[J]. Acta Aeronautica et Astronautica Sinica, 2020, 41(1): 243-255(in Chinese).
    [10] HE A, LUO C, TIAN X, et al. A twofold Siamese network for real-time object tracking[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 4834-4843.
    [11] HU J, SHEN L, ALBANIE S, et al. Squeeze-and-excitation networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(8): 2011-2023. doi: 10.1109/TPAMI.2019.2913372
    [12] 钟莎, 黄玉清. 基于孪生区域候选网络的无人机指定目标跟踪[J]. 计算机应用, 2021, 41(2): 523-529. doi: 10.11772/j.issn.1001-9081.2020060762

    ZHONG S, HUANG Y Q. Tracking of specified target of unmanned aerial vehicle based on Siamese region proposal network[J]. Journal of Computer Applications, 2021, 41(2): 523-529(in Chinese). doi: 10.11772/j.issn.1001-9081.2020060762
    [13] HE K M, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2016: 770-778.
    [14] MA N, ZHANG X, ZHENG H T, et al. ShuffleNet V2: Practical guidelines for efficient CNN architecture design[C]//Proceedings of the 2018 European Conference on Computer Vision(ECCV). Berlin: Springer, 2018, 11218: 122-138.
    [15] ITTI L, KOCH C, NIEBUR E. A model of saliency-based visual attention for rapid scene analysis[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1998, 20(11): 1254-1259. doi: 10.1109/34.730558
    [16] LAROCHELLE H, HINTON G E. Learning to combine foveal glimpses with a third-order Boltzmann machine[C]//Proceedings of the 23rd International Conference on Neural Information Processing Systems. New York: Curran Associates Inc, 2010, 1: 1243-1251.
    [17] ZHANG Q L, YANG Y B. SA-Net: Shuffle attention for deep convolutional neural networks[C]//Proceedings of the 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Piscataway: IEEE Press, 2021: 2235-2239.
    [18] MAX J, KAREN S, ANDREW Z, et al. Spatial transformer networks[C]//Proceedings of the 2015 Advances in Neural Information Processing Systems(NIPS). New York: Curran Associates Inc, 2015, 28: 2017-2025.
    [19] WANG Q, WU B, ZHU P, et al. ECA-Net: Efficient channel attention for deep convolutional neural networks[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2020: 11531-11539.
    [20] WOO S, PARK J, LEE J Y, et al. CBAM: Convolutional block attention module[C]//Proceedings of the 2018 European Conference on Computer Vision(ECCV). Berlin: Springer, 2018, 11211: 3-19.
    [21] FU J, LIU J, TIAN H, et al. Dual attention network for scene segmentation[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 3141-3149.
    [22] NAM H, HAN B. Learning multi-domain convolutional neural networks for visual tracking[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2016: 4293-4302.
    [23] KAREN S, ANDREW Z. Very deep convolutional networks for large-scale image recognition[EB/OL]. (2015-04-10)[2021-04-05].https://arxiv.org/abs/1409.1556.
    [24] KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017, 60(6): 84-90. doi: 10.1145/3065386
    [25] WANG Q, TENG Z, XING J, et al. Learning attentions: Residual attentional Siamese network for high performance online visual tracking[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 4854-4863.
    [26] WU Y, HE K. Group normalization[C]//Proceedings of the 2018 European Conference on Computer Vision(ECCV). Berlin: Springer, 2018: 3-19.
    [27] RAMPRASAATH R S, MICHAEL C, ABHISHEK D, et al. Grad-CAM: Visual explanations from deep networks via gradient-based localization[C]//Proceedings of the 2017 IEEE International Conference on Computer Vision(ICCV). Piscataway: IEEE Press, 2017: 618-626.
    [28] ROMAN P. An in-depth analysis of visual tracking with Siamese neural networks[EB/OL]. (2018-08-02)[2021-04-05]. https://arxiv. org/abs/1707.00569.
    [29] FAN H, LING H. Siamese cascaded region proposal networks for real-time visual tracking[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 7944-7953.
    [30] ESTEBAN R, JONATHON S, STEFANO M, et al. YouTube-BoundingBoxes: A large high-precision human-annotated data set for object detection in video[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2017: 7464-7473.
    [31] OLGA R, JIA D, HAO S, et al. ImageNet large scale visual recognition challenge[J]. International Journal of Computer Vision, 2015, 115(3): 211-252. doi: 10.1007/s11263-015-0816-y
    [32] LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft COCO: Common objects in context[C]//Proceedings of the 2014 European Conference on Computer Vision(ECCV). Berlin: Springer, 2014: 740-755.
    [33] MATTHIAS M, NEIL S, BERNARD G. A benchmark and simulator for UAV tracking[C]//Proceedings of the 2016 European Conference on Computer Vision(ECCV). Berlin: Springer, 2016: 445-461.
    [34] MARTIN D, GUSTAV H, FAHAD S K. Accurate scale estimation for robust visual tracking[C]//Proceedings of the 2014 British Machine Vision Conference(BMVC). Berlin: Springer, 2014: 1-11.
    [35] HARE S, GOLODETZ S, SAFFARI A, et al. Struck: Structured output tracking with kernels[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 38(10): 2096-2109. doi: 10.1109/TPAMI.2015.2509974
    [36] ZHANG J, MA S, SCLAROFF S. MEEM: Robust tracking via multiple experts using entropy minimization[C]//Proceedings of the 2014 European Conference on Computer Vision(ECCV). Berlin: Springer, 2014: 188-203.
    [37] LI Y, ZHU J. A scale adaptive kernel correlation filter tracker with feature integration[C]//Proceedings of the 2014 European Conference on Computer Vision(ECCV). Berlin: Springer, 2015: 254-265.
    [38] MARTIN D, GUSTAV H, FAHAD S K, et al. Learning spatially regularized correlation filters for visual tracking[C]//Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV). Piscataway: IEEE Press, 2015: 4310-4318.
  • 加载中
图(14)
计量
  • 文章访问数:  286
  • HTML全文浏览量:  91
  • PDF下载量:  58
  • 被引次数: 0
出版历程
  • 收稿日期:  2021-04-07
  • 录用日期:  2021-05-28
  • 网络出版日期:  2021-06-16
  • 整期出版日期:  2023-01-30

目录

    /

    返回文章
    返回
    常见问答