留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

结合空间注意力机制的实时鲁棒视觉跟踪

马素刚 张子贤 蒲磊 侯志强

马素刚,张子贤,蒲磊,等. 结合空间注意力机制的实时鲁棒视觉跟踪[J]. 北京航空航天大学学报,2024,50(2):419-432 doi: 10.13700/j.bh.1001-5965.2022.0329
引用本文: 马素刚,张子贤,蒲磊,等. 结合空间注意力机制的实时鲁棒视觉跟踪[J]. 北京航空航天大学学报,2024,50(2):419-432 doi: 10.13700/j.bh.1001-5965.2022.0329
MA S G,ZHANG Z X,PU L,et al. Real-time robust visual tracking based on spatial attention mechanism[J]. Journal of Beijing University of Aeronautics and Astronautics,2024,50(2):419-432 (in Chinese) doi: 10.13700/j.bh.1001-5965.2022.0329
Citation: MA S G,ZHANG Z X,PU L,et al. Real-time robust visual tracking based on spatial attention mechanism[J]. Journal of Beijing University of Aeronautics and Astronautics,2024,50(2):419-432 (in Chinese) doi: 10.13700/j.bh.1001-5965.2022.0329

结合空间注意力机制的实时鲁棒视觉跟踪

doi: 10.13700/j.bh.1001-5965.2022.0329
基金项目: 国家自然科学基金(62072370);陕西省重点研发计划(2018ZDCXL-GY-04-02);西安邮电大学研究生创新基金(CXJJZL2021011)
详细信息
    通讯作者:

    E-mail:msg@xupt.edu.cn

  • 中图分类号: TB391.4

Real-time robust visual tracking based on spatial attention mechanism

Funds: National Natural Science Foundation of China (62072370); Key Research and Developement program of Shaanxi (2018ZDCXL-GY-04-02); Graduate Innovation Fund of Xi’an University of Posts and Telecommunications (CXJJZL2021011)
More Information
  • 摘要:

    为提高全卷积孪生网络(SiamFC)跟踪器在复杂场景下的跟踪能力,缓解跟踪器在跟踪过程中出现的目标漂移问题,提出一种结合空间注意力机制的实时目标跟踪算法。在SiamFC框架基础上,将改进的视觉几何组(VGG)网络作为主干网络,增强跟踪器对于目标深度特征的建模能力。对自注意力机制进行优化,提出一种即插即用的轻量级单卷积注意力模块(SCAM),将空间注意力分解为2个并行的一维特征编码过程,减少空间注意力的计算复杂度。保留跟踪过程中的初始目标模板作为第1模板,通过分析连通域在跟踪结果响应图的变化动态选择第2模板,融合2个模板后对目标进行定位。实验结果表明:在OTB100、LaSOT和UAV123数据集上,所提算法相比于SiamFC跟踪成功率分别提高了0.082、0.045和0.045,跟踪精度分别提高了0.118、0.051和0.062;在VOT2018数据集上,所提算法相比于SiamFC在跟踪准确率、鲁棒性和期望平均重叠率上分别提高了0.029、0.276和0.134;跟踪速度达到了70帧/s,能够满足实时跟踪的需求。

     

  • 图 1  本文算法的总体框架

    Figure 1.  Overall framework of proposed algorithm

    图 2  简化的非局部结构

    Figure 2.  Simplified structure of non-local module

    图 3  SCAM 结构

    Figure 3.  Structure of SCAM

    图 4  热力图可视化

    Figure 4.  Visualization of heatmap

    图 5  SiamFC 上响应峰值和 APCE 值的变化

    Figure 5.  Variation of response peak value and APCE value on SiamFC

    图 6  不同模板的跟踪响应图

    Figure 6.  Tracking response map of different templates

    图 7  跟踪响应图分割结果

    Figure 7.  Segmentation results of tracking response map

    图 8  OTB100数据集上部分跟踪结果比较

    Figure 8.  Comparison of some tracking results

    图 9  不同算法在 OTB100 数据集上的评估结果

    Figure 9.  Evaluation results of different algorithms on OTB100 dataset

    图 10  不同算法在 OTB100 数据集上的速度对比

    Figure 10.  Speed comparison of different algorithms on OTB100 different

    图 11  OTB100数据集 上不同挑战属性的跟踪成功率

    Figure 11.  Tracking success rate of different challenge attributes on OTB100 dateset

    图 12  不同算法在 LaSOT数据集上的评估结果

    Figure 12.  Evaluation results of different algorithms on LaSOT dataset

    图 13  LaSOT 数据集上不同挑战属性跟踪成功率对比

    Figure 13.  Comparison of tacking success rates of different challenge attributes on LaSOT dataset

    图 14  不同算法在 UAV123 数据集上的评估结果

    Figure 14.  Evaluation results of different algorithms on UAV123 dataset

    图 15  SCAM 与 非局部 模块对比实验结果

    Figure 15.  Comparative experimental results of SCAM and non-local modules

    表  1  在OTB100数据集上$\alpha $取不同值时的实验结果

    Table  1.   Experimental result of different $\alpha $ values on OTB100 dataset

    $\alpha $ 成功率 精度
    0.50 0.655 0.870
    0.55 0.669 0.890
    0.60 0.651 0.869
    0.65 0.658 0.878
    0.70 0.659 0.876
    0.75 0.660 0.870
    0.80 0.655 0.875
    0.85 0.659 0.875
    0.90 0.659 0.880
    0.95 0.653 0.866
    下载: 导出CSV

    表  2  不同算法在VOT2018数据集上的实验结果

    Table  2.   Experimental results on of different algonithms VOT2018 dataset

    算法 准确率 鲁棒性 EAO
    SiamFC[13] 0.503 0.585 0.188
    DSiam[36] 0.512 0.646 0.196
    UpdateNet[18] 0.518 0.454 0.244
    CCOT[37] 0.494 0.318 0.267
    ECO[27] 0.484 0.276 0.280
    SiamVGG[38] 0.531 0.318 0.286
    DeepCSRDCF[39] 0.489 0.276 0.293
    本文算法 0.532 0.309 0.322
    下载: 导出CSV

    表  3  消融实验结果

    Table  3.   result of ablation study

    实验成功率精度
    SiamFC0.5870.772
    SiamFC-V0.6240.833
    SiamFC-V-A0.6570.875
    SiamFC-V-A-U0.6690.890
    下载: 导出CSV
  • [1] GAO M, JIN L S, JIANG Y Y, et al. Manifold Siamese network: A novel visual tracking ConvNet for autonomous vehicles[J]. IEEE Transactions on Intelligent Transportation Systems, 2020, 21(4): 1612-1623. doi: 10.1109/TITS.2019.2930337
    [2] 寇展, 吴健发, 王宏伦, 等. 基于深度学习的低空小型无人机障碍物视觉感知[J]. 中国科学:信息科学, 2020, 50(5): 692-703. doi: 10.1360/N112019-00034

    KOU Z, WU J F, WANG H L, et al. Obstacle visual sensing based on deep learning for low-altitude small unmanned aerial vehicles[J]. Scientia Sinica (Informationis), 2020, 50(5): 692-703(in Chinese). doi: 10.1360/N112019-00034
    [3] MARVASTI-ZADEH S M, CHENG L, GHANEI-YAKHDAN H, et al. Deep learning for visual tracking: A comprehensive survey[J]. IEEE Transactions on Intelligent Transportation Systems, 2022, 23(5): 3943-3968. doi: 10.1109/TITS.2020.3046478
    [4] LUO J H, HAN Y, FAN L Y. Underwater acoustic target tracking: A review[J]. Sensors, 2018, 18(2): 112. doi: 10.3390/s18010112
    [5] 刘芳, 孙亚楠, 王洪娟, 等. 基于残差学习的自适应无人机目标跟踪算法[J]. 北京航空航天大学学报, 2020, 46(10): 1874-1882. doi: 10.13700/j.bh.1001-5965.2019.0551

    LIU F, SUN Y N, WANG H J, et al. Adaptive UAV target tracking algorithm based on residual learning[J]. Journal of Beijing University of Aeronautics and Astronautics, 2020, 46(10): 1874-1882 (in Chinese). doi: 10.13700/j.bh.1001-5965.2019.0551
    [6] 韩明, 王景芹, 王敬涛, 等. 基于孪生网络的目标跟踪研究综述[J]. 河北科技大学学报, 2022, 43(1): 27-41.

    HAN M, WANG J Q, WANG J T, et al. Comprehensive survey on target tracking based on Siamese network[J]. Journal of Hebei University of Science and Technology, 2022, 43(1): 27-41(in Chinese).
    [7] HENRIQUES J F, CASEIRO R, MARTINS P, et al. High-speed tracking with kernelized correlation filters[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(3): 583-596. doi: 10.1109/TPAMI.2014.2345390
    [8] ZHANG K H, ZHANG L, LIU Q S, et al. Fast visual tracking via dense spatio-temporal context learning[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2014: 127-141.
    [9] DANELLJAN M, KHAN F S, FELSBERG M, et al. Adaptive color attributes for real-time visual tracking[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2014: 1090-1097.
    [10] MA C, HUANG J B, YANG X K, et al. Hierarchical convolutional features for visual tracking[C]//Proceedings of the IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2016: 3074-3082.
    [11] WANG L J, OUYANG W L, WANG X G, et al. Visual tracking with fully convolutional networks[C]//Proceedings of the IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2016: 3119-3127.
    [12] BHAT G, JOHNANDER J, DANELLJAN M, et al. Unveiling the power of deep tracking[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2018: 493-509.
    [13] BERTINETTO L, VALMADRE J, HENRIQUES J F, et al. Fully-convolutional Siamese networks for object tracking[C]//Eurpean Conference on computer Vision. Berlin: Springer, 2016: 850-865.
    [14] HE A F, LUO C, TIAN X M, et al. A twofold Siamese network for real-time object tracking[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 4834-4843.
    [15] PU L, FENG X X, HOU Z Q, et al. SiamDA: Dual attention Siamese network for real-time visual tracking[J]. Signal Processing:Image Communication, 2021, 95: 116293. doi: 10.1016/j.image.2021.116293
    [16] GUPTA D K, ARYA D, GAVVES E. Rotation equivariant Siamese networks for tracking[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2021: 12357-12366.
    [17] VALMADRE J, BERTINETTO L, HENRIQUES J, et al. End-to-end representation learning for correlation filter based tracking[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2017: 5000-5008.
    [18] ZHANG L C, GONZALEZ-GARCIA A, VAN DE WEIJER J, et al. Learning the model update for Siamese trackers[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE Press, 2020: 4009-4018.
    [19] ZHU Z, WU W, ZOU W, et al. End-to-end flow correlation tracking with spatial-temporal attention[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 548-557.
    [20] ZHANG C Y, WANG H, WEN J W, et al. Deeper Siamese network with stronger feature representation for visual tracking[J]. IEEE Access, 2020, 8: 119094-119104. doi: 10.1109/ACCESS.2020.3005511
    [21] WANG X L, GIRSHICK R, GUPTA A, et al. Non-local neural networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 7794-7803.
    [22] CAO Y, XU J R, LIN S, et al. GCNet: Non-local networks meet squeeze-excitation networks and beyond[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision . Piscataway: IEEE Press, 2020: 1971-1980.
    [23] WANG M M, LIU Y, HUANG Z Y. Large margin object tracking with circulant feature maps[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2017: 4800-4808.
    [24] DANELLJAN M, HÄGER G, KHAN F S, et al. Discriminative scale space tracking[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(8): 1561-1575. doi: 10.1109/TPAMI.2016.2609928
    [25] BERTINETTO L, VALMADRE J, GOLODETZ S, et al. Staple: Complementary learners for real-time tracking[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2016: 1401-1409.
    [26] DANELLJAN M, HÄGER G, KHAN F S, et al. Learning spatially regularized correlation filters for visual tracking[C]//Proceedings of the IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2016: 4310-4318.
    [27] DANELLJAN M, BHAT G, KHAN F S, et al. ECO: Efficient convolution operators for tracking[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2017: 6931-6939.
    [28] DANELLJAN M, HAGER G, KHAN F S, et al. Convolutional features for correlation filter based visual tracking[C]//Proceedings of the IEEE International Conference on Computer Vision . Piscataway: IEEE Press, 2015: 58-66.
    [29] ZHANG Z P, PENG H W. Deeper and wider Siamese networks for real-time visual tracking[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2020: 4586-4595.
    [30] LI B, YAN J J, WU W, et al. High performance visual tracking with Siamese region proposal network[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 8971-8980.
    [31] BHAT G, DANELLJAN M, VAN GOOL L, et al. Learning discriminative model prediction for tracking[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE Press, 2020: 6181-6190.
    [32] ZHU Z, WANG Q, LI B, et al. Distractor-aware Siamese networks for visual object tracking[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2018: 103-119.
    [33] DANELLJAN M, VAN GOOL L, TIMOFTE R. Probabilistic regression for visual tracking[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2020: 7181-7190.
    [34] DANELLJAN M, BHAT G, KHAN F S, et al. ATOM: Accurate tracking by overlap maximization[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2020: 4655-4664.
    [35] LI P X, CHEN B Y, OUYANG W L, et al. GradNet: Gradient-guided network for visual object tracking[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE Press, 2020: 6161-6170.
    [36] GUO Q, FENG W, ZHOU C, et al. Learning dynamic Siamese network for visual object tracking[C]//Proceedings of the IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2017: 1781-1789.
    [37] DANELLJAN M, ROBINSON A, KHAN F S, et al. Beyond correlation filters: Learning continuous convolution operators for visual tracking[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2016: 472-488.
    [38] YIN Z J, WEN C H, HUANG Z Y, et al. SiamVGG-LLC: Visual tracking using LLC and deeper Siamese networks[C]//Proceedings of the IEEE International Conference on Communication Technology. Piscataway: IEEE Press, 2020: 1683-1687.
    [39] LUKEŽIC A, VOJÍR T, ZAJC L C, et al. Discriminative correlation filter with channel and spatial reliability[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2017: 4847-4856.
    [40] XU T Y, FENG Z H, WU X J, et al. Joint group feature selection and discriminative filter learning for robust visual object tracking[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE Press, 2020: 7949-7959.
    [41] DAI K N, WANG D, LU H C, et al. Visual tracking via adaptive spatially-regularized correlation filters[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2020: 4665-4674.
    [42] ZHANG Y H, WANG L J, QI J Q, et al. Structured Siamese network for real-time visual tracking[C]//Proceedings of the Enropean conference on Computer Vision. Berlin: Springer, 2018: 355-370.
    [43] LI F, TIAN C, ZUO W M, et al. Learning spatial-temporal regularized correlation filters for visual tracking[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 4904-4913.
    [44] CHOI J, CHANG H J, FISCHER T, et al. Context-aware deep feature compression for high-speed visual tracking[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 479-488.
  • 加载中
图(15) / 表(3)
计量
  • 文章访问数:  574
  • HTML全文浏览量:  85
  • PDF下载量:  17
  • 被引次数: 0
出版历程
  • 收稿日期:  2022-05-07
  • 录用日期:  2022-08-21
  • 网络出版日期:  2022-09-14
  • 整期出版日期:  2024-02-27

目录

    /

    返回文章
    返回
    常见问答