留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于多注意力与双模板更新的视觉跟踪算法

马素刚 孙思维 侯志强 余旺盛 蒲磊

马素刚,孙思维,侯志强,等. 基于多注意力与双模板更新的视觉跟踪算法[J]. 北京航空航天大学学报,2025,51(6):1955-1964 doi: 10.13700/j.bh.1001-5965.2023.0334
引用本文: 马素刚,孙思维,侯志强,等. 基于多注意力与双模板更新的视觉跟踪算法[J]. 北京航空航天大学学报,2025,51(6):1955-1964 doi: 10.13700/j.bh.1001-5965.2023.0334
MA S G,SUN S W,HOU Z Q,et al. Visual tracking algorithm based on multi-attention and dual-template update[J]. Journal of Beijing University of Aeronautics and Astronautics,2025,51(6):1955-1964 (in Chinese) doi: 10.13700/j.bh.1001-5965.2023.0334
Citation: MA S G,SUN S W,HOU Z Q,et al. Visual tracking algorithm based on multi-attention and dual-template update[J]. Journal of Beijing University of Aeronautics and Astronautics,2025,51(6):1955-1964 (in Chinese) doi: 10.13700/j.bh.1001-5965.2023.0334

基于多注意力与双模板更新的视觉跟踪算法

doi: 10.13700/j.bh.1001-5965.2023.0334
基金项目: 

国家自然科学基金(62072370);陕西省自然科学基金(2023-JC-YB-598);西安市科技计划(22GXFW0125)

详细信息
    通讯作者:

    E-mail:msg@xupt.edu.cn

  • 中图分类号: TP391.4

Visual tracking algorithm based on multi-attention and dual-template update

Funds: 

National Natural Science Foundation of China (62072370); Natural Science Foundation of Shaanxi Province (2023-JC-YB-598); Science and Technology Project of Xi’an (22GXFW0125)

More Information
  • 摘要:

    针对全卷积孪生网络(SiamFC)跟踪器在复杂场景下表征能力不足且缺乏在线更新问题,提出一种基于多注意力与双模板更新的视觉跟踪算法。使用VGG16网络替换AlexNet,用SoftPool代替最大池化层,构建特征提取网络;在骨干网络后添加多注意力模块(MAM),增强网络对目标特征的提取能力;设计双模板进行特征融合和响应图融合,使用平均峰值相关能量(APCE)判断是否更新动态模板,有效提高跟踪鲁棒性。在GOT-10k数据集上对所提算法进行训练,并分别在OTB2015、VOT2018和UAV123数据集上进行测试,实验结果表明:相较于基准SiamFC算法,所提算法在OTB2015和UAV123数据集上,跟踪成功率分别提高了0.085和0.037,精确度分别提升了0.118和0.058;在VOT2018数据集上,跟踪准确率、鲁棒性和期望平均重叠率(EAO)分别提升了0.030、0.295和0.139。所提算法在复杂场景下取得了较高的跟踪准确度,并且运行速度达到33.9帧/s,满足实时跟踪要求。

     

  • 图 1  本文算法总框架

    Figure 1.  Overall framework of the proposed algorithm

    图 2  骨干网络

    Figure 2.  Backbone network

    图 3  MAM

    Figure 3.  MAM

    图 4  注意力可视化

    Figure 4.  Visualization of attention

    图 5  5组视频序列的跟踪效果对比

    Figure 5.  Comparison of tracking effect of five groups of video sequences

    图 6  不同算法在OTB2015数据集上的精确度和成功率曲线

    Figure 6.  Curves of precision rate and success rate of different algorithms on OTB2015 dataset

    表  1  不同取值在OTB2015数据集上的实验结果

    Table  1.   Experimental results of different values on OTB2015 dataset

    $\mu $ 成功率 精确度
    0.6 0.657 0.884
    0.65 0.664 0.892
    0.7 0.651 0.875
    0.75 0.658 0.882
    0.8 0.659 0.884
    0.85 0.667 0.899
    0.9 0.661 0.885
    0.95 0.657 0.880
    1.0 0.655 0.881
    下载: 导出CSV

    表  2  不同属性在OTB2015数据集上的成功率

    Table  2.   Success rate of different attributes on OTB2015 dataset

    算法 背景干扰 形变 快速运动 光照变化 平面内旋转 低分辨率 运动模糊 遮挡 平面外旋转 出视野 尺度变化
    本文 0.663 0.624 0.664 0.690 0.655 0.678 0.691 0.636 0.647 0.621 0.644
    ATOM[10] 0.619 0.631 0.642 0.655 0.635 0.705 0.659 0.637 0.629 0.609 0.671
    DiMP18[11] 0.616 0.617 0.674 0.654 0.650 0.611 0.679 0.636 0.635 0.628 0.675
    DaSiamRPN[25] 0.625 0.599 0.655 0.670 0.669 0.594 0.651 0.577 0.636 0.605 0.652
    ECO-HC[13] 0.636 0.595 0.634 0.640 0.582 0.533 0.627 0.629 0.608 0.594 0.611
    GradNet[26] 0.611 0.572 0.625 0.643 0.627 0.673 0.647 0.617 0.629 0.585 0.614
    DeepSRDCF[6] 0.627 0.566 0.628 0.621 0.589 0.564 0.643 0.602 0.607 0.555 0.606
    SiamFC-VGG[27] 0.591 0.603 0.600 0.631 0.623 0.689 0.630 0.576 0.614 0.533 0.619
    SiamRPN[9] 0.591 0.617 0.600 0.649 0.628 0.642 0.623 0.586 0.625 0.544 0.615
    SiamDW[8] 0.574 0.560 0.630 0.622 0.606 0.598 0.654 0.602 0.612 0.592 0.614
    SRDCF[6] 0.583 0.544 0.597 0.613 0.544 0.514 0.541 0.559 0.550 0.474 0.561
    SiamFC[7] 0.527 0.512 0.571 0.572 0.559 0.621 0.555 0.550 0.561 0.511 0.557
    Staple[28] 0.560 0.551 0.540 0.592 0.548 0.394 0.594 0.543 0.533 0.460 0.521
    下载: 导出CSV

    表  3  不同属性在OTB2015数据集上的精确度

    Table  3.   Precision of different attributes on OTB2015 dataset

    算法 背景干扰 形变 快速运动 光照变化 平面内旋转 低分辨率 运动模糊 遮挡 平面外旋转 出视野 尺度变化
    本文 0.895 0.870 0.890 0.914 0.912 0.983 0.912 0.859 0.903 0.845 0.884
    ATOM[10] 0.806 0.856 0.828 0.866 0.869 0.993 0.855 0.835 0.851 0.808 0.877
    DiMP18[11] 0.801 0.823 0.857 0.849 0.866 0.856 0.850 0.835 0.843 0.820 0.872
    DaSiamRPN[25] 0.843 0.814 0.858 0.895 0.913 0.922 0.856 0.764 0.867 0.787 0.868
    ECO-HC[13] 0.850 0.806 0.829 0.820 0.800 0.888 0.802 0.848 0.834 0.818 0.822
    GradNet[26] 0.822 0.795 0.838 0.844 0.860 0.999 0.855 0.838 0.872 0.789 0.841
    DeepSRDCF[6] 0.841 0.783 0.814 0.791 0.818 0.847 0.823 0.825 0.835 0.781 0.819
    SiamFC-VGG[27] 0.761 0.828 0.777 0.817 0.845 0.997 0.817 0.764 0.833 0.706 0.834
    SiamRPN[9] 0.799 0.825 0.789 0.859 0.854 0.978 0.816 0.780 0.851 0.726 0.838
    SiamDW[8] 0.762 0.763 0.808 0.794 0.824 0.901 0.841 0.798 0.829 0.781 0.819
    SRDCF[6] 0.775 0.734 0.768 0.792 0.745 0.760 0.765 0.734 0.741 0.594 0.745
    SiamFC[7] 0.692 0.691 0.744 0.736 0.743 0.900 0.707 0.723 0.758 0.673 0.736
    Staple[28] 0.749 0.752 0.708 0.783 0.768 0.690 0.698 0.726 0.737 0.664 0.726
    下载: 导出CSV

    表  4  本文算法在OTB2015数据集上的消融实验

    Table  4.   Ablation experiment of the proposed algorithm on OTB2015 dataset

    SiamFC[7] VGG16[15] SoftPool[16] CA[24] GCT[23] MAM Update-A Update-B 成功率 精确度
    0.582 0.781
    0.632 0.848
    0.643 0.855
    0.648 0.866
    0.655 0.875
    0.660 0.888
    0.663 0.892
    0.667 0.899
    下载: 导出CSV

    表  5  不同算法在VOT2018数据集上的性能比较

    Table  5.   Comparisons of performance among different algorithms on VOT2018 dataset

    算法 准确率 鲁棒性 EAO
    ECO-HC[13] 0.484 0.276 0.280
    SiamRPN[9] 0.490 0.460 0.244
    CCOT[29] 0.494 0.318 0.267
    SiamFC[7] 0.503 0.585 0.188
    DSiam[30] 0.512 0.646 0.196
    UpdateNet[14] 0.518 0.454 0.244
    SiamFC-VGG[27] 0.531 0.318 0.286
    本文算法 0.533 0.290 0.327
    下载: 导出CSV

    表  6  不同算法在UAV123数据集上的性能比较

    Table  6.   Comparisons of performance among different algorithms on UAV123 dataset

    算法 成功率 精确度
    MCCT[31] 0.453 0.656
    SRDCF[6] 0.464 0,676
    ARCF[32] 0.465 0.669
    AutoTrack[33] 0.467 0.686
    STRCF[34] 0.478 0.678
    ECO-HC[13] 0.493 0.707
    CCOT[29] 0.502 0.729
    SiamFC[7] 0.505 0.706
    SiamRPN[9] 0.535 0.751
    本文算法 0.542 0.764
    下载: 导出CSV
  • [1] DONG E Z, ZHANG Y, DU S Z. An automatic object detection and tracking method based on video surveillance[C]//Proceedings of the IEEE International Conference on Mechatronics and Automation. Piscataway: IEEE Press, 2020: 1140-1144.
    [2] ZHAI L, WANG C P, HOU Y H, et al. MPC-based integrated control of trajectory tracking and handling stability for intelligent driving vehicle driven by four hub motor[J]. IEEE Transactions on Vehicular Technology, 2022, 71(3): 2668-2680. doi: 10.1109/TVT.2022.3140240
    [3] MANGAL N K, TIWARI A K. Kinect v2 tracked body joint smoothing for kinematic analysis in musculoskeletal disorders[C]//Proceedings of the 42nd Annual International Conference of the IEEE Engineering in Medicine & Biology Society. Piscataway: IEEE Press, 2020: 5769-5772.
    [4] DEWANGAN D K, SAHU S P. Real time object tracking for intelligent vehicle[C]//Proceedings of the 1st International Conference on Power, Control and Computing Technologies. Piscataway: IEEE Press, 2020: 134-138.
    [5] DANELLJAN M, HÄGER G, KHAN F S, et al. Convolutional features for correlation filter based visual tracking[C]//Proceedings of the IEEE International Conference on Computer Vision Workshop. Piscataway: IEEE Press, 2015: 621-629.
    [6] DANELLJAN M, HÄGER G, KHAN F S, et al. Learning spatially regularized correlation filters for visual tracking[C]//Proceedings of the IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2015: 4310-4318.
    [7] BERTINETTO L, VALMADRE J, HENRIQUES J F, et al. Fully-convolutional Siamese networks for object tracking[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2016: 850-865.
    [8] ZHANG Z P, PENG H W. Deeper and wider Siamese networks for real-time visual tracking[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 4586-4595.
    [9] LI B, YAN J J, WU W, et al. High performance visual tracking with Siamese Region proposal network[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 8971-8980.
    [10] DANELLJAN M, BHAT G, KHAN F S, et al. ATOM: accurate tracking by overlap maximization[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 4655-4664.
    [11] BHAT G, DANELLJAN M, VAN GOOL L, et al. Learning discriminative model prediction for tracking[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE Press, 2020: 6181-6190.
    [12] WANG Q, TENG Z, XING J L, et al. Learning attentions: residual attentional Siamese network for high performance online visual tracking[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 4854-4863.
    [13] DANELLJAN M, BHAT G, KHAN F S, et al. ECO: efficient convolution operators for tracking[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2017: 6931-6939.
    [14] ZHANG L C, GONZALEZ-GARCIA A, VAN DE WEIJER J, et al. Learning the model update for Siamese trackers[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE Press, 2019: 4009-4018.
    [15] ZHANG C Y, WANG H, WEN J W, et al. Deeper Siamese network with stronger feature representation for visual tracking[J]. IEEE Access, 2020, 8: 119094-119104. doi: 10.1109/ACCESS.2020.3005511
    [16] STERGIOU A, POPPE R, KALLIATAKIS G. Refining activation downsampling with SoftPool[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE Press, 2021: 10337-10346.
    [17] HUANG L H, ZHAO X, HUANG K Q. GOT-10k: a large high-diversity benchmark for generic object tracking in the wild[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(5): 1562-1577. doi: 10.1109/TPAMI.2019.2957464
    [18] WU Y, LIM J, YANG M H. Object tracking benchmark[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1834-1848. doi: 10.1109/TPAMI.2014.2388226
    [19] KRISTAN M, LEONARDIS A, MATAS J, et al. The sixth visual object tracking VOT2018 challenge results[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2019: 3-53.
    [20] MUELLER M, SMITH N, GHANEM B. A benchmark and simulator for UAV tracking[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2016: 445-461.
    [21] HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 7132-7141.
    [22] WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2018: 3-19.
    [23] YANG Z X, ZHU L C, WU Y, et al. Gated channel transformation for visual recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2020: 11794-11803.
    [24] HOU Q B, ZHOU D Q, FENG J S. Coordinate attention for efficient mobile network design[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2021: 13708-13717.
    [25] ZHU Z, WANG Q, LI B, et al. Distractor-aware Siamese networks for visual object tracking[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2018: 103-119.
    [26] LI P X, CHEN B Y, OUYANG W L, et al. GradNet: gradient-guided network for visual object tracking[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE Press, 2019: 6161-6170.
    [27] LI Y H, ZHANG X F, CHEN D M. SiamVGG: visual tracking using deeper Siamese networks[EB/OL]. (2022-07-04)[2023-02-01]. https://arxiv.org/abs/1902.02804v4.
    [28] BERTINETTO L, VALMADRE J, GOLODETZ S, et al. Staple: complementary learners for real-time tracking[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2016: 1401-1409.
    [29] DANELLJAN M, ROBINSON A, SHAHBAZ KHAN F, et al. Beyond correlation filters: learning continuous convolution operators for visual tracking[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2016: 472-488.
    [30] GUO Q, FENG W, ZHOU C, et al. Learning dynamic Siamese network for visual object tracking[C]//Proceedings of the IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2017: 1781-1789.
    [31] WANG N, ZHOU W G, TIAN Q, et al. Multi-cue correlation filters for robust visual tracking[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 4844-4853.
    [32] HUANG Z Y, FU C H, LI Y M, et al. Learning aberrance repressed correlation filters for real-time UAV tracking[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE Press, 2019: 2891-2900.
    [33] LI Y M, FU C H, DING F Q, et al. AutoTrack: towards high-performance visual tracking for UAV with automatic spatio-temporal regularization[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2020: 11920-11929.
    [34] LI F, TIAN C, ZUO W M, et al. Learning spatial-temporal regularized correlation filters for visual tracking[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 4904-4913.
  • 加载中
图(6) / 表(6)
计量
  • 文章访问数:  259
  • HTML全文浏览量:  64
  • PDF下载量:  3
  • 被引次数: 0
出版历程
  • 收稿日期:  2023-06-09
  • 录用日期:  2023-07-21
  • 网络出版日期:  2023-08-25
  • 整期出版日期:  2025-06-30

目录

    /

    返回文章
    返回
    常见问答