留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于模板更新和双特征增强的视觉跟踪算法

丁奇帅 雷帮军 牟乾西 吴正平

丁奇帅,雷帮军,牟乾西,等. 基于模板更新和双特征增强的视觉跟踪算法[J]. 北京航空航天大学学报,2026,52(4):1096-1106
引用本文: 丁奇帅,雷帮军,牟乾西,等. 基于模板更新和双特征增强的视觉跟踪算法[J]. 北京航空航天大学学报,2026,52(4):1096-1106
DING Q S,LEI B J,MOU Q X,et al. Visual tracking algorithm based on template updating and dual feature enhancement[J]. Journal of Beijing University of Aeronautics and Astronautics,2026,52(4):1096-1106 (in Chinese)
Citation: DING Q S,LEI B J,MOU Q X,et al. Visual tracking algorithm based on template updating and dual feature enhancement[J]. Journal of Beijing University of Aeronautics and Astronautics,2026,52(4):1096-1106 (in Chinese)

基于模板更新和双特征增强的视觉跟踪算法

doi: 10.13700/j.bh.1001-5965.2024.0020
基金项目: 

国家自然科学基金(61871258) ;水电工程智能视觉监测湖北省重点实验室建设项目(2019ZYYD007);宜昌市科技研究与开发项目(A201130225)

详细信息
    通讯作者:

    E-mail:bangjun.lei@ieee.org

  • 中图分类号: TP391.4

Visual tracking algorithm based on template updating and dual feature enhancement

Funds: 

National Natural Science Foundation of China (61871258); Hubei Key Laboratory of Intelligent Visual Monitoring for Hydropower Engineering Project (2019ZYYD007); Yichang Science and Technology Research and Development Progect (A201130225)

More Information
  • 摘要:

    针对视觉跟踪中由于目标形变、翻转和遮挡而导致的跟踪失败问题,提出了一种基于图像结构相似性的模板更新算法,通过动态更新模板以适应目标在跟踪过程中的变化。同时,基于SiamMask网络设计了跟踪特征增强模块和分割特征增强模块。跟踪特征增强模块包括非局部操作和卷积下采样,用于建立上下文关联,增强目标特征,抑制背景干扰,提高跟踪鲁棒性,解决由于目标被遮挡而导致的特征减弱问题。分割特征增强模块引入卷积块注意力模块和可变形卷积,以提高网络对通道和空间特征的捕捉能力,自适应地学习目标的形状和轮廓信息,提升网络对跟踪目标的分割精度,进而提高跟踪准确率。实验表明:所提算法表现良好且稳定,与SiamMask相比,在VOT2016、VOT2018和VOT2019数据集上期望平均重叠率分别提升了0.052、0.053和0.025,鲁棒性分别提升了0.06、0.079和0.156,且达到了平均每秒91帧的实时速度。

     

  • 图 1  基于模板更新和双特征增强的视觉跟踪算法框架

    Figure 1.  Framework of visual tracking algorithms based on template updating and dual-feature enhancement

    图 2  特征融合网络

    Figure 2.  Feature fusion network

    图 3  目标在跟踪失败前的变化过程

    Figure 3.  Process of object changes before tracking failure

    图 4  目标在不同视频帧之间的关联性

    Figure 4.  Object correlation between different video frames

    图 5  跟踪特征增强模块

    Figure 5.  Tracking feature enhancement module

    图 6  非局部操作

    Figure 6.  Non-local operation

    图 7  分割特征增强模块

    Figure 7.  Segmentation feature enhancement module

    图 8  CBAM模块

    Figure 8.  Convolutional block attention module

    图 9  可变形卷积网络

    Figure 9.  Deformable convolution network

    图 10  VOT2016数据集上视觉属性对比

    Figure 10.  Comparison of visual attributes on VOT2016 dataset

    图 11  VOT2016数据集上EAO排名

    Figure 11.  EAO rankings on VOT2016 dataset

    图 12  跟踪效果对比

    Figure 12.  Comparison of tracking results

    图 13  VOT2018数据集上EAO排名

    Figure 13.  EAO rankings on VOT2018 dataset

    图 14  VOT2019数据集上EAO排名

    Figure 14.  EAO rankings on VOT2019 dataset

    图 15  分割效果对比

    Figure 15.  Comparison of segmentation effect

    表  1  VOT2016数据集上不同视觉属性的实验结果

    Table  1.   Experimental results of different visual attributes on VOT2016 dataset

    跟踪算法 总体EAO得分 EAO得分
    遮挡 相机运动 尺度变化 光照变化 运动变化 无定义
    SiamFC[2] 0.234 0.161 0.191 0.242 0.180 0.231 0.059
    MDNet[17] 0.257 0.218 0.238 0.312 0.313 0.252 0.030
    C-COT[18] 0.331 0.246 0.249 0.327 0.402 0.354 0.154
    SiamRPN[3] 0.344 0.117 0.205 0.280 0.270 0.176 0.065
    DaSiamRPN[4] 0.411 0.241 0.280 0.422 0.233 0.294 0.106
    SiamMask[6] 0.433 0.325 0.394 0.444 0.463 0.409 0.109
    本文 0.485 0.470 0.472 0.527 0.617 0.470 0.104
    下载: 导出CSV

    表  2  不同跟踪算法在VOT2016数据集上的结果

    Table  2.   Results of different tracking algorithms on VOT2016 dataset

    跟踪算法 准确率↑ 鲁棒性↓ EAO↑
    SiamMask[6] 0.622 0.214 0.433
    SiamRPN++[5] 0.640 0.200 0.464
    UpdateNet[9] 0.610 0.210 0.481
    Siam R-CNN[19] 0.645 0.173 0.461
    ULAST-on[20] 0.603 0.214 0.417
    本文 0.630 0.154 0.485
    下载: 导出CSV

    表  3  不同跟踪算法在VOT2018数据集上的结果

    Table  3.   Results of different tracking algorithms on VOT2018 dataset

    跟踪算法 准确率↑ 鲁棒性↓ EAO↑
    SiamMask[6] 0.609 0.276 0.380
    SiamRPN++[5] 0.600 0.230 0.415
    Siam R-CNN[19] 0.609 0.220 0.408
    SiamFC++[10] 0.587 0.183 0.426
    ULAST-on[20] 0.571 0.286 0.355
    本文 0.603 0.197 0.433
    下载: 导出CSV

    表  4  不同跟踪算法在VOT2019数据集上的结果

    Table  4.   Results of different tracking algorithms on VOT2019 dataset

    跟踪算法 准确率↑ 鲁棒性↓ EAO↑
    SiamFC[2] 0.511 0.923 0.183
    SiamRPN[3] 0.582 0.527 0.272
    SiamMask[6] 0.594 0.572 0.274
    SPM[21] 0.577 0.507 0.275
    SiamRPN++[5] 0.599 0.482 0.285
    本文 0.601 0.416 0.299
    下载: 导出CSV

    表  5  不同模板更新参数在VOT2018数据集上的结果

    Table  5.   Results of different template update parameters on VOT2018 dataset

    队列长度N $ \varDelta_ 1 $ $ \varDelta_ 2 $ EAO 分割速度/(帧·s−1)
    5 0.2 0.15 0.362 92
    0.2 0.375 96
    0.25 0.366 98
    0.25 0.15 0.381 102
    0.2 0.394 105
    0.25 0.389 105
    0.30 0.15 0.381 104
    0.2 0.379 105
    0.25 0.380 106
    10 0.2 0.15 0.373 74
    0.2 0.379 77
    0.25 0.385 80
    0.25 0.15 0.384 79
    0.2 0.396 82
    0.25 0.388 84
    0.30 0.15 0.376 88
    0.2 0.378 89
    0.25 0.380 91
    下载: 导出CSV

    表  6  VOT2016数据集上的消融实验结果

    Table  6.   Results of ablation experiments on VOT2016 dataset

    SiamMask 模板更新算法 跟踪特征增强模块 分割特征增强模块 准确率↑ 鲁棒性↓ EAO↑ 分割速度/(帧·s−1)↑ ΔEAO↑
    0.622 0.214 0.433 108
    0.623 0.210 0.448 106 0.015↑
    0.631 0.210 0.447 107 0.014↑
    0.616 0.228 0.440 98 0.007↑
    0.637 0.182 0.470 93 0.037↑
    0.630 0.154 0.485 91 0.052↑
    下载: 导出CSV

    表  7  VOT2018数据集上的消融实验结果

    Table  7.   Results of ablation experiments on VOT2018 dataset

    SiamMask 模板更新算法 跟踪特征增强模块 分割特征增强模块 准确率↑ 鲁棒性↓ EAO↑ 分割速度/(帧·s−1)↑ ΔEAO↑
    0.609 0.276 0.380 107
    0.601 0.239 0.394 105 0.014↑
    0.607 0.267 0.395 107 0.015↑
    0.606 0.276 0.403 98 0.023↑
    0.612 0.234 0.420 93 0.04↑
    0.603 0.197 0.433 91 0.053↑
    下载: 导出CSV

    表  8  VOT2019数据集上的消融实验结果

    Table  8.   Results of ablation experiments on VOT2019 dataset

    SiamMask 模板更新算法 跟踪特征增强模块 分割特征增强模块 准确率↑ 鲁棒性↓ EAO↑ 分割速度/(帧·s−1)↑ ΔEAO↑
    0.594 0.572 0.274 109
    0.596 0.511 0.285 106 0.011↑
    0.606 0.507 0.280 108 0.006↑
    0.606 0.492 0.286 98 0.012↑
    0.611 0.477 0.287 95 0.013↑
    0.601 0.416 0.299 92 0.025↑
    下载: 导出CSV

    表  9  DAVIS2016数据集上的消融实验结果

    Table  9.   Results of ablation experiments on DAVIS2016 dataset

    SiamMask 模板更新算法 跟踪特征增强模块 分割特征增强模块 mIoU (0.30) mIoU (0.35) mIoU (0.40) mIoU (0.45) 分割速度/(帧·s−1)↑
    0.637 0.637 0.633 0.626 79
    0.670 0.675 0.674 0.673 71
    0.674 0.670 0.662 0.649 78
    0.681 0.674 0.664 0.650 73
    0.675 0.677 0.677 0.673 71
    0.686 0.690 0.692 0.690 67
    下载: 导出CSV

    表  10  DAVIS2017数据集上的消融实验结果

    Table  10.   Results of ablation experiments on DAVIS2017 dataset

    SiamMask 模板更新算法 跟踪特征增强模块 分割特征增强模块 mIoU (0.30) mIoU (0.35) mIoU (0.40) mIoU (0.45) 分割速度/(帧·s−1)↑
    0.499 0.498 0.495 0.490 84
    0.505 0.508 0.509 0.507 75
    0.525 0.525 0.522 0.517 83
    0.514 0.511 0.505 0.496 77
    0.526 0.527 0.526 0.522 75
    0.525 0.529 0.530 0.528 70
    下载: 导出CSV
  • [1] XIAO H, LIU X. Robust target tracking based on spatio-temporal context learning[J]. Journal of Information Hiding and Multimedia Signal Processing, 2019, 10(1): 212-220.
    [2] BERTINETTO L, VALMADRE J, HENRIQUES J F, et al. Fully-convolutional Siamese networks for object tracking[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2016: 850-865.
    [3] LI B, YAN J J, WU W, et al. High performance visual tracking with Siamese region proposal network[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 8971-8980.
    [4] ZHU Z, WANG Q, LI B, et al. Distractor-aware Siamese networks for visual object tracking[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2018: 103-119.
    [5] LI B, WU W, WANG Q, et al. SiamRPN: evolution of Siamese visual tracking with very deep networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 4277-4286.
    [6] HU W M, WANG Q, ZHANG L, et al. SiamMask: a framework for fast online object tracking and segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(3): 3072-3089.
    [7] PARK E, BERG A C. Meta-Tracker: fast and robust online adaptation for visual object trackers[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2018: 587-604.
    [8] GUO Q, FENG W, ZHOU C, et al. Learning dynamic Siamese network for visual object tracking[C]//Proceedings of the IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2017: 1781-1789.
    [9] ZHANG L C, GONZALEZ-GARCIA A, VAN DE WEIJER J, et al. Learning the model update for Siamese trackers[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE Press, 2019: 4009-4018.
    [10] XU Y D, WANG Z Y, LI Z X, et al. SiamFC++: towards robust and accurate visual tracking with target estimation guidelines[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(7): 12549-12556.
    [11] CHEN Z D, ZHONG B N, LI G R, et al. Siamese box adaptive network for visual tracking[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2020: 6667-6676.
    [12] GUO D Y, WANG J, CUI Y, et al. SiamCAR: Siamese fully convolutional classification and regression for visual tracking[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2020: 6268-6276.
    [13] WANG Z, BOVIK A C, SHEIKH H R, et al. Image quality assessment: from error visibility to structural similarity[J]. IEEE Transactions on Image Processing, 2004, 13(4): 600-612.
    [14] WANG X L, GIRSHICK R, GUPTA A, et al. Non-local neural networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 7794-7803.
    [15] WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2018: 3-19.
    [16] ZHU X Z, HU H, LIN S, et al. Deformable ConvNets V2: more deformable, better results[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 9300-9308.
    [17] NAM H, HAN B. Learning multi-domain convolutional neural networks for visual tracking[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2016: 4293-4302.
    [18] DANELLJAN M, ROBINSON A, KHAN F S, et al. Beyond correlation filters: learning continuous convolution operators for visual tracking[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2016: 472-488.
    [19] VOIGTLAENDER P, LUITEN J, TORR P H S, et al. Siam R-CNN: visual tracking by re-detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2020: 6577-6587.
    [20] SHEN Q H, QIAO L, GUO J Y, et al. Unsupervised learning of accurate Siamese tracking[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2022: 8091-8100.
    [21] WANG G T, LUO C, XIONG Z W, et al. SPM-Tracker: series-parallel matching for real-time visual object tracking[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 3638-3647.
  • 加载中
图(15) / 表(10)
计量
  • 文章访问数:  362
  • HTML全文浏览量:  172
  • PDF下载量:  12
  • 被引次数: 0
出版历程
  • 收稿日期:  2024-01-11
  • 录用日期:  2024-01-28
  • 网络出版日期:  2024-02-27
  • 整期出版日期:  2026-04-30

目录

    /

    返回文章
    返回
    常见问答