留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于二阶注意力的Siamese网络视觉跟踪算法

侯志强 陈茂林 马靖媛 郭凡 余旺盛 马素刚

侯志强,陈茂林,马靖媛,等. 基于二阶注意力的Siamese网络视觉跟踪算法[J]. 北京航空航天大学学报,2024,50(3):739-747 doi: 10.13700/j.bh.1001-5965.2022.0373
引用本文: 侯志强,陈茂林,马靖媛,等. 基于二阶注意力的Siamese网络视觉跟踪算法[J]. 北京航空航天大学学报,2024,50(3):739-747 doi: 10.13700/j.bh.1001-5965.2022.0373
HOU Z Q,CHEN M L,MA J Y,et al. Siamese network visual tracking algorithm based on second-order attention[J]. Journal of Beijing University of Aeronautics and Astronautics,2024,50(3):739-747 (in Chinese) doi: 10.13700/j.bh.1001-5965.2022.0373
Citation: HOU Z Q,CHEN M L,MA J Y,et al. Siamese network visual tracking algorithm based on second-order attention[J]. Journal of Beijing University of Aeronautics and Astronautics,2024,50(3):739-747 (in Chinese) doi: 10.13700/j.bh.1001-5965.2022.0373

基于二阶注意力的Siamese网络视觉跟踪算法

doi: 10.13700/j.bh.1001-5965.2022.0373
基金项目: 国家自然科学基金(62072370)
详细信息
    通讯作者:

    E-mail:hzq@xupt.edu.cn

  • 中图分类号: TP391.4

Siamese network visual tracking algorithm based on second-order attention

Funds: National Natural Science Foundation of China (62072370)
More Information
  • 摘要:

    为提升基于Siamese网络视觉跟踪算法的特征表达能力和判别能力,以获得更好的跟踪性能,提出了一种轻量级的基于二阶注意力的Siamese网络视觉跟踪算法。使用轻量级VGG-Net作为Siamese网络的主干,获取目标的深度特征;在Siamese网络的末端并行使用所提残差二阶池化网络和二阶空间注意力网络,获取具有通道相关性的二阶注意力特征和具有空间相关性的二阶注意力特征;使用残差二阶通道注意力特征和二阶空间注意力特征,通过双分支响应策略实现视觉跟踪。利用GOT-10k数据集对所提算法进行端到端的训练,并在OTB100和VOT2018数据集上进行验证。实验结果表明:所提算法的跟踪性能取得了显著提升,与基准算法SiamFC相比,在OTB100数据集上,精度和成功率分别提高了0.100和0.096,在VOT2018数据集上,预期平均重叠率(EAO)提高了0.077,跟踪速度达到了48帧/s。

     

  • 图 1  本文算法总体框架

    Figure 1.  Overall framework of proposed algorithm

    图 2  残差二阶池化网络

    Figure 2.  Residual second-order pooling network

    图 3  二阶池化流程

    Figure 3.  Second-order pooling process

    图 4  二阶空间注意力网络

    Figure 4.  Second-order spatial attention network

    图 5  部分视频序列跟踪结果

    Figure 5.  Tracking results of partial video sequence

    图 6  OTB100数据集的定量对比结果

    Figure 6.  Quantitative comparison results of OTB100 dataset

    表  1  归一化参数对跟踪性能的影响

    Table  1.   Influence of normalization parameters on tracking performance

    C 精度 成功率
    0.5 0.806 0.597
    0.6 0.815 0.606
    0.7 0.818 0.609
    0.8 0.825 0.614
    0.9 0.821 0.612
    1.0 0.831 0.621
    2.0 0.843 0.629
    3.0 0.844 0.626
    4.0 0.830 0.623
    5.0 0.849 0.633
    6.0 0.845 0.638
    7.0 0.836 0.626
    8.0 0.846 0.633
    9.0 0.824 0.617
    10.0 0.830 0.621
    下载: 导出CSV

    表  2  细化归一化参数平衡跟踪性能

    Table  2.   Refinement of normalization parameters to balance tracking performance

    C 精度 成功率
    5.0 0.849 0.633
    5.1 0.843 0.629
    5.2 0.832 0.629
    5.3 0.839 0.630
    5.4 0.842 0.629
    5.5 0.848 0.638
    5.6 0.836 0.625
    5.7 0.836 0.622
    5.8 0.829 0.628
    5.9 0.844 0.626
    6.0 0.845 0.638
    下载: 导出CSV

    表  3  消融实验结果

    Table  3.   Ablation experiment results

    SiamFC VGG-Net SoA ResSoP DBR 精度 成功率 跟踪速度/(帧·s−1)
    0.777 0.580 58
    0.828 0.622 55
    0.845 0.638 53
    0.864 0.649 50
    0.877 0.676 48
    下载: 导出CSV

    表  4  不同属性下算法的跟踪成功率对比结果

    Table  4.   Comparison results of tracking success rate of algorithms under different attributes

    算法 光照变化 平面外旋转 尺度变化 离开视野 目标形变 低分辨率 快速运动 目标遮挡 相似背景 平面内旋转 运动模糊
    本文算法 0.681 0.658 0.668 0.607 0.640 0.698 0.646 0.634 0.640 0.649 0.674
    SiamSE[26] 0.670 0.635 0.678 0.613 0.617 0.697 0.663 0.637 0.620 0.651 0.651
    ATOM[25] 0.679 0.643 0.681 0.612 0.630 0.693 0.662 0.648 0.631 0.650 0.658
    SiamDW-FC[18] 0.627 0.617 0.618 0.595 0.562 0.616 0.632 0.606 0.582 0.613 0.655
    TADT[24] 0.674 0.643 0.650 0.623 0.602 0.644 0.655 0.638 0.619 0.618 0.668
    GradNet[23] 0.643 0.628 0.614 0.583 0.572 0.669 0.624 0.616 0.611 0.627 0.646
    DaSiamRPN[8] 0.655 0.644 0.637 0.537 0.645 0.636 0.621 0.611 0.642 0.652 0.625
    SASiam[22] 0.644 0.642 0.642 0.611 0.590 0.692 0.636 0.629 0.634 0.626 0.651
    SiamRPN[7] 0.652 0.628 0.621 0.548 0.619 0.663 0.603 0.589 0.598 0.632 0.624
    SiamFC[6] 0.536 0.550 0.560 0.473 0.546 0.574 0.575 0.532 0.525 0.567 0.587
    下载: 导出CSV

    表  5  VOT2018数据集实验对比结果

    Table  5.   VOT2018 dataset experimental comparison results

    算法 EAO 准确率 鲁棒性
    本文算法 0.265 0.534 0.389
    SiamSE[26] 0.270 0.538 0.432
    GradNet[23] 0.247 0.510 0.390
    SiamDW-FC[18] 0.230 0.500 0.490
    SiamRPN[7] 0.243 0.560 0.340
    SASiam[22] 0.236 0.500 0.459
    SiamFC-tri[29] 0.212 0.483 0.526
    DSiam[28] 0.196 0.512 0.646
    DCFNet[27] 0.182 0.470 0.543
    SiamFC[6] 0.188 0.500 0.590
    下载: 导出CSV
  • [1] MARVASTI-ZADEH S M, CHENG L, GHANEI-YAKHDAN H. Deep learning for visual tracking: A comprehensive survey[J]. IEEE Transactions on Intelligent Transportation Systems, 2022, 23(5): 3943-3968. doi: 10.1109/TITS.2020.3046478
    [2] 柏罗, 张宏立, 王聪. 基于高效注意力和上下文感知的目标跟踪算法[J]. 北京航空航天大学学报, 2022, 48(7): 1222-1232.

    BAI L, ZHANG H L, WANG C. Target tracking algorithm based on efficient attention and context awareness[J]. Journal of Beijing University of Aeronautics and Astronautics, 2022, 48(7): 1222-1232(in Chinese).
    [3] 李玺, 查宇飞, 张天柱, 等. 深度学习的视觉跟踪算法综述[J]. 中国图象图形学报, 2019, 24(12): 2057-2080. doi: 10.11834/jig.190372

    LI X, ZHA Y F, ZHANG T Z, et al. A survey of visual object tracking algorithms based on deep learning[J]. Journal of Image and Graphics, 2019, 24(12): 2057-2080(in Chinese). doi: 10.11834/jig.190372
    [4] 蒲磊, 李海龙, 侯志强, 等. 基于高层语义嵌入的孪生网络跟踪算法[J]. 北京航空航天大学学报, 2023, 49(4): 792-803.

    PU L, LI H L, HOU Z Q, et al. Siamese network tracking based on high level semantic embedding[J]. Journal of Beijing University of Aeronautics and Astronautics, 2023, 49(4): 792-803(in Chinese).
    [5] 张成煜, 侯志强, 蒲磊, 等. 基于在线学习的Siamese网络视觉跟踪算法[J]. 光电工程, 2021, 48(4): 4-14.

    ZHANG C Y, HOU Z Q, PU L, et al. Siamese network visual tracking algorithm based on online learning[J]. Opto-Electronic Engineering, 2021, 48(4): 4-14(in Chinese).
    [6] BERTINETTO L, VALMADRE J, HENRIQUES J F. Fully-convolutional Siamese networks for object tracking[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2016: 850-865.
    [7] LI B, YAN J, WU W. High performance visual tracking with Siamese region proposal network[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 8971-8980.
    [8] ZHU Z, WANG Q, LI B. Distractor-aware Siamese networks for visual object tracking[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2018: 101-117.
    [9] KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017, 60(6): 84-90. doi: 10.1145/3065386
    [10] WANG Q, ZHANG L, WU B. What deep CNNs benefit from global covariance pooling: An optimization perspective[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2020: 10771-10780.
    [11] WANG Q, XIE J, ZUO W. Deep CNNs meet global covariance pooling: Better representation and generalization[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(8): 2582-2597.
    [12] GAO Z, WANG Q, ZHANG B, et al. Temporal-attentive covariance pooling networks for video recognition[C]//Proceedings of the Advances in Neural Information Processing Systems. [S. l. ]: NeurIPS, 2021.
    [13] 蒲磊, 冯新喜, 侯志强, 等. 基于二阶池化网络的鲁棒视觉跟踪算法[J]. 电子学报, 2020, 48(8): 1472-1478.

    PU L, FENG X X, HOU Z Q, et al. Robust visual tracking based on second order pooling network[J]. Acta Electronica Sinica, 2020, 48(8): 1472-1478(in Chinese).
    [14] LI Y, ZHANG X, CHEN D M. SiamVGG: Visual tracking using deeper Siamese networks[EB/OL]. (2019-02-07)[2022-05-01]. https://arxiv.org/abs/1902.02804.
    [15] GAO Z, XIE J, WANG Q. Global second-order pooling convolutional networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 3024-3033.
    [16] NG T, BALNTAS V, TIAN Y. SOLAR: Second-order loss and attention for image retrieval[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2020: 253-270.
    [17] SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[EB/OL]. (2015-04-10)[2022-05-01]. https://arxiv.org/abs/1409.1556.
    [18] ZHANG Z, PENG H. Deeper and wider Siamese networks for real-time visual tracking[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 4591-4600.
    [19] HUANG L, ZHAO X, HUANG K. GOT-10k: A large high-diversity benchmark for generic object tracking in the wild[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(5): 1562-1577.
    [20] WU Y, LIM J, YANG M H. Object tracking benchmark[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1834-1848.
    [21] KRISTAN M, LEONARDIS A, MATAS J. The sixth visual object tracking VOT2018 challenge results[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2018: 3-53.
    [22] HE A, LUO C, TIAN X. A twofold Siamese network for real-time object tracking[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 4834-4843.
    [23] LI P, CHEN B, OUYANG W, et al. GradNet: Gradient-guided network for visual object tracking[C]//Proceedings of the IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2019: 6162-6171.
    [24] LI X, MA C, WU B. Target-aware deep tracking[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 1369-1378.
    [25] DANELLJAN M, BHAT G, KHAN F S. ATOM: Accurate tracking by overlap maximization[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 4660-4669.
    [26] SOSNOVIK I, MOSKALEV A, SMEULDERSA W M. Scale equivariance improves Siamese tracking[C]//Proceedings of the IEEE Winter Conference on Applications of Computer Vision. Piscataway: IEEE Press, 2021: 2765-2774.
    [27] WANG Q, GAO J, XING J. DCFNet: Discriminant correlation filters network for visual tracking[EB/OL]. (2017-04-13)[2022-05-01]. https://arxiv.org/abs/1704.04057.
    [28] GUO Q, FENG W, ZHOU C. Learning dynamic Siamese network for visual object tracking[C]//Proceedings of the IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2017: 1763-1771.
    [29] DONG X, SHEN J. Triplet loss in Siamese network for object tracking[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2018: 459-474.
  • 加载中
图(6) / 表(5)
计量
  • 文章访问数:  173
  • HTML全文浏览量:  82
  • PDF下载量:  27
  • 被引次数: 0
出版历程
  • 收稿日期:  2022-05-17
  • 录用日期:  2022-06-27
  • 网络出版日期:  2022-09-16
  • 整期出版日期:  2024-03-27

目录

    /

    返回文章
    返回
    常见问答