留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于高效注意力和上下文感知的目标跟踪算法

柏罗 张宏立 王聪

柏罗, 张宏立, 王聪等 . 基于高效注意力和上下文感知的目标跟踪算法[J]. 北京航空航天大学学报, 2022, 48(7): 1222-1232. doi: 10.13700/j.bh.1001-5965.2021.0013
引用本文: 柏罗, 张宏立, 王聪等 . 基于高效注意力和上下文感知的目标跟踪算法[J]. 北京航空航天大学学报, 2022, 48(7): 1222-1232. doi: 10.13700/j.bh.1001-5965.2021.0013
BAI Luo, ZHANG Hongli, WANG Conget al. Target tracking algorithm based on efficient attention and context awareness[J]. Journal of Beijing University of Aeronautics and Astronautics, 2022, 48(7): 1222-1232. doi: 10.13700/j.bh.1001-5965.2021.0013(in Chinese)
Citation: BAI Luo, ZHANG Hongli, WANG Conget al. Target tracking algorithm based on efficient attention and context awareness[J]. Journal of Beijing University of Aeronautics and Astronautics, 2022, 48(7): 1222-1232. doi: 10.13700/j.bh.1001-5965.2021.0013(in Chinese)

基于高效注意力和上下文感知的目标跟踪算法

doi: 10.13700/j.bh.1001-5965.2021.0013
基金项目: 

国家自然科学基金 51767022

国家自然科学基金 51967019

详细信息
    通讯作者:

    张宏立, E-mail: 1606829274@qq.com

  • 中图分类号: TP391.4

Target tracking algorithm based on efficient attention and context awareness

Funds: 

National Natural Science Foundation of China 51767022

National Natural Science Foundation of China 51967019

More Information
  • 摘要:

    基于匹配思想的孪生网络算法缺乏对目标的整体性感知,容易出现对目标状态估计不够精准和在复杂环境中跟丢的现象。为此,在孪生网络的基础上设计了2个轻量级的模块来实现更精准、更鲁棒的目标跟踪。在提取特征的主干网络之后,嵌入一个高效通道注意力模块,实现高效提取目标特征并增强差异化表示,使网络更注重于目标信息;模板匹配之后的特征通过一个局部上下文感知模块,增强网络对目标的整体感知,以应对跟踪过程中复杂多变的环境;采用Anchor-free的状态估计策略实现对目标的精准估计。实验结果表明:所提算法SiamCC在数据集OTB100、VOT2016和VOT2018上的测试结果均好于DaSiamRPN、ATOM等算法,并且跟踪速度达到了85帧/s。

     

  • 图 1  基于高效注意力和上下文感知的孪生网络目标跟踪算法框架

    Figure 1.  Framework diagram of target tracking algorithm using Siamese network based on efficient attention and context awareness

    图 2  高效通道注意力模块

    Figure 2.  Efficient channel attention module

    图 3  局部上下文感知模块

    Figure 3.  Local context awareness module

    图 4  不同算法在Gym视频上的测试结果

    Figure 4.  Test results of different algorithms on Gym video

    图 5  正负样本区域的分配方式示例

    Figure 5.  Example of allocation of positive and negative sample areas

    图 6  不同算法在数据集OTB100上的精确率与成功率对比

    Figure 6.  Comparison of accuracy and success rates of different algorithms on the OTB100 datasets

    图 7  不同算法在11类挑战下的精确率对比

    Figure 7.  Comparison of accuracy rates of different algorithms under 11 types of challenges

    图 8  不同算法在数据集VOT2016上期望平均重叠率对比

    Figure 8.  Comparison of EAO of different algorithms on the VOT2016 datasets

    图 9  不同算法在数据集VOT2018上期望平均重叠率对比

    Figure 9.  Comparison of EAO of different algorithms on the VOT2018 datasets

    图 10  不同算法在数据集VOT2018上的性能与速度对比

    Figure 10.  Comparison of performance and speed of different algorithms on VOT2018 dataset

    图 11  各类算法在不同视频下的跟踪结果

    Figure 11.  Display of tracking results of various algorithms with different videos

    表  1  不同算法在数据集VOT2016上的测试结果对比

    Table  1.   Comparison of test results of different algorithms on the VOT2016 datasets

    算法 准确性 鲁棒性 EAO
    SiamCC(本文) 0.61 0.16 0.448
    SPM 0.62 0.21 0.434
    DaSiamRPN 0.61 0.22 0.411
    ECO 0.55 0.20 0.375
    C-RPN 0.59 0.27 0.363
    SiamRPN 0.56 0.26 0.344
    STRCF 0.55 0.313
    SASiam 0.54 0.34 0.291
    MDNet 0.51 0.25 0.283
    SiamFC 0.53 0.46 0.235
    下载: 导出CSV

    表  2  不同算法在数据集VOT2018上的测试结果对比

    Table  2.   Comparison of test results of different algorithms on the VOT2018 datasets

    算法 准确性 鲁棒性 EAO
    SiamCC(本文) 0.58 0.19 0.405
    SiamRPN++ 0.60 0.23 0.414
    ATOM 0.59 0.20 0.401
    SiamMask 0.61 0.28 0.380
    SPM 0.58 0.30 0.338
    DaSiamRPN 0.56 0.34 0.326
    ECO 0.48 0.27 0.280
    GradNet 0.51 0.38 0.247
    SiamRPN 0.49 0.46 0.244
    SASiam 0.50 0.46 0.236
    下载: 导出CSV

    表  3  本文算法在数据集OTB2013和VOT2016上的消融实验

    Table  3.   Ablation experiment of the proposed algorithm on OTB2013 and VOT2016 datasets

    模块 模块 模块 模块 OTB2013成功率 VOT2016 EAO 速度/(帧·s-1)
    0.661 0.417 93
    ECANet 0.665 0.420 91
    CCNet 0.667 0.422 88
    ECAM 0.669 0.426 91
    LCAM 0.676 0.431 88
    ECANet CCNet 0.672 0.429 85
    ECAM LCAM 0.684 0.448 85
    下载: 导出CSV
  • [1] 孟琭, 杨旭. 目标跟踪算法综述[J]. 自动化学报, 2019, 45(7): 1244-1260. https://www.cnki.com.cn/Article/CJFDTOTAL-MOTO201907003.htm

    MENG L, YANG X. A survey of object tracking algorithms[J]. Acta Automatica Sinica, 2019, 45(7): 1244-1260(in Chinese). https://www.cnki.com.cn/Article/CJFDTOTAL-MOTO201907003.htm
    [2] BOLME D S, BEVERIDEG J R, DRAPER B A, et al. Visual object tracking using adaptive correlation filters[C]//2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2010: 2544-2550.
    [3] HENRIQUES J F, CASEIRO R, MARTINS P, et al. High-speed tracking with kernelized correlation filters[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(3): 583-596. doi: 10.1109/TPAMI.2014.2345390
    [4] DANELLJAN M, BHAT G, SHAHBAZ K F, et al. ECO: Efficient convolution operators for tracking[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2017: 6638-6646.
    [5] 李玺, 查宇飞, 张天柱, 等. 基于深度学习的目标跟踪算法发展综述[J]. 中国图象图形学报, 2019, 24(12): 2057-2080. https://www.cnki.com.cn/Article/CJFDTOTAL-ZGTB201912001.htm

    LI X, ZHA Y F, ZHANG T Z, et al. A survey of visual object tracking algorithms based on deep learning[J]. Journal of Image and Graphics, 2019, 24(12): 2057-2080(in Chinese). https://www.cnki.com.cn/Article/CJFDTOTAL-ZGTB201912001.htm
    [6] WANG N, YEUNG D Y. Learning a deep compact image representation for visual tracking[C]//Proceedings of the 26th International Conference on Neural Information Processing Systems. New York: ACM, 2013: 809-817.
    [7] MA C, HUANG J B, YANG X, et al. Hierarchical convolutional features for visual tracking[C]//Proceedings of the IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2015: 3074-3082.
    [8] BERTINETTO L, VALMADRE J, HENRIQUES J F, et al. Fully-convolutional Siamese networks for object tracking[C]//European Conference on Computer Vision. Berlin: Springer, 2016: 850-865.
    [9] HE A, LUO C, TIAN X, et al. A twofold Siamese network for real-time object tracking[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 4834-4843.
    [10] LI B, YAN J, WU W, et al. High performance visual tracking with Siamese region proposal network[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 8971-8980.
    [11] FAN H, LING H. Siamese cascaded region proposal networks for realtime visual tracking[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 7952-7961.
    [12] SHEN J, TANG X, DONG X, et al. Visual object tracking by hierarchical attention Siamese network[J]. IEEE Transactions on Cybernetics, 2019, 50(7): 3068-3080.
    [13] ZHANG Z, PENG H. Deeper and wider Siamese networks for real-time visual tracking[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 4591-4600.
    [14] HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2016: 770-778.
    [15] KEIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[C]//Advances in Neural Information Processing Systems, 2012: 1097-1105.
    [16] WANG Q, TENG Z, XING J, et al. Learning attentions: Residual attentional Siamese network for high performance online visual tracking[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 4854-4863.
    [17] WANG F, JIANG M, QIAN C, et al. Residual attention network for image classification[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2017: 3156-3164.
    [18] HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 7132-7141.
    [19] WOO S, PARK J, LEE J Y, et al. CBAM: Convolutional block attention module[C]//European Conference on Computer Vision. Berlin: Springer, 2018: 3-19.
    [20] WANG Q, WU B, ZHU P, et al. ECA-Net: Efficient channel attention for deep convolutional neural networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2020: 11534-11542.
    [21] WANG X, GIRSHICK R, GUPTA A, et al. Non-local neural networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 7794-7803.
    [22] HUANG Z, WANG X, HUANG L, et al. CCNet: Criss-cross attention for semantic segmentation[C]//Proceedings of the IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2019: 603-612.
    [23] KONG T, SUN F, LIU H, et al. FoveaBox: Beyound anchor-based object detection[J]. IEEE Transactions on Image Processing, 2020, 29: 7389-7398. doi: 10.1109/TIP.2020.3002345
    [24] LI B, WU W, WANG Q, et al. SiamRPN++: Evolution of Siamese visual tracking with very deep networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 4282-4291.
    [25] ZHU Z, WANG Q, LI B, et al. Distractor-aware Siamese networks for visual object tracking[C]//European Conference on Computer Vision. Berlin: Springer, 2018: 101-117.
    [26] LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[C]//Proceedings of the IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2017: 2980-2988.
    [27] HUANG L, ZHAO X, HUANG K. GOT-10k: A large high-diversity benchmark for generic object tracking in the wild[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(5): 1562-1577.
    [28] FAN H, LIN L, YANG F, et al. LaSOT: A high-quality benchmark for large-scale single object tracking[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 5374-5383.
    [29] WU Y, LIM J, YANG M H. Online object tracking: A benchmark[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2013: 2411-2418.
    [30] DANELLJAN M, BHAT G, KHAN F S, et al. ATOM: Accurate tracking by overlap maximization[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 4660-4669.
    [31] LI P, CHEN B, OUYANG W, et al. GradNet: Gradient-guided network for visual object tracking[C]//Proceedings of the IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2019: 6162-6171.
    [32] GAO J, ZHANG T, XU C. Graph convolutional tracking[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 4649-4659.
    [33] WANG G, LUO C, XIONG Z, et al. SPM-tracker: Series-parallel matching for real-time visual object tracking[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 3643-3652.
    [34] WANG Q, ZHANG L, BERTINETTO L, et al. Fast online object tracking and segmentation: A unifying approach[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 1328-1338.
  • 加载中
图(11) / 表(3)
计量
  • 文章访问数:  35
  • HTML全文浏览量:  9
  • PDF下载量:  8
  • 被引次数: 0
出版历程
  • 收稿日期:  2021-01-11
  • 录用日期:  2021-01-22
  • 刊出日期:  2021-03-10

目录

    /

    返回文章
    返回
    常见问答