Volume 50 Issue 8
Aug.  2024
Turn off MathJax
Article Contents
HOU Z Q,MA J Y,HAN R X,et al. A fast long-term visual tracking algorithm based on deep learning[J]. Journal of Beijing University of Aeronautics and Astronautics,2024,50(8):2391-2403 (in Chinese) doi: 10.13700/j.bh.1001-5965.2022.0645
Citation: HOU Z Q,MA J Y,HAN R X,et al. A fast long-term visual tracking algorithm based on deep learning[J]. Journal of Beijing University of Aeronautics and Astronautics,2024,50(8):2391-2403 (in Chinese) doi: 10.13700/j.bh.1001-5965.2022.0645

A fast long-term visual tracking algorithm based on deep learning

doi: 10.13700/j.bh.1001-5965.2022.0645
Funds:  National Natural Science Foundation of China (62072370)
More Information
  • Corresponding author: E-mail:hzq@xupt.edu.cn
  • Received Date: 27 Jul 2022
  • Accepted Date: 26 Nov 2022
  • Publish Date: 10 Jan 2023
  • Current deep learning-based visual tracking algorithms have difficulty tracking the target accurately in real-time in complex long-term monitoring environments including target size change, occlusion, and out-of-view. To solve this problem, a fast long-term visual tracking algorithm is proposed, which consists of a fast short-term tracking algorithm and a fast global re-detection module. First, as a short-term tracking algorithm, the attention module of second-order channel and region spatial fusion is added to the base algorithm SiamRPN. Then, in order to make the improved short-term tracking algorithm have a fast long-term tracking ability, the global re-detection module based on template matching proposed in this paper is added to the algorithm, which uses a lightweight network and fast similarity judgment method to speed up the re-detection rate. The proposed algorithm is tested on five datasets (OTB100, LaSOT, UAV20L, VOT2018-LT, and VOT2020-LT). With an average tracking speed of 104 frames per second, the experimental findings demonstrate the algorithm's outstanding long-term tracking performance.

     

  • loading
  • [1]
    李玺, 查宇飞, 张天柱, 等. 深度学习的目标跟踪算法综述[J]. 中国图象图形学报, 2019, 24(12): 2057-2080. doi: 10.11834/jig.190372

    LI X, ZHA Y F, ZHANG T Z, et al. Survey of visual object tracking algorithms based on deep learning[J]. Journal of Image and Graphics, 2019, 24(12): 2057-2080(in Chinese). doi: 10.11834/jig.190372
    [2]
    刘芳, 孙亚楠, 王洪娟, 等. 基于残差学习的自适应无人机目标跟踪算法[J]. 北京航空航天大学学报, 2020, 46(10): 1874-1882.

    LIU F, SUN Y N, WANG H J, et al. Adaptive UAV target tracking algorithm based on residual learning[J]. Journal of Beijing University of Aeronautics and Astronautics, 2020, 46(10): 1874-1882(in Chinese).
    [3]
    张诚, 马华东, 傅慧源. 基于时空关联图模型的视频监控目标跟踪[J]. 北京航空航天大学学报, 2015, 41(4): 713-720.

    ZHANG C, MA H D, FU H Y. Object tracking in surveillance videos using spatial-temporal correlation graph model[J]. Journal of Beijing University of Aeronautics and Astronautics, 2015, 41(4): 713-720(in Chinese).
    [4]
    LUKEŽIČ A, ZAJC L Č, VOJÍŘ T, et al. FuCoLoT–A fully-correlational long-term tracker[C]//Proceedings of the Asian Conference on Computer Vision. Berlin: Springer, 2018: 595-611.
    [5]
    王鑫, 侯志强, 余旺盛, 等. 基于深度稀疏学习的鲁棒视觉跟踪[J]. 北京航空航天大学学报, 2017, 43(12): 2554-2563.

    WANG X, HOU Z Q, YU W S, et al. Robust visual tracking based on deep sparse learning[J]. Journal of Beijing University of Aeronautics and Astronautics, 2017, 43(12): 2554-2563(in Chinese).
    [6]
    蒲磊, 冯新喜, 侯志强, 等. 基于级联注意力机制的孪生网络视觉跟踪算法[J]. 北京航空航天大学学报, 2020, 46(12): 2302-2310.

    PU L, FENG X X, HOU Z Q, et al. Siamese network visual tracking algorithm based on cascaded attention mechanism[J]. Journal of Beijing University of Aeronautics and Astronautics, 2020, 46(12): 2302-2310(in Chinese).
    [7]
    LI B, YAN J, WU W, et al. High performance visual tracking with Siamese region proposal network[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 8971-8980.
    [8]
    LI B, WU W, WANG Q, et al. SiamRPN++: Evolution of Siamese visual tracking with very deep networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 4277-4286.
    [9]
    DANELLJAN M, BHAT G, KHAN F S, et al. ATOM: Accurate tracking by overlap maximization[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 4655-4664.
    [10]
    CHENG S Y, ZHONG B N, LI G R, et al. Learning to filter: Siamese relation network for robust tracking[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2021: 4419-4429.
    [11]
    ZHU Z, WANG Q, LI B, et al. Distractor-aware Siamese networks for visual object tracking[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2018: 103-119.
    [12]
    ZHANG Y H, WANG D, WANG L J, et al. Learning regression and verification networks for long-term visual tracking[EB/OL]. (2018-11-19)[2022-07-01]. http://arxiv.org/abs/1809.04320.
    [13]
    YAN B, ZHAO H J, WANG D, et al. ‘Skimming-Perusal’ tracking: A framework for real-time and robust long-term tracking[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE Press, 2019: 2385-2393.
    [14]
    VOIGTLAENDER P, LUITEN J, TORR P H S, et al. SiamR-CNN: Visual tracking by re-detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2020: 6577-6587.
    [15]
    FAN H, LIN L T, YANG F, et al. LaSOT: A high-quality benchmark for large-scale single object tracking[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 5369-5378.
    [16]
    MUELLER M, SMITH N, GHANEM B. A benchmark and simulator for UAV tracking[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2016: 445-461.
    [17]
    LUKEŽIČ A, ZAJC L Č, VOJÍŘ T, et al. Now you see me: Evaluating performance in long-term visual tracking[EB/OL]. (2018-04-19)[2022-07-01]. http://arxiv.org/abs/1804.07056.
    [18]
    KRISTAN M, LEONARDIS A, MATAS J, et al. The eighth visual object tracking VOT2020 challenge results[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2020: 547-601.
    [19]
    WU Y, LIM J, YANG M H. Online object tracking: A benchmark[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2013: 2411-2418.
    [20]
    KRISTAN M, LEONARDIS A, MATAS J, et al. The sixth visual object tracking VOT2018 challenge results[C]//Proceedings of the European Conference on Computer Vision Workshops. Berlin: Springer, 2018.
    [21]
    KALAL Z, MIKOLAJCZYK K, MATAS J. Tracking-learning-detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012, 34(7): 1409-1422. doi: 10.1109/TPAMI.2011.239
    [22]
    ZHU G, PORIKLI F, LI H D. Beyond local search: Tracking objects everywhere with instance-specific proposals[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2016: 943-951.
    [23]
    KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017, 60(6): 84-90.
    [24]
    WOO S, PARK J, LEE J Y, et al. CBAM: Convolutional block attention module[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2018: 3-19.
    [25]
    HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2016: 770-778.
    [26]
    HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 7132-7141.
    [27]
    LI P H, XIE J T, WANG Q L, et al. Is second-order information helpful for large-scale visual recognition?[C]//Proceedings of the IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2017: 2089-2097.
    [28]
    LIN T Y, ROYCHOWDHURY A, MAJI S. Bilinear CNN models for fine-grained visual recognition[C]//Proceedings of the IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2015: 1449-1457.
    [29]
    JADERBERG M, SIMONYAN K, ZISSERMAN A, et al. Spatial transformer networks[EB/OL]. (2016-02-04) [2022-07-01]. http://arxiv.org/abs/1506.02025.
    [30]
    WANG X L, GIRSHICK R, GUPTA A, et al. Non-local neural networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 7794-7803.
    [31]
    CHENG J, WU Y, ABDALMAGEED W, et al. QATM: Quality-aware template matching for deep learning[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 11553-11562.
    [32]
    HAN K, WANG Y H, TIAN Q, et al. GhostNet: More features from cheap operations[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2020: 1577-1586.
    [33]
    TAN M X, LE Q V. EfficientNet: Rethinking model scaling for convolutional neural networks[EB/OL]. (2020-09-11)[2022-07-01]. http://arxiv.org/abs/1905.11946.
    [34]
    DING X H, ZHANG X Y, MA N N, et al. RepVGG: Making VGG-style ConvNets great again[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2021: 13728-13737.
    [35]
    SONG J G. UFO-ViT: High performance linear vision transformer without softmax[EB/OL]. (2020-09-11) [2022-07-01]. http://arxiv.org/abs/2109.14382.
    [36]
    IANDOLA F N, HAN S, MOSKEWICZ M W, et al. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5 MB model size[EB/OL]. (2016-11-04) [2022-07-01]. http://arxiv.org/abs/1602.07360.
    [37]
    RUSSAKOVSKY O, DENG J, SU H, et al. ImageNet large scale visual recognition challenge[J]. International Journal of Computer Vision, 2015, 115(3): 211-252. doi: 10.1007/s11263-015-0816-y
    [38]
    LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft COCO: Common objects in context[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2014: 740-755.
    [39]
    REAL E, SHLENS J, MAZZOCCHI S, et al. YouTube-BoundingBoxes: A large high-precision human-annotated data set for object detection in video[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2017: 7464-7473.
    [40]
    BERTINETTO L, VALMADRE J, HENRIQUES J F, et al. Fully-convolutional Siamese networks for object tracking[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2016: 850-865.
    [41]
    LI X, MA C, WU B Y, et al. Target-aware deep tracking[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 1369-1378.
    [42]
    DANELLJAN M, HÄGER G, KHAN F S, et al. Learning spatially regularized correlation filters for visual tracking[C]//Proceedings of the IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2015: 4310-4318.
    [43]
    GALOOGAHI H K, FAGG A, LUCEY S. Learning background-aware correlation filters for visual tracking[C]//Proceedings of the IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2017: 1144-1152.
    [44]
    BERTINETTO L, VALMADRE J, GOLODETZ S, et al. Staple: Complementary learners for real-time tracking[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2016: 1401-1409.
    [45]
    LUKEŽIČ A, MATAS J, KRISTAN M. D3S–A discriminative single shot segmentation tracker[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2020: 7131-7140.
    [46]
    DONG X P, SHEN J B, SHAO L, et al. CLNet: A compact latent network for fast adjusting Siamese trackers[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2020: 378-395.
    [47]
    YANG T Y, XU P F, HU R B, et al. ROAM: Recurrently optimizing tracking model[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2020: 6717-6726.
    [48]
    DAI K N, WANG D, LU H C, et al. Visual tracking via adaptive spatially-regularized correlation filters[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 4665-4674.
    [49]
    CAO Z A, FU C H, YE J J, et al. HiFT: Hierarchical feature transformer for aerial tracking[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE Press, 2021: 15457-15466.
    [50]
    FU Z H, LIU Q J, FU Z H, et al. STMTrack: Template-free visual tracking with space-time memory networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2021: 13769-13778.
    [51]
    BHAT G, DANELLJAN M, VAN GOOL L, et al. Learning discriminative model prediction for tracking[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE Press, 2019: 6181-6190.
    [52]
    WANG Q, ZHANG L, BERTINETTO L, et al. Fast online object tracking and segmentation: A unifying approach[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 1328-1338.
    [53]
    LI Y H, ZHANG X F, CHEN D M. SiamVGG: Visual tracking using deeper Siamese networks[EB/OL]. (2022-06-04)[2022-07-01]. http://arxiv.org/abs/1902.02804.
    [54]
    CHOI S, LEE J, LEE Y, et al. Robust long-term object tracking via improved discriminative model prediction[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2020: 602-617.
    [55]
    DAI K N, ZHANG Y H, WANG D, et al. High-performance long-term tracking with meta-updater[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2020: 6297-6306.
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(10)  / Tables(10)

    Article Metrics

    Article views(124) PDF downloads(43) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return