Siamese network visual tracking algorithm based on second-order attention

HOU Zhiqiang; CHEN Maolin; MA Jingyuan; GUO Fan; YU Wangsheng; MA Sugang

doi:10.13700/j.bh.1001-5965.2022.0373

Volume 50 Issue 3

Mar. 2024

Turn off MathJax

Article Contents

Journal of Beijing University of Aeronautics and Astronautics > 2024 > 50(3): 739-747.

HOU Z Q，CHEN M L，MA J Y，et al. Siamese network visual tracking algorithm based on second-order attention[J]. Journal of Beijing University of Aeronautics and Astronautics，2024，50（3）：739-747 （in Chinese） doi: 10.13700/j.bh.1001-5965.2022.0373

Citation:

PDF( 1541 KB)

Siamese network visual tracking algorithm based on second-order attention

doi: 10.13700/j.bh.1001-5965.2022.0373

HOU Zhiqiang^{1, 2
,
,},
CHEN Maolin^{1, 2},
MA Jingyuan^{1, 2},
GUO Fan^{1, 2},
YU Wangsheng³,
MA Sugang^{1, 2}

1.
School of Computer Science and Technology，Xi’an University of Posts and Telecommunications，Xi’an 710121，China
2.
Shaanxi Key Laboratory of Network Data Analysis and Intelligent Processing，Xi’an University of Posts and Telecommunications，Xi’an 710121，China
3.
School of Information and Navigation，Air Force Engineering University，Xi’an 710077，China

Funds: National Natural Science Foundation of China (62072370)

More Information

Corresponding author: E-mail：hzq@xupt.edu.cn
Received Date: 17 May 2022
Accepted Date: 27 Jun 2022

Available Online: 23 Sep 2022

Publish Date: 16 Sep 2022

Abstract

Abstract

To improve the feature expression ability and discriminative ability of the visual tracking algorithm based on Siamese network and obtain better tracking performance, a lightweight Siamese network visual tracking algorithm based on second-order attention is proposed. Firstly, to obtain deep features of the object, the lightweight VGG-Net is used as the backbone of the Siamese network.Secondly, the residual second-order pooling network and the second-order spatial attention network are used in parallel at the end of the Siamese network to obtain the second-order attention features with channel correlation and the second-order attention features with spatial correlation.Finally, visual tracking is achieved through a double branch response strategy using the residual second-order channel attention features and the second-order spatial attention features. The proposed algorithm is trained end-to-end with the GOT-10k dataset and validated on the datasets OTB100 and VOT2018.The experimental results show that the tracking performance of the proposed algorithm has been significantly improved. Compared with the baseline algorithm SiamFC, on dataset OTB100, the precision and the success are increased by 0.100 and 0.096, respectively; on dataset VOT2018, the expected average overlap (EAO) increased by 0.077, tracking speed reached 48 frame/s.
- Siamese network,
- visual tracking,
- residual second-order pooling network,
- second-order spatial attention network,
- double branch response strategy

FullText(HTML)

References(29)

References

[1]	MARVASTI-ZADEH S M, CHENG L, GHANEI-YAKHDAN H. Deep learning for visual tracking: A comprehensive survey[J]. IEEE Transactions on Intelligent Transportation Systems, 2022, 23(5): 3943-3968. doi: 10.1109/TITS.2020.3046478
[2]	柏罗, 张宏立, 王聪. 基于高效注意力和上下文感知的目标跟踪算法[J]. 北京航空航天大学学报, 2022, 48(7): 1222-1232. BAI L, ZHANG H L, WANG C. Target tracking algorithm based on efficient attention and context awareness[J]. Journal of Beijing University of Aeronautics and Astronautics, 2022, 48(7): 1222-1232(in Chinese).
[3]	李玺, 查宇飞, 张天柱, 等. 深度学习的视觉跟踪算法综述[J]. 中国图象图形学报, 2019, 24(12): 2057-2080. doi: 10.11834/jig.190372 LI X, ZHA Y F, ZHANG T Z, et al. A survey of visual object tracking algorithms based on deep learning[J]. Journal of Image and Graphics, 2019, 24(12): 2057-2080(in Chinese). doi: 10.11834/jig.190372
[4]	蒲磊, 李海龙, 侯志强, 等. 基于高层语义嵌入的孪生网络跟踪算法[J]. 北京航空航天大学学报, 2023, 49(4): 792-803. PU L, LI H L, HOU Z Q, et al. Siamese network tracking based on high level semantic embedding[J]. Journal of Beijing University of Aeronautics and Astronautics, 2023, 49(4): 792-803(in Chinese).
[5]	张成煜, 侯志强, 蒲磊, 等. 基于在线学习的Siamese网络视觉跟踪算法[J]. 光电工程, 2021, 48(4): 4-14. ZHANG C Y, HOU Z Q, PU L, et al. Siamese network visual tracking algorithm based on online learning[J]. Opto-Electronic Engineering, 2021, 48(4): 4-14(in Chinese).
[6]	BERTINETTO L, VALMADRE J, HENRIQUES J F. Fully-convolutional Siamese networks for object tracking[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2016: 850-865.
[7]	LI B, YAN J, WU W. High performance visual tracking with Siamese region proposal network[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 8971-8980.
[8]	ZHU Z, WANG Q, LI B. Distractor-aware Siamese networks for visual object tracking[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2018: 101-117.
[9]	KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017, 60(6): 84-90. doi: 10.1145/3065386
[10]	WANG Q, ZHANG L, WU B. What deep CNNs benefit from global covariance pooling: An optimization perspective[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2020: 10771-10780.
[11]	WANG Q, XIE J, ZUO W. Deep CNNs meet global covariance pooling: Better representation and generalization[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(8): 2582-2597.
[12]	GAO Z, WANG Q, ZHANG B, et al. Temporal-attentive covariance pooling networks for video recognition[C]//Proceedings of the Advances in Neural Information Processing Systems. [S. l. ]: NeurIPS, 2021.
[13]	蒲磊, 冯新喜, 侯志强, 等. 基于二阶池化网络的鲁棒视觉跟踪算法[J]. 电子学报, 2020, 48(8): 1472-1478. PU L, FENG X X, HOU Z Q, et al. Robust visual tracking based on second order pooling network[J]. Acta Electronica Sinica, 2020, 48(8): 1472-1478(in Chinese).
[14]	LI Y, ZHANG X, CHEN D M. SiamVGG: Visual tracking using deeper Siamese networks[EB/OL]. (2019-02-07)[2022-05-01]. https://arxiv.org/abs/1902.02804.
[15]	GAO Z, XIE J, WANG Q. Global second-order pooling convolutional networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 3024-3033.
[16]	NG T, BALNTAS V, TIAN Y. SOLAR: Second-order loss and attention for image retrieval[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2020: 253-270.
[17]	SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[EB/OL]. (2015-04-10)[2022-05-01]. https://arxiv.org/abs/1409.1556.
[18]	ZHANG Z, PENG H. Deeper and wider Siamese networks for real-time visual tracking[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 4591-4600.
[19]	HUANG L, ZHAO X, HUANG K. GOT-10k: A large high-diversity benchmark for generic object tracking in the wild[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(5): 1562-1577.
[20]	WU Y, LIM J, YANG M H. Object tracking benchmark[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1834-1848.
[21]	KRISTAN M, LEONARDIS A, MATAS J. The sixth visual object tracking VOT2018 challenge results[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2018: 3-53.
[22]	HE A, LUO C, TIAN X. A twofold Siamese network for real-time object tracking[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 4834-4843.
[23]	LI P, CHEN B, OUYANG W, et al. GradNet: Gradient-guided network for visual object tracking[C]//Proceedings of the IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2019: 6162-6171.
[24]	LI X, MA C, WU B. Target-aware deep tracking[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 1369-1378.
[25]	DANELLJAN M, BHAT G, KHAN F S. ATOM: Accurate tracking by overlap maximization[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 4660-4669.
[26]	SOSNOVIK I, MOSKALEV A, SMEULDERSA W M. Scale equivariance improves Siamese tracking[C]//Proceedings of the IEEE Winter Conference on Applications of Computer Vision. Piscataway: IEEE Press, 2021: 2765-2774.
[27]	WANG Q, GAO J, XING J. DCFNet: Discriminant correlation filters network for visual tracking[EB/OL]. (2017-04-13)[2022-05-01]. https://arxiv.org/abs/1704.04057.
[28]	GUO Q, FENG W, ZHOU C. Learning dynamic Siamese network for visual object tracking[C]//Proceedings of the IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2017: 1763-1771.
[29]	DONG X, SHEN J. Triplet loss in Siamese network for object tracking[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2018: 459-474.

Relative Articles

Supplements(0)

Cited By

Proportional views

Proportional views

通讯作者: 陈斌, bchen63@163.com

1.
沈阳化工大学材料科学与工程学院沈阳 110142

Figures(6) / Tables(5)

Get Citation

PDF

XML

Article Metrics

Article views(176) PDF downloads(30)

Siamese network visual tracking algorithm based on second-order attention

doi: 10.13700/j.bh.1001-5965.2022.0373

Abstract

References

Proportional views

Catalog

通讯作者: 陈斌, bchen63@163.com

Article Metrics

Proportional views

Related

Siamese network visual tracking algorithm based on second-order attention

doi: 10.13700/j.bh.1001-5965.2022.0373

Abstract

References

Proportional views

Catalog

通讯作者: 陈斌, bchen63@163.com

Article Metrics

Proportional views

Related

Export File

Citation

Format

Content