基于IoU约束的孪生网络目标跟踪方法

周丽芳; 刘金兰; 李伟生; 雷帮军; 何宇; 王一涵

doi:10.13700/j.bh.1001-5965.2021.0533

基于IoU约束的孪生网络目标跟踪方法

doi: 10.13700/j.bh.1001-5965.2021.0533

周丽芳^{1, 2, 3, ,},
刘金兰^{1, 3},
李伟生^{2, 3},
雷帮军⁴,
何宇^{1, 3},
王一涵¹

1.
重庆邮电大学软件工程学院, 重庆 400065
2.
重庆邮电大学计算机科学与技术学院, 重庆 400065
3.
重庆邮电大学图像认知重庆市重点实验室, 重庆 400065
4.
三峡大学水电工程智能视觉监测湖北省重点实验室, 宜昌 443002

基金项目:

重庆市教育委员会科学技术研究计划 KJZD-K201900601

重庆市自然科学基金 cstc2019jcyj-msxmX0461

水电工程智能视觉监测湖北省重点实验室(三峡大学)开放基金 2020SDSJ01

国家级大学生创新创业训练计划 202110617009

详细信息

通讯作者:
周丽芳, E-mail: zhoulf@cqupt.edu.cn

中图分类号: TP183; TP391
计量
- 文章访问数: 459
- HTML全文浏览量: 113
- PDF下载量: 30
- 被引次数: 0
出版历程
- 收稿日期: 2021-09-06
- 录用日期: 2021-09-17
- 网络出版日期: 2021-11-01
- 整期出版日期: 2022-08-20

Object tracking method based on IoU-constrained Siamese network

ZHOU Lifang^{1, 2, 3
, ,},
LIU Jinlan^{1, 3},
LI Weisheng^{2, 3},
LEI Bangjun⁴,
HE Yu^{1, 3},
WANG Yihan¹

1.
School of Software Engineering, Chongqing University of Posts and Telecommunications, Chongqing 400065, China
2.
School of Computer Science and Technology, Chongqing University of Posts and Telecommunications, Chongqing 400065, China
3.
Chongqing Key Laboratory of Image Cognition, Chongqing University of Posts and Telecommunications, Chongqing 400065, China
4.
Hubei Key Laboratory of Intelligent Vision Based Monitoring for Hydroelectric Engineering, China Three Gorges University, Yichang 443002, China

Funds:

Science and Technology Research Program of Chongqing Education Commission of China KJZD-K201900601

Natural Science Foundation of Chongqing, China cstc2019jcyj-msxmX0461

Open Found of Hubei Key Laboratory of Intelligent Vision Based Monitoring for Hydroelectric Engineering (China Three Gorges University) 2020SDSJ01

National College Student Innovation Entrepreneurship Training Program 202110617009

More Information

Corresponding author: ZHOU Lifang, E-mail: zhoulf@cqupt.edu.cn

摘要

摘要:
基于孪生网络的跟踪方法通过离线训练跟踪模型，不需要对跟踪模型进行在线更新，兼顾了跟踪精度和速度。现有孪生网络目标跟踪方法使用固定阈值选择正负训练样本易造成训练样本漏选问题，且训练时分类分支和回归分支之间存在低相关性问题，不利于训练出高精度的跟踪模型。为此，提出了一种基于交并比(IoU)约束的孪生网络目标跟踪方法。通过使用动态阈值策略根据预定义锚框与目标真实框的相关统计特征，动态调整正负训练样本的界定阈值，提升跟踪精度。所提方法使用IoU质量评估分支代替分类分支，通过锚框与目标真实框之间的IoU反映目标位置，提升跟踪精度，降低模型的参数量。在数据集VOT2016、OTB-100、VOT2019、UAV123上进行了对比实验，所提方法均有较好的表现。在VOT2016数据集上，所提方法的跟踪精度比SiamRPN方法高0.017，期望平均重叠率为0.463，与SiamRPN++相比仅差0.001，实时运行速度可达220帧/s。
- 目标跟踪 /
- 深度学习 /
- 孪生网络 /
- 交并比(IoU)约束 /
- 动态阈值
Abstract:
The tracking method based on the Siamese network trains the tracking model offline. Therefore, it maintains a good balance between tracking accuracy and speed, which attracts the interest of a growing number of researchers recently. The existing Siamese network object tracking method uses a fixed threshold to select positive and negative training samples, which is easy to cause the problem of missing training samples, and such methods have low correlation between the classification branch and the regression branch during training, which is not conducive to training a high-precision tracking model. To this end, an object tracking method based on intersection over union (IoU)-constrained siamese network is proposed. By using a dynamic threshold strategy, the thresholds of positive and negative training samples are dynamically adjusted according to the relevant statistical characteristics of the predefined anchor boxes and the real boxes. Thereby improving the tracking accuracy. In addition, the proposed method uses the IoU quality assessment branch to replace the classification branch, and reflects the position of the target through the IoU between the anchor box and the target ground-truth frame, which improves the tracking accuracy and reduces the amount of model parameters. The proposed object tracking method based on the IoU-constrained Siamese network has been compared and tested on four datasets: VOT2016, OTB-100, VOT2019, and UAV123. Ideal results have been achieved in these datasets. The tracking accuracy of the proposed method in this paper is 0.017 higher than SiamRPN on the VOT2016 dataset. And with a real-time running speed at 220 frame/s, the expected average overlap rate is 0.463, which is only 0.001 worse than SiamRPN++.
- object tracking /
- deep learning /
- Siamese network /
- intersection over union (IoU)-constrained /
- dynamic threshold

HTML全文

图 1 基于IoU约束的孪生网络目标跟踪框架

Figure 1. Object tracking framework based on IoU-constrained Siamese network

下载: 全尺寸图片幻灯片

图 2 不同方法在OTB-100数据集上的实验结果

Figure 2. Experimental results of different methods on OTB-100 dataset

下载: 全尺寸图片幻灯片

图 3 不同方法在UAV123数据集上的实验结果

Figure 3. Experimental results of different methods on UAV123 dataset

下载: 全尺寸图片幻灯片

表 1 不同方法在VOT2016数据集上的实验结果

Table 1. Experimental results of different methods on VOT2016 dataset

方法	精度	鲁棒性	期望平均重叠率	参数量/MB	速度/(帧·s^-1)
本文方法	0.635	0.200	0.463	41.8	220
SiamBAN^[28]	0.666	0.144	0.505	410	54.53
SiamMask^[29]	0.643	0.219	0.455	82.1	55
SiamFC++^[4]	0.612	0.266	0.357	71.24	90
SiamRPN++^[16]	0.640	0.200	0.464	206	35
SiamRPN^[5]	0.618	0.238	0.393	23.8	180
DaSiamRPN^[30]	0.610	0.220	0.411	86.3	160
ATOM^[31]	0.610	0.187	0.430	108	30
SiamFC^[7]	0.530	0.460	0.235	8.92	86

下载: 导出CSV

表 2 不同方法在数据集VOT2019上的实验结果

Table 2. Experimental results of different methods on VOT2019 dataset

方法	精度	鲁棒性	期望平均重叠率
本文方法	0.597	0.522	0.289
SiamBAN^[28]	0.602	0.396	0.327
SiamRPN++^[16]	0.599	0.482	0.285
SiamRPN^[5]	0.573	0.547	0.260
SPM^[36]	0.577	0.507	0.275
SA-Siam-R^[22]	0.559	0.492	0.253
MemDTC^[22]	0.485	0.587	0.228

下载: 导出CSV

参考文献(37)

[1]	周千里, 张文靖, 赵路平, 等. 面向个体人员特征的跨模态目标跟踪方法[J]. 北京航空航天大学学报, 2020, 46(9): 1635-1642. 体人员特征的跨模态目标跟踪 ZHOU Q L, ZHANG W J, ZHAO L P, et al. Cross-modal object tracking algorithm based on pedestrian attribute[J]. Journal of Beijing University of Aeronautics and Astronautics, 2020, 46(9): 1635-1642(in Chinese). 体人员特征的跨模态目标跟踪
[2]	罗元, 肖航, 欧俊雄. 基于深度学习的目标跟踪技术的研究综述[J]. 半导体光电, 2020, 41(6): 757. https://www.cnki.com.cn/Article/CJFDTOTAL-BDTG202006001.htm LUO Y, XIAO H, OU J X. Research on target tracking technology based on deep learning[J]. Semiconductor Optoelectronics, 2020, 41(6): 757(in Chinese). https://www.cnki.com.cn/Article/CJFDTOTAL-BDTG202006001.htm
[3]	ZHANG S F, CHI C, YAO Y Q, et al. Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2020: 9759-9768.
[4]	XU Y D, WANG Z Y, LI Z X, et al. SiamFC++: Towards robust and accurate visual tracking with target estimation guidelines[C]//Proceedings of the AAAI Conference on Artificial Intelligence. Palo Alto: AAAI, 2020: 12549-12556.
[5]	LI B, YAN J J, WU W, et al. High performance visual tracking with Siamese region proposal network[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 8971-8980.
[6]	TAO R, GAVVES E, SMEULDERS A W M. Siamese instance search for tracking[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2016: 1420-1429.
[7]	BERTINETTO L, VALMADRE J, HENRIQUES J F, et al. Fully-convolutional Siamese networks for object tracking[C]//European Conference on Computer Vision. Berlin: Springer, 2016: 850-865.
[8]	RUSSAKOVSKY O, DENG J, SU H, et al. ImageNet large scale visual recognition challenge[J]. International Journal of Computer Vision, 2015, 115(3): 211-252. doi: 10.1007/s11263-015-0816-y
[9]	VALMADRE J, BERTINETTO L, HENRIQUES J, et al. End-to-end representation learning for correlation filter based tracking[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2017: 2805-2813.
[10]	WANG Q, TENG Z, XING J L, et al. Learning attentions: Residual attentional Siamese network for high performance online visual tracking[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 4854-4863.
[11]	HE A F, LUO C, TIAN X M, et al. A twofold Siamese network for real-time object tracking[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 4834-4843.
[12]	REN S, HE K, GIRSHICK R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149. doi: 10.1109/TPAMI.2016.2577031
[13]	FAN H, LING H B. Siamese cascaded region proposal networks for real-time visual tracking[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 7952-7961.
[14]	KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[C]//Advances in Neural Information Processing Systems, 2012: 1097-1105.
[15]	ZHANG Z P, PENG H W. Deeper and wider Siamese networks for real-time visual tracking[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 4591-4600.
[16]	LI B, WU W, WANG Q, et al. SiamRPN++: Evolution of Siamese visual tracking with very deep networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 4282-4291.
[17]	HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2016: 770-778.
[18]	GUO D Y, WANG J, CUI Y, et al. SiamCAR: Siamese fully convolutional classification and regression for visual tracking[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2020: 6269-6277.
[19]	LI X, WANG W H, WU L J, et al. Generalized focal loss: Learning qualified and distributed bounding boxes for dense object detection[EB/OL]. (2020-06-08)[2021-09-01]. https://arxiv.org/abs/2006.04388.
[20]	WU Y, LIM J, YANG M H. Object tracking benchmark[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1834-1848. doi: 10.1109/TPAMI.2014.2388226
[21]	HADFIELD S J, BOWDEN R, LEBDA K. The visual object tracking VOT2016 challenge results[C]//European Conference on Computer Vision. Berlin: Springer, 2016: 777-823.
[22]	KRISTAN M, MATAS J, LEONARDIS A, et al. The seventh visual object tracking VOT2019 challenge results[C]//Proceedings of the IEEE International Conference on Computer Vision Workshops. Piscataway: IEEE Press, 2019: 2206-2241.
[23]	MUELLER M, SMITH N, GHANEM B. A benchmark and simulator for UAV tracking[C]//European Conference on Computer Vision. Berlin: Springer, 2016: 445-461.
[24]	REAL E, SHLENS J, MAZZOCCHI S, et al. YouTube-BoundingBoxes: A large high-precision human-annotated data set for object detection in video[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2017: 5296-5305.
[25]	LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft COCO: Common objects in context[C]//European Conference on Computer Vision. Berlin: Springer, 2014: 740-755.
[26]	HUANG L H, ZHAO X, HUANG K Q. GOT-10k: A large high-diversity benchmark for generic object tracking in the wild[EB/OL]. (2019-11-20)[2021-09-01]. https://arxiv.org/abs/1810.11981v2.
[27]	FAN H, LIN L T, YANG F, et al. LaSOT: A high-quality benchmark for large-scale single object tracking[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 5374-5383.
[28]	ZEDU C, BINENG Z, GUORONG L, et al. Siamese box adaptive network for visual tracking[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2020: 6668-6677.
[29]	WANG Q, ZHANG L, BERTINETTO L, et al. Fast online object tracking and segmentation: A unifying approach[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 1328-1338.
[30]	ZHU Z, WANG Q, LI B, et al. Distractor-aware Siamese networks for visual object tracking[C]//European Conference on Computer Vision. Berlin: Springer, 2018: 101-117.
[31]	DANELLJAN M, BHAT G, KHAN F S, et al. ATOM: Accurate tracking by overlap maximization[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 4660-4669.
[32]	NAM H, HAN B. Learning multi-domain convolutional neural networks for visual tracking[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2016: 4293-4302.
[33]	DANELLJAN M, BHAT G, SHAHBAZ K F, et al. ECO: Efficient convolution operators for tracking[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2017: 6638-6646.
[34]	BERTINETTO L, VALMADRE J, GOLODETZ S, et al. Staple: Complementary learners for real-time tracking[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2016: 1401-1409.
[35]	DANELLJAN M, HAGER G, SHAHBAZ K F, et al. Learning spatially regularized correlation filters for visual tracking[C]//Proceedings of the IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2015: 4310-4318.
[36]	WANG G T, LUO C, XIONG Z W, et al. SPM-tracker: Series-parallel matching for real-time visual object tracking[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 3643-3652.
[37]	DANELLJAN M, HAGER G, KHAN F, et al. Accurate scale estimation for robust visual tracking[C]//British Machine Vision Conference. Berlin: Springer, 2014: 1-11.