外观动作自适应目标跟踪方法

熊珺瑶; 王蓉; 孙义博

doi:10.13700/j.bh.1001-5965.2021.0597

外观动作自适应目标跟踪方法

doi: 10.13700/j.bh.1001-5965.2021.0597

中国人民公安大学信息与网络安全学院，北京 100038

基金项目:

国家自然科学基金 62076246

详细信息

通讯作者:
王蓉, E-mail: dbdxwangrong@163.com

中图分类号: TP183
计量
- 文章访问数: 253
- HTML全文浏览量: 86
- PDF下载量: 29
- 被引次数: 0
出版历程
- 收稿日期: 2021-10-09
- 录用日期: 2021-10-29
- 网络出版日期: 2021-11-16
- 整期出版日期: 2022-08-20

Appearance and action adaptive target tracking method

School of Information Technology and Network Security, People's Public Security University of China, Beijing 100038, China

Funds:

National Natural Science Foundation of China 62076246

More Information

Corresponding author: WANG Rong, E-mail: dbdxwangrong@163.com

摘要

摘要:
为降低目标运动时产生的外观形变对目标跟踪的影响，在DaSiamese-RPN基础上进行改进，提出了一种外观动作自适应的目标跟踪方法。在孪生网络的子网络中引入外观动作自适应更新模块，融合目标的时空信息和动作特征；利用2种欧氏距离分别度量真实图和预测图之间的全局和局部差异，并对二者加权融合构建损失函数，加强预测目标特征图与真实目标特征图之间全局和局部信息的关联性。在VOT2016、VOT2018、VOT2019和OTB100数据集上进行测试，实验结果表明：在VOT2016和VOT2018数据集上，预测平均重叠率分别提高4.5%和6.1%；在VOT2019数据集上，准确度提高0.4%，预测平均重叠率降低1%；在OTB100数据集上，跟踪成功率提高0.3%，精确度提高0.2%。
- 目标跟踪 /
- 外观动作自适应 /
- 孪生网络 /
- 特征融合 /
- 外观形变
Abstract:
On the basis of DaSiamese-RPN, a target tracking approach of appearance and action adaptation is proposed to limit the effect of appearance deformation on target tracking when the target is moving.First of all, the appearance and action adaptive module is introduced in the subnet of the Siamese network, which integrates the object's spatial information and action feature. Secondly, the global and local divergence between the actual and predicted feature maps are measured by using two Euclidean distances, and the loss function is constructed by weighting the fusion of the two, so as to strengthen the correlation between the global and local information. Finally, tests were conducted on the VOT2016, VOT2018, VOT2019, and OTB100 datasets. The experimental results showed that the expected average overlap was improved by 4.5% and 6.1% in the VOT2016 and VOT2018 datasets respectively. On the VOT2019 dataset, accuracy increased by 0.4% and expected average overlap decreased by 1%; The tracking success rate was improved by 0.3% and accuracy increased by 0.2% when evaluated on the OTB100 dataset.
- target tracking /
- appearance and action adaptive /
- Siamese network /
- feature integration /
- appearance deformation

HTML全文

图 1 DaSiamese-RPN整体框架

Figure 1. Overall framework of DaSiamese-RPN

下载: 全尺寸图片幻灯片

图 2 DaSiamese-RPN跟踪框架

Figure 2. Tracking framework of DaSiamese-RPN

下载: 全尺寸图片幻灯片

图 3 外观动作自适应目标跟踪框架

Figure 3. Framework of appearance and action adaptive target tracking

下载: 全尺寸图片幻灯片

图 4 外观动作自适应更新模块框架

Figure 4. Framework of appearance and action adaptive module

下载: 全尺寸图片幻灯片

图 5 Action-Net框架

Figure 5. Framework of Action-Net

下载: 全尺寸图片幻灯片

图 6 Action-Net各部分结构

Figure 6. Action-Net structure of each part

下载: 全尺寸图片幻灯片

图 7 全局局部信息联合损失框架

Figure 7. Framework of global and local information combination loss function

下载: 全尺寸图片幻灯片

图 8 遮挡场景下可视化

Figure 8. Visualization of occlusion scene

下载: 全尺寸图片幻灯片

图 9 动作复杂变化下可视化

Figure 9. Visualization under complex changes of movement

下载: 全尺寸图片幻灯片

表 1 VOT2016数据集测试结果

Table 1. Results of testing on VOT2016 dataset

模型	准确度/%	鲁棒性/%	预测平均重叠率/%
DaSiamese-RPN	61	22	41.1
本文(ω=0)	62.7	21.4	42.5
本文(ω=1 000)	61.3	19.6	45.5
本文(ω=500)	60.9	19.6	44.2
本文(ω=100)	61.4	18.6	45.6

下载: 导出CSV

表 2 VOT2018数据集测试结果

Table 2. Results of testing on VOT2018 dataset

模型	准确度/%	鲁棒性/%	预测平均重叠率/%
DaSiamese-RPN	56.9	33.7	32.6
本文(ω=0)	58.4	29.5	35.2
本文(ω=1 000)	58.5	28.6	37.2
本文(ω=500)	58.5	25.8	38.7
本文(ω=100)	58.5	28.6	36.5

下载: 导出CSV

表 3 VOT2019数据集测试结果

Table 3. Results of testing on VOT2019 dataset

模型	准确度/%	鲁棒性/%	预测平均重叠率/%
DaSiamese-RPN	58.2	52.7	27.2
本文(ω=0)	58.3	54.7	26.7
本文(ω=1 000)	58.5	55.2	26.8
本文(ω=500)	58.6	55.2	26.2
本文(ω=100)	58.5	55.7	26

下载: 导出CSV

表 4 OTB100数据集测试结果

Table 4. Results of testing on OTB100 dataset

模型	跟踪成功率/%	精确度/%
DaSiamese-RPN	64.6	85.9
本文(ω=0)	64.9	86.1
本文(ω=1 000)	64.5	85.5
本文(ω=500)	64.6	85.8
本文(ω=100)	64.8	86

下载: 导出CSV

表 5 不同方法在VOT2016数据集上的对比

Table 5. Comparison with different methods on VOT2016 dataset

方法	预测平均重叠率/%	准确度/%	鲁棒性/%
MemTrack^[19]	27.3	53.3	144.1
SiamFC^[11]	23.5	52.9	190.8
SiamRPN^[12]	26.2	53.8	42.4
SiamRPNpp^[2]	39.3	61.8	23.8
本文	45.6	61.4	18.6

下载: 导出CSV

表 6 不同方法在VOT2018数据集上的对比

Table 6. Comparison with different methods on VOT2018 dataset

方法	预测平均重叠率/%	准确度/%	鲁棒性/%
DRT^[20]	35.5	51.8	20.1
RCO^[16]	37.6	50.7	15.5
UPDT^[21]	37.9	53.6	18.4
SiamRPN^[12]	38.4	50.5	14
MFT^[16]	38.6	50.3	15.9
LADCF^[22]	38.9	50.3	15.9
SiamRPNpp^[2]	35.2	57.6	27
本文	38.7	58.5	25.8

下载: 导出CSV

表 7 不同方法在OTB100数据集上的对比

Table 7. Comparison with different methods on OTB100 dataset

方法	跟踪成功率/%	精确度/%
SiamFC^[11]	58.9	79.4
GradNet^[23]	63.9	86.1
C-RPN^[24]	63.9	85.2
SiamRPN^[12]	63.7	85.1
SiamRPNpp^[2]	64.8	85.3
FENG^[25]	61	73
SNLT^[26]	67	80
本文	64.9	86.1

下载: 导出CSV

参考文献(26)

[1]	ZHU Z, WANG Q, LI B, et al. Distractor-aware Siamese networks for visual object tracking[C]//Proceedings of the European Conference on Computer Vision(ECCV). Berlin: Springer, 2018: 101-117.
[2]	LI B, WU W, WANG Q, et al. Evolution of Siamese visual tracking with very deep networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 16-20.
[3]	WANG Q, ZHANG L, BERTINETTO L, et al. Fast online object tracking and segmentation: A unifying approach[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 1328-1338.
[4]	ZHANG Z P, PENG H W. Deeper and wider Siamese networks for real-time visual tracking[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 4591-4600.
[5]	HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2016: 770-778.
[6]	XIE S N, GIRSHICK R, DOLLAR P, et al. Aggregated residual transformations for deep neural networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2017: 1492-1500.
[7]	HOWARD A G, ZHU M, CHEN B, et al. MobileNets: Efficient convolutional neural networks for mobile vision applications[EB/OL]. (2017-04-17)[2021-10-01]. https://arxiv.org/abs/1704.04861.
[8]	VOIGTLAENDER P, LUITEN J, TORR P H S, et al. Siam R-CNN: Visual tracking by re-detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2020: 6578-6588.
[9]	ZHANG L, GONZALEZ-GARCIA A, WEIJER J, et al. Learning the model update for Siamese trackers[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 4010-4019.
[10]	WANG Z, SHE Q, SMOLIC A. ACTION-Net: Multipath excitation for action recognition[EB/OL]. (2021-03-11)[2021-10-01]. https://arxiv.org/abs/2103.07372.
[11]	BERTINETTO L, VALMADRE J, HENRIQUES J F, et al. Fully -convolutional Siamese networks for object tracking[C]//Proceedings of the European Conference on Computer Vision(ECCV). Berlin: Springer, 2016: 850-865.
[12]	TAO R, GAVVES E, SMEULDERS A W M. Siamese instance search for tracking[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2016: 1420-1429.
[13]	FAN H, LIN L, YANG F, et al. LaSOT: A high-quality benchmark for large-scale single object tracking[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 5374-5383.
[14]	KRISTAN M, MATAS J, LEONARDIS A, et al. A novel performance evaluation methodology for single-target trackers[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 38(11): 2137-2155. doi: 10.1109/TPAMI.2016.2516982
[15]	KRISTAN M, LEONARDIS A, MATAS J, et al. The visual object tracking VOT2016 challenge results[C]//Proceedings of the European Conference on Computer Vision (ECCV). Berlin: Springer, 2016: 2-5.
[16]	KRISTAN M, LEONARDIS A, MATAS J, et al. The sixth visual object tracking VOT2018 challenge results[C]//Proceedings of the European Conference on Computer Vision (ECCV). Berlin: Springer, 2018: 3-53.
[17]	KRISTAN M, MATAS J, LEONARDIS A, et al. The seventh visual object tracking VOT2019 challenge results[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019.
[18]	WU Y, LIM J, YANG M H. Object tracking benchmark[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1834-1848. doi: 10.1109/TPAMI.2014.2388226
[19]	YANG T, CHAN A B. Learning dynamic memory networks for object tracking[C]//Proceedings of the European Conference on Computer Vision (ECCV). Berlin: Springer, 2018: 152-167.
[20]	SUN C, WANG D, LU H, et al. Correlation tracking via joint discrimination and reliability learning[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 489-497.
[21]	BHAT G, JOHNANDER J, DANELLJAN M, et al. Unveiling the power of deep tracking[C]//Proceedings of the European Conference on Computer Vision(ECCV). Berlin: Springer, 2018: 483-498.
[22]	XU T Y, FENG Z H, WU X J, et al. Learning adaptive discriminative correlation filters via temporal consistency preserving spatial feature selection for robust visual object tracking[J]. IEEE Transactions on Image Processing, 2019, 28(11): 5596-5609.
[23]	LI P, CHEN B, OUYANG W, et al. GradNet: Gradient-guided network for visual object tracking[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 6162-6171.
[24]	FAN H, LING H. Siamese cascaded region proposal networks for real-time visual tracking[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 7952-7961.
[25]	FENG Q, ABLAVSKY V, BAI Q X, et al. Real-time visual object tracking with natural language description[C]//2020 IEEE Winter Conference on Applications of Computer Vision (WACV). Piscataway: IEEE Press, 2020: 700-709.
[26]	FENG Q, ABLAVSKY V, BAI Q, et al. Siamese natural language tracker: Tracking by natural language descriptions with Siamese trackers[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2021: 5851-5860.