-
摘要:
无人机已被广泛应用于军事和民用领域,目标跟踪技术是无人机应用的关键技术之一。针对无人机视频跟踪过程中目标易发生尺度变化、遮挡等问题,提出一种基于残差学习的自适应无人机目标跟踪算法。首先,结合残差学习和空洞卷积的优点构建深度网络提取目标特征,同时克服网络退化问题;其次,将提取的目标特征信息输入核相关滤波算法,构建定位滤波器确定目标的中心位置;最后,根据目标外观特性的不同进行自适应分块,并计算出目标尺度的伸缩系数。仿真实验结果表明:所提算法能够有效应对尺度变化、遮挡等情况对跟踪性能的影响,在跟踪成功率和精确度上均高于其他对比算法。
Abstract:UAVs have been widely used in military and civilian applications, and target tracking technology is one of the key technologies for UAV applications. Aimed at the problem that the target is prone to scale change and occlusion during the target tracking process of the UAV, an adaptive UAV video target tracking algorithm based on residual learning is proposed. Firstly, by combining the advantages of residual learning and dilated convolution, a depth network is constructed to extract target features and overcome the problem of network degradation. Secondly, the extracted feature information is input into the kernel correlation filtering algorithm, and a positioning filter is constructed to determine the central position of the target. Finally, adaptive segmentation is performed according to the different appearance characteristics of the target and the scaling coefficient of the target scale is calculated. The simulation results show that the algorithm can effectively deal with the influence of scale change and occlusion on tracking performance, and has higher tracking success rate and accuracy than other comparison algorithms.
-
Key words:
- UAV /
- target tracking /
- dilated convolution /
- residual learning /
- correlation filter /
- scale adaptation
-
表 1 网络模型性能比较
Table 1. Performance comparison of network models
迭代次数 分类准确率 DC-ResNet CNN ResNet DilatedNet 30000 0.835 0.775 0.813 0.790 35000 0.837 0.778 0.817 0.796 40000 0.837 0.778 0.820 0.795 45000 0.836 0.780 0.818 0.802 50000 0.839 0.782 0.821 0.800 -
[1] WANG N, LI S, GUPTA A, et al.Transferring rich feature hierarchies for robust visual tracking[EB/OL].(2015-01-19)[2019-10-20].https://arxiv.org/abs/1501.04587. [2] WANG L, OUYANG W, WANG X, et al.Visual tracking with fully convolutional networks[C]//Proceedings of the IEEE International Conference on Computer Vision.Piscataway: IEEE Press, 2015: 3119-3127. [3] HE K M, ZHANG X Y, REN S Q, et al.Deep residual learning for image recognition[C]//IEEE Conference on Computer Vision and Patern Recognition (CVPR).Piscataway: IEEE Press, 2016: 770-778. [4] NEJHUM S M S, HO J, YANG M H.Visual tracking with histograms and articulating blocks[C]//IEEE Conference on Computer Vision and Pattern Recognition(CVPR).Piscataway: IEEE Press, 2008: 546-553. [5] 段伟伟, 杨学志, 方帅, 等.分块核化相关滤波目标跟踪[J].计算机辅助设计与图形学学报, 2016, 28(7):1160-1168. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=jsjfzsjytxxxb201607016DUAN W W, YANG X Z, FANG S, et al.Block nucleation correlation filtering target tracking[J].Journal of Computer-Aided Design & Computer Graphics, 2016, 28(7):1160-1168(in Chinese). http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=jsjfzsjytxxxb201607016 [6] KUDO Y, AOKI Y.Dilated convolutions for image classification and object localization[C]//Fifteenth IAPR International Conference on Machine Vision Applications.Piscataway: IEEE Press, 2017: 452-455. [7] YU F, KOLTUN V.Multi-scale context aggregation by dilated convolutions[EB/OL].(2015-11-23)[2019-10-20].https://arxiv.org/abs/1511.07122. [8] CHEN L C, PAPANDREOU G, KOKKINOS I, et al.DeepLab:Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs[J].IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(4):834-848. doi: 10.1109/TPAMI.2017.2699184 [9] CHEN L C, PAPANDREOU G, SCHROFF F, et al.Rethinking atrous convolution for semantic image segmentation[EB/OL].(2017-06-17)[2019-10-20].https://arxiv.org/abs/1706.05587. [10] IOFFE S, SZEGEDY C.Batch normalization: Accelerating deep network training by reducing internal covariate shift[EB/OL].(2015-02-11)[2019-10-20].https://arxiv.org/abs/1502.03167. [11] MUELLER M, SMITH N, GHANEM B.A benchmark and simulator for UAV tracking[C]//European Conference on Computer Vision.Berlin: Springer, 2016: 445-461. [12] ZHU P, WEN L, BIAN X, et al.Vision meets drones: A challenge[EB/OL].(2018-04-20)[2019-10-20].https://arxiv.org/abs/1804.07437. [13] HARE S, SAFFARI A, TORR P H S.Struck: Structured output tracking with kernels[C]//IEEE International Conference on Computer Vision.Piscataway: IEEE Press, 2011: 263-270. [14] MA C, YANG X, ZHANG C, et al.Long-term correlation tracking[C]//Computer Vision & Pattern Recognition, 2015: 5388-5396. https://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Ma_Long-Term_Correlation_Tracking_2015_CVPR_paper.pdf [15] WANG R, ZOU J, CHE M, et al.Robust and real-time visual tracking based on single-layer convolutional features and accurate scale estimation[C]//Chinese Conference on Image and Graphics Technologies, 2018: 471-482. [16] ZHANG J, MA S, SCLAROFF S.MEEM: Robust tracking via multiple experts using entropy minimization[C]//European Conference on Computer Vision.Berlin: Springer, 2014: 188-203. [17] GALOOGAHI H K, FAGG A, LUCEY S.Learning background-aware correlation filters for visual tracking[C]//Proceedings of the IEEE International Conference on Computer Vision.Piscataway: IEEE Press, 2017: 1135-1143. [18] VALMADRE J, BERTINETTO L, HENRIQUES J, et al.End-to-end representation learning for correlation filter based tracking[C]//IEEE Conference on Computer Vision and Pattern Recognition(CVPR).Piscataway: IEEE Press, 2017: 2805-2813.