Deep learning based UAV vision object detection and tracking

PU Liang; ZHANG Xuejun

doi:10.13700/j.bh.1001-5965.2020.0664

Volume 48 Issue 5

May 2022

Turn off MathJax

Article Contents

Abstract

References

Journal of Beijing University of Aeronautics and Astronautics > 2022 > 48(5): 872-880.

LIU He, WEI Cheng, ZHANG Zexu, et al. Spacecraft Anomaly Detection Based on Filtered Autoencoder Envelope Analysis[J]. Journal of Beijing University of Aeronautics and Astronautics. doi: 10.13700/j.bh.1001-5965.2024.0832(in Chinese)

Citation:

PU Liang, ZHANG Xuejun. Deep learning based UAV vision object detection and tracking[J]. Journal of Beijing University of Aeronautics and Astronautics, 2022, 48(5): 872-880. doi: 10.13700/j.bh.1001-5965.2020.0664(in Chinese)

Citation:

PDF( 5857 KB)

Deep learning based UAV vision object detection and tracking

doi: 10.13700/j.bh.1001-5965.2020.0664

PU Liang¹,
ZHANG Xuejun^{1, 2
,
,}

1.
School of Aerospace, Xihua University, Chengdu 610039, China
2.
School of Electronic Information Engineering, Beihang University, Beijing 100083, China

More Information

Corresponding author: ZHANG Xuejun, E-mail: zhxj@buaa.edu.cn
Received Date: 27 Nov 2020
Accepted Date: 31 Jan 2021
Publish Date: 20 May 2022

Abstract

Abstract

An improved model based on the Yolov3-Tiny algorithm is proposed for object detection with high miss and false detection rates of small target objects. The k-means clustering method is improved by adding 3×3 and 1×1 convolutional pooling layers, upsampling the output of the 9th convolutional layer, and connecting it with the feature map obtained from the 8th convolutional layer to obtain a new output: 52×52 convolutional layers, forming a new feature pyramid. The object tracking is implemented based on Kalman filtering algorithm. And the detection network with fusion tracking algorithm is proposed. The Hungarian algorithm is used to optimally match the detection edge frame with the tracking edge frame, and the tracking result is used to correct the detection result. The detection speed is improved and the detection capability is enhanced at the same time. The proposed algorithm is tested in a comprehensive simulation environment of ROS, Gazebo and autopilot software PX4 for comparison. The test results show that the improved algorithm reduces the average detection speed by 15.6% and increases the mAP by 6.5%. The fusion tracking algorithm improves the average detection speed of the network by 34.2% and the mAP by 8.6%. The network after the implementation of fusion tracking algorithm can meet the requirements of system real-time property and accuracy.
- object detection,
- Yolov3-Tiny,
- object tracking,
- Kalman filter,
- Hungary match

FullText(HTML)

References(23)

References

[1]	LOWE D G. Object recognition from local scale-invariant features[C]//Proceedings of IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 1999, 2: 1150-1157.
[2]	LOWE D G. Distinctive image features from scale-invariant keypoints[J]. International Journal of Computer Vision, 2004, 60(2): 91-110. doi: 10.1023/B:VISI.0000029664.99615.94
[3]	REICHARDT W, POGGIO T. Visual control of orientation behaviour in the fly. Part I. A quantitative analysis[J]. Quarterly Reviews of Biophysics, 1976, 9(3): 311-375. doi: 10.1017/S0033583500002523
[4]	卢章平, 孔德飞, 李小蕾, 等. 背景差分与三帧差分结合的运动目标检测算法[J]. 计算机测量与控制, 2013, 21(12): 3315-3318. doi: 10.3969/j.issn.1671-4598.2013.12.050 LU Z P, KONG D F, LI X L, et al. A method for moving object detection based on background subtraction and three-frame differencing[J]. Computer Measurement & Control, 2013, 21(12): 3315-3318(in Chienese). doi: 10.3969/j.issn.1671-4598.2013.12.050
[5]	SANG H F, XU C. Moving object detection based on background subtraction of block updates[C]//2013 6th International Conference on Intelligent Networks and Intelligent Systems (ICINIS). Piscataway: IEEE Press, 2013: 51-54.
[6]	KARASULU B, KORUKOGLU S. Moving object detection and tracking by using annealed background subtraction method in videos: Performance optimization[J]. Expert Systems with Applications, 2012, 39(1): 33-43. doi: 10.1016/j.eswa.2011.06.040
[7]	杨阳, 唐慧明. 基于视频的行人车辆检测与分类[J]. 计算机工程, 2014, 40(11): 135-138. YANG Y, TANG H M. Pedestrian-vehicle detection and classification based on video[J]. Computer Engineering, 2014, 40(11): 135-138(in Chineses).
[8]	GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2014: 580-587.
[9]	REN S, HE K, GIRSHICK R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 39(6): 1137-1149.
[10]	DAI J. R-FCN: Object detection via region-based fully convolutional networks[C]//Proceedings of IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2016: 1-9.
[11]	LIU W, ANGUELOV D, ERHAN D, et al. SSD: Single shot multibox detector[C]//European Conference on Computer Vision. Berlin: Springer, 2016: 21-37.
[12]	REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: Unified, real-time object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2016: 779-788.
[13]	REDMON J, FARHADI A. YOLO9000: Better, faster, stronger[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2017: 7263-7271.
[14]	REDMON J, FARHADI A. YOLOv3: An incremental improvement[EB/OL]. (2018-04-08)[2020-11-01].
[15]	JO K U, IM J H, KIM J, et al. A real-time multi-class multi-object tracker using YOLOv2[C]//2017 IEEE International Conference on Signal and Image Processing Applications (ICSIPA). Piscataway: IEEE Press, 2017: 507-511.
[16]	王聪. 基于深度学习的无人机单目标识别与跟踪算法研究[D]. 泉州: 华侨大学, 2019. WANG C. Research on single target recognition and tracking algorithm for UAV based on deep learning[D]. Quanzhou: Huaqiao University, 2019(in Chinese).
[17]	LIN T Y, DOLLÁR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2017: 2117-2125.
[18]	鞠默然, 罗海波, 王仲博, 等. 改进的YOLO V3算法及其在小目标检测中的应用[J]. 光学学报, 2019, 39(7): 0715004. JU M R, LUO H B, WANG Z B, et al. Improved YOLO V3 algorithm and its application in small target detection[J]. Acta Optica Sinica, 2019, 39(7): 0715004(in Chinese).
[19]	KONG T, SUN F, LIU H, et al. FoveaBox: Beyound anchor-based object detection[J]. IEEE Transactions on Image Processing, 2020, 29: 7389-7398. doi: 10.1109/TIP.2020.3002345
[20]	DALAL N, TRIGGS B. Histograms of oriented gradients for human detection[C]//2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05). Piscataway: IEEE Press, 2005, 1: 886-893.
[21]	KALMAN R E. A new approach to linear filtering and prediction problems[J]. Transactions of the ASME-Journal of Basic Engineering, 1960, 82: 35-45. doi: 10.1115/1.3662552
[22]	ARULAMPALAM M S, MASKELL S, GORDON N, et al. A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking[J]. IEEE Transactions on Signal Processing, 2002, 50(2): 174-188. doi: 10.1109/78.978374
[23]	XIAO K, TAN S, WANG G, et al. XTDrone: A customizable multi-rotor UAVs simulation platform[EB/OL]. (2020-03-21)[2020-11-01].