Dynamic visual SLAM algorithm based on improved YOLOv5s

JIANG Changjiang; LIU Peng; SHU Peng

doi:10.13700/j.bh.1001-5965.2023.0154

Volume 51 Issue 3

Mar. 2025

Turn off MathJax

Article Contents

Abstract

References

Journal of Beijing University of Aeronautics and Astronautics > 2025 > 51(3): 763-771.

Ma Yaofei, Gong Guanghong, Peng Xiaoyuanet al. Cognition behavior model for air combat based on reinforcement learning[J]. Journal of Beijing University of Aeronautics and Astronautics, 2010, 36(4): 379-383. (in Chinese)

Citation:

JIANG C J，LIU P，SHU P. Dynamic visual SLAM algorithm based on improved YOLOv5s[J]. Journal of Beijing University of Aeronautics and Astronautics，2025，51（3）：763-771 （in Chinese） doi: 10.13700/j.bh.1001-5965.2023.0154

Citation:

PDF( 2101 KB)

Dynamic visual SLAM algorithm based on improved YOLOv5s

doi: 10.13700/j.bh.1001-5965.2023.0154

School of Automation，Chongqing University of Posts and Telecommunications，Chongqing 400065，China

Funds: National Natural Science Foundation of China (62277008)

More Information

Corresponding author: E-mail：jiangcj@cqupt.edu.cn
Received Date: 31 Mar 2023
Accepted Date: 09 Jun 2023

Available Online: 28 Jul 2023

Publish Date: 26 Jul 2023

Abstract

Abstract

A dynamic visual simultaneous localization and mapping (SLAM) algorithm based on an object detection network is proposed to address the robustness and camera localization accuracy issues caused by dynamic targets in indoor dynamic scenes. The lightweight network PP-LCNet replaces the YOLOv5 backbone network, and the YOLOv5s with the shortest depth and feature map width are chosen as the object detection network. After training on the VOC2007+VOC2012 dataset, experimental results show that the PP-LCNet-YOLOv5s model reduces the network parameters by 41.89% and improves the running speed by 39.13% compared to the YOLOv5s model. In order to eliminate dynamic feature points from the tracking thread of the visual SLAM system, a parallel thread that combines the enhanced object recognition network and sparse optical flow approach is implemented. Only static feature points are used for feature matching and camera position estimation. Experimental results show that the proposed algorithm improves the camera localization accuracy in dynamic scenes by 92.38% compared to ORB-SLAM3.
- simultaneous localization and mapping (SLAM),
- target detection,
- dynamic feature point elimination,
- positioning accuracy,
- optical flow approach

FullText(HTML)

References(19)

References

[1]	田野, 陈宏巍, 王法胜, 等. 室内移动机器人的SLAM算法综述[J]. 计算机科学, 2021, 48(9): 223-234. TIAN Y, CHEN H W, WANG F S, et al. Overview of SLAM algorithms for mobile robots[J]. Computer Science, 2021, 48(9): 223-234(in Chinese).
[2]	CAMPOS C, ELVIRA R, RODRÍGUEZ J J G, et al. ORB-SLAM3: an accurate open-source library for visual, visual-inertial, and multimap SLAM[J]. IEEE Transactions on Robotics, 2021, 37(6): 1874-1890. doi: 10.1109/TRO.2021.3075644
[3]	MUR-ARTAL R, TARDÓS J D. ORB-SLAM2: an open-source SLAM system for monocular, stereo, and RGB-D cameras[J]. IEEE Transactions on Robotics, 2017, 33(5): 1255-1262. doi: 10.1109/TRO.2017.2705103
[4]	GOMEZ-OJEDA R, MORENO F A, ZUÑIGA-NOËL D, et al. PL-SLAM: a stereo SLAM system through the combination of points and line segments[J]. IEEE Transactions on Robotics, 2019, 35(3): 734-746. doi: 10.1109/TRO.2019.2899783
[5]	张有全, 祁宇明, 邓三鹏, 等. 直接法和共视图优化的视觉惯性SLAM系统研究[J]. 自动化与仪器仪表, 2022(5): 197-203. ZHANG Y Q, QI Y M, DENG S P, et al. Research on visual-inertial SLAM system based on direct method and common view optimization[J]. Automation and Instrumentation, 2022(5): 197-203(in Chinese).
[6]	ENGEL J, STÜCKLER J, CREMERS D. Large-scale direct SLAM with stereo cameras[C]//Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems. Piscataway: IEEE Press, 2015: 1935-1942.
[7]	ENGEL J, KOLTUN V, CREMERS D. Direct sparse odometry[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(3): 611-625.
[8]	LI H, JIANG Y H, FAN F Y, et al. RGB-D SLAM based on semantic and geometry for indoor dynamic environments[C]//Proceedings of the China Automation Congress. Piscataway: IEEE Press, 2022: 5581-5586.
[9]	LONG X D, ZHANG W W, ZHAO B. PSPNet-SLAM: a semantic SLAM detect dynamic object by pyramid scene parsing network[J]. IEEE Access, 2020, 8: 214685-214695. doi: 10.1109/ACCESS.2020.3041038
[10]	BESCOS B, FÁCIL J M, CIVERA J, et al. DynaSLAM: tracking, mapping, and inpainting in dynamic scenes[J]. IEEE Robotics and Automation Letters, 2018, 3(4): 4076-4083. doi: 10.1109/LRA.2018.2860039
[11]	HE K, GKIOXARI G, DOLLAR P, et al. Mask R-CNN[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(2): 386-397. doi: 10.1109/TPAMI.2018.2844175
[12]	HAN S Q, XI Z H. Dynamic scene semantics SLAM based on semantic segmentation[J]. IEEE Access, 2020, 8: 43563-43570. doi: 10.1109/ACCESS.2020.2977684
[13]	YUAN X, CHEN S. SaD-SLAM: a visual SLAM based on semantic and depth information[C]//Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems. Piscataway: IEEE Press, 2020: 4930-4935.
[14]	YU C, LIU Z X, LIU X J, et al. DS-SLAM: a semantic visual SLAM towards dynamic environments[C]//Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems. Piscataway: IEEE Press, 2018: 1168-1174.
[15]	BADRINARAYANAN V, KENDALL A, CIPOLLA R. SegNet: a deep convolutional encoder-decoder architecture for image segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(12): 2481-2495. doi: 10.1109/TPAMI.2016.2644615
[16]	LIU Y B, MIURA J. RDS-SLAM: real-time dynamic SLAM using semantic segmentation methods[J]. IEEE Access, 2021, 9: 23772-23785. doi: 10.1109/ACCESS.2021.3050617
[17]	SU P, LUO S Y, HUANG X C. Real-time dynamic SLAM algorithm based on deep learning[J]. IEEE Access, 2022, 10: 87754-87766. doi: 10.1109/ACCESS.2022.3199350
[18]	ZHANG H, YE M, SUN X. Robust indoor visual-inertial SLAM with pedestrian detection[J]. IEEE Transactions on Instrumentation and Measurement, 2019, 68(10): 3897-3908.
[19]	MUR-ARTAL R, TARDÓS J D. Visual-inertial monocular SLAM with map reuse[J]. IEEE Robotics and Automation Letters, 2017, 2(2): 796-803. doi: 10.1109/LRA.2017.2653359