留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

动态环境下基于语义分割与几何约束的视觉/惯性导航方法

张文珂 韩鹏 冯宇 高东

张文珂,韩鹏,冯宇,等. 动态环境下基于语义分割与几何约束的视觉/惯性导航方法[J]. 北京航空航天大学学报,2026,52(4):1189-1198
引用本文: 张文珂,韩鹏,冯宇,等. 动态环境下基于语义分割与几何约束的视觉/惯性导航方法[J]. 北京航空航天大学学报,2026,52(4):1189-1198
ZHANG W K,HAN P,FENG Y,et al. Visual-inertial navigation method based on semantic segmentation and geometric constraints in dynamic environment[J]. Journal of Beijing University of Aeronautics and Astronautics,2026,52(4):1189-1198 (in Chinese)
Citation: ZHANG W K,HAN P,FENG Y,et al. Visual-inertial navigation method based on semantic segmentation and geometric constraints in dynamic environment[J]. Journal of Beijing University of Aeronautics and Astronautics,2026,52(4):1189-1198 (in Chinese)

动态环境下基于语义分割与几何约束的视觉/惯性导航方法

doi: 10.13700/j.bh.1001-5965.2024.0016
基金项目: 

中国科学院基金(8091A100113)

详细信息
    通讯作者:

    E-mail:hanpeng@nssc.ac.cn

  • 中图分类号: TP242.6

Visual-inertial navigation method based on semantic segmentation and geometric constraints in dynamic environment

Funds: 

Science Foundation of the Chinese Academy of Sciences (8091A100113)

More Information
  • 摘要:

    在实际同时定位与地图构建(SLAM)应用场景中,针对大量运动物体的成像特征点参与特征追踪进而降低算法精度和鲁棒性的问题,以及传统的以剔除动态特征为策略的动态SLAM方案中存在剩余静态特征不足而影响SLAM效果的问题,提出一种基于语义分割与几何约束的动态视觉/惯性组合导航方法。使用语义分割网络及不同类别物体的动态性信任程度生成先验动态掩码,并利用改进的抑制先验动态特征方法提取特征点,进而利用惯性测量单元(IMU)预积分分量结合几何约束技术判断特征点的真实动态性,并制定特征点剔除策略进行剔除,最终使用剩余静态特征点进行追踪与定位。相比于ORB-SLAM3,所提算法在室内动态场景数据集TUM下,定位精度平均提升了73.05%,在室外动态场景数据集KITTI下,定位精度平均提升了19.85%;与传统的动态SLAM算法对比,精度也更优。

     

  • 图 1  基于语义分割与几何约束的动态SLAM算法整体框架

    Figure 1.  Overall framework of dynamic SLAM algorithm based on semantic segmentation and geometric constraints

    图 2  KITTI数据集测试Multi YOLOv5s同时目标检测与语义分割叠加效果

    Figure 2.  The KITTI dataset evaluation of Multi YOLOv5s for simultaneous object detection and semantic segmentation overlay effects

    图 3  改进特征提取方法每层金字塔上特征点提取流程

    Figure 3.  Improved feature extraction method and feature point extraction process on each pyramid

    图 4  不同稀释倍数下ATE的RMSE和平均每帧剔除特征点数目对比

    Figure 4.  Comparison of the RMSE of the ATE and the average number of feature points culled per frame under different dilution times

    图 5  特征点真实动态性判断方法整体框架

    Figure 5.  The overall framework of the method for judging the real dynamics of feature points

    图 6  3D距离误差判断特征点真实动态性示意图

    Figure 6.  3D distance error judgment feature point real dynamic schematic diagram

    图 7  TUM数据集下改进算法运行效果对比

    Figure 7.  Comparison of the running effect of the improved algorithm under the TUM dataset

    图 8  TUM数据集测试各算法估计轨迹与真实轨迹对比

    Figure 8.  Comparison of the estimated trajectories of various algorithm with the real trajectories on the TUM dataset

    图 9  01、04序列下本文改进算法运行效果与原图对比

    Figure 9.  Comparison between the improved algorithm running effect and the original image under 01 and 04 sequences

    图 10  07、09序列下本文改进算法运行效果

    Figure 10.  Running effect of improved algorithm under 07 and09 sequences

    表  1  TUM数据集测试各算法的绝对轨迹误差对比

    Table  1.   Comparison of absolute trajectory errors of various algorithm on TUM dataset

    序列 STD/m RMSE/m RMSE
    提升率1/%
    RMSE
    提升率2/%
    ORB-SLAM3
    (RGB-D模式)
    仅使用语义分割
    网络的改进算法
    改进特征提取
    方法的改进算法
    ORB-SLAM3
    (RGB-D模式)
    仅使用语义分割
    网络的改进算法
    改进特征提取
    方法的改进算法
    walking_xyz 0.1146 0.0099 0.0083 0.2704 0.0223 0.0204 8.52 92.46
    walking_rpy 0.0785 0.0350 0.0287 0.1608 0.0541 0.0487 9.98 69.71
    walking_halfsphere 0.1658 0.0406 0.0331 0.2913 0.0548 0.0467 14.78 83.97
    walking_static 0.0086 0.0033 0.0032 0.0152 0.0088 0.0082 6.82 46.05
    下载: 导出CSV

    表  2  TUM数据集测试各算法的相对位姿误差对比

    Table  2.   Comparison of relative pose errors of various algorithm on TUM dataset

    序列 STD/m RMSE/m RMSE
    提升率1/%
    RMSE
    提升率2/%
    ORB-SLAM3
    (RGB-D模式)
    仅使用语义分割
    网络的改进算法
    改进特征提取
    方法的改进算法
    ORB-SLAM3
    (RGB-D模式)
    仅使用语义分割
    网络的改进算法
    改进特征提取
    方法的改进算法
    walking_xyz 0.0089 0.0091 0.0088 0.0170 0.0156 0.0153 1.92 10.00
    walking_rpy 0.0303 0.0271 0.0230 0.0357 0.0341 0.0301 11.73 15.69
    walking_halfsphere 0.0177 0.0151 0.0102 0.0240 0.0219 0.0188 14.16 21.67
    walking_static 0.0125 0.0039 0.0037 0.0152 0.0074 0.0072 2.70 52.63
    下载: 导出CSV

    表  3  TUM数据集下本文改进算法与其他动态SLAM算法的ATE的RMSE对比

    Table  3.   RMSE comparison of ATE of the improved algorithm and other dynamic SLAM algorithms on TUM dataset m

    序列 ORB-SLAM3[2] DS-SLAM[4] Detect-SLAM[6] ReFusion[17] D3FlowSLAM[18] 本文改进算法
    walking_xyz 0.2704 0.0247 0.0241 0.099 0.018 0.0204
    walking_rpy 0.1608 0.4442 0.2959 0.057 0.0487
    walking_halfsphere 0.2913 0.0303 0.0514 0.104 0.42 0.0467
    walking_static 0.0152 0.0081 0.017 0.007 0.0082
     注:加粗数字表示性能最优。
    下载: 导出CSV

    表  4  KITTI数据集测试各算法的绝对轨迹误差对比

    Table  4.   Comparison of absolute trajectory errors of various algorithms tested on the KITTI dataset

    序列 STD/m RMSE/m RMSE
    提升率1/%
    RMSE
    提升率2/%
    ORB-SLAM3
    (双目视觉/
    惯性模式)
    未使用几何方法的
    动态SLAM方案
    基于语义分割与
    几何约束的
    动态SLAM算法
    ORB-SLAM3
    (双目视觉/
    惯性模式)
    未使用几何方法的
    动态SLAM方案
    基于语义分割与
    几何约束的
    动态SLAM算法
    01 1.12 1.1455 0.9417 2.6673 2.5815 1.9759 23.46 25.92
    04 0.1073 0.0558 0.0534 0.2283 0.1503 0.1469 2.26 35.65
    07 0.5029 0.4991 0.4583 0.8926 0.9244 0.8583 7.15 3.84
    09 0.5943 0.9086 0.5362 1.3086 1.8222 1.1254 38.24 14.00
    下载: 导出CSV

    表  5  KITTI数据集测试各算法的相对位姿误差对比

    Table  5.   Comparison of relative pose errors of various algorithms tested on KITTI dataset

    序列 STD/m RMSE/m RMSE
    提升率1/%
    RMSE
    提升率2/%
    ORB-SLAM3
    (双目视觉/
    惯性模式)
    未使用几何方法的
    动态SLAM方案
    基于语义分割与
    几何约束的
    动态SLAM算法
    ORB-SLAM3
    (双目视觉/
    惯性模式)
    未使用几何方法的
    动态SLAM方案
    基于语义分割与
    几何约束的
    动态SLAM算法
    01 0.0429 0.03 0.0242 0.0662 0.051 0.0457 10.39 30.97
    04 0.0119 0.0115 0.0076 0.0224 0.0191 0.0156 18.32 30.36
    07 0.01 0.0095 0.0085 0.0194 0.0176 0.0158 10.23 18.56
    09 0.0114 0.0092 0.0091 0.0223 0.0198 0.0191 3.54 14.35
    下载: 导出CSV

    表  6  KITTI数据集下本文改进算法与其他动态SLAM算法的ATE的RMSE对比

    Table  6.   RMSE comparison of ATE of the improved algorithm and other dynamic SLAM algorithms under KITTI dataset

    序列 ATE的RMSE/m 相对基础算法提升率/%
    DynaSLAM[20] Dynamic-SLAM[21] 本文改进算法 DynaSLAM[20] Dynamic-SLAM[21] 本文改进算法
    01 9.4 1.9759 9.62 25.92
    04 0.2 1.109 0.1469 0 10.06 35.65
    07 0.5 1.823 0.8583 0 7.08 3.84
    09 1.6 9.285 1.1254 50 10.50 14.00
     注:加粗数字表示性能最优。
    下载: 导出CSV
  • [1] 刘哲, 史殿习, 杨绍武, 等. 视觉惯性导航系统初始化方法综述[J]. 国防科技大学学报, 2023, 45(2): 15-26.

    LIU Z, SHI D X, YANG S W, et al. Review of visual-inertial navigation system initialization method[J]. Journal of National University of Defense Technology, 2023, 45(2): 15-26(in Chinese).
    [2] CAMPOS C, ELVIRA R, RODRÍGUEZ J J G, et al. ORB-SLAM3: an accurate open-source library for visual, visual-inertial, and multimap SLAM[J]. IEEE Transactions on Robotics, 2021, 37(6): 1874-1890.
    [3] QIN T, CAO S Z, PAN J, et al. A general optimization-based framework for global pose estimation with multiple sensors[EB/OL]. (2019-01-11)[2024-01-05]. https://arxiv.org/abs/1901.03642.
    [4] YU C, LIU Z X, LIU X J, et al. DS-SLAM: a semantic visual SLAM towards dynamic environments[C]//Proceedings of the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems. Piscataway: IEEE Press, 2019: 1168-1174.
    [5] BADRINARAYANAN V, KENDALL A, CIPOLLA R. SegNet: a deep convolutional encoder-decoder architecture for image segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(12): 2481-2495.
    [6] ZHONG F W, WANG S, ZHANG Z Q, et al. Detect-SLAM: making object detection and SLAM mutually beneficial[C]//Proceedings of the 2018 IEEE Winter Conference on Applications of Computer Vision. Piscataway: IEEE Press, 2018: 1001-1010.
    [7] ZHANG J, HENEIN M, MAHONY R, et al. VDO-SLAM: a visual dynamic object-aware SLAM system[EB/OL]. (2021-12-14)[2024-01-05]. https://arxiv.org/abs/2005.11052.
    [8] DAI W C, ZHANG Y, LI P, et al. RGB-D SLAM in dynamic environments using point correlations[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(1): 373-389.
    [9] CHENG J H, WANG Z, ZHOU H Y, et al. DM-SLAM: a feature-based SLAM system for rigid dynamic scenes[J]. ISPRS International Journal of Geo-Information, 2020, 9(4): 202.
    [10] 蒋畅江, 刘朋, 舒鹏. 基于改进YOLOv5s的动态视觉SLAM算法[J]. 北京航空航天大学学报, 2025, 51(3): 763-771.

    JIANG C J, LIU P, SHU P. Dynamic visual SLAM algorithm based on improved YOLOv5s[J]. Journal of Beijing University of Aeronautics and Astronautics, 2025, 51(3): 763-771(in Chinese).
    [11] YUAN X, CHEN S. SaD-SLAM: a visual SLAM based on semantic and depth information[C]//Proceedings of the 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems. Piscataway: IEEE Press, 2021: 4930-4935.
    [12] LI A, WANG J K, XU M, et al. DP-SLAM: a visual SLAM with moving probability towards dynamic environments[J]. Information Sciences, 2021, 556: 128-142.
    [13] STURM J, ENGELHARD N, ENDRES F, et al. A benchmark for the evaluation of RGB-D SLAM systems[C]//Proceedings of the 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems. Piscataway: IEEE Press, 2012: 573-580.
    [14] YU C Q, WANG J B, PENG C, et al. BiSeNet: bilateral segmentation network for real-time semantic segmentation[C]//European Conference on Computer Vision. Berlin: Springer, 2018: 334-349.
    [15] GEIGER A, LENZ P, STILLER C, et al. Vision meets robotics: the KITTI dataset[J]. The International Journal of Robotics Research, 2013, 32(11): 1231-1237.
    [16] GALVEZ-LÓPEZ D, TARDOS J D. Bags of binary words for fast place recognition in image sequences[J]. IEEE Transactions on Robotics, 2012, 28(5): 1188-1197.
    [17] PALAZZOLO E, BEHLEY J, LOTTES P, et al. ReFusion: 3D reconstruction in dynamic environments for RGB-D cameras exploiting residuals[C]//Proceedings of the 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems. Piscataway: IEEE Press, 2020: 7855-7862.
    [18] YU X Y, YE W C, GUO X Y, et al. D3FlowSLAM: self-supervised dynamic SLAM with flow motion decomposition and DINO guidance[EB/OL]. (2024-08-21)[2026-01-07]. https://arxiv.org/abs/2207.08794.
    [19] MUR-ARTAL R, TARDÓS J D. ORB-SLAM2: an open-source SLAM system for monocular, stereo, and RGB-D cameras[J]. IEEE Transactions on Robotics, 2017, 33(5): 1255-1262.
    [20] BESCOS B, FÁCIL J M, CIVERA J, et al. DynaSLAM: tracking, mapping, and inpainting in dynamic scenes[J]. IEEE Robotics and Automation Letters, 2018, 3(4): 4076-4083.
    [21] XIAO L H, WANG J G, QIU X S, et al. Dynamic-SLAM: semantic monocular visual localization and mapping based on deep learning in dynamic environment[J]. Robotics and Autonomous Systems, 2019, 117: 1-16.
  • 加载中
图(10) / 表(6)
计量
  • 文章访问数:  417
  • HTML全文浏览量:  124
  • PDF下载量:  13
  • 被引次数: 0
出版历程
  • 收稿日期:  2024-01-10
  • 录用日期:  2024-02-23
  • 网络出版日期:  2024-03-18
  • 整期出版日期:  2026-04-30

目录

    /

    返回文章
    返回
    常见问答