Volume 48 Issue 11
Nov.  2022
Turn off MathJax
Article Contents
XU Guoyan, XIONG Yiwei, ZHOU Bin, et al. Autonomous navigation based on PPO for mobile platform[J]. Journal of Beijing University of Aeronautics and Astronautics, 2022, 48(11): 2138-2145. doi: 10.13700/j.bh.1001-5965.2021.0100(in Chinese)
Citation: XU Guoyan, XIONG Yiwei, ZHOU Bin, et al. Autonomous navigation based on PPO for mobile platform[J]. Journal of Beijing University of Aeronautics and Astronautics, 2022, 48(11): 2138-2145. doi: 10.13700/j.bh.1001-5965.2021.0100(in Chinese)

Autonomous navigation based on PPO for mobile platform

doi: 10.13700/j.bh.1001-5965.2021.0100
Funds:

National Natural Science Foundation of China 51775016

More Information
  • Corresponding author: XU Guoyan, E-mail: xuguoyan@buaa.edu.cn
  • Received Date: 02 Mar 2021
  • Accepted Date: 11 Apr 2021
  • Publish Date: 07 May 2021
  • This paper presents an autonomous navigation method based on proximal policy optimization (PPO) algorithm for mobile platform. In this method, GNSS and LADAR are used for sensing environment information. To define the state of reinforcement learning model, an ego position evaluation method is introduced based on improved artificial potential field algorithm. After that, on the basis of PPO algorithm, a kind of action policy function is designed based on Gaussian distribution, which solves the continuity problem of the vehicle linear velocity and yaw velocity. Furthermore, the network framework and reward function of the model are also designed for navigation scenarios. In order to train the navigation model, a virtual environment based on Gazebo is built. The training results show that the ego position evaluation method obviously helps to improve the speed of model convergence. Finally, the navigation model is transplanted to a real environment, which verifies the effectiveness of the proposed method.

     

  • loading
  • [1]
    王义林. 地面无人平台自主导航避障系统的研究与实现[D]. 哈尔滨: 哈尔滨工业大学, 2020.

    WANG Y L. Research and implementation of autonomous navigation and obstacle avoidance system for ground unmanned platform[D]. Harbin: Harbin Institute of Technology, 2020(in Chinese).
    [2]
    秦圣然. 基于激光传感器的移动机器人导航系统研究[D]. 沈阳: 沈阳工业大学, 2020.

    QIN S R. Research on laser sensor-based navigation system for mobile robots[D]. Shenyang: Shenyang University of Technology, 2020(in Chinese).
    [3]
    HART P E, NILSSON N J, RAPHAEL B. A formal basis for the heuristic determination of minimum cost paths in graphs[J]. IEEE Transactions on Systems Science and Cybernetics, 1968, 4(2): 100-107. doi: 10.1109/TSSC.1968.300136
    [4]
    STENT A. Optimal and efficient path planning for partially-known environments[C]//Proceedings of IEEE International Conference on Robotics and Automation. Piscataway: IEEE Press, 1994, 4: 3310-3317.
    [5]
    STENT A. The focussed D* algorithm for real-time replanning[C]//Proceedings of the 14th International Joint Conference on Artificial Intelligence. New York: ACM, 1995: 1652-1659.
    [6]
    LAVALLE S M, KUFFNER J J. Randomized kinodynamic planning[C]//Proceedings of IEEE International Conference on Robotics and Automation. Piscataway: IEEE Press, 1999, 1: 473-479.
    [7]
    付雪建. 基于强化学习的移动机器人自主导航研究[D]. 重庆: 重庆大学, 2017.

    FU X J. Research on autonomous navigation of mobile robots based on reinforcement learning[D]. Chongqing: Chongqing University, 2017(in Chinese).
    [8]
    杨宁博. 面向环境探测的全向模式移动机器人自主导航研究[D]. 哈尔滨: 哈尔滨工业大学, 2019.

    YANG N B. Research on autonomous navigation of omnidirectional mode mobile robots for environmental detection[D]. Harbin: Harbin Institute of Technology, 2019(in Chinese).
    [9]
    陶睿. 基于深度强化学习的移动机器人导航[D]. 济南: 山东大学, 2020.

    TAO R. Deep reinforcement learning-based navigation for mobile robots[D]. Jinan: Shandong University, 2020(in Chinese).
    [10]
    何聪. 基于深度强化学习的机器人视觉导航算法[D]. 南京: 东南大学, 2021.

    HE C. Robot visual navigation algorithm based on deep reinforcement learning[D]. Nanjing: Southeast University, 2021(in Chinese).
    [11]
    TAI L, PAOLO G, LIU M. Virtual-to-real deep reinforcement learning: Continuous control of mobile robots for mapless navigation[C]//IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Piscataway: IEEE Press, 2017: 31-36.
    [12]
    SCHULMAN J, WOLSKI F, DHARIWAL P, et al. Proximal policy optimization algorithms[EB/OL]. (2017-08-28)[2021-03-01]. https://arxiv.org/abs/1707.06347.
    [13]
    刘志平, 余前勇, 査剑锋. 空间直角坐标至两类常用坐标的快速变换[J]. 测绘科学, 2015, 40(3): 8-11. https://www.cnki.com.cn/Article/CJFDTOTAL-CHKD201503002.htm

    LIU Z P, YU Q Y, ZHA J F. Fast coordinate transformations for both XYZ-BLH and XYZ-RhA[J]. Science of Surveying and Mapping, 2015, 40(3): 8-11(in Chinese). https://www.cnki.com.cn/Article/CJFDTOTAL-CHKD201503002.htm
    [14]
    刘山洪, 邓彩群. 坐标转换与坐标变换研究[J]. 吉林建筑大学学报, 2016, 33(1): 43-47. https://www.cnki.com.cn/Article/CJFDTOTAL-JLJZ201601011.htm

    LIU S H, DENG C Q. Transformation of coordinate system[J]. Journal of Jilin Jianzhu University, 2016, 33(1): 43-47(in Chinese). https://www.cnki.com.cn/Article/CJFDTOTAL-JLJZ201601011.htm
    [15]
    KHATIB O. Real-time obstacle avoidance system for manipulators and mobile robots[J]. International Journal of Robotics Research, 1986, 5(1): 90-98. doi: 10.1177/027836498600500106
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(10)

    Article Metrics

    Article views(290) PDF downloads(33) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return