留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于深度强化学习的平流层浮空器高度控制

张经伦 杨希祥 邓小龙 郭正 翟嘉琪

张经伦,杨希祥,邓小龙,等. 基于深度强化学习的平流层浮空器高度控制[J]. 北京航空航天大学学报,2023,49(8):2062-2070 doi: 10.13700/j.bh.1001-5965.2021.0622
引用本文: 张经伦,杨希祥,邓小龙,等. 基于深度强化学习的平流层浮空器高度控制[J]. 北京航空航天大学学报,2023,49(8):2062-2070 doi: 10.13700/j.bh.1001-5965.2021.0622
ZHANG J L,YANG X X,DENG X L,et al. Altitude control of stratospheric aerostat based on deep reinforcement learning[J]. Journal of Beijing University of Aeronautics and Astronautics,2023,49(8):2062-2070 (in Chinese) doi: 10.13700/j.bh.1001-5965.2021.0622
Citation: ZHANG J L,YANG X X,DENG X L,et al. Altitude control of stratospheric aerostat based on deep reinforcement learning[J]. Journal of Beijing University of Aeronautics and Astronautics,2023,49(8):2062-2070 (in Chinese) doi: 10.13700/j.bh.1001-5965.2021.0622

基于深度强化学习的平流层浮空器高度控制

doi: 10.13700/j.bh.1001-5965.2021.0622
基金项目: 国家自然科学基金(52272445);湖南省自然科学基金(2023JJ100056)
详细信息
    通讯作者:

    E-mail:nkyangxixiang@163.com

  • 中图分类号: V472;TB553

Altitude control of stratospheric aerostat based on deep reinforcement learning

Funds: National Natural Science Foundation of China (52272445);Hunan Provincial Natural Science Foundation (2023JJ100056)
More Information
  • 摘要:

    为研究基于深度强化学习的平流层浮空器高度控制问题。建立平流层浮空器动力学模型,提出一种基于深度 Q 网络(DQN)算法的平流层浮空器高度控制方法,以平流层浮空器当前速度、位置、高度差作为智能体的观察状态,副气囊鼓风机开合时间作为智能体的输出动作,平流层浮空器非线性动力学模型与扰动风场作为智能体的学习环境。所提方法将平流层浮空器的高度控制问题转换为未知转移概率下连续状态、连续动作的强化学习过程,兼顾随机风场扰动与速度变化约束,实现稳定的变高度控制。仿真结果表明:考虑风场环境对浮空器影响下,DQN算法控制器可以很好的实现变高度的跟踪控制,最大稳态误差约为10 m,与传统比例积分微分(PID)控制器对比,其控制效果和鲁棒性更优。

     

  • 图 1  平流层浮空器系统

    Figure 1.  Stratoshperic aerostat system

    图 2  平流层浮空器受太阳辐射示意图

    Figure 2.  Schematic diagram of stratospheric aerostat exposed to solar radiation

    图 3  平流层浮空器高度控制原理

    Figure 3.  Principle of stratospheric aerostat altitude control

    图 4  强化学习基本原理

    Figure 4.  Basic principle of reinforcement learning

    图 5  $\varepsilon $-贪婪法算法流程

    Figure 5.  Flow chart of $\varepsilon$-greedy

    图 6  DQN算法训练过程原理

    Figure 6.  Principle of DQN algorithm training process

    图 7  风场环境固定与变化情况时PID高度控制器的效果

    Figure 7.  Effect of PID height controller when wind field environments are fixed or changing

    图 8  基于DQN的平流层浮空器高度控制

    Figure 8.  Altitude control of stratospheric aerostat based on DQN

    图 9  不同控制器下的高度控制误差

    Figure 9.  Altitude control error under different control methods

    图 10  回报值函数变化

    Figure 10.  Changing of reward value function

    图 11  不同初始高度下的控制结果

    Figure 11.  Control results at different initial heights

    表  1  智能体DQN算法控制结果

    Table  1.   Control results of agent DQN algorithm

    飞行段/s调节时间/s超调量/m稳态误差/m
    0~250109.6
    250~45010011.2
    450~650684.5
    650~850766
    850~1 0006110.5
    下载: 导出CSV

    表  2  智能体Q-learning算法控制结果

    Table  2.   Control results of agent Q-learning algorithm

    飞行段/s调节时间/s超调量/m稳态误差/m
    0~25049
    250~450
    450~65153
    650~850531310.8
    850~1 0009714.3
    下载: 导出CSV
  • [1] 洪延姬, 金星, 李小将. 临近空间飞行器技术[M]. 北京: 国防工业出版社, 2012: 20-34.

    HONG Y J, JIN X, LI X J. Near space vehicle technology[M]. Beijing: National Defense Industry Press, 2012: 20-34 (in Chinese).
    [2] LI J, LIAO J, LIAO Y X, et al. An approach for estimating perpetual endurance of the stratospheric solar-powered platform[J]. Aerospace Science and Technology, 2018, 79: 118-130. doi: 10.1016/j.ast.2018.05.035
    [3] 邓小龙, 杨希祥, 麻震宇, 等. 基于风场环境利用的平流层浮空器区域驻留关键问题研究进展[J]. 航空学报, 2019, 40(8): 022941.

    DENG X L, YANG X X, MA Z Y, et al. Review of key technologies for station-keeping of stratospheric aerostats based on wind field utilization[J]. Acta Aeronautica et Astronautica Sinica, 2019, 40(8): 022941(in Chinese).
    [4] LU L L, SONG H W, WANG Y W, et al. Deformation behavior of non-rigid airships in wind tunnel tests[J]. Chinese Journal of Aeronautics, 2019, 32(3): 611-618. doi: 10.1016/j.cja.2018.12.016
    [5] 张永栋, 翟嘉琪, 孟小君, 等. 基于行为逻辑的平流层飞艇试验自动测试方法[J]. 航空学报, 2018, 39(9): 322191.

    ZHANG Y D, ZHAI J Q, MENG X J, et al. Approach for automatic testing of stratospheric airship test based on behavior logic[J]. Acta Aeronautica et Astronautica Sinica, 2018, 39(9): 322191(in Chinese).
    [6] 赵达, 刘东旭, 孙康文, 等. 平流层飞艇研制现状、技术难点及发展趋势[J]. 航空学报, 2016, 37(1): 45-56.

    ZHAO D, LIU D X, SUN K W, et al. Research status, technical difficulties and development trend of stratospheric airship[J]. Acta Aeronautica et Astronautica Sinica, 2016, 37(1): 45-56(in Chinese).
    [7] 肖存英, 胡雄, 龚建村, 等. 中国上空平流层准零风层的特征分析[J]. 空间科学学报, 2008, 28(3): 230-235. doi: 10.11728/cjss2008.03.230

    XIAO C Y, HU X, GONG J C, et al. Analysis of the characteristics of the stratospheric quasi-zero wind layer over China[J]. Chinese Journal of Space Science, 2008, 28(3): 230-235(in Chinese). doi: 10.11728/cjss2008.03.230
    [8] JIANG Y, LV M Y, QU Z P, et al. Performance evaluation for scientific balloon station-keeping strategies considering energy management strategy[J]. Renewable Energy, 2020, 156: 290-302. doi: 10.1016/j.renene.2020.04.011
    [9] 王益平, 周飞, 徐明. 临近空间浮空器区域驻留控制策略研究[J]. 中国空间科学技术, 2018, 38(1): 63-69.

    WANG Y P, ZHOU F, XU M. Research on control strategy of territory-hovering aerostat in near space[J]. Chinese Space Science and Technology, 2018, 38(1): 63-69(in Chinese).
    [10] DU H F, LV M Y, ZHANG L C, et al. Energy management strategy design and station-keeping strategy optimization for high altitude balloon with altitude control system[J]. Aerospace Science and Technology, 2019, 93: 105342. doi: 10.1016/j.ast.2019.105342
    [11] WAGHELA R, YODER C D, GOPALARATHNAM A, et al. Aerodynamic sails for passive guidance of high-altitude balloons: Static-stability and equilibrium performance[J]. Journal of Aircraft, 2019, 56(5): 1849-1857. doi: 10.2514/1.C035353
    [12] KAYHAN Ö, YÜCEL Ö, HASTAOĞLU M A. Simulation and control of serviceable stratospheric balloons traversing a region via transport phenomena and PID[J]. Aerospace Science and Technology, 2016, 53: 232-240.
    [13] ZHENG Z W, CHEN T, XU M, et al. Modeling and path-following control of a vector-driven stratospheric satellite[J]. Advances in Space Research, 2016, 57(9): 1901-1913. doi: 10.1016/j.asr.2016.02.004
    [14] YANG X W, YANG X X, DENG X L. Horizontal trajectory control of stratospheric airships in wind field using Q-learning algorithm[J]. Aerospace Science and Technology, 2020, 106: 106100. doi: 10.1016/j.ast.2020.106100
    [15] SÓBESTER A, CZERSKI H, ZAPPONI N, et al. High-altitude gas balloon trajectory prediction: A MonteCarlo model[J]. AIAA Journal, 2014, 52(4): 832-842. doi: 10.2514/1.J052900
    [16] 李春霖, 罗蓉媛, 陈彤曦. 平流层通信新思路—谷歌气球计划[J]. 通信技术, 2015, 48(2): 125-129. doi: 10.3969/j.issn.1002-0802.2015.02.002

    LI C L, LUO R Y, CHEN T X. New idea for stratospheric comm-unications—Google Loon[J]. Communications Technology, 2015, 48(2): 125-129(in Chinese). doi: 10.3969/j.issn.1002-0802.2015.02.002
    [17] BELLEMARE M G, CANDIDO S, CASTRO P S, et al. Autonomous navigation of stratospheric balloons using reinforcement learning[J]. Nature, 2020, 588(7836): 77-82. doi: 10.1038/s41586-020-2939-8
    [18] YANG X X, ZHANG W H, HOU Z X. Improved thermal and vertical trajectory model for performance prediction of stratospheric balloons[J]. Journal of Aerospace Engineering, 2015, 28(3): 04014075. doi: 10.1061/(ASCE)AS.1943-5525.0000404
    [19] WATKINS C J C H, DAYAN P. Q-learning[J]. Machine Learning, 1992, 8(3): 279-292.
    [20] MNIH V, KAVUKCUOGLU K, SILVER D, et al. Playing atari with deep reinforcement learning[J]. Computer Science, 2013, 1312: 5602.
  • 加载中
图(11) / 表(2)
计量
  • 文章访问数:  157
  • HTML全文浏览量:  51
  • PDF下载量:  24
  • 被引次数: 0
出版历程
  • 收稿日期:  2021-10-22
  • 录用日期:  2021-12-06
  • 网络出版日期:  2021-12-30
  • 整期出版日期:  2023-08-31

目录

    /

    返回文章
    返回
    常见问答