留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于深度强化学习的风场中浮空器驻留控制

柏方超 杨希祥 邓小龙 侯中喜

柏方超,杨希祥,邓小龙,等. 基于深度强化学习的风场中浮空器驻留控制[J]. 北京航空航天大学学报,2024,50(7):2354-2366 doi: 10.13700/j.bh.1001-5965.2022.0629
引用本文: 柏方超,杨希祥,邓小龙,等. 基于深度强化学习的风场中浮空器驻留控制[J]. 北京航空航天大学学报,2024,50(7):2354-2366 doi: 10.13700/j.bh.1001-5965.2022.0629
BAI F C,YANG X X,DENG X L,et al. Station keeping control for aerostat in wind fields based on deep reinforcement learning[J]. Journal of Beijing University of Aeronautics and Astronautics,2024,50(7):2354-2366 (in Chinese) doi: 10.13700/j.bh.1001-5965.2022.0629
Citation: BAI F C,YANG X X,DENG X L,et al. Station keeping control for aerostat in wind fields based on deep reinforcement learning[J]. Journal of Beijing University of Aeronautics and Astronautics,2024,50(7):2354-2366 (in Chinese) doi: 10.13700/j.bh.1001-5965.2022.0629

基于深度强化学习的风场中浮空器驻留控制

doi: 10.13700/j.bh.1001-5965.2022.0629
基金项目: 国家自然科学基金(61903369,52272445);湖南省自然科学基金(2023JJ10056)
详细信息
    通讯作者:

    E-mail:nkyangxixiang@163.com

  • 中图分类号: V274

Station keeping control for aerostat in wind fields based on deep reinforcement learning

Funds: National Natural Science Foundation of China (61903369、52272445); Natural Science Foundation of Hunan Province (2023JJ10056)
More Information
  • 摘要:

    建立了平流层浮空器区域驻留模型,在有动力和无动力推进的情况下,基于马尔可夫决策过程,将具有优先经验回放的双深度Q学习应用于平流层浮空器区域驻留控制。通过平均区域驻留半径、区域驻留有效时间比等参数来评价区域驻留控制方法的效果。典型风场中仿真分析结果指出:在区域驻留半径为50 km、区域驻留时间为3天的任务下,无动力推进的平流层浮空器的平均区域驻留半径为28.16 km,区域驻留有效时间比为83%;有动力推进平流层浮空器的平均区域驻留半径可达8.84 km,可实现区域驻留半径为20 km的飞行控制,区域驻留有效时间比为100%。

     

  • 图 1  平流层浮空器系统

    Figure 1.  Stratospheric aerostat system

    图 2  平流层浮空器水平方向转移策略原理

    Figure 2.  Schematic diagram of the horizontal transfer strategy of the stratospheric aerostat

    图 3  基于定点悬停的控制策略流程图

    Figure 3.  Control strategy flow chart based on fixed-point hovering

    图 4  智能体状态转移过程

    Figure 4.  Agent state transition process in an intelligent body

    图 5  基于DDQN的区域驻留控制流程

    Figure 5.  The area residency control flow diagram based on DDQN

    图 6  风场示意图

    Figure 6.  Schematic of wind fields

    图 7  定点悬停下的飞行仿真结果

    Figure 7.  The flight simulation result of the fixed point hovering

    图 8  强化学习控制下的飞行仿真结果

    Figure 8.  Flight simulation results under reinforcement learning control

    图 9  风场扰动下飞行仿真结果

    Figure 9.  Flight simulation results under wind disturbance

    图 10  东西单通道控制飞行仿真结果

    Figure 10.  Single-channel control flight simulation results in the east-west direction

    图 11  南北单通道控制飞行仿真结果

    Figure 11.  Single-channel control flight simulation results in the north-south direction

    图 12  双通道控制飞行仿真结果

    Figure 12.  Dual-channel control flight simulation results

    图 13  平均奖励值

    Figure 13.  Average rewards during training process

    表  1  环境状态空间参数设置

    Table  1.   Environmental state space parameter setting

    参数 取值范围
    高度h/km 18~22
    东向位置x/km −50~50
    北向位置y/km −50~50
    副气囊空气质量mair/kg 0~158
    风速Sw/(m·s−1)
    风向与位置角度δ 0~π
     注:东向、北向位置限制条件为$\sqrt {{x^2} + {y^2}} \leqslant {\text{50}} $,风速Sw根据真实风场确定,风向与位置角度δ根据真实风场与当前位置确定。
    下载: 导出CSV

    表  2  无推进系统作用下平流层浮空器动作空间

    Table  2.   Action space of stratospheric aerostat without propulsion system

    动作空间 动作
    a1 阀门排气
    a2 阀门关
    a3 风机吸气
    下载: 导出CSV

    表  3  东西方向单通道推进系统作用下平流层浮空器动作空间

    Table  3.   Action space of stratospheric aerostat under the action of single-channel propulsion system in east-west direction

    动作空间动作
    a1阀门排气,螺旋桨向东推进
    a2阀门排气,螺旋桨向西推进
    a3阀门排气,螺旋桨关闭
    a4阀门关,螺旋桨向东推进
    a5阀门关,螺旋桨向西推进
    a6阀门关,螺旋桨关闭
    a7风机吸气,螺旋桨向东推进
    a8风机吸气,螺旋桨向西推进
    a9风机吸气,螺旋桨关闭
    下载: 导出CSV

    表  4  南北方向单通道推进系统作用下平流层浮空器动作空间

    Table  4.   Action space of stratospheric aerostat under the action of single-channel propulsion system in north-south direction

    动作空间动作
    a1阀门排气,螺旋桨向北推进
    a2阀门排气,螺旋桨向南推进
    a3阀门排气,螺旋桨关闭
    a4阀门关,螺旋桨向北推进
    a5阀门关,螺旋桨向南推进
    a6阀门关,螺旋桨关闭
    a7风机吸气,螺旋桨向北推进
    a8风机吸气,螺旋桨向南推进
    a9风机吸气,螺旋桨关闭
    下载: 导出CSV

    表  5  双通道推进系统作用下平流层浮空器动作空间

    Table  5.   The action space of the stratospheric aerostat under the action of the dual-channel propulsion system

    动作空间动作
    a1阀门排气, 螺旋桨向北
    a2阀门排气, 螺旋桨向东
    a3阀门排气, 螺旋桨向南
    a4阀门排气, 螺旋桨向北
    a5阀门排气,螺旋桨向东北
    a6阀门排气,螺旋桨向东南
    a7阀门排气,螺旋桨向西南
    a8阀门排气,螺旋桨向西北
    a9阀门排气,螺旋桨关闭
    a10阀门关,螺旋桨向北
    a11阀门关,螺旋桨向东
    a12阀门关,螺旋桨向南
    a13阀门关,螺旋桨向西
    a14阀门关,螺旋桨向东北
    a15阀门关,螺旋桨向东南
    a16阀门关,螺旋桨向西南
    a17阀门关,螺旋桨向西北
    a18阀门关,螺旋桨关闭
    a19风机吸气, 螺旋桨向北
    a20风机吸气,螺旋桨向东
    a21风机吸气,螺旋桨向南
    a22风机吸气,螺旋桨向西
    a23风机吸气,螺旋桨向东北
    a24风机吸气,螺旋桨向东南
    a25风机吸气,螺旋桨向西南
    a26风机吸气,螺旋桨向西北
    a27风机吸气,螺旋桨关闭
    下载: 导出CSV

    表  6  DDQN算法参数设置

    Table  6.   DDQN algorithm parameter settings

    训练参数 数值
    批学习数Nb 512
    最大训练回合数Nmax 2×104
    记忆回放单元大小M 2×106
    学习率 0.001
    奖励偏差 −0.1
    ε-贪婪算法下降参数${\varepsilon _{{\text{dec}}}}$ 0.01
    ε初值 0.98
    下载: 导出CSV

    表  7  平流层浮空器参数

    Table  7.   Stratospheric aerostat parameters

    参数 数值
    囊体半径/m 8.7
    囊体体积/m3 2780
    囊体总质量/kg 48
    系统总质量/kg 177.2
    阀门数量 1
    阀门半径/m 0.04
    工作高度/km 18~22
    下载: 导出CSV

    表  8  平流层浮空器初始状态

    Table  8.   Initial state of stratospheric aerostat

    状态量 状态值
    高度${h_0}$/km 20
    x方向x0/km 0
    y方向y0/km 0
    初始空气质量/kg 67.58
    初始时间 2021-08-03T0
    结束时间 2021-08-06T0
    下载: 导出CSV
  • [1] 侯中喜, 杨希祥, 乔凯, 等. 平流层飞艇技术[M]. 北京: 科学出版社, 2019.

    HOU Z X, YANG X X, QIAO K, et al. Stratospheric airship technology [M]. Beijing: Science Press, 2019.
    [2] 李智斌, 黄宛宁, 张钊, 等. 2020年临近空间科技热点回眸[J]. 科技导报, 2021, 39(1): 54-68.

    LI Z B, HUANG W N, ZHANG Z, et al. Summary of the hot spots of near space science and technology in 2020[J]. Science & Technology Review, 2021, 39(1): 54-68(in Chinese).
    [3] NOCK K T, HEUN M K, AARON K M. Global stratospheric balloon constellations[J]. Advances in Space Research, 2002, 30(5): 1233-1238. doi: 10.1016/S0273-1177(02)00528-8
    [4] 赵达, 刘东旭, 孙康文, 等. 平流层飞艇研制现状、技术难点及发展趋势[J]. 航空学报, 2016, 37(1): 45-56.

    ZHAO D, LIU D X, SUN K W, et al. Research status, technical difficulties and development trend of stratospheric airship[J]. Acta Aeronautica et Astronautica Sinica, 2016, 37(1): 45-56(in Chinese).
    [5] CATHEY H M, TUTTLE J W, FAIRBROTHER D A, et al. Qualification of the NASA super pressure balloon[C]// AIAA Balloon Systems Conference. Reston: AIAA, 2015: 2909.
    [6] 刘东旭, 樊彦斌, 马云鹏, 等. 氦气渗透对高空长航时浮空器驻空能力影响[J]. 宇航学报, 2010, 31(11): 2477-2482.

    LIU D X, FAN Y B, MA Y P, et al. Effect of helium permeability on working endurance high altitude long duration LTA vehicle[J]. Journal of Astronautics, 2010, 31(11): 2477-2482(in Chinese).
    [7] 杨燕初, 张航悦, 赵荣. 零压式高空气球球形设计与参数敏感性分析[J]. 国防科技大学学报, 2019, 41(1): 58-64. doi: 10.11887/j.cn.201901009

    YANG Y C, ZHANG H Y, ZHAO R. Shape design of zero pressure high altitude balloon and sensitivity analysis of key parameters[J]. Journal of National University of Defense Technology, 2019, 41(1): 58-64(in Chinese). doi: 10.11887/j.cn.201901009
    [8] 杨跃能. 平流层飞艇动力学建模与控制方法研究[D]. 长沙: 国防科学技术大学, 2013.

    YANG Y N. Dynamics modeling and flight control for a stratospheric airship[D]. Changsha: National University of Defense Technology, 2013(in Chinese).
    [9] 杨希祥, 朱炳杰, 邓小龙, 等. Stratobus平流层飞艇项目研究进展与仿真分析[J]. 航空学报, 2021, 42(9): 224579.

    YANG X X, ZHU B J, DENG X L, et al. Development status and simulation analysis of stratospheric airship Stratobus[J]. Acta Aeronautica et Astronautica Sinica, 2021, 42(9): 224579(in Chinese).
    [10] RONEY J A. Statistical wind analysis for near-space applications[J]. Journal of Atmospheric and Solar-Terrestrial Physics, 2007, 69(13): 1485-1501. doi: 10.1016/j.jastp.2007.05.005
    [11] 邓小龙, 丛伟轩, 李魁, 等. 风场综合利用的新型平流层浮空器轨迹设计[J]. 宇航学报, 2019, 40(7): 748-757.

    DENG X L, CONG W X, LI K, et al. Trajectory design of a novel stratospheric aerostat based on comprehensive utilization of wind fields[J]. Journal of Astronautics, 2019, 40(7): 748-757(in Chinese).
    [12] 翟嘉琪, 杨希祥, 邓小龙, 等. 不确定风场下平流层浮空器全局路径规划[J]. 北京航空航天大学学报, 2023, 49(5): 1116-1126.

    ZHAI J Q, YANG X X, DENG X L, et al. Global path planning of stratospheric aerostat in uncertain wind field[J]. Journal of Beijing University of Aeronautics and Astronautics, 2023, 49(5): 1116-1126(in Chinese).
    [13] SMITH M S. Demonstration of fine altitude control on stratospheric balloons to achieve a desired ground track[C]// AIAA Balloon Systems Conference. Reston: AIAA, 2017: 3287.
    [14] 李魁, 邓小龙, 杨希祥, 等. 基于平流层风场预测的浮空器轨迹控制[J]. 北京航空航天大学学报, 2019, 45(5): 1008-1018.

    LI K, DENG X L, YANG X X, et al. Trajectory control of aerostat based on prediction of stratospheric wind field[J]. Journal of Beijing University of Aeronautics and Astronautics, 2019, 45(5): 1008-1018(in Chinese).
    [15] TRAN N K, HE X, ZLOTNIK D E, et al. Attitude sensing and control of a stratospheric ballon platform[C]// AIAA Balloon Systems (BAL) Conference. Reston: AIAA, 2013: 1373.
    [16] DU H F, SUN T F, LV M Y, et al. Dynamic coverage performance of wind-assisted balloons mesh based on Voronoi partition and energy constraint[J]. Advances in Space Research, 2022, 70(2): 470-484. doi: 10.1016/j.asr.2022.04.051
    [17] YODER C D, GEMMER T R, MAZZOLENI A P. Modelling and performance analysis of a tether and sail-based trajectory control system for extra-terrestrial scientific balloon missions[J]. Acta Astronautica, 2019, 160: 527-537. doi: 10.1016/j.actaastro.2018.12.030
    [18] RAMESH S S, MA J L, LIM K M, et al. Numerical evaluation of station-keeping strategies for stratospheric balloons[J]. Aerospace Science and Technology, 2018, 80: 288-300. doi: 10.1016/j.ast.2018.07.010
    [19] BELLEMARE M G, CANDIDO S, CASTRO P S, et al. Autonomous navigation of stratospheric balloons using reinforcement learning[J]. Nature, 2020, 588(7836): 77-82. doi: 10.1038/s41586-020-2939-8
    [20] DU H F, LV M Y, LI J, et al. Station-keeping performance analysis for high altitude balloon with altitude control system[J]. Aerospace Science and Technology, 2019, 92: 644-652. doi: 10.1016/j.ast.2019.06.035
    [21] DU H F, LV M Y, ZHANG L C, et al. Energy management strategy design and station-keeping strategy optimization for high altitude balloon with altitude control system[J]. Aerospace Science and Technology, 2019, 93: 105342. doi: 10.1016/j.ast.2019.105342
    [22] 王益平, 周飞, 徐明. 临近空间浮空器区域驻留控制策略研究[J]. 中国空间科学技术, 2018, 38(1): 63-69.

    WANG Y P, ZHOU F, XU M. Research on control strategy of territory-hovering aerostat in near space[J]. Chinese Space Science and Technology, 2018, 38(1): 63-69(in Chinese).
    [23] 邓小龙, 杨希祥, 麻震宇, 等. 基于风场环境利用的平流层浮空器区域驻留关键问题研究进展[J]. 航空学报, 2019, 40(8): 022941.

    DENG X L, YANG X X, MA Z Y, et al. Review of key technologies for station-keeping of stratospheric aerostats based on wind field utilization[J]. Acta Aeronautica et Astronautica Sinica, 2019, 40(8): 022941(in Chinese).
    [24] JIANG Y, LV M Y, ZHU W Y, et al. A method of 3-D region controlling for scientific balloon long-endurance flight in the real wind[J]. Aerospace Science and Technology, 2020, 97: 105618. doi: 10.1016/j.ast.2019.105618
    [25] JIANG Y, LV M Y, LI J. Station-keeping control design of double balloon system based on horizontal region constraints[J]. Aerospace Science and Technology, 2020, 100: 105792. doi: 10.1016/j.ast.2020.105792
    [26] 杨思明, 单征, 丁煜, 等. 深度强化学习研究综述[J]. 计算机工程, 2021, 47(12): 19-29.

    YANG S M, SHAN Z, DING Y, et al. Survey of research on deep reinforcement learning[J]. Computer Engineering, 2021, 47(12): 19-29(in Chinese).
    [27] 张悦. 多智能体深度强化学习方法及应用研究[D]. 西安: 西安电子科技大学, 2018.

    ZHANG Y. Research on multi-agent deep reinforcement learning methods and applications[D]. Xi’an: Xidian University, 2018(in Chinese).
    [28] XU Z Y, LIU Y, DU H F, et al. Station-keeping for high-altitude balloon with reinforcement learning[J]. Advances in Space Research, 2022, 70(3): 733-751. doi: 10.1016/j.asr.2022.05.006
    [29] 张小达, 张鹏, 李小龙. 《标准大气与参考大气模型应用指南》介绍[J]. 航天标准化, 2010(3): 8-11.

    ZHANG X D, ZHANG P, LI X L. Introduction of “application guide of standard atmosphere and reference atmosphere model”[J]. Aerospace Standardization, 2010(3): 8-11(in Chinese).
    [30] 张顶立. 基于深度强化学习的城市场景无人机避撞决策研究[D]. 广汉: 中国民用航空飞行学院, 2022.

    ZHANG D L. Research on autonomous collision avoidance decisionmaking of UAV in urban airspace based on deep reinforcement learning[D]. Guanghan: Civil Aviation Flight University of China, 2022(in Chinese).
    [31] LI J X, CHEN Y T, ZHAO X N, et al. An improved DQN path planning algorithm[J]. The Journal of Supercomputing, 2022, 78(1): 616-639. doi: 10.1007/s11227-021-03878-2
    [32] SUTTON R S, BARTO A G. Reinforcement learning: an introduction[M]. 2nd ed. London: MIT, 2018.
    [33] SCHAUL T, QUAN J, ANTONOGLOU I, et al. Prioritized experience replay[EB/OL]. (2016-02-25)[2022-06-19]. https://doi.org/10.48550/arXiv.1511.05952.
  • 加载中
图(13) / 表(8)
计量
  • 文章访问数:  262
  • HTML全文浏览量:  96
  • PDF下载量:  4
  • 被引次数: 0
出版历程
  • 收稿日期:  2022-07-19
  • 录用日期:  2022-12-09
  • 网络出版日期:  2022-12-26
  • 整期出版日期:  2024-07-18

目录

    /

    返回文章
    返回
    常见问答