Station keeping control for aerostat in wind fields based on deep reinforcement learning
-
摘要:
建立了平流层浮空器区域驻留模型,在有动力和无动力推进的情况下,基于马尔可夫决策过程,将具有优先经验回放的双深度Q学习应用于平流层浮空器区域驻留控制。通过平均区域驻留半径、区域驻留有效时间比等参数来评价区域驻留控制方法的效果。典型风场中仿真分析结果指出:在区域驻留半径为50 km、区域驻留时间为3天的任务下,无动力推进的平流层浮空器的平均区域驻留半径为28.16 km,区域驻留有效时间比为83%;有动力推进平流层浮空器的平均区域驻留半径可达8.84 km,可实现区域驻留半径为20 km的飞行控制,区域驻留有效时间比为100%。
Abstract:In this paper, a stratospheric aerostat station keeping model is established. Based on Markov decision process, Double Deep Q-learning with prioritized experience replay is applied to stratospheric aerostat station keeping control under dynamic and non-dynamic conditions. Ultimately, metrics like the average station keeping radius and the station keeping effective time ratio are used to assess the effectiveness of the station keeping control approach. The simulation analysis results show that: under the mission the station keeping radius is 50 km and the station keeping time is three days, in the case of no power propulsion, the average station keeping radius of the stratospheric aerostat is 28.16 km, the station keeping effective time ratio is 83%. In the case of powered propulsion, the average station keeping radius of the stratospheric aerostat is significantly increased. The powered stratospheric aerostat can achieve flight control with a station keeping radius of 20 km, an average station keeping radius of 8.84 km, and a station keeping effective time ratio of 100%.
-
表 1 环境状态空间参数设置
Table 1. Environmental state space parameter setting
参数 取值范围 高度h/km 18~22 东向位置x/km −50~50 北向位置y/km −50~50 副气囊空气质量mair/kg 0~158 风速Sw/(m·s−1) − 风向与位置角度δ 0~π 注:东向、北向位置限制条件为$\sqrt {{x^2} + {y^2}} \leqslant {\text{50}} $,风速Sw根据真实风场确定,风向与位置角度δ根据真实风场与当前位置确定。 表 2 无推进系统作用下平流层浮空器动作空间
Table 2. Action space of stratospheric aerostat without propulsion system
动作空间 动作 a1 阀门排气 a2 阀门关 a3 风机吸气 表 3 东西方向单通道推进系统作用下平流层浮空器动作空间
Table 3. Action space of stratospheric aerostat under the action of single-channel propulsion system in east-west direction
动作空间 动作 a1 阀门排气,螺旋桨向东推进 a2 阀门排气,螺旋桨向西推进 a3 阀门排气,螺旋桨关闭 a4 阀门关,螺旋桨向东推进 a5 阀门关,螺旋桨向西推进 a6 阀门关,螺旋桨关闭 a7 风机吸气,螺旋桨向东推进 a8 风机吸气,螺旋桨向西推进 a9 风机吸气,螺旋桨关闭 表 4 南北方向单通道推进系统作用下平流层浮空器动作空间
Table 4. Action space of stratospheric aerostat under the action of single-channel propulsion system in north-south direction
动作空间 动作 a1 阀门排气,螺旋桨向北推进 a2 阀门排气,螺旋桨向南推进 a3 阀门排气,螺旋桨关闭 a4 阀门关,螺旋桨向北推进 a5 阀门关,螺旋桨向南推进 a6 阀门关,螺旋桨关闭 a7 风机吸气,螺旋桨向北推进 a8 风机吸气,螺旋桨向南推进 a9 风机吸气,螺旋桨关闭 表 5 双通道推进系统作用下平流层浮空器动作空间
Table 5. The action space of the stratospheric aerostat under the action of the dual-channel propulsion system
动作空间 动作 a1 阀门排气, 螺旋桨向北 a2 阀门排气, 螺旋桨向东 a3 阀门排气, 螺旋桨向南 a4 阀门排气, 螺旋桨向北 a5 阀门排气,螺旋桨向东北 a6 阀门排气,螺旋桨向东南 a7 阀门排气,螺旋桨向西南 a8 阀门排气,螺旋桨向西北 a9 阀门排气,螺旋桨关闭 a10 阀门关,螺旋桨向北 a11 阀门关,螺旋桨向东 a12 阀门关,螺旋桨向南 a13 阀门关,螺旋桨向西 a14 阀门关,螺旋桨向东北 a15 阀门关,螺旋桨向东南 a16 阀门关,螺旋桨向西南 a17 阀门关,螺旋桨向西北 a18 阀门关,螺旋桨关闭 a19 风机吸气, 螺旋桨向北 a20 风机吸气,螺旋桨向东 a21 风机吸气,螺旋桨向南 a22 风机吸气,螺旋桨向西 a23 风机吸气,螺旋桨向东北 a24 风机吸气,螺旋桨向东南 a25 风机吸气,螺旋桨向西南 a26 风机吸气,螺旋桨向西北 a27 风机吸气,螺旋桨关闭 表 6 DDQN算法参数设置
Table 6. DDQN algorithm parameter settings
训练参数 数值 批学习数Nb 512 最大训练回合数Nmax 2×104 记忆回放单元大小M 2×106 学习率 0.001 奖励偏差 −0.1 ε-贪婪算法下降参数${\varepsilon _{{\text{dec}}}}$ 0.01 ε初值 0.98 表 7 平流层浮空器参数
Table 7. Stratospheric aerostat parameters
参数 数值 囊体半径/m 8.7 囊体体积/m3 2780 囊体总质量/kg 48 系统总质量/kg 177.2 阀门数量 1 阀门半径/m 0.04 工作高度/km 18~22 表 8 平流层浮空器初始状态
Table 8. Initial state of stratospheric aerostat
状态量 状态值 高度${h_0}$/km 20 x方向x0/km 0 y方向y0/km 0 初始空气质量/kg 67.58 初始时间 2021-08-03T0 结束时间 2021-08-06T0 -
[1] 侯中喜, 杨希祥, 乔凯, 等. 平流层飞艇技术[M]. 北京: 科学出版社, 2019.HOU Z X, YANG X X, QIAO K, et al. Stratospheric airship technology [M]. Beijing: Science Press, 2019. [2] 李智斌, 黄宛宁, 张钊, 等. 2020年临近空间科技热点回眸[J]. 科技导报, 2021, 39(1): 54-68.LI Z B, HUANG W N, ZHANG Z, et al. Summary of the hot spots of near space science and technology in 2020[J]. Science & Technology Review, 2021, 39(1): 54-68(in Chinese). [3] NOCK K T, HEUN M K, AARON K M. Global stratospheric balloon constellations[J]. Advances in Space Research, 2002, 30(5): 1233-1238. doi: 10.1016/S0273-1177(02)00528-8 [4] 赵达, 刘东旭, 孙康文, 等. 平流层飞艇研制现状、技术难点及发展趋势[J]. 航空学报, 2016, 37(1): 45-56.ZHAO D, LIU D X, SUN K W, et al. Research status, technical difficulties and development trend of stratospheric airship[J]. Acta Aeronautica et Astronautica Sinica, 2016, 37(1): 45-56(in Chinese). [5] CATHEY H M, TUTTLE J W, FAIRBROTHER D A, et al. Qualification of the NASA super pressure balloon[C]// AIAA Balloon Systems Conference. Reston: AIAA, 2015: 2909. [6] 刘东旭, 樊彦斌, 马云鹏, 等. 氦气渗透对高空长航时浮空器驻空能力影响[J]. 宇航学报, 2010, 31(11): 2477-2482.LIU D X, FAN Y B, MA Y P, et al. Effect of helium permeability on working endurance high altitude long duration LTA vehicle[J]. Journal of Astronautics, 2010, 31(11): 2477-2482(in Chinese). [7] 杨燕初, 张航悦, 赵荣. 零压式高空气球球形设计与参数敏感性分析[J]. 国防科技大学学报, 2019, 41(1): 58-64. doi: 10.11887/j.cn.201901009YANG Y C, ZHANG H Y, ZHAO R. Shape design of zero pressure high altitude balloon and sensitivity analysis of key parameters[J]. Journal of National University of Defense Technology, 2019, 41(1): 58-64(in Chinese). doi: 10.11887/j.cn.201901009 [8] 杨跃能. 平流层飞艇动力学建模与控制方法研究[D]. 长沙: 国防科学技术大学, 2013.YANG Y N. Dynamics modeling and flight control for a stratospheric airship[D]. Changsha: National University of Defense Technology, 2013(in Chinese). [9] 杨希祥, 朱炳杰, 邓小龙, 等. Stratobus平流层飞艇项目研究进展与仿真分析[J]. 航空学报, 2021, 42(9): 224579.YANG X X, ZHU B J, DENG X L, et al. Development status and simulation analysis of stratospheric airship Stratobus[J]. Acta Aeronautica et Astronautica Sinica, 2021, 42(9): 224579(in Chinese). [10] RONEY J A. Statistical wind analysis for near-space applications[J]. Journal of Atmospheric and Solar-Terrestrial Physics, 2007, 69(13): 1485-1501. doi: 10.1016/j.jastp.2007.05.005 [11] 邓小龙, 丛伟轩, 李魁, 等. 风场综合利用的新型平流层浮空器轨迹设计[J]. 宇航学报, 2019, 40(7): 748-757.DENG X L, CONG W X, LI K, et al. Trajectory design of a novel stratospheric aerostat based on comprehensive utilization of wind fields[J]. Journal of Astronautics, 2019, 40(7): 748-757(in Chinese). [12] 翟嘉琪, 杨希祥, 邓小龙, 等. 不确定风场下平流层浮空器全局路径规划[J]. 北京航空航天大学学报, 2023, 49(5): 1116-1126.ZHAI J Q, YANG X X, DENG X L, et al. Global path planning of stratospheric aerostat in uncertain wind field[J]. Journal of Beijing University of Aeronautics and Astronautics, 2023, 49(5): 1116-1126(in Chinese). [13] SMITH M S. Demonstration of fine altitude control on stratospheric balloons to achieve a desired ground track[C]// AIAA Balloon Systems Conference. Reston: AIAA, 2017: 3287. [14] 李魁, 邓小龙, 杨希祥, 等. 基于平流层风场预测的浮空器轨迹控制[J]. 北京航空航天大学学报, 2019, 45(5): 1008-1018.LI K, DENG X L, YANG X X, et al. Trajectory control of aerostat based on prediction of stratospheric wind field[J]. Journal of Beijing University of Aeronautics and Astronautics, 2019, 45(5): 1008-1018(in Chinese). [15] TRAN N K, HE X, ZLOTNIK D E, et al. Attitude sensing and control of a stratospheric ballon platform[C]// AIAA Balloon Systems (BAL) Conference. Reston: AIAA, 2013: 1373. [16] DU H F, SUN T F, LV M Y, et al. Dynamic coverage performance of wind-assisted balloons mesh based on Voronoi partition and energy constraint[J]. Advances in Space Research, 2022, 70(2): 470-484. doi: 10.1016/j.asr.2022.04.051 [17] YODER C D, GEMMER T R, MAZZOLENI A P. Modelling and performance analysis of a tether and sail-based trajectory control system for extra-terrestrial scientific balloon missions[J]. Acta Astronautica, 2019, 160: 527-537. doi: 10.1016/j.actaastro.2018.12.030 [18] RAMESH S S, MA J L, LIM K M, et al. Numerical evaluation of station-keeping strategies for stratospheric balloons[J]. Aerospace Science and Technology, 2018, 80: 288-300. doi: 10.1016/j.ast.2018.07.010 [19] BELLEMARE M G, CANDIDO S, CASTRO P S, et al. Autonomous navigation of stratospheric balloons using reinforcement learning[J]. Nature, 2020, 588(7836): 77-82. doi: 10.1038/s41586-020-2939-8 [20] DU H F, LV M Y, LI J, et al. Station-keeping performance analysis for high altitude balloon with altitude control system[J]. Aerospace Science and Technology, 2019, 92: 644-652. doi: 10.1016/j.ast.2019.06.035 [21] DU H F, LV M Y, ZHANG L C, et al. Energy management strategy design and station-keeping strategy optimization for high altitude balloon with altitude control system[J]. Aerospace Science and Technology, 2019, 93: 105342. doi: 10.1016/j.ast.2019.105342 [22] 王益平, 周飞, 徐明. 临近空间浮空器区域驻留控制策略研究[J]. 中国空间科学技术, 2018, 38(1): 63-69.WANG Y P, ZHOU F, XU M. Research on control strategy of territory-hovering aerostat in near space[J]. Chinese Space Science and Technology, 2018, 38(1): 63-69(in Chinese). [23] 邓小龙, 杨希祥, 麻震宇, 等. 基于风场环境利用的平流层浮空器区域驻留关键问题研究进展[J]. 航空学报, 2019, 40(8): 022941.DENG X L, YANG X X, MA Z Y, et al. Review of key technologies for station-keeping of stratospheric aerostats based on wind field utilization[J]. Acta Aeronautica et Astronautica Sinica, 2019, 40(8): 022941(in Chinese). [24] JIANG Y, LV M Y, ZHU W Y, et al. A method of 3-D region controlling for scientific balloon long-endurance flight in the real wind[J]. Aerospace Science and Technology, 2020, 97: 105618. doi: 10.1016/j.ast.2019.105618 [25] JIANG Y, LV M Y, LI J. Station-keeping control design of double balloon system based on horizontal region constraints[J]. Aerospace Science and Technology, 2020, 100: 105792. doi: 10.1016/j.ast.2020.105792 [26] 杨思明, 单征, 丁煜, 等. 深度强化学习研究综述[J]. 计算机工程, 2021, 47(12): 19-29.YANG S M, SHAN Z, DING Y, et al. Survey of research on deep reinforcement learning[J]. Computer Engineering, 2021, 47(12): 19-29(in Chinese). [27] 张悦. 多智能体深度强化学习方法及应用研究[D]. 西安: 西安电子科技大学, 2018.ZHANG Y. Research on multi-agent deep reinforcement learning methods and applications[D]. Xi’an: Xidian University, 2018(in Chinese). [28] XU Z Y, LIU Y, DU H F, et al. Station-keeping for high-altitude balloon with reinforcement learning[J]. Advances in Space Research, 2022, 70(3): 733-751. doi: 10.1016/j.asr.2022.05.006 [29] 张小达, 张鹏, 李小龙. 《标准大气与参考大气模型应用指南》介绍[J]. 航天标准化, 2010(3): 8-11.ZHANG X D, ZHANG P, LI X L. Introduction of “application guide of standard atmosphere and reference atmosphere model”[J]. Aerospace Standardization, 2010(3): 8-11(in Chinese). [30] 张顶立. 基于深度强化学习的城市场景无人机避撞决策研究[D]. 广汉: 中国民用航空飞行学院, 2022.ZHANG D L. Research on autonomous collision avoidance decisionmaking of UAV in urban airspace based on deep reinforcement learning[D]. Guanghan: Civil Aviation Flight University of China, 2022(in Chinese). [31] LI J X, CHEN Y T, ZHAO X N, et al. An improved DQN path planning algorithm[J]. The Journal of Supercomputing, 2022, 78(1): 616-639. doi: 10.1007/s11227-021-03878-2 [32] SUTTON R S, BARTO A G. Reinforcement learning: an introduction[M]. 2nd ed. London: MIT, 2018. [33] SCHAUL T, QUAN J, ANTONOGLOU I, et al. Prioritized experience replay[EB/OL]. (2016-02-25)[2022-06-19]. https://doi.org/10.48550/arXiv.1511.05952.