基于异构图的天基光学传感器智能调度方法

彭厚吾; 田景峰; 吴晓杰; 马鹤洋

doi:10.13700/j.bh.1001-5965.2023.0706

基于异构图的天基光学传感器智能调度方法

doi: 10.13700/j.bh.1001-5965.2023.0706

1.
中国空间技术研究院遥感卫星总体部，北京 100094
2.
北京理工大学机械与车辆学院，北京 100081

详细信息

通讯作者:
E-mail：tutortianjingfeng@gmail.com

中图分类号: V443⁺.5；TP212.6
计量
- 文章访问数: 187
- HTML全文浏览量: 74
- PDF下载量: 13
- 被引次数: 0
出版历程
- 收稿日期: 2023-10-31
- 录用日期: 2024-02-29
- 网络出版日期: 2024-03-12
- 整期出版日期: 2025-12-31

Intelligent scheduling method for space-based optical sensors based on heterogeneous graph

1.
Institute of Remote Sensing Satellite，China Academy of Space Technology，Beijing 100094，China
2.
School of Mechanical Engineering，Beijing Institute of Technology，Beijing 100081，China

More Information

Corresponding author: E-mail：tutortianjingfeng@gmail.com

摘要

摘要:
随着空间环境日益复杂，如何在线调度有限的天基光学传感器资源以观测更多的空间目标成为紧迫的问题。基于此，提出一种基于异构图的传感器智能调度方法。该方法将传感器调度方案的求解过程建模为马尔可夫决策过程(MDP)，并将传感器调度方案表征为由目标、传感器和可视窗口3类节点构成的异构图结构，采用异构图神经网络对异构图结构进行编码，通过强化学习对传感器调度智能体开展训练。消融实验表明：提出的异构图神经网络编码方案比多层感知机编码方案的策略收益高7.5%。在不同目标数量和传感器数量的仿真场景中对比了所提方法与模拟退火方法和势博弈方法，结果表明：相较于2种对比方法，所提方法更好地兼顾了在线求解速度和目标容量2个方面。
- 深度强化学习 /
- 遥感卫星 /
- 传感器网络 /
- 图神经网络 /
- 多目标跟踪
Abstract:
With the increasingly complex orbital environment, how to schedule the limited space-based optical sensor resources to observe more space targets has become an urgent problem. To solve this problem, an intelligent sensor scheduling method based on heterogeneous graphs is proposed. In this method, the solution process of the sensor scheduling scheme is modeled as a Markov decision process (MDP), and the sensor scheduling scheme is characterized as a heterogeneous graph structure composed of three kinds of nodes: targets, sensors and visible time windows. The sensor scheduling agent is taught via reinforcement learning, while the heterogeneous graph neural network encodes the heterogeneous graph topology. According to the ablation study, the multi-layer perceptron encoder has a 7.5% lower return than the suggested heterogeneous graph neural network encoder. The proposed method is compared with the simulated annealing method and potential game method in test scenarios with different numbers of targets and sensors. The results show that compared with the two methods, the proposed algorithm has a more balanced performance in terms of online solving speed and target capacity.
- deep reinforcement learning /
- remote sensing satellite /
- sensor network /
- graph neural network /
- multi-target tracking

HTML全文

图 1 天基光学传感器调度问题的决策过程

Figure 1. The decision process of space-based optical sensor scheduling

下载: 全尺寸图片幻灯片

图 2 冲突消解前的异构图示例

Figure 2. The heterogeneous graph before conflict resolution

下载: 全尺寸图片幻灯片

图 3 冲突消解后的异构图示例

Figure 3. The heterogeneous graph after conflict resolution

下载: 全尺寸图片幻灯片

图 4 代表最终调度方案的异构图示例

Figure 4. The heterogeneous graph representing the final scheduling result

下载: 全尺寸图片幻灯片

图 5 智能体模型示意图

Figure 5. Schematic diagram of intelligent agent model

下载: 全尺寸图片幻灯片

图 6 天基光学传感器智能调度方法训练环境架构

Figure 6. The architecture of training environment of space-based optical sensor intelligent scheduling method

下载: 全尺寸图片幻灯片

图 7 实验组和对照组的训练曲线

Figure 7. Training curves of experimental group and control group

下载: 全尺寸图片幻灯片

图 8 空间目标数量对3种方法的目标函数值的影响

Figure 8. The influence of the number of space targets on the objective function values of the three methods

下载: 全尺寸图片幻灯片

图 9 空间目标数量对3种方法的在线求解平均耗时的影响

Figure 9. The influence of the number of space targets on the average time required for online solution using three methods

下载: 全尺寸图片幻灯片

图 10 天基光学传感器数量对3种方法的目标函数值的影响

Figure 10. The influence of the number of space-based optical sensors on the objective function values of the three methods

下载: 全尺寸图片幻灯片

图 11 天基光学传感器数量对3种方法的在线求解平均耗时的影响

Figure 11. The influence of the number of space-based optical sensors on the average time required for online solution using the three methods

下载: 全尺寸图片幻灯片

表 1 参数设置

Table 1. Parameter settings

参数	数值
学习率	0.0001
折扣因子	1
参考精度/km	10
节点嵌入维数	64
编码器网络层数	2
评论家网络层数	2
演员网络层数	2
多层感知机单层尺寸	192

下载: 导出CSV

参考文献(19)

[1]	BOROWITZ M. Examining the growth of the global space situational awareness sector: a network analysis approach[J]. Space Policy, 2022, 59: 101444. doi: 10.1016/j.spacepol.2021.101444
[2]	DU J L, CHEN J Y, LI B, et al. Tentative design of SBSS constellations for LEO debris catalog maintenance[J]. Acta Astronautica, 2019, 155: 379-388. doi: 10.1016/j.actaastro.2018.06.054
[3]	NEEMA K, SUBRAMANIAN S V, DELAURENTIS D. Dual phase consensus algorithm for distributed sensor management[J]. IEEE Transactions on Aerospace and Electronic Systems, 2016, 52(4): 1893-1907. doi: 10.1109/TAES.2016.150196
[4]	THARMARASA R, KIRUBARAJAN T, SINHA A, et al. Decentralized sensor selection for large-scale multisensor-multitarget tracking[J]. IEEE Transactions on Aerospace and Electronic Systems, 2011, 47(2): 1307-1324. doi: 10.1109/TAES.2011.5751260
[5]	KALANDROS M. Covariance control for multisensor systems[J]. IEEE Transactions on Aerospace and Electronic Systems, 2002, 38(4): 1138-1157. doi: 10.1109/TAES.2002.1145739
[6]	XIONG K, ZHANG T X, CUI G L, et al. Coalition game of radar network for multitarget tracking via model-based multiagent reinforcement learning[J]. IEEE Transactions on Aerospace and Electronic Systems, 2023, 59(3): 2123-2140. doi: 10.1109/TAES.2022.3208865
[7]	YU V F, SUSANTO H, JODIAWAN P, et al. A simulated annealing algorithm for the vehicle routing problem with parcel lockers[J]. IEEE Access, 2022, 10: 20764-20782. doi: 10.1109/ACCESS.2022.3152062
[8]	ALI I M, SALLAM K M, MOUSTAFA N, et al. An automated task scheduling model using non-dominated sorting genetic algorithm II for fog-cloud systems[J]. IEEE Transactions on Cloud Computing, 2022, 10(4): 2294-2308. doi: 10.1109/TCC.2020.3032386
[9]	PENG H W, LI J D, TIAN J F, et al. A game theoretic self-organization for satellite-based optical sensor allocation[J]. Aerospace Science and Technology, 2023, 133: 108149. doi: 10.1016/j.ast.2023.108149
[10]	李国梁, 李峭, 徐亚军, 等. 基于DDQN的片上网络混合关键性消息调度方法[J]. 北京航空航天大学学报, 2022, 48(7): 1233-1241. LI G L, LI Q, XU Y J, et a1. A DDQN-based mixed-criticality messages scheduling method for network-on-chip[J]. Journal of Beijing University of Aeronautics and Astronautics, 2022, 48(7): 1233-1241(in Chinese).
[11]	SONG W, CHEN X Y, LI Q Q, et al. Flexible job-shop scheduling via graph neural network and deep reinforcement learning[J]. IEEE Transactions on Industrial Informatics, 2023, 19(2): 1600-1610. doi: 10.1109/TII.2022.3189725
[12]	吴兰, 吴元明, 孔凡士, 等. 基于深度强化学习与扩展卡尔曼滤波相结合的交通信号灯配时方法[J]. 北京航空航天大学学报, 2022, 48(8): 1353-1363. WU L, WU Y M, KONG F S, et al. Traffic signal timing method based on deep reinforcement learning and extended Kalman filter[J]. Journal of Beijing University of Aeronautics and Astronautics, 2022, 48(8): 1353-1363(in Chinese).
[13]	张晟宇, 朱振才, 胡海鹰. 红外低轨星座突发任务多重策略调度方法[J]. 北京航空航天大学学报, 2022, 48(12): 2405-2414. ZHANG S Y, ZHU Z C, HU H Y. Burst tasks scheduling method for infrared LEO constellation based on multi-strategies[J]. Journal of Beijing University of Aeronautics and Astronautics, 2022, 48(12): 2405-2414(in Chinese).
[14]	LI S E. Reinforcement learning for sequential decision and optimal control[M]. Singapore: Springer Nature Singapore, 2023: 93-94.
[15]	刘建业, 王华, 周晚萌. 基于GA-SA的低轨星座传感器资源调度算法[J]. 系统工程与电子技术, 2018, 40(11): 2476-2481. LIU J Y, WANG H, ZHOU W M. LEO constellation sensor resources scheduling algorithm based on genetic and simulated annealing[J]. Systems Engineering and Electronics, 2018, 40(11): 2476-2481(in Chinese).
[16]	VELIČKOVIĆ P, CUCURULL G, CASANOVA A, et al. Graph attention networks[EB/OL]. (2018-02-04)[2023-10-18]. https://arxiv.org/abs/1710.10903.
[17]	HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[EB/OL]. (2015-12-10)[2023-10-18]. https://arxiv.org/abs/1512.03385.
[18]	SCHULMAN J, WOLSKI F, DHARIWAL P, et al. Proximal policy optimization algorithms[EB/OL]. (2017-08-28)[2023-10-18]. https://arxiv.org/abs/1707.06347.
[19]	SCHULMAN J, MORITZ P, LEVINE S, et al. High-dimensional continuous control using generalized advantage estimation[EB/OL]. (2015-06-08)[2023-10-18]. https://arxiv.org/abs/1506.02438.