Software defined satellite attitude control algorithm based on deep reinforcement learning

XU Ke; WU Fengge; ZHAO Junsuo

doi:10.13700/j.bh.1001-5965.2018.0357

Volume 44 Issue 12

Dec. 2018

Turn off MathJax

Article Contents

Journal of Beijing University of Aeronautics and Astronautics > 2018 > 44(12): 2651-2659.

XU Ke, WU Fengge, ZHAO Junsuoet al. Software defined satellite attitude control algorithm based on deep reinforcement learning[J]. Journal of Beijing University of Aeronautics and Astronautics, 2018, 44(12): 2651-2659. doi: 10.13700/j.bh.1001-5965.2018.0357(in Chinese)

Citation:

XU Ke, WU Fengge, ZHAO Junsuoet al. Software defined satellite attitude control algorithm based on deep reinforcement learning[J]. Journal of Beijing University of Aeronautics and Astronautics, 2018, 44(12): 2651-2659. doi: 10.13700/j.bh.1001-5965.2018.0357(in Chinese)

Citation:

PDF( 2169 KB)

Software defined satellite attitude control algorithm based on deep reinforcement learning

doi: 10.13700/j.bh.1001-5965.2018.0357

Institute of Software, Chinese Academy of Sciences, Beijing 100190, China

More Information

Corresponding author: WU Fengge, E-mail: fengge@iscas.ac.cn
Received Date: 13 Jun 2018
Accepted Date: 14 Aug 2018
Publish Date: 20 Dec 2018

Abstract

Abstract

Deep reinforcement learning (DRL) technique is a new kind of machine learning based control algorithm, which shows its outstanding performance in the area of robotics and unmanned aerial vehicle. Meanwhile, in the area of satellite attitude control, traditional PID control algorithm is still widely used. As satellites become smaller and more intelligent and software defined satellite emerges, traditional control methods are even harder to meet the needs of adaptability, autonomy and robustness. To deal with these problems, a deep reinforcement learning based attitude control algorithm is proposed. It is a kind of model-based algorithm, which has much faster convergence speed than model-free algorithm. Compared with traditional method, this algorithm does not need prior knowledge of satellite's physical or orbit parameters and has better adaptability and autonomy, which make it possible for software defined satellite to adapt to different hardware environments and to be developed and deployed much faster. Furthermore, through introducing target network and parallelized heuristic search algorithm, the proposed algorithm has higher network accuracy and faster computation speed. The simulation experiment verifies these improvements.
- reinforcement learning,
- deep learning,
- intelligent control,
- satellite attitude control,
- software defined satellite

FullText(HTML)

References(22)

References

[1]	WILLIAMS T W, SHULMAN S, SEDLAK J, et al.Magnetospheric multiscale mission attitude dynamics: Observations from flight data[C]//AIAA/AAS Astrodynamics Specialist Conference.Reston: AIAA, 2016.
[2]	HU Q, LI L, FRISWELL M I.Spacecraft anti-unwinding attitude control with actuator nonlinearities and velocity limit[J].Journal of Guidance, Control, and Dynamics, 2015, 38(10):1-8. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=3c610737816cd29f1f9057191968bdc9
[3]	MAZMANYAN L, AYOUBI M A.Takagi-Sugeno fuzzy model-based attitude control of spacecraft with partially-filled fuel tank[C]//AIAA/AAS Astrodynamics Specialist Conference.Reston: AIAA, 2014.
[4]	LEVINE S, FINN C, DARRELL T, et al.End-to-end training of deep visuomotor policies[J].Journal of Machine Learning Research, 2016, 17(1):1334-1373. http://dl.acm.org/citation.cfm?id=2946684
[5]	SILVER D, HUBERT T, SCHRITTWIESER J, et al.Mastering chess and shogi by self-play with a general reinforcement learning algorithm[EB/OL].(2017-12-05)[2018-06-13].http://cn.arxiv.org/abs/1712.01815.
[6]	LILLICRAP T P, HUNT J J, PRITZEL A, et al.Continuous control with deep reinforcement learning[J].Computer Science, 2015, 8(6):A187. http://cn.bing.com/academic/profile?id=6acdf290bd7c97fa970faf7a2ba649ce&encoded=0&v=paper_preview&mkt=zh-cn
[7]	BROCKMAN G, CHEUNG V, PETTERSSON L, et al.OpenAI Gym[EB/OL].(2016-01-05)[2018-06-13].http://cn.arxiv.org/abs/1606.01540.
[8]	GROSS K, SWENSON E, AGTE J S.Optimal attitude control of a 6u cubesat with a four-wheel pyramid reaction wheel array and magnetic torque coils[C]//AIAA Modeling and Simulation Technologies Conference.Reston: AIAA, 2015. https://www.researchgate.net/publication/306357094_Optimal_Attitude_Control_of_a_6U_CubeSat_with_a_Four-Wheel_Pyramid_Reaction_Wheel_Array_and_Magnetic_Torque_Coils
[9]	AKELLA M R, THAKUR D, MAZENC F.Partial Lyapunov strictification: Smooth angular velocity observers for attitude tracking control[C]//AIAA/AAS Astrodynamics Specialist Conference.Reston: AIAA: 2015: 442-451. https://www.researchgate.net/publication/269163652_Partial_Lyapunov_Strictification_Smooth_Angular_Velocity_Observers_for_Attitude_Tracking_Control
[10]	XIAO B, HU Q, ZHANG Y, et al.Fault-tolerant tracking control of spacecraft with attitude-only measurement under actuator failures[J].Journal of Guidance, Control, and Dynamics, 2014, 37(3):838-849. doi: 10.2514/1.61369
[11]	WALKER A R, PUTMAN P T, COHEN K.Solely magnetic genetic/fuzzy-attitude-control algorithm for a CubeSat[J].Journal of Spacecraft & Rockets, 2015, 52(6):1627-1639. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=c7f82cb08e9cf6425c7c40e0d4f749dd
[12]	GHADIRI H, SADEGHI M, ABASPOUR A, et al.Optimized fuzzy-quaternion attitude control of satellite in large maneuver[C]//International Conference on Space Operations.Reston: AIAA, 2015. http://www.researchgate.net/publication/303098915_Optimized_Fuzzy-Quaternion_Attitude_control_of_Satellite_in_Large_maneuver
[13]	卢伟.基于阻力参数估计的低轨卫星轨道确定与预报[D].哈尔滨: 哈尔滨工业大学, 2008. http://www.wanfangdata.com.cn/details/detail.do?_type=degree&id=D254198 LU W.Orbit determination and prediction of low earth orbit satellites based on estimating drag parameters[D]. Harbin: Harbin Institute of Technology, 2008(in Chiense). http://www.wanfangdata.com.cn/details/detail.do?_type=degree&id=D254198
[14]	WANG P, ZHENG W, ZHANG H B, et al.Attitude control of low-orbit micro-satellite with active magnetic torque and aerodynamic torque[C]//20103rd In International Symposium on Systems and Control in Aeronautics and Astronautics.Reston: AIAA, 2010: 1460-1464. https://www.researchgate.net/publication/251967428_Attitude_control_of_low-orbit_micro-satellite_with_active_magnetic_torque_and_aerodynamic_torque
[15]	YOO Y, KOO S, KIM G, et al.Attitude control system of a cube satellite with small solar sail[C]//AIAA Aerospace Sciences Meeting.Reston: AIAA, 2013. http://www.researchgate.net/publication/273135427_attitude_control_system_of_a_cube_satellite_with_small_solar_sail
[16]	FRANKLIN G F.Feedback control of dynamic systems[M].Beijign:Posts and Telecom Press, 2007.
[17]	WU B L.Spacecraft attitude control with input quantization[J].Journal of Guidance, Control, and Dynamics, 2016, 39(1):176-181. doi: 10.2514/1.G001427
[18]	TURKOGLU K, GONG A.Preliminary design and prototyping of a low-cost spacecraft attitude determination and control setup[C]//AIAA Guidance, Navigation, and Control Conference.Reston: AIAA, 2015. http://gateway.proquest.com/openurl?res_dat=xri:pqm&ctx_ver=Z39.88-2004&rfr_id=info:xri/sid:baidu&rft_val_fmt=info:ofi/fmt:kev:mtx:article&genre=article&jtitle=Aiaa%20Journal&atitle=Preliminary%20Design%20and%20Prototyping%20of%20a%20Low-Cost%20Spacecraft%20Attitude%20Determination%20and%20Control%20Setup
[19]	WATKINS C J C H, DAYAN P.Q-learning[J].Machine Learning, 1992, 8(3-4):279-292. doi: 10.1007/BF00992698
[20]	KIRKPATRICK S, VECCHI M P.Optimization by simulated annealing[M]//MEZARO M, PARISI G, VIRASORO M.Spin glass theory and beyond: An introduction to the replica method and its applications.Singapore: World Scientific Press, 1987: 339-348.
[21]	KENNEDY J, EBERHART R.Particle swarm optimization[C]//Proceedings of ICNN'95-International Conference on Neural Networks.Piscataway, NJ: IEEE Press, 2002: 1942-1948.
[22]	SALIMANS T, HO J, CHEN X, et al.Evolution strategies as a scalable alternative to reinforcement learning[EB/OL].(2017-12-07).[2018-06-13].https://arxiv.org/abs/1703.03864.

Relative Articles

Supplements(0)

Cited By

Proportional views

Proportional views

通讯作者: 陈斌, bchen63@163.com

1.
沈阳化工大学材料科学与工程学院沈阳 110142

Figures(6) / Tables(2)

Get Citation

PDF

XML

Article Metrics

Article views(799) PDF downloads(546)

Software defined satellite attitude control algorithm based on deep reinforcement learning

doi: 10.13700/j.bh.1001-5965.2018.0357

Abstract

References

Proportional views

Catalog

通讯作者: 陈斌, bchen63@163.com

Article Metrics

Proportional views

Related

Software defined satellite attitude control algorithm based on deep reinforcement learning

doi: 10.13700/j.bh.1001-5965.2018.0357

Abstract

References

Proportional views

Catalog

通讯作者: 陈斌, bchen63@163.com

Article Metrics

Proportional views

Related

Export File

Citation

Format

Content