Time-triggered communication scheduling method based on reinforcement learning

LI Haoruo; HE Feng; ZHENG Zhong; LI Ershuai; XIONG Huagang

doi:10.13700/j.bh.1001-5965.2018.0789

Volume 45 Issue 9

Sep. 2019

Turn off MathJax

Article Contents

Journal of Beijing University of Aeronautics and Astronautics > 2019 > 45(9): 1894-1901.

LI Haoruo, HE Feng, ZHENG Zhong, et al. Time-triggered communication scheduling method based on reinforcement learning[J]. Journal of Beijing University of Aeronautics and Astronautics, 2019, 45(9): 1894-1901. doi: 10.13700/j.bh.1001-5965.2018.0789(in Chinese)

Citation:

LI Haoruo, HE Feng, ZHENG Zhong, et al. Time-triggered communication scheduling method based on reinforcement learning[J]. Journal of Beijing University of Aeronautics and Astronautics, 2019, 45(9): 1894-1901. doi: 10.13700/j.bh.1001-5965.2018.0789(in Chinese)

Citation:

PDF( 2416 KB)

Time-triggered communication scheduling method based on reinforcement learning

doi: 10.13700/j.bh.1001-5965.2018.0789

School of Electronic and Information Engineering, Beihang University, Beijing 100083, China

Funds:

National Natural Science Foundation of China 61301086

National Natural Science Foundation of China 71701020

Open Fund of Tianjin Civil Aircraft Airworthiness and Maintenance Key Laboratory of Civil Aviation University of China 2017SW02

More Information

Corresponding author: HE Feng, E-mail: robinleo@buaa.edu.cn
Received Date: 02 Jan 2019
Accepted Date: 15 Mar 2019
Publish Date: 20 Sep 2019

Abstract

Abstract

In the future, time-triggered communication mechanism will be more widely selected for information transmission to ensure the certainty of information interaction in avionics system. How to reasonably implement time-triggered communication scheduling design is the key to time-triggered application to avionics interconnection systems. For the periodic task of time-triggered scheduling, we proposed a method for generating periodic scheduling timetable based on reinforcement learning. Firstly, the traffic scheduling task is transformed into a tree search problem, which has the Markov characteristics needed for reinforcement learning. Then, the reinforcement learning algorithm based on neural network is used to explore the schedule, and the waiting time is shortened to optimize the schedule. As the training is completed, the model can be directly used in tasks with similar message distribution. Compared with the method, e.g. Yices, which uses the satisfiability modulo theories (SMT) to solve the time-triggered schedule, the proposed method does not cause undetermined problem, and can guarantee the correctness and optimization of the time-triggered scheduling design results. For a large network with 1 000 messages, the calculation speed of the proposed method is dozens of times faster than that of the SMT, and meanwhile, the end-to-end delay of the generated message by scheduling is less than 1% of that of the SMT, which greatly improves the timeliness of message transmission.
- time-triggered,
- scheduling method,
- reinforcement learning,
- tree search,
- offset time

FullText(HTML)

References(16)

References

[1]	王国庆, 谷青范, 王淼, 等.新一代综合化航空电子系统构架技术研究[J].航空学报, 2014, 35(6):1473-1486. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=hkxb201406002 WANG G Q, GU Q F, WANG M, et al.Research on architecture technology for new generation integrated avionics system[J].Acta Aeronautica et Aastronautica Sinica, 2014, 35(6):1473-1486(in Chinese). http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=hkxb201406002
[2]	熊华钢, 王中华.先进航空电子综合技术[M].北京:国防工业出版社, 2009:2-13. XIONG H G, WANG Z H.Advanced avionics integration techniques[M].Beijing:National Defense Industry Press, 2009:2-13(in Chinese).
[3]	STEINER W.An evaluation of SMT-based schedule synthesis for time-triggered multi-hop networks[C]//Real-Time Systems Symposium.Piscataway, NJ: IEEE Press, 2011: 375-384.
[4]	孔韵雯, 李峭, 熊华钢, 等.片间综合化互连时间触发通信调度方法[J].航空学报, 2018, 39(2):321590. http://d.old.wanfangdata.com.cn/Periodical/hkxb201802023 KONG Y W, LI Q, XIONG H G, et al.Time-triggered communication scheduling method for off-chip integrated interconnection[J].Acta Aeronautica et Astronautica Sinica, 2018, 39(2):321590(in Chinese). http://d.old.wanfangdata.com.cn/Periodical/hkxb201802023
[5]	李炳乾, 王勇, 谭小虎, 等.基于混合遗传算法的TTE静态调度表生成设计[J].电子技术应用, 2016, 42(10):96-99. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=dzjsyy201610029 LI B Q, WANG Y, TAN X H, et al.Hybrid-GA based static schedule generation for time-triggered Ethernet[J].Application of Electronic Technique, 2016, 42(10):96-99(in Chinese). http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=dzjsyy201610029
[6]	RUMELHART D E, HINTON G E, WILLIAMS R J.Learning representations by back-propagating errors[J].Nature, 1986, 323:533-536. doi: 10.1038/323533a0
[7]	KHALIL E B, DAI H, ZHANG Y, et al.Learning combinatorial optimization algorithms over graphs[C]//Neural Information Processing Systems, 2017: 6348-6358.
[8]	SILVER D, SCHRITTWIESER J, SIMONYAN K, et al.Mastering the game of Go without human knowledge[J].Nature, 2017, 550(7676):354-359. doi: 10.1038/nature24270
[9]	GAL D, KRISHNAMURTHY D, MATEJ V, et al.Safe exploration in continuous action spaces[EB/OL].(2018-01-26)[2019-01-01].https://arxiv.org/abs/1801.08757.
[10]	KOCSIS L, SZEPESVARI C.Bandit based monte-carlo planning[C]//European Conference on Machine Learning, 2006: 282-293.
[11]	CRACIUNAS S S, OLIVER R S.SMT-based task-and network-level static schedule generation for time-triggered networked systems[C]//International Conference on Real-Time Networks and Systems, 2014: 45-54.
[12]	SCHULMAN J, LEVINE S, ABBEEL P, et al.Trust region policy optimization[EB/OL].(2015-02-19)[2019-01-01].https://arxiv.org/ahs/1502.05477.
[13]	SILVER D, LEVER G, HEESS N, et al.Deterministic policy gradient algorithms[C]//International Conference on Machine Learning, 2014: 387-395.
[14]	SCHULMAN J, WOLSKI F, DHARIWAL P, et al.Proximal policy optimization algorithms[EB/OL].(2017-07-20)[2019-01-01].https://arxiv.org/abs/1707.06347.
[15]	KORF R E.Depth-first iterative-deepening:An optimal admissible tree search[J].Artificial Intelligence, 1985, 27(1):97-109.
[16]	ARINC.Aircraft data network, Part 7.Avionics full-duplex switched ethernet network: ARINC 664P7[R].Washington, D.C.: ARINC, 2005.

Relative Articles

Supplements(0)

Cited By

Proportional views

Proportional views

通讯作者: 陈斌, bchen63@163.com

1.
沈阳化工大学材料科学与工程学院沈阳 110142

Figures(6) / Tables(9)

Get Citation

PDF

XML

Article Metrics

Article views(867) PDF downloads(393)

Time-triggered communication scheduling method based on reinforcement learning

doi: 10.13700/j.bh.1001-5965.2018.0789

Abstract

References

Proportional views

Catalog

通讯作者: 陈斌, bchen63@163.com

Article Metrics

Proportional views

Related

Time-triggered communication scheduling method based on reinforcement learning

doi: 10.13700/j.bh.1001-5965.2018.0789

Abstract

References

Proportional views

Catalog

通讯作者: 陈斌, bchen63@163.com

Article Metrics

Proportional views

Related

Export File

Citation

Format

Content