An offline training method using CGAN for anti-jamming communication decision network
-
摘要:
基于强化学习的避扰通信,由于需要不断地与环境交互从中学习到最优决策,其决策网络的训练时间受环境反馈速率的约束,通常耗时严重。针对这一问题,提出了一种离线式训练方法。构建出一种频谱虚拟环境生成器,可以快速生成大量的逼真合成频谱瀑布图,用于避扰通信决策网络训练。由于所提方法脱离真实环境反馈,形成离线式训练,进而显著提高模型训练效率。实验结果表明:与实时在线训练方法比较,所提离线式训练方法的训练时间可以减少50%以上。
-
关键词:
- 强化学习 /
- 避扰通信 /
- 频谱瀑布图 /
- 条件生成对抗网络(CGAN) /
- 离线式训练
Abstract:Due to the continuous interaction with the environment to learn the optimal decision, the training time of the decision network based on reinforcement learning is restricted by the feedback rate of the environment, which usually consumes a lot of time. To solve this problem, an offline training method is proposed. A spectrum virtual environment generator is constructed, which can quickly generate a large number of realistic synthetic spectrum waterfall images for the training of anti-jamming communication decision network. Because the method is separated from the real environment feedback, the offline training is formed and the efficiency of model training is improved significantly. Experimental results show that the training time of this offline method is reduced by more than 50% compared with the online real-time training method.
-
表 1 本文方法和ADRLA耗时对比
Table 1. Comparison of time consumption between proposed method and ADRLA
计算机性能 参数 TOFFLINE_ADRLA/min TADRLA/min 低性能(CPU) Toffline_init =0.5s,
TCGAN=0.4s,
Tonline_init=40s,
Tsample=0.8s,
TDQN=0.1s,
k=10008.3 15.7 中等性能(1块GPU) Toffline_init=0.3s,
TCGAN=0.2s,
Tonline_init=40s,
Tsample=0.8s,
TDQN=0.05s,
k=10004.2 14.8 高性能(4块GPU) Toffline_init=0.1s,
TCGAN=0.1s,
Tonline_init=40s,
Tsample=0.8s,
TDQN=0.01s,
k=10001.8 14.2 -
[1] DUAN R J, JIA L L, GUO P C.Research on spectrum allocation of HF access network based on intelligent frequency hopping[C]//8th International Symposium on Computational Intelligence and Design(ISCID).Piscataway: IEEE Press, 2015, 2: 295-300. [2] ZHANG L Y, GUAN Z Y, MELODIA T.United against the enemy:Anti-jamming based on cross-layer cooperation in wireless networks[J].IEEE Transactions on Wireless Communications, 2016, 15(8):5733-5747. doi: 10.1109/TWC.2016.2569083 [3] LIU X, XU Y H, JIA L L, et al.Anti-jamming communications using spectrum waterfall:A deep reinforcement learning approach[J].IEEE Communications Letters, 2018, 22(5):998-1001. doi: 10.1109/LCOMM.2018.2815018 [4] LIU Y, XU Y H, CHENG Y P, et al.A heterogeneous information fusion deep reinforcement learning for intelligent frequency selection of HF communication[J].China Communications, 2018, 15(9):73-84. doi: 10.1109/CC.2018.8456453 [5] RIYAZ S, SANKLE K, LOANNIDIS S, et al.Deep learning convolutional neural networks for radio identification[J].IEEE Communications Magazine, 2018, 56(9):146-152. doi: 10.1109/MCOM.2018.1800153 [6] XIAO L, CHEN T H, LIU J L, et al.Anti-jamming transmission Stackelberg game with observation errors[J].IEEE Communications Letters, 2015, 19(6):949-952. doi: 10.1109/LCOMM.2015.2418776 [7] ERPEK T, SAGDUYU Y E, ERPEK T, et al.Adversarial deep learning for cognitive radio security: Jamming attack and defense strategies[C]//IEEE International Conference on Communications Workshops (ICC Workshops).Piscataway: IEEE Press, 2018, 5: 1-6. [8] WUNSCH F, PAOSANA F, RAJENDRAN S, et al.DySPAN spectrum challenge:Situational awareness and opportunistic spectrum access benchmarked[J].IEEE Transactions on Cognitive Communications and Networking, 2017, 3(3):550-562. doi: 10.1109/TCCN.2017.2745682 [9] MIRZA M, OSINDERO S.Conditional generative adversarial nets[J].Computer Science, 2014, 27(8):2672-2680. http://d.old.wanfangdata.com.cn/Periodical/jsjfzsjytxxxb202006007 [10] MINH V, KAVUKCUOGLU K, SILVER D, et al.Playing atari with deep reinforcement learning[EB/OL].(2014-11-06)[2019-08-15].https://arxiv.org/abs/1411.1784. [11] MNIH V, KAVUKCUOGLU K, SILVER D, et al.Human-level control through deep reinforcement learning[J].Nature, 2015, 518(7540):529-533. doi: 10.1038/nature14236 [12] WANG J F, LI X, YANG J, et al.Stacked conditional generative adversarial networks for jointly learning shadow detection and shadow removal[EB/OL].(2017-12-07)[2019-08-15].https://arxiv.org/abs/1712.02478. [13] ODENA A, OLAH C, SHLENS J.Conditional image synthesis with auxiliary classifier GANs[C]//Proceedings of the 34th International Conference on Machine Learning.Piscataway: IEEE Press, 2017, 70: 2642-2651. [14] ISOLA P, ZHU J Y, ZHOU T H, et al.Image-to-image translation with conditional adversarial networks[C]//IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Piscataway: IEEE Press, 2017: 5967-5976. [15] MARTIN A, LÉON B.Towards principled methods for training generative adversarial networks[EB/OL].(2017-01-07)[2019-08-15].https://arxiv.org/abs/1701.04862. [16] MARTIN A, CHINTALA S, LÉON B.Wasserstein GAN[EB/OL].(2017-12-06)[2019-08-15].https://arxiv.org/abs/1701.07875.