Citation: | Ma Yaofei, Gong Guanghong, Peng Xiaoyuanet al. Cognition behavior model for air combat based on reinforcement learning[J]. Journal of Beijing University of Aeronautics and Astronautics, 2010, 36(4): 379-383. (in Chinese) |
[1] 尹全军.基于多Agent的计算机生成兵力建模与仿真 .长沙:国防科学技术大学机电工程与自动化学院,2005 Yin Quanjun.Modeling and simulation of computer generated forces based on multi-agent .Changsha:College of Mechanical Engineering and Automation,National University of Defense Technology,2005(in Chinese) [2] 张汝波.强化学习理论及应用[M].哈尔滨:哈尔滨工程大学出版社,2001 Zhang Rubo.Reinforcement learning theory and application[M].Harbin: Harbin Engineering University Press,2001(in Chinese) [3] Howard R A.Dynamic programming and markov processes[M].Cambridge: MIT Press,1960 [4] Sutton R S,Barto A G.Time derivative models of pavlovian reinforcement,learning and computational neuroscience: foundations of adaptive networks[M].Cambridge: MIT Press,1990:497-537 [5] Baron Sheldon,Kelinman D L,Serben Saul.A study of the markov game approach to tactical maneuvering problems .NASA CR-1979,1972 [6] Moore A W,Atkeson C G.The parti-game algorithm for variable resolution reinforcement learning in multidimensional state-spaces [J].Machine Learning,1995,21(3):199-233 [7] Park J,Sandberg I W.Universal approximation using radial-basis function network [J].Neural Computation,1991(3):246-257 [8] Schaal S,Atkeson C G.From isolation to cooperation: an alternative view of a system of experts //Touretzky D S,Hasselmo M E.Advances in Neural Information Processing Systems 8.MA: MIT Press,1996: 605-611 [9] 高浩,朱培申,高正红.高等飞行动力学[M].北京:国防工业出版社,2004:26-91 Gao Hao,Zhu Peishen,Gao Zhenghong.The advanced flight dynamics[M].Beijing: National Defense Industry Press,2004:26-91(in Chinese) [10] Virtanen K,Raivio T,Hmlinen R P.A decision analytic simulation approach to flight simulation .Helsinki:System Analysis Laboratory,2007 .http://www.sal.tkk.fi/Opinnot/Mat-2.108/ pdf-files/ eham02.pdf [11] Kaebling L P,Littman M L,Moore A W.Reinforcement: a survey [J].Journal of Artificial Intelligence Research,1996(4):237-285
|