Volume 36 Issue 4
Apr.  2010
Turn off MathJax
Article Contents
Ma Yaofei, Gong Guanghong, Peng Xiaoyuanet al. Cognition behavior model for air combat based on reinforcement learning[J]. Journal of Beijing University of Aeronautics and Astronautics, 2010, 36(4): 379-383. (in Chinese)
Citation: Ma Yaofei, Gong Guanghong, Peng Xiaoyuanet al. Cognition behavior model for air combat based on reinforcement learning[J]. Journal of Beijing University of Aeronautics and Astronautics, 2010, 36(4): 379-383. (in Chinese)

Cognition behavior model for air combat based on reinforcement learning

  • Received Date: 26 Mar 2009
  • Publish Date: 30 Apr 2010
  • A cognition model was proposed to support tactical decisions for simulated fighters to fight with each other in a virtual combat, and reinforcement learning (RL) technology was used to acquire knowledge. The combat situation was described by multi-attributes, which resulted in a high dimensional problem space in which the fighters learned to find action policies. The traditional approach that partitioned the problem space would impose demand on huge computation and storage resource. An approximation network is constructed based on Gaussian radial basis function to approximate the state value, which greatly reduced the resource demand and learning cycle time, and produced reasonable maneuver strategy. The model was verified by a one-to-one air combat simulation, and the produced trajectories are similar with those that human pilots flied in real combat.

     

  • loading
  • [1] 尹全军.基于多Agent的计算机生成兵力建模与仿真 .长沙:国防科学技术大学机电工程与自动化学院,2005 Yin Quanjun.Modeling and simulation of computer generated forces based on multi-agent .Changsha:College of Mechanical Engineering and Automation,National University of Defense Technology,2005(in Chinese) [2] 张汝波.强化学习理论及应用[M].哈尔滨:哈尔滨工程大学出版社,2001 Zhang Rubo.Reinforcement learning theory and application[M].Harbin: Harbin Engineering University Press,2001(in Chinese) [3] Howard R A.Dynamic programming and markov processes[M].Cambridge: MIT Press,1960 [4] Sutton R S,Barto A G.Time derivative models of pavlovian reinforcement,learning and computational neuroscience: foundations of adaptive networks[M].Cambridge: MIT Press,1990:497-537 [5] Baron Sheldon,Kelinman D L,Serben Saul.A study of the markov game approach to tactical maneuvering problems .NASA CR-1979,1972 [6] Moore A W,Atkeson C G.The parti-game algorithm for variable resolution reinforcement learning in multidimensional state-spaces [J].Machine Learning,1995,21(3):199-233 [7] Park J,Sandberg I W.Universal approximation using radial-basis function network [J].Neural Computation,1991(3):246-257 [8] Schaal S,Atkeson C G.From isolation to cooperation: an alternative view of a system of experts //Touretzky D S,Hasselmo M E.Advances in Neural Information Processing Systems 8.MA: MIT Press,1996: 605-611 [9] 高浩,朱培申,高正红.高等飞行动力学[M].北京:国防工业出版社,2004:26-91 Gao Hao,Zhu Peishen,Gao Zhenghong.The advanced flight dynamics[M].Beijing: National Defense Industry Press,2004:26-91(in Chinese) [10] Virtanen K,Raivio T,Hmlinen R P.A decision analytic simulation approach to flight simulation .Helsinki:System Analysis Laboratory,2007 .http://www.sal.tkk.fi/Opinnot/Mat-2.108/ pdf-files/ eham02.pdf [11] Kaebling L P,Littman M L,Moore A W.Reinforcement: a survey [J].Journal of Artificial Intelligence Research,1996(4):237-285
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Article Metrics

    Article views(3371) PDF downloads(1754) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return