[1] 段海滨,李沛.基于生物群集行为的无人机集群控制[J].科技导报,2017,35(7):17-25.DUAN H B,LI P.Autonomous control for unmanned aerial vehicle swarms based on biological collective behaviors[J].Science & Technology Review,2017,35(7):17-25(in Chinese). [2] OLFATISABER R.Flocking for multi-agent dynamic systems:Algorithms and theory[J].IEEE Transactions on Automatic Control,2006,51(3):401-420. [3] 黄天云,陈雪波,徐望宝.基于松散偏好规则的群体机器人系统自组织协作围捕[J].自动化学报,2013,39(1):57-68.HUANG T Y,CHEN X B,XU W B.A self-organizing cooperative hunting by swarm robotic systems based on loose-preference rule[J].Acta Automatica Sinica,2013,39(1):57-68(in Chinese). [4] 李瑞珍,杨惠珍,萧丛杉.基于动态围捕点的多机器人协同策略[J].控制工程,2019,26(3):510-514.LI R Z,YANG H Z,XIAO C S.Cooperative hunting strategy for multi-mobile robot systems based on dynamic hunting points[J].Control Engineering of China,2019,26(3):510-514(in Chinese). [5] 张子迎,吕骏,徐东,等.能量均衡的围捕任务分配方法[J].国防科技大学学报,2019,41(2):107-114.ZHANG Z Y,LV J,XU D,et al.Method of capturing task allocation based on energy balabce[J].Journal of National University of Defense Technology,2019,41(2):107-114(in Chinese). [6] UEHARA S,TAKIMOTO M,KAMBAYASHI Y.Mobile agent based obstacle avoidance in multi-robot hunting[M]//GEN M,GREEN D,KATAI O,et al.Intelligent and evolutionary systems.Berlin:Springer,2017:443-452. [7] VLAHOV B,SQUIRES E,STRICKLAND L,et al.On developing a UAV pursuit-evasion policy using reinforcement learning[C]//201817th IEEE International Conference on Machine Learning and Applications (ICMLA).Piscataway:IEEE Press,2018:859-864. [8] 谭浪,巩庆海,王会霞.基于深度强化学习的追逃博弈算法[J].航天控制,2018,36(6):3-8.TAN L,GONG Q H,WANG H X.Pursuit-evasion game algorithm based on deep reinforcement learning[J].Aerospace Control,2018,36(6):3-8(in Chinese). [9] TENG T H,TAN A H,ZURADA J M.Selforganizing neural networks integrating domain knowledge and reinforcement learning[J].IEEE Transactions on Neural Networks and Learning Systems,2014,26(5):889-902. [10] BEVERIDGE A,CAI Y.Pursuit-evasion in a two-dimensional domain[J].ARS Mathematica Contemporanea,2017,13(1):187-206. [11] LIU J,LIU S,WU H,et al.A pursuit-evasion algorithm based on hierarchical reinforcement learning[C]//2009 International Conference on Measuring Technology and Mechatronics Automation.Piscataway:IEEE Press,2009,2:482-486. [12] BILGIN A T,KADIOGLU-URTIS E.An approach to multiagent pursuit evasion games using reinforcement learning[C]//2015 International Conference on Advanced Robotics (ICAR).Piscataway:IEEE Press,2015:164-169. [13] AWHEDA M D,SCHWARTZ H M.A fuzzy reinforcement learning algorithm using a predictor for pursuit-evasion games[C]//2016 Annual IEEE Systems Conference(SysCon).Piscataway:IEEE Press,2016:1-8. [14] LOWE R,WU Y I,TAMAR A,et al.Multi-agent actor-critic for mixed cooperative-competitive environments[C]//Advances in Neural Information Processing Systems,2017:6379-6390. [15] VAN HASSELT H,GUEZ A,SILVER D,et al.Deep reinforcement learning with double Q-learning[C]//National Conference on Artificial Intelligence,2016:2094-2100. [16] HAUSKNECHT M,STONE P.Deep recurrent Q-learning for partially observable MDPS[EB/OL].(2015-07-23)[2020-06-01].https://arxiv.org/abs/1507.06527. [17] HESTER T,VECERIK M,PIETQUIN O,et al.Deep Q-learning from demonstrations[C]//National Conference on Artificial Intelligence,2018:3223-3230. [18] 魏瑞轩,张启瑞,许卓凡.类脑发育无人机防碰撞控制[J].控制理论与应用,2019,36(2):13-20.WEI R X,ZHANG Q R,XU Z F.A brain-like mechanism for developmental UAVs' collision avoidance[J].Control Theory & Applications,2019,36(2):13-20(in Chinese). [19] MNIH V,KAVUKCUOGLU K,SILVER D,et al.Human-level control through deep reinforcement learning[J].Nature,2015,518:529-533. [20] MORDATCH I,ABBEEL P.Emergence of grounded compositional language in multi-agent populations[EB/OL].(2017-03-15)[2020-06-01].https://arxiv.org/abs/1703.04908. |