留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于态势认知的无人机集群围捕方法

吴子沉 胡斌

吴子沉, 胡斌. 基于态势认知的无人机集群围捕方法[J]. 北京航空航天大学学报, 2021, 47(2): 424-430. doi: 10.13700/j.bh.1001-5965.2020.0274
引用本文: 吴子沉, 胡斌. 基于态势认知的无人机集群围捕方法[J]. 北京航空航天大学学报, 2021, 47(2): 424-430. doi: 10.13700/j.bh.1001-5965.2020.0274
WU Zichen, HU Bin. Swarm rounding up method of UAV based on situation cognition[J]. Journal of Beijing University of Aeronautics and Astronautics, 2021, 47(2): 424-430. doi: 10.13700/j.bh.1001-5965.2020.0274(in Chinese)
Citation: WU Zichen, HU Bin. Swarm rounding up method of UAV based on situation cognition[J]. Journal of Beijing University of Aeronautics and Astronautics, 2021, 47(2): 424-430. doi: 10.13700/j.bh.1001-5965.2020.0274(in Chinese)

基于态势认知的无人机集群围捕方法

doi: 10.13700/j.bh.1001-5965.2020.0274
基金项目: 

科技创新2030-“新一代人工智能”重大项目 2018AAA0102403

国家自然科学基金 61573373

详细信息
    作者简介:

    吴子沉  男, 硕士研究生。主要研究方向: 飞行器导航制导与智能控制

    胡斌  男, 硕士, 副教授。主要研究方向: 导弹制导与控制

    通讯作者:

    胡斌. E-mail: singer533@163.com

  • 中图分类号: V249.3;TP24

Swarm rounding up method of UAV based on situation cognition

Funds: 

Science and Technology Innovation 2030-Key Project of "New Generation Artificial Intelligence" 2018AAA0102403

National Natural Science Foundation of China 61573373

More Information
  • 摘要:

    无人机集群围捕是智能无人机“蜂群”作战的一种重要任务方式。现有集群围捕方法大多建立在环境已知的基础上,面对未知的任务环境时围捕策略经常性失效。针对此问题,提出了基于态势认知的发育模型,探索一种对环境适应性较佳的围捕方法。首先,对集群围捕行为分解,将围捕离散化;然后,基于深度Q神经网络(DQN),设计一种围捕策略的生成方法;最后,建立状态-策略知识库,并通过大量有效数据的训练,针对不同环境获得不同的策略,对知识库进行发育。仿真结果表明:提出的基于态势认知的发育模型,能够有效适应不同环境,完成不同环境下的围捕。

     

  • 图 1  围捕问题示意图

    Figure 1.  Schematic of rounding up problem

    图 2  围捕成功示意图

    Figure 2.  Schematic of successful rounding up

    图 3  围捕队形示意

    Figure 3.  Schematic of rounding up formation

    图 4  无人机集群发育结构示意图

    Figure 4.  Schematic diagram of UAV swarm development structure

    图 5  基于认知发育的围捕方法

    Figure 5.  A rounding up algorithm based on cognitive development

    图 6  围捕实例

    Figure 6.  An example of rounding up

    图 7  平均围捕时间变化

    Figure 7.  Change in mean time of rounding up

    图 8  状态-策略知识库数目变化

    Figure 8.  Change in number of state-policy repositories

    图 9  测试环境下的围捕结果

    Figure 9.  Rounding up results in test environment

    表  1  无人机与目标参数设定

    Table  1.   Parameter setting of UAV and target

    参数 数值
    坐标轴范围/m [-25, 25]
    围捕者最大速度/(m·s-1) 3
    围捕者最大加速度/(m·s-2) 3
    围捕者探测距离/m 10
    目标最大速度/(m·s-1) 5
    目标最大加速度/(m·s-2) 5
    目标探测距离/m 5
    下载: 导出CSV

    表  2  障碍物生成参数

    Table  2.   Obstacle generation parameters

    参数 数值
    数目均值 3
    数目变化范围 0~5
    位置范围/m [-24, 24]
    直径均值/m 4
    直径变化范围/m 2~8
    下载: 导出CSV
  • [1] 段海滨, 李沛. 基于生物群集行为的无人机集群控制[J]. 科技导报, 2017, 35(7): 17-25. https://www.cnki.com.cn/Article/CJFDTOTAL-KJDB201707010.htm

    DUAN H B, LI P. Autonomous control for unmanned aerial vehicle swarms based on biological collective behaviors[J]. Science & Technology Review, 2017, 35(7): 17-25(in Chinese). https://www.cnki.com.cn/Article/CJFDTOTAL-KJDB201707010.htm
    [2] OLFATISABER R. Flocking for multi-agent dynamic systems: Algorithms and theory[J]. IEEE Transactions on Automatic Control, 2006, 51(3): 401-420. doi: 10.1109/TAC.2005.864190
    [3] 黄天云, 陈雪波, 徐望宝. 基于松散偏好规则的群体机器人系统自组织协作围捕[J]. 自动化学报, 2013, 39(1): 57-68. https://www.cnki.com.cn/Article/CJFDTOTAL-MOTO201301008.htm

    HUANG T Y, CHEN X B, XU W B. A self-organizing cooperative hunting by swarm robotic systems based on loose-preference rule[J]. Acta Automatica Sinica, 2013, 39(1): 57-68(in Chinese). https://www.cnki.com.cn/Article/CJFDTOTAL-MOTO201301008.htm
    [4] 李瑞珍, 杨惠珍, 萧丛杉. 基于动态围捕点的多机器人协同策略[J]. 控制工程, 2019, 26(3): 510-514. https://www.cnki.com.cn/Article/CJFDTOTAL-JZDF201903017.htm

    LI R Z, YANG H Z, XIAO C S. Cooperative hunting strategy for multi-mobile robot systems based on dynamic hunting points[J]. Control Engineering of China, 2019, 26(3): 510-514(in Chinese). https://www.cnki.com.cn/Article/CJFDTOTAL-JZDF201903017.htm
    [5] 张子迎, 吕骏, 徐东, 等. 能量均衡的围捕任务分配方法[J]. 国防科技大学学报, 2019, 41(2): 107-114. https://www.cnki.com.cn/Article/CJFDTOTAL-GFKJ201902016.htm

    ZHANG Z Y, LV J, XU D, et al. Method of capturing task allocation based on energy balabce[J]. Journal of National University of Defense Technology, 2019, 41(2): 107-114(in Chinese). https://www.cnki.com.cn/Article/CJFDTOTAL-GFKJ201902016.htm
    [6] UEHARA S, TAKIMOTO M, KAMBAYASHI Y.Mobile agent based obstacle avoidance in multi-robot hunting[M]//GEN M, GREEN D, KATAI O, et al.Intelligent and evolutionary systems.Berlin: Springer, 2017: 443-452.
    [7] VLAHOV B, SQUIRES E, STRICKLAND L, et al.On developing a UAV pursuit-evasion policy using reinforcement learning[C]//201817th IEEE International Conference on Machine Learning and Applications (ICMLA).Piscataway: IEEE Press, 2018: 859-864.
    [8] 谭浪, 巩庆海, 王会霞. 基于深度强化学习的追逃博弈算法[J]. 航天控制, 2018, 36(6): 3-8. https://www.cnki.com.cn/Article/CJFDTOTAL-HTKZ201806001.htm

    TAN L, GONG Q H, WANG H X. Pursuit-evasion game algorithm based on deep reinforcement learning[J]. Aerospace Control, 2018, 36(6): 3-8(in Chinese). https://www.cnki.com.cn/Article/CJFDTOTAL-HTKZ201806001.htm
    [9] TENG T H, TAN A H, ZURADA J M. Selforganizing neural networks integrating domain knowledge and reinforcement learning[J]. IEEE Transactions on Neural Networks and Learning Systems, 2014, 26(5): 889-902. http://www.ncbi.nlm.nih.gov/pubmed/25881365
    [10] BEVERIDGE A, CAI Y. Pursuit-evasion in a two-dimensional domain[J]. ARS Mathematica Contemporanea, 2017, 13(1): 187-206. doi: 10.26493/1855-3974.1060.031
    [11] LIU J, LIU S, WU H, et al.A pursuit-evasion algorithm based on hierarchical reinforcement learning[C]//2009 International Conference on Measuring Technology and Mechatronics Automation.Piscataway: IEEE Press, 2009, 2: 482-486.
    [12] BILGIN A T, KADIOGLU-URTIS E.An approach to multiagent pursuit evasion games using reinforcement learning[C]//2015 International Conference on Advanced Robotics (ICAR).Piscataway: IEEE Press, 2015: 164-169.
    [13] AWHEDA M D, SCHWARTZ H M.A fuzzy reinforcement learning algorithm using a predictor for pursuit-evasion games[C]//2016 Annual IEEE Systems Conference(SysCon).Piscataway: IEEE Press, 2016: 1-8.
    [14] LOWE R, WU Y I, TAMAR A, et al.Multi-agent actor-critic for mixed cooperative-competitive environments[C]//Advances in Neural Information Processing Systems, 2017: 6379-6390.
    [15] VAN HASSELT H, GUEZ A, SILVER D, et al.Deep reinforcement learning with double Q-learning[C]//National Conference on Artificial Intelligence, 2016: 2094-2100.
    [16] HAUSKNECHT M, STONE P.Deep recurrent Q-learning for partially observable MDPS[EB/OL].(2015-07-23)[2020-06-01].https://arxiv.org/abs/1507.06527.
    [17] HESTER T, VECERIK M, PIETQUIN O, et al.Deep Q-learning from demonstrations[C]//National Conference on Artificial Intelligence, 2018: 3223-3230.
    [18] 魏瑞轩, 张启瑞, 许卓凡. 类脑发育无人机防碰撞控制[J]. 控制理论与应用, 2019, 36(2): 13-20. https://www.cnki.com.cn/Article/CJFDTOTAL-KZLY201902002.htm

    WEI R X, ZHANG Q R, XU Z F. A brain-like mechanism for developmental UAVs' collision avoidance[J]. Control Theory & Applications, 2019, 36(2): 13-20(in Chinese). https://www.cnki.com.cn/Article/CJFDTOTAL-KZLY201902002.htm
    [19] MNIH V, KAVUKCUOGLU K, SILVER D, et al. Human-level control through deep reinforcement learning[J]. Nature, 2015, 518: 529-533. doi: 10.1038/nature14236
    [20] MORDATCH I, ABBEEL P.Emergence of grounded compositional language in multi-agent populations[EB/OL].(2017-03-15)[2020-06-01].https://arxiv.org/abs/1703.04908.
  • 加载中
图(9) / 表(2)
计量
  • 文章访问数:  1610
  • HTML全文浏览量:  170
  • PDF下载量:  249
  • 被引次数: 0
出版历程
  • 收稿日期:  2020-06-17
  • 录用日期:  2020-08-21
  • 网络出版日期:  2021-02-20

目录

    /

    返回文章
    返回
    常见问答