留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于近似动态规划的目标追踪控制算法

李惠峰 易文峰 程晓明

李惠峰, 易文峰, 程晓明等 . 基于近似动态规划的目标追踪控制算法[J]. 北京航空航天大学学报, 2019, 45(3): 597-605. doi: 10.13700/j.bh.1001-5965.2018.0353
引用本文: 李惠峰, 易文峰, 程晓明等 . 基于近似动态规划的目标追踪控制算法[J]. 北京航空航天大学学报, 2019, 45(3): 597-605. doi: 10.13700/j.bh.1001-5965.2018.0353
LI Huifeng, YI Wenfeng, CHENG Xiaominget al. Target tracking control algorithm based on approximate dynamic programming[J]. Journal of Beijing University of Aeronautics and Astronautics, 2019, 45(3): 597-605. doi: 10.13700/j.bh.1001-5965.2018.0353(in Chinese)
Citation: LI Huifeng, YI Wenfeng, CHENG Xiaominget al. Target tracking control algorithm based on approximate dynamic programming[J]. Journal of Beijing University of Aeronautics and Astronautics, 2019, 45(3): 597-605. doi: 10.13700/j.bh.1001-5965.2018.0353(in Chinese)

基于近似动态规划的目标追踪控制算法

doi: 10.13700/j.bh.1001-5965.2018.0353
详细信息
    作者简介:

    李惠峰  女, 博士, 教授。主要研究方向:飞行器制导与控制

    易文峰:程晓明  男, 博士研究生。主要研究方向:飞行器制导与控制

    通讯作者:

    李惠峰, E-mail:lihuifeng@buaa.edu.cn

  • 中图分类号: V249.1;TP181

Target tracking control algorithm based on approximate dynamic programming

More Information
  • 摘要:

    在目标跟踪问题中,针对飞行器控制算法难以适应目标大机动飞行甚至与我方博弈等难度较大任务的问题,提出了基于近似动态规划的目标追踪控制算法。该算法通过使用博弈策略对我方无人机进行训练形成经验,将双方位置等状态作为已知量,滚转方向作为控制量,利用两物体的相对位置得出其特征,并形成近似值函数;最终利用rollout算法进行最优跟踪决策求解,实现对跟踪目标甚至是博弈目标的灵活有效精确跟踪。仿真结果验证了近似动态规划用于控制算法的有效性。

     

  • 图 1  近似动态规划结构[11]

    Figure 1.  ADP structure[11]

    图 2  算法框架

    Figure 2.  Algorithm framework

    图 3  奖励区域

    Figure 3.  Reward area

    图 4  评估函数

    Figure 4.  Evaluation function

    图 5  优势目标奖励的位置

    Figure 5.  Advantageous target reward position

    图 6  角度定义

    Figure 6.  Angle definition

    图 7  Minimax算法流程

    Figure 7.  Minimax algorithm flowchart

    图 8  60步长仿真(xinit=1)

    Figure 8.  60 step length simulation (xinit=1)

    图 9  实时与目标距离(xinit=1)

    Figure 9.  Real-time target distance (xinit=1)

    图 10  60步长仿真(xinit=2)

    Figure 10.  60 step length simulation (xinit=2)

    图 11  实时与目标距离(xinit=2)

    Figure 11.  Real-time target distance (xinit=2)

    图 12  60步长仿真图(xinit=3)

    Figure 12.  60 step length simulation chart (xinit=3)

    图 13  实时与目标距离(xinit=3)

    Figure 13.  Real-time traget distance map (xinit=3)

    图 14  60步长仿真(xinit=4)

    Figure 14.  60 step length simulation chart (xinit=4)

    图 15  实时与目标距离(xinit=4)

    Figure 15.  Real-time traget distance map (xinit=4)

    图 16  给定PID目标轨迹时的仿真比较

    Figure 16.  Simulation comparison when PID target trajectory is given

    图 17  PID跟踪误差以及ADP与目标距离

    Figure 17.  PID tracking error and target distance from ADP simulation

    图 18  非线性预测模型方法与ADP的跟踪比较

    Figure 18.  Comparison of tracking between nonlinear prediction model methods and ADP

    图 19  非线性预测模型方法与ADP的跟踪误差比较

    Figure 19.  Comparison of tracking error between nonlinear prediction model method and ADP

    图 20  无人机跟踪与博弈

    Figure 20.  UAV tracking and gaming

    表  1  近似动态规划变量说明

    Table  1.   ADP symbology

    变量 说明
    x 状态矢量
    xi 在第i步的状态
    xn X的第n个状态矢量
    xterm 特殊的终止状态
    xpos 无人机x坐标
    ypos 无人机y坐标
    X 状态矢量[x1, x2, …, xn]T
    f(x, u) 状态转移函数
    π(x) 机动策略
    π*(x) 最佳机动策略
    π(x) 通过滚动算法生成的策略
    J(x) 状态x的未来奖励值
    Jk(x) J(x)的第k次迭代
    Japprox(x) J(x)的函数逼近形式
    S(x) 无人机的评估函数
    γ 奖励折扣因子
    u 控制或移动动作
    ζ(x) 状态x的特征向量
    β 函数参数向量
    g(x) 目标奖励函数
    gpa(x) 优势位置函数
    pt 终止函数的概率
    T Bellman逆操作因子
    J*(x) J(x)的最佳值
    下载: 导出CSV
      算法1  优势位置函数gpa(x)
      输入:{x}。
    R=“飞行器与目标的欧几里得距离”
    if(0.1 m < R < 3.0 m) & (|AA| < 60°) &
          (|ATA| < 30°)
    then
          gpa(x)=1.0
    else
          gpa(x)=0
    end if
          输出奖励:(gpa)。
    下载: 导出CSV
        算法2   状态转移函数f(xi, ub, ur)
        输入:{xi, ub, ur}。
    for i=1:5(once per Δt=0.05 s) do
        for{red, blue} do
              ( =40(°)/s, ϕrmax=18°, ϕbmax=23°)
              if u=L then
                   ϕ=max(ϕΔt, -ϕmax)
              else if u=R then
                    ϕ=min(ϕ+Δt, ϕmax)
              end if
               tan ϕ(assume v=2.5 m/s)
              ψ=ψ+Δt; xpos=xpostvsin ψ
              ypos=ypostvcos ψ
         end for
    end for
    下载: 导出CSV
        算法3   rollout算法
        输入:xi
        初始化:Jbest=-∞。
    for ub={L, S, R}
    do xtemp=f(xi, ub, πrnom(xtemp))
        for j={1:Nrolls}
        do
          xtemp=f(xtemp, πapproxN(xtemp), πrnom(xtemp))
        end for
        Jcurrent=[γJapproxN(xtemp)+g(xtemp)]
        if Jcurrent>Jbest
        then
          ubest=ub, Jbest=Jcurrent
        end if
    end for
        输出:ubest
    下载: 导出CSV

    表  2  各仿真图初始状态及目标策略

    Table  2.   Initial state and objet strategy of each simulation chart

    xinit xbpos/m ybpos/m ψb/rad ϕb/rad xrpos/m yrpos/m ψr/rad ϕr/rad πr
    1 0 1 0 0 1 0 0 0 Minimax
    2 0 1 0 0 1 0 0 0 Maintain
    3 0 1 -π/6 0 1 0 0 0 Maintain
    4 0 1 π/6 0 1 0 0 0 Minimax
    注:πr—敌机机动策略。
    下载: 导出CSV
  • [1] 卢虎川, 李佩霞, 王栋.目标跟踪算法综述[J].模式识别与人工智能, 2018, 31(1):61-76. http://d.old.wanfangdata.com.cn/Periodical/mssbyrgzn201801008

    LU H C, LI P X, WANG D.Visual object tracking:A survey[J].Pattern Recognition and Artificial Intelligence, 2018, 31(1):61-76(in Chinese). http://d.old.wanfangdata.com.cn/Periodical/mssbyrgzn201801008
    [2] CHENG Y Z. Meanshift, mode seeking, and clustering[J].IEEE Transactions on Pattern Analysis and Machine Intelligence, 1995, 17(8):790-799. doi: 10.1109/34.400568
    [3] ADAM A, RIVLIN E, SHIMSHONI I.Robust fragments-based tracking using the integral histogram[C]//Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition.Piscataway, NJ: IEEE Press, 2006: 798-805. http://www.cs.technion.ac.il/~amita/fragtrack/fragtrack_cvpr06.pdf
    [4] TURK M, PENTLAND A.Eigenfaces for recognition[J].Journal of Cognitive Neuroscience, 1991, 3(1):71-86. doi: 10.1162/jocn.1991.3.1.71
    [5] TANG M, FENG J.Multi-kernel correlation filter for visual tracking[C]//Proceedings of the IEEE International Conference on Computer Vision.Piscataway, NJ: IEEE Press, 2016: 3038-3046. https://www.cv-foundation.org/openaccess/content_iccv_2015/papers/Tang_Multi-Kernel_Correlation_Filter_ICCV_2015_paper.pdf
    [6] LI Y, ZHU J K.A scale adaptive kernel correlation filter tracker with feature integration[C]//Proceedings of the European Conference on Computer Vision.Berlin: Springer, 2014: 254-265. http://vigir.missouri.edu/~gdesouza/Research/Conference_CDs/ECCV_2014/workshops/w09/W9-07.pdf
    [7] DANELLJAN M, HGER G, KHAN F S.Accurate scale estimation for robust visual tracking[EB/OL].[2017-11-27].http://www.cvl.isy.liu.se/en/research/objrec/visualtracking/scalvistrack/index.html.
    [8] HENRIQUES J F, CASEIRO R, MARINS P, et al. High-speed tracking with kernelized correlation filters[J].IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 37(3):583-596. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=c8f7e9e032e4e419c5c79d7a5f1f6494
    [9] LI Y, ZHU J K, HOI S C H.Reliable patch trackers: Robust visual tracking by exploiting reliable patches[C]//Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition.Piscataway, NJ: IEEE Press, 2015: 353-361. https://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Li_Reliable_Patch_Trackers_2015_CVPR_paper.pdf
    [10] MA C, HUANG J B, YANG X K, et al.Hierarchical convolutional features for visual tracking[C]//Proceedings of the IEEE International Conference on Computer Vision.Piscataway, NJ: IEEE Press, 2015: 3074-3082. https://www.cv-foundation.org/openaccess/content_iccv_2015/papers/Ma_Hierarchical_Convolutional_Features_ICCV_2015_paper.pdf
    [11] 魏庆来.基于近似动态规划的非线性系统最优控制研究[D].沈阳: 东北大学, 2008: 6-8. http://cdmd.cnki.com.cn/Article/CDMD-10145-1012300362.htm

    WEI Q L.Researches on optimal control of nonlinear systems based on approximate dynamic programming[D].Shenyang: Northeastern University, 2008: 6-8(in Chinese). http://cdmd.cnki.com.cn/Article/CDMD-10145-1012300362.htm
    [12] BELLMAN R.On the theory of dynamic programming[J].Proceedings of the National Academy of Sciences of the United States of America, 1952, 38(8):716-719. doi: 10.1073/pnas.38.8.716
    [13] ISAACS R.Games of pursuit[M].Santa Monica, CA:The Rand Corporation, 1951:256-257.
    [14] AUSTIN F, CARBONE G, FALCO M, et al.Game theory for automated maneuvering during air-to-air combat[J] Journal of Guidance, Control, and Dynamics, 1990, 13(6):1143-1149. doi: 10.2514/3.20590
    [15] MCGREW J S, HOW J P, WILLIAMS B, et al.Air combat strategy using approximate dynamic programming[C]//AIAA Guidance, Navigation, and Control Conference and Exhibit.Reston: AIAA, 2008: 6-13. doi: 10.2514/1.46815
    [16] ANWAR H, ZHU Q Y.Minimax game-theoretic approach to multiscale H-infinity optimal filtering[C]//2017 IEEE Global Conference on Signal and Information Processing(GlobalSIP).Piscataway, NJ: IEEE Press, 2017: 853-857. https://nyuscholars.nyu.edu/en/publications/minimax-game-theoretic-approach-to-multiscale-h-infinity-optimal-
    [17] SPRINKLE J, EKLUND J, KIM H, et al.Encoding aerial pursuit/evasion games with fixed wing aircraft into a nonlinear model predictive tracking controller[C]//Proceedings of 200443rd IEEE Conference on Decision and Control(CDC).Piscataway, NJ: IEEE Press, 2004: 2609-2614. https://www.researchgate.net/publication/4142744_Encoding_aerial_pursuitevasion_games_with_fixed_wing_aircraft_into_a_nonlinear_model_predictive_nacking_controller
    [18] EKLUND J, SPRINKLE J, KIM H, et al.Implementing and testing a nonlinear model predictive tracking controller for aerial pursuit/evasion games on a fixed wing aircraft[C]//Proceedings of 2005 American Control Conference.Piscataway, NJ: IEEE Press, 2005: 1509-1514. https://people.eecs.berkeley.edu/~sastry/pubs/PDFs%20of%20Pubs2000-2005/Publications%20of%20Postdocs/Eklund/Eklund.ImplementingTesting%202005.pdf
    [19] SHAW R.Fighter combat tactics and maneuvering[M].Annapolis:Naval Institute Press, 1985:12-15.
    [20] POWELLW B.Approximate dynamic programming solving the curses of dimensionality[M].2nd ed.Hoboken:John Wiley & Sons, Inc., 2011:305-307.
  • 加载中
图(20) / 表(5)
计量
  • 文章访问数:  1245
  • HTML全文浏览量:  158
  • PDF下载量:  413
  • 被引次数: 0
出版历程
  • 收稿日期:  2018-06-12
  • 录用日期:  2018-09-14
  • 网络出版日期:  2019-03-20

目录

    /

    返回文章
    返回
    常见问答