基于近似动态规划的目标追踪控制算法

李惠峰; 易文峰; 程晓明

doi:10.13700/j.bh.1001-5965.2018.0353

基于近似动态规划的目标追踪控制算法

doi: 10.13700/j.bh.1001-5965.2018.0353

李惠峰^1, ,,
易文峰¹,
程晓明^{1, 2}

1.
北京航空航天大学宇航学院, 北京 100083
2.
北京航天自动控制研究所, 北京 100854

详细信息

作者简介:
李惠峰女, 博士, 教授。主要研究方向:飞行器制导与控制

易文峰：程晓明男, 博士研究生。主要研究方向:飞行器制导与控制

通讯作者:
李惠峰, E-mail:lihuifeng@buaa.edu.cn

中图分类号: V249.1;TP181
计量
- 文章访问数: 1245
- HTML全文浏览量: 158
- PDF下载量: 413
- 被引次数: 0
出版历程
- 收稿日期: 2018-06-12
- 录用日期: 2018-09-14
- 网络出版日期: 2019-03-20

Target tracking control algorithm based on approximate dynamic programming

LI Huifeng^{1
, ,},
YI Wenfeng¹,
CHENG Xiaoming^{1, 2}

1.
School of Astronautics, Beihang University, Beijing 100083, China
2.
Beijing Aerospace Automatic Control Institute, Beijing 100854, China

More Information

Corresponding author: LI Huifeng, E-mail:lihuifeng@buaa.edu.cn

摘要

摘要:
在目标跟踪问题中，针对飞行器控制算法难以适应目标大机动飞行甚至与我方博弈等难度较大任务的问题，提出了基于近似动态规划的目标追踪控制算法。该算法通过使用博弈策略对我方无人机进行训练形成经验，将双方位置等状态作为已知量，滚转方向作为控制量，利用两物体的相对位置得出其特征，并形成近似值函数；最终利用rollout算法进行最优跟踪决策求解，实现对跟踪目标甚至是博弈目标的灵活有效精确跟踪。仿真结果验证了近似动态规划用于控制算法的有效性。
- 近似动态规划 /
- 目标跟踪 /
- 飞行控制 /
- 最优决策 /
- 博弈
Abstract:
The control algorithm for the target tracking problem cannot be well adapted to the problem of large-scale maneuver flight or even game with us. This paper proposes a control algorithm for target tracking using approximate dynamic programming. The game algorithm is used to train our UAV to form an experience. The positions of both sides are taken as known quantity and the roll direction as the control quantity. The relative positions of two objects are used to derive their features and then an approximate function is formed. The rollout algorithm is used to obtain the optimal decision, and the flexible and effective tracking of tracking targets and even gaming targets can be achieved. The simulation results verify the effectiveness of approximate dynamic programming for control algorithms.
- approximate dynamic programming /
- target tracking /
- flight control /
- optimal decision /
- game

HTML全文

图 1 近似动态规划结构^[11]

Figure 1. ADP structure^[11]

下载: 全尺寸图片幻灯片

图 2 算法框架

Figure 2. Algorithm framework

下载: 全尺寸图片幻灯片

图 3 奖励区域

Figure 3. Reward area

下载: 全尺寸图片幻灯片

图 4 评估函数

Figure 4. Evaluation function

下载: 全尺寸图片幻灯片

图 5 优势目标奖励的位置

Figure 5. Advantageous target reward position

下载: 全尺寸图片幻灯片

图 6 角度定义

Figure 6. Angle definition

下载: 全尺寸图片幻灯片

图 7 Minimax算法流程

Figure 7. Minimax algorithm flowchart

下载: 全尺寸图片幻灯片

图 8 60步长仿真(x_init=1)

Figure 8. 60 step length simulation (x_init=1)

下载: 全尺寸图片幻灯片

图 9 实时与目标距离(x_init=1)

Figure 9. Real-time target distance (x_init=1)

下载: 全尺寸图片幻灯片

图 10 60步长仿真(x_init=2)

Figure 10. 60 step length simulation (x_init=2)

下载: 全尺寸图片幻灯片

图 11 实时与目标距离(x_init=2)

Figure 11. Real-time target distance (x_init=2)

下载: 全尺寸图片幻灯片

图 12 60步长仿真图(x_init=3)

Figure 12. 60 step length simulation chart (x_init=3)

下载: 全尺寸图片幻灯片

图 13 实时与目标距离(x_init=3)

Figure 13. Real-time traget distance map (x_init=3)

下载: 全尺寸图片幻灯片

图 14 60步长仿真(x_init=4)

Figure 14. 60 step length simulation chart (x_init=4)

下载: 全尺寸图片幻灯片

图 15 实时与目标距离(x_init=4)

Figure 15. Real-time traget distance map (x_init=4)

下载: 全尺寸图片幻灯片

图 16 给定PID目标轨迹时的仿真比较

Figure 16. Simulation comparison when PID target trajectory is given

下载: 全尺寸图片幻灯片

图 17 PID跟踪误差以及ADP与目标距离

Figure 17. PID tracking error and target distance from ADP simulation

下载: 全尺寸图片幻灯片

图 18 非线性预测模型方法与ADP的跟踪比较

Figure 18. Comparison of tracking between nonlinear prediction model methods and ADP

下载: 全尺寸图片幻灯片

图 19 非线性预测模型方法与ADP的跟踪误差比较

Figure 19. Comparison of tracking error between nonlinear prediction model method and ADP

下载: 全尺寸图片幻灯片

图 20 无人机跟踪与博弈

Figure 20. UAV tracking and gaming

下载: 全尺寸图片幻灯片

表 1 近似动态规划变量说明

Table 1. ADP symbology

变量	说明
x	状态矢量
x_i	在第i步的状态
xⁿ	X的第n个状态矢量
x_term	特殊的终止状态
x^pos	无人机x坐标
y^pos	无人机y坐标
X	状态矢量[x¹, x², …, xⁿ]^T
f(x, u)	状态转移函数
π(x)	机动策略
π^*(x)	最佳机动策略
π(x)	通过滚动算法生成的策略
J(x)	状态x的未来奖励值
J^k(x)	J(x)的第k次迭代

J_approx(x)	J(x)的函数逼近形式
S(x)	无人机的评估函数
γ	奖励折扣因子
u	控制或移动动作
ζ(x)	状态x的特征向量
β	函数参数向量
g(x)	目标奖励函数
g_pa(x)	优势位置函数
p_t	终止函数的概率
T	Bellman逆操作因子
J^*(x)	J(x)的最佳值

下载: 导出CSV

算法1 优势位置函数g_pa(x)

输入:{x}。

R=“飞行器与目标的欧几里得距离”

if(0.1 m < R < 3.0 m) & (|AA| < 60°) &

(|ATA| < 30°)

then

g_pa(x)=1.0

else

g_pa(x)=0

end if

输出奖励:(g_pa)。

下载: 导出CSV

算法2 状态转移函数f(x_i, u_b, u_r)

输入:{x_i, u_b, u_r}。

for i=1:5(once per Δt=0.05 s) do

for{red, blue} do

(

=40(°)/s, ϕ_r^max=18°, ϕ_b^max=23°)

if u=L then

ϕ=max(ϕ－

Δt, －ϕ_max)

else if u=R then

ϕ=min(ϕ+

Δt, ϕ_max)

end if

tan ϕ(assume v=2.5 m/s)

ψ=ψ+

Δt; x^pos=x^pos+Δtvsin ψ

y^pos=y^pos+Δtvcos ψ

end for

下载: 导出CSV

算法3 rollout算法

输入:x_i。

初始化：J_best=－∞。

for u_b={L, S, R}

do x_temp=f(x_i, u_b, π_r^nom(x_temp))

for j={1:N_rolls}

x_temp=f(x_temp, π_approx^N(x_temp), π_r^nom(x_temp))

end for

J_current=[γJ_approx^N(x_temp)+g(x_temp)]

if J_current>J_best

then

u_best=u_b, J_best=J_current

end if

end for

输出:u_best。

下载: 导出CSV

表 2 各仿真图初始状态及目标策略

Table 2. Initial state and objet strategy of each simulation chart

x_init	x_b^pos/m	y_b^pos/m	ψ_b/rad	ϕ_b/rad	x_r^pos/m	y_r^pos/m	ψ_r/rad	ϕ_r/rad	π_r
1	0	1	0	0	1	0	0	0	Minimax
2	0	1	0	0	1	0	0	0	Maintain
3	0	1	－π/6	0	1	0	0	0	Maintain
4	0	1	π/6	0	1	0	0	0	Minimax
注：π_r—敌机机动策略。

下载: 导出CSV

参考文献(20)

[1]	卢虎川, 李佩霞, 王栋.目标跟踪算法综述[J].模式识别与人工智能, 2018, 31(1):61-76. http://d.old.wanfangdata.com.cn/Periodical/mssbyrgzn201801008 LU H C, LI P X, WANG D.Visual object tracking:A survey[J].Pattern Recognition and Artificial Intelligence, 2018, 31(1):61-76(in Chinese). http://d.old.wanfangdata.com.cn/Periodical/mssbyrgzn201801008
[2]	CHENG Y Z. Meanshift, mode seeking, and clustering[J].IEEE Transactions on Pattern Analysis and Machine Intelligence, 1995, 17(8):790-799. doi: 10.1109/34.400568
[3]	ADAM A, RIVLIN E, SHIMSHONI I.Robust fragments-based tracking using the integral histogram[C]//Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition.Piscataway, NJ: IEEE Press, 2006: 798-805. http://www.cs.technion.ac.il/~amita/fragtrack/fragtrack_cvpr06.pdf
[4]	TURK M, PENTLAND A.Eigenfaces for recognition[J].Journal of Cognitive Neuroscience, 1991, 3(1):71-86. doi: 10.1162/jocn.1991.3.1.71
[5]	TANG M, FENG J.Multi-kernel correlation filter for visual tracking[C]//Proceedings of the IEEE International Conference on Computer Vision.Piscataway, NJ: IEEE Press, 2016: 3038-3046. https://www.cv-foundation.org/openaccess/content_iccv_2015/papers/Tang_Multi-Kernel_Correlation_Filter_ICCV_2015_paper.pdf
[6]	LI Y, ZHU J K.A scale adaptive kernel correlation filter tracker with feature integration[C]//Proceedings of the European Conference on Computer Vision.Berlin: Springer, 2014: 254-265. http://vigir.missouri.edu/~gdesouza/Research/Conference_CDs/ECCV_2014/workshops/w09/W9-07.pdf
[7]	DANELLJAN M, HGER G, KHAN F S.Accurate scale estimation for robust visual tracking[EB/OL].[2017-11-27].http://www.cvl.isy.liu.se/en/research/objrec/visualtracking/scalvistrack/index.html.
[8]	HENRIQUES J F, CASEIRO R, MARINS P, et al. High-speed tracking with kernelized correlation filters[J].IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 37(3):583-596. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=c8f7e9e032e4e419c5c79d7a5f1f6494
[9]	LI Y, ZHU J K, HOI S C H.Reliable patch trackers: Robust visual tracking by exploiting reliable patches[C]//Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition.Piscataway, NJ: IEEE Press, 2015: 353-361. https://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Li_Reliable_Patch_Trackers_2015_CVPR_paper.pdf
[10]	MA C, HUANG J B, YANG X K, et al.Hierarchical convolutional features for visual tracking[C]//Proceedings of the IEEE International Conference on Computer Vision.Piscataway, NJ: IEEE Press, 2015: 3074-3082. https://www.cv-foundation.org/openaccess/content_iccv_2015/papers/Ma_Hierarchical_Convolutional_Features_ICCV_2015_paper.pdf
[11]	魏庆来.基于近似动态规划的非线性系统最优控制研究[D].沈阳: 东北大学, 2008: 6-8. http://cdmd.cnki.com.cn/Article/CDMD-10145-1012300362.htm WEI Q L.Researches on optimal control of nonlinear systems based on approximate dynamic programming[D].Shenyang: Northeastern University, 2008: 6-8(in Chinese). http://cdmd.cnki.com.cn/Article/CDMD-10145-1012300362.htm
[12]	BELLMAN R.On the theory of dynamic programming[J].Proceedings of the National Academy of Sciences of the United States of America, 1952, 38(8):716-719. doi: 10.1073/pnas.38.8.716
[13]	ISAACS R.Games of pursuit[M].Santa Monica, CA:The Rand Corporation, 1951:256-257.
[14]	AUSTIN F, CARBONE G, FALCO M, et al.Game theory for automated maneuvering during air-to-air combat[J] Journal of Guidance, Control, and Dynamics, 1990, 13(6):1143-1149. doi: 10.2514/3.20590
[15]	MCGREW J S, HOW J P, WILLIAMS B, et al.Air combat strategy using approximate dynamic programming[C]//AIAA Guidance, Navigation, and Control Conference and Exhibit.Reston: AIAA, 2008: 6-13. doi: 10.2514/1.46815
[16]	ANWAR H, ZHU Q Y.Minimax game-theoretic approach to multiscale H-infinity optimal filtering[C]//2017 IEEE Global Conference on Signal and Information Processing(GlobalSIP).Piscataway, NJ: IEEE Press, 2017: 853-857. https://nyuscholars.nyu.edu/en/publications/minimax-game-theoretic-approach-to-multiscale-h-infinity-optimal-
[17]	SPRINKLE J, EKLUND J, KIM H, et al.Encoding aerial pursuit/evasion games with fixed wing aircraft into a nonlinear model predictive tracking controller[C]//Proceedings of 200443rd IEEE Conference on Decision and Control(CDC).Piscataway, NJ: IEEE Press, 2004: 2609-2614. https://www.researchgate.net/publication/4142744_Encoding_aerial_pursuitevasion_games_with_fixed_wing_aircraft_into_a_nonlinear_model_predictive_nacking_controller
[18]	EKLUND J, SPRINKLE J, KIM H, et al.Implementing and testing a nonlinear model predictive tracking controller for aerial pursuit/evasion games on a fixed wing aircraft[C]//Proceedings of 2005 American Control Conference.Piscataway, NJ: IEEE Press, 2005: 1509-1514. https://people.eecs.berkeley.edu/~sastry/pubs/PDFs%20of%20Pubs2000-2005/Publications%20of%20Postdocs/Eklund/Eklund.ImplementingTesting%202005.pdf
[19]	SHAW R.Fighter combat tactics and maneuvering[M].Annapolis:Naval Institute Press, 1985:12-15.
[20]	POWELLW B.Approximate dynamic programming solving the curses of dimensionality[M].2nd ed.Hoboken:John Wiley & Sons, Inc., 2011:305-307.