留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

面向个体人员特征的跨模态目标跟踪算法

周千里 张文靖 赵路平 田乃倩 王蓉

周千里, 张文靖, 赵路平, 等 . 面向个体人员特征的跨模态目标跟踪算法[J]. 北京航空航天大学学报, 2020, 46(9): 1635-1642. doi: 10.13700/j.bh.1001-5965.2020.0042
引用本文: 周千里, 张文靖, 赵路平, 等 . 面向个体人员特征的跨模态目标跟踪算法[J]. 北京航空航天大学学报, 2020, 46(9): 1635-1642. doi: 10.13700/j.bh.1001-5965.2020.0042
ZHOU Qianli, ZHANG Wenjing, ZHAO Luping, et al. Cross-modal object tracking algorithm based on pedestrian attribute[J]. Journal of Beijing University of Aeronautics and Astronautics, 2020, 46(9): 1635-1642. doi: 10.13700/j.bh.1001-5965.2020.0042(in Chinese)
Citation: ZHOU Qianli, ZHANG Wenjing, ZHAO Luping, et al. Cross-modal object tracking algorithm based on pedestrian attribute[J]. Journal of Beijing University of Aeronautics and Astronautics, 2020, 46(9): 1635-1642. doi: 10.13700/j.bh.1001-5965.2020.0042(in Chinese)

面向个体人员特征的跨模态目标跟踪算法

doi: 10.13700/j.bh.1001-5965.2020.0042
基金项目: 

国家重点研发计划 A19808

中国人民公安大学基本科研业务费年度重大项目 2019JKF111

详细信息
    作者简介:

    周千里  男,博士研究生。主要研究方向:计算机视觉

    张文靖  男,博士研究生。主要研究方向:计算机视觉

    赵路平  女,硕士研究生。主要研究方向:单目标跟踪

    田乃倩  女,硕士研究生。主要研究方向:单目标跟踪

    王蓉  女,博士,教授,博士生导师。主要研究方向:计算机视觉

    通讯作者:

    王蓉.E-mail: dbdxwangrong@163.com

  • 中图分类号: TP183;TP389.1

Cross-modal object tracking algorithm based on pedestrian attribute

Funds: 

National Key R & D Program of China A19808

The Operating Expenses of Basic Scientific Research Project of the People's Public Security University of China 2019JKF111

More Information
  • 摘要:

    针对类内干扰影响基于个体人员特征目标跟踪算法的精确性和鲁棒性问题,分析当前跟踪算法在个体人员跟踪方面存在的不足,提出了利用语言先验知识引导辅助跟踪器的方法。在视觉跟踪器的基础上增加语言引导分支,对跟踪目标产生注意力,从而减少对类内干扰的影响。利用位置置信度进行回归目标框定位的方法解决基于孪生网络目标跟踪算法中利用分类置信度定位候选目标框的局限性,实现跨模态信息融合提升特定目标跟踪的精度。为提升所提模型对特定人员目标跟踪的针对性,构建了跨模态的人员目标跟踪数据集用于训练和验证。实验表明:所提模型应用于个体人员跟踪时表现更佳,其有效性得到了证明。

     

  • 图 1  视觉引用表达图像分割示意

    Figure 1.  Image segmentation of visual referring expression

    图 2  跨模态目标跟踪整体框架

    Figure 2.  Cross-modal object tracking framework

    图 3  多模块回归预测结果

    Figure 3.  Results of multiple modules predicted regression

    图 4  最小封闭矩形框生成图

    Figure 4.  Illustration of minimum enclosing rectangle

    图 5  主流跟踪器的结果比较

    Figure 5.  Comparative results among mainstream trackers

    图 6  本文模型与主流跟踪算法的OPE评估结果

    Figure 6.  OPE evaluation results between proposed model and mainstream tracking algorithms

    图 7  不同跟踪器效果可视化

    Figure 7.  Results visualization of different trackers

    图 8  语言检测跟踪评估

    Figure 8.  Tracking assessment of language detection

    表  1  语言引导模块评估结果对比

    Table  1.   Comparison results of language guided module

    模型参数类型 平均交并比
    参数0 0.241 3
    参数1 0.359 8
    参数2 0.349 8
    优化参数 0.465 0
    下载: 导出CSV

    表  2  本文模型与主流跟踪算法评估结果对比

    Table  2.   Comparative results between proposed model and mainstream tracking algorithms

    算法 平均精度 成功率
    SiamRPN[2] 0.493 0.566
    SiamRPN++[5] 0.508 0.612
    SiamMask[21] 0.708 0.808
    ECO[26] 0.647 0.797
    ATOM[27] 0.732 0.848
    DIMP[28] 0.787 0.841
    本文模型 0.930 0.978
    下载: 导出CSV
  • [1] BERTINETTO L, VALMADRE J, HENRIQUE J F, et al.Fully-convolutional siamese networks for object tracking[C]//European Conference on Computer Vision.Berlin: Springer, 2016: 850-865.
    [2] LI B, YAN J, WU W, et al.High performance visual tracking with siamese region proposal network[C]//Proceedings of the IEEE Computer Vision and Pattern Recognition.Piscataway: IEEE Press, 2018: 8971-8980.
    [3] KOSIOREK A R, BEWLEY A, POSNER I, et al.Hierarchical attentive recurrent tracking[C]//Neural Information Processing Systems, 2017, 36: 3053-3061.
    [4] ZHANG Z, PENG H.Deeper and wider siamese networks for real-time visual tracking[C]//Proceedings of the IEEE Computer Vision and Pattern Recognition.Piscataway: IEEE Press, 2019: 4591-4600.
    [5] LI B, WU W, WANG Q, et al.Evolution of siamese visual tracking with very deep networks[J].IEEE Computer Vision and Pattern Recognition, 2019, 35(9):4282-4291. http://ieeexplore.ieee.org/document/8954116
    [6] ZHU Z, WANG Q, LI B, et al.Distractor-aware siamese networks for visual object tracking[C]//European Conference on Computer Vision.Berlin: Springer, 2018: 103-119.
    [7] REN L, YUAN X, LU J, et al.Deep reinforcement learning with iterative shift for visual tracking[C]//European Conference on Computer Vision.Berlin: Springer, 2018: 684-700.
    [8] ZHANG L, GONZALEZGARCIA A, DE WEIJER J V, et al.Learning the model update for siamese trackers[C]//Proceedings of the IEEE International Conference on Computer Vision.Piscataway: IEEE Press, 2019: 4010-4019.
    [9] MOGADALA A, KALIMUTHU M, KLAKOWl D, et al.Trends in integration of vision and language research:A survey of tasks, datasets, and methods[J].IEEE Computer Vision and Pattern Recognition, 2019, 30(19):1183-1986. http://cn.bing.com/academic/profile?id=7797262cdf373568fdd2a8c589597f7a&encoded=0&v=paper_preview&mkt=zh-cn
    [10] HU R, ROHRBACH M, DARRELL T, et al.Segmentation from natural language expressions[C]//European Conference on Computer Vision.Berlin: Springer, 2016: 108-124.
    [11] LI Z, TAO R, GAVVES E, et al.Tracking by natural language specification[C]//Proceedings of the IEEE Computer Vision and Pattern Recognition.Piscataway: IEEE Press, 2017: 7350-7358.
    [12] YU L, LIN Z, SHEN X, et al.Modular attention network for referring expression comprehension[C]//Proceedings of the IEEE Computer Vision and Pattern Recognition.Piscataway: IEEE Press, 2018: 1307-1315.
    [13] SUN C, MYERS A, VONDRICK C, et al.A joint model for video and language representation learning[C]//Proceedings of the IEEE International Conference on Computer Vision.Piscataway: IEEE Press, 2019: 7464-7473.
    [14] SU W, ZHU X, CAO Y, et al.Pre-training of generic visual-linguistic representations[C]//Proceedings of the IEEE International Conference on Computer Vision.Piscataway: IEEE Press, 2019: 13-23.
    [15] WU Y, LIM J, YANG M, et al.Online object tracking: A benchmark[C]//Proceedings of the IEEE Computer Vision and Pattern Recognition.Piscataway: IEEE Press, 2013: 2411-2418.
    [16] WU Y, LIM J, YANG M H.Object tracking benchmark[J].IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9):1834-1848. doi: 10.1109/TPAMI.2014.2388226
    [17] GALOOGAHI H K, FAGG A, HUANG C, et al.A benchmark for higher frame rate object tracking[C]//Proceedings of the IEEE International Conference on Computer Vision.Piscataway: IEEE Press, 2017: 1134-1143.
    [18] MULLER M, BIBI A, GIANCOLA S, et al.A large-scale dataset and benchmark for object tracking in the wild[C]//European Conference on Computer Vision.Berlin: Springer, 2018: 310-327.
    [19] HUANG L, ZHAO X, HUANG K, et al.A large high-diversity benchmark for generic object tracking in the wild[J].IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019, 45(21):1374-1391. http://cn.bing.com/academic/profile?id=d9720ea63f33d60fdb1cd01a86175d08&encoded=0&v=paper_preview&mkt=zh-cn
    [20] FAN H, LIN L, YANG F, et al.A high-quality benchmark for large-scale single object tracking[C]//Proceedings of the IEEE International Conference on Computer Vision.Piscataway: IEEE Press, 2018: 5374-5383.
    [21] WANG Q, ZHANG L, BERTINETTO L, et al.Fast online object tracking and segmentation: A unifying approach[C]//Proceedings of the IEEE Computer Vision and Pattern Recognition.Piscataway: IEEE Press, 2019: 1328-1338.
    [22] HE K, ZHANG X, REN S, et al.Deep residual learning for image recognition[C]//Proceedings of the IEEE Computer Vision and Pattern Recognition.Piscataway: IEEE Press, 2016: 770-778.
    [23] MARGFFOYTUAY E A, PEREZ J C, BOTERO E, et al.Dynamic multimodal instance segmentation guided by natural language queries[C]//European Conference on Computer Vision.Berlin: Springer, 2018: 656-672.
    [24] JIANG B, LUO R, MAO J, et al.Acquisition of localization confidence for accurate object detection[C]//European Conference on Computer Vision.Berlin: Springer, 2018: 816-832.
    [25] KAZEMZADE S, ORDONEZ V, MATTENV M, et al.Referring to objects in photographs of natural scene[C]//Empirical Methods in Natural Language Processing, 2014, 28: 787-789.
    [26] DANELLJIAN M, BHAT G, KHAN F S, et al.Efficient convolution operators for tracking[C]//Proceedings of the IEEE Computer Vision and Pattern Recognition.Piscataway: IEEE Press, 2017: 6931-6939.
    [27] DANELLJIAN M, BHAT G, KHAN F S, et al.Accurate tracking by overlap maximization[C]//Proceedings of the IEEE Computer Vision and Pattern Recognition.Piscataway: IEEE Press, 2019: 4660-4669.
    [28] BHAT G, DANELLJAN M, VAN GOOL L, et al.Learning discriminative model prediction for tracking[C]//Proceedings of the IEEE International Conference on Computer Vision.Piscataway: IEEE Press, 2019: 6182-6191.
  • 加载中
图(8) / 表(2)
计量
  • 文章访问数:  644
  • HTML全文浏览量:  176
  • PDF下载量:  108
  • 被引次数: 0
出版历程
  • 收稿日期:  2020-02-21
  • 录用日期:  2020-03-15
  • 网络出版日期:  2020-09-20

目录

    /

    返回文章
    返回
    常见问答