留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于双重属性信息的跨模态行人重识别算法

陈琳 高赞 宋雪萌 王英龙 聂礼强

陈琳, 高赞, 宋雪萌, 等 . 基于双重属性信息的跨模态行人重识别算法[J]. 北京航空航天大学学报, 2022, 48(4): 647-656. doi: 10.13700/j.bh.1001-5965.2020.0614
引用本文: 陈琳, 高赞, 宋雪萌, 等 . 基于双重属性信息的跨模态行人重识别算法[J]. 北京航空航天大学学报, 2022, 48(4): 647-656. doi: 10.13700/j.bh.1001-5965.2020.0614
CHEN Lin, GAO Zan, SONG Xuemeng, et al. A cross-modal pedestrian Re-ID algorithm based on dual attribute information[J]. Journal of Beijing University of Aeronautics and Astronautics, 2022, 48(4): 647-656. doi: 10.13700/j.bh.1001-5965.2020.0614(in Chinese)
Citation: CHEN Lin, GAO Zan, SONG Xuemeng, et al. A cross-modal pedestrian Re-ID algorithm based on dual attribute information[J]. Journal of Beijing University of Aeronautics and Astronautics, 2022, 48(4): 647-656. doi: 10.13700/j.bh.1001-5965.2020.0614(in Chinese)

基于双重属性信息的跨模态行人重识别算法

doi: 10.13700/j.bh.1001-5965.2020.0614
基金项目: 

国家自然科学基金 61772310

国家自然科学基金 61702300

国家自然科学基金 U1936203

国家自然科学基金 61872270

国家重点研发计划 2018AAA0102502

山东省自然科学基金 ZR2019JQ23

山东省重大科技创新工程项目 2019JZZY010118

山东省高等学校青年创新团队发展计划 2020KJN012

济南市高校二十条创新团队 2018GXRC014

详细信息
    通讯作者:

    聂礼强, E-mail: nieliqiang@gmail.com

  • 中图分类号: TP183

A cross-modal pedestrian Re-ID algorithm based on dual attribute information

Funds: 

National Natural Science Foundation of China 61772310

National Natural Science Foundation of China 61702300

National Natural Science Foundation of China U1936203

National Natural Science Foundation of China 61872270

National Key R & D Program of China 2018AAA0102502

Shandong Provincial Natural Science Foundation ZR2019JQ23

Shandong Provincial Key Research and Development Program 2019JZZY010118

Young Creative Team in Universities of Shandong Province 2020KJN012

Innovation Teams in Colleges and Universities in Jinan 2018GXRC014

More Information
  • 摘要:

    通过对跨模态检索问题的研究,属性信息的使用可以增强所提取特征的语义表达性,但现有基于自然语言的跨模态行人重识别算法对行人图片和文本的属性信息利用不够充分。基于双重属性信息的跨模态行人重识别算法充分考虑了行人图片和文本描述的属性信息,构建了基于文本属性和图片属性的双重属性空间,并通过构建基于隐空间和属性空间的跨模态行人重识别端到端网络,提高了所提取图文特征的可区分性和语义表达性。跨模态行人重识别数据集CUHK-PEDES上的实验评估表明,所提算法的检索准确率Top-1达到了56.42%,与CMAAM算法的Top-1(56.68%)具有可比性,Top-5、Top-10相比CMAAM算法分别提升了0.45%、0.29%。针对待检索图片库中可能存在身份标签的应用场景,利用行人的类别信息提取属性特征,可以较大幅度提高跨模态行人图片的检索准确率,Top-1达到64.88%。消融实验证明了所提算法使用的文本属性和图片属性的重要性及基于双重属性空间的有效性。

     

  • 图 1  文本属性提取

    Figure 1.  Text attribute extraction

    图 2  图片属性识别结果

    Figure 2.  Result of image attribute recognition

    图 3  基于双重属性信息的跨模态行人重识别算法框架

    Figure 3.  Framework of cross-modal matching algorithm based on dual attribute information

    图 4  训练过程中损失函数变化情况

    Figure 4.  Change of loss function during training

    图 5  本文算法解决跨模态行人重识别的Top-5结果示例

    Figure 5.  Examples of Top-5 results for cross-modal pedestrian re-identification by proposed algorithm

    图 6  属性空间、隐空间及其融合后的检索结果

    Figure 6.  Retrieval results using attribute space and latent space and after fusion

    表  1  在CUHK-PEDES数据集上的性能评估及与现有算法的性能比较

    Table  1.   Performance evaluation on CUHK-PEDES dataset and comparison of performance with existing algorithms %

    算法 Top-1 Top-5 Top-10
    GNA-RNN[4] 19.05 53.64
    IATV[32] 25.94 60.48
    PWM+ATH[6] 27.14 49.45 61.02
    DPCE[33] 44.40 66.26 75.07
    GLA[7] 43.58 66.93 76.26
    CMPC+CMPM[9] 49.37 79.27
    MCCL[8] 50.58 79.06
    GALM[23] 54.12 75.45 82.97
    CMAAM[10] 56.68 77.18 84.86
    本文 无预处理 56.42 77.63 85.15
    预处理 64.88 85.02 90.55
    下载: 导出CSV

    表  2  双重属性空间和隐空间的重要性分析

    Table  2.   Analysis of importance of dual attribute space and latent space %

    对比策略 准确率类型 Top-1 Top-5 Top-10
    单元属性提取 Rlatent 54.34 75.42 82.94
    Rattr 39.26 65.50 75.29
    Rtotal 56.42 77.63 85.15
    多元属性提取 Rlatent 54.35 75.88 83.14
    Rattr 52.65 77.86 85.98
    Rtotal 64.88 85.02 90.55
    下载: 导出CSV

    表  3  文本属性和图像属性的重要性分析

    Table  3.   Analysis of importance of text attribute and image attribute %

    对比策略 准确率类型 Top-1 Top-5 Top-10
    无图片属性 Rlatent 53.20 74.77 82.47
    Rattr 40.07 66.81 76.69
    Rtotal 54.81 77.49 84.67
    无文本属性 Rlatent 53.41 75.00 82.88
    Rattr 37.35 62.13 72.38
    Rtotal 51.45 73.15 81.56
    下载: 导出CSV

    表  4  模态差异消除作用

    Table  4.   Role of modal difference elimination %

    对比策略 准确率类型 Top-1 Top-5 Top-10
    无coral损失 Rtotal 44.25 60.62 67.43
    有coral损失 Rtotal 56.42 77.63 85.15
    下载: 导出CSV
  • [1] ZHENG L, YANG Y, HAUPTMANN A G. Person re-identification: Past, present and future[EB/OL]. (2016-10-11)[2020-10-30]. http://arxiv.org/abs/1610.02984.
    [2] 罗浩, 姜伟, 范星, 等. 基于深度学习的行人重识别研究进展[J]. 自动化学报, 2019, 45(11): 2032-2049. https://www.cnki.com.cn/Article/CJFDTOTAL-MOTO201911002.htm

    LUO H, JIANG W, FAN X, et al. A survey on deep learning based person re-identification[J]. Acta Automatica Sinica, 2019, 45(11): 2032-2049(in Chinese). https://www.cnki.com.cn/Article/CJFDTOTAL-MOTO201911002.htm
    [3] YE M, SHEN J, LIN G, et al. Deep learning for person re-identification: A survey and outlook. (2020-01-13)[2020-10-30]. https://arxiv.org/abs/2001.04193v1.
    [4] LI S, XIAO T, LI H, et al. Person search with natural language description[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2017: 1970-1979.
    [5] JI G, LI S J, PANG Y. Fusion-attention network for person search with free-form natural language[J]. Pattern Recognition Letters, 2018, 116: 205-211. doi: 10.1016/j.patrec.2018.10.020
    [6] CHEN T, XU C, LUO J. Improving text-based person search by spatial matching and adaptive threshold[C]//Proceedings of the IEEE Winter Conference on Applications of Computer Vision. Piscataway: IEEE Press, 2018: 1879-1887.
    [7] CHEN D, LI H, LIU X, et al. Improving deep visual representation for person re-identification by global and local image-language association[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2018: 56-73.
    [8] WANG Y, BO C, WANG D, et al. Language person search with mutually connected classification loss[C]//Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing. Piscataway: IEEE Press, 2019: 2057-2061.
    [9] ZHANG Y, LU H. Deep cross-modal projection learning for image-text matching[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2018: 707-723.
    [10] AGGARWAL S, BABU R V, CHAKRABORTY A. Text-based person search via attribute-aided matching[C]//Proceedings of the IEEE Winter Conference on Applications of Computer Vision. Piscataway: IEEE Press, 2020: 2617-2625.
    [11] LI D, CHEN X, HUANG K Q. Multi-attribute learning for pedestrian attribute recognition in surveillance scenarios[C]//Proceedings of the Asian Conference on Pattern Recognition. Piscataway: IEEE Press, 2015: 111-115.
    [12] DENG Y, LUO P, LOY C C, et al. Pedestrian attribute recognition at far distance[C]// Proceedings of the ACM International Conference on Multimedia. New York: ACM, 2014: 789-792.
    [13] WANG J, ZHU X, GONG S, et al. Attribute recognition by joint recurrent learning of context and correlation[C]//Proceedings of the IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2017: 17467750.
    [14] LIN Y, ZHENG L, ZHENG Z, et al. Improving person re-identification by attribute and identity learning[J]. Pattern Recognition, 2019, 95: 151-161. doi: 10.1016/j.patcog.2019.06.006
    [15] MATSUKAWA T, SUZUKI E. Person re-identification using CNN features learned from combination of attributes[C]//Proceedings of the IEEE International Conference on Pattern Recognition. Piscataway: IEEE Press, 2016: 2428-2433.
    [16] CHEN W, CHEN X, ZHANG J, et al. Beyond triplet loss: A deep quadruplet network for person re-identification[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2017: 17355323.
    [17] ALEXANDER H, LUCAS B, BASTIAN L. In defense of the triplet loss for person reidentification[EB/OL]. (2017-03-22)[2020-10-30]. https://arxiv.org/abs/1703.07737.
    [18] YIN J, WU A, ZHENG W. Fine-grained person re-identification[J]. International Journal of Computer Vision, 2020, 128: 1654-1672. doi: 10.1007/s11263-019-01259-0
    [19] GAO Z, GAO L S, ZHANG H, et al. Deep spatial pyramid feature collaborative reconstruction for partial person reid[C]//Proceedings of the ACM International Conference on Multimedia. New York: ACM, 2019: 1879-1887.
    [20] ZHENG Z, YANG X, YU Z, et al. Joint discriminative and generative learning for person re-identification[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 1335-1344.
    [21] YANG Q, WU A, ZHENG W, et al. Person re-identification by contour sketch under moderate clothing change[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(6): 2029-2046. doi: 10.1109/TPAMI.2019.2960509
    [22] WANG B, YANG Y, XU X, et al. Adversarial cross-modal retrieval[C]//Proceedings of the ACM International Conference on Multimedia. New York: ACM, 2017: 154-162.
    [23] JING Y, SI C, WANG J, et al. Pose-guided joint global and attentive local matching network for text-based person search[EB/OL]. (2018-09-22)[2020-10-30]. https://arxiv.org/abs/1809.08440v2.
    [24] LOPER E, KLEIN E, BIRD S. Natural language processing with python-natural language toolkit[CP/OL]. (2019-09-04)[2020-10-30]. http://www.nltk.org/book/.
    [25] LIU X, ZHAO H, TIAN M, et al. HydraPlus-Net: Attentive deep features for pedestrian analysis[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2017: 350-359.
    [26] HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2016: 770-778.
    [27] LUO Y, ZHENG Z, ZHENG L, et al. Macro-micro adversarial network for human parsing[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2018: 424-440.
    [28] SUN B, SAENKO K. Deep coral: Correlation alignment for deep domain adaptation[C]//Proceedings of the European Conference on Computer Vision. Berlin, German: Springer, 2016: 443-450.
    [29] SHI B, JI L, LU P, et al. Knowledge aware semantic concept expansion for image-text matching[C]//Proceedings of the International Joint Conference on Artificial Intelligence. San Francisco: Margan Kaufmann, 2019: 5182-5189.
    [30] HUANG Y, WU Q, SONG C, et al. Learning semantic concepts and order for image and sentence matching[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 6163-6171.
    [31] KINGMA D P, BA J. Adam: A method for stochastic optimization[EB/OL]. (2017-01-30)[2020-10-30]. https://arxiv.org/abs/1412.6980.
    [32] LI S, XIAO T, LI H, et al. Identity-aware textual-visual matching with latent co-attention[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2017: 1908-1917.
    [33] ZHENG Z, ZHENG L, GARRETT M, et al. Dual-path convolutional image-text embeddings with instance loss[J]. ACM Transactions on Multimedia Computing, Communications, and Applications, 2020, 16(2): 1-23.
  • 加载中
图(6) / 表(4)
计量
  • 文章访问数:  474
  • HTML全文浏览量:  204
  • PDF下载量:  113
  • 被引次数: 0
出版历程
  • 收稿日期:  2020-11-04
  • 录用日期:  2020-11-12
  • 网络出版日期:  2022-04-20

目录

    /

    返回文章
    返回
    常见问答