Volume 48 Issue 4
Apr.  2022
Turn off MathJax
Article Contents
CHEN Lin, GAO Zan, SONG Xuemeng, et al. A cross-modal pedestrian Re-ID algorithm based on dual attribute information[J]. Journal of Beijing University of Aeronautics and Astronautics, 2022, 48(4): 647-656. doi: 10.13700/j.bh.1001-5965.2020.0614(in Chinese)
Citation: CHEN Lin, GAO Zan, SONG Xuemeng, et al. A cross-modal pedestrian Re-ID algorithm based on dual attribute information[J]. Journal of Beijing University of Aeronautics and Astronautics, 2022, 48(4): 647-656. doi: 10.13700/j.bh.1001-5965.2020.0614(in Chinese)

A cross-modal pedestrian Re-ID algorithm based on dual attribute information

doi: 10.13700/j.bh.1001-5965.2020.0614
Funds:

National Natural Science Foundation of China 61772310

National Natural Science Foundation of China 61702300

National Natural Science Foundation of China U1936203

National Natural Science Foundation of China 61872270

National Key R & D Program of China 2018AAA0102502

Shandong Provincial Natural Science Foundation ZR2019JQ23

Shandong Provincial Key Research and Development Program 2019JZZY010118

Young Creative Team in Universities of Shandong Province 2020KJN012

Innovation Teams in Colleges and Universities in Jinan 2018GXRC014

More Information
  • Corresponding author: NIE Liqiang, E-mail: nieliqiang@gmail.com
  • Received Date: 04 Nov 2020
  • Accepted Date: 12 Nov 2020
  • Publish Date: 20 Apr 2022
  • Through the investigation of cross-modal retrieval, the use of attribute information can enhance the semantic representation of extracted features. The attributes of the pedestrian image and text are not used adequately in the existing cross-modal pedestrian Re-ID algorithms based on natural language. To tackle the above issues, a novel cross-modal pedestrian Re-ID algorithm based on dual attribute information is proposed. Specifically, the attribute information of the pedestrian image and the attribute information of pedestrian text descriptions are fully and simultaneously explored, and the dual attribute space is also built to improve the distinguishability and semantic expression of extracted image and text features. Extensive experimental results on a public cross-modal pedestrian Re-ID dataset CUHK-PEDES demonstrate that the proposed algorithm is comparable with state-of-the-art algorithm CMAAM (Top-1 56.68%), the retrieval accuracy Top-1 of the proposed algorithm reaches 56.42%, and Top-5 and Top-10 are improved by 0.45% and 0.29% respectively. Besides, the retrieval accuracy of cross-modal pedestrian images can be significantly improved if the class information is provided in the gallery image pool and is used to extract attribute features, and Top-1 can reach 64.88%. The ablation study also proves the importance of the text attribute and image attribute used by the proposed algorithm and the effectiveness of the dual attribute space.

     

  • loading
  • [1]
    ZHENG L, YANG Y, HAUPTMANN A G. Person re-identification: Past, present and future[EB/OL]. (2016-10-11)[2020-10-30]. http://arxiv.org/abs/1610.02984.
    [2]
    罗浩, 姜伟, 范星, 等. 基于深度学习的行人重识别研究进展[J]. 自动化学报, 2019, 45(11): 2032-2049. https://www.cnki.com.cn/Article/CJFDTOTAL-MOTO201911002.htm

    LUO H, JIANG W, FAN X, et al. A survey on deep learning based person re-identification[J]. Acta Automatica Sinica, 2019, 45(11): 2032-2049(in Chinese). https://www.cnki.com.cn/Article/CJFDTOTAL-MOTO201911002.htm
    [3]
    YE M, SHEN J, LIN G, et al. Deep learning for person re-identification: A survey and outlook. (2020-01-13)[2020-10-30]. https://arxiv.org/abs/2001.04193v1.
    [4]
    LI S, XIAO T, LI H, et al. Person search with natural language description[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2017: 1970-1979.
    [5]
    JI G, LI S J, PANG Y. Fusion-attention network for person search with free-form natural language[J]. Pattern Recognition Letters, 2018, 116: 205-211. doi: 10.1016/j.patrec.2018.10.020
    [6]
    CHEN T, XU C, LUO J. Improving text-based person search by spatial matching and adaptive threshold[C]//Proceedings of the IEEE Winter Conference on Applications of Computer Vision. Piscataway: IEEE Press, 2018: 1879-1887.
    [7]
    CHEN D, LI H, LIU X, et al. Improving deep visual representation for person re-identification by global and local image-language association[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2018: 56-73.
    [8]
    WANG Y, BO C, WANG D, et al. Language person search with mutually connected classification loss[C]//Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing. Piscataway: IEEE Press, 2019: 2057-2061.
    [9]
    ZHANG Y, LU H. Deep cross-modal projection learning for image-text matching[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2018: 707-723.
    [10]
    AGGARWAL S, BABU R V, CHAKRABORTY A. Text-based person search via attribute-aided matching[C]//Proceedings of the IEEE Winter Conference on Applications of Computer Vision. Piscataway: IEEE Press, 2020: 2617-2625.
    [11]
    LI D, CHEN X, HUANG K Q. Multi-attribute learning for pedestrian attribute recognition in surveillance scenarios[C]//Proceedings of the Asian Conference on Pattern Recognition. Piscataway: IEEE Press, 2015: 111-115.
    [12]
    DENG Y, LUO P, LOY C C, et al. Pedestrian attribute recognition at far distance[C]// Proceedings of the ACM International Conference on Multimedia. New York: ACM, 2014: 789-792.
    [13]
    WANG J, ZHU X, GONG S, et al. Attribute recognition by joint recurrent learning of context and correlation[C]//Proceedings of the IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2017: 17467750.
    [14]
    LIN Y, ZHENG L, ZHENG Z, et al. Improving person re-identification by attribute and identity learning[J]. Pattern Recognition, 2019, 95: 151-161. doi: 10.1016/j.patcog.2019.06.006
    [15]
    MATSUKAWA T, SUZUKI E. Person re-identification using CNN features learned from combination of attributes[C]//Proceedings of the IEEE International Conference on Pattern Recognition. Piscataway: IEEE Press, 2016: 2428-2433.
    [16]
    CHEN W, CHEN X, ZHANG J, et al. Beyond triplet loss: A deep quadruplet network for person re-identification[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2017: 17355323.
    [17]
    ALEXANDER H, LUCAS B, BASTIAN L. In defense of the triplet loss for person reidentification[EB/OL]. (2017-03-22)[2020-10-30]. https://arxiv.org/abs/1703.07737.
    [18]
    YIN J, WU A, ZHENG W. Fine-grained person re-identification[J]. International Journal of Computer Vision, 2020, 128: 1654-1672. doi: 10.1007/s11263-019-01259-0
    [19]
    GAO Z, GAO L S, ZHANG H, et al. Deep spatial pyramid feature collaborative reconstruction for partial person reid[C]//Proceedings of the ACM International Conference on Multimedia. New York: ACM, 2019: 1879-1887.
    [20]
    ZHENG Z, YANG X, YU Z, et al. Joint discriminative and generative learning for person re-identification[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 1335-1344.
    [21]
    YANG Q, WU A, ZHENG W, et al. Person re-identification by contour sketch under moderate clothing change[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(6): 2029-2046. doi: 10.1109/TPAMI.2019.2960509
    [22]
    WANG B, YANG Y, XU X, et al. Adversarial cross-modal retrieval[C]//Proceedings of the ACM International Conference on Multimedia. New York: ACM, 2017: 154-162.
    [23]
    JING Y, SI C, WANG J, et al. Pose-guided joint global and attentive local matching network for text-based person search[EB/OL]. (2018-09-22)[2020-10-30]. https://arxiv.org/abs/1809.08440v2.
    [24]
    LOPER E, KLEIN E, BIRD S. Natural language processing with python-natural language toolkit[CP/OL]. (2019-09-04)[2020-10-30]. http://www.nltk.org/book/.
    [25]
    LIU X, ZHAO H, TIAN M, et al. HydraPlus-Net: Attentive deep features for pedestrian analysis[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2017: 350-359.
    [26]
    HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2016: 770-778.
    [27]
    LUO Y, ZHENG Z, ZHENG L, et al. Macro-micro adversarial network for human parsing[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2018: 424-440.
    [28]
    SUN B, SAENKO K. Deep coral: Correlation alignment for deep domain adaptation[C]//Proceedings of the European Conference on Computer Vision. Berlin, German: Springer, 2016: 443-450.
    [29]
    SHI B, JI L, LU P, et al. Knowledge aware semantic concept expansion for image-text matching[C]//Proceedings of the International Joint Conference on Artificial Intelligence. San Francisco: Margan Kaufmann, 2019: 5182-5189.
    [30]
    HUANG Y, WU Q, SONG C, et al. Learning semantic concepts and order for image and sentence matching[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 6163-6171.
    [31]
    KINGMA D P, BA J. Adam: A method for stochastic optimization[EB/OL]. (2017-01-30)[2020-10-30]. https://arxiv.org/abs/1412.6980.
    [32]
    LI S, XIAO T, LI H, et al. Identity-aware textual-visual matching with latent co-attention[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2017: 1908-1917.
    [33]
    ZHENG Z, ZHENG L, GARRETT M, et al. Dual-path convolutional image-text embeddings with instance loss[J]. ACM Transactions on Multimedia Computing, Communications, and Applications, 2020, 16(2): 1-23.
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(6)  / Tables(4)

    Article Metrics

    Article views(645) PDF downloads(224) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return