留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于多尺度联合学习的行人重识别

谢彭宇 徐新

谢彭宇, 徐新. 基于多尺度联合学习的行人重识别[J]. 北京航空航天大学学报, 2021, 47(3): 613-622. doi: 10.13700/j.bh.1001-5965.2020.0445
引用本文: 谢彭宇, 徐新. 基于多尺度联合学习的行人重识别[J]. 北京航空航天大学学报, 2021, 47(3): 613-622. doi: 10.13700/j.bh.1001-5965.2020.0445
XIE Pengyu, XU Xin. Multi-scale joint learning for person re-identification[J]. Journal of Beijing University of Aeronautics and Astronautics, 2021, 47(3): 613-622. doi: 10.13700/j.bh.1001-5965.2020.0445(in Chinese)
Citation: XIE Pengyu, XU Xin. Multi-scale joint learning for person re-identification[J]. Journal of Beijing University of Aeronautics and Astronautics, 2021, 47(3): 613-622. doi: 10.13700/j.bh.1001-5965.2020.0445(in Chinese)

基于多尺度联合学习的行人重识别

doi: 10.13700/j.bh.1001-5965.2020.0445
基金项目: 

国家自然科学基金 U1803262

国家自然科学基金 61602349

国家自然科学基金 61440016

详细信息
    作者简介:

    谢彭宇  男, 硕士研究生。主要研究方向: 计算机视觉、行人重识别; 徐新, 男, 博士, 教授, 博士生导师。主要研究方向: 计算机视觉、机器学习、行人重识别

    徐新   男,博士,教授,博士生导师。主要研究方向:计算机视觉、机器学习、行人重识别

    通讯作者:

    徐新, E-mail: xuxin0336@163.com

  • 中图分类号: TP391

Multi-scale joint learning for person re-identification

Funds: 

National Natural Science Foundation of China U1803262

National Natural Science Foundation of China 61602349

National Natural Science Foundation of China 61440016

More Information
  • 摘要:

    现有的行人重识别方法主要关注于学习行人的局部特征来实现跨摄像机条件下的行人辨识。然而在人体部件存在运动或遮挡、背景干扰等行人数据非完备条件下,会导致行人局部辨识信息丢失概率的增加。针对这个问题,提出了一种多尺度联合学习方法对行人辨识特征进行精细化表达。该方法包含3个分支网络,分别提取行人的粗粒度全局特征、细粒度全局特征和细粒度局部特征。其中粗粒度全局分支通过融合不同层次的语义信息来增强全局特征的丰富性;细粒度全局分支通过联合全部局部特征,在对全局特征进行细粒度描述的同时学习行人局部部件间的相关性;细粒度局部分支则通过遍历局部特征来挖掘行人非显著性的信息以增强局部特征的鲁棒性。为了验证所提方法的有效性,在Market1501、DukeMTMC-ReID和CUHK03三个公开数据集上开展了对比实验,实验结果表明:所提方法取得了最佳性能。

     

  • 图 1  真实场景下受遮挡的相似行人图像

    Figure 1.  Obscured images of similar pedestrians in real scenes

    图 2  多尺度联合学习网络框架

    Figure 2.  Multi-scale joint learning network framework

    图 3  细粒度局部分支

    Figure 3.  Fine-grained local branch

    图 4  Market1501数据集部分图像查询结果

    Figure 4.  Partial image query results on Market1501 dataset

    图 5  Market1501数据集部分图像热力图

    Figure 5.  Partial image heatmap on Market1501 dataset

    表  1  多尺度联合学习方法和其他方法性能对比

    Table  1.   Performance comparison of multi-scale joint learning method and other methods   %

    方法 CUHK03 Market1501 DukeMTMC-ReID
    Labeled Detected
    Rank-1 mAP Rank-1 mAP Rank-1 mAP Rank-1 mAP
    基于部件 IDE[15] 22.0 21.0 21.3 19.7 72.5 46.0 67.7 47.1
    MGN[8] 68.0 67.4 66.8 66.0 95.7 86.9 88.7 78.4
    PCB[3] 61.9 56.8 60.6 54.4 92.3 77.4 81.7 66.1
    Pyramid[6] 78.9 76.9 78.9 74.8 95.7 88.2 89.0 79.0
    GFLF-S[34] 76.6 73.5 74.4 69.6 94.8 88.0 89.3 77.1
    基于注意力机制 CASN[35] 73.7 68.0 71.5 64.4 94.4 82.8 87.7 73.7
    M1tB[36] 70.1 66.5 66.6 64.2 94.7 84.5 85.8 72.9
    Mancs[37] 69.0 63.9 65.5 60.5 93.1 82.3 84.9 71.8
    HACNN[24] 44.4 41.0 41.7 38.6 91.2 75.7 80.5 63.9
    其他 DPFL[38] 43.0 40.5 40.7 37.0 88.9 73.1 79.2 60.0
    BDB[39] 73.6 71.7 72.8 69.3 94.2 84.3 86.8 72.1
    SVDNet[40] 40.9 37.8 41.5 37.3 82.3 62.1 76.7 56.8
    本文 多尺度联合 80.7 77.0 78.0 73.4 95.9 89.1 90.0 80.4
    下载: 导出CSV

    表  2  多尺度联合学习方法消融实验

    Table  2.   Ablation experiment of multi-scale joint learning method   %

    方法 CUHK03 Market1501 DukeMTMC-ReID
    Labeled Detected
    Rank-1 mAP Rank-1 mAP Rank-1 mAP Rank-1 mAP
    基线 59.1 54.2 55.1 50.2 93.5 82.4 85.3 72.0
    基线+CG 69.8 66.1 66.9 62.6 94.8 86.9 87.9 76.7
    基线+FG 70.9 67.1 68.2 63.3 95.1 87.3 88.2 77.9
    基线+CG+FG 76.4 72.1 73.0 68.4 95.3 88.7 88.7 79.1
    基线+CG+ FP1 78.4 75.1 76.0 72.2 95.6 88.7 89.1 78.5
    基线+CG+ FP2 78.7 75.2 75.5 71.5 95.6 89.0 89.2 79.6
    基线+ FG + FP1 77.3 73.1 76.4 71.8 95.6 88.5 89.5 78.9
    基线+ FG + FP2 77.6 74.2 75.0 71.4 95.7 88.8 89.5 79.8
    基线+CG+FG+FP1 80.7 77.0 78.0 73.4 95.9 88.8 89.6 79.2
    基线+CG+FG+FP2 80.8 76.7 76.0 71.8 95.9 89.1 90.0 80.4
    下载: 导出CSV
  • [1] LECUN Y, BENGIO Y, HINTON G. Deep learning[J]. Nature, 2015, 521(7553): 436-444. doi: 10.1038/nature14539
    [2] ZHAO H, TIAN M, SUN S, et al. Spindle Net: Person re-identification with human body region guided feature decomposition and fusion[C]//Proceedings of the IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2017: 907-915.
    [3] SUN Y, ZHENG L, YANG Y, et al. Beyond part models: Person retrieval with refined part pooling (and a strong convolutional baseline)[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2018: 480-496.
    [4] ZHENG L, HUANG Y, LU H, et al. Pose-invariant embedding for deep person re-identification[J]. IEEE Transactions on Image Processing, 2019, 28(9): 4500-4509. doi: 10.1109/TIP.2019.2910414
    [5] WEI L, ZHANG S, YAO H, et al. GLAD: Global-local-alignment descriptor for pedestrian retrieval[C]//Proceedings of the 25th ACM International Conference on Multimedia. New York: ACM Press, 2017: 420-428.
    [6] ZHENG F, DENG C, SUN X, et al. Pyramidal person re-identification via multi-loss dynamic training[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 8514-8522.
    [7] FU Y, WEI Y, ZHOU Y, et al. Horizontal pyramid matching for person re-identification[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2019: 8295-8302.
    [8] WANG G, YUAN Y, CHEN X, et al. Learning discriminative features with multiple granularities for person re-identification[C]//Proceedings of the 26th ACM International Conference on Multimedia. New York: ACM Press, 2018: 274-282.
    [9] WANG Z, JIANG J, WU Y, et al. Learning sparse and identity-preserved hidden attributes for person re-identification[J]. IEEE Transactions on Image Processing, 2019, 29(1): 2013-2025.
    [10] ZENG Z, WANG Z, WANG Z, et al. Illumination-adaptive person re-identification[J]. IEEE Transactions on Multimedia, 2020, 22(12): 3064-3074. doi: 10.1109/TMM.2020.2969782
    [11] WANG Z, WANG Z, ZHENG Y, et al. Beyond intra-modality: A survey of heterogeneous person re-identification[EB/OL]. (2020-04-27)[2020-07-23]. https://arxiv.org/abs/1905.10048v4.
    [12] DALAL N, TRIGGS B. Histograms of oriented gradients for human detection[C]//2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2005: 886-893.
    [13] LIAO S, HU Y, ZHU X, et al. Person re-identification by local maximal occurrence representation and metric learning[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2015: 2197-2206.
    [14] KOESTINGER M, HIRZER M, WOHLHART P, et al. Large scale metric learning from equivalence constraints[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2012: 2288-2295.
    [15] ZHENG L, YANG Y, HAUPTMANN A G. Person re-identification: Past, present and future[EB/OL]. [2020-07-23]. https://arxiv.org/abs/1610.02984.
    [16] HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2016: 770-778.
    [17] SU C, LI J, ZHANG S, et al. Pose-driven deep convolutional model for person re-identification[C]//Proceedings of the IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2017: 3980-3989.
    [18] SUH Y, WANG J, TANG S, et al. Part-aligned bilinear representations for person re-identification[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2018: 418-437.
    [19] XU J, ZHAO R, ZHU F, et al. Attention-aware compositional network for person re-identification[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 2119-2128.
    [20] SARFRAZ M S, SCHUMANN A, EBERLE A, et al. A pose-sensitive embedding for person re-identification with expanded cross neighborhood re-ranking[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 420-429.
    [21] ZHENG W S, LI X, XIANG T, et al. Partial person re-identification[C]//Proceedings of the IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2015: 4678-4686.
    [22] LUO H, JIANG W, ZHANG X, et al. AlignedReID++: Dynamically matching local information for person re-identification[J]. Pattern Recognition, 2019, 94: 53-61. doi: 10.1016/j.patcog.2019.05.028
    [23] SUN Y, XU Q, LI Y, et al. Perceive where to focus: Learning visibility-aware part-level features for partial person re-identification[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 393-402.
    [24] LI W, ZHU X, GONG S. Harmonious attention network for person re-identification[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 2285-2294.
    [25] LIU X, ZHAO H, TIAN M, et al. HydraPlus-Net: Attentive deep features for pedestrian analysis[C]//Proceedings of the IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2017: 350-359.
    [26] SCHROFF F, KALENICHENKO D, PHILBIN J. FaceNet: A unified embedding for face recognition and clustering[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2015: 815-823.
    [27] WANG F, XIANG X, CHENG J, et al. Normface: L2 hypersphere embedding for face verification[C]//Proceedings of the 25th ACM International Conference on Multimedia. New York: ACM Press, 2017: 1041-1049.
    [28] HERMANS A, BEYER L, LEIBE B. In defense of the triplet loss for person re-identification[EB/OL]. (2017-11-17)[2020-07-23]. https://arxiv.org/abs/1703.07737.
    [29] ZHENG L, SHEN L, TIAN L, et al. Scalable person re-identification: A benchmark[C]//Proceedings of the IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2015: 1116-1124.
    [30] LI W, ZHAO R, XIAO T, et al. DeepReID: Deep filter pairing neural network for person re-identification[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2014: 152-159.
    [31] ZHONG Z, ZHENG L, CAO D, et al. Re-ranking person re-identification with k-reciprocal encoding[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2017: 1318-1327.
    [32] ZHENG Z, ZHENG L, YANG Y. Unlabeled samples generated by GAN improve the person re-identification baseline in vitro[C]//Proceedings of the IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2017: 3754-3762.
    [33] DENG J, DONG W, SOCHER R, et al. ImageNet: A large-scale hierarchical image database[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2009: 248-255.
    [34] PARK H, HAM B. Relation network for person re-identification[EB/OL]. (2017-08-22)[2020-07-23]. https://arxiv.org/abs/1701.07717.
    [35] ZHENG M, KARANAM S, WU Z, et al. Re-identification with consistent attentive siamese networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 5735-5744.
    [36] YANG W, HUANG H, ZHANG Z, et al. Towards rich feature discovery with class activation maps augmentation for person re-identification[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 1389-1398.
    [37] WANG C, ZHANG Q, HUANG C, et al. Mancs: A multi-task attentional network with curriculum sampling for person re-identification[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2018: 365-381.
    [38] CHEN Y, ZHU X, GONG S. Person re-identification by deep learning multi-scale representations[C]//Proceedings of the IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2017: 2590-2600.
    [39] DAI Z, CHEN M, GU X, et al. Batch DropBlock network for person re-identification and beyond[C]//Proceedings of the IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2019: 3690-3700.
    [40] SUN Y, ZHENG L, DENG W, et al. SVDNet for pedestrian retrieval[C]//Proceedings of the IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2017: 3820-3828.
  • 加载中
图(5) / 表(2)
计量
  • 文章访问数:  253
  • HTML全文浏览量:  11
  • PDF下载量:  77
  • 被引次数: 0
出版历程
  • 收稿日期:  2020-08-24
  • 录用日期:  2020-09-04
  • 刊出日期:  2021-03-20

目录

    /

    返回文章
    返回
    常见问答