-
摘要:
跨域是行人重识别的重要应用场景,但是源域与目标域行人图像在光照条件、拍摄视角、成像背景与风格等方面的表观特征差异性是导致行人重识别模型泛化能力下降的关键因素。针对该问题,提出了基于多标签协同学习的跨域行人重识别方法。利用语义解析模型构造了基于语义对齐的多标签数据表示,以引导构建更关注行人前景区域的局部特征,达到语义对齐的目的,减少背景对跨域重识别的影响。基于行人图像全局特征和语义对齐后的行人局部特征,利用协同学习平均模型生成行人重识别模型的多标签表示,减少跨域场景下噪声硬标签的干扰。利用协同学习网络框架联合多标签的语义对齐模型,提高行人重识别模型的识别能力。实验结果表明:在Market-1501→ DukeMTMC-reID、DukeMTMC-reID→Market-1501、Market-1501→MSMT17、DukeMTMC-reID→MSMT17跨域行人重识别数据集上,与NRMT方法相比,平均精度均值分别提高了8.3%、8.9%、7.6%、7.9%,多标签协同学习方法具有显著的优越性。
Abstract:Cross-domain was an important application scenario in person re-identification, but the apparent difference of person image in illumination condition, shooting angle, imaging background and style between the source domain and target domain was the most important factor that leads to the decline of the generalization ability of person re-identification model. A cross-domain person re-identification method was proposed based on multi-label cooperative learning to solve the problem. Firstly, the semantic parsing model was used to construct the multi-label data based on semantic alignment, which was able to guide us to construct global features that pay more attention to the person area, achieve the purpose of semantic alignment, and reduce the background influence on cross-domain person re-identification. Furthermore, the collaborative learning average model was used to generate a multi-label representation of the person re-identification model based on global and local features after semantic alignment, reducing the interference of noisy hard labels in the cross-domain scenario. Finally, the semantic alignment model of multi-label based on a collaborative learning network framework was combined to improve the identification ability of re-identification model. The experiment results show that on the Market-1501→DukeMTMC-reID, DukeMTMC-reID→Market-1501, Market-1501→MSMT17, DukeMTMC-reID→MSMT17 cross-domain person re-identification data set, compared with the current state-of-the-artscross-domain person re-identification method NRMT, the mean average precision of this method is increased by 8.3%, 8.9%, 7.6% and 7.9%, respectively. Multi-label cooperative learning method has obvious advantages.
-
表 1 在DukeMTMC-reID和Market-1501数据集上的语义对齐模块有效性消融实验
Table 1. Ablation study for semantic alignment module validity on DukeMTMC-reID dataset and Market-1501 dataset
方法 Market-1501→DukeMTMC-reID DukeMTMC-reID→Market-1501 R-1/% R-5/% R-10/% mAP/% R-1/% R-5/% R-10/% mAP/% GFM 68.4 80.1 83.5 49.0 75.8 89.5 93.2 53.7 HPM 76.0 85.8 89.3 60.3 86.2 94.6 96.5 68.7 SPM 76.7 85.9 89.0 60.9 87.9 95.2 96.8 70.9 注:黑体数据为每列最优值。 表 2 在DukeMTMC-reID和Market-1501数据集上的多标签有效性消融实验
Table 2. Ablation study of multi labels validity on DukeMTMC-reID dataset and Market-1501 dataset
方法 Market-1501→DukeMTMC-reID DukeMTMC-reID→Market-1501 R-1/% R-5/% R-10/% mAP/% R-1/% R-5/% R-10/% mAP/% GSL 78.0 88.8 92.5 65.1 87.7 94.9 96.9 71.2 FSL 82.4 91.1 93.4 69.0 91.7 96.5 97.7 77.8 MCL(GSL+FSL) 82.5 91.1 93.2 70.5 93.2 97.1 98.1 80.6 注:黑体数据为每列最优值。 表 3 在DukeMTMC-reID数据集上不同方法的比较
Table 3. Comparison with different methods on DukeMTMC-reID dataset
方法 Market-1501→DukeMTMC-reID R-1/% R-5/% R-10/% mAP/% ECN[28] 63.3 75.8 80.4 40.4 D-MMD[29] 63.5 78.8 83.9 46.0 AD-Cluster[30] 72.6 82.5 85.5 54.1 SSG[21] 76.0 85.8 89.3 60.3 DG-Net++[31] 78.9 87.8 90.4 63.8 JVTC+[32] 80.4 89.9 92.2 66.5 MPLP+MMCL[11] 72.4 82.9 85.0 51.4 NRMT[33] 77.8 86.9 89.5 62.2 MEB[34] 79.6 88.3 92.2 66.1 MCL(本文) 82.5 91.1 93.2 70.5 注:黑体数据为每列最优值。 表 4 在Market-1501数据集上不同方法的实验比较
Table 4. Comparison with different methods on Market-1501 dataset
方法 DukeMTMC-reID→Market-1501 R-1/% R-5/% R-10/% mAP/% ECN[28] 75.1 87.6 91.6 43.0 D-MMD[29] 70.6 87.0 91.5 48.8 AD-Cluster[30] 86.7 94.4 96.5 68.3 SSG[21] 86.2 94.6 96.5 68.7 DG-Net++[31] 82.1 90.2 92.7 61.7 JVTC+[32] 86.8 95.2 97.1 67.2 MPLP+MMCL[11] 84.4 92.8 95.0 60.4 NRMT[33] 87.8 94.6 96.5 71.7 MEB[34] 89.9 96.0 97.5 76.0 MCL(本文) 93.2 97.1 98.1 80.6 注:黑体数据为每列最优值。 表 5 在MSMT17数据集上不同方法的实验比较
Table 5. Comparison with different methods on MSMT17 dataset
方法 Market-1501→MSMT17 DukeMTMC-reID→MSMT17 R-1/% R-5/% R-10/% mAP/% R-1/% R-5/% R-10/% mAP/% ECN[28] 25.3 36.3 42.1 8.5 30.2 41.5 46.8 10.2 D-MMD[29] 29.1 46.3 54.1 13.5 34.4 51.1 58.5 15.3 MPLP+MMCL[11] 40.8 51.8 56.7 15.1 43.6 54.3 58.9 16.2 NRMT[33] 43.7 56.5 62.2 19.8 45.2 57.8 63.3 20.6 DG-Net++[31] 48.4 60.9 66.1 22.1 48.8 60.9 65.9 22.1 MCL(本文) 57.3 68.5 73.3 27.4 58.5 70.0 74.5 28.5 注:黑体数据为每列最优值。 -
[1] DENG W J, ZHENG L, YE Q X, et al. Image-image domain adaptation with preserved self-similarity and domain-dissimilarity for person re-identification[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 994-1003. [2] FAN H, ZHENG L, YANG Y. Unsupervised person re-identification: Clustering and fine-tuning[J]. ACM Transactions on Multimedia Computing Communications, 2018, 14(4): 1-18. [3] SUN Y F, ZHENG L, YANG Y, et al. Beyond part models: Person retrieval with refined part pooling (and a strong convolutional baseline)[C]//Proceedings of European Conference on Computer Vision. Berlin: Springer, 2018: 480-496. [4] ZHAO H Y, TIAN M Q, SUN S Y, et al. Spindle Net: Person re-identification with human body region guided feature decomposition and fusion[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2017: 1077-1108. [5] KALAYEH M M, BASARAN E, GOKMEN M, et al. Human semantic parsing for person re-identification[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 1062-1071. [6] WANG J Y, ZHU X T, GONG S G, et al. Transferable joint attribute-identity deep learning for unsupervised person re-identification[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 2275-2284. [7] LV J M, CHEN W H, LI Q, et al. Unsupervised cross-dataset person re-identification by transfer learning of spatial-temporal patterns[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 7948-7956. [8] LI M X, ZHU X T, GONG S. Unsupervised person re-identification by deep learning tracklet association[C]//Proceedings of European Conference on Computer Vision. Berlin: Springer, 2018: 737-753. [9] ZHONG Z, LIANG Z, ZHENG Z D, et al. Camera style adaptation for person re-identification[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 5157-5516. [10] LIN Y T, ZHENG L, ZHENG Z D, et al. Improving person re-identification by attribute and identity learning[J]. Pattern Recognition, 2019, 95: 151-161. [11] WANG D K, ZHANG S L. Unsupervised person re-identification via multi-label classification[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2020: 19874684. [12] ZHANG X, LUO H, FAN X, et al. AlignedReID: Surpassing human-level performance in person re-identification[EB/OL]. (2018-01-31)[2021-10-01]. https://arxiv.org/abs/1711.08184. [13] ZHENG L, HUANG Y, LU H, et al. Pose-invariant embedding for deep person re-identification[J]. IEEE Transactions on Image Processing, 2019, 28(9): 4500-4509. doi: 10.1109/TIP.2019.2910414 [14] ZHU K, GUO H, LIU Z, et al. Identity-guided human semantic parsing for person re-identification[C]//Proceedings of European Conference on Computer Vision. Berlin: Springer, 2020: 346-363. [15] GUO J Y, YUAN Y H, HUANG L, et al. Beyond human parts: Dual part-aligned representations for person re-identification[C]//Proceedings of IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2019: 19398436. [16] ZENG K W, NIAN M N, WANG Y H, et al. Hierarchical clustering with hard-batch triplet loss for person re-identification[C]// Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2020: 13657-13665. [17] GE Y, CHEN D, LI H. Mutual mean-teaching: Pseudo label refinery for unsupervised domain adaptation on person re-identification[C]//Proceedings of the International Conference on Learning Representations. Piscataway: IEEE Press, 2020. [18] YU H X, ZHENG W S, WU A, et al. Unsupervised person re-identification by soft multilabel learning[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 2148-2157. [19] ZHONG Z, ZHENG L, KANG G L, et al. Random erasing data augmentation[EB/OL]. (2017-11-16)[2021-10-01]. https://arxiv.org/abs/1708.04896v1. [20] HE K, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2016: 770-778. [21] FU Y, WEI Y C, WANG G S, et al. Self-similarity grouping: A simple unsupervised cross domain adaptation approach for person re-identification[C]//Proceedings of IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2019: 6112-6121. [22] ZHAO H, SHI J, QI X, et al. Pyramid scene parsing network[C]// Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2016: 17355193. [23] ZHENG L, SHEN L Y, LU T, et al. Scalable person re-identification: A benchmark[C]//Proceedings of IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2015: 1116-1124. [24] ZHENG Z D, ZHENG L, YANG Y. Unlabeled samples generated by GAN improve the person re-identification baseline in vitro[C]//Proceedings of IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2017: 17453019. [25] WEI L H, ZHANG S L, GAO W, et al. Person transfer GAN to bridge domain gap for person reidentification[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 18347650. [26] DENG J, DONG W, SOCHER R, et al. ImageNet: A large-scale hierarchical image database[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2009: 248-255. [27] ZHONG Z, LIANG Z. Re-ranking person re-identification with k-reciprocal encoding[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2017: 1318-1327. [28] ZHONG Z, ZHENG L, LUO Z, et al. Invariance matters: Exemplar memory for domain adaptive person re-identification[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 598-607. [29] MEKHAZNI D, BHUIYAN A, EKLADIOUS G, et al. Unsupervised domain adaptation in the dissimilarity space for person re-identification[C]//Proceedings of European Conference on Computer Vision. Berlin: Springer, 2020: 159-174. [30] ZHAI Y P, LU S J, YE Q X, et al. AD-Cluster: Augmented discriminative clustering for domain adaptive person re-identification[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2020: 9021-9030. [31] ZOU Y, YANG X D, YU Z D, et al. Joint disentangling and adaptation for cross-domain person re-identification[C]//Proceedings of European Conference on Computer Vision. Berlin: Springer, 2020: 87-104. [32] LI J, ZHANG S L. Joint visual and temporal consistency for unsupervised domain adaptive person re-identification[C]//Proceedings of European Conference on Computer Vision. Berlin: Springer, 2020: 483-499. [33] ZHAO F, LIAO S C, XIE G S, et al. Unsupervised domain adaptation with noise resistible mutual-training for person re-identification[C]//Proceedings of European Conference on Computer Vision. Berlin: Springer, 2020: 526-544. [34] ZHAI Y P, YE Q X, LU S J, et al. Multiple expert brainstorming for domain adaptive person re-identification[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2020: 594-611. -