-
摘要:
可视-红外跨模态行人重识别任务的目标是给定一个模态的特定人员图像,在其他不同模态摄像机所拍摄的图像集中进行检索,找出相同人员对应的图像。由于成像方式不同,不同模态的图像之间存在明显的模态差异。为此,从度量学习的角度出发,对损失函数进行改进以获取具有更加辨别性的信息。对图像特征内聚性进行理论分析,并在此基础上提出一种基于内聚性分析和跨模态近邻损失函数的重识别方法,以加强不同模态样本的内聚性。将跨模态困难样本的相似性度量问题转化为跨模态最近邻样本对和同模态样本对的相似性度量,使得网络对模态内聚性的优化更加高效和稳定。对所提方法在全局特征表示的基线网络和部分特征表示的基线网络上进行实验验证结果表明:所提方法对可视-红外行人重识别的预测结果相较于基线方法,平均准确度最高可提升8.44%,证明了方法在不同网络架构中的通用性;同时,以较小的模型复杂度和较低的计算量为代价,实现了可靠的跨模态行人重识别结果。
-
关键词:
- 可视-红外行人重识别 /
- 度量学习 /
- 深度学习 /
- 跨模态学习 /
- 计算机视觉
Abstract:The goal of the visual-infrared person re-identification task is to search the image of a specific person in a given modality in the image set taken by other cameras in different modality to find out the corresponding image of the same person. Due to the different imaging methods, there are obvious modal differences between images of different modalities. Therefore, from the perspective of metric learning, the loss function is improved to obtain more discriminative information. The cohesiveness of image features is analyzed theoretically, and a re-recognition method based on cohesiveness analysis and cross-modal nearest neighbor loss function is proposed to strengthen the cohesiveness of different modal samples. The similarity measurement problem of cross-modal hard samples is transformed into the similarity measurement of cross-modal nearest neighbor sample pairs and the same modality sample pairs, which makes the optimization of modal cohesion of the network more efficient and stable. The proposed method is experimentally verified on the baseline networks of global feature representation and partial feature representation. Compared with the baseline method, the proposed method can improve the average accuracy of the visual and infrared person re-identification by up to 8.44%. The universality of the proposed method in different network architectures is proved. Moreover, at the cost of less model complexity and less computation, the reliable visual-infrared person re-identification results are achieved.
-
表 1 RegDB数据集中可视-红外模式下全局特征表示的消融实验结果
Table 1. Ablation experimental results ofglobal feature representation under visible-infrared mode on RegDB dataset
% 方法 Rank-1 Rank-10 Rank-20 mAP mINP 基线方法 82.76 91.91 94.57 80.64 73.44 CenL 1# 83.40 92.47 95.53 81.35 73.95 CenL 2# 83.28 92.58 95.16 81.32 74.17 本文方法 85.11 93.50 96.06 84.18 76.70 表 2 RegDB数据集中红外-可视模式下全局特征表示的消融实验结果
Table 2. Ablation experimental results of global feature representation under infrared-visible mode on RegDB dataset
% 方法 Rank-1 Rank-10 Rank-20 mAP mINP 基线方法 83.67 93.00 95.59 80.65 71.41 CenL 1# 82.51 92.75 95.67 80.78 72.62 CenL 2# 82.30 92.02 94.73 80.65 72.57 本文方法 85.08 93.00 95.87 82.80 75.40 表 3 RegDB数据集中可视-红外模式下部分特征表示的消融实验结果
Table 3. Ablation experimental results of part feature representation under visible-infrared mode on RegDB dataset
% 方法 Rank-1 Rank-10 Rank-20 mAP mINP 基线方法 91.05 97.16 98.57 83.28 68.84 本文方法 93.94 97.87 98.96 91.72 85.99 表 4 RegDB数据集中红外-可视模式下部分特征表示的消融实验结果
Table 4. Ablation experimental results of part feature representation under infrared-visible mode on RegDB dataset
% 方法 Rank-1 Rank-10 Rank-20 mAP mINP 基线方法 89.30 96.41 98.16 81.46 64.81 本文方法 94.43 97.80 98.55 91.85 85.76 表 5 SYSU-MM01数据集中部分特征表示的消融实验结果
Table 5. Ablation experimental results under part feature representation on SYSU-MM01 dataset
% 方法 Rank-1 Rank-10 Rank-20 mAP mINP 基线方法 58.18 90.49 95.34 55.25 39.54 本文方法 59.97 92.96 96.96 57.71 43.60 表 6 RegDB数据集中可视-红外和红外-可视模式下本文方法与先进方法的比较
Table 6. Comparison with proposed method and advanced mothods on RegDB datasets under visible-infrared and infrared-visual modes
% 表 7 SYSU-MM01数据集单发-全局设置下较方法与先进方法的比较
Table 7. Comparison with proposed method and advanced methods on SYSU-MM01 datasets under one shot-global setting
% -
[1] GONG S G, XIANG T. Person re-identification[M]. GONG S G, XIANG T. Visual analysis of behaviour. Berlin: Springer, 2011: 301-313. [2] ZHENG L, YANG Y, HAUPTMANN A G. Person re-identification: Past, present and future[EB/OL]. (2016-10-10)[2020-03-15]. https://arxiv.org/abs/1610.02984.pdf. [3] GHEISSARI N, SEBASTIAN T B, HARTLEY R. Person reidentification using spatiotemporal appearance[C]//Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2006: 1528-1535. [4] KARANAM S, LI Y, RADKE R J. Person re-identification with discriminatively trained viewpoint invariant dictionaries[C]//Proceedings of the IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2016: 4516-4524. [5] BAK S, ZAIDENBERG S, BOULAY B, et al. Improving person re-identification by viewpoint cues[C]//Proceedings of the IEEE International Conference on Advanced Video and Signal Based Surveillance. Piscataway: IEEE Press, 2014: 175-180. [6] HUANG Y K, ZHA Z J, FU X Y, et al. Illumination-invariant person re-identification[C]//Proceedings of the 27th ACM International Conference on Multimedia. New York: ACM, 2019: 365-373. [7] CHO Y J, YOON K J. Improving person re-identification via pose-aware multi-shot matching[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2016: 1354-1362. [8] ZHAO H Y, TIAN M Q, SUN S Y, et al. Spindle Net: Person re-identification with human body region guided feature decomposition and fusion[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2017: 907-915. [9] WANG Z X, WANG Z, ZHENG Y Q, et al. Learning to reduce dual-level discrepancy for infrared-visible person re-identification[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2020: 618-626. [10] HAO Y, WANG N N, LI J E, et al. HSME: Hypersphere manifold embedding for visible thermal person re-identification[C]//Proceedings of the AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2019, 33(1): 8385-8392. doi: 10.1609/aaai.v33i01.33018385 [11] YE M, LAN X Y, LENG Q M. Modality-aware collaborative learning for visible thermal person re-identification[C]//Proceedings of the 27th ACM International Conference on Multimedia. New York: ACM, 2019: 347-355. [12] HAO Y, WANG N N, GAO X B, et al. Dual-alignment feature embedding for cross-modality person re-identification[C]//Proceedings of the 27th ACM International Conference on Multimedia. New York: ACM, 2019: 57-65. [13] WANG G A, ZHANG T Z, CHENG J, et al. RGB-infrared cross-modality person re-identification via joint pixel and feature alignment[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE Press, 2020: 3622-3631. [14] JIA M X, ZHAI Y P, LU S J, et al. A similarity inference metric for RGB-infrared cross-modality person re-identification[EB/OL]. (2020-07-03)[2022-03-18].https://arxiv.org/abs/2007.01504.pdf. [15] WEI X, LI D G, HONG X P, et al. Co-attentive lifting for infrared-visible person re-identification[C]//Proceedings of the 28th ACM International Conference on Multimedia. New York: ACM, 2020: 1028-1037. [16] WANG G A, ZHANG T Z, YANG Y, et al. Cross-modality paired-images generation for RGB-infrared person re-identification[C]//Proceedings of the AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2020, 34(7): 12144-12151. doi: 10.1609/aaai.v34i07.6894 [17] LU Y, WU Y, LIU B, et al. Cross-modality person re-identification with shared-specific feature transfer[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2020: 13376-13386. [18] YE M, SHEN J B, CRANDALL D, et al. Dynamic dual-attentive aggregation learning for visible-infrared person re-identification[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2020: 229-247. [19] PU N, CHEN W, LIU Y, et al. Dual Gaussian-based variational subspace disentanglement for visible-infrared person re-identification[C]//Proceedings of the 28th ACM International Conference on Multimedia. New York: ACM, 2020: 2149-2158. [20] LI D G, WEI X, HONG X P, et al. Infrared-visible cross-modal person re-identification with an X modality[C]//Proceedings of the AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2020, 34(4): 4610-4617. doi: 10.1609/aaai.v34i04.5891 [21] PARK H, LEE S, LEE J, et al. Learning by aligning: Visible-infrared person re-identification using cross-modal correspondences[EB/OL]. (2021-08-17)[2022-03-20]. https://arxiv.org/abs/2108.07422. [22] MIAO Z, LIU H, SHI W, et al. Modality-aware style adaptation for RGB-infrared person re-identification[C]//Proceedings of the International Joint Conference on Artificial Intelligence. New York: ACM, 2021: 916-922. [23] LING Y G, LUO Z M, LIN Y J, et al. A multi-constraint similarity learning with adaptive weighting for visible-thermal person re-identification[C]//Proceedings of the International Joint Conference on Artificial Intelligence. New York: ACM, 2021: 845-851. [24] ZHAO Z W, LIU B, CHU Q, et al. Joint color-irrelevant consistency learning and identity-aware modality adaptation for visible-infrared cross modality person re-identification[C]//Proceedings of the AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2021, 35(4): 3520-3528. doi: 10.1609/aaai.v35i4.16466 [25] CHEN Y, WAN L, LI Z H, et al. Neural feature search for RGB-infrared person re-identification[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2021: 587-597. [26] WU Q, DAI P Y, CHEN J, et al. Discover cross-modality nuances for visible-infrared person re-identification[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2021: 4328-4337. [27] TIAN X D, ZHANG Z Z, LIN S H, et al. Farewell to mutual information: Variational distillation for cross-modal person re-identification[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2021: 1522-1531. [28] YE M, WANG Z, LAN X, et al. Visible thermal person re-identification via dual-constrained top-ranking[C]//Proceedings of the International Joint Conference on Artificial Intelligence. New York: ACM, 2018: 1092-1099. [29] LIU H J, TAN X H, ZHOU X C. Parameter sharing exploration and hetero-center based triplet loss for visible-thermal person re-identification[EB/OL]. (2020-14-04)[2022-03-20]. https://arxiv.org/abs/2008.06223.pdf. [30] NGUYEN D T, HONG H G, KIM K W, et al. Person recognition system based on a combination of body images from visible light and thermal cameras[J]. Sensors, 2017, 17(3): 605. doi: 10.3390/s17030605 [31] WANG P Y, ZHAO Z C, SU F, et al. Deep multi-patch matching network for visible thermal person re-identification[J]. IEEE Transactions on Multimedia, 2021, 23: 1474-1488. doi: 10.1109/TMM.2020.2999180 [32] YE M, SHEN J B, LIN G J, et al. Deep learning for person re-identification: A survey and outlook[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(6): 2872-2893. doi: 10.1109/TPAMI.2021.3054775