-
摘要:
行人重识别是计算机视觉领域的一个重要部分,但是容易受到行人图片实际采集环境的影响,导致行人特征表达不充分,进一步导致模型精度不高。提出一种基于注意力机制和条件卷积改进的行人重识别方法,使行人特征得到更充分的表达。将注意力机制引入特征提取网络ResNet50中,对输入图像空间和通道上的关键信息进行加权强化,同时抑制可能的噪声;将条件卷积模块引入主干网络,动态调整卷积核参数,使模型能够在保持高效推理的同时提高容量和性能;利用 Market1501、MSMT17和DukeMTMC-ReID主流数据集对改进方法进行评估,Rank1分别提升1.1%、2.4%、1.3%,mAP分别提升0.5%、2.3%、1.3%,结果表明:改进方法能够使行人特征得到更好的表达,识别精度得到提升。
Abstract:Person Re-identification is an important part of the field of computer vision, but it is easily affected by the actual collection environment of person images, resulting in insufficient expression of person features and further leading to low model accuracy. An improved person re-identification method based on attention mechanism and CondConv is proposed to fully express pedestrian features. The attention mechanism is introduced into the feature extraction network ResNet50, and the key information in the input image space and channel is weighted, while suppressing possible noise. The CondConv is introduced into the backbone network and the convolution kernel parameters are dynamically adjusted to improve the capacity and performance of the model while maintaining efficient reasoning. Mainstream data sets such as Market1501, MSMT17 and DukeMTMC-ReID are used to evaluate the improved method. Rank-1 is increased by 1.1%, 2.4% and 1.3% respectively, and mAP is increased by 0.5%, 2.3% and 1.3%; respectively. The results show that the improved method can better express person features and improve recognition accuracy.
-
Key words:
- attention mechanism /
- CondConv /
- ResNet50 /
- person re-identification /
- deep learning
-
表 1 引入条件卷积前后参数量与计算量对比
Table 1. Comparison of number of parameters and flops before and after introduction of CondConv
模块 参数量 计算量 Basic Layer3 7098368 959119360 CondConv Layer3 3564050 506139136 Basic Layer4 14964736 529301504 CondConv Layer4 7891465 302813696 表 2 数据集简介
Table 2. Introduction of datasets
表 3 基于空间通道注意力机制改进后的CTL模型在不同数据集上的性能对比
Table 3. Performance comparison of improved CTL model based on CBAM attention mechanism on different datasets
% 表 4 基于条件卷积模块改进后的CTL模型在不同数据集上的性能对比
Table 4. Performance comparison of improved CTL model based on CondConv module on different datasets
% 表 5 基于注意力机制和条件卷积改进后的CTL模型在不同数据集上的性能对比
Table 5. Performance comparison of improved CTL model based on attention mechanism and CondConv on different datasets
% 表 6 改进方法与最新方法的性能比较
Table 6. Performance comparison between improved method and latest methods
% 方法 Rank-1 mAP Market1501[13] MSMT17[14] DukeMTMC-ReID[15] Market1501[13] MSMT17[14] DukeMTMC-ReID[15] CDNet[4] 95.1 78.9 88.6 86.0 54.7 76.8 OSNet[6] 94.8 79.1 88.7 86.7 55.1 76.6 PAT[16] 95.4 88.8 88.0 78.2 HLGAT[17] 97.5 92.7 93.4 87.3 CTL 97.5 89.5 95.4 98.3 91.3 96.1 CTL +CBAM 98.3 91.0 96.5 98.6 92.9 97.0 CTL +CondConv 98.5 90.6 96.2 98.7 92.6 96.8 CTL+CBAM+CondConv 98.6 91.9 96.7 98.8 93.6 97.4 -
[1] MING Z Q, ZHU M K, WANG X K, et al. Deep learning-based person re-identification methods: A survey and outlook of recent works[J]. Image and Vision Computing, 2022, 119: 104394. [2] 孙义博, 张文靖, 王蓉, 等. 基于通道注意力机制的行人重识别方法[J]. 北京航空航天大学学报, 2022, 48(5): 881-889. doi: 10.13700/j.bh.1001-5965.2020.0684SUN Y B, ZHANG W J, WANG R, et al. Pedestrian re-identification method based on channel attention mechanism[J]. Journal of Beijing University of Aeronautics and Astronautics, 2022, 48(5): 881-889 (in Chinese). doi: 10.13700/j.bh.1001-5965.2020.0684 [3] WIECZOREK M, RYCHALSKA B, DąBROWSKI J. On the unreasonable effectiveness of centroids in image retrieval[C]//Neural Information Processing: 28th International Conference. Berlin: Springer, 2021: 212-223. [4] LI H J, WU G J, ZHENG W S. Combined depth space based architecture search for person re-identification[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2021: 6729-6738. [5] YAN C, PANG G S, BAI X, et al. Beyond triplet loss: Person re-identification with fine-grained difference-aware pairwise loss[J]. IEEE Transactions on Multimedia, 2021, 24: 1665-1677. [6] ZHOU K Y, YANG Y X, CAVALLARO A, et al. Learning generalisable omni-scale representations for person re-identification[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 44(9): 5056-5069. [7] 张晓伟, 吕明强, 李慧. 基于局部语义特征不变性的跨域行人重识别[J]. 北京航空航天大学学报, 2020, 46(9): 1682-1690.ZHANG X W, LYU M Q, LI H. Cross-domain person re-identification based on partial semantic feature invariance[J]. Journal of Beijing University of Aeronautics and Astronautics, 2020, 46(9): 1682-1690 (in Chinese). [8] LIU J W, ZHA Z J, WU W, et al. Spatial-temporal correlation and topology learning for person re-identification in videos[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2021: 4370-4379. [9] WOO S, PARK J, LEE J Y, et al. Cbam: Convolutional block attention module[C]//Proceedings of the European conference on computer vision. Munich: ECCV, 2018: 3-19. [10] YANG B, BENDER G, LE Q V, et al. Condconv: Conditionally parameterized convolutions for efficient inference[EB/OL]. (2019-01-10) [2022-02-19]. https://arxiv.org/abs/1904.04071.html. [11] WIECZOREK M, MICHALOWSKI A, WROBLEWSKA A, et al. A strong baseline for fashion retrieval with person re-identification models[C]//International Conference on Neural Information Processing. Berlin: Springer, 2020: 294-301. [12] SCHROFF F, KALENICHENKO D, PHILBIN J. Facenet: A unified embedding for face recognition and clustering[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2015: 815-823. [13] ZHENG L, SHEN L, TIAN L Y, et al. Scalable person re-identification: A benchmark[C]//Proceedings of the IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2015: 1116-1124. [14] WEI L H, ZHANG S L, GAO W, et al. Person transfer GAN to bridge domain gap for person re-identification[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018 : 79-88. [15] ZHENG Z D, ZHENG L, YANG Y. Unlabeled samples generated by gan improve the person re-identification baseline in vitro[C]//Proceedings of the IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2017: 3754-3762. [16] LI Y L, HE J F, ZHANG T Z, et al. Diverse part discovery: Occluded person re-identification with part-aware trans-former[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2021: 2898–2907. [17] ZHANG Z, ZHANG H J, LIU S. Person re-identification using hetero-geneous local graph attention networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2021: 12136–12145.