-
摘要:
行人再识别是图像检索领域的一个重要部分,但是由于行人姿态各异、背景复杂等因素,导致提取到的行人特征鲁棒性和代表性不强,进而影响行人再识别的精度。在AlignedReID++算法基础上,提出了基于空间注意力机制的行人特征提取方法,应用在行人再识别中取得了很好的效果。首先,在特征提取部分,引入空间注意力机制来增强特征表达,同时抑制可能的噪声;其次,通过在卷积层中引入实例正则化层(IN)来辅助批正则化层(BN)对特征进行归一化处理,解决单一BN层对特征色调变化以及光照变化的不敏感性,提高特征提取对亮度、色调变化的鲁棒性;最后,在Market1501、DukeMTMC和CUHK03 3个行人再识别通用数据集上对所提改进模型进行测试评价。实验结果显示:改进后的模型在3个数据集上识别精度分别提升了2%、2.9%和5.1%,表明改进后的模型相较于改进前的模型,在精度以及鲁棒性上都有显著提高。
Abstract:Pedestrian re-identification has always been an important part of image retrieval. However, due to different pedestrian poses and complex backgrounds, the extracted pedestrian features are not robust and representative, which in turn affects the accuracy of pedestrian re-recognition. In this paper, based on AlignedReID++ algorithm, we proposes a pedestrian re-identification method based on spatial attention mechanism. First, in the feature extraction part, a spatial attention mechanism is introduced to enhance feature expression while suppressing possible noise. Second, the Instance-Normalization (IN) layer is introduced in the convolution layer to assist the Batch-Normalization (BN) layer to normalize the features and to solve the problem of single BN layer insensitivity to feature tonal and illumination changes, which enhances the robustness of feature extraction to tonal and illumination changes. Finally, to validate the proposed method, extensive experiment has been carried out on the Market1501, DukeMTMC, and CUHK03 pedestrian re-identification datasets. The experimental results show that the recognition accuracy of the improved model on the three datasets has been improved by 2%, 2.9%, and 5.1%, respectively, compared with model before modification, which indicates that the proposed method achieves higher accuracy and more robustness.
-
表 1 基于SGE模块改进的模型性能在Market1501、DukeMTMC、CUHK03数据集上的评价
Table 1. Evaluation of improved model performance based on SGE module on Market1501, DukeMTMC and CUHK03 datasets
% 网络结构 Market1501 DukeMTMC CUHK03 Rank1/Rank1(RK) mAP/mAP(RK) Rank1/Rank1(RK) mAP/mAP(RK) Rank1/Rank1(RK) mAP/mAP(RK) ResNet50 91.0/92.0 77.6/88.5 80.7/85.2 68.0/81.2 60.9/67.6 59.7/70.7 SGE-ResNet50 90.5/92.4 76.8/88.8 81.5/85.7 68.2/82.7 61.9/71.1 59.9/73.5 表 2 基于IBN层改进的模型性能在Market1501、DukeMTMC、CUHK03数据集上的评价
Table 2. Evaluation of improved model performance based on IBN layer on Market1501, DukeMTMC and CUHK03 datasets
% 网络结构 Market1501 DukeMTMC CUHK03 Rank1/Rank1(RK) mAP/mAP(RK) Rank1/Rank1(RK) mAP/mAP(RK) Rank1/Rank1(RK) mAP/mAP(RK) ResNet50 91.0/92.0 77.6/88.5 80.7/85.2 68.0/81.2 60.9/67.6 59.7/70.7 IBN-ResNet50 91.2/92.5 79.5/89.7 83.6/87.1 70.4/83.9 65.1/73.1 62.7/75.2 表 3 基于IBN层以及SGE模块改进的模型性能在Market1501、DukeMTMC、CUHK03数据集上的评价
Table 3. Evaluation of improved model performance based on IBN layer and SGE module on Market1501, DukeMTMC and CUHK03 datasets
% 网络结构 Market1501 DukeMTMC CUHK03 Rank1/Rank1(RK) mAP/mAP(RK) Rank1/Rank1(RK) mAP/mAP(RK) Rank1/Rank1(RK) mAP/mAP(RK) ResNet50 91.0/92.0 77.6/88.5 80.7/85.2 68.0/81.2 60.9/67.6 59.7/70.7 SGE-IBN-ResNet50 91.8/92.9 80.3/90.5 83.6/87.4 71.0/84.1 65.6/73.0 62.9/75.8 表 4 改进后的方法与最新行人再识别方法比较
Table 4. Comparison between improved method and latest pedestrian re-identification methods
% 方法 Market1501 DukeMTMC CUHK03 Rank1 mAP Rank1 mAP Rank1 mAP SVD[20] 82.3 62.1 76.7 56.8 41.5 37.3 PCE & ECN[21] 87.0 69.0 79.8 62.0 30.2 27.3 MLFN[22] 90.0 74.3 81.0 62.8 52.8 47.8 HA-CNN[23] 91.2 75.7 80.5 63.8 41.7 38.3 AlignedReID++ 91.0 77.6 80.7 68.0 60.9 59.7 AlignedReID++(RK) 92.0 88.5 85.2 81.2 67.6 70.7 AlignedReID++(SGE+IBN) 91.8 80.3 83.6 71.0 65.6 62.9 AlignedReID++(SGE+IBN)(RK) 92.9 90.5 87.4 84.1 73.0 75.8 -
[1] LIAO S C, HU Y, ZHU X Y, et al.Person re-identification by local maximal occurrence representation and metric learning[C]//2015 IEEE Conference on Computer Vision and Pattern Recognition.Piscataway: IEEE Press, 2015: 2197-2206. [2] DE MAESSCHALCK R, JOUAN-RIMBAUD D, MASSART D L.The Mahalanobis distance[J].Chemometrics and Intelligent Laboratory Systems, 2000, 50(1):1-18. http://www.sciencedirect.com/science/article/pii/S0169743999000477 [3] YI D, LEI Z, LIAO S C, et al.Deep metric learning for person re-identification[C]//2014 22nd International Conference on Pattern Recognition.Piscataway: IEEE Press, 2014: 34-39. [4] LI W, ZHAO R, XIAO T, et al.DeepReID: Deep filter pairing neural network for person re-identification[C]//2014 IEEE Conference on Computer Vision and Pattern Recognition.Piscataway: IEEE Press, 2014: 152-159. [5] VARIOR R R, SHUAI B, LU J, et al.A Siamese long short-term memory architecture for human re-identification[C]//European Conference on Computer Vision.Berlin: Springer, 2016: 135-153. [6] ZHANG X, LUO H, FAN X, et al.AlignedReID: Surpassing human-level performance in person re-identification[EB/OL].(2018-01-31)[2020-03-02].https://arxiv.org/abs/1711.08184. [7] DENG W, ZHENG L, YE Q, et al.Image-image domain adaptation with preserved self-similarity and domain-dissimilarity for person re-identification[C]//2018 IEEE Conference on Computer Vision and Pattern Recognition.Piscataway: IEEE Press, 2018: 994-1003. [8] LUO H, JIANG W, ZHANG X, et al.AlignedReID++:Dynamically matching local information for person re-identification[J].Pattern Recognition, 2019, 94:53-61. doi: 10.1016/j.patcog.2019.05.028 [9] HERMANS A, BEYER L, LEIBE B.In defense of the triplet loss for person re-identification[EB/OL].(2017-11-21)[2020-03-02].https://arxiv.org/abs/1703.07737. [10] IOFFE S, SZEGEDY C.Batch normalization: Accelerating deep network training by reducing internal covariate shift[EB/OL].(2015-03-02)[2020-03-02].https://arxiv.org/abs/1502.03167. [11] LI X, HU X, YANG J.Spatial group-wise enhance: Improving semantic feature learning in convolutional networks[EB/OL].(2019-05-25)[2020-03-02].https://arxiv.org/abs/1905.09646. [12] PAN X, LUO P, SHI J, et al.Two at once: Enhancing learning and generalization capacities via IBN-Net[EB/OL].(2018-07-27)[2020-03-02].https://arxiv.org/abs/1807.09441. [13] XIAO Q, LUO H, ZHANG C.Margin sample mining loss: A deep learning based method for person re-identification[EB/OL].(2017-10-07)[2020-03-02].https://arxiv.org/abs/1710.00478. [14] ZHENG L, YANG Y, HAUPTMANN A G.Person re-identification: Past, present and future[EB/OL].(2016-10-10)[2020-03-02].https://arxiv.org/abs/1610.02984. [15] LIU H, FENG J S, QI M B, et al.End-to-end comparative attention networks for person re-identification[J].IEEE Transactions on Image Processing, 2017, 26(7):3492-3506. doi: 10.1109/TIP.2017.2700762 [16] CHENG D, GONG Y H, ZHOU S P, et al.Person re-identification by multi-channel parts-based CNN with improved triplet loss function[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition.Piscataway: IEEE Press, 2016: 1335-1344. [17] 杜鹏, 宋永红, 张鑫瑶.基于自注意力模态融合网络的跨模态行人再识别方法研究[J/OL].自动化学报: 1-12(2019-10-16)[2020-01-06].https://kns.cnki.net/kcms/detail/detail.aspx?doi=10.16383/j.aas.c190340.DU P, SONG Y H, ZHANG X Y.Self-attention cross-modality fusion network for cross-modality person re-identification[J/OL].Acta Automatica Sinica: 1-12(2019-10-16)[2020-01-06].https://kns.cnki.net/kcms/detail/detail.aspx?doi=10.16383/j.aas.c190340(in Chinese). [18] 张丽红, 孙志琳.基于多层深度特征融合的行人再识别研究[J].测试技术学报, 2018, 32(4):48-52. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=hbgxycsjsxb201804008ZHANG L H, SUN Z L.Person re-identification based on multi-layer deep feature fusion[J].Journal of Test and Measurement Technology, 2018, 32(4):48-52(in Chinese). http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=hbgxycsjsxb201804008 [19] 李鹏, 王德勇, 师文喜, 等.大数据环境下基于深度学习的行人再识别[J].北京邮电大学学报, 2019, 42(6):29-34. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=bjyddx201906004LI P, WANG D Y, SHI W X, et al.Research on person re-identification based on deep learning under big data environment[J].Journal of Beijing University of Posts and Telecommunications, 2019, 42(6):29-34(in Chinese). http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=bjyddx201906004 [20] SUN Y, ZHENG L, DENG W, et al.SVDNet for pedestrian retrieval[C]//2017 IEEE International Conference on Computer Vision (ICCV).Piscataway: IEEE Press, 2017: 3820-3828. [21] SARFRAZ M S, SCHUMANN A, EBERLE A, et al.A pose-sensitive embedding for person re-identification with expanded cross neighborhood re-ranking[EB/OL].(2018-04-05)[2020-03-02].https://arxiv.org/abs/1711.10378. [22] AN L, QIN Z, CHEN X J, et al.Multi-level common space learning for person re-identification[J].IEEE Transactions on Circuits & Systems for Video Technology, 2018, 28(8):1777-1787. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=3ccc619a6310b609e8f9972cc56a1f75 [23] LI W, ZHU X, GONG S.Harmonious attention network for person re-identification[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Piscataway: IEEE Press, 2018: 2285-2294.