基于空间注意力机制的行人再识别方法

张子昊; 周千里; 王蓉

doi:10.13700/j.bh.1001-5965.2020.0075

基于空间注意力机制的行人再识别方法

doi: 10.13700/j.bh.1001-5965.2020.0075

张子昊^{1, 2},
周千里^{1, 2, 3},
王蓉^{1, 2, ,}

1.
中国人民公安大学警务信息工程与网络安全学院, 北京 100038
2.
安全防范技术与风险评估公安部重点实验室, 北京 100038
3.
北京市公安局, 北京 100740

基金项目:

国家重点研发计划 A19808

中央高校基本科研业务费专项资金 2019JKF111

详细信息

作者简介:
张子昊  男, 硕士研究生。主要研究方向:模式识别、人工智能

周千里  男, 博士研究生。主要研究方向:模式识别、人工智能

王蓉  女, 博士, 教授, 博士生导师。主要研究方向:模式识别、人工智能

通讯作者:
王蓉, E-mail: dbdxwangrong@163.com

中图分类号: O235;TP183
计量
- 文章访问数: 591
- HTML全文浏览量: 92
- PDF下载量: 167
- 被引次数: 0
出版历程
- 收稿日期: 2020-03-03
- 录用日期: 2020-03-27
- 网络出版日期: 2020-09-20

Pedestrian re-identification method based on spatial attention mechanism

ZHANG Zihao^{1, 2},
ZHOU Qianli^{1, 2, 3},
WANG Rong^{1, 2
, ,}

1.
School of Police Information Engineering and Network Security, People's Public Security University of China, Beijing 100038, China
2.
Key Laboratory of Security Technology & Risk Assessment, Beijing 100038, China
3.
Beijing Public Security Bureau, Beijing 100740, China

Funds:

National Key R & D Program of China A19808

the Fundamental Research Funds for the Central Universities 2019JKF111

More Information

Corresponding author: WANG Rong, E-mail: dbdxwangrong@163.com

摘要

摘要:
行人再识别是图像检索领域的一个重要部分，但是由于行人姿态各异、背景复杂等因素，导致提取到的行人特征鲁棒性和代表性不强，进而影响行人再识别的精度。在AlignedReID++算法基础上，提出了基于空间注意力机制的行人特征提取方法，应用在行人再识别中取得了很好的效果。首先，在特征提取部分，引入空间注意力机制来增强特征表达，同时抑制可能的噪声；其次，通过在卷积层中引入实例正则化层（IN）来辅助批正则化层（BN）对特征进行归一化处理，解决单一BN层对特征色调变化以及光照变化的不敏感性，提高特征提取对亮度、色调变化的鲁棒性；最后，在Market1501、DukeMTMC和CUHK03 3个行人再识别通用数据集上对所提改进模型进行测试评价。实验结果显示：改进后的模型在3个数据集上识别精度分别提升了2%、2.9%和5.1%，表明改进后的模型相较于改进前的模型，在精度以及鲁棒性上都有显著提高。
- 深度学习 /
- 空间注意力机制 /
- 行人特征 /
- 特征增强 /
- 卷积神经网络
Abstract:
Pedestrian re-identification has always been an important part of image retrieval. However, due to different pedestrian poses and complex backgrounds, the extracted pedestrian features are not robust and representative, which in turn affects the accuracy of pedestrian re-recognition. In this paper, based on AlignedReID++ algorithm, we proposes a pedestrian re-identification method based on spatial attention mechanism. First, in the feature extraction part, a spatial attention mechanism is introduced to enhance feature expression while suppressing possible noise. Second, the Instance-Normalization (IN) layer is introduced in the convolution layer to assist the Batch-Normalization (BN) layer to normalize the features and to solve the problem of single BN layer insensitivity to feature tonal and illumination changes, which enhances the robustness of feature extraction to tonal and illumination changes. Finally, to validate the proposed method, extensive experiment has been carried out on the Market1501, DukeMTMC, and CUHK03 pedestrian re-identification datasets. The experimental results show that the recognition accuracy of the improved model on the three datasets has been improved by 2%, 2.9%, and 5.1%, respectively, compared with model before modification, which indicates that the proposed method achieves higher accuracy and more robustness.
- deep learning /
- spatial attention mechanism /
- pedestrian characteristics /
- feature enhancement /
- convolutional neural network

HTML全文

图 1 AlignedReID++算法流程图

Figure 1. AlignedReID ++ algorithm flowchart

下载: 全尺寸图片幻灯片

图 2 改进后AlignedReID++算法流程图

Figure 2. Improved AlignedReID ++ algorithm flowchart

下载: 全尺寸图片幻灯片

图 3 SGE-ResNet模块结构

Figure 3. SGE-ResNet module structure

下载: 全尺寸图片幻灯片

图 4 IBN模块结构

Figure 4. IBN module structure

下载: 全尺寸图片幻灯片

图 5 4种网络在Market1501、DukeMTMC、CUHK03数据集上的CMC曲线

Figure 5. CMC curves of four networks on Market1501, DukeMTMC and CUHK03 datasets

下载: 全尺寸图片幻灯片

图 6 不同网络下相似度最高的10张行人图片

Figure 6. Ten most similar pedestrian images on different networks

下载: 全尺寸图片幻灯片

表 1 基于SGE模块改进的模型性能在Market1501、DukeMTMC、CUHK03数据集上的评价

Table 1. Evaluation of improved model performance based on SGE module on Market1501, DukeMTMC and CUHK03 datasets %

网络结构	Market1501		DukeMTMC		CUHK03
网络结构	Rank1/Rank1(RK)	mAP/mAP(RK)	Rank1/Rank1(RK)	mAP/mAP(RK)	Rank1/Rank1(RK)	mAP/mAP(RK)
ResNet50	91.0/92.0	77.6/88.5	80.7/85.2	68.0/81.2	60.9/67.6	59.7/70.7
SGE-ResNet50	90.5/92.4	76.8/88.8	81.5/85.7	68.2/82.7	61.9/71.1	59.9/73.5

下载: 导出CSV

表 2 基于IBN层改进的模型性能在Market1501、DukeMTMC、CUHK03数据集上的评价

Table 2. Evaluation of improved model performance based on IBN layer on Market1501, DukeMTMC and CUHK03 datasets %

网络结构	Market1501		DukeMTMC		CUHK03
网络结构	Rank1/Rank1(RK)	mAP/mAP(RK)	Rank1/Rank1(RK)	mAP/mAP(RK)	Rank1/Rank1(RK)	mAP/mAP(RK)
ResNet50	91.0/92.0	77.6/88.5	80.7/85.2	68.0/81.2	60.9/67.6	59.7/70.7
IBN-ResNet50	91.2/92.5	79.5/89.7	83.6/87.1	70.4/83.9	65.1/73.1	62.7/75.2

下载: 导出CSV

表 3 基于IBN层以及SGE模块改进的模型性能在Market1501、DukeMTMC、CUHK03数据集上的评价

Table 3. Evaluation of improved model performance based on IBN layer and SGE module on Market1501, DukeMTMC and CUHK03 datasets %

网络结构	Market1501		DukeMTMC		CUHK03
网络结构	Rank1/Rank1(RK)	mAP/mAP(RK)	Rank1/Rank1(RK)	mAP/mAP(RK)	Rank1/Rank1(RK)	mAP/mAP(RK)
ResNet50	91.0/92.0	77.6/88.5	80.7/85.2	68.0/81.2	60.9/67.6	59.7/70.7
SGE-IBN-ResNet50	91.8/92.9	80.3/90.5	83.6/87.4	71.0/84.1	65.6/73.0	62.9/75.8

下载: 导出CSV

表 4 改进后的方法与最新行人再识别方法比较

Table 4. Comparison between improved method and latest pedestrian re-identification methods %

方法	Market1501		DukeMTMC		CUHK03
方法	Rank1	mAP	Rank1	mAP	Rank1	mAP
SVD^[20]	82.3	62.1	76.7	56.8	41.5	37.3
PCE & ECN^[21]	87.0	69.0	79.8	62.0	30.2	27.3
MLFN^[22]	90.0	74.3	81.0	62.8	52.8	47.8
HA-CNN^[23]	91.2	75.7	80.5	63.8	41.7	38.3
AlignedReID++	91.0	77.6	80.7	68.0	60.9	59.7
AlignedReID++(RK)	92.0	88.5	85.2	81.2	67.6	70.7
AlignedReID++(SGE+IBN)	91.8	80.3	83.6	71.0	65.6	62.9
AlignedReID++(SGE+IBN)(RK)	92.9	90.5	87.4	84.1	73.0	75.8

下载: 导出CSV

参考文献(23)

[1]	LIAO S C, HU Y, ZHU X Y, et al.Person re-identification by local maximal occurrence representation and metric learning[C]//2015 IEEE Conference on Computer Vision and Pattern Recognition.Piscataway: IEEE Press, 2015: 2197-2206.
[2]	DE MAESSCHALCK R, JOUAN-RIMBAUD D, MASSART D L.The Mahalanobis distance[J].Chemometrics and Intelligent Laboratory Systems, 2000, 50(1):1-18. http://www.sciencedirect.com/science/article/pii/S0169743999000477
[3]	YI D, LEI Z, LIAO S C, et al.Deep metric learning for person re-identification[C]//2014 22nd International Conference on Pattern Recognition.Piscataway: IEEE Press, 2014: 34-39.
[4]	LI W, ZHAO R, XIAO T, et al.DeepReID: Deep filter pairing neural network for person re-identification[C]//2014 IEEE Conference on Computer Vision and Pattern Recognition.Piscataway: IEEE Press, 2014: 152-159.
[5]	VARIOR R R, SHUAI B, LU J, et al.A Siamese long short-term memory architecture for human re-identification[C]//European Conference on Computer Vision.Berlin: Springer, 2016: 135-153.
[6]	ZHANG X, LUO H, FAN X, et al.AlignedReID: Surpassing human-level performance in person re-identification[EB/OL].(2018-01-31)[2020-03-02].https://arxiv.org/abs/1711.08184.
[7]	DENG W, ZHENG L, YE Q, et al.Image-image domain adaptation with preserved self-similarity and domain-dissimilarity for person re-identification[C]//2018 IEEE Conference on Computer Vision and Pattern Recognition.Piscataway: IEEE Press, 2018: 994-1003.
[8]	LUO H, JIANG W, ZHANG X, et al.AlignedReID++:Dynamically matching local information for person re-identification[J].Pattern Recognition, 2019, 94:53-61. doi: 10.1016/j.patcog.2019.05.028
[9]	HERMANS A, BEYER L, LEIBE B.In defense of the triplet loss for person re-identification[EB/OL].(2017-11-21)[2020-03-02].https://arxiv.org/abs/1703.07737.
[10]	IOFFE S, SZEGEDY C.Batch normalization: Accelerating deep network training by reducing internal covariate shift[EB/OL].(2015-03-02)[2020-03-02].https://arxiv.org/abs/1502.03167.
[11]	LI X, HU X, YANG J.Spatial group-wise enhance: Improving semantic feature learning in convolutional networks[EB/OL].(2019-05-25)[2020-03-02].https://arxiv.org/abs/1905.09646.
[12]	PAN X, LUO P, SHI J, et al.Two at once: Enhancing learning and generalization capacities via IBN-Net[EB/OL].(2018-07-27)[2020-03-02].https://arxiv.org/abs/1807.09441.
[13]	XIAO Q, LUO H, ZHANG C.Margin sample mining loss: A deep learning based method for person re-identification[EB/OL].(2017-10-07)[2020-03-02].https://arxiv.org/abs/1710.00478.
[14]	ZHENG L, YANG Y, HAUPTMANN A G.Person re-identification: Past, present and future[EB/OL].(2016-10-10)[2020-03-02].https://arxiv.org/abs/1610.02984.
[15]	LIU H, FENG J S, QI M B, et al.End-to-end comparative attention networks for person re-identification[J].IEEE Transactions on Image Processing, 2017, 26(7):3492-3506. doi: 10.1109/TIP.2017.2700762
[16]	CHENG D, GONG Y H, ZHOU S P, et al.Person re-identification by multi-channel parts-based CNN with improved triplet loss function[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition.Piscataway: IEEE Press, 2016: 1335-1344.
[17]	杜鹏, 宋永红, 张鑫瑶.基于自注意力模态融合网络的跨模态行人再识别方法研究[J/OL].自动化学报: 1-12(2019-10-16)[2020-01-06].https://kns.cnki.net/kcms/detail/detail.aspx?doi=10.16383/j.aas.c190340. DU P, SONG Y H, ZHANG X Y.Self-attention cross-modality fusion network for cross-modality person re-identification[J/OL].Acta Automatica Sinica: 1-12(2019-10-16)[2020-01-06].https://kns.cnki.net/kcms/detail/detail.aspx?doi=10.16383/j.aas.c190340(in Chinese).
[18]	张丽红, 孙志琳.基于多层深度特征融合的行人再识别研究[J].测试技术学报, 2018, 32(4):48-52. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=hbgxycsjsxb201804008 ZHANG L H, SUN Z L.Person re-identification based on multi-layer deep feature fusion[J].Journal of Test and Measurement Technology, 2018, 32(4):48-52(in Chinese). http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=hbgxycsjsxb201804008
[19]	李鹏, 王德勇, 师文喜, 等.大数据环境下基于深度学习的行人再识别[J].北京邮电大学学报, 2019, 42(6):29-34. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=bjyddx201906004 LI P, WANG D Y, SHI W X, et al.Research on person re-identification based on deep learning under big data environment[J].Journal of Beijing University of Posts and Telecommunications, 2019, 42(6):29-34(in Chinese). http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=bjyddx201906004
[20]	SUN Y, ZHENG L, DENG W, et al.SVDNet for pedestrian retrieval[C]//2017 IEEE International Conference on Computer Vision (ICCV).Piscataway: IEEE Press, 2017: 3820-3828.
[21]	SARFRAZ M S, SCHUMANN A, EBERLE A, et al.A pose-sensitive embedding for person re-identification with expanded cross neighborhood re-ranking[EB/OL].(2018-04-05)[2020-03-02].https://arxiv.org/abs/1711.10378.
[22]	AN L, QIN Z, CHEN X J, et al.Multi-level common space learning for person re-identification[J].IEEE Transactions on Circuits & Systems for Video Technology, 2018, 28(8):1777-1787. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=3ccc619a6310b609e8f9972cc56a1f75
[23]	LI W, ZHU X, GONG S.Harmonious attention network for person re-identification[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Piscataway: IEEE Press, 2018: 2285-2294.