北京航空航天大学学报 ›› 2022, Vol. 48 ›› Issue (4): 647-656.doi: 10.13700/j.bh.1001-5965.2020.0614

• 论文 • 上一篇    下一篇

基于双重属性信息的跨模态行人重识别算法

陈琳1, 高赞2, 宋雪萌1, 王英龙2, 聂礼强1   

  1. 1. 山东大学 计算机科学与技术学院, 青岛 266200;
    2. 齐鲁工业大学(山东省科学院) 山东省人工智能研究院, 济南 250014
  • 收稿日期:2020-11-04 发布日期:2022-04-27
  • 通讯作者: 聂礼强 E-mail:nieliqiang@gmail.com
  • 基金资助:
    国家自然科学基金(61772310,61702300,U1936203,61872270);国家重点研发计划(2018AAA0102502);山东省自然科学基金(ZR2019JQ23);山东省重大科技创新工程项目(2019JZZY010118);山东省高等学校青年创新团队发展计划(2020KJN012);济南市高校二十条创新团队(2018GXRC014)

A cross-modal pedestrian Re-ID algorithm based on dual attribute information

CHEN Lin1, GAO Zan2, SONG Xuemeng1, WANG Yinglong2, NIE Liqiang1   

  1. 1. School of Computer Science and Technology, Shandong University, Qingdao 266200, China;
    2. Shandong Artificial Intelligence Institute, Qilu University of Technology (Shandong Academy of Sciences), Jinan 250014, China
  • Received:2020-11-04 Published:2022-04-27

摘要: 通过对跨模态检索问题的研究,属性信息的使用可以增强所提取特征的语义表达性,但现有基于自然语言的跨模态行人重识别算法对行人图片和文本的属性信息利用不够充分。基于双重属性信息的跨模态行人重识别算法充分考虑了行人图片和文本描述的属性信息,构建了基于文本属性和图片属性的双重属性空间,并通过构建基于隐空间和属性空间的跨模态行人重识别端到端网络,提高了所提取图文特征的可区分性和语义表达性。跨模态行人重识别数据集CUHK-PEDES上的实验评估表明,所提算法的检索准确率Top-1达到了56.42%,与CMAAM算法的Top-1(56.68%)具有可比性,Top-5、Top-10相比CMAAM算法分别提升了0.45%、0.29%。针对待检索图片库中可能存在身份标签的应用场景,利用行人的类别信息提取属性特征,可以较大幅度提高跨模态行人图片的检索准确率,Top-1达到64.88%。消融实验证明了所提算法使用的文本属性和图片属性的重要性及基于双重属性空间的有效性。

关键词: 跨模态检索, 匹配算法, 行人属性信息, 特征表示, 特征融合

Abstract: Through the investigation of cross-modal retrieval, the use of attribute information can enhance the semantic representation of extracted features. The attributes of the pedestrian image and text are not used adequately in the existing cross-modal pedestrian Re-ID algorithms based on natural language. To tackle the above issues, a novel cross-modal pedestrian Re-ID algorithm based on dual attribute information is proposed. Specifically, the attribute information of the pedestrian image and the attribute information of pedestrian text descriptions are fully and simultaneously explored, and the dual attribute space is also built to improve the distinguishability and semantic expression of extracted image and text features. Extensive experimental results on a public cross-modal pedestrian Re-ID dataset CUHK-PEDES demonstrate that the proposed algorithm is comparable with state-of-the-art algorithm CMAAM (Top-1 56.68%), the retrieval accuracy Top-1 of the proposed algorithm reaches 56.42%, and Top-5 and Top-10 are improved by 0.45% and 0.29% respectively. Besides, the retrieval accuracy of cross-modal pedestrian images can be significantly improved if the class information is provided in the gallery image pool and is used to extract attribute features, and Top-1 can reach 64.88%. The ablation study also proves the importance of the text attribute and image attribute used by the proposed algorithm and the effectiveness of the dual attribute space.

Key words: cross-modal retrieval, matching algorithm, pedestrian attribute information, feature representation, feature fusion

中图分类号: 


版权所有 © 《北京航空航天大学学报》编辑部
通讯地址:北京市海淀区学院路37号 北京航空航天大学学报编辑部 邮编:100191 E-mail:jbuaa@buaa.edu.cn
本系统由北京玛格泰克科技发展有限公司设计开发