北京航空航天大学学报 ›› 2019, Vol. 45 ›› Issue (10): 2089-2098.doi: 10.13700/j.bh.1001-5965.2018.0564

• 论文 • 上一篇    下一篇

基于异步卷积分解与分流结构的单阶段检测器

赵蓬辉1, 孟春宁2, 常胜江1   

  1. 1. 南开大学 电子信息与光学工程学院 现代光学研究所, 天津 300350;
    2. 中国人民武装警察部队海警学院 电子技术系, 宁波 315801
  • 收稿日期:2018-09-27 出版日期:2019-10-20 发布日期:2019-10-31
  • 通讯作者: 孟春宁 E-mail:mengchunning123@163.com
  • 作者简介:赵蓬辉 男,硕士研究生。主要研究方向:视频目标检测、深度学习;孟春宁 男,博士,副教授。主要研究方向:图像处理、人工智能、信息安全;常胜江 男,博士,教授,博士生导师。主要研究方向:数字图像处理、太赫兹功能器件、深度学习。
  • 基金资助:
    公安部技术研究计划(2017JSYJC10)

Single shot multibox detector based on asynchronous convolution factorization and shunt structure

ZHAO Penghui1, MENG Chunning2, CHANG Shengjiang1   

  1. 1. Institute of Modern Optics, College of Electronic Information and Optical Engineering, Nankai University, Tianjin 300350, China;
    2. Department of Electronic Technology, China Coast Guard Academy, Ningbo 315801, China
  • Received:2018-09-27 Online:2019-10-20 Published:2019-10-31
  • Supported by:
    Technology Research Project of Public Security Ministry (2017JSYJC10)

摘要: 目标检测网络SSD的多层回归特征图存在各层回归计算之间相对独立的问题,且基于SSD改进的系列算法在提高检测精度的同时难以兼顾实时性。针对上述问题,提出一种基于异步卷积分解与分流(shunt)结构的单阶段目标检测器。基于异步卷积分解算法设计了一种shunt结构,交错连接多层特征图,增强了回归计算之间的统一性与协调性。优化了原有高层主流结构,在主流结构与shunt结构中分别用最大池化和异步卷积分解2种不同的方式对特征图大小进行降维,保留空间相关信息的同时提高了特征的多样性。实验结果表明,将VOC2007trainval和VOC2012trainval中的图片统一缩小至300像素×300像素进行训练,提出的目标检测器在VOC2007test上进行检测时的平均精度均值可达到80.5%,检测速度超过30帧/s。

关键词: 目标检测, 卷积神经网络, 异步卷积分解, 分流结构, 结构优化

Abstract: Single shot multibox detector (SSD) owns the relatively independent regression computations of multi-regressive feature maps, while the object detection algorithms based on SSD cannot make a tradeoff between detection accuracy and real-time speed. To solve the problems above, a single shot mutibox detector based on asynchronous convolution factorization and shunt structure (FA-SSD) is introduced based on asynchronous convolution factorization algorithm and shunt structure. The shunt structure, based on the proposed asynchronous convolution factorization algorithm, is designed to staggerly connect the layers of regression features, enhancing the unity and coordination between regression calculations. In order to optimize the mainstream of high-level structure, the asynchronous convolution factorization algorithm and max pooling are implemented to reduce the dimension of image features in the mainstream and shunt respectively, which can hold the spatial information while improving the diversity of features. According to the experimental results from VOC2007test, FA-SSD achieves a mean average precision of 80.5% after the training of VOC2007trainval and VOC2012trainval with nominal resolution of 300×300, while the detection speed exceeds 30 frames per second.

Key words: object detection, convolutional neural networks, asynchronous convolution factorization, shunt structure, structure optimization

中图分类号: 


版权所有 © 《北京航空航天大学学报》编辑部
通讯地址:北京市海淀区学院路37号 北京航空航天大学学报编辑部 邮编:100191 E-mail:jbuaa@buaa.edu.cn
本系统由北京玛格泰克科技发展有限公司设计开发