北京航空航天大学学报 ›› 2021, Vol. 47 ›› Issue (3): 509-519.doi: 10.13700/j.bh.1001-5965.2020.0444

• 论文 • 上一篇    下一篇

一种时空特征聚合的水下珊瑚礁鱼检测方法

陈智能1, 史存存2, 李轩涯3, 贾彩燕2, 黄磊4   

  1. 1. 中国科学院自动化研究所 数字内容技术与服务研究中心, 北京 100190;
    2. 北京交通大学 计算机与信息技术学院, 北京 100044;
    3. 百度公司, 北京 100085;
    4. 中国海洋大学 信息科学与工程学院, 青岛 266100
  • 收稿日期:2020-08-24 发布日期:2021-04-08
  • 通讯作者: 李轩涯 E-mail:lixuanya@baidu.com
  • 作者简介:陈智能,男,博士,副研究员。主要研究方向:多媒体内容分析与检索、医学影像分析;史存存,女,硕士,工程师。主要研究方向:计算机视觉、深度学习;李轩涯,男,博士。主要研究方向:物联网计算、人工智能;贾彩燕,女,博士,教授,博士生导师。主要研究方向:社会计算、文本聚类、网络社区发现;黄磊,男,博士,副教授。主要研究方向:多媒体内容分析与检索、计算机视觉。
  • 基金资助:
    国家自然科学基金(61772526,61876016,61872326);百度开放研究基金

An underwater coral reef fish detection approach based on aggregation of spatio-temporal features

CHEN Zhineng1, SHI Cuncun2, LI Xuanya3, JIA Caiyan2, HUANG Lei4   

  1. 1. Research Centre for Digital Content Technology and Services, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China;
    2. School of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044, China;
    3. Baidu Inc., Beijing 100085, China;
    4. College of Information Science and Engineering, Ocean University of China, Qingdao 266100, China
  • Received:2020-08-24 Published:2021-04-08
  • Supported by:
    National Natural Science Foundation of China (61772526,61876016,61872326); Baidu Open Research Program

摘要: 水下监控视频中的珊瑚礁鱼检测面临着视频成像质量不高、水下环境复杂、珊瑚礁鱼视觉多样性高等困难,是一个极具挑战的视觉目标检测问题,如何提取高辨识度的特征成为制约检测精度提升的关键。提出了一种时空特征聚合的水下珊瑚礁鱼检测方法,通过设计视觉特征聚合和时序特征聚合2个模块,融合多个维度的特征以实现这一目标。前者设计了自顶向下的切分和自底向上的归并方案,可实现不同分辨率多层卷积特征图的有效聚合;后者给出了一种帧差引导的相邻帧特征图融合方案,可通过融合多帧特征图强化运动目标及其周边区域的特征表示。公开数据集上的实验表明:基于以上2个模块设计的时空特征聚合网络可以实现对水下珊瑚礁鱼的有效检测,相比于多个主流方法和模型取得了更高的检测精度。

关键词: 珊瑚礁鱼, 卷积神经网络, 时空联合特征, 目标检测, 特征融合

Abstract: It is challenging to detect coral reef fish from underwater surveillance videos, due to issues like poor video imaging quality, complex underwater environment, high visual diversity of coral reef fish, etc. Extracting discriminative features to characterize the fishes has become a crucial issue that dominates the detection accuracy. This paper proposes an underwater coral reef fish detection method based on aggregation of spatio-temporal features. It is achieved by designing two modules for visual and temporal feature aggregation and fusing multi-dimensional features. The former designs a top-down partition and a bottom-up merging, which achieve effective aggregation of feature maps of different convolutional layers with varying resolutions. The latter devises a temporal feature fusion scheme based on the pixel difference between adjacent frames. It enhances the feature representation of moving objects and their surrounding area through the fusion of feature maps coming from adjacent frames. Experiments on a public dataset show that, by employing the spatio-temporal aggregation network built on top of the two proposed modules, we can effectively detect coral reef fishes in the challenging underwater environment. Higher detection accuracy are obtained compared with the existing methods and popular detection models.

Key words: coral reef fish, convolutional neural network, spatio-temporal feature, object detection, feature fusion

中图分类号: 


版权所有 © 《北京航空航天大学学报》编辑部
通讯地址:北京市海淀区学院路37号 北京航空航天大学学报编辑部 邮编:100191 E-mail:jbuaa@buaa.edu.cn
本系统由北京玛格泰克科技发展有限公司设计开发