北京航空航天大学学报 ›› 2018, Vol. 44 ›› Issue (5): 1074-1080.doi: 10.13700/j.bh.1001-5965.2017.0357

• 论文 • 上一篇    下一篇

一种基于深度学习的交互式电话号码识别方法

韩京冶1, 许福1, 陈志泊1, 刘辉2   

  1. 1. 北京林业大学 信息学院, 北京 100083;
    2. 北京航空航天大学 计算机学院, 北京 100083
  • 收稿日期:2017-05-27 出版日期:2018-05-20 发布日期:2018-05-29
  • 通讯作者: 许福 E-mail:xufu@bjfu.edu.cn
  • 作者简介:韩京冶,男,硕士研究生。主要研究方向:图像处理、深度学习;许福,男,博士,副教授。主要研究方向:图像处理、编译技术、软件工程;陈志泊,男,博士,教授,博士生导师。主要研究方向:数据库技术、林业信息工程;刘辉,男,博士。主要研究方向:软件工程。
  • 基金资助:
    国家自然科学基金(61772078);北京市重点研发计划(D171100001817003);中央高校基本科研业务费专项资金(YX2014-17)

A deep learning based interactive recognition method for telephone numbers

HAN Jingye1, XU Fu1, CHEN Zhibo1, LIU Hui2   

  1. 1. School of Information Science and Technology, Beijing Forestry University, Beijing 100083, China;
    2. School of Computer Science and Engineering, Beijng University of Aeronautics and Astronautics, Beijing 100083, China
  • Received:2017-05-27 Online:2018-05-20 Published:2018-05-29

摘要: 物流、保险和中介服务等行业需要频繁地拨打电话,而人工拨打电话效率较低,高效的电话号码识别技术具有重要的应用价值。传统的印刷体数字识别方法存在人工设计特征过程复杂、识别字体单一等不足,难以满足实际应用需求。本文提出了一种基于深度学习的交互式的电话号码识别方法,通过鼠标双击图像中的电话号码,自动截取出包含此号码的目标区域,并进行灰度化、二值化、目标区域定位、字符分割和图片补白等预处理操作,在此基础上利用改进的LeNet-5卷积神经网络(CNN)自动学习图像特征,支持多种字体、字形和字号的印刷体数字识别,并利用交互式识别和内存池等方法提高识别速度。实验结果表明,单一字符的识别率为99.86%,整个号码的识别率为99.50%,整个号码平均识别时间为91 ms。本文方法识别精度高、识别速度快,具有较为广泛的应用前景。

关键词: 深度学习, 卷积神经网络(CNN), 电话号码识别, 交互式识别, 目标区域定位

Abstract: Some sectors such as logistics, insurance and intermediary agents need to make calls frequently. Manually callings lead to low efficiency, so that telephone number recognition has important practical values. The traditional methods for printed number recognition involve complicated templates designing, which cannot meet the requirements of practical applications. An interactive method based on deep learning is proposed to recognize telephone numbers. Through double-clicking the phone number in an image, this method automatically crops the target area which contains the number and performs preprocessing operations such as grayscale, binarization, target area localization, character segmentation and image padding. An improved LeNet-5 convolutional neural network (CNN) is utilized to make image recognition, which supports the recognition of printed numbers in a variety of fonts, glyphs and font sizes. The recognition speed is optimized through multiple means such as interactive recognition and memory pool. Experimental results show that the accuracy of recognition for a single character is 99.86%, the accuracy for a telephone number is 99.50%, and the average recognition time of a telephone number is 91 ms. Comparing with the traditional methods, the new method has relatively higher accuracy and faster speed in recognition, which can be widely used in many sectors.

Key words: deep learning, convolutional neural network (CNN), telephone number recognition, interactive recognition, target area localization

中图分类号: 


版权所有 © 《北京航空航天大学学报》编辑部
通讯地址:北京市海淀区学院路37号 北京航空航天大学学报编辑部 邮编:100191 E-mail:jbuaa@buaa.edu.cn
本系统由北京玛格泰克科技发展有限公司设计开发