基于多尺度梯度及深度神经网络的汉字识别

潘炜深; 金连文; 冯子勇

doi:10.13700/j.bh.1001-5965.2014.0499

基于多尺度梯度及深度神经网络的汉字识别

doi: 10.13700/j.bh.1001-5965.2014.0499

华南理工大学电子与信息学院, 广州 510641

基金项目: 国家自然科学基金资助项目(61075021); 国家科技支撑计划资助项目(2013BAH65F01,2013BAH65F04); 广东省科技计划资助项目(2012A010701001)

详细信息

作者简介:
潘炜深(1990—),男,广东惠州人,硕士生,panweishen2009@gmail.com

通讯作者:
金连文(1968—),男,贵州都匀人,教授,lianwen.jin@gmail.com,主要研究方向为手写汉字识别、机器学习、模式识别、云计算等.

中图分类号: TP391.4
计量
- 文章访问数: 1277
- HTML全文浏览量: 95
- PDF下载量: 806
- 被引次数: 0
出版历程
- 收稿日期: 2014-04-28
- 修回日期: 2014-11-27
- 网络出版日期: 2015-04-20

Recognition of Chinese characters based on multi-scale gradient and deep neural network

Department of Electronic and Information Engineering, South China University of Technology, Guangzhou 510641, China

摘要

摘要: 介绍了一种基于多尺度滑动窗的方法提取文字的梯度直方图特征,并结合深度神经网络对印刷体汉字进行识别.针对梯度直方图的空间关系,使用可伸缩的滑动窗对图像进行分割,在不同尺度上获取文字的特征信息,有效融合汉字的全局特征和局部分块特征.实验采用5层的深度神经网络模型对国标一级3755个印刷体汉字进行分类,并应用Dropout技术防止训练过拟合,提高神经网络的泛化能力.实验准确率达到98.292%,有较好的识别性能,验证了本文多尺度梯度特征及深度神经网络模型在文字识别上的有效性.
- 多尺度滑动窗 /
- 梯度直方图 /
- 深度神经网络 /
- 泛化能力 /
- 汉字识别
Abstract: The method to extract the gradient histogram feature of the Chinese characters with a multi-scale sliding window and to recognize the printed Chinese characters with deep neural network was presented. In order to acquire the spatial information of the gradient histogram, a retractable sliding window technique was proposed for segmenting the images and getting the gradient feature information from different scales which can effectively combine all the global features and local block features of Chinese characters. The experiment was carried out by using a 5-layer deep neural network to classify 3755 categories of printed Chinese characters.A Dropout technique was applied so as to prevent over-fitting training and to improve the generalization ability of the neural network. The accuracy of the experiment reaches 98.292%, which has better recognition performance and demonstrates that the method of applying a multi-scale gradient feature and deep neural network model on the recognition of Chinese characters is effective.
- multi-scale sliding window /
- gradient histogram /
- deep neural network /
- generalization ability /
- recognition of Chinese characters

HTML全文

参考文献(18)

[1]	Mori K, Masuda I.Advances in recognition of Chinese characters[C]//Proceedings of the Fifth International Conference on Pattern Recognition.Miami:IEEE Computer Society Press,1980:692-702.
[2]	丁晓青. 汉字识别研究的回顾[J].电子学报,2002,30(9):1364-1368. Ding X Q.Chinese character recognition:a review[J].Journal of Acta Electronica Sinica,2002,30(9):1364-1368(in Chinese).
[3]	荆涛,王仲. 光学字符识别技术与展望[J].计算机工程,2003,29(2):1-2. Jing T,Wang Z.A survey of optical character recognition[J].Computer Engineering,2003,29(2):1-2(in Chinese).
[4]	Dalal N, Triggs B.Histograms of oriented gradients for humandetection[C]//IEEE Conference on Computer Vision and Pattern Recognition.San Diego,CA:IEEE Computer Society Press,2005:886-893.
[5]	Islam A, Hasan M R,Rahaman R,et al.Designing ANN using sensitivity & hypothesis correlation testing[C]//Computer and Information Technology.Dhaka:IEEE Computer Society Press,2007:1-6.
[6]	Soulie F F, Viennet E,Lamy B.Multi-modular neural network architectures:applications in optical character and human face recognition[J].International Journal of Pattern Recognition and Artificial Intelligence,1993,7(4):721-755.
[7]	Guyon I. Applications of neural networks to character recognition[J].International Journal of Pattern Recognition and Artificial Intelligence,1991,5(1-2):353-382.
[8]	Chang H D, Wang J F,Kuo S C.A Bayesian neural network for separating similar complex handwritten Chinese characters[J].Pattern Recognition Letters,1994,15(4):403-408.
[9]	Nair V, Hinton G E.Rectified linear units improve restricted boltzmann machines[C]//Proceedings of the 27th International Conference on Machine Learning.Haifa:International Machine Learning Society,2010:807-814.
[10]	Hosmer Jr D W, Lemeshow S.Applied logistic regression[M].Hoboken:John Wiley & Sons,2004:31-43.
[11]	Duan K, Keerthi S S,Chu W,et al.Multi-category classification by soft-max combination of binary classifiers[M].Berlin:Springer-Verlag Berlin Heidelberg,2003:125-134.
[12]	Hinton G E, Srivastava N,Krizhevsky A,et al.Improving neural networks by preventing co-adaptation of feature detectors[EB/OL].[2014-04-14].http://arxiv.org/abs/1207.0580.
[13]	Krizhevsky A, Sutskever I,Hinton G E.Image net classification with deep convolutional neural networks[J].Neural Information Processing Systems,2012,25(2):1097-1105.
[14]	Dahl G E, Sainath T N,Hinton G E.Improving deep neural networks for LVCSR using rectified linear units and dropout[C]//IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP).Vancouver,BC:IEEE Computer Society Press,2013:8609-8613.
[15]	Volodymyr M. Cudamat:a CUDA-based matrix class for python.Rep.UTML-TR-2009-004[R].Toronto:University of Toronto,2009.
[16]	Ojala T, Pietikäinen M,Harwood D.Performance evaluation of texturemeasures with classification based on Kullback discrimination of distributions[C]//International Conference on Pattern Recognition.Jerusalem,Israel:IEEE Computer Society Press,1994:582-585.
[17]	Ojala T, Pietikäinen M,Harwood D.A comparative study of texture measures with classification based on featured distributions[J].Pattern Recognition,1996,29(1):51-59.
[18]	Siagian C, Itti L.Gist:a mobile robotics application of context-based vision in outdoor environment[C]//Computer Vision and Pattern Recognition-Workshops.San Diego,CA:IEEE Computer Society Press,2005:88.