汉语语音识别的平滑声韵基元HMM算法

何强; 毛士艺; 张有为

汉语语音识别的平滑声韵基元HMM算法

1.
北京航空航天大学电子工程系
2. 五邑大学信息科学研究所

基金项目:

广东省自然科学基金资助项目(960631)

详细信息

中图分类号: TN 9123
计量
- 文章访问数: 3333
- HTML全文浏览量: 211
- PDF下载量: 1468
- 被引次数: 0
出版历程
- 收稿日期: 1999-11-12
- 网络出版日期: 2001-02-28

Smoothed-unit HMM Algorithm in Mandarin Speech Recognition

1.
Beijing University of Aeronautics and Astronautics, Dept. of Electronic Engineering
2. Wuyi University, Research Institute of Information Science

摘要

摘要: 汉语语音识别的基本单元一般为音素、音节以及声韵母.以声韵母为基元的识别系统由于HMM模型较少,计算量小,适合于实时实现.但是由于模型比较孤立,对语音信号的声学特性描述得不够精确,因而识别率一般比音节基元的系统低.而以音节、音素(tri-phone、di-phone)为基元的系统则有HMM模型数量多、训练和识别过程中计算量大的缺点,影响到系统的实时性.本文提出了一种折衷的方案,系统基元仍选择声韵母,而在HMM训练阶段,对整个音节序列的参数进行运算,使声韵过渡段的状态得到平滑,同时计算并保存每个音节声韵之间的转移概率,识别时动态组装为完整的音节HMM.该方法在保持少量HMM个数的同时,能够降低误试率,适合于以DSP为核心的实时连接词语音识别系统.
- 言语识别 /
- 实时 /
- 马尔柯夫过程 /
- HMM /
- 声韵基元 /
- 平滑
Abstract: The base unit in mandarin speech recognition is phoneme, semi-syllable or syllable. Semi-syllable system has fewer HMM models and needs less computation, thus it's suitable for real-time systems. But due to poor description for the acoustic properties of the speech signal, it generally shows a low performance compared with syllable system. While the system based on syllable or phoneme (tri-phone or di-phone) has much more HMM models, and needs massive computation in training and recognition, which goes against to real-time implementation. The new scheme is a compromised one. The new system is based on semi-syllable system, but the parameters of the entire syllable are used in training phase, so smoothing between two semi-syllable units is introduced. The transition probability between semi-syllables is calculated, and the two semi-syllable HMMs are connected into a full syllable HMM in recognition phase. This can increase the system performance without increasing HMM models, and it's fit for real-time systems with DSP kernel.
- speech recognition /
- real time /
- Markov processes /
- HMM /
- semi-syllable unit /
- smoothing

HTML全文

参考文献(1)

[1] Rabiner L, Juang B H. Fundamentals of speech recognition[M]. Englewood cliffs:PTR Prentice Hall,1993. [2] 吴宗济,林茂灿. 实验语音学概要[M]. 北京:高等教育出版社,1989. [3] 杨行峻,迟惠生,唐昆,等. 语音信号数字处理[M]. 北京:电子工业出版社,1995. [4] 杨浩荣,刘加,王作英,等. GMD-SDDBHMM语音识别模型和分类训练方法[J]. 通信学报,1998, 19(4):36~42. [5] Zhang J, Huang Z T, Wang X L. Selection and analysis of HMM's state-number in speech recognition . In:Chinese Institute of Electronics, Signal Processing Society,ed. International Conference of Signal Processing'98 . Beijing:Publishing House of Electronics Industry, 1998.641~645.

施引文献

资源附件(0)

访问统计