摘要
研究基于主元音音素基元的声学模型的改进。由于汉语语音特点,主元音模型得到了广泛的应用。通过分析主元音音素模型,发现该模型存在词组音节序列字界线有歧义,从而提出主元音的改进方法以明确音节序列中字的分界,减小基元规模,提高语音系统识别率。为了描述连续语意中的协同发音现象,还针对改进后的主元音基元,设计了相应的有调问题集,利用决策树的参数共享策略建立了上下文相关的音素模型。实验结果表明,改进后的有调音素集合在削减了原有基元个数的基础上,字误识率(CER)有0.4%-0.6%的明显改善。
This research studies the improvements of the main vowel phonemes acoustic model. According to the features of Mandarin, main vowel method is widely used. Through analyzing the phoneme model, it is discovered that the decomposition of a word' s pronunciation into a sequence of syllables is not unique. In this paper, methods are developed to optimize the main vowel modeling, including syllable refinement, model size and character error rate reduction. To describe the semantics of the continuous co - articulation phenomenon, this paper designs a set of appropriate questions, and builds a context dependent tri - phone acoustic model based on decision - tree - based state - tying. Experiments show that the improved acoustic model is of less size than the old one and leads to an absolute reduction of character error rate (CER) by about 0.4% -0.6%.
出处
《计算机仿真》
CSCD
北大核心
2010年第5期355-358,共4页
Computer Simulation
关键词
大词汇量连续汉语语音识别
音素
主元音
决策树
Large vocabulary continuous mandarin speech recognition
Phoneme
Main vowel
Decision tree