摘要
提出了一个高性能的汉语数码语音识别 (MDSR)系统。 MDSR系统使用 Mel频标倒谱系数 (MFCC)作为主要的语音特征参数 ,同时提取共振峰轨迹和鼻音特征以区分一些易混语音对 ,并提出一个基于语音特征的实时端点检测算法 ,以减少系统资源需求 ,提高抗干扰能力。采用了两级识别框架来提高语音的区分能力 ,其中第一级识别用于确定识别候选结果 ,第二级识别用于区分易混语音对。由于采用了以上改进 ,MDSR系统识别率达到了 98.8% .
High performance mandarin digit speech recognition (MDSR) system is developed using MFCC (mel frequency cepstrum coefficient) as the main parameter identifying the speech patterns. The formant trajectory and the nasal feature are extracted to identify confused words. A feature based, real time endpoint detection algorithm is proposed to reduce the system resource requirements and to improve the disturbance proof ability. A two stage recognition frame enhances discrimination by identifying candidate words in the first stage and confused word pairs in the second stage. These improvements result in a correct recognition rate of 98.8%.
出处
《清华大学学报(自然科学版)》
EI
CAS
CSCD
北大核心
2000年第1期32-34,56,共4页
Journal of Tsinghua University(Science and Technology)
基金
国家自然科学基金项目!(6 9772 0 2 0 )
国家"八六三"高技术项目! (86 3- 5 1 2 - 980 5 -1 0 )