摘要
研究了潜在语义分析(LSA)理论及其在连续语音识别中应用的相关技术,在此基础上利用WSJ0文本语料库上构建LSA模型,并将其与3-gram模型进行插值组合,构建了包含语义信息的统计语言模型;同时为了进一步优化混合模型的性能,提出了基于密度函数初始化质心的k-means聚类算法对LSA模型的向量空间进行聚类。WSJ0语料库上的连续语音识别实验结果表明:LSA+3-gram混合模型能够使识别的词错误率相比较于标准的3-gram下降13.3%。
The theory of Latent Semantic Analysis(LSA) for speech recognition is described,and the related techniques for implementing LSA-based language modeling in speech recognition systems are presented.An LSA-based semantic model is constructed on the WSJ0 text corpus.This paper uses the interpolation method to combine this semantic model with conventional 3-gram to form a hybrid language model( i.e. , LSA+3-gram ).To optimize the performance of the hybrid model,it applies k-means algorithm to perform vector clustering in the LSA vector space while the density function is used to initialize the centroid.The constructed hybrid language model outperforms the corresponding 3-gram baseline:Continuous speech recognition experiments conducted on the WSJ0 test corpus show a relative reduction in word error rate of about 13.3%.
出处
《计算机工程与应用》
CSCD
北大核心
2009年第32期111-113,共3页
Computer Engineering and Applications
基金
国家自然科学基金No.60573189
国家高技术研究发展计划(863)No.2006AA01Z139
No.2006AA010107
No.2006AA010108
福建省自然科学基金No.2006J0043~~
关键词
潜在语义分析
N元文法
K均值聚类
连续语音识别
latent semantic analysis
N-gram
k-means clustering
continuous speech recognition