一种基于改进得分分布的查询项特定阈值方法

A Term Specific Thresholding Method Based on Improved Score Distribution

下载PDF

导出

摘要为提高语音查询项检索系统的准确率,提出一种基于改进得分分布的查询项特定阈值方法.在系统判决阶段,根据每个查询项的后验得分分布设定不同阈值.后验得分分布用指数混合模型描述,通过无监督的最大期望(EM)算法估计模型参数,最后根据贝叶斯最小风险准则计算阈值.针对EM算法对初始值较为敏感的问题,初始化时采用K-means聚类算法代替随机初始化方法,首先将候选结果得分分为两类,然后计算每类的先验分布并用最大似然法估计模型参数的初始值.实验结果表明该阈值方法有更好的检索性能. To improve the precision of the spoken term detection system, a term specific thresholding method based on improved score distribution is presented. At the decision stage of the system, different thresholds are set for every query according to the posterior scores. The distribution of all posterior scores retrieved for a query term is modeled by exponential mixture model. The parameters are estimated by the expectation maximization （EM） algorithm in an unsupervised manner. The threshold value is calculated by Bayes minimum risk rule. Since EM algorithm is sensitive to initial values, K-means clustering is used in the initialization instead of randomization. Posterior scores are firstly divided into two classes, the prior distributions are calculated and the intial values of the model parameters are estimated by maximum likelihood method. The experimental results show method is better than that of others. that the performance of the proposed thresholding

作者陆梨花张连海

机构地区中国人民解放军信息工程大学信息系统工程学院

出处《模式识别与人工智能》 EI CSCD 北大核心 2015年第5期437-442,共6页 Pattern Recognition and Artificial Intelligence

基金国家自然科学基金项目(No.61175017)资助

关键词得分分布查询项特定阈值 K-MEANS聚类语音查询项检索 Score Distribution, Term Specific Thresholding, K-means Clustering, Spoken Term Detection

分类号 TN912.3 [电子电信—通信与信息系统]

引文网络
相关文献

参考文献15

1Mamou J, Ramabhadran B, Siohan O. Vocabulary Independent Spoken Term Detection // Proc of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. Amsterdam, The Netherland, 2007: 615-622.
2Tejedor J, Wang D, King S, et al. A Posterior Probability-Based System Hybridisation and Combination for Spoken Term Detection // Proc of the 10th Annual Conference of the International Speech Communication Association. Brighton, UK, 2009: 2131-2134.
3Tejedor J, Echeverría A, Wang D, et al. Evolutionary Discriminative Confidence Estimation for Spoken Term Detection. Multimedia Tools and Applications, 2013, 62(1): 5-34.
4Lee H Y, Chen C P, Lee L S. Integrating Recognition and Retrieval with Relevance Feedback for Spoken Term Detection. IEEE Trans on Audio, Speech, and Language Processing, 2012, 20(7): 2095-2110.
5Tu T W, Lee H Y, Lee L S. Improved Spoken Term Detection Using Support Vector Machines with Acoustic and Context Features from Pseudo-Relevance Feedback // Proc of the IEEE Workshop on Automatic Speech Recognition and Understanding. Waikoloa, USA, 2011: 383-388.
6Lee H H, Lee L S. Enhanced Spoken Term Detection Using Support Vector Machines and Weighted Pseudo Examples. IEEE Trans on Audio, Speech, and Language Processing, 2013, 21(6): 1272-1284.
7Miller D R H, Kleber M, Kao C L, et al. Rapid and Accurate Spoken Term Detection // Proc of the 8th Annual Conference of the International Speech Communication Association. Antwerp, Belgium, 2007: 314-317.
8Soltau H, Saon G, Povey D, et al. The IBM 2006 Gale Arabic ASR System // Proc of the IEEE International Conference on Acoustics, Speech and Signal Processing. Honolulu, USA, 2007, IV: 349-352.
9Vergyri D, Shafran I, Stolcke A, et al. The SRI/OGI 2006 Spoken Term Detection System // Proc of the 8th Annual Conference of the International Speech Communication Association. Antwerp, Belgium, 2007: 2393-2396.
10Allauzen C, Mohri M, Saraclar M. General Indexation of Weighted Automata: Application to Spoken Utterance Retrieval // Proc of the Workshop on Interdisciplinary Approaches to Speech Indexing and Retrieval at HLT-NAACL. Stroudsburg, USA, 2004: 33-40.

1邢长征,苑聪.一种快速、贪心的高斯混合模型EM算法研究[J].计算机工程与应用,2015,51(20):111-115. 被引量：3
2裴旻.一种电话语音查询与控制系统的设计与实现[J].电子工程师,2002,28(9):28-31. 被引量：1
3杨学君,郭杰,付军,李秉智.一种通用语音查询系统的设计[J].重庆邮电学院学报（自然科学版）,1998,10(1):49-52. 被引量：2
4李红莲,宋占岭.基于文音相似度的语音查询系统的设计与开发[J].计算机工程与应用,2006,42(26):221-223. 被引量：1
5杨颖,曹红兵,吴方,杨晴龙.语音-文本转换技术在手机软件开发中的应用[J].安庆师范学院学报（自然科学版）,2016,22(3):73-77.
6任洪君.集装箱码头语音查询系统的开发[J].电子技术与软件工程,2017(3):53-53.
7秦福星,吕飞,童大鹏.基于RBF的某型发动机故障诊断[J].船电技术,2011,31(3):33-34.
8李姝.成都联通开通高考查分热线[J].通信与信息技术,2010(4):27-27.
9彭海深.基于协议分析与PNN组合的网络异常检测[J].计算机与网络,2010,36(20):51-53.
10陆梨花,张连海,陈琦.基于加权有限状态转换器的语音查询项检索技术[J].数据采集与处理,2015,30(2):390-398. 被引量：2

模式识别与人工智能

2015年第5期

浏览历史

内容加载中请稍等...

一种基于改进得分分布的查询项特定阈值方法

参考文献15

相关作者

相关机构

相关主题

浏览历史