期刊文献+

言语信息处理的进展 被引量:3

The Research Progress of Speech Information Processing
下载PDF
导出
摘要 该文介绍了言语信息处理的进展,特别提到汉语言语处理的现状。言语信息处理涉及到言语识别、说话人识别、言语合成、言语知觉计算等。带口音和随意发音的言语识别有力的支持了语言学习与口语水平测评等应用;跨信道、环境噪音、多说话人、短语音、时变语音等因素存在的情况下提高识别正确率,是说话人识别的研究热点;言语合成主要关注多语言合成、情感言语合成、可视言语合成等;言语知觉计算开展了言语测听、噪声抑制算法、助听器频响补偿方法、语音信号增强算法等研究。将言语处理技术与语言、网络有效结合,促进了更加和谐的人机言语交互。 This paper introduces the progress of speech information processing,especially the researches on Chinese speech processing.Speech information processing includes speech recognition,speaker recognition,speech synthesis and computational speech perception.Researches on speech recognition with accent and personal style support the systems of language learning and evaluation,while speaker recognition focuses on how to improve the performance in different conditions.Researches on speech synthesis pay more attention on cross-language,emotional and audio-visual speech synthesis.Fomputational speech perception focuses on the implementation on speech testing and rehabilitation,denoising,and speech enhancement.Through these researches,especially the combination of speech information processing,linguistics and web technology,we can build more harmonious human-computer speech interaction system.
出处 《中文信息学报》 CSCD 北大核心 2011年第6期137-141,共5页 Journal of Chinese Information Processing
基金 国家自然科学基金资助项目(61003094,60928005,60805008)
关键词 言语识别 说话人识别 言语合成 言语知觉计算 speech recognition speaker recognition speech synthesis computational speech perception
  • 相关文献

参考文献14

  • 1Rabiner L, Juang B-H. Fundamentals of Speech Rec- ognition[M]. Prentice Hall, 1993.
  • 2Huang X D, Acero A, Hon H W. Spoken language processing: A guide to theory, algorithm and system development[M]. Prentice Hail. 2001.
  • 3Liu L, Zheng F, Wu W. State-dependent phoneme- based model merging for dialectal Chinese speech rec- ognition [J]. Speech Communication, 2008, 50 (7):605-615.
  • 4Harrison A, Meng H, Lee P. Automated Feedback in Commercial Computer-Training Systems[R]. Dept. of SEEM, CUHK, 2009.
  • 5Meng H, Lo W-K, Harrison A M, et al. Develop- ment of Automatic Speech Recognition and Synthesis Technologies to Support Chinese Learners of English:The CUHK Experience[C]//APSIPA 2010, Biopo-lis, Singapore: 2010.
  • 6Wu W, Zheng F, Xu M, et al. A Channel Robust Speaker Verification Algorithm Using Cohort-based Speaker Model Synthesis [J]. IEEE Transactions onAudio, Speech, and Language Processing, 2007, 15 (6): 1893- 1903.
  • 7Zen H, Nose T, Yamagishi J, et al. The HMM-based Speech Synthesis System (HTS) Version 2. 0 [C]// Sixth ISCA Workshop on Speech Synthesis. Bonn,Germany: 2007: 294-299.
  • 8Qian Y, Xu J, Soong F K. A frame mapping based HMM approach to cross-lingual voice transformation [C]//2011 IEEE International Conference on Acous-tics, Speech and Signal Processing (ICASSP). IEEE, 2011: 5120-5123.
  • 9Chung-Hsien Wu, Chi-Chun Hsia, Chung-Han Lee, et al. Hierarchical Prosody Conversion Using Regres- sion-Based Clustering for Emotional Speech Synthesis.[J]. IEEE Transactions on Audio, Speech, and Lan- guage Processing, 2010, 18(6): 1394-1405.
  • 10Jia Jia, Shen Zhang, Fanbo Meng, et al. Emotional Audio-Visual Speech Synthesis Based on PAD[J]. IEEE Transactions on Audio, Speech, and LanguageProcessing, 2011, 19(3): 570-582.

二级参考文献14

  • 1[12]Zeng FG,Nie KB,Stickney G.Speech recognition with slowly varying amplitude and frequency modulation cues.Proceedings of National Academic Society.U.S.A,2005,102(7):2293-2298.
  • 2[13]Freyman RL,Balakrishnan U,Helfer KS.Effect of number of masking talkers and auditory priming on informational masking in speech recognition.J Acoust Soc Am,2004,115:2246-2256.
  • 3[14]Li L,Qi JG,He Y.(2005a).Attribute capture in the precedence effect for long-duration noise sounds.Hearing Research,2005,202:235-247.
  • 4[1]Zeng FG.Cochlear implants in China.Audiology,1995,34:61-75.
  • 5[2]Zeng FG,Cao KL,Wang ZZ.Progress in cochlear implants.Chinese Journal of Otolaryngology,1998,33(2):123-125.
  • 6[3]Kang,J.Comparison of speech intelligibility between English and Chinese.Journal of Acoustic Society of America,1998,103:1213-12.
  • 7[4]Freyman RL,Helfer KS,McCall DD,et al.The role of perceived spatial separation in the unmasking of speech.Journal of Acoustic Society of America,1999,106:3578-3588.
  • 8[5]Freyman RL,Balakrishnan U,Heifer KS.Spatial release from informational masking in speech recognition.Journal of Acoustic Society of America,2001,109:2112-2122.
  • 9[6]Li L,Daneman M,Qi JQ,et al.Does the information content of an irrelevant source differentially affect speech recognition in younger and older adults? Journal of Experimental Psychology:Human Perception and Performance,2004,30:1077-1091.
  • 10[7]Wu XH,Wang C,Chen J,et al.The effect of perceived spatial separation on informational masking of Chinese speech.Hearing Research,2004,199:1-10.

共引文献7

同被引文献23

  • 1徐俊,蔡莲红.面向情感转换的层次化韵律分析与建模[J].清华大学学报(自然科学版),2009(S1):1274-1277. 被引量:7
  • 2蔡莲红,崔丹丹,蔡锐.汉语普通话语音合成语料库TH-CoSS的建设和分析[J].中文信息学报,2007,21(2):94-99. 被引量:12
  • 3Zen H, Tokuda K, Black A W.Statistical parametric speech synthesis[J].Speech Communication,2009,51 ( 11 ) : 1039-1064.
  • 4Yamagishi J, Kobayashi T, Nakano Y, et al.Analysis of speaker adaptation algorithms for HMM-based speech syn- thesis and a constrained SMAPLR adaptation algorithm[J]. IEEE Transactions on Audio, Speech, and Language Process- ing, 2009, 17( 1 ) : 66-83.
  • 5Nose T, Tachibana M, Kobayashi T.HMM-based style con- trol for expressive speech synthesis with arbitrary speaker's voice using model adaptation[J].IEICE Trans on Inf & Syst, 2009, E92-D (3) : 489-497.
  • 6Yang Hongwu, Meng H M, Cai Lianhong.Modeling the acoustic correlates of expressive elements in text genres for expressive text-to-speech synthesis[C]//Proceedings of International Conference on Spoken Language Processing. Pittsburg, USA : [s.n.], 2006: 1806-1809.
  • 7Wu Zhiyong, Meng H M, Yang Hongwu, et al.Modeling the expressivity of input text semantics for chinese text-to-speech synthesis in a spoken dialog system[J].IEEE Transactions on Audio, Speech, and Language Processing, 2009, 17 (8) : 1567-1577.
  • 8崔丹丹.情感语音分析与变换的研究[D].北京:清华大学,2007.
  • 9Guo Weitong, Yang Hongwu, Pei Dong, et al.Prosody con- version of Chinese northwest mandarin dialect based on five degree tone model[J].Intemational Journal of Digital Content Technology and its Applications, 2012, 6 (17): 323-332.
  • 10Kawahara H,Masuda-Katsuse I,de Cheveigne A.Restructur- ing speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extrac- tion: possible role of a repetitive structure in sounds[J]. Speech Communication, 1999,27(3/4) : 187-207.

引证文献3

二级引证文献6

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部