期刊文献+

IBM Voice Conversion Systems for 2007 TC-STAR Evaluation 被引量:2

IBM Voice Conversion Systems for 2007 TC-STAR Evaluation
原文传递
导出
摘要 This paper proposes a novel voice conversion method by frequency warping. The frequency warping function is generated based on mapping formants of the source speaker and the target speaker. In addition to frequency warping, fundamental frequency adjustment, spectral envelope equalization, breathiness addition, and duration modification are also used to improve the similarity to the target speaker. The proposed voice conversion method needs only a very small amount of training data for generating the warping function, thereby greatly facilitating its application. Systems based on the proposed method were used for the 2007 TC-STAR intra-lingual voice conversion evaluation for English and Spanish and a cross-lingual voice conversion evaluation for Spanish. The evaluation results show that the proposed method can achieve a much better quality of converted speech than other methods as well as a good balance between quality and similarity. The IBM1 system was ranked No. 1 for English evaluation and No. 2 for Spanish evaluation. Evaluation results also show that the proposed method is a convenient and competitive method for crosslingual voice conversion tasks. This paper proposes a novel voice conversion method by frequency warping. The frequency warping function is generated based on mapping formants of the source speaker and the target speaker. In addition to frequency warping, fundamental frequency adjustment, spectral envelope equalization, breathiness addition, and duration modification are also used to improve the similarity to the target speaker. The proposed voice conversion method needs only a very small amount of training data for generating the warping function, thereby greatly facilitating its application. Systems based on the proposed method were used for the 2007 TC-STAR intra-lingual voice conversion evaluation for English and Spanish and a cross-lingual voice conversion evaluation for Spanish. The evaluation results show that the proposed method can achieve a much better quality of converted speech than other methods as well as a good balance between quality and similarity. The IBM1 system was ranked No. 1 for English evaluation and No. 2 for Spanish evaluation. Evaluation results also show that the proposed method is a convenient and competitive method for crosslingual voice conversion tasks.
出处 《Tsinghua Science and Technology》 SCIE EI CAS 2008年第4期510-514,共5页 清华大学学报(自然科学版(英文版)
关键词 voice conversion frequency warping mapping formants voice conversion frequency warping mapping formants
  • 相关文献

参考文献10

  • 1Abe M,Nakamura S,Shikano K,Kuwabara H.Voice con- version through vector quantization[].Proceedings of IEEE International Conference on Acoustics Speech and Signal Processing.1998
  • 2Arslan L M,Talkin D.Voice conversion by codebook mapping of line spectral frequencies and excitation spec- trum[].Proceedings of European Conference on Speech Communication and Technology.1997
  • 3Stylianou Y.High resolution voicetransformation[]..2001
  • 4Shuang Zhiwei,Wang Zixiang,Ling Zhenhua,Wang Ren- hua.A novel voice conversion system based on codebook mapping with phoneme-tied weighting[].Proceedings of International Conference on Spoken Language Processing.2004
  • 5Toda T,Black A W,Tokuda K.Spectral conversion based on maximum likelihood estimation considering global variance of converted parameter[].Proceedings of IEEE International Conference on Acoustics Speech and Signal Processing.2005
  • 6Eichner M,Wolff M,Hoffmann R.Voice characteristics conversion for TTS using reverse VTLN[].Proceedings of IEEE International Conference on Acoustics Speech and Signal Processing.2004
  • 7Eide E,Gish H.A parametric approach to vocal tract length normalization[].Proceedings of IEEE International Conference on AcousticsSpeechand Signal Processing.1996
  • 8Shuang Z,Bakis R,Shechtman S,Qin Y.Frequency warp- ing based on mapping formant parameters[].Proceedings of International Conference on Spoken Language Processing.2006
  • 9Chazan D,Hoory R,Sagi A,Shechtman S,Sorin A,Shuang Z,Bakis R.High quality sinusoidal modeling of wideband speech for the purposes of speech synthesis and modification[].Proceedings of IEEE International Con- ference on Acoustics Speech and Signal Processing.2006
  • 10Y. Stylianou,O. Cappe,and E. Moulines.Continuous probabilistic transform for voice conversion[].IEEE Trans on Speech and Audio Processing.1998

同被引文献14

引证文献2

二级引证文献13

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部