摘要
针对PSOLA算法会引起频域上的不连续的不足 ,提出一种汉语韵律调整的新方法。该方法基于语音的正弦模型理论 ,把每一帧短时语音信号分解为一系列不同幅值、相位和频率的正弦分量 ,然后进行语速和音高的调整 ,实验结果证明 ,合成的语音信号保持了原有语音的清晰度和自然度。将该方法应用于汉语文语转换系统中 。
In order to overcome the discontinuities in frequency domain of TD?PSOLA algorithm,a new method is proposed based on the sinusoidal presentation of speech.Each frame of speech signal is decomposed into sinusoidal components of different magnitudes and phases.The experiments in time scale and pitch scale modifications show that the synthesis speech has the same quality as the original.The application of the method in the Chinese text to speech system proves its capabilities.
出处
《上海应用技术学院学报(自然科学版)》
2001年第2期118-121,共4页
Journal of Shanghai Institute of Technology: Natural Science
关键词
正弦模型
时长修正
音高修正
文—语转换
sinusoidal model
time scale modification
pitch scale modification
text to speech