摘要
提出了一种基于决策树的语音合成基元的语境特征权重训练算法.对语音数据库中的每个带调音节,利用语境相关的问题集和候选基元的频谱距离建立决策树.对每个要合成的音节,根据其语境特征,获得语音合成系统选择的基元的语境特征F*和该语境特征下决策树叶子结点中基元的语境特征F′.统计F′中每一个语境特征相对于F*的变化,根据语境特征变化的概率对权重进行调整.实验结果表明,这种方法能够训练出合理的语境特征权重,使得合成语音的自然度有一定提高.同时,利用这种方法还可以对语音合成系统进行实时优化.
The paper introduces a context specified weights training algorism for contextual features of speech unit in speech synthesis based on Classification and Regression Tree(CART). A CART is created for each tonal syllable with the spectral distance of each candidate unit and the context dependent question set. The increments of contextual features are counted by comparing the units selected by TTS and given by leaf node of CART. The weights of contextual features are then adjusted in accordance with their probability of increment. The experiments demonstrate that a set of reasonable weights can be trained by the algorism so the naturalness of synthetic speech can also be improved. The algorism can also be used to optimize the speech synthesis system online.
出处
《西北师范大学学报(自然科学版)》
CAS
2007年第4期50-54,共5页
Journal of Northwest Normal University(Natural Science)
基金
西北师范大学科研骨干培育项目(NWNU-KJCXGC-03-42)
关键词
语音合成
文语转换
基元选取
权重训练
speech synthesis
text-to-speech
unit selection
weight training