摘要
文中针对格律诗自动生成进行了研究.首先根据创作者提交的若干关键词,利用主题模型进行扩展得到更多的主题相关词,然后通过语言模型自动生成首句.在此基础上通过统计机器翻译的方法生成后续句.在生成过程中,利用主题模型进行诗词的意境扩展,从而得到更加丰富的句子候选.该研究的主要特点和贡献是:首先提出以统计机器翻译为理论基础,将格律诗的上下句关系映射为统计翻译模型中源语言与目标语言的关系,设计了融入诗词领域知识的统计机器翻译模型.其次主题模型用来在生成过程中进行词汇集扩展,从而加强了诗词的主题及意境.另外文中还论述了基于BLEU的诗句生成的自动评测方法,并配合所设计的人工评价标准,形成了比较完备的诗词评价体系.实验结果证实了该方法的有效性.
This paper focuses on automatic ancient Chinese poetry generation.Topic model is leveraged to find semantic related words with the given keywords or key-phrases,and automatically generate the first sentence of the poetry by language model.Then statistical machine translation(SMT)model is used to give the followings step by step.Topic model expands the artistic conception of the poetry during generation,resulting in richer sentence candidates.The features and contributions of this study are as follows:(1)Based on SMT theory we consider two consecutive sentences in the poetry as the source side and target-side sentences in SMT,under the rhythm and meter constraints of ancient Chinese poetry,we proposed a SMT model which learns poetrycreation knowledge from an ancient poetry corpus.(2)Topic model is leveraged to strengthen the artistic conception of the poetry by extending the keywords to a collection of semantic related words.(3)We also discuss automatic evaluation of poetry generation with BLEU metric,cooperating with our human evaluation standards,and having formed a comprehensive evaluation system for poetry generation.The experimental results show our method is quite promising for ancient Chinese poetry generation.
出处
《计算机学报》
EI
CSCD
北大核心
2015年第12期2426-2436,共11页
Chinese Journal of Computers
关键词
律诗生成
主题模型
统计机器翻译
自动评测
poetry generation
topic model
statistical machine translation
automatic evaluation