摘要
神经机器翻译自兴起以来,不断给机器翻译领域带来振奋人心的消息。但神经机器翻译没有显式地利用语言学知识对句子结构进行分析,因此对结构复杂的长句翻译效果不佳。该文基于分治法的思想,识别并抽取句子中的最长名词短语,保留特殊标识或核心词,与其余部分组成句子框架。通过神经机器翻译系统分别翻译最长名词短语和句子框架,再将译文重新组合,缓解了神经机器翻译对句子长度敏感的问题。实验结果表明,该方法获得的译文与基线系统相比,BLEU分值提升了0.89。
Neural Machine Translation(NMT)is defected in long sentences with complex structure owing to its neglect of linguistic knowledge of sentence structure.Adopting the idea of divide-and-conquer strategy,this paper proposes to identifying and extracting the Maximal Noun Phrases in a sentence,and retaining special marks or head words and the rest component to form the sentence framework.Then the Maximal Noun Phrases and sentence frames are translated by NMT,respectively.Experimental results show that the method proposed yields 0.89 imporovments in terms of BLEU score compared with the baseline system.
作者
张学强
蔡东风
叶娜
吴闯
ZHANG Xueqiang;CAI Dongfeng;YE Na;WU Chuang(Human-Computer Intelligence Research Center, Shenyang Aerospace University, Shenyang, Liaoning 110136, China)
出处
《中文信息学报》
CSCD
北大核心
2018年第3期42-48,63,共8页
Journal of Chinese Information Processing
基金
国家自然科学基金(61402299
61403262)
关键词
神经机器翻译
最长名词短语
分治策略
neural machine translation
maximal-length noun phrase
divide-and-conquer strategy