期刊文献+

基于句子分组的中英机器翻译研究 被引量:2

Research on English-Chinese Machine Translation Based on Sentence Grouping
下载PDF
导出
摘要 虽然神经机器翻译模型使用大规模数据集进行训练能够改善翻译模型的表现,但是数据集中有关句子内容类别以及结构的信息并未得到充分利用,模型仍有提高空间。文章提出了一种基于句子分组的神经机器翻译模型架构,在训练之前,首先按照内容类别、句子结构信息对数据集中的句子进行分组,再使用组别标签和平行语料共同对模型进行训练,使得模型能够更充分利用数据集中的信息。大量对比实验证明了分组思想的合理性,基于分组架构训练得到的Transformer模型的翻译结果得到了一定提高,与普通的Transformer模型相比,文章模型的BLEU值最多可以提升1.2。 Although neural machine translation models can obtain improvements when using larger data set for training,the information about categories and structures of sentences in the data set has not been properly utilized.This paper proposes a neural machine translation model based on sentence grouping,which adds a discriminator based on attention mechanism after encoders.In addition,this paper proposes a method to calculate the structural information vector of sentences as well.These vectors can be used to obtain the group labels by unsupervised method.Before training,sentences in the data set will be divided according to their content category and sentence structure to get group labels.Then the model is trained with these labels and parallel corpus at the same time,which will help the model identify the group that sentences belong to.In this way,the information in the data set can be more fully utilized.Sufficient comparative experiments show the rationality of the grouping idea.The translation results of Transformer model based on group architecture have been improved.Compared with the vanilla Transformer model,the BLEU score of our model has increased by at most 1.2.
作者 赵彧然 孟魁 ZHAO Yuran;MENG Kui(School of Electronic Information and Electrical Engineering,Shanghai Jiao Tong University,Shanghai 200240,China)
出处 《信息网络安全》 CSCD 北大核心 2021年第7期63-71,共9页 Netinfo Security
基金 国家自然科学基金[61772337]。
关键词 机器翻译 句子分组 结构信息 machine translation sentence grouping structural information
  • 相关文献

同被引文献28

引证文献2

二级引证文献11

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部