摘要
针对传统英语翻译机器人在多模态翻译中翻译准确率低、翻译语义出现歧义,导致人机交互效果不佳的问题,设计一个基于视觉引导的智能英语翻译机器人人机交互系统。在传统Transformer机器翻译模型和卷积神经网络的基础上,构建基于视觉信息的多模态机器翻译模型Universal MMT;然后基于该模型加入选择注意力,获得感知文本的视觉表示;利用编码器进行多模态门控融合,最终实现编码器翻译结果输出。实验结果表明,相较于其他机器翻译模型,本模型在Multi30K测试集中的BLEU和METEOR取值分别为44.9和62.8,均高于其他模型。在VATEX数据集上,本模型的BLEU值为35.66。由此可知,本模型加入选择注意力后可对上下文语义信息进行准确理解,翻译准确率显著提升。
In view of the problem of low translation accuracy and ambiguity of translation semantic in multi-modal translation, the intelligent English translation robot human-computer interaction system based on visual guidance is designed. Based on the traditional Transformer machine translation model and convolutional neural network, constructing the multimodal machine translation model Universal MMT based on visual information, the selective attention is then added to obtain the visual representation of the perceptual text;using the encoder for multi-modal gating fusion, and finally realizes the output of the encoder translation results. Experimental results show that the BLEU and METEOR values were 44.9 and 62.8, respectively, which are higher than the other models. On the VATEX dataset, the BLEU value of this model is 35.66. This shows that the model can accurately understand the context semantic information, and the translation accuracy is significantly improved.
作者
赵丽容
ZHAO Lirong(University for Science&Technology Sichuan,MeiShan,SiChuan 620564,China)
出处
《自动化与仪器仪表》
2022年第11期220-225,共6页
Automation & Instrumentation
基金
四川省教育厅人文社科青年基金《拓展训练与体育旅游关联发展探索》(08SB017)。