摘要
为了提高跨境旅游汉译英的翻译质量,制定规则模板识别非信息句,构建基于分类器的非信息句识别模型,并在此基础上构建融合集成学习的半监督非信息句识别模型。实验结果表明,当训练集比例为1∶1时,特征维度为50维时,三种分类器具有最好的分类效果;其中,引入集成学习的半监督方法分类效果最高;其在最大熵(Maximum Entropy,ME)分类器下,G-mean值最高,达到0.866,基于统计机器翻译(Statistical Machine Translation,SMT)的翻译系统英语成文率和翻译准确率及性能均优于Google在线系统,说明该系统能够促进跨境旅游翻译质量的提升。
In order to improve the quality of Chinese to English translation of cross-border tourism,rule templates are developed to identify non informative sentences,and a classifier based non informative sentence recognition model is constructed.On this basis,a semi supervised non informative sentence recognition model integrating ensemble learning is constructed.The experimental results show that when the training set ratio is 1∶1 and the feature dimension is 50 dimensions,the three classifiers have the best classification effect;Among them,the semi supervised method introducing ensemble learning has the highest classification effect;Under the Maximum Entropy(ME)classifier,the G-mean value is the highest,reaching 0.866.The translation system based on Statistical Machine Translation(SMT)outperforms Google’s online system in terms of English document rate,translation accuracy,and performance,indicating that the system can promote the quality of cross-border tourism translation.
作者
何媛媛
HE Yuanyuan(Yulin University,Yulin Shaanxi 719000,China)
出处
《自动化与仪器仪表》
2023年第9期201-204,共4页
Automation & Instrumentation
关键词
跨境旅游
翻译质量
统计机器翻译
汉译英
非信息句
cross border tourism
translation quality
statistical machine translation
Chinese to English translation
non informative sentence