摘要
本文提出了一种新颖的方法,综合利用音译和网络挖掘来提高命名实体翻译的效果。具体而言,首先利用音译模型生成一个候选翻译,然后利用音译信息配合网络挖掘获得更多的候选翻译。最后,使用最大熵(MaximumEntropy)模型综合考虑源词和候选翻译之间的各种特征,如发音相似度,上下文本特征,网页共现关系等,来排序得到的候选翻译,从而决定最终的翻译结果。实验结果显示我们的方法显著的提高了命名实体翻译的精确度。
This paper presents a novel approach to improve the named entity translation by combining transliteration with web mining. For the details of the approach, a transliteration model is used to generate a translation candidate, and then the web information applied to get more translations. A Maximum Entropy (ME) model is employed to rank the translation candidates with various features such as pronouncing similarity, contextual features, cooccurrence etc. The experimental results show that our approach effectively improves the precision of the named entity translation by a large margin.
出处
《中文信息学报》
CSCD
北大核心
2007年第1期23-29,共7页
Journal of Chinese Information Processing
关键词
人工智能
机器翻译
音译
命名实体翻译
网络挖掘
artificial intelligence
machine translation
transliteration
named entity translation
web mining