摘要
基于词典和句子的长度和位置信息的双语句子对齐方法在解决真实双语文本对齐问题时具有一定的普适性。在分析该方法的基础上,提出了在解决某一指定领域内的维汉互译文本时,对基于长度和位置信息的双语句子对齐方法的改进,在此方法引入维语与汉语句子长度比的期望值,能够使数据更平滑,更有效地解决了维汉互译文本句子对齐的问题。
It is useful to solve the problem of real bilingual texts by using the method based on sentence pair's length and location information with a dictionary. An optimized algorithm is proposed by introducing the expectation of the length ratio of the sentence between Uigur and Chinese in a specified area. It makes the data distribution smoothly, which can efficiently solve the problem of real bilingual texts aligning sentence between Uigur and Chinese.
出处
《现代电子技术》
2011年第14期25-27,共3页
Modern Electronics Technique
关键词
句子对齐
期望值
双语语料库
锚点
长度和位置
词典
sentence alignment
expectation
bilingual corpus
anchors
length and location
dictionary