摘要
根据地址的语言特性,定义和归纳了适用于邮政信函英文地址自动识别和翻译的用语规则.针对由字符识别技术获得的信函地址,提出了一种基于非精确字符串匹配技术的地址翻译方法,自动识别出英文地址并翻译成中文.实验结果验证了该方法的有效性,能较好地减少识别错误带来的影响,提高系统的翻译性能.
According to the characteristics of the language used in address, rules different from those of natural languages were defined and applied to automatically recognize and translate English address on postal mails. To deal with the address got from Optical Character Recognition, an English-to-Chinese address translation method based on inexact string matching technology was proposed. The experimental results showed that the present method is capable of reducing OCR errors, and improving the translation performance.
出处
《华东师范大学学报(自然科学版)》
CAS
CSCD
北大核心
2008年第3期83-91,共9页
Journal of East China Normal University(Natural Science)
基金
国家自然科学基金(60475006)
教育部新世纪优秀人才支持计划(NCET-05-0430)
上海市曙光计划(05SG29)
关键词
非精确匹配
基于规则
地址识别
地址翻译
OCR
inexact matching
rule-based
address recognition
address translation
OCR