摘要
基于汉英商务信函缩略语块的概念界定和分类,设计了"文本预处理"、"缩略语块识别"、"缩略语块全称识别"、"后期处理"四种模块,从未经词性标注的生语料文本中提取汉英商务信函缩略语块及其全称对,取得了较好的提取效果。
Based on the criteria defining Chinese-English abbreviation chunks in business correspondence,this paper has designed the methods of Chinese-English abbreviation chunks in business correspondence,which has four modules,i.e."pre-processing of text","abbreviation chunks identification","abbreviation chunks full-name identification" and "postprocessing".It has achieved the automatic extraction of Chinese-English abbreviation chunks and their definition pairs without part-of-speech tagging.
出处
《湖南科技大学学报(社会科学版)》
CSSCI
北大核心
2011年第6期125-128,共4页
Journal of Hunan University of Science and Technology(Social Science Edition)
基金
国家社会科学基金项目(10BYY009)
河南省软科学研究计划项目(112400450206)
洛阳理工学院院级项目(2009QR04)
关键词
商务信函
缩略语块
提取方法
语料库
机器翻译
business correspondence
abbreviation chunks
extraction methods
corpus
machine translation