摘要
针对现有的命名实体识别方法不能很好地处理专业领域特定命名抽取的问题,提出一种基于启发式规则的专业命名识别方法。以中文文本中化学物质命名为研究对象,分析其领域特征及统计语言特征,建立适用于化学领域文献命名识别的启发式规则,为专业领域的命名实体识别提供新的解决方案。对比实验证明本文的方法能有效提升专业命名识别的效率。
This paper proposes a method of domain name recognition based on heuristic rules, to overcome the shortage of traditional solution in specific domain. It firstly studies chemical name in Chinese to obtain its domain features and statistical language features, and then on the basis of such features, it puts forward several heuristic rules, which is applicable to domain name recognition of chemical literature. Comparison experiment shows this method can improve the efficiency of domain name recognition obviously.
出处
《现代图书情报技术》
CSSCI
北大核心
2010年第5期13-17,共5页
New Technology of Library and Information Service