期刊文献+

基于概念的文本分类中的人名、地名处理研究

Application of Name of People and Institutions in Text Categorization
下载PDF
导出
摘要 基于概念的文本分类方法是近年来提出的一种新的文本分类方法,弥补了以前基于关键词的文本分类方法的一些不足,对同义词、多义词能进行比较好的处理。但是基于概念的文本分类方法往往对人名、机构名等具有分类特征的词不能很好处理。文中提出了一种将语义词典与一部人名、机构名构成的专有名词词典相结合的新的概念分类方法。并经过实验验证了其有效性。 Text categorization based on concept is a new method that was introduced in recent years. It offsets some shortcomings of the traditional method, such as the phenomenon of synonymy. But this new method can't dispose the name of people and the name of institution. In this paper a new method for text categorization based on concept was introduced. In experience we formed a new dictionary that included a lot of name of people that often appeared in text. At last checked the methed's efficiency by experience.
出处 《微机发展》 2005年第3期11-13,56,共4页 Microcomputer Development
基金 河北省自然科学基金资助项目(F2004000132)
关键词 文本分类 概念分类 K近邻法 text categorization concept categorization KNN
  • 相关文献

参考文献5

二级参考文献16

  • 1黄萱青 吴立德.独立于语种的文本分类方法[M].,2000.37-43.
  • 2鲁松 白硕 等.文本中词语权重计算方法的改进[M].,2000.31-36.
  • 3卜东波.聚类/分类理论研究及其在大模型文本挖掘的应用:博士论文[M].,2000..
  • 4吴赣 程学旗 等.WWW页面的文档分类技术.计算机语言学文集[M].,1999,10..
  • 5董振东 董强.知网.计算语言学文集[M].,1999,10..
  • 6Dasigi, Venu, Mann, Reinhold C. , Protopopescu, Vladimir A..Information fusion fox text classification--an experimental comparison. Pattern Recognition. 2001,34 (12) : 2413 - 2425.
  • 7Tan, Chade-Meng, Wang, Yuan-Fang, Lee, Chan-Do. The use of bigrams to enhance text categorization. Information Processing and Management, 2002,38 (4) : 529 - 546.
  • 8Fuketa, Masao, Lee, Sangkon, Tsuji, Takako, Okada, Makoto,Aoe,Jun-ichi. A document classification method by using field association words. Information Sciences, 2000,126 (1-4) : 57 - 70.
  • 9黄萱菁,2000 International Conference on Multilingual Information Processing,2000年,37页
  • 10鲁松,2000 International Conference on Multilingual Information Processing,2000年,31页

共引文献327

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部