期刊文献+

基于支持向量机的Web文本分类方法 被引量:19

Web Document Classification Based on SVM
下载PDF
导出
摘要 Web文本分类技术是数据挖掘中一个研究热点领域,而支持向量机又是一种高效的分类识别方法,在解决高维模式识别问题中表现出许多特有的优势。文章通过分析Web文本的特点,研究了向量空间模型(VSM)的分类方法和核函数的选取,在此基础上结合决策树方法提出了一种基于决策树支持向量机的Web文本分类模型,并给出具体的算法。通过实验测试表明,该方法训练数据规模大大减少,训练效率较高,同时具有较好的精确率(90.11%)和召回率(89.38%)。 Web document classification has been considered as a hot research area in data mining. SVM is an effective method for learning the classification knowledge from massive data, especially in the situation of high cost in getting labeled classical examples. In this paper, based on the analyses of features of Web documents, this paper does research the approach of classification in Vector Space Model and select of Kernel function. Furthermore, a Web page classification model and algorithm that is based on Decision Tree SVM is presented. The experiments show that it not only reduces the size of train set, but also has very high training efficiency. Its precision(90.11%)and recall (89.38%)are also very good.
出处 《微电子学与计算机》 CSCD 北大核心 2006年第9期102-104,共3页 Microelectronics & Computer
基金 中国矿业大学青年科研基金项目(OD4490)
关键词 支持向量机 特征提取 WEB文本 文本分类 Support vector machine, Feature selection, Web documents, Text classification
  • 相关文献

参考文献5

二级参考文献35

  • 1耿遵敏,宋孔杰,李兆前,张兴华,万德玉.关于柴油机振声特点及动态诊断方法的研究与讨论[J].内燃机学报,1995,13(2):140-147. 被引量:32
  • 2黄萱青 吴立德.独立于语种的文本分类方法[M].,2000.37-43.
  • 3鲁松 白硕 等.文本中词语权重计算方法的改进[M].,2000.31-36.
  • 4卜东波.聚类/分类理论研究及其在大模型文本挖掘的应用:博士论文[M].,2000..
  • 5马笑潇.智能故障诊断中的机器学习新理论及其应用[D].重庆:重庆大学,2002.
  • 6VAPNIK V N. The nature of statistical learning [M].Berlin:Springer, 1995.
  • 7VAPNIK V N. Statistical learning theory [M]. New York:John Wiley & Sons, 1998.
  • 8SCHōLKOPH B, SMOLA A J, BARTLETT P L. New support vector algorithms[J]. Neural Computation.2000, 12(5):1207--1245.
  • 9SUYKENS J A K, VANDEWALE J. Least squares support vector machine classifiers[J]. Neural Processing Letters, 1999, 9(3): 293--300.
  • 10CHEW H-G, BOGNER R E, LIM C-C, Dual v-support vector machine with error rate and training size beasing[A]. Proceedings of 2001 IEEE Int Conf on Acoustics,Speech, and Signal Processing [C]. Salt Lake City,USA: IEEE, 2001. 1269--1272.

共引文献2755

同被引文献124

引证文献19

二级引证文献104

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部