期刊文献+

支持向量机在电子邮件分类中的应用研究 被引量:6

Research on Application of E-mail Classification Based on Support Vector Machine
下载PDF
导出
摘要 在电子邮件分类的研究中,针对研究垃圾邮件识别问题,垃圾邮件问题日益严重,影响正常工作,受到研究人员的广泛关注。而电子邮件特征维数相当的高,使传统分类方法存在分类速度慢、正确率低的问题。为了加快电子邮件分类速度、提高分类的正确率,更好的过滤出垃圾邮件,提出一种基于支持向量机的电子邮件自动分类方法。采用互信息量法提取电子邮件关键词作为分类特征,选择最优的分类特征,加快分类速度,然后支持向量机模型对分类特征进行学习训练,建立最优电子邮件分类器模型,最后对电子邮件测试集进行分类。UCI垃圾邮件数据库进行仿真,支持向量机识别正确率远远高于神经网络,且分类速度明显加快,能够很好的把垃圾分类出来。支持向量机分类方法是一种有效的电子邮件分类方法,有利于清除拉圾邮件。 The volume of junk email in Internet has grown tremendously in the past few years,and this problem attracts many researchers' attention.Due to the diversity of music and high dimension,traditional classification methods in practical application of large Email classification are slow and of lower accuracy.In order to improve the accuracy of classification,an email classification method is proposed based on support vector machine.Email classification task consists of feature extraction and classification.Mutual information method is used to extract key feature of email while support vector machine is designed for classifying.Simulation experiments of nine class emails show that support vector machine's average classification correction is 89.9%.Compared with BPNN method,the classification performances are improved by 4%.Experimental results indicate that support vector machine is useful method for email classification.
作者 石铁峰
出处 《计算机仿真》 CSCD 北大核心 2011年第8期156-158,195,共4页 Computer Simulation
关键词 电子邮件 支持向量机 分类 特征提取 Email Support vector machine(SVM) Classification Feature extraction
  • 相关文献

参考文献7

二级参考文献23

  • 1黄萱菁.大规模中文文本处理[D]博士学位论文.上海:复旦大学,1999.
  • 2Langley P,Wayne I,Thompson K.An analysis of Bayesian classifiers[C]. In:Proceedings of the 10th National Conference on Artificial Intelligence,San Jose:California, 1992:223-228.
  • 3David D Lewls.Feature Selection and Feature Extraction for Text Categorization[C].In:Proceedings of Speech and Natural Language Workshop.
  • 4Mehran Sahami, Susan Dumais, David Heckerman et al.A Bayesian Approach to Filtering Junk E-mail[C].In;Papers from AAAI Workshop on Learning for Text Categorization,Madison,Wisconsin,1998:55-62.
  • 5[1]Vapnik V. The nature of statistical learning theory[M]. New York: Springer Press, 1995.
  • 6[2]Osuna E E, Girosi F. Reducing the run-time complexity of support vector machines[Z]. ICPR'98, Brisbane, 1998.
  • 7[3]Cortes C,Vapnik V. Support vector networks[J]. Machine Learning,1995,20(2):273-297.
  • 8[4]Bennett K P. Decision tree construction via linear programming[Z]. The Midwest Artificial Intelligence and Cognitive Science Society Conference, Utica, 1992.
  • 9边肇祺 张学工.模式识别(第2版)[M].北京:清华大学出版社,1999..
  • 10姜家辉,矩阵理论基础,1995年

共引文献2380

同被引文献58

引证文献6

二级引证文献24

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部