期刊文献+

改进的朴素贝叶斯算法在垃圾邮件过滤中的研究 被引量:20

Study on an improved naive Bayes algorithm in spam filtering
下载PDF
导出
摘要 提出了一种利用支持向量机改进的朴素贝叶斯算法——TSVM-NB算法。首先利用NB算法对样本集进行初次训练,利用支持向量机构造一个最优分类超平面,每个样本根据与其距离最近样本的类型是否相同进行取舍,这样既降低样本空间规模,又提高每个样本类别的独立性,最后再次用朴素贝叶斯算法训练样本集从而生成分类模型。仿真实验结果表明,该算法在样本空间进行取舍过程当中消除了冗余属性,可以快速得到分类特征子集,提高了垃圾邮件过滤的分类速度、召回率和正确率。 A method of improved support vector machine naive Bayes algorithm was proposed——TSVM-NB algorithm. First using NB algorithm to initial sample set, constructing an optimal classification by SVM, each sample according to its distance from the sample was the same type of recent choice, so as to reduce the size of the sample space, but also improve the independence of each sample the last category, again with naive Bayes algorithm training set to generate the classification model. Simulation results show that the algorithm selection process to eliminate the redundant attributes in the sample space, the classification feature subset can be got quickly and improve spam filtering classification speed, recall rate and accuracy of the same algorithm.
作者 杨雷 曹翠玲 孙建国 张立国 YANG Lei CAO Cui-ling SUN Jian-guo ZHANG Li-guo(College of Computer Science and Technology, Harbin Engineering University, Harbin 150001, China Institute of Information Engineering, Chinese Academy of Sciences, Beijing 100093, China)
出处 《通信学报》 EI CSCD 北大核心 2017年第4期140-148,共9页 Journal on Communications
基金 国家自然科学基金资助项目(No.61202455 No.61472096)~~
关键词 邮件过滤 朴素贝叶斯 支持向量机 修剪策略 spam filtering naive Bayes SVM trim strategy
  • 相关文献

参考文献1

二级参考文献14

  • 1Chow C K, Liu C N. Approximating discrete probability dis- tributions with dependence trees. IEEE Transactions on Information Theory, 1968, 14(3): 462-467.
  • 2Friedman N, Geiger D, Goldszmidt M. Bayesian network classifiers. Machine Learning, 1997, 29(2-3): 131-161.
  • 3Grossman D, Domingos P. Learning Bayesian network classiers by maximizing conditional likelihood//Proceedings of the 21th International Conference on Machine Learning, Alberta, Canada, 2004:361-368.
  • 4Jing Y S, Pavlovie V, Rehg J M. Boosted Bayesian network classifiers. Machine Learning, 2008, 73(2): 155-184.
  • 5Webb G I, Boughton J R, Zheng F et al. Learning by extrapolation from marginal to full-multivariate probability distributions: Decreasingly naive Bayesian classification. Machine Learning, 2012, 86(2): 233-272.
  • 6John G H, Langley P. Estimating continuous distributions in Bayesian classifiers//Proeeedings of the 11th Conference on Uncertainty in Artificial Intelligence ( UAI 1995 ). San Mateo, USA, 1995:338-345.
  • 7Perez A, Larranaga P, Inza I. Supervised classification with conditional Gaussian networks : Increasing the structure com- plexity from naive Bayes. International Journal of Approxi mate Reasoning, 2006, 43(1): 1-25.
  • 8Perez A, Larranga P, Inza I. Bayesian classifiers based on kernel density estimation: Flexible classifiers. International Journal of Approximate Reasoning, 2009, 50(2): 341-362.
  • 9Huang S C. Using Gaussian process based kernel classifiers for credit rating forecasting. Expert Systems with Applica- tions, 2011, 38(7): 8607-8611.
  • 10Silverman B W. Using kernel density estimates to investigate multimodality. Journal of the Royal Statistical Society, 1981, 43(1): 97-99.

共引文献37

同被引文献111

引证文献20

二级引证文献88

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部