期刊文献+

样本-属性加权的朴素贝叶斯改进算法

Sample-attribute weighted improved naive Bayesian algorithm
下载PDF
导出
摘要 朴素贝叶斯算法是一种简单、高效且有着广泛应用的分类方法,但在现实中,条件独立性假设影响了其分类性能。为克服该问题,给出一种改进算法——样本-属性加权的朴素贝叶斯算法。首先,对属性计算相关系数得到属性权值;其次,利用属性权结合信息熵获得样本熵权,并据此加权样本以提高泛化能力;然后,给出了样本-属性加权的朴素贝叶斯算法;最后,在UCI数据集上的实验结果验证了改进算法比原算法具有更好的分类性能。 Naive Bayesian algorithm is a simple , efficient and widely used classification method , but the conditional indepen-dence assumption affects it ’ s classification performance in reality . The paper gives an improved algorithm---sample-attribute weighted naive Bayesian algorithm in order to overcome this problem . Firstly , the correlation coefficients of all attributes have been calculated to obtain attribute-weight . Secondly , attribute-weight and information entropy have been combined to get sample-en-tropy-weight , the samples have been weighted according it to enhance the generalization ability . Then , sample-attribute weighted naive Bayesian algorithm has been proposed . Finally , the experimental results on UCI data sets prove that the improved algorithm has got better classification performance than the original algorithm .
作者 曾文赋
出处 《微型机与应用》 2014年第6期62-63,67,共3页 Microcomputer & Its Applications
关键词 朴素贝叶斯 样本-属性加权 条件独立性假设 naive Bayesian sample-attribute weighted conditional independence assumption
  • 相关文献

参考文献5

二级参考文献21

  • 1吴昊,段禅伦,熊志伟,张利伟.粗糙集理论在中医诊断学中的应用研究[J].内蒙古大学学报(自然科学版),2006,37(3):351-355. 被引量:4
  • 2曹渝昆,李云峰,汪成亮,周明强.改进型模糊神经网络在顾客分类中的应用研究[J].计算机工程与应用,2006,42(19):218-221. 被引量:2
  • 3张继国,张文修.模糊随机变量及其概率分布[J].模糊系统与数学,1996,10(4):76-82. 被引量:5
  • 4龚燕冰,倪青,王永炎.中医证候研究的现代方法学述评(一)——中医证候数据挖掘技术[J].北京中医药大学学报,2006,29(12):797-801. 被引量:96
  • 5宫秀军 史忠植.基于贝叶斯潜在语义模型的半监督Web挖掘[J].软件学报,已录用,.
  • 6ChengXiang Zhai .A Note on the Expectation-Maximization (EM) Algorithm[A] .10th Int'l Conf on Information and Knowledge Management (CIKM 2001)[C].2001.403-410.
  • 7B Shahshahani,D Landgrebe.The Elect of Unlabeled Samples in Reducing the Small Sample Size Problem and Mitigrating the Hughes Pheonomenon[J].IEEE Trans on Geoscience and Remote Sensing,1994 ,32(5):1087-1095.
  • 8T Zhang,F Oles.A Probability Analysis on the Value of Unlabeled Data for Classification Problems[A].Proc of the 17th Int'l Conf on Machine Learning(ICML 2000)[C].2000.1191-1198.
  • 9Kamal Nigamy,Andrew Kachites Mccallumzy,Sebastian Thruny,et al.Text Classification from Labeled and Unlabeled Documents Using EM [M].Boston:Kluwer Academic Publishers,2000.
  • 10Seong-Bae Park,Byoung-Tak Zhang .Automatic Webpage Classification Enhanced by Unlabeled Data[A].IDEAL 2003.LNCS 2690[C].2003.821-825.

共引文献78

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部