摘要
朴素贝叶斯算法是一种简单、高效且有着广泛应用的分类方法,但在现实中,条件独立性假设影响了其分类性能。为克服该问题,给出一种改进算法——样本-属性加权的朴素贝叶斯算法。首先,对属性计算相关系数得到属性权值;其次,利用属性权结合信息熵获得样本熵权,并据此加权样本以提高泛化能力;然后,给出了样本-属性加权的朴素贝叶斯算法;最后,在UCI数据集上的实验结果验证了改进算法比原算法具有更好的分类性能。
Naive Bayesian algorithm is a simple , efficient and widely used classification method , but the conditional indepen-dence assumption affects it ’ s classification performance in reality . The paper gives an improved algorithm---sample-attribute weighted naive Bayesian algorithm in order to overcome this problem . Firstly , the correlation coefficients of all attributes have been calculated to obtain attribute-weight . Secondly , attribute-weight and information entropy have been combined to get sample-en-tropy-weight , the samples have been weighted according it to enhance the generalization ability . Then , sample-attribute weighted naive Bayesian algorithm has been proposed . Finally , the experimental results on UCI data sets prove that the improved algorithm has got better classification performance than the original algorithm .
出处
《微型机与应用》
2014年第6期62-63,67,共3页
Microcomputer & Its Applications
关键词
朴素贝叶斯
样本-属性加权
条件独立性假设
naive Bayesian
sample-attribute weighted
conditional independence assumption