期刊文献+

基于改进信息增益的ACO-WNB分类算法研究 被引量:6

Research on ACO-WNB Classification Algorithm Based on Improved Information Gain
下载PDF
导出
摘要 针对朴素贝叶斯分类算法对文本分类性能不高的问题,提出一种基于改进信息增益的ACO-WNB分类算法。首先,根据特征词在数据集中的词频分布情况加入调节因子,对特征词的贡献/干扰作用进行增强/抑制的调节,选择具有强区分度的特征形成特征子集,提高IG处理非均衡数据集的准确率。然后,将蚁群优化算法(ACO)和加权朴素贝叶斯模型相结合,利用ACO对权重进行迭代和全局寻优,生成ACO-WNB分类器,提高对文本数据的分类效率。使用典型新闻数据集将改进前后的算法对比分析,实验表明IG (可以有效去除冗余的高频特征,对非均衡数据集有更好的特征选择能力,ACO-WNB分类器具有更高的准确率,使得对实际的文本数据有更好的分类效率。 Aiming at the problem that the textbook classification performance is not high for naive Bayesian classification algorithm,this paper presents an ACO-WNB classification algorithm based on improved information gain.First,the adjustment factor was added according to the word frequency distribution of the feature word in the data set,the contribution/disturbance effect of the feature word was enhanced/suppressed,and a feature-forming feature subset was selected for a strongly discriminant feature,to increase the accuracy of IG’s processing of unbalanced data sets. Then,the ant colony optimization algorithm and the weighted naive Bayesian model were combined,and the weights were subjected to iterations and global optimization using ACO,tu generate ACO-WNB classifier and improve the classification efficiency of text data. The use of typical news data sets can improve the comparison of algorithms before and after. The experiments show that IG(can effectively remove redundant high frequency characteristics,and has better feature selection ability for unbalanced data sets;while ACO-WNB classifier has a higher accuracy,so that the actual text data has better classification efficiency.
作者 邱宁佳 高鹏 王鹏 陶跃 QIU-Ning-jia;GAO Peng;WANG Peng;TAO Yue(College of Computer Science and Technology,Changchun University of Science and Technology,Changchun Jilin 130022,China)
出处 《计算机仿真》 北大核心 2019年第1期295-299,共5页 Computer Simulation
基金 吉林省科技发展计划重点科技攻关项目(20150204036GX) 吉林省省级产业创新专项资金项目(2017C051)
关键词 朴素贝叶斯 信息增益 特征子集 蚁群算法 Naive Bayesian(NB) Information gain(IG) Feature subset Ant colony optimization(AVO)
  • 相关文献

参考文献7

二级参考文献66

共引文献116

同被引文献66

引证文献6

二级引证文献28

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部