期刊文献+

基于Adaboost算法的不平衡数据集分类效果研究

Research on Classification Effect of Imbalanced Data Set Based on Adaboost Algorithm
下载PDF
导出
摘要 在不平衡数据集中,由于少类样本和多类样本的不平衡,在分类过程中容易产生难以分类和错误分类的现象。针对不平衡数据集的分类特点,设计出一种组合分类器,适用于不平衡数据集的分类。通过SMOTE算法采样对不平衡数据集进行一个预处理,采用单层决策树作为基本分类器,利用Matlab编程,构建Adaboost算法分类器,对demo、heart和usps数据集进行训练集和测试集分析。结果表明,通过Adaboost算法可以有效提高分类效果,算法中通过改变正类样本的权值,从而重视对少类样本的分类,在一定程度上能够提高整体的分类效果,实现不平衡数据集的分类设计。 In the unbalanced data set,due to the imbalance between the small-class samples and the multi-class samples,it was easy to cause the phenomenon of difficult classification and misclassification in the classification process.Aiming at the classification characteristics of unbalanced data sets,a combined classifier was designed,which was suitable for the classification of unbalanced data sets.A preprocessing was performed on the unbalanced data set through SMOTE algorithm sampling,decision stump was used as the basic classifier,Matlab programming was used to construct the adaboost algorithm classifier,and the demo,heart and usps data sets were analyzed for the training set and the test set.The results shown that the Adaboost algorithm were able effectively improved the classification effect.In the algorithm,the weight of the positive samples was changed to emphasize the classification of the few samples.Therefore,the overall classification effect can be improved to a certain extent,and classification design for unbalanced data sets was improved.
作者 董庆伟 DONG Qing-wei(School of Information Management,Minnan University of Science and Technology,Shishi 362700,China)
出处 《长春师范大学学报》 2022年第6期49-52,共4页 Journal of Changchun Normal University
基金 福建省本科高校重大教育教学改革研究项目“基于校企合作的信息与计算科学专业课程体系构建与研究”(FBJG20170333) 福建省中青年教师教育课题项目“电子商务对企业人力资源的影响研究”(JB12372S)。
关键词 不平衡数据集 ADABOOST算法 单层决策树 基本分类器 unbalanced data set Adaboost algorithm decision stump basic classifier
  • 相关文献

参考文献5

二级参考文献54

  • 1Valiant L G. A theory of learnable [J]. Communications of the ACM,1984,27(11):1134-1142.
  • 2Keams M J,Valiant L G. Cryptographic limitations on learning Boolean formulae and finite automata[J]. Jour- nal of the ACM(JACM),1994,41 (1):67-95.
  • 3Freund Y.Boosting a weak learning algorithm by majori- ty[J]. Information and Computation, 1995,121 (2):256- 285.
  • 4Schapire R E.A brief introduction to boosting[C]//Pro- ceedings of the 16th intemational joint conference on Artificial intelligence,Stockholm, Sweden,July 31 -August 6,1999:1401-1406.
  • 5Freund Y,Schapire R E.Experiments with a new boosting algorithm[C]//Proceedings of the 13 th International Con- ference on Machine Learning, Bari,Italy, July 3-6,1996:148-156.
  • 6林智勇,郝志峰,杨晓伟.不平衡数据分类的研究现状[J].计算机应用研究,2008,25(2):332-336. 被引量:46
  • 7付忠良.关于AdaBoost有效性的分析[J].计算机研究与发展,2008,45(10):1747-1755. 被引量:47
  • 8杨明,尹军梅,吉根林.不平衡数据分类方法综述[J].南京师范大学学报(工程技术版),2008,8(4):7-12. 被引量:28
  • 9叶志飞,文益民,吕宝粮.不平衡分类问题研究综述[J].智能系统学报,2009,4(2):148-156. 被引量:72
  • 10翟云,杨炳儒,曲武.不平衡类数据挖掘研究综述[J].计算机科学,2010,37(10):27-32. 被引量:37

共引文献218

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部