摘要
【目的/意义】数据分类是数据挖掘研究的重要内容之一。数据分类时,由于单一分类算法分类性能的差异性,使其不能很好地解决大部分的分类问题,探讨一种基于多类型分类器装袋技术的数据分类方法具有重要理论意义和应用价值。【方法/过程】基于分类性能评价的准确率,使用五种不同类型的分类算法作为分类器,随机抽取训练集后分别训练得到若干个弱分类器,然后采用自动优化加权方式,组合构建一个强的分类器。通过实验对五种分类算法和装袋算法的分类准确率均值和标准差分别进行对比,得出各分类算法在四种数据集上分类性能的优劣和稳定性。【结果/结论】在四个UCI数据集上的实验结果表明,与五种不同类型的分类算法相比,装袋算法不仅在大部分数据集上都表现出很好的稳定性,而且具有更好的泛化能力。
【Purpose/significance】Data classification is an important part of data mining research. Data classification, due to the difference of classification performance of single classification algorithm, cannot solve most of the classification problems well. It is of great theoretical significance and application value to explore a data classification method based on multitype classifier bagging technology.【Method/process】Based on the accuracy of classification performance evaluation, using five difference types of classification algorithms as classifiers, after randomly extracting the training set, several weak classifiers are trained, and then a strong classifier is constructed by using the automatic optimization weighting method. By comparing the mean and standard deviation of the classification accuracy between the five classification algorithms and the bagging algorithms, the classification performance and the stability of each classification algorithm on the four datasets are obtained.【Result/conclusion】The experimental results on the four UCI datasets show that compared with the five different types of classification algorithms, the bagging algorithm not only shows good stability on most datasets, but also has better generalization ability.
作者
段尧清
林平
李施展
DUAN Yao-qing;LIN Ping;LI Shi-zhan(School of Information Management, Central China Normal University, Wuhan 430079, China;School of Mathematics and Information Science, Guangxi University, Nanning 530004, China)
出处
《情报科学》
CSSCI
北大核心
2019年第4期59-65,共7页
Information Science
基金
国家社会科学基金重点项目"基于全生命周期的政府开放数据整合利用机制与模式研究"(17ATQ006)
中央高校基本科研业务费专项资金重大培育项目"大数据环境下的政府信息服务研究"(CCNU16Z02002)
关键词
多类型
装袋技术
数据分类
multiple types
bagging technology
data classification