摘要
本文主要目的是寻找到Bagging的一种快速修剪方法,以缩小算法占用的存储空间、提高运算速度和实现提高分类精度的潜力;还提出一种直接计算基学习器差异度的新选择性集成思想.选择出基学习器集合中对提升其余基学习器差异度能力最强者进行删除,通过层次修剪来加速这一算法.在不影响性能的基础上,新算法能够大幅度缩小Bagging的集成规模;新算法还支持并行计算,其进行选择性集成的速度明显优于GASEN.本文还给出了集成学习分类任务的误差上界.
The main objective of this paper is to find a rapid pruning method for Bagging to reduce the storage space needed by the algorithm, speed up the computation process and obtain the potential of improving the classification accuracy. A new idea of selective ensemble is proposed, which computes file diversity of base learners directly. The base learner which has the strongest ability to improve the diversity of other base learners in the base learner set is chosen and deleted, and hierarchical pruning is used to speed up the new algorithm. The new algorithm can greatly reduce the size of the bagging ensemble without performance degradation. It also supports parallel computing and its selective ensemble speed is much faster than that of GASEN (genetic algorithm based on selected ensemble). The upper bound of classification error of ensemble learning is given.
出处
《信息与控制》
CSCD
北大核心
2009年第4期449-454,共6页
Information and Control
关键词
选择性集成
差异度
层次修剪
并行计算
基学习器
个体学习器
selective ensemble
diversity
hierarchical pruning
parallel computation
base learner
component learner