摘要
利用各类算法对非平衡数据进行处理已成为数据挖掘领域研究的热问题。针对非平衡数据的特点,在研究支持向量机的相关理论及K-SVM算法基础上,提出基于惩罚机制的PFKSVM(K-SVMbased on penalty factor)算法,克服K-SVM在最优分类面附近易发生错分的问题;并提出由重构采样层、基本训练层和综合判定层组成的集成学习模型。利用UCI公共数据集的实验验证了PFKSVM算法及集成模型在处理非平衡数据分类时的优势。
To process the unbalanced data with various algorithms has become a focus in data mining research. Aiming at the characteristic of the unbalanced data, on the basis of studying the related theory of support vector machines and the K-SVM algorithm, we present the penalty mechanism-based PFKSVM (SVM based on penalty factor) method to overcome the problem of K-SVM that it is prone to misclassification when nearby the optimal classification surface. Then, we propose an ensemble learning model composing of the reconstructed sampling layer, basic training layer and decision layer. The experiment using UCI public data sets verifies the predominance of PFKSVM algorithm and the ensemble model in processing the unbalanced data classification.
出处
《计算机应用与软件》
CSCD
北大核心
2014年第1期186-190,共5页
Computer Applications and Software