摘要
变分高斯过程分类器是最近提出的一种较有效的面向大规模数据的快速核分类算法,其在处理类不平衡问题时,对少数类样本的预测精度通常会较低.针对此问题,通过在似然函数中引入指数权重系数和构造包含相同数目正负类样本的诱导子集解决原始算法的分类面向少数类偏移的问题,建立了一种可以有效处理大规模类不平衡问题的改进变分高斯过程分类算法.在10个大规模UCI数据集上的实验结果表明,改进算法在类不平衡问题上的精度较原始算法得到大幅提高.
Variational Gaussian process classifier is an effective fast kernel algorithm proposed recently for large-scale data classification. However, for the class-imbalanced problem, it usually achieves lower accuracy on the samples of minority class. By assigning different index weight coefficients to the likelihood functions and constructing an inducing set containing equal numbers of positive and negative samples to avoid hyperplane biased toward the side of minority class, an improved variational Gaussian process classification algorithm is proposed, which can deal with the large-scale class-imbalanced problem effectively. The experimental results of ten large-scale UCI datasets show that the proposed algorithm can achieve much higher accuracy than the original one for class-imbalanced problem.
出处
《大连理工大学学报》
EI
CAS
CSCD
北大核心
2016年第3期279-284,共6页
Journal of Dalian University of Technology
基金
国家自然科学基金资助项目(61503058
61374170)
辽宁省自然科学基金资助项目(2015020084
2015020099)
辽宁省教育厅科学技术研究项目(L2014540
L2015127)
中央高校基本科研业务费专项资金资助项目(DC201501055
DC201501060201)
关键词
类不平衡问题
高斯过程
变分推理
大规模数据分类
class-imbalanced problem
Gaussian process
variational inference
large-scale dataclassification