摘要
利用大量未标签样本和少量已标签样本共同训练一个有效的分类器是半监督学习方法的优势,自训练半监督学习方法因其简单且有效的特性而被广泛使用。文章提出基于隐空间特征增强的自训练半监督支持向量机分类学习方法,该方法首先将原空间已标签数据样本和大量无标签数据样本映射到同一隐空间,构建特征增强空间,在此特征增强空间结合概率密度进行自标记半监督SVM学习,以提高分类器的准确性和鲁棒性。UCI数据集上的实验证明,所提算法比传统的自训练学习算法具有更好的性能。
It is an advantage of semi-supervised learning method to train an effective classifier with a large number of unlabeled samples and a small number of labeled samples. The self-training semi-supervised learning method is widely used because of its simple and effective characteristics. This paper proposes a self-training semi-supervised support vector machine(SVM) classification learning method based on hidden space feature augmentation. This method firstly maps labeled data samples and a large number of unlabeled data samples in the original space to the same hidden space to construct a feature augmentation space. Based on this feature augmentation space, further self-training semi-supervised SVM learning is performed combined with probability density to improve the accuracy and robustness of the classifier. Experiments on the UCI dataset prove that the proposed algorithm has better performance than the traditional self-training learning algorithms.
作者
许敏
Xu Min(School of Internet of Things Technology,Wuxi Institute of Technology,Wuxi Jiangsu 214121,China)
出处
《统计与决策》
CSSCI
北大核心
2022年第7期11-15,共5页
Statistics & Decision
基金
国家自然科学基金资助项目(61702225)
江苏省高等学校自然科学研究资助项目(18KJB520048)。
关键词
特征增强
半监督学习
自训练
SVM
密度估计
feature augmentation
semi-supervised learning
self-training
SVM
density estimation