期刊文献+

基于密度峰值聚类的Tri-training算法

Tri-training Algorithm Based on Density Peaks Clustering
原文传递
导出
摘要 Tri-training利用无标签数据进行分类可有效提高分类器的泛化能力,但其易将无标签数据误标,从而形成训练噪声。提出一种基于密度峰值聚类的Tri-training(Tri-training with density peaks clustering,DPC-TT)算法。密度峰值聚类通过类簇中心和局部密度可选出数据空间结构表现较好的样本。DPC-TT算法采用密度峰值聚类算法获取训练数据的类簇中心和样本的局部密度,对类簇中心的截断距离范围内的样本认定为空间结构表现较好,标记为核心数据,使用核心数据更新分类器,可降低迭代过程中的训练噪声,进而提高分类器的性能。实验结果表明:相比于标准Tritraining算法及其改进算法,DPC-TT算法具有更好的分类性能。 Tri-training can effectively improve the generalization ability of classifiers by using unlabeled data for classification,but it is prone to mislabeling unlabeled data,thus forming training noise.Tri-training(Tri-training with density peaks clustering,DPC-TT)algorithm based on density peaks clustering is proposed.The DPC-TT algorithm uses the density peaks clustering algorithm to obtain the class cluster centers and local densities of the training data,and the samples within the truncation distance of the class cluster centers are identified as the samples with better spatial structure,and these samples are labeled as the core data,and the classifier is updated with the core data,which can reduce the training noise during the iteration to improve the performance of the classifier.The experimental results show that the DPC-TT algorithm has better classification performance compared with the standard Tri-training algorithm and its improvement algorithm.
作者 罗宇航 吴润秀 崔志华 张翼英 何业慎 赵嘉 Luo Yuhang;Wu Runxiu;Cui Zhihua;Zhang Yiying;He Yeshen;Zhao Jia(School of Information Engineering,Nanchang Institute of Technology,Nanchang 330099,China;College of Computer Science and Technology,Taiyuan University of Science and Technology,Taiyuan 030024,China;College of Artificial Intelligence,Tianjin University of Science&Technology,Tianjin 300457,China;China Gridcom Co.,Ltd.,Shenzhen 518000,China)
出处 《系统仿真学报》 CAS CSCD 北大核心 2024年第5期1189-1198,共10页 Journal of System Simulation
基金 国家自然科学基金(52069014)。
关键词 TRI-TRAINING 半监督学习 密度峰值聚类 空间结构 分类器 Tri-training semi-supervised learning density peaks clustering spatial structure classifier
  • 相关文献

参考文献8

二级参考文献125

  • 1杨剑,王珏,钟宁.流形上的Laplacian半监督回归[J].计算机研究与发展,2007,44(7):1121-1127. 被引量:15
  • 2Chapelle O,Schoelkopf B,Zien A.Semi-Supervised Learning.Cambridge:MIT Press,2006
  • 3Nigam K,McCallum A K,Thrun S,Mitchell T.Text classification from labeled and unlabeled documents using EM.Machine Learning,2000,39(2-3):103-134
  • 4Miller D J,Browning J.A mixture model and EM-based algorithm for class discovery,robust classification,and outlier rejection in mixed labeled/unlabeled data sets.IEEE Transactions on Pattern Analysis and Machine Intelligence,2003,25(11):1468-1483
  • 5Joachims T.Transductive inference for text classification using support vector machines//Proceedings of the 16th International Conference on Machine Learning.New York,USA,1999:200-209
  • 6Blum A,Lafferty J,Rwebangira M,Reddy R.Semi-super-vised learning using randomized mincuts//Proceedings of the 21st International Conference on Machine Learning.Texas,USA,2004:934-947
  • 7Zhu X J.Semi-supervised learning literature survey.University of Wisconsin,Wisconsin:Technical Report:TR1530,2006
  • 8Blum A,Mitchell T.Combining labeled and unlabeled data with co-training//Proceedings of the 11th annual conference on Computational Learning Theory.Wisconsin,USA,1998:92-100
  • 9Goldman S,Zhou Y.Enhancing supervised learning with unlabeled data//Proceedings of the 17th International Conference on Machine Learning.California,USA,2000:327-334
  • 10Zhou Z H,Li M.Tri-training:Exploiting unlabeled data using three classifiers.IEEE Transactions on Knowledge and Data Engineering,2005,17(11):1529-1541

共引文献261

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部