期刊文献+

基于核均值漂移聚类的改进局部协同训练算法 被引量:3

Improved Partial Co-Training Algorithm Based on Kernel Mean Shift
原文传递
导出
摘要 【目的】针对协同训练算法不能直接应用于单视图数据,且在迭代过程中加入的无标记样本隐含有用信息不够的问题,提出基于核均值漂移聚类的改进局部协同训练算法。【方法】该算法先在有标记样本集中利用改进局部协同训练算法训练一个完整视图分类器h1,同时挑选出价值高的特征子集来训练局部视图分类器h2,然后在无标记样本集中采用核均值漂移算法选择聚类过程中指定带宽范围内的样本,交由分类器h2标记类别后再加入分类器h1的训练中,以此来优化分类模型。【结果】在UCI数据集上的3组对比实验证明了该算法的有效性,实验结果表明该算法具有更高的模型评价能力。【结论】改进局部协同训练算法将数据集划分为局部视图和完整视图,解决了单视图数据的视图划分问题。利用核均值漂移算法选出较好表现数据空间结构的无标记样本,降低了无标记样本带来的误差。 [Purposes]When the co-training algorithm is applied to single view data,it is usually confronted with view partitioning problem.Before the iteration ends,the continuously injected unlabeled data sometimes don’t imply abundant information.For solving the above problems,the improved partial co-training algorithm based on kernel mean shift is proposed.[Methods]Firstly,a full view classifier h1 is trained with labeled datasets by improved partial co-training algorithm,and a more valuable subset of the data is selected from the labeled ones for training apartial view classifier h2.Then,the kernel mean shift is utilized to select data within a given bandwidth in each clustering process from unlabeled datasets.After the selected unlabeled data are labeled by using classifier h2,they are added to the training process of classifier h1 to optimize the classification model.[Findings]The algorithm is validated by comparisons with three control experiments on UCI data,and experimental results show that the algorithm has higher model evaluation ability.[Conclusions]The improved partial co-training algorithm can divide the datasets into partial view and complete view,which solves the view partitioning problem of single view data.Using the kernel mean shift can choose the unlabeled data that represent better performance of the space structure of data,therefore reducing the errors caused by the unlabeled data.
作者 鲜焱 吕佳 XIAN Yan;Lü Jia(College of Computer and Information Sciences,Chongqing Normal University;Chongqing Center of Engineering Technology Research on Digital Agriculture Service,Chongqing Normal University,Chongqing 401331,China)
出处 《重庆师范大学学报(自然科学版)》 CAS 北大核心 2020年第4期106-113,共8页 Journal of Chongqing Normal University:Natural Science
基金 国家自然科学基金(No.1971084) 重庆师范大学科研项目(No.YKC19018)。
关键词 协同训练 均值漂移 流行正则化 特征选择 视图划分 co-training mean shift manifold regularization feature selection view partition
  • 相关文献

参考文献5

二级参考文献44

  • 1徐章艳,刘作鹏,杨炳儒,宋威.一个复杂度为max(O(|C||U|),O(|C^2|U/C|))的快速属性约简算法[J].计算机学报,2006,29(3):391-399. 被引量:234
  • 2Dempster A P, Laird N M, Rubin D B. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B(methodological), 1977 : 1 - 38.
  • 3Shahshahani B M, Landgrebe D A. The effect of unlabeled samples in reducing the small sample size problem and mitigating the Hughes phenomenon. IEEE Transactions on Geoscience and Remote Sensing,1994,32(5) :1087- 1095.
  • 4Miller D J, Uyar H S. A mixture of experts classifier with learning based on both labelled and unlabeled data. Advances in Neural Information Processing Systems. Cambridge, MAt MIT Press, 1997: 571-577.
  • 5Nigam K, Mccallum A K, Thrun S, et al. Text classification from labeled and unlabeled documents using EM. Machine Learning, 2000, 39 (2 3): 103-134.
  • 6Joachims T. Transductive inference for text classification using support vector machines. In: Proceedings of the 16^th International Conference on Machine Learning. New York, NY: ACM, 1999,99 200- 209.
  • 7Chapelle O,Zien A. Semi-supervised classification by low density separation. In: Proceedings of the 10^rd In ternational Workshop on Artificial Intelligence and Statistics. Brookline, MA: Microtome, 2005, 1: 57- 64.
  • 8Chapelle O, Chi M, Zien A. A continuation method for semi-supervised SVMs. In: Proceedings of the 23^rd International Conference on Machine Learning. New York, NY, ACM, 2006 : 185- 192.
  • 9Li Y F, Zhou Z H. Towards making unlabeled data never hurt. IEEE Transactions on Pattern Analysis and Machine Intelligence,2015,37(1):175 -188.
  • 10Belkin M, Matveeva I, Niyogi P. Regularization and semi supervised learning on large graphs. In: Proceedings of the 17^th Annual Conference on Learning Theory. Berlin, German: Springer, 2004, 3120:624-638.

共引文献21

同被引文献26

引证文献3

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部