期刊文献+

Analysis of Semi-Supervised Text Clustering Algorithm on Marine Data

下载PDF
导出
摘要 Semi-supervised clustering improves learning performance as long as it uses a small number of labeled samples to assist un-tagged samples for learning.This paper implements and compares unsupervised and semi-supervised clustering analysis of BOA-Argo ocean text data.Unsupervised K-Means and Affinity Propagation(AP)are two classical clustering algorithms.The Election-AP algorithm is proposed to handle the final cluster number in AP clustering as it has proved to be difficult to control in a suitable range.Semi-supervised samples thermocline data in the BOA-Argo dataset according to the thermocline standard definition,and use this data for semi-supervised cluster analysis.Several semi-supervised clustering algorithms were chosen for comparison of learning performance:Constrained-K-Means,Seeded-K-Means,SAP(Semi-supervised Affinity Propagation),LSAP(Loose Seed AP)and CSAP(Compact Seed AP).In order to adapt the single label,this paper improves the above algorithms to SCKM(improved Constrained-K-Means),SSKM(improved Seeded-K-Means),and SSAP(improved Semi-supervised Affinity Propagationg)to perform semi-supervised clustering analysis on the data.A DSAP(Double Seed AP)semi-supervised clustering algorithm based on compact seeds is proposed as the experimental data shows that DSAP has a better clustering effect.The unsupervised and semi-supervised clustering results are used to analyze the potential patterns of marine data.
出处 《Computers, Materials & Continua》 SCIE EI 2020年第7期207-216,共10页 计算机、材料和连续体(英文)
基金 This work was supported in part by the National Natural Science Foundation of China(51679105,61872160,51809112) “Thirteenth Five Plan”Science and Technology Project of Education Department,Jilin Province(JJKH20200990KJ).
  • 相关文献

参考文献2

二级参考文献5

  • 1Frey B J, Dueck D. Clustering by Passing Messages Between Data Points, Science[EB/OL]. (2007-02). http://www.psi.toronto.ed u/affinitypropagation/FreyDueckScience07.pdf.
  • 2Kelly K. Affinity Program Slashes Computing Times[EB/OL]. (2007-02-15). http://www.news.utoronto.ca/bin6/070215-2952.asp.
  • 3Wang K. Supplementary Information[EB/OL]. (2007-03). http://w w w.mathwork s.cona/matlabcentral/fileexchange/loadAuthor.do?obj ect Type=author&objectld= 1095267.
  • 4Dudoit S, Fridlyand J. A Prediction-based Resampling Method for Estimating the Number of Clusters in a Dataset[EB/OL]. (2002-03). http://www.edlab.cs.um ass.edu/cs691 k/conlon/readings/Dudoit Fridlyand2002GB.pdf.
  • 5王玲,薄列峰,焦李成.密度敏感的半监督谱聚类[J].软件学报,2007,18(10):2412-2422. 被引量:94

共引文献185

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部