摘要
聚类是广泛应用的基本数据挖掘方法之一,它按照数据的相似性和差异性将数据分为若干簇,并使得同簇的尽量相似,不同簇的尽量相异。目前存在大量的聚类算法,本文仅考察了划分方法中的两个常用算法:EM算法和K-Means算法,并重点剖析了EM算法,对实验结果进行了分析。最后对算法进行了总结与讨论。
Clustering is one of basic data mining forms, it divides data to many clusters according to the similarity and dissimilarity between the data. And the data in one cluster are more similar than others. There are many clustering algorithms, this paper only introduces two common clustering algorithms: EM algorithm and K-Means algorithm, emphasizes EM algorithm, and at last, discusses the result of the algorithm and draws a conclusion.
出处
《计算机与现代化》
2007年第9期12-14,共3页
Computer and Modernization