摘要
聚类分析是数据挖掘的一个重要研究分支,已经提出了许多聚类算法,划分方法是其中之一。划分方法的缺点是要求事先给定聚类结果数,对初始划分和输入顺序敏感等。为克服这些缺陷,以划分方法为基础,提出了一种基于划分的动态聚类算法。该算法按密度从大到小,依距离选择较为分散的初始值,同时可以过滤噪声数据,并在聚类的过程中动态地改变聚类结果数,改善了聚类质量,获得了更自然的结果。
Clustering is a promising application area for many fields including data mining, statistical data analysis, pattern recognition, image processing, etc. Partitioning method is a clustering algorithm, which is sensible to initial partitions (values of k), initial values and input sequence. To overcome these disadvantages, a partition-based dynamic clustering algorithm is developed. At first, the data objects is sorted by their densities. Then some dispersive data objects is selected as initial cluster centers according to priority. At the same time, the outliers can be filtrated. And it changes the numbers of partitions during the clustering. The experiments demonstrate that the algorithm improves the partition method and gets the better results.
出处
《计算机工程与设计》
CSCD
北大核心
2005年第1期177-179,229,共4页
Computer Engineering and Design