摘要
针对CLARANS算法聚类效率低、聚类效果依赖初始节点等问题,提出一种基于网格的二次CLARANS算法(Twi-CLAR-ANS)。首先利用网格聚类算法划分数据空间,提取出密集网格中的所有数据对象,用CLARANS算法进行初次聚类,然后将第一次聚类得到的局部最优解作为第二次聚类的初始参照点,对原始数据样本进行第二次聚类,最大程度上避免孤立点信息的丢失,防止聚类结果陷入局部最优。实验结果表明,与CLARANS算法相比,Twi-CLARANS算法具备更优的准确性和执行效率,并且保证了信息的完整性。
For the problems of CLARANS algorithm in low clustering efficiency and its clustering effect depending on initial node,a twice CLARANS algorithm based on grid(Twi-CLARANS) is presented in this paper.First,the grid clustering algorithm is employed to divide the data space and to extract all the data objects in dense grids,and the CLARANS algorithm is used to carry out first clustering.Then,it uses local optimal solution,which is the result of the first clustering,as the initial reference points of the second clustering,and carries out second clustering on primitive data sample for preventing the acnodes from losing to the greatest extent,and in addition,for preventing the clustering results getting into local optimum.Experimental results show that,compared with the CLARANS algorithm,Twi-CLARANS algorithm is more accurate and has higher execution efficiency;besides,it keeps the completeness of the information as well.
出处
《计算机应用与软件》
CSCD
北大核心
2013年第3期287-290,共4页
Computer Applications and Software
关键词
CLARANS算法
聚类
网格
数据空间
Clustering large applications based on randomised search algorithm(CLARANS) Clustering Grid Data space