摘要
针对传统聚类算法存在挖掘效率慢、准确率低等问题,提出一种基于最小生成树的多层次k-means聚类算法,并应用于数据挖掘中.先分析聚类样本的数据类型,根据分析结果设计聚类准则函数;再通过最小生成树对样本数据进行划分,并选取初始聚类中心,将样本的数据空间划分为矩形单元,在矩形单元中对样本对象数据进行计算、降序和选取,得到有效的初始聚类中心,减少数据挖掘时间.实验结果表明,与传统算法相比,该算法可快速、准确地挖掘数据,且挖掘效率提升约50%.
Aiming at the problem of slow mining efficiency and lowaccuracy in traditional clustering algorithm,we proposed a multi-level k-means clustering algorithm based on minimum spanning tree,and applied to datamining.Firstly,we analyzed thedata types of the clustering samples and designed the clustering criterion function according to the analysis results.Secondly,we divided the sample data bythe minimum spanning tree,and selected the initial clustering center.The data space of the sample was divided into rectangular unit,the sample object data was calculated,descended and selected in the rectangular unit,the effective initial clustering center was obtained to reduce the time spent in data mining.The experimental results show that,compared with the traditional algorithm,the proposed method can quickly and accurately excavate the data,and the efficiency ofmining is increased by about 50%.
作者
金晓民
张丽萍
JIN Xiaomin;ZHANG Liping(Institute of Transportation,Inner Mongolia University,Hohhot 010021,China;Inner Mongolia Engineering Research Center of Testing and Strengthening for Bridges,Hohhot 010070,China;College of Computer Science and Technology,Inner Mongolia Normal Univers ity,Hohhot 010022,China)
出处
《吉林大学学报(理学版)》
CAS
CSCD
北大核心
2018年第5期1187-1192,共6页
Journal of Jilin University:Science Edition
基金
国家自然科学基金(批准号:61462071)