摘要
经典的K均值聚类算法是基于欧式距离的,它只适用于球形结构的聚类,而且在处理数据时不考虑变量之间的相关性和各变量的重要性差异.针对以上问题改进了K均值聚类算法,将马氏距离与K均值相结合,并在目标函数中增加变量权重因子和协方差矩阵调节因子,利用马氏距离优点有效地解决了K均值聚类算法的缺陷,最后通过实验证实了该方法的可行性和有效性.
The classic K-means clustering algorithm is based on the Euclidean distance, it applies only to spherical structure clustering and in the processing of data without regard to the correlation between variables and differences in the importance of each variable. To solve the above problem, this paper propose a feasible clustering algorithm, it combines Mahalanobis distance with the K-means and adds a variable weighting factor and a regulating factor of covariance matrix to each class in the objective function. Using the advantage of Mahalanobis distance, it effectively solves the shortcomings of K-means clustering algorithm. Experimental results of date clustering illustrate its feasibility and effectiveness.
出处
《江西师范大学学报(自然科学版)》
CAS
北大核心
2012年第3期284-287,共4页
Journal of Jiangxi Normal University(Natural Science Edition)
基金
广东省自然科学基金(06021484
915100900100007)
广东省科技计划(2008A0602011)资助项目
关键词
K均值
马氏距离
聚类
入侵检测
K-mean
Mahalanobis distance
clustering
intrusion detection