摘要
在聚类分析中,针对不同类型的数据,人们设计了模糊k-均值、k-mode以及k-原型算法以分别适合于数值型、类属型和混合型数据.但无论上述哪种方法都假定待分析样本的各维特征对分类的贡献相同.为了考虑样本矢量中各维特征对模式分类的不同影响,本文提出一种基于特征加权的模糊聚类新算法,通过ReliefF算法对特征进行加权选择,不仅能够将模糊k-均值、k-mode以及k-原型算法合而为一,同时使样本的分类效果更好,而且还可以分析各维特征对分类的贡献程度.对各种实际数据集的测试实验结果均显示出新算法的优良性能.
In the field of cluster analysis, the fuzzy k-means, k-modes and k-prototypes algorithms were designed for numerical, categorical and mixed data sets respectively. However, all the above algorithms assume that each feature of the samples plays a uniform contribution for cluster analysis. To consider the particular contributions of different features,a novel feature weighted fuzzy clustering algorithm is proposed in this paper,in which the ReliefF algorithm is used to assign the weights for every feature. By weighting the features of samples, the above three clustering algorithms can be unified, and better classification results can be also achieved. The experimental results with various real data sets illustrate the effectiveness of the proposed algorithm.
出处
《电子学报》
EI
CAS
CSCD
北大核心
2006年第1期89-92,共4页
Acta Electronica Sinica
基金
国家自然科学基金(No.60202004)
中国博士后科学基金
关键词
聚类分析
模糊聚类
数值特征
类属特征
特征加权
cluster analysis
fuzzy clustering
numeric feature
categorical feature
feature weighte