期刊文献+

基于特征加权的模糊聚类新算法 被引量:114

A New Feature Weighted Fuzzy Clustering Algorithm
下载PDF
导出
摘要 在聚类分析中,针对不同类型的数据,人们设计了模糊k-均值、k-mode以及k-原型算法以分别适合于数值型、类属型和混合型数据.但无论上述哪种方法都假定待分析样本的各维特征对分类的贡献相同.为了考虑样本矢量中各维特征对模式分类的不同影响,本文提出一种基于特征加权的模糊聚类新算法,通过ReliefF算法对特征进行加权选择,不仅能够将模糊k-均值、k-mode以及k-原型算法合而为一,同时使样本的分类效果更好,而且还可以分析各维特征对分类的贡献程度.对各种实际数据集的测试实验结果均显示出新算法的优良性能. In the field of cluster analysis, the fuzzy k-means, k-modes and k-prototypes algorithms were designed for numerical, categorical and mixed data sets respectively. However, all the above algorithms assume that each feature of the samples plays a uniform contribution for cluster analysis. To consider the particular contributions of different features,a novel feature weighted fuzzy clustering algorithm is proposed in this paper,in which the ReliefF algorithm is used to assign the weights for every feature. By weighting the features of samples, the above three clustering algorithms can be unified, and better classification results can be also achieved. The experimental results with various real data sets illustrate the effectiveness of the proposed algorithm.
出处 《电子学报》 EI CAS CSCD 北大核心 2006年第1期89-92,共4页 Acta Electronica Sinica
基金 国家自然科学基金(No.60202004) 中国博士后科学基金
关键词 聚类分析 模糊聚类 数值特征 类属特征 特征加权 cluster analysis fuzzy clustering numeric feature categorical feature feature weighte
  • 相关文献

参考文献10

  • 1何清.模糊聚类分析理论与应用研究进展[J].模糊系统与数学,1998,12(2):89-94. 被引量:113
  • 2Zhexue Huang,Michael K Ng.A fuzzy k-modes algorithm for clustering categorical data[J].IEEE Trans on Fuzzy Systems,August,1999,7(4):446-452.
  • 3Zhexue Huang.A fast clustering algorithm to cluster very large categorical data sets in data mining[A].Proceedings of the SIGMOD Workshop on Research Issues on Data Mining and Knowledge Discovery[C].USA:ACM Press,1997.1-8.
  • 4Kononenko I.Estimating attributes:Analysis and extensions of Relief[A].Proceedings of the 7th European Conference on Machine Learning[C].Berlin:Springer,1994.171-182.
  • 5Kira K,Rendell L A.A practical approach to feature selection[A].Proceedings of the 9th International Workshop on Machine Leaning[C].San Francisco,CA:Morgan Kaufmann,1992.249-256.
  • 6李洁,高新波,焦李成.一种基于CSA的混和属性特征大数据集聚类算法[J].电子学报,2004,32(3):357-362. 被引量:9
  • 7Duda R O,Hart P E.Pattern classification and scene analysis[M].New York:John Wiley & Sons,1973.89-91.
  • 8Hathaway R J,Bezdek J C.Nerf C-means:Non-Euclidean relation fuzzy clustering[J].Pattern recognition,1994,27(3):429-437.
  • 9Michalski R S,Stepp R E.Automated construction of classifications:Conceptual clustering versus numerical taxonomy[J].IEEE PAMI,1983,5:396-410.
  • 10Jollois F X,Nadif M.Clustering large categorical data[A].Advances in Knowledge Discovery and Data Mining[C].Heidelberg:Springer-Verlag,2002.257-263.

二级参考文献2

  • 1周光炎.免疫学原理(Principles of Immunology)[M].上海:上海科学技术出版社,..
  • 2李洁 高新波 焦李成.基于GA的混和属性特征大数据集聚类算法研究[R].陕西西安:西安电子科技大学,2002..

共引文献120

同被引文献1048

引证文献114

二级引证文献1948

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部