期刊文献+

一种面向聚类的加权特征选择算法 被引量:1

Weighted feature selection for clustering
下载PDF
导出
摘要 针对面向聚类的特征选择算法效率和效果无法兼顾,并且对高维数据适用度不高的问题,提出了一种基于邻域分析的加权特征选择算法ENFSA。该算法首先基于信息熵构建候选特征集,降低加权特征选择的候选特征维度,在此基础上采用邻域分析法评估特征冗余度和相关性,并根据评估结果更新特征子集和权值向量,不断迭代,直至特征权值向量趋于稳定。在10种典型数据集上的测试结果表明,与传统的特征选择算法相比,新的算法特征约简效率较好,能够明显提高数据集聚类效果,同时在特征维度较高的数据集上依然表现出很好的效果。 Aiming at the problem that the efficiency and performance of traditional feature selection are not compatible and it cannot be well applied to high-dimensional data, this paper proposed a neighborhood analysis based weighted feature selection algorithm(ENFSA). ENFSA created a candidate feature set based on information gain to reduce the number of dimensions. Then it assessed the redundancy and relevance of features based on neighborhood analysis and used them to update feature set and weight vector. This assessment and update process would be repeated until optimal result was obtained. Experimental resuits on 10 typical datasets show that this method has good efficiency and performance, and it do better on high-dimensional dataset than other algorithms.
出处 《计算机应用研究》 CSCD 北大核心 2015年第12期3596-3599,共4页 Application Research of Computers
基金 国家"863"计划资助项目(2012AA012704) 郑州市科技领军人才项目(131PLJRC644)
关键词 加权特征选择 聚类 信息熵 邻域分析 特征权值向量 weighted feature selection clustering information gain neighborhood analysis feature weight vector
  • 相关文献

参考文献12

二级参考文献95

  • 1刘涛,吴功宜,陈正.一种高效的用于文本聚类的无监督特征选择算法[J].计算机研究与发展,2005,42(3):381-386. 被引量:37
  • 2Yu L,Liu H.Efficient feature selection via analysis of relevance and redundancy[J].Journal of Machine Learning Research,2004:1205-1224.
  • 3Zhang D,Chen S,Zhou Z.Constraint score:A new filter method for feature selection with pair-wise constraints[J].Pattern Recognition,2008,41:1440-1451.
  • 4Kohavi G,John H.Wrappers for feature subset selection[J].Artificial Intelligence,1997:273-324.
  • 5Guyon I,Elisseeff A.An introduction to variable and feature selection[J].Journal of Machine Learning Research,2003:1157-1182.
  • 6Swiniarski W,Skovaon A.Rough set methods in feature selection and recognition[J].Pattern Recognition Letters,2003:833-849.
  • 7Last M,Kandel A,Maimon O.Information-theoretic algorithm for feature selection[J].Pattern Recognition Letters,2001:799-811.
  • 8Dash M,Liu H,Yao J.Dimensionality reduction of unsupervised data[C] //Proc 9th IEEE Int'l Conf Tools with Artificial Intelligence,1997:532-539.
  • 9Mitra P,Murthy C A,Pal S K.Unsupervised feature selection using feature similarity[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2002:301-312.
  • 10Covoes T F,Hruschka E R.A cluster-based feature selection approach[C] //LNCS 5572:HAIS2009,2009:69-176.

共引文献135

同被引文献11

引证文献1

二级引证文献13

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部