摘要
多数传统的属性聚类算法不能直接处理连续型属性,为了避免连续数据离散化处理时造成的信息损失,降低样本属性邻域求解的复杂度,提高特征基因提取的效率。文中提出一种将邻域互信息用于属性聚类的特征基因选择方法,用于在海量的基因表达谱数据中挖掘出少量的具有分类识别能力且冗余度较小的特征基因。
The majority of traditional clustering algorithms can not deal directly with the properties of continuous attributes, in order to avoid loss of information when the discrete continuous data processing caused by the sample properties to reduce the complexity of solving the neighborhood, and improve the efficiency of extraction feature gene. This paper proposes a neighbor-hood mutual information for the property clustering feature gene selection method for digging out a small number of genes char-acterized by the ability to identify and classify the smaller mass redundancy in gene expression profiling data.
作者
殷樱
张玉冰
刘家诚
高昆
YIN Ying, ZHANG Yu-bing, LIU Jia-cheng, GAO Kun (College of Computer and Information Technology, Henan Normal University, Xinxiang 453007, China)
出处
《电脑知识与技术》
2014年第2期821-823,共3页
Computer Knowledge and Technology
基金
大学生创新实验项目河南师范大学校级重点项目(2012年)《邻域互信息在基因数据挖掘中的应用研究》编号1s
关键词
粒计算
邻域互信息
属性聚类
基因选择
Granular Computing
neighborhood mutual information
attribute clustering
gene selection