摘要
特征选择算法主要分为filter和wrapper两大类,并已提出基于不同理论的算法模型,但依然存在算法处理能力不强、子集分类精度不高等问题。基于模糊粗糙集的信息熵模型提出最大互信息最大相关熵标准,并根据该标准设计了一种新的特征选择方法,能同时处理离散数据、连续数据和模糊数据等混合信息。经UCI数据集试验,表明该算法与其他算法相比,具有较高的精度,且稳定性较高,是有效的。
Feature selection algorithms broadly fall into two categories: the filter model and the wrapper model. A great many algorithms had been proposed, but the problems of weak process ability and low classification accuracy still exists. To solve these problems, this paper proposed a max mutual information and max correlation entropy criterion based on fuzzy rough information entropy model and designed a new feature selection algorithm based on this criterion. The algorithm can deal with dis- crete data, continuous data, fuzzy and hybrid data. According to tests on UCI datasets, the algorithm is effective, and has higher accuracy and stability compared to three other common feature selection algorithms.
出处
《计算机应用研究》
CSCD
北大核心
2009年第1期233-235,240,共4页
Application Research of Computers
关键词
模糊粗糙集
信息熵
特征选择
互信息
相关熵
fuzzy rough set
information entropy
feature selection
mutual information
correlation entropy