摘要
机器学习和模式识别面临的一个重要问题,就是特征子集的选择问题,即从一个大的已知特征集合,选择一个子集合来一致地描述已知例子(样本).特别,最优特征子集选择问题,即最小的特征子集问题的计算复杂性至今还不清楚.在本文中,作者证明了最优特征子集问题是NP难题,并给出它的一个启发式算法.
Machine learning and pattern recognition are confronted with the difficulty in selecting subset of features. That is,from a large set of candidate features, selecting a subset of features which are able to represent given examples (samples) consistently. Especially,the problem of finding an optimal subset of features has remained open. This paper, proves that the problem of finding an optimal subset of features is NP-hard, and presents a heuristic algorithm to solve this problem.
出处
《计算机学报》
EI
CSCD
北大核心
1997年第2期133-138,共6页
Chinese Journal of Computers
关键词
机器学习
模式识别
特征子集选择
Machine learning, pattern recognition, feature subset selection, set covering, NP-hardness, greedy-algorithm.