摘要
针对数据挖掘与模式识别领域中的高维数据处理问题,通过分析样本类间距离与类内距离,给出一种基于图理论的特征排序框架。根据该框架,提出使用类内-类间和K近邻相似度定义的2种快速特征选择算法,能避免复杂度较高的广义特征分解过程。实验结果表明,该算法具有较高的分类精度。
The high dimensionality of the data samples often makes the data mining or pattern recognition tasks intractable, through analyzing both the within-class distance and between-class distance, it presents a fast feature ranking framework, from which the computationally expensive feature decomposition is avoided. Two similarity measures of within-class and between-class similarity and K nearest neighbor similarity are employed to derive efficient feature selection algorithms. Experimental results demonstrate that these algorithms have higher classification precision.
出处
《计算机工程》
CAS
CSCD
2012年第9期197-198,201,共3页
Computer Engineering
基金
国家自然科学基金资助项目(71001072)
广东省自然科学基金资助项目(9451806001002694)
关键词
数据挖掘
模式识别
特征选择
图模型
特征分解
K近邻
data mining
pattern recognition
feature selection
graph model
feature decomposition
K nearest neighbor