摘要
多元数据图表示是高维数据可视化最简单的一种处理方法.从多元数据的雷达图中提出了一种图形特征—可视化重心特征.由于雷达图与数据的特征排序有关,导致可视化特征深受特征排序影响,提出了利用二次映射计算出所有特征排序下的可视化特征,基于遗传算法再从中选择出具有鉴别能力的可视化特征.葡萄酒、乳腺癌和糖尿病等UCI真实数据集的实验结果证实了我们的想法,最佳分类错误率分别达到了0%、1.61%和20.7%,优于报道的常用的分类性能,优于传统的鉴别特征提取方法。
The graphical representation of multi-dimensional data is a simplest method of the data visualization with high dimension. The visual barycentre graphical feature was proposed based on the radar plot of multi-dimensional data. The radar plot was involved with the feature order, which led that visual feature was involved with the feature order. The quadratic map was used to obtain visual features of the all feature orders. The distinguishing visual feature selection method was proposed based on the improved genetic algorithm (GA). For some UCI dataset such wine, breast cancer and diabetes, the obtained best classification error of distinguishing visual feature of radar plot is 0%, 1.61% and 20.7%, which is very promising compared to the previously reported classification methods, and is superior to that of traditional feature extraction method.
出处
《系统仿真学报》
CAS
CSCD
北大核心
2009年第16期5080-5083,5087,共5页
Journal of System Simulation
基金
国家自然科学基金面上项目(60504035)
燕山大学优秀博士生科学基金
关键词
数据可视化
图表示
特征提取
二次影射
特征选择
遗传算法
data visualization
graphical representation
feature extraction
quadratic map
feature selection
genetic algorithm