摘要
支持向量机已经被成功应用于基因表达谱数据分析。但是,仍有开放问题需要解决:①支持向量机不能自动进行基因表达谱数据的特征选择;②支持向量机的参数优选没有简单有效的办法。一种新型具有良好特性的支持向量机——全间隔自适应模糊支持向量机(TAFSVM)被提出。并且提出一种新的遗传算法——智能遗传算法(IGA)来设计一个TAFSVM分类器,称为ITAFSVM,同时优化TAFSVM参数集和特征选择,并且结合10-fold交叉验证来确定其泛化能力。最后将ITAFSVM应用于四种基因表达谱数据集。通过与进化支持向量机(ESVM)方法、粗糙集与径向基神经网络组合(RBF-RBFNN)方法进行了比较,实验结果表明运用ITAFSVM不仅可以自动进行基因表达谱数据特征选择,而且分类精度和稳定性都较高,速度更快。
SVM has been successfully employed to solve the analysis of gene expression data. However, there are still open issues which need to be addressed : (1) SVM does not offer the mechanism of automatic internal relevant feature selection; (2) There are no simple and effective means to confirm the appropriate parameters setting of SVM. In this study, total margin-based adaptive fuzzy support vector machine (TAFSVM) which has good quality is proposed. In addition, it is proposed an evolutionary approach to design a TAFSVM-based classifier ( named ITAFSVM) by simultaneous optimization of automatic feature selection and parameters tuning using an intelligent genetic algorithm (IGA) , combined with 10-fold crossvalidation regarded as an estimator of generalization ability. Subsequently, the model of ITAFSVM is used to analyze four gene expression datasets. Comparisons with evolutionary support vector machine and a combination of rough-based feature selection and RBF neural network are reported. The experimental results indicate that the proposed ITAFSVM model can not only accomplish automatic feature selection, but also achieve higher classification accuracy, stable and faster speed.
出处
《中山大学学报(自然科学版)》
CAS
CSCD
北大核心
2010年第2期37-42,47,共7页
Acta Scientiarum Naturalium Universitatis Sunyatseni
基金
国家自然科学基金资助项目(10771220)
教育部高等学校博士点科研基金资助项目(SRFDP-20070558043)
关键词
全间隔自适应模糊支持向量机
智能遗传算法
基因表达谱
分类
微阵列
total margin-based adaptive fuzzy support vector machine
intelligent genetic algorithms
gene expression
classification
microarray