期刊文献+

基于最小二乘支持向量机微阵列基因特征分类 被引量:6

MICROARRAY GENE FEATURE CLASSIFICATION BASED ON LS-SVM
下载PDF
导出
摘要 基因表达分析中的微阵列数据具有高维、高冗余的特点,给基因表达数据分类带来很大的困难。机器学习中的最小二乘支持向量机算法具有计算效率高的优势,从而为数据挖掘提供了一条有效途径。针对两类典型的癌症微阵列数据集(结肠癌集和白血病集),进行归一化预处理并且计算其相关系数矩阵;使用主成分分析法进行降维处理,得到用于特征选取和分类的信息基因集(各取 10个基因);采用最小二乘支持向量机分类器对信息基因集进行分类。实验结果表明,该算法在两类癌症数据集上的留一交叉检验的准确率分别为97.5%和100%,具有比其他分类器都高的测试准确率,为进一步医学临床应用提供可靠的诊断依据。 Microarray data in gene expression analysis is characterized by high dimensionality and redundancy,which makes it difficult to classify gene expression data.The least-squares support vector machine(LS-SVM) algorithm in machine learning has the advantage of high computational efficiency,which provides an effective way for data mining.For two types of typical cancer microarray data sets(colon cancer set and leukemia set),we normalized the data and calculated the correlation coefficient matrix.The dimensionality reduction was carried out by principal component analysis,and the information gene sets(10 genes each) for feature selection and classification were obtained.Then,we used LS-SVM classifier to classify information gene sets.The experimental results show that the accuracy of this algorithm is 97.5% and 100% respectively,which is higher than other classifiers.It provides reliable diagnostic basis for further clinical application.
作者 高振斌 Gao Zhenbin(Institute of Mathematics and Applied Mathematics,School of Statistics,Xi’an University of Finance and Economics,Xi’an 710100,Shaanxi,China)
出处 《计算机应用与软件》 北大核心 2019年第8期288-292,共5页 Computer Applications and Software
关键词 微阵列 特征分类 降维 最小二乘支持向量机 Microarray Feature classification Reducing dimension Least-square support vector machine(LS-SVM)
  • 相关文献

参考文献7

二级参考文献70

  • 1李颖新,阮晓钢.基于支持向量机的肿瘤分类特征基因选取[J].计算机研究与发展,2005,42(10):1796-1801. 被引量:51
  • 2刘全金,李颖新,朱云华,阮晓钢.基于BP神经网络的肿瘤特征基因选取[J].计算机工程与应用,2005,41(34):184-186. 被引量:6
  • 3Schena M,Shalon D,Davis R W et al.Quantitat ivemonito ring of gene expression patterns with a complementary DNA microarray[J]. Science, 1995 ; 270 ( 5235 ) ; 467-470.
  • 4Lockhart D J,Dong H,Byme M C et al.Expression monito ring by hybridization to high density oligonucleo tide arrays[J],N at B iotech, 1996: 14(13) :1675-1680.
  • 5M Beisen,P T Spellman,P O Brown el al.Cluster analysis and display of genome-wide expression paltems[J].Proc Natl Acad Sci ,USA, 1998 ,95 : 14863-14868.
  • 6P Tamayo et al.Interpreting patterns of gene expression with salf-organizing maps[J].In proceedings of National Academy of Science, 1999;96(6) :2907-2912.
  • 7J Herrero et al.A hierarchical unsupervised growing neural network for clustering gene expression pattems[J].Bioinformatics,2001 ;17(2): 126-136.
  • 8Y Yung,W Ruzzo.An empirical study on principal component analysis for clustering gene expression data[J],Bioinformatics ,2001 ; 17(9) : 763-774.
  • 9C H Q Ding,X He,H Zha et al.Adaplive dimension reduction for clustering high dimensional data[C].In:Proeeedings of the 2002 IEEE International Conference on Data Mining,2002.
  • 10H Wang,W Wang,J Yang et al.Clustering by pattern similarity in large data sets[C].In:Proceedings of ACM SIGMOD International Conference on Management of Data ,2002.

共引文献53

同被引文献45

引证文献6

二级引证文献14

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部