摘要
为了从高维、小样本的基因表达数据中有效地选择特征基因,消除与肿瘤分类无关的数据,提出一种随机矩阵替换与支持向量机的肿瘤信息基因选择算法(RD-SVM)。首先构建多组0/1随机向量表示的信息基因子集,并以支持向量机构建分类器评价每组子集的优劣,然后考虑各特征之间的相互作用,以0、1替换策略对基因子集评估,并找到最优基因子集,最后采用5个肿瘤信息基因表达谱数据对算法性能进行测试。结果表明,相对于参比算法,RD-SVM算法不仅提高了肿瘤信息基因的识别精度,同时所选信息基因最少。
In order to select characteristic genes effectively from gene expression data with high dimensionality and small sample, and to eliminate tumour classification-independent data, we propose a tumour informative gene selection algorithm (RD-SVM) which is based on ran- dom matrix replacement and support vector machine. First, we construct the subsets of informative genes represented by groups of 0/1 random vectors, and constructs the classifier by support vector machine to evaluate the performance of each group of subset; then by considering inter- actions between each characteristic ,we assess the gene subsets by the strategy of 0/1 replacement and finds the best gene subset; finally,we use five spectral data of tumour informative genes expression to test algorithm' s performance. Result shows that RD-SVM algorithm improves the recognition accuracy of tumour informative genes with minimum number of genes selected compared with the reference methods.
出处
《计算机应用与软件》
CSCD
2015年第5期310-313,共4页
Computer Applications and Software
基金
广东省高新技术产业化项目(2012B010100050)
东莞市高等院校科研机构科技计划重点项目(2011108101010)
关键词
基因选择
肿瘤表达谱
信息基因
支持向量机
Gene selection Tumour expression spectrum Informative gene Support vector machine