摘要
针对基因表达数据高维小样本特性所带来的维数灾难问题,结合回归和类别保留投影方法,提出一种新的基因表达数据降维方法,叫稀疏类别保留投影.相比类别保留投影,能有效避免类别保留投影在基因表达数据降维上存在的矩阵奇异和过拟合问题.通过对真实基因表达数据进行数据可视化和分类识别,验证了方法的有效性.
To solve the problem of the curse of dimensionality of gene expression data due to the characteristic of high dimension low sample size, a new method of dimensionality reduction for gene expression data, called sparse class pre- serving projection (SCPP) is proposed,by combining regression and class preserving projection(CPP). Compared to CPP, SCPP can avoid the problems of matrix singularity and over-fitting. Experiments are performed on gene expression data for visualization and sample classification, and the results confirm the effectiveness of the method.
出处
《电子学报》
EI
CAS
CSCD
北大核心
2016年第4期873-877,共5页
Acta Electronica Sinica
基金
中央高校基本科研业务费专项资金(No.JB140310)
关键词
基因表达数据
高维小样本
类别保留投影
回归
gene expression data
high dimension and low sample size
class preserving projection
regression