摘要
以各种窗口长度的蛋白质样本序列为研究对象,实验样本用稀疏编码方式编码,使用核Fisher判别分析(KFDA)的方法来预测蛋白质氧链糖基化位点。首先通过非线性映射(由核函数隐含定义)将样本映射到特征空间,然后在特征空间中用Fisher判别分析进行分类。进一步,用多数投票策略对各种窗口下的分类器进行组合以综合多个窗口的优势。实验结果表明,使用组合KFDA的方法预测的效果优于FDA和PCA以及单个KFDA分类器的预测效果,预测准确率为86.5%。
To predict the O-glycosylation sites in protein sequence, the method of Kernel Fisher Discriminant Analysis (KFDA) was proposed under various window sizes. Encoded by the sparse coding, the samples were first mapped onto a feature space implicitly defined by a kernel function, and then they were classified into two classes in the feature space by Fisher discriminant analysis. Furthermore, the majority-vote scheme was used to combine all the pre-classifiers to improve the prediction performance. The results indicate that the performance of ensembles of KFDA is better than that of FDA, PCA and pre-classifier. The prediction accuracy is about 86.5%.
出处
《计算机应用》
CSCD
北大核心
2010年第11期2959-2961,共3页
journal of Computer Applications
基金
陕西省自然科学基金资助项目(2010JQ1013)
陕西省教育厅科学研究计划项目(2010JK896
09JK809)
咸阳师范学院专项科研基金资助项目(07XSYK107)
咸阳师范学院大学生科研训练项目(09101)
关键词
糖基化
蛋白质
核FISHER判别分析
特征
glycosylation
protein
Kernel Fisher Discriminant Analysis (KFDA)
feature