期刊文献+

K-means聚类算法在肿瘤基因变异识别中的应用 被引量:6

USING K-MEANS CLUSTERING ALGORITHM FOR CANCER GENE VARIANT DETECTING
下载PDF
导出
摘要 二代测序NGS(Next-generation sequencing)数据的迅速发展加快人们对于基因的探索,同时也给测序数据分析任务带来更大的挑战。癌细胞特异变异的识别是测序数据分析的一项重要基础性工作。当前的变异识别工具大多采用贝叶斯模型方法,特异度、灵敏度和速度都远远满足不了需求。K-means是一种简洁高效的无监督聚类算法,基于此将位点信息映射成多维的特征,再进行类别个数为2的聚类过程。该算法明显提高了准确度和召回率,实验结果验证了算法的有效性。 The rapid development of next-generation sequencing data has accelerated the exploration of genes, and has also brought greater challenges to sequencing data analysis tasks. The identification of cancer-specific mutations is an important basic task in sequencing data analysis. Most of the current mutation identification tools use Bayesian model methods, but the specificity, sensitivity, and speed are far from meeting the needs. K-means is a concise and efficient unsupervised clustering algorithm. The algorithm mapped the site information into multidimensional features, and then carried out the clustering process with two classes. The algorithm improved the accuracy and recall rate obviously. Experimental results verify the effectiveness of the algorithm.
作者 叶骁 Ye Xiao(Shanghai Key Lab of Intelligent Information Processing, School of Computer Science, Fudan University, Shanghai 200433, China)
出处 《计算机应用与软件》 北大核心 2019年第3期287-290,333,共5页 Computer Applications and Software
关键词 K-MEANS 变异识别 二代测序 K-means Variant calling Next-generation sequencing
  • 相关文献

参考文献2

二级参考文献7

共引文献17

同被引文献55

引证文献6

二级引证文献26

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部