摘要
通过将二代测序数据与连接酶检测反应(ligase detection reaction,LDR)对单核苷酸多态性(single nucleotide polymorphism,SNP)基因分型的结果进行比对,确定二代测序数据判定SNP基因型的经验性阈值.利用多重聚合酶链式反应(multiplex polymerase chain reaction,multiplex PCR)对91个样本进行19个SNP位点的扩增,扩增的产物混匀纯化后在Ion torrent PGM仪器上进行二代测序.利用LDR技术对相应的SNP位点进行检测,将其分型结果作为二代测序数据判定SNP基因型的标准,确定了二代测序数据判定SNP基因型的阈值:测序深度≥6X,等位基因比率在15%~85%的位点为杂合子,在范围之外的为纯合子,该阈值准确度达到99.6%;针对等位基因频率分布在阈值边缘的数据,结合聚类分析可将正确率提升至100%.研究结果为利用二代测序数据判定SNP基因型提供了一个准确、快捷和经验性的阈值与方案.
The next generation sequencing data were compared with ligase detection reaction(LDR) data to determine the empirical cut-off thresholds of next generation sequencing data for single nucleotide polymorphism (SNP) calling. The 19 loci from 91 human genomic DNA were amplified with multiplex polymerase chainreaction(multiplex PCR). Then, all the amplicons were sequenced in a single run on torrent PGM platform. With LDR genotyping data, the empirical cut-off thresholds of next generation sequencing data for SNP calling were that sequencing depth was ≥6X and heterozygote ratio was fell in 15%-85%. Application of this method was able to accurately determine 99.6% of SNPs, but was failed to judge the data closed edge of the cut-off thresholds. Combining with clustering analysis could solve this problem for increasing the accuracy to 100%. The results of research provided an accurate, fast and empirical threshold for the next generation sequencing for single nucleotide polymorphism calling.
作者
陈科
李凯
周宇荀
肖君华
CHEN Kea LI Kaib ZHOU Yuxunb XIAO Junhuab(a. School of Environmental Science and Engineerin b. Institute of Biological Sciences and Biotechnology, Donghua University, Shanghai 201620, China)
出处
《东华大学学报(自然科学版)》
CSCD
北大核心
2017年第3期370-376,共7页
Journal of Donghua University(Natural Science)
基金
国家自然科学基金面上资助项目(31371257)
上海市科委关键资助项目(14140900502)