The statistical power of k-mer based aggregative statistics for alignment-free detection of horizontal gene transfer

导出

摘要 Alignment-based database search and sequence comparison are commonly used to detect horizontal gene transfer(HGT).However,with the rapid increase of sequencing depth,hundreds of thousands of contigs are routinely assembled from metagenomics studies,which challenges alignment-based HGT analysis by overwhelming the known reference sequences.Detecting HGT by k-mer statistics thus becomes an attractive alternative.These alignment-free statistics have been demonstrated in high performance and efficiency in wholegenome and transcriptome comparisons.To adapt k-mer statistics for HGT detection,we developed two aggregative statistics T^(S)_(sum ) and T^(*)_(sum),which subsample metagenome contigs by their representative regions,and summarize the regional D^(S) _(2) and D^(*)_(2)metrics by their upper bounds.We systematically studied the aggregative statistics’power at different k-mer size using simulations.Our analysis showed that,in general,the power of T^(S)_(sum) and T^(*)_(sum) increases with sequencing coverage,and reaches a maximum power>80%at k=6,with 5%Type-I error and the coverage ratio>0.2x.The statistical power ofT^(S)_(sum) and T^(*)_(sum) was evaluated with realistic simulations of HGT mechanism,sequencing depth,read length,and base error.We expect these statistics to be useful distance metrics for identifying HGT in metagenomic studies.

作者 Guan-Da Huang Xue-Mei Liu Tian-Lai Huang Li-C.Xia

机构地区 School of Physics and Optoelectronics Department of Medicine

出处《Synthetic and Systems Biotechnology》 SCIE 2019年第3期150-156,共7页 合成和系统生物技术（英文）

基金 L.C.X.was supported by the Innovation in Cancer Informatics Fund.

关键词 Alignment-free sequence comparison k-mer Horizontal gene transfer Statistical power

分类号 F42 [经济管理—产业经济]

引文网络
相关文献

参考文献1

1HAO Bailin(Institute of Theoretical Physics, Academy of Chinese Sciences, Beijing 100080, China,Senior International Fellow of the Santa Fe Institute,T-Life Research Center, Fudan University,Shanghai 200433, China)QI Ji(Institute of Theoretical Physics, Beijing 100080, China).VERTICAL HEREDITY VS. HORIZONTAL GENE TRANSFER: A CHALLENGE TO BACTERIAL CLASSIFICATION[J].Journal of Systems Science & Complexity,2003,16(3):307-314. 被引量：3

二级参考文献19

1Bergey's Manual Trust, Bergey's Manual of Systematic Bacteriology, Springer-Verlag, New York,2nd Ed. Vol. 1, 2001.
2G. M. Garrity , M. Winters, and D. B. Searles, Taxonomic Outline of the Prokaryotic Genera,Bergey's Manual of Systematic Bacteriology, Ed. 2, Rel. 1.0. Available at:http: / / www. bergeysout line. corn.
3C. R: Woese et al., Proc. Natl. Acad. Sci. (USA), 1977, 74:5088 and 1990, 87: 4576; Microbial.Rev., 1983, 47: 621.
4G. Deckert et al., The complete genome of the hyperthermophilic bacterium Aquifez Aeolicus,Nature, 1998, 392: 353-358.
5K.E.Nelson et al., Evidence for lateral gene transfer between Archaea and Bacteria from genome sequence of Thermotoga maritima, Nature, 1999, 399: 323.
6E.Pennisi, Cenome data shake tree of life, Science, 1998, 280: 672-674.
7E. Pennisil Is it time to uproot the tree of life? Science, 1999, 284: 1305-1308.
8W. F. Doolittle, Uprooting the tree of life, Sci. Amer., February 2000, 90-95.
9G. Hinkle et al., Complete Genome Sequence of Agrobacterium Tumefaciens, C58, the Causative Agent of Crown Gall Disease in Plants, GenBank Entries AE007869, AE006469, AE00782, and AE007871, 2001.
10C. R. Woese, Proc. Natl. Acad. Sci. USA, 2000, 97: 8392-8396.

共引文献2

1Guanghong Zuo1,2, Zhao Xu1,3, Hongjie Yu1,4, and Bailin Hao1,5,6 1T-Life Research Center & Department of Physics, Fudan University, Shanghai 200433, China,2Shanghai Institute of Applied Physics, Chinese Acadamy of Sciences, Shanghai 201800, China,3Applied Biosystems, Inc., Beijing 100027, China,4Fudan-VARI Center for Genetic Epidemiology, Fudan University, Shanghai 200433, China,5Institute of Theoretical Physics, Chinese Acadamy of Sciences, Beijing 100190, China,6Santa Fe Institute, Santa Fe, NM 87505, USA..Jackknife and Bootstrap Tests of the Composition Vector Trees[J].Genomics, Proteomics & Bioinformatics,2010,8(4):262-267. 被引量：6
2李强,左光宏,郝柏林.从完全基因组出发建立原核生物亲缘关系和分类系统时遇到的数学问题[J].中国科学：物理学、力学、天文学,2014,44(12):1301-1310. 被引量：2

1Xu-Bo Qian,Tong Chen,Yi-Ping Xu,Lei Chen,Fu-Xiang Sun,Mei-Ping Lu,Yong-Xin Liu.A guide to human microbiome research: study design, sample collection, and bioinformatics analysis[J].Chinese Medical Journal,2020(15):1844-1855. 被引量：8
2Jian Cao,Fei Liu,Baoli Zhu,Yi Shi,George Fu Gao.Diversity and abundance of resistome in rhizosphere soil[J].Science China(Life Sciences),2020,63(12):1946-1949.
3Heqing Yin,Haijin Dai,Weimin Zhang,Xueyan Zhang,Pinqiang Wang.Demonstration of the refined three-dimensional structure of mesoscale eddies and computational error estimates via Lagrangian analysis[J].Acta Oceanologica Sinica,2020,39(7):146-164.
4郝玮,陆家海.广东湛江地区啮齿动物携带版纳病毒系统发育分析[J].热带医学杂志,2020,20(10):1263-1266. 被引量：2
5Hua Wang,Xiao-Yu He,Liu-Yang Chen,Jun-Ru Yin,Li Han,Hui Liang,Fu-Bao Zhu,Rui-Jie Zhu,Zhi-Min Gao,Ming-Liang Xu.Cognition-Driven Traffic Simulation for Unstructured Road Networks[J].Journal of Computer Science & Technology,2020,35(4):875-888. 被引量：2
6Dinesh Bhatia,Antonio De Santis.A Preliminary Numerical Investigation of Airborne Droplet Dispersion in Aircraft Cabins[J].Open Journal of Fluid Dynamics,2020,10(3):198-207. 被引量：4
7Biao Tang,Feng Xie,Wei Zhao,Jian Wang,Shengwang Dai,Huajun Zheng,Xiaoming Ding,Xufeng Cen,Haican Liu,Yucong Yu,Haokui Zhou,Yan Zhou,Lixin Zhang,Michael Goodfellow,Guo-Ping Zhao.A systematic study of the whole genome sequence of Amycolatopsis methanolica strain 239T provides an insight into its physiological and taxonomic properties which correlate with its position in the genus[J].Synthetic and Systems Biotechnology,2016,1(3):169-186. 被引量：1
8Guangping Huang,Xiao Wang,Yibo Hu,Qi Wu,Yonggang Nie,Jiuhong Dong,Yun Ding,Li Yan,Fuwen Wei.Diet drives convergent evolution of gut microbiomes in bamboo-eating species[J].Science China(Life Sciences),2021,64(1):88-95. 被引量：11

Synthetic and Systems Biotechnology

2019年第3期

浏览历史

内容加载中请稍等...

The statistical power of k-mer based aggregative statistics for alignment-free detection of horizontal gene transfer

参考文献1

二级参考文献19

共引文献2

相关作者

相关机构

相关主题

浏览历史