期刊文献+

基于平滑LDA的RNA-Seq数据表达分析研究 被引量:1

RNA-Seq Data Expression Analysis Based on Smoothed LDA
下载PDF
导出
摘要 RNA-Seq是目前转录组研究的一种重要技术,针对RNA-Seq数据分析中读段的多源映射,参考序列分布的不均匀性,一些转录本中外显子分布稀疏以及跨结合区读段处理问题,提出了一个新的转录组表达研究模型sLDASeqQ该模型根据基因中转录本注释信息对模型参数进行约束,对跨结合区的读段按长度分配处理,解决了读段非均匀分布和跨结合区问题;在模型中增加一个超参数,从而解决了外显子的稀疏问题。将该模型应用到3个真实的数据集上,并与其他主流方法进行比较,结果表明该模型获得了较为准确的基因以及转录本表达水平计算结果。 RNA-Seq is an important technique for transcriptome research.Considering the multi-mappings between reads and isoforms,non-uniform distribution of reads along the reference sequence,conjunction reads and the sparsity caused by the large exon size,this paper proposes a new method,sLDASeq,to calculate the gene and transcript expression.To solve the problems of multi-mappings,non-uniform distribution of reads and conjunction reads,the model utilizes the known gene-isoform annotation to constrain the hyper-parameters and allocate the read counts according to exon length.By adding a hyper-parameter,the model solves the problem of sparsity in the exons.sLDASeq is validated by using three real datasets on the gene and transcript expression calculation and compared with LDASeq and other popular methods.Results show that sLDASeq obtains more accurate transcript and gene expression measurements than other methods.
出处 《计算机科学与探索》 CSCD 北大核心 2016年第3期381-388,共8页 Journal of Frontiers of Computer Science and Technology
基金 国家自然科学基金No.61170152 江苏省青蓝工程 中央高校基本科研业务费专项资金No.CXZZ11_0217~~
关键词 RNA-SEQ 基因转录本表达水平 平滑LDA 结合区 多源映射 非均匀性 RNA-Seq gene and transcript expression smoothed LDA exon-junction multi-mapping non-uniformity
  • 相关文献

参考文献24

  • 1Wang Zhong, Gerstein M, Snyder M. RNA-Seq: a revolu- tionary tool for transcriptomics[J]. Nature Reviews Genet- ics, 2009, 10(1): 57-63.
  • 2Sultan M, Amstislavskiy V, Risch T. Influence of RNA ex- traction methods and library selection schemes on RNA- seq data[J]. BMC Genomics, 2014, 15: 675-688.
  • 3Robert A W, Philippa A S, Catherine M M. RNA Seq analy- sis of the Eimeria tenella gametocyte transcriptome reveals clues about the molecular basis for sexual reproduction and oocyst biogenesis[J]. BMC Genomics, 2015, 16: 94-114.
  • 4王曦,汪小我,王立坤,冯智星,张学工.新一代高通量RNA测序数据的处理与分析[J].生物化学与生物物理进展,2010,37(8):834-846. 被引量:64
  • 5Xiao Shengiian, Zhang Chi, Zou Quan, et al. TiSGeD: a data- base for tissue-specific genes[J]. Bioinformatics, 2010, 26 (9): 1273-1275.
  • 6Pan Jianbo, Hu Shichang, Shi Dan, et al. PaGenBase: a pat- tern gene database for the global and dynamic understanding ofgene function[J]. PLoS ONE, 2013, 8(12): e80747.
  • 7Pan Jianbo, Hu Shichang, Wang Hao, et al. PaGeFinder: quantitative identification of spatiotemporal pattern genes[J]. Bioinformatics, 2012, 28(11): 1544-1545.
  • 8Mortazavi A, Williams B A, McCue K, et al. Mapping and quantifying mammalian tmnscriptomes by RNA-seq[J]. Nature Methods, 2008, 5(7): 621-628.
  • 9Bullard J H, Purdom E, Hansen K D, et al. Evaluation of statistical methods for normalization and differential expres- sion in mRNA-Seq experiments[J]. BMC Bioinformatics, 2010, 11: 94-107.
  • 10Trapnell C, Williams B A, Pertea G. Transcript assembly and quantification by RNA-Seq reveals unannotated tran- scripts and isoform switching during cell differentiation[J]. Nature Biotechnology, 2011, 28(5): 511-515.

二级参考文献110

  • 1Marioni J C, Mason C E, Mane S M, et al. RNA-seq: an assessment of technical reproducibility and comparison with gene expression arrays. Genome Res, 2008, 18(9): 1509-1517.
  • 2Mortazavi A, Williams B A, McCue K, et al. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Methods, 2008, 5(7): 621-628.
  • 3Nagalakshmi U, Wang Z, Waem K, et al. The transcriptional landscape of the yeast genome defined by RNA sequencing. Science, 2008, 320(5881): 1344-1349.
  • 4Sultan M, Schulz M H, Richard H, et al. A global view of gene activity and alternative splicing by deep sequencing of the human transcriptome. Science, 2008, 321(5891): 956-960.
  • 5Wang E T, Sandberg R, Luo S, etal. Alternative isoform regulation in human tissue transcriptomes. Nature, 2008, 456(7221): 470-476.
  • 6Birzele F, Schaub J, Rust W, et al. Into the unknown: expression profiling without genome sequence information in CHO by next generation sequencing. Nucleic Acids Res, 2010, doi: 10.1093/nar/ gkq 116.
  • 7Sanger F, Nicklen S, Coulson A R. DNA sequencing with chain- terminating inhibitors. Proc Natl Acad Sci USA, 1977, 74 (12): 5463 -5467.
  • 8Margulies M, Egholm M, Altman W E, et al. Genome sequencing in microfabricated high-density picolitre reactors. Nature, 2005, 437(7057): 376-380.
  • 9Shendure J, Ji H. Next-generation DNA sequencing. Nat Biotechnol, 2008, 26(10): 1135 1145.
  • 10Ruparel H, Bi L, Li Z, et al. Design and synthesis of a 3'-O-allyl photocleavable fluorescent nucleotide as a reversible terminator for DNA sequencing by synthesis. Proe Natl Acad Sci USA, 2005, 102(17): 5932-5937.

共引文献65

同被引文献1

引证文献1

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部