期刊文献+

改进的RNA-Seq数据转录组表达分析研究 被引量:3

Improved Trancriptome Expression Analysis for RNA-Seq Data
下载PDF
导出
摘要 基于高通量测序的RNA-Seq(RNA-sequencing)是用于转录组研究的一种新技术,针对该技术在转录组表达分析研究中存在的读段多源映射和读段非均匀分布等难点,提出一个改进的转录组表达研究方法 LDASeqII(Improvement of latent Dirichlet allocation for sequencing data)。模型利用剪接异构体结构信息对参数进行约束并进行外显子读段数目归一化处理,解决了读段非均匀分布下的多源映射问题。通过引入"伪外显子"和"伪转录本"分别处理接合区读段和噪声读段。将模型应用到真实数据集上,并与原LDASeq(Latent Dirichlet allocation for sequencing data)模型和目前流行的Cufflinks与RSEM(RNA-Seq by expectation maximization)方法进行对比。结果显示,改进方法获得了更为准确的转录本及基因表达水平计算结果。 RNA-Seq(RNA-sequencing),based on high-throughput sequencing,is a new technique for transcriptome research.Considering the difficulties in the analysis of transcript expression using RNA-Seq data,an improved method,improvement of latent dirichlet allocation for sequencing data(LDASeqⅡ)is proposed to calculate the transcript expression.To deal with multi-mappings between reads and isoforms and non-uniform distribution of reads along reference,LDASeqⅡ utilizes the known gene-isoform annotation to constrain the hyperparameters and normalizes the read counts by exon length for each individual exon.By introducing″pseudo-exon″and″pseudo-transcript″,the conjunction reads and noise reads gain proper treatments.LDASeqⅡis validated using two real datasets on gene and transcript expression calculation and compared with latent dirichlet allocation for sequencing data(LDASeq)and other two popular methods Cufflinks and RNA-Seq by expectation maximization(RSEM).The results show that LDASeqⅡobtains more accurate transcript and gene expression measurements than other approaches.
出处 《数据采集与处理》 CSCD 北大核心 2015年第5期1028-1035,共8页 Journal of Data Acquisition and Processing
基金 国家自然科学基金(61170152)资助项目 中央高校基本科研业务费专项(CXZZ11_0217)资助项目
关键词 基因表达 RNA-SEQ 转录组表达 多源映射 非均匀性 gene expression RNA-Seq transcript expression multi-mapping non-uniformity
  • 相关文献

参考文献21

  • 1Wang Z, Gerstein M, Snyder M. RNA-Seq: A revolutionary tool for transcriptomics [J].Nature Reviews Genetics, 2009, 10 (1) : 57-63.
  • 2Denoeud F, Aury J M, Da Silva C, et al. Annotating genomes with massive scale RNA sequencing[J]. Genome Biol, 2008, 9 (12):R175.
  • 3Garber M, Grabherr M G, Guttman M, et al. Computational methods for transcriptome annotation and quantification using RNA-Seq[J]. Nature Methods, 2011, 8(6): 469-477.
  • 4Marguerat S, Bahler J. RNA-seq: From technology to biology[J].Cell Mol Life Sci, 2010, 67: 569-579.
  • 5Mortazavi A, Williams B A, Mccue K, et al. Mapping and quantifying mammalian transcriptomes by RNA-Seq[J]. Nature Methods, 2008, 5(7): 621-628.
  • 6Pan Q, Shai O, Lee L J, et al. Deep surveying of alternative splicing complexity in the human transcriptome by high- throughput sequencing [J]. Nature Genetices, 2008, 40(12) : 1413-1415.
  • 7Turro E, Su S Y, Goncalves fit, et al. Haplotype and isoform specific expression estimation using multi-mapping RNA-Seq reads [J]. Genome Biology, 2011, 12: R13.
  • 8Jiang Hui, Wong Winghung. Statistical inferences for isoform expression in RNA-Seq [J].Bioinformatics, 2009, 25 (8): 1026-1032.
  • 9Trapnell C, Williams B A, Pertea G, Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation[J]. Nat Biotechnol, 2010(5): 511-515.
  • 10Li B, Ruotti V, Stewart R M, et al. RNA-Seq gene expression estimation with read mapping uncertainty [J]. Bioinformatics, 2010, 26(4): 493-500.

二级参考文献22

  • 1Pan Qun, Shai Ofer, Lee W, et al. Blencowe BJ. Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing [J]. Nature Genetices, 2008, 40( 12) : 1413 -1415.
  • 2Skotheim RI, N ees M. Alternative splicing in cancer: noise, functional, or systematic? [J]. The International Journal of Biochemistry and Cell Biology, 2007, 39: 1432 - 1449.
  • 3Wang Zhong, Gerstein M, Snyder M. RNA-Seq: a revolutionary tool for transcriptomics [J]. Nature Reviews Genetics, 2009, 10 (I) : 57 - 63.
  • 4Turro E, Su Shu-Yi, Goncalves A, et al. Haplotype and isoform specific expression estimation using multi-mapping RNA-seq reads [J]. Genome biology, 2011, 12: R13.
  • 5Mortazavi A, Williams BA, McCue K, et al. Mapping and quantifying mammalian transcriptomes by RNA-Seq [J]. Nature Methods, 2008, 5 (7) : 621 - 628.
  • 6Jiang Hui, Wong Wing Hung. Statistical inferences for isoform expression in RNA-Seq [J]. Biolnformatics, 2009, 25 ( 8 ) : 1026 - 1032.
  • 7Kim H, Bi Yingtao, Pal S, et al. IsoformEx: isoform level gene expression estimation using weighted non-negative least squares from mRNA-Seq data [J]. BMC Biolnformatics, 2011, 12: 305.
  • 8Li Bo, Ruotti V, Stewart R. M, et al. RNA-Seq gene expression estimation with read mapping uncertainty [J]. Biolnformatics, 2010,26(4): 493 -500.
  • 9Li Bo, Dewey CN. RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome [J]. BMC Biolnformatics, 2011, 12: 323.
  • 10Katz Y, Wang Eric T, Airoldi EM, et al. Analysis and design of RN A sequencing experiments for identifying isoform regulation [J]. Nature Methods, 2010, 7: 1009 -1015.

共引文献2

同被引文献39

  • 1Wang Z,Gerstein M,Snyder A M.RNA-Seq:A revolutionary tool for transcriptomics[J].Nature Reviews Genetics,2008,10(1):57-63.
  • 2Richard H,Schulz M H,Sultan M,et al.Prediction of alternative isoforms from exon expression levels in RNA-Seq experi-ments[J].Nucleic Acids Res,2010,38(10):e112.
  • 3Wang L G,Xi Y X,Yu J,et al.A statistical method for the detection of alternative splicing using RNA-Seq[J].PLoS one,2010,5-(1):e8529.
  • 4Anders S,Huber W.Differential expression analysis for sequence count data[J].Genome Biology,2010,11(10):R106.
  • 5Hardcastle T J,Kelly K A.Bay-Seq:Empirical Bayesian methods for identifying differential expression in sequence count da-ta[J].BMC Bioinformatics,2010,11:422-439.
  • 6Turro E,Su S Y,Gonalves,et al.Haplotype and isoform specific expression estimation using multi-mapping RNA-Seqreads[J].Genome Biol,2011,12(2):R13.
  • 7Glaus P,Honkela A,Rattray M.Identifying differentially expressed transcripts from RNA-Seq data with biological variation[J].Bioinformatics,2012,28(13):1721-1728.
  • 8Trapnell C,Roberts A,Goff L,et al.Differential gene and transcript expression analysis of RNA-Seq experiments with To-pHat and Cufflinks[J].Nature Protocols,2012,7(3):562-578.
  • 9Jiang H,Wong W H.Statistical inferences for isoform expression in RNA-Seq[J].Bioinformatics,2009,25(8):1026-1032.
  • 10Liu X,Zhang L,Chen S.Modeling exon-specific bias distribution improves the analysis of RNA-Seq data[J].Plos One,2015,10(10):e0140032.

引证文献3

二级引证文献9

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部