Introduction to Next-Generation Sequencing (NGS) Analysis of Transcription using RNA-Seq Dr. Robert Boissy SWH2048 rboissy@unmc.edu Outline NGS instruments and data analysis software RNA-Seq overview mRNA gene expression mRNA transcript isoforms mRNA special cases miRNA gene expression NGS Instruments and software http://seqanswers.com/ Glenn TC. (2011) Field guide to next-generation DNA sequencers. Mol Ecol Resour. 11(5):759-69. http://www.ncbi.nlm.nih.gov/pubmed/21592312 http://onlinelibrary.wiley.com/doi/10.1111/j.1755-0998.2011.03024.x/abstract http://www.molecularecologist.com/next-gen-fieldguide/ NGS Instruments and software NGS Instruments and software NGS Instruments and software RNA-Seq overview Li J, Witten DM, Johnstone IM, Tibshirani R. (2011) Normalization, testing, and false discovery rate estimation for RNA-sequencing data. Biostatistics. 2011 Oct 14. http://www.ncbi.nlm.nih.gov/pubmed/22003245 McIntyre LM, Lopiano KK, Morse AM, Amin V, Oberg AL, Young LJ, Nuzhdin SV. (2011) RNA-seq: technical variability and sampling. BMC Genomics. 12:293. http://www.ncbi.nlm.nih.gov/pubmed/21645359 RNA-Seq overview Auer PL, Doerge RW. (2010) Statistical design and analysis of RNA sequencing data. Genetics 185:405-16. http://www.ncbi.nlm.nih.gov/pubmed?term=20439781 RNA-Seq overview Pevsner, J (2009) Bioinformatics and Functional Genomics Wiley-Blackwell p. 333 RNA-Seq overview Pevsner, J (2009) Bioinformatics and Functional Genomics Wiley-Blackwell p. 338 RNA-Seq overview Pevsner, J (2009) Bioinformatics and Functional Genomics Wiley-Blackwell p. 338 RNA-Seq overview Pevsner, J (2009) Bioinformatics and Functional Genomics Wiley-Blackwell p. 338 Pevsner, J (2009) Bioinformatics and Functional Genomics Wiley-Blackwell p. 350 RNA-Seq overview Pevsner, J (2009) Bioinformatics and Functional Genomics Wiley-Blackwell p. 353 RNA-Seq overview Pevsner, J (2009) Bioinformatics and Functional Genomics Wiley-Blackwell p. 354 mRNA gene expression Biostatistical expertise is essential Study design and power estimates need to be worked out before sequencing Li J, Witten DM, Johnstone IM, Tibshirani R. (2011) Normalization, testing, and false discovery rate estimation for RNA-sequencing data. Biostatistics. 2011 Oct 14. http://www.ncbi.nlm.nih.gov/pubmed/22003245 McCarthy DJ, Chen Y, Smyth GK. (2012) Differential expression analysis of multifactor RNA-Seq experiments with respect to biological variation. Nucleic Acids Res. (Feb. 6) http://www.ncbi.nlm.nih.gov/pubmed/22287627 Fang Z, Cui X. (2011) Design and validation issues in RNA-seq experiments. Brief Bioinform. 12(3):280-7. http://www.ncbi.nlm.nih.gov/pubmed/21498551 mRNA gene expression Influence of sequencing depth Influence of mappability Mercer TR, Gerhardt DJ, Dinger ME, Crawford J, Trapnell C, Jeddeloh JA, Mattick JS, Rinn JL. (2011) Targeted RNA sequencing reveals the deep complexity of the human transcriptome. Nat Biotechnol. 30(1):99-104. http://www.ncbi.nlm.nih.gov/pubmed/22081020 Roberts A, Pachter L. (2011) RNA-Seq and find: entering the RNA deep field. Genome Med. 3(11):74. http://www.ncbi.nlm.nih.gov/pubmed/22113004 Derrien T, Estellé J, Marco Sola S, Knowles DG, Raineri E, Guigó R, Ribeca P. (2012) Fast computation and applications of genome mappability. PLoS One. 7(1):e30377. http://www.ncbi.nlm.nih.gov/pubmed/22276185 mRNA gene expression mRNA gene expression mRNA gene expression Guidelines and reviews Standards, Guidelines and Best Practices for RNA-Seq V1.0 (June 2011) The ENCODE Consortium. http://encodeproject.org/ENCODE/protocols/dataStandards/ENCODE_RNAseq_Sta ndards_V1.0.pdf Jiang L, Schlesinger F, Davis CA, Zhang Y, Li R, Salit M, Gingeras TR, Oliver B. (2011) Synthetic spike-in standards for RNA-seq experiments. Genome Res. 21(9):154351. http://www.ncbi.nlm.nih.gov/pubmed/21816910 Garber M, Grabherr MG, Guttman M, Trapnell C. (2011) Computational methods for transcriptome annotation and quantification using RNA-seq. Nat Methods. 8(6):469-77. http://www.ncbi.nlm.nih.gov/pubmed/21623353 Ramsköld D, Kavak E, Sandberg R. (2012) How to analyze gene expression using RNA-sequencing data. Methods Mol Biol. 802:259-74. http://www.ncbi.nlm.nih.gov/pubmed/22130886 mRNA transcript isoforms TopHat and related programs Trapnell C, Pachter L, Salzberg SL. (2009) TopHat: discovering splice junctions with RNA-Seq. Bioinformatics 25(9):1105-11. http://www.ncbi.nlm.nih.gov/pubmed/19289445 http://tophat.cbcb.umd.edu/ Langmead B, Hansen KD, Leek JT. (2010) Cloud-scale RNA-sequencing differential expression analysis with Myrna. Genome Biol. 11(8):R83. http://www.ncbi.nlm.nih.gov/pubmed/20701754 http://bowtie-bio.sourceforge.net/myrna/index.shtml Kim D, Salzberg SL. (2011) TopHat-Fusion: an algorithm for discovery of novel fusion transcripts. Genome Biol. 12(8):R72. http://www.ncbi.nlm.nih.gov/pubmed/21835007 http://tophat-fusion.sourceforge.net/ Trapnell C, Williams BA, Pertea G, Mortazavi A, Kwan G, van Baren MJ, Salzberg SL, Wold BJ, Pachter L. (2010) Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat Biotechnol. 28(5):511-5. http://www.ncbi.nlm.nih.gov/pubmed/20436464 http://cufflinks.cbcb.umd.edu/ Roberts A, Pimentel H, Trapnell C, Pachter L. (2011) Identification of novel transcripts in annotated genomes using RNA-Seq. Bioinformatics 27(17):2325-9. http://www.ncbi.nlm.nih.gov/pubmed/21697122 Roberts A, Trapnell C, Donaghey J, Rinn JL, Pachter L. (2011) Improving RNA-Seq expression estimates by correcting for fragment bias. Genome Biol. 12(3):R22. http://www.ncbi.nlm.nih.gov/pubmed/21410973 mRNA transcript isoforms mRNA transcript isoforms mRNA special cases Non-polyadenylated transcripts Nascent transcripts + co-transcriptional splicing Circular transcripts A to I editing Yang L, Duff MO, Graveley BR, Carmichael GG, Chen LL. (2011) Genomewide characterization of non-polyadenylated RNAs. Genome Biol. 12(2):R16. http://www.ncbi.nlm.nih.gov/pubmed/21324177 Ameur A, Zaghlool A, Halvardson J, Wetterbom A, Gyllensten U, Cavelier L, Feuk L. (2011) Total RNA sequencing reveals nascent transcription and widespread co-transcriptional splicing in the human brain. Nat Struct Mol Biol. 18(12):1435-40. http://www.ncbi.nlm.nih.gov/pubmed/22056773 mRNA special cases mRNA special cases Salzman J, Gawad C, Wang PL, Lacayo N, Brown PO. (2012) Circular RNAs Are the Predominant Transcript Isoform from Hundreds of Human Genes in Diverse Cell Types. PLoS One. 7(2):e30733. http://www.ncbi.nlm.nih.gov/pubmed/22319583 mRNA special cases mRNA special cases mRNA special cases Bahn JH, Lee JH, Li G, Greer C, Peng G, Xiao X. (2011) Accurate identification of A-to-I RNA editing in human by transcriptome sequencing. Genome Res. 22(1):142-50. http://www.ncbi.nlm.nih.gov/pubmed/21960545 mRNA special cases miRNA gene expression “Small RNA-Seq” is also very important For a recent review see: Preethi H. Gunaratne, Cristian Coarfa, Benjamin Soibam and Arpit Tandon (2012) miRNA Data Analysis: Next-Gen Sequencing. In: Next-generation MicroRNA expression profiling technology, Fan, J.B. (Ed.) Methods in Molecular Biology, Vol. 822, 273-288, DOI: 10.1007/978-1-61779-4278_19 http://www.springerlink.com/content/p21116x3572581r7/#section=999898&pag e=1 Review NGS instruments and data analysis software RNA-Seq overview mRNA gene expression mRNA transcript isoforms mRNA special cases miRNA gene expression