Analysis of Transcription using RNA-Seq

advertisement
Introduction to
Next-Generation Sequencing (NGS)
Analysis of Transcription using RNA-Seq
Dr. Robert Boissy
SWH2048
rboissy@unmc.edu
Outline

NGS instruments and data analysis software

RNA-Seq overview

mRNA gene expression

mRNA transcript isoforms

mRNA special cases

miRNA gene expression
NGS Instruments and software
http://seqanswers.com/
Glenn TC. (2011) Field guide to
next-generation DNA sequencers.
Mol Ecol Resour. 11(5):759-69.
http://www.ncbi.nlm.nih.gov/pubmed/21592312
http://onlinelibrary.wiley.com/doi/10.1111/j.1755-0998.2011.03024.x/abstract
http://www.molecularecologist.com/next-gen-fieldguide/
NGS Instruments and software
NGS Instruments and software
NGS Instruments and software
RNA-Seq overview
Li J, Witten DM, Johnstone IM, Tibshirani R. (2011) Normalization, testing, and false
discovery rate estimation for RNA-sequencing data. Biostatistics. 2011 Oct 14.
http://www.ncbi.nlm.nih.gov/pubmed/22003245
McIntyre LM, Lopiano KK, Morse AM, Amin V, Oberg AL, Young LJ, Nuzhdin SV.
(2011) RNA-seq: technical variability and sampling. BMC Genomics. 12:293.
http://www.ncbi.nlm.nih.gov/pubmed/21645359
RNA-Seq overview
Auer PL, Doerge RW. (2010) Statistical design and analysis of RNA sequencing data.
Genetics 185:405-16. http://www.ncbi.nlm.nih.gov/pubmed?term=20439781
RNA-Seq overview
Pevsner, J (2009) Bioinformatics and Functional Genomics Wiley-Blackwell p. 333
RNA-Seq overview
Pevsner, J (2009) Bioinformatics and Functional Genomics Wiley-Blackwell p. 338
RNA-Seq overview
Pevsner, J (2009) Bioinformatics and Functional Genomics Wiley-Blackwell p. 338
RNA-Seq overview
Pevsner, J (2009) Bioinformatics and Functional Genomics Wiley-Blackwell p. 338
Pevsner, J (2009) Bioinformatics and Functional Genomics Wiley-Blackwell p. 350
RNA-Seq overview
Pevsner, J (2009) Bioinformatics and Functional Genomics Wiley-Blackwell p. 353
RNA-Seq overview
Pevsner, J (2009) Bioinformatics and Functional Genomics Wiley-Blackwell p. 354
mRNA gene expression

Biostatistical expertise is essential

Study design and power estimates need to be
worked out before sequencing
Li J, Witten DM, Johnstone IM, Tibshirani R. (2011) Normalization, testing, and false
discovery rate estimation for RNA-sequencing data. Biostatistics. 2011 Oct 14.
http://www.ncbi.nlm.nih.gov/pubmed/22003245
McCarthy DJ, Chen Y, Smyth GK. (2012) Differential expression analysis of
multifactor RNA-Seq experiments with respect to biological variation. Nucleic Acids
Res. (Feb. 6) http://www.ncbi.nlm.nih.gov/pubmed/22287627
Fang Z, Cui X. (2011) Design and validation issues in RNA-seq experiments. Brief
Bioinform. 12(3):280-7. http://www.ncbi.nlm.nih.gov/pubmed/21498551
mRNA gene expression

Influence of sequencing depth

Influence of mappability
Mercer TR, Gerhardt DJ, Dinger ME, Crawford J, Trapnell C, Jeddeloh JA, Mattick JS,
Rinn JL. (2011) Targeted RNA sequencing reveals the deep complexity of the
human transcriptome. Nat Biotechnol. 30(1):99-104.
http://www.ncbi.nlm.nih.gov/pubmed/22081020
Roberts A, Pachter L. (2011) RNA-Seq and find: entering the RNA deep field.
Genome Med. 3(11):74. http://www.ncbi.nlm.nih.gov/pubmed/22113004
Derrien T, Estellé J, Marco Sola S, Knowles DG, Raineri E, Guigó R, Ribeca P. (2012)
Fast computation and applications of genome mappability. PLoS One. 7(1):e30377.
http://www.ncbi.nlm.nih.gov/pubmed/22276185
mRNA gene expression
mRNA gene expression
mRNA gene expression

Guidelines and reviews
Standards, Guidelines and Best Practices for RNA-Seq V1.0 (June 2011) The
ENCODE Consortium.
http://encodeproject.org/ENCODE/protocols/dataStandards/ENCODE_RNAseq_Sta
ndards_V1.0.pdf
Jiang L, Schlesinger F, Davis CA, Zhang Y, Li R, Salit M, Gingeras TR, Oliver B. (2011)
Synthetic spike-in standards for RNA-seq experiments. Genome Res. 21(9):154351. http://www.ncbi.nlm.nih.gov/pubmed/21816910
Garber M, Grabherr MG, Guttman M, Trapnell C. (2011) Computational methods for
transcriptome annotation and quantification using RNA-seq. Nat Methods.
8(6):469-77. http://www.ncbi.nlm.nih.gov/pubmed/21623353
Ramsköld D, Kavak E, Sandberg R. (2012) How to analyze gene expression using
RNA-sequencing data. Methods Mol Biol. 802:259-74.
http://www.ncbi.nlm.nih.gov/pubmed/22130886
mRNA transcript isoforms

TopHat and related programs
Trapnell C, Pachter L, Salzberg SL. (2009) TopHat: discovering splice junctions with RNA-Seq. Bioinformatics 25(9):1105-11.
http://www.ncbi.nlm.nih.gov/pubmed/19289445 http://tophat.cbcb.umd.edu/
Langmead B, Hansen KD, Leek JT. (2010) Cloud-scale RNA-sequencing differential expression analysis with Myrna. Genome
Biol. 11(8):R83. http://www.ncbi.nlm.nih.gov/pubmed/20701754 http://bowtie-bio.sourceforge.net/myrna/index.shtml
Kim D, Salzberg SL. (2011) TopHat-Fusion: an algorithm for discovery of novel fusion transcripts. Genome Biol. 12(8):R72.
http://www.ncbi.nlm.nih.gov/pubmed/21835007 http://tophat-fusion.sourceforge.net/
Trapnell C, Williams BA, Pertea G, Mortazavi A, Kwan G, van Baren MJ, Salzberg SL, Wold BJ, Pachter L. (2010) Transcript
assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation.
Nat Biotechnol. 28(5):511-5. http://www.ncbi.nlm.nih.gov/pubmed/20436464 http://cufflinks.cbcb.umd.edu/
Roberts A, Pimentel H, Trapnell C, Pachter L. (2011) Identification of novel transcripts in annotated genomes using RNA-Seq.
Bioinformatics 27(17):2325-9. http://www.ncbi.nlm.nih.gov/pubmed/21697122
Roberts A, Trapnell C, Donaghey J, Rinn JL, Pachter L. (2011) Improving RNA-Seq expression estimates by correcting for
fragment bias. Genome Biol. 12(3):R22. http://www.ncbi.nlm.nih.gov/pubmed/21410973
mRNA transcript isoforms
mRNA transcript isoforms
mRNA special cases

Non-polyadenylated transcripts

Nascent transcripts + co-transcriptional splicing

Circular transcripts

A to I editing
Yang L, Duff MO, Graveley BR, Carmichael GG, Chen LL. (2011) Genomewide
characterization of non-polyadenylated RNAs. Genome Biol. 12(2):R16.
http://www.ncbi.nlm.nih.gov/pubmed/21324177
Ameur A, Zaghlool A, Halvardson J, Wetterbom A, Gyllensten U, Cavelier L,
Feuk L. (2011) Total RNA sequencing reveals nascent transcription and
widespread co-transcriptional splicing in the human brain. Nat Struct Mol
Biol. 18(12):1435-40. http://www.ncbi.nlm.nih.gov/pubmed/22056773
mRNA special cases
mRNA special cases
Salzman J, Gawad C, Wang PL, Lacayo N, Brown PO. (2012) Circular RNAs Are
the Predominant Transcript Isoform from Hundreds of Human Genes in
Diverse Cell Types. PLoS One. 7(2):e30733.
http://www.ncbi.nlm.nih.gov/pubmed/22319583
mRNA special cases
mRNA special cases
mRNA special cases
Bahn JH, Lee JH, Li G, Greer C, Peng G, Xiao X. (2011) Accurate identification
of A-to-I RNA editing in human by transcriptome sequencing. Genome Res.
22(1):142-50. http://www.ncbi.nlm.nih.gov/pubmed/21960545
mRNA special cases
miRNA gene expression

“Small RNA-Seq” is also very important
For a recent review see:
Preethi H. Gunaratne, Cristian Coarfa, Benjamin Soibam and Arpit Tandon (2012)
miRNA Data Analysis: Next-Gen Sequencing. In: Next-generation MicroRNA
expression profiling technology, Fan, J.B. (Ed.)
Methods in Molecular Biology, Vol. 822, 273-288, DOI: 10.1007/978-1-61779-4278_19
http://www.springerlink.com/content/p21116x3572581r7/#section=999898&pag
e=1
Review

NGS instruments and data analysis software

RNA-Seq overview

mRNA gene expression

mRNA transcript isoforms

mRNA special cases

miRNA gene expression
Download