BIOL 695 003, Next Generation Sequencing

advertisement
Next-Generation Sequencing: Applications for Human Health and Agriculture
Dr. Harsha K Rajasimha
Fall 2013
1 Why this course?
Next-Generation Sequencing (NGS) is revolutionizing the way molecular biology research is conducted.
The cost of sequencing a megabase of DNA has come down from almost $8K in 2001 to about $0.08 in
2013. That is 5 orders of magnitude reduction in 12 years. This has resulted in the generation of
unprecedented amount of sequencing data from various platforms (dominated by Illumina) for basic
research, understanding disease, human health applications, pharmacogenomics and agricultural
applications for engineering better seeds and traits. Personal genomics offers huge potential and promise
for early diagnosis and for pharmacogenomics based drugs repurposing or companion diagnostics.
Algorithms for analyzing genome sequence data are fast maturing and offering better performance or
accuracy. Open-source as well as commercial tools and databases are being developed and the solutions
are fast moving towards commoditization on the cloud. However, computational advances are not
catching up with the speed of data generation (due in part to cost reduction and potential) and significant
computational challenges still remain a hurdle for a more widespread adoption of NGS-based
applications.
This course aims to cover recent advances in NGS technologies, tools, algorithms, and databases and
discusses various applications of NGS. We will discuss high quality as well as speculative research
articles on NGS data management, analysis and interpretation. The students gain understanding of various
challenges and research opportunities to pursue in genomics and applications.
This is a graduate level course in an interdisciplinary area. Students are expected to have varying levels of
theoretical understanding of the Biochemistry, molecular biology, plant biology, human disease, genetics,
classical statistics, probability theory; machine learning, and computer science —however, proficiency is
not required. The students are also expected to understand questions raised by the papers listed below.
However, the students should not to be discouraged if they do not completely understand all the papers.
The selection of papers covers a large area in the hope that each student will find something particularly
interesting for her. The list of the papers is flexible: with the instructor’s permission the students may
substitute other papers for ones listed here. The students are encouraged to propose the papers related to
their research interests. The course encourages active participation and discussions.
2 Format
Course format:
1. Introductory lecture (3 hours).
2. Presentations by students based on the recommended literature list (four 3 hour meetings).
3. Final exam (take home).
I will be available for consultations during the semester. If you are stuck, I am here to help.
3 Grading
The grade is calculated as the sum of:
Presentation: 60%
Participation in the discussion: 30%
Final Exam (take home): 50%
The sum here is larger than 100%: this is by design. There is many ways to get the good grade; you can
choose the one that suits you best. If a student publishes or submits a paper on NGS or one of its
applications, bonus points are awarded.
4 Presentation Topics
Each seminar day includes three or four presentations on close topics. Aim at about 30–40 minutes
(including questions). The students are expected to participate in the discussion and fill the presentation
evaluation forms. The list of topics below is large. We are not going to cover all of them. Instead, you
have the choice to select the papers and topics that are of interest for you. Moreover, I may approve
presenting a paper not in the list if you are interested in it and request this beforehand. All papers here are
available in the GMU databases for free for GMU students.
4.1 Next-Generation Sequencing
Fundamentals of sequencing
4.2 Applications of NGS
- Exome-seq
- Whole Genome-seq
- RNA-seq
- Small RNA-seq
- ChIP-Seq
- Methylome-seq
4.3 NGS applications in Biomedical Research
- Disease causing mutation detection
- Biomarker discovery
- Gene Expression Studies
- Integrative Genomics Studies
4.4 NGS applications in Agriculture
- Challenges in plant genomics
- Genotyping by Sequencing
- Seeds & Traits
References (approximate list of paper to be reviewed)
GENERAL READING TO UNDERSTAND THE FIELD:
1) Gonzaludo et al. HGV2012: Leveraging Next-Generation Technology and Large Datasets to
Advance Disease Research. Human Mutation (HGV meeting report)
2) Hennekam et al. Next-Generation Sequencing Demands Next-Generation Phenotyping. Human
Mutation (HGV meeting report)
3) Wilson Sayres et al. HGV2011: Personalized Genomic Medicine Meets the Incidentalome.
Human Mutation (HGV meeting report)
4) Xia et al. NGS Catalog: A Database of Next Generation. Sequencing Studies in Humans.
HUMAN MUTATION Database in Brief 33: E2341-E2355 (2012) Online
5) Oetting WS. Exome and Genome Analysis as a Tool for Disease Identification and Treatment:
The 2011 Human Genome Variation Society Scientific Meeting. Human Mutation (HGV
meeting report)
6) Lim et al. Survey of the Applications of NGS to Whole-Genome Sequencing and Expression
Profiling. Genomics & Informatics. Vol. 10(1) 1-8, March 2012
7) Kim D, Pertea G, Trapnell C, Pimentel H, Kelley R, Salzberg SL. TopHat2: accurate alignment
of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biology
2011, 14:R36
8) Trapnell C, Hendrickson D,Sauvageau S, Goff L, Rinn JL, Pachter L Differential analysis of
gene regulation at transcript resolution with RNA-seq; Nature Biotechnology
9) A. Dobin et al, Bioinformatics 2012; "STAR: ultrafast universal RNA-seq aligner"
10) McKenna A, et al (2010). The Genome Analysis Toolkit: a MapReduce framework for
analyzing next-generation DNA sequencing data. Genome Res. 20:1297-303
11) Wang K, Li M, Hakonarson H. ANNOVAR: Functional annotation of genetic variants from
next-generation sequencing data Nucleic Acids Research, 38:e164, 2010
12) Zhang et al. Model-based Analysis of ChIP-Seq (MACS). Genome Biol (2008) vol. 9 (9) pp.
R137
13) Li, B. and Dewey, C. N. RSEM: accurate transcript quantification from RNA-Seq data with or
without a reference genome. BMC Bioinformatics 2011, 12:323
14) Roy Ronen; Ido Gan; Shira Modai;Alona Sukacheov; Gideon Dror; Eran Halperin; Noam
Shomron. miRNAkey: a software for microRNA Deep Sequencing analysis. Bioinformatics
2010;
15) Bormann Chung CA, Boyd VL, McKernan KJ, Fu Y, Monighetti C, et al. (2010) Whole
Methylome Analysis by Ultra-Deep Sequencing Using Two-Base Encoding. PLoS ONE 5(2):
e9320
NGS for DISEASE DIAGNOSTICS
1) Kingsmore and Saunders. Deep Sequencing of Patient Genomes for Disease Diagnosis: When
Will It Become Routine? Sci Transl Med 15 June 2011 Vol 3 Issue 87 87ps23
2) Tracy J. Dixon-Salazar et al. Exome Sequencing Can Improve Diagnosis and Alter Patient
Management Sci Transl Med 4, 138ra78 (2012);
3) Calvo S et al. Molecular Diagnosis of Infantile Mitochondrial Disease with Targeted NextGeneration Sequencing. Sci Transl Med 4, 118ra10 (2012)
4) Carol Jean Saunders et al. Rapid Whole-Genome Sequencing for Genetic Disease Diagnosis in
Neonatal Intensive Care Units Sci Transl Med 4, 154ra135 (2012);
5) Jacob O. Kitzman et al. Noninvasive Whole-Genome Sequencing of a Human Fetus Sci Transl
Med 4, 137ra76 (2012)
NGS for UNDERSTANDING OF CANCER
6) Campbell P et al. Subclonal phylogenetic structures in cancer revealed by ultra-deep sequencing.
PNAS _ September 2, 2008 _ vol. 105 _ no. 35 _ 13081–13086
7) David Wu et al. High-Throughput Sequencing Detects Minimal Residual Disease in Acute T
Lymphoblastic LeukemiaSci Transl Med 4, 134ra63 (2012);
8) Tim Forshew et al. Noninvasive Identification and Monitoring of Cancer Mutations by Targeted
Deep Sequencing of Plasma DNA. Sci Transl Med 4, 136ra68 (2012);
9) Kannan K et al. Recurrent chimeric RNAs enriched in human prostate cancer identified by deep
sequencing, www.pnas.org/cgi/doi/10.1073/pnas.1100489108
10) Rosewick N et al. Deep sequencing reveals abundant noncanonical retroviral microRNAs in Bcell leukemia/lymphoma, www.pnas.org/cgi/doi/10.1073/pnas.1213842110
NGS for AGRICULTURE
11) Elshire RJ et al. (2011) A Robust, Simple Genotyping-by-Sequencing (GBS) Approach for High
Diversity Species. PLoS ONE 6(5): e19379.
12) Bart R et al. High-throughput genomic sequencing of cassava bacterial blight strains identifies
conserved effectors to target for durable resistance.
www.pnas.org/cgi/doi/10.1073/pnas.1208003109
13) Yang H. e al. Application of next-generation sequencing for rapid marker development in
molecular plant breeding: a case study on anthracnose disease resistance in Lupinus angustifolius
L. BMC Genomics 2012, 13:318
Download