‘Omics’ - Analysis of high dimensional Data Achim Tresch Computational Biology Epigenetics Slides: Doug Brutlag, Stanford University School of Medicine http://biochem158.stanford.edu/Epigenetics.html Epigenetics • C.H. Waddington coined the term epigenetics to mean above or in addition to genetics to explain differentiation. • How do different adult stem cells know their fate? – Myoblasts can only form muscle cells – Keratinocytes only form skin cells – Hematopoetic cells only become blood cells – But all have identical DNA sequences. Epigenetics • Modern definition is nonsequence dependent inheritance. • How can identical twins have different natural hair colors? • How can a single individual have two different eye colors? Mosaicism: One Eye, two Colors • How can identical twin liter mates show different coat colors? • How can just paternal or maternal traits be expressed in offspring? This is called genetic imprinting. • How can females express only one X chromosome per cell? • How can acquired traits be passed on to offspring? The ‘epigenetic’ code DNA Methylation & Histone Modifications Paula Vertino, Henry Stewart Talks Methylation of Cytosine in DNA cytosine Paula Vertino, Henry Stewart Talks 5-methyl cytosine Methylation of Cytosine in DNA Paula Vertino, Henry Stewart Talks DNA Methylation (Biochemistry) • CpG dinucleotides are partially methylated in higher vertebrates • Human genome: only ~4% of all cytosines are methylated, but ~ 70%-80% 5mCpG 5-methylcytosine - thymine • Spontaneous deamination transforms CpG to TpG or CpA • Estimated rate (after DNA-repair)[1]: 5.8*10-13 (5.8*10-17) 1/s*sites cytosine - uracil [1] Shen et al. (1993) Nucl. Acids Res. Wikipedia: Deamination, Thymine, 5-methylcytosine Methylation of Cytosine in DNA DNA methylation and Histones Me Maintenance of Cytosine Methylation Maintenance of Cytosine Methylation Maintenance of Cytosine Methylation Alex Meissner, Henry Stewart Talks Maintenance of Cytosine Methylation Functions of cytosine methylation DNA Methylation and Cell Differentiation Alex Meissner, Henry Stewart Talks DNA Methylation and Cell Differentiation Alex Meissner, Henry Stewart Talks DNA Methylation and Cell Differentiation Nuclear transplantation: Differentiated Cells can become Totipotent Methylation level Methylation Changes During Development Paula Vertino, Henry Stewart Talks Methylation level Methylation Changes During Development Paula Vertino, Henry Stewart Talks Methylation level Methylation Changes During Development Paula Vertino, Henry Stewart Talks DNA Methylation and Histone Marks •DNA methylation– bisulfite sequencing TTCGCCGACTAA TTCGCCGAuTAA •Histone modification •chromatin immunoprecipitation (ChIP) © 2013 American Society of Plant Biologists DNA Methylation and Histone Marks Using next-generation sequencing, epigenetic modifications can be identified genome-wide: EPIGENOMICS and METHYLOMICS GREEN = H3K27me3 PURPLE = methylcytosine DNA methylation and Gene Expression http://www.39kf.com/uploadfiles/image/15902/TXT-20081228163836878.gif Epigenomics • Methylation in mammals is mainly targeted at CpG dinucleotides • CpGs are either unmethylated or methylated on both strands • Hemi-methylated CpGs are rare Adapted from: http://www.diagenode.com/en/applications/bisulfite-conversion.php Lars Feuerbach Bisulfite Sequencing NH2 NH2 N N O CH3 N ~ ~ cytosine 5-methylcytosine O NH2 CH3 N N ~ uracil O N ~ No treatment TTCGCCGACTAA Bisulfite treatment O TTCGCCGACTAA N O N Methylcytosine Bisulfite treatment TTCGCCGAuTAA When DNA is bisulfite treated, unmethylated cytosine is converted to uracil. Methylcytosine is not affected. 5-methylcytosine © 2013 American Society of Plant Biologists Bisulfite Sequencing After bisulfite treatment, unmethylated Cs are read as T and so differ in the treated and untreated samples. By contrast, methyl-C is read as C and is the same as the reference sequence. Methylcytosine TTCGCCGACTAA No treatment Bisulfite treatment TTCGCCGACTAA TTCGCCGAuTAA TTCGCCGACTAA TTCGCCGATTAA © 2013 American Society of Plant Biologists Reduced Representation Bisulfite Sequencing RRBS-Seq • DNA is digested by MSP1 restriction enzyme which cuts at CCGG sites • All DNA fragments start with CpG • Alignment is simplified as reads have to map to MSP1 restriction sites • Reads are enriched for CpG rich areas TATGC CGGATGTTTTGTACTAGGATAAC http://www.neb.com/nebecomm/products/productR0106asp CGGAT Alignment of BS converted reads Standard alignment to the reference is not possible. Adapted alignment procedures have lower accuracy. Reference Read out Alignment of BS converted reads Key concept: - Convert the reference genome in silico as bilufite treatment does - Perform conversion for + strand and – strand - Then align reads against both genomes Tools supporting the alignment of BS reads: - Bismark - BSMAP - BS Seeker Simon Andrews, Bioinformatics 2011 Alignment of BS converted reads H = IUPAC character for the letters {A,C,T} Simon Andrews, Bioinformatics 2011 Description of DNA methylation Pearl-Necklace diagrams (lollipop plots) Measure unmethylated Cs (#C) Measure methylated Cs (#5mC) Report the methylation ratio #5mC #5mC # C The Tomato Methylome Density of methylated DNA and other features in chromosomes of the tomato fruit Reprinted by permission from Macmillan Publishers Ltd: Zhong, S., Fei, Z., Chen, Y.R., Zheng, Y., Huang, M., Vrebalov, J., McQuinn, R., Gapper, N., Liu, B., Xiang, J., Shao, Y., and Giovannoni, J.J. (2013). Single-base resolution methylomes of tomato fruit development reveal epigenome modifications associated with ripening. Nat Biotechnol. [in press]. © 2013 American Society of Plant Biologists Characterize deamination by repetitive sequences Evolution of CpG content in repetitive sequences Peifer et al. (2008) Bioinformatics Evolution of CpG-rich promoters • AT-rich promoters in bacteria • Mixed promoters in worm and fly • Increasing GC and CpG content in mosquito • Small CpG islands in fish • Broad CpG islands in humans Khuu et al., PNAS, Sep. 2007 Promoter Types in Humans Weber et al., 2007, Nat. Genet. Model of CpG island evolution CpG frequency Ancestral Genome 8,00% 7,00% 6,00% 5,00% 4,00% 3,00% 2,00% 1,00% 0,00% 1 3 5 7 9 11 13 15 17 19 21 23 Position on chromosome 25 27 29 31 33 35 Model of CpG island evolution CpG frequency After 0.1 transversions 8,00% 7,00% 6,00% 5,00% 4,00% 3,00% 2,00% 1,00% 0,00% 1 3 5 7 9 11 13 15 17 19 21 23 Position on chromosome 25 27 29 31 33 35 Model of CpG island evolution CpG frequency Observable genome 8,00% 7,00% 6,00% 5,00% 4,00% 3,00% 2,00% 1,00% 0,00% 1 3 5 7 9 11 13 15 17 19 21 23 Position on chromosome 25 27 29 31 33 35 CpG island definitions CpG frequency Observable genome 8,00% 7,00% 6,00% 5,00% 4,00% 3,00% 2,00% 1,00% 0,00% 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 Position on chromosome CpG content TakeiJones CpG island definition: 1. GC-content 2. Ratio observed over expected CpG frequency 3. Minimal Length Gardiner-Gardener 33 35 CpG islands • CpG dinucleotides are rare in the human genome • CpG Islands are exceptions • Elevated GC content and CpG frequency • 50-60% of promoters are CpG islands • Methylation level anti-correlated to expression in HCP promoters • Cause or consequence ? CpG islands and chromatin Caiafa and Zampieri,(2005) JCB Histone modifications • How to read the nomenclature: – – – Histone protein (H3) Position in tail (K9) Modification type (me3) Füllgrabe et al., 2011, Oncogene Histone code H3K4me2-me3 Active transcription, near TSSs H3K9me3 Heterochromatin H3K9ac Euchromatin, near TSSs H3K27me3 Polycomb marker, closes chromatin H4K16ac Higher order chromatin, repeat methylation H4K20me3 Heterochromatin •Füllgrabe et al., 2011, Oncogene Interplay •Ceder&Bergman,2009,Nature Rev Genet Allele-unspecific DNA methylation Allele-specific DNA methylation Imprinting • • • Origin-of-allele-specific gene expression Exception from Mendel’s inheritance rules Mediated by methylation of imprinting control regions University of Florida: http://www.peds.ufl.edu/divisions/genetics/teaching Allele specific – Histone modifications Adapted from: http://genomebiology.com/content/figures/gb-2005-6-6-113-1-l.jpg Reference Methylomes – Laurent et al. • Laurent data on human embryonic stem cells and fibroblasts • 70% of all CpGs covered by at least 3 reads Laurent et al. Genome Research 2010 Reference methylomes – Molaro et al. • Male germline methylome for human and chimpanzee • Direct comparison to Laurent et al. data Molaro et al. Cell 2011 ENCODE, IHEC and Epigenome Roadmap • One Genome many Epigenomes • Cataloguing epigenetic modifications in different tissues Translation into NGS signals Translation of epigenetic signals • Capture-seq – Chromatin Immunoprecipitation (ChIP) – Metylated DNA Immunoprecipitation (MeDIP) – MBD chromatography • Conversion-seq – Bisulfite sequencing (methyl-seq) – Reduced representation bisulfite sequencing (RRBS) – Ultra-deep amplicon sequencing Two signal types • Coverage Enrichment Sequencing Mapping Peak calling Sequence Preparation Sequencing Special mapping Decoding Enrichment-seq – Workflow I Epigenetically modified regions Genome DNA Library Preparation Enrichment-seq – Workflow II Enrichment Enrichment-seq – Workflow III Mapping Genome Enrichment-seq – Workflow IV Genome Methylated DNA immunoprecipitation •http://en.wikipedia.org/wiki/Methylated_DNA_immunoprecipitation BiQ Analyzer HT Lutsik P et al. Nucl. Acids Res. 2011;nar.gkr312 Allele-specific methylation analysis pipeline Matthias Bieg et al., in preparation Summary • Epigenetics plays a key role in cell function • Each cell type has its own epigenome • Epigenetic modifications are can be converted to NGS signals • Bioinformatic in depth analysis of epigenomes is still in its infancy References •Laurent, L.; Wong, E.; Li, G.; Huynh, T. et al •“Dynamic changes in the human methylome •during differentiation” •Genome Research (2010) 20 320-331 •Molaro, A.; Hodges, E.; Fang, F.; Song, Q. et al. •“Sperm Methylation Profiles Reveal •Features of Epigenetic Inheritance •and Evolution in Primates” •Cell (2011) 146 1029-1041 •Lutsik,P.; Feuerbach, L. ; Arand, J.; Lengauer, T. et al. •“BiQ Analyzer HT: locus-specific analysis of DNA methylation by high-throughput bisulfite sequencing” •NAR (2011) Environment can Influence Epigenetic Changes Emma Whitelaw, Henry Stewart Talks Hongerwinter 1944 • • • • German’s blocked food to the Dutch in the winter of 1944. Calorie consumption dropped from 2,000 to 500 per day for 4.5 million. Children born or raised in this time were small, short in stature and had many diseases including, edema, anemia, diabetes and depression. The Dutch Famine Birth Cohort study showed that women living during this time had children 20-30 years later with the same problems despite being conceived and born during a normal dietary state.