EPIGENOMICS André Goffeau Institut Pasteur/EMBO/CNPq course Florianopolis, July 11, 2008. Epigenomics is any regulation (on/off) of gene expression that is not due to DNA mutations and is heritable Epigenetic jargon • • • • • • • • • • • • • • • • Paramutation Bookmarking Imprinting Gene silencing X chromosome inactivation Position effect Reprogramming Transvection Maternal effects Carcinogenesis Teratogen effects Histone and chromatin modifications Parthenogenesis Cloning Prions Embryogenesis Jean-Baptiste Lamark 1744-1829 Charles Darwin 1809-1882 Two views about the type of mechanism that promotes evolution. According to Lamarck's theory, acquired characteristics can be passed to subsequent generations. According to Darwin's (and Wallace's) theory of natural selection, a population of giraffes will have individuals with variations in neck length. If having a longer neck is advantageous in feeding, longer necked giraffes will be more successful and reproduce more. RNA interference, Histone acetylation and DNA methylation DNA METHYLATION Cytosine methylation occurs at CpG and is mutagenic It prevents activation of promoters Methylation of CpG islands and the informatician? they try to predict which cytosines are methylated in DNA EMBRIO EPIGENETICS Reprogramming in Germ Cells and Embryos CHROMATIN CODE Chromatin chemistry Acetylation or methylation Histone modifications Methylation Genomics Aberrant methylation in human and mouse leukemia DISEASES and DRUGS Epigenetic diseases Epigenetic drugs Gene silencing and pharmacology siRNA RNA silencing and YEAST?? S.cerevisiae has no DNA methylation S.cerevisiae has no siRNA S.cerevisiae has chromatin modification S.pombe siRNA controls heterochromatin N.crasa DNA methylation depends on a histone methyl transferase S.cerevisiae has other epigenetic systems such: Mating type silencing, FLO11 a pseudohyphal telomeric gene, Prions RNA interference, Histone acetylation and DNA methylation For elucidation of mechanism, use S.pombe, N.crassa or Y.lipolytica ?? but not at S.cerevisiae Epigenetics References Pennisi E. Behind the scenes of gene expression. 2001 Science, 293:1604-1607. Egger G, Liang G, Aparicio A & Jones PE. Epigenetics in human disease and prospects for epigenetic therapy. 2004. Nature,429:457-463. Jenuwein T and Allis CD. Translating the Histone Code 2001 Science, 293:1074-1080. Matzke M, Matzke AJM, Kooter JM RNA: Guiding gene silencing. 2001 Science, 293:1080-1083. Reik W, Dean W, WalterJ. Epigenetic reprogramming in mammalian development. 2001 Science, 293:1089-1093 Hatada I et al. A genomic scanning method for higher organisms using restriction sites as landmarks. 1991. P.N.A.S.,88,9523-9527 Kimura et al. Methylation profiles of genes utilizing newly developed CpG island methylation microarray on colorectal cancer patients 2005 Nucleic Acids Research, 20, E pub Agrawal et al. RNA interference: biology, mechanism and applications. 2003 Microb. and Molec Biology Reviews, 67, 657-685 QuickTime™ et un décompresseur Sorenson Video 3 sont requis pour visionner cette image. http://www.nature.com/focus/rnai/animations /animation/animation.htm PARENTAL DIFFERENTIAL METHYL TAGGING Hinny and Dolly EARLY EXAMPLES • agouti mice (folic acid) • cancer human (p16) • diseases human (BWS) • eye apendage fly (Hsp90) Methyl detector Yellow: hyper-methylated; Blue: under-methylated Restriction Landmark Genomic Scanning Reprogramming and Imprinting Deoxynucleoside analogue inhibition GENOME EVOLUTION Genomology Genome mapping Genome sequencing GENOMICS (GLOBAL) Genome comparisons Databases NEW TOOLS (GLOBAL) REDUCTIONICS (SPECIFIC) Deletomics Overexpressionics Transcriptomics (systematic) (systematic) (DNA chips) Physiologists Pathologists Structuralists Biologists Proteomics (2D gels/2 hybrids) Biochemists Schematic alternating signature for Whole Genome Duplication 4 5 6 7 5 6 7 1 2 3 4 1 2 3 4 10 11 12 13 14 15 16 13 14 15 16 17 18 19 16 17 18 19 8 9 10 11 12 8 9 10 11 20 21 22 20 21 22 Duplicated copy 1 in S. cerevisiae Reference block in K. waltii Duplicated copy 2 in S. cerevisiae The dark grey genes are contiguous in the non-duplicated reference species (K. waltii, K. lactis or A. gossypii). Yellow genes are conserved in both S. cerevisiae copies. Red genes are conserved only in S. cerevisiae copy 1 . Blue genes are conserved only in S. cerevisiae copy 2. The lost genes are in light grey. EMERGENCE OF SPECIES-SPECIFIC TRANSPORTERS DURING EVOLUTION OF THE HEMIASCOMYTE PHYLUM Benoît De Hertogh*[1], Frédéric Hancy†[2], André Goffeau‡ and Philippe V. Baret* Université catholique de Louvain www.gena.ucl.ac.be Evolution of the yeast genome Wolfe KH, Shields DC. Molecular evidence for an ancient duplication of the entire yeast genome. Nature. 1997;387:708-13. Kellis M, Birren BW, Lander ES. Proof and evolutionary analysis of ancient genome duplication in the yeast Saccharomyces cerevisiae. Nature. 2004;428:617-24. Dujon B, Sherman D, Fischer G, Durrens P, Casaregola S, Lafontaine I, De Montigny J, Marck C, Neuveglise C, Talla E, Goffard N, Frangeul L, Aigle M, Anthouard V, Babour A, Barbe V, Barnay S, Blanchin S, Beckerich JM, Beyne E, Bleykasten C, Boisrame A, Boyer J, Cattolico L, Confanioleri F, De Daruvar A, Despons L, Fabre E, Fairhead C, FerryDumazet H, Groppi A, Hantraye F, Hennequin C, Jauniaux N, Joyet P, Kachouri R, Kerrest A, Koszul R, Lemaire M, Lesur I, Ma L, Muller H, Nicaud JM, Nikolski M, Oztas S, OzierKalogeropoulos O, Pellenz S, Potier S, Richard GF, Straub ML, Suleau A, Swennen D, Tekaia F, Wesolowski-Louvel M, Westhof E, Wirth B, Zeniou-Meyer M, Zivanovic I, BolotinFukuhara M, Thierry A, Bouchier C, Caudron B, Scarpelli C, Gaillardin C, Weissenbach J, Wincker P, Souciet JL. Genome evolution in yeasts. Nature. 2004;430:35-44. Epigenomics Egger G, Liang G, Aparicio A, Jones PA. Epigenetics in human disease and prospects for epigenetic therapy. Nature. 2004;429:457-63. Costello JF. Comparative epigenomics of leukemia. Nat Genet. 2005;37:211-2. Membrane Classification (MC) 10 Membrane Proteins A 10.A Lipid Metabolism B 10.B Anchoring C 10.C Polysaccharide Metabolism D 10.D Trafficking E 10.E Signaling F 10.F Oxidoreductases G 10.G Subtelomeric Conserved H 10.H Chaperones Genomology Genome mapping Genome sequencing GENOMICS (GLOBAL) Genome comparison Databases NEW TOOLS (GLOBAL) REDUCTIONICS (SPECIFIC) Deletomics Overexpressionics Transcriptomics (systematic) (systematic) (DNA chips) Physiologists Pathologists Structuralists Biologists Proteomics (2D gels/2 hybrids) Biochemists Conclusions • Analysis of the 28.000 protein sequences obtained from 14 hemiascomycetes illustrates the usefulness of the functional/ phylogenetic TC system proposed by MILTON SAIER • A similar system for non - transport membrane proteins is proposed (179 members) • S. cerevisiae contains contains 11 channels, 211 permeases, •16 P-ATPases and 22 ABC-ATPases •They contain also 28 putative transporters families and 112 singletons of unknown function • Speciation of hemiascomycetes is accompanied by the emergence of membrane proteins not represented in S. cerevisiae • Similar analysis of TMS 1 and 2 proteins is required •Our database has been used for identification of novel putative yeast transporters • Our database will serve as reference for the automatic annotation of membrane proteins from recently sequenced yeast genomes TC - Class 9 9 Incompletely Characterized Transport Systems 9.A Recognized Transporters of Unknown Biochemical Mechanism 9.B Putative Uncharacterized Transport Proteins 9.C Functionally Characterized Transporters Lacking Identified Sequences 9.D The Membrane Proteins of Unknown Function 9.E Questionable ORFs with TMS>2 S. Cerevisiae Membrane Classification Subfamilies 13 ORFs 28 10.A Lipid Metabolism 10.B Anchoring 9 10 10.C Polysaccharide Metabolism 8 32 11 39 10.D Trafficking 10.E Signaling 4 7 10.F Oxidoreductases 5 11 10.G Subtelomeric Conserved 1 12 10.H Chaperones 4 7 Total : 55 146 Table 1 Global statistics of membrane proteins in Hemiascomycete species Species Y. lipolytica D. hansenii K. lactis C. glabrata S. cerevisiae Code YALI DEHA KLLA CAGL SACE Strain CLIB122 CBS767 CLIB210 CBS138 S288c Database Génolevures Génolevures Génolevure s Génolevure s SGD Release 22 may 2004 22 may 2004 22 may 2004 22 may 2004 22 may 2004 fats salted fish milk blood grapes 6666 6896 5331 5272 5800 29965 Classified transporters 597 538 439 398 508 2480 % 9.0 7.8 8.2 7.5 8.8 8.3 Possible transporters (9.B.X.Y.Z) still unannotated 296 295 226 236 243 1296 Natural substrate ORFs Total Table 2 Functional distribution of the “established and putative” transporters in the Hemiascomycete phy YALI DEHA KLLA CAGL 46 32 26 26 38 168 1 1 1 2 2 7 316 281 206 162 218 1183 94 89 91 87 107 468 3.B Decarboxylation-driven transporters 2 1 1 2 0 6 3.D Oxidoreduction-driven transporters 21 27 18 15 25 106 3.E Light absorption-driven transporters 0 0 0 3 3 6 8.A Auxiliary transport proteins 5 7 5 3 10 30 9.A Recognized transporters of unknown mechanism 82 75 76 81 85 399 9.B Putative uncharacterized transport proteins 30 25 15 17 20 107 597 538 439 398 508 2480 1.A Alpha-Type channels 1.B Beta Barrel porins. 2.A Porters (uniporters, symporters, antiporters) 3.A P-P-bond-hydrolysis-driven transporters Total SACE Total Table 4 Mean and standard deviation of the subfamily size according to the different modes of evolution within the Hemiascomycete phylum Mode of evolution Number of subfamilies Mean number of ORF per subfamily Minimum number of ORF Maximum number of ORF UBIQUITOUS 107 19.9 ± 25.8 5 146 SPECIES-SPECIFIC UNIQUE 15 1.5 ± 1.6 1 7 SPECIES-SPECIFIC ABSENT 20 7.9 ± 7.0 4 28 PHYLUM-GAINED 13 3.2 ± 1.0 2 6 PHYLUM- LOST 36 2.6 ± 2.6 1 16 HOMOPLASIC 13 3.1 ± 1.0 2 5 Table 7 Hemiascomycete Mitochondrial Carrier subfamilies that are absent in S. cerevisiae Y. lipolytica D. hansenii K. lactis C. glabrata YALI0A20944g DEHA0G14454g KLLA0E02750g DEHA0E08349g KLLA0A09383g Subfamily 2.A.29.6 2.A.29.Y14 2.A.29.Y15 CAGL0B03883g 2.A.29.Y16 YALI0A16863g 2.A.29.Y17 YALI0A20988g 2.A.29.Y18 YALI0B05852g 2.A.29.Y19 YALI0E33341g YALI0F00418g 2.A.29.Y20 YALI0F20262g DEHA0E11022g 2.A.29.Y21 YALI0F15609g DEHA0B16401g DEHA0E09691g 2.A.29.Y22 YALI0E06897g 2.A.29.Y23 YALI0D06798g ORF number CAGL0F08305g 10 DEHA0G19437g 6 KLLA0E09680g 3 2 Figure 2B Identification principles of the different evolution patterns distinguished in Figure 2A Symbols used in Mode of evolution Species A Species B Species C Species D Ubiquitous ? 1 ORF ? 1 ORF ? 1 ORF ? 1 ORF Unique no ORF ? 1 ORF no ORF no ORF Absent ? 1 ORF no ORF ? 1 ORF ? 1 ORF Gained ? 1 ORF ? 1 ORF no ORF no ORF Lost no ORF no ORF ? 1 ORF ? 1 ORF Homoplasic ? 1 ORF no ORF ? 1 ORF no ORF Figure 2 A Main characteristics Species Y. lipolytica D. hansenii K. lactis C. glabrata S. cerevisiae Code YALI DEHA KLLA CAGL SACE Natural substrate fats salted fish milk blood grapes 6666 6896 5331 5272 5800 29965 Classified transporters 597 538 439 398 508 2480 % 9.0 7.8 8.2 7.5 8.8 8.3 Possible transporters (9.B.X.Y.Z) still unannotated 296 295 226 236 243 1296 ORFs Total Our objective : a consistent annotation • Key elements – Consistent databases – The TCDB system of classification – A well-known evolutive context • Output – A subfamily by subfamily discussion – Dynamic species vs. Quiet species • Extension – Other species – Different levels of annotation Databases Knowledge Models Description Processes Annotation unannotated (9.B.X.Y.Z) still Possible transporters 296 295 226 236 243 % 9.0 7.8 8.2 7.5 8.8 8.3 transporters Classified 597 538 439 398 508 2480 29965 ORFs 6666 6896 5272 5800 Natural substrate fats salted fish milk blood grapes Code YALI DEHA KLLA CAGL SACE Y. lipolytica D. hansenii K. lactis C. glabrata S. cerevisiae Species 5331 1296 Total Evolution The TCDB Classification • • • • Based on five digits Consistent across species Extensible An example – 2 Electrochemical Potential-driven transporte • 2.A Porters (uniporters, symporters, antiporters) – 2.A.1 The Major Facilitator (MFS) Superfamily » 2.A.1.Y2 Undefined Subfamily In practice – the most variable families YALI DEHA KLLA CAGL SACE Mean Variance Subfamily 2.A.1.1 Sugar Porter (SP) 27 48 20 17 34 29.2 153.7 2.A.1.14 Anion Cation Symporter (ACS) 39 27 13 6 10 19.0 187.5 2.A.1.2 Drug Proton Antiporter 1 (DHA-1) 33 24 8 10 12 17.4 114.8 9.A.5.1 Peroxisomal Protein Importer (PPI) 27 7 10 11 10 13.0 63.5 2.A.67.1 Oligopeptide Transporter (OPT) 17 4 3 0 2 5.2 45.7 2.A.1.16 Ferrioxamine H+ symporter (SIT) 14 5 4 1 6 6.0 23.5 2.A.1.13 Fructose uniporter (FRU) 5 8 12 3 0 5.6 21.3 9.B.17.1 The Putative Fatty Acid Transporter (FAT-1) 14 3 3 5 5 6.0 21.0 3.D.1.2 NADH Dehydrogenase I (NDH 1) 8 8 0 0 0 3.2 19.2 2.A.3.10 AminoAcid-Polyamine-Organocation Yeast Transporter( APC-YAT ) 14 24 16 14 18 17.2 17.2 1.A.20.5 Yeast Metal Channel ( Cyt B-FRE ) 11 7 5 1 7 6.2 13.2 Our objective : a consistent annotation • Three elements – The Genolevure database – The TCDB system of classification – A well-known evolutive context • Our material – Five species of Hemiascomycetes – 2480 identified transporter proteins • Our objective – To understand how subfamilies of transporters emerge along the evolutionary process The chosen phylum SACE-ALR2 KLLA-0E07249g YALI-0E00462g YALI-0D00319g 1.A.35.2 plasma membrane Mg, Zn, Mn, Cu SACE-ALR1 CAGL-0E01617g DEHA-0E11616g YALI-0B05148g DEHA-0F17776g KLLA-0F26895g CAGL-0M13233g SACE-MNR2 SACE-LPE10 KLLA-0F28017g CAGL-0M07249g YALI-0D19514g 0.1 DEHA-0E05731g SACE-MRS2 YALI-0F06248g DEHA-0B05445g KLLA-0F02519g CAGL-0E05368g Figure 2. The Yeast MIT Family (Metal Ion Channels). TC # 1.A.35. 1.A.35.5 mitochondria Mg, (Zn, Mn, Cu?) YALI-0F00176g DEHA-0G03828g KLLA-0F08723g CAGL-0K07392g SACE-ZRC1 YALI-0C18359g SACE-COT1 2.A.4.2 vacuoles, mitochondria Zn, Co DEHA-0G14113g KLLA-0F20746g0 CAGL-0F05401g SACE-MSC2 2.A.4.4 endoplasmic reticulum, nucleus Zn YALI-0C12254g CAGL-0E06006g SACE-MMT2 DEHA-0A03553g 0.1 SACE-MMT1 KLLA-0C16181g CAGL-0H08822g Figure 4. The Yeast CDF Family (Cation Diffusion Facilitator). TC # 2.A.4. 2.A.4.Y1 mitochondria Fe 2.A.5.Y1 Golgi Mn YALI-0D19008g KLLA-0F17886g DEHA-0E06105g 2.A.5.Y2 no data no data KLLA-0A07601g SACE-YKE4 SACE-ATX2 CAGL-0K05577g 2.A.5.2 endoplasmic reticulum Zn YALI-0F15411g DEHA-0B16335g YALI-0E00748g YALI-0D00759g 0.1 DEHA-0E25388g SACE-ZRT2 CAGL-0M04301g 2.A.5.1 plasma membrane Zn KLLA-0D16434g SACE-ZRT1 YALI-0F21659g DEHA-0B07337g CAGL-0E01353g Figure 5. The Yeast ZIP Family (ZINC Iron Porters). TC # 2.A.5. DEHA-0F25234g KLLA-0A03740g 2.A.55.1.1 plasma membrane, vacuoles CAGL-0E01969g Mn YALI-0C04411g SACE-SMF1 2.A.55.1.3 vacuoles Fe SACE-SMF3 DEHA-0D06996g CAGL-0A03476g KLLA-0D09581g KLLA-0F17391g CAGL-0J00407g DEHA-0G09251g SACE-SMF2 0.1 YALI-0D26818g re 6. The Yeast Nramp Family (Metal Ion Transporters). TC # 2.A.55.1 2.A.55.1.2 vesicles, mitochondria Mn KLLA-0C01694g CAGL-0J08481g YALI-0A20273g SACE-YDR506 YALI-0D07282g YALI-0D06754g KLLA-0F26400g DEHA-0E13332g KLLA-0D05489g CAGL-0K12738g SACE-FET5 DEHA-0G05720g CAGL-0F06413g SACE-FET3 9.A.10.Y1 plasma membrane, vacuoles Fe YALI-0D07304g YALI-0D06688g SACE-FTH1 CAGL-0M05511g KLLA-0F28039g DEHA-0C07117g DEHA-0C06226g YALI-0A04917g 0.1 DEHA-0D05269g DEHA-0E13211g KLLA-0A03025g CAGL-0I06743g SACE-FTR1 9.A.10.1 plasma membrane, vacuoles Fe Figure 8. The Yeast OFeT Family (Oxydase-dependant Iron Transporters). TC # 9.A.10. 9.A.12.Y1,2,3 no data no data YALI-0C20295g KLLA-0B11407g CAGL-0D04708g DEHA-0F16390g 9.A.12.2 plasma membrane Cu SACE-CTR1 DEHA-0B00407g 9.A.12.Y4 plasma membrane Cu SACE-CTR3 CAGL-0I02508g SACE-CTR2 KLLA-0A09207g DEHA-0G15268g 0.1 Figure 9. The Yeast CTR (Copper Transporters). TC # 9.A.12. 9.A.12.1 vacuoles Cu The variation coefficients of transporters in subfamilies Nbr of Orfs in each subfamily CV YALI DEHA KLLA CAGL SACE 39 17 0 33 8 27 27 14 5 14 3 7 0 6 11 27 4 0 24 8 48 7 5 8 3 0 3 0 3 7 13 3 0 8 0 20 10 4 12 3 0 1 0 2 5 6 0 0 10 0 17 11 1 3 5 0 1 3 0 1 10 2 7 12 0 34 10 6 0 5 0 1 3 1 7 TC/YETI Family or Subfamily Identificator name 9,87 8,79 7,00 6,60 6,00 5,26 4,88 3,92 3,80 3,50 3,00 2,62 2,25 2,21 2,13 2.A.1.14 The Anion: Cation Symporter (ACS) Family 2.A.67.1 Subfamily of the Oligopeptide Transporter (OPT) Family 8.A.9.Y1 Subfamily of the rBAT Transport Accessory Protein (rBAT) Family 2.A.1.2 The Drug:H+ Antiporter-1 (12 Spanner) (DHA1) Family 3.D.1.2 Subfamily of the Proton-translocating NADH Dehydrogenase (NDH) Family 2.A.1.1 The Sugar Porter (SP) Family 9.A.5.1 Subfamily of the Peroxisomal Protein Importer (PPI) Family 2.A.1.16 The Siderophore-Iron Transporter (SIT) Family 2.A.1.13 The Monocarboxylate Porter (MCP) Family 9.B.17.1 The Putative Fatty Acid Transporter (FAT-1) Type 1 Subfamily 3.A.1.201 The Multidrug Resistance Exporter (MDR) Family (ABCB) 2.A.17.2 Subfamily of the Proton-dependent Oligopeptide Transporter (POT) Family 3.E.1.4 The Fungal Subfamily of the Ion-translocating Microbial Rhodopsin Subfamily 2.A.1.12 The Sialate:H+ Symporter (SHS) Family 1.A.20.5 Subfamily of the Human Phagocyte NADPH Oxidase Cyt b558 H+ Channel