Structural and Evolutionary Genomics NATURAL SELECTION IN GENOME EVOLUTION Giorgio Bernardi SZN ELSEVIER N. Hartmann’s “strata of existence” (after Bernardi, 2005) Big Bang -14 Formation of the earth -10 Billions years Origin of life -5 Multicellular organisms 0 SZN Origin of life 1. Absolutely exceptional chance event (Jacques Monod, 1970) 2. Necessary event under the prevailing physico-chemical conditions (Christian de Duve, 1995) SZN Jacques Monod “Le Hasard et la Nécessité” 1970 SZN Christian de Duve “Vital Dust: Life as a Cosmic Imperative” 1995 Georges Cuvier (1769 – 1832) 1. Fixity of species 2. Extinction of species SZN Jean-Baptiste Lamarck “Philosophie Zoologique” 1809 • “Internal force” • Inheritance of acquired characters SZN Alfred R. Wallace “On the Tendency of Varieties to Depart Indefinitely from the Original Type” (1858) SZN Charles Darwin “The Origin of Species” 1859 SZN Evolution: descent with modification Charles Darwin SZN 1. Classical approaches to the study of evolution; classical theories 2. Our approach: structural and evolutionary genomics 3. An ultra-darwinian view of evolution: the neo-selectionist theory SZN At the level of the “classical phenotype” (form and function of organisms) 1. at the trait level (natural selection ; Darwin, 1859; Wallace, 1859) SZN SZN This preservation of favourable individual differences and variations [positive selection], and the destruction of those which are injurious variations [negative selection], I have called Natural Selection, or the Survival of the Fittest [adaptation]. Variations neither useful nor injurious [neutral variations] would not be affected by natural selection and would be left either a fluctuating element, … or would ultimately become fixed, ... Charles Darwin SZN At the level of the “classical phenotype” (characters) 1. at the trait level (natural selection) 2. at the genetic level (selectionist theory ; Fisher, 1930; Wright, 1931; Haldane, 1932) SZN Ronald A. Fisher John B.S. Haldane Sewall Wright SZN The selectionist (neo-darwinian, synthetic) theory of evolution reconciled Mendel’s laws of inheritance with evolution but neglected neutral changes SZN At the level of the “classical phenotype” (proteins and expression) 1. at the trait level (natural selection) 2. at the genetic level (selectionist theory) 3. at the protein level (Zuckerkandl and Pauling, 1962; Sueoka, 1962; Freese, 1962; Kimura, 1968; 1983) SZN Amino acid differences The molecular clock Time (Myr) SZN Biases in the replication machinery Sueoka (1962); Freese (1962) AT GC PROKARYOTES 25 50 75 GC SZN Motoo Kimura “The Neutral Theory of Molecular Evolution” 1983 SZN The mutation-random drift theory (the neutral theory) “the main cause of evolutionary change at the molecular level - changes in the genetic material itself - is random fixation of selectively neutral or nearly neutral mutants ”. (Kimura, 1983) SZN SZN At the level of the “genome phenotype” (Bernardi et al., 1973, 1976) Instead of looking at a few genes, this approach looked at the whole genome, more specifically at its compositional patterns and their evolution, moving, therefore, from the genetic level to the genomic level SZN The genome: an operational definition The haploid chromosome set Hans Winkler (1920) SZN • constant amount of DNA per cell in any given organism (Boivin et al., 1948; Mirsky and Ris, 1949) • c-value, or constant value (Swift, 1950) • genome size (Hinegardner, 1976) SZN The prokaryotic paradigm The genome as the sum total of genes SZN Genome size, coding sequences and gene numbers in some representative organisms Organism Genome size a Coding sequences Gene numbers a 2 % 85 2,000 1 12 70 6,000 2 3,200 2 32,000 100 Mb b Haemophilus Yeast Human a b kb/gene a, b in approximate figures kb, kilobases, or thousands of base pairs, bp; Mb, megabases, or millions of bp; (Gb, gigabases, are billions of base pairs) SZN The genome as the sum total of coding and non-coding sequences SZN The genome • The bean bag view • Additive vs. cooperative properties • The integrated ensemble view SZN Vertebrates 1. are a very small phylum 2. have common genetic background (vertebrates share most genes) 3. have a large genome (~ 3000 Mb; with coding sequences representing < 3%) SZN Structural genomics of vertebrates: our main conclusions (i) Genome compartmentalization (1973, 1976) (discontinuous compositional heterogeneity, isochores) (ii) Genome phenotype (1976, 1986) (compositional patterns of isochores and coding sequences) (iii) Genomic code compositional correlations ● between coding sequences and - non-coding sequences (1984) - thermal stability of proteins (1986) ● among codon positions (universal correlation; 1992) ● First evidence that the eukaryotic genome is an integrated ensemble: no junk DNA) ● Incompatibility with neutral theory SZN SZN SZN Isochore patterns 1 2 1 3 4 5 6 7 8 9 10 Costantini, Pavlicek, Saccone, Paces, Clay Auletta and and Bernardi Bernardi 2001 2004 Genome phenotypes DNA Coding Sequences SZN Compositional correlations Universal correlations SZN Hydrophobicity SZN SZN Gene distribution • Bernardi et al., 1984 • Mouchiroud et al., 1991 • Zoubak et al., 1996 • Lander et al., 2001 Correlations with structure and function Intron, UTR size Large Small Chromatin structure Closed Open GC Heterogeneity Low High Gene expression Low High Replication timing Late Early Recombination Low High Genome evolution in vertebrates 1. Conservative mode 2. Transitional mode SZN Genome evolution in vertebrates The conservative mode Mammalian orders are characterized by • a star-like phylogeny (over 100 Myrs) • a strong mutational AT bias (GC AT; mC T) • a conservation of base composition, methylation and CpG levels SZN Most recent common ancestor AT bias Extant mammalian orders similar isochore patterns 100 Myrs SZN Genome evolution in vertebrates The transitional mode GC increase SZN THE COMPOSITIONAL TRANSITIONS: (cold- to warm-blooded vertebrates) Compositional changes 1. concerned the (gene-dense) ancestral genome core 2. affected both coding and non-coding sequences (at comparable and correlated levels) 3. occurred (and were similar) in the independent ancestral lines of mammals and birds (convergent evolution) 4. did not affect cold-blooded vertebrates (with exceptions) 5. stopped with the appearance of present-day mammals and birds (an equilibrium was reached) SZN The formation and maintenance of GC-rich isochores is due to NATURAL SELECTION Selective advantages: Increased thermodynamic stability of DNA, RNA & proteins (Bernardi and Bernardi, 1986) SZN 5mC, % 3 Polar fish Tropical/Temperate fish 2 R = 0.50 R = 0.45 R = 0.80 1 Mammals 5mC, % 0 35 40 Snakes 45 GC, % Lizards 50 Varriale et al., 2005 Polar fish Turtles 2 Crocodiles 1 Mammals 0 35 40 45 GC, % 50 SZN The compositional transitions affected 1. only a small part of the genome (the ancestral genome core) 2. both coding and non coding sequences (at comparable and correlated levels) SZN Chromosomal regions in interphase nuclei Chromatin Location GC-increase at higher body temperature Gene-rich Gene-poor open closed central peripheral needed not needed for chromatin stability Saccone et al., 2002; Di Filippo et al., 2005 SZN The genome compartmentalization, the genome phenotype and the genomic code, the conservative and transitional modes of genome evolution cannot be accounted for by “a random fixation of neutral mutants” (i.e., by the neutral theory) SZN YET the majority of mutations per se can only be neutral or nearly neutral (if for no other reason that the vast majority of the genome is non coding) SZN (Bernardi, 2004) 1. explains how natural selection can take place at the isochore level 2. reconciles the neutral theory with natural selection 3. makes predictions: genome phenotype differences in populations; SZN genomic fitness 56% Compositional optimum 55% GC 54% Negative selection Structural transition Changes to AT Changes to GC Critical changes SZN SZN The structural transition can be visualized as a change in DNA and chromatin structure which affects gene expression Hence negative selection SZN Isopycnic expression of integrated viral sequences • BLV (Kettmann et al., 1979) • HBV (Zerial et al., 1986) • MMTV (Salinas et al., 1987) • RSV • HTLV-1 (Zoubak et al., 1994) • HIV-1 (Rynditch et al., 1991; 1998) (Tsyba et al., 1992; 2004) SZN Natural selection (mainly negative selection) 1. controls neutral changes at the isochore level 2. causes the shifts in the compositional transitions of the genome SZN 51% 50% 50% 49.5 % T° 50% 49% Ratchet mechanism: Negative selection below the lower (blue) level Shift of the compositional optimum (black line) SZN SZN CHANGES NEUTRAL DARWINIAN VIEW CRITICAL DELETERIOUS NEUTRAL ADVANTAGEOUS NEO-DARWINIAN VIEW NEUTRAL VIEW ULTRA-DARWINIAN VIEW Predictions of the neo-selectionist theory 1. Genome phenotype differences in populations Population A Population B ( denote lower GC levels) 2. Genomic fitness SZN Although the neo-selectionist theory can integrate the neutral theory, it represents a very different view of genome evolution SZN The dilemma of the neutral theory (Kimura, 1983) • “Why natural selection is so prevalent at the phenotypic level and yet random fixation of selectively neutral or nearly neutral alleles prevails at the molecular level ” ? “laws governing molecular evolution are clearly different from those governing phenotypic evolution.” • “increases and decreases in the mutant frequencies are due mainly to chance.” “Survival of the luckiest” SZN According to the neo-selectionist theory natural selection operates not only on 1. the classical phenotype (form and function; proteins and expression) but also on 2. the genome phenotype (compositional patterns and functional implications) “Survival of the fittest” SZN 1. The eukaryotic genome is an integrated ensemble of compositionally correlated coding and non-coding sequences: there is no junk DNA. 2. Isochore patterns (genome phenotypes) are stable or changing depending upon environmental conditions. 3. The GC increases accompanying the transition from coldto warm-blooded vertebrates are advantageous because they stabilize thermodynamically DNA, RNA and proteins. 4. Changes only affect the (gene-dense) genome core because of its open chromatin structure. 5. The neo-selectionist theory (an ultra-darwinian theory) explains how natural selection controls neutral changes at the isochore level and causes shifts in compositional SZN genome transitions. Acknowledgements • • • • • • • • • • • • • • • Fernando Alvarez, Montevideo Stilianos Arhondakis, Naples Fabio Auletta, Naples Oliver Clay, Naples Stéphane Cruveiller, Naples/Paris Maria Costantini, Naples Giuseppe D’Onofrio, Naples Kamel Jabbari, Paris Héctor Musto, Montevideo Adam Pavlicek, Prague/Paris Edda Rayko, Paris Alla Rynditch, Kiev Salvo Saccone, Catania Giuseppe Torelli, Naples Annalisa Varriale, Naples SZN SZN