Concerted Evolution Dan Graur 1 Three evolutionary models for duplicated genes 2 3 Concerted Evolution 4 Divergent (classical) evolution vs. concerted evolution Expected Observed Ganley AR, Kobayashi T. 2007. Genome Res. 17:184-191. Ribosomal RNA gene unit (in a cluster) ITS = internally transcribed sequences ETS = externally transcribed sequences NTS = nontranscribed sequences 6 Xenopus borealis Xenopus laevis 7 18S and 28S in X. laevis and X. borealis are identical. NTS regions differ between the two species. NTS regions are identical within each species. Conclusion: NTS regions in each species have evolved in concert, but have diverged rapidly between species. 8 (a) Stringent selection. (b) Recent multiplication. (c) Concerted evolution. 9 (a) Stringent selection. Refuted by the fact that the NTS regions are as conserved as the functional rRNA sequences. 10 (b) Recent multiplication. Refuted by the fact that the intraspecific homogeneity does not decrease with evolutionary time. 11 (c) Concerted evolution. 12 CONCERTED EVOLUTION A member of a gene family does not evolve independently of the other members of the family. It exchanges sequence information with other members reciprocally or nonreciprocally. Through genetic interactions among its members, a multigene family evolves in concert as a unit. 13 CONCERTED EVOLUTION Concerted evolution results in a homogenized set of nonallelic homologous sequences. 14 CONCERTED EVOLUTION REQUIRES: (1) the horizontal transfer of mutations among the family members (homogenization). (2) the spread of mutations in the population (fixation). 15 Mechanisms of concerted evolution 1. Gene conversion 2. Unequal crossing-over 3. Duplicative transposition. 4. Gene birth and death. 16 gene conversion concerted evolution 17 18 19 Gene Conversion Unbiased Gene Conversion: Sequence A has as much chance of converting sequence B as sequence B has of converting sequence A. Biased Gene Conversion: The probabilities of gene conversion between two sequences in the two possible directions occur are unequal. If the conversional advantage of one sequence over the other is absolute, then one sequence is said to the master and the other to be the slave. 20 21 Gene conversion has been found in all species and at all loci that were examined in detail. Biased gene conversion is more common than unbiased gene conversion. The rate of gene conversion varies with genomic location. 22 unequal crossing-over concerted evolution 23 Unequal crossing over Unequal crossing over 24 Tomoko Ohta 25 concerted evolution: Advantages of Gene Conversion over Unequal Crossing-Over 1. Unequal crossing-over changes the number of repeats, and may cause a dosage imbalance. Gene conversion does not change repeat number. 26 normal configuration 27 following unequal crossing-over mild a-thalassemia 28 concerted evolution: Advantages of Gene Conversion over Unequal Crossing-Over 2. Gene conversion can act on dispersed repeats. Unequal crossingover is severely restricted when repeats are dispersed. 29 deletion duplication 30 concerted evolution: Advantages of Gene Conversion over Unequal Crossing-Over 3. Gene conversion can be biased. Even a small bias can have a large effect on the probability of fixation of repeated mutants. 31 concerted evolution: Advantages of Unequal Crossing-Over over Gene Conversion 1. Unequal crossing-over is faster and more efficient in bringing about concerted evolution. At the mutation level, UCO occurs more frequently than GC. 32 concerted evolution: Advantages of Unequal Crossing-Over over Gene Conversion 2. In a gene-conversion event, only a small region is involved. 33 In yeast, an unequal crossing-over event involves on average ~20,000 bp. A gene-conversion track cannot exceed 1,500 bp. 34 Factors affecting the rate of concerted evolution 1. the number of repeats. 2. the arrangement of the repeats. 3. relative sizes of slowly and rapidly evolving regions within the repeat unit. 4. constraints on homogeneity. 5. mechanisms of concerted evolution. 6. population size. 7. dose requirements 35 1. the number of repeats. 36 37 The number of unequal crossing-overs required for the fixation of a variant repeat increases roughly with n2, where n is the number of repeats. 38 2. the arrangement of the repeats. 39 Types of arrangement of repeated units: Dispersed Clustered 40 The dispersed arrangement causes unequal crossing-over to lead to disastrous genetic consequences. The dispersed arrangement reduces the frequency of gene conversion. 41 3. relative sizes of slowly and rapidly evolving regions within the repeat unit. 42 Noncoding regions evolve rapidly. Coding regions evolve slowly. Both unequal crossing-over and gene conversion depend on sequence similarity for the misalignment of repeats. The more coding regions there are, the higher the rates concerted evolution will be. 43 4. constraints on homogeneity. 44 Two extreme possibilities: 1. The function requires large amounts of an invariable gene product. rRNA and histone genes 2. The function requires a large amount of diversity. immunoglobulin and histocompatibility genes 45 Two extreme possibilities: 1. The function requires large amounts of an invariable gene product. rRNA and histone genes 2. The function requires a large amount of diversity. immunoglobulin and histocompatibility genes 46 5. mechanisms of concerted evolution. 47 Concerted evolution under unequal crossing-over is quicker than that under gene conversion. 48 6. population size. 49 The time required for a variant to become fixed in a population depends on population size. 50 7. dose requirements. 51 Centripetal selection against too many or too few repeats. 52 53 Decreases variation 54 2 loci, 3 alleles gene conversion 2 loci, 4 alleles 55 Detecting Concerted Evolution 56 When dealing with similar paralogous sequences, it is usually impossible to distinguish between two alternatives: (1) the sequences have only recently diverged from one another by duplication. (2) the sequences have evolved in concert. 57 The phylogenetic approach. a1 a2 The two a-globin genes in humans are almost identical. They were initially thought to have duplicated quite recently, so there was no sufficient time for them to diverge in sequence. 58 The phylogenetic approach. a1 a2 However, duplicated a-globin genes were also discovered in distantly related species, so most parsimonious solution to assume that the duplication is quite ancient, but its antiquity is obscured by concerted evolution. 59 g duplication 55 million years ago Ag Gg speciation 5 million years ago Ag Gg Ag Gg The orthologs should be closer to one another than 60 the paralogs. Gg A and g Gg A and g In humans, the 5’ parts of differ from one another at only 7 out of 1,550 nucleotide positions (0.5%). In contrast, the 3’ parts of differ from one another at 145 out of 1,550 nucleotide positions (9.4%). 61 exon 3 exons 1 and 2 62 exon 3 exons 1 and 2 Expected phylogenetic tree for exons 1 and 2, if gene conversion had only occurred in the human lineage. 63 Death is not final: The resurrection of pancreatic ribonuclease as seminal ribonuclease in Bovinae by gene 64 conversion The resurrection of pancreatic ribonuclease as seminal ribonuclease in Bovinae through gene-conversion of a 65 small region at the 5' end of the gene. Pseudogenes may represent reservoirs of genetic information that participate in the evolution of new genes, not only relics of inactivated genes whose fate is genomic extinction. 66 67 21-hydroxylase (cytochrome P21) gene In humans, the 10-exon gene is located on chromosome 6. The gene has a paralogous pseudogene in the vicinity. 68 69 21-hydroxylase (cytochrome P21) gene Hundreds of mutations in the 21hydroxylase gene have been described. 75% of them are due to gene conversion. 70 71 Were it not for the fact that the pseudogene is truncated, we would be hard pressed to say which is the gene and which is the pseudogene. gene pseudogene ATGTCTCTGACCAAGGCTGAGAGGACCATGGTCGTGTCCATATGGGGCAA ATGTCTCTGACCAAGGCTGAGAGGACCATGGTCGTGTCCATATGGGGCAA ************************************************** gene pseudogene GATCTCCATGCAGGCGGATGCCGTGGGCACCGAGGCCCTGCAGAGGTGAG GATCTCCATGCAGGCGGATGCCGTGGGCACCGAGGCCCTGCAGAG----********************************************* gene pseudogene TGCCAGACAGCCTGGGACAGGTGACAGTGTCCCAGGTGACACTGGTGTAG -------------------------------------------------- Gene pseudogene GTGACAGCGTGAGTTTAGTGAGGACAGGGGCCAGTGAAGAGGGGGCAATG -------------------------------------------------- gene pseudogene AGGAAGCGACAGTGTGGAGGGGTAATTGTGGTGGGGAACTGTGAGGACCC... -------------------------------------------------72 Were it not for the fact that the pseudogene is truncated, we would be hard pressed to say which is the gene and which is the pseudogene. gene pseudogene ATGTCTCTGACCAAGGCTGAGAGGACCATGGTCGTGTCCATATGGGGCAA ATGTCTCTGACCAAGGCTGAGAGGACCATGGTCGTGTCCATATGGGGCAA ************************************************** gene pseudogene GATCTCCATGCAGGCGGATGCCGTGGGCACCGAGGCCCTGCAGAGGTGAG GATCTCCATGCAGGCGGATGCCGTGGGCACCGAGGCCCTGCAGAG----********************************************* gene pseudogene TGCCAGACAGCCTGGGACAGGTGACAGTGTCCCAGGTGACACTGGTGTAG -------------------------------------------------- Gene pseudogene GTGACAGCGTGAGTTTAGTGAGGACAGGGGCCAGTGAAGAGGGGGCAATG -------------------------------------------------- gene pseudogene AGGAAGCGACAGTGTGGAGGGGTAATTGTGGTGGGGAACTGTGAGGACCC... -------------------------------------------------73 The birth-and-death model for the evolution of gene families was proposed by Hughes and Nei (1989). In this model, new copies are produced by gene duplication. Some of the duplicate genes diverge functionally; others become pseudogenes owing to deleterious mutations or are deleted from the genome. The end result of this mode of evolution is a multigene family with a mixture of functional genes exhibiting varying degrees of similarity to one another plus a substantial number of pseudogenes interspersed in the mixture. 74 The birth-and-death model for the evolution of gene families An important prediction of the birth-and-death process is that gene-family size will vary among taxa as a result of differential birth and death of genes among different evolutionary lineages. Thus, an understanding of the evolutionary forces governing the birth-and-death process is predicated upon an accurate accounting of the number of births (duplications) and deaths (nonfunctionalization events + deletions) in each lineage. This “bookkeeping” turns out to be anything but a trivial undertaking. 75 Expansions/no change/contractions in the evolution of gene families in five Saccharomyces species. Estimates of divergence times (in millions of years) are shown in circles. 76 There were 3517 gene families shared by the five species. Of these, 1254 (~37%) have changed in size across the tree. On each branch in the tree, the vast majority of gene family sizes remain static. Expansions outnumbered contractions on four of the eight branches, and contractions outnumbered expansions on the other four. 77 Lineage specificity Let us compare the number of expansions and contractions on the branches leading to S. mikatae and S. cerevisiae from their common ancestor, approximately 22 million years ago. On the lineage leading to S. mikatae there were 509 families that expanded and 86 families that contracted—a ratio of 6:1. On the lineage leading to S. cerevisae a smaller number of families changed their size, and the ratio of expanded families 78 (54) to contracted ones (241) was inverted, 1:5. Turnover Rates Turnover = Gains + Losses The gene turnover rate in primates is nearly twice that in nonprimate mammals (0.0024 versus 0.0014 gains and losses per gene per million years). A further acceleration must have occurred in the great-ape lineage, such that humans and chimps gain and lose genes almost three times faster (0.0039 gains and losses per gene per million years) than the other mammals. 79 BIRTH-AND-DEATH EVOLUTION: EXAMPLES The evolution of olfactory receptor gene repertoires Olfactory receptors are G-coupled proteins that have seven α-helical transmembrane regions. Olfactory receptor genes are predominantly expressed in sensory neurons of the main olfactory epithelium in the nasal cavity. Animals use different olfactory receptors and different combinations of olfactory receptors to detect volatile or watersoluble chemicals. 80 BIRTH-AND-DEATH EVOLUTION: EXAMPLES The evolution of olfactory receptor gene repertoires Tetrapods have 400-2,100 olfactory receptor sequences, but 20-60% are pseudogenes. These numbers are small in comparison to the number of odorants, but olfactory receptors function in a combinatorial manner, whereby a single receptor may detect multiple odorants, and a single odorant may be detected by multiple receptors. Functional olfactory receptor genes (red) Pseudogenes (blue) 81 BIRTH-AND-DEATH EVOLUTION: EXAMPLES The evolution of olfactory receptor gene repertoires Vertebrate olfactory receptors genes are classified into at least nine subfamiles (a, b, g, d, e, z, h, q, and k), each of which originated from one or a few ancestral genes in the most recent common ancestor of vertebrates. There was an enormous expansion in the number of a and g genes in non-amphibian tetrapods. The remaining gene families are present primarily in fish and amphibian genomes. This observation suggests that a and g mostly detect airborne odorants, while the function of the other gene families is to detect water-soluble odorants. 82 BIRTH-AND-DEATH EVOLUTION: EXAMPLES The evolution of olfactory receptor gene repertoires Primates generally have a smaller number of functional olfactory receptor genes than rodents and a higher proportion of pseudogenes. 1063 genes, 328 pseudogenes (24%) 388 genes, 414 pseudogenes (52%) 83 BIRTH-AND-DEATH EVOLUTION: EXAMPLES Color Vision 84 85 Color vision in primates is mediated in the eye by up to three types of photoreceptor cells (cones), which transduce photic energy into electrical potentials. 86 Each type of color-sensitive cone expresses one type of colorsensitive pigment (photopigment). Each photopigment consists of two components: a transmembrane protein called opsin, and either of two lipid derivatives of vitamin A, 11-cis-retinal or 11-cis-3,4dehydroretinal. Variation in spectral sensitivity, i.e., color specificity is determined by the sensitivity maximum of the opsins. 87 John Dalton. 1794. “Extraordinary Facts Relating to the Vision of Colours.” Memoirs of the Manchester Literary & Philosophical Society. 88 Ishihara Plates 89 Opsins Long wavelength (red) Medium wavelength (green) Short wavelength (blue) Suggested flag for Mars 90 • Routine trichromacy = all individuals regardless of sex can achieve trichromacy. • Dichromacy (in humans, referred to as color blindness): protanopia (L-deficiency), deuteranopia (M-deficiency), tritanopia (S-deficiency). • Because of X-linkage, protanopia and deuteranopia are considerably more common in males than in females. 91 •Monochromacy can occur if both L and M photopigments are faulty. Most prosiminas (Strepsirrhini) and New World monkeys (Platyrhhini) carry only one X-linked pigment gene, and are, therefore, dichromatic. The ancestral X-linked opsin is thought to resemble the M-opsin, and indeed most prosimians and New World monkeys are protanopic. 92 However, because shifts in the maximal sensitivity of opsins can be achieved quite easily by missense mutations in as few as 3-5 codons, in a few diurnal taxa of prosimians, L-alleles have been produced. In some lineages, the L-allele became fixed in the population at the expense of the M-alleles. In consequence, these taxa are deuteranopic. 93 In other cases, a polymorphic state consisting of two or more alleles is maintained in the population. As an example, in white-faced capuchin monkeys (Cebus capucinus), there exist two alleles at the X-linked opsin locus, the maximal-sensitivity peaks of which being similar to those of human L and M opsins, respectively. For this reason, while males and homozygous females are dichromatic, heterozygous females are 94 trichromatic (Figure 6a.8). This type of trichromacy is called allelic trichromacy. Saimiri sciureus Squirrel monkey New-World monkeys possess only two opsin loci, one autosomal and one X-linked. However, the X-linked opsin locus is highly polymorphic. Two of these alleles have maximal-sensitivity peaks similar to those of human red and green opsin, while the third allele has an intermediate peak. A heterozygous female will be trichromatic, while males and homozygous females are dichromatic. 95 96 97 98