Genetic variation and fitness Hardy Weinberg law According to the Hardy Weinberg law gene frequencies are constant. Assume a gene with two alleles A and B that occur with frequency p and q = 1-p. A B p q A p pp pq Frequency z 1 Assumptions of the Hardy Weinberg law 0.4 1. No mutations to generate new alleles (no genetic variability) 0.2 ( p q)2 p2 2 pq q2 1 After crossing Frequency of B pp 2pq 0.6 B q qp qq AA p2 qq 0.8 How can evolution occur? 0 0 0.2 0.4 0.6 0.8 1 Frequency p of allele A The frequency of heterozygotes is highest at p = q = 1/2 AB 2pq 2pq / 2 BB q2 q2 Sum 1 pq+q2 What is the frequency after crossing? pq q 2 q( p q) q 2 2 2 ( p 2 pq q ) ( p q) 2. Mating is random 3. The population is closed 4. The population is infinitively large 5. Individuals are equivalent None of these assumptions is fully met in nature. Thus, gene frequencies permanently change Therefore, evolution must occur! Inbreeding What is the probability for a children to get a certain allele from their grandparents? Grandparents GM1 A,B Parents P(C)=0.25 GF1 GM2 GF2 C,D E,F G,H M F P(C)=0 Ch Childrens GF1 is already inbred A,B P(C)=0.5 GF1 GM2 GF1 C,C M E,F F GF1 GM2 GF1 A,B P(C)=0.25 E,F M F C,D P(C)=0.25 P(C)=0.25 The probability that Ch gets allele C is 0.25. The mean probability to get an allele X from one of the members of a lineage is called the coefficient of inbreeding FX. C,C P(C)=0.5 Sewall Wright defined this coefficient as 𝑛 Ch C,D Ch P(C)=0.125 The probability that Ch gets allele C is 0.125. GM1 GM1 P(C)=0.5 𝐹𝑋 = 1 2 𝑘+1 (𝐹𝐴 + 1) 𝑖=1 The probability that Ch gets allele C is 0.5. n is the number of connecting links between the two parents of X through common ancestors and FA is the coefficient of inbreeding of the common ancestor A. Mutation rates Assume the number of mutation events M in a genome is proportional to the total amount of the mutation inducing agent D, the dose M D M kD M kD N N Mutation rate The change in gene frequency is assumed to be proportional to actual gene frequency multiplied with the mutation rate. dq q dt dp p dt p p0 e t Equilibrium conditions The change in p is the sum of forward and backward mutations dp p q p (1 p ) dt At equilibrium dp/dt = 0 p q (1 p) p Under constant forward and backward mutation rates p and q will achieve equilibrium frequencies. q q0 e t Otherwise they will permanently change. The change of gene frequency follows an exponential function Constant immigration of individuals causes a permaent linear change in allele frequency Nonrandom mating If mating is totally random a population is said to be panmictic. Inbreeding results in the accumulations of homozygotes. Assortative mating describes a situation where breeding occurs among individual with similar genetic structure. The opposite is called disassortative mating. First cousins Domingue et al. 2014, PNAS Quantile of genetic similarity of pairs Degree of relatedness z A special type of nonrandom mating is inbreeding. 3/2 cousins Second cousins Not related 0 10 20 30 40 Percent offspring mortality (< 21 years)) Inbreeding depression due to homozygosity in Italian marriages 1903-1907. American pairs have a slight (about 4.5% effect) affinity to partners of similar genetic predisposition Quantile of cross sex genetic similarity Positive assortative mating increases the degree of inbreeding Individuals are not equivalent If individuals are not equivalent they have different numbers of progenies. Selection changes frequencies of genes. Selection sets in Five levels of natural selection Zygotes Compatability selection Ontogenetic selection Gametes What is the unit of selection? Children The gene is therefore a natural unit of selection. However, selection operates on different stages of individual development. Intragenomic conflict occurs when genes are selected for at earlier stages of development that later may be disadvantageous. This can occur if they are transmitted by different rules Gametic selection Viability selection Mating success Parents Examples of such genes • Transposons Adults • Cytoplasmatic genes Individuals are not equivalent The ultimate outcome of selection are changes in gene frequencies due to differential mating success. Selection changes the frequency distribution of character states Phenotypic character value Parent Offspring Phenotypic character value Stabilizing selection Phenotypic frequency Directional selection Phenotypic frequency Phenotypic frequency Diversifying selection Phenotypic character value Selection triggers the frequency of alleles The absolute fitness W of a genotype is defined as the per capita growth rate of this genotype. Using the Pearl Verhulst model of population growth absolute fitness is given by the growth parameter r of the logistic growth function for each genotype i. dN(i) KN rN dt K Absolute fitness is therefore equivalent to the reproduction rate of a focal population The relative fitness w of a genotype is defined as the value of r with respect to the highest value of r of any genotype. w = W / Wmax. The highest value of w is arbitrarily set to 1. Hence 0 ≤ w ≤ 1 The value s = 1 - w is defined the selection coefficient that measures selective advantage. s = 1 means highest selection pressure. s = 0 means lowest selection pressure. A general scheme for two alleles A B Sum p q 1 AA AB,BA BB Before Selection pp 2pq qq Relative fitness w11 w12 w22 After selection w11p2 2w12pq w22q2 Initial allele frequencies Crossing Frequencies 1 w11p2+2w12pq+w22q2 A B Sum p q 1 AA AB,BA BB Before Selection pp 2pq qq Relative fitness w11 w12 w22 After selection w11p2 2w12pq w22q2 Initial allele frequencies Crossing Frequencies 1 w11p2+2w12pq+w22q 2 How do allele frequencies change after selection? The change of frequency of p is then p' p(w11p w12 q) w11p 2 2w12 pq w 22 q 2 p p ' p q' q(w12 p w 22 q) w11p 2 2w12 pq w 22 q 2 p(w11p w12q) dp p dt w11p 2 2w12 pq w 22q 2 p(w11p w12 q) p 2 2 w11p 2w12 pq w 22q dp pq[ p( w11 w12 ) q( w12 w22 )] dt w11 p 2 2 w12 pq w22 q 2 The general framework for studying allele frequencies after selection. The basic equation of classical population genetics The dominant allele has the highest fitness w11 = w12 > w22 dp pq[ p( w11 w12 ) q( w12 w22 )] dt w11 p 2 2w12 pq w22q 2 w11 = w12 = 1 w22 = 1 - s Poison tolerance in rats w22=0 w22=0.3 w22=0.5 w22=0.7 0.8 f(p) 100 0.6 0.4 0.2 w22=0.9 0 0 5 10 15 20 Generation 25 Frequency of resistant individuals 1 z dp sp(1 p)2 dt 1 s(1 p)2 Lactose tolerance in the Neolithic 80 60 Start of Warfarin poisoning 40 End of Warfarin poisoning 20 0 1975 1976 1977 1978 Year Rat poisoning with Warfarin in Wales shows how fast advantageous alleles become dominant If lactose tolerant children had a 20% better survival probability, lactose tolerance would have been common after about 100 generation (1500 years) Heterozygotes have the highest fitness (heterosis effect) w11 < w12 > w22 w12 = 1 w11 = 1 - s , w22 = 1 - t dp p[1 p][sp t(1 p)] dt 1 sp 2 t(1 p) 2 In heterozygote advantage, an individual who is heterozygous at a particular gene locus has a greater fitness than a homozygous individual. 1 f(p) 0.8 0.6 w11=w22=0.5 w11=w22=0 w11=w22=0.3 0.4 w11=w22=0.7 0.2 w11=w22=0.9 0 0 5 10 15 20 25 Generation The heterosis effect stabilizes even highly disadvantageous alleles in a population Sickle cell anaemia Reported values of selection coefficients Percentage z 16 14 Survival difference 12 N = 394 Endler (1986) compiled selection coefficient (s = 1 – w) for discrete polymorphic traits 10 8 6 4 2 0 0.05 0.15 0.25 0.35 0.45 0.55 0.65 0.75 0.85 0.95 Selection coefficient Percentage z 14 Reproductive difference 12 N = 172 10 8 Survival differences are: • mostly small. • Reproductive difference are larger. • The proportion of significant differences in reproductive success is higher than for the survival difference. 6 4 2 0 0.05 All values 0.15 0.25 0.35 0.45 0.55 0.65 Selection coefficient Only statistically significant values 0.75 0.85 0.95 • In many species only a small proportion of the population reproduces successfully. Classical population genetics predicts a fast elimination of disadvantageous alleles. Polymorphism should be low. Natural populations have a high degree of polymorphism Balancing selection within a population is able to maintain stable frequencies of two or more phenotypic forms (balanced polymorphism). This is achieved by frequency dependent selection where the fitness of one allele depends on the frequency of other alleles. Cepaea nemoralis Shell colour and habitat preference of European Helicidae Shell Nocturnal Dark Medium Light White Polymorphic 9 8 0 0 0 Partly nocturnal 5 15 1 0 0 Habitat General habitat 0 7 2 0 8 Exposed 0 14 10 1 10 Very exposed 0 0 17 3 14 The fundamental theorem of natural selection a b c d e f g h i j k Variance Mean Difference in mean Gen. 1 0.48 0.44 0.82 0.28 0.59 0.88 0.05 0.59 0.16 0.86 0.22 0.10 0.46 Gen. 2 0.58 1.88 1.10 0.24 1.97 0.84 0.20 1.81 1.20 1.80 0.68 0.51 1.07 0.61 0.65 Fitness Gen. 3 Gen. 4 Gen. 5 2.58 2.17 11.70 2.90 6.01 1.26 1000.00 2.73 3.60 11.28 3.15 3.00 7.38 1.98 100.00 1.67 10.61 2.81 4.59 3.11 2.51 3.03 4.06 10.00 3.41 4.98 14.13 0.57 7.51 4.22 1.00 1.13 2.24 4.23 0.22 6.67 6.38 1.45 4.93 20.31 0.10 2.12 4.20 8.22 0.10 1.05 2.08 4.02 2.23 8.74 33.08 s2 Allele 𝜎𝑤2 ∝ 𝑤∆𝑤 Selection effect Sir Ronald Aylmer Fisher 1890-1962 Gen. 6 15.72 6.43 31.95 30.23 5.25 1.93 12.26 6.43 0.84 18.55 0.30 125.92 21.32 10.00 13.10 279.18 Gen. 7 11.74 43.04 3.86 21.26 25.58 47.73 5.04 15.26 24.90 17.25 17.35 212.91 37.16 1000.00 15.84 588.74 Gen. 8 53.95 53.62 22.92 50.59 25.17 117.87 125.64 92.09 22.20 94.49 92.48 1693.89 203.74 166.58 33940.04 By definition variance and mean fitness have positive values. Change in fitness The Fisher equation is a tautology. It is a simple restatement of the definitions of mean and variance. Nevertheless, it is the basic description of evolutionary change Because mean fitness and its variance cannot be negative, the fundamental theorem states that fitness always increases through time Evolution has a direction Adaptive landscapes Fitness Species A Species A Species A Global peak Theodosius Dobzhansky (1900-1975) Sewall G. Wright (1889-1988) Species occupy peaks in adaptive landscapes where altitude denotes fitness.. Local peak Species A Species B Species C Species D Species increase in fitness through time Genetic composition / morphological structure To evolve into new species they first have to cross adaptive valleys C Fitness AB DE F Species Genetic composition / morphological structure High adaptive peaks are hard to climb but when reached they might allow for fast further evolution but also for long-term survival and stasis. Evolution without change in fitness Neutral evolution and genetic drift A1 A2 Motoo Kimura (1924-1994) Assume a parasitic wasp that infects a leaf miner. Take 100 wasps of which 80 have a yellow abdomen and 20 have a red abdomen. A leaf eating elephant kills 5 mines containing red and 3 mines containing yellow wasps. A3 By chance the frequencies of red and yellow changed to 15 red and 77 yellow ones. A4 The new frequencies are red: 15/(15+77) = 0.16 yellow: 1-0.16 = 0.84 A5 Time During many generations changes in gene frequencies can be viewed as a random walk A random walk of allele occurrences 9 i0 = 20 i80 = 12 7 z 1400 1200 6 Survival time N 8 1000 5 4 3 800 600 400 200 2 0 1 1 10 100 1000 10000 100000 Initial number of allele A 0 0 20 40 60 80 Time Survival times of alleles TE 2 ln(1/ p) ln(1/ p) ln( N ) Var(1/ p ) 2 The Foley equation of species extinction probabilities applied to allele frequencies At low allele frequencies survival times are approximately logarithmic functions of frequency The frequency of heterozygotes in a neutral population is Effective population size If we have N idividuals in a population not all contribute genes to the next generation (reproduce). H The effective population size is the mean number of individuals of a population that reproduce. 4N e u e 4N e u e 1 For a mutation rate of u0 = 10-6 we get 1 Consider a diploid population of effective population size Ne. Neutral mutations are those that don’t significantly effect fitness. H Let ue be the neutral mutation rate at a given locus. 0.1 0.01 u0 = 0.000001 0.001 The number of new neutral mutations is 2Neue. 0 20000 40000 60000 80000 Ne At fairly high population sizes neutral theory predicts high levels of polymorphism. Neutral genetic drift explains the high degree of polymorphism in natural populations. Lynch and Connery 2003 Genome complexity and genetic drift Assume a newly arisen neutral allele within a haploploid population of effective size Ne. Given a mutation rate of u of this allele uNe mutations will occur within the population. Eukaryotes y = 0.0522x-0.548 Mutations can be fixed by genetic drift Procaryotes Prokaryotes Unicellular Unicellular eucaryotes Eukaryotes Invertebrates Invertebrates 108 0.1 107 0.01 Ne Genome size (MB) 1 Mutations are removed 0.001 Prokaryotes 0.0001 1 10 100 Nu 106 105 Land Landplants plants Vertebrata Vertebrates 104 1000 10000 In accordance with the Eigen equation only small effective population sizes allow for larger genome sizes. -10-3 -10-4 -10-5 -10-6 -10-7 Negative Selective effect of mutation -10-8 Neutral The low effective population sizes of higher organisms increase the speed of evolution to a power because a much higher proportion of mutations can be fixed through genetic drift. Population size Populations must not become to small Bottleneck of very low population size Recovery Extinction Bottlenecks increase the degree of inbreeding, decrease the genetic variability, increase the effect of genetic drift Time The population after recovery might have a significantly altered genetic composition compared to the original population (founder effect). Man has extraordinary low genetic variation suggesting a bottleneck in sub-Saharan populations before 60,000 years. Samaritans Strictly inbreeding ethnic group of about 700 people. Neanderthals Lived in very small groups of highly inbreed people. Total population size was at most several thousand in whole Europe. Today’s reading All about selection: http://en.wikipedia.org/wiki/Natural_selection Polymorphism: http://en.wikipedia.org/wiki/Polymorphism_(biology) Fundamental theorem of natural selection: http://stevefrank.org/reprints-pdf/92TREE-FTNS.pdf and http://users.ox.ac.uk/~grafen/cv/fisher.pdf