USC3002 Picturing the World Through Mathematics Wayne Lawton Department of Mathematics S14-04-04, 65162749 matwml@nus.edu.sg Theme for Semester I, 2008/09 : The Logic of Evolution, Mathematical Models of Adaptation from Darwin to Dawkins Natural Selection Reference: Evolution by Mark Ridley, Chapter 5 p. 104 simplest model Genotype Phenotype Chance of Survival Y 1 yellow seeds H 1 yellow seeds green seeds 1 s G s [0,1] is the selection coefficient The chance of survival is relative to the maximal chance of survival among all genotypes. Notice that here it depends on the phenotype Natural Selection Problem: what will the genotype frequencies be after natural selection followed by random mating ? Genotype Y PY 1st Ad. Freq. y PY 12 PH , g PG 12 PH Define y Baby Freq. 2nd Ad. Freq. Define H PH 2 g 2 yg 2 P PH' y 2 /(1 sg 2 ) 2 yg /(1 sg ) g (1 s) /(1 sg ) ' Y PG' 2 y P P y /(1 sg ) ' G PG ' Y 1 2 ' H g ' PG' 12 PH' 1 y ' 2 2 2 Natural Selection Remark: since '2 P y /(1 sg ) y y /(1 sg ) ' Y 2 2 2 2 2 the genotype frequencies of the 2nd Adult population are NOT in Hardy-Weinberg equilibrium Let y y y syg /(1 sg ) ' 2 2 denote the change in gene frequency to the next generation Haldane (1924) produced this model for selection p. 107 Since s y /( y g ) ' 2 the selection coefficient can be computed from the 2nd generation gene frequencies MATLAB Program for Table 5.4, p. 107 function g = tablepage107(s,ngens,g0) % function g = tablepage107(s,ngens,g0) % % Wayne Lawton, 21 August 2007 % computes gene frequencies in Table 5.4 Evolution by Ridley % % Outputs % g = array of length ngens % g(k) = gene frequency of recessive gene after k generations % Inputs % s = selection coefficient % ngens = number of generations % g0 = initial gene frequency % gt = g0; for n = 1:ngens g(n) = gt*(1-s*gt)/(1-s*gt^2); gt = g(n); end Tabular Output generation s=0.05 s=0.01 0 100 200 300 400 500 600 700 800 900 1000 0.9900 0.5608 0.1931 0.1053 0.0710 0.0532 0.0424 0.0352 0.0301 0.0262 0.0233 0.9900 0.9736 0.9338 0.8513 0.7214 0.5747 0.4478 0.3524 0.2838 0.2343 0.1979 g - gene frequencies Plot Plot Plot Plot Differential Equation Approximation for g g g s(1 g ) g /( sg 1) ' 2 2 consists of solving the initial value problem: (frequency of gene ~ g (0) g (0) g in zero-generation) d g~ 2 2 ~ ~ ~ (t ) s(1 g (t )) g (t ) /( sg (t ) 1), t 0 dt followed by the approximation g (n) g~ (n), n 1,2,3,... The error is small if d g~ (t ) is small dt Qualitative Observations d g~ (t ) s(1 g~ (t )) g~ (t ) 2 /( sg~ (t ) 2 1) 0 If s > 0, dt 1. If g~ (t ) g~ (0) 1 d g~ ~ (t )) /(1 s) ( t ) s ( 1 g then dt ~ ~ therefore g (t ) 1 (1 g (0)) exp( st /(1 s)) d g~ ~ 2 ~ ~ g (t ) s(1 g (t )) g (t ) therefore (t ) 2. For small s, dt ~ d g 4s ~ 2 where (t ) decays fastest at g (t ) 3 dt 27 d g~ 2 ~ (t ) sg (t ) then dt g~ (t ) g~ (0) 0 ~ (t ) g~ (0) /(1 sg~ (0)t ) therefore g 3. If Numerical Solution Algorithm Set T 0, t 0 ~ g (0) g (0) Set t 0 Choose While t T t t t 2 2 ~ ~ ~ a s(1 g (t )) g (t ) /( sg (t ) 1) ~ dg ~ ~ g (t t ) g (t ) t (t ) dt MATLAB Code for Differential Equation function [t, g] = tablepage107_approx(s,g0,T,deltat) % function [t, g] = tablepage107_approx(s,g0,T,deltat) % Wayne Lawton, 22 August 2007 % numerical solution of differential equation % for gene frequencies % Outputs % t = array of times % g = solution array (as a function of t) % Inputs % s = selection coefficient % g0 = initial gene frequency % T = approx last time % deltat = time increment N = round(T/deltat); gg = g0; for n = 1:N t(n) = n*deltat; a = s*(1-gg)*gg^2/(s*gg^2-1); gg = gg + deltat*a; g(n) = gg; end Numerical Solution Comparison s 0.05, t 0.1 Comparison Error s 0.05, t 0.1 Exact Solution for t First rewrite the differential equation in the form 2 2 ~ ~ ~ ~ [( sg 1) / s(1 g ) g ] d g dt Then use the method of partial fractions http://www4.ncsu.edu/unity/lockers/users/f/felder/public/kenny/papers/partial.html g~ ( t ) t 1 ~ 1 ~ ~ ( s 1)d g s d g s d g dt 2 ~ ~ ~ g 1 g g g~ ( 0 ) 0 1 ~(t ) 1 g 1 1 (s 1) ln ~ s ln g (0) 1 1 1 ~ g (0) s s t ~ ~ ~ g (t ) g (t ) g (0) Exact Solution for s ~(t ) 1 g 1 1 (s 1) ln ~ s ln g (0) 1 1 1 ~ g (0) s s t ~ ~ ~ g (t ) g (t ) g (0) implies that s can be solved for by 1 ~ g (t ) 1 s t ln ~ ln g (0) 1 g~(t ) 1 ln ~ g (0) 1 g~(0) 1 1 ~ ~ ~ g (t ) g (t ) g (0) Comparison With Sol. of Diff. Eqn. 3 Components Sol. of Diff. Eqn. Inverses of the 3 Components MATLAB Code for Selection Coefficient function s = sexact(t,gt,g0) % function s = sexact(t,gt,g0) % Wayne Lawton, 24 August 2007 % exact solution for s % Outputs % s = selection coefficient % Inputs % t time of evolution % g0 = gene frequency at time 0 % g = gene frequency at t num = log((gt-1)/(g0-1)) + log(g0/gt) + 1/gt - 1/g0; den = t + log((gt-1)/(g0-1)); s = num/den; Peppered Moth Estimation page 110 >> g0 = 1-1/100000 g0 = 0.99999000000000 >> gt = 1 - 0.8 gt = 0.20000000000000 >> t = 50 t = 50 >> s = sexact(t,gt,g0) s = 0.27572621892750 Question: Why does this differ from the book’s estimate s = 0.33 ? Peppered Moth Estimation page 110 >> gbook = tablepage107(0.33,50,11/100000); >> gmine = tablepage107(0.2757,50,11/100000); >> plot(1:50,gbook,1:50,gmine) >> grid >> plot(1:50,gbook) >> plot(1:50,gbook,1:50,gmine) >> grid >> ylabel(‘blue=book, green = mine') >> xlabel('number of generations') Peppered Moth Simulation Assigned Reading Chapter 25. Evolution: The Process in Schaum’s Outlines in Biology Chapter 5. The Theory of Natural Selection in Mark Ridley’s Evolution. In particular study: (i) the peppered moth (Biston betularia) studies of the decrease in the recessive peppered moth allele, (ii) pesticide resistence, (iii) equilibrium for recurrent disadvantageous dominant mutation, (iv) heterozygous advantage and sickle cell (1st study of natural selection in humans), (v) freq. dependent fitness, (vi) Wahlund effect, (vii) effects of migration and gene flow Homework 2. Due Monday 1.09.2008 Do problems 1-6 on page 136 in Ridley (the mean fitness in questions 2, 3 is defined on p105) Homework 3. Due Monday 8.09.2008 Do problems 7-10 on page 136 in Ridley (the mean fitness in questions 2, 3 is defined on p105) Question 11. Assume that for a two allele locus that genotype AA has fitness 1-s, genotype Aa has fitness 1, and genotype aa has fitness 1-t and that random mating occurs. Let p = baby freq. of gene A and q = baby freq. of gene a. Derive formuli for the next baby freq. p’ and q’. Question 12. Assume that in a two allele locus all genotypes have fitness = 1 but that each genotye mates only with the same genotype. Derive equations for the evolution of gene frequencies.