Comparing demographic models using ABC Roger Butlin University of Sheffield Maroja et al. 2009 Evolution 10 loci – do they all behave in the same way? Accessory gland proteins with other evidence of selection We need a flexible method to fit complex demographic (and adaptive?) models with a variety of marker types… Ideally, we would model drift and selection together, rather than fitting one first. Approximate Bayesian Computation (ABC) may be the answer! ABC outline Model N1 N2 m21 Ts Coalescent Simulations m12 Na 6 parameters (Hey & Nielsen, 2004) prior Ts Beaumont 2010 Annu. Rev. Ecol. Evol. Syst. 41: 379-406 ABC outline Model N1 N2 m21 Ts Coalescent Simulations Molecular data (sequences, microsatellites) m12 Na 6 parameters Statistics of population genetics (differentiation and polymorphism) posterior (Hey & Nielsen, 2004) Rejection step (keep only good simulations) Inferences Regression between statistics and parameter values from retained simulations. Ts ABC model comparison Molecular data Model1 simulations Model2 simulations (sequences, microsatellites) Statistics of population genetics (differentiation and polymorphism) Rejection - Regression Posterior probability of model 1 Posterior probability of model 2 ABC tools DIYABC http://www1.montpellier.inra.fr/CBGP/diyabc/ http://code.google.com/p/popabc/ ABC toolbox http://www.cmpg.iee.unibe.ch/content/softwares__services/computer_programs/abctoolbox/index_eng.html Tools in R http://cran.r-project.org/web/packages/abc/index.html Practical steps 1. 2. 3. 4. 5. 6. 7. 8. 9. Choose models Choose summary statistics (and whether to transform) Define priors Choose simulator Choose standard, MCMC or Population MC Choose rejection and regression parameters Choose model comparison method Validate Interpret results! Duvaux et al. 2011 Molecular Ecology A successful example! Present Migration period Divergence time past • • • 10 individuals sampled from each subspecies for their full respective distribution areas + 2 outgroups 61 autosomal loci (Sanger sequencing of around 48 kb of aligned sequences) 15 summary statistics: mean and sd of π, Fst, Da, Dxy, counts of fixed, shared polymorphic and f-s substitutions Mus m. domesticus Mus m. musculus Mus spretus MOL (Japan) Mus Famulus (India) European hybrid zone center Model comparison Posterior probabilities (6M simulations for each model) Isolation model Isolation-withmigration Sympatric differentiation 0.000 0.295 0.008 Secondary contact 0.697 The secondary contact scenario is the most probable Parameters of interest 1. duration of isolation period domesticus musculus 74% of the divergence time in isolation (allopatry) Parameters of interest 2. Secondary contact domesticus musculus Tm≈0.22 Mya (0.048-1.452 Mya) Secondary contact older than the European hybrid zone setting up (2.000 ya) Parameters of interest 3. Migration rate asymmetry domesticus musculus 2Nmmus=0.105 & 2Nmdom=0.050 Migration is twice as strong toward M. m. musculus Allowing two classes of loci (low and high migration rate) improves the fit…. Littorina saxatilis ecotypes UK SPAIN small thin-shelled bigger thick-shelled SWEDEN 3 Nations project – Sampling design Tj ä rno Lysekil • 4 genes sequenced per individual: Dunbar Thornwick • 3 nDNA genes • 1 mtDNA region • and 462 AFLP loci •2 sampling sites per country • 2 ecotypes per sampling site Burela Silleiro • 16 individuals per ecotype What was the demographic setting for ecotype formation? Did it occur in parallel in each locality? Models of divergence for L. saxatilis ecotypes Old divergence W1 W2 C1 C2 ‘Old divergence model’ Scenario for ancestral divergence of ecotypes within one country Vs Parallel divergence Models of divergence for L. saxatilis ecotypes Old divergence with allopatric phase W1 W2 C1 C2 ‘Old divergence model’ Scenario for ancestral divergence of ecotypes within one country, with a period of allopatry Vs Parallel divergence Models of divergence for L. saxatilis ecotypes Old divergence Vs Parallel divergence W1 C1 W2 C2 ‘Parallel model’ Scenario for parallel divergence of ecotypes within a country Models of divergence for L. saxatilis ecotypes Old divergence W1 W2 C1 Vs C2 Parallel divergence W1 C1 W2 Split Splitbetween betweensampling ecotypessites ‘Old divergence model’ ‘Parallel model’ C2 Parameterize the Models Model + parameters Prior distribution of parameters MLG W1 C1 W2 C2 NL TEC MEC MLG NE ‘Parallel model’ TLG AFLP data / Sequence data 2 1 W1 W2 C1 Old divergence C2 W1 C1 3 W2 C2 Parallel divergence W1 W2 C1 C2 Old divergence with allopatry Model choice AFLP Spa 0.0200 0.0100 0.0000 Model1 Model2 Swe 0.0100 Marginal Density 0.0300 Marginal Density Marginal Density Gbr 0.0050 0.0000 Model3 Model1 Model2 0.3000 0.2000 0.1000 0.0000 Model1 Model2 Model3 Model3 Sequence – all 4 loci 2E-08 1.9E-08 1.8E-08 Swe 5E-08 Marginal Density 2.1E-08 Spa Marginal Density Marginal Density Gbr 4E-08 3E-08 2E-08 1E-08 0 1.7E-08 Model1 Model2 Model3 Model1 Model2 Model3 0.0000004 0.0000003 0.0000002 0.0000001 0 Model1 Model2 Model3 Spain model 2 AFLP MLG W1 C1 W2 C2 NL TEC MEC TLG MLG Spain model 2 sequence (minus ThioPer) NE ‘Parallel model’ Black=prior Blue=post-rejection Red=posterior Sympatric speciation!! It’s TRUE!