Appendix 1. ABC validation procedure In order to validate the simulation precedure and to decide which of the mean, mode, or median is the best suited for parameters estimation, we simulated two additional 1000 datasets with known parameter values, the time of colonization being fixed either to 100 (first pseudoobserved dataset) or to 800 generations (second pseudo-observed dataset). The floreana intial and current sizes were fixed to 15 and 1,000 respectively and the Peru size at 10,000. The migration rate was set at 0.0045 and the mutation rate at 0.0015 with a geometric parameter of the generalized stepwise mutation model (GSM-p parameter) at 0.15 (Tables S1 and S2). For the ABC simulations, the six parameters were drawn randomly from uniform prior distributions and an additional set of one million simulations was generated specifically for the purpose of the validation procedure. This set of simulation was used with both pseudo-observed datasets described above. The population of Floreana was assumed to have grown exponentially after the colonization of the island, assuming an initial effective size (N0) of [1-20] and a current size (N1) of [500-5,000]. The effective size in Peru was set to 10,000 and kept constant over time. The colonization time (t) was assumed to lie between 1 and 60,000 generations, assuming a maximum generation time of 25 years. The migration rate was set to [0.000001-0.01]. For each of the five loci, a mutation rate µ was randomly drawn from a gamma distribution, with a mean randomly drawn between 0.0001 to 0.01 and a shape parameter k set to 10. The simulated datasets were compared to the ‘pseudo-observed’ ones and the 2,000 closest simulations were kept for parameter estimations. The bias, the relative root mean square error (RMSE) and Factor 2 were computed, as explained in the Material and Method section. Supplementary Table 1 Accuracy of the estimated parameters assessed using the mean, the median and the mode of the posterior distribution, by simulating a 1000 test datasets with known parameter values, time of colonization being fixed to 100 generations. Mean Parameter Colonization time Floreana initial size Floreana current size Migration rate Mutation rate GSM p-parameter Values True Estimated 100 618.09 15 9.1749 1000 2588.6 0.0045 0.0044 0.0015 0.0018 0.15 0.1489 Bias 5.181 -0.388 1.587 -0.024 0.169 -0.008 RMSE 8.756 0.403 1.602 0.214 0.298 0.038 Factor 2 0.216 0.840 0 0.988 0.996 1 Coverage 50% 90% 0.909 0.999 0.195 0.829 0.812 0.919 0.996 1 0.732 0.97 0.978 1 Factor 2 0.383 0.723 0.039 0.940 0.998 1 Coverage 50% 90% 0.909 0.999 0.195 0.829 0.812 0.919 0.996 1 0.732 0.97 0.978 1 Factor 2 0.831 0.287 0.805 0.630 0.999 0.754 Coverage 50% 90% 0.909 0.999 0.195 0.829 0.812 0.919 0.996 1 0.732 0.970 0.978 1 Median Parameter Colonization time Floreana initial size Floreana current size Migration rate Mutation rate GSM p-parameter Values True Estimated 100 307.14 15 8.6436 1000 2499.4 0.0045 0.0041 0.0015 0.0017 0.15 0.1479 Parameter Colonization time Floreana initial size Floreana current size Migration rate Mutation rate GSM p-parameter Values True Estimated 100 122.3 15 6.3763 1000 1691.1 0.0045 0.0031 0.0015 0.0016 0.15 0.1403 Bias 2.071 -0.424 1.499 -0.099 0.129 -0.014 RMSE 3.359 0.446 1.532 0.281 0.269 0.059 Mode Bias 0.223 -0.575 0.691 -0.311 0.066 -0.065 RMSE 0.951 0.632 1.430 0.483 0.237 0.506 The coverage is computed as the proportion of simulations in which the “true value” lies within the respective 50% and 90% credible intervals. Supplementary Table 2 Accuracy of the estimated parameters assessed using the mean the median and the mode of the posterior distribution, by simulating a 1000 test datasets with known parameter values, time of colonization being fixed to 800 generations. Mean Parameter Colonization time Floreana initial size Floreana current size Migration rate Mutation rate GSM p-parameter Values True Estimated 800 6180.6 15 11.239 1000 2413.1 0.0045 0.0048 0.0015 0.0018 0.15 0.1505 Bias 6.726 -0.251 1.413 0.077 0.184 0.004 RMSE 7.950 0.256 1.433 0.255 0.313 0.040 Factor 2 0.033 1 0.009 0.984 0.998 1 Coverage 50% 90% 0.927 0.997 0.906 1 0.919 0.959 0.965 0.999 0.734 0.976 0.958 1 Factor 2 0.358 0.997 0.283 0.945 0.998 1 Coverage 50% 90% 0.927 0.997 0.906 1 0.919 0.959 0.965 0.999 0.734 0.976 0.958 1 Factor 2 0.557 0.846 0.909 0.628 0.999 0.786 Coverage 50% 90% 0.927 0.997 0.906 1 0.919 0.959 0.965 0.999 0.734 0.976 0.958 1 Median Parameter Colonization time Floreana initial size Floreana current size Migration rate Mutation rate GSM p-parameter Values True Estimated 800 2375 15 11.529 1000 2225.5 0.0045 0.0047 0.0015 0.0017 0.15 0.1504 Parameter Colonization time Floreana initial size Floreana current size Migration rate Mutation rate GSM p-parameter Values True Estimated 800 718.5 15 14.6 1000 1283.9 0.0045 0.0042 0.0015 0.0016 0.15 0.1535 Bias 1.969 -0.231 1.226 0.038 0.141 0.003 RMSE 2.767 0.244 1.275 0.315 0.282 0.062 Mode Bias -0.102 -0.025 0.284 -0.071 0.065 0.023 RMSE 0.746 0.317 0.915 0.586 0.250 0.531 The coverage is computed as the proportion of simulations in which the “true value” lies within the respective 50% and 90% credible intervals. Appendix 2. IMa2 simulations IMa2 (Hey, 2005; 2010) was run using combinations of parameters that differed by the assumed average mutation rate of microsatellites (=0.004, 0.0005 and 0.0001). These different mutations rates were drawn from the literature (Udupa & Baum 2001; Thuillet et al. 2002; Vigouroux et al. 2002; O'Connell & Ritland 2004) and from the results obtained from ABC simulations (mode = 0.0004 to 0.0006). We assumed that an ancestral population in Peru split at a time t and gave rise to the current populations in Peru and Floreana with subsequent migrations between the two populations. For the three IMa runs we performed, the maximum sizes for Peru and the Galapagos were assumed to be, as for ABC simulations, NP =100,000 individuals and NF = 5,000 individuals respectively, which were translated into 4N using the three assumed values of (4NP = 1,600, 200 and 40, respectively; 4NF = 80, 10 and 2, respectively). The geometric mean of 4NP and 4NP estimations, x, was calculated and used to define the priors following the IMa2 guidelines. The maximum population size was set as 5x, which translated into the following priors depending on the assumed mutation rate: q=1800, 224 and 45, respectively; the maximum time of colonization was as 2x (t=720, 89, and 18, respectively) and the maximum migration rate between the two populations Peru and Floreana was set as 2/x (m=0.006, 0.045 and 0.224, respectively). The Metropolis Coupling was implemented using ten independent chains with a high heating using a geometric increment model with a degree of non-linearity of 0.99 and the lower value of the heating term fixed at 0.25. Each chain was initiated with a burning period of 100,000 updates and the total run length was ten million updates, with a thinning interval of 100 updates. The best results were obtained for =0.0005, with all parameters found to be uncorrelated and the trend-line plots showing no obvious trends. IMa was then run a second time with 20 chains and a different seed. The results between the two runs were congruent. The mixing was however higher with the second run and the parameter estimates of this later run are presented in Table S3. Parameters conversion was obtained following Hey (2005) using as =0.0005 for the mutation rate per locus and per generation. Supplementary Table 3 Estimation of population sizes, colonization time and migration parameters using IMa2, with a mutation rate fixed at =0.0005. = 0.0005 Mean t (in generations) Mode HPD95low HPD95Hi 250 267 0 1'869 9'955 5'095 1'960 20'440 N Floreana 795 728 168 1'960 N Ancestral 15'245 13'945 9'015 21'670 Flo -> Peru (m) 0.00001 0.00000 0.00000 0.00002 Peru -> Flo (m) 0.00001 0.00002 0.00000 0.00002 N Peru HPD95low and HPD95Hi correspond to the lower and higher bounds of the estimated 95% lower or highest posterior density intervals, respectively. Supplementary Figure 1 Posterior density curves for the four parameters used to simulate the colonization of the Galapagos by Geoffroea spinosa with IMa2. The modal value of each estimated parameter is shown within parentheses. Colonization time (267) Floreana current size (728) 7 6 5 4 3 2 1 0 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 500 1000 1500 Peru ancestral size (13,945) 0.07 0.06 0.05 0.04 0.03 0.02 0.01 0 0 10000 20000 30000 40000 50000 S3 Peru current size (5,095) 0.05 0.04 0.03 0.02 0.01 0 0 1000 2000 3000 4000 5000 0 20000 40000 60000 80000 100000 Literature Hey J (2005) On the number of New World founders: A population Genetic Portrait of the peopling of the Americas. PLoS Biology, 3(6), e193. Hey J (2010) Documentation for IMa2. Department of Genetics, Rutgers University, USA. Available at http://genfaculty.rutgers.edu/hey/software. O'Connell LM, Ritland K (2004) Somatic Mutations at Microsatellite Loci in Western Redcedar (Thuja plicata: Cupressaceae). Journal of Heredity 95, 172-176. Thuillet A-C, Bru D, David J, et al. (2002) Direct Estimation of Mutation Rate for 10 Microsatellite Loci in Durum Wheat, Triticum turgidum (L.) Thell. ssp durum Desf. Molecular Biology and Evolution 19, 122-125. Udupa S, Baum M (2001) High mutation rate and mutational bias at (TAA)n microsatellite loci in chickpea (Cicer arietinum L.). Molecular Genetics and Genomics 265, 1097-1103. Vigouroux Y, Jaqueth JS, Matsuoka Y, et al. (2002) Rate and Pattern of Mutation at Microsatellite Loci in Maize. Molecular Biology and Evolution 19, 1251-1260.