1 Supporting Information Unraveling the Peculiarities of Island Life: Vicariance, Dispersal and the Diversification of the Extinct and Extant Giant Galápagos Tortoises. Nikos Poulakakis, Michael Russello, Dennis Geist, and Adalgisa Caccone Material and Methods Higher-level analyses were based on consensus mtDNA sequences for extinct and extant lineages of Galápagos tortoises from three mitochondrial DNA regions: control region (CR), 16s rDNA (16S) and cytochrome b (cyt b). A dataset of 108 OTUs [105 of all extant and extinct populations throughout Galápagos and three outgroup taxa (G. chilensis, G. carbonaria and G. denticulata)] was used for subsequent phylogenetic analyses (Table S1), based on the most divergent mtDNA region of the giant tortoises, which is the CR (see Caccone et al. 2002). This comprehensive dataset consists of 21 haplotypes recovered from all extinct populations (Floreana, Rabida, Santa Fe, Fernandina, and San Cristóbal) (Poulakakis et al. 2008; Russello et al. 2005; present study) and 84 haplotypes identified from almost 900 individuals sampled from all extant populations throughout Galápagos (Beheregaray et al. 2004; Caccone et al. 2002; Caccone et al. 1999; Russello et al. 2005) 1, excluding the 12 haplotypes (Burns et al. 2003; Russello et al. 2007) that originated from animals in captivity [Charles Darwin Research Station in the Galápagos Islands (CDRS); Caloosahatchee Aviary and Botanical 1 In the article of Beheregaray et al. (2004), the authors by mistake mentioned 83 different haplotypes as the haplotypes 80 (AF548283) and 83 (AF548286) are identical. 2 Garden (CABG); mainland Ecuador hotels, universities, zoological and private collections (ECU); Prague Zoo (PRZ); Zurich Zoo (ZUZ)]. DNA extraction A) Modern specimens: The 16S and cyt b sequences were available only for 48 of the 84 specimens that produced the 84 CR haplotypes. For this reason, total genomic DNA was extracted from 36 fresh specimens (Table S1) following a previously described protocol (Caccone et al. 1999) in order to augment the available mtDNA database of giant Galápagos tortoises. DNA was extracted from blood stored in 100 mM Tris/100 mM EDTA/2% SDS buffer by using the DNeasy Blood & Tissue Kit (QIAGEN). B) Museum samples: Bone samples of the Giant Galápagos tortoises were obtained from the following museums: Museum of Comparative Zoology (MCZ), California Academy of Science (CAS), American Museum Natural History (AMNH), and National Museum of Natural History, Smithsonian Institution (USNM) (Table S1), representing the extinct taxa of Floreana, Rabida, Santa Fe, and Fernandina, and extirpated populations from western San Cristóbal, Pinta and Pinzón. In total we processed 69 specimens, 16 from MCZ, 24 from AMNH, 9 from USNM, and 20 from CAS. Homogenized bone powder was obtained by grinding the samples using a Spex 6750 freezer mill (Spex SamplePrep, Metuchen, NJ, USA). All DNA extractions were carried out in a physically separate laboratory dedicated to the study of ancient DNA at Yale University (USA), following all necessary precautions to prevent contamination by extant specimens, following the bone extraction 3 method of the DNA IQ System (Promega, USA) (Rohland & Hofreiter 2007) as in (Poulakakis et al. 2008). Mitochondrial DNA amplification and sequencing A) Modern specimens: A fragment of 568 bp of the 16S rRNA and 386 bp of the cyt b gene were amplified using two pairs of primers (16Sar-16Sbr for 16S rRNA and GLU - H15149 for cyt b). All CR sequences were already available. B) Museum samples: Partial sequences of the mitochondrial CR, cyt b, and 16S rRNA were amplified in small overlapping fragments of ~200 (including the primers) in length (Table S2). For each DNA extract, PCR amplification was initially performed with a corresponding pair of external primers. The product of the primary reaction served as the template for another PCR amplification (nested reaction) using the corresponding pair of internal primers. Mitochondrial fragments were amplified in 20μL PCR reactions containing: 1X High Fidelity PCR Buffer, 3.00mM MgSO4, 100μM of each dNTP, 0.30μM of each PCR primer, 1mg/ml BSA and 0.4 units of Taq polymerase (Platinum Taq High Fidelity; Invitrogen). Reactions were carried out using an initial step at 94oC/5 min, followed by 40 cycles of 94oC for 45s, 50oC (CR), 48oC (cyt b), and 50oC (16S) for 60s, and 68oC for 90s, and a final extension of 68oC for 25 min. At least two sterile negative controls were used for each reaction to detect contamination throughout amplification. Sequences were obtained from at least two amplifications of individual samples. Double-stranded PCR products were sequenced using Big Dye 3.1 terminators on an ABI3730 DNA sequencer (Applied Biosystems, USA). 4 Mito-phylogenetic analyses (combined dataset) Of the 69 museum specimens sampled, only 21 specimens yielded DNA of sufficient quality and quantity to produce scorable and repeatable sequence data (Table S1). Sequences from the extinct lineages (Floreana, Santa Fe, Rabida, and Fernandina) were combined with present and previously published sequences of the extant and recently extirpated Giant Galápagos tortoise lineages and three outgroup taxa (C. chilensis, C. carbonaria and C. denticulata) for subsequent phylogenetic analyses, producing a dataset of 1,682 bp of CR, 16s, cyt b mtDNA regions for 108 extinct and extant tortoises. DNA sequences were aligned using default parameters in ClustalX (Thompson et al. 1997). The best-fit model of DNA substitution and parameter estimates used for tree constructions were chosen according to the Akaike Information Criterion (Akaike 1974) (see Posada & Buckley 2004) as implemented in Modeltest v. 3.7 (Posada & Crandall 1998). The best-fit model for the concatenated dataset was GTR+I+G (-lnL=6219.8735, I=0.62, α=0.53). The most suitable model for each target gene (D-loop, cyt b, and 16S) was: TrN+I+G (-lnL= 3398.97, I=0.42, a=0.50), GTR+I+G (-lnL= 1205.26, I=0.62, a=0.88), and GTR+I (-lnL=1170.02, I=0.77), respectively. Phylogenetic analyses was conducted using Bayesian Inference (BI) as implemented in MrBayes v3.1.2 (Ronquist & Huelsenbeck 2003), taking advantage of the ability of this software package to handle a wide variety of data types and models, as well as any mix of these models, based on the procedure described in manual. The analysis was run with four chains for 107 generations, sampling from the chain every 100 generations. This generated an output of 105 trees. In order to confirm that the chains had 5 achieved stationarity, we evaluated “burn-in” by plotting log-likelihood scores and tree lengths against generation number using the software Tracer (v. 1.5.0) (Rambaut & Drummond 2008). The –lnL stabilized after approximately 106 generations and the first 104 trees were discarded as a conservative measure to avoid the possibility of including random, sub-optimal trees. The percentage of samples recovering any particular clade in a BI analysis represents that clade’s posterior probability (Huelsenbeck & Ronquist 2001). A majority rule consensus tree ('Bayesian' tree) was then calculated from the posterior distribution of trees, and the posterior probabilities calculated as the percentage of samples recovering any particular clade, where probabilities ≥ 95% indicate significant support (Fig S1). Of the 108 haplotypes analyzed, nine tortoise haplotypes (non-native individuals; Caccone et al. 2002; Poulakakis et al. 2008) belonging to five populations (Puerto Bravo, PBR; Piedras Blancas, PBL; Santiago, AGO; Roca Union, RU; and Cerro Montura in Santa Cruz; CM) stand out as being aberrant in that they are clearly different from the majority of haplotypes in the population from which they were sampled while they differ by only a few substitutions from haplotypes found in geographically distinct populations, often located on different islands. Additionally, seven museum specimens (R-4477, R4478, R-11064, R-11070, R-213706, R-222475, and R-222478) cluster with lineages other than that of their geographical origin (e.g., the last three specimens described by museum records as being sampled on Pinta island actually cluster with the Pinzón lineage). These individuals and the nine modern specimens previously recognized as "non-native" were eliminated from the final dataset, as they are likely not the outcome of 6 natural colonization events, but the result of either human mediated translocations (Caccone et al. 2002; Poulakakis et al. 2008) or possibly error in the museum records. The exclusion of the non-native and outlier specimens produced a dataset of 92 "pure" haplotypes that was used to repeat the Bayesian analysis (detailed in the main text) as before and for the remaining analyses, such as 1) the delineation of genetic clusters using a generalized mixed Yule coalescent (GMYC) model, 2) the estimation of sequence divergences [(MEGA v.5.04 (Tamura et al. 2011), using the Tamura & Nei (1993) model of evolution] for the current taxonomic units of the giant Galápagos tortoises (Table S3), and the independently evolving mtDNA entities as indicated by SPLITS (Table S4), and 3) the estimation of divergence times. 7 References Akaike H (1974) New Look at Statistical-Model Identification. Ieee Transactions on Automatic Control Ac19, 716-723. Beheregaray LB, Gibbs JP, Havill N, et al. (2004) Giant tortoises are not so slow: Rapid diversification and biogeographic consensus in the Galápagos Proceedings of the National Academy of Sciences of the United States of America 101, 6514-6519. Burns CE, Ciofi C, Beheregaray LB, et al. (2003) The origin of captive Galápagos tortoises based on DNA analysis: implications for the management of natural populations. Animal Conservation 6, 329-337. Caccone A, Gentile G, Gibbs JP, et al. (2002) Phylogeography and history of giant Galápagos tortoises. Evolution 56, 2052-2066. Caccone A, Gibbs JP, Ketmaier V, Suatoni E, Powell JR (1999) Origin and evolutionary relationships of giant Galápagos tortoises. Proceedings of the National Academy of Sciences of the United States of America 96, 13223-13228. Huelsenbeck JP, Ronquist F (2001) MRBAYES: Bayesian inference of phylogenetic trees. Bioinformatics 17, 754-755. Posada D, Buckley TR (2004) Model selection and model averaging in phylogenetics: Advantages of akaike information criterion and Bayesian approaches over likelihood ratio tests. Systematic Biology 53, 793-808. Posada D, Crandall KA (1998) MODELTEST: testing the model of DNA substitution. Bioinformatics 14, 817-818. Poulakakis N, Glaberman S, Russello M, et al. (2008) Historical DNA analysis reveals living descendants of an extinct species of Galápagos tortoise. Proceedings of National Academy of Science, USA 105, 15464-15469. Rambaut A, Drummond AJ (2008) MCMC Trace Analysis Tool. Version v1.5.0, 20032009. http://beast.bio.ed.ac.uk/Tracer. Rohland N, Hofreiter M (2007) Comparison and optimization of ancient DNA extraction. Biotechniques 42, 343-352. Ronquist F, Huelsenbeck JP (2003) MrBayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics 19, 1572-1574. Russello MA, Glaberman S, Gibbs JP, et al. (2005) A cryptic taxon of Galápagos tortoise in conservation peril. Biology Letters 1, 287-290. Russello MA, Hyseni C, Gibbs JP, et al. (2007) Lineage identification of Galápagos tortoises in captivity worldwide. Animal Conservation 10, 304-311. Tamura K, Nei M (1993) Estimation of the Number of Nucleotide Substitutions in the Control Region of Mitochondrial-DNA in Humans and Chimpanzees. Molecular Biology and Evolution 10, 512-526. Tamura K, Peterson D, Peterson N, et al. (2011) MEGA5: Molecular Evolutionary Genetics Analysis using Maximum Likelihood, Evolutionary Distance, and Maximum Parsimony Methods. . Molecular Biology and Evolution. Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins DG (1997) The CLUSTALX windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Research 25, 4876-4882.