Supplementary Text Pretreatment of the samples and DNA extraction method All the reagents were prepared in a laboratory specifically dedicated to work with ancient DNA under restrictive sterility conditions. Before starting the extraction process the samples were briefly cleaned with a 5% hypochlorite solution trying to manipulate them as little as possible. All the process was conducted inside a laminar flux hood. Each sample was obtained using a different set of dental material that included forceps and a high speed dental diamond bur inserted in a 220 V micromotor. The powder was directly extracted from the lesion itself and put in a 15ml polypropylene tube, where it was incubated overnight at 37ºC in 5ml of extraction buffer (250ul Tris HCl 1M (pH 8.0-8.5), 250ul SDS 10%, 250ul sterile deionised water and 4,25ml EDTA 0,5M) and 50ul of 0.01 g/ml proteinase K. After incubation, the DNA was subsequently extracted with a standard phenol-chloroform extraction protocol and the aqueous phase was concentrated using a Centricon-30 filter column (Millipore) up to a 30 ml volume [13]. An extraction blank was prepared in each extraction process to check for contamination. Sequence used as reference and other modern sequences The Streptococcus mutans (S. mutans) D4930.1 strain [1] was used as reference. The numbering was done starting from the first nucleotide that appears in the S. mutans dexA gene for dextranase, complete cds, Accesion number D49430.1. The other modern sequences used are U159 [2] and GS5 (USA) [3], NN2025 and LJ23 (Japan) [4,5], ATCC 25175 (England) [6], 5DC8 (England), KK21, KK23 and AC4446 (Germany) and NCTC11060 (Denmark) [7]. Ancient sequences obtained The full-length caries ancient sequences retrieved were all originally obtained in this work. Some aspects of the ancient human populations from those sites were described in: M1 (Catalonia) [8], CR1 (Majorca) [9], V1 (Universitat Autònoma de Barcelona (UAB) archaeological collection) (Catalonia, present study), SP1 and SP2 (Catalonia) [10], T1, T2, LO1 and LO2 (México) [11] and U1 (UAB archaeological collection) (Catalonia, present study). Determination of human mitochondrial haplogroups The mitochondrial haplogroups of two individuals, one of European and one of American origin, were obtained combining the information obtained by the sequenciation of the second half of the Hypervariable Region I (nucleotide positions 16210 to 16400) and generating restriction-length polymorphisms Detailed data Branch-site test of positive selection The branch-site test of positive selection [14,15] was applied. This compares the modified model A with the corresponding null model with ω = 1 fixed (fix_omega = 1 and omega = 1). 𝜒12 with critical values 3.84 and 5.99 was used to guide against violations of model assumptions, as recommended in [16]. To calculate the p value based on this mixture distribution, p was calculated using 2∆l, and then the obtained value was divided by 2. As no orthologous sequences from a closely related species could be found performing a Blast search [17] (see the electronic supplementary material, table S8), the most ancient sample studied, M1, was taken as the background branch when applying the branch-site test of positive selection to the ancient sequence data set only, or to the whole sequence data set. When applying the test to the modern sequence data only, strain NN2025 (Japan) was used as the background branch because it was in the principal node, and Japan was the only country with representatives in the two principal nodes of the network. Likelihood-ratio tests using the site models Three of the tests supported by PAML package were carried out. The first test compared the one-ratio model (M0) with the discrete model (M3), which tests whether ω can vary among sites. The two likelihood-ratio tests (LRTs) used to check for positive selection were M1a vs M2a and M7 vs M8 [18]. In these tests, a null model that does not allow ω>1 in the class distribution of this value (M1a and M7) is compared with an alternative model that does (M2a and M8, respectively). These are the two best LRTs used so far to test for positive selection [19]. Twice the log likelihood difference (2∆l) between the values obtained under these models can be compared to a 𝜒2 distribution with 4 degrees of freedom (df) in M0 vs M3, and with 2 df in M1a vs M2a and M7 vs M8 [20]. The F3x4 model of codon frequencies (the equilibrium codon frequencies are calculated from the average nucleotide frequencies at the three codon positions) was used to accommodate biased codon usage. Under the conditions set by positive selection models M2a and M8a, the most likely site category (with the associated dN/dS ratio) at each codon (amino acid) site was inferred. In all cases, branch lengths were fixed at their Maximum-Likelihood Estimation (MLE) under M0 (one-ratio) to save computation, as several previous studies have shown that tests of positive selection and detection of specific sites under its action are insensitive to minor errors in the tree topology or to different estimates of branch lengths [21-23]. After detecting that sites under positive selection were present using LRTs, we applied a procedure known as Bayes Empirical Bayes (BEB) [24] to calculate the posterior probabilities that each site belonged to the class ω>1. BEB appears to avoid the high false-positive rates of the naïve empirical Bayes (NEB) approach in small non-informative data sets, as it better accommodates uncertainties in the MLE of parameters in the ω distribution [15]. Different tests of codon selection (Twice the log-likelihood was calculated and compared with the corresponding 𝜒2 test). For ancient sequences M0 vs M3: 2x (-109.230-(-117.484)) = 16.632, p<0,05 for 𝜒42 M1a vs M2a: 2x (-109.230-(-118.530)) = 18.6, p<0,01 for 𝜒22 M7 vs M8: 2x (-109.230-(-118.757)) = 19.054, p<0,01 for 𝜒22 For modern sequences M0 vs M3: 2x (-129.751-(-134.917)) = 10.332, p<0,05 for 𝜒42 M1a vs M2a: 2x (-129.752 -(-135.555)) = 11.506, p<0,01 for 𝜒22 M7 vs M8: 2x (-130.253-(-135.816)) = 10.586, p<0,01 for 𝜒22 For all the sequences M0 vs M3: 2x (-153.352-(-174.916)) = 43.128, p<0,05 for 𝜒42 M1a vs M2a: 2x (-153.352 -(-172.471)) = 38.238, p<0,01 for 𝜒22 M7 vs M8: 2x (-154.681-(-172.478)) = 35.594, p<0,01 for 𝜒22 Codon-by-codon analysis of natural selection For each codon, the numbers of sites that are estimated to be synonymous (S) and nonsynonymous (N) were calculated. These estimates were produced using the joint Maximum Likelihood reconstructions of ancestral states under a Muse-Gaut model [25] of codon substitution and all the models of nucleotide substitution provided by the HyPhy software package [26]. To estimate MLE values, a tree topology was automatically computed. The test statistic dN - dS was calculated, where dS is the number of synonymous substitutions per site (s/S) and dN is the number of nonsynonymous substitutions per site (n/N). A positive value for the test statistic indicates an over-abundance of non-synonymous substitutions. Normalized dN - dS for the test statistic was obtained using the total number of substitutions in the tree (measured in expected substitutions per site), in order to make comparisons between the two data sets. Maximum Likelihood computations of dN and dS were performed using HyPhy software package as done in [27]. BLAST search A search for short, nearly exact matches was carried out using the BLAST program [24] from 7 to 20bp in length in the Bacillus/ Lactobacillus/Streptococcus group (taxid: 91061) and with somewhat similar sequences (blastn) that includes 20bp or longer with just S. mutans dextranase giving significant results (E value lower than 0.1) (http://blast.ncbi.nlm.nih.gov/Blast.cgi, last accessed 15/4/13). References 1 Igarashi, T., Yamamoto, A., Goto, N. 1995 Sequence analysis of the Streptococcus mutans Ingbritt dexA gene encoding extracellular dextranase. Microbiology and Immunology 39(11), 853-60. 2 Ajdic, D., McShan, W. M., McLaughlin, R. E., Savic, G., Chang, J., Carson, M. B., Primeaux, C., Tian, R., Kenton, S., Jia, H., et al. 2002 Genome sequence of Streptococcus mutans UA159, a cariogenic dental pathogen. Proceedings of the National Academy of Sciences U S A 99(22), 14434–14439. 3 Biswas S, Biswas I. 2012 Determination of Complete Genome Sequence of the Streptococcus mutans GS-5, a serotype c Strain. Journal of Bacteriology 194(17), 47878. 4 Maruyama, F., Kobata, M., Kurokawa, K., Nishida, K., Sakurai, A., Nakano, K., Nomura, R., Kawabata, S., Ooshima, T., Nakai, K., et al. 2009 Comparative genomic analyses of Streptococcus mutans provide insights into chromosomal shuffling and species-specific content. Genomics 10, 358. 5 Aikawa, C., Furukawa, N., Watanabe, T., Minegishi, K., Furukawa, A., Eishi, Y., Oshima, K., Kurokawa, K., Hattori, M., Nakano, K., et al. 2012 Complete Genome Sequence of the Serotype k Streptococcus mutans Strain LJ23. Journal of Bacteriology 194(10), 2754-2755. 6 Kim, Y. M., Shimizu, R., Nakai, H., Mori, H., Okuyama, M., Kang, M. S., Fujimoto, Z., Funane, K., Kim, D., Kimura, A. 2011 Truncation of N- and C-terminal regions of Streptococcus mutans dextranase enhances catalytic activity. Applied Microbiology and Biotechnology 91(2), 329-339. 7 Song, L., Sudhakar, P., Wang, W., Conrads, G., Brock, A., Sun, J., Wagner-Döbler, I., Zeng, A. P. 2012 A genome-wide study of two-component signal transduction systems in eight newly sequenced mutans streptococci strains. BMC Genomics 13, 128. 8 Simón, M., Jordana, X., Armentano, N., Santos, C., Díaz, N., Solórzano, E., López, J. B., González-Ruiz, M., Malgosa, A. 2011 The Presence of Nuclear Families in Prehistoric Collective Burials Revisited. The Bronze Age Burial of Montanissell Cave (Spain) in the Light of aDNA. American Journal of Physical Anthropology 146(3), 406-413. 9 Díaz, N. 2009 Bahía de Alcudia, Mallorca: un crisol genético en el Mediterráneo. PhD Thesis. Barcelona: Universitat Autònoma de Barcelona. 10 Jordana, X. 2007 Caracterització i evolució d'una comunitat medieval catalana. Estudi bioantropològic de les inhumacions de les esglésies de Sant Pere. PhD Thesis. Barcelona: Universitat Autònoma de Barcelona. 11 Solórzano, E. 2006 De la Mesoamérica Prehispánica a la Colonial: La huella del DNA antiguo. PhD Thesis. Barcelona: Universitat Autònoma de Barcelona. 12 van Oven M, Kayser M. 2009. Updated comprehensive phylogenetic tree of global human mitochondrial DNA variation. Hum Mutat 30(2), E386-E394. 13 Malgosa, A., Montiel, R., Díaz, N., Solórzano, E., Smerling, A., Isidro, A., García, C. & Simon, M. 2005. Ancient DNA: a modern look at the infections of the past. Recent research developments in microbiology (ed. S.G. Pandalai), pp. 213-236. Trivandrum, India: Research Signpost. 14 Yang, Z., Wong, W. S. W., Nielsen, R. 2005 Bayes Empirical Bayes Inference of Amino Acid Sites Under Positive Selection. Molecular Biology and Evolution 22(4), 1107-1118. 15 Zhang, J., Nielsen, R., Yang, Z. 2005 Evaluation of an improved branch-site likelihood method for detecting positive selection at the molecular level. Molecular Biology and Evolution 22, 2472-2479. 16 Yang, Z. 2007 PAML 4: a program package for phylogenetic analysis by maximum likelihood. Molecular Biology and Evolution 24, 1586-1591. 17 Altschul, S., Madden, T., Schäffer, A., Zhang, J., Zhang, Z., Miller, W., Lipman, D. J. 1997 Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Research 25, 3389-3402. 18 Yang, Z. 1998 Likelihood Ratio Tests for Detecting Positive Selection and Application to Primate Lysozime Evolution. Molecular Biology and Evolution 15(5), 568-573. 19 Anisimova, M., Bielawski, J., Dunn, K., Yang, Z. 2007 Phylogenomic analysis of natural selection pressure in Streptococcus genomes. BMC Evolutionary Biology 7, 154. 20 Yang, Z., Nielsen, R. 2002 Codon-substitution models for detecting molecular adaptation at individual sites along specific lineages. Molecular Biology and Evolution 19(6), 908-17. 21 Suzuki, Y., Gojobori, T. 1999 A Method for Detecting Positive Selection at Single Amino Acid Sites. Molecular Biology and Evolution 16(10), 1315–1328. 22 Yang, Z., Nielsen, R., Goldman, N., Pedersen, A. M. 2000 Codon-substitution models for heterogeneous selection pressure at amino acid sites. Genetics 155(1), 43149. 23 Swanson, W. J., Yang, Z., Wolfner, M. F., Aquadro, C. F. 2001 Positive Darwinian selection drives the evolution of several female reproductive proteins in mammals. Proceedings of the National Academy of Sciences USA 98, 2509–2514. 24 Deely, J. J., Lindley, D. V. 1981 Bayes Empirical Bayes. Journal of American Statistical Association 76, 833-841. 25 Muse, S. V., Gaut, B. S. 1994 A likelihood approach for comparing synonymous and nonsynonymous nucleotide substitution rates, with application to the chloroplast genome. Molecular Biology and Evolution 11, 715-724. 26 Pond, S. L. K., Frost, S. D. W., Muse, S. V. 2005 HyPhy: hypothesis testing using phylogenies. Bioinformatics 21, 676-679. 27 Pond, S. L. K., Frost, S. D. W. 2005 Not So Different After All: A Comparison of Methods for Detecting Amino Acid Sites Under Selection. Molecular Biology and Evolution 22, 1208-1222.