Supplementary Material Sections: I II III IV V VI VII VIII Primer Design PCR Protocol Sequence Assembly Identification of Heterozygous Sites and Sequence Alignment Primate Samples EvoNC hypothesized positively selected sites Supplementary tables References I─Primer Design To design primers, we searched for conserved regions using the human, chimpanzee and rhesus macaque DARC sequences downloaded from GenBank (see figure 1 below). II ─ PCR Protocol DNA was divided into 1µL aliquots and mixed with 2.5µL Biotools buffer, 2µL 10mM dNTPs, 1.5µL 20µM Forward and Reverse primers, 1µL of Taq (Biotools) and H2O to 20µL. PCR reactions were run – 1x(94°C-4min), 1x(annealing temperature – Table1), 35x(72°C-50sec, 94°C-40sec, annealing temperature-40sec) 1x(72°C-10min) 1x(4°C-forever). Annealing temperature for each pair of primers Primer DARCp1 DARCp2 DARCp3 DARCp4 DARCp5 DARCp6 DARCp7 DARCp8 Annealing Temperature (°C) 59 58 59 57 58 58 61 58 III ─ Sequence Assembly The sequence assembly was done using the software programs phred (Ewing and Green 1998; Ewing et al. 1998) and phrap (Gordon et al. 2001). Reference sequences from NCBI in 1 FASTA format were used to guide the assembly. These sequences were converted into PHD files (phred format file) and each base received the value 18 as phred quality score. All Old World monkeys samples were assembled using a genomic reference sequence from Macacamulatta and New-World monkeys samples were assembled using a Saimiriboliviensissequence from Genbank (accession number:AF3119180). Cebusapella and Gorilla gorilla samples were assembled without reference sequences. IV─Identification of Heterozygous Sites and Sequence Alignment The identification of heterozygous sites using chromatogram files was done by PolyPhred 6.11 (Nickerson et al. 1997) andPolyscan 3.0 (Chen et al. 2007). Results were then manually inspected with the Consed program (Gordon et al. 1998) and sites with heterozygote chromatogram profiles were substituted by ambiguity symbol defined by IUPAC. Multiple global alignments were implementedusingClustalW (Larkin et al. 2007) using default parameters. 2 V ─Primate Samples Species Source of DNA Saimirisciureus Department of Genetics and Molecular Biology, Federal University of Pará,Pará State, Brazil Saimiriustus Department of Genetics and Molecular Biology, Federal University of Pará, Pará State, Brazil Cebusapella Department of Genetics and Molecular Biology, Federal University of Pará, Pará State, Brazil Gorilla gorilla Department of Genetics and Molecular Biology, Federal University of Pará, Pará State, Brazil Macacafascicularis Nonhuman primate tissue samples from Covance, Inc., Princeton, NJ, USA Macacamulatta Nonhuman primate tissue samples from Covance, Inc., Princeton, NJ, USA Macacanemestrina Coriell Institute, IPBIR, Camden, NJ, USA(individual ID number NG07921) Macacathibetana Coriell Institute, IPBIR, Camden, NJ, USA(individual ID number PR00711) Macacanigra Coriell Institute, IPBIR, Camden, NJ, USA(individual ID number NG07101) Mandrillus sphinx San Diego Zoo/CRES, CA, USA (Drs. Oliver Ryder and Leona Chemnick) M.leucophaeus San Diego Zoo/CRES, CA, USA (Drs. Oliver Ryder and Leona Chemnick) Lophocebusaterrimus Tulane Regional Primate Research Center, LA, USA (Dr. Bobby Gormus) Cercocebustorquatuslunatus Tulane Regional Primate Research Center, LA, USA (Dr. Bobby Gormus) C.galerituschrysogaster Charles Paddock Zoo, Atascadero, CA, (Dr. Cathi Lehn, Texas A&M, TX, USA) Cercopithecusmitis Animal Research Center, The University of Texas at Austin, USA (Jim Letchworth) * See Material and Methods for information about geographic origin of samples. VI-EvoNC hypothesized positively selected sites and their posterior probability values Site Site Site Site Site Site Site 5: 0.989037 154: 0.932526 296: 0.926272 486: 0.926413 581: 0.919703 663: 0.925781 762: 0.933935 Site Site Site Site Site Site Site 6: 0.930001 247: 0.989254 302: 0.942589 513: 0.933935 588: 0.929422 674: 0.927571 765: 0.925887 Site Site Site Site Site Site Site 3 77: 0.920158 263: 0.920457 329: 0.937435 517: 0.914334 605: 0.915384 704: 0.996492 792: 0.925783 Site Site Site Site Site Site Site 132: 0.923882 278: 0.934672 466: 0.920173 577: 0.914334 644: 0.987727 729: 0.929422 843: 0.922005 Figure 1.-Scheme showing the strategy used for primers design. 4 VII ─ Supplementary Tables Table 1. Likelihood values, parameter estimates and likelihood ratio statistics (2ΔlnL) under branch-sites model A for hominoid clade of phylogenetic tree. 2ΔlnL Model ℓ estimate of parameters Positively Selected Sites A () A ( free) -2393.314268 -2392.498033 0=0.0000 1=1.0000 2=1.0000 p0=0.41717 p1=0.32803 p2a+p2b=0.2548 0=0.0000 p0=0.5709 1=1.0000 2=4.90174 p1=0.39258 p2a+p2b=0.10025 − 0.254058 None with P > 70% Note: The parameters x(x = 0,1 and 2)are thedN/dS ratio included in each model. Table 2. Likelihood values, parameter estimates and likelihood ratio statistics (2ΔlnL) under models of variable ratios among sites (M1a, M2a, M0, M3, M7 and M8). 2ΔlnL Model p ℓ Positively Selected Sites Distribution M1a: neutral 2 -2393.9340 M2a: selection 3 -2392.0471 M0: one ratio M3: discrete 1 5 -2403.2928 -2392.0471 M7: beta M8: beta& 2 4 -2393.9880 -2392.0471 M8a: beta& 4 -2394.3872 0=0.0000 1=1.0000 0=0.2315 1=1.0000 2=2.4734 p0=0.52041 p1=0.4796 p0=0.8484 p1=0.0000 p2=0.1516 0=0.5261 0=0.2313 1=0.2315 2=2.4734 p0=0.2023 p1=0.6461 p2=0.15.16 p=0.0050 p0= 0.8505 p=30.3189 q=99.0000 p0= 0.5204 p=0.0050 q=1.3173 Not Allowed 3.7739 22.4915* q=0.0050 p1=0.1495 =2.4908 7. 22. 25. 28. 31. 34. 39. 211. 332 (0.5 < P < 0.87) 7. 22. 25. 28. 39. 332 (P > 0.95) Not Allowed 3.8817 p1=0.4796 =1 7. 22. 25. 28. 31. 34. 39. 122. 124. 211. 329. 332 (0.53 < P< 0.93) Not Allowed 4.6801 Note:p is the number of parameters of each model. The parameters x(x = 0,1 and 2)are thedN/dS ratio included in each model. * Significant values (P < 1%; 2 = 13.48). Table 3. Likelihood values, parameter estimates and likelihood ratio statistics (2ΔlnLTa) under model for partitioned data. ℓ r1 r2 r3 r4 A: (≠ rs) -2396.7621 1 0.5013 0.5264 0.6745 4.6886 0.5166 B: (≠ rs , S) -2373.4453 1 0.6084 0.5693 0.7165 4.6405 C: (≠ rs, , ) -2388.4290 1 0.5084 0.5405 0.7004 Model 0.5412 1=3.1194 3=5.0850 =1.0329 =0.6967 2=4.3222 4=16.2266 =0.3152 =0.6633 Cnull: (≠ rs, ) -2389.9189 1 0.5084 0.5405 0.7264 - - =1 D: (≠ rs, , , S) -2364.4738 1 0.6107 0.5717 0.7167 1=3.216 3=5.2231 =1.1190 =0.5138 2=4.3324 4=16.0279 =0.3432 =0.4950 - - =1 Dnull: (≠ rs, , , S) -2366.3254 1 0.6107 0.5717 0.7536 2ΔlnL 2.9799 3.7032 Note:Parameters xand x (x = 1, 2, 3 and 4) are, respectively, dN/dS ratio and transition/transversion ratio for each of four defined regions. Parameters r1, r2, r3 and r4 are the codon frequencies for each region. 5 VIII─ References Chen K, McLellan MD, Ding L, Wendl MC, Kasai Y, Wilson RK, Mardis ER (2007) PolyScan: an automatic indel and SNP detection approach to the analysis of human resequencing data. Genome Res 17 (5):659–66. Ewing B, Hillier L, Wendl MC, Green P (1998) Base-calling of automated sequencer traces using phred. I. Accuracy assessment. Genome Res 8(3):175–185. Ewing B, Green P (1998)Base-calling of automated sequencer traces using phred. II. Error probabilities. Genome Res 8(3):186–194. Gordon D,Abajian C, Green P (1998) Consed: a graphical tool for sequence finishing. Genome Res, 8(3):195–202. Gordon D,Desmarais C, Green P (2001) Automated finishing with autofinish. Genome Res 11(4):614–25. Guindon S,Gascuel OA (2003) simple, fast, and accurate algorithm to estimate largephylogenies by maximum likelihood. SystBiol, 52(5):696–704. Larkin MA, Blackshields G, Brown NP, Chenna R, McGettigan PA, McWilliam H, Valentin F, Wallace IM, Wilm A, Lopez R, Thompson JD, Gibson TJ, Higgins DG (2007) Clustal W and Clustal X version 2.0. Bioinformatics 23(21):2947–2948. Nickerson DA,Tobe VO, Taylor SL (1997) PolyPhred: automating the detection and genotyping of single nucleotide substitutions using fluorescence-basedresequencing. Nucleic Acids Res 25(14):2745–51. 6