1 Supplementary online material 2 3 TEXT 4 Experimental procedures 5 6 Sampling procedures. In October 2006, and in June 2008, we conducted two sampling 7 campaigns in Trois-Sauts, an isolated French Guiana village. Characteristics of the village, 8 the humans and the human sampling are described elsewhere (Ruimy et al., 2010; Woerther et 9 al., 2010). Briefly, the village is located in the Amazonian pristine forest, in the south of 10 French Guiana, nearby the Oyapock river source and in a protected area, where access is 11 administratively controlled (Fig. S1). The village is divided in 4 hamlets, Zidock, Roger, Pina 12 and Yawapa from where human samples came from (Fig. S1). The villagers are native 13 Amerindian Wayampis. Their way of life is comparable to traditional societies as they share 14 large huts with no access to hygienic facilities, and eat local food (traditional agriculture, 15 fishing and hunting). However they receive education and medical care from metropolitan 16 French teachers and nurses living in the village. Human associated animals were sampled in 17 Zidock, Roger and Pina (Fig. S1), they mostly belong to dog and chicken populations, living 18 free in the hamlets, hence not as closed to villagers as could be pets from industrialised 19 country inhabitants. Wild animals were sampled by trapping along two transects (one 20 connecting Zidock and Roger and the other issued from Zidock and ending 20 km away in 21 North-West direction in the Amazonian forest) and then released alive in the forest or killed 22 and kept as representative specimens of rare species. Wild animals were also sampled thanks 23 to hunting by the villagers for food purpose. In this case, they were brought immediately to 24 the village and sampled as fast as possible. Main characteristics of individuals including 25 Linnean denomination, order, class, body mass, diet and sampling localisation [obtained by 1 1 the Global Positionning System (GPS)] were collected (Table S1). The study was approved 2 by ad hoc Guadeloupe Ethics Committee (Comité de protection des personnes de 3 Guadeloupe, France; no 06-05). 4 We sampled 393 individuals comprising 162 adult Wayampi Amerindians, 33 human 5 associated animals living in the village and 198 wild animals. Among the 162 human 6 samples, 94 (58 %), 37 (23%), 15 (9%) and 16 (10%) were sampled in the hamlets Zidock, 7 Roger, Pina and Yawapa, respectively (Fig. S1). Among the 33 human associated animal 8 samples, 22 (67%) came from Zidock, 2 (6%) from Roger and 9 (27%) from Pina (Fig. S1). 9 Among the 198 wild animal sampled during the two collects, 122 in 2006 and 76 in 2008, 117 10 (59%) were collected in or close to the hamlets Zidock, Roger, Pina and Yawapa [69 (57%) in 11 2006 and 48 (63%) in 2008] and 81 (41%) were collected outside the hamlets along two 12 transects or captured by the villagers around the village [53 (43%) in 2006 and 28 (37%) in 13 2008] (Fig. S1). 14 15 Strain isolation. Fresh faecal samples and rectal swabs (from humans and animals, 16 respectively) were inoculated extemporaneously onto Drigalski agar slants in screw-cup 17 tubes, stored at room temperature and sent to metropolitan France two weeks later. There, the 18 whole culture from each tube was suspended in 1.5 mL of brain-heart infusion broth with 19 10% glycerol and stored at −80°C. 100µL aliquots of each stored broth were cultured on 20 chromogenic plates (Uriselect®; BioRad). Pink colonies were tested for their indole 21 production and identified as E. coli if positive, according to the manufacturer’s 22 recommendations. The first strain identified as E. coli was chosen to be representative of each 23 sample and considered as randomly selected. 24 2 1 Antibiotic resistance pattern. The antimicrobial susceptibility of the strains to 32 2 antibiotics (see list in Table S3) was determined using the disk-diffusion method, as described 3 elsewhere (http://www.sfm.asso.fr). For each strain, phenotype of resistance was classified 4 into 2 categories: S (sensitivity to all antibiotics tested) or R (resistance to at least one 5 antibiotic). Antibiotic resistance scores were also determined as the relative proportion of the 6 number of identified resistances on the total number of tested resistances *100. 7 8 The presence of penicillinase TEM and CMY-type ß-lactamases were detected by PCR as in (Branger et al., 2005; Courpon-Claudinon et al., 2011). 9 10 Strain genotyping. Phylogenetic groups were firstly assigned to one of the seven groups A0, 11 A1, B1, B22, B23, D1 and D2 using the triplex PCR method (Clermont et al., 2000; Escobar- 12 Paramo et al., 2004). MLST was then performed using the Pasteur Institute scheme 13 (http://www.pasteur.fr/recherche/genopole/PF8/mlst/EColi.html) (Jaureguy et al., 2008) on 96 14 B22, B23, D1 and D2 strains allowing the assignation of the strains to the B2, D, E and F 15 phylogroups (Tenaillon et al., 2010). C group strains (Moissenet et al., 2010) were determined 16 among A1 strains using an allele-specific C group PCR on the trpA gene developed in this 17 study. The PCR steps were as follows: denaturation for 4 min at 94°C, 30 cycles of 5 s at 18 94°C and 10 s at 59°C, and a final extension step of 5 min at 72°C using the primers 19 trpAgpC1 20 TCTGCGCCGGTCACGCCC-3’) (product size: 219 bp). An allele-specific E group PCR was 21 also developed to verify that no strain typed as A0, A1, and B1 by the triplex PCR method was 22 belonging to this group. The chosen target was the arpA gene using the ArpAgpE.f 5’- 23 GATTCCATCTTGTCAAAATATGCC-3’ 24 GAAAAGAAAAAGAATTCCCAAGAG-3’ primers amplifying a 301 bp fragment. The 25 PCR conditions were as above except that the annealing temperature was of 57°C. Among (5’-AGTTTTATGCCCAGTGCGAG-3’) and and trpAgpC2 ArpAgpE.r (5’- 5’- 3 1 the 176 A0, A1, and B1 strains, no one belongs to the E group. At the opposite, all strains 2 classified as E by the MLST were positive with our E group PCR assay. A maximum- 3 likelihood phylogenetic tree was reconstructed with the PHYML program (Guindon et al., 4 2005) using the concatenated MLST sequences from the 96 E. coli strains included in this 5 study, 35 strains of the ECOR collection (including 15, 6, 3, 6, 2, 2, and 1 strains belonging to 6 the B2, D, E/UG, F, A, B1 and C phylogenetic groups, respectively) (Ochman and Selander, 7 1984; Lescat et al., 2009), 13 representative B2 phylogenetic subgroup strains previously 8 analysed by Le Gall et al. (Le Gall et al., 2007), 5 representative D and F group strains 9 (Touchon et al., 2011) and the reference strains ED1a, E2348/69, 536, TN03, 042 and 10 EDL933. The tree was rooted on E. fergusonii ATCC 35469T. Detection of clade strains was 11 performed by PCR as described by Clermont et al. (Clermont et al., 2011a). 12 Extraintestinal virulence genes (hly, cnf, aer, papC, iroN, traT, fyuA, sfa, and kpsE) 13 (Bingen-Bidois et al., 2002; Johnson et al., 2006) and intraintestinal virulence genes (afaD, 14 stx1, stx2, ipaH, eae, bfpA, ST and LT coding genes, aaiC) (Clermont et al., 2011b) were 15 detected by PCR. Bacteriocins, including colicins and microcins, were detected as follow. 16 Colicins have firstly been detected by a slightly modified phenotypic method (Schamberger 17 and Diez-Gonzalez, 2005). At first a suspension in phosphate buffered saline of E. coli K-12 18 (at 0.5 Mc Farland) as a sensitive strain has been plated on a Luria Bertani (LB) agar medium 19 containing mitomycin (concentration at 0.25 mg/L). Then, 10 μl of an over night (O/N) 20 culture in LB medium of each strain were spotted on the mitomycin-LB agar plate. After an 21 O/N culture at 37°C, the presence of colicin (or phage) has been detected for the strains 22 surrounded by a halo traducing an inhibition of the culture of the E. coli K-12 strain. Strains 23 positive for the phenotypic test have then been tested by PCR for the presence of the most 24 frequent colicin genes (colIa/Ib, colE1, colB) (Gordon and O'Brien, 2006). The detection of 25 the main microcins genes (micH47 and micV), which production is related to the iron 4 1 concentration in the medium and known to be associated to the iroN gene (Waters and Crosa, 2 1991), have been detected among the iroN positive strains (Gordon and O'Brien, 2006). For 3 each strain, genotypes of extraintestinal virulence, intraintestinal virulence and bacteriocin 4 were separately classified in 2 categories: absence or presence of at least one character. 5 Relevant scores for each type of characters were then determined by the relative proportion of 6 the number of identified extraintestinal, intraintestinal virulence factors and bacteriocins on 7 the total number of tested characters of each type *100. 8 9 Septicemia mouse model. A mouse model of systemic infection was used to assess the 10 intrinsic virulence of 8 B2 strains chosen to be representative of the B2 phylogenetic 11 subgroups, according to the tree reconstructed from the MLST data obtained in this study 12 (Picard et al., 1999). Ten outbreed female OF1 mice (6-week-old, 14-16 g) were challenged 13 subcutaneously in the neck with a standardized bacterial inoculum for each strain (0.2 ml of a 14 Ringer solution containing 109 cfu/ml of log-phase bacteria). Mortality was assessed over 7 15 days post-challenge. Each experimental series included a positive control (urosepsis strain 16 CFT073) and a negative control (commensal derived strain K-12 MG1655). In this model, 17 lethality is a rather clear-cut parameter and strains are usually classified either as non-killer 18 (strains killing none or one mouse out of 10) or killer (strains killing 9 or 10 mice out of 10) 19 (Johnson et al., 2006). Strains that did not fall in these two categories were considered as 20 being intermediate killer. Animal experimentations were done according to the authorization 21 n° 6665 given by the Ministère de l’Agriculture, France. 22 23 Factorial analysis of correspondence (FAC). A FAC was used to describe associations 24 among the data (Greenacre, 1992). FAC was conducted with SPAD.N software (Cisia, Saint 25 Mandé, France) from a two-way table. This table had 272 rows, one for each studied E. coli 5 1 strains and 6 columns corresponding to the 6 variables: the origins of the strains [human 2 strains (HS), human associated animal strains (HAAS) and wild animal strains (WAS)], the 3 phylogenetic group (A, B1, B2, C, D, E and F) according to the phylogrouping method and 4 the MLST data, the presence of extra-intestinal virulence determinants, of intra-intestinal 5 virulence determinants and of bacteriocins, the presence of resistance to antibiotics. For each 6 column, the character of each strain was coded as a binary code: present =1, absent = 0. 7 8 Comparison of nucleotide diversity of the MLST data. Nucleotide diversity per site (Pi) of 9 the strains obtained from Trois Sauts village and belonging to the B2, D, E and F 10 phylogenetic groups (96 strains) was compared to the nucleotide diversity per site of strains 11 belonging to the same phylogroups but obtained from (i) the ECOR collection representative 12 of the E. coli species genetic diversity (Ochman and Selander, 1984) (31 strains), (ii) the 13 Broad 14 (http://www.broadinstitute.org/annotation/genome/escherichia_antibiotic_resistance/MultiHo 15 me.html) from which strains have been chosen for their origin from various species and 16 completely sequenced (40 strains) and (iii) a personal collection of 137 strains (ROAR 17 collection) composed of 38, 36 and 63 strains from human, domestic and wild animals, 18 respectively, sampled in metropolitan France in the 2000’s (Skurnik, Clermont, Brisse and 19 Denamur, personal data). Nucleotide diversities per site of the 4 collections were estimated 20 using DnaSP (Rozas, 2009) and then compared with a t-test. collection 21 6 1 2 References 3 Bingen-Bidois, M., Clermont, O., Bonacorsi, S., Terki, M., Brahimi, N., Loukil, C. et al. 4 (2002) Phylogenetic analysis and prevalence of urosepsis strains of Escherichia coli bearing 5 pathogenicity island-like domains. Infect Immun 70: 3216-3226. 6 Branger, C., Zamfir, O., Geoffroy, S., Laurans, G., Arlet, G., Thien, H.V. et al. (2005) 7 Genetic background of Escherichia coli and extended-spectrum beta-lactamase type. Emerg 8 Infect Dis 11: 54-61. 9 Clermont, O., Bonacorsi, S., and Bingen, E. (2000) Rapid and simple determination of the 10 Escherichia coli phylogenetic group. Appl Environ Microbiol 66: 4555-4558. 11 Clermont, O., Gordon, D.M., Brisse, S., Walk, S.T., and Denamur, E. (2011a) 12 Characterization of the cryptic Escherichia lineages: rapid identification and prevalence. 13 Environ Microbiol 13: 2468-2477. 14 Clermont, O., Olier, M., Hoede, C., Diancourt, L., Brisse, S., Keroudean, M. et al. (2011b) 15 Animal and human pathogenic Escherichia coli strains share common genetic backgrounds. 16 Infect Genet Evol 11: 654-662. 17 Courpon-Claudinon, A., Lefort, A., Panhard, X., Clermont, O., Dornic, Q., Fantin, B. et al. 18 (2011) Bacteraemia caused by third-generation cephalosporin-resistant Escherichia coli in 19 France: prevalence, molecular epidemiology and clinical features. Clin Microbiol Infect. 20 Escobar-Paramo, P., Grenet, K., Le Menac'h, A., Rode, L., Salgado, E., Amorin, C. et al. 21 (2004) Large-scale population structure of human commensal Escherichia coli isolates. Appl 22 Environ Microbiol 70: 5698-5700. 23 Gordon, D.M., and O'Brien, C.L. (2006) Bacteriocin diversity and the frequency of multiple 24 bacteriocin production in Escherichia coli. Microbiology 152: 3239-3244. 7 1 Greenacre, M. (1992) Correspondence analysis in medical research. Stat Methods Med Res 1: 2 97-117. 3 Guindon, S., Lethiec, F., Duroux, P., and Gascuel, O. (2005) PHYML Online--a web server 4 for fast maximum likelihood-based phylogenetic inference. Nucleic Acids Res 33: W557-559. 5 Jaureguy, F., Landreau, L., Passet, V., Diancourt, L., Frapy, E., Guigon, G. et al. (2008) 6 Phylogenetic and genomic diversity of human bacteremic Escherichia coli strains. BMC 7 Genomics 9: 560. 8 Johnson, J.R., Clermont, O., Menard, M., Kuskowski, M.A., Picard, B., and Denamur, E. 9 (2006) Experimental mouse lethality of Escherichia coli isolates, in relation to accessory 10 traits, phylogenetic group, and ecological source. J Infect Dis 194: 1141-1150. 11 Le Gall, T., Clermont, O., Gouriou, S., Picard, B., Nassif, X., Denamur, E., and Tenaillon, O. 12 (2007) Extraintestinal virulence is a coincidental by-product of commensalism in B2 13 phylogenetic group Escherichia coli strains. Mol Biol Evol 24: 2373-2384. 14 Lescat, M., Hoede, C., Clermont, O., Garry, L., Darlu, P., Tuffery, P. et al. (2009) aes, the 15 gene encoding the esterase B in Escherichia coli, is a powerful phylogenetic marker of the 16 species. BMC Microbiology 9: 723. 17 Moissenet, D., Salauze, B., Clermont, O., Bingen, E., Arlet, G., Denamur, E. et al. (2010) 18 Meningitis caused by Escherichia coli producing TEM-52 extended-spectrum beta-lactamase 19 within an extensive outbreak in a neonatal ward: epidemiological investigation and 20 characterization of the strain. J Clin Microbiol 48: 2459-2463. 21 Ochman, H., and Selander, R.K. (1984) Standard reference strains of Escherichia coli from 22 natural populations. J Bacteriol 157: 690-693. 23 Picard, B., Garcia, J.S., Gouriou, S., Duriez, P., Brahimi, N., Bingen, E. et al. (1999) The link 24 between phylogeny and virulence in Escherichia coli extraintestinal infection. Infect Immun 25 67: 546-553. 8 1 Rozas, J. (2009) DNA sequence polymorphism analysis using DnaSP. Methods Mol Biol 537: 2 337-350. 3 Ruimy, R., Angebault, C., Djossou, F., Dupont, C., Epelboin, L., Jarraud, S. et al. (2010) Are 4 host genetics the predominant determinant of persistent nasal Staphylococcus aureus carriage 5 in humans? J Infect Dis 202: 924-934. 6 Schamberger, G.P., and Diez-Gonzalez, F. (2005) Assessment of resistance to colicinogenic 7 Escherichia coli by E. coli O157:H7 strains. J Appl Microbiol 98: 245-252. 8 Tenaillon, O., Skurnik, D., Picard, B., and Denamur, E. (2010) The population genetics of 9 commensal Escherichia coli. Nat Rev Microbiol 8: 207-217. 10 Touchon, M., Charpentier, S., Clermont, O., Rocha, E.P., Denamur, E., and Branger, C. 11 (2011) CRISPR distribution within the Escherichia coli species is not suggestive of 12 immunity-associated diversifying selection. J Bacteriol 193: 2460-2467. 13 Waters, V.L., and Crosa, J.H. (1991) Colicin V virulence plasmids. Microbiol Rev 55: 437- 14 450. 15 Woerther, P.L., Angebault, C., Lescat, M., Ruppe, E., Skurnik, D., Mniai, A.E. et al. (2010) 16 Emergence and dissemination of extended-spectrum beta-lactamase-producing Escherichia 17 coli in the community: lessons from the study of a remote and controlled population. J Infect 18 Dis 202: 515-523. 19 20 9 1 FIGURE LEGENDS 2 3 Figure S1: Location of study site and sample collection points in Trois-Sauts, French Guiana. 4 The study was located in a protected area where human access is restricted. Locations of 5 human, human associated animal and wild animal samples are indicated in white, grey and 6 black circles, according to the Global Positioning System (GPS) collected for each sample. 7 Circles are proportional to the numbers of sampled individuals. 8 9 TABLE TITLES 10 11 Table S1. Main characteristics of individuals sampled during the 1st collect in October 2006 12 [humans (H), human associated animals (HAA) and wild animals (S1-WA)] and the 2nd 13 collect in June 2008 [wild animals (S2-WA)] as well as the E. coli presence in the faeces. 14 Table S2. Presence of extraintestinal virulence factors, intraintestinal virulence factors, 15 bacteriocins and corresponding scores of strains of E. coli issued from humans, human 16 associated animals and wild animals from both collects (2006-2008). 17 Table S3. Presence of resistance to antibiotics and resistance score of E. coli issued from 18 humans, human associated animals and wild animals from both collects (2006-2008). 19 Table S4. Character mapping of the 96 E. coli strains belonging to the B2, D, E and F 20 phylogroups and ordered as in the Fig. 1. 21 22 23 24 25 10