Supporting file S1 for: Origin and ecological selection of core and food-specific bacterial communities associated with meat and seafood spoilage. (Chaillou et al.) 1 – Objectives: Design 454-sequencing runs and analyze 454 libraries. Determine degree of chloroplast contamination. Establish quality controls to evaluate the potential for methodological bias in our study design. 2 – Design of multiplex 454 GS FLX Titanium runs 27F and 534R primers were fused with one of ten unique, sample-specific barcodes and added to sequences of 454 forward primer A and reverse primer B, respectively. These 10 barcoded primer sets were used for V1-V3 16S amplification of a batch of 10 samples from a given food product at a given time of analysis (e.g., 10 samples of fresh ground beef analyzed at T0); each sequence read could therefore be traced back to its sample of origin. For each batch of 10 pooled samples, two emulsion PCRs (emPCR) were carried out to perform sequencing in both forward (primer A) and reverse (primer B) directions. The resulting 20 emPCRs were pooled together in a ¼-plate run. In total, four full-plate GS-FLX Titanium runs were performed, corresponding to 16 batches of a ¼ run each and thus 160 samples sequenced in both directions. 3 – Figure S1a: Boxplot showing the number of raw reads obtained from the different barcode amplicon libraries. Sample codes are as described in Table 1 in the manuscript. Boxes define the interquartile range (Q1-Q3) of the total number of reads (forward A and reverse B sides sequenced together) obtained from each food type (10 pooled samples from each). The blue line depicts the median value of reads per sample for the whole dataset. This value was used to normalize abundance at a fixed library size of 15,000 reads. Sequencing results revealed unusually high variability among T0 fresh smoked salmon samples (indicated with an arrow). However, this variability did not significantly impact the overall estimation of diversity in these samples. Indeed, the Chao1 estimator 1 revealed that mean diversity was similar in samples above (179 ± 66 predicted OTUs) and below (175 ± 44 predicted OTUs) the median number of reads. 3 – Figure S1b: Amount of chloroplast contamination in T0 samples expressed as a percentage of relative abundance. (most-likely non SSU sequences) Rapid detection of chloroplasts was performed following quality filtering of reads; reads were directly matched with taxonomic assignments using the SILVA analytical pipeline. The inferred origin of reads from each food product, together with the percentage of the total reads attributable to that origin, is represented in Figure S1b by color: bacteria - no color, chloroplast – green, no match – yellow. This last group presumably includes chimeric reads or reads of non-SSU rRNA genes. Chloroplasts were not detected in TS spoiled samples but were detected in some T0 samples, particularly in poultry sausage samples (arrow). Here, chloroplast contamination correlates to the use of spices in this food type. An average of 58% (min: 38%, max: 75%) of all reads obtained from poultry sausage samples were derived from chloroplasts. Therefore, it is likely that this very high level of contamination hindered our efforts to fully capture the complete microbial -diversity of poultry sausage T0 samples in comparison to the other food products analyzed. Indeed, T0 poultry sausage samples demonstrated the greatest divergence from mean observed OTU numbers and from Chao1 estimates of predicted OTUs as described in Figure 1A in the main body of the manuscript. 4 – Analyzing biases in DNA extraction and 16S amplicon PCR We combined the quality control for these two steps into one set of experiments. We aimed to demonstrate that for each of the eight food types, DNA extraction yield and subsequent 16S rRNA gene amplification were not affected by bias introduced by our gradient PCR protocol. We therefore chose two bacteria, each from a different phylum, and artificially co-inoculated them into fresh products at low natural contamination levels (ca. 102 cfu.g-1). Lactobacillus sakei strain 23K (Firmicute) (Chaillou et al., 2005) and Serratia proteamaculans strain CD249 (Proteobacteria) (Jaffres et al., 2011) were inoculated together in six samples (A to F) of each food type in an inverted concentration gradient ranging from 103 2 cfu.g-1 to 108 cfu.g-1 (see Figure S1c). DNA was extracted, and 10 ng were used in temperature-gradient PCR amplifications with 27F and 534R primers under the conditions described in the main body of the manuscript (Materials and Methods section). PCR products were diluted 1000-fold, and 5 µl of this solution was analyzed using quantitative real-time PCR (qPCR). Primer sequences used for the specific quantification of the two species were designed for this study and are described in Table S1e. The protocol for qPCR is described in Chaillou et al., 2013. Results of the quantification, shown in Figure S1d, demonstrated that for all products the concentration gradient of the two species was conserved following DNA extraction and gradient PCR, and that little bias was observed between meat and seafood products. These results confirmed that our protocol did not create significant bias, either in detection or in the relative quantification of bacterial species affiliated with different phyla. C D Figure S1c: Design of co-inoculation experiment for six samples (A-F) of each food type. Grey circles represent inoculations of Lactobacillus sakei 23K, and white circles represent inoculations of Serratia proteamaculans CD249. Figure S1d: qPCR quantification expressed in threshold cycles (Ct) of Lactobacillus sakei 23K (grey) and Serratia proteamaculans CD249 (white) following DNA extraction and gradient PCR of the 16S region. Results are shown for samples obtained from four meat products (circles) and four seafood products (squares). Table S1e: Description of primers designed in this study for the 16S V1-V3 region and used for qPCR quantification Name Lactobacillus sakei 23K QEBP-LSA-01F QEBP-LSA-01R Serratia proteamaculans CD249 QEBP-SER-01F QEBP-SER-01R Sequence 5' -> 3' Amplicon size (bp) Efficiency AAACCTAACACCGCATGGTGTAG TCAGGTCGGCTATGCATCACGGT 208 0.92 CTAGCTGGTCTGAGAGGATGAC CCGTCAATGCAATGTGCTATTAACAC 132 0.88 3 5 – Biases in qualitative taxonomic assignment and relative quantification between 454-Titanium technical replicates Sequencing artifacts and emulsion PCR (emPCR) may strongly distort bacterial community structures in pyrosequencing datasets (Schloss et al., 2011; Pinto and Raskin, 2012; Bakker et al., 2012). Therefore, we used several strategies to ensure the quality and reproducibility of our pyrosequencing analysis. First of all, to avoid data distortion among the ten samples of a given food type at a given time of analysis, these ten samples were sequenced in the same ¼-run lane. Furthermore, we conducted two emPCRs for the PCR pool obtained for each food type at a given time of sampling, one from the forward primer A side and the second from the reverse primer B side; these were considered pyrosequencing technical replicates. To analyze the internal variability between replicates, the data obtained from A- and B-side sequencing were treated separately in our taxonomic assignment pipeline, as described in the main text of the manuscript and in Supporting File S2. Figures S1f and S1g compare the taxonomic assignments obtained from each sequencing replicate (assignments at the genus level using a 98% identity threshold; data not filtered for chimeras). F G Figure S1f: Quantitative and qualitative comparison of technical replicates from forward A- and reverse Bside sequencing of entire 80-sample T0 dataset. Each point represents a taxon (genus level; assigned by SILVA) identified in one sample with the corresponding read counts in its sequencing replicate. Figure S1g: Equivalent to Figure S1f but using the TS dataset. These results show that for each taxon identified, the strength of the correlation between the quantitative measurements obtained from A- and B-side sequencing deteriorated when fewer than 50 reads were available (1.5 log10). Nevertheless, the correlation between technical replicates was very strong (Pearson correlation R > 0.90, P < 0.0001). Furthermore, to ensure increased reliability in the relative quantification analysis, the sum of A- and B-side sequencing was always used for all samples, and this strategy helped to reduce quantification bias in less commonly recovered taxa. However, as can be observed in Figures S1f and S1g, in a few cases the quantification obtained from A-side sequencing did not correlate with that obtained from B-side sequencing, and vice versa. Without exception, these data were revealed to be chimeras and were removed from the analysis. An additional technical replication 4 test was carried out between two independent runs. Six samples were randomly chosen, three from T0 (PST0-04, GVT0-02, SFT0-10) and three from TS (PSTA-04, GVTA-02, SFTA-10); they were sequenced from both sides and in two different runs. Figure S1h shows the resulting comparative SILVA analysis that demonstrates the strong reproducibility of our sequencing analysis between runs. F Figure S1h: Quantitative and qualitative comparison of technical replicates obtained from six samples sequenced in two independent runs. For each run, counts represent the sum of A- and B- side sequences. Each point represents a taxon (genus level, assigned by SILVA) identified in one sample with the corresponding read counts in its sequencing replicate. 6 – Comparison of 454-pyrosequencing data with temporal temperature gel electrophoresis (TTGE) analysis To verify our qualitative taxonomic assignments, we carried out TTGE analysis on several sets of samples and compared these results with those obtained using the 454-pyrosequencing protocol. The TS samples of smoked salmon (SS), salmon fillet (SF), and cooked shrimp (CS) were chosen for this analysis (10 samples of each food product) because studies have already been published using TTGE for these products (Mace et al., 2013; Broekaert et al., 2011) and a straightforward TTGE protocol was available. Primers V3P2 and V3P3-GC-Clamp were used to amplify the 16S rRNA V3 gene region (194 bp) as described in Jaffres et al., 2009. The size of the PCR products was determined in 1% (w/v) agarose gel (Invitrogen) using an exACTGene 100 bp PCR DNA ladder (Fisher Scientific, Illkirch, France). The PCR products obtained from the V3 16S rDNA fragment amplification were subjected to TTGE gel analysis. Migration was performed at 50 V for 12.5h using a temperature gradient of 65-70°C (rate of 0.4°C.h-1) for bacteria of low GC-content and at 120 V for 6h at a constant temperature of 70°C for bacteria of high GCcontent. Two TTGE ladders for each migration condition were prepared by pooling the PCR products amplified from DNA extracted from pure strain cultures. Standardization, analysis, and comparison of TTGE fingerprints were monitored using BioNumerics software, version 6.0 (Applied Maths NV, SintMartens-Latem, Belgium) as described in Mace et al., 2012. Fingerprint bands were assigned to a given species by comparing the band migration position to that of the reference strain profiles included in the database. Results are summarized in Table S1i below. 5 Figure S1i: Sequence-read counts obtained using 454-pyrosequencing from the top 13 most abundant OTUs in seafood TS samples compared with TTGE bands identified from the same samples (in green). OTUs in red are not present in the TTGE reference database. OTUs Ebp0189 Carnobacterium divergens Ebp0162 Brochothrix thermosphacta group Ebp1101 Photobacterium phosphoreum group Ebp1679 Staphylococcus equorum Ebp1098 Photobacterium kishitanii Ebp0191 Carnobacterium maltaromaticum Ebp0795 Lactococcus lactis Ebp0786 Lactobacillus curvatus Ebp0738 Lactobacillus fuchuensis Ebp0569 Serratia proteamaculans group Ebp1824 Uncultured Photobacterium Ebp0794 Lactococcus piscium Ebp0574 Enterococcus faecalis 01 5975 6 22 3489 5 52 0 2713 11 7 4 3 0 02 2 0 0 7643 0 8 6 2120 5 3 0 0 1 03 57 4 57 45 8 2364 5584 0 0 1217 6 0 1077 Smoked salmon (SS) TS samples 04 05 06 07 08 6856 2 124 5 119 4408 84 345 572 7462 2185 4499 3368 771 2586 0 0 81 24 30 315 4451 3460 206 347 578 99 444 1246 510 1 0 0 0 0 1 0 35 0 0 1 4 4404 35 0 252 1 30 1458 0 232 741 369 62 205 2 0 9 1 1355 0 0 0 0 0 09 4404 6121 0 104 0 477 0 0 0 4 0 0 1 10 12316 655 7 9 0 404 0 1 0 52 0 35 0 OTUs Ebp0191 Carnobacterium maltaromaticum Ebp0769 Lactobacillus sakei Ebp1101 Photobacterium phosphoreum group Ebp0189 Carnobacterium divergens Ebp0794 Lactococcus piscium Ebp0786 Lactobacillus curvatus Ebp0162 Brochothrix thermosphacta group Ebp1613 Vagococcus fluvialis Ebp1824 Uncultured Photobacterium Ebp1098 Photobacterium kishitanii Ebp0738 Lactobacillus fuchuensis Ebp1807 Uncultured Vibrionaceae Ebp0679 Flavobacterium succinans 01 2954 219 562 2095 2425 104 183 246 68 104 451 42 34 02 2984 318 671 127 12 953 127 411 92 154 549 37 150 03 4910 653 133 79 588 644 685 446 14 23 109 9 17 Salmon fillet (SF) TS samples 04 05 06 07 4592 1709 1719 99 1175 5145 154 4791 908 73 3623 131 2323 89 30 2 568 2213 453 8 37 1419 674 195 374 104 502 15 110 130 424 6 159 14 693 19 131 8 610 22 87 143 113 139 94 7 349 11 10 6 334 19 08 663 407 4054 241 35 104 135 30 541 438 61 334 105 09 3348 210 1436 880 15 36 2 50 199 189 26 119 5 10 6010 246 932 611 1174 92 251 364 156 169 8 88 4 OTUs Ebp0824 Leuconostoc gasicomitatum Ebp1603 Streptococcus parauberis Ebp1630 Vibrio ordalii Ebp0189 Carnobacterium divergens Ebp1635 Weissela viridescens Ebp0191 Carnobacterium maltaromaticum Ebp0195 Carnobacterium inhibens Ebp0024 Aerococcus viridans group Ebp0823 Leuconostoc mesenteroides group Ebp0186 Carnobacterium funditum Ebp0194 Carnobacterium mobile Ebp1656 Trichococcus pasteurii group Ebp0192 Carnobacterium jeotgali 01 6679 734 21 0 0 0 0 29 1 0 0 0 0 02 586 3196 5523 0 0 0 1 51 0 0 0 0 0 03 1916 29 5028 3 1 1658 1 32 0 1 0 0 0 Cooked shrimp (CS) TS samples 04 05 06 07 4 10 4649 0 39 444 337 7903 0 0 117 1015 6321 680 0 0 0 0 0 0 914 985 118 0 39 146 3 0 0 0 157 0 10 2 5 0 911 2505 0 0 21 50 0 0 150 995 0 0 185 563 0 0 08 0 0 0 50 515 50 1483 14 4009 20 109 0 0 09 0 0 0 9 5900 1018 2 492 0 0 4 0 0 10 0 2 0 13 1 0 2650 3323 1 30 3159 0 2 These results indicated that successful identification with TTGE correlated with the abundance of OTUs revealed by 16S pyrosequencing. However, some of the most abundant OTUs failed to be detected by TTGE, either because the species had not been previously known from these types of samples in the literature and thus were not included among the TTGE reference strains, or because TTGE bands of closely related species (e.g., P. phosphoreum cluster EBP1101, P. kishitanii EBP1098, uncultured 6 Photobacterium EBP1824) might not have been distinguished using TTGE. Furthermore, the TTGE protocol that was used was revealed to be inadequate for the detection of certain OTUs such as the EBP0162 Brochothrix thermosphacta cluster and EBP0769 Lactobacillus sakei. Nevertheless, and despite the lower number of species identified with TTGE, the two techniques were correlated in their specieslevel taxonomic assignments. 7 – References for Supporting File S1 Bakker MG, Tu ZJ, Bradeen JM, Kinkel LL (2012). Implications of pyrosequencing error correction for biological data interpretation. PloS one 7: e44357 Broekaert K, Heyndrickx M, Herman L, Devlieghere F, Vlaemynck G (2011). Seafood quality analysis: Molecular identification of dominant microbiota after ice storage on several general growth media. Food Microbiol 28: 1162-9. Chaillou S, Champomier-Verges MC, Cornet M, Crutz-Le Coq AM, Dudez A, M.Martin V et al., (2005). The complete genome sequence of the meat-borne lactic acid bacterium Lactobacillus sakei 23K. Nat Biotechnol 12: 1527-33. Chaillou S, Christieans S, Rivollier M, Lucquin I, Champomier-Verges MC, Zagorec M (2014). Quantification and efficiency of Lactobacillus sakei strain mixtures used as protective cultures in ground beef. Meat Sci 94: 332-338. Jaffres E, Sohier D, Leroi F, Pilet MF, Prevost H, Joffraud JJ et al. (2009). Study of the bacterial ecosystem in tropical cooked and peeled shrimps using a polyphasic approach. Int J Food Microbiol 131: 20-9. Jaffres E, Lalanne V, Mace S, Cornet J, Cardinal M, Serot T et al (2011). Sensory characteristics of spoilage and volatile compounds associated with bacteria isolated from cooked and peeled tropical shrimps using SPME-GC-MS analysis. Int J Food Microbiol 147: 195-202. Mace S, Cornet J, Chevalier F, Cardinal M, Pilet MF, Dousset X et al. (2012). Characterisation of the spoilage microbiota in raw salmon (Salmo salar) steaks stored under vacuum or modified atmosphere packaging combining conventional methods and PCR-TTGE. Food Microbiol 30: 164-72. Mace S, Joffraud JJ, Cardinal M, Malcheva M, Cornet J, Lalanne V et al (2013). Evaluation of the spoilage potential of bacteria isolated from spoiled raw salmon (Salmo salar) fillets stored under modified atmosphere packaging. Int J Food Microbiol 160: 227-38. Pinto AJ, Raskin L (2012). PCR biases distort bacterial and archaeal community structure in pyrosequencing datasets. PLoS One 7: e43093. Quast C, Pruesse E, Yilmaz P, Gerken J, Schweer T, Yarza P et al (2012). The SILVA ribosomal RNA gene database project: improved data processing and web-based tools. Nucleic Acids Res 41: D590-6. Schloss PD, Gevers D, Westcott SL (2011). Reducing the effects of PCR amplification and sequencing artifacts on 16S rRNA-based studies. PloS one 12: e27310 7