emi13006-sup-0001-si

advertisement
Supplementary Notes
1 Microbiota composition analysis
1.1 Oligonucleotide primers and probes used for qPCR
The primers and probes were previously described (Furet et al., 2009). They were designed based on
16S rRNA sequences (RDP II database) aligned with the program ClustalW (Thompson et al., 1994).
TaqMan® qPCR was used to quantify total bacterial population and the main components of the
microbiota (i.e. >1% of fecal bacteria population): Clostridium coccoides group, C. leptum group,
Bacteroides/Prevotella group and Bifidobacterium genus. SYBR-Green® qPCR was used Escherichia
coli. The TaqMan® probes were synthesized by Applied-Biosystems Applera-France. Primers were
purchased from MWG (MWG-Biotech AG, Ebersberg, Germany).
1.2 Real-time qPCR.
Real-time qPCR was performed using an ABI 7000 Sequence Detection System with software version
1.2.3 (Applied-Biosystems, Foster City, Ca, USA). Amplification and detection were carried out in 96well plates with TaqMan® Universal PCR 2× MasterMix (Applied-Biosystems) or with SYBR-Green®
PCR 2× Master Mix (Applied-Biosystems). Each reaction was run in duplicate in a final volume of 25
µL with 0.2 mM final concentration of each primer, 0.25 mM final concentration of each probe and
10 µL of appropriate dilutions of DNA samples. Amplifications were carried out using the following
ramping profile: 1 cycle at 95°C for 10 min, followed by 40 cycles of 95°C for 30 s, 60°C for 1 min. For
SYBR-Green® amplifications, a melting step was added to improve amplification specificity. Total
numbers of bacteria were inferred from averaged standard curves as previously described (Lyons et
al., 2000).
1.3 Pyrosequencing analysis
76 Bacterial DNA samples from 19 individuals at 4 time points (n° 2, 3, 4, 5 see Figure 1 in the main
manuscript) were used to construct DNA libraries. The following universal 16S rRNA gene primers
were used for the PCR reaction: V3F (TACGGRAGGCAGCAG 343-357 E. coli position) (Wilson et al.,
1990) and V4R (GGACTACCAGGGTATCTAAT 787-806 E. coli position) (Lane, 1991) to target the V3-V4
region, which gives the better confidence for assignation (Wang et al., 2007). Barcode sequences
(GsFLX key) TCAG and MIDGsFLX (12 nucleotides) were attached between the 454 GsFLX adaptator
sequence and the forward primer V3F. The GsFLX key and the 454 GsFLX adaptator were attached to
the reverse primer. The concentration and quality of the PCR products were assessed with Picogreen
in order to obtain equal amounts of each of the samples, and then 16S rRNA genes amplicons were
sequenced on a Roche GS FLX Ti 454 sequencer (Genoscreen, Lille, France) and processed with
standard protocol from manufacturer.
1.4 Quality checking and OTU processing
Raw reads were quality trimmed according to the published recommendations (Huse et al., 2007).
Then the next steps were performed using the LotuS pipeline (Hildebrand et al 2015): quality filtering
was done using SDM software. SDM options take into account the read average quality, accumulated
error over the sequence, quality into a sliding windows frame. Reads were further filtered for
minimal and maximal length, any ambiguous nucleotides, barcode and primer errors and
homopolymeric nucleotide runs. LotuS pipeline allows users to choose high and mid quality criteria.
In our study, the default criteria parameter adapted to 454 sequencing platform were provided by
LotuS: high quality sequence criterion was used to build OTUs. High and mid quality sequences were
mapped to count the occurrence of established OTUs by sample (Hildebrand et al., 2014). Also
included into LotuS, OTU clustering was done with UPARSE which embedded chimera filtering using
UCHIME (Edgar, 2013).
1.5 Additional notes on statistical analysis
We used several multi-table approaches in this study in order to decipher how different datasets
(microbiota composition, SCFA profile, metatranscriptomics) were linked and how this link was
impacted by the nutritional intervention (10g vs 40g fiber per day). Below, we introduce
between/within class analysis, co-inertia analysis and partial triadic analysis.
Between and within class analysis
The microbiota composition dataset (log10 normalized) was challenged in three different tests at
each taxonomic level, from phylum to species: Between diet analysis; between subjects analysis and
between diet analysis within subject. Between class analysis (BCA) is a particular case of principal
component analysis (PCA) with an instrumental variable: here, in our study, variables were
qualitative factors (Chessel, 2004).
Figure 1: Between and within class analysis framework used on this study
On the left side of Figure 1, the dataset was split into eight group (or classes) of diet change, 4 for 1040 run and 4 for 40-10 run (ie result of the interaction of diet run and time point). In the middle, the
dataset was split into 19 subjects.
The Between Class Analysis enables us i) to find the principal components based on the center of
gravity of each group (ie diet change or subjects) to highlight differences between groups and then ii)
to link each sample with its group. From the PCA analysis, the sum of the eigenvalues corresponds to
the whole inertia of the dataset. The eigenvalues obtained from the Between Class Analysis (BCA)
allowed us to know in which proportion the qualitative factor, diet change or subject, explained the
whole inertia. Thus, BCA helps to decipher how diet and subject specificity explained the variations in
the microbiota dataset.
Within class analysis (WCA), downstream from the PCA, allows to remove the inter-group variability
using a mean centering transformation by subject. Using WCA, the group effect is then removed (eg
subject specificity). On the right side, WCA analysis is followed by a BCA analysis on diet, enabling to
decipher the proportion of inertia explained by the diet without subject specificity.
A Monte Carlo (MC) test is used to check if the observed inertia, explained by the factor, is actually
higher than a random expectation. Simulated inertia is made using a permuted dataset. Here,
subjects and diet change were randomly assigned upstream the PCA analysis. The observed inertia
was considered as statistically significant when it was higher than the 95th percentile of simulated
inertia.
Co-inertia analysis
Co-inertia analysis (COIA) is an ordination method for coupling two (or more) sets of parameters (e.g.
SCFA profile and Microbiota composition) by looking at their linear combinations. Thus, co-inertia
analysis enables the simultaneous ordination of several tables. COIA is related to other multivariate
analysis such as canonical correlation analysis. In the case of COIA, the co-inertia (the sum of square
of co-variance) between the two sets is maximized and decomposed. Hence, the co-inertia value is a
global measure of the co-structure between the two datasets. Co-inertia is high when the two sets
vary together and low when they vary independently (Dray et al., 2003).
Depending on the dataset, COIA is coupled with PCA or correspondence analysis. In this study, two
independent PCA were computed on microbiota composition and SCFA profile and then subjected to
a COIA. The overall relatedness of the two datasets was measured by the RV coefficient (Dray et al.,
2003). The RV-coefficient is the coefficient of correlation between two tables (in this case microbiota
composition and SCFA profile). A Monte Carlo test was used to test the robustness of the RVcoefficient.
Partial triadic analysis
Partial triadic analysis (PTA) is a multivariate method which aims to decipher how inertia is explained
on a series of matrices connected through a gradient. In this study, matrices are microbiota
compositions, with individuals as rows and genera relative abundance as columns, connected
through sampling points from the nutritional intervention. PTA allows the simultaneous ordination of
matrices (in our study n=4 matrices) and finds a common structure to every matrix. This common
structure is a compromise of inertia coming from the PCA of those matrices. Hence, the inertia
contribution from those matrices to the compromise can be evaluated. Globally, using this analysis
we can assess the stability of the relationship between variable (eg genera abundance) and
individuals through time. In this study, each time point contributes equally to the compromise, which
means that the link between individual and genera is maintained throughout the study.
Summary
Below is a summary of the multivariable approaches used for this study:
PCA = Principal component analysis
COIA = Co-inertia analysis
PTA = Partial triadic analysis
Source code availability and data processing
All source codes and data from this study are embedded into an R package available on GitHub. Raw
codes could be simply downloaded at this following URL: https://github.com/tapj/AlimIntest
For R users, we advise to use this following command to import, install and explore the AlimIntest
package and reproduce results reported from this study:
library(devtools)
install_github("tapj/AlimIntest")
browseVignettes("AlimIntest")
2 RNA extraction procedure from fecal microbiota sample
We adapted a RNA isolation kit to recover RNA from strains originating from a complex environment.
To use the High Pure Isolation Roche kit, it was important to first perform this bacterial lysate
preparation.
2.1 Reagents
-
High Pure Isolation kit (Roche) containing:
-
RQ1 RNase free DNase enzyme (Promega) with his buffer and DEPC water.
-
Acetate solution 3M with pH = 4.8
-
SDS solution 20%
-
RNase free Water
-
Ethanol solution (100 % and 70%).
-
Mixed Phenol solution, (pH = 4) Chloroform-isoamyl alcohol solution (5:1)
-
Zirconium beads (diameter = 0.1 mm)
-
Tris-EDTA 1X
2.2 Bacterial lysate preparation
1)
2)
3)
4)
Add 400 µL Tris-EDTA 1X to resuspend 200 mg of fecal sample.
Add 500 µL phenol-chloroform isoamyl alcohol solution
Add 25 µL SDS 20% solution, 50µL Acetate 3M and 600 mg of Zirconium beads
Use FastPrep (FP120, MP Biomedical) during 1 minute (power 5) for bacterial cell wall
breaking and centrifuge 15 min at 15,000g, room temperature.
5) Add 500 µL of chloroform isoamyl alcholol solution to the supernatant in order to wash
phenol residues. Vortex strongly and centrifuge 10 minutes at 15,000g, 4°C.
2.3 RNA isolation and purification
This protocol is adapted from (van Hijum et al., 2005)
1) To 50 µL of bacterial lysate, add 400 µL of lysis-binding buffer from Roche kit and load onto a
purification column from the Roche kit.
2) Centrifuge 15 sec at 8000g at 4°C and discard the supernatant.
3) Incubate the column 30 min at 37°C with 100 µL of DNase without agitating.
4) Wash the column with the Wash buffer I, centrifuge 15 sec at 10,000g and discard the
supernatant.
5) Wash twice the column with 500 µL and 200 µL of Wash buffer II. Centrifuge 2 min at
15,000g, room temperature, and discard the supernatant.
6) Add 60 µL of elution buffer, incubate 5 min at 20°C and centrifuge 1 min at 15,000g, room
temperature.
7) Add 40 µL of DNase and incubate 20 min at 37°C without agitating.
8) Add 10 µL of Acetate, 330µL of absolute ethanol and incubate 30 min at -80°C.
9) Centrifuge 30 min at 15,000g; discard the supernatant and dry ethanol residue.
10) Wash the pellet with 1 mL of 70% ethanol solution, centrifuge 1 minute at 15,000g, room
temperature, and discard ethanol.
11) Dry the pellet with a speed vacuum and resuspend the pellet in Tris-EDTA solution.
12) RNA was quality checked with a bioanalyzer (Agilent).
The amounts of RNA obtained from 200 mg of fecal sample are shown in supplementary Table 3.
3 cDNA libraries preparation and in silico analysis
cDNA libraries sequencing was performed by external companies. Due to the high amount
recommended for pyrosequencing analysis (more than 5µg per sample), we used the entire
amplification kit using the recommended protocol.
3.1 Ribosomal RNA removal and amplification
After extracting the total RNAs as described above, their quality was checked with a bioanalyzer
(Agilent). 5S rRNA and smalls RNA molecules were removed using RNeasy kit (Quiagen) following the
default procedure. MicrobExpress Bacterial mRNA Enrichment kit (Ambion) was used to remove 23S
and 16S rRNA molecules with a subtracting hybridization procedure. We previously had determined
that a maximum amount of 5 µg per reaction provided an optimal efficiency. cDNA libraries were
then prepared using the Whole Transcriptome Amplification kit (WTA2 Sigma-Aldrich) following the
default procedure. cDNA samples were checked with a bioanalyzer (Agilent) for the presence of
mRNA and the removal of rRNA peaks. Before being sent for Pyrosequencing analysis, mRNA
concentrations were quantified with picogreen staining.
3.2 Bioinformatics analysis
Overview of the bioinformatics analysis of cDNA libraries:
-
603,463 raw reads checked by quality and RNA removal (details in Supplementary Table S2
-
118,301 reads assembled with CAP3 (p = 66%)
-
59,443 singletons + 5,006 contigs (58,858 reads)
-
15,082 blast hits on Qin et al database (1,872 contigs (11,730 reads) had no hits)
-
59,290 reads distributed among 11,441 genes (MetaHIT DB).
-
23,977 reads distributed among 2,148 COGs and NOGs (eggNOG database v1)
-
18,573 reads distributed among 1,704 KOs (KEGG database).
-
2,220 reads distributed among 73 CAZy families (CAZy database).
4 SCGE assay detailed protocol
The human colorectal carcinoma cell line HT-29 was obtained from American Type Culture Collection
(Rockville, MD). Cells were grown as monolayers in RPMI 1640 (Sigma) supplemented with 2mM Lglutamine, 100 IU/mL penicillin, 100 µg/mL streptomycin (Sigma) and 10% heat-inactivated fetal calf
serum (FCS - Lonza) in a humidified 5% CO2 atmosphere at 37°C. Cells were used between passage 15
and 30. DNA damage in cells was examined using SCGE assay (or Comet assay). After 4 days of
confluence, cells were trypsinized and resuspended at 1.104 cells/mL of medium and cell viability was
assessed by Trypan blue exclusion.
The cell suspension (900 µL) was incubated at 37°C for 30 min (Glinghammar et al., 1997; Rieger et
al., 1999) with 100 µL of fecal water, or 1x PBS buffer (negative control), or 5 µM hydrogen peroxide
in 1x PBS buffer (positive control). Cells were then pelleted by centrifugation (500 g, 4°C, 5 min),
resuspended in 190 µL of warm low-melting point agarose (1 % w/v), and spread onto the 2 wells of
a CometSlide™ (Trevigen, Gaithersburg, MD). Slides were then treated according to the
manufacturer's instructions for alkaline Comet assay, except that SYBR Gold was used instead of
SYBR Green I for DNA staining. Slides were viewed with an epifluorescence microscope (40x
magnification) and images were acquired through a camera using Image-Pro Express v. 6.3 image
analysis software (Media Cybernetics, Bethesda, MD). The fractional amount of DNA in the Comet
tail (percentage of DNA in the tail) was chosen as a descriptor of DNA damage, as recommended by
the Comet assay interest group (www.cometassay.com), and quantified using the public domain
image processing program ImageJ (NIH). Assays were carried out in duplicate so that two slides were
prepared from each fecal water sample. One hundred randomly selected cells were counted and the
mean was calculated to provide a single value.
5 Detailed menus for diet plans
Only the following items were allowed for breakfast: tea, coffee, white sandwich bread, butter, nonprobiotic plain yogurt, apricot jelly; dried apricot and figs for the 40g/fiber plan.
Examples of day meals providing 10g fiber and 40g fiber:
10g/Day
40g/Day
Chicken cooked with tomatoes
Polenta
Camembert
Kiwis
Raw Belgium endives
Chicken
Zuchini
Mashed potatoes
Goat cheese
Pear
Broth with Noodles
Spaghetti
Prawns
Plain yogurt
Apple
Vegetables soup
Prawns
Cantonese rice
Plain Yogurt
Dried fruits
Lunch
Diner
4 to 10 slices of white sandwich bread allowed per day.
See Table S4 for weekly meals composition.
6 Additional references
Chessel, D., Dufour, A.- B. and Thioulouse, J. (2004) The ade4 package-I- One-table methods. R News
4: 5 - 10.
Dray, S., Chessel, D., and Thioulouse, J. (2003) Co-inertia analysis and the linking of ecological data
tables. Ecology 84: 3078-3089.
Edgar, R.C. (2013) UPARSE: highly accurate OTU sequences from microbial amplicon reads. Nat
Methods 10: 996-998.
Furet, J.P., Firmesse, O., Gourmelon, M., Bridonneau, C., Tap, J., Mondot, S. et al. (2009) Comparative
assessment of human and farm animal faecal microbiota using real-time quantitative PCR. FEMS
Microbiology Ecology 19: 19.
Glinghammar, B., Venturi, M., Rowland, I.R., and Rafter, J.J. (1997) Shift from a dairy product-rich to a
dairy product-free diet: influence on cytotoxicity and genotoxicity of fecal water--potential risk
factors for colon cancer. Am J Clin Nutr 66: 1277-1282.
Hildebrand, F., Tito, T., Voigt, A., Bork, P., and Raes, J. (2014) LotuS: an efficient and user-friendly
OTU processing pipeline. Microbiome 2.
Huse, S.M., Huber, J.A., Morrison, H.G., Sogin, M.L., and Welch, D.M. (2007) Accuracy and quality of
massively parallel DNA pyrosequencing. Genome Biol 8: R143.
Lane, D.J. (1991) 16S/23S rRNA sequencing. In. Stackebrandt, E., and Goodfellow, J. (eds): Wiley, pp.
115-175.
Lyons, S.R., Griffen, A.L., and Leys, E.J. (2000) Quantitative real-time PCR for Porphyromonas
gingivalis and total bacteria. J Clin Microbiol 38: 2362-2365.
Rieger, M.A., Parlesak, A., Pool-Zobel, B.L., Rechkemmer, G., and Bode, C. (1999) A diet high in fat
and meat but low in dietary fibre increases the genotoxic potential of 'faecal water'. Carcinogenesis
20: 2311-2316.
Thompson, J.D., Higgins, D.G., and Gibson, T.J. (1994) CLUSTAL W: improving the sensitivity of
progressive multiple sequence alignment through sequence weighting, position-specific gap
penalties and weight matrix choice. Nucleic Acids Research 22: 4673-4680.
van Hijum, S.A., de Jong, A., Baerends, R.J., Karsens, H.A., Kramer, N.E., Larsen, R. et al. (2005) A
generally applicable validation scheme for the assessment of factors involved in reproducibility and
quality of DNA-microarray data. BMC Genomics 6: 77.
Wang, Q., Garrity, G.M., Tiedje, J.M., and Cole, J.R. (2007) Naive Bayesian classifier for rapid
assignment of rRNA sequences into the new bacterial taxonomy. Applied and Environmental
Microbiology 73: 5261-5267.
Wilson, K.H., Blitchington, R.B., and Greene, R.C. (1990) Amplification of bacterial 16S ribosomal DNA
with polymerase chain reaction. J Clin Microbiol 28: 1942-1946.
Download