Supplementary Notes 1 Microbiota composition analysis 1.1 Oligonucleotide primers and probes used for qPCR The primers and probes were previously described (Furet et al., 2009). They were designed based on 16S rRNA sequences (RDP II database) aligned with the program ClustalW (Thompson et al., 1994). TaqMan® qPCR was used to quantify total bacterial population and the main components of the microbiota (i.e. >1% of fecal bacteria population): Clostridium coccoides group, C. leptum group, Bacteroides/Prevotella group and Bifidobacterium genus. SYBR-Green® qPCR was used Escherichia coli. The TaqMan® probes were synthesized by Applied-Biosystems Applera-France. Primers were purchased from MWG (MWG-Biotech AG, Ebersberg, Germany). 1.2 Real-time qPCR. Real-time qPCR was performed using an ABI 7000 Sequence Detection System with software version 1.2.3 (Applied-Biosystems, Foster City, Ca, USA). Amplification and detection were carried out in 96well plates with TaqMan® Universal PCR 2× MasterMix (Applied-Biosystems) or with SYBR-Green® PCR 2× Master Mix (Applied-Biosystems). Each reaction was run in duplicate in a final volume of 25 µL with 0.2 mM final concentration of each primer, 0.25 mM final concentration of each probe and 10 µL of appropriate dilutions of DNA samples. Amplifications were carried out using the following ramping profile: 1 cycle at 95°C for 10 min, followed by 40 cycles of 95°C for 30 s, 60°C for 1 min. For SYBR-Green® amplifications, a melting step was added to improve amplification specificity. Total numbers of bacteria were inferred from averaged standard curves as previously described (Lyons et al., 2000). 1.3 Pyrosequencing analysis 76 Bacterial DNA samples from 19 individuals at 4 time points (n° 2, 3, 4, 5 see Figure 1 in the main manuscript) were used to construct DNA libraries. The following universal 16S rRNA gene primers were used for the PCR reaction: V3F (TACGGRAGGCAGCAG 343-357 E. coli position) (Wilson et al., 1990) and V4R (GGACTACCAGGGTATCTAAT 787-806 E. coli position) (Lane, 1991) to target the V3-V4 region, which gives the better confidence for assignation (Wang et al., 2007). Barcode sequences (GsFLX key) TCAG and MIDGsFLX (12 nucleotides) were attached between the 454 GsFLX adaptator sequence and the forward primer V3F. The GsFLX key and the 454 GsFLX adaptator were attached to the reverse primer. The concentration and quality of the PCR products were assessed with Picogreen in order to obtain equal amounts of each of the samples, and then 16S rRNA genes amplicons were sequenced on a Roche GS FLX Ti 454 sequencer (Genoscreen, Lille, France) and processed with standard protocol from manufacturer. 1.4 Quality checking and OTU processing Raw reads were quality trimmed according to the published recommendations (Huse et al., 2007). Then the next steps were performed using the LotuS pipeline (Hildebrand et al 2015): quality filtering was done using SDM software. SDM options take into account the read average quality, accumulated error over the sequence, quality into a sliding windows frame. Reads were further filtered for minimal and maximal length, any ambiguous nucleotides, barcode and primer errors and homopolymeric nucleotide runs. LotuS pipeline allows users to choose high and mid quality criteria. In our study, the default criteria parameter adapted to 454 sequencing platform were provided by LotuS: high quality sequence criterion was used to build OTUs. High and mid quality sequences were mapped to count the occurrence of established OTUs by sample (Hildebrand et al., 2014). Also included into LotuS, OTU clustering was done with UPARSE which embedded chimera filtering using UCHIME (Edgar, 2013). 1.5 Additional notes on statistical analysis We used several multi-table approaches in this study in order to decipher how different datasets (microbiota composition, SCFA profile, metatranscriptomics) were linked and how this link was impacted by the nutritional intervention (10g vs 40g fiber per day). Below, we introduce between/within class analysis, co-inertia analysis and partial triadic analysis. Between and within class analysis The microbiota composition dataset (log10 normalized) was challenged in three different tests at each taxonomic level, from phylum to species: Between diet analysis; between subjects analysis and between diet analysis within subject. Between class analysis (BCA) is a particular case of principal component analysis (PCA) with an instrumental variable: here, in our study, variables were qualitative factors (Chessel, 2004). Figure 1: Between and within class analysis framework used on this study On the left side of Figure 1, the dataset was split into eight group (or classes) of diet change, 4 for 1040 run and 4 for 40-10 run (ie result of the interaction of diet run and time point). In the middle, the dataset was split into 19 subjects. The Between Class Analysis enables us i) to find the principal components based on the center of gravity of each group (ie diet change or subjects) to highlight differences between groups and then ii) to link each sample with its group. From the PCA analysis, the sum of the eigenvalues corresponds to the whole inertia of the dataset. The eigenvalues obtained from the Between Class Analysis (BCA) allowed us to know in which proportion the qualitative factor, diet change or subject, explained the whole inertia. Thus, BCA helps to decipher how diet and subject specificity explained the variations in the microbiota dataset. Within class analysis (WCA), downstream from the PCA, allows to remove the inter-group variability using a mean centering transformation by subject. Using WCA, the group effect is then removed (eg subject specificity). On the right side, WCA analysis is followed by a BCA analysis on diet, enabling to decipher the proportion of inertia explained by the diet without subject specificity. A Monte Carlo (MC) test is used to check if the observed inertia, explained by the factor, is actually higher than a random expectation. Simulated inertia is made using a permuted dataset. Here, subjects and diet change were randomly assigned upstream the PCA analysis. The observed inertia was considered as statistically significant when it was higher than the 95th percentile of simulated inertia. Co-inertia analysis Co-inertia analysis (COIA) is an ordination method for coupling two (or more) sets of parameters (e.g. SCFA profile and Microbiota composition) by looking at their linear combinations. Thus, co-inertia analysis enables the simultaneous ordination of several tables. COIA is related to other multivariate analysis such as canonical correlation analysis. In the case of COIA, the co-inertia (the sum of square of co-variance) between the two sets is maximized and decomposed. Hence, the co-inertia value is a global measure of the co-structure between the two datasets. Co-inertia is high when the two sets vary together and low when they vary independently (Dray et al., 2003). Depending on the dataset, COIA is coupled with PCA or correspondence analysis. In this study, two independent PCA were computed on microbiota composition and SCFA profile and then subjected to a COIA. The overall relatedness of the two datasets was measured by the RV coefficient (Dray et al., 2003). The RV-coefficient is the coefficient of correlation between two tables (in this case microbiota composition and SCFA profile). A Monte Carlo test was used to test the robustness of the RVcoefficient. Partial triadic analysis Partial triadic analysis (PTA) is a multivariate method which aims to decipher how inertia is explained on a series of matrices connected through a gradient. In this study, matrices are microbiota compositions, with individuals as rows and genera relative abundance as columns, connected through sampling points from the nutritional intervention. PTA allows the simultaneous ordination of matrices (in our study n=4 matrices) and finds a common structure to every matrix. This common structure is a compromise of inertia coming from the PCA of those matrices. Hence, the inertia contribution from those matrices to the compromise can be evaluated. Globally, using this analysis we can assess the stability of the relationship between variable (eg genera abundance) and individuals through time. In this study, each time point contributes equally to the compromise, which means that the link between individual and genera is maintained throughout the study. Summary Below is a summary of the multivariable approaches used for this study: PCA = Principal component analysis COIA = Co-inertia analysis PTA = Partial triadic analysis Source code availability and data processing All source codes and data from this study are embedded into an R package available on GitHub. Raw codes could be simply downloaded at this following URL: https://github.com/tapj/AlimIntest For R users, we advise to use this following command to import, install and explore the AlimIntest package and reproduce results reported from this study: library(devtools) install_github("tapj/AlimIntest") browseVignettes("AlimIntest") 2 RNA extraction procedure from fecal microbiota sample We adapted a RNA isolation kit to recover RNA from strains originating from a complex environment. To use the High Pure Isolation Roche kit, it was important to first perform this bacterial lysate preparation. 2.1 Reagents - High Pure Isolation kit (Roche) containing: - RQ1 RNase free DNase enzyme (Promega) with his buffer and DEPC water. - Acetate solution 3M with pH = 4.8 - SDS solution 20% - RNase free Water - Ethanol solution (100 % and 70%). - Mixed Phenol solution, (pH = 4) Chloroform-isoamyl alcohol solution (5:1) - Zirconium beads (diameter = 0.1 mm) - Tris-EDTA 1X 2.2 Bacterial lysate preparation 1) 2) 3) 4) Add 400 µL Tris-EDTA 1X to resuspend 200 mg of fecal sample. Add 500 µL phenol-chloroform isoamyl alcohol solution Add 25 µL SDS 20% solution, 50µL Acetate 3M and 600 mg of Zirconium beads Use FastPrep (FP120, MP Biomedical) during 1 minute (power 5) for bacterial cell wall breaking and centrifuge 15 min at 15,000g, room temperature. 5) Add 500 µL of chloroform isoamyl alcholol solution to the supernatant in order to wash phenol residues. Vortex strongly and centrifuge 10 minutes at 15,000g, 4°C. 2.3 RNA isolation and purification This protocol is adapted from (van Hijum et al., 2005) 1) To 50 µL of bacterial lysate, add 400 µL of lysis-binding buffer from Roche kit and load onto a purification column from the Roche kit. 2) Centrifuge 15 sec at 8000g at 4°C and discard the supernatant. 3) Incubate the column 30 min at 37°C with 100 µL of DNase without agitating. 4) Wash the column with the Wash buffer I, centrifuge 15 sec at 10,000g and discard the supernatant. 5) Wash twice the column with 500 µL and 200 µL of Wash buffer II. Centrifuge 2 min at 15,000g, room temperature, and discard the supernatant. 6) Add 60 µL of elution buffer, incubate 5 min at 20°C and centrifuge 1 min at 15,000g, room temperature. 7) Add 40 µL of DNase and incubate 20 min at 37°C without agitating. 8) Add 10 µL of Acetate, 330µL of absolute ethanol and incubate 30 min at -80°C. 9) Centrifuge 30 min at 15,000g; discard the supernatant and dry ethanol residue. 10) Wash the pellet with 1 mL of 70% ethanol solution, centrifuge 1 minute at 15,000g, room temperature, and discard ethanol. 11) Dry the pellet with a speed vacuum and resuspend the pellet in Tris-EDTA solution. 12) RNA was quality checked with a bioanalyzer (Agilent). The amounts of RNA obtained from 200 mg of fecal sample are shown in supplementary Table 3. 3 cDNA libraries preparation and in silico analysis cDNA libraries sequencing was performed by external companies. Due to the high amount recommended for pyrosequencing analysis (more than 5µg per sample), we used the entire amplification kit using the recommended protocol. 3.1 Ribosomal RNA removal and amplification After extracting the total RNAs as described above, their quality was checked with a bioanalyzer (Agilent). 5S rRNA and smalls RNA molecules were removed using RNeasy kit (Quiagen) following the default procedure. MicrobExpress Bacterial mRNA Enrichment kit (Ambion) was used to remove 23S and 16S rRNA molecules with a subtracting hybridization procedure. We previously had determined that a maximum amount of 5 µg per reaction provided an optimal efficiency. cDNA libraries were then prepared using the Whole Transcriptome Amplification kit (WTA2 Sigma-Aldrich) following the default procedure. cDNA samples were checked with a bioanalyzer (Agilent) for the presence of mRNA and the removal of rRNA peaks. Before being sent for Pyrosequencing analysis, mRNA concentrations were quantified with picogreen staining. 3.2 Bioinformatics analysis Overview of the bioinformatics analysis of cDNA libraries: - 603,463 raw reads checked by quality and RNA removal (details in Supplementary Table S2 - 118,301 reads assembled with CAP3 (p = 66%) - 59,443 singletons + 5,006 contigs (58,858 reads) - 15,082 blast hits on Qin et al database (1,872 contigs (11,730 reads) had no hits) - 59,290 reads distributed among 11,441 genes (MetaHIT DB). - 23,977 reads distributed among 2,148 COGs and NOGs (eggNOG database v1) - 18,573 reads distributed among 1,704 KOs (KEGG database). - 2,220 reads distributed among 73 CAZy families (CAZy database). 4 SCGE assay detailed protocol The human colorectal carcinoma cell line HT-29 was obtained from American Type Culture Collection (Rockville, MD). Cells were grown as monolayers in RPMI 1640 (Sigma) supplemented with 2mM Lglutamine, 100 IU/mL penicillin, 100 µg/mL streptomycin (Sigma) and 10% heat-inactivated fetal calf serum (FCS - Lonza) in a humidified 5% CO2 atmosphere at 37°C. Cells were used between passage 15 and 30. DNA damage in cells was examined using SCGE assay (or Comet assay). After 4 days of confluence, cells were trypsinized and resuspended at 1.104 cells/mL of medium and cell viability was assessed by Trypan blue exclusion. The cell suspension (900 µL) was incubated at 37°C for 30 min (Glinghammar et al., 1997; Rieger et al., 1999) with 100 µL of fecal water, or 1x PBS buffer (negative control), or 5 µM hydrogen peroxide in 1x PBS buffer (positive control). Cells were then pelleted by centrifugation (500 g, 4°C, 5 min), resuspended in 190 µL of warm low-melting point agarose (1 % w/v), and spread onto the 2 wells of a CometSlide™ (Trevigen, Gaithersburg, MD). Slides were then treated according to the manufacturer's instructions for alkaline Comet assay, except that SYBR Gold was used instead of SYBR Green I for DNA staining. Slides were viewed with an epifluorescence microscope (40x magnification) and images were acquired through a camera using Image-Pro Express v. 6.3 image analysis software (Media Cybernetics, Bethesda, MD). The fractional amount of DNA in the Comet tail (percentage of DNA in the tail) was chosen as a descriptor of DNA damage, as recommended by the Comet assay interest group (www.cometassay.com), and quantified using the public domain image processing program ImageJ (NIH). Assays were carried out in duplicate so that two slides were prepared from each fecal water sample. One hundred randomly selected cells were counted and the mean was calculated to provide a single value. 5 Detailed menus for diet plans Only the following items were allowed for breakfast: tea, coffee, white sandwich bread, butter, nonprobiotic plain yogurt, apricot jelly; dried apricot and figs for the 40g/fiber plan. Examples of day meals providing 10g fiber and 40g fiber: 10g/Day 40g/Day Chicken cooked with tomatoes Polenta Camembert Kiwis Raw Belgium endives Chicken Zuchini Mashed potatoes Goat cheese Pear Broth with Noodles Spaghetti Prawns Plain yogurt Apple Vegetables soup Prawns Cantonese rice Plain Yogurt Dried fruits Lunch Diner 4 to 10 slices of white sandwich bread allowed per day. See Table S4 for weekly meals composition. 6 Additional references Chessel, D., Dufour, A.- B. and Thioulouse, J. (2004) The ade4 package-I- One-table methods. R News 4: 5 - 10. Dray, S., Chessel, D., and Thioulouse, J. (2003) Co-inertia analysis and the linking of ecological data tables. Ecology 84: 3078-3089. Edgar, R.C. (2013) UPARSE: highly accurate OTU sequences from microbial amplicon reads. Nat Methods 10: 996-998. Furet, J.P., Firmesse, O., Gourmelon, M., Bridonneau, C., Tap, J., Mondot, S. et al. (2009) Comparative assessment of human and farm animal faecal microbiota using real-time quantitative PCR. FEMS Microbiology Ecology 19: 19. Glinghammar, B., Venturi, M., Rowland, I.R., and Rafter, J.J. (1997) Shift from a dairy product-rich to a dairy product-free diet: influence on cytotoxicity and genotoxicity of fecal water--potential risk factors for colon cancer. Am J Clin Nutr 66: 1277-1282. Hildebrand, F., Tito, T., Voigt, A., Bork, P., and Raes, J. (2014) LotuS: an efficient and user-friendly OTU processing pipeline. Microbiome 2. Huse, S.M., Huber, J.A., Morrison, H.G., Sogin, M.L., and Welch, D.M. (2007) Accuracy and quality of massively parallel DNA pyrosequencing. Genome Biol 8: R143. Lane, D.J. (1991) 16S/23S rRNA sequencing. In. Stackebrandt, E., and Goodfellow, J. (eds): Wiley, pp. 115-175. Lyons, S.R., Griffen, A.L., and Leys, E.J. (2000) Quantitative real-time PCR for Porphyromonas gingivalis and total bacteria. J Clin Microbiol 38: 2362-2365. Rieger, M.A., Parlesak, A., Pool-Zobel, B.L., Rechkemmer, G., and Bode, C. (1999) A diet high in fat and meat but low in dietary fibre increases the genotoxic potential of 'faecal water'. Carcinogenesis 20: 2311-2316. Thompson, J.D., Higgins, D.G., and Gibson, T.J. (1994) CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Research 22: 4673-4680. van Hijum, S.A., de Jong, A., Baerends, R.J., Karsens, H.A., Kramer, N.E., Larsen, R. et al. (2005) A generally applicable validation scheme for the assessment of factors involved in reproducibility and quality of DNA-microarray data. BMC Genomics 6: 77. Wang, Q., Garrity, G.M., Tiedje, J.M., and Cole, J.R. (2007) Naive Bayesian classifier for rapid assignment of rRNA sequences into the new bacterial taxonomy. Applied and Environmental Microbiology 73: 5261-5267. Wilson, K.H., Blitchington, R.B., and Greene, R.C. (1990) Amplification of bacterial 16S ribosomal DNA with polymerase chain reaction. J Clin Microbiol 28: 1942-1946.