Supplemental Material Materials and Methods Quantification of Cry-proteins from root tissue A total of 0.5 mL sterile PBST buffer (137 mM NaCl, 27 mM KCl, 100 mM Na2HPO4, 10 mM KH2PO4, 0.5 % Tween-20, pH 7.4), 0.7 g of 0.1 mm zirconia-silica beads (Carl Roth, Karlsruhe, Germany), six 2.8 mm ceramic beads (Peqlab, Erlangen, Germany) and one 6.3 mm ceramic sphere (MP Biomedicals, Illkirch, France) were added to each frozen root sample of 0.2 g into 2.0 mL polypropylene screw-cap tubes. The root was macerated by four bead beating steps of 45 s at 6.5 m s-1 on a FastPrep-24 system (MP Biomedicals) and centrifuged at 16,000 x g and 4°C for 10 min. The Cry-protein contents in the supernatants were analyzed separately using specific ELISA systems. The Cry1A.105 and Cry2Ab2 proteins were quantified using ELISA systems kindly provided by Monsanto (St. Louis, Mo) and Cry3Bb1 was quantified with the Cry3Bb1 ELISA system from Agdia (Elkhart, In), all following protocols of the suppliers. The amount of total protein was determined with a ready to use Bradford-solution (AppliChem, Darmstadt, Germany). Processing of pyrosequencing data Sequence data from pyrosequencing were supplied as a FASTA file and a corresponding sequence quality file. The RDP’s pyrosequencing pipeline (Ribosomal Database Project, pyro.cme.msu.edu (Cole et al., 2009)) was used to process the sequence data. In the initial process of the pipeline, the FASTA file was uploaded while the quality file was not used. Initial process parameters were set to a maximum forward and reverse primer distance of 2, a maximum of 2 undetermined nucleotides (Ns) and a minimum sequence length of 150 nucleotides. In addition a tag-file was uploaded to refer all sequences to their respective MIDs. The sequences remaining after this trimming procedure were considered to be of high quality and the sequences from each sample (MID) were analyzed separately. For taxonomic sequence assignment at all taxonomic ranks from phylum to genus, the RDPs classifier was used with a confidence threshold of 80 % (Claesson et al., 2009) and the resulting taxonomical hierarchy file was the basis for all further taxonomic assessments which were done in MS Excel 2002 (Microsoft, Unterschleißheim, Germany). Bioinformatic analyses for higher taxonomical ranks (phylum to genus) To compare the bacterial communities, relative abundances were calculated for all taxonomic units of each hierarchical taxonomic rank. For each sample the number of sequences that were assigned to the domain Bacteria was defined as 100 %. To identify variations greater than the average abundances, each variety (one group) was compared to a second group combining the other three varieties. Differences (P ≤ 0.05) in relative abundances of taxonomic units on all taxonomic ranks were tested by two-tailed, two-sampled, Student’s t-test. Library coverage C (Equation 1) was calculated for all taxonomic ranks separately, i.e., starting from phylum down to genus, in accordance with Good (Good, 1953). C (1 n / N ) *100 (1) With n= number of taxonomic units consisting only of one sequence and N= number of all sequences that were assigned to the respective taxonomic rank. Patterns of relative abundance of each taxonomic rank were compared between samples by BioNumerics 5.10 (Applied Maths, SintMartens-Latem, Belgium). Unassigned sequences were omitted in these analyses because they failed to be differentiated at the respective taxonomic ranks. Thus, in the BioNumerics analyses 100 % referred to all sequences assigned to distinct taxonomic units of each particular taxonomic rank. The Pearson correlation coefficient was used to generate matrices of similarities and the corresponding dendrograms were based on UPGMA (Unweighted Pair Group Method with Arithmetic Mean). Average similarities between treatments were calculated from the respective similarity matrices and a one-way-analysis of variance (ANOVA) was applied to identify significant differences (P ≤ 0.05, SigmaPlot for Windows Version 11.0, Systat Software Inc., San José, Ca). To identify the top dominant genera, the dataset was stepwise reduced by low represented genera, normalized and a new cluster analysis was performed. The reduction was carried on until the new clustering output differed from the original one. The last unaffected cluster was taken as the threshold to consider the remaining genera, to be mainly responsible for the structure of the cluster, and defined as the “dominant” genera. The lower represented genera, which were excluded by the last unaffected cluster, were designated to be the “rare” genera. Bioinformatic analyses of operational taxonomical units (OTU) For taxonomic analyses on a level similar to the species rank, the sequences that shared more than 97 % identity were considered to belong to the same rRNA species complex and therefore grouped into the same OTU (Fox et al., 1992; Stackebrandt & Goebel BM, 1994). For all OTU based analyses, the sequence data of each sample, respectively, were aligned by the RDP’s pyrosequencing pipeline aligner (pyro.cme.msu.edu) applying the Bacteria alignment model. The aligned sequences were clustered by the RDP’s complete linkage clustering tool with a maximum distance of 10 and a step size of 1. The result was used as an input to calculate the rarefaction curve by the RDP’s pyrosequencing pipeline. The dereplicate request of the RDP was used to group sequences by a maximum distance of 3, which corresponds to the formation of OTUs based on a minimum of 97 % sequence similarity. Results from the dereplicate request were used to calculate the Shannon diversity index H’ (Shannon & Weaver, 1963) (Equation 2) and the species evenness J’ (Pielou, 1966) (Equations 3 and 4), S H ' pi ln pi (2) H max ln S (3) J ' H ' / H max (4) i 1 with pi being the proportion of sequences in the ith OTU and S the total number of OTUs. In addition, the dereplicate request selected one representative sequence for each OTU. Representative sequences of dominant OTUs were compared to public databases by BLAST search (blast.ncbi.nlm.nih.gov/Blast.cgi) to identify closely related sequences (“hooks”). The identification of shared OTUs that group sequences of more than 97 % sequence similarity in the different samples is regarded as crucial to determine whether each sample selected for different community members within a genus or whether a genus was always represented by the same highly similar sequences. However, due to the high number of sequences obtained in this study, it was not possible to group the sequences from all samples at once into OTUs using the tools provided by the RDP (Cole et al., 2009) or the commonly applied mothur (Schloss et al., 2009). To elute this drawback of the large dataset, all representative sequences and their close relatives were analyzed by a second cluster analysis to generate superior OTUs that comprise sequences from all samples. Superior OTUs containing the hook sequences were evaluated in detail for the origin of their members. This means that the representative sequences were assigned to its original sample and that the individual sequences which were covered by the representatives of each sample were summed up separately. For the purpose of comparing the presence of the dominant members in the different bacterial communities, the formation of OTUs provided a useful tool, which could further be exploited for their identification below the rank genus. However, the characterization of the rare representatives of a community is potentially more biased than that of the dominant ones, because their detection is more strongly affected by sequence errors which unpredictably increase community richness (Reeder & Knight, 2009). Therefore, community comparisons based on the formation of OTUs were omitted. To account for this bias, the evaluation of the biosphere in this study considered their assignment to 1619 predetermined bacterial genera of the RDP, as this was less error-prone. All sequences which remained unassigned to such genera were ignored. This compromise reduced the bias, but was bound to miss some of the rare biosphere, especially novel genera. For this study it meant that for the purpose of diversity estimates, a total of 72,000 family-assigned sequences were lost. Detection of specific DNA-sequences Search for Bacillus thuringiensis and close relatives: In order to identify DNA-sequences which shared high similarity to B. thuringiensis 16S rRNA gene sequences, a total of 105 delegate B. thuringiensis sequences were selected from the SILVA 104 database (Pruesse et al., 2007) and combined in a joined dataset with all representative sequences from the aforementioned dereplicate requests of the RDP. OTUs were defined by the RDPs complete linkage clustering tool on a minimum of 97 % sequence identity. The OTUs containing delegate B. thuringiensis sequences were examined in respect to the sample origin of the representative sequences to sum up the total amount of B. thuringiensis-like sequences in each sample. To identify DNA-sequences closely related to 16S rRNA genes of potential phytopathogenic bacteria, all sequences from pyrosequencing which were assigned by the RDP with a confidence threshold of 80 % (Claesson et al., 2009) to genera with phytopathogenic species, as listed by the American Biological Safety Association (www.absa.org), were selected. More specifically, this list included the genera Erwinia, Pantoea, Pseudomonas, Ralstonia, Xanthomonas, Xylella and Xylophilus. Sequences of each genus were analyzed separately. The sequences were clustered into OTUs of more than 97 % sequence identity by the RDP’s complete linkage clustering, and representative sequences were selected by the RDP’s dereplicate function. The representative sequences were further analyzed within the ARB software environment (Ludwig et al., 2004) where sequences were aligned and added to the main tree of the SSURef_102_SILVA database. A similarity matrix was calculated by ARB to show the similarities of the representative sequences to sequences of potential phytopathogenic species already in the tree. Similarities of ≥ 97 % were considered as threshold to indicate sequences of potential phytopathogenic species. Cited references Claesson MJ, O'Sullivan O, Wang Q, Nikkilae J, Marchesi JR, Smidt H et al. (2009). Comparative analysis of pyrosequencing and a phylogenetic microarray for exploring microbial community structures in the human distal intestine. PLoS ONE 4: e6669. Cole JR, Wang Q, Cardenas E, Fish J, Chai B, Farris RJ et al. (2009). The Ribosomal Database Project: improved alignments and new tools for rRNA analysis. Nucleic Acids Res 37: D141-D145. Fox GE; Wisotzkey JD, Jurtshuk P. (1992). How close is close:16S rRNA sequence identity may not be sufficient to guarantee species identity. Int J Syst Bacteriol 42: 166-170. Good IJ. (1953). The population frequencies of species and the estimation of population parameters. Biometrika 40: 237-264. Ludwig W, Strunk O, Westram R, Richter L, Meier H, Yadhukumar et al. (2004). ARB: A software environment for sequence data. Nucleic Acids Res 32: 1363-1371. Pielou EC. (1966). Measurement of diversity in different types of biological collections. J Theor Biol 13: 131-144. Pruesse E, Quast C, Knittel K, Fuchs BM, Ludwig WG, Peplies J et al. (2007). SILVA: a comprehensive online resource for quality checked and aligned ribosomal RNA sequence data compatible with ARB. Nucleic Acids Res 35: 7188-7196. Reeder J, Knight R. (2009). The 'rare biosphere': a reality check. Nature Meth 6: 636-637. Schloss PD, Westcott SL, Ryabin T, Hall JR, Hartmann M, Hollister EB et al. (2009). Introducing mothur: Open-source, platform-independent, community-supported software for describing and comparing microbial communities. Appl Environ Microbiol 75: 7537-7541. Shannon CE, Weaver W. (1963). The mathematical theory of communication. University of Illinois Press, Urbana, Illinois. Stackebrandt E, Goebel BM. (1994). A place for DNA-DNA reassociation and 16S ribosomal-RNA sequence-analysis in the present species definition in bacteriology. Int J Syst Bacteriol 44: 846–849.