Supplementary Material (doc 63K)

advertisement
Supplemental Material
Materials and Methods
Quantification of Cry-proteins from root tissue
A total of 0.5 mL sterile PBST buffer (137 mM NaCl, 27 mM KCl, 100 mM Na2HPO4, 10 mM KH2PO4,
0.5 % Tween-20, pH 7.4), 0.7 g of 0.1 mm zirconia-silica beads (Carl Roth, Karlsruhe, Germany), six
2.8 mm ceramic beads (Peqlab, Erlangen, Germany) and one 6.3 mm ceramic sphere (MP
Biomedicals, Illkirch, France) were added to each frozen root sample of 0.2 g into 2.0 mL
polypropylene screw-cap tubes. The root was macerated by four bead beating steps of 45 s at
6.5 m s-1 on a FastPrep-24 system (MP Biomedicals) and centrifuged at 16,000 x g and 4°C for 10 min.
The Cry-protein contents in the supernatants were analyzed separately using specific ELISA systems.
The Cry1A.105 and Cry2Ab2 proteins were quantified using ELISA systems kindly provided by
Monsanto (St. Louis, Mo) and Cry3Bb1 was quantified with the Cry3Bb1 ELISA system from Agdia
(Elkhart, In), all following protocols of the suppliers. The amount of total protein was determined
with a ready to use Bradford-solution (AppliChem, Darmstadt, Germany).
Processing of pyrosequencing data
Sequence data from pyrosequencing were supplied as a FASTA file and a corresponding sequence
quality file. The RDP’s pyrosequencing pipeline (Ribosomal Database Project, pyro.cme.msu.edu
(Cole et al., 2009)) was used to process the sequence data. In the initial process of the pipeline, the
FASTA file was uploaded while the quality file was not used. Initial process parameters were set to a
maximum forward and reverse primer distance of 2, a maximum of 2 undetermined nucleotides (Ns)
and a minimum sequence length of 150 nucleotides. In addition a tag-file was uploaded to refer all
sequences to their respective MIDs. The sequences remaining after this trimming procedure were
considered to be of high quality and the sequences from each sample (MID) were analyzed
separately. For taxonomic sequence assignment at all taxonomic ranks from phylum to genus, the
RDPs classifier was used with a confidence threshold of 80 % (Claesson et al., 2009) and the resulting
taxonomical hierarchy file was the basis for all further taxonomic assessments which were done in
MS Excel 2002 (Microsoft, Unterschleißheim, Germany).
Bioinformatic analyses for higher taxonomical ranks (phylum to genus)
To compare the bacterial communities, relative abundances were calculated for all taxonomic units
of each hierarchical taxonomic rank. For each sample the number of sequences that were assigned to
the domain Bacteria was defined as 100 %. To identify variations greater than the average
abundances, each variety (one group) was compared to a second group combining the other three
varieties. Differences (P ≤ 0.05) in relative abundances of taxonomic units on all taxonomic ranks
were tested by two-tailed, two-sampled, Student’s t-test. Library coverage C (Equation 1) was
calculated for all taxonomic ranks separately, i.e., starting from phylum down to genus, in
accordance with Good (Good, 1953).
C  (1  n / N ) *100
(1)
With n= number of taxonomic units consisting only of one sequence and N= number of all sequences
that were assigned to the respective taxonomic rank. Patterns of relative abundance of each
taxonomic rank were compared between samples by BioNumerics 5.10 (Applied Maths, SintMartens-Latem, Belgium). Unassigned sequences were omitted in these analyses because they failed
to be differentiated at the respective taxonomic ranks. Thus, in the BioNumerics analyses 100 %
referred to all sequences assigned to distinct taxonomic units of each particular taxonomic rank. The
Pearson correlation coefficient was used to generate matrices of similarities and the corresponding
dendrograms were based on UPGMA (Unweighted Pair Group Method with Arithmetic Mean).
Average similarities between treatments were calculated from the respective similarity matrices and
a one-way-analysis of variance (ANOVA) was applied to identify significant differences (P ≤ 0.05,
SigmaPlot for Windows Version 11.0, Systat Software Inc., San José, Ca). To identify the top dominant
genera, the dataset was stepwise reduced by low represented genera, normalized and a new cluster
analysis was performed. The reduction was carried on until the new clustering output differed from
the original one. The last unaffected cluster was taken as the threshold to consider the remaining
genera, to be mainly responsible for the structure of the cluster, and defined as the “dominant”
genera. The lower represented genera, which were excluded by the last unaffected cluster, were
designated to be the “rare” genera.
Bioinformatic analyses of operational taxonomical units (OTU)
For taxonomic analyses on a level similar to the species rank, the sequences that shared more than
97 % identity were considered to belong to the same rRNA species complex and therefore grouped
into the same OTU (Fox et al., 1992; Stackebrandt & Goebel BM, 1994). For all OTU based analyses,
the sequence data of each sample, respectively, were aligned by the RDP’s pyrosequencing pipeline
aligner (pyro.cme.msu.edu) applying the Bacteria alignment model. The aligned sequences were
clustered by the RDP’s complete linkage clustering tool with a maximum distance of 10 and a step
size of 1. The result was used as an input to calculate the rarefaction curve by the RDP’s
pyrosequencing pipeline. The dereplicate request of the RDP was used to group sequences by a
maximum distance of 3, which corresponds to the formation of OTUs based on a minimum of 97 %
sequence similarity. Results from the dereplicate request were used to calculate the Shannon
diversity index H’ (Shannon & Weaver, 1963) (Equation 2) and the species evenness J’ (Pielou, 1966)
(Equations 3 and 4),
S
H '   pi ln pi
(2)
H max  ln S
(3)
J '  H ' / H max
(4)
i 1
with pi being the proportion of sequences in the ith OTU and S the total number of OTUs. In addition,
the dereplicate request selected one representative sequence for each OTU. Representative
sequences of dominant OTUs were compared to public databases by BLAST search
(blast.ncbi.nlm.nih.gov/Blast.cgi) to identify closely related sequences (“hooks”). The identification of
shared OTUs that group sequences of more than 97 % sequence similarity in the different samples is
regarded as crucial to determine whether each sample selected for different community members
within a genus or whether a genus was always represented by the same highly similar sequences.
However, due to the high number of sequences obtained in this study, it was not possible to group
the sequences from all samples at once into OTUs using the tools provided by the RDP (Cole et al.,
2009) or the commonly applied mothur (Schloss et al., 2009). To elute this drawback of the large
dataset, all representative sequences and their close relatives were analyzed by a second cluster
analysis to generate superior OTUs that comprise sequences from all samples. Superior OTUs
containing the hook sequences were evaluated in detail for the origin of their members. This means
that the representative sequences were assigned to its original sample and that the individual
sequences which were covered by the representatives of each sample were summed up separately.
For the purpose of comparing the presence of the dominant members in the different bacterial
communities, the formation of OTUs provided a useful tool, which could further be exploited for
their identification below the rank genus. However, the characterization of the rare representatives
of a community is potentially more biased than that of the dominant ones, because their detection is
more strongly affected by sequence errors which unpredictably increase community richness (Reeder
& Knight, 2009). Therefore, community comparisons based on the formation of OTUs were omitted.
To account for this bias, the evaluation of the biosphere in this study considered their assignment to
1619 predetermined bacterial genera of the RDP, as this was less error-prone. All sequences which
remained unassigned to such genera were ignored. This compromise reduced the bias, but was
bound to miss some of the rare biosphere, especially novel genera. For this study it meant that for
the purpose of diversity estimates, a total of 72,000 family-assigned sequences were lost.
Detection of specific DNA-sequences
Search for Bacillus thuringiensis and close relatives: In order to identify DNA-sequences which shared
high similarity to B. thuringiensis 16S rRNA gene sequences, a total of 105 delegate B. thuringiensis
sequences were selected from the SILVA 104 database (Pruesse et al., 2007) and combined in a
joined dataset with all representative sequences from the aforementioned dereplicate requests of
the RDP. OTUs were defined by the RDPs complete linkage clustering tool on a minimum of 97 %
sequence identity. The OTUs containing delegate B. thuringiensis sequences were examined in
respect to the sample origin of the representative sequences to sum up the total amount of
B. thuringiensis-like sequences in each sample.
To identify DNA-sequences closely related to 16S rRNA genes of potential phytopathogenic bacteria,
all sequences from pyrosequencing which were assigned by the RDP with a confidence threshold of
80 % (Claesson et al., 2009) to genera with phytopathogenic species, as listed by the American
Biological Safety Association (www.absa.org), were selected. More specifically, this list included the
genera Erwinia, Pantoea, Pseudomonas, Ralstonia, Xanthomonas, Xylella and Xylophilus. Sequences
of each genus were analyzed separately. The sequences were clustered into OTUs of more than 97 %
sequence identity by the RDP’s complete linkage clustering, and representative sequences were
selected by the RDP’s dereplicate function. The representative sequences were further analyzed
within the ARB software environment (Ludwig et al., 2004) where sequences were aligned and added
to the main tree of the SSURef_102_SILVA database. A similarity matrix was calculated by ARB to
show the similarities of the representative sequences to sequences of potential phytopathogenic
species already in the tree. Similarities of ≥ 97 % were considered as threshold to indicate sequences
of potential phytopathogenic species.
Cited references
Claesson MJ, O'Sullivan O, Wang Q, Nikkilae J, Marchesi JR, Smidt H et al. (2009). Comparative
analysis of pyrosequencing and a phylogenetic microarray for exploring microbial community
structures in the human distal intestine. PLoS ONE 4: e6669.
Cole JR, Wang Q, Cardenas E, Fish J, Chai B, Farris RJ et al. (2009). The Ribosomal Database Project:
improved alignments and new tools for rRNA analysis. Nucleic Acids Res 37: D141-D145.
Fox GE; Wisotzkey JD, Jurtshuk P. (1992). How close is close:16S rRNA sequence identity may not be
sufficient to guarantee species identity. Int J Syst Bacteriol 42: 166-170.
Good IJ. (1953). The population frequencies of species and the estimation of population parameters.
Biometrika 40: 237-264.
Ludwig W, Strunk O, Westram R, Richter L, Meier H, Yadhukumar et al. (2004). ARB: A software
environment for sequence data. Nucleic Acids Res 32: 1363-1371.
Pielou EC. (1966). Measurement of diversity in different types of biological collections. J Theor Biol
13: 131-144.
Pruesse E, Quast C, Knittel K, Fuchs BM, Ludwig WG, Peplies J et al. (2007). SILVA: a comprehensive
online resource for quality checked and aligned ribosomal RNA sequence data compatible
with ARB. Nucleic Acids Res 35: 7188-7196.
Reeder J, Knight R. (2009). The 'rare biosphere': a reality check. Nature Meth 6: 636-637.
Schloss PD, Westcott SL, Ryabin T, Hall JR, Hartmann M, Hollister EB et al. (2009). Introducing
mothur: Open-source, platform-independent, community-supported software for describing
and comparing microbial communities. Appl Environ Microbiol 75: 7537-7541.
Shannon CE, Weaver W. (1963). The mathematical theory of communication. University of Illinois
Press, Urbana, Illinois. Stackebrandt E, Goebel BM. (1994). A place for DNA-DNA
reassociation and 16S ribosomal-RNA sequence-analysis in the present species definition in
bacteriology. Int J Syst Bacteriol 44: 846–849.
Download