emi12748-sup-0001-si

advertisement
An improved method to set significance
thresholds for β diversity testing in microbial
community comparisons.
Arda Gülay1 and Barth F. Smets1*
1
Department of Environmental Engineering, Technical University of Denmark, Building
113, Miljøvej, 2800 Kgs Lyngby, Denmark. Phone: +45 45251600. FAX: +45 45932850. email: bfsm@env.dtu.dk, argl@env.dtu.dk
Supplementary Information
Fig. S1 PCoA analysis of a sample (7900 individuals) and its subsamples (5000 individuals)
using Bray-curtis dissimilarity measure. Jackknife technique was also applied to compare the
interquartile range and observed distance between subsamples.
Fig. S2 All β1AA' and β2AA' (intra) beta diversities between an in-silico OTU library and its copies using seven dissimilarity measures. OTU
libraries were randomly subsampled at equal sample sizes from 13000 to 3000 individuals. 10 subsets were created for each subsampling depth
from all samples. All ordination plots were created using PCoA (Principle Coordinate Analysis) and the percentage of total variance explained by
principle coordinate is shown under the associated axis. Each data point inside the ordination plots represents a subsample from the in-silico
samples. Blue, green and red colors represent each sample and their subsamples that were being compared.
Fig. S3 Rank-abundance curves of in-silico microbial communities that were created using
different distribution models (table 1 in the article) such as: Log-normal, Log-normal trimmed
(singletons and doubletons removed), Uniform and Chi-squared (degree of freedom:9). We
used “rlnorm (1000, meanlog=1.79, sdlog=1)”, “runif (1000, min = 1, max = 19.1)” and
“rchisq (1000, 9.2, ncp = 1)” commands to create Log-Normal, Uniform and Chi-Squared
distributed communities respectively.
Fig. S4 Lorenz curves of in-silico microbial communities that were created using different
distribution models such as: Log-normal, Uniform and Chi-squared (degree of freedom: 9) as
a measure of evenness.
Fig. S5 In-silico OTU libraries following log-normal distribution with same richness but
different evenness. 9 communities were chosen among many simulations in R. β diversity
values between these communities were used to assess the effect of evenness on corrected β
diversity values.
Fig. S6 β diversities between in silico OTU libraries (see in Fig. S2) with same richness but
different evenness calculated with Bray-Curtis indices at different sampling scales. Delta
values represent the level of evenness differences in terms of Gini coefficient. Black points
indicate observed β diversities as a function of subsampling depth and green points indicate
corrected
β
diversities
as
a
function
of
subsampling
depth.
Fig. S7 β diversity analysis of eight in-silico OTU libraries following log-normal distribution with same richness but different evenness (0.36,
0.42, 0.44, 0.64, 0.73, 0.75, 0.80, and 0.90) calculated with Bray-Curtis indices. OTU0.03 libraries were chosen from a large number of
simulations of which their comparisons resulted as the observed β diversity of 0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, and 0.7. Red dots represents βMC of
the species pool obtained from compared communities
Fig. S8 Rank–abundance curves of triplicate 16S rRNA tag sequence libraries from 3
different rapid sand filters. Separate panels are shown for the dominant (>1%) and rare (<1%)
OTU0.03s .Sequence abundance per OTU is shown in different colors for replicates. Shared
and unique OTU0.03s are also shown in Venn diagrams.
Fig. S9 Schematic explanation of the β diversity significance assessment technique using meta-community concept. 10 subsamples are
recommended for β1AA' and β2AA' assessment. In addition, a p-value can be calculated by comparing observed βA῾B῾ to the null distribution (i.e.
n=999)
of
β1AA῾
values
obtained
from
the
meta-community.
Fig. S10 PCoA plots of standard techniques (subsampling and Jackknife technique) that were
applied on replicates of 3 different rapid sand filters in order to measure significance of β
diversity. 3 clusters of each PCoA plots, affiliated to subsampling technique, represent a
replicate from the same filter in where two different colours in each cluster represent
equalized original sample and its subsamples with lesser individuals. In Jackknife technique,
ellipses around represent the interquartile range (IQR; Lozupone et al., 2007) in each axis for
the 100 jackknife replicates.
Table S1 β diversity analysis of three log normal distributed in silico OTU libraries with
different individuals (4500, 7500 and 13500) using Raup-Crick method implemented in
Vegan after 999 simulations. Raup-Crick outputs show the number of simulated values
(shared species) that are smaller or equal than the observed value.
Comparisions
Raup-Crick values
4700 vs.7500
0.114
7500 vs.13500
0.001
4700 vs.13500
0.001
Calculations were implemented using model r1 as the method in “oecosimu” function, which uses the column
marginal frequencies as probabilities
Supplementary
Table
1 Mean
distances
fromreplicates
replicatesthat
thatwere
wererandomly
randomlysubsampled
subsampledfor
for10
10
Table
S2 Weighted
UniFrac
distances
from
times
according
to
the
minimum
sample
depth.
Effect
of
rare
OTUs
were
calculated
by
times at the minimum sampling depth.Effect of rare OTUs were calculated by substracting β
subtracting
diversity
totalβOTUs
fromofβdominant
diversity of
dominant OTUs.
diversity
ofβtotal
OTUsoffrom
diversity
OTUs.
Filter 2
Filter 7
Filter 12
Rep1 vs. Rep2 vs. Rep1 vs. Rep1 vs. Rep2 vs. Rep1 vs. Rep1 vs. Rep2 vs. Rep1 vs.
Rep2
Rep3
Rep3
Rep2
Rep3
Rep3
Rep2
Rep3
Rep3
Total distance 1
0.066
0.041
0.042
0.057
0.156
0.161
0.104
0.065
0.096
Effect of dominant
0.030
0.020
0.014
0.028
0.124
0.126
0.073
0.024
0.075
Effect of rare
0.036
0.021
0.028
0.028
0.032
0.035
0.031
0.040
0.021
2
11 All
dissimilarities
were
calculatedusing
usingWeighted
Weighted UniFrac
UniFrac algorithm
Dissimilarities
were
calculated
algorithm
OTU-tables
without
rarerare
OTUOTUs
Dissimilarities werecalculated
calculatedfrom
from
OTU tables
without
S
22 Dissimilarities were
Download