Supplementary Information (doc 598K)

advertisement
1
Divergence across diet, time, and populations rules out parallel evolution in the gut
2
microbiomes of Trinidadian guppies
3
Karen E. Sullam, Benjamin E. R. Rubin, Christopher M. Dalton, Susan S. Kilham,
4
Alexander S. Flecker, Jacob A. Russell
5
6
Supplementary Text and Figures
7
8
Supplemental Methods:
9
Additional information on sample processing and sequencing
10
Whole guppies were preserved in 95% ethanol for the bacterial analysis and
11
70% ethanol for the gut length analysis and were transported to Drexel University,
12
where samples were frozen in -20 °C prior to dissections 4-14 months later. After
13
dissections, whole guts were immediately added to Mo Bio PowerBead Tubes.
14
Following dissection, a digital picture was taken of each gut for length measurement
15
using ImageJ. The filters and sediment samples were collected in sterile Whirl-Pak®
16
Sampling Bags. Following collection, samples were frozen at -20 °C for 3 months,
17
followed by storage at -80 °C for 12 months prior to extraction.
18
Eighty samples were included in three different sequencing runs at Research
19
and Testing Labs. A subset of four samples from the Guanapo 2010 survey (two HP
20
samples and two LP sample) were run independently and multiplexed with samples
21
not included in the present study. Thirty-six of the remaining 2010 survey samples
22
were run in the second run, while the final run included the 2011 survey samples, the
23
samples from the dietary experiment, and one sample from the 2010 survey that had
24
minimal coverage from the first sequencing run.
25
1
26
Additional information on dietary manipulation study
27
Males were included in the tanks because females decrease energy assimilation in the
28
absence of male guppies (Reznick 1983), but only females were used for subsequent
29
measurements. All tanks had the same male to female ratio (1:3) and all female
30
guppies were reproductive over the course of the entire experiment.
31
32
33
Additional information on classification
The BLAST algorithm, using the Greengenes database as a reference, was first
34
used for taxonomic classification of representative sequences. Any reads that were
35
either classified as chloroplasts or failed to classify to bacteria were excluded from
36
further analysis. It became apparent that a number of such classifications at the
37
phylum level deviated greatly from results involving BLAST against the NCBI
38
database and from RDP classification. For this reason we utilized UCLUST (in Qiime
39
v. 1.8) to classify all but 82 of our OTUs that went unassigned using this method.
40
BLAST classification against the Greengenes database appeared to adequately
41
classify these latter OTUs, and thus our overall classification consisted of a hybrid
42
approach involving both methods.
43
44
45
Additional information on OTU genotyping
We built a pipeline to find genotype variation within the 6 most dominant
46
OTUs of the dataset, from which we focused on 2 OTUs that were composed of
47
samples run in the same batch to eliminate possibly confounding batch effects (OTU
48
4447 and 5760). First, to reduce the computational load of performing alignments on
49
all sequences in each OTU, we removed non-unique sequences using the dereplicate
50
function of USEARCH v. 6.0.307 (Edgar 2010). De-replicated sequences were
2
51
aligned with MUSCLE v. 3.8.31 (Edgar 2004). We used the “–maxiters 2” option on
52
those OTUs (2023 and 4447) with very large numbers of unique sequences (>17,000)
53
as attempting full alignments caused MUSCLE to crash. Default parameters were
54
used for all other OTUs. Alignments were then repopulated with the non-unique
55
sequences removed by de-replication. To reduce problems introduced by sequencing
56
errors, homopolymers and surrounding gaps were masked from alignments before
57
further analysis. A commonly encountered suspicious alignment pattern with the
58
following three characteristics was also masked: (1) adjacent sites had one identical
59
non-gap allele, (2) one allele at one site was a gap, (3) the frequency of the gap was
60
within 10% of the frequency of the identical allele in the site that did not include a
61
gap. This pattern suggested that the two apparently variable adjacent sites were
62
actually made up of two misaligned sites (e.g. site one: -/G, site two: G/A produced
63
by MUSCLE is likely a misalignment of a correct alignment of site one: G/G, site two
64
-/A). Although there is potentially useful information in these sites, they were
65
excluded to maintain genotype quality. We extended the filter to include similarly
66
suspicious situations following the same pattern except that both bases were shared in
67
the two adjacent sites (e.g. site one: A/G, site two: G/A). This latter pattern less
68
clearly represents misalignment, but when these sites were examined more closely,
69
they were invariably surrounded by combinations of gaps and nucleotides that
70
suggested alignment error. In addition, all alignments dropped precipitously in quality
71
after several hundred bases, so each OTU alignment was examined and trimmed at the
72
length where clear misalignments became common. While these exclusions
73
potentially reduced the amount of true variation that we could identify, the indel
74
errors inherent in 454 sequencing and the difficulty of accurately aligning such large
75
numbers of sequences made this procedure necessary. We ran our pipeline separately
3
76
on each OTU to identify the appropriate quality control parameters and poorly-
77
aligned sites for exclusion. All parameters for each OTU are given in Supplementary
78
Table 4.
79
Many sequences could not be assigned an allele at every variable site, leading
80
to incomplete genotypes. These arose due to the presence of low frequency bases that
81
likely represented sequencing errors, alignment errors, or masked sequence data.
82
Therefore, when these incomplete genotypes were missing data at just a single site but
83
were otherwise identical to one and only one other complete genotype, they were
84
assigned the corresponding complete genotype. Although we could potentially be
85
collapsing unique genotypes in this way, most of these incomplete genotypes likely
86
represented the genotype to which they were assigned. At the very least, they were at
87
least more closely related to the assigned genotype than to any other. Additionally,
88
sequences with genotypes including missing data that did not meet the quality
89
requirements were discarded. Finally, genotypes present at a frequency of less than
90
0.1% across the entire dataset were also excluded to further minimize the inclusion of
91
sequence and alignment artifacts.
92
For each OTU analyzed, samples with more than 25 reads per OTU were
93
included in the genotyping analysis. For OTU 4447, 2 samples from 2011 were
94
excluded in the analysis to focus on site variation within the 2010 samples. For OTU
95
5760, 1 HP Aripo and 1 LP Marianne sample were excluded from analysis to focus in
96
on populations with n > 1.
97
98
99
100
4
101
Results:
102
Enterotyping Analysis
103
It was determined that all fish gut samples, including those from the wild and
104
the dietary study, were optimally partitioned in to six enterotypes (Supplementary
105
Table 5). These groupings appeared to correlate to the dominance of certain OTUs,
106
where the wild fish either had one of the following OTUs as a dominant bacterium
107
(OTU 2023, 4447, and 5998) or were in the sixth partition in which no bacterium had
108
a strong dominance. The two partitions that encompassed lab-reared fish were
109
dominated by two OTUs (OTU 1229 and 1106).
110
The two OTUs that dominated the lab fish and seemingly shaped their
111
enterotype grouping, were not abundant in wild fish, but were still found in the wild
112
fish. For example, 13 out of 55 wild fish (8 LP fish and 5 HP fish) harbored the
113
Spirochaeta-derived OTU 1106 that dominated guts of lab-reared LP fish. For the
114
Entomoplasmatales-derived OTU 1229 that was common in lab-reared HP fish, 6 out
115
of 55 wild fish (5 HP fish and 1 LP fish) were found to have this OTU.
116
117
OTU member genotyping
118
The genotypic, or strain, composition of two dominant OTUs showed
119
variation across one or more scales. The Marianne LP and the Quare LP differed from
120
the other sampling locations that had sufficient representation of OTU 4447, which
121
included the Aripo HP, Aripo LP, and Guanapo LP (Supplementary Figure 3A). OTU
122
5760 showed variation in rare strains between two ecotypes from separate streams
123
(Supplementary Figure 3B).
124
125
126
Reznick DN (1983). The structure of guppy life histories: the tradeoff between growth
and reproduction. Ecology 64: 862-873.
5
127
128
Supplementary Figures:
129
130
131
132
133
134
135
136
137
Supplementary Figure 1: Rarefaction curves of observed species number. Analyses
were performed using QIIME to characterize species richness of bacterial
communities from A) guppy guts colored by stream and separated by ecotype
background from the 2010 survey, and B) environmental samples from the Guanapo
River. The difference in y-axis of the two graphs shows that environmental samples,
particularly those from sediment, tend to harbor greater bacterial diversity than guppy
guts.
138
6
139
140
141
142
143
144
145
146
Supplementary Figure 2: Principal Coordinates Analysis of guppy gut bacterial
communities from the 2010 field survey based on A) unweighted UniFrac distances
and B) Hellinger-transformed Bray-Curtis distances.
7
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
Supplementary Figure 3: Strain analysis of dominant OTUs with their
corresponding phyla in parentheses. Analyses include A) OTU 4447, which shows
differences between certain sampling locations and ecotypes, and B) OTU 5760,
which differs among rare strains between the two streams and ecotypes. HP and LP
are used to distinguish ecotypes, collected from High (HP) vs. Low predation (LP)
habitats. Number of reads from each sequence library is listed on top of bars for each
OTU. Each vetted genotype is assigned a different color and listed in each panel by
different letters (See Supplementary Table 4 for information of which genotype
corresponds to which letters). Stacked bar graphs show the proportion of all reads
from the given OTU made up by each genotype.
8
163
164
165
166
167
168
169
170
171
172
173
Supplementary Figure 4: Size standardized gut length comparison of guppies across
four streams. Size standardizations to visualize the results and account for allometry
(Torres & Vanni 2007) were made by calculating the size corrected gut characteristic
(i.e. length or weight) = gut characteristic/standard length of fish^(slope of all
individuals’ log gut characteristic/ log standard length). Color is associated with
stream of origin and asterisks indicate significant differences between HP and LP
ecotypes.
9
174
175
176
177
178
179
180
181
Supplementary Figure 5: Network analysis of samples collected in the 2011
Guanapo field survey during which gut bacteria were compared to environmental
samples. N= 3 for gut samples from the three different environments and N = 2 for all
environmental samples, except for HP water sample, for which N = 1. The gut
samples clearly separate from the environmental samples.
10
Download