Complete Material and Methods

advertisement
1
1
Detailed Material and Methods
2
Data retrieval.
3
We analyzed 16S rDNA sequences from the 0.1-0.8 m size fraction for each of forty-
4
five seawater samples collected on the Sorcerer II as part of the Global Ocean Sampling
5
expedition[1]. As described in this study the samples were collected between May 2003
6
and March 2004. Detailed procedures for DNA sequencing are available elsewhere[1,
7
2]. We collected a total of 4,125 16S rRNA gene sequences with corresponding
8
environmental data (25 samples) from the CAMERA website[3]. A subset of 3,228
9
sequences remained after discarding those from poorly sampled localities or from a
10
different study[2] — e.g., "GS00a"— (Table S1). We acknowledge that environmental
11
shotgun sequencing only discloses the most abundant phylotypes of local surface
12
communities, which are most likely involved in the main biogeochemical processes at
13
sampling time[4]. Hence, unless they belong to known taxonomic lineages, the rare
14
members of marine microbial biosphere may not follow a similar PT.
15
DNA alignment and phylogenetic assessment.
16
To reduce alignment errors due to implausible insertion-deletion event histories, all 16S
17
rDNA sequences were aligned using the PRANK software[5]. A maximum likelihood
18
(ML) tree was then inferred from 1,285 nucleotide sites using RAxML[6] under a GTR
19
+ Gamma + Invariable model of sequence evolution. A three-step quality control was
20
then conducted: (i) sequences generating excessively long branches (>0.5 substitutions
21
per site) were removed from subsequent analyses; (ii) sites exhibiting more than 75% of
22
gaps or missing data were discarded; and (iii) too fragmentary sequences (totalizing less
23
than 10% of the final alignment length) were also eliminated. The final alignment
24
(3,228 sequences and 1,285 sites) was then subjected to a new RAxML analysis
25
followed by a PAUP[7] refinement based on SPR branch swapping. The resulting ML
2
26
phylogram was rendered ultrametric by the non-parametric rate smoothing procedure of
27
R8S[8]. Patristic distances and tree drawing were then managed using the APE package
28
for R[9].
29
Taxonomic assignation.
30
The taxonomy of each 16S rDNA sequences was inferred using a local BLAST[10]
31
versus the SILVA database version 100 from August 2009[11] which contained nearly
32
1,200,000 SSU/LSU sequences. The first 100 best BLAST hits were then processed
33
using a local Perl script to parse out relevant taxonomic information, and a 2/3
34
consensus majority was used to infer taxonomy. Relevant subgroups (i.e., Alpha
35
Proteobacteria and Gamma Proteobacteria) were then selected from the overall dataset.
36
Because OTUs belonging to other taxonomic groups were often scarce, we could not
37
disclose their patterns of PT.
38
Distance matrices.
39
Geographic distances between pairs of samples were calculated using latitudinal and
40
longitudinal coordinates and computed using R[12]. Phylogenetic ultrametric distances
41
were assessed by the non-parametric rate smoothing procedure of R8S[8] using the
42
picante package for R by the non-parametric rate smoothing procedure of R8S[8]. Then,
43
the amount of phylogenetic turnover between communities was calculated using the
44
Phylosor index[13],which quantifies the fraction of branch lengths that were unique (not
45
shared) to each of the two microbial communities. Environmental distance matrix was
46
computed using the Gower distance implemented in the cluster package for R[12].
47
Disentangling geographic versus environment influences on phylogenetic turnover.
48
We analyzed the respective effect of geographic and environment on phylogenetic
49
turnover between all pairs of microbial communities using multiple regressions on
50
distance matrices (MRM; see[14, 15] for details). In brief, MRM is an extension of
3
51
partial Mantel analysis that is used to investigate relationships between a multivariate
52
response distance matrix and any number of explanatory distance matrices[15]. We
53
implemented additional partial multiple regressions on distance matrices to estimate the
54
“pure” effect of each explanatory matrix[16]. Significance of regression coefficients
55
were tested using 9,999 permutations. Analyses were performed using library
56
“ecodist”[17] implemented in the R Package[12].
57
58
59
References
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
1 Rusch, D. B., Halpern, A. L., Sutton, G., Heidelberg, K. B., Williamson, S., et al.
2007 The Sorcerer II Global Ocean Sampling Expedition: Northwest Atlantic through
Eastern Tropical Pacific. PLOS Biol. 5, 0398.
2 Venter, J. C., Remington, K., Heidelberg, J. F., Halpern, A. L., Rusch, D., et al. 2004
Environmental genome shotgun sequencing of the Sargasso Sea. Science. 304, 66-74.
3 Seshadri, R., Kravitz, S., Smarr, L., Gilna, P., Frazier, M. 2007 CAMERA: A
Community Resource for Metagenomics. PLOS Biol. 5, e75
doi:10.1371/journal.pbio.0050075.
4 Pedrós-Alió, C. 2006 Marine microbial diversity: can it be determined? Trends
Microbiol. 14, 257-263.
5 Loytynoja, A., Goldman, N. 2008 Phylogeny-Aware Gap Placement Prevents Errors
in Sequence Alignment and Evolutionary Analysis. Science. 320, 1632-1635.
6 Stamatakis, A. 2006 RAxML-VI-HPC: Maximum likelihood-based phylogenetic
analyses with thousands of taxa and mixed models. Bioinformatics. 22, 2688-2690.
7 Swofford, D. L. PAUP*: Phylogenetic Analysis Using Parsimony (*And Other
Methods). 4 ed. Sunderland, Massachusetts: Sinauer Associates 2002.
8 Sanderson, M. J. 2003 r8s: inferring absolute rates of molecular evolution and
divergence times in the absence of a molecular clock. Bioinformatics. 19, 301-302.
9 Paradis, E., Claude, J., Strimmer, K. 2004 APE: Analyses of Phylogenetics and
Evolution in R language. Bioinformatics. 20, 289-290.
10 Altschul, S. F., Madden, T. L., Schaffer, A. A., Zhang, J., Zhang, Z., et al. 1997
Gapped BLAST and PSI-BLAST: a new generation of protein database search
programs. Nucleic Acids Res. 25, 3389-3402.
11 Pruesse, E., Quast, C., Knittel, K., Fuchs, B., Ludwig, W., et al. 2007 SILVA: a
comprehensive online resource for quality checked and aligned ribosomal RNA
sequence data compatible with ARB. Nucleic Acids Res. 35, 7188-7196.
12 R Development Core Team. 2008 A language and environment for statistical
computing. .
4
89
90
91
92
93
94
95
96
97
98
99
100
101
102
13 Bryant, J. A., Lamanna, C., Morlon, H., Kerkhoff, A. J., Enquist, B. J., et al. 2008
Microbes on mountainsides: contrasting elevational patterns of bacterial and plant
diversity. Proc. Natl. Acad. Sci. U. S. A. 105 Suppl 1, 11505-11511.
14 Manly, B. F. J. 1986 Randomization and regression methods for testing for
associations with geographical, environmental and biological distances between
populations. Researches on population ecology. 28, 201-218.
15 Lichstein, J. W. 2007 Multiple regression on distance matrices: a multivariate spatial
analysis tool. Plant ecology. 188, 117-131.
16 Legendre, P., Legendre, L. 1998 Numerical Ecology. Elsevier Science Publ. Co.
17 Goslee, S. C., Urban, D. L. 2007 The ecodist package for dissimilarity-based
analysis of ecological data. Journal of statistical software. 22, 1-19.
5
103
Table S1. Samples considered for the analyses
104
105
Sample
Sample
Chlorophyll
Date
Depth (m)
Density
64°30'00"W
15 May 03
5
0.1
36.7
22.9
32°10'00"N
64°30'00"W
15 May 03
5
0.1
36.7
22.9
GS02
42°30'11"N
67°14'24"W
21 Aug. 03
1
1.4
29.2
18.2
GS03
42°51'10"N
66°13'2"W
21 Aug. 03
1
1.4
29.9
11.7
GS04
44°8'14"N
63°38'40"W
22 Aug. 03
2
0.4
28.3
17.3
GS05
44°41'25"N
63°38'14"W
23 Aug. 03
1
6
30.2
15
GS07
43°37'56"N
66°50'50"W
25 Aug. 03
1
1.4
31.7
17.9
GS08
41°29'9"N
71°21'4"W
17 Nov. 03
1
2.2
26.5
9.4
GS09
41°5'28"N
71°36'8"W
17 Nov. 03
1
4
31
11
GS10
38°56'24"N
74°41'6"W
18 Nov. 03
1
2
31
12
GS12
38°56'49"N
76°25'2"W
18 Dec. 03
13.2
21
3.5
1
GS15
24°29'18"N
83°4'12"W
8 Jan. 04
1.7
0.2
36
25
GS16
24°10'29"N
84°20'40"W
8 Jan. 04
2
0.16
35.8
26.4
GS17
20°31'21"N
85°24'49"W
9 Jan. 04
2
0.13
35.8
27
GS18
18°2'12"N
83°47'5"W
10 Jan. 04
1.7
0.14
35.4
27.4
GS19
10°42'59"N
80°15'16"W
12 Jan. 04
1.7
0.23
35.4
27.7
GS21
8°7'45"N
79°41'28"W
20 Jan. 04
1.6
0.5
30.7
27.6
GS22
6°29'34"N
82°54'14"W
21 Jan. 04
2
0.33
32.3
29.3
GS23
5°38'24"N
86°33'55"W
22 Jan. 04
2
0.07
32.6
28.7
GS26
1°15'51"N
90°17'42"W
2 Feb. 04
2
0.22
32.6
27.8
GS27
1°12'58"S
90°25'22"W
4 Feb. 04
2.2
0.4
34.9
25.5
GS29
0°12'0"S
90°50'7"W
9 Feb. 04
2.1
0.4
34.5
26.2
GS35
1°23'21"N
91°49'1"W
3 Feb. 04
1.7
0.28
34.5
21.8
GS36
0°1'15"S
91°11'52"W
2 Mar. 04
2.1
0.65
34.6
25.8
GS47
10°7'53"S
135°26'58"W
29 Mar. 04
30
0.12
37.3
28.6
Sample
Latitude
Longitude
GS01a
32°10'00"N
GS01c
Salinity
Temperature
(°C)
Download