Supplementary Information (doc 312K)

advertisement
Supplementary material
Supplementary table 1: List of metagenomes and MG-RAST (Meyer et al. 2008) accession numbers
#
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
MG-RAST metagenome name
174-1
174-2
179-1
179-2
Bag1_13thMay_DNA
Bag6_May13th_DNA
SRS000298
GS006 Shotgun - Estuary - North
American East Coast - Bay of Fundy,
Nova Scotia - Canada
GS011 Shotgun - Estuary - North
American East Coast - Delaware Bay, NJ USA
GS012 Shotgun - Estuary - North
American East Coast - Chesapeake Bay,
MD - USA
mb2000jd298_1
mb2000jd298_2
mb2001jd115_1
mb2001jd115_2
mb2001jd135_1
mb2001jd135_2
BBAY01
BBAY15
GS005 Shotgun - Embayment - North
American East Coast - Bedford Basin,
Nova Scotia - Canada
GS002 Shotgun - Coastal - North
American East Coast - Gulf of Maine Canada
GS003 Shotgun - Coastal - North
American East Coast - Browns Bank, Gulf
of Maine - Canada
GS004 Shotgun - Coastal - North
American East Coast - Outside Halifax,
Nova Scotia - Canada
GS007 Shotgun - Coastal - North
American East Coast - Northern Gulf of
Maine - Canada
GS008 Shotgun - Coastal - North
American East Coast - Newport Harbor, RI
MG-RAST
accession
4443725.3
4443729.3
4443731.3
4443732.3
4440212.3
4440213.3
4443707.3
Size (bp)
5.90E+07
6.74E+07
4.39E+07
5.28E+07
4.73E+07
3.10E+07
3.14E+07
Category in
figure 1
algal bloom
algal bloom
algal bloom
algal bloom
algal bloom
algal bloom
algal bloom
4441582.3
6.46E+07
Estuary
4441658.3
1.33E+08
Estuary
4441584.3
4443713.3
4443712.3
4443714.3
4443715.3
4443716.3
4443717.3
4443688.3
4443693.3
1.36E+08
5.30E+07
4.79E+07
4.50E+07
4.14E+07
5.30E+07
4.47E+07
9.72E+07
1.77E+08
Estuary
Bay
Bay
Bay
Bay
Bay
Bay
Bay
Bay
4441581.3
6.60E+07
Bay
4441579.3
1.29E+08
Coastal
4441580.3
6.69E+07
Coastal
4441152.3
5.69E+07
Coastal
4441153.3
5.54E+07
Coastal
4441583.3
1.38E+08
Coastal
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
- USA
GS009 Shotgun - Coastal - North
American East Coast - Block Island, NY USA
GS010 Shotgun - Coastal - North
American East Coast - Cape May, NJ USA
GS013 Shotgun - Coastal - North
American East Coast - Off Nags Head, NC
- USA
GS014 Shotgun - Coastal - North
American East Coast - South of
Charleston, SC - USA
GS015 Shotgun - Coastal - Caribbean Sea
- Off Key West, FL - USA
GS016 Shotgun - Coastal Sea - Caribbean
Sea - Gulf of Mexico - USA
GS019 Shotgun - Coastal - Caribbean Sea
- Northeast of Colon - Panama
GS021 Shotgun - Coastal - Eastern
Tropical Pacific - Gulf of Panama Panama
GS027 Shotgun - Coastal - Galapagos
Islands - Devil's Crown, Floreana Island Ecuador
GS028 Shotgun - Coastal - Galapagos
Islands - Coastal Floreana - Ecuador
GS029 Shotgun - Coastal - Galapagos
Islands - North James Bay, Santigo Island
- Ecuador
GS034 Shotgun - Coastal - Galapagos
Islands - North Seamore Island - Ecuador
GS035 Shotgun - Coastal - Galapagos
Islands - Wolf Island - Ecuador
GS036 Shotgun - Coastal - Galapagos
Islands - Cabo Marshall, Isabella Island Ecuador
GS049 Shotgun - Coastal - Polynesia
Archipelagos - Moorea, Outside Cooks
Bay - Fr. Polynesia
GS117a Shotgun - Coastal sample Indian Ocean - St. Anne Island, Seychelles
- Seychelles
GS117b Shotgun - Coastal sample Indian Ocean - St. Anne Island, Seychelles
- Seychelles
GS108a Shotgun - Lagoon Reef - Indian
Ocean - Coccos Keeling, Inside Lagoon Australia
GS108b Shotgun - Lagoon Reef - Indian
4441143.3
8.43E+07
Coastal
4441144.3
8.24E+07
Coastal
4441585.3
1.49E+08
Coastal
4441659.3
1.40E+08
Coastal
4441586.3
1.38E+08
Coastal
4441660.3
1.37E+08
Coastal
4441589.3
1.46E+08
Coastal
4441591.3
1.43E+08
Coastal
4441595.3
2.37E+08
Coastal
4441596.4
2.05E+08
Coastal
4441596.3
1.44E+08
Coastal
4441600.3
1.42E+08
Coastal
4441601.3
1.52E+08
Coastal
4441602.3
8.58E+07
Coastal
4441605.3
9.44E+07
Coastal
4441613.3
3.40E+08
Coastal
4441148.3
5.48E+07
Coastal
4441139.3
4441133.3
5.09E+07
5.35E+07
Reef
Reef
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
Ocean - Coccos Keeling, Inside Lagoon Australia
GS025 Shotgun - Fringing Reef - Eastern
Tropical Pacific - Dirty Rock, Cocos Island
- Costa Rica
GS048a Shotgun - Coral Reef - Polynesia
Archipelagos - Moorea, Cooks Bay - Fr.
Polynesia
GS048b Shotgun - Coral Reef - Polynesia
Archipelagos - Moorea, Cooks Bay - Fr.
Polynesia
GS051 Shotgun - Coral Reef Atoll Polynesia Archipelagos - Rangirora Atoll Fr. Polynesia
GS148 Shotgun - Fringing Reef - Indian
Ocean - East coast Zanzibar (Tanzania),
offshore Paje lagoon - Tanzania
GS032 Shotgun - Mangrove - Galapagos
Islands - Mangrove on Isabella Island Ecuador
GS031 Upwelling, Fernandina Island
Ecuador
GS149 Shotgun - Harbor - Indian Ocean West coast Zanzibar (Tanzania), harbour
region - Tanzania
GS000a Shotgun - Open Ocean Sargasso Sea - Sargasso Station 13 Bermuda
GS000b Shotgun - Open Ocean Sargasso Sea - Sargasso Station 13 Bermuda
GS000c Shotgun - Open Ocean Sargasso Sea - Sargasso Stations 3 Bermuda
GS000d Shotgun - Open Ocean Sargasso Sea - Sargasso Station 13 Bermuda
GS001a Shotgun - Open Ocean Sargasso Sea - Hydrostation S - Bermuda
GS001b Shotgun - Open Ocean Sargasso Sea - Hydrostation S - Bermuda
GS001c Shotgun - Open Ocean Sargasso Sea - Hydrostation S - Bermuda
GS017 Shotgun - Open Ocean Caribbean Sea - Yucatan Channel Mexico
GS018 Shotgun - Open Ocean Caribbean Sea - Rosario Bank - Honduras
GS022 Shotgun - Open Ocean - Eastern
Tropical Pacific - 250 miles from Panama
4441593.3
1.30E+08
Reef
4441603.3
9.28E+07
Reef
4441167.3
5.10E+07
Reef
4441604.3
1.40E+08
Reef
4441617.3
1.08E+08
Reef
4441598.3
1.53E+08
Reef
4441597.3
4.62E+08
Upwelling
4441618.3
1.11E+08
Harbor
4441571.3
6.59E+08
Open Ocean
4441573.3
3.21E+08
Open Ocean
4441574.3
3.72E+08
Open Ocean
4441575.3
3.36E+08
Open Ocean
4441576.3
1.43E+08
Open Ocean
4441577.3
9.10E+07
Open Ocean
4441578.3
9.27E+07
Open Ocean
4441587.3
2.81E+08
Open Ocean
4441588.3
1.56E+08
Open Ocean
4441592.3
1.31E+08
Open Ocean
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
City - Panama
GS023 Shotgun - Open Ocean - Eastern
Tropical Pacific - 30 miles from Cocos
Island - Costa Rica
GS026 Shotgun - Open Ocean Galapagos Islands - 134 miles NE of
Galapagos - Ecuador
GS037 Shotgun - Open Ocean - Eastern
Tropical Pacific - Equatorial Pacific TAO
Buoy - International
GS047 Shotgun - Open Ocean - Tropical
South Pacific - 201 miles from F.
Polynesia - French Polynesia
GS109 Shotgun - Open Ocean - Indian
Ocean - Indian Ocean - International
GS110a Shotgun - Open Ocean - Indian
Ocean - Indian Ocean - International
GS110b Shotgun - Open Ocean - Indian
Ocean - Indian Ocean - International
GS111 Shotgun - Open Ocean - Indian
Ocean - Indian Ocean - International
GS112a Shotgun - Open Ocean - Indian
Ocean - Indian Ocean - International
GS112b Shotgun - Open Ocean - Indian
Ocean - Indian Ocean - International
GS113 Shotgun - Open Ocean - Indian
Ocean - Indian Ocean - International
GS114 Shotgun - Open Ocean - Indian
Ocean - 500 Miles west of the Seychelles
in the Indian Ocean - International
GS115 Shotgun - Open Ocean - Indian
Ocean - Indian Ocean - International
GS116 Shotgun - Open Ocean - Indian
Ocean - Outside Seychelles, Indian Ocean
- Seychelles
GS119 Shotgun - Open Ocean - Indian
Ocean - International Water Outside of
Reunion Island - International
GS120 Shotgun - Open Ocean - Indian
Ocean - Madagascar Waters Madagascar
GS121 Shotgun - Open Ocean - Indian
Ocean - International water between
Madagascar and South Africa International
GS122a Shotgun - Open Ocean - Indian
Ocean - International waters between
Madagascar and South Africa International
GS122b Shotgun - Open Ocean - Indian
4441661.3
1.44E+08
Open Ocean
4441594.3
1.09E+08
Open Ocean
4441145.3
6.87E+07
Open Ocean
4441146.3
6.83E+07
Open Ocean
4441155.3
6.28E+07
Open Ocean
4441607.3
1.00E+08
Open Ocean
4441134.3
5.36E+07
Open Ocean
4441156.3
6.21E+07
Open Ocean
4441609.3
1.02E+08
Open Ocean
4441147.3
5.56E+07
Open Ocean
4441610.3
1.18E+08
Open Ocean
4441611.3
3.45E+08
Open Ocean
4441150.3
6.42E+07
Open Ocean
4441149.3
6.42E+07
Open Ocean
4441151.3
6.51E+07
Open Ocean
4441135.3
4.57E+07
Open Ocean
4441614.3
1.19E+08
Open Ocean
4441615.3
4441139.4
1.05E+08
5.27E+07
Open Ocean
Open Ocean
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
Ocean - International waters between
Madagascar and South Africa International
GS123 Shotgun - Open Ocean - Indian
Ocean - International water between
Madagascar and South Africa International
S_35131
S_35155
S_35163
S_35179
SRS000297
1-19-DNA-flx
6-19-DNA-flx
Arctic Canada
AntarcticaAquatic_5 - MARINE DERIVED
LAKE
AntarcticaAquatic_6 - ACE LAKE,
ANTARCTICA
AntarcticaAquatic_8 Newcomb Bay
AntarcticaAquatic_9
GS020 Shotgun - Fresh Water - Panama
Canal - Lake Gatun - Panama
Tilapia pond microbes Duplicate
GS030 Shotgun - Warm Seep - Galapagos
Islands - Warm seep, Roca Redonda Ecuador
OctopusSpringsMatCoreF
GS033 Shotgun - Hypersaline - Galapagos
Islands - Punta Cormorant, Hypersaline
Lagoon, Floreana Island - Ecuador
HighSalternSDbayMicD200407
LowSalternSDbayMic200407
SaltonSeaMic20060823
Tamarix
Arabidopsis
Clover
Soybean
Rice phyllosphere
Rice Rhizosphere
NTS_creosote.amb.1
NTS_crust.amb.1
Luquillo Experimental Forest Soil, Puerto
Rico
Waseca Farm Soil
4441616.3
4443766.3
4443697.3
4443699.3
4443701.3
4443705.3
4440275.3
4440276.3
4441621.3
1.16E+08
4.36E+07
6.81E+07
6.48E+07
5.19E+07
6.82E+07
5.93E+07
6.82E+07
6.83E+07
Open Ocean
Open Ocean
Open Ocean
Open Ocean
Open Ocean
Open Ocean
Open Ocean
Open Ocean
Arctic/Antarctic
4443682.3
2.84E+08
Arctic/Antarctic
4443684.3
4443686.3
4443687.3
2.81E+08
1.02E+08
9.57E+07
Arctic/Antarctic
Arctic/Antarctic
Arctic/Antarctic
4441590.3
4440423.3
3.15E+08
3.88E+07
Fresh
Fresh
4441662.3
4443749.3
3.92E+08
1.86E+07
Warm
Hot
4441599.3
4440438.3
4440437.3
4440329.3
4448834.3
4447810.3
4447811.3
4447793.3
4450328.3
4449956.3
4445996.3
4445993.3
7.30E+08
3.48E+07
2.53E+07
1.89E+07
3.98E+08
2.56E+08
2.41E+08
1.30E+08
8.32E+08
3.96E+08
1.17E+08
1.34E+08
Saline
Saline
Saline
Saline
Tamarix
Phyllosphere
Phyllosphere
Phyllosphere
Phyllosphere
Rhizosphere
Creosote
Crust
4446153.3
4441091.3
3.22E+08
1.54E+08
Soil
Soil
112
113
114
115
Whale Fall Bone
Whale Fall Mat
Whale Fall Rib
Gut Microflora
4441619.3
4441656.4
4441620.3
4440461.5
4.13E+07
3.79E+07
3.91E+07
9.15E+07
Dark
Dark
Dark
Dark
Relative abundance calculation
Various methods have been proposed for normalization of metagenomic data for comparative
purposes, attempting to control for variance caused by differences in average read length and average
genome size among other factors (Raes et al., 2007; Beszteri et al., 2010). Here, we use the simple
approach of calculating the relative abundance of a gene by dividing the number of BLAST hits to this
gene within a metagenome by the number of BLAST hits to a universal single copy marker gene. This
approach is based on the assumption that all noise generating factors act in a similar manner on the
abundance of the gene in question and on the abundance of the marker gene, and are thus eliminated
by division. The factor that remains to be controlled for is gene length, which linearly increases the
number of BLAST hits a gene gets in a dataset. Genes used in this calculation were selected according to
their occurrence profile in sequenced genomes (using Gene Search and Function Search across all
sequenced bacterial and archaeal genomes on the JGI IMG server: http://img.jgi.doe.gov/cgibin/w/main.cgi), eliminating genes that had multiple copies within genomes or genes that only
appeared in some genomes within their category (sup. Table 2).
In order to further control for stochastic variability in marker gene abundances in metagenomes, we
normalized each gene to the average hit number of 35 universal single copy COGs used in (Beszteri et
al., 2010; sup. Table 3):
1. The hit number (Hm) of each universal marker gene (i) is divided by the gene length (lm):
2. The normalization denominator is calculated as the average of M(i):
3. The hit number (Hp) of each photic gene is divided by the gene length (lp):
4. The normalized abundance (A) is calculated by the division
.
5. For each functional category (PS I, PS II type 1 RC, type 2 RC), the average abundance was
calculated according to genes within it. Genes were selected according to their occurrence
profile in sequenced genomes (sup. Table 2).
The normalizaiotion method was validated by measuring the cross-metagenome stochimerty of PS I and
PS II abundance, expected to be equal.
Supplementary figure 1:
Validation of normalization method:
As photosystem I and photosystem II genes appear in bacterial genomes in tandem, a normalized
abundance measure is expected to yield a 1:1 ration between the two.
Supplementary table 2:
PS I and PS II genes and their occurrence profile in sequenced genomes.
Gene
photosystem I P700 chlorophyll a apoprotein subunit Ia (PsaA)
photosystem I P700 chlorophyll a apoprotein subunit Ib (PsaB)
photosystem I iron-sulfur center subunit VII (PsaC)
photosystem I subunit II (PsaD)
photosystem I subunit IV (PsaE)
photosystem I subunit III precursor, plastocyanin (cyt c553) docking protein (PsaF)
photosystem I subunit VI (PsaH)
photosystem I subunit VIII (PsaI)
photosystem I subunit IX (PsaJ)
photosystem I subunit X (PsaK, PsaK1)
photosystem I subunit XI (PsaL)
photosystem I subunit XII (PsaM)
photosystem II protein D1 (PsbA)
Photosystem II CP47 protein (PsbB)
Photosystem II CP43 protein (PsbC)
photosystem II protein D2 (PsbD)
Cytochrome b559 alpha chain (PsbE)
Cytochrome b559 beta chain (PsbF)
Photosystem II 10 kDa phosphoprotein (PsbH)
Photosystem II protein PsbI
Photosystem II protein PsbJ
Photosystem II protein PsbK
Photosystem II protein PsbL
Photosystem II protein PsbM
Photosystem II manganese-stabilizing protein (PsbO)
Photosystem II oxygen evolving complex protein PsbP
Photosystem II protein PsbT
Photosystem II 12 kDa extrinsic protein (PsbU)
Photosystem II protein PsbV, cytochrome c550
Photosystem II 13 kDa protein Psb28 (PsbW)
Photosystem II protein PsbX
Photosystem II protein PsbY
Photosystem II protein PsbZ
Number of
Occurrences
in
Sequenced
Genomes
84
86
70
81
79
78
14
58
71
103
84
52
232
87
99
126
73
71
80
60
66
72
60
61
83
92
57
54
67
87
53
75
72
Number of
Sequenced
Genomes
Featuring
Gene
72
72
67
73
71
74
10
58
65
72
74
52
75
70
71
72
70
64
70
60
65
68
60
60
73
74
57
50
52
81
52
68
69
Used as
marker
gene?
Yes
No
Yes
Yes
Yes
Yes
No
No
Yes
No
Yes
No
No
No
No
No
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
No
No
No
No
Yes
No
Yes
Yes
Photosystem II protein Psb27
73
71
Yes
Supplementary figure 2:
Correlation between gene abundances in respective DNA and mRNA samples from the Plymouth Marine
Lab Coastal Waters project. Points represent average abundance across samples. Standard Error of the
mean is shown.
Proteorhodopsin
Supplementary table 3:
List of clusters of orthologous groups used for normalization
COG id
COG0012
COG0016
COG0048
COG0049
COG0052
COG0080
COG0081
COG0085
COG0087
COG0088
COG0090
COG0091
COG0092
COG0093
COG0094
COG0096
COG0097
COG0098
COG0099
COG0100
COG0102
COG0103
COG0124
COG0184
COG0185
COG0186
COG0197
COG0200
COG0201
COG0256
COG0495
COG0522
COG0525
COG0533
COG0541
Function
Predicted GTPase, probable translation factor
Phenylalanyl-tRNA synthetase alpha subunit
Ribosomal protein S12
Ribosomal protein S7
Ribosomal protein S2
Ribosomal protein L11
Ribosomal protein L1
DNA-directed RNA polymerase, beta subunit/140 kD subunit
Ribosomal protein L3
Ribosomal protein L4
Ribosomal protein L2
Ribosomal protein L22
Ribosomal protein S3
Ribosomal protein L14
Ribosomal protein L5
Ribosomal protein S8
Ribosomal protein L6P/L9E
Ribosomal protein S5
Ribosomal protein S13
Ribosomal protein S11
Ribosomal protein L13
Ribosomal protein S9
Histidyl-tRNA synthetase
Ribosomal protein S15P/S13E
Ribosomal protein S19
Ribosomal protein S17
Ribosomal protein L16/L10E
Ribosomal protein L15
Preprotein translocase subunit SecY
Ribosomal protein L18
Leucyl-tRNA synthetase
Ribosomal protein S4 and related proteins
Valyl-tRNA synthetase
Metal-dependent proteases with possible chaperone activity
Signal recognition particle GTPase
References
1. Meyer F, Paarmann D, D`Souza M, Olson R, Glass EM et al. (2008) The metagenomics
RAST server – a public resource for the automatic phylogenetic and functional analysis of
metagenomes BMC Bioinformatics 9: 386.
2. Beszteri B, Temperton B, Frickenhaus, S, Giovannoni, SJ. (2010) Average genome size: a
potential source of bias in comparative metagenomics. ISMEJ 4: 1075
3. Raes J, Korbel JO, Lercher M, von Mering C, Brook P. (2007) Prediction of effective
genome size in metagenomic samples. Genome Biol. 8: R10
Download