Dupont et al, supplemental materials Materials and Methods Primary production: Photosynthesis (primary production) was estimated from radiolabelled carbon tracer incorporation during 24 h on-deck incubations. Briefly, water samples were collected into 60 ml polycarbonate bottles and inoculated with 10 μCi of 14 C bicarbonate solution. For each station, one replicate surface sample was filtered immediately to provide time-zero controls. The remaining samples were incubated for 24 h on deck in acrylic incubators screened with blue plus neutral density stage screening to achieve a given percentage of surface irradiance that corresponded to the depth from which they were taken, nominally 100%, 47%, 30%, 16%, 10% and 1%. The depth of each light level was calculated using an inverse analytical radiative transfer model as previous described (Johnson et al 2002). Incubator temperatures were maintained by a continuous flow of surface seawater. After incubation, duplicate samples were filtered onto 0.2 μm polycarbonate or Whatman GF/F filters then acidified with 0.5 ml of 0.5 N HCl for 24 h to liberate unincorporated inorganic 14C. Ecolume scintillation fluid (7 ml) was added and activity quantified using a Tri-Carb 2900TR liquid scintillation counter (Packard Bioscience, Waltham, MA). Total added 14C, was quantified in select inoculated subsamples prior to filtering. Primary production was estimated from the activity of the total and filtered samples using standard calculations (Barber et al 1996). When an exact depth match was not present, primary production values were linearly interpolated between nearest neighbor depths to the depth of the interest. 1 Nitrogen uptake rates: Eight 500mL polycarbonate bottles were filled for each of six depths collected 2 hours before dawn. Spikes of 100 nM 15NH4+, 15N-urea, or 15NO3- (all 99% 15N, Sigma) were added to duplicate bottles for each depth and the bottles were transferred to on-deck incubators cooled with surface seawater and shaded to 30% ambient light. After two hours, the entire contents of each bottle was filtered onto a precombusted GF/F (Whatman) filter at low pressure with subsequent storage at-20oC in a dessicator. Filters were dried (3 hours at 50oC) and packaged in tin for mass spectrometry analysis at the UC Davis Stable isotope facility. Separately, 1L from each of the six depths was filtered gently onto pre-combusted GF/F filters, which were subsequently dried and packaged in tin. The total carbon and total nitrogen captured on these filters were determined at the Scripps Institution of Oceanography analytical facility with a Costech 4010 Elemental combustion analyzer. Uptake rates were determined using the equations described in Dugdale et al (Dugdale and Goering 1967, Dugdale and Wilkerson 1986). In the case where the nutrient concentrations were below the analytical detection limits (100 nM), an ambient concentration of 100nM was assumed. Electron microscopy: Cells in log phase, grown at room temperature at 30 uE (daylight fluorescent bulbs) on a rotary shaker (125 rev/min) were pelleted; a small aliquot was transferred by toothpick to a cushioning material and dropped onto a liquid He-cooled copper block (4 deg K); the frozen material was transferred to liquid nitrogen and fractured, etched at -80°C for 2 min, and Pt/C rotary replicated as described previously (*). The replicas were examined with a JEOL electron microscope, model JEM 2 1400, equipped with an AMTV601 digital camera. The images are photographic negatives; hence, protuberant elements of the fractured/etched surface are most heavily coated with platinum and appear white. Pigment measurements: Two liters of seawater was gently filtered onto a Whatman GF/F filter, which was placed in a cryogenic vial and stored in liquid nitrogen until analysis as described in Goericke and Montoya (1998) and Dupont et al. (2010). 3 Table 1: Number of sequences generated for each library and annotation statistics . Schmidt filtering removes duplicate reads. Taxonomic annotation refers to having a match in PhyloDB. Cluster indicates if a sequence is recruited to the GOS protein clusters. KEGG, PFAM, and TIGRFAM indicate the percentage of reads annotated by the indicated families of HMMs. Library GS265_454_0.1 GS265_454_0.8 GS265_454_3.0 GS265_polyA mRNA GS265_total mRNA GS265_sanger_0.1 GS266_454_0.1 GS266_454_0.8 GS266_454_3.0 GS266_polyA mRNA GS266_total mRNA GS266_sanger_0.1 GS267_454_0.1 GS267_454_0.8 GS267_454_3.0 GS267_polyA mRNA GS267_total mRNA GS267_sanger_0.1 GS268_454_0.1 GS268_454_0.8 GS268_454_3.0 GS268_polyA mRNA GS268_total mRNA GS268_sanger_0.1 GS269_454_0.1 GS269_454_0.8 GS269_454_3.0 GS269_polyA mRNA GS269_total mRNA GS269_sanger_0.1 GS270_454_0.1 GS270_454_0.8 GS270_454_3.0 GS270_polyA mRNA GS270_total mRNA GS270_sanger_0.1 GS271_454_0.1 GS271_454_0.8 GS271_454_3.0 GS271_polyA mRNA GS271_total mRNA GS271_sanger_0.1 GS272_454_0.1 GS272_454_0.8 GS272_454_3.0 GS272_polyA mRNA GS272_total mRNA GS272_sanger_0.1 Genome/Transcriptome Genome Genome Genome Transcriptome Transcriptome Genome Genome Genome Genome Transcriptome Transcriptome Genome Genome Genome Genome Transcriptome Transcriptome Genome Genome Genome Genome Transcriptome Transcriptome Genome Genome Genome Genome Transcriptome Transcriptome Genome Genome Genome Genome Transcriptome Transcriptome Genome Genome Genome Genome Transcriptome Transcriptome Genome Genome Genome Genome Transcriptome Transcriptome Genome Read length (median) 422 406 376 349 157 1021 422 397 351 374 375 1027 457 414 377 424 116 1028 460 365 336 376 287 1025 409 449 432 454 319 1030 426 432 385 336 337 1027 407 411 338 398 397 1036 447 427 402 325 90 1030 Raw Reads 260734 220031 334337 415501 112564 46654 244716 175015 174055 889091 249623 45757 145239 272874 250090 453751 202368 46531 223789 189181 206976 946195 161315 46588 286048 230045 335849 206753 160020 45458 247242 221331 324771 772491 95082 46311 310047 236896 304775 770314 563599 44347 209455 183007 391736 500776 112047 45475 Schmidt filtered Post rRNA removal 250390 158324 235993 412298 96293 46651 235087 130529 127098 874427 209920 45745 135257 233648 211894 450932 182244 46502 208138 129164 143681 914645 128030 46585 276953 220521 314465 206280 133069 45457 239167 203542 294927 758859 77791 46310 291740 201973 251684 752922 163326 44266 198247 146134 308436 499770 95364 45456 4 Percentages based on post schmidt and rRNA removal Library % taxonomic Cluster GS265_454_0.1 73 181550 GS265_454_0.8 56 84095 GS265_454_3.0 24 53285 GS265_polyA mRNA 27 92214 GS265_total mRNA 25 21676 GS265_sanger_0.1 92 42712 GS266_454_0.1 69 161365 GS266_454_0.8 56 69699 GS266_454_3.0 17 20305 GS266_polyA mRNA 29 212449 GS266_total mRNA 35 62700 GS266_sanger_0.1 88 40240 GS267_454_0.1 76 103105 GS267_454_0.8 52 119862 GS267_454_3.0 18 37100 GS267_polyA mRNA 29 112637 GS267_total mRNA 21 35994 GS267_sanger_0.1 92 42813 GS268_454_0.1 64 132924 GS268_454_0.8 51 62077 GS268_454_3.0 21 28999 GS268_polyA mRNA 31 240658 GS268_total mRNA 39 45851 GS268_sanger_0.1 86 40038 GS269_454_0.1 71 196327 GS269_454_0.8 35 72098 GS269_454_3.0 20 60242 GS269_polyA mRNA 32 56862 GS269_total mRNA 32 38848 GS269_sanger_0.1 92 41844 GS270_454_0.1 56 132104 GS270_454_0.8 47 81786 GS270_454_3.0 13 33308 GS270_polyA mRNA 35 213496 GS270_total mRNA 60 44375 GS270_sanger_0.1 88 40436 GS271_454_0.1 67 194088 GS271_454_0.8 50 94403 GS271_454_3.0 17 40473 GS271_polyA mRNA 36 226759 GS271_total mRNA 49 73323 GS271_sanger_0.1 88 38897 GS272_454_0.1 67 132216 GS272_454_0.8 59 84280 GS272_454_3.0 26 76638 GS272_polyA mRNA 29 130192 GS272_total mRNA 15 12940 GS272_sanger_0.1 90 40599 %cluster 73 53 23 22 23 92 69 53 16 24 30 88 76 51 18 25 20 92 64 48 20 26 36 86 71 33 19 28 29 92 55 40 11 28 57 87 67 47 16 30 45 88 67 58 25 26 14 89 Kegg 118236 43221 23919 32722 11377 28086 103852 32683 7226 76698 26233 25181 68685 68456 16510 41830 19310 28988 76815 28222 8136 89046 26302 23699 133271 31458 23733 20849 19159 28319 78944 29430 9462 73090 12213 25102 120461 47466 16678 79645 32036 25525 79324 46043 32926 43057 6350 25142 % Kegg 47 27 10 8 12 60 44 25 6 9 12 55 51 29 8 9 11 62 37 22 6 10 21 51 48 14 8 10 14 62 33 14 3 10 16 54 41 24 7 11 20 58 40 32 11 9 7 55 PFAM 158305 70679 42218 83797 19555 37195 139017 53412 13943 191578 57044 33752 89292 101789 28216 101338 32212 37242 108111 46987 17659 216205 39639 32354 172318 58489 44676 50924 34628 36518 111778 63402 23021 193408 34016 33858 163347 79679 31912 206113 54364 33996 109095 70710 57885 113916 11627 33680 PFAM% 63 45 18 20 20 80 59 41 11 22 27 74 66 44 13 22 18 80 52 36 12 24 31 69 62 27 14 25 26 80 47 31 8 25 44 73 56 39 13 27 33 77 55 48 19 23 12 74 TIGRFAM 74459 27773 15397 14560 7993 17088 65017 21170 4527 37100 15658 15236 42154 44193 10417 18125 11897 17582 47447 18576 5094 42931 18236 14287 84382 19062 14801 8773 11659 17448 48366 17468 5458 36729 9204 15275 74866 30622 10498 36899 22575 15659 49795 29695 20826 20289 3813 15355 TIGR% 30 18 7 4 8 37 28 16 4 4 7 33 31 19 5 4 7 38 23 14 4 5 14 31 30 9 5 4 9 38 20 9 2 5 12 33 26 15 4 5 14 35 25 20 7 4 4 34 5 Table 2: Genomes enriched in the largest size fraction, the SCM, and the surface ocean. No genomes are enriched in the smallest size fraction. All of these genomes are statistically enriched according to multiple methods with p scores less than 0.001 (Chi2 test and FDR). Many more genomes are enriched in each environment. The total metagenomic abundance is shown along with total transcriptome abundance. Finally, the percent amino acid identity of the translated transcripts and metagenome reads to each genome is shown. 3.0 enriched bacteria Synechococcus sp. CC9311 Akkermansia muciniphila ATCC BAA-835 Alteromonas macleodii ATCC 27126 Alteromonas macleodii str. 'Deep ecotype' Coraliomargarita akajimensis DSM 45221 Synechococcus sp. CC9902 Synechococcus sp. BL107 Lacinutrix sp. 5H-3-7-4 Verrucomicrobium spinosum DSM 4136 Flavobacteriales bacterium ALC-1 Lentisphaera araneosa HTCC2155 Chthoniobacter flavus Ellin428 Verrucomicrobiae bacterium DG1235 Synechococcus sp. WH 7803 Ulvibacter SCB49 SCM enriched bacteria Prochlorococcus marinus str. NATL1A Prochlorococcus marinus str. NATL2A Prochlorococcus marinus str. CCMP1375 Synechococcus sp. CC9311 Prochlorococcus marinus str. MIT 9313 Acidimicrobium ferrooxidans DSM 10331 Prochlorococcus marinus str. MIT 9303 Prochlorococcus marinus str. MIT 9211 Frankia sp. EUN1f delta proteobacterium MLMS-1 Candidatus Ruthia magnifica str. Cm Frankia sp. EuI1c Pedosphaera parvula Ellin514 Candidatus Vesicomyosocius okutanii HA Frankia alni ACN14a Surface enriched bacteria Prochlorococcus marinus CCMP1986 Coraliomargarita akajimensis DSM 45221 Prochlorococcus marinus str. MIT 9515 Candidatus Puniceispirillum marinum IMCC1322 gamma proteobacterium HTCC2207 Hyphomonas neptunium ATCC 15444 alpha proteobacterium HIMB114 Roseobacter litoralis Och 149 Octadecabacter antarcticus 307 Candidatus Pelagibacter sp. IMCC9063 Prochlorococcus marinus str. MIT 9301 Ruegeria pomeroyi DSS-3 Roseovarius sp. 217 Roseobacter denitrificans OCh 114 Candidatus Pelagibacter HTCC 7211 0.1/3.0 0.00 0.01 0.01 0.01 0.01 0.01 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.02 surface/SCM 0.01 0.02 0.09 0.13 0.14 0.16 0.18 0.19 0.20 0.20 0.20 0.22 0.24 0.24 0.25 surface/SCM 13.27 6.58 5.74 3.82 3.79 2.69 2.57 2.22 2.21 2.18 2.16 2.16 2.14 2.12 2.05 Meta % 0.58 0.16 0.20 0.20 2.47 0.57 0.34 0.95 0.42 1.08 0.45 0.28 0.41 0.07 0.35 Meta % 3.25 1.38 0.13 0.58 0.03 0.16 0.08 0.08 0.03 0.04 0.14 0.06 0.32 0.10 0.03 Meta % 4.23 2.47 0.77 1.78 1.52 0.17 0.65 0.13 0.12 0.62 0.09 0.24 0.11 0.18 8.22 Tran % 1.52 0.07 1.81 1.80 0.34 2.41 1.19 0.18 0.16 0.19 0.09 0.10 0.17 0.18 0.07 Tran % 5.56 0.68 0.31 1.52 0.07 0.07 0.25 0.31 0.02 0.02 0.10 0.05 0.26 0.08 0.02 Tran % 1.41 0.34 0.64 3.00 0.57 0.19 0.25 0.08 0.11 0.26 0.06 0.20 0.09 0.14 1.67 % ID to ref 91 61 83 93 68 91 89 74 62 64 47 56 50 82 74 % ID to ref 93 91 92 91 90 68 92 86 42 66 76 50 52 73 52 % ID to ref 91 68 93 81 64 70 74 78 70 76 85 71 89 75 81 6 Table 3: Most expressed genes for several of the most abundant genomes. SAR86 subtype A, clade I (78% AA identity) Puniceispirillum marinum IMCC1322 (75% AA identity) Ton B Receptor sugar ABC transporter periplasmic sugar-binding protein Cytochrome C and Quinol oxidase polypeptide I hypothetical protein Hypothetical protein cytochrome-c oxidase subunit I protein Peroxidase membrane protease subunit HflK EF-Tu ATP synthase F1 subunit alpha RNA polymerase Rpb2 cell division protein FtsH Inorganic H+ pyrophosphatase DNA-directed RNA polymerase subunit beta Ribosomal protein L10 DNA-directed RNA polymerase D-isomer specific 2-hydroxyacid dehydrogenase, NAD binding domain molecular chaperone DnaK Ton B Receptor chaperonin GroEL Pelagibacter HTCC7211 (81% AA identity) Erythrobacter sp. SD-21 (83% AA identity) bacteriorhodopsin ribosomal protein L4/L1 family Na+/solute symporter, Ssf family ATP synthase F0, A subunit Bacterial extracellular solute-binding protein, family 7 50S ribosomal protein L3 glycine betaine-binding lipoprotein, putative translation elongation factor G V-type H(+)-translocating pyrophosphatase ribosomal protein S14p/S29e receptor family ligand-binding protein 30S ribosomal protein S13 putative porin ribosomal protein S2 trap dicarboxylate transporter - dctp subunit hypothetical protein translation elongation factor Tu ATP synthase F1, alpha subunit glutamate synthase large subunit protein-export membrane protein SecF SAR324 clade bacteria, surface branch (69% AA identity) Alteromonadales bacterium TW-7 (89% AA identity) LacI periplasm sugar binding protein ribosomal protein L6 Bacterial extracellular solute-binding protein Bacterial flagellin N-terminus TCP-1/cpn60 chaperonin family protein TolR Hypothetical protein ribosomal protein S1 Bacterial extracellular solute-binding protein ribosomal protein L7/L12 Bacterial extracellular solute-binding proteins, family 3 DNA-directed RNA polymerase, beta subunit Bacterial extracellular solute-binding proteins, family 3 ribosomal protein L4/L1 family Sodium/hydrogen exchanger family ribosomal protein L10 Hypothetical protein translation elongation factor G Bacterial extracellular solute-binding proteins 50S ribosomal protein L3 Prochlorococcus NATL1A (93% AA identity) Synechococcus CC9311 (91% AA identity) light-harvesting complex protein porin rplC psbA-3 psbD porin light-harvesting protein PcbH porin rpl cpaB-2 porin-like protein psaB rbcL psaA psbC psbA-1 psbB psbB F0F1 ATP synthase subunit B' NADH dehydrogenase 7 Table 4: Genomes whose abundance correlated (r2>0.7) with measurements of primary production, nitrate uptake, and chl a. Also shown is the metagenomic abundance of the genomes, its enrichment factor in the largest size fraction, and the CDNA to DNA ratio for each genome. Primary production Species Synechococcus sp. CC9902 Synechococcus sp. CC9311 Synechococcus sp. BL107 Synechococcus sp. WH 8109 Candidatus Ruthia magnifica str. Cm Methylophilales bacterium HTCC2181 Runella slithyformis DSM 19594 Candidatus Vesicomyosocius okutanii HA unidentified eubacterium SCB49 Parabacteroides distasonis ATCC 8503 Synechococcus sp. CC9605 Trans % 2.41 1.52 1.19 0.39 0.10 0.08 0.04 0.08 0.07 0.01 0.49 Genome % CDNA/DNA 0.57 4.25 0.58 2.63 0.34 3.47 0.10 4.06 0.14 0.70 0.23 0.36 0.15 0.26 0.10 0.82 0.35 0.21 0.03 0.23 0.17 2.91 0.1/3.0 0.19 0.03 0.24 0.47 8.69 10.17 0.73 6.49 0.34 0.46 0.39 Surf/chl max 0.35 0.13 0.38 1.12 0.20 1.74 0.72 0.24 0.71 0.69 0.98 Nitrate uptake Species Synechococcus sp. WH 8109 Synechococcus sp. WH 8102 Synechococcus sp. CC9605 Methylophilales bacterium HTCC2181 Sar86 B Trans % 0.39 0.32 0.49 0.08 0.44 Genome % CDNA/DNA 0.10 4.06 0.21 1.54 0.17 2.91 0.23 0.36 2.35 0.19 0.1/3.0 0.47 0.39 0.39 10.17 3.22 Surf/chl max 1.12 1.09 0.98 1.74 1.69 Chlorophyll a Species Runella slithyformis DSM 19594 Parabacteroides distasonis ATCC 8503 Synechococcus sp. CC9311 Synechococcus sp. RS9917 Candidatus Vesicomyosocius okutanii HA Synechococcus sp. WH 7805 Candidatus Ruthia magnifica str. Cm. Synechococcus sp. RS9916 Methylovorus sp. MP688 Spirosoma linguale DSM 74 Synechococcus sp. CB0101 Bacteroidetes oral taxon 274 str. F0058 Haliscomenobacter hydrossis DSM 1100 Synechococcus sp. WH 7803 Chthoniobacter flavus Ellin428 Akkermansia muciniphila ATCC BAA-835 Pedosphaera parvula Ellin514 Odoribacter splanchnicus DSM 20712 Synechococcus sp. RCC307 Chitinophaga pinensis DSM 2588 Verrucomicrobium spinosum DSM 4136 Microscilla marina ATCC 23134 Blastopirellula marina DSM 3645 Trans % 0.04 0.01 1.52 0.06 0.08 0.11 0.10 0.13 0.01 0.05 0.03 0.01 0.13 0.18 0.10 0.07 0.26 0.03 0.04 0.05 0.16 0.08 0.10 Genome % CDNA/DNA 0.15 0.26 0.03 0.23 0.58 2.63 0.05 1.27 0.10 0.82 0.07 1.57 0.14 0.70 0.08 1.63 0.02 0.69 0.13 0.39 0.03 0.85 0.03 0.46 0.28 0.46 0.07 2.65 0.28 0.36 0.16 0.44 0.32 0.81 0.09 0.37 0.04 1.18 0.15 0.35 0.42 0.38 0.28 0.28 0.20 0.48 0.1/3.0 0.73 0.46 0.03 0.71 6.49 0.43 8.69 0.49 3.38 0.73 0.45 0.61 0.52 0.34 0.33 0.13 0.50 0.44 0.42 0.54 0.28 0.62 0.61 Surf/chl max 0.72 0.69 0.13 0.40 0.24 0.41 0.20 0.40 0.76 0.68 0.29 0.58 0.68 0.35 0.54 0.67 0.24 0.50 0.37 0.60 0.56 0.65 0.35 8 Figure S1: Depth profiles of particulate carbon, nitrogen, chlorophyll a, and the C:Chl a at each site where metagenomic and metatranscriptomic samples were taken. Particulate Carbon (ug/L) 0.00 0 Particulate nitrogen (ug/L) 50.00 100.00 150.00 200.00 250.00 300.00 20 20 40 40 60 60 80 10.00 20.00 60.00 100 120 120 140 140 160 160 Particulate chl a (ug/L) 0.00 0 30.00 40.00 50.00 80 93.40 93.110 93.120 93.80 100 Dept h (m) 0.00 0 0.20 0.40 0.60 0.80 1.00 C:Chl a (ug/ug) 1.20 0.00 0 20 20 40 40 60 60 80 80 100 100 120 120 140 140 160 160 500.00 1000.00 1500.00 2000.00 9 Figure S2: Taxonomic breakdown at the domain level for reads that received a taxonomic annotation via BLAST to PhyloDB for each library. Surface 0% Chl a maximum 100% 0% 50% 50% 100% 93.40 93.80 93.110 93.120 0.1-0.8 um metagenome 93.40 93.80 93.110 93.120 0.8-3.0 um metagenome 93.40 93.80 93.110 93.120 3.0-20 um metagenome 93.40 93.80 93.110 93.120 polyA mRNA metatranscriptome 93.40 93.80 93.110 93.120 total mRNA metatranscriptome Archaea Bacteria Eukarya Virus Scripps Figure 3 10 Figure S3: Top panel: Taxonomic breakdown of all annotated reads of putative viral origin for each size class. Bottom panel: Taxonomic breakdown of all annotated reads of putative viral origin for the surface and SCM samples. Only the most abundant taxonomic categories are shown for clarity of legend, thus the histograms do not add up to 100%. 100 Taterapox virus Synechococcus phage Syn5 Cyanophage PSS2 90 Roseobacter phage SIO1 OCTOPUS_SMPL_20031004 80 Vibrio phage KVP40 % of total annotated reads of viral origin Micromonas sp. RCC1109 virus MpV1 70 BEARPAW_SMPL_20031007 Bathycoccus sp. RCC1105 virus BpV1 60 Enterobacteria phage EPS7 Enterobacteria phage T5 Acanthamoeba polyphaga mimivirus 50 GIS_HUMAN_FECAL_VIRUS_RNA_SMPL Cafeteria roenbergensis virus BV-PW1 40 Prochlorococcus phage P-SSP7 SDSU_MISSION_BAY_VIRUS_SMPL 30 SDSU_MARINE_SEDIMENT_VIRUS_SMPL Synechococcus phage syn9 20 Synechococcus phage S-PM2 Synechococcus phage S-RSM4 Prochlorococcus phage P-SSM4 10 SDSU_SCRIPPS_PIER_MARINE_VIRUS_SMPL Prochlorococcus phage P-SSM2 0 sum (0.1) sum (0.8) sum (3.0) Aeromonas phage phiAS5 100 90 Xanthomonas phage Xp15 Emiliania huxleyi virus 86 Synechococcus phage Syn5 80 Rhodothermus phage RM378 Ostreococcus virus OsV5 Cyanophage PSS2 Shigella phage phiSboM-AG3 Acanthocystis turfacea Chlorella virus 1 OCTOPUS_SMPL_20031004 70 % of annotated viral reads UDEL_CHESAPEAKE_VIROPLANKTON_SMPL Ostreococcus tauri virus 2 Ostreococcus lucimarinus virus OlV1 Roseobacter phage SIO1 60 Micromonas sp. RCC1109 virus MpV1 BEARPAW_SMPL_20031007 Vibrio phage KVP40 Bathycoccus sp. RCC1105 virus BpV1 GIS_HUMAN_FECAL_VIRUS_RNA_SMPL 50 Enterobacteria phage T5 Acanthamoeba polyphaga mimivirus Enterobacteria phage EPS7 40 Prochlorococcus phage P-SSP7 Cafeteria roenbergensis virus BV-PW1 30 Synechococcus phage S-RSM4 20 Synechococcus phage S-PM2 Synechococcus phage syn9 SDSU_MISSION_BAY_VIRUS_SMPL 10 Prochlorococcus phage P-SSM4 SDSU_MARINE_SEDIMENT_VIRUS_SMPL SDSU_SCRIPPS_PIER_MARINE_VIRUS_SMPL Prochlorococcus phage P-SSM2 UDEL_CHESAPEAKE_VIROPLANKTON_SMPL 0 Surface SCM 11 Figure S4: Representation of various eukaryotic (left column) and bacterial (right column) in pooled metagenomes (top four panels) and the two types of transcriptomes. A similar comparison for the two types of transcriptomes is shown in the bottom two panels. 12 Figure S5: Freeze etch micrographs of the basal body (upper panel) and axoneme (lower) structure in Pelagomonas calceolate CCMP1756 13 Supplemental References Barber RT, Sanderson MP, Lindley ST, Chai F, Newton J, Trees CC et al (1996). Primary productivity and its regulation in the equatorial Pacific during and following the 1991-1992 El nino. Deep Sea Research Part II: Topical Studies in Oceanography 43: 933-969. Dugdale RC, Goering JJ (1967). Uptake of new and regenerated forms of nitrogen in primary production. Limnology and Oceanography 12: 196-206. Dugdale RC, Wilkerson FP (1986). The use of N15 to measure nitrogen uptake in eutrophic oceans: Experimental considerations. Limnology and Oceanography 31: 673689. Dupont CL, Buck KN, Palenik B, Barbeau K (2010). Nickel utilization in phytoplankton assemblages from contrasting oceanic regimes. Deep Sea Research I 57: 553-566. Goericke R, Montoya JP (1998). Estimating the contribution of microalgal taxa to chlorophyll a in the field-variations of pigment ratios under nutrient- and light-limited growth. Marine Ecology Progress Series 169: 97-112. Johnson Z, Bidigare RR, Goericke R, Marra J, Trees C, Barber RT (2002). Photosynthetic physiology and physicochemical forcing in the Arabian Sea, 1995. Deep Sea Research Part I: Oceanographic Research Papers 49: 415-436. Matsen FA, Kodner RB, Armbrust EV (2010). pplacer: linear time maximum-likelihood and Bayesian phylogenetic placement of sequences onto a fixed reference tree. BMC bioinformatics 11: 538. Price MN, Dehal PS, Arkin AP (2009). FastTree: computing large minimum evolution trees with profiles instead of a distance matrix. Mol Biol Evol 26: 1641-1650. Shimodaira H, Hasegawa M (1999). Multiple Comparisons of Log-Likelihoods with Applications to Phylogenetic Inference. Mol Biol Evol 16: 1114. Sievers F, Wilm A, Dineen D, Gibson TJ, Karplus K, Li W et al (2011). Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Molecular systems biology 7: 539. Whelan S, Goldman N (2001). A general empirical model of protein evolution derived from multiple protein families using a maximum-likelihood approach. Mol Biol Evol 18: 691-699. 14