EMI_2239_sm_suppl_info

Supplementary Table 1. Media formulations used to cultivate Carpediemonas-like organisms. 802SW Boil 5g of cerophyll in 1l of seawater for 5 min. Filter the medium and autoclave. Add 10 - 12 ml per 15 ml tube. NM In 15 ml tube combine: 1 sterile rice grain, 0.5ml of modified ATCC medium 1171 (prepared with heat inactivated horse serum) and 10 ml of sterile seawater SW1773 In 15 ml tube mix: 9ml of 802SW (see above) and 3ml of sterile ATCC medium 1171 T/S Mix: 485 ml of sterile seawater, 485 ml of sterile modified TYSGM-9 medium (prepared without serum – see below) and 30 ml of heat inactivated horse serum. Add 10-12 ml per 15ml tube 3%LB 802SW/horse serum Modified TYSGM-9 medium Pre-inoculation In 15ml tube mix: 300 µl of LB media and 10 ml of sterile seawater Prepare horse serum slant: add 3 ml of horse serum into 15 ml tube. Incubate tubes on side at 80°C for 2 hours. The horse serum solidifies and forms slanted surface at the bottom of the tube. Repeat twice: incubate the horse serum slants overnight in the room temperature followed by incubation at 80°C for 2 hours. Add 3-4 ml of 802SW media over horse serum slant. In 485 ml of distilled water dissolve 1 g of Tryptone, 0.5 g of yeast extract, 1.4 g of K2HPO4, 0.2 g of KH2PO4 and 3.7 5g of NaCl. Autoclave. Add 15 ml of heat-inactivated bovine or horse serum. Media for isolates PCE, PCS, NC and GSML were pre-inoculated with Klebsiella sp. 1 Analyses of 454 data We searched for sequences from Carpediemonas-like organisms (CLOs) in two environmental PCR datasets from anoxic marine material (Stoeck et al., 2009 with ~250,000 454 reads, and Stoeck et al., 2010 with ~660,000 454 reads). The first includes sequences derived from material from the Framvaren Fjord, and Cariaco Basin, the second from Framvaren Fjord only. The sequences in these datasets are quite short (~150bp), and mostly encompass a variable region of the SSU rRNA gene. We were concerned, therefore, that simple BLAST analyses would not be a very sensitive method for identifying CLO sequences, as they are quite divergent from each other. Therefore we analyzed all 454 reads one by one, using a combination of phylogenetic methods and similarity searches. The workflow was as follows: 1. Each 454 read was added to a reference alignment with similar taxon sampling to the one presented in the main paper, and aligned using the program MAFFT (Katoh et al., 2005) with the fastest possible set-up (‘mafft –intree 1 infile outfile’). 2. Each alignment was then analyzed with the phylogenetic analysis program RAxML 7.0.4 (Stamakis, 2006) using the ‘–f p’ option. The program used maximum parsimony to place the new sequence within a fixed reference tree (the new sequence is not present in the reference tree). 3. The program PHAT (part of the PhyloGenie package; Frickey and Lupas, 2004) was then used to filter the trees where the new sequence was branching within or sister to Fornicata, e.g.: sequences from possible CLOs or diplomonads. After this we were left with ~ 33,000 potential Fornicata sequences. However, as Fornicata sequences are long-branching, this set was presumed to include many divergent sequences from organisms unrelated to Fornicata, in addition to genuine Fornicata sequences. 4. The potential Fornicata sequences were then extracted into a fasta file and the program BLASTCLUST (from the NCBI blast suite) was used to cluster the sequences that were nearly identical (similarity set to 0.95). This step grouped the ~33,000 sequences into 1490 clusters. 5. Sequences representing each cluster were analyzed by BLAST and all sequences with obvious high similarity (>90%) to organisms other than CLOs were discarded. Around 90% of the sequences were excluded by this step. 6. Sequences representing the remaining clusters (150) were re-aligned to the dataset from the main paper with program MAFFT (einsi setting). For each test sequence a phylogenetic tree was constructed using maximum likelihood using the program RAxML 7.0.4 (with GTRGAMMAI model). 2 Results: We have identified 22 reads closely related to clade CL6 (only from Stoeck et al., 2009, Framvaren Fjord), and 8 reads closely related to clade CL1 (only from Stoeck et al., 2009, Framvaren Fjord). A further 11 sequences appear to be from diplomonads (8 from Framvaren fjord and 3 from Cariaco basin, from Stoeck et al., 2009). We have also identified one sequence that branches amongst CLOs in the ML phylogeny, but is not closely related to any of this known sequences from CL1-6. It is possible that this sequence represents an additional CLO lineage, but more likely that it represents an unrelated sequence that is misplaced in this phylogeny. The 454 sequences are simply too short to make a definitive statement about position of this sequence. Discussion: Analysis of 454 sequencing did allow us to recover sequences from two CLO clades from environments that did not yield any CLO sequences in previous studies that employed clone libraries and Sanger sequencing. It is possible that this was due to the much deeper sampling available with 454 sequencing. Interestingly we did not recover any CLO sequences in the larger of the two 454 datasets examined (Stoeck et al. 2009). Overall this suggests that shallow sequence coverage is not the only reason for the limited recovery of CLO by previous environmental studies. It supports the idea that that CLOs are often extremely-rare-to-nonexistent in suboxic marine systems, or that there is some strong bias against their sequences in PCR studies. A downside of 454 sequencing at present is the limited lengths of the reads, which makes it difficult to place some sequences on the tree, especially if the 454 sequence is not very similar to any available near-full-length sequence. It is quite possible that sequences from novel CLO lineages may have been missed by our analyses for this reason. Until longer sequences become available, we do not expect analysis of such datasets to be a particularly effective way of identifying additional major lineages within groups with divergent rRNA genes, such as Fornicata. References Frickey, T., and Lupas, N.L. (2004) PhyloGenie: automated phylome generation and analysis. Nucleic Acids Res 32: 5231-5238. Katoh, K., Kuma, K., Toh, H., and Miyata, T. (2005) MAFFT version 5: improvement in accuracy of multiple sequence alignment. Nucleic Acids Res 33: 511-518. Stamakis, A. (2006) RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics 22: 2688-2690. Stoeck, T., Bass, D., Nebel, M., Christen, R., Jones, M.D.M., Breiner, H.W., and Richards, T.A. (2010) Multiple marker parallel tag environmental DNA sequencing reveals a highly complex eukaryotic community in marine anoxic water. Mol Ecol 19: in press. Stoeck, T., Behnke, A., Christen, R., Amaral-Zettler, L., Rodriguez-Mora, M.J., Christoserdov, A. et al. (2009) Massively parallel tag sequencing reveals the complexity of anaerobic marine protistan communities. BMC Biol 7: 1-20. 3

EMI_2239_sm_suppl_info

Related documents

Products

Support

EMI_2239_sm_suppl_info

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib