Supplementary Information Environmental distribution of coral-associated relatives of apicomplexan parasites Jan Janouškovec1, Aleš Horák1,2, Katie L. Barott3, Forest L. Rohwer3, and Patrick J. Keeling1 Supplementary Materials and Methods: Sequence data availability: All sequences used in this study and their accession numbers were described previously (Janouskovec et al, 2012, Curr Biol; see Table S1 within) with the exception of three studies additionally identified here: V6 tags of ARL-V from a survey of seven coral species in the Caribbean (Sunagawa et al, 2010; data available at http://vamps.mbl.edu), V6 tags of ARL-V from a Isopora palifera survey on Taiwan (Chen et al, 2011; data available at http://140.109.29.21/~tanglab/) and a single sequence of ARL-V from a Stylopora pistillata survey (data available at http://140.109.29.21/~tanglab/). Coral-macroalgal transects (Figure 1A, Supplementary Table 1): Microbial surveys by Barott et al., 2011 and Barott et al., 2012 obtained 16S ribosomal RNA sequence data using general bacterial primers, and analysed the majority of bacterial and non-ARL plastid sequence. Here we focused on previously overlooked presence of ARLs in this data and compared their distribution to the other already identified plastid lineages. The two studies sampled coral reef in southern coast of Curacao, Caribbean. Four transects were designed to span between multiple specimen of Montastraea annularis and the four following algae: Halimeda opuntia, Dictyota bartayresiana, crustose corraline algae, and turf algae. Within each transect, five different zones were sampled: coral tissue/surface from the centre of the colony 5-10 cm away from coral-algal interface, CAI (Zone 1); coral tissue/surface immediately adjacent to CAI (Zone 2), CAI, the contact zone between coral and algae (Zone 3), algal tissue/surface immediately adjacent to CAI (Zone 4), and algal tissue/surface 5-10 cm away from CAI (Zone 5). Zones 2 and 4 were principally similar to Zones 1 and 5 respectively, however, their proximity to the point of coral-algal contact (Zone 3) suggested they might differ in their microbial composition. All 20 samples (4 transects x 5 zones) were obtained in 5 replicates. All replicates were independently PCRamplified and pooled before sequencing. Plastid sequences in the resulting sequence pools (containing mostly bacteria) were identified using BLASTn searches and phylogenies, as described in Barott et al, 2011. In this study, we concentrated on five groups of unicellular microalgae (diatoms, pelagophytes, haptophytes, unicellular red and green algae), in addition to newly idetified ARLs (ARL-I, ARL-III, ARL-V). We excluded the following plastid sequences of macroalgae, the representatives of which were innate part of the transects: phaeophytes (Dictyota), florideophytes (crustose corraline algae), and ulvophytes (Halimeda, turf algae). Absolute and relative occurrences of sequences from selected plastid lineages for all zones were summarized in Supplementary Table 1. To display general trends in plastid abundance between corals and dominant reef macroalgae as a whole, median relative occurrences for across all four transects were plotted for all five zones and eight microalgal groups (Figure 1A). Medians rather than means were used to plot these trends in order to mediate the effect of several outliers in the data (none of the outliers belonged to ARL-V). However, the alternative use of means provided a picture consistent with the use of medians (Supplementary Table 2). A reference sample of M. annularis tissue/surface equivalent to Zone 1 (marked ‘WP (ref.)’ in Supplementary Table 1) was taken from northern-west tip of Curacao (approx. 37 km aerial distance) using the same sampling method (Barott et al, 2011). Seasonal distribution of ARL-V: Chen et al. sampled Isopora palifera colonies (three replicates) in the southern tip of Taiwan during February, April, May, June, July, August and November 2008, and obtained microbial 16S ribosomal RNA sequence for all samples. We identified 117 plastid sequences of ARL-V in this data, all of which were derived from February and April-August samples (none were derived from November samples). ARL-V abundance within 7 coral species and reef water (Figure 1B): This data is derived from a study conducted near Bocas Del Toro, Panama, Caribbean. Seven coral species were sampled in five replicates using hammer and chisel and coral fragments 1–4 cm2 in size crushed using the same method. Five liters of reef water were collected in the vicinity of corals and filtered using 0.22 mm filter. DNA extractions were done from equivalent amount of homogenized corals (~50 mg). For all samples, equivalent amounts of DNA were used for 16S rDNA amplication, and all amplification reactions were run in triplicates, before they were pooled and sequenced. Parallel approach in sequence generation has made these samples suitable for mutual comparisons. We measured relative sequence occurrence of ARL-V in each sample and plotted the result in Figure 1B. Note: All supplementary references are covered in the main text. Supplementary Table 1 Legend: Occurrence of ARLs and other plastids in coral-macroalgal transects. Values were extracted from sequence surveys by Barott et al., 2011 and Barott et al., 2012. Absolute sequence counts and relative sequence occurrence (note: x10-5 units) for plastid groups in question (left column) are shown for four coral-algal transects (Hal, Dict, CCA, Turf), five zones within each transect (Z1-Z5), and a reference Z1 site (WP (ref.)). The transects cover natural associations between the reef coral Montastraea annularis and four types of macroalgae: Hal = Halimeda opuntia, Dict = Dictyota bartayresiana, CCA = crustose corraline algae, Turf = turf algae. Five zones were sampled within each transect as follows: Z1=coral tissue and surface (T/S) 5-10 cm away from coral-alga interface (CAI), Z2=coral T/S immediately adjacent to CAI, Z3=coral and algal T/S at CAI, Z4=algal T/S immediately adjacent to CAI, Z5=algal T/S 5-10 cm away from CAI. Sample marked as WP (ref.) is Montastraea annularis T/S collected a different site on Curacao providing a reference observation for Z1 (see ‘Coral-macroalgal transects’ section above for details). Number of total sequences generated for each transect/zone are at the bottom of the table (the majority of these are bacterial sequences; the focus of the two surveys above). Supplementary Table 2 Legend: Statistical values for relative occurrence of ARLs and other plastids in coral-algal associations. Four observations for each zone were obtained as described in the ‘Coral-macroalgal transects; section above. Medians, means, 25% and 75% percentiles, statistical errors, and 95% confidence intervals for medians were calculated. Supplementary Table 3 Legend: Distribution of all ARL-V sequence tags in coral species. Sequence counts are presented for 20 coral species from three distinct clades. Long reads were retrieved from NCBI database. 454-pyrosequence tags were generated in three studies (next three columns). Three dashes (---) indicate absence of sampling for a particular species. The last column indicates the ability for each coral species to form symbiotic associations with zooxanthellae, the dinoflagellates from the genus Symbiodinium. Supplementary Table 1: Occurence of ARLs and other plastids in coral-macroalgal transects absolute abundance relative abundance (10E-5) Group Association Z1 Z2 Z3 Z4 Z5 Total Z1 Z2 Z3 Z4 Z5 WP 30 67.8656 n.a. n.a. Hal 7 23 4 5 0 39 12.2463 51.321 7.84314 7.52627 0 Dict 10 19 0 0 0 29 30.0039 22.1818 0 0 0 ARL-V CCA 45 0 1 0 0 46 47.0997 0 1.41225 0 0 Turf 5 7 0 0 0 12 7.89378 6.70395 0 0 0 Total 97 49 5 5 0 156 33.0407 15.3447 2.16273 2.23142 0 WP 0 0 n.a. n.a. Hal 0 0 0 31 43 74 0 0 0 46.6629 77.4105 ARL-I Dict 0 0 7 0 0 7 0 0 11.3638 0 0 Vitrella CCA 0 1 1 1 0 3 0 1.18426 1.41225 2.08238 0 clade Turf 0 0 16 3 16 35 0 0 33.4861 7.55934 23.0302 Total 0 1 24 35 59 119 0 0.31316 10.3811 15.6199 21.0394 WP 0 0 n.a. n.a. Hal 0 0 0 1 3 4 0 0 0 1.50525 5.40073 ARL-III Dict 0 0 1 0 0 1 0 0 1.6234 0 0 Chromera CCA 0 0 0 8 4 12 0 0 0 16.659 8.42336 clade Turf 0 0 0 0 18 18 0 0 0 0 25.909 Total 0 0 1 9 25 35 0 0 0.43255 4.01655 8.91501 WP 2 4.52438 n.a. n.a. Hal 0 2 6 8467 3383 11858 0 4.46269 11.7647 12745 6090.23 Dict 10 3 145 867 1986 3011 30.0039 3.50238 235.393 1239.79 1840.3 Diatoms CCA 5 4 5 47 245 306 5.2333 4.73704 7.06125 97.8718 515.931 Turf 1 1 39 140 237 418 1.57876 0.95771 81.6224 352.769 341.135 Total 18 10 195 9521 5851 15595 6.13127 3.13157 84.3466 4249.06 2086.47 WP 0 0 n.a. n.a. Hal 0 1 0 12 27 40 0 2.23135 0 18.063 48.6066 Dict 0 0 1 2 13 16 0 0 1.6234 2.85996 12.0463 HaptoCCA 0 0 0 3 0 3 0 0 0 6.24714 0 phytes Turf 0 0 1 2 13 16 0 0 2.09288 5.03956 18.712 Total 0 1 2 19 53 75 0 0.31316 0.86509 8.47938 18.8998 WP 1 2.26219 n.a. n.a. Hal 0 0 0 102 184 286 0 0 0 153.536 331.245 0 0 13 80 21 114 0 0 21.1042 114.398 19.4594 Pelago- Dict CCA 0 7 133 190 359 689 0 8.28981 187.829 395.652 755.996 phytes Turf 0 0 30 275 1149 1454 0 0 62.7865 692.94 1653.86 Total 1 7 176 647 1713 2544 0.34063 2.1921 76.1282 288.745 610.856 WP 0 0 n.a. n.a. Hal 0 0 0 10 0 10 0 0 0 15.0525 0 0 0 7 4 0 11 0 0 11.3638 5.71992 0 Unicellular Dict 0 0 0 2 12 14 0 0 0 4.16476 25.2701 red algae CCA Turf 1 0 5 369 119 494 1.57876 0 10.4644 929.799 171.287 Total 1 0 12 385 131 529 0.34063 0 5.19056 171.819 46.7146 WP 0 0 n.a. n.a. Hal 13 0 3 40 45 101 22.7432 0 5.88235 60.2101 81.011 Unicellular Dict 0 0 12 0 2 14 0 0 19.4808 0 1.85328 green CCA 0 0 3 0 10 13 0 0 4.23675 0 21.0584 algae Turf 0 0 0 34 19 53 0 0 0 85.6725 27.3484 Total 13 0 18 74 76 181 4.42814 0 7.78584 33.025 27.1016 WP 44205 n.a. n.a. Hal 57160 44816 51000 66434 55548 274958 n.a. Dict 33329 85656 61599 69931 107917 358432 n.a. All 95542 84441 70809 48022 47487 346301 sequence CCA n.a. Turf 63341 104416 47781 39686 69474 324698 n.a. Total 293577 319329 231189 224073 280426 1348594 n.a. Total 14.184 8.0908 13.2832 3.69574 11.5676 26.9132 1.95295 0.8663 10.7792 8.824 1.45477 0.27899 3.46519 5.54361 2.5953 4312.66 840.048 88.3624 128.735 1156.39 14.5477 4.46389 0.8663 4.92766 5.56135 104.016 31.8052 198.96 447.801 188.641 3.63692 3.06892 4.04273 152.141 39.226 36.7329 3.9059 3.75396 16.3229 13.4214 Supplementary Table 2: Statistical values for relative occurence of ARLs and other plastids in coral-algal associations Zone 1 2 3 4 5 Statistical values Observations Minimum 25% Percentile Median 75% Percentile Maximum Mean Standard Deviation Standard Error of Mean Lower 95% CI of Median Upper 95% CI of Median Observations Minimum 25% Percentile Median 75% Percentile Maximum Mean Standard Deviation Standard Error of Mean Lower 95% CI of Median Upper 95% CI of Median Observations Minimum 25% Percentile Median 75% Percentile Maximum Mean Standard Deviation Standard Error of Mean Lower 95% CI of Median Upper 95% CI of Median Observations Minimum 25% Percentile Median 75% Percentile Maximum Mean Standard Deviation Standard Error of Mean Lower 95% CI of Median Upper 95% CI of Median Observations Minimum 25% Percentile Median 75% Percentile Maximum Mean Standard Deviation Standard Error of Mean Lower 95% CI of Median Upper 95% CI of Median ARL-V 4 7.894 8.982 21.13 42.83 47.1 24.31 17.95 8.976 7.894 47.1 4 0 1.751 14.44 44.04 51.32 20.08 22.79 11.4 0 51.32 4 0 0 0.7561 6.235 7.843 2.364 3.705 1.852 0 7.843 4 0 0 0 5.67 7.526 1.957 3.713 1.857 0 7.526 4 0 0 0 0 0 0 0 0 0 0 Vitrella 4 0 0 0 0 0 0 0 0 0 0 4 0 0 0 0.9132 1.184 0.3711 0.5421 0.2711 0 1.184 4 0 0.4281 6.388 27.96 33.49 11.59 15.44 7.72 0 33.49 4 0 0.5956 4.821 36.89 46.66 14.1 21.94 10.97 0 46.66 4 0 0 11.57 63.82 77.41 25.16 36.47 18.24 0 77.41 Chromera 4 0 0 0 0 0 0 0 0 0 0 4 0 0 0 0 0 0 0 0 0 0 4 0 0 0 1.243 1.623 0.4809 0.7617 0.3809 0 1.623 4 0 0 0.8026 12.87 16.66 4.591 8.073 4.036 0 16.66 4 0 1.425 6.912 21.54 25.91 9.958 11.18 5.588 0 25.91 Diatoms 4 0 0.4697 3.406 23.81 30 9.229 14.02 7.008 0 30 4 0.9577 1.594 3.983 4.668 4.737 3.415 1.722 0.8608 0.9577 4.737 4 7.061 8.237 46.69 197 235.4 83.96 106.6 53.28 7.061 235.4 4 97.87 161.6 796.3 9869 12745 3609 6110 3055 97.87 12745 4 341.1 384.8 1178 5028 6090 2197 2680 1340 341.1 6090 Haptophytes 4 0 0 0 0 0 0 0 0 0 0 4 0 0 0 1.699 2.231 0.6328 1.066 0.5328 0 2.231 4 0 0 0.8617 1.976 2.093 0.9791 1.033 0.5165 0 2.093 4 2.86 3.405 5.643 15.11 18.06 8.052 6.819 3.41 2.86 18.06 4 0 3.087 15.38 41.13 48.61 19.87 20.65 10.32 0 48.61 Pelagophytes 4 0 0 0 0 0 0 0 0 0 0 4 0 0 0 6.242 8.29 2.147 4.095 2.047 0 8.29 4 0 5.351 41.95 156.6 187.8 67.95 84.06 42.03 0 187.8 4 114.4 124.2 274.6 618.6 692.9 339.1 266.7 133.3 114.4 692.9 4 19.46 97.41 543.6 1429 1654 690 709.9 354.9 19.46 1654 Unicellular red algae 4 0 0 0 1.209 1.579 0.4697 0.7394 0.3697 0 1.579 4 0 0 0 0 0 0 0 0 0 0 4 0 0 5.282 11.14 11.36 5.507 6.254 3.127 0 11.36 4 4.165 4.554 10.39 701.1 929.8 238.7 460.8 230.4 4.165 929.8 4 0 0 12.69 134.8 171.3 49.19 82.26 41.13 0 171.3 Unicellular green algae 4 0 0 0 17.08 22.74 5.761 11.32 5.661 0 22.74 4 0 0 0 0 0 0 0 0 0 0 4 0 1.134 5.06 16.08 19.48 7.425 8.397 4.199 0 19.48 4 0 0 306 79.31 85.67 36.52 43.32 21.66 0 85.67 4 1.853 6.655 24.2 67.6 81.01 32.82 33.91 16.95 1.853 81.01 Supplementary Table 3: Distribution of all ARL-V sequence tags in coral species Coral group Species Long reads Sunagawa et Barott et al., Chen et al., Total Zooxanthellae coral (NCBI) al., 2010 2011, 2012 2011 sequences (hosts Symbiodinium)? Diploria strigosa --2 ----2 YES Hexacorals, Scleractinians, Favites sp. 21 ------21 YES Robust clade Fungia scutaria 2 ------2 YES Montastraea annularis 67 --156 --223 YES Montastraea faveolata 1 17 ----18 YES Montastraea franksi 5 118 ----123 YES Mussismilia braziliensis 2 ------2 YES Pocillopora meandrina 1 ------1 YES Stylopora pistillata 1 ------1 YES Acropora cervicornis --10 ----10 YES Hexacorals, Scleractinians, Acropora palmata --21 ----21 YES Complex clade Galaxea fascicularis 3 ------3 YES Isopora palifera ------117 117 YES Pavona cactus 1 ------1 YES Porites astreoides 5 11 ----16 YES Porites compressa 1 ------1 YES Porites cylindrica 1 ------1 YES Porites lobata 1 ------1 YES Porites lutea 3 ------3 YES Octocorals, Gorgonians Gorgonia ventalina 21 536 ----557 YES 3 distinct coral lineages 20 coral species 136 715 156 117 1124 Only found in zooxanthellate corals