SUPPORTING INFORMATION FIGURE LEGENDS Table S1. Prediction of cyclic gene expression. The number of cyclic genes predicted by COSPOT, the DFT, and the combination of both methods, as well as the period mean and phase peaks for each set of predictions. A gene was called “cyclic” if its expression vector had a COPSOT p-value < 0.02 or a cyclic score > 0.800, which is equivalent to the p-value of 0.02 in a population of randomized expression vectors derived from the N. oceanica CCMP1779 FPKM data set. Table S2. Primers used for RT-qPCR. Figure S1. Confirmation of RNA-Seq measurements by reverse transcription quantitative PCR. Cells were grown in light/dark cycles and collected at the indicated times. RNA levels determined by RT-qPCR were normalized against the elongation factor (10181) reference gene. RNA-seq and RT-qPCR expression values were then normalized between 0 and 1 and represent the mean of 2 (RNA-Seq) or 2-3 (RT-qPCR) biological replicates plotted with SEM or range. LHC1, LIGHT HARVESTING COMPLEX I (8367); α-TUBULIN (4716); CS, CELLULOSE SYNTHASE (5780); KAS3, β-KETOACYL-ACP SYNTHASE (2094); DGAT5, DIACYLGLYCEROL ACYLTRANSFERASE 5 (3915); ω3, ω3 DESATURASE (6416). Shaded areas represent dark periods. Figure S2. Phylogenetic analysis of N. oceanica CCMP1779 CDK-related proteins. The phylogeny was inferred using the Maximum Likelihood method based on the General Reverse Transcriptase + Freq. model (Dimmic et al., 2002). Branch bootstrap values (500 replicates) are indicated. Initial tree(s) for the heuristic search were obtained by applying the Neighbor-Joining method to a matrix of pairwise distances estimated using a JTT model (Jones et al., 1992). The branch lengths are proportional to the number of substitutions per site. All positions with less than 95 % site coverage were eliminated. Evolutionary analyses were conducted in MEGA5 (Tamura et al., 2011). The ID numbers for the diatom P. tricornutum (Pt) proteins are according to the annotated genomes at JGI-release 2 as described in Huysman et al., (2010). The names and ID numbers for the E. siliculosus (Es) proteins are from Bothwell et al. (2010). Arabidopsis thaliana (At), Oryza sativa (Os) and Homo sapiens (Hs) sequences were retrieved from UniProt. Figure S3. Phylogenetic analysis of N. oceanica CCMP1779 cyclin related proteins. The phylogeny was inferred by using the Maximum Likelihood method based on the Whelan And Goldman model (Whelan and Goldman 2001). Initial tree(s) were generated and alignment positions filtered as in Figure S1. Branches with bootstrap support (500 replicates) values higher than 50% are indicated by (*). The branch lengths are proportional to the number of substitutions per site. All positions with less than 95% site coverage were eliminated. Evolutionary analyses were conducted in MEGA5 (Tamura et al., 2011). The names and ID numbers were obtained from the same sources as detailed in Figure S1. Figure S4. Time lapse images of a N. oceanica CCMP1779 cell undergoing division into four daughter cells. Bright field microscopy using a Leica DMRA2 microscope and a 100x objective. Scale bar = 3 m. * Non-dividing cell. Figure S5. Phylogenetic analysis of N. oceanica CCMP1779 SLC4 related proteins. The phylogeny was inferred as in Figure S1. Branch bootstrap values (500 replicates) are indicated. Initial tree(s) for the heuristic search were obtained automatically by applying Neighbor-Join and BioNJ algorithms to a matrix of pairwise distances estimated using a JTT model, and then selecting the topology with superior log likelihood value. A discrete Gamma distribution was used to model evolutionary rate differences among sites. The rate variation model allowed for some sites to be evolutionarily invariable. The branch lengths are proportional to the number of substitutions per site. All positions with less than 95% site coverage were eliminated. Evolutionary analyses were conducted in MEGA5 (Tamura et al., 2011). Identifiers are from Genbank with the exception of N. oceanica CCMP1779 sequences. Figure S6. Phylogenetic analysis of N. oceanica CCMP1779 malic enzyme related proteins. The phylogeny was inferred as in Figure S2. Branch bootstrap values (500 replicates) are indicated. The search for the initial trees and the modeling of evolutionary rate differences were performed as described in Figure S4. The branch lengths are proportional to the number of substitutions per site. All positions with less than 95% site coverage were eliminated. Evolutionary analyses were conducted in MEGA5 (Tamura et al., 2011). Protein identifiers are from UniProt or NCBI gi sequence identifiers with the exception of N. oceanica CCMP1779 sequences. Figure S7. Heatmap displaying relative expression levels of putative glucosyl transferases and glucosyl hydrolases encoding genes. Expression values were analyzed as described in Figure 4. Row descriptions indicate gene IDs. CS, cellulose synthase; GT48, glucosyl transferase 48 family gene. Figure S8. Phylogenetic analysis of the N. oceanica CCMP1779 1,3--glucan synthase related protein. The phylogeny was inferred as described for Figure S1. Branch bootstrap values (500 replicates) are indicated. The search for the initial trees and the modeling of evolutionary rate differences were performed as described in Figure S4. The branch lengths are proportional to the number of substitutions per site. All positions with less than 95% site coverage were eliminated. Evolutionary analyses were conducted in MEGA5 (Tamura et al., 2011). Protein identifiers are NCBI gi sequence identifiers or UniProt accession numbers for the Ectocarpus sequences with the exception of N. oceanica CCMP1779 sequences. Figure S9. Expression of genes potentially involved in the mannitol cycle. FPKM values were normalized between 0 and 1. MPDH, mannitol 1-phosphate dehydrogenase; MPP, mannitol 1-phosphate phosphatase; M2DH, mannitol 2-dehydrogenase; FK, fructokinase. Numbers indicate the digits of the gene IDs. Figure S10. Phylogenetic analysis of the N. oceanica CCMP1779 type I FAS-like genes. The phylogeny was inferred as in Figure S2. Branch bootstrap values (500 replicates) are indicated. The search for the initial trees and the modeling of evolutionary rate differences were performed as described in Figure S4. The tree is drawn to scale, with branch lengths measured in the number of substitutions per site. All positions with less than 95% site coverage were eliminated. Evolutionary analyses were conducted in MEGA5 (Tamura et al., 2011). Identifiers are from Genbank with the exception of N. oceanica CCMP1779 sequences, and the Coccomyxa sequence (JGI Phytozome Coccomyxa subellipsoidea C-169 v2.0). PKS, polyketide synthases; FAS, type I fatty acid synthases; Euk, eukaryotes; Prok, prokaryotes. Figure S11. Heatmap displaying relative expression levels of genes involved in the lipid degradation. Expression values were analyzed as described in Figure 4. Labels indicate gene IDs. * Indicate genes potentially involved in -oxidation. Expression values can be found in Data S2. Figure S12. Heatmap displaying relative expression levels of genes involved in chromatin modification. Expression values were analyzed as described in Figure 4. Labels indicate gene IDs, descriptions indicate conserved domains identified using HMMER (Eddy 2011). Figure S13. The relationship between the cyclic score derived from the DFT and negative log transformed p-value from COSPOT. A power-law trendline is plotted against the data (equation in upper left hand corner). The results of the parallel analysis in C. reinhardtii show a similar power law relation (y = 2.4871x2.1549) between COSPOT and the DFT with R2 > 0.70. Data S1. Expression of N. oceanica CCMP1779 under light/dark cycles and prediction of cyclic gene expression. FPKM values (Fragments Per Kilobase of transcript per Million mapped reads) for each sample collected at the indicated times determined by RNA-seq. ZT, Zeitgeber, time after lights on. Each sample corresponds to an independent culture. COSPOT and an application of the discrete Fourier transform (DFT) were used for the identification of cycling genes (see Experimental Procedures). Data S2. Expression patterns and estimation of subcellular localization of manually annotated genes. Manual annotation and subcellular localization analyses were performed as detailed in the Experimental Procedures. COSPOT and an application of the discrete Fourier transform (DFT) were used for the computational identification of cycling genes. REFERENCES Dimmic, M.W., Rest, J.S., Mindell, D.P. and Goldstein, R.A. (2002) rtREV: an amino acid substitution matrix for inference of retrovirus and reverse transcriptase phylogeny. Journal of molecular evolution, 55, 65-73. Eddy, S.R. (2011) Accelerated Profile HMM Searches. Plos Comput Biol, 7. Jones, D.T., Taylor, W.R. and Thornton, J.M. (1992) The rapid generation of mutation data matrices from protein sequences. Computer applications in the biosciences : CABIOS, 8, 275-282. Tamura, K., Peterson, D., Peterson, N., Stecher, G., Nei, M. and Kumar, S. (2011) MEGA5: Molecular Evolutionary Genetics Analysis Using Maximum Likelihood, Evolutionary Distance, and Maximum Parsimony Methods. Mol Biol Evol, 28, 2731-2739. Whelan, S. and Goldman, N. (2001) A general empirical model of protein evolution derived from multiple protein families using a maximum-likelihood approach. Mol Biol Evol, 18, 691-699.