tpj12944-sup-0018-Legends

advertisement
SUPPORTING INFORMATION FIGURE LEGENDS
Table S1. Prediction of cyclic gene expression. The number of cyclic genes predicted by
COSPOT, the DFT, and the combination of both methods, as well as the period mean and
phase peaks for each set of predictions. A gene was called “cyclic” if its expression
vector had a COPSOT p-value < 0.02 or a cyclic score > 0.800, which is equivalent to the
p-value of 0.02 in a population of randomized expression vectors derived from the N.
oceanica CCMP1779 FPKM data set.
Table S2. Primers used for RT-qPCR.
Figure S1. Confirmation of RNA-Seq measurements by reverse transcription quantitative
PCR. Cells were grown in light/dark cycles and collected at the indicated times. RNA levels
determined by RT-qPCR were normalized against the elongation factor (10181) reference
gene. RNA-seq and RT-qPCR expression values were then normalized between 0 and 1 and
represent the mean of 2 (RNA-Seq) or 2-3 (RT-qPCR) biological replicates plotted with SEM
or range. LHC1, LIGHT HARVESTING COMPLEX I (8367); α-TUBULIN (4716); CS,
CELLULOSE SYNTHASE (5780); KAS3, β-KETOACYL-ACP SYNTHASE (2094); DGAT5,
DIACYLGLYCEROL ACYLTRANSFERASE 5 (3915); ω3, ω3 DESATURASE (6416).
Shaded areas represent dark periods.
Figure S2. Phylogenetic analysis of N. oceanica CCMP1779 CDK-related proteins. The
phylogeny was inferred using the Maximum Likelihood method based on the General
Reverse Transcriptase + Freq. model (Dimmic et al., 2002). Branch bootstrap values (500
replicates) are indicated. Initial tree(s) for the heuristic search were obtained by applying
the Neighbor-Joining method to a matrix of pairwise distances estimated using a JTT
model (Jones et al., 1992). The branch lengths are proportional to the number of
substitutions per site. All positions with less than 95 % site coverage were eliminated.
Evolutionary analyses were conducted in MEGA5 (Tamura et al., 2011). The ID numbers
for the diatom P. tricornutum (Pt) proteins are according to the annotated genomes at
JGI-release 2 as described in Huysman et al., (2010). The names and ID numbers for the
E. siliculosus (Es) proteins are from Bothwell et al. (2010). Arabidopsis thaliana (At),
Oryza sativa (Os) and Homo sapiens (Hs) sequences were retrieved from UniProt.
Figure S3. Phylogenetic analysis of N. oceanica CCMP1779 cyclin related proteins. The
phylogeny was inferred by using the Maximum Likelihood method based on the Whelan
And Goldman model (Whelan and Goldman 2001). Initial tree(s) were generated and
alignment positions filtered as in Figure S1. Branches with bootstrap support (500
replicates) values higher than 50% are indicated by (*). The branch lengths are
proportional to the number of substitutions per site. All positions with less than 95% site
coverage were eliminated. Evolutionary analyses were conducted in MEGA5 (Tamura et
al., 2011). The names and ID numbers were obtained from the same sources as detailed
in Figure S1.
Figure S4. Time lapse images of a N. oceanica CCMP1779 cell undergoing division into
four daughter cells. Bright field microscopy using a Leica DMRA2 microscope and a
100x objective. Scale bar = 3 m. * Non-dividing cell.
Figure S5. Phylogenetic analysis of N. oceanica CCMP1779 SLC4 related proteins. The
phylogeny was inferred as in Figure S1. Branch bootstrap values (500 replicates) are
indicated. Initial tree(s) for the heuristic search were obtained automatically by applying
Neighbor-Join and BioNJ algorithms to a matrix of pairwise distances estimated using a
JTT model, and then selecting the topology with superior log likelihood value. A discrete
Gamma distribution was used to model evolutionary rate differences among sites. The
rate variation model allowed for some sites to be evolutionarily invariable. The branch
lengths are proportional to the number of substitutions per site. All positions with less
than 95% site coverage were eliminated. Evolutionary analyses were conducted in
MEGA5 (Tamura et al., 2011). Identifiers are from Genbank with the exception of N.
oceanica CCMP1779 sequences.
Figure S6. Phylogenetic analysis of N. oceanica CCMP1779 malic enzyme related
proteins. The phylogeny was inferred as in Figure S2. Branch bootstrap values (500
replicates) are indicated. The search for the initial trees and the modeling of evolutionary
rate differences were performed as described in Figure S4. The branch lengths are
proportional to the number of substitutions per site. All positions with less than 95% site
coverage were eliminated. Evolutionary analyses were conducted in MEGA5 (Tamura et
al., 2011). Protein identifiers are from UniProt or NCBI gi sequence identifiers with the
exception of N. oceanica CCMP1779 sequences.
Figure S7. Heatmap displaying relative expression levels of putative glucosyl
transferases and glucosyl hydrolases encoding genes. Expression values were analyzed as
described in Figure 4. Row descriptions indicate gene IDs. CS, cellulose synthase; GT48,
glucosyl transferase 48 family gene.
Figure S8. Phylogenetic analysis of the N. oceanica CCMP1779 1,3--glucan synthase
related protein. The phylogeny was inferred as described for Figure S1. Branch bootstrap
values (500 replicates) are indicated. The search for the initial trees and the modeling of
evolutionary rate differences were performed as described in Figure S4. The branch
lengths are proportional to the number of substitutions per site. All positions with less
than 95% site coverage were eliminated. Evolutionary analyses were conducted in
MEGA5 (Tamura et al., 2011). Protein identifiers are NCBI gi sequence identifiers or
UniProt accession numbers for the Ectocarpus sequences with the exception of N.
oceanica CCMP1779 sequences.
Figure S9. Expression of genes potentially involved in the mannitol cycle. FPKM values
were normalized between 0 and 1. MPDH, mannitol 1-phosphate dehydrogenase; MPP,
mannitol 1-phosphate phosphatase; M2DH, mannitol 2-dehydrogenase; FK, fructokinase.
Numbers indicate the digits of the gene IDs.
Figure S10. Phylogenetic analysis of the N. oceanica CCMP1779 type I FAS-like genes.
The phylogeny was inferred as in Figure S2. Branch bootstrap values (500 replicates) are
indicated. The search for the initial trees and the modeling of evolutionary rate
differences were performed as described in Figure S4. The tree is drawn to scale, with
branch lengths measured in the number of substitutions per site. All positions with less
than 95% site coverage were eliminated. Evolutionary analyses were conducted in
MEGA5 (Tamura et al., 2011). Identifiers are from Genbank with the exception of N.
oceanica CCMP1779 sequences, and the Coccomyxa sequence (JGI Phytozome
Coccomyxa subellipsoidea C-169 v2.0). PKS, polyketide synthases; FAS, type I fatty
acid synthases; Euk, eukaryotes; Prok, prokaryotes.
Figure S11. Heatmap displaying relative expression levels of genes involved in the lipid
degradation. Expression values were analyzed as described in Figure 4. Labels indicate
gene IDs. * Indicate genes potentially involved in -oxidation. Expression values can be
found in Data S2.
Figure S12. Heatmap displaying relative expression levels of genes involved in
chromatin modification. Expression values were analyzed as described in Figure 4.
Labels indicate gene IDs, descriptions indicate conserved domains identified using
HMMER (Eddy 2011).
Figure S13. The relationship between the cyclic score derived from the DFT and
negative log transformed p-value from COSPOT. A power-law trendline is plotted
against the data (equation in upper left hand corner). The results of the parallel analysis in
C. reinhardtii show a similar power law relation (y = 2.4871x2.1549) between COSPOT
and the DFT with R2 > 0.70.
Data S1. Expression of N. oceanica CCMP1779 under light/dark cycles and prediction of
cyclic gene expression. FPKM values (Fragments Per Kilobase of transcript per Million
mapped reads) for each sample collected at the indicated times determined by RNA-seq.
ZT, Zeitgeber, time after lights on. Each sample corresponds to an independent culture.
COSPOT and an application of the discrete Fourier transform (DFT) were used for the
identification of cycling genes (see Experimental Procedures).
Data S2. Expression patterns and estimation of subcellular localization of manually
annotated genes. Manual annotation and subcellular localization analyses were performed
as detailed in the Experimental Procedures. COSPOT and an application of the discrete
Fourier transform (DFT) were used for the computational identification of cycling genes.
REFERENCES
Dimmic, M.W., Rest, J.S., Mindell, D.P. and Goldstein, R.A. (2002) rtREV: an amino
acid substitution matrix for inference of retrovirus and reverse transcriptase
phylogeny. Journal of molecular evolution, 55, 65-73.
Eddy, S.R. (2011) Accelerated Profile HMM Searches. Plos Comput Biol, 7.
Jones, D.T., Taylor, W.R. and Thornton, J.M. (1992) The rapid generation of
mutation data matrices from protein sequences. Computer applications in the
biosciences : CABIOS, 8, 275-282.
Tamura, K., Peterson, D., Peterson, N., Stecher, G., Nei, M. and Kumar, S. (2011)
MEGA5: Molecular Evolutionary Genetics Analysis Using Maximum
Likelihood, Evolutionary Distance, and Maximum Parsimony Methods. Mol
Biol Evol, 28, 2731-2739.
Whelan, S. and Goldman, N. (2001) A general empirical model of protein evolution
derived from multiple protein families using a maximum-likelihood
approach. Mol Biol Evol, 18, 691-699.
Download