tpj12726-sup-0021-Legends

advertisement
SUPPORTING INFORMATION LEGENDS
SUPPORTINGFIGURES
Figure S1. Validation of selected methylcytosine sites by PCR experiments.
Each graph shows the results for a particular region of one gene, with the upper panel illustrating
the distribution of reads supporting methyl-cytosines in meristems (M) and late flowers (L), and
barplot in the lower panel showing the Realtime-PCR results for endonuclease digested or control
materials from meristems or late flowers. Error bars are for standard deviation of three replicates.
Endonucleases LpnPI and FspEI were utilized for comparison with MspJI. Scale atop each graph
indicates the genomic coordinates (in Mb) of each region on the respective chromosome. Two
different regions AT5G24670-A and AT5G24670-B are shown for the gene AT5G24670.
Figure S2. Methylation profiles determined by MspJI-seq and BS-seq were consistent for most
of the randomly selected genes and genomic regions.
In each graph, the top figures display the log2-transformed read numbers supporting mC at each
nucleotide position of the respective gene or genomic regions, while the bottom figures show the
methylation levels (in %) for each nucleotide position of the gene or genomic regions. Subfiguresfrom 1 to 44 are for genes, those of 45-73 for genomic regions.Gene IDs or chromosome
coordinates are given at the top. M, meristem; E, early flower (stage1-9); L, late flower (stage 1012).
Figure S3. Density of genes and TEs across Arabidopsis genome.
Density of genes/TEs was calculated as percentages of nucleotides in genes or in TEs within each
of 200kb windows of each chromosome. (a-e), chromosome 1-5.
Figure S4. Percentages of genes of each class among all methylated genes.
Figure S5. Methylation of TEs during Arabidopsis floral development.
(a) Percentage of TEs/genes/ differentially methylated between meristem and early flower. (b)
Percentage of differentially methylated TEs and genes targeted by siRNAs. (c) Methylated (in
meristem) and differentially methylated TEs (between meristem and early flower) of each TE
family.
1
Figure S6. The enrichment of TEs of different families in different mC sequence contexts.
Color gradient shows the statistical significance of the enrichment of TEs of each family in each
m
C context class, as indicated by the color bar atop the heatmap.
Figure S7. The distribution of normalized methylation levels of each mC context for genes of
different classes.
RKCM values were calculated as reads per kilo-base of cytosines in the context of CNNR (each
site counts as 1 bp) per million of mapped reads, for each
m
C context and type of genes
separately.
Figure S8. Examples of genes with correlated variations in methylation and expression levels.
In each graph, the MspJI-seq and RNA-seq tracks are shown for each gene, encompassing the
transcribed as well as the up and downstream 1kb regions, at each development stage separately.
Gene structures are shown at the bottom of each graph, with boxes in blue representing exons and
arrows indicating introns and the transcribe direction of the respective gene. The methylation and
expression patterns for VIM1 and VIM2 were similar to those of VIM3, but not shown here. M,
meristem; E, early flower stage; L, late flower stage.
Figure S9. Phenotypes of the three Arabidopsis floral stages used for experiments.
(a) Clusters of meristems of the ap1cal mutant plant. The orange dashed linesencirclethe parts
collected for this study. (b-c) Shown are individual inflorescences of the Landsberg
erectaecotype. Late flowers (stage 10-12, usually 7-8)are labeled with numbers and surroundthe
early flowers (stage1-9, smaller and without labels). Bars = 200 μm.
Figure S10. MspJI digestion and DNA library recovery.
(a) Optimization of MspJI digestion conditions. 5 g (lanes 2-4), 2.5 g (lanes 5-7), or 0 g (lane
8) of MspJI were added to 100 l of digestion mix with 5 g of genomic DNA in the presence (3,
4, 6, and 7) or absence (2 and 5) of a DNA activator. The red arrows indicate the ~32 bp band of
MspJI-digested DNA fragments. The blue arrow showed the DNA activator bands. (b) Purified
2
DNA library visualized in 4% agarose gel electrophoresis. The red arrow indicates the 100 bp
band that we recovered and used for sequencing.
Figure S11. Identification of mCs based on MspJI-seq.
(a-f) The six scenarios where MspJI recognized a pair of methyl-cytosines and cleaved the
double-stranded DNA segments into fragments of proper length that were collected. In each
graph, the methylated cytosines are colored in red or in purple, with surrounding letters
specifying the sequence pattern recognized by MspJI. ‘R’ denotes A or G; ‘Y’ for C or T; ‘N’ for
any base. The symbol ‘x’ marks the cut positions of MspJI, with the same colors as the
corresponding mCsrecognized by MspJI. The orange lines represent the double-stranded DNA
fragments released by a pair of MspJI digestions from the DNA molecules (blue lines), and the
green dashed lines for the synthesized strand complementary to the 5’ overhangs generated
during MspJI digestion. The lengths of specific DNA stretches after the MspJI digestion are
given in nucleotides (nt). (g-h)Examples illustrating the alignment of the 3’ end of the read to the
corresponding reference genome, in the letter and color space, respectively. Sequence of the
SOLiD sequencing adaptor is also shown. Matched nucleotide between the read and the adaptor
are colored in red. Overhang adaptor bases are shown in green and could be matched with the
read in other cases. Orange numbers denote nucleotide positions on read and blue numbers for
the adaptor positions. (i) Table of the evidence codes (ECs) used to evaluate the confidence of the
reads arising from MspJI cleavage. Sequence similarities were obtained by comparing the 3’ end
of the reads with the reference genome and the adaptor sequences, as shown in g and h. na and ng
represent the number of pairs of adjacent sites that were mismatched between the SOLiD color
sequences of read and adaptor and genome, respectively; na’ and ng’, the number of sites that were
mismatched between the SOLiD color sequences of read and adaptor and genome, respectively.
A1, the first base of the adaptor; R1, the first base of the read 3’ end in comparison to the adaptor;
G1, the genomic base corresponding to R1. Semicolons separate equally applicable rules.
SUPPORTING TABLES
Table S1. Summary of SOLiD reads sequenced and mapped against the Arabidopsis reference
genome and reads that were identified as arising from MspJI digestion for each possible
recognition site pattern.
3
Table S2. Primers used in PCR experiments for selected gene regions digested by MspJI.
Table S3. The DNA methylation and expression levels for genes in Arabidopsis flowers. This
table shows the raw read counts supporting mC sites, the normalized methylation levels and
expression levels for genes in floral tissues.
Table S4. Arabidopsis genes differentially methylated and differentially expressed during floral
development. Statistical significance (false discovery rate) of differential methylation and gene
expression fold change are provided for each gene, together with comparison between paired
tissues.
Table S5. Statistics of differentially expressed and methylated genes between flower meristems
and early flowers.
Table S6. Significantly enriched biological processes for each gene cluster in Figure 6.
Table S7. Number of genes for enriched GO terms for each gene cluster in Figure 6. This table
shows the original data used to construct the heatmap in Figure 6c. GO terms were sorted
according to annotation.
Table S8. The relative frequencies of the wobbling cut positions of MspJI.
Methods S1. Mapping of SOLiD short sequencing reads.
Methods S2. Identification of methylcytosines (mCs) based on MspJI-seq.
4
Download