PCE_2541_sm_Method_Tables

advertisement
SUPPORTING INFORMATION
METHODS
ChIP-Seq method
Three ChIP-Seq runs for samples obtained from cold-treated or control seedlings, respectively,
were obtained from the Illumina-Solexa sequencing machine (Illumina, San Diego, CA, USA).
Short reads of 49 bp were first analyzed by Agilent2100 bioanalyzer (Agilent Technologies, Palo
Alto, CA, USA) for quality filtering including removing adaptor sequences and low quality reads
from raw reads, and then aligned to the ZeaB73_release-5b genome (Supplemental Table 1) by
SOAP (Short Oligonucleotide Alignment Program, version 2.21) software, which is efficient for
gapping and ungapping alignment of short oligonucleotides onto reference sequences.
(Supplemental Table 3). Only the alignments within 2 mismatches were considered in peak calling.
To avoid potential sequencing bias, only two copies of the same short reads that were mapped to a
genome site were preserved. After filtering of these data, the achieved clean reads were about 25
million reads for cold-treated and control samples, respectively (Supplemental Table 2).
Based on these clean reads, whole genome peak scanning is progressed by MACS (Model-based
Analysis of ChIP-Seq) software which is specifically designed for short read sequencing and peak
finding (Supplemental Table 4). The candidate peak region was extended to be long enough for
modeling. Dynamic poisson distribution was used to calculate p-value of the specific region based
on the unique mapped reads. The region would be defined as a peak when p-value<10e-5.
The TSS (transcription start site) and TES (transcription end site) of each reference gene were
determined by the CisGenome Browser. Peaks near TSS and TES sites of all mapped genes
including intergenic region, intron, exon, upstream 20K, and downstream 20K are determined by
MACS. To determine peaks in repeat elements, we construct a pseudo “repeat genome” referring
to 435 repeat elements from MaizeGDB. Each repeat element can be considered as a
“stand-alone” chromosome in the repetitive genome. Then the aligners are mapped into “repeat
genome” and the read number is calculated. To determine the statistical significance of the
differences between cold-treated and control seedlings, we analyzed biological variability using
DESEq. We finally identified repeat elements displaying different read abundances between the
two conditions based on P value <0.05.
FIGURE LEGEND
Supplemental Figure1. Genome-wide H3K9ac enrichment profile. (A) Distribution of H3K9ac
patterns on 99 mb region of the chromosome 10 showed that the majority of H3K9ac peaks
located within gene rich regions. Compared with control groups, cold stress caused a reduction of
H3K9ac enrichment throughout this chromatin region. (B) Although H3K9ac was heavily
enriched in genic regions (control groups: 8.53; cold treatment groups: 6.24), no significant
differences were observed between cold treatment groups and control groups within intergenic
region (control groups: 3.1; cold treatment groups: 2.73).
Supplemental Figure 2. Image represents the genome wide distribution of H3K9ac peaks within
knob 180-bp repeats (A) and knob TR-1 repeats (B). As shown in the picture, the H3K9ac peaks
binding within knob-associated sequences were strictly located at chromosome 1, 4, 5, 6, 7, 8, and
9. Although cold stress did not significantly alter the H3K9ac distribution profile within knob
180-bp and TR-1 repeats, an obvious increase of H3K9ac peak numbers can be observed.
Supplemental Figure 3. The overview of DNA methylation distribution within knob-associated
repetitive sequences. The master sequences (unconverted) in first position are aligned with
bisulfite sequences of respective samples. Probable sites for the three classes of methylation (CG,
CHG, and CHH) as well as actually methylated sites in all the samples were identified by the
software CyMATE. Blocked symbols represent actual methylation, whereas unblocked ones
represent potential sites. (A) A 180-bp tandem repeat unit. Compare to the control group, the
major methylation positions that are located at 12, 23, 24, 40, 58, 59, 71, 75 and 170 display DNA
demethylation during the cold treatment. (B) A 350-bp tandem repeat unit. Compared to the
control group, the major methylation positions that are located at 40, 88, 138, 139, 140 190, 191,
212, 268, 286, 306, and 307 display DNA demethylation during the cold treatment.
Supplemental Figure 4. Genome-wide distribution of peak and read analysis. (A) Peak distribution
analysis indicated that the majority of H3K9ac enrichment was located about 300bp away from
TSS and TES. And binding intensity plots examining H3K9ac peaks are slightly depleted at TSS
and no differences are observed between control groups and cold treatment groups at TES. (B)
Analysis of depth distribution of ChIP-Seq reads revealed that cold stress caused a significant
reduction of read numbers in TSS.
TABLE
Supplemental Table 1. Raw data of production
Sample
Raw reads
Low quality
Clean rate
Cold
26,000,000
949,424
96.35%
Control
26,000,000
544,311
97.91%
Supplemental Table 2. Clean data of production
Sample
Reads length
Clean reads
Production
Cold
49
25,050,576
1,227,478,224
Control
49
25,455,689
1,247,328,761
Supplemental Table 3. Reads alignment
Sample
Clean reads
Mapped reads
Unique
Mapped rate
mapped reads
Unique
mapped rate
Cold
25,050,576
22,733,888
8,222,514
90.75%
32.82%
Control
25,455,689
23,043,921
9,240,777
90.53%
36.30%
Supplemental Table 4. Genome-wide distribution of ChIP sequencing reads
Sample
Whole genome
Genic region
Conserved non-coding sequence
2,066,432,718
97,768,671
171,411,617
100%
4.73%
8.30%
Total read based (bp)
402,903,186
118,960,154
84,503,829
Proportion of read bases
100%
29.53%
20.97%
Enrichment
1.00
6.24
2.73
Total read based (bp)
452,798,073
161,417,241
127,885,351
Proportion of read bases
100%
35.65%
28.24%
Enrichment
1.00
8.53
3.10
Reference (bp)
Proportion of mapped genome
Cold
Control
Download