Dynamic DNA methylation across diverse human cell lines and tissues

advertisement
Dynamic DNA methylation across diverse human cell lines and tissues
Katherine E. Varley, Jason Gertz, Kevin M. Bowling, Stephanie L. Parker, Timothy E.
Reddy, Florencia Pauli, Marie K. Cross, Brian A. Williams, John A. Stamatoyannopoulos,
Gregory E. Crawford, Devin M. Absher, Barbara J. Wold, Richard M. Myers*
* Corresponding author: rmyers@hudsonalpha.org
Supplemental Figures and Legends:
Supplemental Figure 1. Reduced Representation Bisulfite Sequencing experimental
method. Genomic DNA is digested with MspI. The restriction fragment overhangs are
filled-in and tailed with 3’A overhangs.
Methylated adapters containing sequences
compatible with Illumina GAIIx sequencing are ligated to the restriction fragments.
1
Genomic restriction fragments between 40-120 bp are extracted from agarose gels,
treated with sodium bisulfite, and PCR amplified.
Supplemental Figure 2. Computational methods for aligning sequencing reads and
determining percent methylated (PM) values for each cytosine.
2
Supplemental Figure 3. Method Performance. A) Optimum coverage for reproducible
measurements of DNA methylation was determined by comparing two growth replicates
of the K562 cell line. For each minimum read depth (coverage), we calculated the
difference in percent methylated values between replicates, and determined the average
across all assayed CpGs. We found that requiring a minimum of 10x coverage reduces
the average difference between replicates to 3.96 percent methylated. B) Percent
methylated values measured in the cell line K562 by RRBS and Illumina Methyl450K
array are highly correlated (r=0.954). C) The distribution of percent methylated values in
fully methylated and completely unmethylated (whole genome amplified) control
samples indicates that the assay can produce accurate measurements of DNA methylation
for the two extremes of possible genomic methylation. The highly similar genomic
distributions of DNA methylation across 82 human cell line and tissues, and their
replicates, demonstrates that RRBS is reproducible and that, while individual CpGs may
3
have variable methylation across samples, the fraction of CpGs in the genome that are
methylated remains fairly constant (mean R2=0.96).
4
5
Supplemental Figure 4. Detailed tree containing sample identifiers from unsupervised
hierarchical clustering of top 5% of CpGs with the most varying methylation across all
samples. Sample identifier key is Supplemental Table 1.
Supplemental Figure 5. The clustergram from unsupervised hierarchical clustering of
6
the top 5% of CpGs with the most varying methylation across 82 samples was depicted in
Figure 1. Here the same CpGs are divided based on their relationship to gene annotations.
We find that regardless of whether the CpGs are located in regulatory regions near a
gene’s transcription start site (TSS) or in the body of a gene they recapitulate patterns of
methylation associated with four major clades: cancer cell lines, tissues, primary cell
lines, and blood leukocytes. A) The subset of the top 5% of CpGs with the most varying
methylation that are found within 2,000 bp of the transcription start site (TSS). B) The
subset of the top 5% of CpGs with the most varying methylation that are found in the
body of genes more than 2,000 bp from the transcription start site (TSS).
Supplemental Figure 6. Replicate growths of three cell lines obtained from different
labs across the United States exhibit highly correlated percent methylated values within a
cell line regardless of whether the replicates were grown in the same lab or different labs.
Correlation coefficients are reported for each pairwise comparison.
7
Supplemental Figure 7. CpGs with allele-specific methylation in primary cell lines and
tissues exhibit aberrant methylation in cancer cell lines. A) The CpGs that exhibit allelespecific methylation in primary cell lines and tissues have a skewed distribution of
percent methylated values in cancer cell lines. B) A regulatory region of the imprinted
8
gene GNAS exhibits allelic and partial methylation in primary cell lines and normal
tissues (yellow bars are 50% methylated CpGs), and is hyper-methylated across diverse
cancer cell lines (red bars are 100% methylated CpGs). C) The intron of GPR123
exhibits allelic methylation and partial methylation across primary cell lines and normal
tissues (yellow bars are 50% methylated CpGs), and is hypo-methylated across diverse
cancer cell lines (green bars are 0% methylated CpGs).
9
10
Supplemental Figure 8. Detailed tree containing sample identifiers from unsupervised
hierarchical clustering of top 5% of CpGs with the most varying methylation across noncancer samples. Sample identifier key is Supplemental Table 1.
Supplemental Figure 9. CpGs that reside in CGIs in gene bodies far from the
transcription start site (TSS) can be either positively or negatively correlated with
expression. Those that are bound by p300 have methylation values that are negatively
correlated with gene expression, exhibiting a methylation pattern similar to gene
promoters. This indicates that p300-bound CGI intragenic enhancers are unmethylated
in expressed genes, unlike most other gene body CpGs.
11
Download