Statistical_Analysis..

Statistical Analysis of MeDIP-chip Data Array design: MeDIP-chip data was collected on NimbleGen RN34 CpG Island Plus RefSeq Promoter Microarrays . The array contained 15790 CpG Islands annotated by UCSC and all 15287 well-characterized RefSeq promoter regions. The promoter regions spanned roughly about -3880bp to +970bp of the transcription start sites(TSSs). Data pre-processing: Raw data was extracted as pair files by NimbleScan software. We performed median-centering, quantile normalization, and linear smoothing by Bioconductor packages Ringo [1], limma [2], and MEDME[3]. After normalization, a normalized log2-ratio data was created for each sample. Although we had considered alternative normalization schemes, this scheme correlated best with the phyro-sequencing data we had (Spearman rank correlation of 0.45). Peak calling at the replicate level: We smoothed the normalized log2 ratios within each array using a simple moving average method with a window size of 3 probes. Then, for each replicate, we generated candidate peaks with the ACME algorithm [4] using window = 700, thresh = 0.95 (for do.aGFF.calc()), thresh = 5e-3 (for findRegions()). We filtered these candidate peaks by requiring at least 2 consecutive probes within the peak to exceed the threshold utilized by ACME. Overall, we kept the initial step of generating candidate peaks liberal to obtain many peaks and chose to screen these out by a downstream statistical analysis for differential enrichment. Identifying differentially methylated regions (DMRs) by pattern generation and filtering: Each identified peak was classified into the following 5 patterns, where the control, folate and folate+TSA replicates are represented in the 1-3, 46, 7-9 components of the pattern: 0.0.0.1.1.1.0.0.0: 327 DMRs 0.0.0.1.1.1.1.1.1: 450 DMRs 1.1.1.0.0.0.0.0.0: 490 DMRs 1.1.1.0.0.0.1.1.1: 187 DMRs 1.1.1.1.1.1.0.0.0: 2052 DMRs We further screened these DMRs based on a limma differential enrichment analysis [2] at the probe level. We required at least 2 consecutive probes with an adjusted p-value [5] smaller than 0.1 within each DMR. In addition, we required fold changes within the DMRs to confirm the implied ordering by the patterns. For example, for the 0.0.0.1.1.1.0.0.0 pattern, we required that average of the average log2 ratios of the probes across replicates within the DMR region for the folate group was larger than those of both the control and the folate + TSA groups. Many genes had multiple DMRs. Table 1 below lists further classification of the DMRs with respect to CpG island and gene annotations. Pattern 0.0.0.1.1.1.0.0.0 0.0.0.1.1.1.1.1.1 1.1.1.0.0.0.0.0.0 1.1.1.0.0.0.1.1.1 1.1.1.1.1.1.0.0.0 # of DMRs # of DMRs 327 450 490 187 2052 255 350 440 147 1689 Table 1: Summary of the differential methylation patterns. ***We can modify what rows we report in this table*** mapping to gene promoters # of genes with a DMR # of genes with a highly significant DMR # of unique genes with a DMR in a CpG island # of DMRs in nongenes CpG islands 89 72 149 30 306 52 49 81 8 216 34 40 9 17 88 37 25 24 17 210 References: 1) Toedling J, Sklyar O, Huber W: Ringo – an R/Bioconductor package for analyzing ChIP-chip readouts. BMC Bioinformatics 2007, 8:221. 2) Smyth, G. K. (2004). Linear models and empirical Bayes methods for assessing differential expression in microarray experiments. Statistical Applications in Genetics and Molecular Biology 3, No. 1, Article 3. 3) M. Pelizzola, Y. Koga, A. E. Urban, M. Krauthammer, S. Weissman,R. Halaban, and A. M. Molinaro (2008). MEDME: an experimental and analytical methodology for the estimation of DNA methylation levels based on microarray derived MeDIP-enrichment. Genome research,18(10):1652‚Äì1659. 4) Scacheri, P.C., Crawford G.E., Davis S. (2006) Statistics for ChIP-chip and Dase hypersensitivity experiments on Nimblegen arrays. Methods Enzymology 411: 270-282. 5) Benjamini, Y., Hochberg, Y.(1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society, Series B (Methodological) 57 (1): 289–300.

Statistical_Analysis..

Related documents

Products

Support

Statistical_Analysis..

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib