Can we correlate RNA expression and DNase hypersensitivity sites in DNA domains? David John Haan Stuart’s Lab Nov 20, 2014 UCSC Topological Associated Domains (TAD) • Megabase-sized local chromatin interaction domains • Determined thru HiC Assay GTEX Portal RNAseq Data 2,921 samples “Wrangle” the Data! • HiC Assay Domain data: chr14_24960000_26280000 CMA1 CTSG chr14_26560000_28200000 MIR4307 NOVA1 chr14_29080000_30080000 FOXG1 C14orf23 chr14_30160000_30920000 MIR548AI PRKD1 GZMH GZMB MIR548AI PRKD1 STXBP6 • GTEX RNAseq Data in RPKMs: Gene CMA1 CTSG GZMH GZMB STXBP6 Pancreas1 Pancreas2 61.305 44.22 100.5 21.105 4.02 86.43 64.32 85.425 14.07 97.485 Heart1 17.085 53.265 89.445 39.195 8.04 • Take the mean of each domain and each tissue type 6 4 Breast AdiposeTissue Nerve Thyroid Prostate Bladder Lung Uterus FallopianTube Ovary BloodVessel SmallIntestine Colon Stomach Vagina CervixUteri Esophagus Skin SalivaryGland Muscle Heart Kidney AdrenalGland Testis Pituitary Brain Spleen Blood Pancreas Liver Tissue Type High Expression 12 Low Expression 2 DNA Domains GTEX RNAseq expression in 150 most variable domains 10 8 DNase Hypersensitivity Sites Histogram of DNA hypersensitivity coverage among HiC Domains 200 DNase HS assay is an adequate assay to determine if a domain is open for transcription 100 These 5 tissue samples are from the Human Epigenome Atlas 0 # of Domains 300 400 Pancreas1 Pancreas2 Heart Ovary Small Intestine 0.0 0.1 0.2 0.3 0.4 Fraction of hypersensitive sites within domain 0.5 2 4 6 8 Points in red are significantly (p<.05) more expressed in Pancreas than other tissues determined by a right sided t-test. 0 log2(mean RNA level+1) 10 12 Pancreas DNase coverage Vs RNA level 0.0 0.1 0.2 0.3 DNase coverage 0.4 0.5 0.10 0.05 0.00 Hypersensitivity Difference (Pancreas − Mean) 0.15 How does pancreas HS coverage in DNA domains differ from other tissue types? 0 500 1000 1500 Domains 2000 2500 0.10 RNAseq p<10^−2 N=150 RNAseq p<10^−12 N=64 0.05 SOX9 – progenitor gene REG Family 0.00 Hypersensitivity Difference (Pancreas − Mean) 0.15 Pancreas HS differential in significantly expressed Domains 0 500 1000 1500 Domains 2000 2500 0.00 0.05 0.10 0.15 RNAseq p<10^−2 N=673 RNAseq p<10^−12 N=89 −0.05 −0.05 0.00 0.05 0.10 Hypersensitivity Difference (Ovary − Mean) 0.15 0.20 RNAseq p<10^−2 N=318 RNAseq p<10^−12 N=220 −0.10 Hypersensitivity Difference (Heart − Mean) 0.20 More tests: Heart and Ovary 0 500 1000 1500 Domains 2000 2500 0 500 1000 1500 Domains 2000 2500 Future direction • Use more DNase hypersensitivity data • Gene-specific analysis • Analyze tissue specific “closed” domains • Compare domains among tissue types including cancer • Integrating domains into pathway models Thank you • Josh’s lab specifically Josh Stuart, Dan Carlin, Kiley Graim, and Evan Paull • Charles Markello for helping create the t-test • Evan for helping determine what I need (or don’t need) on my bike to get up the hill faster Pancreas HS differential VS significantly expressed Breast domains In black is the Dnase HS coverage differential plot In red/blue is a density plot of significant expression levels *No Correlation!!!