Paper review: The long-range interaction landscape of gene promoters ——Li Yanjian 2012/9/19 Outline • Why we study DNA-DNA interaction • 3C and 5C technology • Experiment results—Interaction landscape 1. 2. 3. 4. Experiment design Data validation Analysis by cell lines and states Important features • Conclusion • Q&A Why we study DNA-DNA interaction • How target genes interact with distal regulatory elements is still unknown. • Promoters and distal elements can form looping interactions which have been implicated in gene regulation. • Chromosome is not simply linear and has its special spatial structure. To learn DNA-DNA interactions is the first step to know chromosome’s 3D structure in vivo. 3C and 5C technology • 3C (Chromosome Conformation Capture) is the first technology to detect DNA-DNA interaction invented by Job Dekker 3C and 5C technology • 3C can only detect one pair of interaction at a time by PCR, so they improved it and invent 5C (Chromosome Conformation Capture Carbon Copy) • The experiment detail is quite complicated, so you can simply focus on the aim of 5C: to detect lots of interactions at a time Interaction landscape—— Experiment design • Using 5C to detect 44 ENCODE region’s (0.5~1.9Mb, 30Mb in total) DNA-DNA interaction in 3 cell lines (GM12878, K562, HeLa-S3) • Analysing interactions between 628 TSS regions and 4535 distal regions Interaction landscape—— Data validation • Interaction strength: 1. Within region > Between region 2. Within ENCODE region > Merely neighbour in genome 3. Different regions from same chromosome > Different regions from different chromosome • Consistent with previous 4C and Hi-C data Interaction landscape—— Analysis by cell lines and states • Authors defined 7 distinct chromatin states based on histone modifications, the presence of DHSs and the localization of proteins such as RNA polymerase II and CTCF 1. enhancer (E) 2. weak enhancer(WE) 3. TSS 4. predicted promoter flanking regions (PF) 5. insulator element (CTCF) 6. predicted repressed region (R) 7. predicted transcribed region (T). Interaction landscape—— Analysis by cell lines and states • ACSL6 region in K562 cell Interaction landscape—— Analysis by cell lines and states • γ-δ globin region in K562 cell Interaction landscape—— Analysis by cell lines and states • α-globin region in K562 cell • Important regulatory interaction can be found Interaction landscape—— Analysis by cell lines and states • α-globin region in GM12878 and HeLa-S3 cells • Same interactions were not detected because these 2 cells express little or no globin Interaction landscape—— Analysis by cell lines and states • Conclusion: The 5C data shown in this paper consists well with previous study, so it’s convincing. • Interactions found by 5C are very likely to be functional • Good Pearson correlation coefficient between replicates (>90%) Interaction landscape—— Analysis by cell lines and states • ~60% of the interactions only occurred in one cell line Interaction landscape—— Analysis by cell lines and states • Authors defined 7 distinct chromatin states based on histone modifications, the presence of DHSs and the localization of proteins such as RNA polymerase II and CTCF 1. enhancer (E) 2. weak enhancer(WE) 3. TSS 4. predicted promoter flanking regions (PF) 5. insulator element (CTCF) 6. predicted repressed region (R) 7. predicted transcribed region (T). Interaction landscape—— Analysis by cell lines and states • Then they categorized interactions into 4 broader functional groups: 1. Putative enhancer (‘E’ (E or WE)) 2. Putative promoter (‘P’ (TSS or PF)) 3. CTCF-bound element (CTCF) 4. Not contain any elements belongs to the above 3 groups (‘U’, unclassified) • This is non-exclusive classification Interaction landscape—— Analysis by cell lines and states • Regions which have interactions usually enrich active functional markers Interaction landscape—— Analysis by cell lines and states • Many U group regions have active marker—— conservative segmentation approach Interaction landscape—— Analysis by cell lines and states • Conclusion: Unclassified group is relatively large and still enriched in active marker such as H3K4me1 • The restriction used by the author is very strict, so only very significant interactions can be taken into consideration (high false negative rate) Interaction landscape—— Analysis by cell lines and states • We found that TSS–E and TSS–P interactions are more cell-type specific than TSS–CTCF interactions Only one: more than one TSS-E/TSS-P TSS-CTCF ~4:1 ~1:1 Interaction landscape—— Analysis by cell lines and states • Conclusion: TSS-CTCF interactions are more conservative among different cell types Interaction landscape—— Analysis by cell lines and states • Looping interactions with E elements were significantly enriched for those that involved expressed TSSs Interaction landscape—— Analysis by cell lines and states • Conclusion: TSSs interacted with E elements are more likely to be expressed Interaction landscape—— upstream or downstream • Long-range interaction is asymmetric • A peak at 120kb upstream of TSSs Interaction landscape—— upstream or downstream • Conclusions: Interactions between TSS and distal fragments are asymmetric Interaction landscape—— Affect of elements order • Only,7% of the looping interactions are between an element and the nearest TSS (for active TSS, it goes up to 22%) • 27% of the distal elements have an interaction with the nearest TSS, and 47% of elements have interactions with the nearest expressed TSS. Interaction landscape—— Affect of elements order • Conclusion: Interactions don’t always occur between nearest TSS and distal fragment Interaction landscape—— CTCF’s insulation function • We found that 79% of longrange interactions are unaffected by the presence of one or more CTCFbound sites • 58% of looping interactions skip sites co-bound by CTCF and cohesin Interaction landscape—— CTCF’s insulation function • Conclusions: CTCF or CTCF&cohesin binding seems to have little affect on interactions’ forming • Other factors are needed to complete insulation function Interaction landscape—— Multiple interactions • 50% of TSSs display one or more long-range interaction, with some interacting with as many as 20 distal fragments • 10% of distal fragments interacted with one or more TSS Interaction landscape—— Multiple interactions • an example of the complex long-range interaction networks in the ENr132 region in K562 cell Conclusion 1. Generate a rich data set reflecting specific geneelement interactions 2. Interactions between TSS and distal elements are correlated with expression 3. Interactions between TSS and distal elements prefer to occur in the upstream (~120kb) 4. Interactions are often not blocked by CTCF and cohesin 5. Very few interactions occur between genes and its nearest elements 6. Promoters and distal elements are engaged in multiple interaction networks Q&A