基础分子生物学 第十一讲:真核基因的转录调控 郭红卫 hongweig@pku.edu.cn,新生物楼428 2011-11-10 Transcriptional Regulation of Eukaryotic Genes (真核基因的转录调控) 1.Transcriptional initiation 2.Histone modification 3.DNA methylation Post-Transcriptional Regulation of Eukaryotic Genes (真核基因的转录后调控) 1.siRNA and RNA silencing 2.miRNA and other ncRNA Reference: Genes IX (Benjamin Lewin) 现代分子生物学 (朱玉贤) Regulation of Gene Expression Chromatin epigenetic control Protein degradation RNA silencing 一般而言的基因表达调控范畴 Nucleosome (核小体) 是染色质的基本结构 单位,由~200 bp DNA和组蛋白八聚体组成 Nucleosomes are hard to get access If nucleosomes form at a promoter, transcription factors (and RNA polymerase) cannot bind. If transcription factors (and RNA polymerase) bind to the promoter to establish a stable complex for initiation, histones are excluded. Chromatin remodeling The dynamic model for transcription of chromatin relies upon factors that can use energy provided by hydrolysis of ATP to displace nucleosomes from specific DNA sequences. two major types of chromatin remodeling complex: SWI/SNF and ISW (imitation SWI) Chromatin remodeling is undertaken by large complexes that use ATP hydrolysis to provide the energy for remodeling. The heart of the remodeling complex is its ATPase subunit. Remodeling complexes are usually classified according to the type of ATPase subunit A remodeling complex binds to chromatin via an activator (or repressor) A genomic survey suggested that most sites that bind transcription factors are free of nucleosome. In addition to chromatin remodeling, chemical modifications on histones also play a central role in gene regulation Promoter activation involves binding of a sequence-specific activator, recruitment and action of a remodeling complex, and recruitment and action of an acetylating complex. Part I: Histone modifications 组蛋白的化学修饰 组蛋白化学修饰发生在组蛋白N端尾部,尤其是组蛋白H3 和H4的修饰起始了 染色质结构的变化。组蛋白N端尾部从 DNA转弯处的核小体间延伸出来。 组蛋白化学修饰的类型 组蛋白乙酰化 Lysine (K) Acetylation 组蛋白甲基化 Lysine (K), Arginine (R) Methylation 组蛋白磷酸化 Serine (S), Threonine (T) Phosphorylation 组蛋白泛素化 Lysine (K) Ubiquitination 组蛋白SUMO化 Lysine (K) Sumoylation 组蛋白糖基化 Glutamate, Arginine (R) ADP-Ribosylation 有些化学修饰的化学基团可以减少组蛋白的正电性,从而使其与DNA 结合变疏松,使染色质结构发生变化。 histone acetylation and histone phosphorylation change the overall charge of the chromatin structure. These modifications can lead to a general decondensation of the chromatin fibre. Histone Code 组蛋白密码 相对而言,组蛋白的甲基化修饰方式是最稳定的,所以最 适合作为稳定的表观遗传信息。而乙酰化修饰具有较高的 动态,另外还有其他不稳定的修饰方式,如磷酸化、腺苷 酸化、泛素化、SUMO化、ADP核糖基化等等。这些修饰更 为灵活的影响染色质的结构与功能,通过多种修饰方式的 组合发挥其调控功能。所以有人称这些能被专识别的修饰 信息为组蛋白密码。这些组蛋白密码组合变化非常多,因 此组蛋白共价修饰可能是更为精细的基因表达方式。 另外,研究发现H2B的泛素化可以影响H3K4和H3K79的甲基 化,这也提示了各种修饰间也存在着相互的关联。 1. 各种组蛋白修饰的化学反应 2. 催化各种组蛋白修饰的酶(writer/eraser) 3. 识别各种组蛋白修饰的蛋白(reader) 4. 各种组蛋白修饰的生物学功能 Histone Acetylation—Lysine (K) Histone Methylation lysines may be mono-, di- or tri-methylated arginines may be mono-, symmetrically or asymmetrically dimethylated All methyl groups come from SAM(s-腺苷甲硫氨酸) Histone S/T Phosphorylation Histone K Ubiquitination Mode of action of histone modifications 1, the modification directly influences the overall structure of chromatin, either over short or long distances. K Acetylation: reduces the positive charge of histones S/T Phosphorylation: reduces the positive charge K Ubiquitination: change the conformation of the nucleosome 2, the modification regulates (either positively or negatively) the binding of effector molecules. K/R Methylation K Acetylation Not long ago, it was generally thought that… Acetylation of H3 and H4 is associated with active chromatin, whereas methylation is associated with inactive chromatin. Histone Acetylation and De-acetylation Histone acetyltransferase (HAT) enzymes modify histones by addition of acetyl groups; some transcriptional coactivators have HAT activity. A deacetylase is an enzyme that removes acetyl groups from proteins. Histone deacetyltransferase (HDAC) enzymes remove acetyl groups from histones; they may be associated with repressors of transcription. Histone acetylation activates transcription 许多HAT存在于转录激活 复合体(co-activators)中 (如SAGA复合体中的GCN5), 且多种HAT可同时存在于 一个辅助激活复合体中 (如P300, ACTR与PCAF的共 同作用). PCAF EF P300 TF ACTR A repressor complex (e.g. Sin3 complex) contains at least three components: a DNA binding subunit, a corepressor, and a histone deacetylase (HDAC). The histone deacetylase (HDAC) superfamily Green rectangles indicate the conserved HDAC domain; numbers following the HDAC domain indicate the number of amino acids. Myocyte enhancer factor 2 (MEF2)-binding sites are marked by a blue square, and binding sites for the 143-3 chaperone protein are also shown. HDAC enzymes oppose the effects of HATs and reverse lysine acetylation, an action that restores the positive charge of the lysine. There are four classes of HDAC. relatively low substrate specificity. present in multiple distinct complexes, often with other HDACs. HDAC是许多转录共抑制复合物中的活性成分。HDACl和 HDAC2均与介导转录抑制的mSin3A有关。通过与许多序列特 异性的转录因子相互作用,可使HDAC-mSin3复合物结合到特 异性的启动子,抑制相关基因的表达。这些转录因子包括未 结合配体的核激素受体、Mad/Max异二聚体、MeCP2、p53 等。 Reversible Lysine (K) Acetylation in Diverse Cellular Processes Histone acetylation in cancer two possible mechanisms: 1. by altering gene expression programmes, including the aberrant regulation of oncogenes and/or tumour suppressors. 2. on a more global level, histone modifications may affect genome stability and/or chromosome integrity. Model for transformation mediated by the MOZ-TIF2 fusion protein Conserved amino acids C of C2HC zin- finger or G of Ac-CoA binding motif in HAT domain were replaced with G (MOZ-TIF2 C543G) or E (MOZ-TIF2 G657E), respectively. The MOZ-TIF2 LXXLL mutant was generated by replacing amino acid residues in the first two LXXLL motifs (PDDLL→PAAAA and LLDQL→AADGA) in the CID. One example of the mechanisms by which aberrant histone modification profiles give rise to cancer • The MOZ protein is a HAT and TIF2 is a nuclear receptor co-activator that binds another HAT, CBP. • When the MOZ-TIF2 fusion was transduced into normal committed murine haematopoietic progenitor cells, which lack self-renewal capacity, the fusion conferred the ability to self-renew in vitro and resulted in AML in vivo (the fusion protein induces properties typical of leukaemic stem cells). • The intrinsic HAT activity of MOZ is required for neither self-renewal nor leukaemic transformation, but its nucleosome-binding motif is essential for both. • The CBP interaction domain within TIF2 is also essential for both processes. Consequently, the transforming ability of MOZ-TIF2 most likely involves an erroneous histone acetylation profile at MOZ-binding sites. Histone Methylation and Demethylation 组蛋白甲基化 组蛋白甲基化是由组蛋白甲基化转移酶(histone methyl transferase,HMT)完成的。 甲基化位点: 甲基化可发生在组蛋白的赖氨酸(K)和精氨酸(R)残基上, 而且赖氨酸残基能够发生单、双、三甲基化,而精氨酸 残基能够单、双甲基化,这些不同程度的甲基化极大地 增加了组蛋白修饰和调节基因表达的复杂性。 组蛋白H3的第4、9、27和36位,H4的第20位Lys,H3的 第2、l7、26位及H4的第3位Arg都是甲基化的常见位点。 甲基化不改变组蛋白的电荷,而是通过不同位点的、不 同程度的甲基化修饰,来招募不同的effector蛋白。 精氨酸甲基化:是一种相对动态的标记,精氨酸甲基化 与基因激活相关,而H3和H4精氨酸的甲基化丢失与基因 沉默相关。 赖氨酸甲基化:是基因表达调控中一种较为稳定的标记。 例如,H3第4位的赖氨酸残基甲基化与基因激活相关,而 第9位和第27位赖氨酸甲基化与基因沉默相关。此外, H4K20的甲基化与基因沉默相关,H3K36和H3K79的甲基 化与基因激活有关。但应当注意的是,甲基化个数与基 因沉默和激活的程度相关。 Histone code – H3K9me3 - Heterochromatin – H3K27me3 - Repressed promoters – H3K4me3, H3Ac, H4Ac - Active promoters Within a gene region: – H3K4me1 - enhancers – H3K4me3 – transcription start sites – H3K36me3 – transcribed regions Histone lysine methyltransferases -- HKMTs All HKMTs contain a so-called SET domain for methyltransferase activity HKMT: some can catalyze mono-, di-, and tri-methylation of histone lysine, whereas others can only catalyze specific form. So how do these enzymes modify the appropriate lysine to a specific degree? • DIM5 can tri-methylate H3K9, but SET7/9 can only mono-methylate H3K4. • In DIM5, there is a phenylalanine (F281) within the enzyme’s lysine-binding pocket that can accommodate all the methylated forms of the lysine, thereby allowing the enzyme to generate the tri-methylated product. • SET7/9 has a tyrosine (Y305) in the corresponding position such that it can only accommodate the mono-methyl product. • Elegant mutagenesis studies have shown that mutagenesis of DIM5 F281 to Y converts the enzyme to a mono-methyltransferase, whereas the reciprocal mutation in SET7/9 (Y305 to F) creates an enzyme capable of tri-methylating its substrate. Differential K methylations recruit two opposing enzyme activities (HAT and HDAC) in Yeast transcription The Set1 H3K4 methyltransferase binds to the serine 5 phosphorylated CTD of RNAPII, the initiating form of polymerase situated at the transcription start site (TSS). In contrast, the Set2 H3K36 methyltransferase binds to the serine 2 phosphorylated CTD of RNAPII, the transcriptional elongating form of polymerase. Thus, the two enzymes are recruited to genes via interactions with distinct forms of RNAPII, and it is therefore the location of the different forms of RNAPII that defines where the modifications are laid down. H3K4me3 recruits Yng1, which binds via its PHD finger. This in turn stabilizes the interaction of the NuA3 HAT leading to hyperacetylation of its substrate, H3K14. Thus, methylation at H3K4 is intricately linked to acetylation at H3K14. In a similar manner, and again in yeast, H3K36me3 has been shown to recruit the Rpd3S HDAC complex, which deacetylates histones behind the elongating RNAPII. This is important because it prevents cryptic initiation of transcription within coding regions. Together, these examples show how the recruitment of two opposing enzyme activities (HATs and HDACs) is important at active genes in yeast. Histone lysine demethylases The first HK demethylase was identified in 2004 -LSD1, an H3K4 and H3K9 demethylase JmjC-domain-containing proteins encode histone lysine demethylases demethylating tri-methylated lysines using Fe(II) and α-ketoglutarate as co-factors e.g. JMJD2 demethylates H3K9me3 and H3K36me3 JMJD6 demethylates methylated Arginine on histones H3R2me and H4R3me Genome-wide approaches for studying histone modifications Most of the existing methods for studying histone modifications on a genomic scale combine the use of with high-throughput techniques including DNA microarrays and highthroughput sequencing. ChIP-PCR (chromatin immunoprecipitation): chromatin fragments are isolated using antibodies that are specific to a feature of interest. ChIP-chip The most prevalent technique, DNA needed to be amplified, hybridization; ChIP-Seq( the most powerful ) a recently developed technique, very limited amplification, more quantitative, sequenced reads can be directly mapped to genome, the modification levels at different genomic regions can be directly compared. (Solexa 测序技术后面章节会讲到) eg. ChIP-Seq to profile histone modification in mouse ES cells. Chromatin Immunoprecipitation Protocol to Analyze Histone Modifications in a site Procedure 1.Chromatin Crosslinking 2.Chromatin preparation 3.Pre-clearing and immuno precipitation (IP) 4.Collection, washes and elution of immune complexes 5.Reverse crosslinking 6.DNA cleanup Proceed to PCR reactions. At4g03770 and At4g03800 represent retrotransposons which are are transcriptionally silent (Lippman et al., 2004), both loci are associated with dimethylated H3K9. At4g04040 is an active gene and is associated with tri-methylated H3K4. Histone modification cross-talk. Histone modifications can positively or negatively affect other modifications. A positive effect is indicated by an arrowhead and a negative effect is indicated by a flat head Acetylation of histones activates chromatin, and methylation of DNA and histones inactivates chromatin. Methylation of DNA and of histones is associated with heterochromatin. The two types of methylation event may be connected. Part II: DNA methylation DNA甲基化的位点 DNA甲基化主要形成5-甲基胞嘧啶(5-mC)和少量的N6-甲基嘌呤(N6-mA) 及7-甲基鸟嘌呤(7-mG) C 5-mC 由于5-甲基胞嘧啶脱氨后生成胸腺嘧啶(T),不易被识别 校正,因此DNA甲基化提高了该位点的突变频率。 1、 原核生物中,DNA甲基化是为了抵抗噬菌体侵害而发生碱基C和A上的化学 修饰 。如大肠杆菌的限制修饰系统中,自身DNA特定位点的甲基化可以避免 限制性内切酶的切割。 2、真核生物中,甲基化被分为对称性甲基化(canotical / symmetric methylation),包括CpG和CpNpG),以及非对称甲基化(asymmetric methylation),包括CpHpH。多数细胞 5-甲基胞嘧啶主要出现在CpG中。 DNA甲基化能引起染色质结构、DNA构象、组蛋白修饰及DNA与蛋白质相互作 用方式的改变,从而控制基因表达。 CpG island作为甲基化调控基因转录的单位 多个CpG序列集合成簇形成了富含甲基化位点的CpG岛(CpG island), 具有很高的序列保守性。真核生物约一半的存在于所有组成型表达的管 家基因中,但这些CpG岛处于组成型非甲基化状态;另外一半出现在部 分(<40%)组织特异性调控基因的启动子中。 CpG island is a stretch of 1-2 kb genomic sequence that surrounds the promoters of constitutively expressed genes where they are unmethylated (why??). terms A fully methylated site is a palindromic sequence that is methylated on both strands of DNA. Most DNA methylations are found on cytosine on both strands of the CpG doublet. A hemi-methylated site is a palindromic sequence that is methylated on only one strand of DNA. Replication converts a fully methylated site to a hemi-methylated site. A demethylase is a casual name for an enzyme that removes a methyl group, typically from DNA, RNA, or protein. A methyltransferase (Methylase) is an enzyme that adds a methyl group to a substrate, which can be a small molecule, a protein, or a nucleic acid. A de novo methylase adds a methyl group to an unmethylated target sequence on DNA. A maintenance methylase adds a methyl group to a target site that is already hemimethylated. DNA methylation is perpetuated by a maintenance methylase DNA methylase ①甲基化转移酶:包括日常性甲基化转移酶和从头合成型甲基化转移酶 日常性甲基化转移酶是遗传DNA甲基化状态最重要的酶类,它可以在甲基化 母链模板的指导下甲基化新合成链的相应位点,使DNA迅速由半甲基化状态 转变为完全甲基化状态,即参与甲基化的维持(maintenance)。 从头合成型甲基化转移酶可以催化CpG成为mCpG,此过程不需母链指导, 但速度很慢。但这一类甲基化酶是特异基因受甲基化调控的主要因子,在 基因表达的表观遗传学调控中起十分重要的作用。 甲基化调控基因转录的两种机制: ①用于结合某些因子的位点被甲基化后不能再结合蛋白质。这些例子发生 在调节位点中而不是在启动子上。 ②DNA甲基化可以使特定的阻遏物结合到DNA上 两种结合到甲基化CpG序列上的蛋白可以阻遏转录: MeCP1: 结合到DNA上时需要有几个甲基化同时存在,多结合与CpG岛上 MeCP2及其相关家族蛋白: 能够结合到单个甲基化的CpG碱基上,这也使得 转录起始时需要一块无甲基化区域。 MeCP2通过结合到启动子上的复合体相互作用来直接抑制转录。 MeCP2通过与具有组蛋白去乙酰化酶活性的Sin3阻抑物复合体结合, 从而协同组蛋白的乙酰化调节基因的转录活性。 DNA的甲基化与组蛋白的乙酰化可以互相引发: histone methylation and DNA methylation are connected SUVAR39H: histone methyltransferase (methylation on H3-K9) HP1: heterochromatin-associated protein 1 DNA甲基化与多种组蛋白修饰密切相关: 研究DNA甲基化的方法: 1. Genomic DNA fractionated by methylation-sensitive restriction enzyme digestion (individual sites) 2. PCR amplification products from bisulfite-treated DNA (hundreds of sequences by direct sequencing or genome-wide sequences by microarray analysis/pyrosequencing) 3. Direct sequencing of methylated DNA fragments isolated by affinity purification of MeCP1/2 protein (genome-wide) Work flow Methylation-sensitive restriction enzyme digestion The restriction enzyme MspI cleaves all CCGG sequences whether or not they are methylated at the second C, but HpaII cleaves only nonmethylated CCGG tetramers. Bisulfite sequencing Bisulfite (HSO3) can switch unmethylated C into U (or microarray analysis) The first comprehensive DNA methylation map of an entire genome ---DNA Methylation in Arabidopsis The met1-1, which is a methyltransferase mutant, homozygote exhibits a delay in flowering time that is accompanied by the production of additional rosette and cauline leaves before flowering stem elongation. DNA methylation: Bisulphite treatment & the methylcytosine immunoprecipitation (mCIP) method + whole-genome tiling microarrays were used to map methylated component of the genome; Gene expression: Using the same tiling-microarray platform to determine both strand of the genome RNA expression profiles. DNA甲基化的分布 染色体水平上,DNA甲基化在着丝粒附近水平最高 基因水平上,DNA甲基化高水平区域涵盖了多数转座子,假基因和 小RNA编码区, 甲基化似乎对长度较短的基因有较强的转录调控能力,而对长基因的调控能 力十分微弱。 Genes methylated in their promoters tended to be expressed in low levels; Genes methylated in their coding regions were constitutively expressed at higher levels. Human DNA methylation MAP In stem cells, regions of DNA with CpG methylation (blue) are mostly uniformly methylated, whereas this modification is more heterogeneous in fibroblasts. Non-CpG methylation (red), which occurs primarily at CA nucleotides, is detected only in stem cells, yet is asymmetric and more scarce and patchy than CpG methylation. If fibroblasts are converted to induced pluripotent stem cells they regain non-CpG methylation. Lister, R. et al. Nature 462, 315–322 (2009). Filled circles, methylated cytosines; unfilled circles, unmethylated cytosines. H stands for A, C or T; N stands for any nucleotide. DNA methylation patterns differ between stem cells and differentiated cells. Most Recent Hot-topic: 5-hydroxymethylcytosine (5hmc) 基本性质: •低水平存在于哺乳动物的多种细胞类型中; •由TET家族的酶通过氧化5-甲基胞嘧啶产生; •大量分布于人体干细胞到脑细胞; •主要集中在外显子和转录起始位点附近,尤其集中在启动子含有 H3K27me3和H3K4me3这两个标记的基因起始位点; •5-羟甲基胞嘧啶只与准备就绪的染色体配置,以及在分化中上调 的基因相关,可能参与启动快速激活的位点。 5hmc的功能 Knockdown of Tet1 and Tet2 causes: • down-regulation of a group of genes that includes pluripotencyrelated genes (including Esrrb, Prdm14, Dppa3, Klf2, Tcl1 and Zfp42) and a concomitant increase in methylation of their promoters, • an increased propensity of ES cells for extra-embryonic lineage differentiation. The balance between hydroxymethylation and methylation in the genome is inextricably linked with the balance between pluripotency and lineage commitment. 思考题 • What determines the DNA sequence specificity of Histone modifications? • Why is CpG island enriched in the enhancer region?