基础分子生物学 第九讲:真核生物基因表达调控 郭红卫 hongweig@pku.edu.cn,新生物楼428 课程安排 11/1 郭红卫 第8章 11/8 期中考试 (第1-6章内容),郑晓峰,占总成绩50% 11/10 郭红卫 11/15 郭红卫 11/22 魏文胜 11/24 魏文胜 11/29 魏文胜 12/6 郭红卫 12/8 郭红卫 12/13 郭红卫 12/20 郭红卫 12/22 郭红卫 第8章 第9章 第10—11章 12/29 上午 8:30-10:30 期末考试(第8-11章内容),郭红卫,占50% 助教:冯莹 新生物楼426 dorafeng86308016@sina.com 第八章 真核基因表达调控 Eukaryotic Gene Expression and Regulation 本章主要内容 1. 基因表达与调控的基本概念与原理 2. 转录水平的调控(transcriptional regulation): DNA level (Genetic) Chromatin level (Epigenetic) 3. 转录后水平的调控(post-transcriptional regulation): RNA interference (RNAi) Protein degradation (Ubiquitin/proteasome) 第一节 基本概念与原理 Basic Concepts and Principles 单个基因 单个细胞 Genome (cell’s repertoire of DNA) Transcriptome (cell’s repertoire of RNA transcripts) 中心法则 Proteome (cell’s repertoire of proteins) 一、基因表达的概念 •基因组(genome) 一个细胞或病毒所携带的全部遗传信息或整 套基因。 •基因表达(gene expression) 基因经过转录、翻译,产生具有特异生物学功 能的蛋白质分子或RNA分子的过程。 •基因表达调控(gene regulation, or regulation of gene expression) 基因表达是受内源及外源信号调控的。 Regulation of Gene Expression Chromatin epigenetic control Protein degradation RNA silencing 一般而言的基因表达调控范畴 二、基因表达的时间性及空间性 (一)时间特异性 按功能需要,某一特定基因的表达严格按 特定的时间顺序发生,称之为基因表达的时间 特异性(temporal specificity)。 多细胞生物基因表达的时间特异性又称阶 段特异性(stage specificity)。 人体发育过程中不同类型β-珠蛋白的含量变化 (二)空间特异性 在个体生长全过程,某种基因产物在个体 按不同组织空间顺序出现,称之为基因表达的 空间特异性(spatial specificity)。 基因表达伴随时间顺序所表现出的这种分 布差异,实际上是由细胞在器官的分布决定的, 所以空间特异性又称细胞或组织特异性(cell or tissue specificity)。 BARD1 is expressed specifically in the apical domains of Arabidopsis inflorescence (A), ovules (B), anthers (C), and embryos (D). In suit hybridization A, B, C, D: antisense BARD1 probe; E: sense BARD1 probe as a negative control. (朱玉贤第五章课件) BICOID 四种母源影响基因的 mRNA和蛋白沿果蝇 胚胎前-后轴分布的 浓度变化图 NANOS mRNA 第十章(基因和发育) protein Facts • Identical genome: Virtually every cell in an organism contains a complete set of genes • Spatial specificity: But they are not all turned on in every cell or tissue • Temporal specificity: Each cell of an organism expresses a distinctive subset of genes at different time or developmental stage • Tight regulation: During development different cells express different sets of genes in a precisely regulated fashion 三、基因表达的方式 按对刺激的反应性,基因表达的方式分为: (一)组成性表达 (constitutive expression) 某些基因在一个个体的几乎所有细胞中持 续表达,通常被称为管家基因(housekeeping gene)。 Housekeeping genes – genes for essential cellular structures and metabolic pathways (e.g. rRNA, actin, tubulin) – usually expressed at high level – the level of their gene expression may vary 这类基因表达又称为组成性基因表达 (constitutive gene expression)。 rRNA, actin, tubulin are commonly used as loading control in RT-PCR or Northern blot (二)诱导和阻遏表达 在特定环境信号刺激下,相应的基因被激 活,基因表达产物增加,这种基因称为可诱导 基因 (inducible genes)。 如果基因对环境信号应答是被抑制,这种 基因是可阻遏基因 (repressible genes)。 基因表达调控大多数是对这些基因的转 录和翻译速率的调节,从而导致其编码产 物的水平发生改变,影响其功能。 四、基因表达调控的生物学意义 (一)维持细胞增殖、分化 (二)维持个体生长、发育 (三)适应环境变化 第九、十章(基因与疾病、基因与发育)将要讲到 一般而言,基因表达调控主要是发生在基因转录 水平上的调节,即:mRNA合成的多少。 transcription 1. Transcripts (转录本) begin and end beyond the coding region (5’UTR and 3’UTR) 2. The primary transcript is processed by: 5’ capping 3’ formation / polyA splicing 3. Mature transcripts are transported to the cytoplasm for translation 五、基因转录调节基本要素 (一)RNA聚合酶 (RNA Polymerase) (二)特异DNA序列 (cis-acting elements) (三)调节蛋白 (trans-acting factors) Gene expression regulation at the level of DNA (transcriptional regulation) --highly sequence-dependent --varied regulation for different genes cis-acting elements: promoters/regulatory sequences of genes trans-acting factors: proteins and RNAs that bind cis-elements and promote or repress gene expression (一) RNA聚合酶 启动子、调节序列和调节蛋白通过DNA-蛋白质相互作 用、蛋白质-蛋白质相互作用影响RNA聚合酶活性。 RNA Pol I: rRNA, 相对活性50-70% RNA Pol II: mRNA,相对活性20-40% RNA Pol III: tRNA,相对活性10% RNA Pol IV: small ncRNA,相对活性?? (二)特异DNA序列 真核生物基因组中含有可以调控自身基因表达活性的特异DNA 序列,称为顺式作用元件 (cis-acting element)。 顺式作用元件能够被转录调节蛋白特异识别和结合,从而影 响基因表达活性。 启动子 (promoter) 顺式作用元件又分 增强子 (enhancer) 沉默子 (silencer) 转录起始点 DNA En/Si Pro 编码序列 (三)真核基因的调节调节蛋白 反式作用因子 (trans-acting factor) 能直接或间接与顺式作用元件相互作用,进而调控基 因转录的一类调节蛋白,统称为反式作用因子。 按其功能不同,常有以下三类: 基本转录因子 :识别promoter元件 转录调节因子:识别enhancer或silencer 共调节因子:不能进行DNA-蛋白质相互作用 1. 基本转录因子 (general transcription factor, GTF) 是指能够直接或间接与启动子核心序列TATA盒特异结合、 并启动转录的一类调节蛋白。 holoenzyme TAF: TBP associated factors TFⅡF TAF TAF TFⅡA TAF TBP TATA polⅡ TFⅡH TBP: TATA-box binding protein TFII: pol II associated TF TFⅡB DNA RNA聚合酶Ⅱ在转录因子帮助下,形成的转录起始复合物 2. 转录调节因子 (transcription factor, TF) 这类调节蛋白能识别并结合转录起始点的上游序列和远 端的增强子元件,通过DNA-蛋白质相互作用而调节转录活性。 决定不同基因的时间、空间特异性表达. 转录激活因子(transcriptional activator) 转录阻遏因子(transcriptional repressor) 3. 共调节因子 (transcriptional regulator/ co-factor) 首先与转录因子发生蛋白-蛋白相互作用,进而影响它 们的分子构象,以调节转录活性,本身无DNA结合活性。 如果与转录激活因子有协同作用——共激活因子; 与转录阻遏因子有协同作用——共阻遏因子。 常见转录因子的结构域 (domain) TF DNA结合域 (DNA binding domain) Basic AA (K/R) rich, positively charged 转录激活域 (trans-activation domain) 酸性激活域 (D/E-rich) 谷氨酰胺(Q)富含域 脯氨酸(P)富含域 蛋白质-蛋白质结合域 (dimerization, co-factors) 1) TF最常见的DNA binding domain Zinc Finger bZIP Homeodomain bHLH (1) 锌指(zinc finger) Cys-X2-4-Cys-X3-Phe-X5-Leu-X2-His-X3-His C-terminal: α-helix binding DNA 常结合GC box (2) 碱性亮氨酸拉链 bZIP (3) 碱性螺旋-环-螺旋bHLH bHLH蛋白(basic Helix-Loop-Helix) 2) TF常见的trans-activation domain (Activation domain is interchangeable) Interaction Assays Design of Two-hybrid / Three-hybrid /etc… separable functional domains Tri-hybrid assay (protein-RNA) Two-hybrid assay (protein-protein) 真核基因转录起始的调控 Eukaryotic gene expression is usually controlled at the level of initiation of transcription. 1. RNA polymerase II 2. promoter and enhancers 3. transcription factors Ordered Assembly and Pol II Holoenzyme one-step multiple-step TFIID TFIID • Holoenzyme --- a supramolecular complex comprising Pol II, most GTFs, and Mediator/Srb complex • In yeast, a 2MDa holoenzyme + TBP suffices for transcription Sequential Assembly Binding of TFIID (TBP + 11 TAFs, 800KD) to the TATA box is the first step in initiation. +25bp TBP: TATA binding protein TAFs: TBP associated factors TFIIB binds to DNA and contacts RNA polymerase near the RNA exit site and at the active center, and orients it on DNA. Q: prok -10bp vs euk -25bp? CTD:RNA Pol II C-terminal domain In eukaryotic cells, the transcription of genes is accurately orchestrated both spatially and temporally by the C-terminal domain of RNA polymerase II (CTD). • CTD is an unusual extension appended to the C terminus of the largest subunit of RNA polymerase II. • It comprises from 25 to 52 tandem copies of the consensus repeat heptad Y1S2P3T4S5P6S7. • S2 and S5 are major phosphorylation sites. • CTD phosphorylation cause the conversion of proline isomerization states. • Phosphorylation patterns on the CTD repeats determine different sets of associated factors, so that provide a dynamic platform to recruit different regulators of the transcription apparatus. S2 & S5, the trigger for transcriptional process modulation PIC: PhosphoS5 is required for assembly of the PIC and facilitates mRNA capping via recruitment of capping enzymes. Elongation: S5 gradually becomes dephosphorylated, whereas S2 is phosphorylated. Terminating: PhosphoS2 ensures efficient 3′-RNA processing by triggering recruitment of 3′-RNA processing machinery. Ending: CTDs are free of phosphate groups; non-phosphorylated CTDs are required for RNA polymerase II to recycle and bind a promoter for the next cycle of transcription. Many Transcriptional Activators i.e. CAAT GC-box Factors involved in gene expression include RNA polymerase and the basal apparatus, activators that bind directly to, co-activators that bind to both activators and the basal apparatus, and regulators that act on chromatin structure (chromatin remodeling complex). Near the initiation site A little far away SP1 stimulates transcription in presence of TAFII110 Near SV40 early promoter • GC boxes bound by DNA binding protein SP1 • SP1 recruits TFIID by binding TAFII110 • Partially reconstituted complex (TBP and 3 TAFs) in addition to other GTFs, Pol II leads to high levels of transcription Mediator complex is targeted by an activator (中介复合体) Far • Mediator is a stable complex containing several proteins (20-50) • Mediator binds to the RNA pol II and transcription factors (activators or repressors) and ‘mediates’ the regulatory signals to pol II What is the mechanism of activation? ?? (interaction activation) Two models: 1.Tethering holoenzyme (recruitment) 2.Activating holoenzyme (allosteric) In favor of recruitment model(勾引模型) tat protein of HIV can stimulate transcription initiation without binding DNA at all The activating domain of the tat protein can stimulate transcription if it is tethered in the vicinity of promoter by binding to the RNA product (tar sequence) of a previous round of transcription. tat tar DNA-binding domain is to bring the activation domain into the vicinity of the startpoint. And activation is independent of the means of tethering. we can think of DNA-binding (or RNA-binding in the case of tat) domain as providing a "tethering" function, whose main purpose is to ensure that the activation domain is in the vicinity of the initiation complex. The notion of tethering is a more general idea that initiation requires a high concentration of transcription factors in the vicinity of the promoter. This may be achieved when activators bind to enhancers, upstream promoter elements, or in an extreme case by tethering to a newly-made RNA product. 总结 所有激活因子的共性:识别靶位点(启动子、增强子)的特异性由 DNA结合域决定。 DNA结合域将转录激活域带到基础转录区域附近。 直接作用的激活因子具有DNA结合域和转录激活域。 没有转录激活域的激活因子可能与具有转录激活域的共激活因子一起 行使功能。 基础转录区域中许多元件是(共)激活因子的靶位点 RNA聚合酶可以和多种不同的转录因子相互作用,形成全酶复合物行 使功能。 ‘Synergy’ High levels of transcription induced by multiple factors • Transcription factors can enhance transcription in a non-linear manner • Synergisitic activation occurs due to multiple contacts with the machinery • Multiple copies of the same activator also induce synergistic activation Interferon ß enhancer • Enhancers often have binding sites for several transcription factors • Transcription factors can bind cooperatively at adjacent sites • Architectural factors (with no regulatory domains, i.e. HMG1) can assist assembly • Remarkably increase binding affinity for both DNA and machinery 肩并肩、手挽手,根基稳、魅力足 HMG1 香肩并立、玉指紧扣,脚如磐石、面若桃花 How do enhancers act independent of distance and orientation? One possible strategy: ----Looping---Cohesins help to stabilize enhancer-promoter interactions Two experiments support the looping model --The essential role of the enhancer is to increase the concentration of activator in the vicinity of the promoter--An enhancer may function by bringing proteins into the vicinity of the promoter. An enhancer does not act on a promoter at the opposite end of a long linear DNA, but becomes effective when the DNA is joined into a circle by a protein bridge. An enhancer and promoter on separate circular DNAs do not interact, but can interact when the two molecules are catenated. Steroid receptors are transcription factors Zinc finger TF Receptors for many steroid and thyroid hormones have a similar organization, with an individual N-terminal region, conserved DNAbinding region, and a C-terminal hormone-binding region Activation of Glucocorticoid Receptor (GR) Nuclear shuttling Glucocorticoids regulate gene transcription by causing their receptor to transport into the nucleus and bind to an enhancer whose action is needed for promoter function. 利用GR的特性构建可诱导表达融合蛋白系统 Dexamethasone GR GR X GR X X • Dexamethasone(DEX): 地 塞米松,氟美松(抗炎药),合成 的一种糖皮质激素; • 通过分子克隆的方法将GR和要 研究的核蛋白X构建成融合蛋 白,转基因到酵母、动物细胞 或者植物中; • 不施加外源DEX时,融合蛋白 与HSP90形成复合物,由于构 象和空间位阻等原因,融合蛋 白存在于胞质中,不能定位到 细胞核; • 添加DEX时,DEX扩散入胞与 GR相结合,融合蛋白改变构象 后核定位信号暴露,即可入核 行使功能; • 用以研究核蛋白的功能(包括 转录因子)。 Activation Tagging approach in plants Plant transformation • 构建T-DNA序列,其中包括4倍重复的CaMV的35S增强字序 列,4×35S元件可以大大增强相邻基因的转录; • 通过转基因的方法将T-DNA片段整合到植物基因组中; • 植物基因组上与T-DNA插入位点相近的基因表达量增高,即 得到这个基因gain-of-function的转基因植物; 1. mutants screening 2. locate T-DNA insertion site in Arabidopsis genome (how?) 3. identify the right gene conferring mutant phenotype (how?) A chemical-inducible activation tagging vector pER16 in plants RB->LB: T-DNA fragment (可以插入到基因组中) XVE=LexA 的DNA结合域(X)+VP16转录激活域(V)+human estrogen receptor调控域(E) NPTII;转基因筛选标记 OLexA-46:LexA操纵子+CaMV 35S基本启动子(而非增强子) 其工作原理为:在G10-90启动子控制下,XVE 融合转录因子组成型表 达;当加入雌激素,雌激素和受体调控域(E)结合,导致XVE融合蛋 白构象发生变化,并由细胞质转移进入核内;在细胞核内,XVE中的 LexA DNA结合域(X)特异识别LexA操纵子区(OLexA),VP16的转录 激活域(V)激活LB旁边的基因高水平表达。 Enhancer Trap Enhancer trap的质粒包含一个报告基因(lacZ)和基本 启动子,这段启动子不足以启动报告基因的表达,但是 对增强子非常敏感。将这样的质粒整合到基因组中,如 果插入位点附近有增强子报告基因就会表达。因此只要 通过观测报告基因的表达情况(比如时空上的特异性) 就可以知道这个增强子的作用,进而研究由这个增强子 调控的内源基因的表达特性。 ZFN 技术原理 锌指核酸酶(Zinc-finger nucleases, ZFN)是人工改造的限制性核酸内 切酶,利用不同的锌指结构识别特异DNA序列,利用核酸酶切断靶DNA。 • 锌指结构中每一个α螺旋可以特异识别 3-4个碱基; • 人工设计识别特异DNA序列的α螺旋采 用如上的通用序列,通过改变其中7个 X来实现识别不同的三联体碱基, TGEK是多个螺旋间的连接序列; • 构建成对人工锌指结构域和FokI融合 蛋白(ZFN)可以在指定区域切断 DNA双链。 ZFN 技术 研究人员可以利用ZFN技术进行各种基因编辑,比如基因 敲除。已建立有ZFN库,识别多种DNA序列,但还不能达 到识别任意靶DNA的目的,其应用受到一定的限制。 锌指核酸酶介导的定向染色体删除 TALEN 技术 TALEN=Transcription Activator-Like Effector+FokI Nuclease fusion protein TALE: transcription activator-like effector from Xanthomonas, TALE can specifically bind and regulate plant genes during pathogenesis. codon 长度为34aa的重复肽段中的第12、13个氨基酸可以特异识别 DNA单个碱基,形成2aa->1bp的特殊coden。利用这个特性 可以人工设计识别任意碱基序列的TALE蛋白。 设计TALE结合位点接上报告基因(mCherry),同时构建一 个特异识别这种TALE结合序列的的TALE蛋白,将两种质粒 共同转入细胞,那么这个人工TALE蛋白可以启动报告基因 mCherry的表达 特异TALE蛋白与核酸内切酶FokI的融合蛋白可以切 割特异识别序列下游9-13bp。基于这个原理,可以设 计转基因融合蛋白敲除指定内源基因。设计一对识别 果蝇tnikb基因的TALEN蛋白(left and right),内源 tnikb将被FokI切开,转基因果蝇子代将有tnikb基因突 变的个体。 真核基因转录调节是复杂的、多样的 *不同的DNA元件组合可产生多种类型的转录 调节方式。 *多种转录因子又可结合相同或不同的DNA元 件。 *转录因子与DNA元件结合后,对转录激活过 程所产生的效果各异,有正性调节或负性调 节之分。 思考题: 1、如何通过实验的方法分析CTD上S2和S5不同磷 酸化pattern的功能? 2、如何鉴定activation tagging的转基因植物 中是哪个基因的表达上调而导致所观测的表型? 祝大家期中考试顺利 下周后见