Promoters Epigenetics 2014 by Nigel Atkinson The University of Texas at Austin A) Lecture - Promoters 1) Expectations You should be able to answer the following: What are promoters? What is a core promoter? How are they recognized (who does it? and a bit about how it is done)? 2) Important elements for transcription by RNA polymerase II RNA polymerase II transcribes protein-encoding genes DNA elements promoter, core promoter, tss (transcription start site) proximal promoter? ~250 bp upstream. enhancer silencer insulator 3) Promoter recognition The meaning of core promoter, proximal promoter elements, enhancers? b) How is a promoter is recognized? 4) RNA polymerase is a complex enzyme that interacts with a large number of proteins. What are transcription factors? • In normal cells, anything other than RNA polymerase or histones that binds a promoter or DNA regulatory element that increases or reduces the rate of transcription initiation. Promoters & regulatory sequences • Core Promoter - a DNA sequence that specifies where transcription should start and which way it will go. • • By itself it does not have to actually specify that transcription actually occurs. - exceptions exist Regulatory sequences are sequences that determine how much, and when to transcribe a gene. • By itself, it does not specify where transcription should start or which way it should go. - exceptions exist. • Enhancer and silencers are DNA elements Promoters & regulatory sequences • What do people mean when they say promoter? • What do people mean when they say proximal promoter? • Be sure that you understand the difference between how these are currently used and the concept of the core promoter. Eukaryotic RNA polymerase II is complex • For protein encoding genes it is RNA polymerase II that is performing transcription • 12 proteins • 3 are evolutionarily related to prokaryotic • Five subunits are common to all nuclear polymerases Eukaryotic RNA polymerase II is complex pink - essential for function yellow - common to all three eukaryotic polymerases Transcriptional Regulation • Mammals regulate ∼25,000 genes • Many with multiple promoters • DNA being regulated is wrapped in chromatin • Combinatorial control Core promoters used by RNAP II • • • -40 to +50 • In vivo they are inactive or expressed very weakly. Need exogenous stimulation (exceptions exist). May have much more activity in vitro. Preinitiation complex assembles here Determines start site and direction of transcription • • • TFIID TFIID=TBP + 8-10 TAFIIS TATA Inr A TFIID B TFIID binds minor groove of the TATA box. A TFIID A B DAB complex Pol II TFIIF - ATP dependent helicase activity 2 proteins, also reduces affinity of polymerase for nonpromoter DNA Pol II B F F Pol II Pol II BTFIID F A mediator B F H mediator Holoenzyme E H E Pol II E A BTFIID F H mediator Complete TFIID A B TFIID DABTFIID complexTFIID Upstream element Not shown - CpG islands TFIIB TFIIE, IH, RNApolI TFIID TFIIA SP1 Pol II F +1 TFIIB <-- upstream TFIIA Pol II TFIIE, IIH, RNAPolII 3ʼ SP1 Upstream TATA INR BRE DCE II DCE I BRE TATA upstream elements -31 to -26 -2 to +4 -37 to-32 DPE DCE DCE INR I II Pol II +16 to +21 +6 to -11 -2 to +4 TATA A/T A A/Tcore promoter Py Py AN T/A Py Py Proximal Promoter TATA: TATA box INR: Initiator element +16 to +21 A DCEIII Pol I +32 to +34 +28 to +30 CTTC CTTC mediator proximal promoter elements BRE: TFIIB binding element +6 to +32 +11 to +34 DPE +28 to +30 BTFIID F G/C G/C G/C CGCCC B F DCE III -31 to -26 -37 to -32 • • • • • TFIID IIB downstream --> 5ʼ TFIID Core promoter B F AGC RG A/T CGTG Holoenz E H DCE: Downstream core element DPE: downstream promoter element Pol II E A BTFIID F H mediator Complete TFIID TFIID TFIIB TFIIE, IH, RNApolI TFIIA SP1 TFIID +1 <-- upstream downstream --> 5ʼ Upstream TATA 3ʼ INR BRE DCE II DCE I DCE III DPE -31 to -26 -2 to +4 -37 to-32 +16 to +21 +6 to -11 +32 to +34 +28 to +30 core promoter proximal promoter elements A few common combinations TATA or a DPE usually not both TATA with DCE TATA INR INR DPE Not shown: MTE stands for "motif ten element" Found at +18 to +29 Found in Drosophila Has not yet been shown to be important in mammals. MTE requries INR TATA MTE is common MTE DPR is common MTE can substitute for TATA and INR Figure 3. Signatures of active promoters. A nucleosome free region (NFR) surrounds the transcriptional start site (TSS) in the core promoter, which may contain core promoter elements, including BRE, TATA, Inr, MTE, DPE and others (positions are relative to the +1 TSS within the Inr; please see detailed explanation of these elements in the main text and in Table 1). The nucleosomes flanking the NFR contain the histone variant H2A.Z, while other nucleosomes contain normal H2A and other histone proteins that are subject to various modifications. Histone acetylation peaks just downstream of the promoter, while methylation of histone 3 lysine 4 is present in a gradient, TFIID TFIID TFIID from trimethylation (H3K4me3) at the promoter, to di- and then monomethylation (H3K4me2, H3K4me1) with increasing distance from the promoter into the transcribed region. This diagram is a composite TFIIE, of features in yeast, fly and mammalian systems; it is TFIIB IH,determined RNApolI representative of some important characteristics of promoters identified in large-scale studies. SP1 TFIIA +1 the sequence motifs responsible for this critical step in chem.qmul.ac.uk/iubmb/misc/naseq.html) was deter<-- upstream --> gene regulation, revealing a collection of short regulamineddownstream by comparison of 5a flanking regions in several tory DNA sequence elements conserved across species. organisms [31]. The TATA-box is located approximately 5ʼ 3ʼ While the first core promoter element has been known for 25–30 bp upstream of the transcription start site in most almost 30 years, additional novel sequence elements have eukaryotes, though in yeast it is found slightly further upbeen discovered recently, emphasizing the importance of stream [32]. It is typically recognized by the TATA bindcontinued research of these Upstream regulatory sequences.TATA Most ing subunit of the general transcription INRprotein (TBP) DCE DCE III II of the canonical core promoter elements have been thorfactor TFIID [33], though additional related but distinct BRE DCE I DPE oughly reviewed elsewhere [2], but it is useful to describe proteins can also recognize this element [34]. -31 to -26 their general features here (see Table 1) in light of recent The initiator element (Inr; YYANWYY) immediately -2 to +4 +16 to +21 +32 to +34 genome-wide analyses of these elements. Note that there surrounds the transcription start site [35] and is found in -37 to-32 +6 to -11 +28 to +30 are no ‘universal’ core promoter elements; the sequences promoters containing or lacking a TATA-box. While the described below are found in only a subset of promoters, Inr can stimulate transcription independently of a TATAand the origins and functional consequences of the result- box, these two elements act synergistically when found ing core promoter diversity are a topic of current study. together [36]. This element is recognized by the TAF1 core promoter The first core promoter element identified was the TATA- and TAF2 subunits of TFIID [37]. box, whose consensus sequence (TATAWAAR; degener- The downstream promoter element (DPE; RGWYV) [38] proximal promoter elements ate nucleotides according to IUPAC code, http://www. is typically found in TATA-less promoters and functions with the Inr as a downstream counterpart to the TATAbox [39]. The DPE is located at +28 to +32 relative to Table 1. Summary of sequence and frequency of core promoter ele- the TSS, with this exact spacing critical to optimal tranments. scription [40]. Like the TATA-box and Inr, this element is recognized by TFIID, likely the TAF6 and TAF9 subunits, Core Position Consensus Frequency in proelement relative sequence** moters but not TBP [41]. There is evidence that the presence of a to TSS* TATA-box or DPE in a promoter can influence its interacFlies Vertebrates tions with enhancers [42] and transcriptional activation or TATA approx. TATAWAAR 33–43% 10–16% repression [43], suggesting multiple regulatory mecha–31 to nisms acting at the core promoter. –26 The TFIIB recognition element (BRE; SSRCGCC) conInr –2 to +4 YYANWYY 69% 55% sists of the 7 bp immediately upstream of the TATA-box, DPE +28 to RGWYV 40% 48% and as its name suggests, it is bound by transcription fac+32 tor IIB [44]. The BRE has been shown to both stimulate BRE approx. SSRCGCC – 12–62% and repress transcriptional activity [45]. –37 to –32 The motif ten element (MTE; CSARCSSAACGS) was MTE +18 to CSARCSSAACGS 8.5% – identified in a computational survey of Drosophila pro+29 moters [46], located +18 to +29 downstream of the TSS and overlapping slightly with the 5a-end of the DPE. The * The TSS is assigned to position +1. ** Degenerate nucleotides represented using IUPAC codes. MTE requires Inr and functions synergistically with the TFIID TFIID TFIIB TFIIE, IH, RNApolI TFIIA SP1 TFIID +1 <-- upstream downstream --> 5ʼ Upstream TATA 3ʼ INR BRE DCE II DCE III DCE I DPE -31 to -26 -2 to +4 -37 to-32 +16 to +21 +6 to -11 +32 to +34 +28 to +30 core promoter proximal promoter elements Why so many elements? core promoters. The use of DPE- and TATA-specific activators would enable the construction of more sophisticated and effective connections between enhancers and promoters (Fig. 5). There exist many variants on the core promoter sequence. Why? TBP-related factors (TRFs) and transcriptional regulation There is diversity not only in core promoter elements but also in the basal transcription machinery. This concept is nicely exemplified in studies of the TBP-related factors (TRFs) (for reviews, see: Jones, 2007; Müller et al., 2007; Reina and Hernandez, 2007; Torres-Padilla and Tora, 2007). There are three TRFs, which are generally termed TRF1, TRF2, and TRF3. • 1. May assemble different pre-initiation complex with specific features required for cell-specific regulation. One example is tuning between enhancers and promoters. Juven-Gershon, T. & Kadonaga, J.T. (2010) Regulation of gene expression via the core promoter and the basal transcriptional machinery. Dev Biol, 339, 225-229. Fig. 5. A simplified, hypothetical diagram of activation by DPE- and TATA-specific factors. Transcription factors bind to enhancers but only activate transcription from promoters with the appropriate core promoter elements. The core promoter containing both TATA and DPE motifs can be activated by either DPE- or TATA-specific activators. Conclu The two im of gen and fu contrib compl essent gene r be stu Altern with c Fig. 6. complex bind to establish There exist many variants on the core promoter sequence. Why? • 2. Different core promoters might bind the same basal factors more or less tightly. Why is this important? What are transcription factors? • In normal cells, anything other than RNA polymerase or histones that binds a promoter or DNA regulatory element that increases or reduces the rate of transcription initiation. General transcription factors • Interact directly with core promoter. Determine site of initiation and the direction of transcription. TFIIA, TFIIB, TFIID, TFIIF, TFIIH and Mediator. • In eukaryotes, RNA polymerase holoenzyme cannot recognize promoters by itself. • The general (aka basal) transcription factors recognize the core promoter. • • • TFIID=TBP + 8-10 TAFIIS TFIID A TFIID binds minor groove of the TATA box. TFIIF - ATP dependent helicase activity 2 proteins, also reduces affinity of polymerase for nonpromoter DNA TATA Inr TFIID B A TFIID A B DAB complex Pol II Pol II B F F Pol II Pol II BTFIID F A mediator B F H mediator Holoenzyme E H E Pol II E A BTFIID F H mediator Complete DAB complex Pol II Pol II • • • • F TFIID TFIIH - kinase that phosphorylates YSPTSPS (CTD domain) Unphosphorylated RNAP = RNAPIIA = initiation specific Phosphorylated RNAP = RNAPIIO = for chain elongation TFIIH also has helicase activity B F TATA Inr Pol II A A BTFIID F TFIID A B Pol II B F H Pol II Pol II Pol II E A BTFIID F H Complete B F F Pol II Pol II BTFIID F A mediator B F E H mediator Holoenzyme E H H Holoenzyme E TFIID A B DAB complex E Pol II E A BTFIID F H mediator Complete Upstream element TFIID TFIID TFIIB TFIIE, IH, RNApolI TFIID TFIIA SP1 TFIID +1 TFIIB <-- upstream TFIID IIB downstream --> TFIIA 5ʼ TFIID TFIIE, IIH, RNAPolII 3ʼ SP1 Upstream TATA BRE upstream elements -31 to -26 INR DCE II DCE I BRE TATA -2 to +4 -37 to-32 DPE DCE DCE INR I II +6 to +32 +11 to +34 +16 to +21 -31 to -26 +6 to -11 -37 to -32 DCE III -2 to +4 +16 to +21 CTTC proximal promoter elements TFIID +32 to +34 +28 to +30 CTTC TATA A/T A A/Tcore promoter Py Py AN T/A Py Py • DCEIII +28 to +30 G/C G/C G/C CGCCC Proximal Promoter DPE Core promoter AGC RG A/T CGTG TFIID: What is TFIID? TFIID=TBP + 8-10 TAFIIS TAF250 TA TAF110 F60 40 30! 80 TFIID TBP 30" TAF150 TFIID: TBP can bend DNA • Crystal structure of TBP suggested a saddle. • TBP DNA co- crystal indicated that not like a saddle on a horse. TFIID: Who recognizes what? • TATA-less promoters are not recognized by TBP • TAFII250 & TAFII150 impart ability to recognize Inr & DPE • Other TF (Sp1) necessary for TATA, Inr & DPE-less promoters TAFs can have enzymatic activity • TAFII250 - Histone acetyl transferase activity that acetylates lysine residues of histones. This can lead to remodeling of the chromatin. • TAFII250 - Protein kinase activity that phosphorylates itself and TFIIF, TFIIA and TFIIE. Thought to modulate activity of initiation complex. Enhancers Enhancers are DNA elements that regulate promoter activity TFIID has many targets for interactions Enhancers are DNA elements that regulate core promoter activity Promoter Clearance Kinase activity of TFIIH activates polymerase for elongation. ATP-dep DNA helicase activity of TFIIH is required for promoter clearance. Enhancers - how do they work? Proteins bind these DNA elements and then do at least of the following Stabilization of the pre-initiation complex Help hold it down - many hands model TFIID alone has many places where interactions can take place Activate enzymatic activities within the pre-initiation complex eg. TAFII250 HAT or TFIIH kinase Bend the DNA - eg. lef1 Prepare the area <--- REALLY IMPORTANT THIS CONTAINS MOST OF THE EPIGENETICS MECHANISMS THAT WE WILL DISCUSS This can make the binding sites visible Changes to histones can stabilize the preinitiation complex Changes to histones can help recruit the preinitiation complex What do I mean by the word recruit? eg. Some preparations can include modifications of histones. These can be attractive to important transcription factors. That is they can increase the local concentration of the transcription factor. eg. Alter the histones so that needed proteins such as TFIID concentrate in the area. Silencers - how do they work? Proteins bind these DNA elements and then do the opposite. Enhancers and silencers These are names. The names are chosen because it is the first thing that an element is observed to do. It is common that later one finds out that the enhancer is used to suppress transcription in a different cell. My point is that biology is not restrained by the names that we give things. Many, many people forget this.