Biochemistry 2/e - Garrett & Grisham Chapter 31 Transcription and Regulation of Gene Expression to accompany Biochemistry, 2/e by Reginald Garrett and Charles Grisham All rights reserved. Requests for permission to make copies of any part of the work should be mailed to: Permissions Department, Harcourt Brace & Company, 6277 Sea Harbor Drive, Orlando, Florida 32887-6777 Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Outline • • • • • • 31.1 Transcription in Prokaryotes 31.2 Transcription in Eukaryotes 31.3 Regulation of Transcription in Prokaryotes 31.4 Transcription Regulation in Eukaryotes 31.5 Structural Motifs in DNA-Binding Proteins 31.6 Post-Transcriptional Processing of mRNA Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham The Postulate of Jacob and Monod • Before it had been characterized in a molecular sense, messenger RNA was postulated to exist by F. Jacob and J. Monod. • Their four properties: – base composition that reflects DNA – heterogeneous with respect to mass – able to associate with ribosomes – high rate of turnover Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Other Forms of RNA • • • • • rRNA and tRNA only appreciated later All three forms participate in protein synthesis All made by DNA-dependent RNA polymerases This process is called transcription Not all genes encode proteins! Some encode rRNAs or tRNAs Transcription is tightly regulated. Only 0.01% of genes in a typical eukaryotic cell are undergoing transcription at any given moment • How many proteins is that??? Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Transcription in Prokaryotes Only a single RNA polymerase • In E.coli, RNA polymerase is 465 kD complex, with 2 , 1 , 1 ', 1 ' binds DNA binds NTPs and interacts with recognizes promoter sequences on DNA subunits appear to be essential for assembly and for activation of enzyme by regulatory proteins Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Stages of Transcription • • • • See Figure 31.2 binding of RNA polymerase holoenzyme at promoter sites initiation of polymerization chain elongation chain termination Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Binding of polymerase to Template DNA • Polymerase binds nonspecifically to DNA with low affinity and migrates, looking for promoter • Sigma subunit recognizes promoter sequence • RNA polymerase holoenzyme and promoter form "closed promoter complex" (DNA not unwound) - Kd = 10-6 to 10-9 M • Polymerase unwinds about 12 pairs to form "open promoter complex" - Kd = 10-14 M Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Properties of Promoters • • • • See Figure 31.3 Promoters typically consist of 40 bp region on the 5'-side of the transcription start site Two consensus sequence elements: The "-35 region", with consensus TTGACA - sigma subunit appears to bind here The Pribnow box near -10, with consensus TATAAT - this region is ideal for unwinding - why? Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Initiation of Polymerization • RNA polymerase has two binding sites for NTPs • Initiation site prefers to binds ATP and GTP (most RNAs begin with a purine at 5'-end) • Elongation site binds the second incoming NTP • 3'-OH of first attacks alpha-P of second to form a new phosphoester bond (eliminating PPi) • When 6-10 unit oligonucleotide has been made, sigma subunit dissociates, completing "initiation" • Note rifamycin and rifampicin and their different modes of action (Fig. 31.4 and related text) Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Chain Elongation • • • • Core polymerase - no sigma Polymerase is accurate - only about 1 error in 10,000 bases Even this error rate is OK, since many transcripts are made from each gene Elongation rate is 20-50 bases per second slower in G/C-rich regions (why??) and faster elsewhere Topoisomerases precede and follow polymerase to relieve supercoiling Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Chain Termination Two mechanisms • Rho - the termination factor protein – rho is an ATP-dependent helicase – it moves along RNA transcript, finds the "bubble", unwinds it and releases RNA chain • Specific sequences - termination sites in DNA – inverted repeat, rich in G:C, which forms a stem-loop in RNA transcript – 6-8 As in DNA coding for Us in transcript Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Transcription in Eukaryotes • RNA polymerases I, II and III transcribe rRNA, mRNA and tRNA genes, respectively • Pol III transcribes a few other RNAs as well • All 3 are big, multimeric proteins (500-700 kD) • All have 2 large subunits with sequences similar to and ' in E.coli RNA polymerase, so catalytic site may be conserved • Pol II is most sensitive to -amanitin, an octapeptide from Amanita phalloides ("destroying angel mushroom") Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Transcription Factors More on this later, but a short note now • The three polymerases (I, II and III) interact with their promoters via so-called transcription factors • Transcription factors recognize and initiate transcription at specific promoter sequences • Some transcription factors (TFIIIA and TFIIIC for RNA polymerase III) bind to specific recognition sequences within the coding region Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham RNA Polymerase II Most interesting because it regulates synthesis of mRNA • Yeast Pol II consists of 10 different peptides (RPB1 - RPB10) • RPB1 and RPB2 are homologous to E. coli RNA polymerase and ' • RPB1 has DNA-binding site; RPB2 binds NTP • RPB1 has C-terminal domain (CTD) or PTSPSYS • 5 of these 7 have -OH, so this is a hydrophilic and phosphorylatable site Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham More RNA Polymerase II • CTD is essential and this domain may project away from the globular portion of the enzyme (up to 50 nm!) • Only RNA Pol II whose CTD is NOT phosphorylated can initiate transcription • TATA box (TATAAA) is a consensus promoter • 7 general transcription factors are required • See TFIID bound to TATA (Fig. 31.11) Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Transcription Regulation in Prokaryotes • Genes for enzymes for pathways are grouped in clusters on the chromosome - called operons • This allows coordinated expression • A regulatory sequence adjacent to such a unit determines whether it is transcribed - this is the ‘operator’ • Regulatory proteins work with operators to control transcription of the genes Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Induction and Repression • Increased synthesis of genes in response to a metabolite is ‘induction’ • Decreased synthesis in response to a metabolite is ‘repression’ • Some substrates induce enzyme synthesis even though the enzymes can’t metabolize the substrate - these are ‘gratuitous inducers’ - such as IPTG Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham The lac Operon • lacI mutants express the genes needed for lactose metabolism • The structural genes of the lac operon are controlled by negative regulation • lacI gene product is the lac repressor • The lac operator is a palindromic DNA • lac repressor - DNA binding on N-term; C-term. binds inducer, forms tetramer. Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Catabolite Activator Protein Positive Control of the lac Operon • Some promoters require an accessory protein to speed transcription • Catabolite Activator Protein or CAP is one such protein • CAP is a dimer of 22.5 kD peptides • N-term binds cAMP; C-term binds DNA • Binding of CAP-(cAMP)2 to DNA assists formation of closed promoter complex Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham The trp Operon • Encodes a leader sequence and 5 proteins that synthesize tryptophan • Trp repressor controls the operon • Trp repressor binding excludes RNA polymerase from the promoter • Trp repressor also regulates trpR and aroH operons and is itself encoded by the trpR operon. This is autogenous regulation (autoregulation). Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Transcription Regulation in Eukaryotes • More complicated than prokaryotes • Chromatin limits access of regulatory proteins to promoters • Factors must reorganize the chromatin • In addition to promoters, eukaryotic genes have ‘enhancers’, also known as upstream activation sequences • DNA looping permits multiple proteins to bind to multiple DNA sequences Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Structural Motifs • • • • in DNA-Binding Regulatory Proteins Crucial feature must be atomic contacts between protein residues and bases and sugar-phosphate backbone of DNA Most contacts are in the major groove of DNA 80% of regulatory proteins can be assigned to one of three classes: helix-turn-helix (HTH), zinc finger (Zn-finger) and leucine zipper (bZIP) In addition to DNA-binding domains, these proteins usually possess other domains that interact with other proteins Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Alpha Helices and DNA A perfect fit! • A recurring feature of DNA-binding proteins is the presence of -helical segments that fit directly into the major groove of B-form DNA • Diameter of helix is 1.2 nm • Major groove of DNA is about 1.2 nm wide and 0.6 to 0.8 nM deep • Proteins can recognize specific sites in DNA Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham The Helix-Turn-Helix Motif • • • • First identified in 3 prokaryotic proteins two repressor proteins (Cro and cI) and the E. coli catabolite activator protein (CAP) All these bind as dimers to dyad-symmetric sites on DNA (see Figure 31.33) All contain two alpha helices separated by a loop with a beta turn The C-terminal helix fits in major groove of DNA; N-terminal helix stabilizes by hydrophobic interactions with C-terminal helix Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Helix-Turn-Helix II • • • • See Figures 31.34 and 31.35 Residues 1-7 of the motif are the first helix (but called "helix 2") Residue 9 is the turn maker - a Gly, of course Residues 12-20 are the second helix (called "helix 3") Recognition of DNA sequence involves the sides of base pairs that face the major groove (see discussion on pages 1050-1052) Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham The Zn-Finger Motif First discovered in TFIIIA from Xenopus laevis, the African clawed toad • Now known to exist in nearly all organisms • Two main classes: C2H2 and Cx • C2H2 domains consist of Cys-x2-Cys and His-x3His domains separated by at least 7-8 aas • Cx domains consist of 4, 5 or 6 Cys residues separated by various numbers of other residues • See Figure 31.37 and Table 31.7 Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham More Zn-Fingers Their secondary and tertiary structures • C2H2 -type Zn fingers form a folded beta strand and an alpha helix that fits into the DNA major groove • Cx-type Zn fingers consist of two minidomains of four Cys ligands to Zn followed by an alpha helix: the first helix is DNA • recognition helix, second helix packs against the first Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham The Leucine Zipper Motif • • • • First found in C/EBP, a DNA-binding protein in rat liver nuclei Now found in nearly all organisms Characteristic features: a 28-residue sequence with Leu every 7th position and a "basic region" (What do you know by now about 7-residue repeats?) This suggests amphipathic alpha helix and a coiled-coil dimer Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham The Structure of the Zipper • • • • • and its DNA complex Leucine zipper proteins (aka bZIP proteins) dimerize, either as homo- or hetero-dimers The basic region is the DNA-recognition site Basic region is often modelled as a pair of helices that can wrap around the major groove Homodimers recognize dyad-symmetric DNA Heterodimers recognize non-symmetric DNA • Fos and Jun are classic bZIPs Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Post-transcriptional Processing of mRNA in Eukaryotes • Translation closely follows transcription in prokaryotes • In eukaryotes, these processes are separated - transcription in nucleus, translation in cytoplasm • On the way from nucleus to cytoplasm, the mRNA is converted from "primary transcript" to "mature mRNA" Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Eukaryotic Genes are Split • Introns intervene between exons • Examples: actin gene has 309-bp intron separates first three amino acids and the other 350 or so • But chicken pro-alpha-2 collagen gene is 40kbp long, with 51 exons of only 5 kbp total. • The exons range in size from 45 to 249 bases • Mechanism by which introns are excised and exons are spliced together is complex and must be precise Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Capping and Methylation • Primary transcripts (aka pre-mRNAs or heterogeneous nuclear RNA) are usually first "capped" by a guanylyl group • The reaction is catalyzed by guanylyl transferase • Capping G residue is methylated at 7-position • Additional methylations occur at 2'-O positions of next two residues and at 6-amino of the first adenine Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham 3'-Polyadenylylation • Termination of transcription occurs only after RNA polymerase has transcribed past a consensus AAUAAA sequence - the poly(A)+ addition site • 10-30 nucleotides past this site, a string of 100 to 200 adenine residues are added to the mRNA transcript - the poly(A)+ tail • poly(A) polymerase adds these A residues • Function not known for sure, but poly(A) tail may govern stability of the mRNA Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Splicing of Pre-mRNA Capped, polyadenylated RNA, in the form of a RNP complex, is the substrate for splicing • In "splicing", the introns are excised and the exons are sewn together to form mature mRNA • Splicing occurs only in the nucleus • The 5'-end of an intron in higher eukaryotes is always GU and the 3'-end is always AG • All introns have a "branch site" 18 to 40 nucleotides upstream from 3'-splice site • Branch site is essential to splicing Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham The Branch site and Lariat • Branch site is usually YNYRAY, where Y = pyrimidine, R = purine and N is anything • The "lariat" a covalently closed loop of RNA is formed by attachment of the 5'-P of the intron's invariant 5'-G to the 2'-OH at the branch A site • The exons then join, excising the lariat. • The lariat is unstable; the 2'-5' phosphodiester is quickly cleaved and intron is degraded in the nucleus. Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham The Importance of snRNP • Small nuclear ribonucleoprotein particles snRNPs, pronounced "snurps" - are involved in splicing • A snRNP consists of a small RNA (100-200 bases long) and about 10 different proteins • Some of the 10 proteins are general, some are specific. Properties described on page 1063 • snRNPs and pre-mRNA form the spliceosome • Spliceosome is the size of ribosomes, and its assembly requires ATP Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Assembly of the Spliceosome See Figure 31.53 • snRNPs U1 and U5 bind at the 5'- and 3'splice sites, and U2 snRNP binds at the branch site • Interaction between the snRNPs brings 5'and 3'- splice sites together so lariat can form and exon ligation can occur • The transesterification reactions that join the exons may in fact be catalyzed by "ribozymes" Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Copyright © 1999 by Harcourt Brace & Company Biochemistry 2/e - Garrett & Grisham Copyright © 1999 by Harcourt Brace & Company