Biochemistry 2/e - Garrett & Grisham
Chapter 31
Transcription and Regulation of
Gene Expression
to accompany
Biochemistry, 2/e
by
Reginald Garrett and Charles Grisham
All rights reserved. Requests for permission to make copies of any part of the work
should be mailed to: Permissions Department, Harcourt Brace & Company,
6277
Sea Harbor Drive, Orlando, Florida 32887-6777
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Outline
•
•
•
•
•
•
31.1 Transcription in Prokaryotes
31.2 Transcription in Eukaryotes
31.3 Regulation of Transcription in Prokaryotes
31.4 Transcription Regulation in Eukaryotes
31.5 Structural Motifs in DNA-Binding Proteins
31.6 Post-Transcriptional Processing of mRNA
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
The Postulate of Jacob and
Monod
• Before it had been characterized in a molecular
sense, messenger RNA was postulated to exist
by F. Jacob and J. Monod.
• Their four properties:
– base composition that reflects DNA
– heterogeneous with respect to mass
– able to associate with ribosomes
– high rate of turnover
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Other Forms of RNA
•
•
•
•
•
rRNA and tRNA only appreciated later
All three forms participate in protein synthesis
All made by DNA-dependent RNA polymerases
This process is called transcription
Not all genes encode proteins! Some encode
rRNAs or tRNAs
Transcription is tightly regulated. Only 0.01% of
genes in a typical eukaryotic cell are undergoing
transcription at any given moment
• How many proteins is that???
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Transcription in Prokaryotes
Only a single RNA polymerase
• In E.coli, RNA polymerase is 465 kD
complex, with 2 , 1 , 1 ', 1 
 ' binds DNA
  binds NTPs and interacts with 
  recognizes promoter sequences on DNA
  subunits appear to be essential for
assembly and for activation of enzyme by
regulatory proteins
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Stages of Transcription
•
•
•
•
See Figure 31.2
binding of RNA polymerase holoenzyme
at promoter sites
initiation of polymerization
chain elongation
chain termination
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Binding of polymerase to
Template DNA
• Polymerase binds nonspecifically to DNA with
low affinity and migrates, looking for promoter
• Sigma subunit recognizes promoter sequence
• RNA polymerase holoenzyme and promoter
form "closed promoter complex" (DNA not
unwound) - Kd = 10-6 to 10-9 M
• Polymerase unwinds about 12 pairs to form
"open promoter complex" - Kd = 10-14 M
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Properties of Promoters
•
•
•
•
See Figure 31.3
Promoters typically consist of 40 bp region
on the 5'-side of the transcription start site
Two consensus sequence elements:
The "-35 region", with consensus TTGACA
- sigma subunit appears to bind here
The Pribnow box near -10, with consensus
TATAAT - this region is ideal for unwinding
- why?
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Initiation of Polymerization
• RNA polymerase has two binding sites for NTPs
• Initiation site prefers to binds ATP and GTP (most
RNAs begin with a purine at 5'-end)
• Elongation site binds the second incoming NTP
• 3'-OH of first attacks alpha-P of second to form a
new phosphoester bond (eliminating PPi)
• When 6-10 unit oligonucleotide has been made,
sigma subunit dissociates, completing "initiation"
• Note rifamycin and rifampicin and their different
modes of action (Fig. 31.4 and related text)
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Chain Elongation
•
•
•
•
Core polymerase - no sigma
Polymerase is accurate - only about 1 error
in 10,000 bases
Even this error rate is OK, since many
transcripts are made from each gene
Elongation rate is 20-50 bases per second slower in G/C-rich regions (why??) and
faster elsewhere
Topoisomerases precede and follow
polymerase to relieve supercoiling
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Chain Termination
Two mechanisms
• Rho - the termination factor protein
– rho is an ATP-dependent helicase
– it moves along RNA transcript, finds the
"bubble", unwinds it and releases RNA chain
• Specific sequences - termination sites in DNA
– inverted repeat, rich in G:C, which forms a
stem-loop in RNA transcript
– 6-8 As in DNA coding for Us in transcript
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Transcription in Eukaryotes
• RNA polymerases I, II and III transcribe rRNA,
mRNA and tRNA genes, respectively
• Pol III transcribes a few other RNAs as well
• All 3 are big, multimeric proteins (500-700 kD)
• All have 2 large subunits with sequences similar
to  and ' in E.coli RNA polymerase, so
catalytic site may be conserved
• Pol II is most sensitive to -amanitin, an
octapeptide from Amanita phalloides
("destroying angel mushroom")
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Transcription Factors
More on this later, but a short note now
• The three polymerases (I, II and III) interact
with their promoters via so-called
transcription factors
• Transcription factors recognize and initiate
transcription at specific promoter sequences
• Some transcription factors (TFIIIA and TFIIIC
for RNA polymerase III) bind to specific
recognition sequences within the coding
region
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
RNA Polymerase II
Most interesting because it regulates
synthesis of mRNA
• Yeast Pol II consists of 10 different peptides
(RPB1 - RPB10)
• RPB1 and RPB2 are homologous to E. coli RNA
polymerase  and '
• RPB1 has DNA-binding site; RPB2 binds NTP
• RPB1 has C-terminal domain (CTD) or PTSPSYS
• 5 of these 7 have -OH, so this is a hydrophilic and
phosphorylatable site
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
More RNA Polymerase II
• CTD is essential and this domain may
project away from the globular portion of the
enzyme (up to 50 nm!)
• Only RNA Pol II whose CTD is NOT
phosphorylated can initiate transcription
• TATA box (TATAAA) is a consensus promoter
• 7 general transcription factors are required
• See TFIID bound to TATA (Fig. 31.11)
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Transcription Regulation in
Prokaryotes
• Genes for enzymes for pathways are
grouped in clusters on the chromosome
- called operons
• This allows coordinated expression
• A regulatory sequence adjacent to such
a unit determines whether it is
transcribed - this is the ‘operator’
• Regulatory proteins work with operators
to control transcription of the genes
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Induction and Repression
• Increased synthesis of genes in
response to a metabolite is ‘induction’
• Decreased synthesis in response to a
metabolite is ‘repression’
• Some substrates induce enzyme
synthesis even though the enzymes
can’t metabolize the substrate - these
are ‘gratuitous inducers’ - such as IPTG
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
The lac Operon
• lacI mutants express the genes needed
for lactose metabolism
• The structural genes of the lac operon
are controlled by negative regulation
• lacI gene product is the lac repressor
• The lac operator is a palindromic DNA
• lac repressor - DNA binding on N-term;
C-term. binds inducer, forms tetramer.
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Catabolite Activator Protein
Positive Control of the lac Operon
• Some promoters require an accessory
protein to speed transcription
• Catabolite Activator Protein or CAP is
one such protein
• CAP is a dimer of 22.5 kD peptides
• N-term binds cAMP; C-term binds DNA
• Binding of CAP-(cAMP)2 to DNA assists
formation of closed promoter complex
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
The trp Operon
• Encodes a leader sequence and 5
proteins that synthesize tryptophan
• Trp repressor controls the operon
• Trp repressor binding excludes RNA
polymerase from the promoter
• Trp repressor also regulates trpR and
aroH operons and is itself encoded by
the trpR operon. This is autogenous
regulation (autoregulation).
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Transcription Regulation
in Eukaryotes
• More complicated than prokaryotes
• Chromatin limits access of regulatory
proteins to promoters
• Factors must reorganize the chromatin
• In addition to promoters, eukaryotic
genes have ‘enhancers’, also known as
upstream activation sequences
• DNA looping permits multiple proteins to
bind to multiple DNA sequences
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Structural Motifs
•
•
•
•
in DNA-Binding Regulatory Proteins
Crucial feature must be atomic contacts between
protein residues and bases and sugar-phosphate
backbone of DNA
Most contacts are in the major groove of DNA
80% of regulatory proteins can be assigned to
one of three classes: helix-turn-helix (HTH),
zinc finger (Zn-finger) and leucine zipper
(bZIP)
In addition to DNA-binding domains, these
proteins usually possess other domains that
interact with other proteins
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Alpha Helices and DNA
A perfect fit!
• A recurring feature of DNA-binding proteins
is the presence of -helical segments that fit
directly into the major groove of B-form DNA
• Diameter of helix is 1.2 nm
• Major groove of DNA is about 1.2 nm wide
and 0.6 to 0.8 nM deep
• Proteins can recognize specific sites in DNA
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
The Helix-Turn-Helix Motif
•
•
•
•
First identified in 3 prokaryotic proteins
two repressor proteins (Cro and cI) and the E.
coli catabolite activator protein (CAP)
All these bind as dimers to dyad-symmetric
sites on DNA (see Figure 31.33)
All contain two alpha helices separated by a
loop with a beta turn
The C-terminal helix fits in major groove of
DNA; N-terminal helix stabilizes by
hydrophobic interactions with C-terminal helix
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Helix-Turn-Helix II
•
•
•
•
See Figures 31.34 and 31.35
Residues 1-7 of the motif are the first helix
(but called "helix 2")
Residue 9 is the turn maker - a Gly, of course
Residues 12-20 are the second helix (called
"helix 3")
Recognition of DNA sequence involves the
sides of base pairs that face the major groove
(see discussion on pages 1050-1052)
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
The Zn-Finger Motif
First discovered in TFIIIA from Xenopus laevis, the
African clawed toad
• Now known to exist in nearly all organisms
• Two main classes: C2H2 and Cx
• C2H2 domains consist of Cys-x2-Cys and His-x3His domains separated by at least 7-8 aas
• Cx domains consist of 4, 5 or 6 Cys residues
separated by various numbers of other residues
• See Figure 31.37 and Table 31.7
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
More Zn-Fingers
Their secondary and tertiary structures
• C2H2 -type Zn fingers form a folded beta
strand and an alpha helix that fits into the
DNA major groove
• Cx-type Zn fingers consist of two minidomains of four Cys ligands to Zn followed
by an alpha helix: the first helix is DNA
• recognition helix, second helix packs
against the first
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
The Leucine Zipper Motif
•
•
•
•
First found in C/EBP, a DNA-binding protein in
rat liver nuclei
Now found in nearly all organisms
Characteristic features: a 28-residue sequence
with Leu every 7th position and a "basic
region"
(What do you know by now about 7-residue
repeats?)
This suggests amphipathic alpha helix and a
coiled-coil dimer
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
The Structure of the Zipper
•
•
•
•
•
and its DNA complex
Leucine zipper proteins (aka bZIP proteins)
dimerize, either as homo- or hetero-dimers
The basic region is the DNA-recognition site
Basic region is often modelled as a pair of
helices that can wrap around the major groove
Homodimers recognize dyad-symmetric DNA
Heterodimers recognize non-symmetric DNA
• Fos and Jun are classic bZIPs
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Post-transcriptional Processing
of mRNA in Eukaryotes
• Translation closely follows transcription
in prokaryotes
• In eukaryotes, these processes are
separated - transcription in nucleus,
translation in cytoplasm
• On the way from nucleus to cytoplasm,
the mRNA is converted from "primary
transcript" to "mature mRNA"
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Eukaryotic Genes are Split
• Introns intervene between exons
• Examples: actin gene has 309-bp intron
separates first three amino acids and the other
350 or so
• But chicken pro-alpha-2 collagen gene is 40kbp long, with 51 exons of only 5 kbp total.
• The exons range in size from 45 to 249 bases
• Mechanism by which introns are excised and
exons are spliced together is complex and
must be precise
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Capping and Methylation
• Primary transcripts (aka pre-mRNAs or
heterogeneous nuclear RNA) are usually first
"capped" by a guanylyl group
• The reaction is catalyzed by guanylyl
transferase
• Capping G residue is methylated at 7-position
• Additional methylations occur at 2'-O positions
of next two residues and at 6-amino of the first
adenine
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
3'-Polyadenylylation
• Termination of transcription occurs only after
RNA polymerase has transcribed past a
consensus AAUAAA sequence - the poly(A)+
addition site
• 10-30 nucleotides past this site, a string of
100 to 200 adenine residues are added to
the mRNA transcript - the poly(A)+ tail
• poly(A) polymerase adds these A residues
• Function not known for sure, but poly(A) tail
may govern stability of the mRNA
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Splicing of Pre-mRNA
Capped, polyadenylated RNA, in the form of a RNP
complex, is the substrate for splicing
• In "splicing", the introns are excised and the
exons are sewn together to form mature mRNA
• Splicing occurs only in the nucleus
• The 5'-end of an intron in higher eukaryotes is
always GU and the 3'-end is always AG
• All introns have a "branch site" 18 to 40
nucleotides upstream from 3'-splice site
• Branch site is essential to splicing
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
The Branch site and Lariat
• Branch site is usually YNYRAY, where Y =
pyrimidine, R = purine and N is anything
• The "lariat" a covalently closed loop of RNA is
formed by attachment of the 5'-P of the intron's
invariant 5'-G to the 2'-OH at the branch A site
• The exons then join, excising the lariat.
• The lariat is unstable; the 2'-5' phosphodiester is
quickly cleaved and intron is degraded in the
nucleus.
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
The Importance of snRNP
• Small nuclear ribonucleoprotein particles snRNPs, pronounced "snurps" - are involved in
splicing
• A snRNP consists of a small RNA (100-200
bases long) and about 10 different proteins
• Some of the 10 proteins are general, some are
specific. Properties described on page 1063
• snRNPs and pre-mRNA form the spliceosome
• Spliceosome is the size of ribosomes, and its
assembly requires ATP
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Assembly of the Spliceosome
See Figure 31.53
• snRNPs U1 and U5 bind at the 5'- and 3'splice sites, and U2 snRNP binds at the
branch site
• Interaction between the snRNPs brings 5'and 3'- splice sites together so lariat can
form and exon ligation can occur
• The transesterification reactions that join the
exons may in fact be catalyzed by
"ribozymes"
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company
Biochemistry 2/e - Garrett & Grisham
Copyright © 1999 by Harcourt Brace & Company