Chen-7-Gene Expression

advertisement
Biochemistry
Chen Yonggang
Zhejiang University
Schools of Medicine
Regulation of Gene Expression:
Putting information to work
Information encoded in DNA is
of no use unless it is expressed
• Each cell has many genes but few are active at
any time
• Although all cells have the total complement of
genes only a portion are expressed
• Cells must respond to the situation at hand and
do what is needed for survival
• Cells only have so much energy and must use it
efficiently
Bacteria are simpler than eucaryotic
cells but have similar processes
• In trying to understand the design of living
things we study bacteria as models for life
• Once processes are understood in bacteria
we find that the same processes function in
eucaryotes, but with much more complexity
• Bacteria don’t complain, need no sleep and
have no protectors. In fact no one worries
whether they live or die
Gene expression systems be understood
by studying two models
• Biochemical processes can be subdivided into
two categories
– Anabolism
• The process of building complexity
– Catabolism
• The process of breaking down complexity
• Living things oxidize fuels and use the energy to
synthesize molecules and do work
Life processes are controlled by
enzymes----- gene products
• For each process activated in the cell, genes
must be expressed
• The gene is transcribed to form mRNAs and
proteins(enzymes) are made
• The enzymes carry out catabolic
(degradative) steps or in other cases,
anabolic (synthetic) steps
The easiest bacterium to study is
E. coli
• Since it lives in the human colon, there is an easy
source
• It is easy to isolate, it lives on almost any fuel
source
• It grows at a comfortable temperature (37o C)
• A lot is known about it including its entire
chromosomal DNA sequence
The energy for growth and
function comes from food
• E. coli prefers glucose as a fuel. Glucose, a
sugar can be oxidized to provide the energy
for ATP synthesis and to generate the Proton
Motive Force, energy sources used by the
bacterium to drive its life processes
• However, in the colon it gets a variety of
foods depending on the whims of its host
• When we drink milk, E.coli needs to use
lactose, milk sugar in place of glucose
E. coli does not normally express the
enzymes needed to use lactose
• The genes required to utilize lactose are
located together in a segment of DNA and
are transcribed together to make a
polycistronic mRNA, containing
information for making all the enzymes
needed
• Such a coordinately expressed set of genes
is called an Operon
The lac operon, an inducible
operon
McKee 18.28
The lac operon contains
regulatory and structural genes
The Operon is organized in the chromosome
Lac I CAP Lac P Lac O
LacZ
LacY LacA
Lac P is the Promoter for the operon
Lac I, CAP and LacO are regulatory genes
LacZ, LacY and Lac A are structural genes for the
proteins needed to metabolize lactose
Lactose is a dissacharide
To use lactose, 3 structural genes
are required at the same time
• Lac Z encodes b -galactosidase which
cleaves the b -galactosidic bond joining the
glucose and the galactose
• Lac Y encodes a lactose permease, a
membrane protein which transports lactose
into the bacterium
• Lac A encodes a transacetylase, an enzyme
whose role is not well understood
Structure of Lac operon
1 Regulatory genes and a set of structural genes
2 Promoter: LacP
3 Regulatory genes: LacI---encodes a repressor,
LacO—can be bound by the repressor, CAP site
structural genes: LacZ--encodes b -galactosidase ,
LacY--- encodes a lactose permease, LacA-encodes transacetylase
The LacI gene encodes a control
protein, a repressor
• Lac I is located upstream of the lac
promoter Lac P and thus is not under
regulation, it is encodes a constitutive
protein, the repressor which has high
affinity for the Lac O gene, called the
operator
• Normally the repressor binds to the operator
forming a repressor/operator complex
The repressor/operator interferes
with RNA polymerase binding
• The operator overlaps the Lac P gene so when
repressor binds it blocks access of
RNA polymerase to the promoter
• The repressor blocks the transcription of the
Lac operon
• Under normal circumstances the lac operon is
not expressed, no proteins are synthesized, it
is a silent region of the chromosome
When lactose is available, the
Lac operon is activated
• An enzyme(β-galactosidase) converts a
small portion of the available lactose to a
modified molecule, allolactose(a β-1,6isomer of lactose)
• Allolactose is an inducer of the Lac operon,
• As an inducer, it turns on the operon by
virture of its affinity for the repressor
• When allolactose binds the repressor
undergoes a conformational change
Induction of the lac operon
McKee 18.29
The repressor/inducer complex
does not bind to the operator
• Removal of the repressor from the operator
opens the promoter for RNA polymerase
binding allowing transcription of the lac
operon and the utilization of lactose
• In the laboratory,
IPTG(Isopropylthiogalactoside), a
gratuitous inducer (is not cleaved by bgalactosidase) is used to study the process
Control of a catabolic operon
Devlin 8-3
Mutations in genes affect operon
expression
• Mutations in the Lac I gene yield a
repressor which always binds the operator,
such Repressor Constitutive Mutations
never allow Lac activation
• Mutations in the Lac O gene yield an
operator which cannot bind repressor, such
Operator Constitutive Mutations never
allow the Lac operon to be repressed
Glucose is a breakdown product
of lactose and the preferred fuel
• E. coli prefers to use glucose
• The presence of glucose inactivates another protein
Adenyl Cyclase
Normally Adenyl Cyclase mediates the reaction
ATP
5’,3’-cAMP + PPi
• The availability of glucose reduces cAMP
• When glucose is gone, Adenyl Cyclase is activated and
cAMP is formed
CAP encodes a protein,
Catabolite gene Activator Protein
• When lactose is the only fuel cAMP is
synthesized
• cAMP binds Catabolite gene Activator Protein
• The cAMP-Catabolite Activator Protein Complex
is a DNA binding protein which creates an abrupt
kink in the DNA in the Lac P or promoter
• The kink increases RNA polymerase binding to
the promoter, turning on the operon
cAMP-CAP exerts positive control,
enhancing transcription of the operon
Devlin 8-6
The Lac operon is an inducible
system
• Many catabolic pathways employ induction
as a regulatory motif
• An inducer, either the molecule to be
catabolized or a related molecule induces
the transcription of the genes needed for its
catabolism
The lac operon, an inducible
operon
McKee 18.28
Anabolic, biosynthetic operons
are regulated differently
• Tryptophan is an important amino acid, further
the sythesis of tryptophan uses energy and
critical starting materials
• When tryptophan is available, E. coli does not
need to engage in its synthesis
• Under these conditions the Trp operon is
repressed
• Anabolic processes are repressed when not
needed
The Trp operon is similar to the
Lac operon in structure
• It has regulatory genes and structural genes
• There are 5 structural genes encoding 3 enzymes
needed for Tryptophan synthesis (multiple
subunits for 2 enzymes)
• The genes are arranged as seen before
TrpP TrpO Trpa Trp E TrpD TrpC TrpB TrpA Trpf
• The promoter Trp P and operator Trp O are
upstream from the structural genes
Trp R, not associated with the operon
constituitvely synthesizes a repressor
• The trp repressor binds to the operator only
when complexed with a corepressor, a small
molecule(in this case, tryptophan)
• Thus, normally the operon is turned on,
however when the corepressor is available, the
operator will be repressed, turned off
• The Trp operon is a repressible operon as are
many anabolic, biosynthetic operons
• In the absence of corepressor it is derepressed
The trp operon is repressible
In addition to gene regulation, the
pathway is regulated by enzymes
• The first enzyme in the pathway is Anthranilate
Synthetase, encoded by genes TrpD and TrpE
• As is common in anabolic pathways, this first
committed enzyme in the pathway is regulated
by Feedback Inhibition
• Thus, if Tryptophan is present in the medium the
enzyme is inhibited and this stops the synthesis
of new tryptophan
Such enzymes, called FluxDetermining Enzymes are regulated
• Flux-determining enzymes have an active
site which carries out the enzymatic
function
• They have another(allo) site(steric) which
can bind the end product of the pathway
• Binding the feedback inhibitor to the
allosteric site alters the conformation of the
enzyme and diminishes its activity
Tryptophan synthesis is controlled by
gene expression and enzyme activity
• Ability to synthesize tryptophan is
controlled by the ability to transcribe an
mRNA- Repressible
• The flux through the pathway is controlled
by allosteric feedback inhibition-Inhibitable
• The amount of tryptophan synthesized
meets the needs of the bacterium
Eucaryotic Regulation
• Most eucaryotic gene expression is
regulated by the same kinds of processes
seen in simpler organisms
• Induction and Repression are common
model in eucaryotes
• Because of the differences in gene
organization expression does not involve
operons in eucaryotes
Scientists used to believe that based
on evolution, simpler is better
• As we study gene expression we find more
and more complexity
• The complexity does not appear to be
careless or repetitive
• Novel designs for precise control seem to be
necessary and are almost unimaginable until
discovered
• Makes one wonder?
Eucaryotic genes are organized in
logical arrays
• As in procaryotes, many eucaryotic genes
are clustered by function and need
• Ribosomal RNA genes are clustered into
multiple tandemly repeated arrays
• Histone genes are also clustered and
tandemly repeated in some organisms
• Most genes are present in single copies
Gene clustering may be related to
expression
• The different globin genes, expressing the
various polypeptides of hemoglobin are
clustered
• Thus during development moving from
early to fetal to adult hemoglobin synthesis
utilizes co-located genes
Gene expression in eucaryotes
occurs from chromatin, not DNA
• During expression of genes in Drosophila
segments of DNA in the polytene chromosomes
become puffed during development
• Puffing is related to the transition from
condensed (heterochromatin) to dispersed
chromatin(euchromatin)
• Euchromatin is transcriptionally active
• Transcription is hormonally induced
The structure of transcriptionally
active chromatin is unique
• DNAse I digestion patterns of euchromatin and
heterochromatin are different
• When condensed, the fragments produced are larger
than when dispersed
• Covalent modification of histones, such as
phosphorylation and acetylation alters digestion
patterns
• Digestion hypersensitivity is related to bound
regulatory proteins
The DNA of active genes may be
altered either in structure or access
• Transcriptionally active globin genes are more
sensitive than quiescent genes to DNAse I
digestion
• Methylation of Cytosines is lower in actively
transcribing genes. Since methylation blocks
access to the major groove in DNA---demethylation could open the DNA to the
binding of regulatory proteins
Gene expression is primarily
controlled by transcription
• As in procaryotes, the expression of genes
depends upon the transcription of information
• Regulation of eucaryotic transcription is more
complex than that for procaryotes---more
components
• Transcription requires access to DNA
• Nucleosomes must be disrupted prior to
transcription
Eucaryotic RNA polymerases
cannot act alone
• Eucaryotic RNA polymerase II transcribes DNA to
form hnRNA and mRNA to form active proteins
• For RNA polymerase II to function, a preinitiation
complex must be formed at the TATA box
immediately upstream of the RNA Pol binding site
• Formation of the preinitiation complex is
dependent upon binding domains for general
transcription factors (TFs) which require access to
the major groove of the DNA
DNA access can be modified
• Acetylation of histones reduces the lysinederived positive charge on histones. Since
the negative charges on the DNA backbone
bind to + charged histones, DNA binding is
reduced
• Protein (SWI/SNF) complexes interact with
RNA Polymerase II and ATP to open access
to DNA sequences for TF binding
Methylation of cytosines blocks
access to TFs
Activation of transcription depends
on forming an initiation complex
• Activators and inhibitors of transcription
alter the probability of forming the complex
• In contrast to procaryotes where one or two
proteins promote transcription at a
promoter, eucaryotic regulation depends on
many TFs, both general and specific factors
acting at many different sites in the DNA
• Each transcription factor has a specific role
In general, all transcription
factors have two domains
• Transcription factors have a protein binding
domain and a DNA binding domain and may also
bind co-activators
• DNA binding domains are sequence specific
• DNA binding domains are conserved across
species
• Common motifs are seen throughout all eucaryotes
There are many general TFs
required for transcription
• One of these factors (TFIID) is a large
complex which contains among other proteins
a TATA binding protein(TBP)
• This complex serves as the foundation for the
assembly of the initiation complex at the TATA
box(-27 bp)
• Binding of this complex causes a large
distortion in the DNA double helix
The TATA
box anchors
the initation
complex
Devlin 8-22
While binding to the TATA box is
essential, it is not sufficient
• Eucaryotic promoters are defined as all
sequences which affect gene transcription
• Thus eucaryotic promoters require multiple
transcription factor binding sites
• CAAT and GC boxes are often components
• Other sequences which bind such effectors
as hormone receptor binding sites are called
response elements
• Enhancers are distant factor binding sites
Eucaryotic genes can have
multiple promoters
• Since “promoter” denotes all sequences involved
in transcription, one gene, activated by multiple
events will have multiple response elements and
even enhancers
• Thus a gene may be induced or repressed in
concert with other genes in response to varying
stimuli
• Stimuli are transduced by specific transcription
factors which bind to response elements and
generate the induction or repression
Four common DNA binding
motifs are used in TFs
• Helix-turn-helix (H-T-H) motif proteins bind in
the grooves of DNA straddling the strand
• Zinc finger motif proteins bind in the major
groove of DNA and recognize specific sequences
• Leucine zipper motif proteins form a dimer with
another protein, it can scissor across and
recognize DNA sequences
• Helix-Loop-Helix (H-L-H) motif proteins are
similar to HTH
Transcription factor DNA binding
domains are sequence specific
• All appear to bind in the major groove
• Most have dyad symmetry
• All have strong secondary structure
character
• Each has variation in its expression
• Most are very small in comparison with the
transcription factor as a whole
Helix-turn-helix proteins are found in
both procaryotes and eucaryotes
• The cro proteins which regulates viral
lysogeny/lytic phases employs an H-T-H motif
• All H-T-H proteins employ a 20 amino acid
sequence organized into a 7 aa a helix-a 4 aa
non-helical turn, followed by a 9 aa helix
• Generally, these are repeated across a dyad axis
to form a symmetrical DNA binding domain
HTH domains bind in the major
groove of DNA
Devlin 8-24
Each domain has a specific role
• The 9 amino acid helix is the DNA binding
domain which recognizes the DNA sequence
through interaction with hydrophobic amino
acids such as valine or leucine
• The 4 amino acid turn drapes over the
polynucleotide strand
• The 7 amino acid helix stabilizes the binding to
the DNA
Specific
binding
occurs in the
major groove
Devlin 9-19,20
The Zinc finger motif binds
specific sequences in the DNA
• Zinc fingers are so named because of the
looping nature of the structure which can
wrap in the groove of the DNA
• Zinc is coordinated between 4 cysteines or
2 histidines and 2 cysteines
• Many zinc fingers can exist within a single
transcription factor, increasing the
specificity of binding
Zinc fingers can wrap in the
major groove of DNA
Devlin 9-22
Zinc fingers mediate hydrogen
bonding
• Sequential multiple zinc fingers can probe
multiple turns of the DNA double helix,
binding specific patterns of H bonds
• Some zinc finger motifs use one finger to
recognize and an adjacent finger to stabilize
binding
Leucine zippers use dimeric
helical molecules to bind DNA
• Leucine zipper use antiparallel proteins to
scissor across the DNA double helix
• They recognize DNA sequences and allow
specific binding in the major groove
• The leucines give rise to strong a helical
structures
• They form hydrophobic heterodimeric or
homodimeric binding structures
Leucine
zippers are
common
motifs
Devlin 8-26,9-25
The Helix-Loop-Helix is a
variation on the HTH motif
• HLH proteins such as TFIID (TATA
binding) use a dyad symmetric HLH to
bind specific sequences
• Conformational changes induced by
interaction with other transcription factors
shift the attachment to the specific DNA
sequences
HLH proteins bind to DNA
sequences
Devlin 8-27
Transcription is controlled
through common mechanisms
• Primary binding of TFIID to the TATA box
provides the foundation of the preinitiation
complex
• Binding of specific initiation factors to
recognition sequences more remote from the
gene interact through DNA folding to form the
initiation complex
• Binding of cofactors activates the transcription
factors
Initiation of transcription
• Each transcription factor has one or multiple specific
DNA binding domains
• Each specific transcription factor is activated in
response to a cellular event
• The initiation complex recruits chromatin modifying
enzymes such as acetylases to loosen the DNA
structure
• The complex recruits RNA polymerase II to initiate
and elongate the mRNA
Transcription is precisely regulated
to meet the needs of the cell
Devlin 8-30
Questions for you to think about
• How could proto-oncogenes such as ras, fos and myc
modify cell function to give rise to cancer?
• Why are we concerned about the effects of carcinogens,
UV light and retroviral infection with regard to tumor
suppression?
• What are heat shock effector elements?
• What do Immunoglobulin heavy chain M and G genes
and hemoglobin e, g, and b chain genes have in
common? (hint: different genes are expressed under
different developmental and functional conditions)
Download