U4Word

advertisement
Chapter 5 Recombinant DNA Technology
I. Restriction Endonucleases (REs): catalyse a ds “cut” in dsDNA at a specific pallindrome
sequence Table 5-4
A. Biological Function: degrade foreign DNA, protect bacterium from phage infection
1. Discovered after the observation that phage that grow in one strain of E coli can not grow in
others (restricted growth). The cause of the restriction was identified: REs that cut up phage DNA.
2. Recognition of phage DNA vs own DNA: methylation pattern; a RE will not cut at its
recognition sequence if that sequence is methylated at specific locations. Each strain has a methylase
that methylates its DNA so that it will not be cut by its own RE (a given strain’s RE and methylase both
recognize the same sequence).
3. The phage that can grow in a given strain of E coli have their RE sites methylated.
B. Applications for REs
1. Production of Recombinant DNA: many of the REs make staggered cuts, producing “sticky
ends”. These ends on one molecule base pair with those on another molecule. Fig 5-46, 48
2. Restiction Fragment Length Polymorphism (RFLP): to detect genetic diseases, identify
criminals (DNA fingerprinting) Fig 5-41, 42
a. DNA of chromosomes is cleaved with an RE.
b. resulting fragments are “separated” by electrophoresis: smaller fragments move faster
c. DNA fragments are transferred to nitrocellulose paper (Southern blotting) where they
are held in place (same pattern as electrophoresis gel) Fig 5-50
d. fragments are melted (NaOH) and then a probe (a ssDNA complimentary to
sequence of interest) is added. The probe only occurs at the location of the one (or two, etc)
band(s) of interest. The probe must have a detectable “tag”, such as radioactivity (or a
chromophore or colored group)
e. autoradiograph: lay a piece of x-ray film over the paper, radioactivity exposes the film.
f. find location of band: is it in the usual place?
C. Constructing a restriction map of a DNA molecule, showing all sites that will be cut by REs.
1. Treat different samples of the DNA with one RE, another RE, and a mixture of the 2 REs.
Electrophorese each sample in a different lane and estimate fragment sizes in each lane (say we got
the following fragments, for RE1: 3.1kb, 3.8kb, and 4.6kb; for RE2: 1.8kb, 4.4kb, and, 5.3kb; for RE1 +
RE2: 1.6kb, 1.8kb, 2.2kb, 2.8kb, 3.1kb) Fig 5-39, 40
2. Identify “end” fragments: these will be fragments that are the same size in the double digest
(RE1+ RE2) as they are in a single digest (RE1 or RE2). (In this case, the 1.8kb and 3.1 kb fragments
are ends.)
3. Work in from each end: is there a fragment remaining in the double digest (after crossing out
the 1.8 and 3.1 that are already accounted for) that, when added to the
a. 1.8kb, will give a sum equal to one of the fragments for RE1? Yes, 1.8 + 2.8 = 4.6
b. 3.1 kb, will give a sum equal to one of the fragments for RE2? Yes, 3.1 + 2.2 = 5.3
4. The remaining fragment in the double digest, 1.6kb, can be combined with each of the
previous two (2.8kb and 2.2kb) to account for the other fragments from the single digests (4.4kb and
3.8kb). So 1.6kb is the middle fragment.
II. “Cloning” General Outline
A. The overall purpose common to all techniques is to amplify (produce many copies of) a specific
DNA. There are three processes common to most techniques:
1. Construct recombinant DNA molecule: “target” is a DNA fragment containing gene or DNA
of interest (to sequence or produce protein), “vector” is a carrier which can be replicated in a host:
has origin of replication (host usually bacterium, vector is plasmid or phage) (or virus for animal,
plant cells)
2. “Transform” (with plasmid) or “infect” (with phage). Cells take up DNA, (may want to use
restriction deficient strain of bacteria) replicate it (and express it ---> protein)
3. Select the cells which contain recombinant DNA of interest (“pick” a colony, which contains
progeny from a single cell.) Grow these up in broth ----> “clones” (all cells identical)
B. Linking target and vector = Construct recombinant DNA
1. With “sticky (or cohesive) ends”:
a. Construct/obtain a restriction map of target and of vector. One should have one restriction
site on the vector for the RE to be used. The RE sites on the target should not be in the midst of
the gene if expression is desired.
b. Cut target and vector with same “staggered cutting” restriction enzyme (RE) Fig 5-46
c. allow products to form by base pairing (one will get the desired product - circular plasmid
containing target - among many products: 1,2 or more plasmids in circle, 1,2 or more targets in
circle, plasmids with uninteresting fragments)
d. Heat “kill” RE. Covalently link using DNA ligase
2. From “blunt ends” - it may be desirable to cut the plasmid at a site recognized by a “blunt
cutting” RE, or may just have to use blunt end DNA Fig 5-48
a. Synthesize (buy) “linker” with blunt ends that contains the sequence for staggered cut RE.
b. Link “linker” to vector and/or target (using T4 ligase which ligates blunt ends), then proceed
as in B1.
III. Various Specific Techniques Employing variations on targets, vectors, selection methods
A. cDNAs are first produced as ssDNA molecules that are complimentary in sequence to an
mRNA. Then they are often converted to dsDNA form if desired.
1. How to make them:
a. Isolate mRNA produced from the gene of interest. It contains “poly A tail” on its 3' end.
b. Add “oligo dT”, a short, ssDNA in which all bases are T. This serves as a primer.
c. Add reverse transcriptase and dNTPs --> ssDNA-ssRNA hybrid.
d. Add strong basesuch as NaOH: RNA polymer will be hydrolysed, ssDNA will not. Now one
has the ssDNA, OR continue;
e. Add DNA pol I, dNTPs --> dsDNA with short ss “hairpin”
f. Add S1 nuclease; it digests the ss hairpin --> dsDNA (blunt).
2. Uses for cDNA:
a. As “probes”, in restriction fragment length polymorphism (RFLP) or for screening gene
libraries (below).
b. For expression of eukaryotic proteins, the genes for which contain “non expressed
sequence” (introns) interspersed with coding sequence. One must use a cDNA; one cannot use a
chromosomal DNA fragment because there is no machinery in the bacterial cell to remove the
introns.
B. Selection based on “insertional inactivation”: if a plasmid vector that contains a gene with an
observable activity is used, the observation can be used for selection
1. Using a plasmid that contains 2 antibiotic resistance genes (say, ampR (ampicillin resistance)
and tetR):
a. Use an RE that cuts in the midst of one resistance gene, say ampR, insert target there. This
resistance gene is no longer functional (it has been inactivated by insertion of the target into it).
b. Transform E coli
c. Grow transformants on an agar plate containing the other antibiotic, tetracycline.
Nontransformants (no plasmid) won’t grow.
d. Pick colonies (containing the daughters from a single cell), inocculate two broth culture
tubes, one with each antibiotic.
e. Cells which grow in the presence of tetracycline and not in the presence of ampicillin
contain the recombinant plasmid.
f. If the target was a cDNA any such cell (as in e) is used since the target is homogeneous,
not a mixture of various DNAs.
2. A variation on insertional inactivation uses “blue-white screening”:
a. The plasmid contains a lac Z (β-galactosidase) gene. Fig 5-43
b. The lac Z gene has a number of different RE sites within it. If the target is inserted in one of
them then plasmid becomes lac Z-.
c. The plasmid also contains ampR gene. Transform a lac Z- strain of E Coli with the products
of the recombination mixture, plate onto medium containing amp and “X-gal” (a lactose analog)
i. nontransformed cells don’t grow; (amp sensitive)
ii. cells transformed with original, nonrecombinant plasmid--> blue colonies (active lac z
gene’s β galactosidase cleaves X-gal, as it would lactose, producing blue substance)
iii. cells transformed with recombinant plasmid --> white colonies (lac Z- because insert
inactivates it)
C. Gene libraries contain stored samples that include all of the DNA fragments from an organisms
genome.
1. Producing the library by shotgun cloning
a. Uses a special mutant of λ phage as the vector. It is cut with Eco R1 in 2 places --> 3
fragments. Fig 5-49
b. Separate fragments, discard middle, mix 2 end fragments with target, ligate.
c. Target is chromosomal DNA cut with Eco R1 for a short time so as to only cut, say 25% of
the sites.
d. Mix recombination products with “packaging proteins”, infect E coli.
e. Joining of 2 fragments (λ ends) without target will produce DNA too short to be effectively
packaged --> no plaques (a plaque is like a colony in that all λ in it came from infection by a single
phage).
f. So all plaques are recombinants, but target was a mixture of millions of genomic DNA
fragments.
2. Screening the gene library by colony (plaque) hybridization:
a. Press “filter paper” onto each plate; this picks up some of each plaque, in place (replica of
plate) Fig 5-52
b. Lyse phage with NaOH, (remove phage protein), which also melts the DNA. ssDNA stays
in place on filter.
c. Add the probe, and autoradiograph as in RFLP
d. Go back to the original plate and pick the plaque that corresponds to the spot on the X-ray
film. Grow these up (infect E coli).
D. Chromosome Walking is used to sequence, map or study chromosome segments much longer
than can be cloned (plasmids <10 kbp, lamda phage up to ~ 15 kbp, cosmid up to ~50 kb, artificial
chromosomes of yeast, bacteria ~ 200kbp)) Fig 5-53
1. Gene libraries are made with partial digests, cutting at, say 1/4 of all the sites for the RE. So
the first probe (C2, above) may detect several different plaques with overlapping cloned fragments.
2. Each plaque is picked, grown up and its target DNA mapped or sequenced.
3. Another fragment is selected from the mapped targets that flanks the portion detected by the
first probe, (on the extreme end of a cloned fragment). This fragment is then “subcloned”, and used as
a probe to screen the library again and find new plaques. These plaques contain cloned fragments
adjacent to those detected by the first probe.
4. Repeat steps 2 and 3 many times.
E. DNA Sequencing
1. “Sequencing Phage” - M13: is a circular ssDNA phage with a dsDNA (circular) replicative
form (RF)
a. treat RF and target with a RE, ligate
b. infect E Coli --> ss DNA (“+strand”)
c. add synthetic ssDNA oligomer consisting of RE recognition sequence + adjacent (~15 bp).
This will bind to cloning site at end of target, serve as primer for DNA Pol I
d. use 4 dNTPs and a small proportion of one dideoxy TriPhosphate (which is radioactive)
e. run 4 reactions in 4 tubes, each with a different dideoxy (2',3') (dideoxy ? Chain terminator
because it has no 3'OH) --> various lengths of ssDNA complimentary to target, which for each tube all
end at same nucleotide
f. electrophoresis, autoradiography. Autoradiograph can be “read” for sequence. Fig 7-14,
15, 16
*Note: since orientation of target is random, some M13 + strands will contain one strand of target,
some will contain other: sequence both as a check
F. Site - Directed Mutagenesis Fig 5-57
1. This is also based on obtaining synthesis of a specific DNA sequence by using synthetic
oligonucleotide primers (as in sequencing above, and PCR, below). The purpose here is to produce a
modified protein which has specific amino acid substitution(s) chosen by the investigator.
2. Use a cloned fragment, say from an expression vector (this is a plasmid with a cDNA inserted
in the appropriate position in relation to sequences on the plasmid for control of transcription and
translation).
3. Cut with same RE used to make the recombinant plasmid, isolate the target.
4. Do steps 1, 2 and 3 with phage M13 as above, but in step 3, the synthetic oligomer contains
the altered codon bases, along with the residues that will bp with the bases adjacent to the codon to
be altered. This is the primer for DNA synthesis by DNA Pol I, which produces a recombinant ds M13
containing the altered codon on one strand of the target. Cut this with same RE used above, isolate
the modified target, insert it into the expression vector, transform and grow up cells, and purify the
protein.
G. Polymerase Chain Reaction (PCR): To amplify a specific segment using total chromosomal
DNA Fig 5-54
1. Add synthetic primers that ONLY bind on the ends of the segment to be amplified
2. Also add Taq DNA Polymerase (heat stable), 4 dNTPs, Mg ion, etc. for DNA synthesis
3. Melt DNA, separate product and template strands.
4. Thermal cycler will repeat n temperature cycles: at T1-primers bind, at T2- DNA synthesis
occurs, at T3- melt DNA. This produces 2n copies after n cycles.
5. Variable stops (as to location of DNA Pol I on chromosomal fragments at the time of shift from
T2 to T3) produce constant length dsDNA segments. (One can purify the products from one band on
an EP gel.) HOW? All templates that are the products of a previous round end at the synthetic
oligonucleotide primer from which their synthesis was initiated.
6. What determines how many copies are made? The number of primer molecules one adds.
Each strand of dsDNA product begins with one. (One can always add enough dNTPs, set the cycler to
run enough cycles.)
H. Expression Vectors: In order to produce a eukaryote protein in a transformed bacterium an
appropriate coding sequence must be inserted into an appropriate location in relation to the regulatory
sequences required for transcription and translation.
1. As explained previously, the coding sequence must be a cDNA rather than a chromosomal
fragment.
2. The following sequences would be positioned in 5’ to 3’ order:
i. transcription regulatory sequences, such as those of the arabinose operon
ii. bacterial promoter
iii. Shine-Dalgarno sequence for ribosome binding
iv. RE site(s) to insert target
v. hairpin-UUU for transcription stop
Section I was skipped in 2006
I. Directional cloning: sometimes the target has directionality in relation to the technique, as in
expression. It is desirable to have the coding sequence in the correct order rather than reversed.
1. The target and vector are each cut with 2 different staggered cutting REs. Fig 5-56
2. Then, when mixed and ligated, there is only one orientation in which the target and vector can
pair.
Chapter 34: Eukaryotic Chromosomes and Genes
I. Chromosomes: 4 mile long uncooked spaghetti in a bag a few feet in diameter (cell nucleus)?
1) Chromosomes consist of DNA with associated proteins and RNA= chromatin Fig 34-1
2) Less densely packed euchromatin is expressed producing RNA and proteins, densely packed
heterochromatin is not expressed. Fig 34-2
3) 2 x 23 chromosomes in humans, 44 to 246 x 106 bps and 1.6 to 8.2 cm long. How to fit in
nucleus? Packaged with histones.
A. Histones: DNA packaging proteins
1) Most of the chromatin protein is histones, all of which have >20% [lys (R=-(CH2)4+NH3) + arg
(R=-(CH2)3NH-C(-NH2)=N+H2]. These bind to negatively charged phosphates of DNA backbone.
Table 34-1
2) Have nearly identical AA sequences in all organisms. One of the histones, H4, differs by only 2
AA’s between cow and pea (Fig 34-3). This evolutionary conservation of sequence implies that
pea H4 is “perfect” and any change would be for the worse.
3) Modifications such as addition of methyl, acetyl and phosphate groups are common. These
usually decrease the strength of DNA binding and “loosen” the packing of DNA to make it more
accessible for transcription, replication, recombination and repair.
4) Often there are variants that differ by a few AA, with different variants expressed at different
stages of cell development.
B. Nucleosomes (“nuclear bodies”)
1) Nucleosome core particle: 146 bp of ds DNA is wrapped around a histone octamer, which
consists of 2 each of H2A, H2B, H3, H4; the octamer is an eight subunit protein, about 100 Å in
diameter. A tetramer, (H3)2(H4)2 forms a central disk with a H2A-H2B dimer above and below.
2) A segment of DNA, 8 to 114 bp long (usually ~55 bp long) extends from core to core. This is
“linker DNA”.
3) One molecule of histone H1 binds to both ends of the two loops of DNA that wrap around the
histone octamer “sealing off” the nucleosome. The octamer + 166bp of dsDNA + H1 makes up the
“chromatosome. Fig 34-9 Chromatin has a “beads on a string’ appearance, in which the
chromatosomes are the beads and linker DNA is the string. Fig 34-4 The DNA “enters” and
“leaves” on the same side of the nucleosome as a result of H1 binding; in H1-depleted chromatin
this is more random and often on opposite sides. Fig 34-10
4) The segments and side chains of the histones do not wrap around or protrude into or between
the DNA helices. Long, flexible N-terminal segments of each of the histones in the core extend out
from the nucleosome and interact with linker DNA. The residues that undergo modification are in
these N-terminal segments. Fig 34-7
4. Histone H5 is a variant of H1 that binds more tightly to DNA than H1 does. H5 is associated
with DNA that is inactive in transcription and replication.
5) During replication the histone octamers are distributed at random among both daughter
chromosomes.
6) Nucleoplasmin is a protein that has a role in the assembly of the nucleosomes, in bringing
histones and DNA together. It is acidic and binds to histones, but not to DNA nor to nucleosomes.
C. Nucleosome Associations: 300 Å (aka 30-nm) filaments.
1) Histone H1 has a role in organizing the nucleosomes (Fig 34-10).
2) The nucleosomes pack into a helix with 6 nucleosomes/turn, the 30-nm filament (Figs 34-12,
13), with H1's in the center. The H1's ends are thought to make contacts with adjacent
nucleosomes by interacting with each other end-to-end.
D. Chromosome Organization: Loops of 300 Å filaments
1. A metaphase chromosome consists of loops of 30-nm filaments that project from a protein
“scaffold”. The loops are ~45-90 kb (45,000-90,000 bp) long and appear to be attached at both
ends to about the same place on the scaffold. (Figs 34-14, 15). There are about 2,000
loops/chromosome.
E. Polytene Chromosomes of Drosophila
1) Multiple replications without cell division may yield ~1000 copies of a chromosome which
remain aligned and joined in parallel. Fig 34-16, 17, 18
2) Staining these polytene chromosomes produces a characteristic pattern of alternating light and
dark bands. Fig 33-17 shows that the dark bands are highly compacted DNA/chromatin and the
light bands are parallel segments of linear chromatin which is not highly condensed. Each light
band is an expressed gene and each dark one is a group of unexpressed genes.
3) Evidence for this: in situ hybridization of an mRNA or cDNA preparation will hybridize to one (or
a few) light band(s).
II. Repeated sequences in Eukaryotic DNA: it is not all genes. The sizes of the genomes do not exactly
parallel the morphological complexities of different organisms. In many cases, a simpler animal has a
smaller genome. Fig 34-19
A) Methodology: Kinetics of Reassociation
1) Mechanically fragment chromosomal DNA to produce fragments of ~0.3 to 10 kbp.
2) Melt, cool to ~25o C below Tm; measure reassocation of complimentary strands.
3) Observe A260 as function of time (A260 decreases: dsDNA has lower A260 than ssDNA;
“hyperchromic effect”) Fig 34-20
4) Theory, interpretation: compare two samples - one has 1000 identical fragments and other has
1000 fragments that are all different. Fragment - fragment encounters will occur at the same rate
in the two samples. Only one strand will be the “partner” of a given strand in the second sample,
but each strand has 1000 “partners” in the sample with identical fragments. So the sample with
identical fragments reassociates fast, the other sample slow.
B) Results
1) About 10% of the DNA of some species consists of long inverted repeats (Figure 33-22b). The
single strands of these DNA’s don’t have to find partners, they can fold back (hairpins) and base
pair with themselves much faster; this is the fastest “reassociating” fraction. These are ~100 to
1000 bp long and are distributed at many sites on the chromosomes. About 2,000,000 occur in
humans. The function is unknown, but these may be in hairpin form on chromosomes and be
recognized as such. Fig 34-23
2. Highly repeated short sequences may account for a significant fraction of the total DNA. The
DNA of one crab species is 30% AT repeats. These are sometimes referred to as satellite DNA
because they form separate bands in the ultracentrifuge. Three AT rich heptanucleotides account
for 41% of the DNA in a Drosophila. Such simple sequence repeats (SSRs; aka short tandem
repeats, STRs) are often located at the telomeres on the ends of chromosomes or at the
centromeres at which the “sister chromatids” (daughter chromosomes) are joined and at which the
chromosomes attach to the mitotic spindle for cell division. Fig 34-25 About 3% of human DNA is
SSRs.
3. Moderately Repetitive DNA consists of blocks of 0.1 to several kbp that are randomly
interspersed in many locations on the chromosomes. In humans, more than 40% of the DNA
consists of such segments that are transposons, mainly retrotransposons: Table 34-2
a. LINEs (long interspersed nuclear elements, ~20% of DNA) are derived from 6 - 8kb segments
that code for the enzymes, such as reverse transcriptase, that transpose them. They are derived
from transcripts produced by RNAP II (mRNA). All mammals have them. Most of the LINEs in
humans are shortened and transpositionally inactive.
b. SINEs (short interspersed nuclear elements, ~13% of DNA) are 0.1 - 0.4 kbp segments that
are derived from RNAP III (tRNA, small RNA) transcripts. They contain a RNAP III promoter but
no genes. The Alu family (so called because they happen to have a site for Alu I RE) has 90%
homology with the 7S RNA, a snRNA involved in targeting proteins for secretion or to specific
cellular locations. They occur only in primates, but similar elements are widely distributed. All
SINEs other than Alus are derived from tRNAs.
c. LTR retrotransposons (~8% of DNA) contain long terminal repeats (LTRs) that flank gag and
pol (but not env) genes. Among these, only the EVRs (endogenous retroviruses) appear to have
been active in mammalian genomes.
d, About 3% of human DNA is DNA transposons like those of bacteria.
e. A variety of functions or benefits have been proposed for this DNA, such as sites for
recombination or modular gene assembly or replication origins. Or maybe it is just “junk”
(hitchhikers we never got rid of).
4. Repeated genes are also in the moderately repetitive category.
5. About half of the segments identified as genes in the human genome project have no known
function. Notable is the large numbers of genes that code for proteins with regulatory functions
such as transcription factors and receptors. Fig 34-27
III. Repeated Genes
A) Ribosomal RNA Genes
1) The rRNAs genes are grouped into transcription units ordered: 18S-”intron”-5.8S - “intron” -28S
and following untranscribed “spacers” the order is repeated. (Xenopus - toad) Fig 34-28
2) This “tandem array” is repeated ~500x in Xenopus somatic (nongamete) cells. In oogenesis
(formation of the egg cell) selective gene amplification yields ~2 x 106 repeats (75% of total DNA)
(egg cell needs 1012 ribosomes at fertilization). (5s rRNA genes are repeated ~120 x or ~20,000 x
in the two cell types) Fig 34-29, 33
3) The tandem array is a transcription unit. The primary transcript is specifically cleaved to
produce the separate 18S, 5.8S and 28S rRNAs, and some specific 2'OHs are methylated (these
methylations are evolutionarily conserved: done the same for many species).
4) The “spacers” between the transcription units are nearly all identical in oocytes, but are
heterogeneous in somatic cells. The amplification of one gene (with its spacer) gives rise to
identical circular rRNA “plasmids” which replicate as “rolling circles” (see F plasmid)
B) Histone genes Fig 34-31
1) These are also arranged in tandem and repeated many (10 to 100s) times (depending on
species, rate of zygote divisions).
2) The repetitions may contain variations and certain variations may be selected depending on
cell type and development.
3) These genes contain no introns and the mRNAs do not get “poly A tails”
C) Other structural genes (protein coding genes) are usually present at one or a few copies. The
gene for silk fiber is at one per cell, but is messages are extremely stable (for days), so they
accumulate to attain a high level.
D) Selective pressure may result in accumulation of hundreds of copies of a gene usually present
once/cell. (“pressure = inhibit enzyme with a drug; response = gene duplications. *Note: this gene
duplication only occurs in cancer cells. The “pressure” gives cells with duplications a selective
advantage; evolution).
IV. Globin (Hb) genes:
A) Hemoglobin Gene Family (Immune Gene Superfamily later)
1. The human Hb genes are all clustered on 2 chromosomes. The order in which the genes are
expressed during development (embryo ( Hb Gower 1: ζ2ε2) --> fetus (HbF: α2γ2) --> baby/adult (HbA:
α2β2, with a little HbA2 α2δ2)) is the same as the order of the genes on each chromosome. Fig 34-35
2. The alpha-like genes are in a ~28 kbp segment on chromosome 16: zeta (ζ), alpha1, alpha2 (the 2
alpha genes code for the same AA sequence); also 4 pseudogenes (ψζ, ψα2, ψα1, ψθ) – these are
not transcribed, but very similar sequence (~ 75% same bps) to the corresponding genes; and 3
copies of the Alu repeat. Fig 34-36
3. The beta-like genes are in a ~100 kbp segment of chromosome 1l: epsilon (ε), Ggamma (Gγ),
A
gamma (Aγ),(the two gamma genes are identical except one has gly at residue 136, the other has ala)
delta (δ), and beta (β); also one pseudogene, ψβ; 8 copies of Alu, and 2 copies of Kpn, another
moderately repeated DNA
4. The alpha- and beta- like Hb genes (and Mb genes) all have the same exon-intron-exon array. The
introns vary in length, but for all: the first exon codes the first 30 or 31AAS, the second codes the next
68-74 AAs, and the third exon codes the last 31 or 32 AAs. Each exon encodes a domain, a distinct
folding unit that has a distinct function. Fig 34-37
B) Thalassemia’s, Defects of Hb genes
1.Variety of defects that result in beta-Thalassemia
a. Point mutation converts a codon to a stop codon; or point mutations in promotor boxes diminish
transcription or those at splicing recognition sites (branch, junctions) or at poly A site produce
nonfunctional transcripts.
b. Frameshifts in exons resulting from insertions or deletions.
c. Deletions of all or part of the beta or beta-like genes
2. Heterozygotes for beta-thalassemia are usually asymptomatic. Homozygotes with little or no
functional beta subunits results in severe anemia requiring lifelong blood transfusions (vampires?).
This causes early death from Fe toxicity (alpha4 tetramers form, damage RBCs which are destroyed).
3. delta-beta-Thalassemia: delta and beta genes are both deleted, gamma subunits produced (HbF).
Hereditary persistence fetal Hb
4. Unequal crossover produces HbLepore (beta-thallassemia) and Hb anti-Lepore (normal). This is an
observed example of a gene duplication/gene deletion type of event resulting from recombination of
slightly misaligned chromosomes. Fig 34-39
5. How did the arrangement of the Hb genes on chromosomes 11 and 16 come about? How does a
chromosome come to have multiple copies of the same sequence (which can then diverge to have
subtle differences in sequence that result in subtle differences in the function of the corresponding
proteins)?
V. Regulation of Transcription Initiation
A. Control of gene expression by controlling initiation on transcription was first shown by comparing
the levels of primary transcripts from specific genes with the levels of the corresponding mRNAs.
Experiment: does the production of a given protein only in one type of cell result from its transcripts not
being produced or not being processed in other cell types?
1. A cDNA library made from the mRNA of mouse liver cells was prepared. This library was
screened with labeled mRNA from various types of mouse cells. This enabled identification of a set of
cDNAs that were only expressed (as mRNA) in liver cells, and 3 others that were expressed in all of
the cell types tested.
2. The cDNAs were purified, denatured, and immobilized on nitrocellulose paper. These were
then probed with the newly synthesized, unprocessed hnRNA purified from the nuclei of the various
cell types.
3. It was found that the hnRNA from the other cell types did not hybridize to the cDNAs that had
been identified as liver-specific based on mRNA. That is, the reason these other types of cells don’t
contain these mRNAs is that these genes are not transcribed in those cell types.
B. Initiation of Transcription by RNA Pol II
Transcription rates in prokaryotes (rate for an expressed gene)/(rate for a repressed gene) vary by
about 103. Basal expression levels are significant. In eukaryotes they vary by about 10 9. Unexpressed
genes are off, with transcription rates of zero. But the mechanism of control in either case involves the
binding of specific proteins to specific control sequences.
RNA Pol II has no subunit corresponding to E Coli RNA Pol sigma subunit; it can’t recognize promotor
without transcription factors (TFs). There are three types of TFs:
1. General transcription factors (GTFs) needed for all mRNA synthesis. GTFs usually produce
low level mRNA without other TFs present
2. Upstream TFs (UTFs) bind “in front” of mRNA genes and may activate or repress
transcription. They bind to their sites under all conditions (rather than only when inducer-type
substance is bound/not bound to it or when modified). A given upstream TF may be present only in
certain cells types or only at certain stages of development: their synthesis is regulated.
3. Inducible TFs are activated or inhibited by effector/inducer binding or by phosphorylation
/dephosphorylation. (Similar to CAP, lac & trp represssor, AraC)
C. Preinitiation complex (PIC): ordered assocation of 7 GTF’s and RNA Pol II with promotor. Mass:
1.6 million + 0.6 million RNA Pol II Fig 34-47
1. The TATA-box binding protein (TBP), binds and about 10 TAFs (TBP associated factors) bind
to it, comprising TFIID. (Recall that the TATA box appears to be involved in setting up initiation to
occur at a specific bp; its elimination causes the initiation site to be variable.)
2. Then TFIIA and TFIIB bind, followed by the TFIIF-RNAP II complex. One of TFIIFs subunits
has sequence homology with σ70, the major bacterial sigma subunit, and binds to bacterial RNAP.
3. TFIIE and TFIIH bind, completing the PIC. Then, the ATP-dependent helicase function of
TFIIH “melting” of the promoter is required.
4. The PIC forms on promoters that often contain several elements, though these vary:
a. TATA box: seven residues (TATA(T or A)A(T or A)) at ~ -31 to -25.
b. TFIIB recognition element (BRE): seven residues, all G and C, just upstream from TATA.
c. Initiator: pyrimidine (T, C) rich seven residue segment at ~ -2 to +5.
d. Downstream core promoter element: six residues at ~ +28 to +32.
5. TBP has two homologous, identically-folded domains that bind the TATA box. It binds along
the length of the DNA, sharply bends it, almost into a small loop, and partially unwinds it. TFIIA and
TFIIB both have contacts with TBP and with DNA.
6. Among the TAFs that bind TBP are those with similarities to the histones in sequence and/or
folding pattern. A histonelike octamer appears to be involved in the structure of TFIID, but many of the
residues in this octamer that contact the DNA are negatively charged.
7. About 65% of genes transcribed by PNAPII lack the TATA box. These are mostly constitutive
genes. These genes usually contain the initiator sequence, which appears to be sufficient to set the
start site. Transcription of these genes involves a PIC as above, including a requirement for TBP. TBP
also acts at genes transcribed by RNA Pol I (which produces rRNA) and Pol III (tRNA, small RNA),
where it interacts with TAFs and TFs different from those above.
D. Regulation of Genes that Produce Cell-Specific Proteins
1. Upstream TFs (UTFs) bind to promotor and/or enhancer sequences to activate transcription in
specific cells.
a. The insulin gene has an enhancer in the -103 to -333 region that is affected only in insulin
producing cells: only these cells produce the TFs that bind these sequences. When this enhancer is
inserted anywhere into a plasmid, it turns on transcription of an enzyme gene only when used to
transform cells that normally produce insulin. Only these have the UTFs that bind the enhancer and
activate the transcription. The insulin-coding sequence is not required, it’s irrelevant; it is the UTFenhancer complex that is specific to these cells.
b. β-globin (Hb) gene has CCAAT box at ~ -80 that is recognized by CP1 and a CACCC box at ~
-110 recognized by Sp1. There are also 4 upstream TFs specific to these cells, GATA-1, NF-E2, NFE3, NF-E4.
c. Upstream TFs affect initiation of transcription by interacting with their regulatory sequences in
the DNA, with each other (increasing DNA sequence recognition) and with the PIC (increasing
transcription activation by RNA Pol II). This can result in various degrees of activation (or inhibition)
depending on how many TFs are bound. The TFs have two functional domains; the DNA-binding
domain determines what gene(s) will be activated by affecting the location of binding to DNA, the
activator domain’s function is not specific to a particular gene, but to interacting with the PIC.
Production of a hybrid protein (by recombinant DNA techniques) that has the DNA-binding domain of
the TF that activates gene x and the activator domain of the TF that activates gene y results in a TF
that activates gene x. The two halves of the hybrid TF can be put together in either order (amino to
carboxyl) with the same result.
d. (Skipped in 2006) Architectural proteins bind to the DNA, causing it to bend or change
conformation so as to enable TFs to bind. Some assemblies of a specific group of architectural
proteins and TFs on an enhancer of ~100bp form an “enhancesome” as the functional unit for
activating or repressing.
2. Inducible TFs include those with activity that is regulated by hormones. Steroid hormones bind
to their receptors, which bind to enhancers that turn on specific genes (such as ovalbumin, the major
egg protein, in chick oviduct). The effect of the hormone-receptor complex may be different in different
cell types, apparently because of the differences in the TFs in them.
E. Mediators (Skipped in 2006) (MW ~ 1 million) are involved in transcription from all yeast RNAP II
promoters. They bind to DNA-bound RNAP II and TFs. Similar proteins have been identified in other
eukaryotes.
V. Transcriptionally Active DNA is More Sensitive to Nuclease Digestion.
A. The DNase I sensitivity of a gene is tissue-specific and developmental stage-specific, in parallel
with transcriptional activity.
1. Hb genes
a. In the 20 hr chick embryo, there no Hb synthesis; the genes are not sensitive. In the 35 hr
chick enbryo: Hb synth begins and the Hb genes become sensitive
b. Hb genes in brain cells are not sensitive.
c. Fetal Hb genes are sensitive in fetal erythroid cells, adult Hb genes sensitive in adult
erythroid cells (and not vice versa)
2. The ovalbumin gene is sensitive in the chick oviduct (ovalbumin is the major protein in eggs,
the oviduct is where eggs are produced) but not in erythroid cells that produce Hb.
B. Molecular Basis of Sensitivity
1. Inactive chromation is highly condensed, resistant to DNase I cleavage. Potentially active or
active genes are more exposed or accessible to DNase I: sensitive. Recall the appearance of polytene
chromosomes.
2. Sensitivity results from having HMG14 and HMG17 (proteins) bound to nucleosomes. These
are not tissue specific nor are they sequence specific (transcription factors are sequence specific).
HMG’s have high % of both basic (lys, arg) and acidic (ghu, asp) residues.
3. Evidence for point 2.: Salt extraction of HMGs eliminates sensitivity. Adding back HMGs,
even from another tissue, restores sensitivity.
4. There are three Superfamilies of HMGs
a. (Skipped in 2006) The HMGB proteins have an HMG box consisting of three alpha helical
segments in an “L” shape. This binds to the DNA in the minor groove independent of sequence and
sharply bends the DNA, acting as an architectural protein.
b. (Skipped in 2006) The HMGA proteins contain arg-gly-arg-pro segments called AT hooks
that bind to AT rich sequences. These proteins can bend, unwind, or cause loops to form in dsDNA,
again acting architecturally. They have been observed involved in enhancesome formation.
c. The HMGN proteins include HMG14 and HMG17 (now known as HMGN1 and HMGN2).
They interact with the N-terminal tail of H3 and compete with H1 for binding to nucleosome cores. In
so binding, they cause chromatin to be less condensed, more accessible. HMGN-bound nucleosomes
occur in clusters of about 6.
C. (Skipped in 2006) Nucleosomes “Step Around” RNAP During Polymerization
1. Experiment: a nucleosome core was bound to a short segment of dsDNA on which it did not
move. When this was inserted behind a promoter on a 227 bp DNA, transcription caused it to move
upstream by 40 to 95 bp, into the untranscribed region.
2. Model for the mechanism:
a. As the RNAP approaches the nucleosome, the segment just ahead separates from the
nucleosome.
b. Once RNAP advances into the DNA that had been bound to the nucleosome, the DNA
behind the RNAP binds to the nucleosome.
c. AS RNAP moves on, the DNA ahead “peels” off of the nucleosome while the DNA behind
spools onto it.
D. DNA is hypersensitive to Dnase I in the regulatory sequences on the 5' ends of active genes.
1. These hypersensitive sequences are free of nucleosomes (Fig 34-67). Two proteins have
been identified that bind to DNA on the 5' side of β-globin (Hb) genes that confer hypersensitivity and
prevent histone binding.
2. There are 5 hypersensitive sites in a region 6 to 22 kb upstream of the ε (embryo beta-like)
gene and another about 20 kb downstream of beta. Together, these SH sites act to keep the Hb gene
cluster open to transcription. Individuals with deletion of the 5 upstream sites have severely reduced
synthesis of γ,δ,β (γδβ thalassemia)
3. When RBC precursor cells that synthesize Hb are transformed with DNA containing these SH
sites followed by non-Hb protein genes, these proteins are produced. These SH sites are involved in
activation of transcription in response to specific proteins (TFs) produced only in these cells (other
types of cells do not transcribe this DNA when transformed with it).
E. (Skipped in 2006) Chromatin Remodeling Complexes Send Waves of Unbound DNA Around
Nucleosomes To Make DNA Acessible
1. These are large multisubunit complex that allow TFs and other proteins transient access to
the DNA.
2. Mechanism model:
a. The remodeling complex binds to the nucleosome and adjacent DNA.
b. Using an ATP-dependent “movement” on DNA, it pushes a segment of DNA in toward the
nucleosome, creating segment that is not bound to the DNA.
c. This unbound region then moves like a wave around the nucleosome.
Download