Biotechnology Homework 1 Fall 2009 Answers

advertisement
Biotechnology
Homework 1 Fall 2009
Answers
1. (i) (a) On one chromosome there is only one copy of a specific piece of DNA, so you must detect
just a single DNA molecule.
(b) You need to recall how to convert mass, moles and molecules, and to approximate some values. A
whole human genome has a molecular weight of roughly 3 x 109 x 700 = 2 x 1012.
Hence, 5 g has 5 x 10-6 / 2 x 1012 moles
Or 5 x 10-6 x 6 x 1023 / 2 x 1012 = 1.5 x 106 copies, or molecules if we consider the whole human
genome as one molecule.
No matter what the size of DNA in question there would be roughly a million molecules in a single
band that might be illuminated by hybridization to a probe.
(c) Similar to the calculation above, the number of template molecules is roughly
1 x 10-7 x 6 x 1023 / 700 x 103 = 1 x 1011
In each round of synthesis one template molecule can only produce one product molecule (and in fact
the efficiency of template use will be somewhat lower than 100%, but still pretty high.
Although the procedure is called cycle sequencing there is no template amplification, just repeated
rounds of copying the same 100ng of original template.
Hence, the maximum number of product molecules from 20 cycles would be 2 x 1012
These products must be distributed amongst about 1,000 bands (whenever you are attempting to read
700 nt or so), and in fact there is bound to be an excess of full-length copies. So, again, the maximum
number of molecules in a single band will be about 2 x 109 (and in reality that figure might be as much
as ten-fold lower)- something like a billion.
Very few answers acknowledged that important products were spread over about 1,000 bands. Hence,
any one crucial piece of information (a band or peak) will only represent about 0.1% of the total
product.
(ii) The most striking disparity is between FISH and sequencing but FISH and Southerns are in some
ways more similar procedures and still a million-fold apart in required sensitivity. The most important
difference is that the FISH signal is confined to a very small area (perhaps something like 0.1 m x 0.1
m in real space; a bright signal makes it appear larger) compared to the others (perhaps 0.1mm x 0.5
mm)- a difference (very roughly estimated) in the neighborhood of a million. Thus, a signal can be
much lower in FISH to achieve the same intensity. When taking this kind of argument to the extremes
one has to consider also the absolute amount of fluorescence and photon detection efficiency, but here
one can use a probe that has a very large number of fluorophores to hybridize to a single target
molecule.
(iii) The longer the region of target DNA for which you have labeled probe the stronger the signal. Of
course, the probe must be labeled to high specific activity also to increase the signal. In practice,
whether one or several DNA templates are used to generate a probe the probe will be made as, or
broken down into, relatively short sections. Thus, each fragment has some chance of hybridizing to a
given chromosome during FISH and the signals you see come from the hybridization of many probes
to nearby regions of the chromosome. If you had one long probe you would often see nothing, and
occasionally see a jackpot hybridization event with a strong signal.
1
I am personally a little surprised at how many molecules are in a sequencing band (at least according to
these calculations assuming very efficient template usage). Probes for Southern blots are often around
1kb or more and are made to high specific activity by using a labeled dNTP. Hence, there may be
towards 2-300 labels per probe molecule, compared to only one labeled ddNTP in a sequencing
product molecule. That accounts for a difference of a hundred-fold or more in the sensitivity of
detection between these two procedures (but not a million-fold). When Southern blots were popular
they generally used radioactivity, for which the sensitivity of detection (for bands) may be greater than
fluorescence, but I do not know the relevant numbers. I do know that when old-fashioned sequencing
transitioned from incorporating radioactive label (for example, from 32P-dATP at several positions in
one sequencing product) to using one fluorescent ddNTP it was necessary to introduce multiple cycles
of sequence to produce enough product to be readily detectable.
Many answers did not emphasize, explain or mention the importance of a long probe simply to
generate more signal. Instead, there was a lot of discussion about specificity of hybridization. There are
regions in a human genome that are exact or near-exact repeat sequences. If those repeats come from a
variety of chromosomal positions and you include the repeat sequence in your probe you will indeed
see hybridization at more than one site. The longer your probe the more chance of including a repeat.
However, the solution to this is not to use short probes but to find and remove repeat sequence (i.e use
multiple pieces of DNA from the region in question, but none with repeats). Ignoring repeats there is
no problem with specificity. 300bp is long enough to provide absolute specificity in hybridization (so
is even 50bp) even in the complex human genome at high stringency, and the specificity remains the
same (exceedingly high) even if the probes collectively include 200kb.
Also, in case anyone was worried, it is equally good to talk about annealing or hybridization of two
nucleic acid strands to each other.
2. (i) (a) After PCR the tube will contain the starting genomic DNA, the PCR product, excess primers
that were inevitably used, Taq or similar polymerase, dNTPs and buffer. The main components to
consider are nucleic acids.
Genomic DNA should not be a problem. A sequencing primer is typically at least 20nt in order that it
would have several mis-matches with random (or in this case genomic) DNA. Thus, it is easy to find
an annealing temperature where only a very small proportion of hybridizing primer, hybridizes to
inappropriate sites. Even if there is, by chance, a really good match at one genomic location there are
so few copies of that locus in the reaction tube that the specific products from there will be negligible.
Hence, genomic DNA would be expected to contribute only a very small background of bands, roughly
equally distributed among ddA, G, C and T colors at each size.
The primers will be a problem. Sequencing must be from only one site. Even if you chose to use one
of the PCR primers for sequencing you must get rid of the other (a reasonably strong second sequence
imposed on a primary sequence makes a mess; although theoretically possible to resolve, the demand
for top quality sequence makes that objective unattractive). Hence, PCR products are generally
cleaned up prior to sequencing by passing them through some form of sizing column, where large
DNAs pass through and are collected, while primers and low molecular weight compounds are still
traveling through.
The polymerase, dNTPs and buffer components are not the main issue here. All fresh solutions and
enzymes are nice and undoubtedly have some impact on product quality but a portion of the PCR
products could easily be diluted straight into sequencing reactions with more polymerase and suitable
quantities of dNTPs and ddNTPs if this were the only issue.
2
Many answers did not acknowledge the presence of genomic DNA and address whether this was a
problem or not. Also, some answers talked generically about contaminants. While it is not necessary
to enumerate every “contaminant” it is important to think about what you include under that heading
and which contaminants propose what kinds of problems. In other words, be precise in your thoughts
and writings.
(b) The problem (as made very clear in Q1) is that you won’t have anything like enough of the desired
template. Basically, for the human genome, making 1g of a PCR product of 3kb from the same mass
of a genome of 3 x 106 kb is a million-fold enrichment. It would not be feasible to use 100mg of
genomic DNA in a volume like 10l, as used for sequencing- far too sticky. If it were feasible, a
second problem would now emerge of considerable background priming from primer mis-matches to
the, now, huge concentration of genomic DNA segments.
(ii) If you had 8 fluorophores you could use two tubes (4 fluorophores each) for two different templates
and then combine into one lane. The idea that might occur for 5 fluorophores is to use the 5th
fluorophore by attaching it to the primer used for one template (but not the other primer). Although
initially you might think that you have thereby created two sets of uniquely labeled sequencing
products, the problem is that you do not see the results molecule by molecule but rather band by band.
Thus, you cannot identify a molecule as starting with labeled primer and ending with ddGTP, for
example. Instead, of course, every single band will have some product from the labeled primer and
will include that fluorophore, providing no extra information at all. Two super-imposed sequences
cannot be separated into the two components. That would also be true if one were theoretically at
much lower concentration because by-products (random pauses etc.) from the stronger sequence would
introduce significant background signals into the weaker sequence.
3 (i) (a) The hybridization (and washing steps) was likely conducted at low stringency. This can be
done by reducing the temperature and/or increasing salt concentration from standard conditions.
Some answers mentioned zoo blots. That is OK but the name itself does not contribute to answering
the question. Also, a typical zoo blot has DNA from many species (not just yak) to see if a probe
sequence is generally conserved (& such experiments are very rare now given full sequences of many
representatives of different phyla and species).
Some answers talked about controlling the timing of altered hybridization conditions. I don’t believe it
is common or useful to change conditions during a hybridization experiment.
(b) No, for many reasons. DNA is mostly irreversibly bound to membranes used for Southern blots.
Even in a narrow slice of (fresh, not transferred) gel with cleaved genomic DNA there will be a huge
number of contaminating bands and way too low a concentration of the desired product to be of any
use.
One could envisage PCR-amplifying a yak DNA fragment by guessing which sequences might be best
conserved and designing suitable PCR primers, but if the Southern blot looked good the simplest idea
is just to use the mouse DNA probe for the yak Northern blot under similar conditions of reduced
stringency. It is true that this approach may be problematic. If the gene in question is poorly expressed
the signal may be very low (whereas in a genomic DNA Southern equal representation of different
sequences [other than repeats] is guaranteed). Also, if a somewhat related sequence is abundant (may
3
even apply to rRNA) that could give a spurious signal under reduced stringency. A realistic problem is
that it would be hard to design a control to prove that whatever you see is the correct transcript.
Hardly any answers stated the obvious deficit (to me) of insufficient and impure DNA in the 3.2kb
region (so the question was worthwhile). Instead, a variety of lesser concerns not directly relevant to
the idea of probe quality were raised. One concern was that part of the 3.2kb might correspond to
intron. Even if only 500bp corresponds to exon you should be able to get a good signal. If the whole
region were intron it probably would not be well conserved and the 3.2kb band would not be identified
in the first place. It is common to use genomic DNA fragments to probe Northern blots, although I
concede that cDNA probes can be better. Another concern was that the relevant mRNA might be
absent or at low concentration. True, but that means a Northern is doomed no matter what the probe.
Third, some had concerns about DNA hybridizing to RNA and wanted to make RNA probes instead.
DNA-RNA hybridization is perfectly fine.
Having failed to identify the main problem, most solutions were necessarily inappropriate. Some
included cloning steps, but incidentally rather than as a way to improve DNA yield and purity for a
probe.
(ii) (a) Isolate DNA (probably from blood cells), cut DNA and run a gel. Transfer and probe
duplicates with (a) the normal oligo sequence and (b) the specifically altered oligo. As controls for
each hybridization include DNA (PCR product or from a plasmid in very low amounts so the signal is
similar to the genomic DNA blots) that is normal and DNA that has the specific mutation (although
normal genomic DNA samples can suffice). Hybridize at moderate stringency and wash with
increasing stringency (several trials may be necessary unless the conditions have already been
established), so that under at least one condition correct matches give much stronger signals than mismatches (revealed by the controls). It is important to use both normal and altered oligos to distinguish
heterozygous, homozygous and no mutations, to be sure that precisely the desired mutation is being
tested and to provide some measure of duplication, affording a reliable result. The controls are
similarly crucial to be sure how to interpret what you see.
The sensitivity of a direct Southern with oligo probes is just sufficient. It might, however, be better and
more convenient to PCR amplify the region around the suspected mutation, spot products onto a
membrane and hybridize to those dot-blots.
Most answers talked about high stringency, which is good enough. However, I think the answers
assumed it is trivial to distinguish between perfect & imperfect matches as a black and white issue. It
is actually quite a fine distinction and really requires controls to show you that the chosen conditions
do (or do not) achieve that distinction.
A few answers did not talk about mis-matches at all, but instead different stabilities due to G-C vs A-T
content and looking for detailed melting profiles. That is missing the main point, but also those
distinctions are extremely small. The purpose of diagnostic tests is to get clean results that are
definitive. Hence, small quantitative differences are never attractive options.
(b) The final sequence would have two overlapping peaks of similar magnitude at a single position (for
a mutant heterozygote). If sequence quality is high this should stand out easily from the random
background peaks (but seeing normal samples in parallel is extremely helpful in being convinced of an
abnormality). Both alleles are of course expected to be amplified similarly by PCR.
4
Many responses simply refused to describe the requested data. It is important to get used to the idea
that data and interpretation are two distinct things that must be distinguished and kept separate. The
purpose of these questions is to make you think about the real experimental experience. You see data
and then try to make deductions strictly from the data.
(c) When two super-imposed sequences diverge at one point by a nucleotide deletion you will see a
clean sequence up to that point and then a variety of single or dual peaks that could actually be
ascribed as consistent with a precise change (when you know the normal sequence- note that is
different from deducing unknown sequences from two super-imposed sets of sequencing peaks).
(d) The sequence output above would be reasonably clear but certainly would not be good evidence
that there is only a single base-pair deletion (other aberrations including larger deletions and
inseretions would give results that look broadly similar) and is not in itself crystal clear. An
improvement could be made by sequencing the DNA from the other end. That would show that either
side of the lost bp the sequence of all amplified templates are the same. An alternative approach
would be to clone the PCR products and sequence enough copies of each type of DNA molecule
present (in pure form) to be sure that both forms are represented.
Many answers opted for a solution by oligo hybridization (presumably because you had already
described such experiments in (i)). That is OK but it is not the best approach in terms of speed or
clarity of results. Some suggested doing this as a FISH experiment. I think that is impractical because
FISH requires long probes to generate a sufficient signal and discriminating hybridization by two
oligos is difficult to achieve under the other constraints of FISH (including amplification of a binding
signal due to a single molecule).
(e) There is an established technique called ARMS-PCR (or ASA, allele-specific amplification). The
key point is that DNA polymerase with no exonuclease activity requires perfect base-pairing of a
primer at its 3’ end. This is a very stringent requirement leading to black and white results (much
better than comparing hybridization of oligos with a perfect match versus a single mis-match). So, two
PCR reactions are attempted. Each has a perfectly good PCR primer downstream of the site in
question. The second PCR primer anneals precisely to the sequence that is being interrogated so that
its 3’ end hybridizes to the nucleotide that may be different. This second primer is made to include
either normal sequence or the hybridization partner of the precise change being tested. Hence, one
primer amplifies only normal DNA and the other amplifies only mutant DNA.
While differential hybridization of primers could theoretically be used it provides far less
discrimination of mis-matches and, unfortunately once you generate some PCR products you will now
have perfectly matching templates for future rounds, so PCR would be much worse than simply testing
oligo hybridization if mis-matches were positioned anywhere other than the 3’ end.
I expect several students knew of the correct method or came across it in texts or discussions, but if
anyone actually figured the correct answer out without such prompts I would be very impressed (& you
missed out on a patent opportunity by a few years).
4. (i) One concern would be whether any bands are hidden or hard to see. The easiest place to hide in
plain sight is for a band to co-migrate with another. Gels have limited resolving power, so, for
example, two bands of approximately 3.0kb might well co-migrate even if they differ in length by up
to about 100bp. Such co-incidences are not so likely for this small molecule, but they are inevitable
5
when examining much larger cloned DNAs. You can tell if there is co-migration (or at least strongly
suspect it) by paying attention to band intensities. Ethidium bromide (and other dyes that may be used)
bind evenly along DNA, so the band intensity reflects the number of bp present, or mass of DNA. If
the three bands all come from the same molecule they should be equimolar and hence band intensities
should be proportional to size. Co-migration is revealed by bands of twice or more the expected
intensity (easy to judge by eye).
The second place where bands disappear is if they are very small. Such bands stain poorly and diffuse
more readily, so they are weaker and broader. That problem is worst for low percentage agarose gels
and for slowly run gels but could easily affect fragments of 300bp or smaller in typical gels. The best
way to address this possibility is to look at several other digests (and sometimes a variety of double
digests). You can sum the bands for each type of digest and look for consistency (no missed
fragments) or the largest (only that digest produced all bands large enough to notice).
Some answers suggested that different gels might better resolve co-migrating bands. That is true to
some degree but for differences less than 50bp or so agarose gels will have trouble (worse if the
fragments are large) and polyacrylamide is more work with only a small chance of benefit.
It was suggested that looking at uncut DNA or DNA cut just once (by chance) would be especially
helpful. Full-length linear DNA can be sized quite well but big fragments are inevitably measured less
accurately than smaller ones (& how do you know there is only one cut). Circular molecules migrate
anomalously (nicked & supercoiled migrate very differently from each other and linear DNA of the
same size), are not well separated from each other & hence are very poor choices for size
determination.
Some students thought very large fragments might not enter the gel. Protein associations might trap
DNA but generally pure DNA of any size enters a gel even if it is not fractionated well. Also, a
plasmid is necessarily relatively small, so no huge DNA fragments will be present.
(ii)(a) The fact that the second band is 2.8kb should alert you to the idea that this could represent the
sum of 1.0 and 1.8kb bands. That band could be present if it is not cut in the middle by EcoRI. That
could happen if the DNA were not fully digested (perhaps the DNA prep. was not as pure as useful or
the batch of enzyme was getting old- should not happen often, but does occur frequently enough to be
realistic). In that scenario one would expect other EcoRI sites to be incompletely cut and to see the
1.0kb region still associated with its neighbor on the other side (and perhaps even with two
neighboring regions where two sites were not cut). That is consistent with the reported results.
It is possible that multiple regions of the genome have considerable sequence identity to the 1.0kb
fragment and are hybridizing to the probe. If the sequence similarity is very high this could happen
even at high stringency but it is likely that 1.0kb would be the strongest band (so intensities might
favor or dis-allow this possibility).
A surprisingly large number of students took this and the rest of the question to relate to digestion of
plasmid DNA (the question is clear that genomic DNA is digested). Accordingly, most results were
rather hard to explain or the answers were, at least, different. The most salient advice is simply to read
the questions very carefully (and to attend recitation so there is less chance of such misunderstandings).
(b) The arguments here are similar except there are only two bands, as if the EcoRI site separating 1.0
and 1.8kb regions were the only site not being cut (some of the time). That could be because it is
absent in the DNA (perhaps a single nucleotide change, a SNP, within the EcoRI site- a restriction
6
fragment length polymorphism [RFLP] in the Southern). In fact, for the observed pattern the donor
mouse would have to be heterozygous for that specific SNP. Among individual humans and fruitflies
(so likely quite similar for mice) there is roughly one SNP per kb, so finding one in a specific 6bp
target (the EcoRI site in question) would certainly not be expected but is a reasonably frequent
occurrence.
As above, a second region of sequence similarity to the probe is possible (although the 2.8kb size coincidence and bands of equal intensity would certainly make the first explanation more likely).
(iii) (a) Since you see two HindIII bands you assume that the 1.0kb probe spans these two fragments
and hence that they are adjacent (here we are ignoring the possibilities in (i), as instructed). BamHI
does not appear to affect the 1.2kb fragment, implying that no BamHI sites lie within that fragment.
The 4.7kb fragment is reduced to 0.8kb, implying a BamHI site 0.8kb from the end of the 4.7kb
fragment that is adjacent to the 1.2kb fragment (if it were at the other end of the 4.7kb fragment the 1.0
kb probe would not hybridize to the 0.8kb fragment).
The above argument is the simplest but is not definitive without some further tests. Specifically, it
could be that the HindIII 1.2kb fragment is being cut by BamHI to give an 0.8kb fragment and the
4.7kb fragment is cut by BamHI to produce a 1.2kb fragment (the same size as the original HindIII
fragment just by co-incidence). It is fine for you to ignore this second possibility (the two could be
resolved by further digests, including BamHI only).
(b) Going with the first explanation, you know that the 1.0kb EcoRI fragment overlaps to some degree
with the HindIII 1.2kb fragment and the Bam-HindIII 0.8kb fragment. If the 1.0kb extended
significantly beyond the BamHI site (away from the 0.8kb fragment) you would see a third band
illuminated by the probe (the possibility of small unseen fragments and overlaps insufficient for a
strong signal mean you cannot be certain that an EcoRI site is within the 0.8kb fragment but that is
most likely. The most important undisclosed information is the intensity of the 1.2kb band relative to
the 0.8kb or 4.7kb bands (which should be similar to each other). The relative intensity will reveal
what proportion of the 1.0kb probe lies either side of the central HindIII site. That should allow
moderately good positioning, which, of course, could be refined by BamHI-EcoRI and HindIII-EcoRI
digests if that is important to ascertain.
In fact, the extent to which signal intensity gives more information depends on the way the probe is
prepared. If made, for example, by random priming you will in fact have many short labeled singlestranded fragments (typically around 300nt) from the 1.0kb region. If two-thirds of the 1.0kb fragment
overlaps the 0.8kb BH fragment roughly two-thirds (actually more) of probe molecules can hybridize
to that fragment (compared to one third for the 1.2kb fragment). Hence, the relative intensities would
be roughly 2:1. If, however, the probe is made by PCR and contains only 1.0kb fragments, each
fragment can hybridize to 0.8Kb or 1.2kb fragments and in doing so will bring with them the same
amount of radioactivity. Provided each overlap is reasonably long (about 300bp or more) the
efficiency of hybridization (rate of collisions leading to stable association) would also be very similar
so the final band intensities would be much the same. In other words, this type of probe will only give
you information if one of the overlaps between probe and target is very small (less than 300bp).
7
8
Download