Biotechnology Homework 1 Fall 2010 Answers 1. DNA Hybridization in solution. (i) The formulae are designed for easy application to common procedures. DNA hybridization of oligonucleotides is most frequently used to provide a primer for DNA polymerase, which operates at salt concentrations around 50mM. (ii) If the salt concentration is increased 10 fold the Tm will be increased by roughly 16.6C, according to the quoted formula. If using an oligonucleotide to hybridize to DNA in a Southern blot or colony blot it is common to use salt concentrations around 0.5M in order that hybridization and washing can be conducted at temperatures around 40-60C. (iii) If you have a larger amount of each hybridizing component you would generate more product but that is especially true if you keep the volume small. In other words, DNA concentration ) of both potential partners) is a key parameter. Melting of DNA is rapid but hybridization requires DNA molecules to meet in roughly the right register and orientation. It can be extremely slow and so, in most cases a hybridization reaction does not reach equilibrium. Simply by waiting longer you can get more product. The more concentrated the two hybridizing DNAs the fastervwill be the reaction. Thus, if you can have extremely high concentrtations there is no need to wait a long time. This is generally true for DNA sequencing where adequate amounts of template are used and a large molar excess of primer is used. For Southern blots, colony blots and several other applications (FISH) one partner is often present at very low (fixed) levels. Increasing probe concentration speeds the reaction but there is a limit because the probe must also be made at high specific activity. Hence, it is often necessary to hybridize over several hours to increase the amount of hybridized product sufficiently. The temperature of hybridization also has an important bearing; somewhere around 15C below Tm is often optimal. The general idea is that DNA molecules should not be unavailable for too long because they are trapped in mismatched or partial hybrids but temperatures very close to Tm will allow a substantial number of perfect matches to be reversed over time. Again it is the speed of association that is being affected. DELETE? (iv) Are those TWO factors easy to change and effective if the application is (a) DNA sequencing or (b) performing a Southern blot? [2] (v) An initial thought might be that the Tm could be reduced by as much as 20C. If the oligo were extended by about 7-8 nucleotides (of mixed A/T and G/C composition) that might raise the Tm of the desired match back to its original level. However, it is also important that the oligo does not hybridize to incorrect sites. This becomes more likely the longer the oligo (under the same conditions). In fact, there are likely to be quite a few off-target sequences that differ from your oligo at 6, 7 or 8 positions, which is not so different from the target (4 mismatches). Hence, you would be well advised to increase the length of your oligo by considerably moiré than the initial proposal based on considering only the target and not the background. By doing that you can raise the stringency of hybridization and reduce the proportion of oligo that hybridizes off-target despite the increased length necessitated by mismatches. (vi) There are at least two significant issues here. First, as discussed earlier, kinetics rather than Tm equilibrium alone is important for the practical outcome of annealing. In most ligation reactions you 1 conveniently or necessarily use fairly low concentrations of DNA. Thus, whenever two DNAs do transiently anneal it is very important that the linkage is maintained at high frequency. That would be achieved if the hybrid is very stable or if ligase makes a covalent linkage, but not in the absence of both. Second, consider the step after a typical ligation. Most commonly a small amount of the ligation reaction is diluted substantially (usually 20-100 fold) when adding to competent bacteria. The dilution will favor dissociation, so you may rapidly lose your annealed product. Furthermore, the transformation medium is necessarily low salt, again greatly favoring dissociation. (vii) Given the answer above, the increase in hybrid stability must be reasonably large to preserve a good proportion of hybrids up to the point of entering bacteria. The length of single-strand overhang required is not easy to anticipate (& was not requested here) but experimentally an overhang of about 12nt has been found to be sufficient (though obviously the longer the better). Even harder to anticipate is what might happen inside bacteria. In theory the salt concentration is no lower than in the transformation medium and there are ligase molecules ready to seal single-stranded gaps. On the other hand, you are now dealing with one molecule in each cell and if there is dissociation there would surely not be any re-annealing. It turns out experimentally that such hybrids are preserved in good proportion and can yield stable plasmid populations with good frequency. (viii) A good method is to use a 3’ to 5’ exonucleas activity. In fact, T4 DNA polymerase is particularly good because it has high exonuclease activity but this will only be revealed if polymerase activity is reduced by lack of dNTPs. To generate a specific single-stranded overhang you can design the strand to be chewed away to have no nucleotides of a certain type and then to add only that specific dNTP. For example, if one strand ended in the sequence 5’ TGGACCTGCTTTCCGGTC 3’(its complement would be present but would not be digested at this end because the exonuclease proceeds from 3’ to 5’), you could add T4 DNA pol and just dATP to reduce the terminus on that strand to 5’ TGGA 3’. A second DNA molecule could be made in similar fashion (adding specified sequence by PCR and trimming back with T4 DNA pol to a specific site) such that the two single-stranded ends generated were complementary. This procedure followed by the annealing discussed in (vii) are in fact the basis of a method dubbed ligase-independent cloning (LIC). 2. Hybridization to immobilized DNA It is often crucial to consider the sensitivity of a detection method and instructive to think about how many molecules you are trying to detect. Imagine performing a genomic Southern blot using radioactively labeled probes (which used to be common) (i) You would probably use PCR or random priming using one or more alpha-labeled 32P dNTPs (ii) RNA hybridizes to DNA with very similar kinetics, specificity and equilibria as observed for DNA-DNA. You might worry about RNA probe degradation. That could have a small impact but if a starting 1kb RNA were split into five or six fragments they would still hybridize with similar efficiency. Riboprobes are (were) indeed used commonly for this type of application. 2 (iii) You would use a purified phage RNA polymerase (T3, T7 or SP6) and you would have to join the DNA of interest to a binding site for the relevant polymerase in the appropriate orientation (that could be accomplished by PCR or cloning in a plasmid). You would need rNTPs, including one labeled. You would not need a primer. (iv) The strength of signal depends on the number of radioactive molecules hybridized (if of uniform 32P content), which in turn will depend roughly linearly on the number of target molecules on the filter. If one genome equivalent is roughly 250 times the mass of the other (yeast; about 12,000 kb, human: about 3,000,000 kb), an equal mass of DNA will contain about 250 times fewer molecules containing any given single copy sequence of human DNA compared to yeast DNA. Hence the signal for the human blot will be about 250-fold weaker. (v) If we approximate the Mr of one bp as 700 and the human genome as containing 3 x 109 bp, one mole of the human genome has a mass of 2.1 x 1012 g and, like all moles contains about 6 x 1023 molecules. Hence, 1 x 10-7 g contains 6 x 1023 (1 x 10-7 / 2.1 x 1012) molecules, or about 1.3 x 105 (roughly 100,000) molecules. (vi) In FISH you detect the presence of a single molecule of DNA. The most critical difference between FISH and a Southern for sensitivity is that the FISH signal is spread over a tiny area, viewed with a high-powered microscope, whereas a band on a Southern blot may have an area of 1mm x 5mm. Hence, the same total signal may be several hundred fold more intense at any one spot in FISH. PLEASE START A NEW PAGE FOR QUESTION 3 3. DNA sequencing (i) The question contains several irrelevant details about the process of the experiment in order to be realistic and to presnt the problem empirically. Sometimes you do not know if the assay is deceptive or if something unanticipated was happening in your primary procedure. My best guess as to the cause is that genomic DNA will be self-priming. Even in a good preparation of genomic DNA there will be some shearing of DNA molecules. You may have DNA molecules of a variety of sizes, perhaps mostly 300kb or more, with randomly positioned ends, but you will not have intact whole chromosomes. When you denature and renature the DNA (ostensibly to allow oligonucleotide to hybridize) original duplexes will not necessarily re-form. In most cases two DNA strands which are largely complementary and hybridize together will not originate from the same molecule. They will therefore have different end-points and the duplex formed will have (reasonably long) single-stranded overhangs; sometimes the 3’ end of one strand will be recessed and will serve as a primer for synthesis of DNA, copying the single-stranded region of the complementary strand. This will produce a lot of self-priming. Even without the denaturation and annealing steps, shearing will produce some (but far fewer) molecules with single-stranded overhangs, likely to produce a little background self-priming. (ii) If you tried to perform standard Sanger dideoxy cycle sequencing with a good primer (as in (i)) but using total genomic DNA as template you might very well have a problem with both signal and noise. (a) The problem in (i) was self-priming. If, instead of using labeled ddNTPs you used labeled primer you would only see DNA synthesis primed by the specific oligonucleotide. Remember, labeled primer 3 could be used by having four separate reactions with ddATP, ddTTP, ddCTP and ddGTP and loading four parallel gel lanes, or by using four separate reactions, each with a different fluorophore labeling the primer and running all four samples together in one (capillary) gel lane. (b) Several reasonable answers were anticipated. You could make each product strand include more reporter by using labeled dNTPs (& going back to the four la ne solution mentioned above). You could increase the number of cycles used for sequencing. Technically you could use more genomic DNA also. (iii) Yes, this should work in principle. The RNA products would not be as stable as DNA and any hydrolysis would produce background (of a labeled fragment of random size) as well as reducing correct signals. Also, RNA polymerases are less accurate than DNA polymerase. I suspect they are also less processive but I do not have numerical evidence. By now, of course, cycle sequencing is routine (although not essential) and there are no good heat-stable RNA polymerases (plus RNAs would be degraded too rapidly at high temperature), so that convenient feature of DNA sequencing would have to be discarded. (iv) I believe it would be possible in principle to use an RNA template, hybridize a primer (which can most conveniently be DNA), extend with reverse transcriptase and use ddNTPs as terminators. I am not certain if reverse transcriptase would recognize dideoxynucleotides well enough (same reservation applies to RNA polymerase and rNTPs lacking 3’-OH in above question) but reverse transcriptase definitely scores very poorly in terms of processivity and accuracy, so you would, at best, retrieve very poor sequence. It is also, of course, hard to obtain pure defined RNAs other than by transcribing pure DNA so you may as well sequence the DNA directly. (v) If you used a biotin labeled primer, after the sequencing reactions you might plan on cutting the duplex of template plus newly synthesized DNA and then removing all molecules containing primer with streptavidin-linked beads, leaving only labeled DNAs starting at the exact same cleavage site in each case. That array of labeled molecules should produce perfectly good DNA sequence. However, there are some limitations. First the streptavidin clean up must be very efficient; fortunately it can be very good. Second, one cycle of sequencing will yield product still associated with template but it would not be effective to include many cycles; the single-stranded products could theoretically be reannealed at the end with excess template but I doubt that would be very efficient. Hence, the signal obtained may be too low to be useful (of course the effort involved would make the whole process of dubious value given the small gains even if the sequence obtained was clear). Finally, once the gel separation limitation is overcome one would eventually run into problems of polymerase processivity; there would be more random stops relative to signal the longer the synthesized DNA, so there is very little chance of extending this type of process over much longer distances. 4. Synthetic oligonucleotides & applications (i) If oligonucleotides are made with overlapping complementary sequences they can be annealed together. Since each overlap has different sequence this kind of assembly can occur simultaneously for many oligonucleotides without interference (sufficient to generate 5kb in one step I believe). The design could be such that overlapping hybridization reconstitutes the entire double-stranded DNA, leaving only nicks, which can be sealed with DNA ligase. Alternatively, and more economically on 4 oligos, one can reconstitute double-stranded DNA with sizeable single-stranded gaps, which can be filled in with DNA polymerae plus ligase. It is key that the overlaps are long enough for specific hybridization. I would guess that as short as 12-20 nt might work but that it is wise to make the overlaps much longer (30-50nt) to ensure robust (efficient and specific) assembly of several oligos. (ii) Certain physical methods (gels, sizeing columns) could be used but the key point here is that you will also almost certainly need to amplify the correct DNA molecules. That means you will have to either use PCR or clone the products. (iii) How might 5kb units of pure double-stranded DNA fragments be put together in the next step, into, say, 20-25kb units? Remember that the desired DNA sequence must be made precisely- no changes in DNA sequence to make convenient restriction sites etc. [1] A DNA polymerase generally has the choice of extending a primer annealed to a template or hydrolyzing the last primer nucleotide using 3’ to 5’ exonuclease activity. Exonuclease activity is much higher for a mismatched 3’ nucleotide and polymerase activity is much higher if base-pairing is correct. The competition between polymerase and exonuclease is also influenced by dNTP concentration and is very different for different polymerases. In general, a polymerase that is fast and processive in DNA synthesis has low 3’ to 5’ exonuclease activity. Such enzymes are good for DNA sequencing, including Taq, which is widely used for cycle DNA sequencing. (iv) For DNA sequencing it is crucial that the vast majority of labeled products derive from a single priming site with a fixed (identical) 5’ end. Impurities that are of correct sequence but 1,2,3, 4, 5, 6, 7 or even 8 nucleotides shorter at the 5’ end will likely prime. Each will lead to a band of inappropriate size with terminators and produce background that disguises the correct sequence (note that the problem would be even worse without capping where specific incorrect bands would be produced by all incorrect oligos (just 1nt shorter than the correct oligo). For PCR the exact size and sequence at the ends may not matter for some applications. Whether the background sequencing bands (amounting to perhaps 10% of correct bands) actually spoils the sequence is not obvious, but there will be some loss of quality. (v) If the 3’end is mismatched there will be very little priming because the polymerase has low 3’ to 5’ exonuclease activity- so, no signal. A mismatch in the middle will not prevent specific hybridization and the length of products will not be affected, so sequence will be normal. If a nucleotide was omitted (anywhere) hybridization will be OK and all products will be one nucleotide shorter but that is of no consequence- the sequence will still read fine. If a nucleotide (other than the last 3’ nucleotide) is missing in 50% of primers two sequences offset by one will be produced and superimposed. That superposition cannot be converted to correct sequence in practice because of the demands of high quality of peaks to prove that DNA sequence is correct. 5