Biotechnology Homework 2 Fall 2010 Answers Please read questions carefully. You always need to provide explanations in your answers. 1. (i) Since the vector is commonly used, oligonucleotides of about 20nt that are complementary to sequences either side of the Multiple Cloning Site section already exist. These can be used as sequencing primers, allowing you to read sequences from the vector into the insert from each end. You can just use the plasmid DNA as template, or you can cut it outside the area of interest if you wish (e.g. KpnI or EcoRI for one primer and XbaI or SacI for the other), but why take extra, unnecessary steps? Each sequence should extend 800 to 1,000 nt to good quality, so you are likely to cover the whole cDNA in this way. Even if the sequence is not perfectly clear towards the end of the runs you are not discovering DNA sequence de novo, but checking against a prototype, so minor ambiguities can be tolerated. If the sequence quality is not high enough you can synthesize one or more oligos that will prime from internal sites in the cDNA (based on the first sequence you determine: “primer walking”), allowing those regions of sequence to appear on shorter DNA fragments, and hence to be read more clearly. When sequencing de novo (when you have no idea of the DNA sequence beforehand) it is important to sequence any stretch of DNA several times, generally from each strand, in order that difficulties with poor gel resolution, secondary structure of synthesized DNA or polymerase pausing are minimized in at least one or more reads. (ii) The 5’ to 3’ mRNA direction is EcoRI to XbaI. Hence, this mRNA will hybridize to a probe with the opposite orientation of XbaI to EcoRI. The RNA probe should therefore be made with T3 RNA polymerase. It is standard practice to linearize the template so that the polymerase can terminate (here using EcoRI or KpnI (but certainly not SacI or XbaI) but this is not necessary as continued synthesis around the plasmid is not especially harmful (probe that hybridizes to vector DNA will also be generated, “wasting” some label but vector DNA or RNA is unlikely to be present in your RNA blot). You must, however, provide an appropriate buffer and rNTPs, at least one of which must be labeled (for example with 32P at the alpha position, or biotin-UTP, or digoxigenin-UTP). (iii) Most likely these represent different mRNA forms of gene A (containing some common exons and some different exons). The cDNA of 1.7kb contains the whole coding sequence but mRNAs generally have some sequence preceding the initiator codon (“5’ UTR”; tens to hundreds of nts), transcribed sequences beyond the terminator codon (“3’ UTR”; often a couple of hundred nts & sometimes more, especially if those sequences regulate the rate of translation initiation or localization of the mRNA within the cell), and a polyA tail (of, say, 30-200nt and heterogeneous). Hence, the 1.9kb band most likely corresponds to the cloned sequence in its coding capacity (but you cannot be certain from just these data & are not being asked for that distinction in the question). Given the qualifiers of strong bands and high stringency in the question, it is less likely, but not impossible, that one (or both) of the bands corresponds to mRNAs of a related but non-identical gene. (iv) The key difference between the two probes is that the DNA probe will contain both labeled strands. The 2.4kb RNA may therefore be transcribed from the opposite strand to that used for gene A. Note that it would be sufficient for a signal if only part of the 1.7kb region was in the 1 2.4kb RNA (especially if that RNA were very abundant, increasing the signal). You may be concerned that such a scenario is rare and therefore unlikely. True enough, and a good reason to pause, but overlapping genes in general are fairly common; sharing exons, in opposite orientations, as described here, is quite rare. In this case, a good test would be to make an RNA probe from the opposite strand to hybridize to the RNA blot, and you might do well to consider the above explanation only a hypothesis pending that result. To my mind, no better hypothesis is apparent. (v) (a) Simply cut out the EcoRI-XbaI cDNA fragment and ligate to vector cut with the same two enzymes. (b) After the digests mentioned above purify each desired fragment from an agarose gel. Ligate in roughly equimolar quantities and at reasonably low concentration (perhaps 20-50ng of vector in 10l). Take some of the ligation mixture (perhaps 2 l) and transform competent E.coli. Plate on ampicillin plates. Pick three or four colonies, grow, isolate plasmid DNA and digest with, for example, EcoRI and XbaI to reveal (hopefully) 3.0kb and 1.7kb of appropriate intensities in at least one case. (c) Purifying the cDNA fragment is worthwhile. Otherwise many products of transformation will yield the original plasmid (from re-ligation to the cDNA insert). The argument for the recipient vector is similar but here the tiny fragment cut out may not be a problem if it is too short to remain double-stranded. In fact, however, an equally big (or bigger) danger is that there is a tiny amount of uncut vector or a small amount of vector cut with only one enzyme (these need not be visible). Even a contamination of less than 1% of uncut vector is serious because it is all ready to transform whereas your desired products are made with efficiencies (often) no better than 1%. Likewise, self-ligation of vector cut with only one enzyme is fairly efficient. Gel purification will generally separate linear vector (one or two cuts) from circular (uncut) and the tiny DNA fragment. Both gel purifications will modestly increase the number of correct colonies but will greatly decrease the background of “incorrect” colonies, making them worthwhile. However, by simply screening through more colonies you can skip one or both purifications and still succeed (provided your enzyme cuts were very efficient). Generally, purifying is much better and standard practice (slower but steady progress). (vi) (a) Now you would not worry so much about the kanamycin-resistant vector producing colonies in the transformation (on amp plates). Hence, you could more reasonably omit the purification of the cDNA fragment and only reduce your yield of correct products a little. Even in this case, purifying the cDNA fragment is still likely worthwhile. (b) Here you would have great trouble purifying the cDNA insert on an agarose gel if cutting with just EcoRI and XbaI. Best would be too look for an enzyme that cuts the vector 3kb fragment but not the cDNA insert and adding that to the EcoRI + XbaI digest. Now the cDNA 3kb fragment can be purified well from a gel. (c) Now a complete EcoRI + XbaI digest will not produce the intact cDNA. The simplest solution here is a partial EcoRI digest (+ complete XbaI digest). Even though less than a quarter 2 of the molecules may be cut as you desire you will easily be able to purify the required 3kb fragment on a gel. Alternatively, you could cut out a KpnI-XbaI cDNA fragment and then prepare a recipient vector fragment by doing a partial KpnI + full XbaI digest, purifying the required fragment by size. 2 (i) The tag sets an open reading frame where GAA and TTC are codons but the cDNA must have ATT from the EcoRI site as a codon to make the correct translation product. In other words a simple EcoRI ligation will not allow translation of the cDNA in the correct reading frame. Most students had difficulty putting this answer into words. Many had a basic mis-conception. Translation in eukaryotes generally begins with the first AUG in a mRNA, although nearby sequences affecting recognition by the ribosome can favor use of a downstream AUG in rare cases. So, in this question translation will initiate with the MET codon of the Flag tag. It will then continue in that reading frame. Subsequent, downstream AUG codons are of no special significance to the ribosome (they just encode Met). Here the AUG sequence of the cDNA is given because it is a clears landmark showing what is the reading frame used for the normal cDNA product. If this is not read downstream of Flag-initiated translation as a Met it means the correct translation product will not be made but it does not affect where translation is initiated (it is still initiated at the AUG of the Flag tag). (ii) (a) In the cDNA we have to use the EcoRI site (could use KpnI but not enough sequence information is available from the question). From the Flag-tag vector we could use either BamHI or EcoRI sites. When using an adapter it is best to allow as few possible types of ligation as possible, so BamHI would be the better choice. Essentially we need to ligate an adapter to the EcoRI site of the cDNA to (i) convert it to a BamHI-compatible overhang and (ii) restore the correct open reading frame. One choice would be the sequences of oligos that hybridize to form the structure in the middle below (grouped to emphasize the reading frame). Note that writing out both strands is critical. ……AAA G …… TTT CCT AG GA TCX XXX XX Y YYY YYT TAA A ATT CGC ATG……… GCG TAC………. X and Y can be any base-paired nucleotides and the adapter could include 3, 6 or 9 bp more. An alternative choice would be to fill in the EcoRI cut end of the cDNA and ligate linkers containing BamHI sites (again taking note of the reading frame). (b) Using the adapter strategy: Cut cDNA plasmid with EcoRI, ligate to adapter, then cut with XbaI (if you cut with XbaI first you will ligate some XbaI ends, which you do not want). Purify the 3kb fragment, thereby removing oligos as well as vector DNA. If you used a non-phosphorylated adapter you will only have one adapter ligate to the EcoRI site & no more complicated arrangements. Ligate purified cDNA to gel-purified vector BamHI-XbaI fragment. 3 Transform E. coli, plate on amp, screen colonies for plasmids that are cut by BamHI plus XbaI into two fragments of expected size. For correct-looking clones sequence, especially across the critical junction at the N-terminus of the coding region for the fusion protein. No student answer addressed each of the issues discussed above. In fact, most answers did not explain what was to be done and most had major mis-understandings, raising many issues. Answering the question: “Describe how you would…” means describe actions in a lab ( mixing reagents, purifying fragments, transforming bacteria etc.), not invoke magic (“add an adapter into the vector” without explaining how, or “add the adapter to this end”- you mix DNAs and ligase in a tube & realize that the reaction you envisage is not the only one taking place). It also means that you have to think about both ends of a fragment or a ligation and describe that, and that you have to say if you are transforming the product of ligation in some way (“clone” is OK in many contexts where detailed step by step description is not requested; however, it is always important to think about how you test whether you have recovered the correct product). “make clear the exact sequences of X and Y” means you should do just that; otherwise you will not have explained the procedure or been able to check you create the right product. Language: Fragments with short complementary overhangs (like those generated by restriction enzymes) need to be joined by ligation (as discussed extensively in the last homework). These sticky ends do not anneal stably, they do not hybridize stably and there is no hybridization step before adding ligase. You should therefore talk about ligating such fragments together, and not about annealing or hybridizing & certainly not saying that you then add ligase just to seal nicks. When you have longer complementary regions, as in GC tailing or LIC or SLIC, you should call that hybridization or annealing because a stable junction forms without ligase. Alternative but poorer strategies: A. Using an EcoRI to EcoRI adapter. This can work but you have less control. The adapter can ligate in either orientation. That may not matter. Fragment EcoRI ends that did not ligate to adapter can ligate to vector and give you colonies you don’t want (that you probably only find out are wrong by DNA sequencing). If you convert the ends to BamHI with the adapter unligated ends do not participate in further ligations to vector. B. Repairing the vector (or insert) first (or later) with respect to reading frame by either using an adapter insertion or by Quik-site (or similar) mutagenesis. The adapter insertion chosen here was generally using the same sites at each end & so has the drawbacks mentioned above. More important, there is no need to have two separate cloning operations to achieve the desired goal. Purifications: The idea is to focus on what is important (not on more trivial issues like heat inactivating enzymes, phenol extraction etc.) & in the context of these questions to think about something new with each question (not repetition of what you were asked in an earlier question). Here the new element is clearly use of an adapter. You were also given an explicit hint about 5’ end phosphorylation. Aside from standard gel purification issues (tested in Q1) the important principle here are (i) preventing multiple adapter ligations and (ii) purifying fragment only after ligation to adapter to do two jobs in one, plus (iii) protecting sites from unwanted ligation by cutting them only after ligations involving a different end. 4 It is also important that one fragment is first treated with adapter & purified before adding the vector fragment for ligation- otherwise adapter will ligate to the vector end as well. (iii) (a) It is again easiest to change the EcoRI region of the cDNA to produce a BamHI site by designing an appropriate PCR primer. The second PCR primer should be downstream of the XbaI site, allowing cleavage with XbaI at the same site as used previously. The upstream primer could have many designs. It needs about 20-25nt matching the cDNA (complementary to the other strand) and then should have at its 5’ end a BamHI site in the correct reading frame plus a few extra nucleotides to allow BamHI to cut. One possibility is 5’ XXX XXX GGA TCC CGC ATG nnn nnn nnn nnn nnn nnn 3’ where X can be anything but n must match cDNA sequence right after the ATG (not given in the Q) Very few student answers were close to good, yet this question addresses the most common uses of PCR and hence should be answered extremely well by a molecular biologist. The main issues were (A) Forgetting what I emphasized: always describe template & two primers for any PCR reaction (so you have to deal with the 3’ end of the cDNA here too; a polyA signal is not itself a run of A residues but a binding site for a polyA polymerase complex). You do not have to linearize a template prior to PCR. (B) A primer must match the template extensively in its 3’ region (aim for about 20 matches at least) in order to prime but it can have any sequence (even a long one) that you wish at its 5’ end. This affords tremendous flexibility. (C) If you use the above flexibility to add a restriction enzyme site you still have to cut the PCR product with that enzyme after PCR to generate a sticky end, and you have to state this as an explicit step. (D) PCR is efficient and offers a better solution to almost every molecular biology problem. If you simply use the strategy you derived for (ii) and add a PCR step that achieves little you have not taken advantage of PCR. Your strategy with PCR should be different and will almost certainly be better. (b) After PCR, cut with BamHI and XbaI Good to purify the resulting fragment on a gel but also possible to use a column that separates large DNA from oligos and low molecular weight materials. Ligate to the same BamHI-XbaI vector fragment as described above & transform, screen & sequence in the same way, except here you must be sure to sequence the entire cDNA to check for any PCR errors. 3. (i) The slow but steady way is to first clone one fragment (say KpnI to EcoRI), just as in Q1. Then, with pure cloned DNA repeat for the next fragment. This will work provided none of the fragments introduce a site for an enzyme to be used in a later step. You can probably organize the sequence of steps so that this does not happen (because the fragments are short and only one or two are likely to have a complicating restriction site). If not, and in any case for greater speed, it is possible also to ligate two successive inserts into a vector in one reaction and emerge with a 5 reasonable yield, so that cloning is effective. For example, KpnI-EcoRI and EcoRI-BamHI fragments can be ligated together with KpnI-BamHI cut vector. This can even work with three insert fragments provided all have different ends. It is critical to clone at the end of each step. Purifying products from gels will not yield enough material to proceed to the next step. Many student answers involved successive ligations, sometimes with intervening gel purifications but no cloning step in between. It is possible that this strategy can be helpful and could work for a simpler problem. It is important to realize, though, that many unwanted products are produced at each ligation, continually reducing the amount of your correct product available for the next step- I think those other ligation events were either ignored or underestimated. Several answers discussed using adapters or linkers. I don’t understand why because the question already presents you with fragments containing ideal end-points. Several answers suggested using SLIC. While SLIC is undoubtedly useful and it is good to read and think widely, it was not a good direction for this question. That is largely because you would have to describe a lot (and do a lot to convert this to a SLIC process). Basically you would have to add sequences by PCR to each fragment that overlapped the sequence at the end of the neighboring fragment. If you don’t describe that (at least in outline) you are not explaining how to use SLIC, which absolutely requires reasonably long stretches of identical sequence at positions where junctions are to be made at the ends of fragments. Here that would be more work than in the non-SLIC answer. (ii) (a) RNA is harder to purify and keep intact. Also, the particular cell type may not express much or any of the RNA in question. Furthermore, mutations that introduce premature stop codons often lead to instability of the corresponding mRNA (“nonsense-mediated decay”). Hence, DNA is by far the better choice. Although mRNA will allow production of a cDNA containing the entire relevant sequence in one molecule, it is simple to PCR amplify each exon separately and sequence. Most students picking RT-PCR preferred it for economy. It is very much a false economy. You don’t need to amplify introns from genomic DNA- you choose what to amplify. Setting up multiple exon amplifications is not a problem (parallel reactions are well suited not just to robotics but also to human capabilities and efficiency. Amplifying sections of DNA rather than one long cDNA can actually work better as cloning large cDNAs can be hard. The reasons cited in the first paragraph make this a clear choice, not a marginal one. (b) The quality of DNA sequencing is good enough that a mixed sequence (roughly 50% A and, say, 50% G) at any one position is readily seen and distinguished from background, especially if a normal sample is run as a control. If the DNAs are cloned any one clone that is sequenced may have either A or G. It is a nuisance to have to sequence several clones for it to be very likely that each possible sequence has been represented. So, direct sequencing of the PCR product is better. Many student answers revealed a major mis-understanding of a “heterozygous mutation”. It means that one copy of a gene has a normal sequence and one has an aberrant sequence. In each case there are normal base-paired duplexes. There are no mismatches. One allele might have a GC bp and the other an AT bp (not A-C or anything else). Mismatches would not be tolerated for long or replicated as mismatches in a human cell, so it is bizarre to think that cells from an 6 individual would have mismatches. Furthermore, some answers were concerned that such mismatches (which do not exist) would be repaired by bacteria (to the “correct” sequence). Obviously there is nothing telling the bacteria whether GC or AT bps are correct. Also, that line of answering assumes that the mismatches would be preserved through the PCR process. The last round of DNA synthesis will copy one template strand and produce perfect base-pairing. More legitimate concerns were that two peaks in a sequence at one position might not be clear if the general sequence was noisy. However, this concern was generally not expressed clearly (usually just a vague reference to noise) and it is generally unfounded because standard DNA sequence quality is very high (fifteen years ago it may have been a concern). (iii) An obvious first step would be to make the ends blunt. Adding T4 DNA polymerase plus all four dNTPs would be good for this because 5’ overhangs could be filled in (on the opposite strand) and 3’ ends degraded. Blunt ends are not very good for ligation. They could be converted to convenient sticky ends in a number of waysLigate an adapter that is blunt at one end and has a convenient sticky end for ligating to the vector. Ligate linkers- but here you will need to cut with a restriction enzyme. So, either the chosen enzyme should be non-existent or extremely rare in genomic DNA (no perfect solution) or you can use an enzyme, like EcoRI for which a methylase is available & methylate the genomic DNA before ligating linkers. Then only the linker EcoRI sites will subsequently be cut. Tailing with terminal transferase is possible (& annealing with complementary tails added to the vector) but having long CG or AT runs either side of an insert is in fact a nuisance for many purposes. Grading on this question was very generous (partly because there are in fact many possibilities). However, many answers presumably derive from a text or similar source and were likely recited without much thought. For example, why use Klenow and T4 DNA polymerase? No justification was given & I think there is little. Why omit dNTPs for T4 DNA pol? If there is a recessed 3’ end it will be extended. If there is a 3’ overhang it will be degraded. Both will take place in the presence of dNTPs. No answers acknowledged the problem of internal sites when using linkers (above). Several answers used terminal transferase without first making ends blunt. That is OK for 3’ overhangs but will not work for half of the ends where 3’ ends are recessed. There was in some cases confusion over “TA cloning”. That refers to cloning of Taq polymerase products by ligation to a vector with a TA overhang. It is not used to mean annealing of long A and T tails to each other. Some answers suggested tailing with only a single A residue (using ddATP). It seems to me that the A residue could not ligate (perhaps OK because the other strand could join) and that you can only engineer this if you first create blunt ends. 7