1 The Techniques of Molecular Biology 1.1 Basic Techniques Used to Identify, Amplify, and Clone Genes The haploid genome of a mammal contains about 3 x 109 nucleotide pairs. If the combined exons of the average gene are 3,000 nucleotide pairs long (many are larger), the coding region of the gene will represent one of a million such sequences in the genome. Although most of the DNA in mammalian genomes does not consist of genes, still, isolating any one gene is like searching for the proverbial needle in a haystack. Most techniques used in the analysis of genes and other DNA sequences require that the sequence be available in significant quantities in pure or essentially pure form. How can one identify the segment of a DNA molecule that carries a single gene and isolate enough of this sequence in pure form to permit molecular analyses of its structure and function? The development of recombinant DNA and gene-cloning technologies has provided molecular geneticists with methods by which genes or other segments of large chromosomes can be isolated, replicated, and studied by nucleic acid sequencing techniques, electron microscopy, and other analytical techniques. Indeed, genes or other DNA sequences can be amplified by two distinct approaches—one with amplification of the sequence occurring in vivo and the other in vitro. The second approach can only be used when short nucleotide sequences on either side of the DNA sequence of interest are known. In the first approach, a minichromosome carrying the gene of interest is produced in the test tube and is then introduced into an appropriate host cell. This genecloning procedure involves two essential steps: (1) the incorporation of the gene of interest into a small self-replicating chromosome (in vitro), and (2) the amplification of the recombinant minichromosome by its replication in an appropriate host cell (in vivo). Step 1 involves the joining of two or more different DNA molecules in vitro to produce recombinant DNA molecules, for example, a human gene inserted into an E. coli plasmid or other self-replicating minichromosome. Step 2 is really the genecloning event in which the recombinant DNA molecule is replicated or “cloned” to produce many identical copies for subsequent biochemical analysis. In step 2, the 1 recombinant minichromosome is introduced into E. coli cells where it replicates to produce many copies of the recombinant DNA molecule. Although the entire procedure is often referred to as the recombinant DNA or gene-cloning technique, these terms actually refer to two separate steps in the process. In the second approach, short DNA strands that are complementary to DNA sequences on either side of the gene or DNA sequence of interest are synthesized and used to initiate its amplification in vitro by a special (heat-stable) DNA polymerase. This procedure—called the polymerase chain reaction (PCR)—is an extremely powerful gene-amplification tool. The amplified products can then be analyzed and sequenced, and, if desired, they can be inserted into cloning vectors and replicated in vivo for additional studies. Amplification of a DNA sequence by PCR frequently eliminates the need to clone the sequence by replication in vivo. Thus, procedures involving the amplification of DNA sequences by PCR have commonly replaced earlier in vivo amplification protocols. However, PCR can only be used when nucleotide sequences flanking the gene or DNA sequence of interest are known. 1.1.1 The Production of Recombinant DNA Molecules In Vitro A restriction endonuclease catalyzes the cleavage of a specific sequence of nucleotide pairs regardless of the source of the DNA. It will cleave phage DNA, E. coli DNA, corn DNA, human DNA, or any other DNA, as long as the DNA contains the nucleotide sequence that it recognizes. Thus, restriction endonuclease EcoRI will produce fragments with the same complementary single-stranded ends, 5’-AATT-3’, regardless of the source of DNA, and two EcoRI fragments can be covalently fused regardless of their origin; that is, an EcoRI fragment from human DNA can be joined to an EcoRI fragment from E. coli DNA just as easily as two EcoRI fragments from E. coli DNA or two EcoRI fragments from human DNA can be joined. A DNA molecule of the type shown in Figure 1.1, containing DNA fragments from two different sources, is referred to as a recombinant DNA molecule. The ability of geneticists to construct such recombinant DNA molecules at will is the basis of the recombinant DNA technology that has revolutionized molecular biology in the last three decades. The first recombinant DNA molecules were produced in Paul Berg’s laboratory at Stanford University in 1972. Berg’s research team constructed recombinant DNA molecules that contained phage lambda genes inserted into the small circular DNA molecule of simian virus 40 (SV40). In 1980, Berg was a co-recipient of the Nobel 2 Prize in Chemistry as a result of this accomplishment. Shortly thereafter, Stanley Cohen and colleagues, also at Stanford, inserted an EcoRI restriction fragment from one DNA molecule into the cleaved, unique EcoRI restriction site of a self-replicating plasmid. When this recombinant plasmid was introduced into E. coli cells by transformation, it exhibited autonomous replication, just like the original plasmid. Figure 1.1: The construction of recombinant DNA molecules in vitro. DNA molecules isolated from two different species are cleaved with a restriction enzyme, mixed under annealing conditions, and are covalently joined by treatment with DNA ligase. The DNA molecules can be obtained from any species—animal, plant, or microbe. The digestion of DNA with the restriction enzyme EcoRI produces the same complementary single-stranded 5’-AATT-3’ ends regardless of the source of the DNA. 3 1.1.2 Amplification of Recombinant DNA Molecules in Cloning Vectors The various applications of recombinant DNA techniques require not only the construction of recombinant DNA molecules, as shown in Figure 1.1, but also the amplification of these recombinant molecules; that is, the production of many copies or clones of these molecules. This is accomplished by making sure that one of the parental DNAs incorporated into the recombinant DNA molecule is capable of self-replication. In practice, the gene or DNA sequence of interest is inserted into a specially chosen cloning vector. Most of the commonly used cloning vectors have been derived from plasmids or bacteriophage chromosomes. A cloning vector has three essential components: (1) an origin of replication, (2) a dominant selectable marker gene, usually a gene that confers drug resistance to the host cell, and (3) at least one unique restriction endonuclease cleavage site—a cleavage site that is present only once in a region of the vector that does not disrupt either the origin of replication or the selectable marker gene (Figure 1.2). Figure 1.2: The phagemid cloning vector Bluescript II (Stratagene) contains (1) a plasmid origin of replication controlling double-stranded DNA synthesis, (2) a phage f1 origin of replication controlling single-stranded DNA synthesis, (3) an ampicillin-resistance gene (ampr) that serves as a dominant selectable marker, (4) the promoter for the lac genes and the promoter-proximal segment (Z’) of the lacZ gene, and (5) a polylinker or multiple cloning site (MCS) containing a cluster of unique restriction enzyme cleavage sites (18 are shown). The MCS is located within the lacZ’ gene segment; therefore, when foreign DNA is inserted into the MCS, it disrupts LacZ’ function. The designators and brackets showing the locations of recognition sequences for the restriction enzymes are above the MCS DNA sequence. The cleavage sites are marked with red arrows except for AccI and HincII, where they are marked with blue and green arrows, respectively. 4 Modern cloning vectors contain a cluster of unique restriction sites called a polylinker or a multiple cloning site (Figure 1.2). Many cloning vectors are modified versions of plasmids, the extra-chromosomal, double-stranded circular molecules of DNA present in bacteria. Plasmids range from about 1 kb (1 kilobase = 1,000 base pairs) to over 200 kb in size, and many replicate autonomously. Many plasmids also carry antibiotic-resistance genes, which are ideal selectable markers. A limiting factor in using plasmid vectors is that they will only accept relatively small foreign DNA inserts—maximum sizes of 10–15 kb. Thus, scientists searched for vectors that could replicate even when very large inserts were present. Some of these vectors are listed in Table 1.1, along with the maximum sizes of inserts that they would accept. Phage lambda vectors were widely used for several years; then more sophisticated vectors were constructed by combining components from viruses and plasmids. Phagemids combine components of phage such as M13 with parts of plasmids. Cosmids contain the cohesive ends (cos sites) of lambda in plasmids. Yeast artificial chromosomes (YACs) are linear minichromosomes containing just the essential parts of yeast chromosomes—the origin of replication, centromere, and telomeres—along with a selectable marker and a multiple cloning site. Bacterial artificial chromosomes (BACs) and P1 artificial chromosomes (PACs) combine multiple cloning sites and selectable marker genes with the essential components of bacterial fertility (F) factors and phage P1 chromosomes, respectively. YACs, BACs, and PACs accept much larger foreign DNA inserts than plasmids and phage lambda cloning vectors (Table 1.1). Table 1.1: Selected Cloning Vectors and Maximum Insert Sizes ___________________________________________________________________________ Vector Maximum Insert Size ___________________________________________________________________________ Plasmids 15 kb Phagemids 15 kb Phage lambda 23 kb Cosmids 44 kb Bacterial artificial chromosomes (BACs) 300 kb Phage P1 artificial chromosomes (PACs) 300 kb Yeast artificial chromosomes (YACs) 600 kb ___________________________________________________________________________ 5 Bluescript (Figure 1.2) is a phagemid vector with a multiple cloning site (MCS) that contains many unique restriction enzyme cleavage sites, two distinct origins of replication, and a good selectable marker—a gene that makes the host bacterium resistant to ampicillin. The MCS is located within the 5’ portion of the coding region of the lacZ gene, which encodes -galactosidase, the enzyme that catalyzes the first step in the catabolism of lactose. When foreign DNA is inserted into one of the restriction sites in the MCS, it disrupts the function of the plasmid-encoded lacZ product. This inactivation of the amino-terminal segment of -galactosidase provides a good visual test for determining whether or not the Bluescript plasmid in a cell contains a foreign DNA insert. The basis for this visual test is as follows. The presence of -galactosidase in cells can be monitored based on its ability to cleave the substrate 5-bromo-4-chloro3-indolyl--D-galactoside (usually called X-gal) to galactose and 5-bromo-4chloroindigo. X-gal is colorless; 5-bromo-4-chloroindigo is blue. Thus, cells containing active -galactosidase produce blue colonies on agar medium containing X-gal, whereas cells lacking -galactosidase activity produce white colonies on X-gal plates (Figure 1.3). Figure 1.3: Photograph illustrating the use of X-gal to identify E. coli colonies containing (blue) or lacking (white) -galactosidase activity. In this case, the cells in the white colonies harbor Bluescript plasmids with foreign DNA fragments inserted into the multiple cloning site, and the cells in the blue colonies contain Bluescript plasmids with no insert. 6 The molecular basis of the -galactosidase activity that provides the color indicator test for Bluescript vectors is somewhat more complex. The lacZ gene of E. coli is over 3 kb long, and placing the entire gene in the plasmid would make the vector larger than desired. The Bluescript vector contains only a small part of the lacZ gene. This lacZ’ gene segment encodes only the amino-terminal portion of -galactosidase. However, the presence of a functional copy of the lacZ’ gene segment can be detected because of a unique type of complementation. When a functional copy of the lacZ’ gene segment on the Bluescript plasmid is present in a cell that contains a particular lacZ mutant allele on the chromosome or on an F’ plasmid, the two defective lacZ sequences yield polypeptides that together have -galactosidase activity. The mutant allele, designated lacZ M15, synthesizes a Lac protein that lacks amino acids 11 through 14 from the amino terminus. The absence of these amino acids prevents the mutant polypeptides from interacting to produce the active tetrameric form of the enzyme. The presence of the amino-terminal fragment (the first 147 amino acids) of the lacZ polypeptide encoded by the lacZ’ gene fragment on Bluescript plasmids facilitates tetramer formation by the M15 deletion polypeptides. This yields active galactosidase, which permits the X-gal color test to be utilized without placing the entire lacZ gene in the pBluescript vector. 1.1.3 Cloning Large Genes and Segments of Genomes in BACs, PACs, and YACs Some eukaryotic genes are very large. For example, the gene for human dystrophin (a protein that links filaments to membranes in muscle cells) is over 2,000 kb in length. Research on large genes and chromosomes is much easier using vectors that accept large foreign DNA inserts, namely, BACs, PACs, and YACs (see Table 1.1). These vectors accept inserts of size 300 to 600 kb. BACs and PACs are less complex and easier to construct and work with than YACs. In addition, BACs and PACs replicate in E. coli like plasmid vectors. Thus, BAC and PAC vectors have largely replaced YAC vectors in the studies of large genes and genomes such as those of mammals and flowering plants. PAC vectors have been constructed that permit negative selection against vectors lacking foreign DNA inserts. These PAC vectors contain the sacB gene of Bacillus subtilis. This gene encodes the enzyme levan sucrase, which catalyzes the transfer of 7 fructose groups to various carbohydrates. The presence of this enzyme is lethal to E. coli cells when grown in medium containing 5 percent sucrose. The inactivation of the sacB gene by the insertion of foreign DNA in a BamHI restriction site in the gene can be used to select vectors containing inserts. Cells containing vectors with inserts can grow on medium containing 5 percent sucrose; cells with vectors lacking inserts cannot grow on this medium. Cells containing vectors lacking inserts lyse during the first hour of growth in the presence of 5 percent sucrose. As a result, all surviving cells contain vectors with inserts located within the sacB gene—inserts that eliminate levan sucrase activity. PAC and BAC vectors have been modified to produce shuttle vectors that can replicate both in E. coli and in mammalian cells. The structure of one of these vectors is shown in Figure 1.4. This shuttle vector, pJCPAC-Mam1, contains the sacB gene, which allows for positive selection of cells carrying vectors with inserts, plus the origin of replication (oriP) and the gene encoding nuclear antigen 1 of the EpsteinBarr virus, which facilitate replication of the vector in mammalian cells. Figure 1.4: Structure of the PAC mammalian shuttle vector pJCPAC-Mam1. The vector can replicate in either E. coli or mammalian cells. It can replicate in E. coli at low copy number under the control of the bacteriophage P1 plasmid replication unit or be amplified by inducing the phage P1 lytic replication unit (under the control of the lac inducible promoter). It can replicate in mammalian cells by using the origin of replication (oriP) and nuclear antigen 1 of the Epstein-Barr virus. Genes kanr and purr provide dominant selectable markers for use in E. coli and mammalian cells, respectively. The sacB gene (derived from Bacillus subtilis) is used for negative selection against vectors lacking DNA inserts (see text for details). BamHI and NotI are cleavage sites for these two restriction endonucleases. 8 In addition, the purr (puromycin-resistance) gene has been added so that mammalian cells carrying the vector can be selected on medium containing the antibiotic puromycin. Similar BAC shuttle vectors have also been constructed. 1.1.4 Amplification of DNA Sequences by the Polymerase Chain Reaction (PCR) Today, we have complete or nearly complete nucleotide sequences of many genomes, including the human genome. The availability of these sequences in GenBank and other databases allows researchers to isolate genes or other DNA sequences of interest without using cloning vectors or host cells. The amplification of the DNA sequence is performed entirely in vitro, and the sequence can be amplified a millionfold or more in just a few hours. All that is required to use this procedure is knowledge of short nucleotide sequences flanking the sequence of interest. This in vitro amplification of genes and other DNA sequences is accomplished by the polymerase chain reaction (usually referred to as PCR). PCR involves using synthetic oligonucleotides complementary to known sequences flanking the sequence of interest to prime enzymatic amplification of the intervening segment of DNA in the test tube. The PCR procedure for amplifying DNA sequences was developed by Kary Mullis, who received the 1993 Nobel Prize in Chemistry for this work. 1.2 Construction and Screening of DNA Libraries The first step in cloning a gene from an organism usually involves the construction of a genomic DNA library—a set of DNA clones collectively containing the entire genome. Sometimes, individual chromosomes of an organism are isolated by a procedure that sorts chromosomes based on size and DNA content. The DNAs from the isolated chromosomes are then used to construct chromosome-specific DNA libraries. The availability of chromosome-specific DNA libraries facilitates the search for a gene that is known to reside on a particular chromosome, especially for organisms like humans with large genomes. After their construction, libraries are amplified by replication and used to identify individual genes or DNA sequences of interest to the researcher. An alternative approach to gene cloning restricts the search for a gene to DNA sequences that are transcribed into mRNA copies. The RNA retroviruses encode an enzyme called reverse transcriptase, which catalyzes the synthesis of DNA molecules complementary to single-stranded RNA templates. These DNA molecules are called complementary DNAs (cDNAs). They can be converted 9 to double-stranded cDNA molecules with DNA polymerases, and the double-stranded cDNAs can be cloned in plasmid vectors. By starting with mRNA, geneticists are able to construct cDNA libraries that contain only the coding regions of the expressed genes of an organism. 1.2.1 Construction of Genomic Libraries Genomic DNA libraries are usually prepared by isolating total DNA from an organism, digesting the DNA with a restriction endonuclease, and inserting the restriction fragments into an appropriate cloning vector. If the restriction enzyme that is used makes staggered cuts in DNA, producing complementary single-stranded ends, the restriction fragments can be ligated directly into vector DNA molecules cut with the same enzyme (Figure 1.5). When this procedure is used, the foreign DNA inserts can be excised from the vector DNA by cleavage with the restriction endonuclease used to prepare the genomic DNA fragments for cloning. 10 Figure 1.5: Procedure used to clone DNA restriction fragments with complementary singlestranded ends. Once the genomic DNA fragments are ligated into vector DNA, the recombinant DNA molecules must be introduced into host cells for amplification by replication in vivo. This step usually involves transforming antibiotic-sensitive recipient cells under conditions where a single recombinant DNA molecule is introduced per cell (for most cells). When E. coli is used, the bacteria must first be made permeable to DNA by treatment with chemicals or a short pulse of electricity. Transformed cells are then selected by growing the cells under conditions where the selectable marker gene of the vector is essential for growth. A good genomic DNA library contains essentially all of the DNA sequences in the genome of interest. For large genomes, complete libraries contain hundreds of thousands of different recombinant clones. 1.2.2 Construction of cDNA Libraries Most of the DNA sequences present in the large genomes of higher animals and plants do not encode proteins. Thus, expressed DNA sequences can be identified more easily by working with complementary DNA (cDNA) libraries. Because most mRNA molecules contain 3’ poly(A) tails, poly(T) oligomers can be used to prime the synthesis of complementary DNA strands by reverse transcriptase (Figure 1.6). Then, the RNA–DNA duplexes are converted to double-stranded DNA molecules by the combined activities of ribonuclease H, DNA polymerase I, and DNA ligase. Ribonuclease H degrades the RNA template strand, and short RNA fragments produced during degradation serve as primers for DNA synthesis. DNA polymerase I catalyzes the synthesis of the second DNA strand and replaces RNA primers with DNA strands, and DNA ligase seals the remaining single-strand breaks in the doublestranded DNA molecules. These double-stranded cDNAs can be inserted into plasmid 11 or phage cloning vectors by adding complementary single-stranded tails to the cDNAs and vectors. Figure 1.6: The synthesis of double-stranded cDNAs from mRNA molecules. 1.2.3 Screening DNA Libraries for Genes of Interest The genomes of higher plants and animals are very large. For example, the human genome contains 3 x 109 nucleotide pairs. Thus, searching genomic DNA or cDNA libraries of multicellular eukaryotes for a specific gene or other DNA sequence of interest requires the identification of a single DNA sequence in a library that contains a million or more different sequences. The most powerful screening procedure is genetic selection: searching for a DNA sequence in the library that can restore the wild-type phenotype to a mutant organism. When genetic selection cannot be employed, more laborious molecular screens must be carried out. Molecular screens usually involve the use of DNA or RNA sequences as hybridization probes or the use of antibodies to identify gene products encoded by cDNA clones. 12 1.2.3.1 Genetic Selection The simplest procedure for identifying a clone of interest is genetic selection. For example, the Salmonella typhimurium gene that confers resistance to penicillin can be easily cloned. A genomic library is constructed from the DNA of a penr strain of S. typhimurium. Penicillin-sensitive E. coli cells are transformed with the recombinant DNA clones in the library and are plated on medium containing penicillin. Only the transformed cells harboring the penr gene will be able to grow in the presence of penicillin. When mutations are available in the gene of interest, genetic selection can be based on the ability of the wild-type allele of a gene to restore the normal phenotype to a mutant organism. Although this type of selection is called complementation screening, it really depends on the dominance of wild-type alleles over mutant alleles that encode inactive products. For example, the genes of S. cerevisiae that encode histidine biosynthetic enzymes were cloned by transforming E. coli histidine auxotrophs with yeast cDNA clones and selecting transformed cells that could grow on histidine-free medium. Indeed, many plant and animal genes have been identified based on their ability to complement mutations in E. coli or yeast. Complementation screening has limitations. Eukaryotic genes contain introns, which must be spliced out of gene transcripts prior to their translation. Because E. coli cells do not possess the machinery required to excise introns from eukaryotic genes, complementation screening of eukaryotic clones in E. coli is restricted to cDNAs, from which the intron sequences have already been excised. In addition, the complementation screening procedure depends on the correct transcription of the cloned gene in the new host. Eukaryotes have signals that regulate gene expression that are different from those in prokaryotes; therefore, the complementation approach is more likely to work with prokaryotic genes in prokaryotic organisms, and eukaryotic genes in eukaryotic organisms. For this reason, researchers often use S. cerevisiae to screen eukaryotic DNA libraries by the complementation procedure. 1.2.3.2 Molecular Hybridization: Colony Hybridization The first eukaryotic DNA sequences to be cloned were genes that are highly expressed in specialized cells. These genes included the mammalian - and -globin genes and the chicken ovalbumin gene. Red blood cells are highly specialized for the synthesis and storage of hemoglobin. Over 90 percent of the protein molecules 13 synthesized in red blood cells during their period of maximal biosynthetic activity are globin chains. Similarly, ovalbumin is a major product of chicken oviduct cells. As a result, RNA transcripts of the globin and ovalbumin genes can be easily isolated from reticulocytes and oviduct cells, respectively. These RNA transcripts can be employed to synthesize radioactive cDNAs, which, in turn, can be used to screen genomic DNA libraries by in situ colony or plaque hybridization (Figure 1.7). Colony hybridization is used with libraries constructed in plasmid and cosmid vectors; plaque hybridization is used with libraries in phage lambda vectors. We will focus on in situ colony hybridization here, but the two procedures are virtually identical. The colony hybridization screening procedure involves transfer of the colonies formed by transformed cells onto nylon membranes, hybridization with a radioactively labeled DNA or RNA probe, and autoradiography (Figure 1.7). The labeled DNA or RNA is employed as a probe for hybridization to denatured DNA from colonies grown on the nylon membranes. The DNA from the lysed cells is bound to the membranes before hybridization so that it won’t come off during subsequent steps in the procedure. After time is allowed for hybridization between complementary strands of DNA, the membranes are washed with buffered salt solutions to remove nonhybridized cDNA and are then exposed to X-ray film to detect the presence of radioactivity on the membrane. Only colonies that contain DNA sequences complementary to the radioactive cDNA will yield radioactive spots on the autoradiographs (Figure 1.7). The locations of the radioactive spots are used to identify colonies that contain the desired sequence on the original replicated plates. These colonies are used to purify DNA clones harboring the gene or DNA sequence of interest. Test your comprehension of the methods used to prepare and screen genomic libraries by working Solve It: How Can You Clone a Specific NotI Restriction Fragment from the Orangutan Genome? 14 Figure 1.7: Screening DNA libraries by colony hybridization. A radioactive cDNA is employed as a hybridization probe. See text for details. 15 1.3 The Molecular Analysis of DNA, RNA, and Protein The development of recombinant DNA techniques has spawned many new approaches to the analysis of genes and gene products. Questions that were totally unapproachable just 25 years ago can now be investigated with relative ease. Geneticists can isolate and characterize essentially any gene from any organism; however, the isolation of genes from large eukaryotic genomes is sometimes a long and laborious process. Once a gene has been cloned, its expression can be investigated in even the most complex organisms such as humans. Is a particular gene expressed in the kidney, the liver, bone cells, hair follicles, erythrocytes, or lymphocytes? Is this gene expressed throughout the development of the organism or only during certain stages of development? Is a mutant allele of this gene similarly expressed, spatially and temporally, during development? Or does the mutant allele have an altered pattern of expression? If the latter, is this altered pattern of expression responsible for an inherited syndrome or disease? These questions and many others can now be routinely investigated using well-established methodologies. A comprehensive discussion of the techniques used to investigate gene structure and function is far beyond the scope of this text. However, let’s consider some of the most important methods used to investigate the structure of genes (DNA), their transcripts (RNA), and their final products (usually proteins). 1.3.1 Analysis of DNAs by Southern Blot Hybridization Gel electrophoresis is a powerful tool for the separation of macromolecules with different sizes and charges. DNA molecules have an essentially constant charge per unit mass; thus, they separate in agarose and acrylamide gels almost entirely on the basis of size or conformation. Agarose or acrylamide gels act as molecular sieves, retarding the passage of large molecules more than small molecules. Agarose gels are better sieves for large molecules (larger than a few hundred nucleotides); acrylamide gels are better for separating small DNA molecules. Figure 1.8 illustrates the separation of DNA restriction fragments by agarose gel electrophoresis. The procedures used to separate RNA and protein molecules are largely the same in principle but involve slightly different techniques because of the unique properties of each class of macromolecule. 16 17 Figure 1.8: The separation of DNA molecules by agarose gel electrophoresis. The DNAs are dissolved in loading buffer with density greater than that of the electrophoresis buffer so that DNA samples settle to the bottoms of the wells, rather than diffusing into the electrophoresis buffer. The loading buffer also contains a dye to monitor the rate of migration of molecules through the gel. Ethidium bromide binds to DNA and fluoresces when illuminated with ultraviolet light. In the photograph shown, lane 3 contained EcoRI-cut plasmid DNA; the other lanes contained EcoRIcut plasmid DNAs carrying maize glutamine synthetase cDNA inserts. In 1975, Edward M Southern published an important procedure that allowed investigators to identify the locations of genes and other DNA sequences on restriction fragments separated by gel electrophoresis. The essential feature of this technique is the transfer of the DNA molecules that have been separated by gel electrophoresis onto nitrocellulose or nylon membranes (Figure 1.9). Such transfers of DNA to membranes are called Southern blots after the scientist who developed the technique. The DNA is denatured either prior to or during transfer by placing the gel in an alkaline solution. After transfer, the DNA is immobilized on the membrane by drying or UV irradiation. A radioactive DNA probe containing the sequence of interest is then hybridized with the immobilized DNA on the membrane. The probe will hybridize only with DNA molecules that contain a nucleotide sequence complementary to the sequence of the probe. Nonhybridized probe is then washed off the membrane, and the washed membrane is exposed to X-ray film to detect the presence of the radioactivity. After the film is developed, the dark bands show the positions of DNA sequences that have hybridized with the probe (Figure 1.10). 18 Figure 1.9: Procedure used to transfer DNAs separated by gel electrophoresis to nylon membranes. The transfer solution carries the DNA from the gel to the membrane as the dry paper towels on top draw the salt solution from the reservoir through the gel to the towels. The DNA binds to the membrane on contact. The membrane with the DNA bound to it is dried and baked under vacuum to affix the DNA firmly prior to hybridization. SSC is a solution containing sodium chloride and sodium citrate. Figure 1.10: Identification of genomic restriction fragments harboring specific DNA sequences by the procedure. (a) Photograph of an ethidium bromide-stained agarose gel containing phage DNA digested with HindIII (left lane), and Arabidopsis thaliana DNA digested with EcoRI (right lane). The DNA digest provides size markers. The A. thaliana DNA digest was transferred to a nylon membrane by the Southern procedure (Figure 1.9) and hybridized to a radioactive DNA fragment of a cloned -tubulin gene. The resulting Southern blot is shown in (b); nine different EcoRI fragments hybridized with the -tubulin probe. 19 The ability to transfer DNA molecules that have been separated by gel electrophoresis to nylon membranes for hybridization studies and other types of analyses has proven to be extremely useful. Practical applications of Southern blotting center on identifying DNA fragments that contain sequences similar to the probe DNA or RNA, where the proportion of mismatched nucleotides allowed is determined by the conditions of hybridization. The advantages of the Southern blot are convenience and sensitivity. The sensitivity comes from the fact that both hybridization with a labeled probe and the use of photographic film amplify the signal; under typical conditions, a band can be observed on the film with only 5 x 1012 grams of DNA — a thousand times less DNA than the amount required to produce a visible band in the gel itself. 1.3.2 Analysis of RNAs by Northern Blot Hybridization If DNA molecules can be transferred from agarose gels to nylon membranes for hybridization studies, we might expect that RNA molecules separated by agarose gel electrophoresis could be similarly transferred and analyzed. Indeed, such RNA transfers are used routinely in genetics laboratories. RNA blots are called northern blots in recognition of the fact that the procedure is analogous to the Southern blotting technique, but with RNA molecules being separated and transferred to a membrane. As we will discuss in the next section, this terminology has been extended to the transfer of proteins from gels to membranes, a procedure called western blotting. The northern blot procedure is essentially identical to that used for Southern blot transfers (Figure 1.9). However, RNA molecules are very sensitive to degradation by RNases. Thus, care must be taken to prevent contamination of materials with these extremely stable enzymes. Furthermore, most RNA molecules contain considerable secondary structure and must therefore be kept denatured during electrophoresis in order to separate them on the basis of size. Denaturation is accomplished by adding formaldehyde or some other chemical denaturant to the buffer used for electrophoresis. After transfer to an appropriate membrane, the RNA blot is hybridized to either RNA or DNA probes just as with a Southern blot. Northern blot hybridizations (Figure 1.11) are extremely helpful in studies of gene expression. They can be used to determine when and where a particular gene is 20 expressed. However, we must remember that northern blot hybridizations only measure the accumulation of RNA transcripts. They provide no information about why the observed accumulation has occurred. Changes in transcript levels may be due to changes in the rate of transcription or to changes in the rate of transcript degradation. More sophisticated procedures must be used to distinguish between these possibilities. Figure 1.11: Typical northern blot hybridization data. Total RNAs were isolated from roots (R), leaves (L), and flowers (F) of Arabidopsis thaliana plants, separated by agarose gel electrophoresis, and then transferred to nylon membranes. The autoradiogram shown in (a) is of a blot that was hybridized to a radioactive probe containing an -tubulin coding sequence. This probe hybridizes to the transcripts of all six -tubulin genes in A. thaliana. The autoradiograms shown in (b) and (c) are of RNA blots that were hybridized to DNA probes specific for the 1- and 3-tubulin genes (TUA1 and TUA3, respectively). The results show that the 3-tubulin transcript is present in all organs analyzed, whereas the 1-tubulin transcript is present only in flowers. The 18S and 26S ribosomal RNAs provide size markers. Their positions were determined from a photograph of the ethidium-bromide stained gel prior to transfer of the RNAs to the nylon membrane. 1.3.3 Analysis of RNAs by Reverse Transcriptase-PCT (RT-PCR) The enzyme reverse transcriptase catalyzes the synthesis of DNA strands that are complementary to RNA templates. It can be used in vitro to synthesize DNAs that are complementary to RNA template strands. The resulting DNA strands can then be converted to double-stranded DNA by several different procedures (for example, see Figure 1.6), including the use of a second primer and the heat-stable Taq DNA polymerase. The resulting DNA molecules can then be amplified by standard PCR. 21 The first strand of DNA, often called a cDNA because it is complementary to the mRNA under study, can be synthesized by using an oligo(dT) primer that will anneal to the 3’-poly(A) tails of all mRNAs, or by using gene-specific primers (sequences complementary to the RNA molecule of interest). Gene-specific oligonucleotide primers are usually chosen to anneal to sequences in the 3’-noncoding regions of the mRNAs. Figure 1.12 illustrates how such primers can be used in RT-PCR to amplify a specific gene transcript. The products of these amplifications are analyzed by gel electrophoresis. Wherever a product appears in the gel, the investigator knows that the sample from which it was generated contained the mRNA under study. This procedure is therefore a quick and easy way of ascertaining whether or not a particular gene is being transcribed. Many modifications of the RT-PCR procedure have been developed, with a major emphasis on making it more quantitative. For example, known amounts of the RNA under study can be analyzed to determine the relationship between RNA input and DNA output. By knowing this relationship, an investigator can use the quantity of DNA generated by an experimental sample to extrapolate back to the amount of RNA that was initially present in that sample. 22 Figure 1.12: Detection and amplification of RNAs by reverse transcriptase PCR (RT-PCR). Specific gene transcripts are amplified by first using reverse transcriptase to synthesize a single-stranded DNA that is complementary to the mRNA of interest. The synthesis is initiated with a gene-specific oligonucleotide primer (a primer that will only anneal to the mRNA of interest). The complementary DNA strand is then synthesized by using a reverse primer and Taq polymerase. Large quantities of double-stranded cDNA are subsequently synthesized by standard PCR reactions in the presence of both the gene-specific and reverse PCR primers. 1.3.4 Analysis of Proteins by Western Blot Techniques Polyacrylamide gel electrophoresis is an important tool for the separation and characterization of proteins. Because many functional proteins are composed of two or more subunits, individual polypeptides are separated by electrophoresis in the presence of the detergent sodium dodecyl sulfate (SDS), which denatures the proteins. After electrophoresis, the proteins are detected by staining with Coomassie blue or silver stain. However, the separated polypeptides also can be transferred 23 from the gel to a nitrocellulose membrane, and individual proteins can be detected with antibodies. This transfer of proteins from acrylamide gels to nitrocellulose membranes, called western blotting, is performed by using an electric current to move the proteins from the gel to the surface of the membrane (Figure 1.13). Figure 1.13: Western blotting transfer apparatus. Schematic showing the assembly of a typical Western blot apparatus with the position of the position of the gel, transfer membrane and direction of protein in relation to the electrode position. (A) The image depicted is representative of a horizontal semi-dry transfer apparatus. (B) The orientation is applicable for vertically positioned "wet" transfer apparatus. After transfer, a specific protein of interest is identified by placing the membrane with the immobilized proteins in a solution containing an antibody to the protein. Non-bound antibodies are then washed off the membrane, and the presence of the 24 initial (primary) antibody is detected by placing the membrane in a solution containing a secondary antibody. This secondary antibody reacts with immunoglobulins (the group of proteins comprising all antibodies) in general. The secondary antibody is conjugated to either a radioactive isotope (permitting autoradiography) or an enzyme that produces a visible product when the proper substrate is added. Reference 1. Principles of Genetics, Sixth Edition. 2012. D Peter Snustad and Michael J Simmons. John Wiley & Sons, Inc., New Jersey. 25 Review Questions 1. What is a recombinant DNA molecule? 2. What are restriction endonucleases? 3. How are restriction endonucleases used to construct recombinant DNA molecules in vitro? 4. Why is the polymerase chain reaction (PCR) such a powerful tool for use in analyses of DNA? 5. (a) In what ways is the introduction of recombinant DNA molecules into host cells similar to mutation? (b) In what ways is it different? 8. In what ways do restriction endonucleases differ from other endonucleases? 9. Of what value are recombinant DNA and gene-cloning technologies to geneticists? 10. What determines the sites at which DNA molecules will be cleaved by a restriction endonuclease? 11. Restriction endonucleases are invaluable tools for biologists. However, genes encoding restriction enzymes obviously did not evolve to provide tools for scientists. Of what possible value are restriction endonucleases to the microorganisms that produce them? 12. What is the rationale for the inclusion of multiple cloning sites in modern cloning vectors. Include in your answer a description of what multiple cloning sites are. 13. With labelled diagram describe the phagemid cloning vector Bluescript II. Mention the role of each component parts of the vector Bluescript II. How would you select the recombinant Bluescript II plasmid in a cell contains a foreign DNA insert? 14. The cloning vectors in use today contain an origin of replication, a selectable marker gene (usually an antibiotic resistance gene), and one additional component. What is this component, and what is its function? 15. What are Phagemids, Cosmids, Yeast artificial chromosomes (YACs), Bacterial artificial chromosomes (BACs) and P1 artificial chromosomes (PACs)? What are the advantages of using each of the theses vectors as alternatives to plasmids? 16. What are shuttle vectors? Describe the structure of the PAC mammalian shuttle vector pJCPAC-Mam1. What are the advantages of using such a vector? 17. What is a genomic library and what is its value? How could you construction a genomic library? 18. cDNA clones are widely used in recombinant DNA studies. (a) What does the term "cDNA" stand for? (b) How is cDNA prepared? (c) What features of cDNA make it a useful research tool? 19. Compare the nucleotide-pair sequences of genomic DNA clones and cDNA clones of specific genes of higher plants and animals. What is the most frequent difference that you would observe? 20. What is a cDNA library and what is its value? How could you construction a cDNA library? 21. What is the basis of genetic selection? What is complementation screening and how is it carried out? 22. What is in situ colony or plaque hybridization. How would you screen a particular clone form a DNA library by in situ colony hybridization for a gene of interest? 23. (a) What experimental procedure is carried out in Southern, northern, and western blot analyses? (b) What is the major difference between Southern, northern, and western blot analyses? 24. Distinguish between Southern and Northern blots in a manner that makes it clear you know what each is and how they differ. 25. How does a Western blot differ from both of the above? When is a Western blot used in perference to a Northern or Southern? 26. Why is it generally desirable to use a relatively small probe when doing a Southern blot? What situations may make the use of a larger probe desirable? 27. Summarize the major steps that are involved in Southern blotting. 28. Describe a procedure that could be used to determine which tissues in weed plant Arabidopsis thaliana plant express the highest levels of a tubulin gene. What is the procedure called? 29. What major advantage does the polymerase chain reaction (PCR) have over other methods for analyzing nucleic acid structure and function? 30. What is reverse transcriptase PCR (RT-PCR)? Describe the procedures for the detection and amplification of RNAs by reverse transcriptase PCR (RT-PCR). 26