More Molecular Genetic Technologies (Chapters 8, 9, 10) 1. 2. 1. Polymerase Chain Reaction (PCR) 1. Standard PCR 2. Real-time quantitative PCR 3. Site-directed mutagenesis PCR DNA sequencing 1. Manual dideoxy/automated fluorescent dye 2. Pyrosequencing 3. Other methods Single nucleotide polymorphisms (SNPs) Lots of practical applications, virtually unlimited: Amplify DNA for Cloning (PCR) Amplify DNA for sequencing without cloning (PCR) DNA sequencing reaction (PCR) Mapping genes and regulatory sequences Linkage analysis (identify genes for traits/diseases) Diagnose disease Pathogen screening Sex determination Forensic analysis Paternity/maternity (relatedness) Behavioral ecology studies (relatedness) Molecular systematics and evolution (comparing homologous sequences in different organisms) Population genetics (theoretical and applied) Physiological genetics (studying basis of adaptation) Livestock pedigrees (optimize breeding) Wildlife management (stock identification/assessment) Detection of Genetically Modified Food (GMOs) Polymerase Chain Reaction (PCR) Ability to generate identical high copy number DNAs made possible in the 1970s by recombinant DNA technology (i.e., cloning). Cloning DNA is time consuming and expensive (>>$15/sample). Probing libraries can be like hunting for a needle in a haystack. PCR, “discovered” in 1983 by Kary Mullis, enables the amplification (or duplication) of millions of copies of any DNA sequence with known flanking sequences. Requires only simple, inexpensive ingredients and a couple hours. DNA template Primers (anneal to flanking sequences) DNA polymerase dNTPs Mg2+ Buffer Can be performed by hand or in a machine called a thermal cycler. 1993: Nobel Prize for Chemistry How PCR works: 1. Begins with DNA containing a sequence to be amplified and a pair of synthetic oligonucleotide primers that flank the sequence. 2. Next, denature the DNA to single strands at 94˚C. 3. Rapidly cool the DNA (37-65˚C) and anneal primers to complementary single-straned sequences flanking the target DNA. 4. Extend primers at 70-75˚C using a heat-resistant DNA polymerase such as Taq polymerase derived from hot water bacterium Thermus aquaticus. 5. Repeat the cycle of denaturing, annealing, and extension 20-45 times to produce 1 million (220) to 35 trillion copies (245) of the target DNA. 6. Extend the primers at 70-75˚C once more to allow incomplete extension products in the reaction mixture to extend completely. 7. Cool to 4˚C and store or use amplified PCR product for analysis. Hot water bacteria: Thermus aquaticus Taq DNA polymerase Life at High Temperatures by Thomas D. Brock Biotechnology in Yellowstone © 1994 Yellowstone Association for Natural Science http://www.bact.wisc.edu/Bact303/b27 Fig. 9.3 Denature Anneal PCR Primers Extend PCR Primers w/Taq Repeat… Example thermal cycler protocol used in lab: Step 1 7 min at 94˚C Initial Denature Step 2 45 cycles of: 20 sec at 94˚C 20 sec at 52˚C 1 min at 72˚C Denature Anneal Extension Step 3 7 min at 72˚C Final Extension Step 4 Infinite hold at 4˚C Storage Real-time Quantitative PCR: Same as PCR, but measures the abundance of DNA as it is amplified. Useful for quantitatively measuring the levels of mRNA in a sample. Uses reverse transcriptase to generate cDNA for the template. Can also be used to quantitatively estimate fraction of DNA from various organisms in a heterogenous sample (e.g, can be used to measure abundance of different microbes in soil sample). Can be used to type SNPs if primer binding is stringent. Fluorescent dye, SYBR Green, is incorporated into PCR reaction. SYBR Green fluoresces strongly when bound to DNA, but emits little fluorescence when not bound to DNA. SYBR Green fluorescence is proportional to the amount of DNA amplified, detected with a laser or other device. Experimental samples are compared to control sample with known concentration of cDNA. Fig. 10.9 SYBR Green binds to double-stranded DNA and fluoresces Real-time Quantitative PCR amplification plot: Site-specific in vitro PCR mutagenesis: • Method by which mutant alleles can be synthesized in the lab and transformed into cell culture and animals. • Commonly used to study mutations of human genes in mice or other model organisms. One simple method relies on PCR: 1. Begin with 4 PCR primers; 2 primers match the target sequence except where the mutation is desired, and 2 primers flank the region. 2. Synthesize 2 PCR products in both directions from mutation site to cover full length of gene 3. Remove primers, mix PCR products, and denature. 4. Two PCR products now overlap; self-anneal and extend full length products in a thermalcycler. 5. Transform into cell or expression vector for further tests. Fig. 9.1, Site-specific mutagenesis using PCR. Exam 1 Results DNA Sequencing DNA sequencing = determining the nucleotide sequence of DNA. Dideoxy sequencing developed by Frederick Sanger in the 1970s. 1980: Walter Gilbert (Biol. Labs) & Frederick Sanger (MRC Labs) Dideoxy DNA sequencing relies on chain termination: 1. DNA template is denatured to single strands. 2. Single DNA primer (3’ end near sequence of interest) is annealed to template DNA and extended with DNA polymerase. 3. Four reactions are set up, each containing: 1. 2. 3. 4. DNA template Primer annealed to template DNA DNA polymerase dNTPS (dATP, dTTP, dCTP, and dGTP) 4. Next, a different labeled dideoxynucleotide (ddATP, ddTTP, ddCTP, or ddGTP) is added to each of the four reaction tubes at 1/100th the concentration of normal dNTPs. 5. ddNTPs possess a 3’-H instead of 3’-OH, compete in the reaction with normal dNTPS, and produce no phosphodiester bond. Dideoxy DNA sequencing (cont.): 7. Whenever the labeled ddNTPs are incorporated in the chain, DNA synthesis terminates. 8. Dideoxy DNA sequencing also called dye terminator sequencing. 9. Each of the four reaction mixtures produces a population of DNA molecules with DNA chains terminating at all possible positions. 10. Extension products in each of the four reaction mixtures also end with a different labeled ddNTP (depending on the base). 11. Next, each reaction mixture is electrophoresed in a separate lane (4 lanes) at high voltage on a polyacrylamide gel. 12. Polyacrylamide gels can be thinner higher voltage faster. 13. Pattern of bands in each of the four lanes is visualized on X-ray film or automated sequencer. 14. Location of “bands” in each of the four lanes indicate the size of the fragment terminating with a respective labeled ddNTP. 15. DNA sequence is deduced from the pattern of bands in the 4 lanes. Fig. 8.17, 2nd edition http://askabiologist.asu.edu/sequencing Radio-labeled ddNTPs (4 rxns) Sequence (5’ to 3’) Short products G G A T A T A A C C C C T G T Long products Vigilant et al. 1989 PNAS 86:9350-9354 Automated Dye-Terminator dideoxy DNA Sequencing: 1. Original dideoxy DNA sequencing methods were time consuming, radioactive using P32 labels and throughput was low, typically ~300 bp per run. 2. Automated DNA sequencing employs the same general procedure, but uses ddNTPs labeled with fluorescent dyes. 3. Combine 4 dyes fluorescing at different wavelengths in one reaction tube and electrophores in one lane on a capillary containing polyacrylamide. 4. Capillary is thinner then gel higher voltage even faster. 5. UV laser detects dyes and reads the sequence. 6. Sequence data is displayed as colored peaks (chromatograms) that correspond to the position of each nucleotide in the sequence. 7. Throughput is high, up to 1,200 bp per reaction and 96 reactions every 3 hours with capillary sequencers. 8. Most automated DNA sequencers can load robotically and operate around the clock for weeks with minimal labor. Applied Biosystems PRISM 377 (Gel, 34-96 lanes) Applied Biosystems PRISM 3700 (Capillary, 96 capillaries) Applied Biosystems PRISM 3100 (Capillary, 16 capillaries) “virtual autorad” - real-time DNA sequence output from ABI 377 1. Trace files (dye signals) are analyzed and bases called to create chromatograms. 2. Chromatograms from opposite strands are reconciled with software to create doublestranded sequence data. Fig. 8.11, Chromatogram of about 250 bp Pyrosequencing: 1. Based on the “sequencing by synthesis” principle instead of chain termination with dideoxy nucleotides. 2. Developed by Pål Nyrén/Mostafa Ronaghi in 1996. 1. Immobilize a single template DNA molecule on a bead/substrate and synthesize complementary strand. 2. Detect which nucleotide is added at each step. Sequencing (polymerization) doesn’t stop… 3. Complex reaction requiring template DNA, primer, DNA polymerase, ATP sulfurylase, luciferase, apyrase, adenosine 5’ phosphosulfate (APS), and luciferin. 4. As with dideoxy sequencing, base incorporation is recorded when light is emitted at particular wavelengths. Fig. 8.12 Example showing how to read pyrosequencing data 454 Life Sciences Genome Sequencer FLX Three common platforms: 1. 454 Life Sciences Genome Sequencer FLX Read length = 300-500 nt (long fragments) Output = 400 Gb $5-7K per run $1M for mammalian genome ~95% base-call accuracy 1. Illumina Hi-Seq Read length = 100 bp (short fragments) Output = 600 Gb (2 x 100 bp paired ends) $2K per run $10K for mammalian genome 85% base-call accuracy 3. Applied Biosystems SOLiD Read length = 75 bp (short fragment) Output = 54 Gb (2 x 75 bp paired ends) 99.99 % base-call accuracy All methods require preparation of genomic library and none entail sequencing a single molecule. Other ways to sequence DNA: Since P32-labeled ddNTPs were abandoned in favor of fluorescentlylabeled ddNTPs, all sequencing has utilized light emission. This continues to be true for the currently available next-generation sequencing platforms (454, Illumina, SOLiD). One factor contributing to short read lengths is the light-induced degradation of polymerases and the chemistry components (i.e., the dyes). In addition to complex chemistry, sequencing requires optics, like cameras, lasers and scanners. Fundamentally new ways of sequencing are emerging, including the ability to sequence a single molecule (with no library preparation). Hydrogen Ion DNA Sequencing: Recall that a hydrogen ion (H+) is emitted when the phosphodiester bond is created. This causes a change in pH that can be registered by a pH meter. A semiconductor can be used to register this pH change and record the sequence when dNTPs of known composition (A, G, C, T) are combined with polymerase. Ion Torrent http://www.iontorrent.com http://www.iontorrent.com http://www.iontorrent.com What’s next? Nanopore sequencing Under development since 1995; DNA is passed through a nanopore. The bases perturb the charge and the sequence it read without synthesis or a PCR amplification step, chemical labeling, or optical instrumentation. Oxford Nanopore Technologies – GridION & MinION http://www2.technologyreview.com/article/427677/nanopore-sequencing/ Single nucleotide polymorphisms (SNPs): 1. DNA sequences of most individuals are almost identical, >99%. 2. Single base pair differences occur about once every 500-1000 bp. 3. In most populations there is a common SNP, and several less common SNPs. 4. SNPs can be used just like other genotyping markers, but there are only 4 alleles (A, C, G, T). 5. About 3 million SNPs occur in the human genome, and these are becoming popular genetic markers. Why sequence the entire genome or even whole genes? How to type SNPs: 1. SNPs can be typed by hybridizing a complementary oligonucleotide (e.g., single-base extension assay). 2. If the stringency is high (i.e., temperature), the oligonucleotide will fail to bind to DNAs showing polymorphism. 3. Many hundreds of SNPs can be tested simultaneously using: DNA microarrays (DNA-chips, Gene-Chips, SNP-chips) First developed in the early 1990s. Ordered grid of short, complementary, known sequence oligonucleotides placed at fixed positions on silicon, glass, or nylon substrate. Oligonucleotides are experimentally determined and are either (1) microspotted or (2) synthesized on the chip. User defined SNP chips are available commercially, and can contain >400,000 different probes. Fig. 8.14, Typing a SNP with an oligonucleotide. How to type SNPs (cont.): 1. SNP chip is designed with an array of user defined oligonucleotides attached to the substrate (the SNP chip is the probe). 2. Oligonucleotides match each of the common and variant alleles in the population (all alleles of interest). 3. Target DNAs are labeled with a fluorescent tag and hybridized (or not) to the chip. 4. Fluorescence pattern is detected by a laser. 5. Because the oligonucleotides are known, the pattern indicates the type of alleles the individual possesses. 6. Many different alleles at thousands of different loci can be screened simultaneously in the same experiment. Fig. 8.14b, SNP chip assay