Introduction Early Sequencing • Early sequencing was performed with tRNA through a technique developed by Richard Holley, who published the first structure of a tRNA in 1964. This involved breaking down RNA molecules, then puzzling the pieces back together. However, this was extremely time consuming, and due to its large size, such methods could not readily be used for DNA sequencing (Sanger 1988). • Frederick Sanger developed improved methods that allowed sequencing of some DNA up to 50 nucleotides in length. However, he realized the potential of copying DNA instead of degrading it (Sanger 1980). • In 1975 Sanger developed “the plus and minus method.” This included the principal of chain termination during polymerization, and was used to sequence an entire genome (almost) – that of the φX 174 bacteriophage. Despite this result, Sanger was not satisfied and kept searching for better methodology (Sanger 1980). • Sagner’s major breakthrough, which would become the basis for subsequent techniques, came at a meeting in Germany where Klaus Geider gave him a sample of ddTTP, which would terminate chain polymerization upon incorporation. This would be called “the di-deoxy method.” dNTP Addition of ddNTPs will terminate chain elongation. Location of ddNTP insertion within a nucleotide chain can be determined using gel separation. (image adapted from Sanger 1980) • Sanger used this to finish the remainder of the φX 174 genome in 1978. These improvements and Maxim and Gilbert’s chemical method of led to a large attraction towards sequencing (Sanger 1988). • Using the same methods as Sanger did in 1977, it would have taken more than 1000,000 years to complete the human genome. Three innovations came about that greatly expedited the sequencing process: Shot-gun sequencing, PCR, and the automation of sequencing. These developments led to the publication of the first bacterial genome, H. influenzae Rd KW20, in 1995 by Robert Fleischmann (Binnewies 2006). From early sequencing of tRNA to the publication of a bacterial genome took 30 years, but the groundwork was laid out for future sequencing. Human Genome Project • Erwin Chargaff could not have imagined the rapid technological advancement in computers and robotics when he made the often quoted statement by critics of the human genome project; “Even the smallest functional DNA varieties seen, those occurring in certain small phages, must contain something like 5,000 nucleotides in a row. We may, therefore, leave the task of reading the complete nucleotide sequence of a DNA to the 21st century, which will, however, have other worries.” • Progress was deemed too slow such that a decade to publication of the 1st draft of the human genome, Bart Barrell considered the HGP premature the outstanding advances made then notwithstanding. • The magnitude of progress made in rapid DNA sequencing has been quite phenomenal. • The prohibitive cost of sequencing a human genome can be reduced through Novel detection assays, miniaturization in instrumentation, microfluidic separation techniques, and increase in number of assays per run. • Novel detection assays are mostly modifications the Sanger sequencing assay, but non-Sanger methods such as pyrosequencing and several modification of it hold promise of dramatically reducing the cost of genome sequencing. Current and Developing Techniques Sequencing By Hybridization (SBH) • The array contains all possible oligonucleotide sequences of a given length. • DNA of unknown sequence is incubated with the array. • The target hybridizes to the array wherever there is complementation to a Oliver portion of the target. Limitations • Hybridization of oligos are detected by •Difficult to reconstruct long fluorescence. sequences. • The probes are organized by overlaps •Very large libraries are with one another to reconstruct the target required. sequence. •The normal approach to SBH is also sensitive to errors. Latest Improvement and Advantages • Universal bases are used instead of normal oligonucleotides. • By acting as spacers the universal bases make consecutive probes less dependent on one another. • These are less sensitive to errors. • Does not require larger libraries. SBS involves detection of the identity of each nucleotide immediately after its incorporation into a growing strand of DNA in a polymerase reaction. The SBS includes "fluorescent in situ sequencing" (FISSEQ) and the pyrosequencing method. (Seo 2005) • A different fluorophore is linked to each of the four bases through a photocleavable linker. • DNA polymerase incorporates complementary a single-nucleotide analogue. • Unique fluorescence emission detected depends upon the nt. incorporated. • Fluorophore is subsequently removed photochemically. The 3-OH group is chemically regenerated and the cycle proceeds. Advantages • Allows parallel sequencing. • Use of photons requires no additional chemical reagents. • Clean products with no need of subsequent purification. • avidin-biotin purification of A/B fragment • 4 bases (TACG) cycled 42 times Emulsify beads and PCR reagents in water-in-oil microreactors Clonal amplification occurs inside microreactors Break microreactor enrich for DNA positive beads • Chemiluminescent signal generation • Signal processing to determine base sequence and quality score • Well diameter: 44µm • 200,000 reads obtained in parallel • A single cloned amplified sstDNA bead is deposited per well • Non-Sanger nonfluorescence technique that quantitatively measures released PPi • Pyrogram corresponds to complementary base Source: Biotage Applications and advantages • SNP analysis • Ideal for rapidly mutating organisms • Quantifications provide additional data Limitations • Short sequence reads • Homopolymer repeat problems Cyclic Reversible Terminator (CRT) Sequencing by CRT consists of three steps; incorporation, imaging and deprotection. The reversible terminator must be cleaved efficiently with photocleaving groups like 2-nitrobenzyl group. • Polony (polymerase colony) is amplified product from single DNA molecule in acrylamide gel. • Sequencing done by the incorporation of cleavable fluorescent labeled nucleotide. Advantage • Scalability is easy by using 1μm magnetic beads. Disadvantage • Failure in cleaving dye moiety. Limitations • Low readout length. • Error prone. Comparative Genome Sequencing • Test DNA is hybridized with reference DNA to identify regions of genomic differences. • Genomic different regions are sequenced to identify SNPs. Advantages • Fast, accurate sequencing of the regions of interest. “Lab-on-a-chip” concept: integration of all sequencing steps, including PCR amplification, sample purification and capillary electrophoresis using the Sanger sequencing method. • Integration of Separation using 384 channels, accurately sequencing 560 bases with 99% accuracy. • High throughput and decreased (6x) analysis time. • Reduced reagent and sample vol. • Potential for low cost commercial product. Limitations • Require longer read lengths. • Rate limiting step: sample preparation. (Blazej, et al. 2006) Applications With quicker, faster, transportable, low-cost sequencing, applications include: • Individual sequencing leading to personalized medicine- gene therapy. • Rapid identification and characterization of pathogens. • Profiling tumor subtypes for diagnosis and prognosis. • Hypothesis testing for genotype/phenotype relationships. • Understanding B- and T- cell receptor diversity to allow antibody selection. Ethical Issues Advances and declining costs for sequencing technology will yield accessible genotype- phenotypic information to the scientific community. Rising issues within the scientific community and the public include identifiably of individuals, intellectual property vs. individual property rights to genomic sequences, requirements to “share” research results, and targeting research towards/away from certain races and cultures based on cost benefits. Conclusion (Mitra.) • Determines the order of nonconsecutive nucleotide additions. • Cy3-labeled-UTP is incorporated into the primer strand, donor dye. Subsequent incorporation of a complementary Cy5-labeled-UTP or Cy5-labeled-dCTP substrate results ins a spFRET signal. • Photobleaching of Cy5 dye addition of natural nucleotides dATP and dGTP addition of Cy5-labeled dNTP. Advantages • High degree of parallelization. • Sparing use of reagents. Sequencing Reaction Miniaturization Informatics Multiplexing Reactions Integration of Technologies Automation Data acquisition Microfluidic Separation Platforms • sstDNA library with adaptors Single-Pair FRET (spFRET) Cycle continues Robotics Software control Pyrosequencing Selection (isolate AB fragment only) Polony Technology Sequencing-By-Synthesis (SBS) Microfluidics Throughput Single-nucleotide addition (SNA) • Nebulization of genome Advantages • Avoids gel electrophoresis, functions in highly parallel fashion, high throughput, speed and accuracy. (Harvard Nanopore Group) Sensitivity • A method for multifluorescence discrimination of nucleotides separated by CE. • Four laser- four dye system excites near absorption maximum, uniformly intense emission signal. • Elimination of cross-talk between dye channels. • High fluorescent signal is collected; it is easier to resolve (Lewis, et al. 2005) the correct sequence. • Less processing of fluorescent data is required. Future Implications • Potential for a transportable and compact DNA sequencing system. • Higher sensitivity, quicker analysis, and lower cost: shortened preparation time, reduced sample and reagent volumes, and less data processing. Ligation Nanopore Sequencing • Utilizes a nanoscale device that translocates polymer molecules in sequential monomer order through a very small volume of space. • Includes a detector that directly converts characteristic features of the translocating polymer into an electrical signal. Transduction and recognition occur in real time, on a molecule-by-molecule basis. It can probe thousands of different molecules in a few minutes. • It can probe very long lengths of DNA. Efficiency Pulsed Multi-line Excitation (PME) Flow diagram of SBS developed by 454 Life Sciences Anneal sstDNA to an excess of DNA capture beads $1000 Speed Nanoscaling Arun Ammayappan, Ernest Nyannor, Jason Sinclair, Senthilkumar Palaniyandi and Sandi Kirsch Comparative Genome Sequencing How do the newest-latest DNA sequencing technologies work and what applications become possible with much cheaper sequencing? With the advancements in sequencing and its marriage with computer science, the face of biology has been altered. Biology will merge with computer science, mathematics, and physics as never before (Yao 2002), adding the advancements made. However, lets not forget about a meeting in Germany 40 years ago when one man told another that he had some ddTTP. References • Bart Barrell, 1991. DNA sequencing: present limitation and prospects for the future. FASEB J: 5: 40-45 • Robert G. Blazej et al., 2006. Microfabricated bioprocessor for integrated nanoliter-scale Sanger DNA sequencing. PNAS: 103(19): 7240-7245 • Biotage. http://www.pyrosequencing.com/DynPage.aspx?id=8726&mn1=1366 • Morris W. Foster and Richard R. Sharp, 2006. Ethical issues in medical-sequencing research: implications of genotype-phenotype studies for individuals and populations. Hum. Mol. Gen.: 15(R1): R45-R49 • E.K.Lewis,et al., 2005. Color-blind fluorescence detection for four-color DNA sequencing. PNAS:102(15):5346-51 • Oliver. http://www.chem.brown.edu/faculty/oliver/slide1.htm • Mitra. http://cbcg.lbl.gov/Genome9/Talks/mitra.pdf • M. L. Metzker, 2005. Emerging technologies in DNA sequencing. Genome Res.:15(12):1767-76 • NimbleGen. http://www.nimblegen.com/products/cgr/index.html • Tae Seok Seo et al., 2005. Four-color DNA sequencing by synthesis on a chip using photocleavable fluorescent nucleotides. PNAS: 102(17): 5926-5931 • 454 Life Sciences. http://www.454.com/enabling-technology/the-process.asp • Caitlin Smith. 2005. Genomics: Getting down to details. Nature 435, 991-994 • Harvard Nanopore Group. http://www.mcb.harvard.edu/branton/projects-NanoporeSequencing.htm • Tim T. Binnewies et al., 2006. Ten years of bacterial genome sequencing: comparative-genomics-based discoveries. Func. Integ. Genomics 6:165-85 • Frederick Sagner, 1980. Determination of Nucleotide Sequences in DNA. Nobel Lectures, Chemistry 1971-80 • Frederick Sanger. Sequences, Sequences, Sequences. Annu. Rev. Biochem. 57:1-28, 1988 • Toru Yao, 2002. Bioinformatics for the genomic sciences and towards systems biology. Progress in Biophysics and Molecular Biology 80:23-42