p.1 02. DNA sequencing See Primrose, Twyman & Old Principles of Gene Manipulation, 6th edition (2002) Chapter 7 (partim: pages 120-131) Primrose & Twyman Principles of Gene Manipulation and Genomics, 7th edition (2006) Chapters 7 (partim: pages 96-111 and 77-79 and 213-214) - 1960-1975’s - DNA polymerase reactions - X174 : F. Sanger : plus-minus method - Chain termination by dideoxynucleotides (- chemical degradation method : Maxam & Gilbert) not in 2007-2008 - Basics of the dideoxy-chain termination method - copying single stranded template - primer oligonucleotide - generation of a nested set : unique 5' end - separation according to chain length (PAGE) : n > < n+1 - labelling (detection of products) : 32P (-32P-dNTP) - Problem aspects: - fractionation : "gel compressions" - fractionation : resolution (label, sensitivity) - reading distance (bands in parallel lanes) - unequal band intensities - polymerase stops & pausing G. Volckaert Genetic engineering : DNA sequencing 13/02/2016 p.2 - Initial optimisation - Klenow polymerase => T7 DNA polymerase (Sequenase) processivity, dNTP versus ddNTP => Taq polymerase (higher temperature) - control over gel temperature (avoiding "compressions"!) - other labels (35S, 33P, non-isotopic) -radiation 32 P 35 S 33 P 3 H 14 C half-life 14.3 days 87.4 days 25 days 12.3 years 5,730 years - primers : by chemical oligonucleotide synthesis - template : cloning in M13 vectors (later on: in fasmids) "universal" primer, master primer, forward & reverse primer - Using synthetic primers: also double-stranded templates are allowed compare: circular versus linear (presence of 3’ends !) - PCR: cycle sequencing 25 cycli : denaturation step, annealing step and elongation/termination step - Labelling : - dNTP (-phosphate position) - primer (5' end) - ddNTP (-phosphate position) - Compare: fixed moment vs on-line detection (real-time) vs trapping (Pohl) G. Volckaert Genetic engineering : DNA sequencing 13/02/2016 p.3 - "Automated" sequencing Fluorescent labelling for detection (dyes) - tagged primer : 1 tag, 4 reactions, 4 lanes - more tags with different emission wavelengths : 4 tags, 4 reactions, 1 lane - tagged ddNTPs : 4 tags, 1 reaction, 1 lane (is it substrate of the polymerase!!??) - one lane : - only "intra-lane" resolution is important - fixed detectors versus scanning detector (idem for laser) - corrections may be needed due to differences in dye properties - on-line detection => PC => data processing & analysis - ET-primers, Bodipy primers - Examples (see figures) - Novel equipment : fractionation by capillary electrophoresis : 1, 4, 8, 96 capillaries 50 m silica capillary; matrix is often linear polyacrylamide Robotics - Accuracy : - standard error frequency allowed 1/10000 – 1/100000 bp - effect of error frequency on encoded gene products - analysis on both strands is required - preparation on single-stranded templates - M13 vectors - fasmid (phasmid, phagemid) vectors - exonuclease III treatment - denaturation - aPCR (asymmetric PCR) : G. Volckaert unequal primer concentrations Genetic engineering : DNA sequencing 13/02/2016 p.4 - strand separation on “magnetic beads” after biotin labelling - labelling at a single end - filling-in asymmetric restriction ends - PCR with one standard + one biotin-labelled primer - immobilisation by streptavidin onto the magnetic beads - strand separation by NaOH, then physical separation using a magnet - adaptation of ends to use fluorescently-labelled primers Other methods : - Multiplex sequencing (G. Church & Kiefer-Higgins) not in 2007-2008 - indirect detection by hybridisation onto the ladders - Pyrosequencing (‘sequencing by synthesis’) - pyrophosphate release during polymerase reactions - pyrophosphate => ATP (APS + ATP-sulphurylase) - ATP [luciferase] => AMP + PPi + h - relatively short distances (diagnostics, comparative sequencing/confirmation) - SBH ('sequencing by hybridisation') (R. Drmanac) - sequence complexity : 8 => 65536 not in 2007-2008 256 x 256 - DNA immobilised on matrix, labelled oligonucleotides as probes => multiple hybridisation rounds or: - reverse blots: immobilised oligonucleotides, labelled DNA fragment as probe => limited hybridisation rounds - array format : at 0,1 mm : 2,56 x 2,56 cm at 0,04 mm : 1 x 1 cm G. Volckaert Genetic engineering : DNA sequencing 13/02/2016 p.5 - DNA chips : preparation by in situ synthesis (see chemical DNA synthesis) or direct application of the oligonucleotides - microtiter format 12 x 8 = 96 x4 384 x4 1456 - macro-arrays, micro-arrays : => robotics - Feasibility: possible with segments of 200 or a few hundreds of bp. Particularly efficient for comparative (control) (re-)sequencing or analyses of small changes (mutants, polymorphisms, SNPs). - Maldi – TOF - analysis of nested set by mass spectrometry Analysis of larger DNA’s (> 1 kb) not in 2007-2008 - definition : - CONTIG - “reverse sequencing” - 3 strategies - SHOTGUN : accumulation of data into contigs - redundancy : => remaining “gap”s - both strands - routine phase and “finishing” phase - systematic deletions : “progressive sequencing” - method of Henikoff (+ variations thereoff) : using exonuclease III - importance of polylinker - 5’-protruding, 3’-protruding, blunt : protecting one end from degradation G. Volckaert Genetic engineering : DNA sequencing 13/02/2016 p.6 - exonuclease III + nuclease S1 or mung-bean nuclease or exonuclease VII - “primer walking” sequencing - depends on large capacity of oligonucleotide synthesis - time lags between consecutive cycles - at genome scale : currently several tens of entire genomes have already been sequenced - “whole genome shotgun” approach : => computing power ! repeated sequences ! - “minimal tiling path” approach : “hierarchical approach” : - metagenome sequencing Novel and/or upcoming approaches not in 2007-2008 - 454 sequencing - supported oligonucleotide detection (SOLiD) - single-molecule sequencing - nanopore sequencing - optical trapping - polony sequencing G. Volckaert Genetic engineering : DNA sequencing 13/02/2016