Third generation sequencing

advertisement
March 2012
Third Generation
Sequencing
Barbara Hutter
Division of Theoretical Bioinformatics (B080)
Computational Oncology group
The Next Next Generation
●
http://seqanswers.com/forums/showthread.php?t=6263
• When does "current generation" become "last generation", and "next generation"
become "current generation", and can "third generation" become "next
generation"?
• Is Sanger still "current generation"?
• Will the GA/454/Solid always be "next generation"?
• Should we stop saying "current" and "next" and start saying "first" and "second"?
• Maybe we should take a tip from Star Trek. After Next generation comes Deep
Space Nine, Voyager, and Enterprise.
• "Next Generation" does have a nicer ring to it than "Markedly Faster than Last
Year (but wait until next year)"
• I like the idea of always having the "next gen" name.
Page 2
Barbara Hutter
3rd Generation Sequencing
Why Another Generation of Sequencers?
●
●
Next (2nd) Generation Sequencing weaknesses:
• PCR steps introduce bias (duplicates, PCR errors)
• alternating phases of nucleotide incorporation and signal detection
• dephasing, quality decreasing towards the end
• short reads
• big expensive machines, expensive chemicals
• long run times
Make things bigger, better, faster, faster, cheaper:
• less input DNA, simpler library preparation, less reagents (no “washing”)
• smaller, less expensive machines
• reduced run time
• longer reads
• better quality
• higher throughput
Page 3
Barbara Hutter
3rd Generation Sequencing
Third Generation Sequencing in the Strict Sense
●
●
3rd Gen Seq = real time sequencing of single DNA (or RNA) molecules
• by synthesis
• nucleotide incorporation and signal detection occur continuously
• as fast as the polymerase incorporates nucleotides (750 nt/sec)
• SMRT (Pacific Biosciences)
• without synthesis or ligation
• Oxford Nanopore
Between second and third generation: still use “wash-and-scan” technology
• Ion Torrent (Life Technologies)
• non-optical sequencing
• Helicos Genetic Analysis System
• single molecule sequencing
Page 4
Barbara Hutter
3rd Generation Sequencing
Advantages of Single Molecule Sequencing
●
●
●
●
No PCR
• no PCR amplification bias
• simplified library construction
No synchronization needed => no dephasing
Consensus read
• sequence the same template molecule more than once
• multiple alignment of all the sequences from each template molecule
• construct a consensus read
• => reduce stochastic errors in the single-molecule sequence, gain greater
accuracy than that of raw reads
Direct RNA sequencing
• replacing DNA polymerase with a reverse transcriptase or other RNA-dependent
polymerase
Page 5
Barbara Hutter
3rd Generation Sequencing
Ion Torrent I
●
●
●
Very similar to 454 sequencing but with semiconductor technology
Hydrogen ions (H3O+) released by DNA synthesis => changes in pH that can be
measured by ion-sensitive field-effect transistor (ISFET)
Microwells on a semiconductor chip (ion sensitive layer) below which is an ISFET ion
sensor
• each microwell contains one single-stranded template DNA molecule to be
sequenced and one DNA polymerase
http://en.wikipedia.org/wiki/Ion_semiconductor_sequencing
Page 6
Barbara Hutter
3rd Generation Sequencing
Ion Torrent II
●
●
●
●
●
Has been purchased by Life Technologies for $375 million
No need for labeled nucleotides and (expensive and bulky) laser equipment
• small and comparably cheap machines
Limited read length (max. 200 bp) and throughput, but very fast
• ~ 100 million bases in 2 hours
Homopolyer error similar to 454
• proportionally greater electronic signal
Already widely applied for:
• targetted and amplicon sequencing, e.g. for validation of variants detected with
other sequencing platforms
• sequencing of bacterial genomes, e.g. enterohemorrhagic Escherichia coli
(EHEC) in June 2011
Page 7
Barbara Hutter
3rd Generation Sequencing
Helicos Genetic Analysis System
●
●
●
●
●
●
●
●
●
●
Close to 3rd Gen boundary
First DNA-sequencing instrument to operate by
imaging individual DNA molecules
Individual DNA molecules fixed to a surface
Proprietary Virtual Terminator nucleotides allow for
step-wise sequencing
~ 1 billion molecules sequenced in ∼8 days
High raw error rate (over 5%) improved by
consensus sequencing
Reads only ~ 32 nucleotides
Higher costs than 2nd Gen sequencing
Direct RNA sequencing possible
Helicos BioSciences have re-focused on molecular
diagnostics
http://en.wikipedia.org/wiki/Single_molecule_fluorescent_sequencing
Page 8
Barbara Hutter
3rd Generation Sequencing
SMRT I
●
●
●
●
●
Single-Molecule Real Time
sequencing
DNA polymerase anchored to
the bottom of zero-mode
waveguides (ZMW; some 10
nm diameter) with biotinstreptavidin
Laser light “bends” only 30 nm
into the ZMW
Dye is attached to the
phosphate and naturally
cleaved off
Each incorporated nucleotide
emits a “flash”
metal film with holes = ZMWs
DNA polymerase
glass slide
Schadt EE et al. A window into third-generation sequencing
Hum. Mol. Genet. (2010) 19(R2):R227
Page 9
Barbara Hutter
3rd Generation Sequencing
SMRT II
●
●
●
●
●
●
●
Incorporation in milliseconds but 3 times slower than diffusion => strong signal of the
incorporated nucleotide overcomes the noise of diffusing ones
Minimal amount of reagents, only added once
dsDNA is circularized => can “read” the same sequence (both strands) several times
Read size limited by bleaching and redox reactions
Standard reads: 1 pass = 1 read, 1 - 2 kb
Short reads: 6 passes, 250 bp, circular consensus sequence (=> high quality)
Strobes: 1 pass, very large (6 -10 kb) fragments (“multiple paired end reads”)
• alternating light pulses (signal) and letting the polymerase work in the dark (gaps)
• structural variants, transcripts, haplotyping, scaffolding in hybrid assembly
Page 10
Barbara Hutter
3rd Generation Sequencing
SMRT III
●
●
●
●
●
●
●
Strand-specific
Modified nucleotides on template (e.g. methylated C) can be detected by altered
kinetics (light impulse duration, height and width of peaks, distance between peaks)
• actual detection is complicated
Frequently random errors and insertions / deletions
• special mapping and assembly software
• ~ 50 Mb mappable reads per run
Speed limit imposed by the imaging equipment
Can use other molecules instead of DNA polymerase
• RNA polymerase => watch transcription
• ribosome (labeled tRNAs) => watch translation
• molecule that binds drugs
Already on the market but not high throughput
Applications:
• targetted sequencing
• hybrid assembly (patch gaps in scaffolds of EHEC genome)
Page 11
Barbara Hutter
3rd Generation Sequencing
Nanopore Sequencing I
●
●
●
●
●
●
●
Oxford Nanopore
Nanopore sequencing is a method under development since 1995
Porous transmembrane cellular protein, diameter ~ 1nm
• modified alpha hemolysin (αHL) or Mycobacterium smegmatis porin A (MspA)
Nanopore is immersed in a conducting fluid (synthetic lipid bilayer)
Potential (voltage) applied across the bilayer (by salt gradient)
Conduction of ions through the nanopore => electric current
Passage of bases disrupts the current
Schadt EE et al. 2010
Page 12
Barbara Hutter
3rd Generation Sequencing
Nanopore Sequencing II
●
●
●
●
●
●
Characteristic change in the magnitude of the current through the nanopore
• single nucleotides cleaved off by endonuclease
• threading the whole ssDNA through the nanopore
• no single-nucleotide resolution because the DNA strand moves too rapidly
through the nanopore (1-5 μs per base)
• engineered nanopores, dsDNA stretches to slow down
• modified bases recognized directly
Pore records each base irrespective of what comes before or after
• homopolymer stretches are resolved correctly
Can read the same DNA molecule several times => consensus sequence
Strand-specific, average read length > 1 kb
Also applicable to other molecules (RNA, amino acids, ...)
No laser equipment, no need for high-speed CCD camera, no chemicals
Page 13
Barbara Hutter
3rd Generation Sequencing
Nanopore Sequencing III
●
●
●
●
●
●
●
●
Oxford Nanopore ready for the market in 2012
http://www.nanoporetech.com/news/press-releases/view/39
Oxford Nanopore's GridION system consists of scalable instruments (nodes) used with
consumable cartridges that contain proprietary array chips for multi-nanopore sensing.
Each GridION node and cartridge is initially designed to deliver tens of Gb of sequence data
per 24 hour period, with the user choosing whether to run for minutes or days according to the
experiment.
Oxford Nanopore has also miniaturised these devices to develop the MinION; a disposable
DNA sequencing device the size of a USB memory stick [...]. A single MinION is expected to
retail at less than $900.
Each cartridge is initially designed for real-time sequencing by 2,000 individual nanopores at
any one time. Alternative configurations with more processing cores will become available in
early 2013 containing over 8,000 nanopores.
Nodes may be clustered in a similar way to computing devices, allowing users to increase the
number of nanopore experiments being conducted at any one time if a faster time-to-result is
required. For example, a 20-node installation using an 8,000 nanopore configuration would be
expected to deliver a complete human genome in 15 minutes.
Each GridION node contains all the computing hardware and control software required for
primary analysis of data as it is streamed from each nanopore, resulting in full length real-time
delivery of complete reads [...].
Page 14
Barbara Hutter
3rd Generation Sequencing
Other Approaches
●
●
●
Optical multipore detection
• two different fluorescently labeled molecular beacons hybridized to the DNA
• the beacons are sequentially unzipped from the DNA molecules as they are
translocated through a nanopore
• each unzipping event unquenches a new fluorophore
Direct imaging of DNA
• transmission electron microscopy (TEM)
• scanning tunneling microscope (STM) tips
• DNA molecule has to be stretched and fixed on a surface
Schadt EE et al. 2010
• no publications for proof of principle yet
Transistor-mediated DNA sequencing (developed by IBM)
• individual bases of ssDNA molecules pass through nanometer-sized pores =>
unique electronic signature
• surface of the pores consists of axially stratified, alternating layers of metal and
dielectric material (like a transistor; see figure)
• control the motion of the DNA through the pores by modulating the current in the
electrodes of the transistor
Page 15
Barbara Hutter
3rd Generation Sequencing
Summary Third Generation Sequencing
●
●
●
●
●
●
No “wash-and-scan” technology
“Real time” - really fast
No synchronization required => no dephasing problem
Single molecule sequencing
• no PCR => no bias, simpler library preparation
• strand-specific
• direct detection of modified bases
• also RNA
Improved read quality by consensus sequence from multiple passes
Challenges
• not yet at high throughput
• deletions and insertions are the most frequent errors
• need special mapping and analysis programs
Page 16
Barbara Hutter
3rd Generation Sequencing
Literature
●
●
●
●
●
Schadt EE et al. A window into third-generation sequencing Hum. Mol. Genet. (2010)
19(R2):R227
Ion Torrent
• http://www.iontorrent.com/publications/
SMRT
• http://www.pacificbiosciences.com/news_and_events/publications
• http://www.aacc.org/events/meeting_proceeding/2011/Documents/OakRidge_Tur
ner_Slides.pdf
Nanopore Sequencing
• http://www.nanoporetech.com/technology/publications
Sequencing Wars - The Third Generation
• http://stocks.investopedia.com/stock-analysis/2010/Sequencing-Wars---TheThird-Generation-ILMN-LIFE-A-CALP-GE0610.aspx#ixzz1op4LPPE8
Page 17
Barbara Hutter
3rd Generation Sequencing
The Next Generation (Sequencing Experts) at DKFZ
●
●
http://www.dkfz.de
Theoretical Bioinformatics
(http://ibios.dkfz.de/tbi/)
Computational Oncology Group
(Benedikt Brors)
●
Network Modelling Group (Rainer König)
• Moritz Aschoff – RNA-Seq
• Prakash Balasubramanian – ChIP-Seq
• Volker Ast – RNA-Seq
• Lars Feuerbach – integrative analyses
• Rosario Piro – interactions,
pathways
• Michael Heinold – WGS pipeline
• Barbara Hutter – WGS, SOLiD, RNA•
•
•
•
•
•
Seq, ChIP-Seq
Natalie Jäger – WGS, SNVs
Dilafruz Juraeva – GWAS, pathways
Rolf Kabbe – best cluster administrator
ever!
Nora Rieber – SOLiD, Complete
Genomics
Matthias Schlesner – structural variants,
assembly
Qi Wang – indels
Page 18
Barbara Hutter
●
Molecular Genetics
• Volker Hovestadt – miRNA-Seq, WG
bisulfite-Seq, methods development
• Marc Zapatka – assembly
●
Core Facilitity Genome Sequencing
• Bärbel Lasitschka – WGS pipeline
And you?!
3rd Generation Sequencing
Download