Whole genome sequencing - Center for Biological Sequence Analysis

advertisement
Course on Introduction to microbial
whole genome sequencing and
analysis
Mette Voldby Larsen
DTU – Center for Biological Sequence Analysis (CBS)
Henrik Hasman
DTU – National Food Institute
Presentation
• Henrik Hasman
• Ph.D. in molecular microbiology (1999)
• Has been working at DTU – National Food Institute since 2000
• Main topics are antimicrobial resistance and genetic engineering of
microorganisms and practical applications of NGS in clinical
microbiology.
What do we do
• Applied research in evolution and spread of pathogenic bacteria with
focus on antimicrobial resistance and bacterial typing.
• Drug development for control of infections
• Development of bioinformatic solutions for especially clinical
microbiology.
• WHO Collaborating Center and EU Reference Laboratory for
antimicrobial resistance (EURL AR).
• Coordinator of COMPARE (Horizon 2020).
Mette Voldby Larsen
2002: Cand. scient. in Biology from University of Copenhagen
2007: PhD in Immunological Bioinformatics from Center for Biological Sequence Analysis (CBS),
DTU
2007-2012: Assistant professor at CBS, DTU
2012 -
: Associate professor at CBS, DTU
> Primary research fields: Developing methods for whole-genome based prediction of
microorganism’s type, phenotype, phylogeny ect. Recently also phages.
> Teaching, study leader for Human Life Science Engineering
More than 150 employees= one of the largest bioinformatics groups within academia
in Europe
Web-services runs a total of more than 1 million jobs per month.
The flagship is “SignalP”, which predicts protein localization
The course
Learning objectives:
• Understand the most common NGS technologies and terminology.
• Learn how to prepare raw data from the sequencer for further bioinformatic analysis.
• Be able to use tools for In silico detection of plasmid, resistance and virulence genes.
• Be able to perform global and local WGS analysis to determine clonal relationship of
bacteria (SNP, ND, MLST).
• Cases and discussion of relevant literature.
• Learn about metagenomics in clinical microbiology.
Introduction to NGS
Today
Welcome
Introduktion to Next Generation Sequencing
Illumina præsentation
Intro to sequencing, raw data and assembly
Lunch (Sandwiches)
Journal club
Introduction to CGE single isolate, single services
Computer work w. single isolates and single services
Coffee
Computer work w. single isolates and single services
Wrap-up of computer work
Introduction to NGS
Tomorrow
Welcome back
Case - VTEC diagnostics
Coffee
Introduction to the SNP/ND concept
Computer work w. VTEC
Lunch (Sandwiches)
Wrap-up of computer work
Computer work w. CSIPhylogeny and NDtree
Coffee
Batch upload and the pipeline
The map
Computer work w. batch upload and the map
Sponsored dinner in Lyngby at 18.30
Introduktion til NGS
Friday
Welcome back
Wrap-up of computer work
Metagenomics
Coffee
Case - Urine infections
CLCBio presentation
Computer work w. MGMapper/your own data
Lunch (Sandwiches)
Computer work w. MGMapper/your own data
Wrap-up of computer work
Implementing NGS in a clinical laboratory
Future perspectives and GMI/COMPARE
Course evaluation and goodbye
Coffee
And now to you..?
• Who are YOU?
• Where do you come from (country/institution)?
• Your daily work?
• Experience with NGS/WGS?
• Your motivation for joining the course?
Introduction to NGS
Next Generation Sequencing
One
method to
rule them
all…
1981
£35000
2006
£2600
Ray Kurzweil
100-200£ + 2-3£ for App…
Workflow today at the clinical laboratory
Family
Genus
Species
Identification
(Subspecies)
Serovar
Phagetype
Ribotype
Resistograms
Typing
PFGE type
MLVA type
MLST type
DNA Microarray analysis
Full genomic DNA
sequence
Selecting an appropriate typing
method can be depending on
initial (less discriminatory) pretyping.
And going directly for the most
discriminatory method can
sometimes be misleading.
Typing methods
• Phenotypic
– Serotyping (antibodies)
– Phage typing (virus susceptibility)
– Biotyping (ability to grow in different substrates)
– Antimicrobial resistance
– Protein profiles
• Genotypic
– DNA fingerprint (RAPD, AFLP, ERIC, MLVA)
– DNA sequencing (MLST, spa, dru, full genome)
19
Workflow with WGS at the clinical laboratory
Didelot et al, 2012.
DNA sequencing
21
DNA sequencing
Applied Biosystems (ABI) Genetic analyser
“First Generation” Sequencing machine
(capillary Sanger sequencing)
22
23
Limitations
• Limitation
The size of DNA fragments that can be read in
this way is about 700 bps...and it takes a long
time to rum even a few genes..!
• Problem
Most genomes are enormous (e.g 108 base
pair in case of human). So it is impossible to
be sequenced directly! This is called LargeScale Sequencing
24
Solution
• Solution
 Break the DNA into small
fragments randomly
 Sequence the readable
fragment directly
 Assemble the fragment
together to reconstruct the
original DNA
 Scaffolder gaps
25
Solving a one-dimensional jigsaw puzzle with millions of
pieces(without the box) !
NGS output
Huge numbers of small fragments (35-500 bp)
Second generation sequencing
Platforms
Loman et al, 2012
Platforms
Loman et al, 2012
Next generation sequencing machines
454 Life Sciences (Roche)
First Next Generation Sequencing machine
Illumina HiSeq/GAII systems
High throughput systems
Ion Torrent PGM system
Low/medium throughput system
Illumina MiSeq system
Medium throughput system
Oxford Nanopore (MinION)
Single-molecule sequencing
30
Client side
Raw DNA sequences
Summary of:
What it is?
 Has it been seen before?
 How we can fight/treat?
 What is new/unusual?

Rough assembly
and compression
Fine assembly
Server side
Identification
Gene finding
Comparison
What is already known?
Pathogenicity islands
Virulence genes
Resistance genes
MLST type
What is novel?
Vaccine targets
Virulence genes
Resistance genes
SNPs
Google maps like view
• Reports
 Outbreaks
Workflow with WGS at the clinical laboratory
4-6 hours
Modified from Didelot et al., 2012.
Wet-Lab Workflow
Analysis tools
DNA purification
Library
DNA barcoding
Download