Course on Introduction to microbial whole genome sequencing and analysis Mette Voldby Larsen DTU – Center for Biological Sequence Analysis (CBS) Henrik Hasman DTU – National Food Institute Presentation • Henrik Hasman • Ph.D. in molecular microbiology (1999) • Has been working at DTU – National Food Institute since 2000 • Main topics are antimicrobial resistance and genetic engineering of microorganisms and practical applications of NGS in clinical microbiology. What do we do • Applied research in evolution and spread of pathogenic bacteria with focus on antimicrobial resistance and bacterial typing. • Drug development for control of infections • Development of bioinformatic solutions for especially clinical microbiology. • WHO Collaborating Center and EU Reference Laboratory for antimicrobial resistance (EURL AR). • Coordinator of COMPARE (Horizon 2020). Mette Voldby Larsen 2002: Cand. scient. in Biology from University of Copenhagen 2007: PhD in Immunological Bioinformatics from Center for Biological Sequence Analysis (CBS), DTU 2007-2012: Assistant professor at CBS, DTU 2012 - : Associate professor at CBS, DTU > Primary research fields: Developing methods for whole-genome based prediction of microorganism’s type, phenotype, phylogeny ect. Recently also phages. > Teaching, study leader for Human Life Science Engineering More than 150 employees= one of the largest bioinformatics groups within academia in Europe Web-services runs a total of more than 1 million jobs per month. The flagship is “SignalP”, which predicts protein localization The course Learning objectives: • Understand the most common NGS technologies and terminology. • Learn how to prepare raw data from the sequencer for further bioinformatic analysis. • Be able to use tools for In silico detection of plasmid, resistance and virulence genes. • Be able to perform global and local WGS analysis to determine clonal relationship of bacteria (SNP, ND, MLST). • Cases and discussion of relevant literature. • Learn about metagenomics in clinical microbiology. Introduction to NGS Today Welcome Introduktion to Next Generation Sequencing Illumina præsentation Intro to sequencing, raw data and assembly Lunch (Sandwiches) Journal club Introduction to CGE single isolate, single services Computer work w. single isolates and single services Coffee Computer work w. single isolates and single services Wrap-up of computer work Introduction to NGS Tomorrow Welcome back Case - VTEC diagnostics Coffee Introduction to the SNP/ND concept Computer work w. VTEC Lunch (Sandwiches) Wrap-up of computer work Computer work w. CSIPhylogeny and NDtree Coffee Batch upload and the pipeline The map Computer work w. batch upload and the map Sponsored dinner in Lyngby at 18.30 Introduktion til NGS Friday Welcome back Wrap-up of computer work Metagenomics Coffee Case - Urine infections CLCBio presentation Computer work w. MGMapper/your own data Lunch (Sandwiches) Computer work w. MGMapper/your own data Wrap-up of computer work Implementing NGS in a clinical laboratory Future perspectives and GMI/COMPARE Course evaluation and goodbye Coffee And now to you..? • Who are YOU? • Where do you come from (country/institution)? • Your daily work? • Experience with NGS/WGS? • Your motivation for joining the course? Introduction to NGS Next Generation Sequencing One method to rule them all… 1981 £35000 2006 £2600 Ray Kurzweil 100-200£ + 2-3£ for App… Workflow today at the clinical laboratory Family Genus Species Identification (Subspecies) Serovar Phagetype Ribotype Resistograms Typing PFGE type MLVA type MLST type DNA Microarray analysis Full genomic DNA sequence Selecting an appropriate typing method can be depending on initial (less discriminatory) pretyping. And going directly for the most discriminatory method can sometimes be misleading. Typing methods • Phenotypic – Serotyping (antibodies) – Phage typing (virus susceptibility) – Biotyping (ability to grow in different substrates) – Antimicrobial resistance – Protein profiles • Genotypic – DNA fingerprint (RAPD, AFLP, ERIC, MLVA) – DNA sequencing (MLST, spa, dru, full genome) 19 Workflow with WGS at the clinical laboratory Didelot et al, 2012. DNA sequencing 21 DNA sequencing Applied Biosystems (ABI) Genetic analyser “First Generation” Sequencing machine (capillary Sanger sequencing) 22 23 Limitations • Limitation The size of DNA fragments that can be read in this way is about 700 bps...and it takes a long time to rum even a few genes..! • Problem Most genomes are enormous (e.g 108 base pair in case of human). So it is impossible to be sequenced directly! This is called LargeScale Sequencing 24 Solution • Solution Break the DNA into small fragments randomly Sequence the readable fragment directly Assemble the fragment together to reconstruct the original DNA Scaffolder gaps 25 Solving a one-dimensional jigsaw puzzle with millions of pieces(without the box) ! NGS output Huge numbers of small fragments (35-500 bp) Second generation sequencing Platforms Loman et al, 2012 Platforms Loman et al, 2012 Next generation sequencing machines 454 Life Sciences (Roche) First Next Generation Sequencing machine Illumina HiSeq/GAII systems High throughput systems Ion Torrent PGM system Low/medium throughput system Illumina MiSeq system Medium throughput system Oxford Nanopore (MinION) Single-molecule sequencing 30 Client side Raw DNA sequences Summary of: What it is? Has it been seen before? How we can fight/treat? What is new/unusual? Rough assembly and compression Fine assembly Server side Identification Gene finding Comparison What is already known? Pathogenicity islands Virulence genes Resistance genes MLST type What is novel? Vaccine targets Virulence genes Resistance genes SNPs Google maps like view • Reports Outbreaks Workflow with WGS at the clinical laboratory 4-6 hours Modified from Didelot et al., 2012. Wet-Lab Workflow Analysis tools DNA purification Library DNA barcoding