A framework for the future: Building molecular tools to understand the epidemiology of Clostridium difficile in Scotland Talk Plan • Highlight extent of problem in 2009 • Progress made – first SIRN grant – Evaluation of Molecular tools • Conclusions reached regarding application • Progress new grant – Aims to link epidemiological data generated from whole genomic sequencing with patient healthcare informatics Scale of problem Scottish C. difficile reference laboratory set up 2007 Typing methods • In 2007, numerous typing methods – Toxinotyping, Pulse Field Electrophoresis – North Americans (NAP), Restriction Endonuclease activity (REA) – Ribotyping • Based on amplification of 16s-23S ribosomal RNA intergenic regions Comparison 1. 3. PCR amplification with BioNumerics® database Ribotype 001 – represented first banding pattern observed, 002..... 2. Image analysis O’Neill GL et al Anaerobe 1996; 2; 205-209 Ribotyping is informative • Problems with ribotyping • Lack of discriminatory power – Difficulties - band calling • Size of band /not sequence – Exchangeability between labs Prevalence a percentage of ten most frequently – Basis of as variation /epidemiology recovered ribotypes Pre/Post 2009 30 20 10 001 027 106 002 005 014 015 020 023 078 Funding enabled • Determine strength and weaknesses of several typing methods – Multi-locus sequence analysis (MLST) • Amplification and sequencing of 7 amplicons – Multi-locus Variable Number Tandem Repeat Analysis (MLVA) • Amplification and sizing of 7 unique regions encoding tandem repeat sequences – Whole Genome Single nucleotide polymorpism (SNP) analysis • Sequencing and analysis of entire genome MLST • Involves amplification of 7 amplicons – C.difficile • adk, atpA, dxr, glyA, recA, sodA, tpi – Sequence and determine varients in each amplicon adk atpA dxr glyA recA sodA tpi 0 0 1 0 0 0 0 1 0 0 1 0 1 0 0 2 Determine population based on relative similarities/ differences in these alleles type 96 isolates typed using MLST • Sequence based/epidemiologically accurate • No significant advantage over ribotyping – Lacked sufficient discrimination for use in outbreaks Multilocus Variable Number Tandem Repeat Analysis (MLVA) VNTR = Variable-Number Tandem Repeat Locus 1 • Strain A: VNTR array 4x3 • atgggtaatccgtcgACgCACgCACgCgccaatcgatacgat • Strain B: VNTR array 4x5 • atgggtaatccgtcgACgCACgCACgCACgCACgCgccaatcgatacgat Locus 2 • Strain A: VNTR array 3x4 • ggtaccggtaaagcgcACCACCACCACCttgacactgccggttg • Strain B: VNTR array 3x6 • ggtaccggtaaagcgcACCACCACCACCACCACCttgacactgccggttg Data analysis – relative relatedness Using this approach – determine minimum total distance between strains MLVA analysis performed on the top 10 ribotypes in Scotland (n = 748) MLVA analysis of 027 isolates reveals strong geographical clustering over time Diversity of MLVA loci in the ten most common ribotypes in Scotland. Ribotype MLVA Locus (Van den Berg et al, 2007) A6 B7 C6 E7 F3 G8 H9 001 0.893 0.729 0.910 0.119 0.230 0.506 0.119 002 0.898 0.827 0.916 0.587 0.436 0.711 0.240 005 0.921 0.882 0.930 0.785 0.518 0.760 0.038 014 0.888 0.816 0.878 0.816 0.582 0.551 0.255 015 0.922 0.909 0.936 0.082 0.160 0.806 0.161 020 0.934 0.920 0.927 0.656 0.299 0.865 0.156 023 - 0.790 0.907 0.821 0.457 0.889 0.296 027 0.929 0.773 0.870 0.173 0.497 0.796 0.024 078 - - - 0.549 0.128 0.790 0.128 106 0.889 0.836 0.923 0.513 0.096 0.367 0.083 Simpson’s Index of Diversity for the seven MLVA loci in the ten most common ribotypes in Scotland. High CDI values indicate accurate measurement of a highly variable locus. Whole Genome Sequence Analysis Using SNP (single nucleotide polymorphism) Scottish Isolates included in global analysis of 027 ribotypes He et al. Nature genetics 2013 Correlation between whole genome and MVLA analysis GLA 010 GLA GLA 012 018 GLA 014 GLA GLA 008 GLA 004 013/019 GLA 002/009 Gla015 Gla004 2.00 Gla010 1.00 Gla020 2.00 Gla021 Gla022 2.00 2.00 2.00 2.00 1.00 2.00 Gla005 2.00 Gla008 Gla007 GLA 007 GLA 017 1.00 Gla017 Gla018 1.00 3.00 1.00 Gla013 GLA 003 GLA 021 GLA 022 GLA 020 GLA 006 Gla003 GLA 015 3.00 Gla016 Gla002, Gla009 2.00 3.00 4.00 Gla019 2.00 GLA 005 Gla006 GLA 001 Gla014 Gla012 Gla001 GLA 016 MLVA sufficiently discriminating to allow tracking within an outbreak situation Conclusions • PCR-ribotyping valuable typing tool – lacks sufficient discriminatory power to analyse outbreaks. • MLVA as a supplemental subtyping method – Cost effective – is highly discriminatory/discern outbreaks. – Application limited for some ribotypes • Whole Genome SNP analysis – validated and supported data generated by MLVA – Opportunity to use in limited analysis of 078 outbreaks • Prediction of phenotypes (antibiotic resistance) IMPORTANCE OF DATA LINKAGE 2009 Antibiotic policy change • Reduction in use of – – – – Fluoroquinolones Third generation cephalosporins Clindamycin Co-amoxyclav 30 20 10 001 027 106 002 005 014 015 020 023 078 Modelling impact of policy change Resistant vs sensitive Moxifloxacin Levofloxacin Erythromycin Policy Change 0.25 0.50 0.31 Epidemic 167.89 25.38 54.48 Non-epidemic 0.38 0.5 0.49 • Unclear what driving this enhanced resistance • Possible treatment of a particular population of patients • Identifiable if we could link with patient data Going forward Project Objectives • To develop a fully automated pipeline to allow rapid SNP analysis of genomic DNA from C. difficile. • To use this pipeline to evaluate – Relationship between community- and hospital-associated C. difficile strains – Further differentiate epidemiology of ribotype 078 in Scotland • To link patient health data and C. difficile WGS – To identify risk factors that could be reduced through modification of clinical practice Where are we? • Have identified 500 strains to be sequenced – 100 078 strains – 134 community associated strains isolated pre 2009 – 270 matched community/hospital associated strains isolated post 2009 • 350/500 have been sequenced • Automated SNP analysis pipeline generated • Permissions have been sort and granted to allow patient linkage • Plan in next 12 months to begin to link the data together • Long term aim: • To develop a unique Scottish electronic resource that can be interrogated by researchers focussed on this infection Acknowledgments SSSCDRL John Coia Derek Brown Health Protection Scotland Camilla Wuiff A-Lan Banks University of Glasgow Jan Lindstrom Cosmika Goswami Umer Ijaz Chris Quince University of Dundee Charis Marwick Peter Doonan Peter Davey Nichosa De Souza University of Strathclyde Marion Bennie The Wellcome Trust Sanger Institute Trevor Lawley Miao He Sequencing undertaken Glasgow Polyomics facility