Towards an understanding of global patterns of simple sequence repeatmediated phase variation during host persistence of Campylobacter jejuni and Neisseria meningitidis Chris Bayliss RCUK Research Fellow Department of Genetics University of Leicester Edinburgh Workshop 29-30th September 2010 Outline • • • • Overview of my research areas Intro to SSRs and phase variation Measuring mutation rates/patterns Phase variation of C. jejuni genes in in vitro and in vivo models • Models of SSR-phase variation • Issues My Research: Phase Variation Experimental models/ Epidemiological samples Mechanistic studies Campylobacter jejuni In vitro models In silico models Impact of phase variation rate on population structure Colonisation of chickens Carriage samples Combined model Neisseria meningitidis Disease samples Hb receptors/reversible selection model Haemophilus influenzae R-M systems/Phage infection Selection of phase variants Consequences of Localised Hypermutation: Phase Variation SELECTION /MUTATION SELECTION /MUTATION MUTATION ON Frequency = 10-2 to 10-4 OFF ON Streisinger Model Streisinger Model Streisinger Model Insertion Streisinger Model Streisinger Model Deletion In-Frame Repeats ATG………..CAAT(30)…..//………….TAG ATG………..CAAT(29)…..TAG ATG………..CAAT(28)……..TAG ATG………..CAAT(27)…..//………….TAG ON OFF OFF ON Promoter-Located Repeats -35 -10 ATTATA……..TA(10)…….ATTAAA…//…ATG ON ATTATA……..TA(9)…..ATTAAA…//…ATG OFF Functions of the Products of Repeat-Associated Genes Flagella Biosynthetic Enzymes Capsule Biosynthetic Enzymes LOS/LPS Biosynthetic Enzymes Iron Acquisition Proteins Restriction Enzyme Adhesins Long Tracts of Simple Sequence Repeats in Bacterial Genomes G/C (8) A/T (10) Di (6) Tetra (5) Penta (3) H. influenzae (Rd) 6 2 0 12 2 N. meningitidis (MC58) 26 11 4 2 5 C. Jejuni (NCTC11168) 29 2 0 0 0 E. coli (K12) 12 0 1 0 0 Repeat Type (min. no. rpts) Length of PolyG/PolyC Repeat Tracts in C. jejuni Contingency Loci 16 14 12 10 8 6 4 2 11168 81-176 1221 81116 0 7 8 9 10 Repeat Tract Length 11 12 >12 Phase Variation of Simple Sequence Contingency Loci SELECTION /MUTATION SELECTION /MUTATION ON What What What What What are are are are are the the the the the OFF ON mutation rates of SSRs? determinants of SSR mutation rates? fitness implications of differing switching rates? roles of selective and non-selective bottlenecks? implications of multiple SSCL? Campylobacter jejuni:Phase Variation Frequencies Campylobacter jejuni * Gram –ve commensal of gasterointestinal tract of birds and widespread environmental contaminant * Major agent of foodborne gasteroenteritis * Implicated in autoimmmune diseases such as Guillain-Barre syndrome Reporter Constructs for Detecting Phase Variation in Campylobacter jejuni cj1139c lacZ G8 cat G8 lacZ G11 capA (cj0628/cj0629) T6-G11 Strain NCTC11168 ON CapA (surface-located autotransporter) a-CapA antibodies On-to-off ‘off’ variant Off-to-on ‘on’ variant Colony Blots of C. jejuni strain 11168 probed with anti-CapA ON-to-OFF Freq. -ve = 0.03 (filter 1, 9/8/07) OFF-to-ON Freq. +ve = 0.03 (filter 4, 23/7/07) MHA-VT plates -2 -3 -4 -5 MHA-VT-XGal plates Frequency of variants in = Number of variant cells ‘start colony’ Total number of cells N. meningitidis C. jejuni %G+C of Genome 38 51 31 MMR Genes MutS/MutL/ MutH MutS/MutL None SSR Mutation Frequencies 1x10-3 (AGTC30) 4x10-5 (G12) 4x10-3(G11) >95% +1/-1 Mutational Pattern Deletions>Insertions Cis-Acting Factors Repeat Number Repeat Number Repeat Number Trans-Acting Factors PolI, RNaseH MMR, PolIV Unknown 90% +1/-1 Unknown Short: ins>del Long: del>ins No environmental factors H. influenzae Campylobacter jejuni:In vitro/In vivo Passage PCR-Based Measurement of Repeat Tract Length GGGGGGGGGG FAM Multiple Passages of Growth in MHB Broth Suspend inoculum Plate Dilutions Colony Blotting Inoculate 5mL MHB Inoculate 5mL MHB Day 0 Pick 30 colonies PCR Array Inoculate Inoculate Inoculate 5mL MHB 5mL MHB 5mL MHB Day 1 Day 2 Day 3 Pallet the cells Day 4 Colony Blotting Plate Dilutions Pick 30 colonies PCR Array Analysis of Phase Variable Genes and Repeat Tracts CapA Frequency -ve Constant Inoculum Inoculum Output 0.29 0.24-0.36 (3.5x108cfu; 6 tubes) Variable Inoculum (from 3.5 x108 to 3.5x103cfu; 6 tubes) 0.29 0.27-0.36 Drift, Bottlenecks, Selection and HitchHiking 6 Genes = 64 Genotypes Selection Bottleneck Random Drift 0685-on Mutation/Bottleneck Mutation/Selection 1139-off Mutation/Bottleneck Mutation/Selection 0031-on 0685-on 1139-off Neisseria meningitidis PorA Phase Variation, Immune Evasion and Variant-Specific Immune Responses During Carriage Escape Assay • Modified serum bactericidal assay using large inoculum (1x104-1x107 cfu) and multiple passages • LPS phase variants with switches in expression of lgtG mediate escape of mAb B5 (translational switching) • Escape dependent on size of inoculum, amount of antibody and rate of phase variation Bayliss et al. 2008 Infect. Immun. 76:5038 PV of porA mediates immune escape in vitro 1.00E+07 11C Size of Inoculum 1.00E+06 1.00E+05 10C 1.00E+04 1.00E+03 5.00E+05 1.00E+02 5.00E+03 (5.00E+3 No Antibody) 1.00E+01 1.00E+00 P0 P1 P2 P3 P4 Passage +/- mAb 1.2 10% human serum +/- mAb 1.2 10% human serum +/- mAb 1.2 10% human serum *Variants examined had 10C residues in the porA repeat tract *Escape is due to pre-existing variants Correlation of porA PV Expression to Escape • Repeat tract changes to expression • Whole cell ELISA and lysate western blotting 1 11C 10C 9C 0.8 OD 405 11C Repeat 0.6 10C Repeat 9C Repeat 0.4 0.2 0 0 0.0001 0.001 0.01 1 mAb Dilutions *Level of PorA expression is highest when 11C repeat units is present in 8047 *~ 3 fold of reduction in expression of porA Week -4 Week 0 Week 4 Week 12 Week 24 Phase Variation of NadA Volunteer V43 V51 V52 V54 V58 V59 V88 V138 1st 12 12 12 14 12 13 11 12 2nd 12 12 14 12 12 9 12 3rd 12 12 12 12 12 9 12 4th 12 12 12 9 OFF 9 and 12 rpts Number of tetranucleotide repeats All volunteers colonised with Y:P1.21,16:CC174 Computer Models Multiple simple sequence contingency loci • Multiple loci = multiple potential genotypes • Haemophilus influenzae strain Rd has 12 genes containing tetranucleotide repeat tracts, a potential 4096 genotypes (if two genotypes per locus, i.e. ON and OFF) • Lic2 locus has three genotypes :- ONStrong, ON-Weak and OFF (if all 12 loci had 3 genotypes then there is 531 441 potential genotypes) Computer Model 1 • Population founded by single organism which divides by binary fission • Three phase variable loci • Switching occurs in both directions at the same rates • Mutations occur during division giving one genotype of the parental phenotype and one mutant Effect of phase variation rate on the amount of genetic diversity produced in 20 generations Number of populations Mutation rate 1x10-6 (< 6) (repeat number) 3.6x10-5 (10) 1.24x10-4 (22) 1000 900 800 700 600 500 400 300 200 100 0 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 Number of genotypes 1 2 3 4 5 6 7 8 Effect of phase variation rate on the production of genotypes with multiple switches *Solution is when all three loci have switched from OFF to ON. *30 generations were used. *All cells of the parental genotype were removed at generation 20. *1000 replicates were performed Mutation rate Number of populations containing solution 3.6x10-5 21 1.24x10-4 370 Model 2 Effect of Interval Between Selective Environments Environment A Selection for ON Phenotype Number of Generations 2,000-100,000 Variable Repeat Number 17 = ON = A 18 = OFF = B 19 = OFF = B 20 = ON = A etc 37 = OFF = B 38 = ON = A 2,000-100,000 Environment B Selection for OFF Phenotype Mike Palmer and Marc Lipsitch Evolution of Repeat Tracts in the Absence of Selection Repeat Number 5 6 7 8 9 10 11 12 13 Evolution of Repeat Tracts with Selection and in a Fluctuating Environment Environmental switch period:- 20 000 generations Fitness advantage:- 0.1 Environmental switch period:- 4 000 generations Fitness advantage:- 0.1 Environmental switch period:- 2 000 generations Fitness advantage:- 0.1 Environmental switch period:- 100 generations Fitness advantage:- 0.1 Summary Computer Simulation Model • Selection is required to maintain large numbers of repeats in the repeat tracts • Repeat number is determined by the frequency of the environmental switch • Correlation between repeat number and environmental switch is also influenced by the conferred fitness advantage and mutational pattern Model 3 • Model phase shifts in multiple loci using known mutation rates (excludes mutational patterns) • Assumes each locus switches independently of other loci (can set PV rate for each gene, but not scalable with tract length changes) • Simple deterministic model, average of multiple trees from a Monte Carlo simulation, performed in Excel (maximum of 100 generations) Sample from Chicken B9 One Isolate B9.1 cj0045 cj0685 cj1326 Gene capA cj1139 cj0032 Tract 9 9 10 12 9 9 Phenotype OFF ON OFF OFF OFF ON Binary code 0 1 0 0 0 1 Note:- genotype is not directly correlated with phenotype (i.e. cj0045 is OFF with 9 or 10 repeats Coded phenotypes of all 30 colonies for B9 010001 010100 010101 110000 110001 110100 110101 10 2 2 3 5 1 7 Drift, Bottlenecks, Selection and HitchHiking 6 Genes = 64 Genotypes Selection Bottleneck Random Drift 0685-on Mutation/Bottleneck Mutation/Selection 1139-off Mutation/Bottleneck Mutation/Selection 0031-on 0685-on 1139-off Modelling Changes in the Distribution of Phase Variants:- no selection 6 Phase variable genes = ON/OFF = 64 genotypes Inoc Output1 Output2 B9 Frequency 0.4 0.3 0.2 0.1 0=off, 1=on Output = 100 generations Inoc 111100 111000 110100 110000 101100 100100 100000 101000 Genotypes 011100 011000 010100 010000 001100 001000 000100 000000 0 Output 1 = all genes at G9 PV rate (0.0015) Output 2 = varied PV rates Scientific Issues • What factors to include in a model – mutation rate, mutational pattern, population size, fitness, frequency of environmental switching, bottlenecks, number of loci, number of generations • How to model – simulation of multiple populations or deterministic model of average solutions Logistical Issues • • • • Data collection (sample bias) Computational power Biological and clinical relevance Simultaneous data collection and modelling (local collaborators) • Relevance to systems biology • Requirement for a modelling community Jean-Philipe Gautier Jacques Marlet Fadil Bidmos Nathalie Ingouf Rebecca Richards Awais Anjum Vladimir Manchev Richard Haig Julian Ketley (University of Leicester) Neil Oldfield Del Ala’Aldeen Karl Wooldridge Michael Jones Paul Barrow (University of Nottingham) Michael Tretyakov Alexander Gorban (University of Leicester) Michael Palmer Marc Lipsitch Richard Moxon