Putative Biology of PE/PPE Multigene Families of Mycobacteria and their Relevance with Regard to Evolution Helmi Mardassi, Institut Pasteur de Tunis M. tuberculosis, the most successful human pathogen, displays restricted genetic polymorphism and appears to be exceptionally stable LAM , Haarlem… H37Rv, H37Ra… Beijing… Sreevatsan et al., 1997 PE/PPE gene families have been uncovered owing to the availability of the M. tuberculosis genome sequence (S.Cole et al., 1998) Structure of the PE family members (99 members) PE PE Any aa sequence (N=34) PE_PGRS PE (GGAGGA)n Conserved N-terminal sequence (~ 110 aa PE region ) C-terminal extension with variable length PGRS portion (N=65) Structure and sequence of the PE_PGRS33 member (M. tuberculosis Rv1818c) Delogu and Brennan,, 2002 The PPE family (69 members) PPE MPTR 180 aa PPE ~200 to SVP 180 aa PPE 180 aa PPE 180 aa (NxGxGNxG)n N=23 ~ >3 500 aa GxxSVPxxW N=24 ~200 to ~ 400 aa PPW PxxPxxW N=10 ~200 to ~400 aa Unique sequence 0 to ~400 aa N=12 Constraints 1. Repetitive sequences: extensive cross reactivity (hybridization and antigenic assays) 2. high G+C content: very difficult to amplify by PCR 3. Poorly expressed in the conventional host E.coli (Toxic effect? Instability?) 4. Very acidic and membrane-associated proteins: notoriously refractory to analysis by two-dimensional electrophoresis and mass spectrometry PE/PPE genes are particularly abundant within the M. tuberculosis complex Association of PE/PPE with the ESAT-6 (esx) gene cluster Gey van Pittius,2006 A plausible scenario for the expansion of the PE/PPE gene families has been recently proposed Several questions relating to these multigene families came to mind 1. Do all PE and PE_PGRS proteins share similar functions? 2. What is the extent of their genetic variability? 3. Do they really play a role in antigenic variations 4. Are they associated to enhanced virulence and /or transmissbility? 5. Which members are conserved and essential Two functions were immediately proposed Antigenic variability Interference with antigen processing and presentation in the context of MHC I molecules (Cole et al., 1998) The PGRS domain of PE_PGRS proteins dispalys significant sequence similarity with the EBNA1 antigen of EBV (Brennan & Delogu, 2002) The PGRS domain confers increased stability to GFP protein in eucaryotic cells (Brennan & Delogu, 2002) PE_PGRS33 Michael Brennan TM region PE region PGRS region PGRS33 DNA vaccine – reasonable level of protection Vaccine produced ab response to only PGRS tail region and not PE PE region – high Inf-γ (low IL-10, no ab) – high Th1 response PGRS region – low Inf-γ (high IL-10, high ab) – high Th2 response PE/PPE are genetically variable ◤ In silico comparative sequence analysis al., 2002, Garnier et al., 2003) (Cole et al., 1998, Gordon et al., 2001, Fleishmann et ◤ Sequence analysis of clinical isolates [PE_PGRS33 (Talarico et al., 2005), PPE8 (Srivastava et al., 2006, PE_PGRS17, PE_PGRS18 (Karboul et al., 2006)] ◤ Microarray data (Tsolaki et al., 2004; Garcia-Pelayo et al., 2004) The Molecular mechanisms that operate to generate the genetic variability in PE/PPE genes ● Dislocations between a replicating strand and its template at repetitive DNA sequences (replication slippage) (Cole et al., 1998, Machowski et al., 2007) ● Intergenic and intragenic recombiantion/gene conversion events Gutacker et al. 2006, Karboul et al., 2006, Lui et al., 2006) (Cole et al., 1998, ● Microsatellite polymorphism (Sreenu et al., 2006) ● Insertion deletion events of IS and phage sequences within PE/PPE genes A genomic library-based amplification strategy (GL-PCR) for efficient mapping of insertion sequences A typical GL-PCR profile A B M 1 2 3 4 Control vector M 1 2 3 MTB14323 4 M 1 2 3 4 Haarlem3 MDR-TB Outbreak isolate Rv1755 plcD Rv0403c mmpS1 Rv2819c Rv0794c:Rv0795c Rv2815c :Rv2816c Rv2017:Rv2018 Rv0171 mce1c Genomic location of IS6110 in the M. tuberculosis reference strain MTB14323 Rv2328 PE23 Rv2352c PPE38 Rv2336 PE/PPE genes are differentially expressed ◤ DFI ◤ Promoter trap ◤ cDNA microarray ◤ RT-PCR and QRT-PCR Subcellular location ● PE/PPE proteins are associated with the cell wall and cell membrane fractions and appears to be partly exposed on the cell surface (Doran et al., 1992; Brennan et al., 2001; Banu et al., 2002, Sampson et al., 2001; Okkels et al., 2003; Delogu et al., 2004; Le Moine et al., 2005) ● In silico analysis identified in 40 PE/PPE proteins potential beta-barrel outer-membrane structures (Pajon et al., 2006) ● PE_PGRS33 influences the cellular architecture, colony morphology and bacillus-bacillus interaction (Brennan et al., 2001) Structural genomics/structural biology determined the crystal structure of a PE/PPE protein complex Strong et al., 2006 Additional putative function(s) and relevance of the PE/PPE proteins 1. Antigenic variations 2. Interference with antigen processing and presentation 3. Necessary for replication and persistence of the bacillus within the host cell 4. Vaccine candidate 5. Pre-clinical expression (in the mouse model) 6. Architecture of the bacillus diagnostic potential Phenotypic characteristics as inferred from gene inactivation based experiments ► A transposon mutant of PPE 46 was found to be attenuated for growth in macrophages (Camacho et al.,1999) ► In the M. marinum model, two PE_PGRS genes were found to be essential for the bacillus to replicate in macrophages and persists in the host granulomas (Ramakrishnan et al., 2000) ► M. Bovis BCG strain, whose PE_PGRS33 expression is abrogated could not infect and survive in macrophages (Brennan et al., 2001) ► PPE31, PPE68, and PE35 are required for growth in vivo during infection of mice (Sassetti et al., 2003 ) ► PPE25 ( Li et al., 2005) and PPE10 (Stewart et al., 2005) mutants seem to be associated with the control of phagosomal acidification Effect of expression of certain PE_PGRS genes in the non pathogenic M. smegmatis ● M. Smegmatis expressing PE_PGRS33 displayed enhanced colonization of BMM macropahages and increased cell necrosis (Dheenadhayalan et al., 2005) ● PE_PGRS33 elicits TNF-alpha release from macrophages in a TLR2dependent manner (Basu et al., 2007) ● M. smegmatis expressing the PE_PGRS gene Rv3812c display increased resistance in vitro to low pH (Karboul et al., in preparation) CONCLUSION ●Expansion of PE/PPE proteins in pathogenic mycobacteria seem to have been accompanied with functional divergence ● Although several members are homologous, there does not seem to be any compensatory effect. Thus, a high level of functional specialization could have been reached during evolution ● From these preliminary studies, certain members seem to be endowed with multiple functions ● Although deletion analyses of PE/PPE genes were accompanied with phenotypic characteristics, the detailed molecular mechanisms responsible for the observed effects remain to be demonstrated The research program relating to PE/PPE gene families carried out at IPT Primary objective: Evaluate the distribution of PE/PPE genes among mycobacteria and the extent of their genetic variability Specific aims: -Comparative sequence analyses of selected genes -Development of efficient tools for the specific detection of PE/PPE genes (identification of specific probes) PE/PE_PGRS members subjected to comparative sequence analysis PE34 Rv3872 Rv3020c Rv1089 Rv3893c Rv0335c Rv3018A Rv3022A Rv0285 Rv1386 Rv2408 Rv2328 Rv3812 Rv1646 Rv2107 Rv1169c Rv3097c Rv2431c Rv0151c Rv0152c Rv0159c Rv0160c Rv1430 Rv3650 Rv1172c Rv2099c Rv1788 Rv1791 Rv2769c Rv1040c Rv3622c Rv1195 Rv3477 Rv0754 Rv1806 Rv2340c Rv0916c Rv3652 Rv1088 Rv1983 Rv2519 Rv1214c Rv0977 Rv0109 Rv1768 Rv1087 Rv2162c Rv1091 Rv1840c Rv1651c Rv3653 Rv2098c Rv1803c Rv0532 Rv0124 Rv1396c Rv0578c Rv1067c Rv1068c Rv1468c Rv3388 Rv0278c Rv0279c Rv0747 Rv0742 Rv2741 Rv1818c Rv0746 Rv2396 Rv1325c Rv0834c Rv2591 Rv3344c Rv3512 Rv0833 Rv2126c Rv3507 Rv0297 Rv1243c Rv2490c Rv0872c Rv2853 Rv2487c Rv3367 Rv3595c Rv0832 Rv3590c Rv2634c Rv1441c Rv3345c Rv0978c Rv0980c Rv2371 Rv2615c Rv1450c Rv1452c Rv3508 Rv3514 Rv3511 Gr.1 Gr.2 Gr.3 Gr.4 Gr.5 PCR amplification of PE members through the Mycobacterium tuberculosis complex Rv 0978 Rv 0285 Rv 0160 Rv 1169 Rv 3367 Rv 1195 Rv 1040 Rv 0980 Rv 1441 Sequence analysis of 22 PE members in Mycobacterium tuberculosis complex RESULTS 0 PE 5 PE_PGRS Conserved 1 PE 2 PE_PGRS 4 PE_PGRS 10 PE variable Highly variable Among the highly variable genes, two conform with the definition of a duplicated gene pair A Rv0981 PE_PGRS16 Rv0982 RpmF FadE13 Rv0979c PE_PGRS17 Rv0976c PE_PGRS18 B -235 MAR1 PE_PGRS17 97 % identity 996 507 506 1 MAR2 85 % nt identity (90 % aa similarity) 98 % nt identity (98% aa similarity) 729 PE_PGRS18 98 % nt identity (98 % aa similarity) 30% nt identity (38 % aa similarity) 1386 596 PE_PGRS45 MAR1 PE 1374 MAR2 MAR1 98 % identity 1206 PGRS PE_PGRS17 250 260 270 280 290 300 310 320 ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| Q A G S T Y A V A E A A S A T P L Q N V L D A H37Rv and H37Ra CAAGCTGGCAGCACCTACGCGGTCGCCGAAGCGGCCAGCGCAACACCGCTGCAGAA------------CGTGCTCGATGC CDC1551 ......................................................C.GATCGAGCAGGC.C..T.G.GG.T M.bov AF2122/97 ......................................................C.GATCGAGCAGGC.C..T.G.GG.T Q 330 340 350 360 370 I E Q 380 A L G 390 V 400 ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| I N A P V Q S L T G R P L I G D G A N G I D G T G Q H37Rv and H37Ra GATCAACGCACCCGTTCAGTCGCTGACCGGGCGCCCATTGATCGGCGACGGCGCGAACGGGATCGACGGGACCGGGCAAG CDC1551 .......A.G..GAC.G..G.....GTG......AAGC.......T........CC.....GCGCC...C........G. M.bov AF2122/97 .......A.G..GAC.G..G.....GTG......AAGC.......T........CC.....GCGCC...C........G. T T 410 E A V 420 K 430 H 440 450 ....|....|....|....|....|....|....|....|....|....| A G G N G G W L W G N G G N G G S H37Rv and H37Ra CCGGCGGTAACGGCGGGTGGCTGTGGGGCAACGGCGGCAACGGCGGGTCG CDC1551 .......GGC......CATCT...................T......... M.bov AF2122/97 .......GGC......CATCT...................T......... A I A P PE_PGRS18 250 260 270 280 290 300 310 320 ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| Q A G S T Y A V A E A A S A T P L Q N V L D A H37Rv and H37Ra CAAGCTAGCAGCACCTACGCGGTCGCCGAAGCGGCCAGCGCAACACCGCTGCAGAA------------CGTGCTCGATGC CDC1551 ......................................................C.GATCGAGCAGGC.C..T.G.GG.T M.bov AF2122/97 ........................................................------------............ Q 330 340 350 360 370 I E Q 380 A L G 390 V 400 ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| I N A P V Q S L T G R P L I G D G A N G I D G T G Q H37Rv and H37Ra GATCAACGCACCCGTTCAGTCGCTGACCGGGCGCCCATTGATCGGCGACGGCGCGAACGGGATCGACGGGACCGGGCAAG CDC1551 .......A.G..GAC.G..G.....GTG......AAGC.......T...C....CC.....GCGCC...C........G. M.bov AF2122/97 ................................................................................ T T 410 E A V 420 K 430 R 440 450 ....|....|....|....|....|....|....|....|....|....| A G G N G G W L W G N G G N G G S H37Rv and H37Ra CCGGCGGTAACGGCGGGTGGCTGTGGGGCAACGGCGGCAACGGCGGGTCG CDC1551 .......GGC......CATCT...................T......... M.bov AF2122/97 .................................................. A I H A P The distribution of the 12/40 polymorphism could define three new PE_PGRS-based groups PE_PGRS18 PE_PGRS17 M. tb H37Rv and H37Ra MAR1 MAR2 MAR1 MAR2 PGRST3 M. tb CDC1551 PGRST2 M. tb 210 and M. bovis AF2122/97 PGRST1 PE PGRS PE PGRS A worldwide collection of tubercle bacilli strains was subjected to sequencing analyses The 12/40 polymorphism was not randomly distributed PGRST1 PGRST1>PGRST2>PGRST3 PGRST1 PGRST1 PGRST1 Development of a reverse hybridization assay (PEGAssay) for the large scale analysis of the 12/40 polymorphism distribution 17 M. tuberculosis (H37Rv) M. tuberculosis (H37Ra) Negative control ( Buffer ) M. tuberculosis ( Erdman ) M. tuberculosis (CDC1551) M. smegmatis ( mc ² 155) M. africanum (ATCC 25420 ) (ATCC 35782) M. microti (FCC69) M. pinnipedii M. caprae (CIP 105776) M. bovis (AF2122/97) M. bovis BCG (A TCC 27290) 18 Overall, 521 MTBC isolates were analyzed 415 M. tuberculosis [108 PGG1(57 Ancestral), 259 PGG2, 48 PGG3] 42 M. bovis (5 BCG strains) 30 M. africanum (14 A1, 6 A2, 8 A3) 17 M. microti (9 voles, 3 llama, 2 cat, 2 human, 1 pig) 3 dassie 4 M. pinnipedii 2 M. caprae 6 “M. canettii” and 2 smooth tubercle bacilli Within the whole collection of MTBC strains, only the three newly defined PGRST types could be identified PE_PGRS17 PE_PGRS18 PGRST1 (+/-) PGRST2 (+/+) PGRST3 (-/-) PGRST1 was associated with all ancestral tubercle bacilli and was the most abundant The whole collection New York et New Jersey Tunisia South Africa Gene conversion involving the two paralogous PE_PGRS genes appears to play a crucial role in the diversification of the modern M. tuberculosis population Gene conversion is a class of homologous recombination Recombinant DNA Gene conversion Parental DNAs Crossing over Recombinant DNAs Strand break coupled to mismatch repair as the most plausible explanation for gene conversion + - + + The gene conversion event occurs independently multiple times CONCLUSION As far as could be ascertained, this is work provided the most obvious gene conversion event in the natural evolution of the mycobacterial species The findings reinforce the role of gene conversion as a mechanism for the generation of genetic variability associated with PE/PPE families Strains of the M. bovis lineage appear to be refractory to gene conversion The study offers a new perspective to trace back the evolution of tubercle bacilli and other smooth tubercle bacilii (-/-) Phylogenetic analysis of smooth tubercle bacilli (referred to as M. prototuberculosis) provided insights into the genetics of strains that might have predominated prior to the expansion of the MTBC Gutierrez et al., 2005 The sequence polymorphism within the housekeeping genes of the smooth tubercle bacilli group shows gene mosaicism Genetic variability of the PE_PGRS duplicated genes The duplicated PE_PGRS members were previously shown to be preferentially upregulated in vivo Development of efficient tools for the specific detection of PE/PPE genes Development of a perl scripting program for the identification of PE/PE_PGRS member specific sequence PE/PE_PGRS database 30-base window size ATCGGGATCCAGGAAT TCGATCCCCGGTTTTA ACTATACGCATGTCAT GCAAGTCCCGTGGGGG Script 1 extracts a specified length of the sequence from the gene sequence starting from the first and shift to the second and so on until the last possible sequence of the desired length is obtained. This constitute the candidate primers CCTTAAGGTTGCAACACATGTGGGCCTTAGGAGTCGTTGTTTGTTACGTAATGGGCGTTGG CCTTAAGGTTGCAACACATGTGGGCCTTAGGAGTCGTTGTTTGTTACGTAATGGGCGTTGG CCTTAAGGTTGCAACACATGTGGGCCTTAGGAGTCGTTGTTTGTTACGTAATGGGCGTTGG CCTTAAGGTTGCAACACATGTGGGCCTTAGGAGTCGTTGTTTGTTACGTAATGGGCGTTGG CCTTAAGGTTGCAACACATGTGGGCCTTAGGAGTCGTTGTTTGTTACGTAATGGGCGTTGG CCTTAAGGTTGCAACACATGTGGGCCTTAGGAGTCGTTGTTTGTTACGTAATGGGCGTTGG CCTTAAGGTTGCAACACATGTGGGCCTTAGGAGTCGTTGTTTGTTACGTAATGGGCGTTGG CCTTAAGGTTGCAACACATGTGGGCCTTAGGAGTCGTTGTTTGTTACGTAATGGGCGTTGG Script 2 converts a sequence in FASTA format to a format that enable the candidate primers to search the gene sequence Candidate primers database Script 3 search for number of times the candidate primer occurs in the gene sequence. Extract the one which occur once. Putative PE/PE_PGRS member specific primer 30 mer signature sequences (100% identity) PE subfamily PE30_175_204 PE30_176_205 PE30_177_206 PE30_178_207 PE30_179_208 PE30_180_209 PE30_181_210 #catggtcaggactatcaagctcttagcgca# #atggtcaggactatcaagctcttagcgcac# #tggtcaggactatcaagctcttagcgcaca# #ggtcaggactatcaagctcttagcgcacag# #gtcaggactatcaagctcttagcgcacagc# #tcaggactatcaagctcttagcgcacagct# #caggactatcaagctcttagcgcacagctt# Rv3097c Rv3097c Rv3097c Rv3097c Rv3097c Rv3097c Rv3097c PE22_46_75 PE22_47_76 PE22_48_77 PE22_49_78 #gcgacactggagtcccttggttcccacatg# #cgacactggagtcccttggttcccacatgg# #gacactggagtcccttggttcccacatggc# #acactggagtcccttggttcccacatggcg# Rv2107 Rv2107 Rv2107 Rv2107 PE2_776_805 PE2_777_806 PE2_778_807 PE2_779_808 #ttgcaggcatcacattcgtacacaccaagt# #tgcaggcatcacattcgtacacaccaagta# #gcaggcatcacattcgtacacaccaagtat# #caggcatcacattcgtacacaccaagtatt# Rv0152c Rv0152c Rv0152c Rv0152c PE26_1290_1319 PE26_1288_1317 PE26_1289_1318 #tatctcaatctcaatacatgacaaccagac# #gttatctcaatctcaatacatgacaaccag# #ttatctcaatctcaatacatgacaaccaga# Rv2519 Rv2519 Rv2519 PE4_862_891 PE12_242_271 PE18_229_258 PE1_1209_1238 PE24_134_163 #ccggcgaatagtccctacccgacacacatt# #tgagagcaagtgcagacgcgtatgcaaccg# #gtcaacactctacagatgagctcagggtcg# #cgaaccgaacttggaagtaatcgtcaatct# #caattgccgcaatattgctgtcacacgccc# Rv0160c Rv1172c Rv1788 Rv0151c Rv2408 PE_PGRS subfamily PE_PGRS21_1027_1056 PE_PGRS21_1028_1057 PE_PGRS21_1029_1058 PE_PGRS21_1030_1059 PE_PGRS21_1031_1060 PE_PGRS21_1032_1061 #gtcaccttcagtagtagcttaagtggcctt# #tcaccttcagtagtagcttaagtggccttt# #caccttcagtagtagcttaagtggcctttc# #accttcagtagtagcttaagtggcctttcc# #ccttcagtagtagcttaagtggcctttccg# #cttcagtagtagcttaagtggcctttccgg# Rv1087 Rv1087 Rv1087 Rv1087 Rv1087 Rv1087 PE_PGRS62_585_614 PE_PGRS62_586_615 PE_PGRS62_587_616 PE_PGRS62_588_617 PE_PGRS62_589_618 PE_PGRS62_590_619 PE_PGRS62_591_620 #ggcgtactacatccaacagattattagctc# #gcgtactacatccaacagattattagctcg# #cgtactacatccaacagattattagctcgc# #gtactacatccaacagattattagctcgca# #tactacatccaacagattattagctcgcag# #actacatccaacagattattagctcgcaga# #ctacatccaacagattattagctcgcagat# Rv3812 Rv3812 Rv3812 Rv3812 Rv3812 Rv3812 Rv3812 Using a window size of 30 mers 9 PE member specific sequences 2 PE_PGRS member specific sequences Results of the probe search Window size 25 mers 20 mers 16 mers Total (including 30 mers) Newly identified member specific sequences PE 0 PE_PGRS 1 PE 5 PE_PGRS 10 PE 13 PE_PGRS 31 27 PE + 44 PE_PGRS = 71 PE member specific sequences 82.5% coverage Overall 2187 member specific sequences targeting PE and PPE families were derived Hybridization of a biotinylated PCR product corresponding to a PE_PGRS gene with member specific sequences of 34 other PE/PE_PGRS gene probes a3812 Ps1 a3812 Ps1 Set up of a 50 mer-based PE/PPE specific microarray protocol Initial conventional hybridization conditions Improved hybridization conditions 29 polymorphic sites out of a total of 177 could be detected 58 Tunisian isolates + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + ? + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + ? + + ? ? ? - + + + + + + + + + + + + + + + + + + + + + + + + + + + + ? ? + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + ? + ? + + + + + + + + + + + + ? + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + ? ? ? + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + ? + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + ? ? ? + + + + + + + + + + + + ? + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + ? + + + + + + + + + + + + + + + + + + + + + + + + + + + ? + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + ? + + + + + + + + + + + + + + + + + + + + + + + + + + + + ? + + + + + + + + + + + + + + + + + + + + + + + + + + + + Controls + + + + + + + + + + + + + + + + + + ? + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + ? + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + ? ? + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + ? + + + ? + ? ? + + + + + + + + + + + + + + ? + + + ? + ? + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + ? + + + + + + + + + M. bov M. bov H37Rv CDC1551 Towards PE/PPE-based phylogenomics Acknowledgements Part of this work was supported by funds from the United Nations Development Program/World Bank/World Health Organization Special Program for Research and Training in Tropical Diseases (TDR). Special thanks to Anis Karboul and Amine Namouchi, Nico Gey van Pittius (US, Cape Town), Roland Brousseau (BRI, Montreal), and Cristina Gutierrez (IP, Paris).