Complex relatedness in the genus Neisseria

advertisement
Population structure in the Neisseria, and the biological significance of fuzzy species
Jukka Corander, Thomas R. Connor, Clíona A O’ Dwyer, J. Simon Kroll and William P. Hanage
Supplementary online material (SOM)
Details of the meningococcal strains used in the transformation experiments are provided in Table
1.
Additional results of population genetic and sequence analyses
Given the very large size of the Neisseria ST data set, no standard methods of Bayesian Monte
Carlo computation (such as a Gibbs sampler) can be expected to reliably handle population genetic
model fitting and to achieve convergence without extremely large-scale computational resources
(Robert & Casella 2005). To handle highly challenging and complex molecular data sets, BAPS
utilizes analytical integration in combination with a stochastic optimization algorithm using
intelligent search operators to search for the posterior mode in the space of clustering models.
The BAPS analysis decisively clustered the 8619 Neisseria STs into 50 groups which reflect species
boundaries nearly perfectly, apart from a number of important exceptions discussed in detail below.
A total of 42 groups out of the 50 contained only N. meningitidis STs. The number of STs in the N.
meningitidis groups varied between [49, 475], with mean 189.9 and SD 92.8 STs. In Table 2 the
exact frequencies of different Neisseria species are shown for the remaining 8 groups identified by
BAPS. Among the 50 detected groups, four two-species mixed groups were present (groups 2, 31,
41, 46), sharing in total 282 N. lactamica and 92 N. meningitidis STs. A single group (50) contained
all the 193 N. gonorrhoeae STs included in the data, and additionally, one N. meningitidis ST was
also included in this group. The STs from other Neisseria species were fairly evenly divided
between three minor mixed groups (34, 48, 49), which also contained all the cases with unclear
species status.
Epidemiological and geographical information about STs within the BAPS groups not containing
solely N. meningitidis is given in Table 2. Due to their high level of similarity and the large number
of groups, we abstain from displaying the exact epidemiological and geographical data for the 42
groups containing N. meningitidis STs only. In summary, STs in these groups have in all cases been
sampled in at least 16 different countries which are spread over at least four continents. Taking into
account the sizes of the different groups, there appear not to be any notable differences in
geographical spread of the different species in the genus Neisseria. Most N. meningitidis groups
contain at least half of the known serotypes, and in particular, serotype B and non-groupables are
consistently present in all groups.
SOM Figs 1-3 display patterns of molecular variation in the aligned and concatenated data within
and between the groups estimated by BAPS. These figures contain a varying number of groups to
provide a better visualization of the shared and distinct patterns over the data. Several important
observations can be made from Figs 1-3. Firstly, mosaic patterns of molecular variation caused by
horizontal transfer of DNA are clearly visible over all the seven housekeeping loci and in most
inferred groups. None of the 42 groups consisting solely of N. meningitidis STs show evidence for a
particularly clonal ancestry, but the patterns of molecular variation indicate frequent recombination
events, which is harmony with earlier findings reported in the literature (Feil et al. 2001; Hanage et
al. 2005; Zhou et al. 1997). SOM Fig 4 shows how the STs in BAPS groups 34, 48, 49 and 50 are
distributed over the ML tree.
Complete results of the BAPS admixture analysis are presented in Table 3 (see the separate Excel
file). The estimates of admixture between the groups reported in Table 3 are calculated as in Tang et
al. (2009).
Analysis of possible species mis-classification and distribution of non-typable strains
To consider to which extent a pattern of mixed clusters could emerge under the null hypothesis of
equal mis-classification probability for all isolates, we calculated p-values for the observed
frequencies of N. lactamica and N. meningitidis STs. Assuming that that the STs representing
minority species in each mixed cluster were a result of laboratory error, the overall misclassification rate for any single ST is estimated to be 0.0319 and 0.0060 for N. lactamica (in total
282 STs) and N. meningitidis (in total 8074 STs), respectively. Under these two estimated
frequencies as the null models of mis-classification for each ST, the corresponding p-values to
observe frequencies of STs from a ‘wrong’ species that are at least as large as those reported in
Table 1 can be calculated using the CDF of Binomial(n,θ) distribution, where n is the total number
of STs in a group and θ is overall mis-classification rate for the species in question. The resulting pvalues are: <10-6, <10-15, 0.0065, 0.0017, for the groups 2, 31, 41, 46, respectively. The p-values
remain clearly significant at 5% level even under a Bonferroni correction for multiple testing.
Similar to the above analysis, we examined whether the increased proportion of non-typable
meningococcus STs in groups 2, 31, 41, 46 can be reasonably explained by chance. Under the null
hypothesis that the serotypes of the strains in each group represent a random sample of the observed
population where non-typable STs are present with the overall frequency 21.2%, the Binomial pvalues of observing at least the number of non-typable STs as reported in Table 2 of the main article
are: <0.0001 for groups 2, 31, 41 and 0.075 for group 46.
Recombination between meningococcus and N. gonorrhoeae.
The results display also evidence for recombination between the two pathogens in the dataset, N.
gonorrhoeae and the meningococcus, for the STs residing in the mixed groups. The gdh and aroE
genes show particularly clear traces of recombination in mixed N. lactamica and N. meningitidis
groups, as well as to some extent also in the multi species group 34. To trace the possible origins of
recombinations for gdh, we searched the NCBI nucleotide collection using the gdh sequence of a
typical single N. meningitidis ST from each of the four mixed two-species groups. The BLAST
results for group 2 are as follows: 100% match to appr. 10 N. gonorrhoeae sequences and a single
N. lactamica sequence, whereas the closest match to a N. meningitidis sequence is only 91%.
Similarly, for group 31 we obtained 98% match for the same N. gonorrhoeae and N. lactamica
sequences as for group 2, while the degree of closest match to N. meningitidis remained unchanged.
In contrast, for the two other mixed groups (41, 46), gdh sequences of N. meningitidis matched
100% to multiple N. meningitidis sequences present in the NCBI collection and no close matches to
other species were discovered.
Similarly, we repeated the search for origin for the latter part of the aroE gene, which shows
evidence of multiple separate recombination events over the different groups. Here, a search at the
NCBI nucleotide collection using N. meningitidis STs from the groups 2, 31 resulted in a 100%
match with multiple N. gonorrhoeae, N. lactamica and N. meningitidis sequences. When
comparable sequences of N. meningitidis STs from non-mixed groups were used in the search, they
matched 100% to N. meningitidis, while no matches to other species were discovered.
Finally, none of the 193 N. gonorrhoeae STs show any signs of horizontal import of DNA from the
remaining lineages in the data.
Tables and Figures
Isolate
OX9931639
OX01061662
ST99248587
ST00248061
89
295
584
M01, 257107
518
801
NG H15
90/18311
M99.241026
BZ 147
BZ 169
ST
BAPS group
3260
3266
3270
4000
753
754
960
1186
860
861
43
11
4037
48
32
41
41
41
41
40
40
40
35
35
35
28
17
17
14
5
Assigned in a
mixed BAPS group
Yes
Yes
Yes
Yes
No
No
No
No
No
No
No
No
No
No
No
Table 1. Meningococcal strains used in the transformation experiments.
Variable\group
Carrier
Invasive
Meningitis
Disseminated gonococcal infection
Uncomplicated gonorrhoea
No. countries where sampled
No. continents where sampled
2
61
(100.0%)
9
3
31
391
(99.4%)
1 (0.3%)
1 (0.3%)
22
4
34
17
(94.4%)
1 (5.6%)
6
3
41
25
(100.0%)
11
5
46
17
(63.0%)
7 (25.9%)
3 (11.1%)
13
4
48
15
(93.8%)
1 (6.2%)
16
4
49
6
(100.0%)
4
1
50
2 (2.2%)
10 (11.4%)
76 (86.4%)
24
6
Table 2. Epidemiological and geographical information about STs within the mixed BAPS groups. Percentages are given within parenthesis for
the epidemiological status.
Table 3. (separate Excel file). The estimates of admixture between the 50 BAPS groups. Diagonal values are put to zero and have no meaning.
Figure 1. Inferred BAPS groups for 8619 STs of genus Neisseria. Each row corresponds to a concatenated MLST sequence for an ST and
horizontal black lines indicate boundaries of the 50 clusters. Each column is a sequence position in the aligned data and colors indicate different
nucleotides. An entirely white column corresponds to a monomorphic site.
Figure 2. Two-species mixed BAPS groups 2, 31, 41 and 46 containing a total of 374 N. lactamica and N. meningitidis STs. Molecular variation
is displayed analogously to Figure 1 but color coding may vary over the sites.
Figure 3. The BAPS groups not containing solely N. meningitidis STs. Molecular variation is displayed analogously to Fig 1 but color coding
may vary over the sites.
Figure 4. ML tree for the 8619 STs with BAPS groups 34, 48, 49, 50 marked by colored branches.
Figure 5. Boxplots of transformation frequencies for 27 replicates of the transformation experiments. Replicates 1-6 refer to experiments with
meningococcal strains assigned to mixed BAPS groups (SOM Table 1) and the remaining replicates refer to strains in non-mixed groups. The
three replicates with all observed transformation frequencies equal to zero (2 in the mixed and 1 in the non-mixed groups) are not included in the
figure. Note that the non-displayed replicates were still included in the random effects analysis.
Feil, E. J., Holmes, E. C., Bessen, D. E., Chan, M.-S., Day, N. P. J., Enright, M. C., Goldstein, R., Hood, D. W., Kalia, A., Moore, C. E., Zhou, J.
& Spratt, B. G. 2001 Recombination within natural populations of pathogenic bacteria: short term empirical estimates and long-term
phylogenetic consequences. Proceedings of the National Academy of Sciences USA 98, 182-187.
Hanage, W. P., Fraser, C. & Spratt, B. G. 2005 Fuzzy species among recombinogenic bacteria. BMC Biol 3, 6.
Robert, C. P. & Casella, G. 2005 Monte Carlo Statistical Methods. New York: Springer.
Tang, J., Hanage, W. P., Fraser, C. & Corander, J. 2009 Identifying currents in the gene pool for bacterial populations using an integrative
approach. PLoS Comput Biol 5, e1000455.
Zhou, J., Bowler, L. D. & Spratt, B. G. 1997 Interspecies recombination, and phylogenetic distortions, within the glutamine synthetase and
shikimate dehydrogenase genes of Neisseria meningitidis and commensal Neisseria species. Mol Microbiol 23, 799-812.
Download