Supplementary Information (doc 902K)

advertisement
Supporting information
S1 MODEL DERIVATION
Here we derive the probability of measuring a proportion of strains ( x ) given an infection
rate, growth rate and the proportion of strains in the initial inoculum. We start by defining a
model for the density of a strain over time given a growth rate. Then we derive the
probability of an infection occurring at a certain time given an invasion rate. This is then
combined into the probability that one finds a certain density of each strain at a certain time
given the invasion rate and a growth rate. Based on the probability density function of the
density we can then derive the probability for the proportion of two strains given the
invasion rate, growth rates and initial proportion in the inoculum.
Density of a strain
We model growth of a pathogen after invasion. The growth is assumed to be deterministic
and constant. The density n at time t can then be defined as follows:
0, t 0  t

n(t )    (t t )
0
, t0  t
e
with t 0 the time of invasion and  the growth rate within the host. Furthermore, we assume
that the time of successful invasion follows an exponential distribution, with the probability
density function defined as follows:
0, t 0  t

f t0 t 0 |      t
,
0
 e , t 0  t
with  the successful invasion rate. We define successful invasion as an invasion event that
results in a growing population. Under these assumptions the variability of the data results
from the stochasticity of the invasion process and not from the growth within the cell. The
setup of this model is mathematically the same as the model by (Baranyi 1998), who used it
to describe lag in bacterial growth. Finally, we combine the previous results, by applying the
density function method, which results in the following probability density function of the
pathogen population size:
0, 0  n  1 or n  e γt

 e t
f n n  , ,t  =  (  ) r 1 , 1  n  e γt
 n

e t , n  0

Proportion of two strains
In our system there are two strains invading at the same time. If the initial proportion of
each strain in the inoculum is known and both strains invade with the same base rate, then
we can define the invasion rate for each strain as a function of the inoculum proportion of
strain a ( p ) and the total invasion rate (  ), resulting in an invasion rate for strain a and b
of, respectively, p and (1  p )  . Next we derive the probability density function of the
proportion of two strains ( a and b ) at time t . We start by introducing two new random
variables, i.e. the proportion: ( x  na / na  nb  ) and the total density ( m  na  nb ), with na
and nb the density at time t of, respectively, strain a and b . Using the generalized density
function method we can then derive the probability density function of x as follow:
f x  x | p, ,  a ,  b , t  

g  p, ,  a ,  b , t 
0
x
na
m
na
x
nb
m
nb
1
f n na | p,  a , t  f n nb | (1  p ),  b , t 
dm 
n a  xm ,n b (1 x ) m
 f n 0 | p,  a , t (1  f n 0 | (1  p ),  b , t ), x  0

p
 p bt pt

a
x



(   ) t 1
e a


 ,0  x  1  e b a

 (1  p ) p

1 x 
g  p, ,  a ,  b , t 

 (1 p )
 ( p ( b   a )   a )(1  x) x   (1 p ) at  (1 p )t  1  x   b
1
, 1  e ( b  a ) t  x  1

e b


 x 


(1  f 0 | p,  , t ) f 0 | (1  p ),  , t , x  1
n
a
n
b





with γa and γb the growth rate of, respectively strain a and b , and
g  p, , γa , γb ,t  = 1  f n 0 | p,  a ,t  f n 0 | (1  p),  b ,t  a normalization factor, since x is
1
undefined when both strains have a density of 0. The number of parameters is further
reduced by redefining them as follows: the relative invasion rate (    /  a ), the relative
growth rate of strain b (    b /  a ) and the growing time (    a t ). Finally resulting in:
f x  x | p,  ,  ,  
 f n 0 | p ,1, (1  f n 0 | (1  p) ,  , ), x  0

 p  1  x  p

(  1) 1

 ,0  x  1  e
e

1 x 
 (1  p) p


g  p,  ,  , 

 (1 p ) 1
1
 ( p(  1)  1)(1  x) x   (1 p )  1 1 1  x 
, 1  e ( 1)  x  1



e
 x 


(1  f n 0 | p ,1, ) f n 0 | (1  p ) ,  , , x  1



with g  p, ,  , = 1  f n 0 | p,1,  f n 0 | (1  p),  ,  .
1

We can further use this probability density function to analyse the population structure.
A common approach for this is to calculate the relatedness. Following Queller and
Goodnight (1989), relatedness is defined as follows:
where we sum over all possible interactions. For each interaction:
score of the actor and
the -score of the recipient.
is defined as the -
is the average -score in the
population, with the -score defined as the frequency of an allele in the actor or receiver.
We are working with two strains here and assume clonal birth. Therefore the -score of
strain
can be set to 1 and of strain
average proportion of strain
with ,
with ,
with
and
to 01. Note that the average -score is equal to the
( ). If we now take into account all possible interactions, i.e
with
and weigh the interactions with the frequency they
occur we can calculate relatedness as follows:
1It
is equally valid to set the values to 0 and 1 respectively; this would lead to the same
solution.
We also know that
and the variance of
is
, such that the relatedness can be further simplified to:
Interestingly, another (mathematically equivalent) method to derive the relatedness in
this case is the normalized difference between the probability of sampling two individuals
from the same strain within a sub sample and the same probability, but for the population
as a whole (Queller and Goodnight 1989; Bryden and Jansen 2010), i.e. defining relatedness
as the probability to pick (with replacement) two individuals of the same strain from one
subpopulation (insect) minus the probability to pick two individuals of the same type from
the population as a whole, divided by 1 minus the probability to pick two individuals of the
same type from the populations as a whole:
which simplifies to the same equation as above.
Parameter fitting
Model fitting is done following a maximum likelihood approach. Important to note is that
the data consists of counts of each strain found in a certain dilution factor. As such this data
does not provide us with an exact density of the pathogens, i.e. a count of 10 individuals in a
solution with a dilution factor of 3 does not mean that there were exactly 10,000 individuals
in the original solution. To account for this fact we assume that in such a case the density
might be anywhere between 9,500, and 10,500. The probability of our data for strain a is
then:
N
10 yi ( ci  0.5 )
 
f D c , y | ,  , t  =   yi
f n (n | ,  , t )dn ,
i 1
10 ( ci  0.5 )


with N the total number of observations, c all the count data and y the corresponding
dilutions. We then calculate the likelihood of our parameters giving count data from both
strains and an initial proportion as follows:
   
 
 
L( ,  , | ca , ya , cb , yb )  f D ca , ya | p ,1,  f D cb , yb | (1  p) ,  ,  .
The parameter values corresponding to the maximum of this function are found numerically
by finding the maximum of the logarithm of this function. The numerical method we use is
an adaptive differential evolution method(Brest et al. 2006). We fit the model both to each
set of data (Table S1) separately, and also to the three sets of data combined.
Bootstrap tests are used to estimate the uncertainty around the maximum likelihood
estimation of the parameter. For this we resample the data with replacement and then
calculate the maximum likelihood value using the resampled data. This process is repeated a
1,000 times. These results can then be used to estimate the 90 percentile interval, by finding
the 50th and 950th lowest value of each parameter.
Proportion of the RifR strain
0.278
No. of samples
139
0.429
52
0.806
138
Table S1. Initial inoculum proportions and the respective number of samples.
S2 Lotka-Volterra competition
In the full model we assume exponential growth (see S1). Here we explore the effect of
assuming a more complicated growth model which includes Lotka-Volterra competition.
Under Lotka-Volterra competition strain growth is defined by:
with
the growth rate for strain ,
density at time ,
the carrying capacity and
the effect of strain on . Under the assumption that both strains are very similar we can
assume neutral competition and set
, resulting in.
To explore how this growth model changes the dynamics of the model we focus on two
different attributes, i.e. the initial proportion at the time of the second invasion
change of this proportion after both species have invaded (
and the
). For notational simplicity
in the following section we will assume that strain
strain
invades at
has invaded first at time
and that
.
The proportion at time of the second invasion ( ) is:
From the growth equations it is clear that when
the influence of the carrying
capacity is minimal and the results are approximately the same as for exponential growth
(
). Furthermore, from the data we know that the carrying capacity will be
much larger than 1. Therefore, when the density of strain
resulting in
is close to the carrying capacity
. This shows that the introduction of a Lotka-Volterra
competition only results in a minor difference in the proportion at the invasion of the
second strain (
).
The second important consideration is whether the introduction of Lotka-Volterra
competition influences the change of the proportion directly after invasion.
Similarly as in the simpler model we can rescale and set
and
without loss
of generality resulting in:
The change over time only differs from the exponential growth model in the last term. Note
that if the strains are identical (
) the change over time is zero and both models are the
same. The major difference with the simpler model is that in this model the change over
time goes to zero as the strains increase in size, since they reach a stable coexistence.
Crucially, as long as the strains are very similar (
on the outcome of the model will be small.
) the influence of neutral competition
S3 Model comparison
In the main manuscript we assumed that the time between successful infections was
enough to explain the finding that after co-infection of two neutral strains we normally only
found one of the strains. It is of course possible to focus on other aspects of the infection
process and here we explored an alternative hypothesis. This hypothesis assumed that only
a limited number of bacteria from the inoculum survived the initial invasion process and the
final population ratio of each strain could be explained by the ratios in these surviving
populations. To explore this possibility we defined a model very similar to our other model,
but with the crucial difference that both strains invade at the same time, but have different
(founding) population sizes at time 0. Under a given survival rate ( ) the surviving
population size will be Poisson distributed:
Fitting this alternative model was performed using the same methods as described in
Supplementary Information S1.
This model could, theoretically, result in high bimodality, because at low survival rates
there was a high probability that only one of the strains successfully invaded, resulting in a
final strain proportion ( ) of 0 or 1 (depending on which strain invaded). Interestingly, we
found that the model did not adequately explain that large number of results that were
close to 0 or 1, but not exactly 0 or 1, because when both strains invaded the founding
populations were most likely to be relatively similar in size. Furthermore, to adequately
explain the large variation in final population sizes found in the data, the model predicted
high initial population sizes. As a result our fitted model did not exhibit the same
distribution as the experimental data. See table S3.1 for the results from the model fitting.
Inoculum ( ) Parameter Max L
all data
841.277
all data
0.993052
all data
6.83063
0.278
923.656
0.278
1.00533
0.278
6.84091
0.429
1094.05
0.429
1.46507
0.429
4.62993
0.806
530.957
0.806
1.21497
0.806
5.53622
Table S3.1: Maximum likelihood parameters for the model given the data. Model is both
fitted to all the data simultaneously and separately to each different experiment.
The maximum likelihood value of this fitted model was extremely low leading to
numerical problems, because the probability of some of the data points given the model
was 0 (i.e. below numerical precision of the computer). We dealt with this by introducing a
minimal (log) probability value that was just above the numerical precision of the computer
allowing us to complete the maximum likelihood based fitting. Likelihood based model
comparisons using the Akaike Information Criterion (AIC) are based on the fact that models
that explain the data better have a higher maximum likelihood (Akaike 1973; Burnham &
Anderson 2002). The likelihood of a model is the product of the probability of each data
point given the model. As mentioned above the probability of some of the data points for
the second model was 0, therefore the likelihood of this model is also 0 and the AIC which is
based on the log likelihood is
. As a result the difference in AIC between this second
model and the model presented in the main manuscript is infinite and we can conclude that
there is essentially no empirical support for the second model (Burnham & Anderson 2002).
S4. Differences in life-history between competing Bt strains.
Figure S4. Differences in life history traits of isolates used in
coinfec on experiments. A. Pathogenicity was assessed at three
doses using 30-45 larvae per dose using the droplet method
described in the main body of the paper. B. The number of spores
produced from cadavers containing a single bacteria strain. Data are
means ±SE. ** indicates significant difference at the P < 0.01 level.
See main text for sta s cal analysis.
A
1
mortality
0.8
0.6
0.4
RifR
NalR
0.2
0
2
2.5
3
log10 spore dose in droplets (cfu)
**
B
spore producitivity in
cadaavers log 10 cfu/ul
5
4
3
2
1
0
RifR
NalR
3.5
REFERENCES
1.
Akaike, H. (1973). Information theory and an extension of the maximum likelihood principle.
In: 2nd Int. Symp. Inf. Theory (eds. Petrov, B.N. & Csaki, F.). Publishing house of the
Hungarian Academy of Sciences, Budapest, pp. 268–281.
2.
Baranyi, J. (1998). Comparison of Stochastic and Deterministic Concepts of Bacterial Lag. J.
Theor. Biol., 192, 403–408.
3.
Brest, J., Greiner, S., Boskovic, B., Mernik, M. & Zumer, V. (2006). Self-Adapting Control
Parameters in Differential Evolution: A Comparative Study on Numerical Benchmark
Problems. IEEE Trans. Evol. Comput., 10, 646 –657.
4.
Burnham, K.P. & Anderson, D.R. (2002). Model Selection and Multi-Model Inference: A
Practical Information-Theoretic Approach. 2nd edn. Springer.
Download