Model details and bootstrapping results

advertisement
Electronic Supplementary Material S2: Models and Calibration.
We used continuous-time Markov chains (CTMCs) to model the rate at which
forest stands change from one stand-development stage to another and then, within the
context of these dynamics, to model the invasion of Hieracium into stands, and its
persistence and abundance within-stands. As stated in the introduction of the main text,
CTMCs are useful for modelling such discrete-sampled data as we have (Norris 1997;
Keeling & Ross 2008) because they: i) allow for events to occur at any point in time; ii)
explicitly model the dynamic transitions between discrete entities (e.g. change in standdevelopment stage, or the absence to presence of Hieracium in a stand); iii) account for
randomness in the type and timing of events that occur; and, iv) provide a rigorous
framework in which to model long-term data. The parameter structures of the full and
best-fitting models are presented in electronic supplementary material S4.
We computationally evaluated the distribution of the process, given a set of
parameters, using EXPOKIT (Sidje 1998), and then used MATLAB's fminsearchbnd to
search for the parameter set that maximises the likelihood of our data (Norris 1997, Ross
et al. 2006). The likelihood of our observed data, given a set of parameter values, is
simply the product of the probabilities of moving between the states observed in our data
over the time elapsed between re-measurements (Ross et al. 2006). These probabilities
are calculated by evaluating exp(Qt), where Q is the q-matrix of transition rates and t is
the time between observations of the state of the process, and exp is the matrix
exponential (Norris 1997, Ross et al. 2006). This method is the gold-standard for
calibrating CTMCs to discrete-sampled data - the form of data used herein. We determine
a single model which best describes the dynamics occurring at each stand using data from
all stands.
The parameter values estimated for the forest stand dynamics were held fixed
when estimating the parameters for the dynamics of Hieracium invasion at two spatial
scales, stand and within-stand. This approach produces only ‘approximate’ maximumlikelihood estimates of the parameters, as we do not estimate all parameters
simultaneously. The ‘true’ maximum likelihood estimates could be gained by attempting
to estimate all parameters simultaneously, but in this scenario the high dimensionality of
the search space makes convergence of the parameter values difficult, and thus more
reliable estimates are derived using our approach. Furthermore, note that our approach
has some biological support, since the stand dynamics should be independent of
Hieracium dynamics. The confidence we can have in these estimated parameter values is
indicated by parametric bootstrapping, as discussed below.
In order to find the ‘optimal’ models, we create alternative stand-level and withinstand level models with permutations of different parameter structures for each of the
Hieracium dynamics and propagule sources. We find the best-fitting alternative model by
using the sum of ranks from each of three model selection criteria: Akaike Information
Criterion (AIC), Schwarz (BIC) and Hannan-Quinn (HQ). There is debate in the literature
as to the best selection criterion to adopt in any given situation, and, with the exception of
certain model types and certain sample size thresholds, it is dependent upon the desired
trade-off of model complexity versus statistical fit. For this reason, we decided to adopt
the equal-weight approach described above (note that this results in identical model
choices to if the HQ model selection criterion had been adopted in isolation). The
parameter structures of the alternative models differ as follows: for each type of
Hieracium dynamics modelled (I, i, Edis, edis , Estoch, and estoch), we include either a
constant rate for all forest stand stages (one parameter), different rates for disturbed
(major- and minor-) versus non-disturbed stands (two parameters), or different rates for
each stand stage (five parameters). For propagule supply, we test alternative structures
including with and without the unmeasured source (Pu), with both stream and edge
sources (Ps and Pe), and with stream only (Ps). Permutations of these parameter structures
lead to 54 and 11 alternative models at the stand and within-stand level, respectively.
Excluding the parameters for the forest stand dynamics model (see table 1 of paper), the
alternative models with maximum and minimum numbers of parameters, respectively,
have 17 and 4 parameters for the stand-level invasion model, and 22 and 5 parameters for
the within-stand invasion model. We believe the minimal representations considered here
to be the models with the minimum number of parameters which still retains a sensible
and realistic representation of the system. As stated, information criteria were used to
ensure over-fitting was avoided. The best-fitting models at the stand and with-stand
levels, respectively, have 6 and 10 parameters (see electronic supplementary materials,
S4)
We note that our model is not spatially-explicit, therefore we make the
assumption that the individual stands behave independently of other stands and have no
impact on the forest stand structure, or Hieracium dynamics, of other stands. We believe
this assumption is valid as it is more likely that other aspects of the forest environment,
such as the proximity to a water course or the forest edge, are determining factors of
invasion dynamics. To illustrate this, an unmeasured potential source of propagules was
included in the models; this parameter was not supported in the best-fitting models, hence
giving support to the assumption of independence of stands.
We simulated 100 sets of data for the forest stand, stand-level and within-stand
invasion models, with parameter values estimated from our real data - the data
corresponds to the same years of measurement as in the true data. We then used our
calibration method to estimate the parameters; the estimated parameters should ideally be
reasonably close to those estimated from the true data. Bootstrap estimates of parameters
from these 100 simulations are provided below for (a) forest stand dynamics model, (b)
stand-level dynamics model, and (c) within-stand level dynamics model. The mean,
median, and standard errors of these estimates are provided in table 1 of the main paper,
and the individual estimates are presented here. Overall, the estimates (in particular the
median estimates) are close to the true value for all parameters, as seen below. In the
forest stand model (a), aDR is slightly underestimated on average, and there exists a few
large estimates of aNM, and to a lesser degree aND. It should be noted that these three
parameters correspond to disturbed states, which are a minority stand developmental
stage (see figure 1), and hence estimates would be expected to be more unstable in
comparison to the other parameters. The parameters of some concern at the stand level
(b), and within-stand level (c), respectively, are Edis and edis, which are the probabilities of
(major or minor) disturbances removing Hieracium at the stand level, and subplot level,
respectively. Our sensitivity analysis of these parameters in Figure 3 of the paper, used to
deduce their importance on Hieracium stand and subplot abundance, shows that the
dynamics are not highly sensitive to these parameters, hence explaining the uncertainty in
their precise values. Overall, we believe the bootstrap estimates provide good support for
our calibration methodology and hence our calibrated models.
Download