Supplementary materials for “Species traits and the niche” by

advertisement
Supplementary materials for “Habitat suitability models do not always relate to
variation in species’ plant functional traits” by Thuiller, Albert et al.
1 MATERIAL AND METHODS
a). Study areas:
We measured plant functional traits into two study areas with a minimal common set of
species. The first study area is located in the central French Alps between Briançon (44.9°N 6.6°E, 1200m) and Combe Roche Noire (3000m). This valley (Guisane) belongs to the
peripheral zone of the Ecrins National Park. It is characterized by contrasted climates, with
mean annual temperatures running from 6.3°C to 0°C and annual precipitations going from
900 mm to 2000 mm.
The second study area (ANZEINDAZ), is located in the Swiss western Alps, in the Avançon
Valley between Gryon and the Col du Pas de Cheville (7°03’ to 7°12’ E; 46°16’ to 46°18’ N) . It
stretches from about 1100 to 2300 m.a.s.l. and presents contrasted mean annual temperatures
between -3.58 and 8.17 degrees and precipitations between 1309.3 and 2390.1 mm with a main
gradient going from the West to the East.
b. Datasets
Presence – absence data have been extracted from two phyto-sociological datasets. In order to
capture the realized niche of the species in the specific study areas, we only used presenceabsence data in a perimeter surrounding the study areas (figure S1). A database comprising
500 complete sampling plots has been used for Guisane and 550 for Anzeindaz (figure S1).
For the French site, we used a comprehensive phytosociological dataset provided by the
National Alpine Botanic Conservatory (CBNA). A plot was considered as a community when
the survey was performed in a homogenous area (around 10x10m). All plots were localized
with a spatial accuracy lower than 100m and species nomenclature was standardized
according to the Kerguelen Synonymic Index of the French flora (Kergélen 1993).
The ANZEINDAZ database consists in 550 8x8m vegetation plots sampled according to a
stratified-random strategy and restricted to non-woody vegetation.
A)
B)
Figure S1: 3D view of the Guisane (A) and Anzeindaz (B) study areas. Black dots represent
the presence and absence data used to calibrate the habitat suitability models. Red dots
represent the sampling plots in which trait measurements were carried out.
c) Information theory approach
One of the most commonly used method to analyze, model and predict the relationships
between a response (species presence/absence) and a set of explanatory variables (topo-
climatic, or habitat suitability / organic matter / pH in our paper) is the well-known stepwise
regression, backward, forward or both, applied to linear, generalised linear (McCullagh &
Nelder 1989; Thuiller et al. 2003) or even generalised additive models (Hastie & Tibshirani
1990; Thuiller et al. 2003). Although appealing, this is well-recognised that stepwise
procedures are not exempt of problems (Guisan et al. 2002; Johnson & Omland 2004).
Indeed, with an increasing number of tested variables, the number of plausible models
increases. Such an increase makes the selection criteria problematic, as several combinations
of variables (sometimes with completely different variables) are likely to give similar AIC or
BIC (Akaike 1974; Burnham & Anderson 2002), but only one will be selected as the
‘optimal’ solution.
An alternative to that is inference-based modelling (Burnham & Anderson 2002; Link &
Barker 2006). Unlike stepwise model selection, multimodal inference is based on all possible
sub-models from a set of explanatory variables, which eliminates model selection bias and
provides a relative measure of each predictor’s importance (weight of evidence). The interest
of multimodal inference is to make inference from more than one single ‘optimal’ model, by
extending the concept of likelihood of the parameters given a model and data to a concept of
the likelihood of the model given the data.
This can be summarised by the Akaike weights:
1
exp(   i )
2
wi  R
1
exp(   r )

2
r 1
Where
 i  AICi  min AIC
Where AICi is the Akaike Information Criteria (AIC, Akaike 1974) of candidate model i, and
min AIC is the smallest AIC value in the set of models. The larger Δi is, the less plausible the
fitted model is.
This approach based on set of multiple models (2number of variables) is more robust than inferring
variable importance based on a single stepwise selected model (Link & Barker 2006).
We thus used this approach to model species distributions using generalized additive model
and a selection of topo-climatic variables. The habitat suitability derived from this approach
was the weighted average from all possible models (the weight being given by the weight of
evidence for a model).
R
P   wi Pi , where P is the resulting habitat suitability, Pi the suitability of candidate model
i 1
i and wi the Akaike weight of evidence of the ith candidate model.
d) Generalised additive models
We developed generalised additive models using the library gam of the free source R software
(R Development Core Team 2008, v. R.2.8.2) with a logit link function and a binomial
family. We used the built-in nonparametric smoothing splines with a smoother of degree 3.
The information theory approach used a custom function based on the pgirmess library
e) Predictive accuracy of habitat suitability models
We assessed the predictive accuracy of our statistical species distribution models using the
area under the curve (AUC) of a receiver operating characteristic (ROC) plot of sensitivity
against (1-specificity) (Swets 1988). Sensitivity is defined as the proportion of true positives
correctly predicted, whereas specificity is the proportion of true negatives correctly predicted
(Fielding & Bell 1997).
The ROC curve is a graphical method that represents the relation between (1 – specificity)
and the sensitivity for a range of thresholds (figure S2). If all predictions were possibly
expected by chance, the relation would be a 45° line. Good model performance is
characterized by a curve that maximizes sensitivity for low values of (1-specificity), i.e. when
the curve passes close to the upper left corner of the plot. The area between the 45° line and
the curve measures discrimination, that is, the ability of the model to correctly classify a
species as present or absent in a given plot (area under the curve: AUC). A rough guide for
classifying the accuracy of a diagnostic test is the traditional academic point system (Araújo
et al. 2005 adapted from Swets 1988): 0.90-1.00 = excellent; 0.80-0.90 = good; 0.70-0.80 =
fair; 0.60-0.70 = poor; 0.50-0.60 = fail
1.0
ROC curve
0
0.1
0.8
0.2
0.3
Sensitivity
0.6
0.4
0.4
0.5
0.2
0.6
0.7
0.0
0.8
0.9
1
0.0
0.2
0.4
0.6
1 - specificity
0.8
1.0
Figure S2: A virtual example of the ROC curve displaying the value of sensitivity (rate of true
positive) and 1 minus specificity (rate of false positive) for a range of threshold (displayed on
the curve). The area under the curve is the area between the curve and the 45° line. In this
example, the AUC is 0.82.
2 RESULTS
a) Predictive accuracy of habitat suitability models
Species
AUC
Guisane
Carex sempervirens
0.75
Dactylis glomerata
0.81
Dryas octopetala
0.86
Festuca paniculata
0.81
Geum montanum
0.86
Juniperus sp.
0.79
Larix decidua
0.86
Leucanthemum vulgare
0.76
Pinus sp.
0.96
Polygonum viviparum
0.83
Rhododendron ferrugineum 0.86
Sesleria caerulea
0.77
Salix herbacea
0.96
Silene nutans
0.74
Trifolium alpinum
0.81
Vaccinium myrtillus
0.79
AUC
Anzeindaz
0.86
0.94
0.88
0.86
0.87
Table S1: Predicted accuracy of the habitat suitability. AUC values for the two study areas are
displayed.
b) Intraspecific variation in functional traits
Species
Carex sempervirens
Dactylis glomerata
Dryas octopetala
Festuca paniculata
Geum montanum
Juniperus sp.
Larix decidua
Hmax
0.35
0.37
0.31
0.19
0.43
Guisane
LDMC
0.14
0.19
0.16
0.13
0.14
0.15
0.15
LNC
0.21
0.3
0.22
0.21
0.21
0.19
0.16
Hmax
0.45
0.38
0.34
Anzeindaz
LDMC
0.13
0.20
0.07
LNC
0.22
0.34
0.17
Leucanthemum vulgare
Pinus sp.
Polygonum viviparum
Rhododendron ferrugineum
Sesleria caerulea
Salix herbacea
Silene nutans
Trifolium alpinum
Vaccinium myrtillus
0.36
0.49
0.36
0.31
0.27
0.33
0.46
0.31
0.15
0.08
0.13
0.12
0.25
0.14
0.15
0.1
0.13
0.15
0.18
0.16
0.18
0.24
0.11
0.17
0.09
0.19
0.52
0.13
0.18
0.35
0.19
0.19
Table S2 : Coefficient of variation of the measured functional traits in the two study sites.
They expressed the intraspecific variability of traits along the gradients.
Akaike, H. 1974 A new look at statistical model identification. IEEE Transactions on Automatic Control
AU-19, 716-722.
Araújo, M. B., Pearson, R. G., Thuiller, W. & Erhard, M. 2005 Validation of species-climate impact
models under climate change. Global Change Biology 11, 1504-1513.
Burnham, K. P. & Anderson, D. R. 2002 Model selection and multimodal inference: a practical
information-theoretic approach. New York: Springer-Verlag.
Fielding, A. H. & Bell, J. F. 1997 A review of methods for the assessment of prediction errors in
conservation presence/absence models. Environmental Conservation 24, 38-49.
Guisan, A., Edwards, J., Thomas, C. & Hastie, T. 2002 Generalized linear and generalized additive
models in studies of species distributions: setting the scene. Ecological Modelling 157, 89100.
Hastie, T. J. & Tibshirani, R. 1990 Generalized additive models. London: Chapman and Hall.
Johnson, J. B. & Omland, K. S. 2004 Model selection in ecology and evolution. TREE 19, 101-108.
Kergélen, M. 1993 Index synonymique de la flore de France. Paris, Muséum National d'Histoire
Naturelle, Secrétariat Faune-Flore, XXVIII, 196 pp.
Link, W. A. & Barker, R. J. 2006 Model weights and the foundations of multimodel inference. Ecology
87, 2626-2635.
McCullagh, P. & Nelder, J. A. 1989 Generalized linear models. Monographs on statistics and applied
probability 37: Chapman & Hall.
R Development Core Team (ed.) 2008 R: A Language and Environment for Statistical Computing.
Vienna, Austria: R Foundation for Statistical Computing, Vienna, Austria. Available online at
http://www.R-project.org.
Swets, K. A. 1988 Measuring the accuracy of diagnostic systems. Science 240, 1285-1293.
Thuiller, W., Araujo, M. B. & Lavorel, S. 2003 Generalized models vs. classification tree analysis:
Predicting spatial distributions of plant species at different scales. Journal of Vegetation
Science 14, 669-680.
Download