Which environmental variables should I use in my

advertisement
International Journal of Geographical Information Science: Special Issue Paper
Which environmental variables should I use in my biodiversity model?
Supplementary Information
Kristen J Williams1, Lee Belbin1,2, Michael P Austin1, Janet L. Stein3, Simon Ferrier1
1. CSIRO Ecosystem Sciences, Canberra, Australia; 2. Atlas of Living Australia,
Hobart, Australia; 3. Fenner School of Environment and Society, Australian National
University, Canberra, Australia
Correspondence: Kristen J Williams (kristen.williams@csiro.au); CSIRO Ecosystem
Sciences, GPO Box 1700, Canberra, ACT 2601, Australia, phone +61 2 62464213
CONTENTS
Table 1. Substitutable subsets of variables that are alternatives and not sensibly
included in the same model (without justification).
Table 2. Similar variables that can be included in the same model (not substitutable).
Table 3. MaxEnt model results for Eucalyptus delegatensis in southeastern Australia.
Table 4. Summary of relative contribution and permutation importance (percents) of
variables included in the four MaxEnt models (from Table 3) summarised by two
climate and three substrate groups.
Table 5. Relative contribution and permutation importance of variables included in the
four MaxEnt models for Eucalyptus delegatensis presence (from Table 3).
Table 6. GDM model results for vascular plants across the Australian continent.
Table 7. Summary of relative contribution (sum of coefficient values) and partial
deviance explained (percent) by variables included in the two GDM models (from Table
6).
Table 8. Relative contribution (sum of coefficients) and partial deviance explained (%)
by variables included in the two GDM models for vascular plant compositional
dissimilarity (from Table 6).
Table 9. Overview of variables used in MaxEnt and GDM models compared with
variables tested.
Figure 1. Predicted (probability of presence) natural distribution of Eucalyptus
delegatensis for four alternative MaxEnt models.
International Journal of Geographical Information Science: Special Issue Paper
Table 1. Substitutable subsets of variables that are alternatives and not sensibly included in the same
model (without justification).
Group
1
Type
Atmospheric water
Subset1
RAINI, RAINX, EVAPI,
EVAPX
Subset2
ADEFI, ADEFX
(possibly also,
ARID_MIN,
ARID_MAX)
2
Rainfall seasonality
SLRAIN1, SLRAIN2
SRAIN1MP,
SRAIN2MP
3
Geological age
GEOLLMNAGE,
GEOLLRNGEAGE
GEOLMEANAGE,
GEOLRANGEAGE
4
Soil hydrology
SOLPAWHC
SOLDEPTH, CLAY
5
Soil pedality
PEDALITY
HPEDALITY
Comments
Atmospheric water deficit is
derived from rainfall and
evaporation (no
recommendation, user
preference, test both
separately). Possibly consider
aridity indices in subset2, but
may also test these in
combination with susbet1.
Subset 1 is a factor ratio using
the logarithm of rainfall,
subset 2 is a simple rainfall
ratio (test alternatives, or
choose subset1 for
continental studies and subset
2 for regional studies)
Subset 1 is the logarithm of
age, subset 2 is in millions of
years (recommend using
subset1)
Subset 1 is a derivative of
variables in subset 2
(recommend using subset 2,
more direct interpretation of
soil attribute and clay % is
also a factor in soil nutrient
supply). CLAY contributes
information about soil nutrient
status and structure (in
addition to hydrology) and
could be included in a model
with SOLPAWHC.
Subset 1 is a categorical
variable, subset 2 is ordered.
Recommend using subset 2
International Journal of Geographical Information Science: Special Issue Paper
Table 2. Similar variables that can be included in the same model (not substitutable).
Group
1
Type
Substrate fertility
Subset1
FERT
Subset2
NUTRIENTS
2
Terrain flatness
MRRTF,
MRVBF
RIDGETOP,
VALLEYBOTTOM
3
Soil attribute
reliability
DATASUPT
WR_UNR,
KSAT_ERR
4
Temperature
extremes
MINTI, MAXTX
TMINABSI,
TMAXABSX
5
Temperature range
TRNGX
TRNGA
6
Humidity
RH2MIN,
RH2MAX
VPD2MIN, VP2MAX
7
Temperature
MAXTI, MAXTX
MINTI, MINTX
Comments
Subset1 is derived from 1:1< geology mapping by
expert interpretation, subset2 is derived from the
Atlas of Australian soils using data and expert
interpretation. Correlated but can be tested
together. Recommend both.
Subset1 and subset2 are different summaries at
1km grid of value and heterogeneity within 9sec grid
estimates of MRRTF and MRVBF. Correlated but
can be tested together. Recommend both.
Subset1 is an overall estimate of soil property
interpretation reliability based on available data and
subset2 are specific to particular soil attributes.
Correlated but can be tested together. Recommend
subset1.
Subset1 are long-term average monthly maximum
and minimum temperatures generated at 1km grid
using ANUCLIM v5.1, subset 2 are absolute
monthly maximum and minimum values over 50
years generated from daily 5km grid SILO surfaces.
Correlated but can be tested together. Recommend
subset1.
Subset 1 measures the maximum of monthly diurnal
temperature ranges, subset 2 measures the annual
difference between hottest day and coldest night.
Correlated but can be tested together. Recommend
subset1.
Subset 1 is the relative humidity (ratio), subset 2 is
the vapour pressure deficit (difference). Correlated
but can be tested together. Recommend subset1.
Subset 1 measures daytime temperatures, subset2
measures night time temperatures. Correlated but
can be tested together. Recommend both.
International Journal of Geographical Information Science: Special Issue Paper
Table 3. MaxEnt model results for Eucalyptus delegatensis in southeastern Australia. Summary of training and
test statistics (AUC) using 25% test data (54 presence records) and 75% training data (162 presence records) to
validate the relative performance of each model generated. Different models were generated using substitutable
subsets of environmental variables. Model 1a: evaporation and precipitation, potential soil water holding capacity.
Model 1b: evaporation and precipitation, soil depth and texture (percent clay). Model 2a: precipitation deficit and
aridity, potential soil water holding capacity. Model 2b: precipitation deficit and aridity, soil depth and texture
(percent clay).
Model
100%
Training
AUC
75%
Training
AUC
Model 1a
0.947
0.949
Model 1b
0.945
0.949
Model 2a
0.945
0.949
Model 2b
0.944
0.952
25% Test
AUC and
standard
deviation1
0.890 +/0.022
0.903 +/0.019
0.892 +/0.022
0.887 +/0.022
Regularized
training
gain
Unregularized
training gain
Unregularized
test gain
#
variables
Most
important
variable2
1.692
2.057
1.020
23
1.649
1.996
1.230
23
1.649
1.994
1.145
24
1.678
2.049
1.145
23
MAXTI
(38.2)
MAXTI
(38.4)
MAXTI
(40.0)
MAXTI
(38.7)
1. Calculated as per equation 2 in (DeLong et al. 1988).
2. Training data percent contribution
Table 4. Summary of relative contribution and permutation importance (percents) of variables included in the four
MaxEnt models summarised by two climate and three substrate groups. Results for individual variables are given
in Table 5.
variable
Model1a
Model1b
Model2a
Model2b
contribution
Importance
contribution
Importance
contribution
Importance
contribution
Importance
water
18.14
27.42
14.30
21.63
12.47
20.80
15.39
13.88
energy
65.48
33.69
69.73
40.03
70.21
31.37
66.24
34.79
soil
5.45
4.43
7.27
12.96
6.19
4.42
6.74
12.04
geoscience
5.49
3.34
4.16
2.04
6.75
5.23
7.02
7.02
terrain
5.43
31.12
4.53
23.33
4.38
38.17
4.61
32.28
International Journal of Geographical Information Science: Special Issue Paper
Table 5. Relative contribution and permutation importance of variables included in the four MaxEnt models for Eucalyptus delegatensis presence. Shading identifies the two
climate and three substrate groups summarised in Table 4.
Model 1a
Variable
Model 1b
Model 2a
contribution
permutation
contribution
permutation
ARID_MAX
-
-
-
-
ARID_MIN
-
-
-
-
4.0792
RAINX
3.7543
4.2896
2.4593
2.5732
EVAPI
2.2372
2.5127
2.5033
2.146
EVAPX
4.2323
8.59
5.0813
10.5557
-
-
-
-
RPRECMAX
0.6217
0.9927
0.7386
1.3318
0.5955
1.2418
RPRECMIN
0.2922
1.4098
1.261
0.9936
SLRAIN1
4.3668
8.0886
4.0575
11.5284
4.9931
7.38
SLRAIN2
2.6392
1.5412
2.3375
3.7179
2.5196
1.4208
MAXTI
46.792
1.9771
47.9136
8.0951
51.131
9.3364
46.106
2.2444
MINTI
1.9949
12.127
2.3754
8.5935
1.1464
4.6968
1.8137
8.9536
MINTX
10.3755
3.0802
9.3235
5.7213
10.2811
2.5018
12.8647
7.4872
0.8582
1.1989
1.0081
1.7875
RADNX
0.5848
4.1639
4.0676
5.86
0.521
2.0353
RH2MIN
0.3801
3.5035
0.4372
2.0157
0.4141
1.1714
0.4341
1.267
0.7235
3.9214
0.7418
3.459
TRNGX
4.4175
6.0087
4.2738
6.0571
4.189
4.2713
4.2759
11.3744
RTXMIN
0.9399
2.8247
0.4833
2.4898
0.7984
1.6509
BDENSITY
0.4486
1.4896
0.3293
1.1452
CLAY
0.428
2.883
0.3121
2.3552
3.4562
5.7047
2.6453
2.8373
RADNI
4.2558
6.357
RH2MIN
KSAT
contribution
Model 2b
contribution
permutation
2.4471
1.293
3.2273
4.8328
2.5394
-
-
-
-
-
-
-
-
0.8174
1.2585
NUTRIENTS
2.8679
2.9751
3.6106
5.8895
3.1319
SOLDEPTH
-
-
1.966
1.4443
-
SOLPAWHC
2.5854
1.4576
-
3.0532
permutation
2.9543
1.4672
-
International Journal of Geographical Information Science: Special Issue Paper
Model 1a
Variable
Model 1b
contribution
permutation
FERT
1.4663
1.2863
GEOLLMEANAGE
4.0206
2.0491
contribution
4.1638
Model 2a
permutation
2.0385
GRAVITY
Model 2b
contribution
permutation
contribution
permutation
1.5678
1.3989
1.4921
1.2053
4.3935
2.5159
4.8102
2.8582
0.7857
1.32
0.7194
2.9548
EROSIONAL
1.5473
1.1891
0.4266
1.0958
0.6216
1.3208
0.4793
1.2603
MRRTF
0.6365
13.4384
0.6205
12.2037
0.6462
20.9168
0.5587
12.8779
MRVBF
0.218
2.435
0.2383
2.1633
0.1072
2.4392
0.2043
5.3782
SLOPE
1.9549
2.7581
2.3919
1.9169
2.0103
3.8092
2.5314
5.225
VALLEYBOTTOM
1.0748
11.3015
0.8559
5.9536
0.9963
9.6859
0.8375
7.5376
International Journal of Geographical Information Science: Special Issue Paper
Table 6. GDM model results for vascular plants across the Australian continent. Percent deviance explained, sum
of coefficient values and intercept. Different models were generated using substitutable subsets of environmental
variables. Model 1 evaporation and precipitation. Model 2: precipitation deficit and aridity. The two substitutable
soil variables – soil depth and water holding capacity – did not contribute minimum levels of partial deviance
explained to be retained in the final model.
Model
Number of
predictors1
%
deviance
explained
(> 0.02)
Intercept
50.33
Sum of
coefficient
values for
all
predictors
26.02
Model 1
27
Model 2
28
50.17
24.48
1.4462
1.4468
1. Number of predictors includes sampling covariates and geographic distance predictor.
Table 7. Summary of relative contribution (sum of coefficient values) and partial deviance explained (percent) by
variables included in the two GDM models. Results for individual variables are given in Table 8.
Model 1
variable
water
Summed coefficients
(relative contribution
%)
11.25 (43.25%)
energy
soil
Model 2
1.10
Summed coefficients
(relative contribution
%)
9.06 (37.02%)
7.49 (28.78%)
0.77
8.40 (34.33%)
1.20
1.35 (5.18%)
0.35
1.36 (5.56%)
0.36
geoscience
2.06 (7.90%)
0.43
2.08 (8.51%)
0.43
terrain
0.91 (3.51%)
0.12
0.98 (3.99%)
0.13
other
2.96 (11.38%)
0.56
2.60 (10.60%)
0.54
Partial % deviance
explained
Partial % deviance
explained
1.16
International Journal of Geographical Information Science: Special Issue Paper
Table 8. Relative contribution (sum of predictor coefficient values) and partial of deviance explained (%) by
variables included in the two GDM models for vascular plant compositional dissimilarity. Shading identifies the
variable groups: two climate, three substrate and covariates summarised in Table 7. COVARPLANTS refers to two
sampling covariates which take into account sampling inadequacies through the number of species and
observation records aggregated at the 0.01° grid scale (see Williams et al. 2010).
Model1
Partial Deviance
Explained (%)
Relative
Contribution
Model2
Partial Deviance
Explained (%)
Variable
Relative
Contribution
RAINX
5.460645
0.17308
-
-
EVAPX
1.125213
0.090956
-
-
EVAPI
0.840935
0.069545
-
-
ADEFX
-
-
3.113301
0.028234
ADEFI
-
-
1.525199
0.243337
RPRECMIN
1.115339
0.066047
1.586704
0.154569
SLRAIN1
2.466874
0.645147
2.608743
0.676711
SLRAIN2
0.24507
0.053681
0.227876
0.056239
MAXTX
1.131683
0.02804
0
0
MAXTI
2.006137
0.148737
2.736282
0.398598
TMAXABSX
0.314126
0.024183
0.413698
0.030735
RADNX
0.959297
0.10692
1.126785
0.065975
RADNI
0.729256
0.123138
0.982256
0.253473
TRNGX
1.025938
0.095511
1.081095
0.096862
RTIMIN
0.653083
0.112414
0.87582
0.205535
RH2MAX
0
0
0.525599
0.034251
WINDSPMIN
0.669577
0.133405
0.662541
0.113334
HPEDALITY
0.212145
0.127517
0.208824
0.125604
COARSE
0.18639
0.055869
0.182173
0.053381
CLAY
0.446703
0.095465
0.451461
0.096171
CALCRETE
0.148287
0.041682
0.173019
0.059234
BDENSITY
0.354797
0.030082
0.344541
0.028426
WII_WGS1KB
0.294236
0.056399
0.305238
0.057207
GRAVITY
1.152121
0.159316
1.161817
0.162091
GEOLLRNGEAGE
0.609643
0.211048
0.615012
0.214859
EROSIONAL
0.122406
0.028733
0.120382
0.028761
ROUGHNESS
0.47921
0.053093
0.498765
0.056615
MRVBF
0.310472
0.037011
0.358424
0.047646
COVARPLANTS (2)
0.187043
0.021233
0.187441
0.033362
GEOGRAPHIC DISTANCE
2.77459
0.537892
2.407314
0.506649
International Journal of Geographical Information Science: Special Issue Paper
Table 9. Overview of variables used in MaxEnt and GDM models compared with variables tested. Model details
are given in Table 5 for the MaxEnt analysis and Table 8 for the GDM analysis.
Group
Variable
MaxEnt 1a
MaxEnt 1b
MaxEnt 2a
GDM 2
Included
ADEFI
1
1
ADEFX
1
1
ARID_MAX
ARID_MIN
water
1
MaxEnt 2b
GDM 1
1
1
1
2
EVAPI
1
1
1
3
EVAPX
1
1
1
3
RAINI
0
RAINX
1
RPRECMAX
1
1
1
RPRECMIN
1
1
SLRAIN1
1
SLRAIN2
1
1
1
3
1
3
1
1
4
1
1
1
1
6
1
1
1
1
5
SRAIN1MP
0
SRAIN2MP
0
MAXTI
1
1
1
1
MINTI
1
1
1
1
4
MINTX
1
1
1
1
4
MAXTX
1
1
1
RADNI
1
1
1
1
1
4
RADNX
1
1
1
1
1
5
RH2MAX
1
1
1
1
1
5
1
1
RH2MIN
2
RTIMAX
0
RTIMIN
energy
1
1
RTXMAX
RTXMIN
2
0
1
1
1
3
TMAXABSX
1
1
2
TMINABSI
0
TRNGA
0
TRNGI
0
TRNGX
1
1
1
1
1
1
6
VPD2MAX
0
VPD2MIN
0
WINDRI
0
WINDRX
0
WINDSPMAX
0
WINDSPMIN
BDENSITY
1
1
1
1
CALCRETE
soil
6
CLAY
COARSE
1
1
2
1
1
4
1
1
2
1
1
4
1
1
2
1
1
DATASUPT
HPEDALITY
KS_ERR
0
2
0
International Journal of Geographical Information Science: Special Issue Paper
Group
Variable
MaxEnt 1a
KSAT
NUTRIENTS
MaxEnt 2a
MaxEnt 2b
GDM 1
GDM 2
1
1
4
1
2
1
1
SOLDEPTH
SOLPAWHC
MaxEnt 1b
1
1
1
1
1
2
WR_UNR
GEOLLMEANAGE
0
1
1
1
1
GEOLLRNGEAGE
geoscience
FERT
4
1
1
GRAVITY
1
1
1
1
1
2
3
1
1
MAGNETICS
terrain
Included
4
0
WIII_WGS1KB1
-
-
-
-
1
1
2
EROSIONAL
1
1
1
1
1
1
6
MRRTF
1
1
1
1
MRVBF
1
1
1
1
4
1
1
RELIEF
0
RIDGETOPFLAT
0
ROUGHNESS
SLOPE
1
1
1
1
1
1
1
1
1
TWI
VALLEYBOTTOM
6
1
2
4
0
1. Weathering intensity index not tested in the MaxEnt models.
4
International Journal of Geographical Information Science: Special Issue Paper
Figure 1. Predicted probability of presence of Eucalyptus delegatensis for four alternative MaxEnt models for the
natural distribution. White areas indicate prediction values <0.1 or fall outside the analysis domain.
Download