DDI_844_sm_AppS1-2

advertisement
Appendix S1 – Testing data collection bias
MAXENT models rely on unbiased datasets. We performed the following two analyses to ensure our
datasets were unbiased and allowed for error in the methods used to compile the distribution
information. First, as we extracted locality data from the published literature, we need to confirm that
any error in point locality data will not affect model performance. To do this we created a 25km buffer
around each point and then randomly sampling within this buffer – or jittering. The jittered points were
added to the points that had GPS coordinates (when available) to ‘anchor’ them with highly accurate
locality data. This was repeated to give 10 replicates which all included random error surrounding each
point extracted from the literature. Running these as separate models in Maxent and then quantifying
niche overlap using ENMtools revealed no significant difference between the models built on the
extracted points versus the models built on the extracted points with random error (t-test: Aust.hist p =
0.379; Aust.curr p = 0.376).
Secondly, to test whether the datasets we used in our models were unbiased from roadside sampling,
we built datasets that randomly sampled both near to and away from roads in southern Australia. We
first obtained a shapefile of the roads for Australia from Geoscience Australia (www.ga.gov.au). An
area was defined with a northern border that stretched across Western Australia, South Australia,
Victoria and New South Wales. This border represents the possible extent of the distribution of
Halotydeus destructor, including available area between known localities. We created a 1.5km buffer
around each road and randomly sampled 5000 points restricted to this buffer (‘inside’). We then
randomly sampled 5000 random points outside of this buffer (‘outside’).
Figure 1. Polygon used to mask southern Australia, with roads shapefile.
1
We considered the inside and outside datasets as separate ‘species’ and analysed them accordingly with
the dismo package in R (version 2.10.1; R Development Core Team 2009) to determine if they were
drawn from the same environments. We extracted variable information for the suite used in this study
(arid, bio4, bio8, bio18, bio19) for each of the 5000 points in each dataset. We then performed a
Principle Components Analysis (PCA) to determine how overlapping the climate space was for each
‘species’.
Figure 2. Principle Component Analysis of randomly sampled datasets, inside (red circles) and outside
(blue circles) a road-buffer zone across southern Australia.
The datasets almost entirely overlap one another, suggesting that they are drawn from the same climate
space; the axes hold the same explanation power for both. We can conclude from this analysis that the
sampling from roadsides themselves do not provide bias in the climate variables chosen for this study.
2
Appendix S2 – Differences between climate datasets
These figures show differences between climate datasets (Current [1975-2010] – Historical [19211995]) for the variables used in this study, and mean temperature and precipiation. (a) “Mean Annual
Temperature” (bio1), legend = temperature (˚C); (b) “Temperature Seasonality (Coefficient of
Variation)” (bio4), legend = coeffient of variation; (c) Aridity index (arid), legend =change in aridity
index (The aridity index quantifies the ratio of precipitation availability over atmospheric water
demand, with low values indicating arid conditions, and high ones humid climates); (d) “Annual
Precipitation” (bio12), legend = precipitation (mm); (e) “Precipitation of Warmest Quarter” (bio18),
legend = precipitation (mm); (f) “Precipitation of Coldest Quarter” (bio19), legend = precipitation
(mm).
3
Download