Appendix S1 – Testing data collection bias MAXENT models rely on unbiased datasets. We performed the following two analyses to ensure our datasets were unbiased and allowed for error in the methods used to compile the distribution information. First, as we extracted locality data from the published literature, we need to confirm that any error in point locality data will not affect model performance. To do this we created a 25km buffer around each point and then randomly sampling within this buffer – or jittering. The jittered points were added to the points that had GPS coordinates (when available) to ‘anchor’ them with highly accurate locality data. This was repeated to give 10 replicates which all included random error surrounding each point extracted from the literature. Running these as separate models in Maxent and then quantifying niche overlap using ENMtools revealed no significant difference between the models built on the extracted points versus the models built on the extracted points with random error (t-test: Aust.hist p = 0.379; Aust.curr p = 0.376). Secondly, to test whether the datasets we used in our models were unbiased from roadside sampling, we built datasets that randomly sampled both near to and away from roads in southern Australia. We first obtained a shapefile of the roads for Australia from Geoscience Australia (www.ga.gov.au). An area was defined with a northern border that stretched across Western Australia, South Australia, Victoria and New South Wales. This border represents the possible extent of the distribution of Halotydeus destructor, including available area between known localities. We created a 1.5km buffer around each road and randomly sampled 5000 points restricted to this buffer (‘inside’). We then randomly sampled 5000 random points outside of this buffer (‘outside’). Figure 1. Polygon used to mask southern Australia, with roads shapefile. 1 We considered the inside and outside datasets as separate ‘species’ and analysed them accordingly with the dismo package in R (version 2.10.1; R Development Core Team 2009) to determine if they were drawn from the same environments. We extracted variable information for the suite used in this study (arid, bio4, bio8, bio18, bio19) for each of the 5000 points in each dataset. We then performed a Principle Components Analysis (PCA) to determine how overlapping the climate space was for each ‘species’. Figure 2. Principle Component Analysis of randomly sampled datasets, inside (red circles) and outside (blue circles) a road-buffer zone across southern Australia. The datasets almost entirely overlap one another, suggesting that they are drawn from the same climate space; the axes hold the same explanation power for both. We can conclude from this analysis that the sampling from roadsides themselves do not provide bias in the climate variables chosen for this study. 2 Appendix S2 – Differences between climate datasets These figures show differences between climate datasets (Current [1975-2010] – Historical [19211995]) for the variables used in this study, and mean temperature and precipiation. (a) “Mean Annual Temperature” (bio1), legend = temperature (˚C); (b) “Temperature Seasonality (Coefficient of Variation)” (bio4), legend = coeffient of variation; (c) Aridity index (arid), legend =change in aridity index (The aridity index quantifies the ratio of precipitation availability over atmospheric water demand, with low values indicating arid conditions, and high ones humid climates); (d) “Annual Precipitation” (bio12), legend = precipitation (mm); (e) “Precipitation of Warmest Quarter” (bio18), legend = precipitation (mm); (f) “Precipitation of Coldest Quarter” (bio19), legend = precipitation (mm). 3