Text S1. Variable summary Table S1 provides a list of variables used in modelling species extinction risk and predicting Data Deficient status. This set of variables is derived from the dataset analyzed by Sodhi et al. (2008), and more detailed description of these variables is found therein. We updated the 2008 dataset to reflect changes in species taxonomy and distributions, updating associated spatial variables to reflect changed distributions. Extinction risk status and taxonomy were derived directly from the IUCN Redlist and Extent of occurrence (EOO) was derived from IUCN/GAA range maps. Both sources are updated from the 2008 dataset to reflect most recent available information from the respective sources. Species trait data were compiled by Sodhi et al. (2008) and remain unchanged. This data was compiled from a combination of field guides, herpetology textbooks, monographs, journal articles, online amphibian databases and websites and expert opinion (see Sodhi et al. 2008 for references). These variables were originally chosen as they represent both available data, and data considered to be important in determining species susceptibility to extinction risk. Percentage range lost was calculated by Sodhi et al. (2008) using the GAA range polygons to extract data from modified version 3 of the Global Land Cover 2000 dataset (GLC 2000), where landuse within each species range classified as human modified was considered as ‘lost’. We updated these values, using the same GLC data, to reflect changed species polygons. Mean bioclimatic variables within each species distribution range were similarly recalculated to account for changed species ranges since the 2008 dataset. These were re-estimated following the procedure used by Sodhi et al. (2008). Bioclimatic variables were determined using the ‘WorldClim’ database (Version 1.4; Hijmans et al., 2005). Extent of occurrence (area) is widely found to be the most important determinant of species extinction risk (eg. Cooper et al., 2008). Thermal and humidity/precipitation variables are likely to be highly important for aquatic breeding ectotherms and have interactions with disease susceptibility (Pounds et al., 2006). Latitude and longitude are included in the model to allow for differing regional patterns in extinction threat. Habitat loss is considered the single largest threat to species persistence globally and this is incorporated as percentage range lost (eg. Stuart et al., 2004). Intrinsic, life-history, characteristics are likely to modify species specific extinction risk through complex interactions with other threat processes (eg. Murray et al., 2011). A number of life-history variables are included in the model, given preconceived hypothesise as to their importance: body-size has links to hunting pressure (Warkentin et al., 2009); adult niche, egg deposition and larval development site are likely to alter species susceptibility to extinction risks; fertilization mode (internal/external), reproductive cycle (seasonal/seasonal) and reproductive mode (egg laying, live-bearing, direct developing) are all linked to populations’ ability to recruit numbers following declines; parental care behaviour is likely to enhance offspring survival; species primary broad habitat preference is linked to the species ability to persist in degraded habitats. Group Variable Source Dependant IUCN extinction risk From (IUCN, 2011) Area (EOO) GAAS (IUCN, 2011) Lat./Long. Max./Min. From GAAS (IUCN, 2011) Temperature Max./Min./Mean Precipitation Max./Min/Mean To GAAS from (Hijmans et al., 2005) To GAAS from (Hijmans et al., 2005) To GAAS from (Hijmans et al., 2005) To GAAS from (European Commission’s Joint Research Centre (JRC) Institute for Environment and Sustainability (IES), 2002) Geographic Humidity Extrinsic Threat Life History Percent range lost Area commonly the most important driver of extinction risk. Different threat processes may occur in different spatial regions. Ectotherms rely on environmental temperatures. Most amphibians require water bodies to breed. Most amphibians require humid conditions. Habitat loss a major driver of extinction. Size class Robustness and hunting pressure related to body size. Niche (arboreal/terrestrial/fossorial/ riparian) Land use change removes niches? Fertilization mode (internal/external) Reproductive success. Reproductive cycle (seasonal/a seasonal) Reproductive success. Reproductive mode (direct development/larval) Egg site (stream/phytotelm/pond/ vegetation) Taxonomic Role in extinction risk From published literature, original species descriptions and expert opinion Reproductive success. Site availability affected by habitat loss? Larval site (stream/phytotelm/pond) As above. Parental care (present/absent) Increased embryo/larval survival? Broad habitat (primary forest/secondary/riparian) Habitat obligates more at risk? Order From (IUCN, 2011) Differential responses based on taxonomy. Table S1. Variables used in predictive models. Geographic variables are taken from the IUCN Global Amphibian Assessment shapefiles (GAAS) or derived from these areas using Geographic Information Systems (GIS) (IUCN, 2011). Climatic variables extracted to GAAS from WorldClim database (Hijmans et al., 2005). Range loss derived from European Commission Joint Research Centre (2003). Complete detail of sources of life history data is given in (Sodhi et al., 2008). Dataset Total 4982 (81.5) Assessed DD Anura IUCN Total 6115 3886 (63.5) 1096 (17.6) Caudata 615 502 (81.6) 458 (74.5) 44 (7.2) Gymnophiona 189 167 (88.4) 58 (30.7) 109 (57.7) Total 6919 5651 (81.8) 4402 (63.6) 1249 (18.1) Order Table S2. Completeness of amphibian taxonomic richness in analyzed dataset. Overall, and by taxonomic order, amphibian richness is based on most recent values from the IUCN, totalling 6919 species. The number of species we analyze in our dataset are given for the total dataset and for assessed and Data Deficient subsets, with percentages of IUCN total richness in parentheses. Total dataset 152577 DD subset 31971 Assessed subset 117132 Missing data points (% of total) 3474 1752 1722 (2.3) (5.2) (1.4) Number (%) species with 1+ missing data point 4024/5651 1227/1249 2797/4402 (71.2) (98.2) (63.5) Data points Table S3. Summary of missing data points in the analyzed amphibian database. The number of data points in our database is shown for the total dataset (5651 species x 27 predictor variables) and for Data Deficient and assessed subsets. The number of missing predictor variable values in the respective datasets is indicated as well as the number of species with any missing data. The majority of species records contain some missing information (>63.5%) far outweighing the actual amount of missing information (< 5.2%). ~Area + IUCN trend + Taxonomy NA excluded error % (N=1605) 37.4 NA imputed error % (N=4402) 30.8 ~Life history- Area + Taxonomy 57.0 46.7 ~Geographic – Area + Taxonomy 43.6 34.4 ~Area + Life history + Geographic + Taxonomy 33.4 26.3 Model Table S4. Model inaccuracy (OOB error %) for IUCN assessed species subset with missing data excluded and imputed. ‘IUCN trend’ is only relevant to IUCN assessed species (see IUCN, 2011 for further details) and could not be used in predictive models for Data Deficient species. Predicted IUCN assessed LC NT VU EN CR EW EX total LC 2119 190 123 48 26 0 0 2506 NT 27 64 29 9 2 0 0 131 VU 45 59 289 96 26 0 1 516 EN 39 47 126 482 134 2 9 839 CR 10 4 33 84 253 0 9 393 EW 0 0 0 0 0 0 0 0 EX 0 0 0 2 1 0 14 17 total 2240 364 600 721 442 2 33 4402 Table S5. Model accuracy for IUCN assessed species. Comparison of IUCN risk to the model output was assessed by chi-squared test on column/row totals and merging extinct categories. Model mis-classification was assessed by row-wise averaging of the number of mis-classified species by the numeric distance (classes) to their correct classification across the matrix. Figure S1. Species richness distribution for Data Deficient amphibians. Showing most recent available IUCN data (correct as Feb 2012) and including Data Deficient species excluded from our analysis. Data S1. Text file containing a list of Data Deficient species analyzed, their predicted extinction class and certainty of prediction (proportion of 1000 randomForest trees matching predicted status) and complete extinction risk class probability distribution. REFERENCES European Commission’s Joint Research Centre (JRC) Institute for Environment and Sustainability (IES) (2002) GLC 2000: global land cover mapping for the year 2000. Available at: http://bioval.jrc.ec.europa.eu/products/glc2000/glc2000.php (accessed 12 July 2008). Hijmans, R.J., Cameron, S.E., Parra, J.L., Jones, P.G. & Jarvis, A. (2005) Very high resolution interpolated climate surfaces for global land areas. International Journal of Climatology, 25, 1965–1978.