Generalized linear models Generalized linear models (GLMs) are mathematical extensions of linear models which are capable to handle non–linear relationships and various statistical distributions [McCullagh and Nelder, 1989]. In this study, GLMs were based on first and second order polynomials and logit –link function. In addition, to calibrate GLMs we utilized an automatic stepwise procedure based on the Akaike’s information criteria (AIC). Generalized additive models Generalized additive models (GAMs) are semi-parametric extensions of GLMs [Hastie and Tibshirani, 1990]. To relate the occurrence of ESPs to environmental predictors, the maximum smoothing function was set to 4 which were subsequently optimized by the model fitting function [Wood, 2011]. Generalized boosting method Generalized boosting method (GBM) is a sequential ensemble modeling method that combines a large number of iteratively fitted classification trees to a single model with improved prediction accuracy [Elith et al., 2008]. GBMs automatically incorporate interactions between predictors and are capable of modeling highly complex non–linear systems. GBMs (with Bernoulli–distribution assumption) were fitted setting the interaction depth to six, bagging fraction to 0.75 and maximum number of trees to 3000. Classification tree analysis Classification tree analysis (CTA) generates a binary tree through binary recursive partitioning i.e. the nodes are split based on true/false answers concerning the values of predictors [Venables and Ripley, 2002]. CTA builds an overgrown tree based on the node purity -criterion that is later pruned back via tenfold cross-validation to avoid over-fitting [Breiman et al., 1984]. Artificial neural network Artificial neural network (ANN) is a machine-learning method that replicates the functioning of the human brain. This highly flexible method builds accurate models for prediction when the functional form of the underlying equations is unknown [Venables and Ripley, 2002]. ANNs were fitted by setting the maximum number of iterations to 200 (Biomod 2 default). To prevent the ANNs to over fit, four–fold cross–validation was implemented to stop the training of the networks. Random forest The random forest (RF) generates a large ensemble of classification trees forming a “forest”. Each tree is subsequently built by randomly selecting a training dataset from the observations (i.e. bootstrap sample with replacement; n=1200 in this study). In addition, four explanatory variables in each tree were randomly selected for calculating the best split on these predictors in the training set [Breiman, 2001]. This procedure is iterated over all trees in the ensemble and the RF algorithm detects the classification appearing most frequently in the model selection process (i.e. the random forest prediction). In this study, we set the maximum number of trees to 500. Multivariate adaptive regression splines Multivariate adaptive regression splines (MARS) merge standard linear regression, mathematical spline construction and binary recursive partitioning in order to produce a local model in which relationships between response and explanatory variables are either linear or non-linear [Friedman, 1991]. In this study, we set maximum interaction degree to two, forward stepwise stopping threshold to its default value (0.001) with pruning of the model allowed. Maximum entropy Maximum entropy (MAXENT) is a flexible statistical modeling approach where a target probability distribution is estimated by finding the probability distribution of maximum entropy (i.e. that is most dispersed or closest to uniform) [Phillips et al., 2006]. It has many advantages over some other machine learning techniques e.g. the MAXENT probability distribution has a mathematical definition and is not sensitive to over-fitting [Phillips et al., 2004]. In our study, we set the maximum number of model iterations to 200. Flexible discriminant analysis Flexible discriminant analysis (FDA) is a non–parametric equivalent of the linear discriminant analysis (LDA) [Hastie et al., 1994]. FDA is a multi–group nonlinear classification technique that replaces the linear regression by any nonparametric method [Hastie et al., 1995]. The FDAs were optimally scaled by using MARS (default option in Biomod2). Surface range envelope Surface range envelope (SRE) is a range based technique where the algorithm identifies locations where all the predictors fall within the extreme values determined for a set of observations (97.5% in this study) [Carpenter et al., 1993]. SRE defines the suitable environmental conditions for target ESPs as a rectangular envelope in Euclidean space. Literature cited: Breiman, L., Friedman, F., Olshen, F., and C. Stone (1984), Classification and Regression Trees. Wadsworth Statistics/Probability Series. Wadsworth, Belmont, CA, USA, 358pp. Breiman, L. (2001), Random forests. Machine Learning 45(1), 5–32, doi: 10.1023/A:1010933404324 Carpenter, G., Gillison, A. N., and J. Winter (1993), DOMAIN: a flexible modelling procedure for mapping potential distributions of plants and animals. Biodiversity and Conservation 2(6), 667– 680, doi: 10.1007/BF00051966 Elith, J., Leathwick, J. R., and T. Hastie (2008), A working guide to boosted regression trees. Journal of Animal Ecology 77(4), 802–813, doi: 10.1111/j.1365-2656.2008.01390.x Friedman, J. (1991), Multivariate adaptive regression splines. Annals of Statistics 19(1), 1–141. Hastie, T., Tibshirani, R., and A. Buja (1994), Flexible discriminant analysis by optimal scoring. Journal of the American Statistical Association 89(428), 1255–1270, doi: 10.1080/01621459.1994.10476866 Hastie, T., Buja, A., and R. Tibshirani (1995), Penalized discriminant analysis. The Annals of Statistics 23(1), 73–102. Hastie, T., and R. Tibshirani (1990), Generalized Additive Models, Monographs on Statistics and Applied Probability 43. Chapman and Hall, New York. McCullagh, P., and J. A. Nelder (1989), Generalized Linear Models. 2nd ed. Chapman & Hall, New York, 511pp. Phillips, S. J., Dudìk, M., and R. E. Schapire (2004), A maximum entropy approach to species distribution modeling. In: Proceedings of the 21st International Conference on Machine Learning, ACMPress, New York, pp. 655–662. Phillips, S. J., Anderson, R. P., and R. E. Schapire (2006), Maximum entropy modelling of species geographic distributions. Ecological Modelling 190(3–4), 231–259, doi: http://dx.doi.org/10.1016/j.ecolmodel.2005.03.026 Venables, W. N., and B. D. Ripley (2002), Modern Applied Statistics with S, fourth ed. Springer, Berlin, 495pp. Wood, S. N. (2011), Fast stable restricted maximum likelihood and marginal likelihood estimation of semiparametric generalized linear models. Journal of the Royal Statistical Society: Series B (Statistical methodology) 73(1), 3–36, doi: 10.1111/j.1467-9868.2010.00749.x