Journal of Biogeography Supporting Information Invasion ratcheting

Journal of Biogeography SUPPORTING INFORMATION Invasion ratcheting in the zebra mussel (Dreissena polymorpha) and the ability of native and invaded ranges to predict its global distribution Belinda Gallardo, Philine S. E. zu Ermgassen and David C. Aldridge Appendix S2 Optimization of ecological niche models (ENMs) for the zebra mussel. Regularization According to the MAXENT user manual (available at: http://www.cs.princeton.edu/~schapire/ maxent/tutorial/tutorial.doc), the ‘regularization multiplier’ parameter affects how focused or closely-fitted the output predicted distribution is. For instance, a smaller value than the default of 1.0 will result in a more localized output distribution that is a closer fit to the given presence records, but can result in overfitted predictions. A larger regularization multiplier will give a more spread out, less localized prediction (Phillips & Dudík, 2008). For the zebra mussel, a regularization score of 1 to 4 was tested and models compared using ENMTools (Warren et al., 2010). The Akaike information criterion corrected for sample size (AICc) was used to select the best regularization option, as recommended by (Warren & Seifert, 2011). We conclude that while the default regularization multiplier of 1 is appropriate to model the distribution of the zebra mussel based on its European and North American ranges, an increased value of 4 is needed when using the Ponto-Caspian partial data set. Table S1 Results of ecological niche models (ENM) for the zebra mussel (Dreissena polymorpha), performed with a regularization multiplier of 1 to 4. Model: occurrence data used for calibration corresponding to Europe (EU), North America (NA) and the Ponto-Caspian region (PC); sample size: number of data points used to calibrate the model; AICc: Akaike information criterion corrected for sample size; AUC: accuracy of the model. Best model indicated in bold. Model EU NA PC Regularization Log-likelihood Parameters Sample size AICc AUC 1.0 2.0 −9817.0 −9895.9 66 47 910 910 19776.5 19891.0 0.954 0.953 3.0 −9938.2 41 910 19962.3 0.953 4.0 −9973.5 32 910 20013.5 0.953 1.0 2.0 −18370.0 −18596.6 111 78 1642 1642 36978.1 37357.1 0.923 0.922 3.0 −18747.9 63 1642 37626.9 0.921 4.0 −18853.4 52 1642 37814.1 0.920 4.0 2.0 −1001.7 −979.8 16 29 95 95 2042.5 2044.4 0.990 0.990 1.0 −958.6 38 95 2046.2 0.989 3.0 −992.8 27 95 2062.2 0.989 Sampling bias Zebra mussel occurrence points are clustered around the Netherlands, England and the Great Lakes of North America, which can potentially bias the results of output predictions. A sampling effort map was created to compensate this potential source of bias (e.g. Heibl & Renner, 2012; Torres et al., 2012), which involves retrieving the global occurrence of a ‘target group’ of species that are likely to be sampled using the same methods as our study species from GBIF (Phillips & Dudík, 2008). In this case, we chose the whole Dreissenidae family, and assumed that the number of Dreissenidae occurrence records per pixel is an indirect indicator of the sampling effort invested. We calculated Dreissenidae occurrence density at a 5-arcminute resolution with ARCVIEW 10.0 (ESRI, Redlands, CA, USA) and log10-transformed the map to reduce numeric disparities. As a result, a raster map was obtained where each pixel value ranged from 0 (lowest sampling effort) to 6 (highest sampling effort, corresponding to the Netherlands, England, and the Great Lakes of North America). This map was used in ENM to weight occurrences i.e. an inverse weight to its sampling effort was applied to each occurrence, reducing the importance of oversampled areas (also known as ‘bias file option’ (Phillips et al., 2011). According to this test, a bias file did not seem to significantly improve predictions, therefore samples were not weighted in the final models. Species geographical attributes ENM are affected by the geographical attributes of species, including sample size, prevalence (i.e. the ratio between presence and absence data), clustering (i.e. spatial autocorrelation) and rarity (Marmion et al., 2009). Species with a large number of occurrences have a higher prevalence but are likely to produce overfitted predictions, whereas a low number of occurrences often results in very general and loose models. In this study, a considerable clustering of data was noted that could not be appropriately compensated using the ‘bias file’ option described before. In addition, different sample sizes for the Ponto-Caspian, European and North American regions may prevent comparing predictions in a meaningful way. To compensate unequal sample sizes in the three zebra mussel ranges, we created a subset of 100 randomly selected points in the invaded Europe and North America ranges – thus comparable to sampling size in the native range (n = 98). This procedure was repeated 10 times Table S2 Results from ecological niche models (ENM) performed with and without using a ‘bias file’. Model: occurrence data used for calibration corresponding to Europe (EU), North America (NA) and the Ponto-Caspian region (PC). Sample size: number of data points used to calibrate the model. AIC c: Akaike information criterion corrected for sample size. AUC: accuracy of the model. Best option highlighted in bold. Model Bias file Log-likelihood Parameters Sample size AICc AUC EU no yes −9803.2 −11970.8 67 77 910 910 19751.2 24110.0 0.954 0.911 NA no yes −18362.6 −21183.2 103 87 1642 1642 36945.1 42550.2 0.923 0.876 PC no yes −956.6 −955.7 40 37 95 95 2054.1 2034.8 0.990 0.990 Table S3 Results from ecological niche models (ENM) performed using equal- and unequal-size subsets of data for the European (EU) and North American (NA) ranges of the zebra mussel. AUC: accuracy of the model. Prevalence: ratio between presence and background data. Range: maximum and minimum predicted scores. Fractional area: % area predicted suitable for the species according to the threshold maximizing the training suitability and specificity of the model (maxTSS). Model EU NA Sample size Equal Unequal Equal Unequal AUC 0.99 0.95 0.99 0.92 Entropy 5.51 6.90 5.71 7.63 Prevalence 0.01 0.05 0.01 0.09 Range 0–0.74 0–0.64 0–0.81 0–0.84 Fractional area 0.040 0.100 0.026 0.149 to reduce the possibility of sample subselection influencing the result of models. Although we did not specifically correct for spatial autocorrelation, the degree of clustering of each subset was notably reduced. We ran models using the default modelling options of MAXENT and subsequently reported the average of the 10 replicated models calibrated for Europe and North America respectively (i.e equal-size models). In this case it was not appropriate to use AICc values to compare the performance of equal-size and unequal-size models since a lower number of degrees of freedom in the former would inevitably lead to lower AICc. We therefore used alternative metrics to evaluate the two options: the AUC of the model, range of suitability scores, and the ecological plausibility of predictions. Although model accuracy (AUC) was higher in equal-size than in unequal-size models, differences between both options were most prominent in spatial predictions. The area predicted to be suitable for the zebra mussel was 2.5 to 6 times smaller when using equal-size subsets, notably underestimating its current and presumably potential range of distribution (Fig. 1). We hereafter show results from ENMs calibrated with all available data, an option recommended by Beaumont et al. (2009) and Broenimann & Guisan (2008). (A) (B) Figure S1 Results from ecological niche models (ENM) calibrated with zebra mussels occurrence data from Europe (A) and North America (B). In dark grey, the areas where the zebra mussel is predicted to be present according to both the equal-size and unequal-size models. In light grey, the additional areas where the species is predicted to be present according to the unequal-size model (using all available occurrence data). Variable inclusion Because it has been suggested that the choice and number of predictors affects the accuracy and transferability of ENM (Rödder & Lötters, 2010), we sequentially removed one variable at a time and selected the option with lowest AICc (i.e. backward elimination; Guyon & Elisseeff, 2003). Table S4 Results from backward elimination of variables in ENM calibrated with the zebra mussel native (PC), European (EU) and North American (NA) ranges. Parameters: number of parameters in the model after linear, quadratic, product, threshold and hinge features have been automatically optimized (‘auto features’ default option). Sample size: number of occurrence records used to calibrate the model. AIC c: Akaike information criterion corrected for sample size. Model Variable excluded Log-likelihood Parameters Sample size AICc EU PPdriest None PPseason Tmax Tseason Tann Tmin PPann Elevation Geology −9703.1 −9706.4 −9714.6 −9720.7 −9724.3 −9736.5 −9742.5 −9748.1 −9754.6 −9817.0 63 66 65 63 62 67 65 64 72 66 910 910 910 910 910 910 910 910 910 910 19541.7 19555.3 19569.4 19576.9 19581.7 19617.8 19625.1 19634.1 19665.7 19776.5 NA None Tann Tmin Elevation PPdriest Tseason Tmax −18229.1 −18250.8 −18274.6 −18295.1 −18284.5 −18304.0 −18296.0 101 95 101 95 106 94 104 1642 1642 1642 1642 1642 1642 1642 36673.5 36703.4 36764.5 36791.9 36795.7 36807.5 36814.2 PPseason Geology PPann PC Elevation, PPseason, Tann* Elevation PPdriest PPann Geology PPseason None Tann Tmax Tseason Tmin −18360.2 −18370.0 −18548.6 97 111 111 1642 1642 1642 36926.8 36978.2 37335.5 −987.5 −986.4 −977.2 −990.4 −958.7 −978.0 −976.4 −976.8 −986.1 −988.0 −981.3 19 24 29 24 38 31 32 32 36 39 45 95 95 95 95 95 95 95 95 95 95 95 2023.1 2037.9 2039.2 2045.9 2046.3 2049.6 2050.9 2051.8 2090.2 2110.7 2137.1 * Intermediate backward elimination steps not shown. Variable optimization suggested eliminating a number of factors from models calibrated with the zebra mussel’s native and European ranges, whereas models based on its North American range should utilize all available environmental predictors. REFERENCES Beaumont, L.J., Gallagher, R.V., Thuiller, W., Downey, P.O., Leishman, M.R. & Hughes, L. (2009) Different climatic envelopes among invasive populations may lead to underestimations of current and future biological invasions. Diversity and Distributions, 15, 409–420. Broennimann, O. & Guisan, A. (2008) Predicting current and future biological invasions: both native and invaded ranges matter. Biology Letters, 4, 585–589. Guyon, I. & Elisseeff, A. (2003) An introduction to variable and feature selection. Journal of Machine Learning Research, 3, 1157–1182. Heibl, C. & Renner, S.S. (2012) Distribution models and a dated phylogeny for Chilean Oxalis species reveal occupation of new habitats by different lineages, not rapid adaptive radiation. Systematic Biology, 61, 823–834. Phillips, S., Dudík, M. & Schapire, R. (2011) A brief tutorial on MaxEnt. Princeton University, Princeton, NJ. Phillips, S.J. & Dudík, M. (2008) Modeling of species distributions with Maxent: new extensions and a comprehensive evaluation. Ecography, 31, 161–175. Rödder, D. & Lötters, S. (2009) Niche shift versus niche conservatism? Climatic characteristics of the native and invasive ranges of the Mediterranean house gecko (Hemidactylus turcicus). Global Ecology and Biogeography, 18, 674–687. Torres, R., Jayat, J.P. & Pacheco, S. (2012) Modelling potential impacts of climate change on the bioclimatic envelope and conservation of the Maned Wolf (Chrysocyon brachyurus). Mammalian Biology, 78, 41– 49. Warren, D.L. & Seifert, S.N. (2011) Ecological niche modeling in Maxent: the importance of model complexity and the performance of model selection criteria. Ecological Applications, 21, 335–342. Warren, D.L., Glor, R.E. & Turelli, M. (2010) ENMTools: a toolbox for comparative studies of environmental niche models. Ecography, 33, 607–611.

Journal of Biogeography Supporting Information Invasion ratcheting

Related documents

Products

Support

Journal of Biogeography Supporting Information Invasion ratcheting

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib