NCON-D-14-00014 - Research Letters - Supporting material
The modeled species
Tecoma stans is an invasive tree species with a pantropical distribution. The most recent taxonomic revision of the genus recognizes three varieties within T. stans (Wood 2008).
In this study, occurrences of both T. stans and T. stans var. stans were treated as our widespread focus species ( T. stans hereafter). Tecoma stans var. sambucifolia (Kunth)
J.R.I.Wood and T. stans var. velutina DC. are endemic to the Andes (Colombia,
Ecuador, Peru and Bolivia) and are not cultivated. Tecoma stans is native of dry areas from the New World: Southern USA, Mexico, the Caribbean, Central America,
Venezuela, Colombia, Ecuador, Peru, Bolivia and northern Argentina (Gentry 1992;
Wood 2008).
The species was traded as ornamental in the 19 th
century, reaching French
Polynesia and Asia by 1845 (Shine et al.
2003). By 1871, it was reported as cultivated in Brazil (Kranz & Passini 1997). Currently, it is commonly cultivated in gardens and streets. Due to its ruderal habit, individuals have formed semi-natural and naturalized populations in many places. We used T.
stans as our target species because it has fast growth and high dispersal ability (Kranz & Passini 1997), quickly forming dense monocultures, decreasing available areas for pasture and agriculture (Henderson 2001), and hindering the access of livestock and machinery to the land (Kranz & Passini 1997).
Once established, it restricts the access of native plants to soil nutrients and light, affecting habitat integrity and functioning in high population densities (Henderson
2001).
Occurrence datasets
We compiled a total of 1,385 T. stans occurrence records from literature (e.g. Madire et al.
2011; Wood 2008) and online databases [CRIA Species Link
(www.splink.cria.org.br); Global Biodiversity Information Facility (www.gbif.org);
Tropicos (www.tropicos.org); Australia’s Virtual Herbarium (avh.ala.org.au); JSTOR
Plant Science database (plants.jstor.org)]. Since city hall coordinates were used as a proxy for the exact GPS coordinates for 20-30% of the occurrence records, we used a grid resolution of 0.5 º decimal degrees in our rENMs and multivariate analyses, decreasing the number of unique occurrence records to 535 points. Of these, 348
records represented native populations. Native area definition followed Pelton (1964),
Gentry (1992), and Wood (2008), including areas of southern USA (Florida), Mexico,
Central America, the Caribbean, and Tropical Andes from Venezuela and Colombia to northern Argentina. The remaining occurrences were considered as exotic (following
Kranz & Passini 1997), including all Brazilian records.
Environmental input variables
Our initial variables set was composed by all nineteen variables available from
WorldClim’s climate database (http://www.worldclim.org; Hijmans et al.
2005) and altitude and slope obtained from the US Geological Surveys Hydro-1K database
(edcdaac.usgs.gov/gtopo30/hydro). We performed a Factor Analysis, only in the 19 climate variables, based on the correlation matrix between pairs of variables, followed by Varimax orthogonal rotation, firstly we defined the number of factors and after for each factor we selected the variables with the highest loadings to represent them and finally we used these variables with the slope and altitude as predictors of T. stans
’ potential distribution (see Terribile et al.
2012). This method minimizes the collinearity among environmental predictors, and produces more reliable potential distributions for the modeled species (Jiménez-Valverde et al.
2011).
Considering a 0.5º resolution, the final selected variables to represent
T. stans
’ environmental space were mean diurnal range, temperature seasonality, mean temperature of the warmest quarter, precipitation of driest period, precipitation of wettest quarter, altitude, and slope (see Table S1 for the selected factors and the loadings of the bioclimatic variables).
Modeling algorithms, thresholds, and evaluation statistics
The selected algorithms correspond to distance-based methods. Although considered as the most simple algorithms when compared to statistical and/or artificial intelligence methods, these methods consider the occurrences of a given species are constrained by its own tolerance to environmental conditions (Rangel & Loyola 2012). Consequently, even though these methods may lack from realism and precision, they are certainly more tied to ecological theory on species physiological tolerances than any other methods used to predict species distributions (Rangel & Loyola 2012). Of the methods we employed, Mahalanobis distance (MHL hereafter) is a bioclimatic envelope
algorithm that assumes the existence of an optimum gradient based on the species’ environmental centroid in the multidimensional space and accounts for the covariance between all environmental variables from sample locations (Farber & Kadmon 2003).
Considering the environmental space covered by the focus species occurrences,
BIOCLIM (BIO hereafter) estimates its rectilinear envelope considering the most extreme climatic conditions where the species was recorded (Munguía et al.
2012; Nix
1986). Finally, DOMAIN (DOM hereafter) uses a point-to-point similarity metric with no discrete boundary within environmental space to evaluate environmental similarity between each known occurrences of the target species and project its distribution in the geographic space (Carpenter et al.
1993).
Although the Least Presence Training threshold (TH-LPT) is acknowledged as the best threshold to predict invasive species distributions (Jiménez-Valverde et al.
2011), we used maximum training test sensitivity plus specificity threshold (TH-ROC) to project T. stans distributions, since our goal was to pinpoint areas with high invasion risk. We used single means ensemble forecasts from each algorithm and each occurrence dataset to create a consensual distribution map (Marmion et al.
2009).
We used the Area Under the Receiver-Operator Curve (AUC) to evaluate the models.
AUC is a threshold-independent method that is based on the model sensitivity and specificity ratio (Fielding & Bell 1997). Values vary from 0 to 1, where scores close to
1 indicate a perfect fit between observed and predicted species distributions, whilst values of 0.5 indicate random fit. AUC scores greater than 0.7 are considered as reliable estimators of species distributions (Elith et al.
2006). In addition as each model have different scales of suitability, we normalized the suitability of each models to range from 0 (minimum suitability) to 1 (maximum suitability). This transformation is important to compare the models in the same scale.
Given the dispersal ability of T. stans , we considered a buffer between 120 to
275 km surrounding the species known occurrences as the geographical background for the allocation of the pseudo-absences during the rENM evaluation of both exotic and native ranges and also in the multivariate niche analyses. We chose this buffer to avoid regions very close of the points of occurrence where the species could occur but was not collected and regions very distance where the species cannot disperse (Lobo et al.
2010;
VanDerWal et al.
2009).
Supporting material references
Broennimann O, Fitzpatrick MC, Pearman PB, et al.
, 2012. Measuring ecological niche overlap from occurrence and spatial environmental data. Global Ecology and
Biogeography , 21:481–497.
Carpenter G, Gillison AN & Winter J, 1993. Domain: a flexible modelling procedure for mapping potential distributions of plants and animals. Biodiversity and
Conservation , 2(6):667–680.
Elith J, Graham CH, Anderson RP, et al.
, 2006. Novel methods improve prediction of species’ distributions from occurrence data.
Ecography , 29:129–151.
Farber O & Kadmon R, 2003. Assessment of alternative approaches for bioclimatic modeling with special emphasis on the Mahalanobis distance. Ecological
Modelling , 160:115–130.
Fielding AH & Bell JF, 1997. A review of methods for the assessment of prediction errors in conservation presence/absence models. Environmental Conservation ,
24:38–49.
Gentry AH, 1992. Bignoniaceae Part 2. In , Flora Neotropica Monograph 25 . . New
York: New York Botanical Garden, v. 1.
Henderson L, 2001. Alien weeds and invasive plants. In , Plant Protection Research
Institute Handbook No. 12 . .
Hijmans RJ, Cameron SE, Parra JL, Jones PG & Jarvis A, 2005. Very high resolution interpolated climate surfaces for global land areas. International Journal of
Climatology , 25:1965–1978.
Jiménez-Valverde A, Peterson AT, Soberón J, et al.
, 2011. Use of niche models in invasive species risk assessments. Biological Invasions , 13(12):2785–2797.
Kranz WM & Passini T, 1997. Amarelinho: biologia e controle. Londrina: Instituto
Agronômico do Paraná.
Lobo JM, Jiménez-Valverde R & Hortal J, 2010. The uncertain nature of absences and their importance in species distribution modelling. Ecography , 33(1):103–114.
Madire LG, Wood AR, Williams HE & Neser S, 2011. Potential agents for the
Biological Control of Tecoma stans (L .) Juss ex Kunth var . stans (Bignoniaceae) in South Africa. African Entomology , 19(2):434–442.
Marmion M, Parviainen M, Luoto M, Heikkinen RK & Thuiller W, 2009. Evaluation of consensus methods in predictive species distribution modelling. Diversity and
Distributions , 15(1):59–69.
Munguía M, Rahbek C, Rangel TF, Diniz-Filho JAF & Araújo MB, 2012. Equilibrium of global amphibian species distributions with climate. PLoS One , 7:e34420.
Nix HA, 1986. A biogeographic analysis of Australian elapid snakes. In R Longmore
(ed.), Atlas of Elapid Snapkes of Australia - Australian Flora and Fauna series
Number 7 . . 1st ed., Canberra: Australian Government Publishing Service, p. 4–15.
Pelton J, 1964. A survey of the ecology of Tecoma stans . Butler University Botanical
Studies , 14(11):53–88.
Rangel TF & Loyola RD, 2012. Labeling ecological niche models. Natureza &
Conservação
, 10:119–126.
Shine C, Reaser JK & Gutierrez AT, 2003. Invasive alien species in the Austral Pacific
Region: National Reports & Directory of Resources. Cape Town, Africa: Global
Invasive Species Programme,
Terribile LC, Lima-Ribeiro MS, Araújo MB, et al.
, 2012. Areas of climate stability of species ranges in the Brazilian Cerrado: Disentangling uncertainties through time.
Natureza & Conservação
, 10:152–159.
VanDerWal J, Shoo LP, Graham C & William SE, 2009. Selecting pseudo-absence data for presence-only distribution modeling: How far should you stray from what you know? Ecological Modelling , 220(4):589–594.
Wood JRI, 2008. A revision of Tecoma Juss. (Bignoniaceae) in Bolivia. Botanical
Journal of the Linnean Society , 156:143–172.
Table S1 – Loadings obtained in the Factor Analysis for the 19 bioclimatic variables which initially composed our variables set. Variables with the highest loading values in each factor were selected to represent T. stans ’ environmental space and are highlighted in bold.
Bioclimatic
Variables/Factors bio1 bio2 bio3 bio4 bio5 bio6 bio7 bio8 bio9 bio10 bio11 bio12 bio13
I II III IV V
0.944 0.158 0.239 0.028 -0.131
-0.226 -0.314 -0.379 -0.449 0.702
-0.190 0.223 0.878 0.020 0.077
0.103 -0.335 -0.923 -0.060 0.130
0.868 -0.110 -0.358 -0.170 0.262
0.640 0.303 0.606 0.187 -0.300
-0.044 -0.346 -0.779 -0.277 0.438
0.904 -0.017 -0.249 -0.040 -0.068
0.830 0.197 0.342 0.015 -0.030
0.957 -0.032 -0.279 -0.002 -0.044
0.700 0.286 0.630 0.055 -0.158
0.091
0.142
0.843
0.934
0.310
0.302
0.395
0.040
-0.070
-0.042 bio14 bio15 bio16 bio17
-0.029
0.078
0.120
0.320
0.033
0.935
-0.022 0.335
0.112
0.313
0.121
0.935
0.042 -0.817 0.222
0.054
0.931
0.002
-0.043
-0.009 bio18 -0.084 0.697 0.037 0.285 -0.147 bio19 0.193 0.524 0.228 0.447 -0.060 bio1. Annual Mean Temperature; bio2. Mean Diurnal Range (Mean of monthly
(max temp - min temp)); bio3. Isothermality; bio4. Temperature Seasonality
(standard deviation *100); bio5. Max Temperature of Warmest Month; bio6. Min
Temperature of Coldest Month; bio7. Temperature Annual Range; bio8. Mean
Temperature of Wettest Quarter; bio9. Mean Temperature of Driest Quarter; bio10.
Mean Temperature of Warmest Quarter; bio11. Mean Temperature of Coldest
Quarter; bio12. Annual Precipitation; bio13. Precipitation of Wettest Month; bio14.
Precipitation of Driest Month; bio15. Precipitation Seasonality (Coefficient of
Variation); bio16. Precipitation of Wettest Quarter; bio17. Precipitation of Driest
Quarter; bio18. Precipitation of Warmest Quarter; bio19. Precipitation of Coldest
Quarter.
Table S2 – Niche metrics (i.e. niche stability, niche unfilling and niche expansion) with different proportion of marginal environment being comparing. Note that the marginal environmental range from all (i.e 100%) to any marginal niche (i.e 0%).
Marginal environmental
100%
75%
50%
25%
0% stability unfilling expansion
0.999 0.316
1.000 0.000
0.001
0.000
1.000 0.003
0.999 0.148
0.999 0.217
0.000
0.001
0.001
Fig. S1
– Flowchart representing the experimental design of the modeling analysis. In order to run the rENM, the exotic (n=193), the native (n=362), and the total occurrence data sets (n=555) were divided into 10 50%-50% training-testing subsets each with bootstrap samplings. With the T. stans
’ training subsets, its distributions were produced and with the testing subsets, all produced distributions were validated. In all maps represented in Fig. 2, we produced T. stans ’ final distributions considering all occurrences from the native, the exotic, and total occurrence datasets, respectively.
Fig. S2 – Resulting PCA for the variables available for the use of T. stans from the
PCA-env approach from Broennimann et al. (2012).
Fig. S3 – Tecoma stans distribtutions produced by models which used native occurrences to predict exotic (Brazillian, African, and Australasian) ones, according to each modeling algorithm with TH-ROC: A) MHL; B) DOM; C) BIO. Green and yellow dots refer to exotic and native occurrences, respectively.
Fig. S4 – Tecoma stans distribtutions produced by models which used exotic
(Brazillian, African, and Australasian) occurrences to predict Native ones according to each modeling algorithm and considering the TH-ROC: A) MHL; B) DOM; C) BIO.
Green and yellow dots refer to exotic and native occurrences, respectively.
Fig. S5 – Tecoma stans distributions produced by models which used TOTAL occurrences with TH-ROC: A) MHL; B) DOM; C) BIO.