Predictor variable preparation - Springer Static Content Server

1 Appendix S1. Further details on analysis methods 2 Variable transformations and Factor Analysis 3 The factor analyses were justified as indicated by the Kaiser-Meyer-Olkin 4 measure of sampling adequacy (human impact variables: 0.86; environmental variables: 5 0.59; a minimum of 0.5 is usually considered indicative of data being appropriate for FA, 6 McGregor 1992) as well as Bartlett's test of sphericity (human impact variables: 7 2=1189.9, df=19, P<0.001; environmental variables: 2=1147.2, df=28, P<0.001; 8 McGregor 1992). 9 The third environmental factor (EFactor3) and the second human impact factor 10 (HFactor2) revealed from the FA were square root transformed to achieve more 11 symmetrical distributions before using them as predictors in the GLM. 12 13 Accounting for spatial autocorrelation 14 The abundances investigated were likely to show some spatial autocorrelation (i.e., 15 neighboring transects revealing more similar values than more distant ones) unexplained 16 by the predictor variables included in the model. Such autocorrelation leads to spatially 17 non-independent residuals, violating a crucial prerequisite of any linear model and, 18 hence, devalues its reliability. We thus explicitly incorporated spatial autocorrelation into 19 the model. We did this using the following approach: First, we ran the full model with all 20 environmental gradients, the squared terms, the species, the interactions and the offset 21 term (log transformed transect length) and transect ID included and derived the residuals 22 from it. Then we calculated, separately for each for each individual data point (i.e., 23 combination of species and individual transect), the weighted average of the residuals of 1 24 all other data points of the same species. The weight used was inversely related to the 25 distance between two transects and followed a Gaussian function. The mean of this 26 function was set to zero (i.e., maximum weight at a distance of zero) and its standard 27 deviation was determined by maximizing the likelihood of the full model with the derived 28 autocorrelation term included. 29 30 Species-specific analyses 31 To assess the impact of the different covariates on the abundances of the different species 32 and to understand the complex patterns of species specific impacts of the investigated 33 gradients we ran separate models for each species and investigated their results with 34 regard to direction, magnitude and significance of estimates. From these models we 35 removed squared terms which were clearly not significant (p>0.5). The autocorrelation 36 term included into these models was that derived from the full model ran for all species 37 and transect length was again included as an offset variable. 38 39 Species model predictions used for identifying core distributional range areas 40 After extracting the variables for each cell in the country (cell size: 0.05 degrees) and 41 transforming them when necessary to achieve approximately symmetrical distributions 42 (Table S1, S2) we subjected them to two separate factor analyses with varimax rotation 43 (conducted using the function fa of the R-package psych; Revelle 2012), one for the 44 environmental variables and one for the human impact variables. These were justified as 45 indicated by the Kaiser-Meyer-Olkin measure of sampling adequacy (environmental 46 variables: 0.421; human impact variables: 0.755) as well as Bartlett's test of sphericity 2 47 (environmental variables: 2=588.46, df=15, P<0.001; human impact variables: 48 2=550.05, df=21, P<0.001; McGregor 1992). Both revealed two factors with 49 Eigenvalues larger than one, together explaining roughly 50% of the total variance in the 50 set of variables (Table S1, S2). We extracted the respective scores (thereafter 'factor 51 scores') and used them as predictors in the models. 52 Out of the four factor scores and their squares we constructed all subsets (i.e., a total of 53 81 one models; models that included a squared covariate included also the respective 54 covariate unsquared). The squared terms we included to allow for non-linear impacts of 55 the predictors on species abundances (i.e., higher abundance at intermediate values of a 56 covariate). We fitted the models to the transect data (N=266 transects) using Generalized 57 Linear Models (McCullagh & Nelder 2008) with negative binomial error structure and 58 log link function (function glm.nb of the R package MASS; Venables & Ripley 2002). In 59 some cases a given model could not be fitted. This concerned chimpanzees and cane rat 60 (one model each), Spot nosed monkeys (five models), Giant rat (seven models), Warthog 61 (12 models), Bay duiker (26 models), Buffalo (30 models), and Campbell's monkey (32 62 models). The respective models we discarded from further consideration. We then 63 determined for each model the predicted values for all the grid cells in the country and 64 also its Akaike weight (Burnham & Anderson 2002) and then averaged the predictions 65 whereby we weighted their contribution by the respective Akaike weights. We used these 66 estimated abundances per species and grid cell as the basis of the spatial prioritization. 67 68 3 69 70 71 Table S1: Results of the factor analysis for the human impact variables. For each variable the largest absolute loading is shown in bold. variable tr.ht_30 distance to nearest village distance ot major roads distance to minorroads human population density minimum distance to protected area humanpopulation change Eigenvalue prop. variance explained 72 73 74 75 76 transformation square root square root square root square root log square root square root factor 1 factor 2 0.177 -0.630 0.198 0.782 0.609 -0.363 0.720 -0.211 -0.399 0.633 -0.151 -0.613 -0.295 0.538 2.167 1.311 0.310 0.187 Table S2: Results of the factor analysis for the environmental variables. For each variable the largest absolute loading is shown in bold. variable elevation tr.ht_40 CTI ht_60 Precipitation seasonality mean precipitation Eigenvalue prop. variance explained transformation factor 1 factor 2 square root 0.961 -0.267 square root 0.143 0.354 square root -0.667 -0.016 -0.210 -0.549 square root -0.342 0.504 -0.021 0.997 1.655 1.642 0.276 0.274 77 78 79 Algorithm for identifying core distributional ranges 80 We first determined as starting point Qtot for the configuration of the selected 20% of grid 81 cells with maximum relative abundance (Qtot 82 step we identified those selected cells that had at least one adjacent unselected cell 83 (adjacent cells were considered those four pixels immediately above, below, to the left 84 and to the right of the pixel, not diagonal). These cells could potentially be dropped from 85 the CDRA ('pixels to be dropped'). In the third step we identified those pixels that were 86 not selected, but that were adjacent to at least one selected pixel following the same 4 actual; Appendix S2, Fig. S4). In a second 87 criteria as in step two. These were the pixels that could potentially be included in the 88 CDRA ('pixels to be included'). In the fourth step we then calculated Qtot for each 89 combination of pixels to be dropped and pixels to be included and chose that combination 90 that maximized Qtot (Qtot new) by keeping 20% of all pixels. In the fifth step we evaluated 91 if Qtot new>Qtot actual, and if this was the case we chose the configuration of the new area as 92 updated CDRA and repeated steps one to five; if Qtot 93 search and used the actual configuration as final CDRA. The algorithm was implemented 94 in R (R Core Team 2012) and applied separately for each species. new<Qtot actual we terminated the 95 96 Implementation 97 All analyses were conducted in R (version 2.11; R Development Core Team 2010 or 98 version 3.1.2; R Core Team 2014). GLMMs were fitted using the functions glmmadmb of 99 the R package glmmADMB (Fournier et al. 2012; Skaug et al. 2014), the FAs were 100 conducted using the function factanal or the function fa of the package psych (Revelle 101 2012), and diagnostics of the applicability of FAs were derived using the function paf of 102 the package rela (Chajewski 2009). The autocorrelation term and the fitting of the 103 standard deviation of the respective weight function were derived using a self-written R- 104 script. The CCA was conducted and its results were plotted using the functions cca and 105 plot.cca of the R package vegan (Oksanen et al. 2010). 106 The identification of species core ranges turned out to be computationally intense. 107 We therefore parallelized the algorithm (to do multiple computer operations at the same 108 time) to achieve reasonable computation times using the R-package ‘parallel’. The 109 algorithm finished after 8 to 106 iterations (mean = 50, median = 42; average number of 5 110 arrangements tested per iteration: 66070) and needed on average 14.9 hours (arithmetic 111 mean) per species to finalize on a quadcore processor with 2.83Ghz. 112 113 References 114 Burnham, KP & Anderson, DR. (2002). Model Selection and Multimodel Inference. 2nd 115 ed. Berlin: Springer. 116 Chajewski, M. 2009. rela: Scale item analysis. R package version 4.1. 117 Fournier DA, Skaug HJ, Ancheta J, Ianelli J, Magnusson A, Maunder M, Nielsen A & 118 Sibert J. 2012. AD Model Builder: using automatic differentiation for statistical inference 119 of highly parameterized complex nonlinear models. Optim. Methods Softw., 27, 233-249. 120 McGregor PK. 1992. Quantifying responses to playback: one, many, or composite 121 multivariate measures? In: McGregor PK. (Ed.): Playback and Studies of Animal 122 Communication. Plenum Press. New York, London. 123 Oksanen, J., Blanchet, F.G., Kindt, R., Legendre, P., O'Hara, R.B., Simpson, G.L., 124 Solymos, P., Stevens, M.H.H. & Wagner, H. 2010. vegan: Community Ecology Package. 125 R package version 1.17-4. 126 R Development Core Team. 2010. R: A Language and Environment for Statistical 127 Computing. R Foundation for Statistical Computing. Vienna, Austria.R Core Team. 128 2014. R: A Language and Environment for Statistical Computing. R Foundation for 129 Statistical Computing. Vienna, Austria. 130 Skaug H, Fournier D, Bolker B, Magnusson A & Nielsen A. 2014. Generalized Linear 131 Mixed Models using AD Model Builder. R package version 0.8.0. 132 Revelle, W. 2012. psych: Procedures for Personality and Psychological Research. 6 133 Venables, WN & Ripley, BD. 2002. Modern Applied Statistics with S. Fourth Edition. 134 Springer, New York. 135 7

Predictor variable preparation - Springer Static Content Server

Related documents

Products

Support

Predictor variable preparation - Springer Static Content Server

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib