Spatial relationship between climatologies and changes in global vegetation activity Rogier de Jong, Michael E. Schaepman, Reinhard Furrer, Sytze de Bruin, Peter H. Verburg Supporting Online Material (SOM) Materials and Methods S1. Deterministic model (fixed effects) For global application, we used a regression tree (RT) model. Such a model is built by recursive partitioning of the sample (= root node) into more homogeneous nodes, or children (Breiman et al., 1984). Each split is based on one predictor and is selected according to a splitting criterion, which minimizes the total sum of squared deviations from node centers. The tree is grown until no splits can be made anymore due to lack of data and subsequently reduced in a process of pruning with least-important splits, based upon a cost-complexity measure (Steinberg, 2009), being removed. Cross-validation was used for derivation of the optimal complexity parameter. As such, the 54601 grid cells (root node) were classified into 867 terminal nodes. All climate variables were selected in approximately equal amounts within the splits. The resulting model was used to predict the change in vegetation activity by following the path from the root node down to the appropriate terminal node of the tree. This provided the fixed -effects term of the additive model in Equation 1 (main document). For this model, we used the tree package in R (R Development Core Team, 2012). S2. Spatial field model (random effects) Spatial dependence in the non-associated effects was modeled in R using a stationary Gaussian random field (GRF). A GRF is specified by its mean value function and its covariance function. Therefore, the main assumption underlying h is a normal distribution with, in this case, zero mean and covariance matrix ∑(Θ). The model parameters Θ (i.e. sill and range parameter δ) fully characterize the random field, which is expressed as: ℎ ~ 𝑁(0, ∑(𝛩)) (Eq. S1) A spherical covariance function induces a symmetric and positive definite covariance matrix. The size of the covariance matrix (i.e. square of the number of observations) may lead to serious computational issues for datasets of the size used here (Furrer & Sain, 2009). We used two measures to deal with this. First, a spherical function was selected because observations beyond the maximum range δ can be considered spatially uncorrelated. The range, therefore, defines the mean 'patch size' in a realization of a GRF. We determined δ (in 1 degree steps, using great circle distance) by computing the negative 2 log-likelihood (-2ln(L)) of the observed spatial field of residuals. We found the optimum around 900km (Figure S1a), with the most substantial decrease in -2ln(L) below ~500km. The latter provides a sort of minimal range that should be respected. We used the longer-range δ = 897km (~8deg) for estimation of the other model parameters. It resulted in a covariance-matrix density of 3%, equivalent to 89.5 million nonzero elements for 54’601 observations. Second, recognizing the sparse nature of the covariance matrices, only the nonzero entries were stored and used for estimation of Θ. For this part of the analysis we used the R package spam (Furrer & Sain, 2010). Given that ∑(Θ) is symmetric and positive definite, Cholesky decomposition was used to construct a lower triangular matrix L, such that the product LLT returns the original matrix. Solving linear systems, like maximum likelihood estimation (MLE), becomes computationally more efficient using this manipulation (Higham, 2009), which we used to our advantage for optimizing Θ, as described in the following steps. (step 1) A set of initial parameters Θ0 was derived from the residuals of the fixed-effects model by method-of-moments using gstat (Pebesma & Wesseling, 1998). (step 2) The distance matrix up to distance δ was calculated. Subsequently, the spherical covariance function was applied using the current parameter estimates. (step 3) The parameters were optimized using MLE to obtain a new set Θ, which was used for predicting the spatial field ℎ̂ (Eq. S2). In turn, this spatial field was used for backfitting of β; this procedure was repeated until convergence. The final spatial parameters and the resulting spherical covariance function are shown in Figure S1b. The described methodology provides the best empirical linear unbiased prediction (E-BLUP) of the spatial field (Henderson, 1975) and is, under the intrinsic assumption of Eq. S1, analogous to kriging approaches in geostatistics (Lark et al., 2006). More specifically, it is analogous to kriging approaches with nugget filtering, which yielded the separation between the spatially correlated field (Figure 4c) and the uncorrelated residuals (Figure 4d). Figure S1 (a) Maximum likelihood estimation (MLE) of spatial model parameters as a function of range δ (x-axis). The y-axis shows the negative 2 log-likelihood, or -2ln(L). The optimal range was found at 897km or 8deg (b) the optimized spherical covariance function obtained from the MLE and used for the Gaussian random field (GRF). The spatial dependency reduces to zero at distance δ. References (SOM only) Breiman L, Friedman JH, Olshen RA, Stone CJ (1984) Classification and Regression Trees, Wadsworth, Belmont, CA, Chapman and Hall / CRC Press. Furrer R, Sain SR (2009) Spatial model fitting for large datasets with applications to climate and microarray problems. Statistics and Computing, 19, 113-128. Furrer R, Sain SR (2010) spam: a sparse matrix R package with emphasis on MCMC methods for Gaussian Markov random fields. Journal of Statistical Software, 36, 1-25. Henderson CR (1975) Best linear unbiased estimation and prediction under a selection model. Biometrics, 31, 423-447. Higham NJ (2009) Cholesky factorization. Wiley Interdisciplinary Reviews: Computational Statistics, 1, 251-254. Lark RM, Cullis BR, Welham SJ (2006) On spatial prediction of soil properties in the presence of a spatial trend: the empirical best linear unbiased predictor (E-BLUP) with REML. European Journal of Soil Science, 57, 787-799. Pebesma EJ, Wesseling CG (1998) Gstat: a program for geostatistical modelling, prediction and simulation. Computers & Geosciences, 24, 17-31. R Development Core Team (2012) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, URL http://www.R-project.org/. Steinberg D (2009) CART: Classification and Regression Trees. In: The Top Ten Algorithms in Data Mining. (eds Wu X, Kumar V) pp Page. Boca Raton, FL, USA, CRC Press (Taylor & Francis Group).