SUPPLEMENTARY MATERIAL Figure S1: Phytosociological composition of the study plots in the baseline year of 1971. ‘y’ = plots and sites inside the Oct ’87 stormtrack, ‘n’ = plots and sites outside the storm-track but in southern England. Plots were assigned to the units of the National Vegetation Classification (Rodwell, 1991 et seq.) using the MAVIS application freely available at www.ceh.ac.uk . W21 Crataegus monogyna - Hedera helix W16 Quercus spp. - Betula spp. - Deschampsia flexuosa W15 Fagus sylvatica - Deschampsia flexuosa W14 Fagus sylvatica - Rubus fruticosus W13 Taxus baccata y W12 Fagus sylvatica - Mercurialis perennis n W10 Quercus robur - Pteridium aquilinum W8 Fraxinus excelsior - Acer campestre - Mercurialis perennis W6 Alnus glutinosa - Urtica dioica 0 10 20 30 % of total plots 40 50 60 1 0 20 40 60 80 mean soil pH 1971 0 1 Site area 1971 90 70 50 120 80 m2 ha-1 0 0 1 1 Intensive land-use 2000 60 40 20 n per 200m2 7 6 5 4 0 hectares 0 Nitrogen deposition 1996 8 S deposition change 1 40 0 30 % cover in 5km site buffer 50 45 40 35 kg N ha-1 yr-1 -0.8 -1.2 -1.6 change in meqv ha-1 1970 to 2000 Figure S2: Distributions of measured variables in storm (1) and non-storm (0) sites. 0 1 mean spp richness 1971 0 1 mean Basal area 1971 Table S1: Percentage of the variation in response variables explained by hypothesized predictor variables. Response variables Explanatory variables Day difference of survey date Storm exposure Ba change: SITE Ba change: PLOT pH mean: PLOT pH change: SITE pH change: PLOT 29.0 1.7 0.0 12.7 3.2 ẞ 1971: SITE Ba change: SITE Ba change: PLOT pH mean: SITE pH mean: PLOT pH change: SITE pH change: PLOT Spp richness change: SITE Spp Spp richness richness change: change: SITE PLOT 3.6 1.1 60.4 21.4 20.0 0.8 4.0 77.7 29.8 0.2 32.8 2.4 15.8 0.8 1.5 5.4 0.9 5.2 1.9 0.1 1.6 1.2 3.5 Figure S3: Colour version of Figure 5 in the main text. Both graphs depict the data in Table S1. 100 90 % variance explained 80 70 60 Spp richness change: SITE pH change: PLOT 50 pH change: SITE 40 pH mean: PLOT pH mean: SITE 30 20 10 Ba change: PLOT Ba change: SITE ẞ 1971: SITE Storm exposure 0 Day difference of survey date Table S2. Path analysis of change in cover-weighted Specific Leaf Area (cSLA). Summary statistics for all model parameters from the Bayesian path analysis of October 1987 storm impacts on British broadleaved woodlands (n=293 plots across 26 sites). Significant effects by Bayes p value are emboldened. Based on data not centred or standardized. Regression coefficients Change in cSLA given difference in survey date Change in cSLA given storm exposure Change in cSLA given within-site beta diversity in 1971 Woody basal area change given storm exposure pH change given storm exposure pH change given woody basal area change pH change given mean soil pH Change in cSLA given woody basal area change Change in cSLA given pH change mean sd MC_error val2.5pc median val97.5pc 0.0057 0.0113 0.0002 -0.0172 0.0058 0.0275 0.2921 0.2889 0.6152 0.0078 -0.9413 0.2986 1.4610 0.3135 0.1618 0.4107 0.0049 -0.9837 -0.1582 0.6431 0.3456 0.0008 -0.0749 0.0615 0.2018 0.1821 0.0048 -0.6715 -0.2369 0.2116 0.1409 0.1252 0.1025 0.2977 0.0659 0.0006 0.0027 -0.3232 0.1660 -0.1250 0.2986 0.0768 0.1107 0.4231 0 0.6131 0.5122 0.0947 0.2831 0.0028 0.0020 -1.6320 -0.4621 -0.6140 0.0944 0.3882 0.1149 0.6512 0.3676 0.0620 0.0700 0.2348 0.2251 Figure S4: Path analysis diagram for change in cover-weighted SLA between 1971 and 2002. Difference in survey date Oct '87 storm y/n? Mean (pH71+pH02 0.30 0.06 0.006 0.29 Change in woody basal -0.23 Bayes P value -0.61 Change in coverweighted SLA -0.13 0.09 -0.17 Change in soil pH Within-site Beta diversity in 1971 Table S3: Significance tests of change in understorey community heterogeneity (ΣDi) for each woodland site between 1971 and 2002. Carried out using the R function dDEV (Baeten et al. 2014). Site code 5 6 7 10 14 16 25 26 29 30 33 90 94 100 101 102 N plots 16 16 16 16 16 16 16 15 15 16 16 15 16 16 16 16 2 20 27 31 32 79 80 88 91 99 16 16 16 16 16 14 16 16 15 13 Plant species P from pool size 1971 Change randomisation Storm (understorey) in Sum Di test exposure? 146 382.21 0.0005 no 115 407.50 0.0005 no 97 -422.94 0.0005 no 102 -172.34 0.0160 no 72 -4.05 0.9475 no 100 155.84 0.0255 no 83 142.52 0.0180 no 98 -169.07 0.0090 no 79 497.94 0.0005 no 91 -32.75 0.6137 no 45 136.39 0.0030 no 73 63.29 0.2489 no 111 513.23 0.0005 no 75 -402.61 0.0005 no 123 -762.08 0.0005 no 86 -113.09 0.0945 no 126 129 40 103 66 100 127 99 156 57 178.10 199.01 65.35 10.52 228.17 331.34 -149.49 448.67 305.34 45.72 0.0425 0.0115 0.1039 0.8741 0.0010 0.0005 0.0755 0.0005 0.0020 0.3603 yes yes yes yes yes yes yes yes yes yes SIG Y Y Y Y N Y Y Y Y N Y N Y Y Y N Direction up up down down down up up down up down up up up down down down Y Y N N Y Y N Y Y N up up up up up up down up up up APPENDIX S1: VARIATION PARTITIONING Andy Scott Calculation of the proportions of the variance of a response variable explained by the different drivers in a hypothesised path is, like model fitting, straightforward for some models and more problematical for others. Where the fitted model can be written as a sequence of variables with linear regression at each stage the calculation is relatively simple. Let the sequence of fitted regression equations for a set of n variables be written as, x1 1 x2 21 x1 2 x3 31 x1 32 x2 3 ... xn n1 x1 n 2 x2 ... nn1 xn1 n where the β’s are the regression coefficients and the ε’s are the unexplained parts of the observed variables. The β ’s are set to zero where the corresponding variable is not included in a particular regression. This set of equations can be written in matrix form as, X AX 1) where X is the vector of variables, E is the vector of unexplained components and A is the square matrix of regression coefficients which is lower triangular with zeros on the diagonal. Note that the first row of A is completely zero since the first variable in the chain is not regressed on any other variable. Similarly the last column is all zeros since the final (response) variable in the chain is not used as an explanatory variable in any of the component regression models. Rearranging gives ( I A) X 2) X ( I A) 1 3) or so that the elements of (I-A)-1 give the required relationship between each variable and the unique, unexplained parts of the preceding variables in the sequence. The variance of X is var( X ) ( I A) 1 var( )( I A) T 4) Elements of E are uncorrelated by construction so that var(E) is diagonal with diagonal values given by the residual variances of the individual regression equations used to construct the model. This equation gives the required breakdown of the variance of each variable in terms of the preceding variables in the sequence. The component values give the total variance explained, i.e. direct plus indirect. The variance of a variable xi explained by the direct effect of variable xj can be calculated as βij2var(Ej). The indirect effect is then obtained by subtraction. The required variance components can be calculated directly. Let B be the matrix whose elements are the squares of the corresponding elements in (I-A)-1 then the components of variance are directly given as C BVar () 5) Row j of this matrix gives the components of variance of variable j, i.e. the amount of variance of variable j explained by the other variables. The diagonal values are the unexplained variance and the sum of the values in each row add to the total variance of the corresponding variable. Dividing each row of this matrix by the row sum converts the matrix to proportions of variance explained. When hierarchical data are involved the calculations are somewhat more involved but essentially the same. The path diagram is modified so that latent/unobserved variables corresponding to each lower level variable are explicitly included. Variables recorded at the highest level in a hierarchy therefore have just one row and column in the matrix A; variables at the next level have two rows and columns and so on. In the woodland dataset there are just two levels, plots within sites, so that the each plot level variable has two rows and columns, the first for the unobserved square level variable and the second for the plot level measurement. All regression coefficients from the fitted hierarchical component are entered in the square level row while only coefficients corresponding to previous plot level variables are entered in the plot level row. In addition in the plot level row a value of 1 is entered as a regression coefficient for the corresponding square level variable. APPENDIX S2: OpenBUGS code ## LIKELIHOODS: Analysis of centred & standardized data model { for (i in 1:rs) { YY[i,2] ~ dnorm(mu1, tau1) YY[i,2]<-(Y[i,2]-mean(Y[,2]))/sd(Y[,2]) # Survey day difference # YY[i,1] contains the indicator variable for whether the SITE was in the 87 storm # track or not. This is coded as 1 (storm) or 0 (no-storm). #1. Change in richness |Survey_day_diff + Oct storm #2. Change in basal area | Oct storm # 1. Change in richness | Survey_day_diff + Oct storm ... YY[i,2] contains # ctrd/stdz Survey Day difference muScS[i] <-alpha1_0 + (beta1 * YY[i,2]) + (beta2 * Y[i,1]) # 2. Change in basal area | Oct storm... Y[i,1] contains binary # in/out Storm track muBaS[i] <-alpha2_0 + (beta3 * Y[i,1] ) #################################################################### # PLOT-level loops. Where plot level variables are predicted by their SITE-level #mean prediction plus any other plot level covariates. Here we need... #1. Change in soil pH | (change in basal area) + pH starting point i.e mean of pH in 71 + 02 as this is a correlate of measurement error #2. Change in species richness | (Change in basal area + change in soil pH) for (j in Y[i,3]:Y[i,4]) { #XX[j,3]<-(X[j,3]-mean(X[,3]))/sd(X[,3]) XX[j,2]<-(X[j,2]-mean(X[,2]))/sd(X[,2]) XX[j,5]<-(X[j,5]-mean(X[,5]))/sd(X[,5]) XX[j,4]<-(X[j,4]-mean(X[,4]))/sd(X[,4]) XX[j,6]<-(X[j,6]-mean(X[,6]))/sd(X[,6]) # @@@ Variance components on Basal area and mean pH 71+02 muBaP[j]<- alpha10[X[j,1]] + muBaS[X[j,1]] mupH7102[j]<-alpha11[X[j,1]] XX[j,5] ~ dnorm(muBaP[j], tauBa) XX[j,6] ~ dnorm(mupH7102[j], taupHSt) #Change in richness #Change in mSLA #Change in basal area #Change in soil pH #Mean pH (71+02)/2 muPcP[j]<- alpha6_0[X[j,1]] + (beta5 * XX[j,5]) + (beta7 * XX[j,6]) XX[j,4] ~ dnorm(muPcP[j], tauPc) muScP[j] <- alpha9[X[j,1]] + muScS[X[j,1]] + (beta4 * XX[j,5]) + (beta6 * XX[j,4]) #XX[j,3] ~ dnorm(muScP[j], tauSc) XX[j,2] ~ dnorm(muScP[j], tauSc) } } ## PATH COEFFICIENTS ## Storm effect on richness change via basal area change B3timesB4 <- beta3 * beta4 ## Storm effect on richness change via pH change via Basal area change B3B5B6 <- beta3 * beta5 * beta6 ## Effect of pH starting point on species richness change via pH change B7timesB6 <- beta7 * beta6 ## Bayes p-values on beta() p.B7B6<-1-step(B7timesB6) p.B3B5B6<-1-step(B3B5B6) p.B3B4<-1-step(B3timesB4) p.Ind_storm<-1-step(Ind_storm) p.Tot_BA<-1-step(Tot_BA) p.Tot_Storm<-1-step(Tot_Storm) p.St_ph<-1-step(Starting_ph) p.beta1<-1-step(beta1) p.beta2<-1-step(beta2) p.beta3<-1-step(beta3) p.beta5<-1-step(beta5) p.beta7<-1-step(beta7) p.beta4<-1-step(beta4) p.beta6<-1-step(beta6) ## PRIORS mu1 ~ dnorm(0, 1.0E-6) alpha2_0 ~ dnorm(0, 1.0E-6) alpha1_0 ~ dnorm(0, 1.0E-6) beta3 ~ dnorm(0, 1.0E-2) beta5 ~ dnorm(0, 1.0E-6) beta7 ~ dnorm(0, 1.0E-6) beta1 beta2 beta4 beta6 ~ ~ ~ ~ dnorm(0, dnorm(0, dnorm(0, dnorm(0, 1.0E-6) 1) 1.0E-6) 1.0E-6) tauBa ~ dgamma(0.001, 0.001) sigmaBa <-1/sqrt(tauBa) tau1 ~ dgamma(0.001, 0.001) sigma1 <-1/sqrt(tau1) tauPc ~ dgamma(0.001, 0.001) sigmaPc <-1/sqrt(tauPc) tauSc ~ dgamma(0.001, 0.001) sigmaSc <-1/sqrt(tauSc) tauPhSt ~ dgamma(0.001, 0.001) sigmaPhSt <-1/sqrt(tauPhSt) ## Uniform priors on the sdev of the random intercepts # following the recommendation in Gelman (2006) for (i in 1:26) { alpha6_0[i] ~ dnorm(mu.int6_0, tau.int6_0) alpha9[i] ~ dnorm(mu.int9, tau.int9) alpha10[i] ~ dnorm(mu.int10, tau.int10) alpha11[i] ~ dnorm(mu.int11, tau.int11) } mu.int6_0 ~ dnorm(0, 1.0E-6) sigma.int6_0 ~ dunif(0,100) tau.int6_0 <-1/(sigma.int6_0*sigma.int6_0) mu.int9 ~ dnorm(0, 1.0E-6) sigma.int9 ~ dunif(0,100) tau.int9 <-1/(sigma.int9*sigma.int9) mu.int10 ~ dnorm(0, 1.0E-6) sigma.int10 ~ dunif(0,100) tau.int10 <-1/(sigma.int10*sigma.int10) mu.int11 ~ dnorm(0, 1.0E-6) sigma.int11 ~ dunif(0,100) tau.int11 <-1/(sigma.int11*sigma.int11) } # Data for WinBUGS generated by BAUW, #... a free program by Zhang, Z. and Wang, L. (2006) list( rs= 26, Y=structure(.Data=c( DATA AVAILABLE ON REQUEST ), .Dim = c(26,4)), X=structure(.Data=c( DATA AVAILABLE ON REQUEST ), .Dim = c(293,6))) APPENDIX S3: Notes on construction of the path analysis in OpenBUGS The essential building blocks of the path model are regression equations that, when coupled, quantify the partial explanatory power of one factor on another factor via correlation with an intermediate factor. In the storm example consider species richness change as the response variable (y0) to be explained by two variables, change in woody basal area (y1) and exposure to the October ‘87 storm (y2). Both are hypothesized to have direct and independent effects leading to the model, y0j =α1i(j) + β4.y1j + β2.y2j + ε1j 1) where the β’s are standardized regression coefficients and α1i(j) is a random intercept indexed on woodland site i. However, change in woody basal area (y1) is also expected to be partially explained by a significant effect of storm exposure leading to another model, y1j =α2i(j) + β3.y2j + ε2j. 2) So the dichotomous factor y2j, indicating exposure to the storm, has appeared in two models; in 1) it is fitted alongside change in woody basal area with both being independent predictors of change in species richness and in 2) it is treated as an explanatory variable for change in woody basal area. The proportion of the effect of woody basal area change on species richness change that is indirectly attributable to the effect of the storm is given by inserting 1) into 2), y0j =α1 i(j) + β4 (α2 i(j) + β3.y2j + ε2j) + β2.y2j + ε1j. 3) Centering and standardizing the covariates to mean 0 and sd 1 means that the intercept α2 can be ignored, because it is centred on zero, and all β now also range between 1 and -1 and are centered on 0. Residual errors are also assumed to be randomly distributed and centered on zero. The first term after the intercept in 3) therefore reduces to β4.β3.y2j. This is the estimate of the indirect effect of storm exposure on species richness change via woody basal area change. Many kinds of data cannot be centred and standardized in this way so that generating estimates of indirect and total effects in a path analysis requires alternative methods (see Clough 2012, Grace et al. 2012 and Grace 2006, pages 57-73). Including random effects Sample plots nested within individual woodland sites are likely to be more similar in their behavior in space and over time than plots from different sites. Random intercepts drawn from a zero mean normal distribution are specified to model this additional blocking structure in the data. The hierarchical structure can be modeled as, yj = μ + αi + εj with εj ~ N(0, σj2) and αi ~ N(0, σi2) 4) where μ is a grand mean, αi is a random intercept for each woodland and εj the random variation among plots having accounted for the random variation between woodlands. After introducing fixed effects, the estimates of the two variances in equation 4 quantify the variation between plots and sites that is left having fitted either site or plot level explanatory variables (Singer 1998; Gelman & Hill 2007). These quantities are essential for partitioning the variation at plot or site level that is explained by the fixed effects included in the path model. The residual variances are calculated for each regression model in the path analysis. If some predictors are included then the model can be written out so that it is clear that some are at site level, and cannot explain any of the within-site variation, and others are measured in every plot in every site. So within each site we have a regression involving plot-level predictors with an intercept per site αi as in 4) above, yj ~ N(αi + βj.xj, σj2) 5) then site-level predictors are added in and expand the random intercept term, αi ~ N(βi.xi, σi2). 6) This formulation is based on Gelman & Hill (2007), pages 262-265, where they give five ways of writing out the same kind of multi-level model. This is well worth a read as it helps to clarify how a multilevel model represents the structured random variation in the sample alongside predictors specified at different hierarchical levels. Writing the path model in OpenBUGS The fully documented code to implement the storm impacts example is available in Supplementary Material. Here we describe the essential building blocks of the path model as written in OpenBUGS. Regression models In the storm example two loops are specified within which likelihoods and residual variances are modelled. The first loop, subscripted i, models site-level effects (line 2). The second, subscripted j, models plot-within-site effects (line 5). Note here that the two column vectors Y[i,3] and Y[i,4]contain the row numbers in i that indicate the start and end of each set of plot data within each site. Line 3 specifies the regression model for the effects of difference in date of survey (YY[i,2]) and exposure to the storm (Y[i,1]) on species richness change. Since the two explanatory variables are measured at the site level and do not vary from plot to plot within any one site they cannot explain any of the variation in species richness change between plots within each site. The hierarchical structure of the dataset can be ignored when fitting site-level fixed effects, thus only one intercept is fitted; alpha1_0. Similarly, line 4 specifies the model for the effects of storm exposure on mean change in woody basal area at the site level. Again, storm exposure cannot explain any of the within-site, between-plot variation. 1 model { 2 for (i in 1:rs) { 3 muScS[i] <-alpha1_0 + (alpha1_1 * YY[i,2]) + (alpha1_2 * Y[i,1]) 4 muBaS[i] <-alpha2_0 + (alpha2_1 * Y[i,1] ) Where measurements were made in each plot within each site, regression relationships between these variables are modelled inside loop j. Two regression models are shown. Line 6 specifies the parameters to be estimated for the plot-level effects of change in woody basal area XX[j,5]and change in soil pH XX[j,4]on change in plot-level species richness XX[j,2]. The expectation for this regression model is muScP[j]which is defined in line 7 with variance 1/tauSc. This residual variance quantifies the amount of variation in change in plotlevel species richness not explained by the regression model in line 6. To complete the model of variation in species richness change at the plot level but across all plots and sites, two site-level expectations need to be included. These are the site-level predictions from line 3, muScS[X[j,1]] and a random intercept alpha9[X[j,1]]. Note the use of the double bracketing here to indicate estimation of each random intercept is across the set of plots j in site i. Line 8 specifies another plot-level regression model this time for woody basal area change in each plot within each site. Figure 3 shows that no plot-level explanatory variables point to woody basal area. Only storm exposure is hypothesised to explain change in woody basal area. Since this is a site-level effect, the relationship has already been modelled in the site loop i (line 4). Line 8 completes the variance decomposition by fitting residual random intercepts alpha10[X[j,1]] given the modelled effect of storm exposure at site level muBaS[X[j,1]]. 5 for (j in Y[i,3]:Y[i,4]) { 6 muScP[j] <- alpha9[X[j,1]] + muScS[X[j,1]] + (alpha8_0 * XX[j,5]) + (alpha8_1 * XX[j,4]) 7 XX[j,2] ~ dnorm(muScP[j], tauSc) 8 muBaP[j]<- alpha10[X[j,1]] + muBaS[X[j,1]] 9 XX[j,5] ~ dnorm(muBaP[j], tauBa) Path coefficients Indirect and total effects are readily specified as stochastic quantities whose distributional properties are estimated by OpenBUGS. For example the indirect effect of storm exposure on change in species richness via the intermediate effect of storm exposure on change in woody basal area can be written simply in OpenBUGS code as Rich_Ba_Storm <- alpha2_1*alpha8_0. This line achieves the nesting of model 1) into 2). The posterior distributions of these variables can be summarised by specifying them as nodes to monitor while Bayes p values can also be readily specified – see the OpenBUGS code. References Clough, Y. (2012) A generalized approach to modelling and estimating indirect effects in ecology. Ecology 93, 1809-1815. Gelman, A., Hill, J. (2007) Data Analysis Using Regression and Multilevel/Hierarchical Models. CUP. Grace, J.B. (2006) Structural Equation Modelling and Natural Systems. CUP. Grace, J.B., Schoolmaster, D.R., Guntenspergen, G.R., Little, A.M., Mitchell, B.R., Miller, K.R., Schweiger, E.W. (2012) Guidelines for a graph-theoretic implementation of structural equation modelling. Ecosphere 3, Article 73. Singer, J.D. (1998) Using SAS PROC MIXED to fit multi-level models, hierarchical models and individual growth models. Journal of Educational and Behavioral Statistics 24, 323-355.