jec12291-sup-0001-SuppInfo

advertisement
SUPPLEMENTARY MATERIAL
Figure S1: Phytosociological composition of the study plots in the baseline year of 1971. ‘y’ = plots and sites inside the Oct ’87 stormtrack, ‘n’ = plots and sites outside the storm-track but in southern England. Plots were assigned to the units of the National Vegetation
Classification (Rodwell, 1991 et seq.) using the MAVIS application freely available at www.ceh.ac.uk .
W21 Crataegus monogyna - Hedera helix
W16 Quercus spp. - Betula spp. - Deschampsia flexuosa
W15 Fagus sylvatica - Deschampsia flexuosa
W14 Fagus sylvatica - Rubus fruticosus
W13 Taxus baccata
y
W12 Fagus sylvatica - Mercurialis perennis
n
W10 Quercus robur - Pteridium aquilinum
W8 Fraxinus excelsior - Acer campestre - Mercurialis perennis
W6 Alnus glutinosa - Urtica dioica
0
10
20
30
% of total plots
40
50
60
1
0
20 40 60 80
mean soil pH 1971
0
1
Site area 1971
90
70
50
120
80
m2 ha-1
0
0
1
1
Intensive land-use 2000
60
40
20
n per 200m2
7
6
5
4
0
hectares
0
Nitrogen deposition 1996
8
S deposition change
1
40
0
30
% cover in 5km site buffer
50
45
40
35
kg N ha-1 yr-1
-0.8
-1.2
-1.6
change in meqv ha-1 1970 to 2000
Figure S2: Distributions of measured variables in storm (1) and non-storm (0) sites.
0
1
mean spp richness 1971
0
1
mean Basal area 1971
Table S1: Percentage of the variation in response variables explained by hypothesized predictor variables.
Response variables
Explanatory variables
Day difference of survey date
Storm exposure
Ba
change:
SITE
Ba
change:
PLOT
pH
mean:
PLOT
pH
change:
SITE
pH
change:
PLOT
29.0
1.7
0.0
12.7
3.2
ẞ 1971: SITE
Ba change: SITE
Ba change: PLOT
pH mean: SITE
pH mean: PLOT
pH change: SITE
pH change: PLOT
Spp richness change: SITE
Spp
Spp
richness richness
change: change:
SITE
PLOT
3.6
1.1
60.4
21.4
20.0
0.8
4.0
77.7
29.8
0.2
32.8
2.4
15.8
0.8
1.5
5.4
0.9
5.2
1.9
0.1
1.6
1.2
3.5
Figure S3: Colour version of Figure 5 in the main text. Both graphs depict the data in Table S1.
100
90
% variance explained
80
70
60
Spp richness change: SITE
pH change: PLOT
50
pH change: SITE
40
pH mean: PLOT
pH mean: SITE
30
20
10
Ba change: PLOT
Ba change: SITE
ẞ 1971: SITE
Storm exposure
0
Day difference of survey date
Table S2. Path analysis of change in cover-weighted Specific Leaf Area (cSLA). Summary statistics for all
model parameters from the Bayesian path analysis of October 1987 storm impacts on British broadleaved
woodlands (n=293 plots across 26 sites). Significant effects by Bayes p value are emboldened. Based on data
not centred or standardized.
Regression coefficients
Change in cSLA given difference in
survey date
Change in cSLA given storm
exposure
Change in cSLA given within-site
beta diversity in 1971
Woody basal area change given
storm exposure
pH change given storm exposure
pH change given woody basal area
change
pH change given mean soil pH
Change in cSLA given woody basal
area change
Change in cSLA given pH change
mean
sd
MC_error val2.5pc median val97.5pc
0.0057 0.0113
0.0002
-0.0172
0.0058
0.0275 0.2921
0.2889 0.6152
0.0078
-0.9413
0.2986
1.4610 0.3135
0.1618 0.4107
0.0049
-0.9837
-0.1582
0.6431 0.3456
0.0008
-0.0749
0.0615
0.2018 0.1821
0.0048
-0.6715
-0.2369
0.2116 0.1409
0.1252 0.1025
0.2977 0.0659
0.0006
0.0027
-0.3232
0.1660
-0.1250
0.2986
0.0768 0.1107
0.4231
0
0.6131 0.5122
0.0947 0.2831
0.0028
0.0020
-1.6320
-0.4621
-0.6140
0.0944
0.3882 0.1149
0.6512 0.3676
0.0620 0.0700
0.2348 0.2251
Figure S4: Path analysis diagram for change in cover-weighted SLA between 1971 and 2002.
Difference in
survey date
Oct '87 storm
y/n?
Mean
(pH71+pH02
0.30
0.06
0.006
0.29
Change in
woody basal
-0.23
Bayes
P
value
-0.61
Change in coverweighted SLA
-0.13
0.09
-0.17
Change in
soil pH
Within-site Beta
diversity in 1971
Table S3: Significance tests of change in understorey community heterogeneity (ΣDi) for each woodland site
between 1971 and 2002. Carried out using the R function dDEV (Baeten et al. 2014).
Site code
5
6
7
10
14
16
25
26
29
30
33
90
94
100
101
102
N plots
16
16
16
16
16
16
16
15
15
16
16
15
16
16
16
16
2
20
27
31
32
79
80
88
91
99
16
16
16
16
16
14
16
16
15
13
Plant species
P from
pool size 1971 Change
randomisation
Storm
(understorey) in Sum Di test
exposure?
146
382.21
0.0005
no
115
407.50
0.0005
no
97
-422.94
0.0005
no
102
-172.34
0.0160
no
72
-4.05
0.9475
no
100
155.84
0.0255
no
83
142.52
0.0180
no
98
-169.07
0.0090
no
79
497.94
0.0005
no
91
-32.75
0.6137
no
45
136.39
0.0030
no
73
63.29
0.2489
no
111
513.23
0.0005
no
75
-402.61
0.0005
no
123
-762.08
0.0005
no
86
-113.09
0.0945
no
126
129
40
103
66
100
127
99
156
57
178.10
199.01
65.35
10.52
228.17
331.34
-149.49
448.67
305.34
45.72
0.0425
0.0115
0.1039
0.8741
0.0010
0.0005
0.0755
0.0005
0.0020
0.3603
yes
yes
yes
yes
yes
yes
yes
yes
yes
yes
SIG
Y
Y
Y
Y
N
Y
Y
Y
Y
N
Y
N
Y
Y
Y
N
Direction
up
up
down
down
down
up
up
down
up
down
up
up
up
down
down
down
Y
Y
N
N
Y
Y
N
Y
Y
N
up
up
up
up
up
up
down
up
up
up
APPENDIX S1: VARIATION PARTITIONING
Andy Scott
Calculation of the proportions of the variance of a response variable explained by the different drivers in a
hypothesised path is, like model fitting, straightforward for some models and more problematical for others.
Where the fitted model can be written as a sequence of variables with linear regression at each stage the
calculation is relatively simple. Let the sequence of fitted regression equations for a set of n variables be written
as,
x1   1
x2   21 x1   2
x3   31 x1   32 x2   3
...
xn   n1 x1   n 2 x2  ...   nn1 xn1   n
where the β’s are the regression coefficients and the ε’s are the unexplained parts of the observed variables. The
β ’s are set to zero where the corresponding variable is not included in a particular regression. This set of
equations can be written in matrix form as,
X  AX  
1)
where X is the vector of variables, E is the vector of unexplained components and A is the square matrix of
regression coefficients which is lower triangular with zeros on the diagonal. Note that the first row of A is
completely zero since the first variable in the chain is not regressed on any other variable. Similarly the last
column is all zeros since the final (response) variable in the chain is not used as an explanatory variable in any
of the component regression models. Rearranging gives
( I  A) X  
2)
X  ( I  A) 1 
3)
or
so that the elements of (I-A)-1 give the required relationship between each variable and the unique, unexplained
parts of the preceding variables in the sequence. The variance of X is
var( X )  ( I  A) 1 var( )( I  A) T
4)
Elements of E are uncorrelated by construction so that var(E) is diagonal with diagonal values given by the
residual variances of the individual regression equations used to construct the model. This equation gives the
required breakdown of the variance of each variable in terms of the preceding variables in the sequence. The
component values give the total variance explained, i.e. direct plus indirect. The variance of a variable xi
explained by the direct effect of variable xj can be calculated as βij2var(Ej). The indirect effect is then obtained
by subtraction.
The required variance components can be calculated directly. Let B be the matrix whose elements are the
squares of the corresponding elements in (I-A)-1 then the components of variance are directly given as
C  BVar ()
5)
Row j of this matrix gives the components of variance of variable j, i.e. the amount of variance of variable j
explained by the other variables. The diagonal values are the unexplained variance and the sum of the values in
each row add to the total variance of the corresponding variable. Dividing each row of this matrix by the row
sum converts the matrix to proportions of variance explained.
When hierarchical data are involved the calculations are somewhat more involved but essentially the same. The
path diagram is modified so that latent/unobserved variables corresponding to each lower level variable are
explicitly included. Variables recorded at the highest level in a hierarchy therefore have just one row and
column in the matrix A; variables at the next level have two rows and columns and so on. In the woodland
dataset there are just two levels, plots within sites, so that the each plot level variable has two rows and
columns, the first for the unobserved square level variable and the second for the plot level measurement. All
regression coefficients from the fitted hierarchical component are entered in the square level row while only
coefficients corresponding to previous plot level variables are entered in the plot level row. In addition in the
plot level row a value of 1 is entered as a regression coefficient for the corresponding square level variable.
APPENDIX S2: OpenBUGS code
## LIKELIHOODS: Analysis of centred & standardized data
model {
for (i in 1:rs) {
YY[i,2] ~ dnorm(mu1, tau1)
YY[i,2]<-(Y[i,2]-mean(Y[,2]))/sd(Y[,2]) # Survey day difference
# YY[i,1] contains the indicator variable for whether the SITE was in the 87 storm
# track or not. This is coded as 1 (storm) or 0 (no-storm).
#1. Change in richness |Survey_day_diff + Oct storm
#2. Change in basal area | Oct storm
# 1. Change in richness | Survey_day_diff + Oct storm ... YY[i,2] contains
# ctrd/stdz Survey Day difference
muScS[i] <-alpha1_0 + (beta1 * YY[i,2]) + (beta2 * Y[i,1])
# 2. Change in basal area | Oct storm... Y[i,1] contains binary
# in/out Storm track
muBaS[i] <-alpha2_0 + (beta3 * Y[i,1] )
####################################################################
# PLOT-level loops. Where plot level variables are predicted by their SITE-level
#mean prediction plus any other plot level covariates. Here we need...
#1. Change in soil pH | (change in basal area) + pH starting point i.e
mean of pH in 71 + 02 as this is a correlate of measurement error
#2. Change in species richness | (Change in basal area + change in soil pH)
for (j in Y[i,3]:Y[i,4]) {
#XX[j,3]<-(X[j,3]-mean(X[,3]))/sd(X[,3])
XX[j,2]<-(X[j,2]-mean(X[,2]))/sd(X[,2])
XX[j,5]<-(X[j,5]-mean(X[,5]))/sd(X[,5])
XX[j,4]<-(X[j,4]-mean(X[,4]))/sd(X[,4])
XX[j,6]<-(X[j,6]-mean(X[,6]))/sd(X[,6])
# @@@ Variance components on Basal area and mean pH 71+02
muBaP[j]<- alpha10[X[j,1]] + muBaS[X[j,1]]
mupH7102[j]<-alpha11[X[j,1]]
XX[j,5] ~ dnorm(muBaP[j], tauBa)
XX[j,6] ~ dnorm(mupH7102[j], taupHSt)
#Change in richness
#Change in mSLA
#Change in basal area
#Change in soil pH
#Mean pH (71+02)/2
muPcP[j]<- alpha6_0[X[j,1]] + (beta5 * XX[j,5]) + (beta7 * XX[j,6])
XX[j,4] ~ dnorm(muPcP[j], tauPc)
muScP[j] <- alpha9[X[j,1]] + muScS[X[j,1]] + (beta4 * XX[j,5])
+ (beta6 * XX[j,4])
#XX[j,3] ~ dnorm(muScP[j], tauSc)
XX[j,2] ~ dnorm(muScP[j], tauSc)
}
}
## PATH COEFFICIENTS
## Storm effect on richness change via basal area change
B3timesB4 <- beta3 * beta4
## Storm effect on richness change via pH change via Basal area change
B3B5B6 <- beta3 * beta5 * beta6
## Effect of pH starting point on species richness change via pH change
B7timesB6 <- beta7 * beta6
## Bayes p-values on beta()
p.B7B6<-1-step(B7timesB6)
p.B3B5B6<-1-step(B3B5B6)
p.B3B4<-1-step(B3timesB4)
p.Ind_storm<-1-step(Ind_storm)
p.Tot_BA<-1-step(Tot_BA)
p.Tot_Storm<-1-step(Tot_Storm)
p.St_ph<-1-step(Starting_ph)
p.beta1<-1-step(beta1)
p.beta2<-1-step(beta2)
p.beta3<-1-step(beta3)
p.beta5<-1-step(beta5)
p.beta7<-1-step(beta7)
p.beta4<-1-step(beta4)
p.beta6<-1-step(beta6)
## PRIORS
mu1 ~ dnorm(0, 1.0E-6)
alpha2_0 ~ dnorm(0, 1.0E-6)
alpha1_0 ~ dnorm(0, 1.0E-6)
beta3 ~ dnorm(0, 1.0E-2)
beta5 ~ dnorm(0, 1.0E-6)
beta7 ~ dnorm(0, 1.0E-6)
beta1
beta2
beta4
beta6
~
~
~
~
dnorm(0,
dnorm(0,
dnorm(0,
dnorm(0,
1.0E-6)
1)
1.0E-6)
1.0E-6)
tauBa ~ dgamma(0.001, 0.001)
sigmaBa <-1/sqrt(tauBa)
tau1 ~ dgamma(0.001, 0.001)
sigma1 <-1/sqrt(tau1)
tauPc ~ dgamma(0.001, 0.001)
sigmaPc <-1/sqrt(tauPc)
tauSc ~ dgamma(0.001, 0.001)
sigmaSc <-1/sqrt(tauSc)
tauPhSt ~ dgamma(0.001, 0.001)
sigmaPhSt <-1/sqrt(tauPhSt)
## Uniform priors on the sdev of the random intercepts
# following the recommendation in Gelman (2006)
for (i in 1:26) {
alpha6_0[i] ~ dnorm(mu.int6_0, tau.int6_0)
alpha9[i] ~ dnorm(mu.int9, tau.int9)
alpha10[i] ~ dnorm(mu.int10, tau.int10)
alpha11[i] ~ dnorm(mu.int11, tau.int11)
}
mu.int6_0 ~ dnorm(0, 1.0E-6)
sigma.int6_0 ~ dunif(0,100)
tau.int6_0 <-1/(sigma.int6_0*sigma.int6_0)
mu.int9 ~ dnorm(0, 1.0E-6)
sigma.int9 ~ dunif(0,100)
tau.int9 <-1/(sigma.int9*sigma.int9)
mu.int10 ~ dnorm(0, 1.0E-6)
sigma.int10 ~ dunif(0,100)
tau.int10 <-1/(sigma.int10*sigma.int10)
mu.int11 ~ dnorm(0, 1.0E-6)
sigma.int11 ~ dunif(0,100)
tau.int11 <-1/(sigma.int11*sigma.int11)
}
# Data for WinBUGS generated by BAUW,
#... a free program by Zhang, Z. and Wang, L. (2006)
list(
rs= 26,
Y=structure(.Data=c(
DATA AVAILABLE ON REQUEST
), .Dim = c(26,4)),
X=structure(.Data=c(
DATA AVAILABLE ON REQUEST
), .Dim = c(293,6)))
APPENDIX S3: Notes on construction of the path analysis in OpenBUGS
The essential building blocks of the path model are regression equations that, when coupled, quantify the partial
explanatory power of one factor on another factor via correlation with an intermediate factor. In the storm
example consider species richness change as the response variable (y0) to be explained by two variables, change
in woody basal area (y1) and exposure to the October ‘87 storm (y2). Both are hypothesized to have direct and
independent effects leading to the model,
y0j =α1i(j) + β4.y1j + β2.y2j + ε1j
1)
where the β’s are standardized regression coefficients and α1i(j) is a random intercept indexed on woodland site
i. However, change in woody basal area (y1) is also expected to be partially explained by a significant effect of
storm exposure leading to another model,
y1j =α2i(j) + β3.y2j + ε2j.
2)
So the dichotomous factor y2j, indicating exposure to the storm, has appeared in two models; in 1) it is fitted
alongside change in woody basal area with both being independent predictors of change in species richness and
in 2) it is treated as an explanatory variable for change in woody basal area. The proportion of the effect of
woody basal area change on species richness change that is indirectly attributable to the effect of the storm is
given by inserting 1) into 2),
y0j =α1 i(j) + β4 (α2 i(j) + β3.y2j + ε2j) + β2.y2j + ε1j.
3)
Centering and standardizing the covariates to mean 0 and sd 1 means that the intercept α2 can be ignored,
because it is centred on zero, and all β now also range between 1 and -1 and are centered on 0. Residual errors
are also assumed to be randomly distributed and centered on zero. The first term after the intercept in 3)
therefore reduces to β4.β3.y2j. This is the estimate of the indirect effect of storm exposure on species richness
change via woody basal area change. Many kinds of data cannot be centred and standardized in this way so that
generating estimates of indirect and total effects in a path analysis requires alternative methods (see Clough
2012, Grace et al. 2012 and Grace 2006, pages 57-73).
Including random effects
Sample plots nested within individual woodland sites are likely to be more similar in their behavior in space and
over time than plots from different sites. Random intercepts drawn from a zero mean normal distribution are
specified to model this additional blocking structure in the data. The hierarchical structure can be modeled as,
yj = μ + αi + εj
with
εj ~ N(0, σj2) and αi ~ N(0, σi2)
4)
where μ is a grand mean, αi is a random intercept for each woodland and εj the random variation among plots
having accounted for the random variation between woodlands. After introducing fixed effects, the estimates of
the two variances in equation 4 quantify the variation between plots and sites that is left having fitted either site
or plot level explanatory variables (Singer 1998; Gelman & Hill 2007). These quantities are essential for
partitioning the variation at plot or site level that is explained by the fixed effects included in the path model.
The residual variances are calculated for each regression model in the path analysis. If some predictors are
included then the model can be written out so that it is clear that some are at site level, and cannot explain any
of the within-site variation, and others are measured in every plot in every site. So within each site we have a
regression involving plot-level predictors with an intercept per site αi as in 4) above,
yj ~ N(αi + βj.xj, σj2)
5)
then site-level predictors are added in and expand the random intercept term,
αi ~ N(βi.xi, σi2).
6)
This formulation is based on Gelman & Hill (2007), pages 262-265, where they give five ways of writing out
the same kind of multi-level model. This is well worth a read as it helps to clarify how a multilevel model
represents the structured random variation in the sample alongside predictors specified at different hierarchical
levels.
Writing the path model in OpenBUGS
The fully documented code to implement the storm impacts example is available in Supplementary Material.
Here we describe the essential building blocks of the path model as written in OpenBUGS.
Regression models
In the storm example two loops are specified within which likelihoods and residual variances are modelled. The
first loop, subscripted i, models site-level effects (line 2). The second, subscripted j, models plot-within-site
effects (line 5). Note here that the two column vectors Y[i,3] and Y[i,4]contain the row numbers in i
that indicate the start and end of each set of plot data within each site. Line 3 specifies the regression model for
the effects of difference in date of survey (YY[i,2]) and exposure to the storm (Y[i,1]) on species richness
change. Since the two explanatory variables are measured at the site level and do not vary from plot to plot
within any one site they cannot explain any of the variation in species richness change between plots within
each site.
The hierarchical structure of the dataset can be ignored when fitting site-level fixed effects, thus only one
intercept is fitted; alpha1_0. Similarly, line 4 specifies the model for the effects of storm exposure on mean
change in woody basal area at the site level. Again, storm exposure cannot explain any of the within-site,
between-plot variation.
1 model {
2
for (i in 1:rs) {
3 muScS[i] <-alpha1_0 + (alpha1_1 * YY[i,2]) + (alpha1_2 * Y[i,1])
4 muBaS[i] <-alpha2_0 + (alpha2_1 * Y[i,1] )
Where measurements were made in each plot within each site, regression relationships between these variables
are modelled inside loop j. Two regression models are shown. Line 6 specifies the parameters to be estimated
for the plot-level effects of change in woody basal area XX[j,5]and change in soil pH XX[j,4]on change in
plot-level species richness XX[j,2]. The expectation for this regression model is muScP[j]which is defined
in line 7 with variance 1/tauSc. This residual variance quantifies the amount of variation in change in plotlevel species richness not explained by the regression model in line 6.
To complete the model of variation in species richness change at the plot level but across all plots and sites, two
site-level expectations need to be included. These are the site-level predictions from line 3, muScS[X[j,1]]
and a random intercept alpha9[X[j,1]]. Note the use of the double bracketing here to indicate estimation
of each random intercept is across the set of plots j in site i. Line 8 specifies another plot-level regression
model this time for woody basal area change in each plot within each site. Figure 3 shows that no plot-level
explanatory variables point to woody basal area. Only storm exposure is hypothesised to explain change in
woody basal area. Since this is a site-level effect, the relationship has already been modelled in the site loop i
(line 4). Line 8 completes the variance decomposition by fitting residual random intercepts
alpha10[X[j,1]] given the modelled effect of storm exposure at site level muBaS[X[j,1]].
5
for (j in Y[i,3]:Y[i,4]) {
6 muScP[j] <- alpha9[X[j,1]] + muScS[X[j,1]] +
(alpha8_0 * XX[j,5]) + (alpha8_1 * XX[j,4])
7 XX[j,2] ~ dnorm(muScP[j], tauSc)
8 muBaP[j]<- alpha10[X[j,1]] + muBaS[X[j,1]]
9
XX[j,5] ~ dnorm(muBaP[j], tauBa)
Path coefficients
Indirect and total effects are readily specified as stochastic quantities whose distributional properties are
estimated by OpenBUGS. For example the indirect effect of storm exposure on change in species richness via
the intermediate effect of storm exposure on change in woody basal area can be written simply in OpenBUGS
code as
Rich_Ba_Storm <-
alpha2_1*alpha8_0.
This line achieves the nesting of model 1) into 2). The posterior distributions of these variables can be
summarised by specifying them as nodes to monitor while Bayes p values can also be readily specified – see the
OpenBUGS code.
References
Clough, Y. (2012) A generalized approach to modelling and estimating indirect effects in ecology. Ecology 93,
1809-1815.
Gelman, A., Hill, J. (2007) Data Analysis Using Regression and Multilevel/Hierarchical Models. CUP.
Grace, J.B. (2006) Structural Equation Modelling and Natural Systems. CUP.
Grace, J.B., Schoolmaster, D.R., Guntenspergen, G.R., Little, A.M., Mitchell, B.R., Miller, K.R., Schweiger,
E.W. (2012) Guidelines for a graph-theoretic implementation of structural equation modelling. Ecosphere 3,
Article 73.
Singer, J.D. (1998) Using SAS PROC MIXED to fit multi-level models, hierarchical models and individual
growth models. Journal of Educational and Behavioral Statistics 24, 323-355.
Download