1 S2. Statistical Appendix. 2 3 Several distinct statistical procedures are described here. All statistical analyses were conducted 4 in R (version 2.11.1; [1]). Bayesian statistics were conducted using the BRugs package, which 5 interfaces R to OpenBUGS [2]. All approaches follow methods described by McCarthy [3]. 6 This appendix is presented in three parts: A description of the model selection procedure (A.), a 7 description of the procedure used to generate Figures 4 and 5 (B.) and a description of the 8 process used to generate the mean and 95% credible intervals presented in Figure 3 (C.). 9 10 11 A. Model selection In this model selection procedure, a comma-separated values (.csv) file of the data (which 12 corresponds to a spreadsheet in the Data Appendix) is attached so that R can 'see' the individual 13 columns as variables. In the example code below, AllAg (all forms of agricultural land cover) is 14 natural-log converted prior to analysis. As a result, in this example the DIC and parameter 15 estimates are made for a model logarithmically relating agricultural land cover to the consumer 16 tissue δ15N (FF15N). The general format for the regression equation was: 17 Consumer tissue δ15N = WatershedProperty(β) + Intercept 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 library("BRugs") ## This loads the BRugs package Year1RiverFF <- read.csv("C:/Users/jhlarson/Desktop/USGS Science/Stable Isotopes 2011/Analysis/R Model Selection/Year1RiverFF.csv") ## Loading the dataset attach(Year1RiverFF) ## attaching an imported dataset LogAllAg <- log(AllAg+1) ## Regression Model - This creates a function regressionmodel <- function(){ a~dnorm(0,1.0E-6) ## Non-informative prior b~dnorm(0,1.0E-6) ## Non-informative prior prec~dgamma(0.001,0.001) ## Non-informative that BRugs can use in OpenBUGS y-intercept slope model precision sy2 <- pow(sd(y[]),2) R2B <- 1 - 1/(prec*sy2) ## Bayesian R2 for (i in 1:11) ## 1:N, where N is the number of observations { mean [i] <- a+b*x[i] y[i] ~dnorm(mean[i],prec) } Larson et al. 1 of 8 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 } 73 used for any 1-parameter model. However, for 2-parameter models, this procedure was slightly 74 modified by the addition of a new model: 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 regressionmodelfile <- file.path(tempdir(),"regressionmodel.txt") model <- writeModel(regressionmodel,regressionmodelfile) inits <- "C:\\Users\\jhlarson\\Desktop\\USGS Science\\OpenBugs Code\\Regression\\Initials.txt" ## location of the initials file: a=0, b=0, prec=100 ## Test for AllAg. x <- LogAllAg ## for convenience variable is converted to x to match model y <- FF15N ## for convenience variable is converted to y to match model bdata <-bugsData(c("y","x"),,digits=5) ## this places the data into a form OpenBUGS can read modelCheck(regressionmodelfile) ## Tells OpenBUGS to check the model modelData(bdata) ## Tells OpenBUGS to load the data modelCompile(numChains=1) ## Compiles the model with number of chains modelInits(inits,) ## Loads the initials modelUpdate(50000) ## Updates the model 50000 times as a burn-in samplesSet(c("b","R2B","a","prec"))## Tells OpenBUGS to keep data on these variables dicSet() ## Tells OpenBUGS to keep data on DIC modelUpdate(50000) ## Updates model 50000 times to collect data YR1R15NLogAllAg <- samplesStats("*") ## Store the variable estimates in a designated file YR1R15NLogAllAgDIC <- dicStats() ## Store the DIC estimates in a designated file This basic approach was repeated for every model tested. The model above could be ## Two-parameter Regression Model regressionmodel <- function(){ a~dnorm(0,1.0E-6) b1~dnorm(0,1.0E-6) b2~dnorm(0,1.0E-6) prec~dgamma(0.001,0.001) sy2 <- pow(sd(y[]),2) R2B <- 1 - 1/(prec*sy2) for (i in 1:11) { mean [i] <- a+b1*x1[i]+b2*x2[i] y[i] ~dnorm(mean[i],prec) Larson et al. 2 of 8 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 } } regressionmodelfile <- file.path(tempdir(),"regressionmodel.txt") model <- writeModel(regressionmodel,regressionmodelfile) inits <- "C:\\Users\\jhlarson\\Desktop\\USGS Science\\OpenBugs Code\\Regression\\Initials2variables.txt" x1 <- Wdep x2 <- LogAllAg y <- FF15N bdata <-bugsData(c("y","x1","x2"),,digits=5) In this example, initials were set at a=0, b1=0, b2=0 and prec=100. Other aspects of the procedure were identical, except that parameters a,b1,b2, R2B and prec were monitored. 111 The above models have non-informative prior distributions, and these models were used 112 on the data from Larson et al. [4] to generate informative prior distributions that could be used in 113 the analysis of the new data. This re-analysis of the earlier data is summarized in Appendix 114 Table 1. These distributions were used as informative priors in the analysis of the new data. In 115 the example below, data from Appendix Table 1 is used to create prior distributions for model 116 parameters and model precision. 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 attach(RiverFF) LogAllAg <- log(AllAg+1) ## Regression Model regressionmodel <- function(){ a~dnorm(5.213,0.5259) b~dnorm(1.497,5.4641) prec~dgamma(13.164,4.5087) sy2 <- pow(sd(y[]),2) R2B <- 1 - 1/(prec*sy2) for (i in 1:22) { mean [i] <- a+b*x[i] y[i] ~dnorm(mean[i],prec) } } Larson et al. 3 of 8 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 regressionmodelfile <- file.path(tempdir(),"regressionmodel.txt") model <- writeModel(regressionmodel,regressionmodelfile) inits <- "C:\\Users\\jhlarson\\Desktop\\USGS Science\\OpenBugs Code\\Regression\\Initials.txt" ## Test for AllAg. x <- LogAllAg y <- FF15N bdata <-bugsData(c("y","x"),,digits=5) modelCheck(regressionmodelfile) modelData(bdata) modelCompile(numChains=1) modelInits(inits,) modelUpdate(50000) samplesSet(c("a","b","R2B","prec")) dicSet() ## Tells OpenBUGS to keep data on DIC modelUpdate(50000) R15NLogAllAg <- samplesStats("*") R15NLogAllAgDIC <- dicStats() ## Store the DIC estimates in a designated file B. Visualizing the model and 95% credible intervals 164 Creating a visual representation of the model plus 95% credible intervals can be done by 165 creating predictions from the model across the parameter space and displaying those predictions 166 in a graphic. Although not necessarily the best mechanism to do this, we made these estimates in 167 R using the BRugs, then exported the resulting estimates to Excel to build a figure. In this 168 example, we calculated predictions for the range of possible values following a procedure 169 suggested by McCarthy [3]. 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 RiverFF <- read.csv("C:/Users/jhlarson/Desktop/USGS Science/Stable Isotopes 2011/Analysis/R Model Selection/RiverFF.csv") attach(RiverFF) LogAllAg <- log(AllAg+1) ## Regression Model regressionmodel <- function(){ a~dnorm(5.213,0.5259) b~dnorm(1.497,5.4641) prec~dgamma(13.164,4.5087) prediction0.15<-a+b*0.15 ## This generates the prediction for a particular value prediction0.25<-a+b*0.25 prediction0.35<-a+b*0.35 prediction0.5<-a+b*0.5 Larson et al. 4 of 8 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 prediction1<-a+b*1 prediction1.6<-a+b*1.6 prediction1.8<-a+b*1.8 prediction2<-a+b*2 prediction2.2<-a+b*2.2 prediction2.4<-a+b*2.4 prediction2.6<-a+b*2.6 prediction3<-a+b*3 prediction3.1<-a+b*3.1 prediction3.2<-a+b*3.2 prediction3.3<-a+b*3.3 prediction3.4<-a+b*3.4 prediction3.5<-a+b*3.5 prediction3.6<-a+b*3.6 prediction3.7<-a+b*3.7 prediction3.8<-a+b*3.8 prediction3.9<-a+b*3.9 prediction4<-a+b*4 prediction4.1<-a+b*4.1 prediction4.2<-a+b*4.2 prediction4.3<-a+b*4.3 prediction4.4<-a+b*4.4 prediction4.45<-a+b*4.45 prediction4.5<-a+b*4.5 prediction4.55<-a+b*4.55 prediction4.6<-a+b*4.6 prediction4.65<-a+b*4.65 prediction5<-a+b*5 sy2 <- pow(sd(y[]),2) for (i in 1:22) { mean [i] <- a+b*x[i] y[i] ~dnorm(mean[i],prec) } } regressionmodelfile <- file.path(tempdir(),"regressionmodel.txt") model <- writeModel(regressionmodel,regressionmodelfile) inits <- "C:\\Users\\jhlarson\\Desktop\\USGS Science\\OpenBugs Code\\Regression\\Initials.txt" ## Test for AllAg. x <- LogAllAg y <- FF15N bdata <-bugsData(c("y","x"),,digits=5) modelCheck(regressionmodelfile) modelData(bdata) modelCompile(numChains=1) modelInits(inits,) modelUpdate(50000) samplesSet(c("a","b","prediction0.15","prediction0.25","prediction0.35","pred iction0.5","prediction1","prediction1.6","prediction1.8","prediction2","predi Larson et al. 5 of 8 243 244 245 246 247 248 249 250 251 ction2.2","prediction2.4","prediction2.6","prediction3","prediction3.1","pred iction3.2","prediction3.3","prediction3.4","prediction3.5","prediction3.6","p rediction3.7","prediction3.8","prediction3.9","prediction4","prediction4.1"," prediction4.2","prediction4.3","prediction4.4","prediction4.45","prediction4. 5","prediction4.55","prediction4.6","prediction4.65","prediction5")) modelUpdate(50000) LogAgPredictions <- samplesStats("*") 252 C. Descriptive statistics 253 254 Estimating a mean using a Bayesian approach includes estimation of 95% credible 255 intervals [3]. This makes for a simple test of statistically significant differences: If intervals 256 overlap, then the means are not different. The following code was used to estimate mean and 257 95% credible intervals for consumer tissue δ15N. 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 ## First the data file is loaded. RiverFF <- read.csv("C:/Users/jhlarson/Desktop/USGS Science/Stable Isotopes 2011/Analysis/R Model Selection/RiverFF.csv") attach(RiverFF) ## This allows R to read columns as variables ## This defines the function for OpenBUGS regressionmodel <- function(){ ## the naming of this function is arbitrary for (i in 1:22){ x[i] ~ dnorm (mu[1], tau[1]) } mu[1] ~ dnorm (0, 0.0001) ## non-informative prior distributions were used tau[1] ~ dgamma (0.001, 0.001) } regressionmodelfile <- file.path(tempdir(),"regressionmodel.txt") model <- writeModel(regressionmodel,regressionmodelfile) inits <- "C:\\Users\\jhlarson\\Desktop\\USGS Science\\OpenBugs Code\\1meansinitials.txt" ## this is the location of the initials file x <- FF15N ## Transforming the variable to the same term used in the function bdata <-bugsData(c("x"),,digits=5) ## This prepares the data in the format OpenBUGS uses modelCheck(regressionmodelfile) modelData(bdata) modelCompile(numChains=1) modelInits(inits,) modelUpdate(50000) ## These are the same as described above. Larson et al. 6 of 8 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 samplesSet(c("mu","tau")) modelUpdate(50000) MeanFF15N<- samplesStats("*") ## The same process is repeated for the RM sites below RMFF <- read.csv("C:/Users/jhlarson/Desktop/USGS Science/Stable Isotopes 2011/Analysis/RM Model Selection/RMFF.csv") attach(RMFF) 326 327 1. R Development Core Team (2010) R: A language and environment for statistical computing. 328 2. Openbugs T, Best N, Lunn D (2007) The BRugs Package. 329 330 3. McCarthy M (2007) Bayesian methods for ecology. New York, New York, USA: Cambridge University Press. p. 331 332 333 334 4. Larson JH, Richardson WB, Vallazza JM, Nelson JC (2012) An exploratory investigation of the landscape-lake interface: Land cover controls over consumer N and C isotopic composition in Lake Michigan rivermouths. Journal of Great Lakes Research 38: 610– 619. ## Regression Model regressionmodel <- function(){ for (i in 1:21){ x[i] ~ dnorm (mu[1], tau[1]) } mu[1] ~ dnorm (0, 0.0001) tau[1] ~ dgamma (0.001, 0.001) } regressionmodelfile <- file.path(tempdir(),"regressionmodel.txt") model <- writeModel(regressionmodel,regressionmodelfile) inits <- "C:\\Users\\jhlarson\\Desktop\\USGS Science\\OpenBugs Code\\1meansinitials.txt" x <- FF15N bdata <-bugsData(c("x"),,digits=5) modelCheck(regressionmodelfile) modelData(bdata) modelCompile(numChains=1) modelInits(inits,) modelUpdate(50000) samplesSet(c("mu","tau")) modelUpdate(50000) RMMeanFF15N<- samplesStats("*") References 335 Larson et al. 7 of 8 336 Larson et al. 8 of 8