Journal of Biogeography SUPPORTING INFORMATI ON Geographical variability in the controls of giant kelp biomass dynamics Tom W. Bell, Kyle C. Cavanaugh, Dan C. Reed and David A. Siegel APPENDIX S1 Supplementary methods. Modelled significant wave height methods Predictions of significant wave height, Hs, for the entire time-series were made at each coastline site using a generalized additive model (GAM) with the log-transformed hourly CDIP-modelled Hs as the response variable with a Gaussian distribution, and the hourly Harvest buoy Hs, period and direction as the predictors. The site-specific model was trained with two-thirds of the available wave data from 1998–2012 and validated using the remaining third. Hourly Hs was hindcasted from 1987–2011 by inputting hourly Hs and period from the Harvest platform, and using a randomly selected direction from a probability function for daily direction generated from direction data from the Harvest buoy from 1998–2012 for dates (1987–1997) and known directional data for 1998–2011. An estimate of hourly Hs was generated from the model 100 times using a new randomly selected direction for each iteration. The mean Hs from these 100 model estimates was used as the Hs for that hour at the specific site. Sea urchin observations Sea urchin densities were measured by the Santa Barbara Coastal Long Term Ecological Research Project (SBC LTER), the Partnership for Interdisciplinary Studies of Coastal Oceans (PISCO), the US Geological Survey and the National Park Service. Sea urchin density was only included at sites where urchin surveys were conducted inside a 500-m coastline segment. GAMs were run for these coastline segments only for the period for which sea urchin data were available. All surveys were completed on an annual basis, with seasonal densities estimated from the annual survey closest in time, and density was obtained from either 1-m2 quadrats along permanent 50-m transects or 30 m × 2 m swath surveys. Annual surveys are a valid proxy for season densities because sea urchins are long-lived (Ebert & Southon, 2003). The SBC LTER collected sea urchin density data for sites on the mainland coastline of southern California; PISCO sites were used for sites on the mainland coast of central and southern California, the northern Channel Islands, and Santa Barbara Island, the US Geological Survey for sites on San Nicolas Island and the National Park Service for sites in the northern Channel Islands. Generalized additive models and prediction methods The general concept of GAMs is that a response variable can be modelled as the sum of non-linear functions of different predictor variables (Hastie & Tibshirani, 1990). The underlying relationship between each predictor variable and kelp biomass was determined using thin-plate penalized regression splines, which add penalties to wiggly functions to avoid overfitting (Wood & Augustin, 2002). The weight of these penalties was optimized using generalized cross-validation, which minimizes the root mean square error between the fit and data points. Shrinkage smoothers were added to the GAM allowing a predictor to be penalized over its entire range and be practically removed from the model if no significant relationship was found. The basis dimension (k) for each predictor function was kept at 7, and additional wiggleness penalty constants (m) were added to all predictors at a value of 2. The function g(E(Y)) is the link function along with a specified distribution for the response variable data; in this case, the g is a log link. Each fi(Xi) represents a non-parametric smoothing function of each additive predictor variable. 𝑔(E(proportional biomass)) = 𝛽0 + 𝑓1 (𝐻smax ) + 𝑓2 (mean NO3 ) + 𝑓3 (NPGO) + 𝑓4 (kelp occupancy) + 𝑓5 (harvest effort) + 𝑓6 (urchin density) To investigate potential drivers not included in the model, biomass predictions were generated for coastline segments where there was an extended absence of kelp canopy. Proportional kelp biomass was predicted without the time periods of extended kelp absence (more than two quarters from nearest season with measureable biomass). The environmental predictors identified by the EOF analysis were found to affect kelp biomass over short time-scales (lagged by one season), so it was assumed that periods of extended kelp absence were most likely to be due to a driver that was not identified or measured for that period. The model biomass predictions were validated using an eight-fold cross-validation, where the model is trained with seven-eighths of the data to predict the final eighth, and repeated eight times. Once validated, these initial predictions of proportional kelp biomass were lagged by one quarter and used as the kelp occupancy term for a second run of GAM predictions. This final prediction represents a hindcast of kelp biomass at a site if it were only a function of the environmental predictors evaluated, and can be compared to actual kelp biomass measurements. Periods of mismatch from these comparisons were related to sitespecific data on grazer abundance. REFERENCES Ebert, T.A. & Southon, J.R. (2003) Red sea urchins (Strongylocentrotus franciscanus) can live over 100 years: confirmation with A-bomb 14carbon . Fishery Bulletin, 101, 915–922. Hastie, T.J. & Tibshirani, R.J. (1990) Generalized additive models. Chapman & Hall/CRC. Wood, S. & Augustin, N. (2002) GAMs with integrated model selection using penalized regression splines and applications to environmental modelling. Ecological Modelling, 157, 157–177.