jane12409-sup-0001-SuppInfo

advertisement
Appendix S1: Bayesian geostatistical posterior prediction of the depth
distribution of aggregations of small pelagic fish
Introduction
The objective of the analysis presented here was to simulate the depth distribution of
aggregations of Peruvian anchoveta (Engraulis ringens) for the purposes of investigating
foraging site selection by two seabird species, the Peruvian Booby (Sula variegata) and Guanay
Cormorant (Phalacrocorax bougainvillii). Bayesian predictions of the relative abundance of
anchoveta across two dimensions were previously developed by Boyd et al. (2015) from the
same survey data. The account presented here parallels the account presented by Boyd et al.
(2015).
Analysis of seabird foraging site selection was based on a regular hexagonal grid, encompassing
tracking data for Peruvian Boobies and Guanay Cormorants breeding at Grupo Pescadores in
December 2008 (Fig. 1a in the main text).
Data on the depth distribution of anchoveta were derived from systematic acoustic survey
transects. Bayesian posterior prediction was used to predict the upper depth limits of
aggregations at a set of prediction locations in each hexagonal grid cell. Posterior predictions of
the mean and the variance of the upper depth limits of aggregations were then computed for
each grid cell, and used to estimate the probability that prey occurred in the upper water
column.
Materials and methods
DATA
The Instituto del Mar del Péru (IMARPE) conducted an acoustic survey on the RV “Olaya” off
Grupo Pescadores (~11.77°S, 77.27°W) off the coast of Peru during 2-5 December 2008 (Fig. S.1).
The survey design was systematic, based on parallel, equally-spaced, onshore-offshore transects
approximately 10 km apart (Fig. 1b). For the purposes of this analysis, the study region was
restricted to the on-shelf area covered by the survey.
Acoustic backscattering data were collected using a calibrated Simrad scientific echosounder
(EK60) operating at 120 kHz and processed by IMARPE using Echoview acoustic
postprocessing software (Myriax Software, Hobart, Tasmania, Australia). Acoustic backscatter
was identified to species based on known backscattering characteristics, ground-truthed using
1
biological samples taken from mid-water trawls during the survey (Castillo et al. 2009;
Simmonds et al. 2009). Only backscatter attributed to anchoveta was used in this analysis.
Relatively homogeneous regions of acoustic backscatter were identified using the school
detection algorithm in Echoview. For each identified aggregation, the height and mean depth
were estimated by the school detection algorithm and used to calculate the depth of the upper
limit of the aggregation. All depths refer to depths below the echosounder (i.e. depths are
measured from 3.4m below the sea surface). No adjustment was made for possible vessel
avoidance by anchoveta. Anchoveta schools in central southern Chile exhibited limited vertical
diving behaviour with a range of around 5 m (Gerlotto et al. 2004).
The geographic locations of anchoveta aggregations were transformed into the Universal
Transverse Mercator coordinate system using the package rgdal() in R (Keitt et al. 2011; R
Development Core Team 2012). Several aggregations occurred at the same geographic location,
but at different depths (20 sets of duplicate coordinates from a total of 1,562 identified
aggregations). Multiple observations at the same geographic location are not supported in
geostatistical analysis. This issue was resolved by jittering the duplicate coordinates using the
package geoR() in R (Ribeiro & Diggle 2001).
BAYESIAN GEOSTATISTICAL POSTERIOR PREDICTION
The upper depth limits of anchoveta aggregations were assumed to be a realization of a
continuous non-zero random variable, and were analyzed using a linear geostatistical model
following Box-Cox transformation (Box & Cox 1964). Preliminary analysis indicated that a
model based on lognormal transformation of the upper depth limits, constant mean, and an
exponential correlation function was well-supported by the data (Boyd 2012):
𝑌𝑖 ~ 𝛽 + 𝑆𝑖 + 𝜀𝑖
eqn S1.1
where Yi represents the log-transformed upper depth limits of anchoveta aggregations observed
at a set of sampling locations i = 1 ....N; β is the mean parameter; Si is the spatial signal process;
and εi are independent and identically distributed with zero mean and non-spatial variance τ2.
The spatial signal process is characterized by the theoretical variogram. In the stationary and
isotropic case, this simplifies to:
𝑉(ℎ) = 𝜎 2 [1 − 𝜌(ℎ)]
eqn S1.2
2
where σ2 is the variance of the spatial signal process (the ‘partial sill’); and ρ(h) is the
ℎ
exponential correlation function, 𝜌(ℎ) = 𝑒𝑥𝑝 (− 𝜑), where h is the absolute Euclidean distance
between two locations and φ > 0 is a scaling parameter. The non-spatial variance may be
𝜏2
reparameterized in relative terms as 𝑣 2 = 𝜎2 .
Bayesian methods were used to sample 100 unique parameter sets from the posterior densities
of the parameters, β, σ2, φ, and v2, and generate posterior predictions of the upper depth limits
of aggregations for a set of over 45,000 prediction points (approximately 25 prediction points
per hexagonal grid cell). Bayesian inference was conducted by direct simulation, replicated
independently, using the krige.bayes function in the geoR() package in R. The following vague
priors were chosen from the options available in geoR: a flat prior for β (i.e. 𝑝(𝛽) ∝ 1), a
reciprocal prior for σ2 (i.e. 𝑝(𝜎 2 ) ∝
1
),
𝜎2
a uniform discrete prior {0, 0.1, 0.2,…20} for φ, and a
uniform discrete prior {0, 0.01, 0.02,…2} for v2.
The geoR() package uses a global neighborhood for prediction, but this is computationally
demanding for a set of over 45,000 prediction points, so posterior predictions of the upper
depth limits of aggregations were generated using the predict.gstat function in the gstat()
package in R (Pebesma 2004), with the prediction neighborhood set equal to 3φ (i.e. the practical
range, the distance at which correlation is 0.05) and the maximum number of observations set to
475 (i.e. the approximate number of prediction locations in 19 hexagonal grid cells).
Results
Samples from the posterior densities of the parameters, β, σ2, φ, and v2, are shown in Figure S.2,
together with the respective prior distributions.
The Bayesian posterior predictions reproduce the sample statistics fairly well (Fig. S.3). The
differences between the distribution of the observed data and the distribution of the posterior
predictions reflect deviations from normality in the observed data following log-transformation
(see Boyd 2012).
The Bayesian geostatistical approach appears to underestimate spatial autocorrelation at short
distances when compared to the empirical variogram for the observed data (Fig. S.4). The mean
theoretical variogram computed from samples from the posterior distribution of the spatial
parameters, and hence the posterior predictions, appear to over-estimate the variance in the
depth of aggregations between locations that are less than 24 km distant. We tested alternative
3
correlation functions with similar results. The likelihood-based geostastistical methods applied
here use all the data to estimate the spatial parameters, rather than focusing on pairs of points
that are relatively close together as is common practice in classical geostatistics. While this
appears to lead to a poor fit to the empirical variogram in this case, it is important to recognize
that the empirical variogram is only a summary of the data.
Sample predictions for the probability that the upper depth limit of aggregations is less than
10m below the echosounder are shown in Figure S.5.
The Bayesian posterior predictions for the depth distribution of anchoveta aggregations were
consistent with the observed data in showing that the depth distribution of anchoveta was
relatively shallow throughout the study region in December 2008.
References
Box, G.E.P. & Cox, D.R. (1964) An analysis of transformations. Journal of the Royal Statistical
Society, Series B, 26, 211-252.
Boyd, C. (2012) The Predator’s Dilemma: Investigating the responses of seabirds to changes in
the abundance and distribution of small pelagic prey. Ph.D. dissertation., University of
Washington.
Castillo, R., Peraltilla, S., Aliaga, A., Flores, M., Ballón, M., Calderón, J. & Gutiérrez, M. (2009)
Protocolo técnico para la evaluación acústica de las áreas de distribución y abundancia
de recursos pelágicos en el mar peruano. Informe Instituto del Mar Perú, 36, 7-28.
Gerlotto, F., Castillo, J., Saavedra, A., Barbieri, M.A., Espejo, M. & Cotel, P. (2004) Threedimensional structure and avoidance behaviour of anchovy and common sardine
schools in central southern Chile. Ices Journal of Marine Science, 61, 1120-1126.
Keitt, T.H., Bivand, R., Pebesma, E. & Rowlingson, B. (2011) rgdal: Bindings for the Geospatial
Data Abstraction.
Pebesma, E.J. (2004) Multivariable geostatistics in S: the gstat package. Computers & Geosciences,
30, 683-691.
R Development Core Team (2012) R: A language and environment for statistical computing. R
Foundation for Statistical Computing, Vienna, Austria.
Ribeiro, P.J. & Diggle, P.J. (2001) geoR: a package for geostatistical analysis. R-NEWS, 1, 15-18.
Simmonds, E.J., Gutiérrez, M., Chipollini, A., Gerlotto, F., Woillez, M. & Bertrand, A. (2009)
Optimizing the design of acoustic surveys of Peruvian anchoveta. Ices Journal of Marine
Science, 66, 1341-1348.
4
Figure legends
Figure S.1. Proportional representation of acoustic backscatter (m2 per nautical mile2) in
December 2008. Elementary distance sampling units (EDSUs) are marked by crosses (zero
values for anchoveta) and circles (positive values, diameter of the circles proportional to the
logarithm of relative anchoveta abundance). The shelf break (200 m isobaths) is indicated by the
dashed line.
Figure S.2. Prior distributions (dashed lines) and samples from the posterior densities
(histograms) for the spatial parameters.
Figure S.3. The distribution of the observed log-transformed upper depth limit of aggregations
(histogram); and the mean (solid line) and 2.5th and 97.5th percentiles (dashed lines) of samples
from the corresponding posterior densities.
Figure S.4. Quantitative summaries of the spatial pattern of observed data and posterior
predictions. Box plots represent empirical variograms computed from posterior predictions.
The dashed line represents the empirical variogram for the observed data. The dotted line
represents the mean of the theoretical variograms computed from 100 samples from the
posterior distributions of the spatial parameters.
Figure S.5. Four Bayesian posterior predictions of the probability that the upper depth limit of
aggregations is less than 7.5m below the echosounder. Land is shown in pale grey.
5
Figures
6
7
8
9
10
Download