SUPPLEMENTARY MATERIAL Appendix: Spatial Autocorrelation

advertisement
SUPPLEMENTARY MATERIAL
Appendix: Spatial Autocorrelation Methods
Spatial autocorrelation is the dependence among observations of the same
variable caused by their locations in space.
If there is a transect of n
observations, spatial autocorrelation means that the observations in the
spatial series are not completely independent and so the n observations do
not provide the same amount of real information as n independent
observations. If the autocorrelation is positive so that neighbouring values
tend to be more similar than if they were independent, we have the
equivalent of n’, not n, independent observations, where n’ is something less
than n. Here n’ is called the “effective sample size”. The problem of spatial
autocorrelation for statistical testing is that because the effective sample size
is actually less than the sample size, we tend to get more apparently
significant results for our tests than the data actually justify if we use n in
our calculations.
It would seem to be a simple solution merely to estimate n’ from the data
somehow and then use it rather than n in the calculations to correct the rates
of significant results in our statistical tests. Sometimes this can be done, but
it is not easily accomplished, particularly for univariate tests (see Dale &
Fortin 2002, Dale & Fortin 2009). An alternative is to use a model of the data
that has a similar autocorrelation structure, and use Monte Carlo methods to
generate a large number of similarly-structured “data” sets with which the
results from the real data can be compared. This is the “Model & Monte
Carlo” approach suggested by Dale & Fortin (2002). There are a number of
different models of spatial autocorrelation available for use, but one of the
most common is the autoregressive model, which mimics the familiar
regression model but relates the variable of interest to itself rather than to
other variables. For example,
xi 
i 1

j i  20
j
xj  i
The approach advocated is to start with a very general autoregressive model
such as the one above and find the one that best fits the observations from
the spatial series. The model is then used to generate a large number of
series of “data”, similar to the series observed.
The many generated
realizations of the best fit model can be used for direct comparison of test
statistics, observed vs. the distribution from Monte Carlo realizations, or to
provide an estimate of the effective sample size, which can then be used for
subsequent statistical tests.
For example, in an ANOVA analysis, the
numbers of degrees of freedom associated with each source of variation is
unchanged from standard practice except for the total, which is now based on
the effective sample size n’, rather than on the full number of observations, n.
We have described an approach suitable for data that occur in spatial series,
but we now describe a method for assessing the degree of spatial
autocorrelation when the data occur in a two-dimensional grid or lattice.
This is the “join count” approach which examines the categories to which
first-order neighbours on a lattice belong. For simplicity, consider a small
rectangular lattice, the nodes of which belong to one of two categories: black
or white. The joins between “rook’s move neighbours are either black-black
or white-white, joining nodes of the same class, or white-black, joining nodes
of different classes. The more “like” joins and the fewer “unlike” joins there
are, the greater is the autocorrelation in the structure.
In this example of 9 nodes in a 3 × 3 lattice, there are 3 white-white joins, 3
black-black joins, and 6 white-black joins.
References
Dale MRT, Fortin M-J (2009) Spatial autocorrelation and statistical tests:
some solutions. Journal of Agricultural, Biological and Environmental
Statistics 14:188–206.
Dale MRT, Fortin M-J (2002) Spatial autocorrelation and statistical tests in
ecology. Ecoscience 9: 162–167.
Download