ch14.SAR_CAR

advertisement
Chapter 14 – Spatial autoregressive models
1
He, F., Zhou, J. and Zhu, H.T. 2003. Autologistic regression model for the distribution of vegetation.
Journal of Agricultural, Biological and Environmental Statistics 8:205-222.
Dependent variable: y
North Carolina, South Carolina, and
Georgia proportion of years from 1960 to
1990 with southern pine beetle outbreak.
Independent variables:
x1
x2
Proportion
of land area
classified as
hydirc
x3
ln(elevation)
in foot
Mean daily
maximum
temperature
(F), Summer
Gumpertz, M. L., Wu, C.-T. & Pye, J. M. 2000. Logistic regression for southern pine beetle
outbreaks with spatial and temporal autocorrelation. Forest Science 46:95-107.
2
Dependent variable
Covariates
Residuals
y
x

yi   0  1 x1i   2 x2i  3 x3i   i
where
In matrix notation:
i  s1 , s2 ,..., sn
are spatial locations
Y  X  
Y   y ( s1 ), y ( s2 ),..., y ( sn )
   ( s1 ),  ( s2 ),...,  ( sn )
1, x1 ( s1 ), x2 ( s1 ), x3 ( s1 ) 
1, x ( s ), x ( s ), x ( s ) 
1
2
2
2
3
2 
X
.....



1, x1 ( sn ), x2 ( sn ), x3 ( sn )
3
Data type
Regression models
counting data
Spatial models
Non-spatial counterparts
Simultaneous autoregressive
model
Standard linear regression
Conditional autoregressive
model
Standard linear regression
Auto-Poisson model
Poisson regression*
Auto-logistic model
Logistic regression*
*Note: Poisson and logistic regressions are two most important generalized linear models
4
Unlike the regression models introduced in the previous chapter where spatial
autocorrelation in dependent variable is modeled (captured) by the variance-covariance
matrix , the autoregressive model do not directly rely on this variance-covariance matrix.
Instead the autoregressive model itself defines this covariance.
Spatial autoregressive models: “Autoregressive” means that the dependent variable (y)
regresses with itself, i.e., y appears in both right and left hands of the regression model.
Dependent variable
Covariates
Residuals
y
x
e
yi  xi   ei
5
Simultaneous autoregressive (SAR) models
(following spatial econometric terminology):
Residuals
y  W1 y  X  e
e  W2 e  
where
e
 ~ N (0,  2 I )
W1 and W2 are two n by n spatial weight matrices, associated respectively with a spatial
autoregressive process in the dependent variable y and in the error term . They are
defined as
Wij 
wij
w..
n
n
w..   wij
i 1 j 1
where wij = 1 if locations i and j are considered as neighbors and 0 otherwise. The
simplest case is the first order neighbors in rook move. Usually, W1 and W2 are
assumed to be the same and they can be easily defined using dnearneigh in spdep.
6
Simultaneous autoregressive (SAR) models:
y  Wy  X  e
e  We  
where
 ~ N (0,  2 I )
1. For ρ=0, λ=0, the above model becomes an ordinary linear regression model, with no
spatial effects:
y  X   .
2. For λ=0, it becomes a mixed regressive-spatial autoregressive model (spatial lag model):
y  Wy  X  
3. For ρ=0, it becomes a mixed regressive-spatial autoregressive model with a spatially
autocorrelated error term ε (spatial error model).
y  X  e
e  We  
7
Residuals
Spatial error models:
yi  xi   ei
Residuals: e1, e2, …, en, where n is the number of cells,
e
i.e., data points. We want to model spatial dependence of
the residuals. One of the spatial models is
n
ei   bij e j   i
with bii = 0
j 1
where  is the residuals of residuals, have mean zero and a diagonal variance-
covariance matrix:
   2 I
bij‘s are spatial dependence parameters which captures how other residuals ej (j  i)
affect the focal residual ei. Thus, the full regression model is
n
yi  xi    bij e j   i
j 1
8
n
yi  xi    bij e j   i
y j  xj  ej
j 1
n


yi  xi    bij y j  x j    i
j 1
This model describes spatial correlation through the inclusion of this term. It is a weighted
sum of the deviation of the jth observation from its modeled mean value.
In matrix notation:
( I  B)(Y  X )  
 0 b12
b
where Bnn contains the spatial dependence
0
21

B=
parameters bij.
 ...
...

SAR was first introduced by Whittle (1954). “Simultaneous”
bn1 bn 2
... b1n 
... b2n 
... ... 

... bnn 
refers the n autoregressions that occur simultaneously at
each data location in the above formulation.
9
Whittle, P. 1954. On stationary processes in the plane. Biometrika 41:434-449.
n


yi  xi    bij y j  x j    i
( I  B)(Y  X )  
j 1
Estimating SAR model parameters
    2V
 SAR   2 ( I  B) 1V ( I  B' ) 1   2VSAR ( )
If VSAR() is known, the estimation of  is straightforward, we can used the weighted
(generalized) least squares method, as we have already learned in last chapter!
1
1
ˆ  ( X 'VSAR
X ) 1 X 'VSAR
Y
1
(Y  Xˆ )'VSAR
(Y  Xˆ )
 
n p
ˆ2
If VSAR() is unknown, the estimation of  is more complicated. The ML method is usually
used and also certain structure about the variance-covariance VSAR is assumed.
10
Example: Model richness distribution in BCI
In terms of topographic variables.
Spatial error model:
y  X  e
e  We  
y   0  1 x1   2 x2  3 x3  e
e  eneighbors  
where xi are the topographic variables.
> bcisp.dat[1:10,]
index
habcat
1
1
stream
2
2
slope
3
3
slope
4
4
slope
5
5
slope
….
habno
4
3
3
3
3
gx
0
0
0
0
0
gy
0
20
40
60
80
meanelev
122.6950
124.9025
128.9125
130.3825
131.8025
convex
-3.5975
-2.2460
-0.1260
-1.2910
-1.5990
slope
13.341756
17.875906
9.650691
10.916467
11.921838
abund
133
145
171
185
185
rich
54
60
57
55
60
11
1. Define neighborhood structure:
>bci.xy=expand.grid(x=sort(unique(bcisp.dat$gx)),y=sort(unique(bcisp.dat$gy))) #create xy grids
>bci.nb=dnearneigh(as.matrix(bci.xy),0,20)
#define neighborhood structure
>plot(bci.nb,bci.xy)
#view neighborhood
>bci.W=nb2listw(bci.nb)
#define neighborhood weights using default “W” (row standardized)
2. Use spatial error model (errorsarlm in spdep) to model species richness in BCI in
relation to topographical variables:
>bcisp.sem=errorsarlm(rich~meanelev+convex+slope,listw=bci.W,data=bcisp.dat)
#note “zero.policy=T” should be used for irregular data locations because some of irregular
#locations may not have neighbors. In that case, program will crash if zero.policy is not used.
> summary(bcisp.sem)
#view outputs
3. Compare the results with simple linear model
>bcisp.lm=lm(rich~meanelev+convex+slope,data=bcisp.dat)
>summary(bcisp.lm)
Important note: compare both outputs, you will notice that the standard errors for bcisp.sem is
larger than those for bcisp.lm
12
> summary(bcisp.sem)
#view outputs
Residuals:
Min
1Q Median
3Q
Max
-21.575692 -5.363112 -0.062932 4.885973 29.647711
Type: error
Coefficients: (asymptotic standard errors)
Estimate
Std. Error z value
(Intercept) 80.393299 7.675263
10.4743
meanelev -0.215075 0.052292
-4.1129
convex
2.121780
0.518145
4.0949
slope
0.471001
0.085445
5.5123
BCI richness
Pr(>|z|)
< 2.2e-16
3.906e-05
4.223e-05
3.541e-08
Lambda: 0.45542 LR test value: 175.04 p-value: < 2.22e-16
Asymptotic standard error: 0.033131 z-value: 13.746 p-value: < 2.22e-16
Wald statistic: 188.96 p-value: < 2.22e-16
sem residuals
The simultaneous autoregressive model for
BCI richness is:
y  80.393  0.215x1  2.122 x2  0.471x3  e
e  0.455eneighbors  
13
1. Assessing model adequacy: testing the autocorrelation in residuals of the
models
>bcisp.I=sp.correlogram(bci.nb,bcisp.dat$rich,order=25,method="I",zero.policy=T)
>plot(bcisp.I, ylim=c(-0.05,0.4),main=“”)
>bcisp.lm.resid.I=sp.correlogram(bci.nb,resid(bcisp.lm),
Original data
order=25,method="I",zero.policy=T)
>plot(bcisp.I, ylim=c(-0.05,0.4),main=“”)
>bcisp.sem.resid.I=sp.correlogram(bci.nb,resid(bcisp.sem),
order=25,method="I",zero.policy=T)
>plot(bcisp.I, ylim=c(-0.05,0.4).main=“”)
lm resid
sem resid
Original data
Residuals of simple
linear model
Residuals of sem
SAR model
14
Conditional autoregressive (CAR) models:
3 2
2
yi  xi    1 y1   2 y2   3 y3   4 y4   i
1
1
1
4
2
First-order
4
1
2
3
Second-order
The CAR model is based on the concept of Markov random field. Besag (1974) provided
a formal mathematical foundation for the method. In a general form (considering all
orders of neighborhood), the CAR can be written as
n

E  yi | yi   xi    cij y j  xi 
var  yi | yi    i2
j 1

Require cii = 0

C  diag  12 ,  22 ,...,  n2
This defines a joint multivariate normal distribution with
mean: X
and variance Y  ( I  C ) 1  C
Besag, J. 1974, Spatial interaction and the statistical analysis of lattice systems. JRSS, B. 36:192-225.
15

Estimating (CAR) models parameter
3 2
2
yi  xi    1 y1   2 y2   3 y3   4 y4   i
1
1
2
First-order
4
1
4
1
2
3
Second-order
Several methods can be used to estimate the parameters of the CAR model:
1. The simplest one is called pseudo-likelihood method (= standard maximum likelihood
method. In this linear regression case, it is just the ordinary least squares method,
pretending the neighborhoods are other covariates.
2. The generalized least squares method – It is based on Besag’s theory that the CAR
model is a multivariate normal distribution with
mean: X
and variance
Y  ( I  C ) 1  C
2
1
2
The CAR variance is:  CAR   ( I  C ) VC   VCAR
1
1
ˆ  ( X 'VCAR
X ) 1 X 'VCAR
Y
1
(Y  Xˆ )'VCAR
(Y  Xˆ )
 
n p
ˆ2
3. MCMC – Markov Chain Monte Carlo simulation algorithm
16
Logistic regression
So far we only consider the situation where y is continuous
numerical variable. We now model y which only takes values of
0 or 1, i.e., binary maps.
1
y
0
presence of species
absence
The probability of occurrence is a function of covariates x, of the form:
 ( y  1 | x) 
e 0  1x
1  e 0  1x
It can be expressed in a more familiar form (called logit):
g ( x)  log

1 
  0  1x
17
Odds ratio
Odds of outcome being present among individuals with x = 1 is defined as:
Odds:
 (1 | 1)
 (1 | 1)

 e
1   (1 | 1)  (0 | 1)
0  1
Odds of outcome being present among individuals with x = 0 is:
Odds:
 (1 | 0)
 (1 | 0)

 e
1   (1 | 0)  (0 | 0)
Odds ratio:
0
 (1 | 1)
1   (1 | 1) e   

   e
 (1 | 0)
e
1   (1 | 0)
0
1
1
0
Odds ratio is a measure of association which has wide applications. It approximates how much
more likely (or unlikely) it is for the outcome to be present among those with x = 1 than among
those with x = 0. For example, if y denotes the presence or absence of lung cancer and if x
denotes whether or not the person is a smoker, then   2 indicates that lung cancer occurs
twice as often among smokers than among nonsmokers in the study population.
18
Autologistic regression
3 2
2
Following the principle of the CAR model, we can
incorporate neighborhood spatial correlation into the
1
1
2
1
4
4
1
2
3
logistic model. The logit now becomes:
First-order
g ( x)  log

1 
Second-order
  0  1x   1 y1   2 y2   3 y3   4 y4
The estimation methods include:
1. PML – pseudo maximum likelihood method, i.e., the standard method
used to estimate logistic regression models.
2. MCMC (see He et al. 2003).
He, F., Zhou, J. and Zhu, H.T. 2003. Autologistic regression model for the distribution of vegetation.
Journal of Agricultural, Biological and Environmental Statistics 8:205-222.
19
Spatial statistical analysis in Ecology
1. Point pattern analysis
2. Geostatistics
3. Lattice data analysis (regression)
1. Methods for testing (detecting) spatial structures/scale effect:
(1) Quadrat methods, distance methods, Ripley’s K function
(2) Moran’s I, Geary’s c
(3) Geostatistic methods: variogram, covariogram
2. Saptial interpolation: naïve methods and kriging
3. Model lattice data: spatial autoregressive models
20
Download