R for Macroecology Spatial models Next week Any topics that we haven’t talked about? Group projects SAR Models Augment standard OLS with an additional term to model the spatial autocorrelation We’ll focus on error SAR models, which focuses on spatial pattern in the error part of the model OLS SARlag SARerror Y = βX + ε Y = ρWY + βX + ε Y = βX + λWu+ ε Defining the spatial weights matrix, W, is crucial Neighborhoods in R spdep dnearneigh() knearneigh() Coordinates (matrix or SpatialPoints) dnearneigh(x, d1, d2, row.names = NULL, longlat = NULL) Minimum and maximum distances (in km if longlat = T) Returns a list of vectors giving the neighbors for each point Neighborhoods in R spdep dnearneigh() knearneigh() > x = c(1,3,2,5) > y = c(3,2,4,4) > n = dnearneigh(cbind(x,y),d1 = 0,d2 = 3) > n Neighbour list object: Number of regions: 4 Number of nonzero links: 10 Percentage nonzero weights: 62.5 Average number of links: 2.5 > str(n) List of 4 $ : int [1:2] 2 3 $ : int [1:3] 1 3 4 $ : int [1:3] 1 2 4 $ : int [1:2] 2 3 - attr(*, "class")= chr "nb" - attr(*, "nbtype")= chr "distance” ... Converting a neighborhood to weights neighbors list what to do with neighborless points nb2listw(neighbours, style="W", zero.policy=NULL) W = row standardized (rows sum to 1) B = binary (0/1) C = global standardized (all links sum to n) U = C/n S = variance stabilization (Tiefelsdorf et al. 1999) Converting a neighborhood to weights > nb2listw(n,style = "W")$weights [[1]] [1] 0.5 0.5 > nb2listw(n,style = "C")$weights [[1]] [1] 0.4 0.4 [[2]] [1] 0.3333333 0.3333333 0.3333333 [[2]] [1] 0.4 0.4 0.4 [[3]] [1] 0.3333333 0.3333333 0.3333333 [[3]] [1] 0.4 0.4 0.4 [[4]] [1] 0.5 0.5 > nb2listw(n,style = "B")$weights [[1]] [1] 1 1 [[4]] [1] 0.4 0.4> > nb2listw(n,style = "S")$weights [[1]] [1] 0.4494897 0.4494897 [[2]] [1] 1 1 1 [[2]] [1] 0.3670068 0.3670068 0.3670068 [[3]] [1] 1 1 1 [[3]] [1] 0.3670068 0.3670068 0.3670068 [[4]] [1] 1 1 [[4]] [1] 0.4494897 0.4494897 Converting a neighborhood to weights > nb2listw(n,style = "W")$weights [[1]] [1] 0.5 0.5 > nb2listw(n,style = "C")$weights [[1]] [1] 0.4 0.4 [[2]] [1] 0.3333333 0.3333333 0.3333333 Emphasizes weakly [[2]] [1] 0.4 0.4 0.4 Emphasizes connected points [[3]] [1] 0.3333333 0.3333333 0.3333333 [[3]] [1] 0.4 0.4 0.4 [[4]] [1] 0.5 0.5 > nb2listw(n,style = "B")$weights [[1]] [1] 1 1 [[4]] [1] 0.4 0.4 > nb2listw(n,style = "S")$weights [[1]] [1] 0.4494897 0.4494897 [[2]] [1] 1 1 Emphasizes 1 [[2]] [1] 0.3670068 0.3670068 0.3670068 [[3]] [1] 1 1 1 [[3]] [1] 0.3670068 0.3670068 0.3670068 [[4]] [1] 1 1 strongly connected points strongly connected points Tries to balance [[4]] [1] 0.4494897 0.4494897 Lots of options – how to choose? Define the neighborhood Define the spatial weights matrix Try things out! Look for stability in model estimates Look for residual autocorrelation Defining the neighborhood - d #1. Small distance n = dnearneigh(cbind(x,y),d1 = 0, d2 = 0.1) w1 = nb2listw(n,zero.policy = T) #2. Medium distance n = dnearneigh(cbind(x,y),d1 = 0, d2 = 0.3) w2 = nb2listw(n,zero.policy = T) #2. Large distance n = dnearneigh(cbind(x,y),d1 = 0, d2 = 0.5) w3 = nb2listw(n,zero.policy = T) par(mfrow = c(1,4)) plot(x,y,axes = F,xlab = "",ylab = "") plot(w1,cbind(x,y)) plot(w2,cbind(x,y)) plot(w3,cbind(x,y)) Defining the neighborhood - K #4. 2 neighbors n = knn2nb(knearneigh(cbind(x,y),k=2,RANN = F)) w4 = nb2listw(n,zero.policy = T) #5. 4 neighbors n = knn2nb(knearneigh(cbind(x,y),k=4,RANN = F)) w5 = nb2listw(n,zero.policy = T) #6. 8 neighbors n = knn2nb(knearneigh(cbind(x,y),k=8,RANN = F)) w6 = nb2listw(n,zero.policy = T) par(mfrow = c(1,4)) plot(x,y,axes = F,xlab = "",ylab = "") plot(w4,cbind(x,y)) plot(w5,cbind(x,y)) plot(w6,cbind(x,y)) Neighborhoods on grids x = rep(1:20,20) y = rep(1:20,each = 20) plot(x,y) n = dnearneigh(cbind(x,y),d1=0,d2 = 1) w = nb2listw(n) plot(w,cbind(x,y)) n = dnearneigh(cbind(x,y),d1=0,d2 = sqrt(2)) w = nb2listw(n) plot(w,cbind(x,y)) Rook’s case Queen’s case Data size SAR models can take a very long time to fit 2000 points is the maximum I have used sample() is useful again Fitting the SAR model errorsarlm() just like lm() what to do with neighborless points errorsarlm(formula, listw, zero.policy=NULL) The neighborhood weights Try it out Build several SAR models with different W Which one works best? Spatial eigenvector maps Generate new predictors that represent the spatial structure of the data Three steps Calculate a pairwise distance matrix Do a principal components analysis on this matrix Select some of these PCA axes to add to an OLS model Spatial eigenvector maps Diniz-Filho and Bini 2005 Filter 1 Filter 2 Filter 3 Filter 4 Filter 10 Filter 30 Filter 20 Filter 40