Session 3: Spatial autocorrelation tests Course on Spatial Econometrics with Applications Profesora: Coro Chasco Yrigoyen Universidad Autónoma de Madrid Lugar: Universidad Politécnica de Barcelona 12-13, 18-20 de junio, 2007 ©2007, Coro Chasco Yrigoyen All Rights Reserved Course Index S1: Introduction to spatial econometrics S2: Spatial effects, spatial dependence S3: Spatial autocorrelation tests S4: Exploratory Spatial Data Analysis (ESDA) S5: Specification of spatial dependence models S6: Spatial regression models: OLS estimation and testing PS1: GeoDa: introduction and ESDA S7: Spatial dependence models: estimation and testing S8: Modelling strategies in spatial regression models PS2: SpaceStat: confirmatory spatial data analysis S9: Specification of spatial heterogeneity models S10: Spatial heterogeneity models: estimation and testing PS3: Practical exercise and evaluation @ 2007, Coro Chasco Yrigoyen All Rights Reserved 2 . CHASCO, C. (2003), “Econometría espacial aplicada a la predicción-extrapolación de datos microterritoriales”. Comunidad de Madrid; pp. 62-78. Overview and Goals Global spatial autocorrelation 1. 2. 3. 4. Moran’s I Geary’s c Mantel’s Γ Getis and Ord’s G(d) Local spatial autocorrelation 1. Getis and Ord’s local statistics 4. LISA tests @ 2007, Coro Chasco Yrigoyen All Rights Reserved 3 Session 3 Session 3 3.1. Global spatial autocorrelation Used to test for the presence of general spatial trends in the distribution of a geographical variable over a whole space. But how can we determine the existence of spatial autocorrelation? 3.1.1. 3.1.2. 3.1.3. 3.1.4. Moran’s I Geary’s c Mantel’s Γ Getis and Ord’s G(d) @ 2007, Coro Chasco Yrigoyen All Rights Reserved Rta. disp. por hab. (1997) (miles ptas.) 1.400 a 1.800 1.125 a 1.400 900 a 1.125 4 Session 3 . MORAN, P. (1948), “The interpretation of statistical maps”. Journal of the Royal Statistical Society B, vol. 10; pp. 243-251. 3.1. Global spatial autocorrelation 3.1.1. Moran’s I Moran’I theoretical mean: E(I) = W* Possitive aut. Negativ aut. N: sample size @ 2007, Coro Chasco Yrigoyen All Rights Reserved 5 . CLIFF, A. y J. ORD (1973), “Spatial autocorrelation”. London: Pion. . CLIFF, A. y J. ORD (1981), “Spatial processes, models and applications”. London: Pion Session 3 3.1. Global spatial autocorrelation 3.1.1. Moran’s I (II) For N → ∞, z(I) follows a standard normal distribution: z(I) ∼ N(0,1) Inference is typically based on a standardized z-value, Assumptions: Normalisation: the variable X follows an asymptotic normal distribution. Randomisation by permutation: unknown distribution function for X @ 2007, Coro Chasco Yrigoyen All Rights Reserved 6 Session 3 3.1. Global spatial autocorrelation 3.1.1. Moran’s I (III) Normalisation: the variable X follows a normal distribution 1) For N → ∞, zN(I) follows a standard normal distribution: zN(I) ∼ N(0,1) 2) Significance of zN(I): in a standard normal table 1 EN ( I ) = − N −1 VarN ( I ) = 4 AN 2 − 8 ( A + D ) N + 12 A2 4 A2 ( N 2 − 1) 1 N A = ∑ Li = S0 2 i =1 @ 2007, Coro Chasco Yrigoyen All Rights Reserved 1 n D = ∑ Li (Li − 1) 2 i =1 7 Session 3 3.1. Global spatial autocorrelation 3.1.1. Moran’s I (IV) Permutation: randomisation with unknown distribution function 1) A reference distribution for I is generated empirically. 2) Randomly permuting observations & computing Moran’s for a set of n! new samples 3) E[I] & SD[I] are computed directly from the generated distribution of Moran’s Is 4) Significance of z(I): in a standard normal table. @ 2007, Coro Chasco Yrigoyen All Rights Reserved 8 3.1. Global spatial autocorrelation 3.1.1. Moran’s I (V) @ 2007, Coro Chasco Yrigoyen All Rights Reserved 9 Session 3 3.1. Global spatial autocorrelation 3.1.1. Moran’s I (VI) Interpretation: Non-significant values for z(I) should be interpreted as a rejection of H0(no spatial autocorrelation). © Significant z(I) > 0 ⇒ positive spatial autocorrelation: it is possible to find out similar high/low values of a variable X spatially clustered than could be by chance. © Significant z(I) < 0 ⇒ negative spatial autocorrelation: there is a lack of similar high/low values of X spatially clustered than could be by chance. This pattern is perfectly represented by a checkerboard. @ 2007, Coro Chasco Yrigoyen All Rights Reserved 10 Session 3 3.1. Global spatial autocorrelation 3.1.1. Moran’s I (VII) A negative significant z(I): spatial autocorrelation (lack of clustering more than would be in a random pattern) @ 2007, Coro Chasco Yrigoyen All Rights Reserved 11 Session 2 . CLIFF, A. y J. ORD (1981), “Spatial processes, models and applications”. London: Pion; chapter 5. 3.1. Global spatial autocorrelation 3.1.1. Moran’s I (VIII) Correlogram: an analytic method that is of value in assessing the spatial scale of a process. Sometimes the strength of spatial interaction will vary in a complex way with distance. Higher-order spatial autocorrelation: spatial correlogram 1.5 Z (I) M ORAN 1 0.5 0 1 -0.5 -1 -1.5 @ 2007, Coro Chasco Yrigoyen All Rights Reserved 12 2 3 4 5 6 7 8 9 Session 3 3.1. Global spatial autocorrelation 3.1.2. Geary’s c N − 1) ∑ ( 2) wij ( xi − x j ) ( c= 2 S0 N ∑(x − x ) i =1 2 2 i Geary’s c theoretical mean: E(c) = 1 Perfect possitive aut.: c = 0, xi ≅ xj → xi – xj = 0 Geary’s c: depends on the (absolute) difference between neighboring values of a variable. It is similar to the Durbin-Watson test. It’s a variance test. Moran’s I: depends on the difference between each value of X variable and its mean. It is similar to the Pearson correlation coefficient. It’s a covariance test. @ 2007, Coro Chasco Yrigoyen All Rights Reserved 13 Session 3 3.1. Global spatial autocorrelation 3.1.2. Geary’s c (II) For N → ∞, z(c) follows a standard normal distribution: z(c) ∼ N(0,1) Inference is typically based on a standardized z-value, c − E (c) z (c) = SD ( c ) Normalisation: the variable X follows an asymptotic normal distribution. Randomisation by permutation: unknown distribution function for X @ 2007, Coro Chasco Yrigoyen All Rights Reserved 14 Session 3 3.1. Global spatial autocorrelation 3.1.2. Geary’s c (III) Interpretation: c − E (c) z (c) = SD ( c ) Non-significant values for z(c) should be interpreted as a rejection of H0(no spatial autocorrelation). © Significant z(c) < 0 ⇒ positive spatial autocorrelation: it is possible to find out similar spatially clustered high/low values of a variable X than it would be by chance. © Significant z(I) > 0 ⇒ negative spatial autocorrelation: there is a lack of clustered similar high/low values of X than it would be by chance. @ 2007, Coro Chasco Yrigoyen All Rights Reserved 15 Session 3 3.1. Global spatial autocorrelation 3.1.3. Mantel’s Γ Mantel (1967): matrix association index, which is the sum of the cross-product of the coincident elements of matrices A, B: Γ = ∑ ∑ a ij bij i j wij xi x j Moran’s I (x − x ) i j 2 Geary’s c Spatial association measures can be obtained, in general, expressing similarities by means of matrices: 1) spatial similarity (e.g., the spatial weight matrix) and 2) value similarities. @ 2007, Coro Chasco Yrigoyen All Rights Reserved 16 Session 3 3.1. Global spatial autocorrelation 3.1.4. Getis and Ord G(d) Spatial autocorrelation is measured as a distanced-based or spatial clustering measure. For this test, two spatial units are neighbors if they are located at a certain distance (d). N G (d ) = N ∑∑ w ( d ) x x i =1 j =1 N ij ∑∑ x x i X>0 j ; N i =1 j =1 i j for j ≠ i W = binary, symmetric It measures the association degree existent between the values of X around “i” and the association in the value of X around “j” j @ 2007, Coro Chasco Yrigoyen All Rights Reserved 17 3.1. Global spatial autocorrelation @ 2007, Coro Chasco Yrigoyen All Rights Reserved 18 Session 3 3.2. Local spatial autocorrelation Concentration -in a particular zone of the global space- of particularly high/low values of a variable more than the expected mean value (or mean of the variable). This phenomenon takes place in non-stationary spatial processes: spatial dependence changes with location. © Sometimes there is no global spatial autocorrelation in a variable but small spatial clusters, in which it takes a significant concentration/lack of high values. © Sometimes there is global spatial autocorrelation in a variable, but each region contributes differently to it. TESTS: 1. Getis and Ord’s local statistics 2. LISA tests. @ 2007, Coro Chasco Yrigoyen All Rights Reserved 19 Session 3 3.2. Local spatial autocorrelation 3.2.1. Getis and Ord’s local statistics Gi(d), Gi*(d), New Gi(d), New Gi*(d) Gi(d) measures the concentration (or lack or it) of the weighted sum of values of variable Y in a subregion of “j” locations around “i” in the global space. GLOBAL N G (d ) = LOCAL N N ∑∑ w ( d ) x x i =1 j =1 N ij N ∑∑ xi x j i =1 j =1 i j Gi ( d ) = ∑ w (d ) x j =1 ij N ∑x j =1 @ 2007, Coro Chasco Yrigoyen All Rights Reserved j j ⎧j≠i ⎪ ; for ⎨ x j > 0 ⎪ ⎩W binary & symmetric 20 Session 3 3.2. Local spatial autocorrelation 3.2.1. Getis and Ord’s local statistics (II) Gi*(d): Local spatial concentration also considers the value of variable X in “i”. Since wii = 0, the only difference with Gi(d) is only in the denominator. @ 2007, Coro Chasco Yrigoyen All Rights Reserved N Gi∗ ( d ) = ∑ w (d ) x j =1 ij j N ∑x j =1 j ⎧j ⎪ for ⎨ x j > 0 ⎪ ⎩W binary & symmetric 21 Session 3 3.2. Local spatial autocorrelation 3.2.1. Getis and Ord’s local statistics (III) New Gi(d), New Gi*(d): the standardized versions of Gi(d) and Gi*(d). They distribute as normal variables. Significant positive values of these tests = positive spatial autocorrelation = concentration of high values of the variable. Significant negative values of these tests = positive spatial autocorrelation = concentration of low values of the variable. @ 2007, Coro Chasco Yrigoyen All Rights Reserved 22 Session 3 Anselin, L. (1995). “Local indicators of spatial association - LISA.” Geographical Analysis 27, 93–115. 3.2. Local spatial autocorrelation 3.2.2. LISA tests LISA: Local Indicators of Spatial Autocorrelation Detect the contribution of each location to global spatial autocorrelation Local spatial autocorrelation statistics are useful to identify hot spots: Spatial concentration of high/low values or Spatial outliers Local autocorrelation is always present in global spatial autocorrelation, but it can also exist in the absence of it. Local Moran’s I is the most popular. @ 2007, Coro Chasco Yrigoyen All Rights Reserved 23 Session 3 3.2. Local spatial autocorrelation Anselin, L. (1995). “Local indicators of spatial association - LISA.” Geographical Analysis 27, 93–115. 3.2.2. LISA tests (II) Gives an indication of the extent of significant spatial clustering of similar values around one observation “i”. The sum of LISAs for all observations is proportional to the global Moran’s I. Local Moran’s I (LISA) zi, zj: standaridzed yi values For a row-standardised W OBS I_DIST01 1 168.0678 2 -1.155578 3 -0.88391 4 0.044727 5 -5.440304 Z_DIST01 -1.845431 0.842808 0.342822 0.291255 1.480166 P_DIST01 0.000557 0.677512 0.673284 0.950351 0.019687 The moments for Ii statistic, under the null hypothesis of no spatial association, can be derived for a randomisation hypothesis. @ 2007, Coro Chasco Yrigoyen All Rights Reserved 24