Introduction to Spatial Regression Analysis

advertisement
Introduction to Spatial Regression Analysis
ICPSR Summer Program 2010
1
1
Paul R. Voss1 and Katherine J. Curtis2
University of North Carolina at Chapel Hill
2
University of Wisconsin-Madison
Odum Institute for Research
in Social Science
Manning Hall, CB #3355
University of North Carolina at Chapel Hill
Chapel Hill, NC 2759
paul_voss@unc.edu
2
Department of Community &
Environmental Sociology
1450 Linden Drive
University of Wisconsin-Madison
Madison WI 53706
kcurtis@ssc.wisc.edu
Objectives
The goal of this five-day course is to provide an overview of applied spatial regression analysis
(spatial econometrics) that will enable participants to effectively incorporate these tools into their
own empirical research. The course will introduce the broader field of spatial data analysis and
the range of issues that generally must be dealt with when analyzing georeferenced data on a
lattice. Census-type data are among the most commonly encountered data that conform to this
description, although the course acknowledges the wider range of data appropriate for spatial
regression analysis. In general, this is NOT a course where significant attention can be given to
spatial analyses involving so-called geostatistical data or point pattern data. It also is not a GIS
course.
Course Materials and Organization
The course will convene each day from 9:00 a.m. until approximately 4:30 p.m., except for the
last day (Friday), when the course likely will wind down earlier to enable participants who must
meet Friday evening flights to do so. The course is organized into a format that includes
morning lectures (theoretical and conceptual underpinnings) and afternoon computing lab
sessions (hands-on applications). We will attempt to set aside the last half hour or more of each
day for group discussion of the topics introduced that day. Course materials are organized such
that the readings supplement and provide greater detail on the topics covered in the classroom.
Many more topics are introduced in the course lectures (assisted by PowerPoint) than can
reasonably be absorbed in five intensive days, so the readings provide a point of return for
review and deeper understanding of the topics covered, as well as a source of references for
further reading. The lab exercises are guided by written, step-by-step tutorial instructions so
that they can be repeated (and more fully absorbed) at a later time. Recommended readings
and lab exercises are available on-line.
Software
The course will use primarily the spatial analysis package GeoDa TM and the open source
programming application, R.
OUTLINE OF COURSE
Day 1 Morning:
1.
2.
3.
4.
5.
5.
6.
7.
8.
Welcome and introductions
Review of objectives and overview of plan for the week
Goal and overview for Day 1
Motivational example
Understanding spatial data
a.
Overview of spatial data and spatial data analysis
b.
Spatial analysis vs. spatial data analysis
c
Classes of problems in spatial data analysis
d.
Spatial vs. non-spatial data analysis
Why spatial is special
a.
Characteristics of spatial data
b.
Problems often caused by spatial data
Review OLS estimation
a.
Assumptions of the classical linear regression model
b.
Consequences of violation of the assumptions
The importance of exploratory data analysis (EDA) and exploratory spatial data
analysis (ESDA)
Orientation to afternoon lab: Introduction to shapefiles (attribute data and digital map
married up) and elementary ESDA using GeoDa.TM Orientation to univariate EDA
using R
Day 1 Afternoon:
1.
2.
3.
4.
Introduction to “our” shapefile; your task: Begin thinking now about hypotheses,
models, and analyses
Reading a shapefile into GeoDa TM
Simple mapping operations using GeoDa TM
Univariate EDA using R.
Day 1 Readings:
1.
2.
3.
4.
[Older reading but nice perspective] Anselin, Luc. 1989. “What is Special About
Spatial Data? Alternative Perspectives on Spatial Data Analysis.” NCGIA
Technical Paper 89-4.
[Recent & highly accessible reading] Ward, Michael D., & Kristian Skrede
Gleditsch. 2008. Spatial Regression Models. Quantitative Applications in the
Social Sciences, No. 155. Thousand Oaks, CA: Sage. Chapter 1. [May be found
as a downloadable PDF at: http://www.duke.edu/web/methods/ ]
[Together with the following reading, a nice motivational example] Loftin, Colin &
Sally K. Ward. 1983. “A Spatial Autocorrelation Model of the Effects of Population
Density on Fertility.” American Sociological Review, 48(1):121-128.
Galle, Omer R., Walter R. Gove, & J. Miller McPherson. 1972. “Population
Density and Pathology: What Are the Relations for Man?” Science (new series)
176:23-30
Day 2 Morning:
1.
2.
3.
4.
5.
7.
Q&A from readings or 1st day lecture or lab
Goal for Day 2: ESDA & spatial autocorrelation
Data exploration:
a.
Distribution aspects of dependent variable
b.
QQ Plots
c.
Linearity between dependent variable and independent variables
d.
Variable transformations; Box-Cox transformations
Global spatial autocorrelation & weights matrices
a.
What it is
b.
How it arises; Spatial processes
i.
Spatial heterogeneity
ii.
Spatial dependence
c.
Consequences of spatial autocorrelation
d.
How to measure it
i.
Weights Matrices
ii.
Global measures of spatial autocorrelation
a.
Global Moran statistic
b.
Global Geary statistic
c.
Problems with global measures
Local measures of spatial autocorrelation
a.
Local Moran
b.
Moran scatterplot
c.
LISA mapping
Orientation to afternoon lab: ESDA and spatial autocorrelation with GeoDa TM and
similar work in R.
Day 2 Afternoon:
1.
2.
3.
4.
5.
Introduction to ESDA
ESDA with GeoDa TM and R
Creating and comparing weights matrices
Global spatial autocorrelation in GeoDa TM and R
Local spatial autocorrelation in GeoDa TM and R
Day 2 Readings:
1.
2.
3.
[Introduction to a key diagnostic tool in spatial data analysis] Anselin, Luc. 1996.
“The Moran Scatterplot as an ESDA Tool to Assess Local Instability in Spatial
Association.” Pp. 111-125 in Fischer, Manfred, Henk J. Scholten, & David Unwin
(eds.) Spatial Analytical Perspectives on GIS: GISDATA 4 (London: Taylor &
Francis).
[The foundational reading for LISA statistics] Anselin, Luc. 1995. “Local Indicators
of Spatial Association – LISA.” Geographical Analysis 27(2):93-115.
[Nice example of ESDA] Messner, Steven F., et al. 1999. “The Spatial Patterning of
County Homicide Rates: An Application of Exploratory Spatial Data Analysis.”
Journal of Quantitative Criminology 15(4):423-450
Day 3 Morning:
1.
Q&A from readings or 2nd day lecture, lab or readings
2.
3.
4.
5.
6.
Goal for Day 3: Understanding spatial regression
Spatial processes
a.
Spatial heterogeneity
i.
Define
ii.
Causes of
iii.
Problems arising from
iv.
Correcting for spatial heterogeneity
v.
GWR preview
b.
Spatial dependence
i.
Define
ii.
Causes of
a.
True contagion vs. false contagion
iii.
Expressions of
a.
Lagged dependent variable
b.
Unresolved heterogeneity; error lag
iv.
Corrections for
a.
Spatial lag model
b.
Spatial error model
c.
What these models mean/imply
d.
Relationship between the two models
e.
Higher order models
Common modeling strategy
a.
Specify and estimate OLS model
b.
Analyze the regression diagnostics
c.
Specify spatial model
d.
MLE fundamentals
Understanding the regression diagnostics provided by GeoDa TM
a.
Information criteria statistics
b.
Normality of errors
c.
Heteroskedasticity
d.
Lagrange multiplier statistics
Orientation to afternoon lab: OLS & spatial regression modeling with GeoDa TM and
R
Day 3 Afternoon:
1.
2.
3.
OLS regression in GeoDa TM and R
GeoDa TM diagnostics and implications of these
Spatial regression models in GeoDa TM and R
Day 3 Readings:
1.
2.
3.
[A very strong foundational reading] Anselin, Luc, & Anil Bera. 1998. “Spatial
Dependence in Linear Regression Models with an Introduction to Spatial
Econometrics.” Chapter 7 (pp. 237-289) in Aman Ullah and David Giles (eds.)
Handbook of Applied Economic Statistics (New York: Marcel Dekker).
[Overview of spatial econometric regression models] Anselin, Luc. 2002. “Under
the Hood: Issues in the Specification and Interpretation of Spatial Regression
Models.” Agricultural Economics 27(3):247-267.
[Good reading for understanding of the dataset used in this course] Voss, Paul R.,
David D. Long, Roger B. Hammer, & Samantha Friedman. 2006. “County Child
4.
5.
Poverty Rates in the U.S.: A Spatial Regression Approach.” Population Research
and Policy Review 25:369-391.
[Terrific example of grounding a spatial data analysis in theory] Baller, Robert D., &
Kelly K. Richardson. 2002. “Social Integration, Imitation, and the Geographic
Patterning of Suicide.” American Sociological Review 67(6):873-888.
[Wonderful overview of spatial error and spatial lag regression models] Ward,
Michael D., & Kristian Skrede Gleditsch. 2008. Spatial Regression Models.
Quantitative Applications in the Social Sciences, No. 155. Thousand Oaks, CA:
Sage. Chapters 2 & 3.
Day 4 Morning:
1.
2.
3.
4.
5.
6.
7.
8.
Q&A from readings or 3rd day lecture, lab or readings
Goal for Day 4: Understanding spatial heterogeneity in relationships
Brief digression to examine spatial smoothing using Empirical Bayes approach
Introduction to GWR
a.
Theory and concept
b.
Local multivariate methods for spatial data analysis
i.
Spatial expansion model
ii.
Spatial adaptive filtering
iii.
Multilevel modeling
iv.
Random coefficient models
GWR analytical steps
What it means
a.
Spatial regime analysis
b.
GWR as a specification tool (interaction effects)
c.
GWR as a tool for policy analysis and decision making
Cautions with GWR
Orientation to afternoon lab: GWR
Day 4 Afternoon:
1.
2.
GWR in R
Spatial regime analysis in R
Day 4 Readings:
1.
2.
3.
4.
[Understanding GWR] Fotheringham, A. Stewart, & Chris Brunsdon. 1999. “Local
forms of Spatial Analysis.” Geographical Analysis 31(4):340-358.
[GWR has its critics] Wheeler, David, & Michael Tiefelsdorf. 2005. “Multicollinearity
and Correlation among Local Regression Coefficients in Geographically Weighted
Regression.” Journal of Geographical Systems 7:161-187.
[Excellent example of regime analysis] O’Loughlin, John, Colin Flint, & Luc Anselin.
1994. “The Geography of the Nazi Vote: Context, Confession, and Class in the
Reichstag Election of 1930.” Annals of the Association of American Geographers
84(3):351-380.
[How to for R] Anselin, Luc. 2007. “Spatial Regimes.” Pp. 107-115 in Spatial
Regression Analysis in R: A workbook. (CSISS)
Day 5 Morning:
1.
2.
3.
4.
5.
6.
7.
8.
Q&A from readings or 4th day lecture, lab or readings
Goal for the day: Dealing with spatial autocorrelation using Multilevel Analysis
Defining “place” and “space”
Conceptual motivations for multilevel modeling
Statistical motivations
Basic two-level multilevel model (continuous outcome)
Generalized multilevel model (binary outcome)
Orientation to afternoon: Multilevel analysis in R
Day 5 Afternoon:
1.
2.
Multilevel analysis in R
Final questions & consultations regarding student data analyses and plans
Day 5 Readings:
1.
2.
3.
4.
5.
6.
7.
[Classical, early approach to context] Entwisle, Barbara, John B. Casterline, &
Hussein A.A. Sayed. 1989. “Villages as Contexts for Contraceptive Behavior in Rural
Egypt.” American Sociological Review 54(6):1019-1034.
[More contemporary example] Baumer, Eric. P., Steven F. Messner, & Richard
Rosenfeld. 2003. “Explaining Spatial Variation in Support for Capital Punishment: A
Multilevel Analysis.” American Journal of Sociology 108(4):844-875.
[Summary/Introduction] Teachman, Jay & Kyle Crowder. 2002. “Multilevel Models in
Family Research: Some Conceptual and Methodological Issues.” Journal of Marriage
and Family 64(2):280-294.
[Bringing in space] Goldstein H , Rasbash J, & Browne W, et al. 2000. “Multilevel
Models in the Study of Dynamic Household Structures.” European Journal of
Population 16:373–88.
Chaix, Basile, Juan Merlo, S.V. Subramanian, John Lynch, & Pierre Chauvin. 2005.
“Comparison of a Spatial Perspective with the Multilevel Analytical Approach in
Neighborhood Studies: The Case of Mental and Behavioral Disorders due to
Psychoactive Substance Use in Malmö, Sweden, 2001.” American Journal of
Epidemiology 162(2):171-182.
[How to for R] Bliese, Paul. 2009. “Multilevel Modeling in R (2.3): A brief
introduction to R, the multilevel package and the nmle package.” http://cran.rproject.org/doc/contrib/Bliese_Multilevel.pdf
[Primer] Luke, Douglas. 2004. Multilevel Modeling. Thousand Oaks, CA: Sage.
Download