Spatial Analysis - The University of Texas at Dallas

advertisement
Spatial Analysis
Concepts and Challenges
Briggs Henan University 2012
1
Topics
• Description versus Analysis
• The concepts of Process, Pattern and
Analysis
• Issues and challenges in spatial data
analysis
• Measuring space
Briggs Henan University 2012
2
Description and Analysis
Description
• Most GIS systems are used
by governments and
private companies to
describe the real world
• this helps the organization
“do its job”
– For example, manage sewer
and water networks
– Manage land resources
• Most GIS systems are
primarily designed for this
purpose
– They are used to develop
spatial databases to describe
the real world and help
manage it.
Briggs Henan University 2012
3
Description and Analysis
Is the locations of the software industry different
Analysis
from the telecommucations industry?
• Tries to understand the
processes which cause or
create the patterns in the real
world
• Understanding processes:
– Helps the organization do its
job better
• Make better decisions, for
example
– Helps us understand the
phenomena itself
• This is the role of science
Here, we are using “centrographic
statistics” to help answer this question
Briggs Henan University 2012
4
Description: Water and Sewer
system
Analysis: Do the locations of the
software and telecommucations
industries differ?
We will talk about analysis.
Briggs Henan University 2012
5
Process, Pattern and Analysis
• Processes operating in space create patterns
• Spatial Analysis is aimed at:
– Identifying and describing the pattern
– Identifying and understanding the process
Processes
Create
Patterns
(or cause)
Briggs Henan University 2012
6
Spatial Analysis is aimed at:
• Identifying and describing the
pattern
The pattern is clearly
clustered
(points are in “groups”)
• Identifying and
understanding the process
Access to transportation.
Agglomeration economies*
from sharing ideas, access to
skilled labor, access to
business services.
*cost savings from many firms
locating in the same area
We will focus on spatial analysis
Briggs Henan University 2012
7
Process, Pattern and Analysis
• Often, we cannot observe (or “see’) the
process, so we have to infer (“guess at” ?) the
process by observing the pattern
No
Infer
Processes
Yes
Create
Patterns
(or “cause”)
Briggs Henan University 2012
8
Spatial Analysis: successive levels of sophistication
Four levels of Spatial Analysis:
--Each is more advanced (more difficult!)
1. Spatial data description:
2. Exploratory Spatial Data Analysis (ESDA)
3. Spatial statistical analysis and hypothesis testing
4. Spatial modeling and prediction
We will look at all 4 levels in this lecture series
More difficult,
but more useful!
(more powerful)
Briggs Henan University 2012
9
Spatial Analysis: successive levels of sophistication
1. Spatial data description:
– Focus is on describing the world,
and representing it in a digital
format
--computer map
--computer database
– Uses classic GIS capabilities
--buffering, map layer overlay
--spatial queries & measurement
Briggs Henan University 2012
10
Spatial Analysis: successive levels of sophistication
2. Exploratory Spatial Data Analysis (ESDA):
– searching for patterns and possible explanations
– GeoVisualization through data graphing and
mapping
--Density
Kernel
Estimation
--Overlay
transportation
network
Briggs Henan University 2012
11
Spatial Analysis: successive levels of sophistication
2. Exploratory Spatial Data Analysis (ESDA):
– searching for patterns and possible explanations
– GeoVisualization through calculation and display
of Centrographic statistics
--Calculation of
Centrographic
Statistics
Briggs Henan University 2012
12
Spatial Analysis: successive levels of sophistication
3.
Spatial statistical analysis and hypothesis
testing
– Are data “to be expected” or are they
“unexpected” relative to some statistical model,
usually of a random process (pure chance)
2.5%
We will look at
statistical hypothesis testing for:
--point patterns
--also for polygon data
-1.96
2.5%
0
1.96
We can test if the spatial pattern for software &
telecommmunications companies in Dallas is
clustered (a pattern) or “random” (no pattern)
Briggs Henan University 2012
13
Spatial Analysis: successive levels of sophistication
4.
Spatial modeling: prediction
– Construct models (of processes) to
predict spatial outcomes (patterns)
Notice how the density of points (number
per square km) decreases as we move
away from the highway.
We can construct regression
models to predict location patterns.
However, for spatial data, we need special:
Spatial regression models
Density of points
Density of points = f (distance from highway)
Distance from highway
Briggs Henan University 2012
14
The first example of Spatial Analysis
• John Snow’s maps of cholera in 1850s London
Was it ESDA or hypothesis testing?
• Did he discover the association between water and
cholera after drawing the map: ESDA
• Did he draw the map in order to prove the
association: using a map for hypothesis testing
Briggs Henan University 2012
15
Maps are good—but more is needed!
A. Is this clustered?
B. Is this clustered?
We must test rigorously using spatial analysis methods.
Not just look and guess
Briggs Henan University 2012
Source: R & Y, p. 5
16
Why is this important?
?
Is it clustered?
We must measure and test
--not just look and guess!
Because that is science!
Because that is how earth management
decisions must be made!
Briggs Henan University 2012
17
Why we need analysis
--and not just visual examination
See handout!!!
You need this course!
Briggs Henan University 2012
18
A
Clustered or random?
B
Clustered or random?
Source: Rogerson and Yamada, 2009
(clustered = points are in “groups”)
= pattern exists
(random = points can be anywhere)
= no pattern
19
Briggs Henan University 2012
Issues/Challenges/Problems
in Spatial Analysis
Summarize these now.
Talk in greater detail about them
throughout this lecture series.
Briggs Henan University 2012
20
Critical Issues in Spatial Analysis
• Spatial autocorrelation
– Data from locations near to each other are usually more similar than data from
locations far away from each other
• Modifiable areal unit problem (MAUP-zone )
– Results may depend on the specific geographic unit used in the study
– Province or county; county or city
• Scale affects representation and results
– Cities may be represented as points or polygons
– Results depend on the scale at which the analysis is conducted: province or county
– MAUP—scale effect
• Ecological fallacy
– Results obtained from aggregated data (e.g. provinces) cannot be assumed to apply
to individual people
– MAUP—individual effect
• Non-uniformity of Space
– Phenomena are not distributed evenly in space
– Be careful how you interpret results!
• Edge issues
– Edges of the map, beyond which there is no data, can significantly affect results
Briggs Henan University 2012
21
Critical Issues in Spatial Analysis
• Modifiable areal unit problem (MAUP)
– Results may depend on the specific geographic unit used in the study
– Province or county; county or city
– MAUP—zone effect
• Scale affects representation and results
– Results depend on the scale at which the analysis is conducted
– MAUP—scale effect
• Ecological fallacy
– Results obtained from aggregated data (e.g. provinces) cannot be assumed to apply
to individual people
– MAUP—individual effect
• Non-uniformity of Space
– Phenomena are not distributed evenly in space
– Be careful how you interpret results!
• Edge issues
– Edges of the map, beyond which there is no data, can significantly affect results
• Spatial autocorrelation –the biggest of all!
– Data from locations near to each other are usually more similar than data from
locations far away from each other
Briggs Henan University 2012
22
Modifiable areal unit problem (MAUP): 3 in 1!!!
– Results may depend on the specific geographic unit used in the study
– Dangerous to assume results for one set of units will also apply for another
Zonal effect: Similar size and number of units,
but different boundaries
• Zip codes versus census tracts
• Postal zones versus city neighborhoods
Scale effect: increases size and decreases number
of units
• Counties versus provinces
Individual effect: ecological fallacy
results from geograhic units
may not apply to individual people
Briggs Henan University 2012
23
Modifiable areal unit problem (MAUP): zonal
• Census Tracts versus Zip codes
• Problem not as big—usually—as for scale differences
Census Tract
(used by US Census Bureau for data)
Zipcode Areas
(used by US Post Office)
Briggs Henan University 2012
24
Modifiable areal unit problem (MAUP): scale
• Counties versus Provinces codes
• Usually a bigger problem than for zonal
Will results be the same?
Do conclusions still apply?
Briggs Henan University 2012
25
Aggregation
• combining smaller units
into bigger units
• affects results!
• Note how:
• variance (s2)
decreases
• Correlation
coefficient (rXY)
increases (‘cos of
less variability)
Briggs Henan University 2012
26
Scale:
• Results obtained at one scale do not necessarily apply at other
scales
– A pattern may be clustered at one scale but dispersed at another scale
Population
clustered
into cities
City populations
are dispersed
Scale is always very important in spatial analysis!
Briggs Henan University 2012
27
Scale: Always Important
– ratio of distance on a map, to the equivalent
distance on the earth's surface.
– Scale must be shown on every map
– Use scale bar because that is correct when map
is enlarged or reduced
0
1
Large scale >objects are
large, small area covered
2
Km
• Affects how objects are represented on a
map and how data is stored in a data base
– Important for research design and data collection
– Cities may be points or polygons
Small scale >objects are
small, large area covered
Dallas
point
polygon
10 km
Briggs Henan University 2012
28
Ecological fallacy
crime rate
Results from aggregated data (e.g. provinces) cannot be applied
to individual people
A special case of the MAUP problem
Encountered in spatial and non-spatial analysis
Usually because a variable was left out (omitted variable)
income
If low income provinces have
high crime rate.
Cannot assume low income people
commit crimes.
Perhaps low income provinces do not
have money to pay for police.
--”Yuans on policing” is omitted variable
Briggs Henan University 2012
29
Non-uniformity of Space: things are not evenly distributed in space
– Bank robberies are clustered
– But only because banks are clustered
Bank robbery
Banks
Bank Robberies
Bonnie and Clyde were two very famous bank robbers in Texas in the 1930s
They were asked “Why do you rob banks?”
They replied “Because that is where the money is!
What a stupid question!”
(the expected answer was, perhaps, “because we needed money for food”)
Briggs Henan University 2012
30
Non-uniformity of Space
and Choropleth Maps
Illiteracy in China
Henan does not have
high illiteracy!
Henan has high
illiteracy!
Briggs Henan University 2012
31
Non-uniformity of Space and
Choropleth Maps
Always normalize data if drawing a choropleth map
– By total population
– By geographic area
• Do not map “counts” unless population and/or geographic
area are the same size for all observation units
• Failing to “normalize” is a very common mistakes made by
non-professional GIS people
– You are professionals
– Do not make that mistake!
Briggs Henan University 2012
32
Edge or Boundary Effect
– Every study region has a boundary (unless you study the entire world!)
– You do not have data for outside your study region
– However, the outside data can affect the inside data if there is spatial
autocorrelation
– Consequently, edges of the map, beyond which there is no data, can
significantly effect results
Solutions:
Core study
region
periphery or
guard area
Use Core/Periphery
--analyze only the core
--use edge only for “neighborhood”
calculations
--reduces amount of data available
Use the toroid concept
--bends the left edge to meet the right
and the top to meet the bottom
--uses all the data
--assumes that there is no systematic
spatial trend in data
Briggs Henan University 2012
33
Spatial Autocorrelation
• Spatial organization is usually important
• The results from a traditional regression analysis ignore how the observation
units are organized spatially!
• Data from location near to each other are usually more similar than
data from locations far away
– Must be considered in your analysis
– Also causes serious problems with traditional statistical hypotheses testing
• Spatial statistical models are essential
Briggs Henan University 2012
34
What is the most common
mistake in GIS analysis?
Much more basic than any
discussed above.
Briggs Henan University 2012
35
Single most common error in GIS Analysis
--intending a one to one join of attribute data to spatial table
--getting a one to many join of attribute data to spatial table
51 states
Hawaii
Spatial
FID
0
1
SHAPE
Polygon
Polygon
11
12
13
14
15
16
Polygon
Polygon
Polygon
Polygon
Polygon
Polygon
53
54
Polygon
Polygon
NAME
Alabama
Alaska
…..
Georgia
Hawaii
Idaho
…..
Wisconsin
Wyoming
51 states
NAME
FIPS
Alabama
01
Alaska
02
…..
Georgia
13
Hawaii-Hawaii 15
Hawaii-Maui
15
Hawaii-Oahu
15
Hawaii-Kauai 15
Idaho
16
…..
Wisconsin
55
Wyoming
56
CODE
AL
AK
FIPS
01
02
CODE
AL
AK
POP2000
4,447,100
626,932
13
15
16
GA
HI
ID
8,186,453
1,211,537
1,293,953
55
56
WI
WY
5,363,675
493,782
Total
282,421,906
POP2000
4,447,100
626,932
GA
HI
HI
HI
HI
ID
8,186,453
1,211,537
1,211,537
1,211,537
1,211,537
1,293,953
WI
WY
5,363,675
493,782
Total
286,056,517
54 observations
Briggs Henan University 2012
After joining
attribute to
spatial data
36
Errors with the
attribute data will
occur if Hong Kong or
Guangdong are not
correctly drawn in the
shapefile (spatial data)
If there are islands, the province must drawn as a multi-part
feature in the shapefile (the spatial data)
--then there is only one row in the attribute table
If each island is drawn as a separate feature, there will be
multiple rows in the attribute table
Briggs Henan University 2012
37
Measuring Space
Not easy!
Briggs Henan University 2012
38
Spatial is special: 3 primary concepts
– Distance
– Adjacency
or neighborhood
A
– Interaction
Briggs Henan University 2012
39
Fundamental Spatial Concepts
• Distance
– The magnitude of spatial separation
– Euclidean (straight line) distance often only an approximation
• Adjacency or neighborhood
– Nominal or binary (0,1) equivalent of distance
– Levels of adjacency exist: 1st, 2nd, 3rd nearest neighbor, etc..
• Interaction
– The strength of the relationship between entities
– An inverse function of distance
Briggs Henan University 2012
40
Distance is not simple!
• Cartesian distance via Pythagorus
dij  ( Xi  Xj ) 2  (Yi  Yj ) 2
Use for projected data, at local scale
• Spherical distance via spherical coordinates
Cos d = (sin a sin b) + (cos a cos b cos P)
where:
d = arc distance
a = Latitude of A
b = Latitude of B
P = degrees of long. A to B
Use for unprojected data, or at world scale
• possible distance metrics:
–
–
–
–
Euclidean straight line/airline
city block/manhattan metric
distance through network
time/friction through network
Briggs Henan University 2012
41
Spatial neighbors based on adjacency
Square raster
Rook:
Sharing a
boundary
Hexagons
Irregular
Queen:
Sharing a
boundary
or a point
Briggs Henan University 2012
42
1st and 2nd order adjacency
1st
order
rook
hexagon
queen
2nd
order
Briggs Henan University 2012
43
Interaction
Based on the Gravity Model
 P P 

I ij   κ
 e



α β
i
j
γ d ij
Gravity Model: Interaction
between i and j is a function of:
Pi --the population (size) at i
Pj --the population (size) at j
dij --the distance from i to j
Based on a Hierarchy
How do you fly from
Zhengzhou to Wuhan?
Shanghai
Zhengzhou
Central Place
Hierarchy
Briggs Henan University
2012
Wuhan
44
Today, we discussed spatial analysis
and some of its problems and
challenges
However, to do spatial analysis you
must have spatial data
Next time:
Spatial data and why it is special!
Briggs Henan University 2012
45
Texts
O’Sullivan, David and David Unwin, 2010.
Geographic Information Analysis. Hoboken, NJ:
John Wiley, 2nd ed.
Other Useful Books:
Mitchell, Andy 2005. ESRI Guide to GIS Analysis Volume 2: Spatial
Measurement & Statistics. Redlands, CA: ESRI Press.
Allen, David W 2009. GIS Tutorial II: Spatial Analysis Workbook.
Redlands, CA: ESRI Press.
Wong, David W.S. and Jay Lee 2005. Statistical Analysis of Geographic
Information. Hoboken, NJ: John Wiley, 2nd ed.
Briggs Henan University 2012
46
Download