Spatial Data What is special about Spatial Data? 1

advertisement
Spatial Data
What is special about Spatial Data?
1
Briggs Henan University 2012
What is needed for spatial analysis?
1. Location information—a map
2. An attribute dataset: e.g
population, rainfall
3. Links between the locations
and the attributes
4. Spatial proximity information
– Knowledge about relative
spatial location
– Topological information
Topology
Topography
--knowledge about relative spatial positioning
--the form of the land surface, in particular, its elevation
Briggs Henan University 2012
2
Berry’s geographic matrix
Berry, B.J.L 1964 Approaches to regional analysis: A synthesis . Annals of the Association of American Geographers, 54,
pp. 2-11
1990
time
2000
location
Attributes or variables
Variable 1 Variable 2 … Variable P
Attributes or variables
areal
unit
1
location
2010
Income … Variable P
areal unitPopulation
2
Attributes or Variables
.1
areal
unit
location
.
Income
… Variable P
.2
areal unitPopulation
geographic
.
Henanareal
unit n
associations
.
Shanxi .
geographic
.
areal
unit ndistribution
geographic
.
.
fact
areal unit n
3
Briggs Henan University 2012
4
Briggs Henan University 2012
Types of Spatial Data
•
•
•
•
Continuous (surface) data
Polygon (lattice) data
Point data
Network data
5
Briggs Henan University 2012
Spatial data type 1: Continuous
(Surface Data)
• Spatially continuous data
– attributes exist everywhere
• There are an infinite number
locations
– But, attributes are usually only
measured at a few locations
• There is a sample of point
measurements
• e.g. precipitation, elevation
– A surface is used to represent
continuous data
6
Briggs Henan University 2012
Spatial data type 2: Polygon Data
• polygons completely
covering the area*
– Attributes exist and are
measured at each location
– Area can be:
• irregular (e.g. US state or
China province boundaries)
• regular (e.g. remote sensing
images in raster format)
*Polygons completely covering an area are
called a lattice
7
Briggs Henan University 2012
Spatial data type 3: Point data
• Point pattern
– The locations are the focus
– In many cases, there is no attribute involved
8
Briggs Henan University 2012
Spatial data type 4: Network data
• Attributes may measure
– the network itself (the roads)
– Objects on the network (cars)
• We often treat network
objects as point data, which
can cause serious errors
– Crimes occur at addresses on
networks, but we often treat
them as points
See: Yamada and Thill Local Indicators of network-constrained clusters in
spatial point patterns. Geographical Analysis 39 (3) 2007 p. 268-292
Briggs Henan University 2012
9
Which will we study?
Point data
(point pattern analysis: clustering and dispersion)
Polygon data*
(polygon analysis: spatial autocorrelation and spatial regression)
Continuous data*
(Surface analysis: interpolation, trend surface analysis and kriging)
*in the fall semester
Briggs Henan University 2012
10
Converting from one type of data
to another.
--very common in spatial analysis
11
Briggs Henan University 2012
Converting point to continuous data:
interpolation
#
## #
#
#
#
#
# #
#
#
#
# #
#
##
##
#
#
#
##
#
# #
# # #
#
#
##
##
# ## #
##
# #
#
##
#
##
#
#
#
# # # #
#
#
#
# #
# #
# ### ##
#
# #
# ###
#
###
## #
#
#
#
#
#
### # #
# #
#
#
#
#
#
#
#
#
#
#
#
#
#
#
# #
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
# #
#
# ###
##
##
#
#
# # # #
# # #
#
#
#
#
## #
#
#
#
#
#
#
#
# # # ##
#
#
#
# # ##
#
#
# #
# # #
#
# #
#
# ##
#
#
#
#
#
#
#
#
#
# #
#
#
# # #
#
#
#
#
#
#
#
#
#
#
#
##
# #
#
#
#
#
#
##
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
12
#
Briggs Henan University 2012
Interpolation
• Finding attribute values at locations where
there is no data, using locations with known
data values
Simple linear
• Usually based on
interpolation
– Value at known location
– Distance from known location
• Methods used
Known
Unknown
– Inverse distance weighting
– Kriging
13
Briggs Henan University 2012
Converting point data to polygons
using Thiessen polygons
#
#
#
#
## #
#
# #
#
#
#
# ##
# #
# ##
# # # #
#
# # #
##
#
#
# ## #
# #
# #
# #
##
#
#
# ## # # # #
#
#
#
#
#
# #
#
#
#
#
#
# # #
#
#
#
#
# ## #
# #
##
#
# #
# #
#
#
# #
#
##
#
#
#
#
#
#
#
# #
#
#
# #
#
#
#
#
#
#
#
#
#
# #
# #
#
#
#
#
# #
#
# #### #### # # #####
##
## ##
#
#
#
#
#
#
#
# # # # # #
#
## #
#
#
#
#
# #
#
#
#
#
#
#
#
#
#
# #
#
#
# #
#
#
#
#
# ##
#
#
# #
#
#
#
#
#
#
# ##
#
#
#
#
#
# ##
#
#
#
#
# #
# #
#
# # #
#
#
# #
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
14
#
#
#
Briggs Henan University 2012
Thiessen or Proximity Polgons
(also called Dirichlet or Voronoi Polygons)
• Polygons created from a point
layer
• Each point has a polygon (and
each polygon has one point)
• any location within the polygon
is closer to the enclosed point
than to any other point
• space is divided as ‘evenly’ as
possible between the polygons
Thiessen or Proximity
Polygons
A
15
Briggs Henan University 2012
How to create Thiessen Polygons
1. Connect point
to its nearest
(closest) neighbor
2. Draw
perpendicular
line at midpoint
3. Repeat for
other points
4. Thiessen
polygons
16
Briggs Henan University 2012
Converting polygon to point data using
Centroids
• Centroid—the balancing point for a polygon
• used to apply point pattern analysis to polygon data
• More about this later
17
Briggs Henan University 2012
Using a polygon to represent a set of
points: Convex Hull
• the smallest convex polygon able to
contain a set of points
– no concave angles pointing inward
• A rubber band wrapped around a set of
points
• “reverse” of the centroid
• Convex hull often used to create the
boundary of a study area
No!
– a “buffer” zone often added
– Used in point pattern analysis to solve the
boundary problem.
• Called a “guard zone”
18
Briggs Henan University 2012
Models for Spatial Data:
Raster and Vector
two alternative methods for
representing spatial data
19
Briggs Henan University 2012
Concept of Vector and Raster
river
Real World
house
trees
Raster Representation
0
0
1
2
3
4
5
6
7
8
9
1
2
3
4
5
6
7
8
9
Vector Representation
R T
R
R
R
H
T
point
line
R R
R
R
R
T T
T T
H
polygon
R
R
20
Briggs Henan University 2012
Comparing Raster and Vector Models
Raster Model
• area is covered by grid with (usually) equal-size, square cells
• attributes are recorded by giving each cell a single value based
on the majority feature (attribute) in the cell, such as land use
type or soil type
• Image data is a special case of raster data in which the “attribute”
is a reflectance value from the geomagnetic spectrum
– cells in image data often called pixels (picture elements)
Vector Model
The fundamental concept of vector GIS is that all geographic
features in the real work can be represented either as:
• points or dots (nodes): trees, poles, fire plugs, airports, cities
• lines (arcs): streams, streets, sewers,
• areas (polygons): land parcels, cities, counties, forest, rock type
Because representation depends on shape, ArcGIS refers to files containing
vector data as shapefiles
Briggs Henan University 2012
21
Raster model
Land use (or soil type)
wheat
fruit
clover
corn
Image
fruit
0
1
2
3
4
5
6
7
8
9
0 1 2 3 4 5 6 7 8 9
1 1 1 1 1 4 4 5 5 5
1 1 1 1 1 4 4 5 5 5
1 1 1 1 1 4 4 5 5 5
1 1 1 1 1 4 4 5 5 5
1 1 1 1 1 4 4 5 5 5
2 2 2 2 2 2 2 3 3 3
2 2 2 2 2 2 2 3 3 3
2 2 2 2 2 2 2 3 3 3
2 2 4 4 2 2 2 3 3 3
2 2 4 4 2 2 2 3 3 3
Each cell (pixel)
has a value
between 0 and 255
(8 bits)
21
186
22
Briggs Henan University 2012
Vector Model
• point (node): 0-dimensions
– single x,y coordinate pair
– zero area
– tree, oil well, location for label
• line (arc):
1-dimension
– two connected x,y coordinates
– road, stream
– A network is simply 2 or more
connected lines
• polygon :
2
1
y=2
.
Point: 7,2
x=7
1
7
8
2
Line: 7,2 8,1
1
7
8
2-dimensions
– four or more ordered and
connected x,y coordinates
– first and last x,y pairs are the
same
– encloses an area
– county, lake
2
Polygon: 7,2 8,1 7,1 7,2
1
7
8
23
Briggs Henan University 2012
Using raster and vector models to
represent surfaces
24
Briggs Henan University 2012
Representing Surfaces
with raster and vector models –3 ways
• Contour lines
– Lines of equal surface value
– Good for maps but not computers!
• Digital elevation model (raster)
– raster cells record surface value
• TIN (vector)
– Triangulated Irregular Network (TIN)
– triangle vertices (corners) record surface
value
25
Briggs Henan University 2012
Contour (isolines) Lines
for surface representation
Contour lines of constant elevation
--also called isolines (iso = equal)
Advantages
•
Easy to understand (for most people!)
–
–
–
–
hill top (or basin)
Downhill > = ridge
Uphill < = valley
Closer lines = steeper slope
Circle =
Disadvantages
•
•
Not good for computer representation
Lines difficult to store in computer
Raster
for surface representation
Each cell in the raster records the height (elevation) of the surface
Surface
Contour lines
Raster cells with
elevation value
Raster cells
(Contain
elevation
values)
27
Briggs Henan University 2012
Triangulated Irregular Network (TIN):
Vector surface representation
• a set of non-overlapping
triangles formed from
irregularly spaced points
• preferably, points are located at
“significant” locations,
valley
ridge
1
– bottom of valleys, tops of ridges
• Each corner of the triangle
(vertex) has:
– x, y horizontal coordinates
– z
vertical coordinate measuring
elevation.
2
3
4
vertex
Point #
1
2
3
4
etc
5
X
10
25
30
15
Y
30
30
25
20
Z
160
150
140
130
Draft: How to Create a TIN
surface:
from points to surfaces
Thiessen3.jpg
Thiessen4.jpg
Links together all spatial concepts: point, line, polygon, surface
29
Briggs Henan University 2012
Using raster and vector models to
represent polygons
(and points and lines)
30
Briggs Henan University 2012
Representing Polygons
(and points and lines)
with raster and vector models
• Raster model not good
– not accurate
0
1
2
3
4
5
6
7
8
9
0
1
1
1
1
1
2
2
2
2
2
1
1
1
1
1
1
2
2
2
2
2
2
1
1
1
1
1
2
2
2
4
4
3
1
1
1
1
1
2
2
2
4
4
4
1
1
1
1
1
2
2
2
2
2
5
4
4
4
4
4
2
2
2
2
2
6
4
4
4
4
4
2
2
2
2
2
7
5
5
5
5
5
3
3
3
3
3
8
5
5
5
5
5
3
3
3
3
3
X
9
5
5
5
5
5
3
3
3
3
3
• Also a big challenge for the vector model
– but much more accurate
– the solution to this challenge resulted in the
modern GIS system
31
Briggs Henan University 2012
Using Raster model for points,
lines and polygons
For points
--not good!
For lines and polygons
Point “lost” if two
points in one cell
Line not
accurate
Point located at cell center
--even if its not
Polygon boundary
not accurate
Briggs Henan University 2012
32
Using vector model to represent
points, lines and polygons:
Node/Arc/Polygon Topology
The relationships between all spatial elements (points, lines, and polygons) defined
by four concepts:
• Node-ARC relationship:
– specifies which points (nodes) are connected to form arcs (lines)
• Arc-Arc relationship
– specifies which arcs are connected to form networks
•
Polygon-Arc relationship
– defines polygons (areas) by specifying
which arcs form their boundary
• From-To relationship on all arcs
from
to
New!
– Every arc has a direction from a node to a node
from
– This allows
• This establishes left side and right side of an arc (e.g. street)
• Also polygon on the left and polygon on the right for
every side of the polygon
Briggs Henan University 2012
Left
Right
33
to
1
I
4
II
Smith
Estate A34
IV
2
Birch
III
A35
3
Cherry
Spatial Data
Node Table
Node ID Easting Northing
1 126.5
578.1
2 218.6
581.9
3 224.2
470.4
4 129.1
471.9
Arc Table
Arc ID From N To N L Poly
I
4
1
II
1
2
III
2
3 A35
IV
3
4
Polygon Table
Polygon ID
Arc List
A34
I, II, III, IV
A35
III, VI, VII, XI
Node/Arc/ Polygon and Attribute Data
R Poly
A34
A34
A34
A34
Example of computer implementation
Attribute Data
Node Feature Attribute Table
Node ID Control
Crosswalk
1 light
yes
2 stop
no
3 yield
no
4 none
yes
ADA?
yes
no
no
no
Arc Feature Attribute Table
Arc ID Length Condition Lanes Name
I
106 good
4
II
92 poor
4 Birch
III
111 fair
2
IV
95 fair
2 Cherry
Polygon Feature AttributeTable
Polygon ID Owner
Address
A34
J. Smith 500 Birch
A35
R. White 200 Main
34
Briggs Henan University 2012
This is how a vector GIS system works!
This data structure was invented by Scott Morehouse at the
Harvard Laboratory for Computer Graphics in the 1960s.
Another graduate student named Jack Dangermond hired
Scott Morehouse, moved to Redlands, CA, started a new
company called ESRI Inc., and created the first commercial
GIS system, ArcInfo, in 1971
Modern GIS was born!
35
Briggs Henan University 2012
Other ways to represent polygons
with vector model
2. Whole polygon structure
3. Points and Polygons structure
• Used in earlier GIS systems before
node/arc/polygon system invented
• Still used today for some, more simple,
spatial data (e.g. shapefiles)
• Discuss these if we have time!
36
Briggs Henan University 2012
Vector Data Structures:
Whole Polygon
Whole Polygon (boundary structure): list coordinates of points in
order as you ‘walk around’ the outside boundary of the polygon.
– all data stored in one file
– coordinates/borders for adjacent polygons stored twice;
• may not be same, resulting in slivers (gaps), or overlap
– all lines are ‘double’ (except for those on the outside periphery)
– no topological information about polygons
• which are adjacent and have a common boundary?
– used by the first computer mapping program, SYMAP, in late
1960s
– used by SAS/GRAPH and many later business mapping programs
– Still used by shapefiles.
Topology
Topography
--knowledge about relative spatial positioning
-- knowledge about shared geometry
--the form of the land surface,Briggs
in particular,
its elevation
Henan University
2012
37
Whole Polygon:
illustration
Data File
A34
A44
A42
A32
A34
B44
B54
B52
B42
B44
C 32
C42
C40
5
4
3
E
A
B
C
D
2
1
0
1
2
3
4
5
C30
C32
D42
D52
D50
D40
D42
E15
E55
E54
E34
E30
E10
E15
38
Briggs Henan University 2012
Vector Data Structures:
Points & Polygons
Points and Polygons: list ID numbers of points in order
as you ‘walk around’ the outside boundary
• a second file lists all points and their coordinates.
– solves the duplicate coordinate/double border problem
– still no topological information
• Do not know which polygons have a common border
– first used by CALFORM, the second generation mapping
package, from the Laboratory for Computer Graphics and
Spatial Analysis at Harvard in early ‘70s
39
Briggs Henan University 2012
Points and Polygons:
Illustration
5
12
11
2
1
4
3
E
2
1
10
0
1
A
9
2
3
4
3
C
8
4
5
B
6
D
7
Points File
1
2
3
4
5
6
7
8
9
10
11
12
34
44
42
32
54
52
50
40
30
10
15
55
Polygons File
A 1, 2, 3, 4, 1
B 2, 5, 6, 3, 2
C 4, 3, 8, 9, 4
D 3, 6, 7, 8, 3
E 11, 12, 5, 1, 9,
10, 11
5
40
Briggs Henan University 2012
Hopefully, you now have a better
understanding of
what is special about spatial data!
Monday, we will begin talking about
Spatial Statistics
41
Briggs Henan University 2012
42
Briggs Henan University 2012
Download