GEOG 60 Introduction to Geographic Information Systems

advertisement
GEOG 60 – Introduction to Geographic Information Systems
Professor: Dr. Jean-Paul Rodrigue
Topic 4 – Geographical Data Analysis
A – The Nature of Spatial Analysis
B – Basic Spatial Analysis
A
The Nature of Spatial Analysis
■
■
■
■
1. Spatial Analysis and its Purpose
2. Spatial Location and Reference
3. Spatial Patterns
4. Topological Relationships
1
Spatial Analysis and its Purpose
■ Conceptual framework
• Search of order amid disorder.
• Organize information in categories.
■ Method
• Inducting or deducting conclusions from spatially related
information.
• Deduction: Deriving from a model or a rule a conclusion.
• Induction: Learning new concepts from examples.
• Spatial analysis as a decision-making tool.
• Help the user make better decisions.
• Often involve the allocation of resources.
1
Spatial Analysis and its Purpose
■ Requirements
Information
Encoding
Media
Methods
Message
• 1) Information to be analyzed
must be encoded in some way.
• 2) Encoding implicitly requires a
spatial language.
• 3) Some media to support the
encoded information.
• 4) Qualitative and/or quantitative
methods to perform operations
over encoded information.
• 5) Ways to present to results in
an explicit message.
1
Spatial Analysis and its Purpose
Remote sensing
Geomorphology
Climatology
Quantitative
methods
Biogeography
Cartography
Spatial
Analysis
GIS
Soils
Human Geography
Historical Political
Economic
Behavioral
Population
1
Mapping Deaths from Cholera, London, 1854 (Snow
Study)
1
Spatial Analysis and its Purpose
■ Data Retrieval
DB
HD
SHP
• Browsing; windowing (zoom-in &
zoom-out).
• Query window generation (retrieval of
selected features).
• Multiple map sheets observation.
• Boolean logic functions (meeting
specific rules).
■ Map Generalization
• Line coordinate thinning of nodes.
• Polygon coordinate thinning of nodes.
• Edge-matching.
1
Spatial Analysis and its Purpose
■ Map Abstraction
• Calculation of centroids.
• Visual editing & checking.
• Automatic contouring from randomly
spaced points.
• Generation of Thiessen / proximity
polygons.
• Reclassification of polygons.
• Raster to vector/vector to raster
conversion.
■ Map Sheet Manipulation
•
•
•
•
Changing scales.
Distortion removal/rectification.
Changing projections.
Rotation of coordinates.
1
Spatial Analysis and its Purpose
■ Buffer Generation
• Generation of zones around certain
objects.
■ Geoprocessing
• Polygon overlay.
• Polygon dissolve.
• “Cookie cutting”.
6
■ Measurements
4.5
7.5
5
• Points - total number or number
within an area.
• Lines - distance along a straight or
curvilinear line.
• Polygons - area or perimeter.
1
Spatial Analysis and its Purpose
■ Raster / Grid Analysis
15
•
•
•
•
Grid cell overlay.
Area calculation.
Search radius.
Distance calculations.
■ Digital Terrain Analysis
•
•
•
•
•
•
•
Visibility analysis of viewing points.
Insolation intensity.
Grid interpolation.
Cross-sectional viewing.
Slope/aspect analysis.
Watershed calculation.
Contour generation.
3
Spatial Patterns
■ Relativity of objects
Size
Form
Orientation
Scale
Proximity
• Definition of an object in view of
another.
• Create spatial patterns.
■ Main patterns
• Size.
• Distribution/spacing : Uniform,
random and clustered.
• Proximity.
• Density: Dense and dispersed.
• Shape.
• Orientation.
• Scale.
3
Spatial Patterns
■ Spatial autocorrelation
•
•
•
•
Set of objects that are spatially associated.
Relationship in the process affecting the object.
Negative autocorrelation.
Positive autocorrelation.
Uniform
Clustered
Positive autocorrelation
Random
4
Topological Relations
■ Proximity
• Qualitative expression of
distance.
• Link spatial objects by their
mutual locations.
• Nearest neighbors.
• Buffer around a point or a line.
■ Directionality
4
Topological Relations
■ Adjacency
• Link contiguous entities.
• Share at least one common
boundary.
■ Intersection
■ Containment
• Link entities to a higher order
set.
City B
City A
4
Topological Relations
■ Connectivity
1
2
3
4
5
6
• Adjacency applied to a network.
• Must follow a path, which is a set
of linked nodes.
• Shortest path.
• All possible paths.
4
Topological Relations
Arable land
Flat land
■ Intersection
• What two geographical objects
have in common.
■ Union
Suitable for agriculture
• Summation of two geographical
objects.
■ Complementarity
• What is outside of the
geographical object.
Land
Non arable land
B
Elementary Spatial Analysis
■ 1. Statistical Generalization
■ 2. Data Distribution
■ 3. Spatial Inference
1
Statistical Generalization
■ Maps and statistical information
• Important to display accurately the underlying distribution of
data.
• Data is generalized to search for a spatial pattern.
• If the data is not properly generalized, the message may be
obscured.
• Balance between remaining true to the data and a generalization
enabling to identify spatial patterns.
• Thematic maps are a good example of the issue of statistical
generalization.
1
Statistical Generalization
Data
15
25
88
34
56
7
92
61
45
77
39
21
Classification
0-30
31-65
65-
Spatial Pattern
1
Statistical Generalization
■ Number of classes
•
•
•
•
Too few classes: contours of data distribution is obscured.
Too many classes: confusion will be created.
Most thematic maps have between 3 and 7 classes.
8 shades of gray are generally the maximum possible to tell
apart.
1
Statistical Generalization
■ Classification methods
• Thematic maps developed from the same data and with the
same number of classes, will convey a different message if the
ranging method is different.
• Each ranging method is particular to a data distribution.
Data Distribution
■ Histogram
•
•
•
•
The first step in producing a thematic map.
See how data is distributed.
Use of basic statistics such as mean and standard deviation.
An histogram plots the value against the frequency.
Uniform
Frequency
2
Value
Normal
Exponential
Data Distribution
■ Equal interval
C1
C2
C3
• Each class has an equal range
of values.
• Difference between the lowest
and the highest value divided by
the number of categories.
C4
Frequency
2
• (H-L)/C
L
Value
H
• Easy to interpret.
• Good for uniform distributions
and continuous data.
• Inappropriate if data is clustered
around a few values.
Data Distribution
C1
C2
C3
C4
n(C1)
n(C2)
n(C3)
n(C4)
■ Quantiles
Frequency
2
Value
• Equal number of observations in
each category.
• n(C1) = n(C2) = n(C3) = n(C4).
• Relevant for evenly distributed
data.
• Features with similar values may
end up in different categories.
■ Equal area
• Classes divided to have a similar
area per class.
• Similar to quantiles if size of
units is the same.
Data Distribution
■ Standard deviation
C1
C2
C3
C4
Frequency
2
Value
-1STD X +1STD
• The mean (X) and standard
deviation (STD) are used to set
cutpoints.
• Good when the distribution is
normal.
• Display features that are above
and below average.
• Very different (abnormal)
elements are shown.
• Does not show the values of the
features, only their distance from
the average.
Data Distribution
■ Arithmetic and geometric
progressions
C1
C2
Frequency
2
Value
C3 C4
• Width of the class intervals are
increased in a non linear rate.
• Good for J shaped distributions.
Data Distribution
■ Natural breaks
C1
C2
Frequency
2
Value
C3 C4
• Complex optimization method.
• Minimize the sum of the variance
in each class.
• Good for data that is not evenly
distributed.
• Statistically sound.
• Difficult to compare with other
classifications.
• Difficult to choose the
appropriate number of classes.
2
Data Distribution
■ User defined
• The user is free to select class intervals that fit the best the data
distribution.
• Last resort method, because it is conceptually difficult to explain
its choice.
• Analysts with experience are able to make a good choice.
• Also used to get round numbers after using another type of
classification method.
• $5,000 - $10,000 instead of $4,982 - $10,123.
■ Using classification
• Classification can be used to deliberately confuse or hide a
message.
2
Data Distribution
“no problems” - Equal steps
“there is a problem” - Quantiles
2
Data Distribution
“everything is within standards” - standard deviation
3
Spatial Inference
■ Filling the gaps
• Sampling shortens the time necessary to collect data.
• Requires methods to “fill the gaps”.
■ Interpolation and extrapolation
• Data at non-sampled locations can be predicted from sampled
locations.
• Interpolation:
• Predict missing values when bounding values are known.
• Extrapolation:
• Predict missing values outside the bounding area.
• Only one side is known.
Height
Spatial Inference: Interpolation and Extrapolation
Interpolation line
Sample
Location
Delay at the traffic light
3
Extrapolation line
Sample
Interpolation line
Number of vehicles
Spatial Inference: Best Fit
112
110
y = 0.1408x + 116.69
2
R = 0.6779
108
106
Sex Ratio
3
104
102
100
98
96
-130
-120
-110
-100
Longitude
-90
-80
-70
-60
3
Spatial Inference
■ Aggregation
• Data within a boundary can be aggregated.
• Often to form a new class.
■ Conversion
• Data from a sample set can be converted for a different sample
set.
• Changing the scale of the geographical unit.
• Switching from a set of geographical units to another.
3
Spatial Inference: Aggregation and Conversion
Pine Trees
Boreal Forest
Poplar Trees
District A
District
B1 B
District
District B2
Download