part1 - Department of Geography

advertisement
A Short Course in Geoinformatics
Part I: Science Issues in
GeoInformatics
Michael F. Goodchild
Outline
• A short history of GIS
• Basic principles of GIScience
• Uncertainty
A short history of GIS
• Maps in computers
– for decision-making
• each map representing one dimension of a decision
– for managing data
• aggregating census returns to reporting zones
• managing the multiple data types of transportation
planning
– to support map-making
• editing
• projection change
A model for landscape architecture
• Ian McHarg’s school at the University of
Pennsylvania
Meteorology
Geology
Hydrology
Plant ecology
Animal ecology
Limnology
Computation
Ian McHarg
1920-2001
Remote sensing
“For the first time, a department of landscape architecture
could recruit a faculty of distinguished natural scientists
sharing the ecological view and determined to integrate
their perceptions into a holistic discipline applied to the
solution of contemporary problems.”
I.L. McHarg, A Quest for Life (Wiley, 1996, p. 192)
 Integration of science into action
 Frequently emulated as a model for environmental
science
 But with a weaker intervention component
 The social context is missing
 Computation and remote sensing do not fit the model
The Canada Geographic Information
System
• Roger Tomlinson
– IBM contracts 1964-68
• 7 layers of land characteristics
–
–
–
–
soil capability for agriculture
recreation capability
current land use
….
• To assess the current use of Canadian land
– to measure area, plan new uses
Technical aspects of CGIS
• Manuscript maps at 1:50,000
– 7 per tile
• Hand-scribing of boundaries
• An optical scanner creating a raster of
boundaries
• Vectorization
• Merging with area attributes
• The common boundary between two areas
as the basic unit
Flat-file options (tape)

By face/polygon
– double recording of internal boundaries
– spurious differences

By edge/arc
– half the data volume
– compute area in O(vertices)
– simplify overlay
– attributes of adjacent polygons
– no polygon records
Technical aspects…
• Storage on magnetic tape
– variable-length records
– leftpolyID, rightpolyID, #points, (x1,y1),…
• Indexing in Morton order
– a quad-tree index
• Numerical output only
– tabulations of area
– no visual display
• Mainframe technology
– later leased land lines at 300 bps
The quadtree

Recursive subdivision
– variable depth depending on local detail
1
0
31
33
30
32
3
2
Other types of maps

Transportation links
– linear features
– networks
– U.S. Bureau of the Census
– blocks = 2-cells
– street segments = 1-cells
– intersections = 0-cells
Topological data structures
• 1977 conference
– sponsored by Harvard University
• A unifying structure across many application
areas
– all three of: decision-making, managing data,
editing maps
• The birth of ESRI
The relational model

The map as a collection of arcs, nodes,
and faces
– F-A+N = 2
Stored in tables with keys
 GIS built on RDBMS

– INFO

Vertices left out
– a hybrid solution
– ARC/INFO
– the ARC data structure still proprietary
Square pegs in round holes

Cul-de-sacs
– allow 1-nodes

Properties of parts of edges
– dynamic segmentation
– linear referencing

Non-planarity
– overpasses and underpasses
– turntables
A 1990s house of cards


Still no vertices in the RDBMS
Points
– coordinates stored in tables
– no topological relationships with other features

Does it have to be this hard?
–
–
–
–
–
simple CAD data model
points, lines, and areas in an empty space
potentially overlapping
no topological relationships
compute on the fly
Object-oriented data modeling
All features are instances of classes
 Classes inherit properties from more
general classes
 Features can be aggregates of other
features
 Features can be composed of other
features
 Features can be associated














Address
Agriculture
Archiving
Atmospheric
Basemap
Biodiversity
Census-Administrative
Boundaries
Defense-Intel
Energy Utilities
Energy Utilities MultiSpeak TM
Environmental Regulated
Facilities
Forestry
Geology














Groundwater
Health
Historic Preservation and
Archaeology
Hydro
International Hydrographic
Organization (IHO) S-57 for
ENC
Land Parcels
Local Government
Marine
Petroleum
Pipeline
Raster
Telecommunications
Transportation
Water Utilities
A paradigm shift

Away from the map metaphor
– georeferenced events, transactions
– objects with no georeferences
– phenomena that were never mapped

Neogeography
– customized maps
•
•

user-centric
transitory
Interactions, flows
*
0..1
MINARD NAPOLEON MAP
0..2
*
INTERACTION
*
0..1
KARST FLOW ROUTES
0..2
*
ORIGINAL USE
CASE MODELS
0..1
*
0..1
*
0..2
0..1
Generic Flow Model
slide 19 / 22
slide 15 / 22
The data modeling cycle
The set of all
phenomena in the
domain
Find workarounds,
violate the data
model
Adopt a
generic
solution
Identify
inefficiencies
and special
cases
Is the process beginning again?

All features are instances of classes
– are all phenomena naturally features?
– is there a pre-feature stage?

Inherently continuous phenomena
– roads, rivers
– topography
– the pre-patch ecological landscape
Basic principles of GIScience
• The atomic geographic fact
– the geo-atom
– <x,z>
– a pair defining what (z) is where (x)
• Point observations are individual geo-atoms
– data about lines, areas, volumes can be
decomposed into geo-atoms
– the boundary of California defines an infinite
number of statements of the form <x,z>
• where z = 1 if x is inside the boundary
• else z=0
The result of applying a 150kmwide kernel to points distributed
over California
A typical kernel function
Discrete objects
• Points, lines, areas, or volumes
– in an otherwise empty space
– may overlap
– countable
• Examples:
–
–
–
–
buildings
cars
instances of a disease
oil wells
Continuous fields
• Variables that can be measured anywhere
– at any time
– z = f(x,y) f(x,y,z) f(x,y,z,t)
• Examples:
–
–
–
–
elevation of the ground surface
atmospheric temperature
soil pH
wind direction
• Variable can be a class
– soil type
– land use type
Fields as objects

Fields discretized as collections of
objects
– sample points
– isolines
– triangles of a mesh
– samples of a Fourier transform

Methods implied by roles of objects
– isolines cannot cross
– polygons must not overlap
Mitchell, A., 1999. The ESRI Guide to GIS Analysis. Redlands:
ESRI Press
Principle
• There are two fundamentally distinct ways of
aggregating geo-atoms
– into discrete objects
• all points within an object have the attributes of the
object
– into continuous fields
• every point is mapped to a variable
• Marginal cases:
– weather highs, lows, fronts
– mountain peaks
– clouds in the sky
Beyond objects and fields
• Discrete objects that move
• Discrete objects that change shape
• Discrete objects that have internal structure
Helix representation
Spine: expresses spatiotemporal 3-D movement of the
center of mass.
Prongs: express expansion or collapse
of the object’s outline
May Yuan, University
of Oklahoma
Hurricane Frances
Hurricane helixes
Spatially binary data
• <x1,x2,z>
– information about the relationship between two
locations
•
•
•
•
flow of migrants
distance
direction
time of travel
– such information is key to understanding many
social processes
– conventional geographic information is spatially
unary
1) Spatial dependence principle
• Tobler’s First Law of Geography (TFL)
– “All things are similar, but nearby things are more
similar than distant things”
• Horizontal context
– geographic facts should be consistent with their
surroundings
• Spatial dependence
– the tendency for nearby observations to be
correlated
– violating an assumption of many statistical tests
that observations are independent
Validity

“Nearby things are less similar than distant
things”
– negative spatial autocorrelation
– possible at certain scales
•
•
the checkerboard
retailing
– but negative a/c at one scale requires positive a/c
at other scales
– smoothing processes dominate sharpening
processes
Formalization

Geostatistics
– variogram, covariogram
– measuring how similarity decreases
(variance increases) with distance
– parameters vary by phenomenon
•
does this make TFL less of a law?
Utility

Representation
– GI is reducible to statements of the form
<x,z>
– the atomic form of GI is unmanageable,
encountered only in point samples
– all other GI data models assume TFL

Spatial interpolation
– IDW and Kriging implement TFL
If TFL weren’t true

GIS would be impossible
– a point sample is useful only with
interpolation

Life would be impossible
2) Spatial heterogeneity principle
• The Earth’s surface is fundamentally
heterogeneous
– unlike humans, whose characteristics are
distributed around an average
• It is difficult to generalize from a single case
study
• The results of any case study depend
explicitly on the spatial bounds of the study
• The second law of geography
• Again, problematic for science
Jorge Sifuentes, PhD
dissertation
Practical implications of the
second law

A state is not a sample of the nation
– a country is not a sample of the world
Classification schemes will differ when
devised by local jurisdictions
 Figures of the Earth will differ when
devised by local surveying agencies
 Global standards will always compete
with local standards

3) A fractal principle

The closer you look the more you see
– and for many natural phenomena the rate
is orderly
– Richardson plots
– lengths of national boundaries
•
•
Spain and Portugal
context of 1920s
Practical implications

Indexing schemes, quadtrees
– partitioning of information at different scales

Length is a function of spatial resolution
– and variously under-estimated in GIS
– as are many other properties
•
•
•
slope
soil class
land cover class
– spatial resolution should always be explicit in GIS
analysis
•
•
easy in raster
much more difficult in vector
4) The uncertainty principle

No representation of the Earth’s surface
can be complete
– no measurement of position can be perfect
– a GIS will always leave doubt about the
true nature of the Earth’s surface
ArcMap 10.0, Plate Carrée projection
Error-sensitive GIS
Storing characterizations of uncertainty
 Propagation through GIS operations
 Visualization
 Confidence limits on products

How to build one?

Augmentation of existing data models
– new attributes of objects, object classes,
data sets
– metadata
– the five-fold way
– Lanter and Veregin, GeoLineus
– inheritance, object-orientation
Download