Data Conversion & Integration

advertisement
Data Conversion
&
Integration
Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007
Data Conversion/Integration Process
•
•
•
•
Data Inventory
• Existing hard-copy maps / digital data
Data Collection (additional )
• Satellite Imagery, Aerial Photo, etc.
• Field Collection (hand-held devices-GPS,
etc.)
Data Input/Conversion
• Keyboard entry of coordinates
• Digitizing/Scanning/Raster-to-Vector
• Editing/Building Topology
Data Integration
• Georeferencing/Geocoding
Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007
About Geographic Data
•
Conversion of hardcopy to digital maps is the most timeconsuming task in GIS
•
Up to 80% of project costs
•
Example: estimated to be a US $10 billion annual market
•
Labor intensive, tedious and error-prone
Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007
Data Inventory
•
National overview maps
•
1:250,000 and 1:5,000,000 (small scale)
•
show major civil divisions, urban areas, physical
features such as roads, rivers, lakes, elevation, etc.
•
used for planning purposes
Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007
Data Inventory (cont.)
•
Topographic maps- scales range from 1:25,000 to
250,000 (mid-scale)
•
Town and city maps at large cartographic scales, showing
roads, city blocks, parks, etc. (1:1,000 to 1:5,000)
Maps of administrative units at all levels of civil division
•
•
Thematic maps showing population distribution for
previous census dates, or any features that may be useful
for census mapping
Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007
Existing Digital Data
•
Digital maps
•
Satellite imagery
•
GPS coordinates
•
Etc.
Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007
Data Collection
Capture
Aerial Photography
Remote Sensing
Surveying.
GPS
Maps
GDB
Census & Surveys
GIS
Management
Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007
Aerial photography
•
Aerial photography is obtained using specialized cameras
on-board low-flying planes. The camera captures the image
digitally or on photographic film.
•
Aerial photography is the method of choice for mapping
applications that require high accuracy and a fast completion
of the tasks.
•
Photogrammetry—the science of obtaining measurements
from photographic images.
Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007
Aerial photography (cont.)
•
Traditional end product: printed photos
•
Today: digital image (scanned from photo) in standard graphics
format (TIFF, JPEG) that can be integrated in a GIS or desktop
mapping package
•
Trend: fully digital process
•
digital orthophotos
•
•
•
•
corrected for camera angle, atmospheric distortions and
terrain elevation
georeferenced in a standard projection (e.g. UTM)
geometric accuracy of a map
large detail of a photograph
Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007
Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007
Remote sensing process
Sources
of Energy
Sensing
System
Receiving
station
Earth Surface
Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007
GPS
•
Collection of point data
•
Stored as “waypoints”
•
Accuracy dependent on device and environmental variables
Surveying
•
Paper Based
•
•
Manual recording of information
Electronic Based
•
Handheld device
Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007
Geographic data input/conversion
•
Keyboard entry of coordinates
•
Digitizing
•
Scanning and raster to vector conversion
•
Field work data collection using
•
Global positioning systems
•
Air photos and remote sensing
Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007
Keyboard entry
•
keyboard entry of coordinate data
•
e.g., point lat/long coordinates
•
•
from a gazetteer (a listing of
place names and their
coordinates)
from locations recorded on a
map
Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007
Latitude/Longitude coordinate conversion
•
Latitude is y-coo, Longitude is x-coo
•
Common format is
degrees, minutes, seconds
113º 15’ 23” W 21º 56’ 07” N
•
To represent lat/long in a GIS, we need to
decimal degrees
-113.25639 21.93528
•
convert to
DD = D + (M + S / 60) / 60
Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007
Data Conversion
•
Conversion is often the easiest form to import digital
spatial data into a GIS
•
Data transfer often rely on the exchange of data in
mostly proprietary file formats using the import/export
functions of commercial GIS packages
•
Open source data Conversion software becoming widely
available
Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007
Conversion of hardcopy maps to digital data
•
Turning features that are visible on a hardcopy map into
digital point, line, polygon, and attribute information
•
In many GIS projects this is the step that requires by far
the largest time and resources
•
Newer methods are arising to minimize this arduous step
Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007
Conversion of hardcopy maps to digital data
•
Digitizing
• Manual digitizing
• Heads-up digitizing
•
Scanning
•
Raster-to-Vector
(cont.)
Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007
Manual Digitizing
Most common form of coordinate
data input
•
Requires a digitizing table
• Ranging in size (25x25 cm to
150x200cm)
•
Ideally the map should be flat
and not torn or folded
•
Cost: hundreds (300) to
thousands (5000)
Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007
Digitizing steps (how points are recorded)
•
trace features to be digitized with pointing device (cursor)
•
point mode: click at positions where direction changes
•
stream mode: digitizer automatically records position at
regular intervals or when cursor moved a fixed distance
Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007
Control Points
•
If a large map is digitized in several stages and the map
has to be removed from the digitizing table occasionally, the
control points allow the exact re-registration of the map on
the digitizing board.
•
Control points are chosen for which the real-world
coordinates in the base map’s projection system are known.
Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007
Digitizing table
•
Grid of wires in the table creates a magnetic field which is
detected by the cursor
•
•
X/Y coordinates in digitizing units are
fed directly into GIS
y
•
High precision in coordinate recording
x
Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007
Heads-Up Digitizing I
•
Features are traced from a map drawn on a transparent
sheet attached to the screen
•
Option, if no digitizer is available; but: accuracy very low
Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007
Heads-Up Digitizing II
•
Common today is heads-up digitizing, where the operator
uses a scanned map, air photo or satellite image as a
backdrop and traces features with a mouse
•
This method yields more accurate results
•
Quicker and easier to retrace and save steps
Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007
Heads-Up Digitizing II
•
Raster-scanned image on the computer screen
•
Operator follows lines on-screen in vector mode
Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007
Digitizing Errors
•
Undershoots
•
Dangles
•
Spurious Polygons
Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007
Digitizing errors
•
Any digitized map requires considerable post-processing
•
Check for missing features
•
Connect lines
•
Remove spurious polygons
•
Some of these steps can be automated
Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007
Fixing Errors
•
Some of the common digitizing errors shown in the figure
can be avoided by using the digitizing software’s snap
tolerances that are defined by the user
•
For example, the user might specify that all endpoints of a
line that are closer than 1 mm from another line will
automatically be connected (snapped) to that line
•
Small sliver polygons that are created when a line is
digitized twice can also be automatically removed
Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007
Advantages and Disadvantages of Digitizing
Advantages
• It is easy to learn and thus does not require expensive
skilled labor
•
Attribute information can be added during digitizing
process
•
High accuracy can be achieved through manual
digitizing; i.e., there is usually minimal loss of accuracy
compared to the source map
Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007
Advantages and Disadvantages of Digitizing
Disadvantages
• It is a tedious activity, possibly leading to operator
fatigue and resulting quality problems which may require
considerable post-processing
•
It is slow. Large-scale data conversion projects may thus
require a large number of operators and digitizing tables
•
The accuracy of digitized maps is limited by the quality
of the source material
Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007
Scanning
A viable alternative to digitizing
• The map is placed onto the scanning surface where light
is directed at the map at an angle
• A photosensitive device records the intensity of light
reflected for each cell or pixel in a very fine raster grid
• In gray scale mode, the light intensity is converted
directly into a numeric value, for example into a number
between 0 (black) and 255 (white)
• In binary mode, the light intensity is converted into
white or black (0/1) cell values according to a threshold
light intensity
Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007
Scanning
•
•
•
Electronic detector moves across map and records light
intensity for regularly shaped pixels
Flat-bed scanner
Drum-scanner (pictured)
Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007
Scanning (cont.)
computer
R
G
B
color
splicing
Types of scanners
• Flat
• small format, low cost, good for
small tasks
• Drum
• high precision but expensive and
slow
• Feed
• fast, good precision, lower cost than
drum
optical
sensor
pixel
width
Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007
Scanning (cont.)
•
direct use of scanned images
•
e.g., scanned air-photos
•
digital topographic maps in raster format
Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007
Scanning (cont.)
•
Scanner output is a raster data set usually needs to be
converted into a
•
Vector representation
- manually (on-screen digitizing)
- automated (raster-vector conversion)
line-tracing - e.g., MapScan
•
Often requires considerable editing
Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007
Advantages and Disadvantages of Scanning
Advantages
• Scanned maps can be used as image backdrops for
vector information
• Scanned topographic maps can be used in combination
with digitized EA boundaries for the production of
enumerator maps
• Clear base maps or original color separations can be
vectorized relatively easily using raster-to-vector
conversion software
• Small-format scanners are relatively inexpensive and
provide quick data capture
Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007
Advantages and Disadvantages of Scanning
Disadvantages
• Converting large maps with a small format scanners
requires tedious re-assembly of the individual parts
•
Large format, high-throughput scanners are expensive
•
Despite recent advances in vectorization software
associated with scanning, considerable manual editing
and attribute labeling may still be required
Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007
Raster to Vector Conversion
Gets scanned/image data into vector format
•
Automatic mode: the system converts all lines on the
raster image into sequences of coordinates automatically.
automated raster to vector process starts with a line
thinning algorithm
•
Semi-automatic mode, the operator clicks on each line
that needs to be converted; system then traces that line to
the nearest intersections and converts it into a vector
representation
Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007
OBIA Raster to Vector Conversion
•
Object-Based Image Analysis (OBIA) is a tentative name
for a sub-discipline of GIScience devoted to partitioning
remote sensing (RS) imagery into meaningful image-objects,
and assessing their characteristics through spatial, spectral
and temporal scale. At its most fundamental level, OBIA
requires image segmentation,
•
attribution, classification and the ability to query and link
individual objects (a.k.a. segments) in space and time. In
order to achieve this, OBIA incorporates knowledge from a
vast array of disciplines involved in the generation and use of
geographic information (GI).
Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007
Object-Based Image Analysis
Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007
OBIA Dwelling Identification
• Segmentation
based
• Pixel based
• Automated
Digitizing
Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007
Object-Based Image Analysis
• Increasing demand for updated geo-spatial information,
rapid information extraction
• Complex image content of VHSR data needs to be
structured and understood
• Huge amount of data can only be utilized by automated
analysis and interpretation
• New target classes and high variety of instances
• Monitoring systems and update cycles
• Transferability, objectivity, transparency, flexibility
Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007
Editing
•
•
•
•
•
Manual digitizing is error prone
Objective is to produce an accurate representation of the
original map data
This means that all lines that connect on the map must
also connect in the digital database
There should be no missing features and no duplicate
lines
The most common types of errors
• Reconnect disconnected line segments, etc
Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007
Some common digitizing errors
spike
undershoot
missing
line
overshoot
line digitized
twice
Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007
Building Topology
•
GIS determines relationships between features in the database
•
System will determine intersections between two or more roads
and will create nodes
•
For polygon data, the system will determine which lines define
the border of each polygon
•
After the completed digital database has been verified to be
error-free
•
The final step is adding additional attributes
Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007
Building Topology
•
•
•
•
•
The building of relationships between objects
Feature topology describes the spatial relationships between
connecting or adjacent geographic features such as roads
connecting at intersections
The user typically does not have to worry about how the GIS
stores topological information
Feature topology describes the spatial relationships between
connecting or adjacent geographic features such as roads
connecting at intersections
The user typically does not have to worry about how the GIS
stores topological information
Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007
Converting Between Different Digital Formats
•
All software systems provide links to other formats
•
But the number and functionality of import routines varies
between packages
•
Problems often occur because software developers are
reluctant to publish the exact file formats that their
systems use -> instability of information (ex. filegeodatabase [.gdb])
•
Option of using a third data format
• Example: Autocad’s DXF format
Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007
Georeferencing/Geocoding
•
Georeferencing
• Converting map coordinates to the real world
coordinates corresponding to the source map’s
cartographic projection.
•
Attaching codes to the digitized features (geocoded
feature)
• each line representing a road would obtain a
code that refers to the road status (dirt road,
one lane road, two lane highway, etc.)
• Or a unique code that can be linked to a list of
street names.
Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007
For attribute data:
•
spreadsheets
•
links to external database
•
management systems (DBMS)
•
tabulation programs (IMPS, Redatam)
Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007
Sample components of a digital EA map
Buildings
Street Network
61
27
57
65
40
43
28
349
60
41
42
19
63
21
64
58
59
350
20
58
2
17
15
16
eet
Bonne Str
378
50
49
61
57
35
22
65
62
40
31
44
32
63
42
60
41
20
21
86
45
1
2
3
51
4
52
54
88
2
83
14
84
85
13
7
12
15
52
23
53
Bessel Street
9
1
27
24
22
51
6
61
60
55
46
41
42
43
44
33
34
64
58
59
47
28
1
40
54
58
45
19
43
50
77
78
377
Grinten Street
48
59
79
Mollweide Street
87
41
Street
42
et
Tissot Stre
Neatlines and legend
29
Imhof Drive
43
76
Miller
374
21
20
19
82
81
80
39
34
377
28
Goode
33
30
374
18
ve
Cassini Dri
37
42
32
31
41
68
69
70
38
31
Street
Robinson
13
362
71
43
27
12
36
361
28
32
33
21
22
35
29
29
28
361
27
23
22
30
74
73
20
26
25
67
10
3
4
72
23
Street
Building numbers
18
Cassini Drive
24
75
11
2
ive
20
362
3
e
Avenu
21
19
358
14
Lambert Avenue
44
45
19
13
et
349
or
Mercat
43
51
350
12
et
38
5
Dr
358
64
63
Gall Street
ij Stre
Krassowsk
Snyder Stre
42
65
57
Tobler Street
37
62
66
6
et
e Stre
Clark
8
Street
9
1
Ortelius
56
59
10
11
Street
Ptolemy
1
31
61
60
7
41
42
43
44
33
34
55
5
4
Annotation and symbols
32
Eckert Drive
6
Boundaries
45
31
35
22
62
16
11
10
9
25
378
26
8
27
32
34
10
21
33
6
4
5
7
1
62
31
9
5
66
56
59
10
11
11
2
8
58
65
64
63
75
67
Enumeration Area Map
10
3
4
57
Symbols
37
12
42
19
13
74
18
2
12
13
73
72
Province:
District:
Locality:
EA-Code:
14
38
20
23
68
71
18
69
43
3
51
17
15
16
21
22
70
44
45
24
76
29
36
26
25
79
20
21
20
19
82
81
80
35
21
19
77
78
28
22
23
27
37
30
29
28
27
28
29
43
1
38
31
88
32
31
39
30
86
41
13
EA
N
84
85
40
34
District
7
12
24
42
43
48
49
51
50
50
23
8
44
47
46
45
1
2
3
53
26
25
58
22
51
54
17
10
11
9
54
59
15
52
27
Building
number
16
4
32
52
34
9
10
21
358
EA-Code
Locality
2
14
33
41
14
032
0221
00361
83
87
42
32
33
Cartania
Chartes
Maptown
Hospital
Church
School
Approximate scale
33
Enumeration Area Map
Province:
District:
Locality:
EA-Code:
Cartania
Chartes
Maptown
Symbols
14
032
0221
00361
District
358
EA-Code
Locality
EA
N
17
Building
number
Hospital
0
50
100
200m
Census 2000
National Statistical Office - July 1998
Church
School
Approximate scale
Census
2000
National Statistical Office July 1998
Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007
A Simpler Alternative
•
In many countries, EA map design may be simpler than in
this example
•
Instead of a fully integrated digital base map in vector
format, rasterized images of topographic maps may be
used as a backdrop for EA boundaries
•
What is available already!
Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007
A Simpler Alternative
•
In some instances, map features may be more
generalized, for instance by using only the centerlines for
the streets and polygons for entire city blocks rather
than for individual houses
•
This can include the use of free data as a baseline or
starting point in the creation or updating of census
related maps
Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007
Agencies to contact
•
•
•
•
•
•
•
•
•
•
•
National geographic institute / mapping agency
Military mapping services
Province, district and municipal governments
Various government or private organizations dealing with
spatial data
Geological or hydrological survey
Environmental protection authority
Transport authority
Utility and communication sector companies
Land titling & surveying agencies
Academic institutions
Donor activities
Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007
Sources of
geographic
information
Additional
geographic data
collection
Identify existing
data sources
Paper maps,
existing printed
air photos and
satellite images
Field mapping
products
such as
sketch maps
Digital air photos
and satellite
images
GPS coordinate
collection
Existing digital
maps
Data
conversion
Digitizing
Generate lines
and polygons
Scanning
Raster to vector
conversion
(automated or
semi-automated)
Editing
geographic
features
Construct
topology for
geographic
features
Digital map
data integration
Georeferencing
(coordinate transformation and
projection change)
Coding (labelling)
of digital
geographic
features
Combine and
integrate digital
map sheets
Parallel activity
Additional
delineation of
EA boundaries
Develop
geographic
attributes
database
Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007
Download