Data Conversion & Integration Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007 Data Conversion/Integration Process • • • • Data Inventory • Existing hard-copy maps / digital data Data Collection (additional ) • Satellite Imagery, Aerial Photo, etc. • Field Collection (hand-held devices-GPS, etc.) Data Input/Conversion • Keyboard entry of coordinates • Digitizing/Scanning/Raster-to-Vector • Editing/Building Topology Data Integration • Georeferencing/Geocoding Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007 About Geographic Data • Conversion of hardcopy to digital maps is the most timeconsuming task in GIS • Up to 80% of project costs • Example: estimated to be a US $10 billion annual market • Labor intensive, tedious and error-prone Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007 Data Inventory • National overview maps • 1:250,000 and 1:5,000,000 (small scale) • show major civil divisions, urban areas, physical features such as roads, rivers, lakes, elevation, etc. • used for planning purposes Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007 Data Inventory (cont.) • Topographic maps- scales range from 1:25,000 to 250,000 (mid-scale) • Town and city maps at large cartographic scales, showing roads, city blocks, parks, etc. (1:1,000 to 1:5,000) Maps of administrative units at all levels of civil division • • Thematic maps showing population distribution for previous census dates, or any features that may be useful for census mapping Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007 Existing Digital Data • Digital maps • Satellite imagery • GPS coordinates • Etc. Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007 Data Collection Capture Aerial Photography Remote Sensing Surveying. GPS Maps GDB Census & Surveys GIS Management Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007 Aerial photography • Aerial photography is obtained using specialized cameras on-board low-flying planes. The camera captures the image digitally or on photographic film. • Aerial photography is the method of choice for mapping applications that require high accuracy and a fast completion of the tasks. • Photogrammetry—the science of obtaining measurements from photographic images. Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007 Aerial photography (cont.) • Traditional end product: printed photos • Today: digital image (scanned from photo) in standard graphics format (TIFF, JPEG) that can be integrated in a GIS or desktop mapping package • Trend: fully digital process • digital orthophotos • • • • corrected for camera angle, atmospheric distortions and terrain elevation georeferenced in a standard projection (e.g. UTM) geometric accuracy of a map large detail of a photograph Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007 Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007 Remote sensing process Sources of Energy Sensing System Receiving station Earth Surface Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007 GPS • Collection of point data • Stored as “waypoints” • Accuracy dependent on device and environmental variables Surveying • Paper Based • • Manual recording of information Electronic Based • Handheld device Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007 Geographic data input/conversion • Keyboard entry of coordinates • Digitizing • Scanning and raster to vector conversion • Field work data collection using • Global positioning systems • Air photos and remote sensing Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007 Keyboard entry • keyboard entry of coordinate data • e.g., point lat/long coordinates • • from a gazetteer (a listing of place names and their coordinates) from locations recorded on a map Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007 Latitude/Longitude coordinate conversion • Latitude is y-coo, Longitude is x-coo • Common format is degrees, minutes, seconds 113º 15’ 23” W 21º 56’ 07” N • To represent lat/long in a GIS, we need to decimal degrees -113.25639 21.93528 • convert to DD = D + (M + S / 60) / 60 Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007 Data Conversion • Conversion is often the easiest form to import digital spatial data into a GIS • Data transfer often rely on the exchange of data in mostly proprietary file formats using the import/export functions of commercial GIS packages • Open source data Conversion software becoming widely available Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007 Conversion of hardcopy maps to digital data • Turning features that are visible on a hardcopy map into digital point, line, polygon, and attribute information • In many GIS projects this is the step that requires by far the largest time and resources • Newer methods are arising to minimize this arduous step Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007 Conversion of hardcopy maps to digital data • Digitizing • Manual digitizing • Heads-up digitizing • Scanning • Raster-to-Vector (cont.) Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007 Manual Digitizing Most common form of coordinate data input • Requires a digitizing table • Ranging in size (25x25 cm to 150x200cm) • Ideally the map should be flat and not torn or folded • Cost: hundreds (300) to thousands (5000) Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007 Digitizing steps (how points are recorded) • trace features to be digitized with pointing device (cursor) • point mode: click at positions where direction changes • stream mode: digitizer automatically records position at regular intervals or when cursor moved a fixed distance Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007 Control Points • If a large map is digitized in several stages and the map has to be removed from the digitizing table occasionally, the control points allow the exact re-registration of the map on the digitizing board. • Control points are chosen for which the real-world coordinates in the base map’s projection system are known. Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007 Digitizing table • Grid of wires in the table creates a magnetic field which is detected by the cursor • • X/Y coordinates in digitizing units are fed directly into GIS y • High precision in coordinate recording x Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007 Heads-Up Digitizing I • Features are traced from a map drawn on a transparent sheet attached to the screen • Option, if no digitizer is available; but: accuracy very low Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007 Heads-Up Digitizing II • Common today is heads-up digitizing, where the operator uses a scanned map, air photo or satellite image as a backdrop and traces features with a mouse • This method yields more accurate results • Quicker and easier to retrace and save steps Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007 Heads-Up Digitizing II • Raster-scanned image on the computer screen • Operator follows lines on-screen in vector mode Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007 Digitizing Errors • Undershoots • Dangles • Spurious Polygons Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007 Digitizing errors • Any digitized map requires considerable post-processing • Check for missing features • Connect lines • Remove spurious polygons • Some of these steps can be automated Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007 Fixing Errors • Some of the common digitizing errors shown in the figure can be avoided by using the digitizing software’s snap tolerances that are defined by the user • For example, the user might specify that all endpoints of a line that are closer than 1 mm from another line will automatically be connected (snapped) to that line • Small sliver polygons that are created when a line is digitized twice can also be automatically removed Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007 Advantages and Disadvantages of Digitizing Advantages • It is easy to learn and thus does not require expensive skilled labor • Attribute information can be added during digitizing process • High accuracy can be achieved through manual digitizing; i.e., there is usually minimal loss of accuracy compared to the source map Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007 Advantages and Disadvantages of Digitizing Disadvantages • It is a tedious activity, possibly leading to operator fatigue and resulting quality problems which may require considerable post-processing • It is slow. Large-scale data conversion projects may thus require a large number of operators and digitizing tables • The accuracy of digitized maps is limited by the quality of the source material Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007 Scanning A viable alternative to digitizing • The map is placed onto the scanning surface where light is directed at the map at an angle • A photosensitive device records the intensity of light reflected for each cell or pixel in a very fine raster grid • In gray scale mode, the light intensity is converted directly into a numeric value, for example into a number between 0 (black) and 255 (white) • In binary mode, the light intensity is converted into white or black (0/1) cell values according to a threshold light intensity Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007 Scanning • • • Electronic detector moves across map and records light intensity for regularly shaped pixels Flat-bed scanner Drum-scanner (pictured) Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007 Scanning (cont.) computer R G B color splicing Types of scanners • Flat • small format, low cost, good for small tasks • Drum • high precision but expensive and slow • Feed • fast, good precision, lower cost than drum optical sensor pixel width Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007 Scanning (cont.) • direct use of scanned images • e.g., scanned air-photos • digital topographic maps in raster format Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007 Scanning (cont.) • Scanner output is a raster data set usually needs to be converted into a • Vector representation - manually (on-screen digitizing) - automated (raster-vector conversion) line-tracing - e.g., MapScan • Often requires considerable editing Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007 Advantages and Disadvantages of Scanning Advantages • Scanned maps can be used as image backdrops for vector information • Scanned topographic maps can be used in combination with digitized EA boundaries for the production of enumerator maps • Clear base maps or original color separations can be vectorized relatively easily using raster-to-vector conversion software • Small-format scanners are relatively inexpensive and provide quick data capture Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007 Advantages and Disadvantages of Scanning Disadvantages • Converting large maps with a small format scanners requires tedious re-assembly of the individual parts • Large format, high-throughput scanners are expensive • Despite recent advances in vectorization software associated with scanning, considerable manual editing and attribute labeling may still be required Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007 Raster to Vector Conversion Gets scanned/image data into vector format • Automatic mode: the system converts all lines on the raster image into sequences of coordinates automatically. automated raster to vector process starts with a line thinning algorithm • Semi-automatic mode, the operator clicks on each line that needs to be converted; system then traces that line to the nearest intersections and converts it into a vector representation Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007 OBIA Raster to Vector Conversion • Object-Based Image Analysis (OBIA) is a tentative name for a sub-discipline of GIScience devoted to partitioning remote sensing (RS) imagery into meaningful image-objects, and assessing their characteristics through spatial, spectral and temporal scale. At its most fundamental level, OBIA requires image segmentation, • attribution, classification and the ability to query and link individual objects (a.k.a. segments) in space and time. In order to achieve this, OBIA incorporates knowledge from a vast array of disciplines involved in the generation and use of geographic information (GI). Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007 Object-Based Image Analysis Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007 OBIA Dwelling Identification • Segmentation based • Pixel based • Automated Digitizing Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007 Object-Based Image Analysis • Increasing demand for updated geo-spatial information, rapid information extraction • Complex image content of VHSR data needs to be structured and understood • Huge amount of data can only be utilized by automated analysis and interpretation • New target classes and high variety of instances • Monitoring systems and update cycles • Transferability, objectivity, transparency, flexibility Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007 Editing • • • • • Manual digitizing is error prone Objective is to produce an accurate representation of the original map data This means that all lines that connect on the map must also connect in the digital database There should be no missing features and no duplicate lines The most common types of errors • Reconnect disconnected line segments, etc Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007 Some common digitizing errors spike undershoot missing line overshoot line digitized twice Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007 Building Topology • GIS determines relationships between features in the database • System will determine intersections between two or more roads and will create nodes • For polygon data, the system will determine which lines define the border of each polygon • After the completed digital database has been verified to be error-free • The final step is adding additional attributes Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007 Building Topology • • • • • The building of relationships between objects Feature topology describes the spatial relationships between connecting or adjacent geographic features such as roads connecting at intersections The user typically does not have to worry about how the GIS stores topological information Feature topology describes the spatial relationships between connecting or adjacent geographic features such as roads connecting at intersections The user typically does not have to worry about how the GIS stores topological information Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007 Converting Between Different Digital Formats • All software systems provide links to other formats • But the number and functionality of import routines varies between packages • Problems often occur because software developers are reluctant to publish the exact file formats that their systems use -> instability of information (ex. filegeodatabase [.gdb]) • Option of using a third data format • Example: Autocad’s DXF format Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007 Georeferencing/Geocoding • Georeferencing • Converting map coordinates to the real world coordinates corresponding to the source map’s cartographic projection. • Attaching codes to the digitized features (geocoded feature) • each line representing a road would obtain a code that refers to the road status (dirt road, one lane road, two lane highway, etc.) • Or a unique code that can be linked to a list of street names. Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007 For attribute data: • spreadsheets • links to external database • management systems (DBMS) • tabulation programs (IMPS, Redatam) Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007 Sample components of a digital EA map Buildings Street Network 61 27 57 65 40 43 28 349 60 41 42 19 63 21 64 58 59 350 20 58 2 17 15 16 eet Bonne Str 378 50 49 61 57 35 22 65 62 40 31 44 32 63 42 60 41 20 21 86 45 1 2 3 51 4 52 54 88 2 83 14 84 85 13 7 12 15 52 23 53 Bessel Street 9 1 27 24 22 51 6 61 60 55 46 41 42 43 44 33 34 64 58 59 47 28 1 40 54 58 45 19 43 50 77 78 377 Grinten Street 48 59 79 Mollweide Street 87 41 Street 42 et Tissot Stre Neatlines and legend 29 Imhof Drive 43 76 Miller 374 21 20 19 82 81 80 39 34 377 28 Goode 33 30 374 18 ve Cassini Dri 37 42 32 31 41 68 69 70 38 31 Street Robinson 13 362 71 43 27 12 36 361 28 32 33 21 22 35 29 29 28 361 27 23 22 30 74 73 20 26 25 67 10 3 4 72 23 Street Building numbers 18 Cassini Drive 24 75 11 2 ive 20 362 3 e Avenu 21 19 358 14 Lambert Avenue 44 45 19 13 et 349 or Mercat 43 51 350 12 et 38 5 Dr 358 64 63 Gall Street ij Stre Krassowsk Snyder Stre 42 65 57 Tobler Street 37 62 66 6 et e Stre Clark 8 Street 9 1 Ortelius 56 59 10 11 Street Ptolemy 1 31 61 60 7 41 42 43 44 33 34 55 5 4 Annotation and symbols 32 Eckert Drive 6 Boundaries 45 31 35 22 62 16 11 10 9 25 378 26 8 27 32 34 10 21 33 6 4 5 7 1 62 31 9 5 66 56 59 10 11 11 2 8 58 65 64 63 75 67 Enumeration Area Map 10 3 4 57 Symbols 37 12 42 19 13 74 18 2 12 13 73 72 Province: District: Locality: EA-Code: 14 38 20 23 68 71 18 69 43 3 51 17 15 16 21 22 70 44 45 24 76 29 36 26 25 79 20 21 20 19 82 81 80 35 21 19 77 78 28 22 23 27 37 30 29 28 27 28 29 43 1 38 31 88 32 31 39 30 86 41 13 EA N 84 85 40 34 District 7 12 24 42 43 48 49 51 50 50 23 8 44 47 46 45 1 2 3 53 26 25 58 22 51 54 17 10 11 9 54 59 15 52 27 Building number 16 4 32 52 34 9 10 21 358 EA-Code Locality 2 14 33 41 14 032 0221 00361 83 87 42 32 33 Cartania Chartes Maptown Hospital Church School Approximate scale 33 Enumeration Area Map Province: District: Locality: EA-Code: Cartania Chartes Maptown Symbols 14 032 0221 00361 District 358 EA-Code Locality EA N 17 Building number Hospital 0 50 100 200m Census 2000 National Statistical Office - July 1998 Church School Approximate scale Census 2000 National Statistical Office July 1998 Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007 A Simpler Alternative • In many countries, EA map design may be simpler than in this example • Instead of a fully integrated digital base map in vector format, rasterized images of topographic maps may be used as a backdrop for EA boundaries • What is available already! Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007 A Simpler Alternative • In some instances, map features may be more generalized, for instance by using only the centerlines for the streets and polygons for entire city blocks rather than for individual houses • This can include the use of free data as a baseline or starting point in the creation or updating of census related maps Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007 Agencies to contact • • • • • • • • • • • National geographic institute / mapping agency Military mapping services Province, district and municipal governments Various government or private organizations dealing with spatial data Geological or hydrological survey Environmental protection authority Transport authority Utility and communication sector companies Land titling & surveying agencies Academic institutions Donor activities Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007 Sources of geographic information Additional geographic data collection Identify existing data sources Paper maps, existing printed air photos and satellite images Field mapping products such as sketch maps Digital air photos and satellite images GPS coordinate collection Existing digital maps Data conversion Digitizing Generate lines and polygons Scanning Raster to vector conversion (automated or semi-automated) Editing geographic features Construct topology for geographic features Digital map data integration Georeferencing (coordinate transformation and projection change) Coding (labelling) of digital geographic features Combine and integrate digital map sheets Parallel activity Additional delineation of EA boundaries Develop geographic attributes database Workshop on Census Cartography and Management, Bangkok, Thailand, 15–19 October 2007