Geocoding & Data Collection with GPS Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago, 22-26 October 2007 Summary • Introduction to Geocoding • Geocoding: Concepts and Definitions • Relationship to other Census Processes • Approaches to Data Collection • NSO Benefits & Concluding Remarks • Introduction to GPS • How GPS Works • Sources of Error & Accuracy • Selecting a GPS • Advantages & Disadvantages Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago, 22-26 October 2007 Introduction • Many NSOs have a specialized coding scheme and understand geocoding as a dynamic process • Clarification within the statistical community • Expansion and discussion on components and methods within the process of geocoding • The purpose of this section is to introduce geocoding concepts relevant for census mapping and the different approaches to related data collection. Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago, 22-26 October 2007 Geocoding • Definitions: • Conceptual/Operational • Geocoding vs Georeferencing • Census Hierarchies • Coding Scheme • Data Collection Methods • Direct Collection • Matching Approach • Benefits for NSOs Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago, 22-26 October 2007 • “Geocoding can be broadly defined as the assignment of a code to a geographic location. Usually however, Geocoding refers to a more specific assignment of geographic coordinates (latitude, longitude) to an individual address.” • Reference: UN Report of the Expert Group Meeting on Contemporary Practices in Census Mapping and Use of Geographical Information Systems (2007) Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago, 22-26 October 2007 Definition of Geocoding • Conceptual - 2 situations: • The more general process of assigning geographic codes to features in a digital database. • • A GIS function that determines a point location based on an address. It could generally be expected that such point locations will be relatively precise (eg +/-2m) in accuracy and will be based upon use of GPS technology. Operational • Geocoding is the computer oriented process which converts information about a unit from which statistical information is collected into a set of coordinates describing the geographic position of that unit Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago, 22-26 October 2007 …cont. • Operational Elements • Collecting precise data at the level of point locations (or very low geographic level such as a city block) and assigning codes for use in dissemination. • Coding the centroid, building corners, or building point of entry coordinates for a unit such as a block of land, building or dwelling • Coordinates must contain latitude and longitude or standardized x and y points for gridded interpolation. A Z or Zed coordinate may represent altitude or elevation • Codes cover each geographic unit and have a combinational relationship to distinguish different units (Enumeration Areas/Blocks) Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago, 22-26 October 2007 Georeferencing vs Geocoding • Georeferencing • Aligning geographic data to a known coordinate system so it can be analyzed, viewed, and queried with other geographic data • Geocoding • The process of assigning geographic codes to features in a digital database (including the GIS operation for converting street addresses into spatial data that can be displayed as features on a map) Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago, 22-26 October 2007 Relationship to Other Census Processes • Movement into a fully GIS based approach to census mapping • Generation of high quality maps for use in the collection phase • Reduction of work required for updating maps for future censuses • Aggregation of records into customized units for satisfying users’ requirements Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago, 22-26 October 2007 Census Enumeration & the Geocoding System • Delineation irrespective of the existence of address • Ability to apply a geocode to any geographic areal unit • Flexible Coding Scheme • Ability to incorporate future administrative divisions • Pre-enumeration geocoding critical • links between GIS boundaries and tabular census data Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago, 22-26 October 2007 Census Hierarchies Define census geographic hierarchy Develop geographic coding scheme Development of an administrative and census units listing Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago, 22-26 October 2007 Census Hierarchies: some principles • Internal political Boundaries • Areal unit aggregation • Resolution suitable to NSO needs and user demands • Considers available datasets for continuous development • The smaller area defined by the geocode the more flexible the results for subsequent users Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago, 22-26 October 2007 Example of Administrative Hierarchy country region province district sub-district rural locality urban locality Enumeration area ward Enumeration area Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago, 22-26 October 2007 Illustration of a nested Admin. Hierarchy Provinces Districts Localities Enumeration areas Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago, 22-26 October 2007 Hierarchical Coding Scheme: operational considerations - Geographic units are numbered at each level of the administrative hierarchy (gaps between the numbers to allow changes) - For example at the province level, units may be numbered 5, 10, 15 and so on. A similar scheme would be used for lower-level administrative units and for enumeration areas. - Since there are often, for example, more districts in a province than provinces in a country, more digits may be required at lower levels - The unique identifier for the EA (the smallest-level unit): concatenation of the identifiers of the Admin. Units into which it falls Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago, 22-26 October 2007 Example of a Coding Scheme A small country could use the following coding scheme: Province 2 digits District 3 digits Locality 4 digits EA 4 digits An EA code of 10 025 0105 0073 means that enumeration area number 73 is located in province 10, district 25 and locality 105. The unique code is stored in the database as a long integer or as a 13-character string variable. Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago, 22-26 October 2007 Example of a Coding Scheme (cont.) • The variable type needs to be the same in the census database and the geographic database. • The integer variable has the advantage that subsets of records can be selected easily (SQL) • Example of query: SELECT ID > 1203501550000 AND ID < 1203501560000 Will find all EAs within locality number 155 in the database or on the digital map- Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago, 22-26 October 2007 (cont.) • Special coding conventions needed to be developed, in cases where admin. and reporting units are not hierarchical • In any case, consistency should be complete in defining and using the administrative unit identifiers, since they are the link between GIS boundaries and the tabular census data. • Maintenance: NSOs should maintain a Master List of EA and admin. units and their respective codes and report any changes made to the Master List to the GIS and census databases. Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago, 22-26 October 2007 Census Hierarchies Country Given Country Province District Locality Enumeration Areas Blocks Building Dwelling Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago, 22-26 October 2007 Coding Scheme 250131402013 Digits 1-2 = State code Digits 3-5 = County Code Digits 6-11 = Census Tract Code Digit 12 = Blockgroup code Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago, 22-26 October 2007 Geocoding Classifications • • Disaggregation into Spatial Entities or Civil Divisions and Compatibility 1st Region Province 2nd District Municipality 3rd Town/Village 4th Dwelling Resultant geocoded units placed within a set of Latitude and Longitudinal boundaries Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago, 22-26 October 2007 Data Collection Methods • Two main methods: • Direct Collection Approach • Matching Approach Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago, 22-26 October 2007 Direct Collection Approach • • Digitizing from available topographic maps Direct collection using field techniques (ex.GPS) Digitizing from a topographic map Global Positioning System (GPS) Areas, Street, Dwelling Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago, 22-26 October 2007 Matching approach • • Using an Address locator database and street network database in a GIS Joining an address database to an existing spatial database for the area of interest First Avenue First Avenue Street Network Left of Street Left of Street #1 Second Avenue #2 #51 #32 Right of Street Nodes #99 #100 Second Avenue address number #99 Main Street #2 #100 Right of Street #1 Street Segment Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago, 22-26 October 2007 Data Maintenance • Cleaning Addresses • Retaining only the key address elements • Establish a Matchcode (indicator of which address elements will determine the geocode) Record Street Address City State ZIPcode Latitude Longitude Areakey MatchCode 1 344 East 63rd New York NY 10023 40.47 73.58 3502508100 AS0 • Eliminating extraneous characters • Standardizing Spelling Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago, 22-26 October 2007 Staff Expertise Recommendations Task/condition Direct collection Matching Existence of digital base map for country Highly desirable Highly desirable Statistical staff with expertise in use of GPS Essential Not Essential Acquisition of large numbers of GPS receivers Essential Not Essential Geo-referenced list of addresses or equivalent Not Essential Essential Excellent address matching algorithms Not Essential Essential Existence of a rational, consistent, and locallyrecognized addressing system for housing units Highly desirable Essential Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago, 22-26 October 2007 Geocoding: Benefits for National Statistical Offices • Improved map creation for the field • Customizable map outputs for specified regional activities • Coding techniques are transparent and transferable • Fixates the groundwork for future statistical activities and coding schemes Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago, 22-26 October 2007 Concluding Remarks • Technologies are accessible and allow delineation irrespective of the existence of address • Many available methods and technologies exist to support accurate geocoding frameworks • Geocoding system is value-added for GIS based Spatial Analysis of Statistical Data Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago, 22-26 October 2007 Global Positioning Systems (GPS) • Technology has revolutionized field mapping in recent years • Prices of GPS receivers have dropped • GPS methods have been integrated in many applications • User groups are widespread (utilities management, surveying and navigation). GPS has contributed and advanced to improve field research in areas such as biology, forestry, geology, epidemiology and population studies Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago, 22-26 October 2007 Global Positioning Systems (cont.) • GPS has become a major tool in census cartographic applications • Preparation and updating of enumerator (EA) maps for census activities • Location of point features such as service facilities or village centers • Coordinates can be downloaded or entered manually into a digital mapping system or GIS, and can be combined with existing, georeferenced information Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago, 22-26 October 2007 How GPS Works • GPS receivers collect the signals transmitted from more than 24 satellites—21 active satellites and three spares. The system is called NAVSTAR, and is maintained by the U.S. Department of Defense • The satellites are circling the earth in six orbital planes at an altitude of approximately 20,000 km. At any given time five to eight GPS satellites are within the “field of view” of a user on the earth’s surface • The position on the earth’s surface is determined by measuring the distance from several satellites Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago, 22-26 October 2007 The global positioning system (GPS) Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago, 22-26 October 2007 The global positioning system (cont.) • GPS satellites circle the Earth twice a day … • The satellite signal: • Three kinds of coded information essential for determining a position; • The receiver: • 1. Calculates the distance to the first satellite user is able to catch. • 2. Calculates the distance to a second satellite for which it is able to catch a signal. • 3. Repeats the operation mentioned under point 2 with a third satellite. Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago, 22-26 October 2007 How GPS determines a location’s coordinates a b m e a s u re d d is t a n c e u x c x Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago, 22-26 October 2007 Sources of GPS signal errors • Good visibility and bad visibility of satellites due to obstacles • signal multipath • Uncontrollable sources of error over which the user does not have control • Atmosphere delays • Receiver clock errors • Orbital errors Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago, 22-26 October 2007 Differential GPS Space segment GPS satellites Correction signal DGPS mobile receiving station DGPS ground station Control segment User segment Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago, 22-26 October 2007 GPS Accuracy • Inexpensive GPS receivers • Within 15 to 100 meters for civilian applications. • Differential GPS reduces error further • Accuracy of about 3-10m can be achieved with quite affordable hardware and shorter observation times. • More expensive systems and longer data collection for each coordinate reading can yield sub-meter accuracy. Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago, 22-26 October 2007 Problems with GPS • In dense urban settings, the possible error of standard GPS (standard ~15m up to 100 meters) may not be sufficient • Differential GPS can be used for cross-checking GPS readings with other data sources • published maps • aerial photographs • sketch maps produced during fieldwork Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago, 22-26 October 2007 Selecting a GPS Unit • Commercially available GPS receivers vary in price and capabilities • Technical specifications determine the accuracy by which positions can be achieved • The more powerful a receiver, the more expensive it will be • In many mapping applications, the accuracy of standard systems is quite sufficient • Receivers also vary in terms of user-friendliness, tracking capabilities which are useful in navigation Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago, 22-26 October 2007 Summary: Advantages and Disadvantages of GPS Advantages • Fairly inexpensive, easy-to-use field data collection • Modern units require very little training for proper use • Collected data can be read directly into GIS databases minimizing intermediate data entry or data conversion steps • Worldwide availability • Sufficient accuracy for many census mapping applications—high accuracy achievable with differential correction Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago, 22-26 October 2007 Summary: Advantages and Disadvantages of GPS Disadvantages • Signal may be obstructed in dense urban or wooded areas • Standard GPS accuracy may require differential techniques • Differential GPS is more expensive, requires more time in field data collection and more complex post-processing to obtain more accurate information • A very large number of GPS units may be required for only a short period of data collection. Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago, 22-26 October 2007 Where’s your Datum Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago, 22-26 October 2007 Geocoding Classifications (cont.) • Initial creation of Civil Divisions through digitizing or segmentation/pixel based-approaches • Low to Zero levels of sampling through the accurate placing of coded units, but flexible enough to include changes • Appropriate detail that fits with the boundaries of a geographic area for a given country Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago, 22-26 October 2007