Geocoding Systems - United Nations Statistics Division

advertisement
Geocoding &
Data Collection with GPS
Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago,
22-26 October 2007
Summary
•
Introduction to Geocoding
• Geocoding: Concepts and Definitions
• Relationship to other Census Processes
• Approaches to Data Collection
• NSO Benefits & Concluding Remarks
•
Introduction to GPS
• How GPS Works
• Sources of Error & Accuracy
• Selecting a GPS
• Advantages & Disadvantages
Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago,
22-26 October 2007
Introduction
•
Many NSOs have a specialized coding scheme and
understand geocoding as a dynamic process
•
Clarification within the statistical community
•
Expansion and discussion on components and methods
within the process of geocoding
•
The purpose of this section is to introduce geocoding
concepts relevant for census mapping and the different
approaches to related data collection.
Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago,
22-26 October 2007
Geocoding
•
Definitions:
• Conceptual/Operational
•
Geocoding vs Georeferencing
•
Census Hierarchies
• Coding Scheme
•
Data Collection Methods
• Direct Collection
• Matching Approach
•
Benefits for NSOs
Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago,
22-26 October 2007
•
“Geocoding can be broadly defined as the assignment of a
code to a geographic location. Usually however, Geocoding
refers to a more specific assignment of geographic
coordinates (latitude, longitude) to an individual address.”
•
Reference: UN Report of the Expert Group Meeting on
Contemporary Practices in Census Mapping and Use of
Geographical Information Systems (2007)
Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago,
22-26 October 2007
Definition of Geocoding
•
Conceptual - 2 situations:
• The more general process of assigning geographic codes to
features in a digital database.
•
•
A GIS function that determines a point location based on an
address. It could generally be expected that such point
locations will be relatively precise (eg +/-2m) in accuracy and
will be based upon use of GPS technology.
Operational
• Geocoding is the computer oriented process which converts
information about a unit from which statistical information is
collected into a set of coordinates describing the geographic
position of that unit
Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago,
22-26 October 2007
…cont.
•
Operational Elements
• Collecting precise data at the level of point locations (or
very low geographic level such as a city block) and
assigning codes for use in dissemination.
• Coding the centroid, building corners, or building point of
entry coordinates for a unit such as a block of land,
building or dwelling
• Coordinates must contain latitude and longitude or
standardized x and y points for gridded interpolation. A Z
or Zed coordinate may represent altitude or elevation
• Codes cover each geographic unit and have a
combinational relationship to distinguish different units
(Enumeration Areas/Blocks)
Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago,
22-26 October 2007
Georeferencing vs Geocoding
•
Georeferencing
• Aligning geographic data to a known coordinate system so
it can be analyzed, viewed, and queried with other
geographic data
•
Geocoding
• The process of assigning geographic codes to features in a
digital database (including the GIS operation for converting
street addresses into spatial data that can be displayed as
features on a map)
Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago,
22-26 October 2007
Relationship to Other Census Processes
•
Movement into a fully GIS based approach to census mapping
•
Generation of high quality maps for use in the collection phase
•
Reduction of work required for updating maps for future
censuses
•
Aggregation of records into customized units for satisfying
users’ requirements
Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago,
22-26 October 2007
Census Enumeration & the Geocoding System
• Delineation irrespective of the existence of address
• Ability to apply a geocode to any geographic areal unit
• Flexible Coding Scheme
• Ability to incorporate future administrative divisions
• Pre-enumeration geocoding critical
• links between GIS boundaries and tabular census data
Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago,
22-26 October 2007
Census Hierarchies
Define census geographic hierarchy
Develop geographic coding scheme
Development of an administrative and census units listing
Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago,
22-26 October 2007
Census Hierarchies: some principles
•
Internal political Boundaries
•
Areal unit aggregation
•
Resolution suitable to NSO needs and user demands
•
Considers available datasets for continuous development
•
The smaller area defined by the geocode the more
flexible the results for subsequent users
Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago,
22-26 October 2007
Example of Administrative
Hierarchy
country
region
province
district
sub-district
rural locality
urban locality
Enumeration area
ward
Enumeration area
Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago,
22-26 October 2007
Illustration of a nested
Admin. Hierarchy
Provinces
Districts
Localities
Enumeration
areas
Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago,
22-26 October 2007
Hierarchical Coding Scheme: operational considerations
- Geographic units are numbered at each level of the
administrative hierarchy (gaps between the numbers to allow
changes)
- For example at the province level, units may be numbered 5, 10,
15 and so on. A similar scheme would be used for lower-level
administrative units and for enumeration areas.
- Since there are often, for example, more districts in a province
than provinces in a country, more digits may be required at lower
levels
- The unique identifier for the EA (the smallest-level unit):
concatenation of the identifiers of the Admin. Units into which it
falls
Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago,
22-26 October 2007
Example of a Coding Scheme
A small country could use the following coding scheme:
Province
2 digits
District
3 digits
Locality
4 digits
EA
4 digits
An EA code of 10 025 0105 0073 means that enumeration area
number 73 is located in province 10, district 25 and locality 105.
The unique code is stored in the database as a long integer or as
a 13-character string variable.
Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago,
22-26 October 2007
Example of a Coding Scheme (cont.)
• The variable type needs to be the same in the census database
and the geographic database.
• The integer variable has the advantage that subsets of records
can be selected easily (SQL)
• Example of query:
SELECT ID > 1203501550000 AND ID < 1203501560000
Will find all EAs within locality number 155 in the database or on
the digital map-
Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago,
22-26 October 2007
(cont.)
• Special coding conventions needed to be developed, in cases
where admin. and reporting units are not hierarchical
• In any case, consistency should be complete in defining and
using the administrative unit identifiers, since they are the link
between GIS boundaries and the tabular census data.
• Maintenance: NSOs should maintain a Master List of EA and
admin. units and their respective codes and report any changes
made to the Master List to the GIS and census databases.
Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago,
22-26 October 2007
Census Hierarchies
Country
Given Country
Province
District
Locality
Enumeration Areas
Blocks
Building
Dwelling
Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago,
22-26 October 2007
Coding Scheme
250131402013
Digits 1-2 = State code
Digits 3-5 = County Code
Digits 6-11 = Census Tract Code
Digit 12 = Blockgroup code
Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago,
22-26 October 2007
Geocoding Classifications
•
•
Disaggregation into Spatial Entities or Civil
Divisions and Compatibility
1st
Region
Province
2nd
District
Municipality
3rd
Town/Village
4th
Dwelling
Resultant geocoded units placed within a set
of Latitude and Longitudinal boundaries
Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago,
22-26 October 2007
Data Collection Methods
•
Two main methods:
•
Direct Collection Approach
•
Matching Approach
Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago,
22-26 October 2007
Direct Collection Approach
•
•
Digitizing from available topographic maps
Direct collection using field techniques
(ex.GPS)
Digitizing from a topographic map
Global Positioning System (GPS)
Areas,
Street,
Dwelling
Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago,
22-26 October 2007
Matching approach
•
•
Using an Address locator database and street network
database in a GIS
Joining an address database to an existing spatial
database for the area of interest
First Avenue
First Avenue
Street Network
Left of Street
Left of Street
#1
Second Avenue
#2
#51
#32
Right of Street
Nodes
#99
#100
Second Avenue
address number #99
Main Street
#2
#100
Right of Street
#1
Street Segment
Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago,
22-26 October 2007
Data Maintenance
• Cleaning Addresses
• Retaining only the key address elements
• Establish a Matchcode (indicator of which
address elements will determine the
geocode)
Record Street Address
City
State ZIPcode Latitude Longitude Areakey MatchCode
1
344 East 63rd New York NY
10023
40.47
73.58
3502508100
AS0
• Eliminating extraneous characters
• Standardizing Spelling
Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago,
22-26 October 2007
Staff Expertise Recommendations
Task/condition
Direct collection
Matching
Existence of digital base map for country
Highly desirable
Highly desirable
Statistical staff with expertise in use of GPS
Essential
Not Essential
Acquisition of large numbers of GPS receivers
Essential
Not Essential
Geo-referenced list of addresses or equivalent
Not Essential
Essential
Excellent address matching algorithms
Not Essential
Essential
Existence of a rational, consistent, and locallyrecognized addressing system for housing
units
Highly desirable
Essential
Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago,
22-26 October 2007
Geocoding: Benefits for National Statistical Offices
•
Improved map creation for the field
•
Customizable map outputs for specified regional
activities
•
Coding techniques are transparent and transferable
•
Fixates the groundwork for future statistical activities
and coding schemes
Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago,
22-26 October 2007
Concluding Remarks
•
Technologies are accessible and allow delineation
irrespective of the existence of address
•
Many available methods and technologies exist to support
accurate geocoding frameworks
•
Geocoding system is value-added for GIS based Spatial
Analysis of Statistical Data
Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago,
22-26 October 2007
Global Positioning Systems (GPS)
•
Technology has revolutionized field mapping in recent years
•
Prices of GPS receivers have dropped
•
GPS methods have been integrated in many applications
•
User groups are widespread (utilities management, surveying
and navigation). GPS has contributed and advanced to improve
field research in areas such as biology, forestry, geology,
epidemiology and population studies
Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago,
22-26 October 2007
Global Positioning Systems (cont.)
•
GPS has become a major tool in census cartographic
applications
•
Preparation and updating of enumerator (EA) maps for
census activities
•
Location of point features such as service facilities or
village centers
•
Coordinates can be downloaded or entered manually into a
digital mapping system or GIS, and can be combined with
existing, georeferenced information
Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago,
22-26 October 2007
How GPS Works
•
GPS receivers collect the signals transmitted from more than 24
satellites—21 active satellites and three spares. The system is
called NAVSTAR, and is maintained by the U.S. Department of
Defense
•
The satellites are circling the earth in six orbital planes at an
altitude of approximately 20,000 km. At any given time five to
eight GPS satellites are within the “field of view” of a user on
the earth’s surface
•
The position on the earth’s surface is determined by
measuring the distance from several satellites
Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago,
22-26 October 2007
The global positioning system (GPS)
Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago,
22-26 October 2007
The global positioning system (cont.)
•
GPS satellites circle the Earth twice a day …
•
The satellite signal:
• Three kinds of coded information essential for
determining a position;
•
The receiver:
• 1. Calculates the distance to the first satellite user is able
to catch.
• 2. Calculates the distance to a second satellite for which
it is able to catch a signal.
• 3. Repeats the operation mentioned under point 2 with a
third satellite.
Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago,
22-26 October 2007
How GPS
determines a location’s coordinates
a
b
m e a s u re d
d is t a n c e
u
x
c
x
Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago,
22-26 October 2007
Sources of GPS signal errors
•
Good visibility and bad visibility of satellites due to obstacles
•
signal multipath
•
Uncontrollable sources of error over which the user does not
have control
•
Atmosphere delays
•
Receiver clock errors
•
Orbital errors
Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago,
22-26 October 2007
Differential GPS
Space segment
GPS satellites
Correction signal
DGPS mobile
receiving station
DGPS
ground station
Control segment
User segment
Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago,
22-26 October 2007
GPS Accuracy
•
Inexpensive GPS receivers
•
Within 15 to 100 meters for civilian applications.
•
Differential GPS reduces error further
•
Accuracy of about 3-10m can be achieved with quite
affordable hardware and shorter observation times.
•
More expensive systems and longer data collection for each
coordinate reading can yield sub-meter accuracy.
Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago,
22-26 October 2007
Problems with GPS
•
In dense urban settings, the possible error of standard GPS
(standard ~15m up to 100 meters) may not be sufficient
•
Differential GPS can be used for cross-checking GPS
readings with other data sources
• published maps
• aerial photographs
• sketch maps produced during fieldwork
Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago,
22-26 October 2007
Selecting a GPS Unit
•
Commercially available GPS receivers vary in price and
capabilities
•
Technical specifications determine the accuracy by which
positions can be achieved
•
The more powerful a receiver, the more expensive it will be
•
In many mapping applications, the accuracy of standard
systems is quite sufficient
•
Receivers also vary in terms of user-friendliness, tracking
capabilities which are useful in navigation
Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago,
22-26 October 2007
Summary: Advantages and Disadvantages of GPS
Advantages
• Fairly inexpensive, easy-to-use field data collection
• Modern units require very little training for proper use
• Collected data can be read directly into GIS databases
minimizing intermediate data entry or data conversion
steps
• Worldwide availability
• Sufficient accuracy for many census mapping
applications—high accuracy achievable with differential
correction
Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago,
22-26 October 2007
Summary: Advantages and Disadvantages of GPS
Disadvantages
•
Signal may be obstructed in dense urban or wooded areas
•
Standard GPS accuracy may require differential techniques
•
Differential GPS is more expensive, requires more time in field
data collection and more complex post-processing to obtain
more accurate information
•
A very large number of GPS units may be required for only a
short period of data collection.
Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago,
22-26 October 2007
Where’s your Datum
Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago,
22-26 October 2007
Geocoding Classifications (cont.)
•
Initial creation of Civil Divisions through digitizing or
segmentation/pixel based-approaches
•
Low to Zero levels of sampling through the accurate placing
of coded units, but flexible enough to include changes
•
Appropriate detail that fits with the boundaries of a
geographic area for a given country
Workshop on Census Cartography and Management, Port-of-Spain, Trinidad and Tobago,
22-26 October 2007
Download