Introduction to Geographic Information Systems Spring 2013 (INF 385T-28437) Dr. David Arctur Lecturer, Research Fellow University of Texas at Austin Lectures 8 & 9 Feb 28, 2013 8 - Spatial Analysis 9 - Geocoding Geocoding Outline (Tutorial Ch.7) Geocoding overview Linear (street) geocoding Problems and solutions Street map sources Polygon geocoding Geocoding in ArcGIS Useful Web sites INF385T(28437) – Spring 2013 – Lecture 9 2 Lecture 9 GEOCODING OVERVIEW INF385T(28437) – Spring 2013 – Lecture 9 3 Geocoding Process of creating geometric representations for locations (e.g., points) from descriptions of locations (e.g., street addresses) Uses a computer program that employs code tables to standardize address components called a geocoding engine Two ways to geocode Batch geocoding—attempts to match all addresses Interactive rematching—sophisticated user interface to match addresses INF385T(28437) – Spring 2013 – Lecture 9 4 Geocoding examples City’s economic development department maps technology businesses by street addresses to see technology-rich areas in a city County health director maps personal care and nursing homes and compares them to elderly population by neighborhood Business maps store locations and compares to competitor locations Emergency dispatch operators geocode an address to determine who should respond to an emergency call Others? INF385T(28437) – Spring 2013 – Lecture 9 5 Geocoding files Tabular data (text or dbase) Street addresses ZIP codes Latitude and longitude INF385T(28437) – Spring 2013 – Lecture 9 6 Geocoding files Geographic data Street centerlines ZIP code polygons INF385T(28437) – Spring 2013 – Lecture 9 7 Other geocoding files Lines Railroads, rivers Polylines Parcels, census blocks, tracts, MCD/CCDs, places, counties, etc. Points Landmarks such as churches, schools, and other cultural features represented as TIGER as points INF385T(28437) – Spring 2013 – Lecture 9 8 Lecture 9 LINEAR (STREET) GEOCODING INF385T(28437) – Spring 2013 – Lecture 9 9 Linear geocoding (streets) Urban street maps Four street address numbers ranging from low to high for each side of a street segment 100 101 INF385T(28437) – Spring 2013 – Lecture 9 Oak Street 198 199 10 Geocoding steps Original address: 125 East Oak Street 15213 Address parsed: |125|East|Oak|Street| 15213 Abbreviations standardized: |125|E|Oak|St|15213 Elements assigned to match keys: [HN]:125 [SN]:Oak[ST]:St [SD]:E [ZP]:15213 Index values calculated: [HN]:125 [SN]:Oak(Soundex # ) [ST]:St [SD]:E [ZP]:15213 (Index #) INF385T(28437) – Spring 2013 – Lecture 9 11 Geocoding steps Candidates identified: 125 East Oak Street15213 From To Street Type Side Parity Direction Street_ 2 98 Oak St R E W 4344 1 99 Oak St L O W 4345 100 198 Oak St R E E 4346 101 199 Oak St L O E 4357 Candidates scored and filtered: From To Street Type Side Parity Direction Street_ 100 198 Oak St R E E 4346 101 199 Oak St L O E 4357 INF385T(28437) – Spring 2013 – Lecture 9 12 Geocoding steps Best candidate matched From To Street Type Side Parity Direction Street_ 101 199 Oak St L O E 4357 Oak St 98 100 198 1 99 101 199 Pine Av 2 INF385T(28437) – Spring 2013 – Lecture 9 123 13 Address components Number 123 Oak St E, Apt. 2, Pittsburgh, PA 15213 Street name 123 Oak St E, Apt. 2, Pittsburgh, PA 15213 Street type 123 Oak St E, Apt. 2, Pittsburgh, PA 15213 Direction, suffix 123 Oak St E, Apt. 2, Pittsburgh, PA 15213 Direction, prefix 123 E Oak St, Apt. 2, Pittsburgh, PA 15213 Unit number 123 Oak St E, Apt. 2, Pittsburgh, PA 15213 Zone, city 123 Oak St E, Apt. 2, Pittsburgh, PA 15213 Zone, ZIP code 123 Oak St E, Apt. 2, Pittsburgh, PA 15213 Items for single number street address: Address 123 Oak St E Unit City ZIP Code Apt. 2 Pittsburgh 15213 INF385T(28437) – Spring 2013 – Lecture 9 14 Lecture 9 PROBLEMS AND SOLUTIONS INF385T(28437) – Spring 2013 – Lecture 9 15 Possible problems Variations in street names Fifth Avenue, Fifth Ave., 5th AV Saw Mill Run Blvd, Route 51 Data entry errors Fidth Avenue Sawmill Run Place Names White House, Heinz Field, Empire State Building Intersections Fifth Avenue and Craig Street INF385T(28437) – Spring 2013 – Lecture 9 16 Possible problems Zones 100 Main ST 15101, 100 Main ST 16202 P.O. boxes P.O. Box 125 Missing street (TIGER) information INF385T(28437) – Spring 2013 – Lecture 9 17 Solutions Clean data before geocoding Use postal address standards Publication 28 of the U.S. Post Office (2000) Ultimate source on mailing address formats and codes Provides standard street address formats Standards include house number, prefix directional, street name, street suffix, common unit designator abbreviation for apartment, city, state abbreviation, fivedigit ZIP code, ZIP+4 extension Use Standard Intersection Connectors &|@ INF385T(28437) – Spring 2013 – Lecture 9 18 Solutions Use alias tables Alias Address White House 1600 Pennsylvania Avenue Heinz Field 100 Art Rooney Avenue Empire State Building 350 5th Ave Points of Interest (POI) databases; gazetteers Assign house numbers in rural areas Purchase or build high quality maps (field verification) INF385T(28437) – Spring 2013 – Lecture 9 19 Lecture 9 STREET MAP SOURCES INF385T(28437) – Spring 2013 – Lecture 9 20 Caution: source has changed TIGER/Line files 2000 Census street centerlines INF385T(28437) – Spring 2013 – Lecture 9 21 Caution: source has changed TIGER/Line files INF385T(28437) – Spring 2013 – Lecture 9 23 Commercial sources for maps GDT / ETAK / TeleAtlas TomTom.com All the big pioneers of streetmaps are now TomTom Esri StreetMap Premium for ArcGIS No more free streetmaps MapBox.com (free & fee) Others…? INF385T(28437) – Spring 2013 – Lecture 9 25 OpenStreetMap.org (free open-source) INF385T(28437) – Spring 2013 – Lecture 9 26 Lecture 9 POLYGON GEOCODING INF385T(28437) – Spring 2013 – Lecture 9 27 Polygon geocoding Suppose you wished to make a choropleth map showing distribution of attendees at an event Need to geocode data whose identifier is a polygon (e.g. ZIP code, city, or county) Create an aggregate table with a single record for each unique polygon Count the records for each polygon Join table to corresponding polygon layer Symbolize with a choropleth map or graduated point symbols INF385T(28437) – Spring 2013 – Lecture 9 28 Polygon geocoding (ZIP codes) INF385T(28437) – Spring 2013 – Lecture 9 30 Polygon geocoding (ZIP codes) Points created at ZIP code centroids INF385T(28437) – Spring 2013 – Lecture 9 31 Polygon geocoding (ZIP codes) Spatially join points to polygons to make choropleth map INF385T(28437) – Spring 2013 – Lecture 9 32 Choropleth map result INF385T(28437) – Spring 2013 – Lecture 9 33 Lecture 9 GEOCODING IN ARCGIS INF385T(28437) – Spring 2013 – Lecture 9 34 Create address locator ArcCatalog INF385T(28437) – Spring 2013 – Lecture 9 35 Choose address locator style Skeleton of the address locator Based on data tables and reference layer INF385T(28437) – Spring 2013 – Lecture 9 36 Choose reference layer Streets, ZIP codes INF385T(28437) – Spring 2013 – Lecture 9 37 Address locator properties INF385T(28437) – Spring 2013 – Lecture 9 38 Geocode in ArcMap Add tabular data and streets layer Add address locator Geocode addresses View geocoding results Interactively rematch addresses INF385T(28437) – Spring 2013 – Lecture 9 39 Address rematching Investigate unmatched addresses Generally requires expertise with knowledge of local streets Compare street name in the attributes of streets table and address table INF385T(28437) – Spring 2013 – Lecture 9 40 Prepare log file Log file includes reasons why addresses did not geocode Useful for future work on cleaning addresses or repairing street maps Incorrect address Possible reason/solution 490 Penn Avenue Missing ZIP code 111 Hawksworth Spelled incorrectly 900 Smallman Street TIGER street missing 900 Lib Ave Spelled incorrectly INF385T(28437) – Spring 2013 – Lecture 9 41 Lecture 9 USEFUL WEBSITES INF385T(28437) – Spring 2013 – Lecture 9 42 Useful Web sites http://www.usps.gov/ http://www.geocode.com/ (TomTom fee service) http://batchgeo.com/ http://www.mapquest.com http://maps.google.com http://www.bing.com/maps/ http://www.zipinfo.com http://zipskinny.com/ Others? INF385T(28437) – Spring 2013 – Lecture 9 43 Geocoding Summary Geocoding overview Linear (street) geocoding Problems and solutions Street map sources Polygon geocoding Geocoding in ArcView Useful Web sites Complete Assignment 7-1 (7-2 optional) INF385T(28437) – Spring 2013 – Lecture 9 44