lecture 4 ppt

advertisement
Introduction to Geographic Information Systems
Fall 2013 (INF 385T-28620)
Geodatabases
Dr. David Arctur
Research Fellow, Adjunct Faculty
University of Texas at Austin
Lecture 4
September 19, 2013
Outline







Tables
Geocodes
Data table joins
Spatial joins
Spatial data formats
Geodatabases
Calculating geometry
INF385T(28620) – Fall 2013 – Lecture 4
2
Lecture 4
TABLES
INF385T(28620) – Fall 2013 – Lecture 4
3
Two kinds of tables in ArcGIS


Feature attribute table of map layer
 Attribute data is part of map layers
Data table with geocodes (such as census IDs)
 Can add as table to ArcMap
 Can join to map layer to add more attributes to layer
 Join via same geocode values in both the data table
and map layer’s attribute table
 Census data example—too many census variables to
supply already in feature attribute table, so download
custom table and join to appropriate polygon layer
INF385T(28620) – Fall 2013 – Lecture 4
4
Data table format

Rectangular table with one value per cell


Columns (fields) are attributes
Rows are observations (records)
INF385T(28620) – Fall 2013 – Lecture 4
5
Data table format

First row must have column names that are selfdocumenting labels
 E.g., Shape, POP2000
 First character of attribute name must be a letter
 Remaining characters can be any letter, digit, or
the underscore character (but no blanks)
INF385T(28620) – Fall 2013 – Lecture 4
6
Data table format

All additional rows of a data table must
contain only attribute values (raw data)

None of the rows can be sums, averages, or
other statistics for raw data rows
INF385T(28620) – Fall 2013 – Lecture 4
7
Primary keys

Each table has a primary key attribute with
two properties
 Each value is unique
 There are no null values
INF385T(28620) – Fall 2013 – Lecture 4
8
Field calculator

Add computed columns in ArcGIS
 ArcGIS does not have the query capacity of relational
database packages to compute new columns on the fly
 So, must create permanent new columns

Full range of computation
 Can add, multiply, etc.
 Has numeric and text functions
 Can concatenate text values
INF385T(28620) – Fall 2013 – Lecture 4
9
Field calculator (numeric)
INF385T(28620) – Fall 2013 – Lecture 4
10
Field calculator (text)

Concatenate house number and street fields
INF385T(28620) – Fall 2013 – Lecture 4
11
External table file formats for import into ArcGIS

Plain ASCII text with comma separated values
(.csv)





Very transportable format, very large files
Each table record is a row terminated with a line-break character
(invisible, nonprinting value)
Has values separated by a delimiter, usually a comma
For data values that contain the delimiter, enclose the value in
double quotes
Sometimes columns get wrong data type on import (use double
quotes to force text data type for digits, say for house numbers)
INF385T(28620) – Fall 2013 – Lecture 4
12
External table file formats for import to ArcGIS
 Excel (.xls, .xlsx)
 Excel 2003, up to 65,000 rows and 256 columns
 Excel 2007, up to 1,048,576 rows and 16,384 columns
 dBase database table (.dbf)
 Legacy format
st 10 characters
 ArcMap truncates field names to 1
 dBase IV has maximum of 255 columns
 Can open dBase file in Excel but cannot save dBase
from Excel
 Microsoft Access database (.mdb)


Up to 2 GB file size
See following for other limits:
http://www.databasedev.co.uk/access_specifications.html
INF385T(28620) – Fall 2013 – Lecture 4
13
Lecture 4
GEOCODES
INF385T(28620) – Fall 2013 – Lecture 4
14
Geocodes (2000)
 Federal Information Processing Standards
(FIPS)
 Developed by the National Institute of Standards
and Technology
 Codes for place-names throughout the world
•
•
•
•
•
•
Countries
States/provinces
Counties
Metropolitan statistical areas (MSA’s)
Cities
Places—Indian reservations, airports, and post offices in
the US
See http://www.genesys-sampling.com/pages/Template2/site2/61/default.aspx for
additional geocodes.
INF385T(28620) – Fall 2013 – Lecture 4
15
Geocodes: hierarchical
FIPS codes
(political boundaries)
Country: US
State: 42 (Pennsylvania)
County: 003 (Allegheny)
Minor civil division: 4200361000 (Pittsburgh)
Census codes
Tract: 1917
(statistical boundaries)
Block group: 003
Block: 005 (US420031917003005)
Parcel block & lot number
Local government cadastral data
0096-P-00210000000
(legal boundaries)
(1690 Seaton St, Pittsburgh, PA 15226)
INF385T(28620) – Fall 2013 – Lecture 4
16
World and US
INF385T(28620) – Fall 2013 – Lecture 4
17
US and state 42
State 42 and county 003
INF385T(28620) – Fall 2013 – Lecture 4
18
County 003 and municipality 61000
Municipality 61000 and tract 1917
INF385T(28620) – Fall 2013 – Lecture 4
19
Tract 1917 and block group 003
Block group 003 and block 005
INF385T(28620) – Fall 2013 – Lecture 4
20
Geocodes (2010)

ANSI Codes
 American National Standards Institute Codes
 Replace the Federal Information Processing
Standards (FIPS)
 The entities covered include:
• States and statistically equivalent entities
• Counties and statistically equivalent entities
• Named populated and related location entities
(such as places and county subdivisions)
• American Indian and Alaska Native areas
See http://www.census.gov/geo/www/ansi/ansi.html
INF385T(28620) – Fall 2013 – Lecture 4
21
Lecture 4
DATA TABLE JOINS
INF385T(28620) – Fall 2013 – Lecture 4
22
Review: Table joins

Puts two tables together, on the fly,
to make one table
 One-to-one join (e.g., join state attribute data to
state shapefile by StateName)
 One-to-many join (e.g., join code table to feature
attribute table to add code description. Many
records can use the same code value.)

Each table in a join must have key
attribute for matching
 Must have same values and data types for key in
both tables
INF385T(28620) – Fall 2013 – Lecture 4
23
Example join
+
INF385T(28620) – Fall 2013 – Lecture 4
=
24
Problems with joins

Field types are different (e.g., one is
numeric and one is text)
Text values left align
while numeric values
right align
INF385T(28620) – Fall 2013 – Lecture 4
25
Solution

Create a new field of the same type and use Field
Calculator
INF385T(28620) – Fall 2013 – Lecture 4
26
Solution

Both tables are same field types
INF385T(28620) – Fall 2013 – Lecture 4
27
Problems with joins
Data format varies
Must remove dashes
INF385T(28620) – Fall 2013 – Lecture 4
28
Lecture 4
SPATIAL JOINS
INF385T(28620) – Fall 2013 – Lecture 4
29
Spatial joins



Joins using shape (not attribute field)
Enables data aggregation (counting or
summing points by polygon)
Common spatial joins

Points to polygons (counts)

Polygons to points (adds text)

Points to points (distances)
INF385T(28620) – Fall 2013 – Lecture 4
30
Points to polygons

How many businesses are in each
neighborhood?
 Start with:

Business points

Neighborhood
polygons
INF385T(28620) – Fall 2013 – Lecture 4
31
Points to polygons
Right-click neighborhoods > Joins and Relates > Join
INF385T(28620) – Fall 2013 – Lecture 4
32
Spatial join result

New polygon layer with count of points (number of
architects and engineers)
INF385T(28620) – Fall 2013 – Lecture 4
33
Spatial join result

Show as a choropleth map with labels, or table
Neighborhood Name
Central Business District
Southside Flats
Shadyside
Bloomfield
Lower Lawrenceville
North Shore
Squirrel Hill South
Strip District
Point Breeze
Squirrel Hill North
Garfield
South Oakland
Friendship
North Oakland
Carrick
Central Lawrenceville
East Allegheny
Mount Washington
East Liberty
Central Northside
Westwood
Banksville
Brookline
Perry North
Highland Park
Larimer
Allegheny West
Middle Hill
Bluff
Southside Slopes
INF385T(28620) – Fall 2013 – Lecture 4
Count
53
14
9
8
8
8
6
6
4
4
3
3
2
2
2
2
2
2
1
1
1
1
1
1
1
1
1
1
1
1
34
Points to polygons

What neighborhood is a business in?
 Start with:


Business points
Neighborhood
polygons
INF385T(28620) – Fall 2013 – Lecture 4
35
Polygons to points
Right-click business points > Joins and Relates > Join
INF385T(28620) – Fall 2013 – Lecture 4
36
Spatial join result

Point shapefile with neighborhood data on each
business
INF385T(28620) – Fall 2013 – Lecture 4
37
Points to points

How close is the nearest bus stop to a business?
 Start with:


Business points
Bus stop points
INF385T(28620) – Fall 2013 – Lecture 4
38
Points to points

Right-click business points > Joins and Relates > Join
INF385T(28620) – Fall 2013 – Lecture 4
39
Result

Distance field added to new layer of businesses and
stops joined
INF385T(28620) – Fall 2013 – Lecture 4
40
Lecture 4
SPATIAL DATA FORMATS
INF385T(28620) – Fall 2013 – Lecture 4
41
Esri legacy format: Coverage



Folder with
multiple files
Can have points,
lines, and/or
polygons
Has several
intermediate data
products
(topology) to
speed up
processing (now
calculated on the
fly)
INF385T(28620) – Fall 2013 – Lecture 4
42
Esri legacy format: Shapefile



Multiple files, all with the same name but different
file extensions
No intermediate data products, but has indices to
speed data processing
Widely used to share spatial data files
INF385T(28620) – Fall 2013 – Lecture 4
43
Shapefiles

ArcView native format
 Minimum files



.shp–stores feature geometry
.shx–stores index of features
.dbf–stores attribute data
 Additional files



.prj–projection data
.xml–metadata
.sbn and .sbx–store
additional indices
INF385T(28620) – Fall 2013 – Lecture 4
44
CAD drawings

CAD software
 Autodesk, AutoCAD (.dwg)
 Bentley, Microstation (.dgn, .dxf)

Often used by engineering companies

Better digitizing precision
INF385T(28620) – Fall 2013 – Lecture 4
45
CAD drawings
INF385T(28620) – Fall 2013 – Lecture 4
46
Lecture 4
GEODATABASES
Geodatabases
A geodatabase is a container used to hold a
collection of datasets (GIS features, tables,
raster images, and other objects)
Country layer
World.gdb
Graticule layer
INF385T(28620) – Fall 2013 – Lecture 4
48
Enterprise geodatabases


Practically unlimited size and multiple
simultaneous users
Use enterprise data management systems
 Store spatial datasets in a number of DBMSs:
IBM DB2, Microsoft SQL Server, Oracle, or
Postgres
INF385T(28620) – Fall 2013 – Lecture 4
49
Personal geodatabase





Parallels enterprise geodatabase but on PC
Stores datasets in a Microsoft Access .mdb file
Limited to 2 GB
Much overhead in space and extra structure
Tempting to apply one’s own Access skills, but
needs ArcGIS Catalog utility for manipulation
INF385T(28620) – Fall 2013 – Lecture 4
50
File geodatabase

An Esri replacement for shapefiles
 Vector and raster map layers
 Other objects (tables)
 Stores one or more datasets in a folder of
files with .gdb extension
 Can be up to 1 TB in size
 Can be used across platforms
 Can be compressed and encrypted for
read-only, secure use
INF385T(28620) – Fall 2013 – Lecture 4
51
View geodatabases


Cannot identify names in Windows Explorer
Must use ArcCatalog
INF385T(28620) – Fall 2013 – Lecture 4
52
Non-Esri vector formats

Interoperability
 Ability of different vendors’ hardware and software to share
data
 Driven by the Internet with standards evolving for open
data access (International Organization for Standardization,
Open Geospatial Consortium, US Federal Geographic Data
Committee)

Over 110 vector file formats available in
ArcGIS Data Interoperability extension
(http://www.esri.com/library/fliers/pdfs/data-interop-formats.pdf)
INF385T(28620) – Fall 2013 – Lecture 4
53
KML (Keyhole Markup Language)

XML schema for Internet-based maps
 Originally created by Keyhole, Inc. for satellite images and
purchased by Google to become Google Maps
 Provides a set of features (points, lines, polygons, images, text,
etc.) with lat/long coordinates plus altitude for 3D viewing
 KMZ is zipped KML and associated files, needed for upload to
Google Maps

Portability
 Can import and export KML/KMZ via ArcToolbox in ArcGIS
 Can upload to Google maps from your computer
INF385T(28620) – Fall 2013 – Lecture 4
54
X,y data



Point data table with x and y attributes
Increasingly popular to include x and y with
data
Commonly used for GPS data
INF385T(28620) – Fall 2013 – Lecture 4
55
Lecture 4
CALCULATING GEOMETRY
INF385T(28620) – Fall 2013 – Lecture 4
56
Point centroids
When displaying or analyzing small polygons
it is often better to use point centroids
INF385T(28620) – Fall 2013 – Lecture 4
57
Calculate x,y fields
Add new x and y fields in the attribute table
INF385T(28620) – Fall 2013 – Lecture 4
58
Calculate x,y fields
Calculate geometry for x field, repeat for y
INF385T(28620) – Fall 2013 – Lecture 4
59
X,y field results
Results are x and y values based on map
properties (e.g., Long/Lat or x,y feet)
INF385T(28620) – Fall 2013 – Lecture 4
60
Export table with x,y values
INF385T(28620) – Fall 2013 – Lecture 4
61
Add x,y data table
INF385T(28620) – Fall 2013 – Lecture 4
62
Export features

X,y events should be exported as permanent
shapefile or feature class
INF385T(28620) – Fall 2013 – Lecture 4
63
Count point centroids

Population can be spatially joined to buffer around polluting
companies
INF385T(28620) – Fall 2013 – Lecture 4
64
Other geometry calculations

Area

Perimeter

Length
INF385T(28620) – Fall 2013 – Lecture 4
65
Summary







Tables
Geocodes
Data table joins
Spatial joins
Spatial data formats
Geodatabases
Calculating geometry
INF385T(28620) – Fall 2013 – Lecture 4
66
Download