Intermediate_GIS_Ski.. - Yale University Library

advertisement
INTERMEDIATE GIS CONCEPTS AND SKILLS WITH ARCGIS
S TACEY M APLES – GIS S PECIALIST & I NSTRUCTION C OORDINATOR AT T HE S TERLING
M EMORIAL L IBRARY M AP D EPARTMENT - STACEY . MAPLES @ YALE . EDU
This session will build upon the skills and concepts introduced in the "Introduction to Geographic
Information Systems and ArcGIS for Spatial Analysis" session and participants will be expected to
attend that workshop, or have comparable experience with ArcGIS 10. Topics will include: Use of
Relates & Relationship Classes; Geoprocessing of geographic data; Geocoding of street addresses;
Overlay Analysis; and Advanced Manipulation of Tabular Data. Part of the Yale University Library
Map Collection GIS Workshop Series
GIS RESOURCES:
www.library.yale.edu/maps Yale Map Dept. Website
http://mailman.yale.edu/mailman/listinfo/gis-l Yale GIS Listserv
http://guides.library.yale.edu/GIS Yale GIS Support Portal
DOWNLOAD TUTORIAL DATA FROM THE GIS LIBGUIDE
1. http://guides.library.yale.edu/gis ; Yale GIS Workshops Tab
2. Download the Datasets for the INTERMEDIATE GIS CONCEPTS AND SKILLS WITH
ARCGIS Workshop (Right-click; Save Target As; Save to c:\temp)
POINT PATTERN ANALYSIS
EXTRACT THE DATA TO THE C:\TEMP FOLDER
1. Browse to the c:\Temp folder, where you saved the Data
2. Right-click on the EX_02_John_Snow.zip and select Extract
All…
3. Accept all defaults to extract the data file to C:\Temp
4. Browse into the EX_02_John_Snow folder and double-click on
the EX_02_John_Snow.mxd to open it.
GENERATE A NEAR TABLE AND RELATE
1. Open the Search Window by Hovering over the Tab at the
right side of the Data Frame, or using the Main
Menu>Windows>Search Window
2. Search on the term “Near” and click on the Generate Near
Tool result to launch the tool.
Page 1 of 19
3. Use the deaths_reverse_geocode layer as
the Input Features and the
Water_Pumps_Rev_Geocode layer as the
Near Features.
4. Save the Output Table as
Death2Pump_Near_Table in your
Snow_Cholera_Data.gdb
5. Make sure that the “Find only the closest
feature” option is checked and click OK to
run the tool
6. Right-click and Open the resulting
Death2Pump_Near_Table to examine the
results.
Note that this table contains IN_FID and NEAR_FID
fields. These fields match the Input features (IN_FID)
to their nearest Near features (NEAR_FID) using the
OBJECTID for each layer. This table can be use dot
create a relationship class that allows you to select
associated fetures between these two feature classes.
JOIN THE DEATHS DATA TO THE NEAR TABLE
1. Close the Death2Pump_Near_Table
2. Right-click on the deaths_reverse_geocode
layer layer and select Joins and Relates>Join.
3. Join the deaths_reverse_geocode layer to the
Death2Pump_Near_Table, using the
OBJECTID to the IN_FID as the Keyfield.
Page 2 of 19
4. Examine the results to make sure the two
tables are now joined.
RELATE THE DEATHS DATA TO THE WATER
PUMPS
5. Right-click on the deaths_reverse_geocode
layer again and select Joins and
Relates>Relate.
6. Relate the deaths_reverse_geocode layer to
the Water_Pumps_Rev_Geocode layer,
using the NEAR_FID to the OBJECTID as the
Keyfield, and name the Relate
Death2Pumps. Click OK.
7. Open the Attribute Table for the
Water_Pumps_Rev_Geocode layer and
select the Broadwick Pump record by
clicking on the gray box at the far left
8. Now, using the Related Tables Dropdown
Button
(at the top left of the
Attribute Table Window), and select the
Death2Pumps Relate.
Note that the Attribute Table for the
deaths_reverse_geocode layer should now be
open and the features in that layer that
correspond to the Broadwick Pump should be
selected (both in the table, and in the Data
Frame).
1. Right-Click on the Num_Cases field header
and select Statistics to examine a basic
statistical summary of the deaths nearest
the Broadwick Pump.
2. Close the Selection Statistics Window and
clear the selection using the Main
Menu>Selection>Clear Selected Features.
Page 3 of 19
THIESSEN POLYGON (SPATIAL
ALLOCATION)
Thiessen polygons allocate space in an
area of interest to a single feature per polygon. That is, within a Thiessen polygon, all other features
are closer to the point that was used to generate that polygon than to any other point in the feature
set. In this case, we will create a set of Thiessen polygons based upon the locations of the Water
Pumps in our project.
1. Use the Search Tab to search
for the Thiessen Polygon
tool, using the Search Term
“Thiessen” and launch the
tool from the result.
2. Select the
Water_Pumps_Rev_Geocode
as the Input Features and
save the Output Feature
Class as Pump_Thiessen, in
the Snow_Cholera_Data.gdb.
3. Set the Output Fields option
to ALL.
Page 4 of 19
4. Click on the Environments
button at the bottom of the
window and expand the
Processing Extent Option.
5. Set the Processing extent to
the “Same as layer extent”
6. Click OK twice to apply the
Environment Setting and
run the Thiessen Polygon
tool.
SPATIAL JOIN (POINT
AGGREGATION)
Now that you have created the Thiessen polygon layer, you will
“allocate” each of the deaths to one of the Thiessen polygons.
To do this, we will use the Spatial Join tool.
1. Right-click on the deaths_reverse_geocode layer and
select Joins and Relates>Join
2. Change the Method Dropdown to “Join data from
another layer based on spatial location”
3. Save the Output Layer as Deaths_Allocated, in your
Snow_Cholera_Data.gdb. Click OK
4. The resulting layer is added to the Map
Document. Open it’s attribute table to
confirm that the attributes of the Water
Pumps have been transferred.
SUMMARY STATISTICS
1. Use the Search Window to search on the
term “Summary” and open the Summary
Statistics tool.
2. Select the Deaths_Allocated Table as the
Input.
3. Save the Ouput Table to your
Snow_Cholera_Data.gdb and name it
Deaths_Summary_by_Pumps.
Page 5 of 19
4. For the Statistics Field(s), select the Num_Cases Field, twice, and set the Statistic Type to
SUM and MEAN.
5. Assign the REV_Addres_1 (the address field from the Water Pump data layer) as the case
field and click
OK.
6. Open the
resulting table
and Sort
descending on
the
SUM_Num_Cases
field.
Note that the Broadwick
Pump has the highest
value for all three
significant attributes:
FREQUENCY (No. of
households),
SUM_Num_Cases (Total Deaths) and
MEAN_Num_Cases (Mean Deaths per
Household).
SPATIAL CENTRAL TENDENCY
SPATIAL MEAN
1. Search for and open the Mean
Center tool.
2. Select the deaths_reverse_geocode
layer as the Input Feature Class
3. Save the Output Feature Class to
the Snow_Cholera_Data.gdb and
name it Deaths_Spatial_Mean.
4. Do not assign a Weight Field, yet. Click OK to calculate the Mean Center.
5. Change the Symbology for the Deaths_Spatial_Mean layer to something that contrasts with
the other symbologies.
WEIGHTED SPATIAL MEAN
1. Run the Mean Center tool again,
this time assigning the
deaths_reverse_geocode_Num_Case
s field as the Weight Field.
Page 6 of 19
2. Save the Output Feature Class to the Snow_Cholera_Data.gdb and name it
Deaths_Weighted_Spatial_Mean.
3. Apply a symbology to the
Deaths_Weighted_Spatial_Mean
layer.
STANDARD DISTANCE
1. Search for and open the Standard
Distance tool.
2. Select deaths_reverse_geocode as
the Input feature class.
3. Save the Output Feature Class to
the Snow_Cholera_Data.gdb and
name it
Deaths_Standard_Distance.
4. Select
deaths_reverse_geocode_Num_Cas
es as the Weight Field.
5. Click OK to calculate the Standard
Distance.
Page 7 of 19
INTERPOLATION (HOT SPOTS)
INVERSE DISTANCE WEIGHTED (IDW)
INTERPOLATION
1. Search for and open the IDW
tool.
2. Select the
deaths_reverse_geocode layer as
the Input Point features
3. Select the
deaths_reverse_geocode.Num_Ca
ses as the Z Value Field.
4. Save the Output Raster to the
Snow_Cholera_Data.gdb and
name it IDW_Deaths.
5. Set the Output Cell Size to 10
(this is in meters).
6. Leave the remaining settings at
their defaults and click OK to
calculate the
IDW raster.
Page 8 of 19
AREAL INTERPOLATION OF ATTRIBUTES
In this tutorial, we will be performing what is referred to as “Areal Interpolation” of Census
Attributes. We have a set of boundaries (in this case the Major Watershed Basins of Connecticut,
our CT_Major_Basins Layer) for which we would like to summarize the population. Our problem is
that these watershed boundaries do not correspond with the geographic units that the U.S. Census
uses to collect and tabulate demographic data. Some of the Census Block Groups in our
CT_Block_Groups layer overlaps more than one Watershed basin unit. What we will do in the
following steps is to calculate the proportion of overlap for each Census Block Group, relative to the
Watershed Boundaries, and use these proportions to assign an appropriate estimate of the
population to each watershed.
CALCULATING GEOMETRY FOR A DATA LAYER
First, we need to determine the initial area of each of our
“intact” Census Block Groups. We can refer to these as the
“Parent” features.
1. Right-Click on the CT_Block_Group Layer and Open
the Attribute Table.
2. Take a few seconds to examine the data available in
this dataset. This data describes the demographic
characteristics of every Census Block Group in our
area of interest.
3. Click the Options Button at the Top of the Attribute
Table and Select Add Field...
4. Add a Field with Name = AREA, and
Type = Float.
5. Click OK.
6. Scroll to the far right of the Attribute
Table to view the newly added AREA
Field.
7. Right-Click on the Area Field Header and
Select Calculate Geometry… Click Yes
when warned about “Calculating Outside
and Edit Session.”
8. Change the Units to Square Miles US [sq
mi].
9. Click OK.
10. Note that the AREA Field should now be
populated with the new values.
11. Close the Attribute Table
Page 9 of 19
GEOPROCESSING: USING THE
UNION TOOL
Now, we need to merge the Block
Group and Watershed boundary files,
so that those Block Groups that span
more than one watershed will be split
into their sub-units of overlap, or
“child” features. To do this, we will
use a technique generically referred
to as “Geoprocessing.” Geoprocessing
is the act of applying any number of
spatially transforming tools to a
dataset. In this case, we will use the
Union Tool to create a new dataset.
1. Search for and open the Union
Tool
2. Select the CT_Major_Basins
and CT_Block_Groups Layers
as the Input Features.
3. Click on the Show Help>> Button at the bottom of the Dialog Box and note that the Help
System is Context-Sensitive.
4. Save the Output Feature Class to your
C:\temp\Intermediate_GIS_Skills\CT_Watershed_Data.gdb and name it “Union”
5. Leave the remaining options at their default settings.
6. Click OK to Apply the Union Tool.
7. Click Close once the process has completed.
8. You should be left with a new Union Layer, at the top of your Table of Contents.
CALCULATING THE NEW AREA OF THE UNION RESULTS
Now we need to calculate the NEW AREA of those “Child” Block Groups that were split by the Union
Process and then the proportion of their original AREA.
1.
2.
3.
4.
5.
6.
7.
8.
9.
Right-Click on the Union Layer and Open the Attribute Table.
Click on the Options Button and Select Add Field…
Add a new field: Name = SUBAREA, Type = Float. Click OK.
Add a new field: Name = WEIGHT, Type = Float. Click OK.
Add a new field: Name = WTPOP, Type = Short Integer. Click OK.
Scroll to the right of the Attribute Table to find the newly added SUBAREA Field.
Right-Click on the SUBAREA field header and Select Calculate Geometry…
Change the Units to Square Miles US [sq mi].
Click OK to apply the calculation.
Page 10 of 19
CALCULATING THE WEIGHT VALUE
Now we will calculate the proportion of the child area to parent area, which will be used as a weight
to apply to the demographics we are interested
in. First, we must exclude those polygons that
have an AREA=’0’ (these are coastal “slivers”
and are not important to the results of our
analysis).
1. Click on the Select by Attributes Button
2. In the Query Argument panel, at the
bottom of the Select by Attributes Dialog
Box, enter the query:
"AREA" <>0
3. This will select only those records that do
not have an AREA = 0.
4. Click on the Verify Button to check your
SQL Query Syntax.
5. Click Apply.
6. Click Close.
7. Right-Click on the WEIGHT field header
and Select Field Calculator…
8. Use the Field Calculator to build the following argument:
[SUBAREA] / [AREA]
9. Click OK to apply the calculation and note
that, because you have an active select,
the calculation is only applied to the
selected subset of records, thus avoiding
a “divide by 0 error.”
10. Finally, Scroll to the far right of the
Attribute Table, Right-Click on the
WTPOP field header and select Field
Calculator…
11. Use the Field Calculator to build the
following argument:
[POP2004] * [WEIGHT]
Page 11 of 19
12. Click OK to apply the Calculation.
13. Save
work.
your
SUMMARY STATISTICS
Now that we have a set
of Census Boundary
files that correspond to
the watershed, and
estimates of the
population of those new
boundary units, we
need to summarize
those population
estimates for each of
our watershed units.
1. On the Attribute
Table and click
on the Clear Selection
button.
2. Search for and open the
Summary Statistics Tool.
3. Select the Union Layer as
the Input Table.
4. Browse to the
CT_Watershed_Data.gdb and
save the Output Table as
“Population_Summary”
5. Select WTPOP as the
Statistics Field, and select
SUM as the Statistic Type.
6. Select MAJOR as the Case
field.
7. Click OK.
8. Click Close when the tool completes.
9. Click on the Source Tab, at the Bottom of the Table of Contents.
Page 12 of 19
10. Right-Click on the Population _Summary Table and Open it to observe the population counts
for the watersheds.
11. Close Attribute Table.
12. Save
your work.
JOINS AND MULTIPART FEATURES
1. Join the Summary Table to the
MAJOR_BASIN_POLY using the MAJOR
field as a Keyfield and observe the
results.
Note that two of the records are repeated. This is
because the Hudson and Southeast Coast
features are represented in the dataset as two
distinct features. The fix for this is to
dissolve these features into single
“multipart” features.
2. Remove the Join by right-clicking
on the Major_Basin layer and
selecting Joins and
Relates>Remove Joins>Remove all
joins.
3. Search for and Open the Dissolve
tool.
4. Use the Dissolve Tool to Dissolve
the mulitpart features into single
entries based upon the MAJOR
field. SUM the ACREAGE &
AREA_SQMI. Call the
result BASIN_DISSOLVE
and save it to the
CT_Watershed_Data.gdb.
5. Join the Summary Table to
the BASIN_DISSOLVE
feature class and examine
the attribute table.
6. Remove the original
CT_MAJOR_BASIN feature
Page 13 of 19
class and save your Map Document.
ADD A BASEMAP FROM ARCGIS ONLINE
1. On the Main Menu, go to File>Add Data>Add Basemap and add a basemap of your choice to
the Map Document.
2. Turn off all other layers in the Map
Document.
CREATING A FEATURE CLASS FROM A TABLE OF
XY COORDINATES
1. Click on the View by Source button at the top
of the Table of Contents
2. Add the table to ArcMap
3. Right-Click the CT_TRI_Facilties tableand
examine the data.
4. Close the CT_TRI_Facilties table, right-click
on it and select Display XY Data.
5. Edit the Coordinate System to
Geographic>North America>NAD 1983
6. Export the “Events” layer to the
CT_Watersheds_data.gdb as TRI_SITES, using
the coordinate system of the Data Frame, and
Add the new Feature Class when prompted.
7. Remove the CT_TRI_Facilities Events Layer.
Page 14 of 19
RELATIONSHIPS CLASSES IN THE GDB
1. Right-Click the CT_Watershed_Data.gdb
and add a New>Relationship Class
a. Name = ToxicSites_to_Chem
b. Origin Table = TRI_SITES
c. Destination table =
CT_TRI_Chemicals
d. Simple Relationship
e. prefix “to_” to the
Relationship Labels
f.
Cardinality = One to Many
g. no attributes
h. TRIFID = primary/foreign
key
2. Once the relationship class is
established, Use Select by
Attributes to select all records in
the CT_TRI_Chemicals table
where:
"CHEMNAME" IN( 'LEAD' , 'LEAD
COMPOUNDS' )
3. Use the Related table tool to
select the related TRI_SITES that
release Lead and Lead Compounds into the environment.
Page 15 of 19
GEOCODE THE ADDRESS DATA
1. Right-Click the Schools_Addresses
Table and select Geocode
Addresses
2. Click the Add… Button in the
resulting window and browse to
your CT_Watershed_Data.gdb.
Select the NH_STREETS_Locator
and click Add.
3. Double-check that NH_STREETS_Locator is the highlighted Locator and click OK.
4. Use ZIPTXT as the ZIP and save the results to the GDB as Geocoding_Result_01 and click OK
5. Examine the Interactive Rematch interface after the Automatic Geocode (look at the
ZipCode in the Address Table and Reference
Data)
Page 16 of 19
FINDING THE NEAREST
FEATURES
1. Search and open the
Generate Near Table tool.
2. Use the Generate Near
Table tool to create a table
that identifies the TRI sites
within 5 miles of each
school in
Geocoding_results_02.
3. Be sure to uncheck the
option to “Find only closest
feature.”
4. Save this table to the
CT_Watersheds_Data.gdb
and name it
NEAR_Schools_to_TRI
Page 17 of 19
RELATIONSHIP CLASSES IN THE MAP
DOCUMENT
TRI_SITES TO THE NEAR TABLE
1. Right-Click on the TRI_SITES layer and
create a Relate to the
NEAR_Schools_to_TRI table based on its
ObjectID and the NEAR_FID
2. Call this Relate TRI_2_NEAR
SCHOOLS TO THE NEAR TABLE
1. Right-Click on the Geocoding_Results_01
layer and create a Relate to the
NEAR_Schools_to_TRI table
based on the IN_FID and the
OBJECTID of the Schools
EXPLORING RELATED TABLES
2. Open the
Geocoding_Results_02 layer
and Select The Strong School
(using any method of
selection you prefer).
3. Use the Related Tables Tools
to track through the table
relationships until you have a
selection of related
CT_TRI_Chemicals Records.
The selection of chemical records you have created represents the compounds being released with
5 miles of the Strong School.
Page 18 of 19
SUMMARY STATISTICS
1. Search for and open the
Summary Statistics tool
2. Run the Summary Stats Tool
on the active selection in the
CT_TRI_Chemicals table using
the CHEMNAME field as the
Case field and the TTLAIR and
TTLSURFWAT fields as the
statistics fields, with SUM as
the statistic type.
3. Name the Output Table
Strong_School_Exposure and
save it to the
CT_Watershed_Data.gdb.
4. Open the resulting Summary
Table and examine the
results.
Page 19 of 19
Download