Assignment 6 - University of Regina

advertisement
Assignment 6: Vector Data Analysis
Due March 2, 2012
Introduction
A geographic information system comprises several components each of which plays a unique part in
ensuring its overall functionality. Analytical functions, however, could be considered a centerpiece of a
GIS, its reason d’être. Although other ways exist, analysis is the main vehicle of processing raw data into
information that could be used in the decision making process. Most GIS packages have a wide range of
analytical functions, including measurement techniques, queries, proximity analysis, overlay operations
and analysis of surfaces and networks. The application of these analyses and techniques (algorithms)
used to perform them differs depending on the spatial data model.
This assignment gives you an opportunity to try your hand at analyzing vector data using measurement
and overlay tools. Given the relative complexity and length of time required to learn and practice these
techniques, this assignment contains only theory and guided tutorial sections.
Section 1: Theory
Textbook readings: Chapter 6 and lectures
Answers to these questions can be found in the textbook, in the guided tutorial, or online. After reading
the sources you find, please provide the answers in your own words.
(each question is worth 2 marks)
1. Why Calculate Geometry tool can only be used to calculate the area, length, or perimeter of
features if the dataset is projected?
2. Explain the difference between a feature and an entity as it applies to vector data model.
3. Which data encoding methods can be used for creating raster data and vector data?
4. Quires and overlays can be used to solve the same problems when analysing vector data. What
is the difference between the results produced by these two methods of analysis? Describe
situations when you would prefer to use each of these methods?
5. Define the buffering procedure. How is it used in GIS projects?
6. Describe common principles on which all three polygon-on-polygon overlay procedures are
based.
1|Page
Section 2: Guided Tutorial
This section of the assignment is based on the ideas and data available on the website of the National
Center for Ecological Analysis and Synthesis, a research center of the University of California, Santa
Barbara. The Center has a wealth of spatial datasets showing distribution of plants and animals in the
Western hemisphere and related data that can be downloaded for free.
In this part of the assignment you will learn to apply some of the tools available in ArcGIS for analysis of
vector data using a collection of point and polygon data sets showing distribution of two species of
armadillo – Greater Naked-Tailed Armadillo and Southern Armadillo – collected in South America. You
will find answers to four questions about the study area:
1.
2.
3.
4.
What is the area of each species range?
What is the total area of South America covered by (one or more) species ranges?
What portion of the continent is covered by both Armadillo species ranges?
Do all of the Armadillo species sightings occur inside the Armadillo species range?
Instructions
1. Open AcrMap and add the data from T:\Class\Geography\geog303\Assignment6 folder to the map
document. This folder contains four datasets:
SouthAmericaStates.shp – a shapefile containing boundaries of South American countries.
GreaterArmadilloPts.shp – a shapefile containing locations of sightings of a greater naked-tailed
armadillo (Cabassous tatouay).
GreaterArmadilloPoly.shp – a shapefile containing boundary of the species range for the greater
naked-tailed armadillo.
SouthernArmadilloPoly.shp – a shapefile containing boundary of the species range for a southern
greater naked-tailed armadillo (Cabassous unicinctus).
2. If ArcCatalogue is not open in your ArcGIS session, click hte corresponding button on the Standard
tool bar at the top of the ArcGIS window to open it. In ArcCatalogue, locate the folder with the data
used in this exercise. Right click on the name of the first dataset listed in the Catalogue window and
chose Properties. Click the X and Y Coordinate System tab to see the projection and coordinate
system your data in. Repeat these steps for all four shapefiles. In this exercise you will measure the
area of polygon features and perform several overlays. Will the projection and coordinate system
your data is in be suitable for these operations? What type of projection you would want your data
to be in?
3. We will have to project all four data sets in order to able to perform the analyses we planned to
perform in this exercise. To calculate meaningful polygon areas, we need to transform each dataset
into an equal area projection.
Open the ArcToolbox window, in the Data Management Tools kit find Projections and
Transformations tools for vector data located in the Feature subset and double-click on Project tool.
In the window that opens, enter the name of the dataset you want to project. There are several
ways to do that. You can: (a) click on the name of the dataset in the Table of Contents of the map
document and grad and drop it into the corresponding line in the Project window; (2) alternatively
2|Page
you can navigate to the dataset using browse button on the right of the line; or (3) select the name
of the dataset from the drop down list for that line.
For your output file, choose a name from the input dataset, for example SouthAmerica_prj.
As your Output Coordinate System, Select South_America_Albers_Equal_Area_Conic from
Predefined  Projected Coordinate Systems  Continental  South America.
From the list of Geographic Transformation options ArcGIS provides in the corresponding drop down
list choose SAD_1969_To_WGS_1984_1.
Click OK to run the transformation. Repeat the same procedure for all four shapefiles.
4. Delete the original files from the map document. Now you are all set to start the analysis.
5. To answer the first question -- What is the area of each species range? – you will need to use
Calculate Geometry tool to calculate the area for each of the species range.
Open the attribute table of the projected GreaterArmadilloPoly shapefile. In order to calculate the
area of the features in this dataset you first need to add a field that will hold these values. Locate
the Table Options button in the upper left corner of the Table window and select Add Field option.
In the window that opens, name your field Area, set its Type to Double and Scale and Precision to 15
places (to allow for large numbers to be stored). Click OK to create the new field.
After the field is added to the table, right-click on its name and select Calculate Geometry option.
Since the dataset you are working with is small and the calculation is straightforward, you are going
to perform the calculation outside an Edit session. Ignore the warning that pops up. When the tool
window opens, make sure that the Property option is set to Area, accept defaults for the Coordinate
System option and choose square kilometres as your Units. Click OK to perform the calculation
based on the set parameters.
Repeat this sequence on the projected SouthernArmadilloPoly shapefile. Note the answers to this
question in a table below.
6. To answer the second question – What is the total area of South America covered by at least one
of the two species ranges? – we need to overlay the three polygon datasets using a Union method.
Since the question has to do with an area of South American continent, you need to calculate the
area of the entities in the projected SouthAmericaStates shapefile before performing the overlay.
To that, follow the steps outlined in section 5 above.
After you prepared the SouthAmericaStates shapefile for further analysis, locate Analysis Tools kit in
the ArcToolbox window. Expand the Overlay toolset and open the Union tool. In the window that
opens, add all three projected polygon datasets as input features. Make sure that you save your
output dataset where you can find it. For the rest of the settings in this window choose the default
options and click OK to run the Union procedure.
When the output dataset is added to the map document, open its attribute table and examine the
fields. You may choose to Hide most of the fields that describe the species, except the PRESENCE
fields (you should have two -- one for each species).
This field contains ‘1’ for polygons representing the area where one of the two armadillo species is
present. Why we can make this assumption? (Question 7: 2 points) Hint: check the original
species range files and think about how the Union operation works. You can consult ArcHelp files.
3|Page
Select these areas by performing an attribute query that will return a selection of features which
have either one of the species present. Hint: use the OR operator.
Now can find how much of the area of the continent has at least one of these species of armadillo
present. Technically you can use any of the three area fields in the union output shapefile attribute
table. But can you really? If you want to get an accurate answer to this question you will have to
recalculate the geometry of one of the area fields. Why? (Question 8: 2 points)
Right-click on an Area field’s name and select Calculate Geometry option. Uncheck the Calculate
selected records only option to re-calculate area for all the records in the table while maintaining
your selection. Recalculate geometry of all three Area fields in the table and compare the values.
Are they different? Why? (Question 9: 2 points).
After you done your calculations and compared the values, right-click on the Area field’s name and
select Statistics option. This tool returns various summary statistics on the values in the field,
including the sum. Note the sum of the area in a table below. This is the answer to the second
question.
7. Clear the selected features in Union dataset and turn this layer off. You would not need to work
with it any more.
To answer the third question in our practice research project -- What portion of the continent is
covered by both Armadillo species ranges? – we will perform another type of a polygon-on-polygon
overlay, Intersect. Can you explain why in this case we should choose this method? (Question 10: 2
points)
In the Analysis Tools kit, locate and open the Intersect tool. In the window that opens, add all three
projected polygon datasets as input features. (The order in which you add them determines the
order of attributes in the table of the output of the dataset.) Make sure that you save your output
dataset where you can find it. For the rest of the settings in this window choose the default options
and click OK to run the Intersect procedure.
Examine the output dataset when it is added to the map document. Why does it have only one
entity? (Question 11: 2 points) Hint: you may want to review how the Intersect procedure works in
ArcHelp files.
Open its attribute table of the output dataset and examine the fields. You may the fields that
describe the species. Again, you got three area fields in this attribute table. Compare their values.
Why are they different? (Question 12: 2 points). Again, you will have to re-calculate the area of the
output feature using Calculate Geometry tool. Note the resulting area in a table below. This is the
answer to the third question.
8. Finally, let’s find the answer to the last question in our project – Do all of the Armadillo species
sightings occur inside the Armadillo species range? Based on the data we have, we can answer this
question only for the greater naked-tailed armadillo species. We will use the point-in-polygon
overlay procedure to get the answer.
Make sure that your output data sets are turned off and only projected polygon and point datasets
are displayed in the data view.
In the Analysis Tools kit, locate and open the Intersect tool. In the window that opens, set the
projected GreaterArmadilloPts shapefile as Target dataset and projected GreaterArmadilloPoly
4|Page
shapefiles as Join dataset. Make sure that you save your output dataset where you can find it.
Leave the Join option as One-to-One (default). In Field Map of Join Features you may delete
duplicate fields. (Both the point and polygon files contain the same set of attributes.) Since we
want to know which of the sittings are within the species range boundary, choose WITHIN as the
Match Option. Click OK to run the Join.
When the output dataset is added to the map document, turn off the point file that was used as an
input dataset in this analysis. Open the attribute table of the output dataset and examine its
attributes. The Join Count field displays the results the spatial join analysis – the cells containing ‘1’
belong to the records representing armadillo sittings with this species range. You will use this field
to present the result of this analysis on a map.
Close the attribute table and double-click on the output dataset name to open the Properties
window. Click the Symbology tab open and select the Categories  Unique values symbology to
display the features in the dataset. Make sure that Joint Count is selected as the Value Field. Click
Add All Values button to add values from this field to the classification box. By clicking on the
corresponding text, change the Label for ‘0’ category to ‘outside the range’ and to ‘within the range’
for ‘1’ category. Remove the check mark next to All Other Values category and click OK to apply
your symbology and close the Properties window.
Create a map showing the result of Spatial Join analysis using LetterLandscape template on the
Traditional Layouts tab of the Select Template window.

Note: You can make your map more visually attractive by double-clicking on the data frame
(contains the map in the layout view) and choosing Focus Data Frame option. When the
data frame becomes editable you can zoom in or pan the map features using the tools you
would you use in the data view.
Question 13
Submit a map showing results of your work.
5 marks
Question 13
Submit a table showing you area calculations.
5 marks
Species
Area covered by species
range, sq. km
Greater Naked-Tailed Armadillo
Southern Greater Naked-Tailed Armadillo
At least on these species
Two species co-occur
5|Page
Download