GEOG 070 – Spring 2005 Lab #6: Geographic Data and Quantitative Methods Due Dates: Wednesday Labs: Tuesday, April 12, 2005 by 5:00 PM Thursday Labs: Wednesday, April 13, 2005 by 5:00 PM Assignment Overview In previous assignments, you’ve explored the use of a GIS to create maps for purposes of display or qualitative interpretation. In addition to these applications, we can also perform quantitative analyses using a GIS. The attribute data associated with spatial objects (whether they are raster cells or points, lines, and polygons in the vector model) often represents a specific quantity associated with that location. We can extract attribute values and use them to calculate descriptive statistics that tell us about the characteristics of different locations. In this lab, you’ll explore some of ArcView’s capabilities for summarizing and extracting attribute values from raster datasets in order to perform some very basic quantitative analyses on a dataset that describes the temperature and moisture conditions in the vicinity of Baltimore, Maryland on May 24, 2002. Data for the Assignment You'll use the following ArcView shape files and GRIDs as data layers for this assignment: climdiv.shp: The climdiv.shp file contains a single polygon representing the extent of Maryland Climate Division 6 (MD CD6) counties.shp: The counties.shp file contains polygons representing the counties contained in Maryland Climate Division 6 (Baltimore County, Carroll County, Cecil County, the City of Baltimore, Frederick County, Harford County, Howard County, and Montgomery County) LULC: This GRID (a raster data format used by ArcView) contains land-use/land-cover information. Each 1 km2 cell contains an integer class value, which can be translated to a certain land use (these labels are stored in the GRID’s attribute table). There are a total of 8270 1 km2 cells within MD CD 6. LST: This GRID contains land surface temperature information derived from the TerraMODIS sensor’s morning overpass of MD CD6 on May 24, 2002. Each 1 km2 cell contains a floating-point value that specifies the temperature on that day in degrees Fahrenheit. API: This GRID contains antecedent precipitation index information, derived from the NEXRAD Doppler radar network’s precipitation observations over a period of several months. Each 4.7 km2 cell contains a floating-point value that specifies the antecedent moisture condition in terms of millimeters of accumulated precipitation (i.e. higher values indicate more rain has fallen more recently in the period preceding May 24, 2002). *You will be graded on the 10 questions in this assignment Procedure Opening the Project, Save your Own Version, Explore the Data 1. Start ArcView and open the following project in the course data directory: J:\isis\html\courses\2005spring\geog\070\006\data\lab6\lab6.apr 2. Before you do anything else, you’ll save your own version of the project in your directory by using File Save Project As… and specifying a location in your GEOG 070 student space: J:\isis\html\courses\2005spring\geog\070\006\students\youronyen\lab6\lab6.apr 3. Before you proceed, you’ll also set your working directory so that the temporary files that are created in this lab are stored there. Use File Set Working Directory… and specify the same directory where you stored your version of the project: J:\isis\html\courses\2005spring\geog\070\006\students\youronyen\lab6\ 4. Now familiarize yourself with the data: This project was set up to open up and present you with a View showing the LULC grid with the county boundaries overlaid above it. You should be able to see the City of Baltimore near the middle of the climate division (note the cluster of red pixels that denote the urban and built-up LULC class), and suburban Washington to the Southeast of Baltimore. Using the Legend Editor, you can view the Legends associated with each of the Themes, while you use the checkmarks to turn the Themes on and off so you can View the distributions of the various GRIDs’ values. Take a look at the pattern of temperatures in the LST (land surface temperature) GRID, as well as the pattern of moisture in the API grid. 2 Tabulate Areas and Map Query 5. By using the Analysis Tabulate Areas menu item, we can get ArcView to report counts of the number of GRID cells inside each of the county polygons. We will create a cross-tabulation of the count of each class of LULC cells within each county. Click on the Analysis Tabulate Areas menu item, and fill in the pull-down menus with the following values before clicking OK: 6. The resulting table gives counts of each type of LULC cells within each of the counties. Since the cell size for the LULC GRID is 1 km2, these counts also tell us the areas in square kilometers of each land-use within each county. ArcView also provides a convenient way to sum the values in a Field (a.k.a. column) using the Field Statistics menu item. First, highlight a particular Field, by clicking on its label: For example, in the above figure, I have highlit the City of Baltimore Field. Now, when I click on the Field Statistics menu item, the following window will pop up: 3 The value we are interested in here is the Sum. This tells us that the total number of LULC cells within the City of Baltimore (212). We can now use that value to calculate the proportions of the City of Baltimore that have any particular LULC class by looking up the value in the Table, e.g. 178/212 cells ~ 84% of the City of Baltimore is in the Urban and Built Up class in the LULC GRID. We can also use this same approach to determine the total number of cells in the climate division by getting the statistics for the counts of LULC classes that can be found in the LULC GRID’s table. Make the LULC Theme the active theme, by clicking on its legend (when you do this, it should appear to be raised, like this): and then get the Theme’s Table by clicking on the Table button . Once the LULC Table appears, highlight the count Field and use the Field Statistics menu item to find the total number of cells with values other than ‘No Data’ in the LULC GRID. 4 Use these techniques to help you answer the following questions (report proportions in your writeup as percentages with one decimal place, i.e. proportion = xx.x%: (Q1) (Q2) (Q3) What proportion of the climate division is Baltimore County? What proportion of Baltimore County is classified as Mixed Forest? What proportion of the climate division is Mixed Forest that is within Baltimore County? Extracting Values from Floating-point GRIDs 7. Because the LULC GRID contains integer values, it has an associated Table that we can use with Tabulate Areas and Field Statistics. The other GRIDs in the project contain floating-point values (with decimal places), and do not have an associated Table, so we must use a different method to extract information from these GRIDs. Make the LST GRID visible so you can see the temperature pattern on May 24, 2002 (it’s always good to look at the data first). Suppose we want to find all the cells in the LST GRID that were warmer than 90 degrees Fahrenheit. We can use the Analysis Map Query menu item for this purpose. Clicking on that menu item will bring up a dialog to specify a query. Here we are interested in the condition where ([LST] > 90), so this is what we need to type into the Map Query 1 Dialog: You can type the expression in, or use the buttons in Dialog to help you. Once your dialog looks like the figure above, click Evaluate. This will create a new GRID called Map Query 1, which will have values of 0 where the expression was not true (LST , and values of 1 where the expression was true (LST > 90). The new GRID is an integer GRID, so it has a Table that can use to find the count that indicates the total number of cells with each value. 5 (Q4) What proportion of the climate division had temperatures above 90 degrees F? 8. We can also form more complicated Map Queries by using operators like and, or, and not. For example, suppose we wanted to find all cells where the temperature was below 70 degrees and the LULC class is Deciduous Forest, We would use the expression (([LST] < 70) and ([LULC.Lulcclass] = "Deciduous Forest")) for the Map Query. Note the use of brackets and the quotation marks around the class name … the query will not run if you do not get the syntax right. (Q5) What proportion of the climate division had temperatures above 90 degrees F and was within the “Urban and Built-Up” LULC class? Descriptive Statistics Using Summarize Zones 9. In addition to obtaining counts of cells with certain characteristics, ArcView will also allow us to identify groups of cells and calculate descriptive statistics for them. We can do this using the Analysis Summarize Zones menu item. First, make the counties.shp shapefile the active theme. Then click on the aforementioned menu item. This will bring up a dialog that used to specify the Field in the shapefile that define the zones; we’ll use the Countyname Field, since it is more convenient to refer to the counties by their names than their IDs: This will bring up a further dialog that allow us to specify which GRID we wish to summarize. Let’s generate descriptive statistics for the API GRID: This will produce a Table of statistics of API within each of the counties. The Table will be entitled Stats of API Within Zones of counties.shp, and will contain fields that 6 give the minimum, maximum, range, mean, standard deviation, and sum of values of API cells within each of the counties. (Q6) (Q7) What were the lowest and highest mean API values and which counties had these values? Given the locations of the counties with the lowest and highest mean API values in the climate division, what does this suggest about the pattern of precipitation preceding May 24, 2002? 10. We can also make use of the mean and standard deviation information reported in the Table that is produced by Summarize Zones to determine the position of an individual cell’s value within the distribution of values within that zone by doing a little simple math. What we will do is calculate a z-score by taking the difference between an individual cell’s value and the mean value for the zone, and then dividing that difference by the standard deviation of the values for that zone, i.e. expressed as a mathematical formula: z = (cell value – mean) / std. A z-score can tell you how a certain value in a distribution compares to the central tendency of that distribution: A negative z-score tells you that value is lower than the average for the distribution, while a positive z-score tell you the value is higher than average, and z-scores with larger absolute values are further from the average than zscores with smaller absolute values. You can use the Calculator found in Start Menu Programs Accessories if you need one to help you subtract and divide to calculate z-scores. You can obtain the value of an individual cell in a GRID by using the Identify Tool, (which is activated by clicking on the icon, although this Tool is usually active in the View by default), making sure that the proper theme is active before you do so. Make the API GRID the active theme and click on the cell that is furthest south in the climate division (located at about coordinates (1950,-401) which can be seen in the upper-right part of the ArcView Window). An Identify Results Window should appear, which will indicate the API value at that position: (Before using the Identify Tool, you might first want to use the Zoom In Tool to zoom into the correct vicinity, making it easier for you to click on the correct cell). 7 11. Use the method described above to obtain the value for the API cell furthest east in the climate division (at coordinates 2047,-320). (Q8) Calculate the z-score for the API value of the lower-right cell using the mean and standard deviation for the county where the cell is located to define the parameters of the distribution. Report z-scores in your writeup to two decimal places, i.e. z-score = x.xx 12. Use the Analysis Summarize Zones menu item to obtain the mean and standard deviation of the API values for all the cells in the climate division by making climdiv.shp the active theme before clicking on Analysis Summarize Zones and selecting the API GRID to summarize. (Q9) Calculate another z-score for the API value of the lower-right cell, this time using the mean and standard deviation for the entire climate division (obtained in Step 12) to define the parameters of the distribution. (Q10) What can we state about the moisture condition at this location by comparing these two z-scores for this particular cell? Did this cell receive an unusually small or large amount of rain in the period preceding May 24, 2002 when compared to other cells in the same county? Did this cell receive an unusually small or large amount of rain when compared to all cells in the climate division? What does this suggest about the amount of rain received in this particular county when it is compared to the whole climate division? 8