Spatial Statistics II RESM 575 Spring 2010 Lecture 8 Last time Part A: Spatial statistics What are spatial measurements and statistics? Part B: Measuring geographic distributions Testing statistical significance Identifying patterns 2 Review Used in lab7 3 What we will use spatial stats to find 1. 2. 3. 4. How features are distributed Where are the clusters What is the pattern created by the features What are the relationships between sets of features or values 4 Identifying patterns Part A: Measuring the pattern of discrete or noncontinuous features (points, lines, polygons) Part B: Measuring the spatial pattern of feature values Measuring the pattern of discrete or noncontinuous features (points, lines, polygons) Options: 1. 2. 3. Overlaying areas of equal size Calculating the average distance between features Counting the number of features within defined distances 6 Overlaying areas of equal size Termed Quadrat analysis GIS overlays areas of equal size on the study area and counts the number of features in each . . Calculates the expected counts for a random .. distribution If fewer areas than expected contain most of the features and more areas than expected contain few or no features, the features form a clustered pattern 7 Notes Traditionally, squares are used for the quadrats The quadrat size will affect whether any patterns are identified! What size to use? 8 Quadrat size Quadrat size = twice the size of the mean area per feature Length of a side of quadrat = sqrt[{2*(Area of the extent / # of features)}] 9 Example Area = 6,000m * 6,000m = 36,000,000m2 n = 36 Length of a side of quadrat = sqrt[{2*(36,000,000 / 36)}] 1412.14 m 10 Hawth’s spatial ecology tools Already loaded in 317 11 Use Hawth’s tools to create quadrat ArcMap ->View -> Toolbars -> 1412 12 Quadrat analysis Because the method counts the number of features per unit area, it measures the density of features It doesn’t take into account the proximity of features to each other or their arrangement 13 Calculating the quadrat analysis stat GIS can be used to count the number of features in each quadrat and then summarize it to create a frequency table We want a table listing the number of quadrats containing 0, 1, 2, etc features How do we create this in GIS?????? 14 Creating a frequency table 1. Spatially join the point features to the quadrat polygons (right click on the quadrat polygons ->joins and relates -> join…) Make sure it is on spatial location join The Point features Sum because we want total # of points per quadrat Creates a new output shapefile to be used in step 2 15 Creating a frequency table 2. Now summarize the count field from the output shapefile (right click Count field) Will create final frequency table shown on next slide 16 Finished frequency table # of points per quad # of quads that had that many points in them Right clicking on Options will allow us to export dbf file for stat tests in Excel 17 Calculating the quadrat analysis stat We then need a frequency table for the expected distribution usually based on a Poisson distribution Calculating the probability that a given number of features will occur in a quadrat λ=n/k Where λ = the ave # of features per quadrat n = the # of features k = the # of quadrats 18 Finding the prob of a # of features occurring in any given quadrat P(x) P(x) = e-λλx / x! e is Euler’s constant, with a value of 2.7183 X! is x factorial (if x is 3 then 3! is 3*2*1= 6 Excel can be used to easily calculate this equation and create a frequency table NOTE: After calculating P(x), or the frequency table, we then multiply the results by the total # of features to get the number of quadrats expected to contain that number of features 19 20 Analysis By comparing the two frequency tables, you can see whether the features create a pattern If the observed distribution table has more features than the table for the random distribution, then the features create a clustered pattern 21 Stat tests to find how much the frequency tables (and thus the patterns) differ Kolmogorov-Smirnov test Compares the difference in frequencies between the tables Chi-Square test Compares the cumulative difference in frequencies 22 Kolmogorov-Smirnov test Calculates the proportion of quads in each class for each line in the frequency table (the number of quads in the class divided by the total number of quads) Then creates a running cumulative total of proportions from top of the table to the bottom – the total of all classes equals 1 (100% of the quads) 23 Kolmogorov-Smirnov test Calculated difference in proportions Compares this value to the critical value for a confidence level you specify 24 Critical value of Kolmogorov-Smirnov test Critical value = K / sqrt(m) Where m = number of quads k = is the constant for a confidence interval you’re using For a CI of .20 (80%), K is 1.07 If 63 quads, Critical value = .135 If the calculated difference in proportions between the observed and random distribution is greater than the critical value, the difference can be considered statistically significant. 25 Chi-Square Also based on the frequency table Test to find if two sets of frequencies (or two distributions) are significantly different 26 Chi-Square test You then compare the test value with the critical value If the X2 value is greater than the critical value, the distributions are significantly different, and you can consider the observed distribution to not be random, but either clustered or dispersed 27 Factors influencing the results of quadrat analysis Works best with small, tight clusters (earthquakes, bird studies) Since it does not measure the relationship or distance between features (just if they fall in the quad or not) it may not recognize certain patterns . ... . . . 28 Measuring the pattern of discrete or noncontinuous features (points, lines, polygons) Options: 1. 2. 3. Overlaying areas of equal size Calculating the average distance between features Counting the number of features within defined distances 29 Calculating the Average Distance Between Features Referred to as the Nearest Neighbor index Measures the distance between each feature and its nearest neighbor an then calculates the average Distributions that have a smaller ave distance between features than a random distribution would have are considered clustered . . . . . 30 Notes on Nearest Neighbor Use if there is a direct interaction between features Good for data analyzed along a line (ie plants or wildlife observations collected along a transect) Stat is calculated using the distance between features 31 Calculating the index in GIS 32 Interpretation of results 33 Testing the results statistically of NNI The null hypoth is that the features are randomly distributed A spatially random distribution would result in a normal distribution of the NNI Z score 34 Testing the results statistically of NNI A positive Z score indicates a dispersed pattern, negative Z score is clustered At 95% CI the z would have to be > 1.96 or < -1.96 to be significantly significant Can compare distributions (the higher the z the more clustered) Make sure study areas are the same size 35 Factors influencing the NNI For lines, NNI could be misleading if endpoints are close and midpoints aren’t If single continuous lines are really several joined lines Centroids of large, convoluted polygons or of different sizes, area boundaries may be close while the centroids are far apart 36 Side note: finding points of lines, or points of polygons (centroids) in ArcGIS Select source tab 37 Factors influencing the NNI Extent of the study area 38 Measuring the pattern of discrete or noncontinuous features (points, lines, polygons) Options: 1. 2. 3. Overlaying areas of equal size Calculating the average distance between features Counting the number of features within defined distances 39 Counting the number of features within defined distances Known as the K-function, or Ripley’s K-function Here you specify a distance interval and use GIS to calculate the average number of neighboring features within the distance of each feature If the ave # of features found at a distance is greater than the ave concentration of features throughout the study area, then the distribution is considered clustered at that distance . . .. . .. . . . ... . .. K is the distance bands . . .. . . .. 40 Notes on K-function Good for if you are interested in how the pattern changes at different scales of analysis Includes all distances to neighbors within a given distance Emergency calls, bird nests 41 K function in GIS Finds the distance from each point to every other point Then, for each point, counts up the number of surrounding points within a given distance 42 Interpreting the K stat 43 Paper on class website for today. 44 Factors influencing the K function Points near the edge of a study area are likely to have less neighboring points At larger distances its more likely that fewer neighboring points will be found than actually exist 45 Comparing methods for measuring the pattern of feature locations 46