Spatial statistics II

advertisement
Spatial Statistics II
RESM 575
Spring 2010
Lecture 8
Last time
Part A: Spatial statistics
 What are spatial measurements and
statistics?
Part B:
 Measuring geographic distributions
 Testing statistical significance
 Identifying patterns
2
Review
Used
in lab7
3
What we will use spatial stats to find
1.
2.
3.
4.
How features are distributed
Where are the clusters
What is the pattern created by the features
What are the relationships between sets of
features or values
4
Identifying patterns
Part A:
Measuring the pattern of discrete or
noncontinuous features (points, lines,
polygons)
Part B:
Measuring the spatial pattern of feature
values
Measuring the pattern of discrete or
noncontinuous features (points, lines,
polygons)
Options:
1.
2.
3.
Overlaying areas of equal size
Calculating the average distance between
features
Counting the number of features within defined
distances
6
Overlaying areas of equal size




Termed Quadrat analysis
GIS overlays areas of equal size on the study area
and counts the number of features in each
. .
Calculates the expected counts for a random ..
distribution
If fewer areas than expected contain most of the
features and more areas than expected contain few
or no features, the features form a clustered
pattern
7
Notes



Traditionally, squares are used for the
quadrats
The quadrat size will affect whether any
patterns are identified!
What size to use?
8
Quadrat size
Quadrat size = twice the size of the mean
area per feature
 Length of a side of quadrat =
sqrt[{2*(Area of the extent / # of features)}]

9
Example
Area = 6,000m * 6,000m
= 36,000,000m2
n = 36
Length of a side of quadrat =
sqrt[{2*(36,000,000 / 36)}]
1412.14 m
10
Hawth’s spatial ecology tools
Already
loaded
in 317
11
Use Hawth’s tools to create quadrat

ArcMap ->View -> Toolbars ->
1412
12
Quadrat analysis


Because the method counts the number of
features per unit area, it measures the
density of features
It doesn’t take into account the proximity of
features to each other or their arrangement
13
Calculating the quadrat analysis stat


GIS can be used to
count the number of
features in each
quadrat and then
summarize it to create
a frequency table
We want a table listing
the number of quadrats
containing 0, 1, 2, etc
features
How do we create this in
GIS??????
14
Creating a frequency table
1.
Spatially join the point features to the
quadrat polygons (right click on the quadrat
polygons ->joins and relates -> join…)
Make sure it is on spatial
location join
The Point
features
Sum because
we want total #
of points per
quadrat
Creates a new output shapefile
to be used in step 2
15
Creating a frequency table
2.
Now summarize the count field from the
output shapefile (right click Count field)
Will create final
frequency table
shown on next slide
16
Finished frequency table
# of points per quad
# of quads that had
that many points in
them
Right clicking on
Options will allow us
to export dbf file for
stat tests in Excel
17
Calculating the quadrat analysis stat


We then need a frequency table for the
expected distribution usually based on a
Poisson distribution
Calculating the probability that a given
number of features will occur in a quadrat
λ=n/k
Where λ
= the ave # of features per quadrat
n
= the # of features
k
= the # of quadrats
18
Finding the prob of a # of features
occurring in any given quadrat P(x)
P(x) = e-λλx / x!
e is Euler’s constant, with a value of 2.7183
X! is x factorial (if x is 3 then 3! is 3*2*1= 6
Excel can be used to
easily calculate this
equation and create a
frequency table
NOTE: After calculating P(x), or the
frequency table, we then multiply
the results by the total # of features
to get the number of quadrats
expected to contain that number of
features
19
20
Analysis


By comparing the two frequency tables, you
can see whether the features create a pattern
If the observed distribution table has more
features than the table for the random
distribution, then the features create a
clustered pattern
21
Stat tests to find how much the frequency
tables (and thus the patterns) differ

Kolmogorov-Smirnov test


Compares the difference in frequencies between
the tables
Chi-Square test

Compares the cumulative difference in
frequencies
22
Kolmogorov-Smirnov test


Calculates the proportion of quads in each class for
each line in the frequency table (the number of
quads in the class divided by the total number of
quads)
Then creates a running cumulative total of
proportions from top of the table to the bottom – the
total of all classes equals 1 (100% of the quads)
23
Kolmogorov-Smirnov test
Calculated
difference in
proportions
Compares this value
to the critical value for
a confidence level you
specify
24
Critical value of Kolmogorov-Smirnov test
Critical value = K / sqrt(m)
Where
m = number of quads
k = is the constant for a confidence interval you’re using
For a CI of .20 (80%), K is 1.07
If 63 quads, Critical value = .135
If the calculated difference in proportions between the
observed and random distribution is greater than the critical
value, the difference can be considered statistically
significant.
25
Chi-Square


Also based on the frequency table
Test to find if two sets of frequencies (or two
distributions) are significantly different
26
Chi-Square test


You then compare the test value with the
critical value
If the X2 value is greater than the critical
value, the distributions are significantly
different, and you can consider the observed
distribution to not be random, but either
clustered or dispersed
27
Factors influencing the results of quadrat
analysis


Works best with small, tight clusters
(earthquakes, bird studies)
Since it does not measure the relationship or
distance between features (just if they fall in
the quad or not) it may not recognize certain
patterns
.
...
.
.
.
28
Measuring the pattern of discrete or
noncontinuous features (points, lines,
polygons)
Options:
1.
2.
3.
Overlaying areas of equal size
Calculating the average distance between
features
Counting the number of features within defined
distances
29
Calculating the Average Distance Between
Features



Referred to as the Nearest Neighbor index
Measures the distance between each feature
and its nearest neighbor an then calculates
the average
Distributions that have a smaller ave distance
between features than a random distribution
would have are considered clustered
.
.
.
.
.
30
Notes on Nearest Neighbor



Use if there is a direct interaction between
features
Good for data analyzed along a line (ie plants
or wildlife observations collected along a
transect)
Stat is calculated using the distance between
features
31
Calculating the index in GIS
32
Interpretation of results
33
Testing the results statistically of NNI



The null hypoth is that the features are
randomly distributed
A spatially random distribution would result in
a normal distribution of the NNI
Z score
34
Testing the results statistically of NNI



A positive Z score indicates a dispersed
pattern, negative Z score is clustered
At 95% CI the z would have to be > 1.96 or <
-1.96 to be significantly significant
Can compare distributions (the higher the z
the more clustered)
Make sure study areas are the same size
35
Factors influencing the NNI

For lines, NNI could be
misleading if endpoints are
close and midpoints aren’t

If single continuous lines
are really several joined
lines

Centroids of large,
convoluted polygons or of
different sizes, area
boundaries may be close
while the centroids are far
apart
36
Side note: finding points of lines, or points
of polygons (centroids) in ArcGIS
Select
source tab
37
Factors influencing the NNI

Extent of the study area
38
Measuring the pattern of discrete or
noncontinuous features (points, lines,
polygons)
Options:
1.
2.
3.
Overlaying areas of equal size
Calculating the average distance between
features
Counting the number of features within defined
distances
39
Counting the number of features within
defined distances



Known as the K-function, or Ripley’s K-function
Here you specify a distance interval and use GIS to
calculate the average number of neighboring
features within the distance of each feature
If the ave # of features found at a distance is greater
than the ave concentration of features throughout
the study area, then the distribution is considered
clustered at that distance
. . .. .
..
.
. . ... .
..
K is the distance bands
. . .. .
. ..
40
Notes on K-function



Good for if you are interested in how the
pattern changes at different scales of
analysis
Includes all distances to neighbors within a
given distance
Emergency calls, bird nests
41
K function in GIS


Finds the distance from each point to every
other point
Then, for each point, counts up the number of
surrounding points within a given distance
42
Interpreting the K stat
43
Paper on
class
website for
today.
44
Factors influencing the K function


Points near the edge of a study area are
likely to have less neighboring points
At larger distances its more likely that fewer
neighboring points will be found than actually
exist
45
Comparing methods for measuring the
pattern of feature locations
46
Download