Lecture 8

advertisement

Introduction

• Statistical options tend to be limited in most GIS applications.

• This is likely to be redressed in the future.

• We will look at spatial statistics in general terms, and conclude with a review of the software available.

Basic Concepts

• Spatial statistics differ from ‘ordinary’ statistics by the inclusion of locational properties.

• This makes spatial statistics more complex.

• The book by Bailey and Gatrell (1995) provides an accessible introduction. They identify four categories:

– Point pattern data;

– Spatially continuous data;

Areal data; and

– Interaction data.

Obvious correspondence with conceptual models.

Scale Levels

• Attribute data can be classified by measurement scale:

– Nominal: e.g. 1=females, 2=males.

– Ordinal: e.g. 1=good, 2=medium, 3=poor.

– Interval (+ ratio): e.g. degrees Centigrade, percentages.

Bailey and Gatrell classify techniques by purpose:

Visualisation

Exploration

Modelling – this is involved in all statistical inference and hypothesis testing)

Random Variables

Statistical models deals with phenomena that are stochastic (i.e. are subject to uncertainty).

• A random variable Y has values that are subject to uncertainty (but may not necessarily be random).

The distribution of possible values is referred to as the probability distribution .

• Represented by a function f

Y

(y)

• Random variables may be discrete or continuous .

Probabilities

Probability that y is between a and b is given by:

 y b 

 a f

Y f

Y

  dy a b if Y is discrete i f Y is continuous (probability density)

• Cumulative probability ( or distribution function ) F

Y given by: is

F

Y

F

Y

 u

 y 

 f

Y

 y

 f

Y

  du if Y is discrete if Y is continuous

Expected Values

• The expected value of Y is its mean E(Y):

E

 y

 y .

f

Y or

E

 

 y .

f

Y

  dy

• The expected value of a function of Y, say g(Y) is :

E

 g

  

  g

   

Y or

E

 g

  

  

  g

   

Y dy

• Variance is:

VAR(Y) =

S

([Y - E(Y)] 2 )

• The square root of this is the standard deviation ( s

Y

)

Joint Probability

• Can generalise to situations where there is more than one random variable.

• Joint probability distribution (or density): f

XY

(x,y)

• Covariance : COV(X,Y) =

S

((X - E(X)).(Y - E(Y)))

Correlation : r

X,Y

= COV(X,Y) / s

X

.

s y

Independence : Neither variable affects the other. Joint probability is product of individual probabilities: f

XY

(x,y)=f

X

(x).f

Y

(y)

Statistical Models

• A statistical model specifies the probability distribution for the phenomenon being modelled.

• If modelling ozone levels in a region R we would have a probability distribution for each location s (where s is a

2x1 vector of x,y coordinate pairs). Individual points can be referred to as s

1

, s

2 etc.

• The complete set of random variables may be referred to as a spatial stochastic process .

• The probability distribution for near points will probably be more similar than for distant points, so our random variables will probably not be independent.

Specifying Models

• To specify a model we need to specify its probability distribution. For the ozone model we would need to specify the joint distribution of every possible subset of random variables.

• For a fair die: f

Y

(y) = 1/6

• For more complex models (e.g. ozone) we can use observed data: (y

1

, y

2

, …)

• These data are a realistion

– i.e. one outcome from the joint probability distribution {Y

1

, Y

2

, …}

• One set of data does not get us very far. Even with more data observations we must make reasonable assumptions, based either on theory or prior observations.

Specifying Models(2)

• Assumptions may be expressed in general terms (e.g. a

Normal distribution, a regression model) with unspecified parameters .

• The model can be fitted using observed data to estimate the parameters.

• After evaluating the model we may decide to change its general form.

A Regression Model

• To illustrate, to model our ozone data we might make the following assumptions:

– The random variables {Y(s), s

R} are independent;

– They have the same distribution, but different means;

– Their means are a simple linear function of location, say E(Y(s)) = b

0

+ b

1 s

1

+ b

2 s

2

;

– Each

Y(s) has a normal distribution about this mean with the same variance s

2 .

• These assumptions would enable us to estimate the parameters from the available data.

Maximum Likelihood

• Most frequently used method is maximum likelihood .

• We can write down the general form of the joint probability distribution e.g. vector of parameters - ( b

0 model.

f(y

, b

1

, b

,y

2

2

,

, … y s

2 n

; q

) where q is a

) in our regression

• Given that we have actual values for y

1

… y n

, this joint probability distribution is the probability of getting these actual values. This is referred to as the likelihood and would usually be denoted L(y

1

, y

2

, … y n

; q

).

• Our objective is to identify the parameter values q that maximise L. In practice we usually maximise the logarithm of L ( log likelihood ) denoted l(y

1

, y

2

, … y n

; q

).

Parameter Estimation

• This is the basic approach, but the actual estimation may be complicated.

• Parameter estimation of our multiple linear regression involving assumptions of independence, normal distributions and equal variance reduces to using the method of ordinary least squares.

• Relaxing the independence and equal variance, we can still use generalised least squares .

Standard errors provide a measure of the reliability of each parameter estimate.

• Likelihood ratios can be used to compare alternative models.

Hypothesis Testing

• Hypothesis testing entails comparing the fit of two models, one of which incorporates assumptions which reflect the hypothesis, the other incorporating a less specific set of assumptions.

• All modelling inevitably involves some assumptions about the phenomenon under study; hence hypothesis testing will always involve comparison of the fit of a hypothesised model with that of an alternative which also incorporates assumptions, albeit of a more general nature.

Spatial Data Modelling

• Spatial data often exhibit spatial correlation (or autocorrelation). Assumptions of independence may therefore be unrealistic.

• Can make a distinction between:

First order effects : variation in the mean due to global trend;

– Second order effects : caused by spatial correlation.

• Can illustrate using analogy of iron filings and magnets.

• Real-world patterns are often an outcome of a mix of first and second order effects.

Spatial Data Modelling(2)

• To allow for second order effects, spatial models may need to assume a covariance structure.

• The second order effects may be modelled as a stationary spatial process – i.e.

– Its statistical properties (mean, variance) are independent of absolute location;

– Covariance depends only on relative location.

• A process is said to be isotropic if it is stationary, and covariance depends only on distance and not direction.

• If the mean, variance or covariance ‘drifts’ over the study area, then the process exhibits non-stationarity or heterogeneity .

Spatial Data Modelling(3)

• Heterogeneity in the mean, combined with stationarity in second order effects, is a useful spatial modelling assumption.

• The modelling of a spatial process often tends to proceed by first identifying any heterogeneous 'trend' in mean value and then modelling the 'residuals', or deviations from this

'trend', as a stationary process.

Geographically Weighted Regression

• Covariates are often incorporated in a multiple regression model taking the general form: y i

 b

0

  b k x ik

 k

 i

• The model assumes the coefficients are homogeneous or stationary.

• Fotheringham et al. proposed an alternative model: y i

 b

0

 u i

, v i

  k b k

 u i

, v i

 x ik

  i

• To allow the model to be fitted, it is assumed the parameters are non-stationary but are a function of location.

• Parameters can be mapped.

Point Pattern Techniques

• Bailey and Gatrell discuss various techniques, organised by data type.

• Point pattern techniques include:

Quadrat analysis

– Kernel estimation

Nearest neighbour analysis

– K-functions

Normally used to test null hypothesis of complete spatial randomness (i.e. homogeneous Poisson process), but can also examine heterogeneous Poisson processes.

Spatially Continous Data

Techniques used to explore field data .

Sometimes referred to as geostatistics .

– Spatial moving averages

Trend surface analysis

– Delauney triangulation / Thiesen polygons / TINs

– Kernel estimation (for the values at sample points)

– Variograms / covariograms / kriging

– Principal components analysis / factor analysis

– Procrustes analysis

Cluster analysis

– Canonical correlation

Area Data

• Techniques for analysing areal data (i.e. polygon attributes) include:

Spatial moving averages

Kernel estimation

– Spatial autocorrelation (Moran’s I, Geary’s c)

– Spatial correlation and regression

Generalised linear models provide a family of techniques for dealing with special types of data: e.g. counts

(Poisson regression), proportions (logistic regression).

Bayesian techniques often used to model rates based on small numbers.

Spatial Interaction Data

• Techniques for modelling spatial interactions are most based on some variant of the gravity model.

• This postulates that the amount of interaction between two places is a function of their sizes (measured using an appropriate metric) and is inversely related to the distance between them.

Software

ArcGIS . Geostatistical Analyst a step forward.

Idrisi . GIS Analysis | Statistics menu has a lot of options.

• S-Plus . The S+SpatialStats addon provides a lot of options.

• R . R is an open-source version of S-Plus. There are a number of projects currently developing tools for spatial statistics (e.g. sp, spatstat, DCluster, spgwr).

• BUGS . Software for Bayesian statistics. There is a free version for Windows (WinBUGS). Includes a spatial subset called GeoBUGS.

Download