Review for Final Exam

advertisement
GEOG090 - Quantiative Methods in Geography
Spring 2006
Review for Final Exam
GEOG 090 – Quantitative Methods in
Geography
• The Scientific Method
– Exploratory methods (descriptive statistics)
– Confirmatory methods (inferential statistics)
• Mathematical Notation
– Summation notation
– Pi notation
– Factorial notation
– Combinations
Summation Notation: Components
refers to where the
sum of terms ends
in
x
 i
indicates we are
taking a sum
i 1
refers to where the
sum of terms begins
indicates what we
are summing up
Summation Notation: Compound Sums
• We frequently use tabular data (or data drawn from
matrices), with which we can construct sums of
both the rows and the columns (compound sums),
using subscript i to denote the row index and the
subscript j to denote the column index:
Columns
Rows
x11 x12 x13
x21 x22 x23
2
3
 x
i 1 j 1
ij

( x11  x12  x13  x21  x22  x23 )
Pi Notation
• Whereas the summation notation refers to the
addition of terms, the product notation applies to
the multiplication of terms
• It is denoted by the following capital Green letter
(pi), and is used in the same way as the
summation notation
n
x
i
 x1 x2    xn
i 1
n
 (x  y )  (x
i
i 1
i
1
 y1 )( x2  y2 )    ( xn  yn )
Factorial
• The factorial of a positive integer, n, is equal to
the product of the first n integers
• Factorials can be denoted by an exclamation
point
n
n!   i
i 1
5
5! 5  4  3  2 1  120   i
i 1
• There is also a convention that 0! = 1
• Factorials are not defined for negative integers
or nonintegers
Combinations
• Combinations refer to the number of possible
outcomes that particular probability experiments
may have
• Specifically, the number of ways that r items may
be chosen from a group of n items is denoted by:
n
n!
  
 r  r!(n  r )!
or
n!
C (n, r ) 
r!(n  r )!
C. Scales of Measurement
• The data used in statistical analyses can divided
into four types:
1. The Nominal Scale
2. The Ordinal Scale
3. The interval Scale
4. The Ratio Scale
As we progress through
these scales, the types
of data they describe
have increasing
information content
Which one is better: mean, median,
or mode?
• The mean is valid only for interval data or ratio
data.
• The median can be determined for ordinal data as
well as interval and ratio data.
• The mode can be used with nominal, ordinal,
interval, and ratio data
• Mode is the only measure of central tendency that
can be used with nominal data
Which one is better: mean, median,
or mode?
• The mean is valid only for interval data or ratio
data.
• The median can be determined for ordinal data as
well as interval and ratio data.
• The mode can be used with nominal, ordinal,
interval, and ratio data
• Mode is the only measure of central tendency that
can be used with nominal data
Which one is better: mean, median,
or mode?
• It also depends on the nature of the distribution
Multi-modal distribution
Unimodal skewed
Unimodal symmetric
Unimodal skewed
Which one is better: mean, median,
or mode?
• It also depends on your goals
• Consider a company that has nine employees with
salaries of 35,000 a year, and their supervisor
makes 150,000 a year
• What if you are a recruiting officer for the
company that wants to make a good impression
on a prospective employee?
• The mean is (35,000*9 + 150,000)/10 = 46,500 I
would probably say: "The average salary in our
company is 46,500" using mean
Source: http://www.shodor.org/interactivate/discussions/sd1.html
Measures of dispersion
• Measures of Dispersion
– Range
– Variance
– Standard deviation
– Interquartile range
– z-score
– Coefficient of variation
Further Moments of the Distribution
• There are further statistics that describe the
shape of the distribution, using formulae that are
similar to those of the mean and variance
• 1st moment - Mean (describes central value)
• 2nd moment - Variance (describes dispersion)
• 3rd moment - Skewness (describes asymmetry)
• 4th moment - Kurtosis (describes peakedness)
Further Moments of the Distribution
• There are further statistics that describe the
shape of the distribution, using formulae that are
similar to those of the mean and variance
• 1st moment - Mean (describes central value)
• 2nd moment - Variance (describes dispersion)
• 3rd moment - Skewness (describes asymmetry)
• 4th moment - Kurtosis (describes peakedness)
How to Graphically Summarize Data?
• Histograms
• Box plots
Functions of a Histogram
• The function of a histogram is to graphically
summarize the distribution of a data set
• The histogram graphically shows the following:
1. Center (i.e., the location) of the data
2. Spread (i.e., the scale) of the data
3. Skewness of the data
4. Kurtosis of the data
4. Presence of outliers
5. Presence of multiple modes in the data.
Box Plots
• We can also use a box plot to graphically
summarize a data set
• A box plot represents a graphical summary of
what is sometimes called a “five-number
summary” of the distribution
– Minimum
– Maximum
– 25th percentile
– 75th percentile
– Median
• Interquartile Range (IQR)
max.
median
min.
Rogerson, p. 8.
75th
%-ile
25th
%-ile
How To Assign Probabilities
to Experimental Outcomes?
• There are numerous ways to assign probabilities
to the elements of sample spaces
• Classical method assigns probabilities based on
the assumption of equally likely outcomes
• Relative frequency method assigns probabilities
based on experimentation or historical data
• Subjective method assigns probabilities based on
the assignor’s judgment or belief
Probability-Related Concepts
• An event – Any phenomenon you can observe that
can have more than one outcome (e.g., flipping a
coin)
• An outcome – Any unique condition that can be
the result of an event (e.g., flipping a coin: heads or
tails), a.k.a simple event or sample points
• Sample space – The set of all possible outcomes
associated with an event
– e.g., flip a coin – heads (H) and tails (T)
– e.g., flip a coin twice – HH, HT, TH, TT
Probability-Related Concepts
• Associated with each possible outcome in a
sample space is a probability
• Probability is a measure of the likelihood of
each possible outcome
• Probability measures the degree of uncertainty
• Each of the probabilities is greater than or equal
to zero, and less than or equal to one
• The sum of probabilities over the sample space
is equal to one
How To Assign Probabilities
to Experimental Outcomes?
• There are numerous ways to assign probabilities
to the elements of sample spaces
• Classical method assigns probabilities based on
the assumption of equally likely outcomes
• Relative frequency method assigns probabilities
based on experimentation or historical data
• Subjective method assigns probabilities based on
the assignor’s judgment or belief
Probability Rules
• Rules for combining multiple probabilities
• A useful aid is the Venn diagram - depicts multiple
probabilities and their relations using a graphical
depiction of sets
• The rectangle that forms the area of
the Venn Diagram represents the
sample (or probability) space, which
we have defined above
• Figures that appear within the
sample space are sets that represent
events in the probability context, &
their area is proportional to their
probability (full sample space = 1)
A
B
Probability Mass Function
• Example: # of malls in cities
xi
p(X=xi)
1
2
3
4
1/6 = 0.167
1/6 = 0.167
1/6 = 0.167
3/6 = 0.5
p(xi)
0.50
0.25
0
1
2
3
xi
• This plot uses thin lines to denote that the
probabilities are massed at discrete values of
this random variable
4
a
b
f(x)
x
• The probability of a continuous random variable
X within an arbitrary interval is given by:
b
p(a  X  b)   f ( x)dx
a
• Simply calculate the shaded shaded area  if we
know the density function, we could use calculus
Discrete Probability Distributions
• Discrete probability distributions
– The Uniform Distribution
– The Binomial Distribution
– The Poisson Distribution
• Each is appropriately applied in certain
situations and to particular phenomena
Source: http://en.wikipedia.org/wiki/Uniform_distribution_(discrete)
1 /(b  a  1)  1 / n a<=x<=b
f ( x)  
otherwise
0
x<a
0

F ( x)  P( X  x)  ( x  a  1) /(b  a  1)
1

a<=x<=b
x>b
The Uniform Distribution
• Example – Predict the direction of the prevailing
wind with no prior knowledge of the weather
system’s tendencies in the area
• We would have to begin with the idea that
P(xEast) = 1/4
P(xSouth) = 1/4
P(xWest) = 1/4
P(xi)
P(xNorth) = 1/4
0.25
0.125
0
N
E
S
W
• Until we had an opportunity to sample and find
out some tendency in the wind pattern based on
those observations
The Binomial Distribution – Example
• Naturally, we can plot the probability mass
function produced by this binomial distribution:
xi P(xi)
0.50
1
2
3
4
0.4096
0.1536
0.0256
0.0016
P(xi)
0 0.4096
0.25
0
0
1
xi
2
3
4
The Poisson Distribution
• Poisson distribution
P(x) =
-l
e
*
x!
x
l
• The shape of the distribution depends strongly
upon the value of λ, because as λ increases, the
distribution becomes less skewed, eventually
approaching a normal-shaped distribution as it
gets quite large 
• We can evaluate P(x) for any value of x, but large
values of x will have very small values of P(x)
Source: http://en.wikipedia.org/wiki/Normal_distribution
Finding the P(x) for Various Intervals
1.
a
P(Z  a) = (table value)
• Table gives the value of P(x) in the
tail above a
a
P(Z  a) = [1 – (table value)]
•Total Area under the curve = 1, and
we subtract the area of the tail
2.
3.
a
P(0  Z  a) = [0.5 – (table value)]
•Total Area under the curve = 1, thus
the area above x is equal to 0.5, and
we subtract the area of the tail
Finding the P(x) for Various Intervals
4.
a
5.
P(Z  a) = (table value)
• Table gives the value of P(x) in the
tail below a, equivalent to P(Z  a)
when a is positive
a
P(Z  a) = [1 – (table value)]
• This is equivalent to P(Z  a) when
a is positive
a
P(a  Z  0) = [0.5 – (table value)]
• This is equivalent to P(0  Z  a)
when a is positive
6.
Finding the P(x) for Various Intervals
P(a  Z  b) if a < 0 and b > 0
7.
b
a
= (0.5 – P(Z<a)) + (0.5 – P(Z>b))
= 1 – P(Z<a) – P(Z>b)
or
= [0.5 – (table value for a)] +
[0.5 – (table value for b)]
= [1 – {(table value for a) +
(table value for b)}]
• With this set of building blocks, you should be able to
calculate the probability for any interval using a standard
normal table
The Central Limit Theorem
• Suppose we draw a random sample of size n (x1,
x2, x3, … xn – 1, xn) from a population random
variable that is distributed with mean µ and standard
deviation σ
• Do this repeatedly, drawing many samples from the
population, and then calculate the
• We will treat the
x
x
of each sample
values as another distribution,
which we will call the sampling distribution of the
mean (
X
)
The Central Limit Theorem
• Given a distribution with a mean μ and variance σ2, the
sampling distribution of the mean approaches a
normal distribution with a mean (μ) and a variance
σ2/n as n, the sample size, increases
• The amazing and counter- intuitive thing about the
central limit theorem is that no matter what the shape
of the original (parent) distribution, the sampling
distribution of the mean approaches a normal
distribution
Confidence Intervals for the Mean
• More generally, a (1- α)*100% confidence interval
around the sample mean is:
margin of
Standard
error
error

 
 

pr  x  z
     x  z
  1  
n
n 


• Where zα is the value taken from the z-table that
is associated with a fraction α of the weight in the
tails (and therefore α/2 is the area in each tail)
Hypothesis Testing
• One-sample tests
– One-sample tests for the mean
– One-sample tests for proportions
• Two-sample tests
– Two-sample tests for the mean
Hypothesis Testing
1. State the null hypothesis, H0
2. State the alternative hypothesis, HA
3. Choose a, our significance level
4. Select a statistical test, and find the observed test
statistic
5. Find the critical value of the test statistic
6. Compare the observed test statistic with the critical
value, and decide to accept or reject H0
Hypothesis Testing - Errors
H0 is true
H0 is false
Accept H0
Correct decision
(1-α)
Type II Error (β)
Reject H0
Type I Error
Correct decision
(1-β)
(α)
p-value
• p-value is the probability of getting a value of the test
statistic as extreme as or more extreme than that observed
by chance alone, if the null hypothesis H0, is true.
• It is the probability of wrongly rejecting the null
hypothesis if it is in fact true
• It is equal to the significance level of the test for which
we would only just reject the null hypothesis
p-value
• p-value vs. significance level
• Small p-values  the null hypothesis is unlikely to be
true
• The smaller it is, the more convincing is the rejection of
the null hypothesis
One-Sample t-Tests
Data: Acidity data has been collected for a population of
~6000 lakes in Ontario, with a mean pH of μ = 6.69, and σ
= 0.83. A group of 27 lakes in a particular region of
Ontario with acidic conditions is sampled and is found to
have a mean pH of x = 6.16, and a s = 0.60.
Research question: Are the lakes in that particular region
more acidic than the lakes throughout Ontario?
One-Sample Tests for Proportions
• Data: A citywide survey finds that the proportion of
households that own cars is p0 = 0.2. We survey 50
households and find that 16 of them own a car (p = 16/50
= 0.32)
• Research question: Is the proportion of households in
our survey that has a car different from the proportion
found in the citywide survey?
Two-Sample t-tests
•Variances are equal (homoscedasticity)
ttest =
| x1 - x 2 |
Sp
(1 / n1) + (1 / n2)
Pooled estimate of the standard deviation:
sp =
(n1 - 1)s12 + (n2 - 1)s22
n1 + n2 - 2
 df = n1 + n2 - 2
Two-Sample t-tests
• Variances are unequal
ttest =
| x1 - x 2 |
(s12 / n1) + (s22 / n2)
df = min[(n1 - 1),(n2 - 1)]
Matched pairs t-tests
(x11, x12, …, x1n)
(x21, x22, …, x2n)
• Paired observations  reduced independence
of observations
• Matched pairs t-test
Matched Pairs t-tests
ttest =
Sd =
|d|
sd
n
S (di - d)2
n-1
ANOVA
• Null hypothesis
H0:
1  2      k
• Alternative hypothesis
– Not all means are equal
ANOVA
• Null hypothesis
H0:
1  2      k
• Alternative hypothesis
– Not all means are equal
ANOVA Table
Source of
Variation
Sum of
Squares
Degrees of Mean Square
Freedom
Variance
F-Test
Between
Groups
BSS
k-1
BSS/(k-1) BSS/(k-1)
WSS/(N-k)
Within
Groups
WSS
N-k
WSS/(N-k)
Total
Variation
Fα, k-1, N-k
TSS
N–1
Linear Correlation
Source: Earickson, RJ, and Harlin, JM. 1994. Geographic Measurement and
Quantitative Analysis. USA: Macmillan College Publishing Co., p. 209.
Linear Correlation
Source: Earickson, RJ, and Harlin, JM. 1994. Geographic Measurement and
Quantitative Analysis. USA: Macmillan College Publishing Co., p. 209.
Covariance Formulae
1
Cov [X, Y] =
n-1
i=n
S (xi - x)(yi - y)
i=1
Covariance Example
Mean
TVDI (x)
Soil
Moisture
(y)
0.274
0.542
0.419
0.286
0.374
0.489
0.623
0.506
0.768
0.725
0.414
0.359
0.396
0.458
0.350
0.357
0.255
0.189
0.171
0.119
1
0.501
0.307
(x - xbar) (y - ybar)
2
-0.227
0.042
-0.082
-0.215
-0.127
-0.011
0.122
0.005
0.267
0.225
0.107
0.052
0.090
0.151
0.044
0.050
-0.052
-0.118
-0.136
-0.188
Sum
Covariance
(x - xbar) *
(y - ybar)
-0.024305
0.0021624
-0.007323
-0.032424
-0.005553
-0.000566
-0.006374
-0.000618
-0.036282
-0.042289
3
-0.15357
-0.017063
4
5
Linear Correlation
Source: Earickson, RJ, and Harlin, JM. 1994. Geographic Measurement and
Quantitative Analysis. USA: Macmillan College Publishing Co., p. 209.
Covariance Formulae
1
Cov [X, Y] =
n-1
i=n
S (xi - x)(yi - y)
i=1
Covariance Example
Mean
TVDI (x)
Soil
Moisture
(y)
0.274
0.542
0.419
0.286
0.374
0.489
0.623
0.506
0.768
0.725
0.414
0.359
0.396
0.458
0.350
0.357
0.255
0.189
0.171
0.119
1
0.501
0.307
(x - xbar) (y - ybar)
2
-0.227
0.042
-0.082
-0.215
-0.127
-0.011
0.122
0.005
0.267
0.225
0.107
0.052
0.090
0.151
0.044
0.050
-0.052
-0.118
-0.136
-0.188
Sum
Covariance
(x - xbar) *
(y - ybar)
-0.024305
0.0021624
-0.007323
-0.032424
-0.005553
-0.000566
-0.006374
-0.000618
-0.036282
-0.042289
3
-0.15357
-0.017063
4
5
Pearson’s product-moment
correlation coefficient
Cov [X, Y]
r=
sX sY
i=n
r =
S (xi - x)(yi - y)
i=1
(n - 1) sXsY
S ZxZy
i=n
r
=
i=1
(n - 1)
Linear Correlation
Source: Earickson, RJ, and Harlin, JM. 1994. Geographic Measurement and
Quantitative Analysis. USA: Macmillan College Publishing Co., p. 209.
Covariance Formulae
1
Cov [X, Y] =
n-1
i=n
S (xi - x)(yi - y)
i=1
Covariance Example
Mean
TVDI (x)
Soil
Moisture
(y)
0.274
0.542
0.419
0.286
0.374
0.489
0.623
0.506
0.768
0.725
0.414
0.359
0.396
0.458
0.350
0.357
0.255
0.189
0.171
0.119
1
0.501
0.307
(x - xbar) (y - ybar)
2
-0.227
0.042
-0.082
-0.215
-0.127
-0.011
0.122
0.005
0.267
0.225
0.107
0.052
0.090
0.151
0.044
0.050
-0.052
-0.118
-0.136
-0.188
Sum
Covariance
(x - xbar) *
(y - ybar)
-0.024305
0.0021624
-0.007323
-0.032424
-0.005553
-0.000566
-0.006374
-0.000618
-0.036282
-0.042289
3
-0.15357
-0.017063
4
5
Pearson’s product-moment
correlation coefficient
Cov [X, Y]
r=
sX sY
i=n
r =
S (xi - x)(yi - y)
i=1
(n - 1) sXsY
S ZxZy
i=n
r
=
i=1
(n - 1)
A Significance Test for r
r
ttest =
=
SEr
r
r2
1n-2
df = n - 2
=
r n-2
1 - r2
Spearman’s Rank
Correlation Coefficient
6S
i=1
i=n
rs = 1 -
2
di
n3 - n
Spearman’s Rank
Correlation Coefficient
6S
i=1
i=n
rs = 1 -
2
di
n3 - n
A Significance Test for rs
1
SErs =
n -1
rs
ttest =
= rs
SErs
n -1
df = n - 1
Correlation  Regression
• Correlation  Direction & Strength
• We might wish to go a little further
• Rate of change
• Predictability
Simple Linear Regression
• Model
y = a + bx + e
error: 
x: independent variable
y (dependent)
b
a
x (independent)
y: dependent variable
b: slope
a: intercept
e: error term
Fitting a Line to a Set of Points
• Scatterplot  fitting a line
y (dependent)
• Least squares method
• Minimize the error term e
x (independent)
Minimizing the SSE
n
min
a,b
S (y - ŷ)
i=1
n
2
min
= a,b
S (y - a - bx )
i=1
i
i
2
Finding Regression Coefficients
• Least squares method 
n
S (x - x) (y - y)
b=
i
i=1
i
n
S (x - x)
i=1
a = y - bx
i
2
Coefficient of Determination (r2)
y
• Regression sum of squares
(SSR)
y
n
SSR =
S (ŷ - y)
i=1
i
2
• Total sum of squares
(SST)
• Coefficient of determination
(R2)
n
SST =
S (y i=1
i
ŷ
y)2
r2
SSR
=
SST
Regression ANOVA Table
Component Sum of Squares
n
Regression
2
(ŷ
y)
(SSR)
i
Error
(SSE)
S
S (y - ŷ)
Total
(SST)
S (y - y)
df Mean Square
1
SSR / 1
i=1
n
i=1
n
i=1
i
i
2
2
n - 2 SSE / (n - 2)
n-1
F
MSSR
MSSE
Significance Test for Slope (b)
• H0: b = 0
b
ttest =
sb
sb is the standard deviation of the slope parameter:
sb =
df = (n - 2)
se 2
(n - 1) sx2
Significance Test for
Regression Intercept
a
ttest =
sa
where sa is the standard deviation of the intercept:
sa = se2
Sxi2
nS(xi - x)2
and degrees of freedom = (n - 2)
Sampling designs
• Non-probability designs
- Not concerned with being representative
• Probability designs
- Aim to representative of the population
Non-probability Sampling Designs
• Volunteer sampling
- Self-selecting
- Convenient
- Rarely representative
• Quota sampling
- Fulfilling counts of sub-groups
• Convenience sampling
- Availability/accessibility
• Judgmental or purposive sampling
- Preconceived notions
Probability Sampling Designs
• Random sampling
• Systematic sampling
• Stratified sampling
Point Pattern Analysis
Regular
Random
Clustered
Point Pattern Analysis
1. The Quadrat Method
2. Nearest Neighbor Analysis
Download