Math 141 - Lecture 12: The LLN, CLT, and the Normal Distribution

advertisement
Math 141
Lecture 12: The LLN, CLT, and the Normal Distribution
Albyn Jones1
1 Library 304
jones@reed.edu
www.people.reed.edu/∼jones/courses/141
Albyn Jones
Math 141
Properties of X n
Suppose X1 , X2 , . . . , Xn are IID random variables, with mean
µ and standard deviation σ. We know that
E(X n ) = µ
and
σ
SD(X n ) = √
n
In other words, typical values for X n are around
√
µ ± σ/ n
or more formally:
X n →P µ
Albyn Jones
Math 141
The Law of Large Numbers
The Law of Large Numbers
X n →P µ
tells us that X n gets as close to µ as you like when
n → ∞, with probability approaching 1.
Albyn Jones
Math 141
The Law of Large Numbers
The Law of Large Numbers
X n →P µ
tells us that X n gets as close to µ as you like when
n → ∞, with probability approaching 1.
It does not tell you how close you are at any point,
or how large n must be to guarantee you are as
close as you would like to be!
Albyn Jones
Math 141
The Law of Large Numbers
The Law of Large Numbers
X n →P µ
tells us that X n gets as close to µ as you like when
n → ∞, with probability approaching 1.
It does not tell you how close you are at any point,
or how large n must be to guarantee you are as
close as you would like to be!
The Central Limit Theorem Another famous theorem called the
Central Limit Theorem answers that question!
Albyn Jones
Math 141
First: The Normal Distribution
The so-called Normal distribution (aka Gaussian, or the
bell-shaped curve) has its origin in approximations to Binomial
probabilities for large n.
Before discussing that approximation, we study the properties
of the Normal distribution.
Albyn Jones
Math 141
The Normal Distribution
0.2
0.1
0.0
density
0.3
0.4
The Standard Normal Density
−3
−2
−1
0
1
Z
Albyn Jones
Math 141
2
3
The Normal Distribution, Part 2
The Normal Distribution has several important features:
it is symmetric and unimodal,
the mean, median, and mode coincide,
it is completely characterized by the values of the mean µ
and the standard deviation σ or variance σ 2 ,
every Normal distribution has the same shape.
Albyn Jones
Math 141
Notation
The standard notation for a random variable X which has a
Normal Distribution with mean µ and standard deviation σ is
X ∼ N(µ, σ 2 )
In other words, list the mean and variance.
Warning! R functions are parametrized by the mean µ and
standard deviation σ!
Albyn Jones
Math 141
The Normal Distribution, Part 3
Roughly 96% of any Normal population lies with 2 SD’s of the
mean, and about 99.7% lies within 3 SD’s of the mean.
0.2
0.1
0.023
0.136
0.341
0.341
0.136
0.023
0.0
density
0.3
0.4
Areas under the Standard Normal Curve
−3
−2
−1
0
1
Z
Albyn Jones
Math 141
2
3
The Normal(50, 102 ) Curve
The corresponding regions for ANY Normal distribution contain
the same proportions of the population!
0.02
0.01
0.023
0.136
0.341
0.341
0.136
0.023
0.00
density
0.03
0.04
Areas under the Normal(50,100) Curve
20
30
40
50
60
Z
Albyn Jones
Math 141
70
80
Linear functions of Normal RV’s are Normal!
Let Z ∼ N(0, 1), and let Y = σZ + µ. Then
E(Y ) = E(σZ + µ) = σE(Z ) + µ = µ
and
SD(Y ) = SD(σZ + µ) = σSD(Z ) = σ
Thus:
Y ∼ N(µ, σ 2 )
Albyn Jones
Math 141
Standardization
It works the other way too: let Y ∼ N(µ, σ 2 ), then
Z = (Y − µ)/σ
is a Standard Normal RV:
Y −µ
1
1
E(Z ) = E
= E(Y − µ) = (E(Y ) − µ) = 0
σ
σ
σ
A standardized RV is often called a Z–score, and represents
the number of standard deviations the value is away from the
mean.
Albyn Jones
Math 141
Historical Footnote!
Standardizing data was a common practice back before
computers; then you only needed a table of probabilities for the
Standard Normal distribution! Tables are unnecessary now, but
it is still very useful to remember that areas under the Normal
density curve depend only on the mean and SD, and that
Z–scores measure in units of SD’s.
Albyn Jones
Math 141
R Functions for Normal Probabilities
pnorm(a, µ, σ) gives P(Y ≤ a), dnorm(a, µ, σ) gives the height
of the curve at a.
0.04
dnorm() and pnorm()
0.02
0.01
pnorm(55,50,10) = 0.691
0.00
density
0.03
dnorm(55,50,10)
20
30
40
50
60
Z
Albyn Jones
Math 141
70
80
The Normal Density and CDF
The density function for X ∼ N(µ, σ 2 ) is given by
f (x) = √
1
2πσ 2
e
−
(x−µ)2
2σ 2
the CDF is the area under the curve to the left of the point x:
Z x
(x−µ)2
1
−
√
e 2σ2 dx
P(X ≤ x) =
2πσ 2
−∞
Albyn Jones
Math 141
Cumulative Normal Probabilities: pnorm() and qnorm()
The CDF is the area under the density curve up to a point,
given by pnorm(), qnorm() is the inverse function of pnorm().
0.8
1.0
The Cumulative Distribution Function
0.4
density
0.6
pnorm(55,50,10) = 0.691
0.0
0.2
qnorm(.691,50,10) = 55
20
30
40
50
60
Z
Albyn Jones
Math 141
70
80
Another pnorm example
0.2
0.1
0.0
density
0.3
0.4
pnorm(1)−pnorm(−1) : .68...
−3
−2
−1
0
1
x
Albyn Jones
Math 141
2
3
pnorm() and qnorm()
QUIZ!
What is
qnorm(pnorm(0))?
What is
pnorm(qnorm(.5))?
Albyn Jones
Math 141
Sample Means
Suppose X1 , X2 , . . . , Xn are IID random variables, with mean
µ and standard deviation σ. We know that
E(X n ) = µ
and
σ
SD(X n ) = √
n
Albyn Jones
Math 141
The Central Limit Theorem
Let X1 , X2 , . . . , Xn be IID random variables, with mean µ and
standard deviation σ. Then as n increases, the distribution of
√
X n is approaching that of a Normal with mean µ and SD σ/ n:
!
Z x
x2
Xn − µ
1
√ e− 2 dx
√ ≤x →
P
σ/ n
2π
−∞
Albyn Jones
Math 141
The Central Limit Theorem, Three versions
We actually have three ways of describing the normal
approximation:
Albyn Jones
Math 141
The Central Limit Theorem, Three versions
We actually have three ways of describing the normal
approximation:
1
X −µ
√ ∼ N(0, 1)
σ/ n
Albyn Jones
Math 141
The Central Limit Theorem, Three versions
We actually have three ways of describing the normal
approximation:
1
X −µ
√ ∼ N(0, 1)
σ/ n
2
X ∼ N(µ, σ 2 /n)
Albyn Jones
Math 141
The Central Limit Theorem, Three versions
We actually have three ways of describing the normal
approximation:
1
X −µ
√ ∼ N(0, 1)
σ/ n
2
X ∼ N(µ, σ 2 /n)
3
X
Xi ∼ N(nµ, nσ 2 )
Albyn Jones
Math 141
Interpretation
It is the CLT that allows us to say that
√
X ≈ µ ± σ/ n
For the Normal distribution, the SD really is a typical deviation!
Finally: this also explains why the SD is often more useful than
other measures of spread.
Albyn Jones
Math 141
Example: Binomial
Let Xp
i be n IID Bernoulli(p) RV’s. Then µ = p and
σ = p(1 − p), while X = p̂.
Albyn Jones
Math 141
Example: Binomial
Let Xp
i be n IID Bernoulli(p) RV’s. Then µ = p and
σ = p(1 − p), while X = p̂.
Standardized Averages
p̂ − p
p
∼ N(0, 1)
p(1 − p)/n
Albyn Jones
Math 141
Example: Binomial
Let Xp
i be n IID Bernoulli(p) RV’s. Then µ = p and
σ = p(1 − p), while X = p̂.
Standardized Averages
p̂ − p
p
∼ N(0, 1)
p(1 − p)/n
Averages
p̂ ∼ N(p, p(1 − p)/n)
Albyn Jones
Math 141
Example: Binomial
Let Xp
i be n IID Bernoulli(p) RV’s. Then µ = p and
σ = p(1 − p), while X = p̂.
Standardized Averages
p̂ − p
p
∼ N(0, 1)
p(1 − p)/n
Averages
p̂ ∼ N(p, p(1 − p)/n)
Sums
X
Xi ∼ Binomial(n, p) ∼ N(np, np(1 − p))
Albyn Jones
Math 141
Example: Binomial(20, 1/2)
0.00
0.05
0.10
0.15
Binomial(20,.5) and N(10, 20*.5*.5)
0 1 2 3 4 5 6 7 8 9
Albyn Jones
11
13
Math 141
15
17
19
Example: Binomial(50, 1/2)
0.00
0.02
0.04
0.06
0.08
0.10
0.12
Binomial(50,.5) and N(25, 50*.5*.5)
0
3
6
9
12
16
20
Albyn Jones
24
28
32
Math 141
36
40
44
48
Example: Binomial(100, 1/2)
If X ∼ Binomial(100, 1/2), then
E(X ) = np = 100/2 = 50
SD(X ) =
p
p
np(1 − p) = 100/4 = 5
So typical values for X are around 45 or 55, and roughly
96% of the time,
40 ≤ X ≤ 60
sum(dbinom(40:60,100,.5)) gives 0.9648.
Albyn Jones
Math 141
Example: Binomial(100, 1/100)
If X ∼ Binomial(100, 1/100), then
E(X ) = np = 100/100 = 1
SD(X ) =
p
√
np(1 − p) = 100 · .01 · .99 ≈ 1
So typical values for X are roughly 0 to 2, and according to
the CLT roughly 96% of the time,
−1 ≤ X ≤ 3
sum(dbinom(0:3,100,.01)) gives 0.9816, while
sum(dbinom(0:2,100,.01)) gives 0.92. The poisson
approximation works better here!
Albyn Jones
Math 141
Example: Binomial(100, 1/100)
0.0
0.1
0.2
0.3
0.4
Binomial(100,.01) and N(1,.995)
0
1
2
3
4
Albyn Jones
5
6
7
8
Math 141
9
10
12
14
16
Summary
The Normal distribution originated as a means of approximating
probabilities for sums and averages.
For IID RV’s Xi with mean µ and variance σ 2
X ∼ N(µ, σ 2 /n)
Albyn Jones
Math 141
Download