The Normal Distribution

advertisement
ST 380
Probability and Statistics for the Physical Sciences
The Normal Distribution
The normal distribution plays a central role in probability theory and
in statistics.
It is often used as a model for the distribution of continuous random
variables. Like all models, it is always wrong, but sometimes useful.
Even when individual measurements are not normally distributed, the
central limit theorem implies that sums or averages of the
measurements are at least approximately normally distributed.
1 / 16
Continuous Random Variables
Normal Distribution
ST 380
Probability and Statistics for the Physical Sciences
Definition
A continuous random variable X is said to have a normal distribution,
with parameters µ and σ, if its pdf is
f (x; µ, σ) = √
1
2πσ 2
2 /(2σ 2 )
e −(x−µ)
, −∞ < x < ∞.
We write X ∼ N(µ, σ 2 ), meaning “X is normally distributed with
parameters µ and σ”.
curve(dnorm(x, mean = 1, sd = 1), from = -6, to = 6)
curve(dnorm(x, mean = -1, sd = 2), from = -6, to = 6, add = TRUE)
2 / 16
Continuous Random Variables
Normal Distribution
ST 380
Probability and Statistics for the Physical Sciences
Standard Normal Distribution
If Z is normally distributed with µ = 0 and σ = 1, that is
Z ∼ N(0, 1), then Z has the standard normal distribution.
The pdf of Z is denoted ϕ(z):
1
2
ϕ(z) = √ e −x /2 .
2π
The cdf of Z is denoted Φ(z):
Z
z
Φ(z) =
ϕ(y ) dy
−∞
3 / 16
Continuous Random Variables
Normal Distribution
ST 380
Probability and Statistics for the Physical Sciences
Mean
For the standard normal distribution,
Z ∞
E (Z ) =
xϕ(z) dz = 0,
−∞
and for the general normal distribution with parameters µ and σ,
Z ∞
E (X ) =
xf (x; µ, σ) dx = µ.
−∞
That is, the parameter µ is also the expected value of X .
4 / 16
Continuous Random Variables
Normal Distribution
ST 380
Probability and Statistics for the Physical Sciences
Variance
For the standard normal distribution,
Z ∞
V (Z ) =
x 2 ϕ(z) dz = 1,
−∞
and for the general normal distribution with parameters µ and σ,
Z ∞
V (X ) =
(x − µ)2 f (x; µ, σ) dx = σ 2 .
−∞
That is, the parameter σ is also the standard deviation of X .
5 / 16
Continuous Random Variables
Normal Distribution
ST 380
Probability and Statistics for the Physical Sciences
Standardizing
If X ∼ N(µ, σ 2 ), then
Z=
X −µ
∼ N(0, 1).
σ
Tables of the standard normal distribution tell us that, for instance,
P(|Z | < 1.96) = 0.95.
So
0.95 = P(|Z | < 1.96)
X − µ
< 1.96
= P σ = P(µ − 1.96σ < X < µ + 1.96σ).
6 / 16
Continuous Random Variables
Normal Distribution
ST 380
Probability and Statistics for the Physical Sciences
Percentiles
Because the normal distribution is widely used, we use a special
notation for its percentiles:
zα = [100(1 − α)]th percentile = ηZ (1 − α).
That is, zα is the value for which
P(Z > zα ) = α
For example, tables show that z.025 = 1.96.
Note
Other authors write this as z.975 ; be careful!
7 / 16
Continuous Random Variables
Normal Distribution
ST 380
Probability and Statistics for the Physical Sciences
Other Normal Distributions
If X ∼ N(µ, σ 2 ), then
FX (x) = P(X ≤ x)
X −µ
x −µ
=P
≤
σ
σ
x −µ
=P Z ≤
σ
x −µ
=Φ
.
σ
Differentiating,
1
fX (x) = ϕ
σ
8 / 16
x −µ
σ
.
Continuous Random Variables
Normal Distribution
ST 380
Probability and Statistics for the Physical Sciences
Percentiles
If X ∼ N(µ, σ 2 ) and ηX (p) is its (100p)th percentile, then
ηX (p) − µ
= Φ[ηZ (p)],
p = FX [ηX (p)] = Φ
σ
so
ηX (p) − µ
= ηZ (p),
σ
or
ηX (p) = µ + σηZ (p) = µ + σz(1−p) .
9 / 16
Continuous Random Variables
Normal Distribution
ST 380
Probability and Statistics for the Physical Sciences
The Exponential Distribution
The continuous analog of the geometric distribution is the
exponential distribution, with pdf
(
λe −λx x ≥ 0
f (x; λ) =
0
x < 0.
Integrating,
(
1 − e −λx
F (x; λ) =
f (y ; λ) dy =
0
−∞
Z
10 / 16
x
Continuous Random Variables
x ≥0
x < 0.
Exponential Distribution
ST 380
Probability and Statistics for the Physical Sciences
Mean and Variance
Integration by parts shows that
Z ∞
1
E (X ) =
λxe −λx dx = .
λ
0
Another integration by parts shows that
V (X ) =
1
,
λ2
so the standard deviation of X is 1/λ, the same as the expected
value.
11 / 16
Continuous Random Variables
Exponential Distribution
ST 380
Probability and Statistics for the Physical Sciences
The “No Memory” Property
The exponential distribution is often used as a model for the time
you wait until some event occurs.
Suppose you wait until time t0 , and the event has not occurred; what
is the distribution of the remaining waiting time?
P(X > t0 + t)
P(X > t0 )
−λ(t0 +t)
e
=
e −λt0
−λt
=e
= P(X > t).
P(X > t0 + t|X > t0 ) =
12 / 16
Continuous Random Variables
Exponential Distribution
ST 380
Probability and Statistics for the Physical Sciences
That is, the probability that you still have time t to wait is the same
as it was initially.
Reliability
Suppose that the event is the failure of a piece of equipment.
If the time to failure has the exponential distribution, the equipment
does not age: its failure probability does not increase with time in
service.
That makes it an interesting but unrealistic model for failure times.
13 / 16
Continuous Random Variables
Exponential Distribution
ST 380
Probability and Statistics for the Physical Sciences
The Gamma Distribution
The pdf of the Gamma distribution is of the form
(
kx α−1 e −x/β x ≥ 0
f (x; α, β) =
0
x <0
for an appropriate normalizing constant k.
We determine k from the requirement
Z ∞
f (x; α, β) dx = 1.
−∞
14 / 16
Continuous Random Variables
Gamma Distribution
ST 380
Probability and Statistics for the Physical Sciences
The definition of the Gamma function is
Z ∞
Γ(α) =
x α−1 e −x dx
0
so
Z
∞
1=k
x α−1 e x/−β dx = kβ α Γ(α),
0
and
k=
1
β α Γ(α)
.
Finally,
(
f (x; α, β) =
15 / 16
1
x α−1 e −x/β
β α Γ(α)
x ≥0
0
x < 0.
Continuous Random Variables
Gamma Distribution
ST 380
Probability and Statistics for the Physical Sciences
Mean and Variance
Again using integration by parts and properties of the Gamma
function,
E (X ) = αβ
and
V (X ) = αβ 2 .
Special Case
If α = 1, the gamma pdf simplifies to the exponential distribution
with parameter λ = 1/β.
We can view the gamma distribution as a generalization of the
exponential distribution, with a shape parameter α in addition to the
scale parameter β.
16 / 16
Continuous Random Variables
Gamma Distribution
Download