4.2b - Continuous (N..

advertisement
CHAPTER 4
• 4.1 - Discrete Models
 General distributions
 Classical: Binomial, Poisson, etc.
• 4.2 - Continuous Models
 General distributions
 Classical: Normal, etc.
~ The Normal Distribution ~
(a.k.a. “The Bell Curve”)
standard
deviation
X ~ N(μ, σ)
σ
Johann Carl Friedrich Gauss
1777-1855
X
mean μ
• Symmetric, unimodal
• Models many (but not
all) natural systems
• Mathematical
properties make it
useful to work with
2
Standard Normal Distribution
Z ~ N(0, 1)
d e n s ity fu n c tio n
 (z) 
1
2

e
z
2
2
1
Total Area = 1
Z
The cumulative distribution function (cdf) is denoted by (z).
It is tabulated, and computable in R via the command pnorm.
Example
Find P(Z  1.2).
Standard Normal Distribution
Z ~ N(0, 1)
1
Total Area = 1
Z
1.2
“z-score”
Example
Standard Normal Distribution
Find P(Z  1.2).
Z ~ N(0, 1)
 Use the included table.
1
Total Area = 1
Z
1.2
“z-score”
Lecture Notes Appendix…
6
7
Example
Standard Normal Distribution
Find P(Z  1.2).
Z ~ N(0, 1)
 Use the included table.
 Use R:
> pnorm(1.2)
[1] 0.8849303
1
Total Area = 1
0.88493
P(Z > 1.2)
0.11507
Z
1.2
“z-score”
Note: Because this is a continuous distribution, P(Z = 1.2) = 0,
so there is no difference between P(Z > 1.2) and P(Z  1.2), etc.
Standard Normal Distribution
X ~ N(μ, σ)
σ
μ
Z ~ N(0, 1)
Z 
X 

1
Z
Why be concerned about this, when most “bell curves”
don’t have mean = 0, and standard deviation = 1?
Any normal distribution can be transformed to the standard
normal distribution via a simple change of variable.
Example
POPULATION
Random Variable
X = Age at first birth
Question: What proportion of the
population had their first child
before the age of 27.2 years old?
P(X < 27.2) = ?
Year 2010
X ~ N(25.4, 1.5)
σ = 1.5
μ = 25.4 27.2
10
Example
POPULATION
Random Variable
X = Age at first birth
Question: What proportion of the
population had their first child
before the age of 27.2 years old?
P(X < 27.2) = ?
Year 2010
X ~ N(25.4, 1.5)
The x-score =
27.2 must first be
transformed to
a corresponding
z-score.
σ = 1.5
μ μ==25.4 27.2
33
11
Example
POPULATION
Random Variable
X = Age at first birth
Question: What proportion of the
population had their first child
before the age of 27.2 years old?
P(X < 27.2) = ?P(Z < 1.2) = 0.88493
Year 2010
X ~ N(25.4, 1.5)
27.2
X  25.4

Z Z Z
  1.2

1.5
σ = 1.5
 Using R:
> pnorm(27.2, 25.4, 1.5)
[1] 0.8849303
μ μ==25.4 27.2
33
12
Standard Normal Distribution
Z ~ N(0, 1)
1
Z
What symmetric interval about the mean 0 contains 95% of the population values?
That is…
Standard Normal Distribution
Z ~ N(0, 1)
 Use the included table.
0.95
0.025
0.025
Z
-z.025 = ?
+z.025 = ?
What symmetric interval about the mean 0 contains 95% of the population values?
That is…
Lecture Notes Appendix…
15
16
Standard Normal Distribution
Z ~ N(0, 1)
 Use the included table.
 Use R:
> qnorm(.025)
[1] -1.959964
> qnorm(.975)
[1] 1.959964
0.95
0.025
0.025
Z
-z.025 = -1.96
?
“.025 critical values”
+z.025
?
.025 = +1.96
What symmetric interval about the mean 0 contains 95% of the population values?
X ~ N(μ1.5)
, σ)
X ~ N(25.4,
Standard Normal Distribution
Z ~ N(0, 1)
What symmetric interval about the mean age
of 25.4 contains 95% of the population values?
X 
Z 

 1.96 
X  25.4
1.5
22.46  X  28.34 yrs
> areas = c(.025, .975)
> qnorm(areas, 25.4, 1.5)
[1] 22.46005 28.33995
X  25.4  (1.96)(1.5)
X  25.4  2.94
0.95
0.025
0.025
Z
-z.025 = -1.96
?
“.025 critical values”
+z.025 = +1.96
?
What symmetric interval about the mean 0 contains 95% of the population values?
Standard Normal Distribution
Z ~ N(0, 1)
 Use the included table.
0.90
0.05
0.05
Z
Similarly…
-z.05 = ?
+z.05 = ?
What symmetric interval about the mean 0 contains 90% of the population values?
…so average 1.64 and 1.65
0.95  average of 0.94950 and 0.95053…
20
Standard Normal Distribution
Z ~ N(0, 1)
 Use the included table.
 Use R:
> qnorm(.05)
[1] -1.644854
> qnorm(.95)
[1] 1.644854
0.90
0.05
0.05
Z
Similarly…
-z.05 = -1.645
?
“.05 critical values”
+z
+z.05
= +1.645
?
.05 =
What symmetric interval about the mean 0 contains 90% of the population values?
Standard Normal Distribution
Z ~ N(0, 1)
In general….
10.90
–
0.05
/2
0.05
/2
Z
Similarly…
-z.05 = -1.645
? -z / 2
““.05
 / 2critical
criticalvalues”
values”
+z
+z.05
= +1.645
?
.05
/ 2=
What symmetric interval about the mean 0 contains
100(1 – )% of the population values?
continuous
discrete
Normal Approximation to the Binomial Distribution
Suppose a certain outcome exists in a population, with constant probability .
We will randomly select a random sample of n individuals, so that the binary
“Success vs. Failure” outcome of any individual is independent of the binary
outcome of any other individual, i.e., n Bernoulli trials (e.g., coin tosses).
Discrete random variable
X = # Successes in sample
(0, 1, 2, 3, …,, n)
P(Success) = 
P(Failure) = 1 – 
Then X is said to follow a Binomial distribution,
written X ~ Bin(n, ), with “probability function”
n

x
 
f(x) = 
x
(1   )
nx
,
x = 0, 1, 2, …, n.
23
> dbinom(10, 100, .2)
[1] 0.00336282
Area
24
> pbinom(10, 100, .2)
[1] 0.005696381
Area
25
26
27
28
29
Therefore, if…
X ~ Bin(n, ) with n  15 and n (1 – )  15,
then…


X  N n  , n  (1   .
That is…

ˆ 
 N  ,

n

X
 (1   ) 

n

“Sampling Distribution” of
ˆ
30
● Normal distribution
● Log-Normal ~ X is not normally distributed (e.g., skewed), but
Y = “logarithm of X” is normally distributed
● Student’s t-distribution ~ Similar to normal distr, more flexible
● F-distribution ~ Used when comparing multiple group means
● Chi-squared distribution ~ Used extensively in categorical
data analysis
● Others for specialized applications ~ Gamma, Beta, Weibull…
31
Download