8-5 Normal Distributions
continuous distribution found in many application areas
e.g. SAT scores, people’s heights, leaf lengths
all are distributed according to the normal distribution
also called the Gaussian Distribution
it is the famous "bell-shaped" curve:
distributions have a mean
and a standard deviation
the above Normal has
= 430 and
= 100
For any Normal Distribution with mean
and standard deviation
:
68% of all scores lie within
1
of the mean
95% are within
2
99.7% (virtually all) are within
3
8-5 p. 1
Computing areas (probabilities) under normal distributions
Question:
What % of the people that take the SAT make scores between 330 and 530?
Analysis: mean
= 430 s.d.
= 100 picture:
330 =
530 =
% of scores between
and
for any normal distribution?
We know that one; it's 68% !!!
8-5 p. 2
Question:
What % of the people that take the SAT make scores between 430 and 600?
Analysis: mean
= 430 s.d.
= 100
600 = how many
from the mean?
600
430 z = = 1.7
100
We only know percentages for within
1
2
and
3
of the mean.
Q: Can we find the area under a normal from
to
+ 1.7
A: Yes, we can look it up in a table.
Here's the picture:
z = 1.7
The table on page 723:
gives us area ( for any normal distribution !!!)
from
to
+ (any number of
) look up 1.7 .4554
Answer : 45.56%
8-5 p. 3
Z-statistics
by computing the number of
's a score is from the mean
you are computing a number known as:
a z-statistic , or
a standardized score
the unit of measurement of a z-statistic is always standard deviations
Example: if
= 10 and
= 7, what is the z-statistic or standardized score corresponding to a score of 15? z =
15
7
10
= .714
Now, what % of scores lie between 10 and 15?
Compute the area under a normal between
and
+ .714
Picture:
.71
.2611 Ans: 26% z = .714
8-5 p. 4
Example:
What % of scores lie between 5 and 10?
Picture: z = -1.7
since the normal curve is symmetric
the shaded area is the same as for between 10 and 15
and the answer = 26% as before
Example:
What is the area between scores of 7 and 15?
Picture: z = (7 - 10)/7 = -.43
z = .714
shaded area below
(look up .43 in the table) = .1664 shaded area above
(look up .714 in the table) = .2611 total shaded area = .1664 + .2611 = .4275
8-5 p. 5
Example:
What is the area above 15?
Picture: z = .714
since the total area above the mean is .5
and the area from 10 - 15 (look up .714) is .2611
the required area is .5 - .2611 = .2389
Animal Crackers - segue to a new topic
"Animal crackers, and cocoa to drink.
That is the finest of suppers, I think."
The American Express Card is used by 40% of the people buying animal crackers in Austin. 20 animal cracker buyers are surveyed. What is the probability that from 6 to 12 of them use Amex?
this is a repeated Bernoulli experiment with
n=20, p=.40, q=.60
if X = #people using Amex , we want
(using the binomial distribution ) i
12
P(6
X
12) =
6
(.
40 ) i
(.
60 )
20
i
That's 7 of those nasty terms to compute and add!
What a bore! Is there an easier way? Watch!
8-5 p. 6
Areas under a binomial distribution
for simplicity, we use an example binomial with:
n = 5 and p = .4
Let's compute the probability that (X= 1 or X=2):
p(1) + p(2) =
C
5,1
(.4)(.6) 4 + C
5,2
(.4) 2 (.6) 2 = .2592 + .3456 = .6048
If we graph our binomial, we get:
Now notice: since the bar labeled 1 has width 1 height = (probability) = .2592 the area of bar 1 = 1 x .2592 = .2592 = p(1) the area of bar 2 = 1 x .3456 = .3456 = p(2) the area of bar 1 + the area of bar 2 =
.2592 + .3456 = .6048 = p(1) + p(2)
8-5 p. 7
for discrete distributions :
areas (under the "curve") represent probabilities
just as for continuous distributions !
The shaded area is p(1) + p(2)
8-5 p. 8
The Normal approximation to the binomial distribution
We start by overlaying an appropriately-shaped (but rather crudely-drawn) Normal on our binomial distribution:
This makes sense, because of our observation in the last section: one can interpret areas for both the normal and binomial distributions as probabilities
We are creating a normal approximation to the binomial , to be used in the following way:
instead of computing areas under the binomial
we will instead compute areas under the normal
they won't be exactly right . . .
only an approximation of the "real area" we want
8-5 p. 9
how good an approximation do we have?
look:
Areas under the normal approximation will:
"leave out" pieces of the binomial appearing above the normal
but will "add in" some extra pieces outside the binomial appearing below the normal
these adjustments do not balance out completely
but in many cases will provide a good enough approximation!
8-5 p. 10
Using the Normal approximation to compute a binomial probability
We go back to our original problem. We want to compute: p(1) + p(2) = C
5,1
(.4)(.6) 4 + C
5,2
(.4) 2 (.6) 2 which was computed directly to be: .6048
Now we will try to approximate that by computing an area under the normal approximation :
Recall that for the binomial:
Mean:
= np Standard deviation:
= npq
Accordingly, we will use a Normal with:
= 5(.4) = 2
= 5
.
4
.
6 = 1.1
we want to do p(1) + p(2) BUT . . .
the limits for the normal will go from .5 to 2.5
( not 1 to 2)
8-5
Why? Look at the picture!! p. 11
So we want to compute this area z = (.5 - 2)/1.1 = -1.36
z = (2.5 - 2)/1.1 = .45
shaded area below 2: 1.36 = table = .4131
shaded area above 2: .45 = table = .1736
total area: .5867
remember the exact calculation?: .6048
not a bad approximation, huh?
8-5 p. 12
Animal Crackers accomplished - no pain!
The American Express Card is used by 40% of the people buying animal crackers in Austin. 20 animal cracker buyers are surveyed. What is the probability that from 6 to 12 of them use Amex?
use a Normal distribution with
Mean:
= np = 20(.40) = 8
Standard deviation:
= npq = 20
.
4
.
6 = 2.19
compute the area under a Normal
with mean 8 and standard deviation 2.19
from 6 - .5 = 5.5
to 12 + .5 = 12.5
where did the .5's come from?(re-read the previous section!)
Picture: z = (5.5 - 8)/2.19 = -1.14
z = (12.5 - 8)/2.19 = 2.05
shaded area below 8: 1.14 table .3729
shaded area above 8: 2.05
table
.4798
total area: .8527
Ans: .8527
8-5 p. 13
When is the Normal approximation appropriate?
The question arises: "When is the Normal Approximation a good enough approximation to the Binomial?"
Here's a rule-of-thumb to use:
Rule-of-Thumb Test
If
0
and
n the Normal Approximation is appropriate
try it on our binomial problem, parameters: n = 5, p = .4
= 2,
= 1.1
remember we got a pretty good approximation?
BUT!
= 2 - (3)(1.1) = -1.3, not
0 it fails the test already!
so we really shouldn't use the Normal approximation!
we were just lucky, I guess!
How about Animal Crackers ? n = 20, p = .4,
= 8,
= 2.19
= 1.43
0 OK so far
= 14.57
OK
8-5 p. 14