Probabilities for Normal Random Variables in General

advertisement
CONTINUOUS RANDOM VARIABLES
The Binomial random variable is called a discrete random variable because it takes
on discrete values 0, 1, 2, 3, ..,n.
Consider the probability histogram of X, Binomial n =3, p =.7. The areas under the
histogram are related to the probability distribution.
P(X=2) =
P(X 2) =
The total area under this probability histogram is
p(0) + p(1) + p(2) + p(3) = 1
A continuous random variable X is one which represents measurements that
(theoretically) can be made to any degree of accuracy.
For example suppose X = the weight (in kg) of a randomly chosen newborn baby.
Depending on the accuracy of our scale the weight X of a randomly selected baby
could be recorded either as 3 or 3.3 or 3.26 or 3.258 etc.
The probability histogram for such a variable has to be formed in a very different
way than for a discrete random variable.
For example, in the case of “X=the birth weight of a newborn”, we could take a very
large sample from the population of newborns, measure the sample birth weights
very accurately ( with many decimal of accuracy) and form a histogram of the birth
weights using classes of small width. If the vertical scale is adjusted so the total area
under the histogram is one then area under the histogram can be used to calculate
(approximately) the probabilities. The larger the sample and the more accurate our
measurements, the more accurate will be these probabilities.
In the illustration below a srs of 812 babies was used to form a probability
histogram.
(70)
Birth Weight (kg) of 812 Newborns
Frequency
30
20
10
0
1
2
3
4
5
WT(kg)
Let X be the birth weight of a randomly chosen newborn.
P(X3) = Shaded Area
The larger the sample, the smaller we can make these rectangles, and the
“smoother” will be the resulting histogram.
Thus a “model” of the distribution could be obtained by fitting a curve to such a
histogram and using the area under the curve to calculate the probabilities ( this
curve is called a Probability Density Function).
Thus P(X3) is the area under the curve to the left of 3. Thus total area under the
curve must be 1.
Note: As we know there are many different shapes among various populations (e.g.
left skewed, right skewed, symmetric etc.). In the class of bell-shaped curves there is
a specific one which is called normal curve ( there is a mathematical formula which
defined it exactly). If a population can be modeled by this certain bell –shaped curve
the population is said to have a Normal ( or Gaussian) Distribution.
(71)
The Normal Distribution
A Normal ( or Gaussian) Population is one that can be modeled by a certain bellshaped curve called the normal curve. The population of weights of newborn babies
described above is an example of such a population. In describing such a population,
we need to know two quantities,  and  .  stands for the population mean and 
stands for the population standard deviation. In our example  is the mean
(average) weight of all newborn babies in the population and  measures the spread
of the population values about the mean  .
A Normal (or Gaussian) random variable X represents a randomly chosen
measurement sampled from this population. Probabilities about X are found by
finding the appropriate areas under the curve i.e. P(Xx) = the area under the
normal curve to the left of x.
Note: (a) The total area under the curve is 1.
(b) For a normal random variable P(X=x) is always 0.
Practically speaking this means that if we can measure observations very accurately,
the chances of finding a newborn weighing exactly 3.0000000000kg, say, is very
small. Thus for all practical purposes (X=3) =0.
One consequence of this is that P(Xx) = P(X<x).
The population of Z-scores of a normal population is called the Standard Normal
Population. A randomly chosen measurement from the standard normal population
is denoted by Z. Z is simply the Z-score of a normal random variable X.
X-
Z= 

For the standard normal random variable Z: Z = 0 and Z = 1.
We will show later that most of the time an observed value of Z will fall between –3
and +3.
(72)
Probabilities for the Standard Normal
Probabilities for the standard normal distribution can be obtained from the Table A
on pages T-2 and T-3.
Examples: (a) (i) P(Z1.5) =
(ii) P (Z>1.5) =
(b) (i) P(Z  -1.5) =
(ii) P(Z-1.5) =
(c) P(-1.5  Z  1.5) =
(d) P(-1.5 < Z < 2.21) =
(73)
Probabilities for Normal Random Variables in General
Example: The heart rate of patients suffering from heart disease is normally
distributed with a mean of 97 beats per minute and a standard deviation of 18 beats
per minute. For a randomly chosen patient, find the probability the heart rate is
(a) below 80 (b) more than 140, (c) between 55 and 90.
Let X = the heart rate of a randomly selected patient;  =97,  = 18.
(a) P(X< 80) =
(b) P(X>140) =
(c) P(55<X<90) =
(74)
Example: Let X be any normal random variable with mean  and standard
deviation . What is the probability that X is within two standard deviations of the
mean?
First note that the statement “ X is within two standard deviations of the mean”
means that X lies between  - 2
and  + 2.
Thus P( X is within two standard deviations of the mean)
= P ( - 2 < X <  + 2)
=
(75)
Note: Similarly,
P(X is within one standard deviations of the mean)
=
Note:
P(X is within three standard deviations of the mean)
=
(76)
Percentiles of the Standard Normal Distribution
(Using the Normal Tables backwards)
(a) Find the 95th percentile of the standard
normal distribution i.e. find the value of z0
such that
P( Z  z0 ) = .95
( b) Find z0 such that
P ( Z  z0 ) = .41
(c)Find z0 such that
P( z0  Z  0 ) = .1
(d) Find z0 such that
P( -z0  Z  z0 ) = .95
(77)
Example: Let x be a normal random variable with mean  = 100 and standard
deviation =10. Find x0 such that
(a) P(X  x0 ) =.80 [i.e. x0 is the 80th percentile of X]
(b) P ( X  x0 ) = .025
(78)
Example : The scores on the Scholastic Aptitude Test ( SAT) for verbal ability of
high school seniors is normally distributed with mean 430 and standard deviation
100. What score must a student attain in order to be in the top 5% of all the
students who took the test?
Example: The time to first failure for a certain model of television set is normally
distributed with mean 5 years and standard deviation 1.56 years. If the
manufacturer wishes to repair only 10% of the sets sold, for how long should he
guarantee his product?
(79)
Normal Approximation to Binomial Probabilities
Let X be a binomial random variable with parameters n and p. Mean and standard
deviation of this distribution are given by

 = np and
 =  npq ,
q = 1-p.
Then if both np  5 and nq  5 the binomial probabilities may be closely
approximated by Normal probabilities in the following way. For a = 0,1,2,3…n
a+ .5 -np
P(X  a)  P(Z   )

npq
Example: An automotive plant employs workers and suffers a daily absentee rate of
1%.
(a) What is the mean ( or expected value) of absentees on a given day?
(b) What is the variance and standard deviation of the absentees on a given day?
(c) Find the probability that on a given day
(i)
at most 30 of the workers are absent,
(ii)
at least 60 of the workers are absent,
(iii)
between 30 and 60 (inclusive) of the workers are absent.
Solution: Let, X = the number of workers absent on a given day. Therefore, X is
binomial with n =5000 and p =.01.
(a)  = np =
(b) 2 = npq =
=
(c) Since the values for n and p are not in Table C (Binomial
Probabilities) and since using the binomial formula would take a very
long time ( not to mention the difficulty of the calculation when n =
5000), we use the normal approximation to the binomial.
Check: np =
; nq =
(80)
(i)
P(X 30) 
(ii)
P( X  60 ) =
(iii)
P ( 30  X  60 ) =
(81)
Example : A travel agency promotes vacation packages by telephoning households
at random in the evening hours. Historically, only 65% of heads of households are
at home when the agency calls. Suppose that 30 households are phoned in a given
evening. Find the probability that the agency will find
(a) between 15 and 25 households, inclusively, with the head of the household at
home.
(b) fewer than 23 households with the head of the household at home.
(c) P( 17 < X < 28)
(82)
Download