Chapter 1.4: Distribution of Continuous R.V.: Normal Distribution

advertisement
Distribution of Continuous R.V.: Normal Distribution (Ch 1.4)
Topics:
§1.4 What is Normal Distribution, and its density function, mean, variance
Standard Normal Distribution: (a) Calculating Probability
(b) Calculating Percentile
General Normal Distribution : (a) Calculating Probability
(b) Calculating Percentiles
------------------------------------------------------------------------------------------------------I. Normal random variable/Normal Distribution

A distribution for describing continuous random variables

Two common ways to describe a Normal distribution
1. Density plot
 Shape:
 Symmetric, centering at  also the median and mean.
 Can be fully specified via two parameters:  and  2 . The distribution is
denoted by N (  ,  ) , It can be shown  2 is the variance of x (  is the standard
deviation of x)
Ex.
A
B
D
C
1
2. Density function (for your reference):
f ( x;  ,  ) 

1
( x   )2
exp{
}
2 2
2
What problems are we interested in solving regarding normal distribution?
1. Know how to calculate probabilities from a given normal distribution
Ex. Test score X ~ N (75, 5). P(90< X < 100)? P(X < 60) ?
2. Be able to identify the percentiles of the population
Ex. Test score X ~ N (75, 5). (1) What are largest 10% of the scores?
That is, we want to find x0.1 such that P[ X  x0.1 ]  10%  0.1
(2) What are the most extreme 5% of the scores?
That is, we want to find x0.05 such that P[ X  x0.05 ]  5%  0.05
II. Standard Normal Distribution

Normal distribution with mean=0 and SD=1. Denoted by N(0, 1).

Usually use Z to denote a standard normal r.v.

Why learn the standard normal distribution?
o Area under the normal curve can only be calculated numerically.
2
So statisticians have established a table that shows the left tail area under the
standard normal curve of any given number (see the very first page of the
textbook).
o Later we can use such table to solve for all normal distribution.
How? One can STANDARDIZE any given N (  ,  ) to N(0, 1), and then use the
area table of standard normal to solve the problem (Your HW2, Question #2, 2.61)

Use the area table of standard normal curve
(1) Calculate probability
Ex. A variable Z ~ N(0, 1). Calculating the following probabilities:
1. P(Z  1.25) =0.8944
2. P(Z  -1.25) =0.1056 ( = 1 - 0.8944)
3. P(Z  -1.25) = 1 - P(Z  -1.25) =1 - 0.1056 = 0.8944
4. P(-.38  Z  .25) =P[Z  0.25] – P[Z  -0.38] = 0.5987 – 0.3520 = 0.2467
In general, P[a  Z  b] = P[Z  b] – P[Z  a].
5. P(Z  -6) < P[Z  -3.89] = 0.0000
6. P(Z  2) = 1 – P[Z < 2] = 1 – P[Z  2] = 1 – 0.9772 = 0.0228
3
(2) Obtain extreme values
Ex1. A variable Z ~ N(0, 1). Find the following z* that fulfills the probability:
1. P(Z  z*) = 0.1
z *  1.28 (the exact value is -1.281552)
2. P(Z  z*) = 0.5
z*  0
3. P(Z  z* or Z  -z*) = 0.1
By symmetry of N(0,1), P[Z  - z*] = 0.05, -z* = -1.645, z* = 1.645.
Ex2. Consider a standard Normal r.v. Z~N(0,1). At what value of z*, the area to the
right is 2.5%?
Want to find z* such that P[Z  z*] = 0.025, or P[Z  z*] = 0.975. The value of
z* = 1.96
Ex3. Consider a standard Normal r.v . Z~N(0,1). At what value of z*, the area
between –z* and z* is 68%?
P[-z*  Z  z*] = 0.68  P[Z  - z*] = (1 - 0.68)/2 = 0.16  - z* = - 0.995,
z = 0.995.
4
III. General Normal Distribution

If X has a normal distribution with mean  and SD  , then we can standardize X to Z by
Z


has the standard normal distribution
Therefore,
P[a  X  b]  P[

X 
a


X 


b

]  P[a  Z  b ], where a 
a

and b 
b

Calculating probability and percentiles
Ex. A variable X ~ N(100, 5). Calculating the following probabilities:
1. P(90  X  125) = P[a  Z  b ] , where
a
90  100
125  100
 2, b 
 5 . So
5
5
P(90  X  125) = P(-2  Z  5) = P(Z  5) – P(Z  -2) = 1 – 0.0228 = 0.9772.
2. P( X  98 ) = P[Z 
98  100
 0.4 ] = 1 – P[Z  -0.4] = 1- 0.3446 = 0.6554
5
3. Find the x* such that P( X  x* )=0.1
P[ X  x*]  P[
X  100 x * 100
x * 100

]  P[ Z 
]  0.1 . But P[Z  -1.28] = 0.1, so
5
5
5
x * 100
 1.28 , which gives x* = 100 + 5*(-1.28) = 93.6
5
4. Find the range that contains the MIDDLE 90% of the observations: Want to find a
such that x is in [100 – a, 100 + a] with 90% probability
P[100  a  X  100  a]  P[
a X  100 a

 ]  P[ a / 5  Z  a / 5]  0.9,  P[ Z  a / 5]  0.05
5
5
5
So, - a/5 = - 1.645, a = 5*1.645 = 8.225. The range is [100 – 8.225, 100 + 8.225] =
[91.775, 108.225]
5
Ex. X is the diameter (in mm) of tires, normally distributed with mean 575 and SD 5.
1. P(575 < X < 579)=
575  575 X  575 579  575
P[


]  P[0  Z  0.8]  P[ Z  0.8]  0.5  0.7881  0.5  0.2881.
5
5
5
2. P(575  X  579)=0.2881
3. Find the diameter x* such that there are only 1% tires longer than this diameter
That is, P[X > x*] = 0.01 or equivalently P[X < x*] = 0.99. Since P[Z<2.33] =
0.99, so x* = 575 + 5 * 2.33 = 586.65.
4. Find the (diameters of) tires that have most extreme 5% diameters.
That is, P[X > x*] = 0.05 or equivalently P[X < x*] = 0.95. Since P[Z<1.645] =
0.95, so x* = 575 + 5 * 1.645 = 583.225.
6
Putting everything together…. An overall example:
The diameter of a tire follows normally distribution with mean 575 and SD 5. We have 4 tires,
and the diameters of these tires are independent of each other.
(a)
What is the probability that a tire has its diameter between 570 and 580?
Let Xi be the diameter for tire i. Then P[570 < Xi < 580] = P[-1 < Z < 1] = 0.8413 – 0.1587 =
0.6826.
(b)
What is the probability that all 4 tires have diameters between 570 and 580?
Let Ai = [570 < Xi < 580]. Then A1, A2, A3, A4 are independent. So
P[ A1  A2  A3  A4 ]  P( A 1 ) P( A 2 ) P( A 3 ) P( A 4 )  0.6826 4  0.2171
(c)
What is the probability that at least one tire is not between 570 and 580?
This probability = 1 – P[all tires are between 570 and 580]
= 1 - P[ A1  A2  A3  A4 ]  1  0.2171 = 0.7829
7
Download