Student version

advertisement
Topic 3-5 Normal distribution
The PDF
f(x) =
1
 2
1 x   2
(
)
e2

 N ( x;  ,  )
(3-5-1)
is called a normal (or Gaussian) distribution; where  equals its mean and  gives the standard deviation (these
can be shown by integration). Note how  and  alter the appearance the PDF: the curve is symmetric about the
vertical line x = , while an increase in  causes it to flatten and spread outwards. Fig. 3-5-1 shows two normal
curves with the same  but different s. One physical situation modeled by this is when X is an observable’s
measured value, which is the true value plus a small random error. This error is equally likely to be positive or
negative, hence the curve is symmetric about , interpreted as the true value. Moreover, smaller errors are more
likely than larger ones, hence the bell shape characterized by  , the degree of precision.
200
PDF
of X
150
higher
precision
( = 0.002)
100
lower
precision
( = 0.006)
50
4.98
4.99
5.01
5.02
same mean ( = 5)
X
Fig. 3-5-1
The probability that X assumes any value between a and b is obtained by integrating f(x) from x = a to x = b,
1 x  
b
(
1
P( a  X  b)  
e2
 2
a

)2
dx
(3-5-2)
A standard way to evaluate this integral is to first introduce the variable change
x
Z
(3-5-3)

into (3-5-2), hence
b
P ( a  X  b) 


a

b
1
2
1 2
z
e 2 dz


 ydz
a

(3-5-4)
where y  y( z ) 
1
2
1
 z2
e 2
is obviously N(z;0,1); called the standard normal distribution. Its picture is shown
in Fig. 3-5-2.
0.4
Standard Normal
Distribution
N(z; 0,1)
0.3
  P ( Z  z*)
0.2
0.1
z
-3
-2
-1
0
1
z*
2
3
x

Fig. 3-5-2
Now, since the standard normal curve is symmetric about z = 0, the area on each side being 1/2 (since total area
under the curve must be 1, i.e.



ydz  1 ), it suffices to compute the definite integral
z*
0
1
2
e

z2
2
dz for
positive values of z* only, which is done and tabulated in most probability/ statistics texts.
Solved problems
Problem 3-5-1
Suppose that the height (in inches) of a 25-year-old man is normal with  = 71, 2 = 6.25.
(a) What percentage of 25-year-old men are over 6 feet 2 inches tall? (ans. 0.12)
(b) What percentage of 25-year-old males taller than 6’ are over 6 foot 5 inches? (ans. 0.024)
Solution:
(a) Let X be the height of a 25-year-old man.
P(X > 74) = P((X - 71)/6.25 > (74 - 71)/ 6.25)
= P(Z > 1.2) = 0.5 - 0.3849 = 0.12,
i.e. 12% are over 6 feet 2 inches tall.
(b) Given that 6’ < X , the conditional probability
P(X > 77” X > 6’) = P(X > 77” and X > 6’) / P(X > 6’)
= P(X > 77) / P(X > 72)
= [1 - ((77 - 71)/2.5)] / [1 - ((72 - 71)/2.5)]
= [ 1 - (2.4)] / [1 - (0.4)] = (0.5000 - 0.4918) / (0.5000 - 0.1554)
= 0.0238,
i.e. 2.4% of 25-year-old men taller than 6’ are over 6 foot 5 inches.
Problem 3-5-2
A foundation engineer estimates that the settlement of a proposed structure will not exceed 2 inches with 95%
probability. From a record of performance of many similar structures built on similar soil conditions, he finds
that the coefficient of variation of the settlement is about 20%. If a normal distribution is assumed for the
settlement of the proposed structure, what is the probability that the proposed structure will settle more than 2.5
inches? (ans. 0.00047)
Solution:
Let X be the settlement of the proposed structure; X ~ N(  X ,  X ). Given probability:
P(X  2) = 0.95
X  X 2  X
 P(
) = 0.95

Also, since we’re given the C.O.V. =
X
 0.2
X
X
X
  X  0.2 x
substituting (iii) into (ii),
P(Z 
2  X
) = 0.95,
0.2  X
whereas Ang & Tang Table A.1 gives
P(Z  1.645) = 0.95
Comparing the two equalities, thus solving the algebraic equation for  X , we have
 X = 1.504890985, so by (ii)
  X = 0.3009781790
With these parameters found, we may calculate the desired probability
P( X > 2.5)
X   X 2.5  1.504890985
= P(
)

X
0.3009781790
= P(Z > 3.30625)
= 1 - P(Z  3.30625)
= 1 - 0.999533  0.00047
Here's how you can use Excel to compute P(Z  3.30625): in any blank cell, type in:
=NORMSDIST(3.30625)
(don't forget the "=" sign and the "S" in NORMSDIST)
(i)
(ii)
(iii)
Exercises
Exercise 3-5-1
A student has submitted a concrete cylinder to the concrete cylinder strength contest in Engineering Open
House. Suppose the strength of her concrete cylinder is normally distributed as N(80, 20) in kips. She was
scheduled to be the last contestant for load testing. Immediately prior to her cylinder being tested, the two
highest strengths in the contest thus far are 100 and 70 kips.
(a)
What is the probability that she will be the second place winner?
(b)
Suppose her cylinder is being tested, and it has not shown any sign of distress at a load of 90 kips. What is
the probability that she will win first place?
(c)
Suppose she submitted a similar cylinder to another contest. Her boyfriend used an alternative procedure
to make his concrete cylinder, such that the strength is expected to be only 1% higher than hers, but the
c.o.v. is 50% higher. Who is more likely to score a higher strength in that contest? Justify.
Exercise 3-5-2
The daily SO2 concentration in the air for a given city is normally distributed with mean 0.03 ppm and a c.o.v. of
40%. Assume statistical independence between the SO2 concentration for any 2 days. Suppose the criteria for
clean air standard requires that:
1.
2.
The weekly average SO2 concentration should not exceed 0.04 ppm.
The SO2 concentration should not exceed 0.075 ppm on more than 1 day during a given week.
Determine which one of these two criteria is more likely to be violated in this city. Substantiate your answer
with calculated probabilities.
Exercise 3-5-3
The daily flow rate of contaminant from an industrial plant is modelled by a normal random variable with mean
value 10 units and c.o.v. of 20%. When the contaminant flow rate exceeds 14 units on a given day, it is
considered excessive. Assume that the contaminant flow rate between any two days are statistically independent.
(a) What is the probability of having excessive contaminant flow rate on a given day?
(b) Regulation requires the measurement of contaminant flow rate for three days. The plant will be charged with
a violation if excessive contaminant flow rate is observed during the three day period. What is the
probability that the plant will not be charged with violation?
(c) Suppose there is a proposal to change the regulation such that contaminant flow rate will be measured for
five days and the plant will be charged with a violation if excessive contaminant flow rate is observed in
more than one of those five days. Will the plant be better off with the proposed change? Please justify.
(d) Return to Part (b). Although the plant cannot reduce the standard deviation of the daily contaminant flow
rate, it can reduce the mean daily contaminant flow rate by improving the chemical process. Suppose the
plant decides to limit the probability of violation to 1%. What should be the daily mean contaminant
overflow rate?
Download