Topic 3-5 Normal distribution The PDF f(x) = 1 2 1 x 2 ( ) e2 N ( x; , ) (3-5-1) is called a normal (or Gaussian) distribution; where equals its mean and gives the standard deviation (these can be shown by integration). Note how and alter the appearance the PDF: the curve is symmetric about the vertical line x = , while an increase in causes it to flatten and spread outwards. Fig. 3-5-1 shows two normal curves with the same but different s. One physical situation modeled by this is when X is an observable’s measured value, which is the true value plus a small random error. This error is equally likely to be positive or negative, hence the curve is symmetric about , interpreted as the true value. Moreover, smaller errors are more likely than larger ones, hence the bell shape characterized by , the degree of precision. 200 PDF of X 150 higher precision ( = 0.002) 100 lower precision ( = 0.006) 50 4.98 4.99 5.01 5.02 same mean ( = 5) X Fig. 3-5-1 The probability that X assumes any value between a and b is obtained by integrating f(x) from x = a to x = b, 1 x b ( 1 P( a X b) e2 2 a )2 dx (3-5-2) A standard way to evaluate this integral is to first introduce the variable change x Z (3-5-3) into (3-5-2), hence b P ( a X b) a b 1 2 1 2 z e 2 dz ydz a (3-5-4) where y y( z ) 1 2 1 z2 e 2 is obviously N(z;0,1); called the standard normal distribution. Its picture is shown in Fig. 3-5-2. 0.4 Standard Normal Distribution N(z; 0,1) 0.3 P ( Z z*) 0.2 0.1 z -3 -2 -1 0 1 z* 2 3 x Fig. 3-5-2 Now, since the standard normal curve is symmetric about z = 0, the area on each side being 1/2 (since total area under the curve must be 1, i.e. ydz 1 ), it suffices to compute the definite integral z* 0 1 2 e z2 2 dz for positive values of z* only, which is done and tabulated in most probability/ statistics texts. Solved problems Problem 3-5-1 Suppose that the height (in inches) of a 25-year-old man is normal with = 71, 2 = 6.25. (a) What percentage of 25-year-old men are over 6 feet 2 inches tall? (ans. 0.12) (b) What percentage of 25-year-old males taller than 6’ are over 6 foot 5 inches? (ans. 0.024) Solution: (a) Let X be the height of a 25-year-old man. P(X > 74) = P((X - 71)/6.25 > (74 - 71)/ 6.25) = P(Z > 1.2) = 0.5 - 0.3849 = 0.12, i.e. 12% are over 6 feet 2 inches tall. (b) Given that 6’ < X , the conditional probability P(X > 77” X > 6’) = P(X > 77” and X > 6’) / P(X > 6’) = P(X > 77) / P(X > 72) = [1 - ((77 - 71)/2.5)] / [1 - ((72 - 71)/2.5)] = [ 1 - (2.4)] / [1 - (0.4)] = (0.5000 - 0.4918) / (0.5000 - 0.1554) = 0.0238, i.e. 2.4% of 25-year-old men taller than 6’ are over 6 foot 5 inches. Problem 3-5-2 A foundation engineer estimates that the settlement of a proposed structure will not exceed 2 inches with 95% probability. From a record of performance of many similar structures built on similar soil conditions, he finds that the coefficient of variation of the settlement is about 20%. If a normal distribution is assumed for the settlement of the proposed structure, what is the probability that the proposed structure will settle more than 2.5 inches? (ans. 0.00047) Solution: Let X be the settlement of the proposed structure; X ~ N( X , X ). Given probability: P(X 2) = 0.95 X X 2 X P( ) = 0.95 Also, since we’re given the C.O.V. = X 0.2 X X X X 0.2 x substituting (iii) into (ii), P(Z 2 X ) = 0.95, 0.2 X whereas Ang & Tang Table A.1 gives P(Z 1.645) = 0.95 Comparing the two equalities, thus solving the algebraic equation for X , we have X = 1.504890985, so by (ii) X = 0.3009781790 With these parameters found, we may calculate the desired probability P( X > 2.5) X X 2.5 1.504890985 = P( ) X 0.3009781790 = P(Z > 3.30625) = 1 - P(Z 3.30625) = 1 - 0.999533 0.00047 Here's how you can use Excel to compute P(Z 3.30625): in any blank cell, type in: =NORMSDIST(3.30625) (don't forget the "=" sign and the "S" in NORMSDIST) (i) (ii) (iii) Exercises Exercise 3-5-1 A student has submitted a concrete cylinder to the concrete cylinder strength contest in Engineering Open House. Suppose the strength of her concrete cylinder is normally distributed as N(80, 20) in kips. She was scheduled to be the last contestant for load testing. Immediately prior to her cylinder being tested, the two highest strengths in the contest thus far are 100 and 70 kips. (a) What is the probability that she will be the second place winner? (b) Suppose her cylinder is being tested, and it has not shown any sign of distress at a load of 90 kips. What is the probability that she will win first place? (c) Suppose she submitted a similar cylinder to another contest. Her boyfriend used an alternative procedure to make his concrete cylinder, such that the strength is expected to be only 1% higher than hers, but the c.o.v. is 50% higher. Who is more likely to score a higher strength in that contest? Justify. Exercise 3-5-2 The daily SO2 concentration in the air for a given city is normally distributed with mean 0.03 ppm and a c.o.v. of 40%. Assume statistical independence between the SO2 concentration for any 2 days. Suppose the criteria for clean air standard requires that: 1. 2. The weekly average SO2 concentration should not exceed 0.04 ppm. The SO2 concentration should not exceed 0.075 ppm on more than 1 day during a given week. Determine which one of these two criteria is more likely to be violated in this city. Substantiate your answer with calculated probabilities. Exercise 3-5-3 The daily flow rate of contaminant from an industrial plant is modelled by a normal random variable with mean value 10 units and c.o.v. of 20%. When the contaminant flow rate exceeds 14 units on a given day, it is considered excessive. Assume that the contaminant flow rate between any two days are statistically independent. (a) What is the probability of having excessive contaminant flow rate on a given day? (b) Regulation requires the measurement of contaminant flow rate for three days. The plant will be charged with a violation if excessive contaminant flow rate is observed during the three day period. What is the probability that the plant will not be charged with violation? (c) Suppose there is a proposal to change the regulation such that contaminant flow rate will be measured for five days and the plant will be charged with a violation if excessive contaminant flow rate is observed in more than one of those five days. Will the plant be better off with the proposed change? Please justify. (d) Return to Part (b). Although the plant cannot reduce the standard deviation of the daily contaminant flow rate, it can reduce the mean daily contaminant flow rate by improving the chemical process. Suppose the plant decides to limit the probability of violation to 1%. What should be the daily mean contaminant overflow rate?