Topic 5-1 Confidence intervals Solved problems Problem 5-1-1 The average speed of vehicles on a highway is studied. (a) Suppose observation on 50 vehicles yield a sample mean of 65 mph. Assume the standard deviation of vehicle speed is known equal to 6 mph. Determine the 2-sided 99% confidence intervals of the mean speed. (b) In part (a), how many additional vehicles’ speed should be observed such that the mean speed can be estimated to within 1 mph with 99% confidence? (c) Suppose John and Mary are assigned to collect data on the speed of vehicles on this highway. After each person has separately observed 10 vehicles, what is the probability that John’s sample mean will exceed Mary’s sample mean by 2 mph? (Hint: John’s sample mean will be normally distributed about the true mean μ). (d) Repeat part (c) if each person has separately observed 100 vehicle instead. Solution: (a) Since x = 65, n = 50, = 6, a 2-sided 99% interval for the true mean is given by the limits 6 65 z0.05 50 where z0.05 are the points between which the standard normal curve covers an area of 0.99. From tables, z0.05 = 2.58 (or you can use Excel’s “=NORMSINV(0.995)” to find the more precise value of 2.575834515 (beyond which there is probability 1 – 0.995 = 0.005). 6 6 <>0.99 = (65 – 2.58 , 65 – 2.58 ) 50 50 = (62.81, 67.19) (in mph) 6 (b) Requiring 2.58 1 n (2.586)2 = 239.6304 n n = 240 (240 – 50) = 190 additional vehicles must be observed. Before we proceed to (c) and (d), let X J and X M denote John and Mary’s sample means, respectively. Both are 6 approximately normal with mean (unknown true mean speed) and standard deviation , hence their n difference D = XJ – XM has an (approximate) normal distribution with mean = – = 0, and 2 2 6 6 2 = 6 standard deviation = , n n n i.e. D ~ N(0, 6 2 ) n Hence (c) When n = 10, P( X J – X M > 2) = P(D > 2) D D 2 0 = P( ) D 6/ 5 = 1 – (0.745355992) 0.228 (d) When n = 100, P(D > 2) = P( D D D 20 ) = 1 – ( 2.357022604) 0.0092 6 / 50 Problem 5-1-2 Suppose the annual maximum stream flow of a given river has been observed for ten years yielding the following statistics: sample mean = x = 10,000 cfs sample variance = s2 = 9106 (cfs)2 (a) Establish the 2-sided 90% confidence interval on the mean annual maximum stream flow. Assume a normal population. (ans. [8261,11739]) (b) If it is desired to estimate the mean annual maximum stream flow to 1,000 cfs with 90% confidence, how many additional years of observation will be required? Assume the sample (not the true value of the) variance based on the new set of data will be approximately 9 106 (cfs)2. (ans. 17) Solution: (a) The formula to use is <>1 - = [ x t / 2,n1 s n , x t / 2,n1 s ], where n = 0.1, n = 10, x = 10000 cfs, s = 3000 cfs, and t / 2,n1 t 0.05,9 = 1.833113856 (this can be found with Excel’s TINV(0.1,9); note the first argument is and “two tails” assumed by default) Plugging all these numbers into the formula above, a 2-sided 90% confidence interval for the mean annual maximum flow is [(10000 - 1739.044) cfs, (10000 + 1739.044) cfs] [8261 cfs, 11739 cfs] (b) Since the confidence level (1 - ) is fixed at 90%, while s is assumed to stay at approximately 3000 cfs, the s “half-width” of the confidence interval, t / 2,n1 can be considered as a function of the sample size n n only. To make it equal to 1000 cfs, one must have t 0.05,n 1 3000 = 1000 n t 0.05,n 1 = 1/3, n which can be solved by trial and error. We start with n = 20 (say), and increase the sample size until the half width is narrowed to the desired 1000cfs, i.e. until t 0.05,n 1 0.33333…. n With reference to the following table, Sample size, n t0.05,n-1 (found by Excel’s TINV(0.1, n-1)) 20 21 22 23 24 25 26 27 1.729131327 1.724718004 1.720743512 1.717144187 1.713870006 1.710882316 1.708140189 1.705616341 t 0.05,n 1 n 0.387 0.376 0.367 0.358 0.350 0.342 0.335 0.328 We see that a sample size of 27 will do, hence an additional (27 – 10) = 17 years of observation will be required. Exercises Exercise 5-1-1 Five piles have been load tested until failure; the load measured at failure denotes the actual capacity of the given pile. The following table summarizes the data from the load tests: Pile Test No. Actual Capacity A Predicted Capacity P 1 2 3 4 5 20.5 18.5 10.0 15.3 26.2 13.6 20.4 8.8 14.3 22.8 N = A/P Observe that the capacity of each of the pile has been also predicted by a theoretical model and listed in the above table. The factor N is simply the ratio of the actual pile capacity to the predicted pile capacity. (a) Complete the table by calculating the respective value of N for each test pile. (ans. Test No. 3: N = 1.136) (b) Determine the sample mean and variance of N. (ans. 1.154, 0.048) (c) Determine the 95% confidence interval of the mean value of N. (ans. (0.881, 1.427)) (d) In order to estimate the mean value of N to plus/minus 0.02 with 90% confidence, how many additional piles should be tested? Assume the variance of N is known and equal to 0.045 for this part. (ans. 300) (e) Assume N is a normal random variable whose mean value and variance are given exactly by the corresponding sample values from part (b). Consider a new site where a pile has been designed and its capacity is predicted by the model to be 15 tons. What is the probability that the pile will fail under a load of 12 tons? (ans. 0.0537) (Hint: Express the actual capacity of this pile, A, as a function of N first). Exercise 5-1-2 Concrete placed on a structure was cored and the following results were obtained: 4142, 3405, 3402, 4039, 3372 psi (a) Determine the 90% 2-sided confidence interval of the mean concrete strength. (ans. (3305.91, 4038.09)) (b) Suppose the confidence interval established in part (a) is too wide, and the engineer would like to have a confidence interval to be 300 psi of the computed sample mean concrete strength. Generally, more specimens of concrete would be needed to keep the same confidence level. However, without addition samples, what is the confidence level he would have based on the 5 measurements given above? (ans. 83.6%)