One- and Two-Sample Estimation Problems Classical Methods of Estimation A point estimate of some population parameter is a single value of a statistic. Definition 9.1 A statistic is said to be an unbiased estimator of the parameter if = E() = . Definition 9.2 If we consider all possible unbiased estimators of some parameter , the one with the smallest variance is called the most efficient estimator of . The interval L < < U, computed from the selected sample, is called a (1- )100% confidence interval, the fraction 1- is called the confidence coefficient or the degree of confidence, and the endpointsL and U, are called the lower and upper confidence limits. “ Ideally we prefer a short interval with a high degree of confidence” Single Sample: Estimating the Mean Confidence Interval of ; Known. If x is the mean of a random sample of size n from a population with known variance 2, a (1 - )100% confidence interval for is given by x - z/2 ------ < < x + z/2 --------, n n where z/2 is the z-value leaving an area of /2 to the right. 4/252) An electrical firm manufactures light bulbs that have a length of life that is approximately normally distributed with a standard deviation of 40 hours. If a sample of 30 bulbs has an average life of 780 hours, find a 96% confidence interval for the population mean of all bulbs produced by this firm. Soln: Given: n = 30, = 40 hours,x = 780 hours 96% confidence interval From Table A.3, z/2 = z.02 =2.05 x - z/2 /n < < x + z/2 /n 780 – 2.05(40)/30 < < 780 + 2.05(40)/30 765 < < 795 Theorem 9.1 Ifx is used as an estimate of , we can then be (1 - )100% confident that the error will not _ exceed z/2/n. Error _ x - z/2n x _ x + z/2n 6/252) The heights of a random sample of 50 college students showed a mean of 174.5 centimeters and a standard deviation of 6.9 centimeters. Construct a 98% confidence interval for the mean height of all college students. What can we assert with 98% confidence about the possible size of our error if we estimate the mean height of all college students to be 174.5 centimeters? Soln: Given: n = 50, = 6.9 cm,x = 174.5 cm a) 98% confidence interval From Table A.3, z/2 = z.01 =2.33 x - z/2 /n < < x + z/2 /n 174.5 – 2.33(6.9)/50 < < 174.5 + 2.33(6.9)/50 172.23 < < 176.77 b) Error 2.33(6.9)/50 2.27 cm Theorem 9.2 Ifx is used as an estimate of , we can be (1-)100% confident that the error will not exceed a specified amount e when the sample size is n = (z/2/e)2 8/253) How large a sample is needed in Exercise 4 if we wish to be 96% confident that our sample mean will be within 10 hours of the true mean? Soln: Given: z/2 = 2.05, = 40 hours, e = 10 hours n = (z/2 / e)2 n = (2.05(40) / 10)2 n 68 samples 10/253) An efficiency expert wishes to determine the average time that it takes to drill three holes in a certain metal clamp. How large a sample will he need to be 95% confident that his sample mean will be within 15 seconds of the true mean? Assume that it is known from previous studies that = 40 seconds. Soln: Given: z/2 = z.025 = 1.96, = 40 seconds, e = 15 seconds n = (z/2 / e)2 n = (1.96(40) / 15)2 n 28 samples Confidence Interval for ; Unknown. Ifx and s are the mean and standard deviation of a random sample from a normal population with unknown variance 2, a (1 – )100% confidence interval for is given by _ _ x - t/2 s/n < < x + t/2 s/n, were t/2 is the t-value with = n –1 degrees of freedom, leaving an area of /2 to the right. 13/253) A machine is producing metal pieces that are cylindrical in shape. A sample of pieces is taken and the diameters are 1.01, 0.97, 1.03, 1.04, 0.99, 0.98, 0.99, 1.01, and 1.03 centimeters. Find a 99% confidence interval for the mean diameter of pieces from this machine, assuming an approximate normal distribution. Soln: 1.01 + 0.97 + 1.03 + 1.04 + 0.99 + 0.98 + 0.99 + 1.01 + 1.03 x = 9 x = 1.006 9 s2 = (xi -x)2/(n-1) = 0.0048 / 8 : Therefore, s = 0.024 i=1 From Table A.4, t/2 = t .005 =3.355 x - t/2 s/n < < x + t/2 s/n, 1.006 – 3.355(.024)/9 < < 1.006 + 3.355(.024)/9 16/253) A random sample of 12 graduates of a certain secretarial school typed an average of 79.3 words per minute with a standard deviation of 7.8 words per minute. Assuming a normal distribution for the number of words typed per minute, find a 95% confidence interval for the average number of words typed by all graduates of this school. Soln: Given: n = 12, x = 79.3 words/minute, s = 7.8 words/minute @ 95% confidence interval From Table A.4, t/2 = = t .025 =2.201 x - t/2 s/n < < x + t/2 s/n, 79.3 – 2.201(7.8)/12 < < 79.3 + 2.201(7.8)/12 74.344 < < 84.256 Standard Error of a Point Estimate _ The standard error ofX is /n. Simply put, the standard error of an estimator is its standard deviation. For the case ofX, the computed confidence limit _ x z/2 /n is written as x z/2 s.e.(x), where s.e. is the standard error. In the case where is unknown and sampling is from a normal distribution, s replaces and the estimated standard error s/n is involved. Thus the confidence limits on are given by _ x t/2 s/n = x t/2s.e.(x). Tolerance limits. For a normal distribution of measurements with unknown mean and unknown standard deviation , tolerance limits are given byx ks, where k is determined so that one can assert with 100(1 - )% confidence that the given limits contain at least the proportion 1 - of the measurements. 18/253) The following measurements were recorded for the drying time, in hours, of a certain brand of latex paint: 3.4 2.5 4.8 2.9 3.6 2.8 3.3 5.6 3.7 2.8 4.4 4.0 5.2 3.0 4.8 Assuming that the measurements represent a random sample from a normal population, find the 99% tolerance limits that will contain 95% of the drying times. Soln: 15 x = xi / n = 3.787 i=1 15 s2 = (xi -x)2/(n-1) : Therefore, s = 0.971 i=1 @ n = 15, (1- ) = 0.99, (1- ) = 0.95 From Table A.7, k = 3.507 x ks 3.787 3.507(0.971) 3.787 3.405 Therefore, the tolerance limit is from 0.382 to 7.192 hours. Two Samples: Estimating the Difference Between Two Means Confidence Interval For 1 - 2; 2 1 and 22 Known. Ifx1 andx2 are the means of independent random samples of size n1 and n2 from populations with known variances 2 1 and 22, respectively, a (1 - )100% confidence interval for 1 - 2 is givenby (x1-x2)-z/22 /n +2 /n < - < (x -x )+z 1 1 2 2 1 2 1 2 2 /n +2 /n , /2 1 1 where z/2 is the z-value leaving an area of /2 to the right. 2 2 2/263) Two kinds of thread are being compared for strength. Fifty pieces of each type of thread are tested under similar conditions. Brand A has an average tensile strength of 78.3 kilograms with a standard deviation of 5.6 kilograms, while brand B had an average tensile strength of 87.2 kilograms with a standard deviation of 6.3 kilograms. Construct a 95% confidence interval for the difference of the population means. Soln: Given: Brand A: nA = 50, A = 5.6 kg,xA = 78.3 kg Brand B: nB = 50, B = 6.3 kg,xB = 87.2 kg @ 95% confidence interval From Table A.3, z/2 = z.025 =1.96 (xB-xA) - z/22 /n +2 /n < - < (x -x )+ z A A B B B A B A /2 2 /n +2 /n , A A B (87.2–78.3)–1.96(5.6)2/50 + (6.3)2/50 < B-A < (87.2–78.3)+1.96(5.6)2/50 + (6.3)2/50, 6.56 < B - A <11.24 B 3/263) A study was made to determine if a certain metal treatment has any effect on the amount of metal removed in a pickling operation. A random sample of 100 pieces was immersed in a bath of 24 hours without treatment, yielding an average of 12.2 millimeters of metal removed and a sample standard deviation of 1.1 millimeters. A second sample of 200 pieces was exposed to the treatment followed by the 24-hour immersion in the bath, resulting in an average removal of 9.1 milliliters of metal with a sample standard deviation of 0.9 millimeter. Compute a 98% confidence interval estimate for the difference between the population means. Does the treatment appear to reduce the mean amount of metal removed? Soln: Sample1 n1 = 100 x1 = 12.2ml 1 = 1.1 ml @ 98% confidence interval, From Table A.3, z/2 = z.01 =2.33 Sample 2 n2 = 200 x2 = 9.1 ml 2 = 0.9 ml (x1-x2) - z/22 /n + 2 /n < - < 1 1 2 2 1 2 (x1-x2) + z/22 /n + 2 /n , 1 1 2 2 (12.2–9.1)–2.33 (1.1)2/100 + (0.9)2/200 < 1-2 < ( 12.2–9.1)+2.33(1.1)2/100+(0.9)2/200 2.804 < 1-2 < 3.396 Confidence Interval for 1 - 2; 21 = 22 but Unknown. Ifx1 andx2 are the means of independent random samples of size n1 and n2, respectively, from approximate normal populations with unknown but equal variances, a (1 - )100% confidence interval for 1 - 2 is given by (x1-x2) - t/2sp1/n1+1/n2 < 1-2 < (x1 -x2) + t/2 sp1/n1+1/n2, where sp is the pooled estimate of the population standard deviation, given as (n1 – 1)s21 + (n2 –1)s22 s2p = n1 + n2 -2 and t/2 is the t-value with = n1 + n2 –2 degrees of freedom, leaving an area of /2 to the right. 8/264) An experiment reported in Popular Science, in 1981, compared fuel economies for two types of similarly equipped diesel mini-trucks. Let us suppose that 12 Volkswagen and 10 Toyota trucks are used in 90-kilometer per hour steady-speed tests. If the 12 Volkswagen trucks average 16 kilometers per liter with a standard deviation of 1.0 kilometer per liter and the10 Toyota trucks average 11 kilometers per liter with a standard deviation of 0.8 kilometer per liter, construct a 90% confidence interval for the difference between the average kilometers per liter of these two mini-trucks. Assume that the distances per liter for each truck model are approximately normally distributed with equal variances. Soln: Volkswagen n1 = 12 x1 = 16 km/l s1 = 1.0km/l Toyota n2 = 10 x2 = 11 km/l s2 = 0.8km/l @ 90% confidence interval for the difference 2 - 1 From Table A.4 , @ = 20, t/2 = t.05 =1.725 s2 = p (12 –1)12 + (10-1)(0.8)2 = 0.838 12 + 10 - 2 sp = 0.915 (16 - 11) – 1.725(0.915)1/12+1/10 < 2 - 1 < (16 - 11) + 1.725(0.915)1/12+1/10, 4.324 < 2 - 1 <5.676 Confidence Interval for 1 - 2; 2 1 22 and Unknown. Ifx1 and s21, andx2 and s22, are the means and variances of small independent samples of size n1 and n2, respectively, from approximate normal distributions with unknown and unequal variances, an approximate (1 - )100% confidence interval for 1 2 is given by (x1 -x2) - t/2s2 /n +s2 /n < - < (x -x ) + t 1 1 2 2 1 2 1 2 s2 /n +s2 /n , /2 where t/2 is the t-value with (s2 /n + s2 /n )2 1 v= 1 2 2 [(s2 /n )2/(n – 1)] + [(s2 /n )2/(n – 1)] 1 1 1 2 2 2 degrees of freedom, leaving an area of /2 to the right. 1 1 2 2 Paired Observations Confidence Interval for D = 1 - 2 for Paired Observations. If d and sd are the mean and standard deviation of the normally distributed differences of n random pairs of measurements, a (1 - )100% confidence interval for D = 1 - 2 is _ _ d - t/2 (sd / n) < < d + t/2 (sd / n), where t/2 is the t-value with = n –1 degrees of freedom, leaving an area of /2 to the right, and | n n | n d2i– ( d i)2 i=1 i=1 | sd = | n(n-1) Single Sample: Estimating a Proportion Large-Sample Confidence Interval for p. If p is the proportion of successes in a random sample of size n, andq = 1 -p, an approximate (1 - )100% confidence interval for the binomial parameter p is given by p - z/2pq / n < p <p + z/2pq /n , where z/2 is the z-value leaving an areas of /2 to the right. Theorem 9.3 Ifp is used as an estimate of p, we can be (1- )100% confident that the error will not exceed z/2pq / n Theorem 9.4 Ifp is used as an estimate of p, we can be (1 - )100% confident that the error will be less than a specified amount e when the sample size is approximately n = z2 /2 pq / e2 Theorem 9.5 If p is used as an estimate of p, we can be at least (1 - )100% confident that the error will not exceed a specified amount e when the sample size is n = z2 /2 / 4e2 Two Samples: Estimating the Difference B/n Two Proportions Large-Sample Confidence Interval for p1-p2. Ifp1 andp2 are the proportion of successes in random samples of size n1 and n2, respectively,q1 = 1-p1 and q2 = 1-p2, an approximate (1- )100% confidence interval for the difference of two binomial parameters p1-p2, is given by (p1 - p2) - z/2p1q1/n1 +p2q2/n2 < p1 – p2 < (p1 -p2) + z/2p1q1/n1 + p2q2 / n2, where z/2 is the z-value leaving an area of /2 to the right. Single Sample: Estimating the Variance Confidence Interval for 2. If s2 is the variance of a random sample of size n from a normal population, a (1 - )100% confidence interval for 2 is given by (n –1)s2/ 2 /2 < 2 < (n - 1)s2/ 2 , 1-/2 where 2/2 and 21-/2 are 2-values with = n –1 degrees of freedom, leaving areas of /2 and 1- /2, respectively, to the right. Two Samples: Estimating the Ratio of Two Variances Confidence Interval for 2 /2 . If s2 and s2 are the variances of 1 2 1 2 independent samples of size n1 and n2, respectively, from normal populations, then a (1 - )100% confidence interval for 2 /2 is 1 (s2 /s2 )(1/f 1 2 ( , )) < 2 /2 < (s2 /s2 )f /2 1 2 1 2 1 2 ( , ), 2 /2 1 2 where f/2(1, 2) is an f-value with 1 = n1 –1 and 2 = n2 –1 degrees of freedom leaving an area of /2 to the right, and f/2(1, 2) is a similar f-value with 2 = n2 – 1 and 1 = n1 –1 degrees of freedom.