251solnO1 4/08/08 (Open this document in 'Page Layout' view!) O. Estimation of Parameters. 1. Point and Interval Estimation. Properties of Estimators. 2. A Confidence Interval for Text 8.1 -8.3, 8.8. (8.1 -8.3, 8.8.) When is Known. 3. A Confidence Interval for When is not known. Text 8.10, 8.11, 8.17, 8.21, 8.89! on CD [8.10, 8.11, 8.15, 8.20*]. (8.10, 8.11, 8.15, 8.17*). O1. 3. A Confidence Interval for a Proportion. O2. --------------------------------------------------------------------------------------------------------------------------- Confidence Intervals when the Population Standard Deviation is Known From the Outline: x z 2 s x This is not what you actually use most of the time! All that " unknown" means is that we do not have a value of the population variance 2 . If you only have the sample variance s , ignore this stuff and use the t table. 2 Remember that x x or x n x N n , but that only the first formula is used in these exercises. N 1 n Exercise 8.1: If x 85 , 8 and n 64, construct a 95% confidence interval for the population mean. Solution: .05 . z z.025 1.960 can be found on the last line of the t table. So 2 xz 85 1.960 8 85 1.96 or 83.04 n 64 P83 .04 86 .96 .95 . 86.96. More formally, Exercise 8.2: If x 125 , 24 and n 36, construct a 99% confidence interval for the population mean. Solution: .01 . z z.005 2.576 from the last line of the t table. So 2 24 125 10 .30 or 114.70 135.30 n 36 Exercise 8.3: A market researcher says that she has 95% confidence that the mean monthly sales of a product are between $170,000 and $200,000. Explain the meaning of this statement. Solution: According to the Instructor’s Solutions Manual, if all possible samples of the same size n are taken, 95% of them include the true population average monthly sales of the product within the interval developed. Thus we are 95 percent confident that this sample is one that does correctly estimate the true average amount. xz 125 2.576 Exercise 8.8: the quality control manager at a light bulb factory needs to estimate the mean life of a batch (population) of light bulbs. We assume that the (population) standard deviation is 100 hours. A random sample of 64 light bulbs from the batch yields a sample mean of 350. a) Construct a 95% confidence interval for the population mean of light bulbs in this batch. b) Do you think that the manufacturer has the right to state that the average life of the light bulbs is 400 hours? Explain. c) Must you assume that the population of light bulb lives is Normal? Explain. d) Suppose that the (population) standard deviation changed to 80 hours. How would this change your answers to b) and c)? 1 251solnO1 4/08/08 (Open this document in 'Page Layout' view!) Solution: We are given .05 , x 350 , 100 , n 64 . The answer below has been edited from the Instructor’s Solutions Manuals for the 9th and 10th editions. (a) The Instructor’s Solutions Manual for the 9th edition does not show the details of this solution, but gives the following printout. The 10th edition actually says that (b) (c) X Z n 350 1.96 100 or 64 325.5 374.50. If you can’t show this using the examples on the previous page, you have a serious problem! No. The manufacturer cannot support a claim that the bulbs last an average 400 hours. Based on the data from the sample, a mean of 400 hours would represent a distance of 4 standard deviations above the sample mean of 350 hours. This is what is called a hypothesis test! Our so-called null hypothesis is H 0 : 400 . Showing that 400 does not fall on the confidence interval constructed from our sample statistic, we have shown that the mean is significantly different from 400 (at the 5% significance level). No. Since is known and n = 64, from the central limit theorem, we may assume that the sampling distribution of x is approximately normal. (d) in the 9th edition asks us to explain why an observed value of 320 hours would not be unusual even though it would be outside of the confidence interval just calculated. The Instructor’s Solutions Manual for the 9th edition gives the following answer. An individual value of 320 is only 0.30 standard deviations below the sample mean of 350. The confidence interval represents bounds on the estimate of a sample of 64, not an individual value. (d) in the 10th edition is (e) in the 9th edition. The confidence interval is narrower based on a process standard deviation of 80 hours rather than the original assumption of 100 hours. (a) (b) X Z n 350 1.96 80 or 330.4 369.6. 64 Based on the smaller standard deviation, a mean of 400 hours would represent a distance of 5 standard deviations above the sample mean of 350 hours. No, the manufacturer cannot support a claim that the bulbs last an average of 400 hours. On the other hand, if the process standard deviation was sufficiently above 100 hours, we could deny that the mean was significantly different from 400 hours. How large would that have to be? Good question! The confidence interval is x z n 1.960 350 1.960 64 . If 350 1.960 400 , then 64 400 350 50 . If we take this as an equality we get 64 50 64 204 .08 . So if is larger than 204.08, we will 1.960 64 not be able to say that the mean is significantly different from 400. 1.960 50 or 2 251solnO1 4/08/08 (Open this document in 'Page Layout' view!) Confidence Intervals when the Population Standard Deviation is Unknown From the Outline: x tn1 s x This is what you actually use most of the time! All that " unknown" means is that 2 we do not have a value of the population variance. If you only have the sample variance, use the t s s N n table. Remember that s x x or s x x . Only the first formula is used in most text problems. n n N 1 Exercise 8.10: Find t n 1 for a) a 95% confidence level and a sample size of 10, b) a 99% confidence 2 level and a sample size of 10, c) a 95% confidence level and a sample size of 32, d) a 95% confidence level and a sample size of 65 and e) a 90% confidence level and a sample size of 16. Solution: In each case we use t n 1 2 (a) (b) (c) (d) (e) 1 .95, so .05 and t 9 = 3.250 .005 31 t .025 64 t .025 15 t .05 2 9 = 2.262. .025 . n 10, so n 1 9 , and t .025 = 2.040 = 1.998 = 1.753 Exercise 8.11: If x 75 , s 24, n 36 and the parent population is Normal, find a 95% confidence interval for the population mean. 35 Solution: .05 . t n 1 t .025 2.030 . 2 x ts x 75 2.030 24 75 8.12 or 66.98 36 83.12. Exercise 8.17 [8.15 in 9th]: The problem is about a tread wear index, an important measure of a tire’s performance. A brand of tires is graded 200. A random sample of 18 tires is taken by a consumer organization and gives a sample mean tread wear index of 195.3 and a sample standard deviation of 21.4. Assume a Normal distribution. a. Set up a 95% confidence interval for the population mean. b. Should the organization accuse the manufacturer of not meeting the standard for the grade? c. Why is a tread wear index of 210 for a given tire not unusual? 17 2.110 Solution: Since x 195 .3, s 21 .4, n 18 and .05 . t n 1 t .025 (a) (b) (c) x ts x 195 .3 2.110 21 .4 18 2 195 .3 10 .643 184.657 205.942 No, a grade of 200 is in the interval. It is not unusual. A tread-wear index of 210 for a particular tire is only 0.69 standard deviations above the sample mean of 195.3. 3 251solnO1 4/08/08 (Open this document in 'Page Layout' view!) Exercise 8.89 [Not in 9th ??]: If we are sampling without replacement from a population of 200 and from a sample of 36 get a sample mean of 75 and a standard deviation of 24, construct a 95% confidence interval for the mean. Solution: Our facts are x 75 , s 24, n 36, N 200 and .05 . Our basic formula remains x t n1 s but because the sample is more than 5% of the population, we need a finite population 2 x 200 36 164 4 4 0.82412 4.9078 3.631 . The value of 200 1 199 n 35 2.030 . So the interval is x t n1 s x 75 2.0303.631 75 9.55 , so t n 1 that we need is t .025 correction. s x sx N n 24 N 1 36 2 2 we can say P65 .45 84 .55 .95 . Exercise 8.17(in 8th edition only): In order to estimate dental expenses to plan for a proposed dental plan, a personnel department takes a random sample of dental expenses for the families of 10 employees over the previous year. (Dental data set on disk) Expenses 110 362 246 85 510 208 173 425 316 179 a. Set up a 90% confidence interval estimate of mean family dental exposes for all employees. b. What assumption must be made about the population distribution in a)? c. Give an example of a family dental expense that is outside the confidence interval but that are not unusual for an individual family and explain why this is not a contradiction. d. Repeat a) for a 95% interval. e. What would the effect be in a) of changing the fourth value from $85 to $585? Solution: Since the file is available on your disk, I downloaded it to Minitab and got the following results with my comments added. ————— 4/9/2003 2:42:54 PM ———————————————————— Welcome to Minitab, press F1 for help. MTB > Retrieve "E:\Content\Data Files\Minitab Data Files\Dental.MTW". Retrieving worksheet from file: E:\Content\Data Files\Minitab Data Files\Dental.MTW # Worksheet was saved on Mon Apr 27 1998 Results for: Dental.MTW MTB > Let c2=c1*c1 MTB > print c1 c2 Data Display Row Expenses 1 2 3 4 5 6 7 8 9 10 110 362 246 85 510 208 173 425 316 179 The data you have is labeled Expenses and is in C1. To compute the variance I squared the data and placed it in C2. C2 12100 131044 60516 7225 260100 43264 29929 180625 99856 32041 MTB > sum c1 Sum of Expenses Sum of Expenses = 2614.0 x 2614 4 251solnO1 4/08/08 (Open this document in 'Page Layout' view!) MTB > sum c2 Sum of C2 x Sum of C2 = 856700 MTB > Describe c1 2 856700 This is the basic command for describing a data set. Descriptive Statistics: Expenses Variable Expenses N 10 Mean 261.4 Median 227.0 TrMean 252.4 Variable Expenses Minimum 85.0 Maximum 510.0 Q1 157.3 Q3 377.8 MTB > tinterval 90 c1 StDev 138.8 SE Mean 43.9 This is the command to get a 90% confidence interval. One-Sample T: Expenses Variable Expenses N 10 Mean 261.4 StDev 138.8 SE Mean 43.9 ( 90.0% CI 180.9, 341.9) So the interval is 180 .9 341 .9 (a) Printout from the Instructor’s Solutions Manual follows: If we do this by hand, we write down the two columns above. x 2 856700 , If .10 , n 10, 2 x x x 2614 110 12100 x 2614 , x 261 .4 and 362 131044 n 10 246 60516 85 7225 x 2 nx 2 856700 10 261 .42 510 260100 s2 19266 .711 208 43264 n 1 9 173 29929 . s 19266.711 138.8046 425 180625 9 1.833 . We use t n 1 t .05 316 99856 2 179 2041 2614 856700 138 .80 x ts x 261 .4 1.833 261 .40 80 .45 or 180.95 to 341.85 10 (b) (c) The population of dental expenses must be approximately normally distributed. I’ll let you think about this one. 5 251solnO1 4/08/08 (Open this document in 'Page Layout' view!) 9 If .05 , n 10, x 261 .4 and s 138 .8046 , we use t n 1 t .025 2.262 . 2 138 .80 x ts x 261 .4 2.262 261 .40 2.262 43 .892 261 .40 99 .28 or 161.12 10 to 360.68. (d) (e) The additional $500 in dental expenses, divided across the sample of 10, raises the mean by $50 and increases the standard deviation by nearly $20. The interval half-width increases over $11 in the process. The new interval is: x t s 311 .40 1.8331 157 .056 n 10 Exercise 8.21(8.20 in 9th ): In New York a random sample was taken of the time required in days to approve 27 Savings Bank Life Insurance policies. (Insurance data set on disk) Time 73 31 92 19 56 63 16 22 50 64 18 51 28 45 69 28 48 16 31 17 17 90 17 60 17 56 91 a. Set up a 95% confidence interval estimate of mean processing time. b. What assumption must be made about the population distribution in a)? c. Do you think that the assumption made in b) has been seriously violated? Explain. d. Compare the conclusions reached in a) with those of Problem 3.61 on page 126. Solution: Since the file was available on disk, I downloaded it to Minitab and got the results below. MTB > Retrieve "C:\Berenson\Data_Files-9th\Minitab\INSURANCE.MTW". Retrieving worksheet from file: C:\Berenson\Data_Files9th\Minitab\INSURANCE.MTW # Worksheet was saved on Mon Apr 09 2001 Results for: INSURANCE.MTW The data you have is labeled Time and is in C1. MTB > let c2 = c1*c1 MTB > print c1 c2 To compute the variance I squared the data and placed it in C2. Data Display Row Time C2 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 73 19 16 64 28 28 31 90 60 56 31 56 22 18 45 48 17 17 17 91 92 63 5329 361 256 4096 784 784 961 8100 3600 3136 961 3136 484 324 2025 2304 289 289 289 8281 8464 3969 6 251solnO1 4/08/08 (Open this document in 'Page Layout' view!) 23 24 25 26 27 50 51 69 16 17 2500 2601 4761 256 289 MTB > sum c1 Sum of Time x 1185 Sum of Time = 1185.0 MTB > sum c2 Sum of C2 x Sum of C2 = 68629 2 68629 MTB > describe c1 Descriptive Statistics: Time This is the basic command for describing a data set. Variable Time N 27 Mean 43.89 Median 45.00 TrMean 43.08 Variable Time Minimum 16.00 Maximum 92.00 Q1 18.00 Q3 63.00 MTB > tinterval 95 c1 StDev 25.28 SE Mean 4.87 This is the command to get a 95% confidence interval. One-Sample T: Time Variable Time N 27 Mean 43.89 StDev 25.28 SE Mean 4.87 ( 95.0% CI 33.89, 53.89) If we do this by hand we get the following. obs 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 x 73 19 16 64 28 28 31 90 60 56 31 56 22 18 45 48 17 17 17 91 92 63 50 51 69 16 17 1185 x2 5329 361 256 4096 784 784 961 8100 3600 3136 961 3136 484 324 2025 2304 289 289 289 8281 8464 3969 2500 2601 4761 256 289 68629 7 251solnO1 4/08/08 (Open this document in 'Page Layout' view!) If .05 , n 27 , x x 1185 , x 2 68629 , x x 1185 43.88889 n 27 and 68629 27 43 .88889 2 16620 .6667 639 .2564 . n 1 26 26 26 s 639.2564 25.2835 . For a 2-sided interval use t n1 t.025 2.056 . s2 2 nx 2 2 x ts x 43 .88889 2.056 25 .2835 43 .89 10 .00 or 33.89 to 53.89, which means we can 27 say P33.89 53.89 . Or make a diagram of an almost Normal curve with a mean at 43.89 and mark 33.89 and 53.89. Label the area between these two points with 95% and the area in each of the tails with 2.5%. The remainder of the solution comes from the Instructor’s Solutions Manual. (b) The population distribution needs to be normally distributed. (c) Normal Probability Plot 100 90 80 Time 70 60 50 40 30 20 10 0 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 Z Value Box-and-whisker Plot Time 10 30 50 70 90 Both the normal probability plot and the box-and-whisker show that the population distribution is not normally distributed and is skewed to the right. (d) With a sample size of 27 and the population distribution that appears to be skewed, the method used in (a) is not reliable and, hence, any comparison with Problem 2.64 is likely to be invalid. 8 251solnO1 4/08/08 (Open this document in 'Page Layout' view!) A Summary Exam-type Problem We are using the following formulas. When is known x z x , where x 2 x n , or x x n N n when the sample is N 1 more than 5% of the population. s s When is unknown x tn1 s x , where s x x , or s x x 2 n n more than 5% of the population. N n when the sample is N 1 Problem O1: If n 64 and x 11 .50 , find 95% confidence intervals for the mean under the following circumstances: a. 6.30, N 3000 b. 6.30 , N 300 c. s 6.30, N 3000 d. s 6.30 , N 300 Solution: Use the formulas from Table 3 of the syllabus supplement or from the outline. a) x z x 11.50 1.960 .7875 11.50 1.54 or 9.96 to 13.04 2 x x 6.30 .7875 z 2 z.025 1.960 . More formally, we can say n 64 P9.96 13.04 .95 or make a diagram. b) x z x 11.50 1.960 .6996 11.50 1.37 or 10.13 to 12.87 2 x 236 N n 6.30 300 64 0.7875 .8884 .6996 0.7875 299 64 300 1 n N 1 z 2 z.025 1.96 c) x tn1 s x 11.50 1.998 .7875 11.50 1.57 or 9.93 to 13.07 x 2 sx sx 6.30 n .7875 64 t 2 n 1 63 t .025 1998 . d) x tn1 s x 11.50 1.998 .6996 11.50 1.40 or 10.10 to 12.90 2 sx sx n N n 6.30 N 1 64 300 64 .6996 300 1 63 tn1 t.025 1.998 2 9 251solnO1 4/08/08 (Open this document in 'Page Layout' view!) A Confidence Interval for a Proportion Problem O2 (Black): a) A researcher wants to know what share her company holds in a large city. A sample of 1003 people who bought CDs in the last month is taken and 256 turn out to have bought her company’s products (CDs). Create a 95% confidence interval for the proportion that bought her company’s products. b) CD sales aren’t what they used to be. What if we find out that there were only 10000 people who bought CDs in the city last month? Solution: a) p p z 2 s p p z 2 pq x 256 .2552 , q 1 p .7448 . . p n 1003 n .2552 .7448 .0001895 .01377 . z 2 z.025 1.96 . So p .2552 1.960 .01377 1003 .2552 .0270 or .2282 to .2822 sp b) If N 10000 our sample is more than 5% of the population and we have s p N n N 1 pq n 10000 1003 .01377 0.89979 .01377 0.94857 .01377 .01306 . So 10000 1 p .2552 1.960 .01306 .2552 .0256 . 10