confidence interval estimation

CHAPTER 20 CONFIDENCE INTERVAL ESTIMATION MULTIPLE CHOICE QUESTIONS In the following multiple-choice questions, please circle the correct answer. 1. The confidence interval for a proportion is based on the assumption of a large sample size. A rule of thumb for checking the validity of this assumption is if npL , n(1  pL ), npU , and n(1  pU ) are all greater than what value? a. 0 b. n c. 2 d. 3 e. 5 ANSWER: 2. e When the samples we want to compare are paired in some natural way, such as pretest/posttest for each person or husband/wife pairs, a more appropriate form of analysis is to not compare two separate variables, but their . a. difference b. sum c. ratio d. total e. product ANSWER: a 3. 4. Confidence intervals are a function of which of the following three things? a. The population, the sample, and the standard deviation b. The sample, the variable of interest, and the degrees of freedom c. The data in the sample, the confidence level, and the sample size d. The sampling distribution, the confidence level, and the degrees of freedom e. The mean, the median, and the mode ANSWER: c The chi-square and F distributions are used primarily to make inferences about population ___________. a. means b. variances c. medians d. modes e. proportions ANSWER: b 5. If you increase the confidence level, the confidence interval . a. decreases b. increases c. stays the same d. may increase or decrease, depending on the sample data ANSWER: b 6. A random sample allows us to use: a. the rules of probabilities b. the rules of large numbers c. the laws of parameters d. the laws of distributions e. the laws of gravity ANSWER: a 7. Suppose there are 500 accounts in a population. You sample 50 of them and find a sample total of $5,000. What would be your estimate for the population total? a. $5,000 b. $50,000 c. $250,000 d. $2,500,000 e. None of the above ANSWER: b 8. 9. Suppose there are 400 accounts in a population. You sample 50 of them and find a sample mean of $500. What would be your estimate for the population total? a. $5,000 b. $50,000 c. $250,000 d. $2,500,000 e. None of the above ANSWER: b When we replace  with the sample standard deviation (s), we introduce a new source of variability and the sampling distribution becomes the . a. t distribution b. F distribution c. chi-square distribution d. robust distribution ANSWER: a 10. Another commonly used random mechanism, besides a simple random sample, is called: a. interval estimation b. a random hypothesis test c. a randomized experiment d. a nuisance sample ANSWER: c 11. If the odds of a horse winning a race are 2 to 1, then the probability of this horse winning the race is . a. 1/4 b. 1/3 c. 1/2 d. 2/3 e. 2/10 ANSWER: 12. d There are, generally speaking, two types of statistical inference. They are: a. sample estimation and population estimation b. confidence interval estimation and hypothesis testing c. interval estimation for a mean and interval estimation for a proportion d. independent sample estimation and dependent sample estimation e. none of the above ANSWER: b 13. The t distribution has a. n b. 2 c. 10 d. n – 1 e. trillion ANSWER: 14. degrees of freedom. d If you are constructing a confidence interval for a single mean, the confidence interval will with an increase in the sample size. a. decreases b. increases c. stays the same d. may increase or decrease, depending on the sample data ANSWER: a 15. As the sample size increases, the t distribution becomes more similar to the __ distribution. a. b. c. d. e. normal exponential F chi-square binomial ANSWER: 16. a A parameter, such as  , is sometimes referred to as a ________ parameter, because many times we need its value even though it is not the parameter of primary interest. a. special b. random c. nuisance d. independent e. dependent ANSWER: c 17. When you calculate the sample size for a proportion, you use an estimate for the population proportion ( pest ). A conservative value for n can be obtained by using pest = . a. 0.0 b. 0.05 c. 0.10 d. 0.50 e. 1.00 ANSWER: d QUESTIONS 18 THROUGH 23 ARE BASED ON THE FOLLOWING INFORMATION: The following values have been calculated using the TDIST and TINV functions in Excel. These values come from a t distribution with 15 degrees of freedom. These values represent the probability to the right of the given positive values. Value 1.00 1.20 1.40 t probability 0.1636 0.1209 0.0872 These values represent the t value for a given probability. Probability 0.20 0.10 0.05 t value 1.3178 1.7109 2.0639 18. What is the probability of a t-value smaller 1.00? a. 0.1209 b. 0.1636 c. 0.8364 d. 0.8791 ANSWER: 19. What is the probability of a t-value larger than 1.20? a. 0.0872 b. 0.1209 c. 0.1636 d. 0.2000 ANSWER: 20. b What would be the t-value where 0.05 of the values are in the upper tail? a. +1.000 b. +1.318 c. +1.711 d. +2.064 ANSWER: 22. b What is the probability of a t-value between –1.40 and +1.40? a. 0.7582 b. 0.8256 c. 0.9128 d. 0.9500 ANSWER: 21. c c What would be the t-values where 0.10 of the values are in both tails (sum of both tails)? a. –1.000, +1.000 b. –1.318, +1.318 c. –1.711, +1.711 d. –2.064, +2.064 ANSWER: c 23. What would be the t-values where 0.95 of the values would fall within this interval? a. b. c. d. –1.000, +1.000 –1.318, +1.315 –1.711, +1.711 –2.064, +2.064 ANSWER: d QUESTIONS 24 THROUGH 29 ARE BASED ON THE FOLLOWING INFORMATION: The following values have been calculated using the TDIST and TINV functions in Excel. These values come from a t distribution with 15 degrees of freedom. These values represent the probability to the right of the given positive values. Value 0.95 1.15 1.20 t probability 0.1786 0.1341 0.1244 These values represent the t value for a given probability. Probability 0.20 0.15 0.10 24. What is the probability of a t-value smaller than 1.20? a. 0.8756 b. 0.8659 c. 0.1341 d. 0.1244 ANSWER: 25. b What is the probability of a t-value between –0.95 and +0.95? a. 0.1786 b. 0.3572 c. 0.6428 d. 0.8214 ANSWER: 27. a What is the probability of a t-value larger than 1.15? a. 0.1786 b. 0.1341 c. 0.1244 d. 0.1500 ANSWER: 26. t value 1.341 1.517 1.753 c What would be the t-value where 0.075 of the values are in the upper tail? a. +1.000 b. +1.341 c. +1.517 d. +1.753 ANSWER: 28. c What would be the t-values where 0.80 of the values would fall within this interval? a. –1.000, +1.000 b. –1.341, +1.341 c. –1.517, +1.517 d. –1.753, +1.753 ANSWER: b 29. What would be the t-values where 0.10 of the values are in both tails (sum of both tails)? a. –1.000, +1.000 b. –1.341, +1.341 c. –1.517, +1.517 d. –1.753, +1.753 ANSWER: d TEST QUESTIONS 30. You are told that a random sample of 150 people from Iowa has been given cholesterol tests, and 60 of these people had levels over the “safe” count of 200. Construct a 95% confidence interval for the population proportion of people in Iowa with cholesterol levels over 200. ANSWER: n  150, pˆ  60 /150  .40 pˆ  Z pˆ (1  pˆ ) / n  0.40  1.96 (.40)(.60) /150  0.40  0.0784 Lower limit = 0.3216, and upper limit = 0.4784 31. You are trying to estimate the average amount a family spends on food during a year. In the past, the standard deviation of the amount a family has spent on food during a year has been approximately $1200. If you want to be 99% sure that you have estimated average family food expenditures within $60, how many families do you need to survey? ANSWER:  est . =1200, z-multiple = 2.575, B = 60 . The sample size for a mean is given by  z  multiple   est   2.575 1200  n    B 60     2 32. 2 2653 You have been assigned to determine whether more people prefer Coke to Pepsi. Assume that roughly half the population prefers Coke and half prefers Pepsi. How large a sample would you need to take to ensure that you could estimate, with 95% confidence, the proportion of people preferring Coke within 3% of the actual value? ANSWER: pest . = 0.50, z-multiple = 1.96, B = 0.03. The sample size for a proportion is given by  z  multiple   1.96  n  pest . (1  pest . )    (0.50)(0.50) 1068 B    0.03  2 2 QUESTIONS 33 THROUGH 35 ARE BASED ON THE FOLLOWING INFORMATION: A marketing research consultant hired by Coca-Cola is interested in determining the proportion of customers who favor Coke over other soft drinks. A random sample of 400 consumers was selected from the market under investigation and showed that 53% favored Coca-Cola over other brands. 33. Compute a 95% confidence interval for the true proportion of people who favor Coke. Do the results of this poll convince you that a majority of people favors Coke? ANSWER: 0.53  0.0489 = (0.4811, 0.5789). Since confidence interval ranges from 48% to 57.9%, it is difficult to conclude that a majority of people favors Coke. It could be below 50%. 34. Suppose 2,000 (not 400) people were polled and 53% favored Coke. Would you now be convinced that a majority of people favor Coke? Why might your answer be different than in Question 33? ANSWER: 0.53  0.0219 = (0.5081, 0.5519). In this case the 95% confidence interval is entirely above 50%, the data is now more convincing than it was previously. 35. How many people would have to be surveyed to be 95% confident that you can estimate the fraction of people who favor Coca-Cola within 1%? ANSWER: 9,569.43 or 9,570. QUESTIONS 36 AND 37 ARE BASED ON THE FOLLOWING INFORMATION: The employee benefits manager of a medium size business would like to estimate the proportion of full-time employees who prefer adopting plan A of three available health care plans in the coming annual enrollment period. A reliable frame of the company’s employees and their tentative health care preferences are available. Using Excel, the manager chose a random sample of size 50 from the frame. There were 17 employees in the sample who preferred plan A. 36. Construct a 99% confidence interval for the proportion of company employees who prefer plan A. Assume that the population consists of the preferences of all employees in the frame. ANSWER: n  50, pˆ  17 / 50  0.34 pˆ  Z pˆ (1  pˆ ) / n  0.34  2.575 (0.34)(0.66) / 50  0.34  0.1725 lower limit = 0.1675, upper limit = 0.5125 37. Interpret the 99% confidence interval constructed in Question 36. ANSWER: We are 99% confident that the proportion of all employees who prefer plan A is between 0.1675 and 0.5125. QUESTIONS 38 THROUGH 40 ARE BASED ON THE FOLLOWING INFORMATION: Q-Mart is interested in comparing its male and female customers. Q-Mart would like to know if its female charge customers spend more money, on average, than its male charge customers. They have collected random samples of 25 female customers and 22 male customers. On average, women charge customers spend $102.23 and men charge customers spend $86.46. Some information are shown below. Summary statistics for two samples Sample sizes Sample means Sample standard deviations Female 25 102.23 93.393 Male 22 86.46 59.695 Confidence interval for difference between means Sample mean difference Pooled standard deviation Std error of difference 38. 15.77 79.466 23.23 Using a t-value of 2.014, calculate a 95% confidence interval for the difference between the average female purchase and the average male purchase. Would you conclude that there is a significant difference between females and males in this case? Explain. ANSWER: 15.77  46.785 = (-31.015, 62.555). Since the range includes 0, there does not appear to be a significant difference between the means of the two groups. 39. What are the degrees of freedom for the t-statistic in this calculation? Explain how you would calculate the degrees of freedom in this case. ANSWER: n1 + n2 – 2 = 45 40. What is the assumption in this case that allows you to use the pooled standard deviation for this confidence interval? ANSWER: In order to use the pooled standard deviation for this confidence interval, we must assume that the two populations standard deviations are equal (  1   2 ). QUESTIONS 41 AND 42 ARE BASED ON THE FOLLOWING INFORMATION: A company employs two shifts of workers. Each shift produces a type of gasket where the thickness is the critical dimension. The average thickness and the standard deviation of thickness for shift 1, based on a random sample of 40 gaskets, are 10.85 mm and 0.16 mm, respectively. The similar figures for shift 2, based on a random sample of 30 gaskets, are 10.90 mm and 0.19 mm. Let 1  2 be the difference in thickness between shifts 1 and 2, and assume that the population variances are equal. 41. Construct a 95% confidence interval for 1  2 . ANSWER: n1  40, X1  10.85, s1  0.16 n2  30, X 2  10.90, s2  0.19 The pooled standard deviation is s p  ( X1  X 2 )  t s p 1 n1 (n1  1) s12  (n2  1) s22 = 0.1734 n1  n2  2  n12  0.05  1.9955(0.1734)(0.2415)  0.05  0.0836 Lower limit = -0.1336, and upper limit = 0.0336. 42. Based on your answer to Question 41, are you convinced that the gaskets from shift 2 are, on average, wider than those from shift 1? Why or why not? ANSWER: The confidence interval extends from a negative number (indicating shift 2 thickness is larger) to a positive number (indicating shift 2 thickness is smaller). So we are not absolutely sure which mean is greater. QUESTIONS 43 AND 44 ARE BASED ON THE FOLLOWING INFORMATION: A sample of 9 production managers with over 15 years of experience has an average salary of $71,000 and a sample standard deviation of $18,000. 43. You can be 95% confident that the mean salary for all production managers with at least 15 years of experience is between what two numbers (the t-statistic with 8 degrees of freedom is 2.306)? What assumption are you making about the distribution of salaries? ANSWER: $71,000  $13,836 = ($57,164, $84,836). The assumption is that the population is normal or near normal. This is particularly important since the sample size is so small (9). However, the t distribution is rather robust to violations of normality. 44. What sample size would be needed to ensure that we could estimate the true mean salary of all production managers with more than 15 years of experience and have only 5 chances in 100 of being off by more than $600? ANSWER: 69.18 or 70 QUESTIONS 45 THROUGH 50 REQUIRE THE USE OF EXCEL: 45. Compute P(1.50  t10  1.00), where t10 has a t-distribution with 10 degrees of freedom. ANSWER: 0.74730 46. Compute P(1.50  t100  1.00), where t100 has a t-distribution with 100 degrees of freedom. ANSWER: 0.77176 47. Compute P(1.50  Z  1.00), where Z is a standard normal random variable. ANSWER: 0.77454 48. Compare the result of Question 47 to the results obtained in Questions 45 and 46. How do you explain the difference in these probabilities? ANSWER: The variance of t with a small degree of freedom is larger than a t with a large degree of freedom, which is larger than for a Z. This explains why the “between” probabilities in Questions 45, 46, and 47 increase. 49. Find the 75th percentile of the t-distribution with 25 degrees of freedom. ANSWER: 0.32217 50. Find the 75th percentile of the t-distribution with 5 degrees of freedom. ANSWER: 0.33672 QUESTIONS 51 and 52 ARE BASED ON THE FOLLOWING INFORMATION: A sample of 40 country CD recordings of Willie Nelson has been examined. The average playing time of these recordings is 51.3 minutes, and the standard deviation is 5.8 minutes. 51. Construct a 95% confidence interval for the mean playing time of all Willie Nelson recordings. ANSWER: n = 10, X = 54000, s = 15000 X  t (s / n)  51.3  2.0227(5.8/ 40)  51.3  1.855 Lower limit = 49.445, and upper limit = 53.155 52. Interpret the confidence interval you constructed. ANSWER: We are 95% confident that the mean playing time of all Willie Nelson recordings is between. 49.445 and 53.155 minutes. QUESTIONS 53 AND 54 ARE BASED ON THE FOLLOWING INFORMATION: A department store is interested in the average balance that is carried on its store’s credit card. A sample of 40 accounts reveals an average balance of $1,250 and a standard deviation of $350. 53. Find a 95% confidence interval for the mean account balance on this store’s credit card (the t-statistic with 39 degrees of freedom is 2.02). ANSWER: $1,250  $111.79 = ($1,138.21, $1,361.79). 54. What sample size would be needed to ensure that we could estimate the true mean account balance and have only 5 chances in 100 of being off by more than $100? ANSWER: 49.98 or 50. QUESTIONS 55 AND 56 ARE BASED ON THE FOLLOWING INFORMATION: A market research consultant hired by Coke Classic Company is interested in estimating the difference between the proportions of female and male customers who favor Coke Classic over Pepsi Cola in Chicago. A random sample of 200 consumers from the market under investigation showed the following frequency distribution. Coke Pepsi 55. Male 72 58 130 Female 38 32 70 110 90 200 Construct a 95% confidence interval for the difference between the proportions of male and female customers who prefer Coke Classic over Pepsi Cola. ANSWER: n1  number of males = 130, n 2 = number of females = 70 Pˆ1  proportion of males who favor Coke over Pepsi = 72/130 = 0.5538 Pˆ  proportion of females who favor Coke over Pepsi = 0.5429 2 SE ( pˆ1  pˆ 2 )  pˆ1 (1  pˆ1 ) pˆ 2 (1  pˆ 2 )  = 0.0738 n1 n2 ( Pˆ1  Pˆ2 )  Z  SE ( Pˆ1  Pˆ2 )  0.0109  1.96(.0738)  0.0109  0.1446 Lower limit = -0.1337, and upper limit = 0.1555 56. Interpret the constructed confidence interval. ANSWER: We are 95% confident that the population difference between these proportions is between –13.37% and 15.55%. QUESTIONS 57 THROUGH 60 ARE BASED ON THE FOLLOWING INFORMATION: The percent defective for parts produced by a manufacturing process is targeted at 4%. The process is monitored daily by taking samples of sizes n = 160 units. Suppose that today’s sample contains 14 defectives. 57. 58. Determine a 95% confidence interval for the proportion defective for the process today. ANSWER: 0.0875  0.0438 = (0.0437, 0.1313). Based on your answer to Question 57, is it still reasonable to think the overall proportion defective produced by today’s process is actually the targeted 4%? Explain your reasoning. ANSWER: No, since 4% falls outside of this range. 59. The confidence interval in Question 57 is based on the assumption of a large sample size. Is this sample size sufficiently large in this example? Explain how you arrived at your answer. ANSWER: Yes. Because npL , n(1  pL ), npU , and n(1  pU ) are all greater than 5.0. 60. How many units would have to be sampled to be 95% confident that you can estimate the fraction of defective parts within 2% (using the information from today’s sample)? ANSWER: 766.40 or 767. QUESTIONS 61 AND 62 ARE BASED ON THE FOLLOWING INFORMATION: Auditors of Independent Bank are interested in comparing the reported value of all 1775 customer saving account balances with their own findings regarding the actual value of such assets. Rather than reviewing the records of each savings account at the bank, the auditors randomly selected a sample of 100 savings account balances from the frame. The sample mean and sample standard deviations were $505.75 and 360.95, respectively. 61. Construct a 90% confidence interval for the total value of all savings account balances within this bank. Assume that the population consists of all savings account balances in the frame. ANSWER: N  1775, n  100, X  505.75, s  360.95 NX  1.6604( Ns / n )  1775(505.75)  1.6604(1775  360.95 / 100)  897706.25 106379.55 = ($719,326.70, $1,004,085.8) 62. Interpret the 90% confidence interval constructed in Question 61. ANSWER: We are 90% confident that the total balance of all 1775 savings account balances within the bank are between $791,327 and $1,004,086. QUESTIONS 63 AND 64 ARE BASED ON THE FOLLOWING INFORMATION: A real estate agent has collected a random sample of 40 houses that were recently sold in Grand Rapids, Michigan. She is interested in comparing the appraised value and recent selling price (in thousands of dollars) of the houses in this particular market. The values of these two variables for each of the 40 randomly selected houses are shown below. House 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Value 140.93 132.42 118.30 122.14 149.82 128.91 134.61 121.99 150.50 142.87 155.55 128.50 143.36 119.65 122.57 145.27 149.73 147.70 117.53 140.13 Price 140.24 129.89 121.14 111.23 145.14 139.01 129.34 113.61 141.05 152.90 157.79 135.57 151.99 120.53 118.64 149.51 146.86 143.88 118.52 146.07 House 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 Value 136.57 130.44 118.13 130.98 131.33 141.10 117.87 160.58 151.10 120.15 133.17 140.16 124.56 127.97 101.93 131.47 121.27 143.55 136.89 106.11 Price 135.35 121.54 132.98 147.53 128.49 141.93 123.55 162.03 157.39 114.55 139.54 149.92 122.08 136.51 109.41 127.29 120.45 151.96 132.54 114.33 63. Using the sample data, generate a 95% confidence interval for the mean difference between the appraised values and selling prices of the houses sold in Grand Rapids. ANSWER: We applied the paired sample analysis using n  40, X D  1.612, sD  6.794 , where: D = Difference = Appraised value – selling price. X D  t (sD / n)  1.612  2.0227(6.794 / 40)  1.612  2.173 Lower limit = -3.785, and Upper limit = 0.561 (in thousands of dollars) 64. Interpret the constructed confidence interval for the real estate agent. ANSWER: We are 95% confident that the actual mean difference between the appraised values and selling prices of all the houses sold in Grand Rapids is between -$3785 and $561. QUESTIONS 65 THROUGH 69 REQUIRE THE USE OF EXCEL: 65. Compute P(t15  2.0), where t15 has a t-distribution with 15 degrees of freedom. ANSWER: 0.03197 66. Compute P(t150  2.0), where t150 has a t-distribution with 150 degrees of freedom. ANSWER: 0.02365 67. How do you explain the difference between the results obtained in Questions 65 and 66? ANSWER: The smaller the degrees of freedom, the higher the variance of t, and so the larger the tail probabilities are. 68. Compute P( Z  2.0), where Z is a standard normal random variable. ANSWER: 0.02275 69. Compare the results of Question 68 to the results obtained in Questions 65 and 66. How do you explain the difference in these probabilities? ANSWER: First, the variance of t with a small degree of freedom is larger than a t with a large degree of freedom, which is larger than for a Z. This explains why the probabilities in Questions 65, 66, and 68 increases. Second, when the sample size is large, the degrees of freedom of t are large; and that the t distribution and the standard normal distribution are practically indistinguishable. This explains why the probabilities in Questions 66 and 68 are close. QUESTIONS 70 THROUGH 72 ARE BASED ON THE FOLLOWING INFORMATION: Senior management of a consulting services firm is concerned about a growing decline in the firm’s weekly number of billable hours. The firm expects each professional employee to spend at least 40 hours per week on work. In an effort to understand this problem better, management would like to estimate the standard deviation of the number of hours their employees spend on work-related activities in a typical week. Rather than reviewing the records of all the firm’s full-time employees, the management randomly selected a sample of size 50 from the available frame. The sample mean and sample standard deviations were 48.5 and 7.5 hours, respectively. 70. Construct a 99% confidence interval for the standard deviation of the number of hours this firm’s employees spend on work-related activities in a typical week. ANSWER: n  50, X  48.5, s  7.5 71. Lower limit = n  1s / 2 / 2  49(7.5) / 78.2305 =5.936 Upper limit = n  1s / 12 / 2  49(7.5) / 27.2494  10.057 Interpret the 99% confidence interval constructed in Question 70. ANSWER: We are 99% confident that the population standard deviation is between 5.936 and 10.057. 72. Given the target range of 40 to 60 hours of work per week, should senior management be concerned about the number of hours their employees are currently devoting to work? Explain why or why not. ANSWER: The best guess for the population mean is 48.5 hours per week, and about 95% of all employees are within 2 standard deviations of this, where we are almost sure (99% sure) that this standard deviation is between 5.9 and 10.1. But even if the standard deviation is only 5.9, then 48.5 2 standard deviations will produce the range 36.7 to 60.3. Maybe management should be concerned. QUESTIONS 73 THROUGH 75 REQUIRE THE USE OF EXCEL: 73. Compute P(t20  0.95), where t20 has a t-distribution with 20 degrees of freedom. ANSWER: Because of the symmetry of the t distribution, this left-hand tail probability can be calculated exactly like right-hand tail. The answer is 0.17673. 74. Compute P(t2  0.95), where t2 has a t-distribution with 2 degrees of freedom. ANSWER: Because of the symmetry of the t distribution, this left-hand tail probability can be calculated exactly like right-hand tail. The answer is 0.22119. 75. How do you explain the difference between the results obtained in Questions 73 and 74? ANSWER: The larger the degrees of freedom, the lower the variance of t, so the smaller the tail probabilities are. This explains why the probability in Question 73 is smaller than that in Question 74. QUESTIONS 76 AND 77 ARE BASED ON THE FOLLOWING INFORMATION: A sample of 10 quality control managers with over 15 years of experience has an average salary of $54,000 and a standard deviation of $15,000. 76. You can be 95% confident that the mean salary for all quality control managers with at least 15 years of experience is between what two numbers? What assumptions are you making about the distribution of salaries? ANSWER: n = 10, X = 54000, s = 15000 X  t (s / n)  54000  2.2622(15000 / 10)  54000  10730.557 Lower limit = 43,269.443, and upper limit = 64,730.557 We must assume that the population distribution of salaries is normal, especially since the sample size is so small. 77. What size sample would be needed to ensure that we could estimate the true mean salary of all quality control managers with more than 15 years of experience and have only 2 chances in 100 of being off by more than $800? ANSWER:  est . =15000, z-multiple = 2.326, B = 800 The approximate sample size required to produce a 98% confidence interval for the mean is given by  z  multiple   est   2.326 15000  n    B 800     2 2 1903 QUESTIONS 78 THROUGH 80 ARE BASED ON THE FOLLOWING INFORMATION: Q-Mart is interested in comparing customer who used its own charge card with those who use other types of credit cards. Q-Mart would like to know if customers who use the QMart card spend more money per visit, on average, than customers who use some other type of credit card. They have collected information on a random sample of 38 charge customers and the data is presented below. On average, the person using a Q-Mart card spends $192.81 per visit and customers using another type of card spend $104.47 per visit. Use the information below to answer the following questions. Summary statistics for two samples Q-Mart Sample sizes 13 Sample means 192.81 Sample standard deviations 115.243 Other Charges 25 104.47 71.139 Confidence interval for difference between means Sample mean difference Pooled standard deviation Std error of difference 78. 88.34 88.323 30.201 Using a t-value of 2.023, calculate a 95% confidence interval for the difference between the average Q-Mart charge and the average charge on another type of credit card. Would you conclude that there is a significant difference between the two types of customers in this case? Explain. ANSWER: 88.34  61.0966 = +27.2434 – +149.4366. Since the range does not include 0, there appears to be a significant difference between the means of the two groups. In this case, it appears as though the Q-Mart charge card holders spend more money than those who use other types of charge cards. 79. What are the degrees of freedom for the t-statistic in this calculation? Explain how you would calculate the degrees of freedom in this case. ANSWER: n1 + n2 – 2 = 36 80. What is the assumption in this case that allows you to use the pooled standard deviation for this confidence interval? ANSWER: In order to use the pooled standard deviation for this confidence interval, we must assume that the two populations standard deviations are equal (  1   2 ). QUESTIONS 81 THROUGH 84 ARE BASED ON THE FOLLOWING INFORMATION: The average annual household income levels of citizens of selected U.S. cities are shown below. City Index 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 81. Household Income $54,300 $61,800 $61,400 $50,800 $56,200 $48,300 $61,600 $63,200 $55,200 $58,000 $77,600 $47,600 $62,700 $46,200 $64,300 $56,000 $53,400 $56,800 $51,200 $59,000 City Index 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 Household Income $53,500 $45,600 $70,100 $108,700 $46,400 $56,700 $59,100 $46,300 $52,900 $56,300 $67,300 $63,800 $70,600 $49,800 $51,300 $56,600 $49,600 $67,400 $53,700 $48,700 City Index 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 Household Income $61,500 $53,000 $51,000 $55,600 $51,600 $57,200 $54,300 $51,500 $53,500 $61,800 $44,800 $57,400 $48,100 $52,700 $57,400 $65,500 $59,600 $62,000 $49,700 $54,400 Use Excel to obtain a simple random sample of size 10 from this frame. ANSWER: I used StatPro’s Generate Random Samples to generate a sample of size 10, then used the VLOOKUP function to get the corresponding incomes. The following sample is obtained: City Index 50 14 4 56 48 49 8 11 38 52 Income 61,800 46,200 50,800 65.500 51,500 53,500 63,200 77,600 67,400 57,400 82. Using the sample generated in Question 81, construct a 95% confidence interval for the mean average annual household income level of citizens in the selected U.S. cities. Assume that the population consists of all average annual household income levels in the given frame. ANSWER: n  10, X  59, 490, s  9, 439.803 X  2.2622(s / n)  59, 490  2.2622(9, 439.803/ 10)  59, 490  6,752.9561 Lower limit = 52,737.0439, upper limit = 66,242.9561 83. Interpret the 95% confidence interval constructed in Question 82. ANSWER: We are 95% confident that the average of all incomes is between $52,737 and $66,243. 84. Does the 95% confidence interval contain the actual population mean? If not, explain why not. What proportion of many similarly constructed confidence intervals should include the true population mean value? ANSWER: This confidence interval easily captures the true population mean of $57,043. Approximately 95% of the confidence intervals constructed in this way should contain the true population mean. QUESTIONS 85 THROUGH 91 ARE BASED ON THE FOLLOWING INFORMATION: The personnel department of a large corporation wants to estimate the family dental expenses of its employees to determine the feasibility of providing a dental insurance plan. A random sample of 12 employees reveals the following family dental expenses (in dollars) for the year 2001. 115 370 250 85. Construct a 90% confidence interval estimate of the mean family dental expenses for all employees of this corporation. ANSWER: 93 540 225 177 425 318 182 275 228 86. What assumption about the population distribution must be made to answer Question 85? ANSWER: The population of dental expenses must be approximately normally distributed. 87. Interpret the 90% confidence interval constructed in question 85. ANSWER: We are 90% confident that the mean family dental expenses for all employees of this corporation is between $199.26 and $333.74. 88. Suppose you used a 95% confidence interval in Question 85. What would be your answer to Question 85? ANSWER: 89. Suppose the fourth value were 593 instead of 93. What would be your answer to Question 88? What effect does this change have on the confidence interval? ANSWER: The additional $500 in dental expenses, divided across the sample of 12, raises the mean by $41.67 and increases the standard deviation by nearly $18.20. The interval width increases over $23 in the process. 90. Construct a 90% confidence interval estimate for the standard deviation of family dental expenses for all employees of this corporation. ANSWER: 91. Interpret the 90% confidence interval constructed in question 90. ANSWER: We are 90% confident that the standard deviation for family dental expenses for all employees of this corporation is between 110.61 and 229.38. QUESTIONS 92 AND 93 ARE BASED ON THE FOLLOWING INFORMATION: An automobile dealer wants to estimate the proportion of customers who still own the cars they purchased six years ago. A random sample of 200 customers selected from the automobile dealer’s records indicates that 88 still own cars that were purchased six years earlier. 92. Construct a 95% confidence interval estimate of the population proportion of all customers who still own the cars they purchased six years ago ANSWER: pˆ (1– pˆ ) 0.44(0.56) = 0.44  0.0688  0.44  1.96  n 200 Lower limit = 0.3712, and upper limit = 0.5088 pˆ  Z  93. How can the result in Question 92 be used by the automobile dealer to study satisfaction with cars purchased at the dealership? ANSWER: The dealer can infer that the proportion of all customers who still own the cars they purchased at the dealership 6 years earlier is somewhere between 03712 and 0.5088 with a 95% level of confidence. TRUE / FALSE QUESTIONS 94. The degrees of freedom for the t and chi-square distributions is a numerical parameter of the distribution that defines the precise shape of the distribution. ANSWER: 95. When all possible samples of size n are drawn from any population, then the sampling distribution of the sample mean X is approximately normal provided that n is reasonably large. ANSWER: 96. F The standard error of the sampling distribution of the sample proportion p̂ , when the sample size n = 50 and the population proportion p = 0.25, is 0.00375. ANSWER: 98. T The mean of the sampling distribution of the sample proportion p̂ , when the sample size n = 100 and the population proportion p = 0.58, is 58.0. ANSWER: 97. T F In developing a confidence interval for the population standard deviation  , we make use of the fact that the sampling distribution of the sample standard deviation S is not the normal distribution or the t distribution, but rather a rightskewed distribution called the chi-square distribution, which (for this procedure) has n – 1 degrees of freedom. ANSWER: T 99. As a general rule, the normal distribution is used to approximate the sampling distribution of the sample proportion p̂ only if the sample size n is greater than 30. ANSWER: 100. In general, the paired-sample procedure is appropriate when the samples are naturally paired in some way and there is a reasonably large positive correlation between the pairs. In this case, the paired-sample procedure makes more efficient use of the data and generally results in narrower confidence intervals. ANSWER: 101. F A confidence interval is an interval estimate for which there is a specified degree of certainty that the actual value of the population parameter will fall within the interval. ANSWER: 106. T If two random samples of sizes 30 and 35 are selected independently from two populations whose means are 85 and 90, then the mean of the sampling distribution of the sample mean difference, X 1  X 2 , equals 5. ANSWER: 105. F If two random samples of size 40 each are selected independently from two populations whose variances are 35 and 45, then the standard error of the sampling distribution of the sample mean difference, X 1  X 2 , equals 1.4142. ANSWER: 104. F If a random sample of size 250 is taken from a population, where it is known that the population proportion p = 0.4, then the mean of the sampling distribution of the sample proportion p̂ is 0.60. ANSWER: 103. T If the standard error of the sampling distribution of the sample proportion p̂ is 0.0324 for samples of size 200, then the population proportion must be 0.30. ANSWER: 102. F T The 95% confidence interval for the population mean  , given that the sample size n = 49 and the population standard deviation  = 7, is X  1.96 . ANSWER: T 107. In order to construct a confidence interval estimate of the population mean  , the value of  must be given. ANSWER: 108. The interval estimate 18.5  2.5 was developed for a population mean when the sample standard deviation S was 7.5. Had S equaled 15, the interval estimate would be 37  5.0. ANSWER: 109. F The t-distribution and the standard normal distribution are practically indistinguishable as the degrees of freedom increase. ANSWER: 115. T The lower limit of the 95% confidence interval for the population proportion p, given that n = 300; and p̂ = 0.10 is 0.1339. ANSWER: 114. F The upper limit of the 90% confidence interval for the population proportion p, given that n = 100; and p̂ = 0.20 is 0.2658. ANSWER: 113. T In general, increasing the confidence level will narrow the confidence interval, and decreasing the confidence level widens the interval. ANSWER: 112. T A 90% confidence interval estimate for a population mean  is determined to be 72.8 to 79.6. If the confidence level is reduced to 80%, the confidence interval for  becomes narrower. ANSWER: 111. F We can form a confidence interval for the population total T by finding a confidence interval for the population mean  in the usual way, and then multiplying each end point of the confidence interval by the population size N. ANSWER: 110. F T In determining the sample size n for estimating the population proportion p, a conservative value of n can be obtained by using 0.50 as an estimate of p. ANSWER: T 116. In developing confidence interval for the difference between two population means using two independent samples, we use the pooled estimate s p in estimating the standard error of the sampling distribution of the sample mean difference X 1  X 2 if the populations are normal with equal variances. ANSWER: T

confidence interval estimation

Related documents

Products

Support

confidence interval estimation

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib