Chapter 8 Study Guide Solutions Study Guide 8.1a 8.1b 8.1c Ideal Response 1 Compute the mean number of pairs of shoes. In this case x 30.35 2 Compute the sample variance of the pairs of shoes. In this case sx2 202.77 3 Compute the sample proportion of those planning to attend the prom. In this case pˆ 0.72 . 4 Compute the sample proportion of those who would report cheating. In this case pˆ 0.11 . 9 The figure shows that 4 of the 25 confidence intervals did not contain the true parameter. This amounts to 16%. Therefore 84% of the intervals actually did contain the true parameter which suggests that these were 80% intervals (though they could have been 90%). 10 The figure shows that all of the 25 confidence intervals did contain the true mean. This suggests that the confidence level was quite high – probably 99%, but possibly 95%. 11 (a) If we were to repeat the sampling procedure many times, on average, the sample proportion would be within 3 percentage points of the true proportion in 95% of samples. (b) The 95% confidence interval is 0.63 to 0.69. We are 95% confident that the interval from 0.63 to 0.69 captures the true proportion of those who favor an amendment to the Constitution that would permit organized prayer in public schools. (c) If we were to repeat the sampling procedure many times, about 95% of the confidence intervals computed would contain the true proportion of those who favor an amendment to the Constitution that would permit organized prayer in public schools. 12 (a) If we were to repeat the sampling procedure many times, on average, the sample proportion would be within 3 percentage points of the true proportion in 95% of samples. (b) The 95% confidence interval is 0.56 to 0.62. We are 95% confident that the interval from 0.52 to 0.62 captures the true proportion of those who would like to lose weight. (c) If we were to repeat the sampling procedure many times, about 95% of the confidence intervals computed would contain the true proportion of those who would like to lose weight. 13 Some of the practical difficulties would include non-response (those who either do not answer the phone or those who refuse to answer) and undercoverage of those who do not have telephones. Also, the description says that the random numbers formed were for “residential numbers” which suggests that they did not include cell phones so they would have undercoverage of those people who only have cell phones. 14 There could be sources of error due to many sampling issues such as undercoverage (if calls were only made on certain days or time of day) and non-response bias (many people will not participate in telephone or mail-in surveys). 16 We are 95% confident that the interval from 0.120 to 0.297 captures the true difference in the proportions of younger teens and older teens who include false information on their profiles (younger – older). That is, we are 95% confident that between 12% and 29.7% more younger teens publish false information on their profiles than older teens. When we say 95% confident, we mean that if this sampling method were employed many times, approximately 95% of the resulting confidence intervals would capture the true difference between the proportions of younger teens and older teens who include false information on their profiles. Chapter 8 Study Guide Solutions 17 (a) Incorrect; the interval refers to the mean BMI of all women, not to individual BMI’s which will be much more variable. (b) This is not quite correct, although it is closer than the explanation given in part (a). 95% of future samples will be within 0.6 of the true mean (true mean 0.6 ), not within 0.6 of 26.8 (unless it happens that the true mean is 26.8). That is, future samples will not necessarily be close to the results of this sample; instead, they should be close to the truth. (c) Correct; we have given an interval which we believe contains the true mean. Therefore, the values in that interval are values which are believable as being that true mean. (d) Incorrect: it suggests that the population mean will be different in some samples (in 5% of samples it will not be between 26.2 and 27.4?). The population mean always stays the same, regardless of the sample taken. (e) Incorrect: we are reasonably sure that the population mean is between 26.2 and 27.4, but that does not rule out any other possibility absolutely. 18 (a) Incorrect; the probability is either 0 or 1, but we don’t know which. (b) Incorrect; the general form of these confidence intervals is x m , so x will always be in the center of the confidence interval. (c) Correct interpretation. (d) Incorrect; there is nothing magic about the interval from this one sample. Our method for computing confidence intervals is based on capturing the mean of the population, not a particular interval from one sample. (e) Correct interpretation. 19 The data must be random so that we can generalize our results to a larger population (sampling) or make inferences about cause-and-effect (experiment). We need Normality so that we know the sampling distribution of the statistic which, in turn, leads to the computation of the confidence interval. Finally, we need independence for calculating the appropriate standard deviations. 20 The data were not randomly collected; they come from a voluntary response sample. Those who respond to such online polls tend to be those with strong opinions about the issue at hand, and so are not representative of the population of interest. Since the data are not random, the confidence interval cannot be generalized to any larger group, rendering it basically useless. 8.1 MC 8.2a 21. B 31 Since 22. E 10.98 2 23. C 24. B 0.01 , z * for a 98% confidence interval can be found by looking for a left-tail area of 1 0.01 0.99 . The closest area is 0.9901 corresponding to a critical value z * of 2.33. 32 Since 10.93 2 0.035 , z * for a 93% confidence interval can be found by looking for a left-tail area of 1 0.035 0.965 . The closest area is 0.9649 corresponding to a critical value z * of 1.81. 8.2b 34 (a) The population of interest consists of the undergraduates at a large university. The parameter of interest is the true proportion who would be willing to report cheating. (b) Random: the sample was a simple random sample. Normal: there were 19 successes (willing to report) and 153 failures (not willing to report). Both of these numbers are at least 10. Independent: since this is a large university, 172 should be less than 10% of the undergraduate student population. 19 0.11 (c) For a 99% confidence interval z* 2.576 . For this sample pˆ 172 . So the confidence interval is 0.11 2.576 0.11(0.89) 172 0.11 0.06 . Therefore, the confidence interval is from 0.05 to 0.17. (d) We are 99% confident that the interval from 0.05 to 0.17 captures the true proportion of students who would be willing to report cheating. Chapter 8 Study Guide Solutions 36 (a) State: We want to estimate the actual proportion of all teens who have a photo of themselves on their online profiles at a 95% confidence level. We should use a one-sample z-interval for p if the conditions are satisfied. Random: the teens were selected randomly. Normal: there were 385 successes (teens with photos on profile) and 102 failures (teens without photos on profile). Both are at least 10. Independent: the sample is less than 10% of the population of all American teens. The conditions are met. A 95% confidence interval is given by Plan: Do: 0.791 1.96 0.791(0.209) 487 0.791 0.036 . Therefore, the confidence interval is from 0.755 to 0.827. We are 95% confident that the interval from 0.755 to 0.827 captures the true proportion of teens who have online profiles that have a photo of themselves on their profile. (b) The value 75% does not appear in our 95% confidence interval. While it is certainly possible that 75% have photos, it is not very likely. We are reasonably confident that it is more than that – somewhere between 75.5% and 82.7%. (b) The value 75% does not appear in our 95% confidence interval. While it is certainly possible that 75% have photos, it is not very likely. We are reasonably confident that it is more than that – somewhere between 75.5% and 82.7%. Conclude: 38 There could be sources of error due to many sampling issues such as undercoverage (if calls were only made on certain days or time of day) and non-response bias (many people will not participate in telephone or mail-in surveys). 40 State: Plan: Do: Conclude: We want to estimate the actual proportion of all adults who are satisfied with the way things are going in the United States at this time at a 90% confidence level. We should use a one-sample z-interval for p if the conditions are satisfied. Random: the adults were selected randomly. Normal: there were 256 successes (adults who were satisfied) and 769 failures (adults who were not satisfied). Both are at least 10. Independent: the sample is less than 10% of the population of all adults. The conditions are met. A 90% confidence interval is given by 0.25 1.645 0.25(0.75) 1025 0.25 0.02 . Therefore, the confidence interval is from 0.23 to 0.27. We are 90% confident that the interval from 0.23 to 0.27 captures the true proportion of adults who are satisfied with the way things are going in the United States at this time. 42 (a) We do not know the sample sizes for the men and for the women. (b) The margin of error for women alone would be greater than 0.03 because the sample size for women alone is smaller than 1019. 46 Our guess is p* 0.7 , so we need 1.645 0.7(0.3) n 0.04 or n 1.645 0.04 2 (0.7)(0.3) 355.17 . Take an SRS of n 356 students. 48 The margin of error is stated to be 0.01 and the sample proportion is pˆ know that the margin of error is z * 0.01 z * 0.6341(0.3659) 5594 pˆ (1 pˆ ) n 3547 5594 0.6341 . We so filling in what we know gives . When we solve for z * we get 1.55. The area is between 1.55 and 1.55 under the Standard Normal curve is 0.8788. The confidence level is likely 88%. (b) We do not know if those who did respond can reliably represent those who did not. 8.2 MC 49. A 50. D 51. C 52. A Chapter 8 Study Guide Solutions 8.3a 8.3b 57 (a) df 9 , t* 2.262 (b) df 19 , t* 2.861 58 (a) df 11 , t* 1.796 (c) df 29 , t* 2.045 65 Since 57 degrees of freedom is not in the table, we use 50 df and t* 2.678 . Using technology with the exact degrees of freedom we find t* 2.665 . If we are able to use the exact degrees of freedom we will have a slightly shorter interval with the same level of confidence. 66 Since 76 degrees of freedom is not in the table, we use 60 df and t* 1.671 . Using technology with the exact degrees of freedom we find t* 1.665 . If we are able to use the exact degrees of freedom we will have a slightly shorter interval with the same level of confidence. 71 The t-procedure is only valid if the original measurements follow a Normal distribution (or are at least approximately Normally distributed). In this case there are far too many outliers with only 20 measurements for us to think the Normal distribution is plausible. 72 The t-procedure is only valid if the original measurements follow a Normal distribution (or are at least approximately Normally distributed). In this case there are far too many outliers with only 20 measurements for us to think the Normal distribution is plausible. 73 (a) A t-procedure would not be appropriate here because we are trying to estimate a population proportion, not a population mean. (b) The t-procedure would not be appropriate here because the sample (male athletes) is not representative of the whole population (all male college students at this school). The sample was not selected randomly. (c) The t-procedure would not be appropriate here because there are too many outliers with only 25 observations. We cannot assume that the population values can be described by a Normal distribution. 74 (a) A t-procedure would not be appropriate here because we are trying to estimate a population proportion, not a population mean. (b) The t-procedure would not be appropriate here because the sample (male athletes) is not representative of the whole population (all male college students at this school). The sample was not selected randomly. (c) The t-procedure would not be appropriate here because there are too many outliers with only 25 observations. We cannot assume that the population values can be described by a Normal distribution. 55 The margin of error is defined to be 1 z * n . We are told that 7.5 and for 99% confidence z* 2.576 . Putting those numbers into the equation for the margin of error we get 1 2.576 56 7.5n . Solving this we get n 373.26 , so take a sample of 374 women. The margin of error is defined to be 2 z * n . We are told that 50 and for 95% confidence z* 1.96 . Putting those numbers into the equation for the margin of error we get 2 1.96 50n . Solving this we get n 2401 , so take a sample of 2401 students who took the SAT a second time. 59 SEM 9.3 27 1.7898 . In repeated sampling, the average distance between the sample means and the population mean will be about 1.7898 units. Chapter 8 Study Guide Solutions 60 SEM 21.88 20 4.8925 . In repeated sampling, the average distance between the sample means and the population mean will be about 4.8925 minutes. 63 State: Plan: Do: We want to estimate the true mean fuel efficiency for this vehicle at a 95% confidence level. We should construct a one-sample t-interval for if the conditions are met. Random: The data come from a random sample. Normal: There are only 20 measurements so we must check the shape. The normal probability plot is roughly linear (be sure to sketch the normal probability plot). Also, the histogram does not show any strong skewness or outliers so this condition is met (see histogram below). Independent: We have less than 10% of the possible records for mpg. The conditions are met. From the data we find that x 18.48 and s 3.116 and we have a sample of n 20 observations. This means that we have 19 degrees of freedom and t* 2.093 . Therefore, the confidence interval is 18.48 2.093 64 18.48 1.458 , or 17.022,19.938 . 3.116 20 Conclude: We are 95% confident that the interval from 17.022 to 19.938 captures the true mean miles per gallon for this car. State: We want to estimate the true mean amount of vitamin C in the CSB from this production at a 95% confidence level. We should construct a one-sample t-interval for if the conditions are met. Random: The data come from a random sample. Normal: There are only 8 measurements so we must check the shape. The normal probability plot is roughly linear (be sure to sketch the normal probability plot). Also, the dotplot does not show any strong skewness or outliers so this condition is met (see dotplot below). Independent: We have less than 10% of the possible samples from this production run. The conditions are met. Plan: Do: From the data we find that x 22.5 and s 7.19 and we have a sample of n 8 observations. This means that we have 7 degrees of freedom and t* 2.365 . Therefore, the confidence interval is 22.5 2.365 22.5 6.01 , or 7.19 8 16.49, 28.51 . Conclude: We are 95% confident that the interval from 16.49 to 28.51 captures the true mean amount of vitamin C in this production run. Chapter 8 Study Guide Solutions 8.3C 67 (a) State: Plan: Do: We want to estimate the true mean percent change in BMC in the population at a 99% confidence level. We should construct a one-sample t-interval for if the conditions are met. Random: The data come from a random sample. Normal: The sample size is larger than 30, therefore the CLT applies. Independent: We have less than 10% of nursing mothers. The conditions are met. We are told that x 3.587 and s 2.506 and we have a sample of n 47 observations. This means that we have 46 degrees of freedom and t* 2.704 from Table B (with 40 df). Therefore, the confidence interval is 3.587 2.704 2.506 47 3.587 0.988 , or 4.575, 2.599 . Using technology, the CI is 4.459, 2.605 . We are 99% confident that the interval from 4.459 to 2.605 captures the true mean percent change in BMC. (b) Yes. The interval includes only negative numbers which represent bone mineral loss, so we are quite confident that nursing mothers lose bone mineral. Conclude: 69 (a) State: Plan: Do: We want to estimate the true mean mean difference in the estimates from these two methods in the population at the 95% confidence level. We should construct a one-sample t-interval for if the conditions are met. Random: The data come from a random sample. Normal: We calculate the differences ( weight groove ) and we observe the normal probability plot of the differences is roughly linear (be sure to sketch the normal probability plot). Also, the histogram of the differences indicates that there is no strong skewness or outliers (see histogram below). Independent: We have less than 10% of tires. The conditions are met. We are told that x 4.556 and s 3.226 and we have a sample of n 16 observations. This means that we have 15 degrees of freedom and t* 2.131 from Table B (with 40 df). Therefore, the confidence interval is 4.556 1.719 , or 2.837,6.275 . Using technology, the CI 4.556 2.131 3.226 16 is 4.459, 2.605 . Conclude: We are 95% confident that the interval from 2.837 to 6.275 thousands of miles captures the true mean difference in tire wear. (b) Since the entire interval is positive, and we measured weight groove , we are reasonably sure that the measurement by the weight method is giving larger numbers than the measurement by the groove method, on average. Chapter 8 Study Guide Solutions 70 (a) State: Plan: Do: We want to estimate the true mean difference in the amount of zinc in th top and the bottom wells in this large region at the 95% confidence level. We should construct a one-sample t-interval for if the conditions are met. Random: The data come from a random sample. Normal: We calculate the differences ( bottom top ) and we observe the normal probability plot of the differences is roughly linear (be sure to sketch the normal probability plot). The dotplot of the differences indicates that there is no strong skewness but a possible outlier at 0.175 (see dotplot below), therefore look at the modified boxplot to confirm it is not an outlier (see boxplot below). Independent: We have less than 10% of tires. The conditions are met. We calculate the differences and find x 0.0804 and s 0.0523 and we have a sample of n 10 observations. This means that we have 9 degrees of freedom and t* 2.262 . Therefore, the confidence interval is 0.0804 2.262 0.0523 10 0.0804 0.0374 , or 0.043,0.1178 . Conclude: We are 95% confident that the interval from 0.043 to 0.1178 mg/L captures the true mean difference in the amount of zinc between the bottom and top of these wells. (b) Since the entire interval is positive, and we measured bottom – top, we are reasonably sure that, on average, the amount of zinc at the bottom of the wells is greater than the amount of zinc at the top of the wells. 8.3 MC 75. B 76. A 77. B 78. A