AP Statistics Inference for Comparisons Worksheet Name: ________________________ Date: ___________________ Period: _________ Complete each of the following problems. Remember to use SCAD where applicable. Comparing Proportions 1. A recent Harris Poll of 1020 randomly selected American adults found that 31% said they considered the economy one of the two major issues for the government to address. In a similar poll taken the previous year, 37% said they considered the economy one of the two major issues for the government to address. Using a 95% confidence interval for the difference between two proportions, decide if there is convincing evidence of a change in the proportion of adults who considered the economy one of the two major issues for the government to address. Assume that the sample sizes were the same in both years and that the samples were random and independent. A.no, because the 95% confidence interval for the difference contains 6% B. no, because the 95% confidence interval for the difference contains 0% C. no, because the 95% confidence interval for the difference does not contain 0% D.yes, because the 95% confidence interval for the difference contains 6% E. yes, because the 95% confidence interval for the difference does not contain 0% 2. Compute and interpret the 95% confidence interval for the situation in Question 1. Then explain the meaning of being 95% confident. You may assume that conditions have been met. Interval: (-.1008, -.0188). We are 95% confident that the change in the proportion of adults who considered the economy one of the two major issues for the government to address is between -.1008 and -.0188. 95% confident suggests that of all possible samples of differences that could have been taken, approximately 95% of those samples would produce intervals which contain the true change in the proportion of adults who considered the economy one of the two major issues for the government to address. 3. A recent article published in the Journal of the American Medical Association reported a four-year double-blind study comparing the effect of Prempro (experimental), the most widely prescribed type of hormone therapy, and a placebo (control) in the reduction of Alzheimer’s disease (dementia). Of the study’s 4532 women age 65 and older, half were randomly assigned to receive Prempro, while the other half received a placebo. At the end of the study, there were 40 cases of dementia in the Prempro group and 21 in the placebo group. a. Name an appropriate test to compare the two treatments, and check the conditions. 2-Proportion Z-test SRS: Not explicitly stated. Assume that the 4532 women make up an SRS of the population with the understanding that our results may be invalid. Independence: It is reasonable to assume that more than 45320 women age 65 and older were available for this study. Normality: npˆ1 , n (1 pˆ1 ), npˆ2 , n (1 pˆ2 ) are all greater than 5. So, we can safely assume normality. b. Formulate appropriate null and alternative hypotheses for a two-sided test. H 0 : p1 p 2 H a : p1 p 2 c. Compute the test statistic and find the P-value. Z = 2.45 p-value = P(|z| > 2.45) = .0143 d. What do you conclude? At the 5% level, these sample results are statistically significant. Reject H0. We have sufficient evidence that there is a different in the proportion of women who develop demential with Prempro vs. a placebo. Comparing Means Remember to consider whether the samples are dependent or independent. 4. The National Highway Traffic Safety Administration (NHTSA) performed a series of brake tests to evaluate variability in the results obtained at its different test sites and among different drivers. This table displays the braking distance, in feet, from two different test sites for the Ford F-150 pickup truck on dry surface. Stopping Distance Statistics, Ford F-150 Stop Number 1 2 3 4 5 6 7 8 9 10 Site 1 157.14 159.83 156.95 155.22 156.63 158.23 158.86 155.72 159.87 159.25 Site 2 163.41 163.48 165.23 159.17 167.45 161.33 165.85 168.09 162.35 166.78 11 12 161.20 165.33 Summary Statistics Site 1 Site 2 Average Stopping Distance (ft) 157.77 164.14 Standard Deviation of Stopping Distances (ft) 1.68 2.77 Number of Stops Analyzed 10 12 a. Is there statistically significant evidence that the average braking distance at the two test sites is different? Explain by conducting a hypothesis test at the 5% level of significance. As there is no information about the design of this study, treat it as an observational study when you write the conclusion. Name the test. Test of significance for the difference between two means (two-sample t-test). Check conditions. Plots of the two distributions show little skewness and no outliers. The design of the study is unknown, so we will treat it as an observational study. State your hypotheses. H 0: 1 2 Ha: 1 2 Compute the test statistic and P-value. t (157.77 164.14) 0 1.682 2.77 2 10 12 P -value 0.00000278 6.636 (unpooled) Write a conclusion in context. There is sufficient evidence to reject the null hypothesis because the t-statistic of –6.636 falls in the rejection region (or, because the P-value of 0.00000278 is less than 0.05). The difference between the two sample mean braking distances cannot reasonably be attributed to chance alone. Until we know how the study was designed, however, we cannot say that the reason was differences in the test sites. (It may have been differences in the trucks or drivers used.) b. To further support your conclusion, construct a 95% confidence interval for the mean difference in braking distance. We are 95% confident that the true difference in braking difference of all Ford-F150 models at these two sites is between -8.38 and -4.36 feet. 5. These rates of women’s participation in the labor force were collected by the U.S. Department of Labor Statistics in 19 cities for 1968 and 1972. The purpose was to monitor the change in the percentage of women who were in the labor force during that period. Percentage of Women in the Labor Force City 1972 1968 New York, NY 45 42 Los Angeles, CA 50 50 Chicago, IL 52 52 Philadelphia, PA 45 45 Detroit, MI 46 43 San Francisco, CA 55 55 Boston, MA 60 45 Pittsburgh, PA 49 34 St. Louis, MO 35 45 Hartford, CT 55 54 Washington, DC 52 42 Cincinnati, OH 53 51 Baltimore, MD 57 49 Newark, NJ 53 54 Minneapolis-St. Paul, MN 59 50 Buffalo, NY 64 58 Houston, TX 50 49 Paterson, NJ 57 56 Dallas, TX 64 63 a. Discuss whether a statistical test is appropriate for determining whether there was a change in the mean percentage of women in the labor force, by city, between 1968 and 1972. In other words, determine whether the conditions will be met for a hypothesis test. b. Regardless of your response to part a, carry out a test of whether there is statistically significant evidence of a change in the mean percentage of women who were in the labor force, by city, between 1968 and 1972. c. To further support your conclusion, construct a 95% confidence interval for the mean difference in percentage of women in the labor force. a. The sample is not random. These are the largest U.S. cities. Thus, they are unlikely to be representative of all cities in the entire United States. Thus, a test of significance based on matched pairs isn’t really appropriate. b. Name the test. Test of significance of a mean difference based on paired differences (paired ttest). Check conditions. The sample of cities wasn’t selected at random. These are the largest cities in the United States and there is no reason to suppose that these large cities are representative of all cities or of the entire United States. The set of differences is slightly skewed with no outliers, but this doesn’t indicate a serious problem with non-normality. State your hypotheses. H0: D 0 Ha: D 0 Compute the test statistic and P-value. t 3.3684 0 5.9741/ 19 with df = 18, 2.4577 P -value 0.02435 Write a conclusion in context. Reject the null hypothesis. Because the P-value of 0.02435 is less than the 0.05 level of significance, there is statistically significant evidence of a change in the mean percentage of women participating in the labor force, by city, between 1968 and 1972 that cannot be explained by chance alone. Because these cities aren’t a random sample from all cities, all we can say is that the mean difference between the two years in these 19 cities is too far from 0 to be reasonably attributed to chance alone.