AP Statistics Hypothesis Testing Comparing 2 Populations or Treatments Name: ___________________________ Date: ________________ Block: ___________ Independent OR Dependent? For each of the situations presented below, determine if the samples are independent or dependent (matched pairs). EXPLAIN your choice. A. Which coating (A or B) produces higher strength? A sample of 5 CD’s is treated with coating A, while a second sample of 5 CD’s is treated with coating B. Each CD is then tested for strength. B. Does environment affect intelligence? Researchers identified 25 sets of identical twins in which one twin was raised by biological parents and the other twin was raised by adoptive parents. Each twin was given an IQ test. C. Which drug (A or B) relieves severe headaches faster? A sample of 10 people who suffer from severe headaches is given drug A for their pain one month and drug B another month. Time to headache relief is measured. D. A total of 20 people enter a study to determine the benefits of a new drug designed to reduce cholesterol. The new drug is given to 10 people, while a placebo is given to the other 10 people. After a period of three months, the reduction in cholesterol is measured for each person. Adapted from: Introduction to Statistics & Data Analysis Chapter 11Activities Worksheets Section 13.1 – Comparing Population Means: Independent Samples CI You are creating an interval that will estimate the true difference in the means of both populations. You will use calculator procedures to help. For the 2-sample T-Interval, you have: Formula: Conditions: s12 s 22 n1 n 2 SRS, Independence, and Normality for both sampling distributions X1 X2 t critical value Researchers at Rochester Institute of Technology investigated the use of isolation time-out with 155 emotionally disturbed students enrolled in a special education facility (Exceptional Children, Feb., 1995). The students were randomly assigned to one of two types of classrooms – Type A classrooms (with a maximum of 12 students) and Type B classrooms (with a maximum of 6 students). Over the academic year the number of incidents resulting in an isolation time-out was recorded for each student. Summary statistics for the two groups are shown in the following table. Classroom Type A Type B # of Students 100 55 Mean # Timeouts 78.67 102.87 Std Dev 59.08 69.33 1. Explain why time-outs for the students in Type A classrooms are independent of time-outs for the students in Type B classrooms. 2. Create a 95% confidence interval for the difference in mean # of time outs for Type A versus Type B classrooms. Remember to use steps A – E (Population, Statistical Method, Sample, Statistical Results, and Conclusion). 3. Based on your CI from part 2, would you agree that "On average, students in Type A classrooms had significantly fewer time-out incidents than students in Type B classrooms”? EXPLAIN. Adapted from: Introduction to Statistics & Data Analysis Chapter 11Activities Worksheets Section 13.1 – Comparing Population Means: Independent Samples Test Your goal is to determine if there is a significant difference in the means of two populations. 2-Sample T-Test Hypotheses: H0: 1 = 2 (1 - 2 = 0: no difference in population means) Ha: 1 2, 1 < 2, or 1 > 2 Conditions: SRS, Independence, and Normality of Both Populations Test Statistic: t X 1 X2 0 s12 s 22 n1 n 2 . Use df, as reported by your calculator. Do not use the pooled option!! According to WebMD (www.webmd.com), “normal” body temperature is an average. Not only is body temperature different for different people, it also changes during the day and is very sensitive to hormone levels. The table below summarizes body temperature data from the Journal of Statistics Education Data Archive (Shoemaker, 1996). Body Temperature (F) Gender n Mean StDev Male 65 98.105 0.699 Female 65 98.394 0.743 Does the data suggest that there a significant difference in average body temperature for men and women? Perform a hypothesis test to answer this question. Remember to use steps A – E (Population, Statistical Method, Sample, Statistical Results, and Conclusion). Adapted from: Introduction to Statistics & Data Analysis Chapter 11Activities Worksheets Section 12.1 – Comparing Population Means: Dependent Samples CI (Paired T-test) In a paired T-test, you are working with dependent data. As a result, we do not use the 2-Sample procedures. Instead, we calculate the difference between the values of our dependent sample. We, then, use the single sample t-test on the differences. 1-Sample T-Interval (Paired) Hypotheses: H0: D = 0 (no difference) Ha: D = 0, D < 0 , D > 0 Conditions: SRS, Independence, and Normality of the population of differences Confidence Interval: s X d t critical value d n Test Statistic: t XD , df = n – 1 sD n Jane usually takes her dry cleaning to Lilac Cleaners in Rochester, NY, but is considering changing to Leary’s Cleaners. According to an article in the local newspaper, Leary’s uses a dry-wet cleaning process which is more environmentally friendly. Jane expects to pay more for this service, but wants to know how much more before taking her business there – see the data in the table below. Item Suit Shirt/Blouse Slacks Skirt Sweater Winter coat Comforter (full-size) Raincoat Leary’s $12.25 $5.99 $5.89 $5.89 $5.39 $13.49 $14.25 $12.99 Lilac Difference $10.25 $5.25 $5.25 $5.25 $5.25 $11.95 $15.00 $11.95 Prices found in “The Bottom Line: A monthly comparison of goods and services”, Rochester Democrat and Chronicle, Oct 16, 2005. 1. Explain why the prices in the table above are considers dependent samples. 2. Compute the differences (Leary’s – Lilac) and write them in the column labeled “Difference”. 3. Create a 90% confidence interval for the mean difference in price for the two dry cleaners. Remember to use steps A – E (Population, Statistical Method, Sample, Statistical Results, and Conclusion). Adapted from: Introduction to Statistics & Data Analysis Chapter 11Activities Worksheets Section 12.1 – Comparing Population Means: Dependent Samples Test A 1998 article in Measurement in Physical Education & Exercise Science (Erdmann, Dolgener, and Hensley) examined the post-exercise heart rates of sixty-three middle school-age boys. The boys were instructed in self-pulse counting using the carotid artery on either side of the neck, and given the opportunity to briefly practice. Each boy was then connected to a telemetry system (to measure heart rate) and told to walk as fast as possible over a 720 meter course. Postwalk heart rates were simultaneously measured by the boys and via telemetry. Postwalk Measurement Mean Std Dev Telemetry Heart Rate 165.1 22.0 Self-Reported Heart Rate 143.6 31.3 Paired Difference in Heart Rate (Telemetry – Self-Reported) 21.6 23.3 1. What information do the paired differences provide here? Why are they more informative than the separate sets of self-reported and telemetry heart rates? 2. Does the data provide sufficient evidence to conclude that, on average, all middle school-age boys are underreporting their post-exercise heart rates? Perform a hypothesis test to answer this question. Remember to use steps A – E (Population, Statistical Method, Sample, Statistical Results, and Conclusion). Adapted from: Introduction to Statistics & Data Analysis Chapter 11Activities Worksheets Section 13.1 – Estimating the Difference Between Two Proportions You are creating an interval that will estimate the true difference in the proportions of both populations. You will use calculator procedures to help. For the 2-proportion Z-Interval, you have: pˆ 1 1 pˆ 1 pˆ 2 1 pˆ 2 n1 n2 Formula: pˆ 1 pˆ 2 z critical value Conditions: SRS, Independence, and Normality for both sampling distributions. Normality is now verified as long as: npˆ 1 5, n (1 pˆ 1 ) 5 npˆ 2 5, n (1 pˆ 2 ) 5 The National Highway Traffic Safety Administration published the Motor Vehicle Occupant Safety Survey in March 2000. Their survey of seat belt use asked 3569 male and 3893 female drivers, “When driving this [vehicle], how often do you wear your [lap/shoulder] belt?” In response, 74% of the male drivers and 84% of the female drivers answered “All of the time”. 1. How many of the male drivers answered “All of the time”? How many of the female drivers? 2. Estimate, with 95% confidence, the difference between the proportion of all male drivers and the proportion of all female drivers who would answer that the wear their seat belt “all of the time”. Remember to use steps A – E (Population, Statistical Method, Sample, Statistical Results, and Conclusion). 3. Based on your CI result, how much more likely are women to wear their seat belt than men? Adapted from: Introduction to Statistics & Data Analysis Chapter 11Activities Worksheets Section 13.1 – Testing for a Difference Between Two Proportions You are testing to determine if there is a significance difference in the proportions of both populations. You will use calculator procedures to help. For the 2-proportion Z-Test, you have: Hypotheses: H0: p1 = p2 (p1 - p2 = 0: no difference in population proportions) Ha: p1 p2, p1 < p2, or p1 > p2 Conditions: SRS, Independence, and Normality of Both Populations Normality is now verified as long as: Test Statistic: z pˆ 1 pˆ 2 pˆ c 1 pˆ c pˆ c 1 pˆ c n1 p̂c npˆc 5, n (1 pˆc ) 5 , where n2 count of successes in both samples combined . count of individuals in both samples combined In 1954 an experiment was conducted to test the effectiveness of the Salk vaccine as protection against the devastating effects of polio. With their parents consent, 200,745 children were injected with the vaccine, while 201,229 other children were injected with an ineffective saline solution. The experiment was “double blind” because the children being injected didn’t know whether they were given the real vaccine or the placebo, and the doctors giving the injections didn’t know either. The children were monitored for a period of years to determine if they developed paralytic polio. Thirty-three of the 200,745 vaccinated children later developed paralytic polio, whereas 115 of the 201,229 injected with the saline solution later developed paralytic polio ("An Evaluation of the 1954 Poliomyelitis Vaccine Trials," American Journal of Public Health, 1955). Does the data provide sufficient evidence to conclude that the Salk vaccine is effective at lowering the risk of developing polio? Perform a hypothesis test to answer this question. Remember to use steps A – E (Population, Statistical Method, Sample, Statistical Results, and Conclusion). Adapted from: Introduction to Statistics & Data Analysis Chapter 11Activities Worksheets