LESSON 2.2 INFERENCES ABOUT TWO PROPORTIONS MGT204 - Unit 2 Hypothesis Testing with Two Samples In this Unit In MGT201, we introduced methods of inferential statistics: confidence intervals and testing claims made about population parameters. ● The objective of this unit is to extend the methods for estimating values of population parameters and the methods for testing hypotheses to situation involving two sets of sample data instead of just one. ● See Lesson 2.1 to review these concepts. ● In this Lesson We present methods for: Testing a claim made about the two population proportions and 2. Constructing a confidence interval estimate of the difference between the two population proportions. 1. Notation for Two Proportions For population 1 we let p1 = population proportion n1 = size of the sample x1 = number of successes in the sample (sample proportion) (complement of ) The corresponding notations apply to population 2. Pooled Sample Proportion The Pooled sample proportion is denoted by given by: and is Requirements 1. The sample proportions are from two simple random sample that are independent (Samples are independent if the sample values selected from one population are not related to or somehow naturally paired or matched with sample values selected from the other population.) 2. For each of the two samples, the number of successes is at least 5 and the number of failures is at least 5 (That is for each of the two samples.) Test Statistic for Two Proportions (with H0: p1 = p2) where p1 - p2 = 0 (Assumed in the null hypothesis) Critical values Use Appendix B.1 - Click to access Based on the significance level α, find the area under the curve up to the critical value by finding 1 - α/2 Finally, express the critical value as a z-score, find the z-score in Appendix B.1 equal to the area of the curve. Confidence Interval Estimate of p1 - p2 The confidence interval estimate of the difference p1 - p2 is: where the margin of error E is given by: Rounding: Round the confidence interval limits to three significant digits. Example 1 - Do Airbags Save Lives? The table below lists results from a simple random sample of front-seat occupants involved in car crashes (based on data from “Who Wants Airbags?” by Meyer and Finney, Chance, Vol. 18, No 2). Use a 0.05 significance level to test the claim that the fatality rate of occupants is lower for those in cars equipped with airbags. Airbag Available Occupant Fatalities Total number of Occupants No Airbag Available 41 52 11,541 9,853 Example 1 - Do Airbags Save Lives? Step 0: Given Occupant Fatalities Total number of Occupants Airbag Available No Airbag Available 41 52 11,541 9,853 α = 0.05 Claim: Fatality rate of occupants is lower for those in cars equipped with airbags. Symbolically: The claim that the fatality rate is lower for those with airbags can be expressed as p1 < p2 p1 represents the fatality rate in Airbags available p2 represents the fatality rate in no airbags available. Example 1 - Do Airbags Save Lives? Step 1: State Null and Alternative Hypotheses Since the claim of p1 < p2 does not contain equality, it becomes the alternative hypothesis. The null hypothesis is the statement of equality. For tests of hypotheses made about two population proportions, we consider only tests having a null hypothesis of p1 = p2. So we have: H0: p1 = p2 H1: p1 < p2 (Original Claim) Since H1 contains the less than symbol <, this will be a left-tailed test Example 1 - Do Airbags Save Lives? Step 2: Determine which Test Statistic to use given the information you have: Since it is a proportion, we will use the normal distribution (z-test) Example 1 - Do Airbags Save Lives? Step 3: The Decision Rule Given the level of significance, type of test, and which test statistic being used, we find the critical value(s) and find the area of rejection. We refer to Appendix B.1 and find that an area of α = 0.05 in the left tail corresponds to the critical value of z = -1.645 Reject H0 z = -1.645 Example 1 - Do Airbags Save Lives? Step 4: Calculate the Test Statistic Example 1 - Do Airbags Save Lives? Step 5: Interpret Results The test statistic of z = -1.91 does fall in the critical region bounded by the critical value of z = -1.645. Therefore we reject the null hypothesis and accept the alternative hypothesis (the original claim). We conclude that there is sufficient evidence to support the claim that the proportion of accident fatalities for occupants in cars with airbags is less than the proportion of fatalities for occupants in cars without airbags. Confidence Intervals Using the format given earlier, we can construct a confidence interval estimate of the difference between population proportions (p1 - p2). If a confidence interval estimate of p1 - p2 does not include 0, we have evidence suggesting that p1 and p2 have different values. Example 2 - Airbags Use the sample data given in Example 1 to construct a 90% confidence interval estimate of the difference between the two population proportions. What does the result suggest about the effectiveness of airbags in an accident? Example 2 - Airbags Step 0: Given Airbag Available Occupant Fatalities Total number of Occupants No Airbag Available 41 52 11,541 9,853 With a 90% confidence level, zα/2 = 1.645 (from Appendix B.1). From Example 1: Example 2 - Airbags Step 1: Calculate the value of the margin of error E as shown. Example 2 - Airbags Step 2: Construct the confidence interval Example 2 - Airbags Step 3: Interpret results. The confidence interval limits do not contain 0, implying that there is a significant difference between the two proportions. p1 The confidence interval suggests that the fatality rate is lower for occupants in cars with airbags than for occupants in cars without airbags. (The interval contains only p2 negative numbers) p1 - p2 is always negative The confidence interval also provides an estimate of the amount of the difference between the two fatality rates. Practice 2.2 Handout