Section 6.2 Confidence Intervals for the Difference between Two Population Means µ1 - µ2: Independent Samples 1 6.2 Confidence Intervals for the Difference between Two Population Means µ1 - µ2: Independent Samples • Two random samples are drawn from the two populations of interest. • Because we compare two population means, we use the statistic x 1 x 2 . 2 Population 1 Population 2 Parameters: µ1 and 12 Parameters: µ2 and 22 (values are unknown) (values are unknown) Sample size: n1 Statistics: x1 and s12 Sample size: n2 Statistics: x2 and s22 Estimate µ1 µ2 with x1 x2 3 Sampling distribution model for x1 x 2 ? E ( x1 x 2 ) 1 2 1 2 SD ( x1 x 2 ) n1 2 2 Estimate using n2 SE ( x1 x 2 ) Shape? s s n n 1 2 2 1 df 2 2 2 s1 n1 2 s2 n2 2 2 2 s12 1 s2 n1 1 n1 n2 1 n2 2 df 1 Sometimes used (not always very good) estimate of the degrees of freedom is min(n1 − 1, n2 − 1). 0 t Confidence Interval for 1 – 2 C onfidence interval ( x x ) t df 1 2 * 2 2 s s 1 2 n n 1 2 * w here t df is the value from the t-table that corresponds to the confidence level s s n n 1 2 2 1 df 2 2 2 2 s 1 s n1 1 n1 n2 1 n2 1 2 1 2 2 2 5 Confidence Interval for 1 – 2 C onfidence interval ( x x ) t df 1 2 * 2 2 s s 1 2 n n 1 2 * w here t df is the value from the t-table that corresponds to the confidence level s s n n 1 2 2 1 df 2 2 2 2 s 1 s n1 1 n1 n2 1 n2 1 2 1 2 2 2 6 Example: “Cameron Crazies”. Confidence interval for 1 – 2 Do the “Cameron Crazies” at Duke home games help the Blue Devils play better defense? Below are the points allowed by Duke (men) at home and on the road for the conference games from a recent season. Pts allowed 44 at home 56 44 54 75 101 91 81 Pts allowed on road 56 70 74 80 67 65 79 58 hom e: x1 68.25 s1 21.8 n1 8 road: x 2 68.63 s 2 8.9 n 2 8 7 Example: “Cameron Crazies”. Confidence interval for 1 – 2 Calculate a 95% CI for 1 - 2 where 1 = mean points per game allowed by Duke at home. 2 = mean points per game allowed by Duke on road • n1 = 8, n2 = 8; s12= (21.8)2 = 475.36; s22 = (8.9)2 = 79.41 s s n n 1 2 2 1 df 2 2 2 2 s 1 s n1 1 n1 n2 1 n2 1 2 1 2 2 2 475.36 79.41 8 8 2 2 1 475.36 1 79.41 7 8 7 8 2 9.27 8 Example: “Cameron Crazies”. Confidence interval for 1 – 2 • To use the t-table let’s use df = 9; t9* = 2.2622 • The confidence interval estimator for the difference between two means is … ( x x ) t9 1 2 * 2 2 s s 1 2 n n 1 2 (68.25 68.63) 2.2622 475.36 8 79.41 8 .38 18.84 19.22,18.46 9 Interpretation • The 95% CI for 1 - 2 is (-19.22, 18.46). • Since the interval contains 0, there appears to be no significant difference between 1 = mean points per game allowed by Duke at home. 2 = mean points per game allowed by Duke on road • The Cameron Crazies appear to have no affect on the ABILITY of the Duke men to play defense. How can this be? 10 Example: confidence interval for 1 – 2 • Example (p. 6) – Do people who eat high-fiber cereal for breakfast consume, on average, fewer calories for lunch than people who do not eat high-fiber cereal for breakfast? – A sample of 150 people was randomly drawn. Each person was identified as a consumer or a non-consumer of highfiber cereal. – For each person the number of calories consumed at lunch was recorded. 11 Example: confidence interval for 1 – 2 Consmers Non-cmrs 568 498 589 681 540 646 636 739 539 596 607 529 637 617 633 555 . . . . 705 819 706 509 613 582 601 608 787 573 428 754 741 628 537 748 . . . . Solution: (all data on p. 6) • The parameter to be tested is the difference between two means. • The claim to be tested is: The mean caloric intake of consumers (1) is less than that of non-consumers (2). • n1 = 43, n2 = 107; s12=4,103; s22=10,670 s s n n 1 2 2 1 df 2 2 2 2 2 s12 1 s2 n1 1 n1 n2 1 n2 2 122.6 1 12 Example: confidence interval for 1 – 2 • Let’s use df = 120; t120* = 1.9799 • The confidence interval estimator for the difference between two means using the formula on p. 4 is ( x x ) t120 1 2 * 2 2 s s 1 2 n n 1 2 (604.02 633.239) 1.9799 4103 43 10670 107 29.21 27.66 56.87, 1.55 13 Interpretation • The 95% CI is (-56.87, -1.55). • Since the interval is entirely negative (that is, does not contain 0), there is evidence from the data that µ1 is less than µ2. We estimate that non-consumers of high-fiber breakfast consume on average between 1.55 and 56.87 more calories for lunch. 14 Example: (cont.) confidence interval for 1 – 2 using min(n1 –1, n2 -1) to approximate the df • Let’s use df = min(43-1, 107-1) = min(42, 106) = 42; • t42* = 2.0181 • The confidence interval estimator for the difference between two means using the formula on p. 4 is ( x x ) t 42 1 2 * 2 2 s s 1 2 n n 1 2 (604.02 633.239) 2.0181 4103 43 10670 107 29.21 28.19 57.40, 1.02 15 Beware!! Common Mistake !!! A common mistake is to calculate a one-sample confidence interval for 1, a one-sample confidence interval for 2, and to then conclude that 1 and 2 are equal if the confidence intervals overlap. This is WRONG because the variability in the sampling distribution for x 1 x 2 from two independent samples is more complex and must take into account variability coming from both samples. Hence the more complex formula for the standard error. 2 SE s1 n1 2 s2 n2 INCORRECT Two single-sample 95% confidence intervals: The confidence interval for the male mean and the confidence interval for the female mean overlap, suggesting no significant difference between the true mean for males and the true mean for females. Male Male interval: (18.68, 20.12) Female mean 19.4 17.9 st. dev. s 2.52 3.39 n 50 50 Female interval: (16.94, 18.86) C O R R E C T T he 2-sam ple 95% confidence interval of the form 2 ( y1 y 2 ) t * .025 , df 2 s1 n1 s2 n2 for the difference m ale fem ale betw een the m eans is (.313, 2.69). Interval is entirely positive, su ggestin g sign i fican t d ifferen ce betw een the true m ean for m ales and the true m ean for fem ales (evidence that true m ale m ean is larger than true fem ale m ean). 0 .313 1.5 2.69 Reason for Contradictory Result It's alw ays true that a b 2 1 s n1 s 2 2 n2 a s1 n1 b . S pecifically, s2 n2 SE ( x1 x 2 ) SE ( x1 ) SE ( x 2 ) 18 Does smoking damage the lungs of children exposed to parental smoking? Forced vital capacity (FVC) is the volume (in milliliters) of air that an individual can exhale in 6 seconds. FVC was obtained for a sample of children not exposed to parental smoking and a group of children exposed to parental smoking. Parental smoking FVC x s n Yes 75.5 9.3 30 No 88.2 15.1 30 We want to know whether parental smoking decreases children’s lung capacity as measured by the FVC test. Is the mean FVC lower in the population of children exposed to parental smoking? FVC x Parental smoking s n Yes 75.5 9.3 30 No 88.2 15.1 30 95% confidence interval for (µ1 − µ2), with df = min(30-1, 30-1) = 29 t* = 2.0452: 2 ( x1 x 2 ) t * s1 n1 1 = mean FVC of children with a smoking parent; 2 = mean FVC of children without a smoking parent 2 s2 n2 (75.5 88.2) 2.0452 9.3 2 30 15.1 2 30 12.7 2.0452 * 3.24 12.7 6.63 ( 19.33, 6.07 ) We are 95% confident that lung capacity is between 19.33 and 6.07 milliliters LESS in children of smoking parents. Do left-handed people have a shorter life-expectancy than right-handed people? Some psychologists believe that the stress of being lefthanded in a right-handed world leads to earlier deaths among left-handers. Several studies have compared the life expectancies of lefthanders and right-handers. One such study resulted in the data shown in the table. Handedness Mean age at death x s n Left 66.8 25.3 99 Right 75.2 15.1 888 star left-handed quarterback Steve Young left-handed presidents We will use the data to construct a confidence interval for the difference in mean life expectancies for left- handers and right-handers. Is the mean life expectancy of left-handers less than the mean life expectancy of right-handers? Handedness Mean age at death s n Left 66.8 25.3 99 Right 75.2 15.1 888 95% confidence interval for (µ1 − µ2), with df = min(99-1, 888-1) = 98 t* = 1.9845: 2 ( x1 x 2 ) t * s1 n1 2 s2 n2 (66.8 75.2) 1.9845 (25.3) 99 2 (15.1) 888 2 The “Bambino”,left-handed Babe Ruth, baseball’s all-time best player. 1 = mean life expectancy of left-handers; 2 = mean life expectancy of right-handers 8.4 1.9845 * 2.59 8.4 5.14 ( 13.54, 3.26) We are 95% confident that the mean life expectancy for lefthanders is between 3.26 and 13.54 years LESS than the mean life expectancy for right-handers.