BUSINESS STATISTICS Chapter 10 INFERENCE: TWO POPULATIONS Chap 10-1 Chapter Contents This chapter includes two main parts: 1. Estimation for the difference of two population parameters 2. Hypothesis Testing for the difference of two population parameters Chap 10-2 Part 1: Goals After completing this part, you should be able to: Form confidence intervals for the mean difference from dependent samples Form confidence intervals for the difference between two independent population means (standard deviations known or unknown) Compute confidence interval limits for the difference between two independent population proportions Create confidence intervals for a population variance Find chi-square values from the chi-square distribution table Determine the required sample size to estimate a mean or proportion within a specified margin of error Chap 10-3 Estimation: Two populations Chapter Topics Population Means, Dependent Samples Population Means, Independent Samples Population Proportions Population Variance Proportion 1 vs. Proportion 2 Variance of a normal distribution Examples: Same group before vs. after treatment Group 1 vs. independent Group 2 Chap 10-4 Dependent Samples Tests Means of 2 Related Populations Dependent samples Paired or matched samples Repeated measures (before/after) Use difference between paired values: di = xi - yi Eliminates Variation Among Subjects Assumptions: Both Populations Are Normally Distributed Chap 10-5 Mean Difference The ith paired difference is di , where Dependent samples di = x i - yi The point estimate for the population mean paired difference is d : The sample standard deviation is: n d d i 1 i n n Sd 2 (d d ) i i1 n 1 n is the number of matched pairs in the sample Chap 10-6 Confidence Interval for Mean Difference Dependent samples The confidence interval for difference between population means, μd , is d t n1,α/2 Sd Sd μd d t n1,α/2 n n Where n = the sample size (number of matched pairs in the paired sample) Chap 10-7 Confidence Interval for Mean Difference (continued) Dependent samples The margin of error is ME t n1,α/2 sd n tn-1,/2 is the value from the Student’s t distribution with (n – 1) degrees of freedom for which α P(t n1 t n1,α/2 ) 2 Chap 10-8 Paired Samples Example Six people sign up for a weight loss program. You collect the following data: Person 1 2 3 4 5 6 Weight: Before (x) After (y) 136 205 157 138 175 166 125 195 150 140 165 160 Difference, di 11 10 7 -2 10 6 42 di d = n = 7.0 Sd 2 (d d ) i n 1 4.82 Chap 10-9 Paired Samples Example (continued) For a 95% confidence level, the appropriate t value is tn-1,/2 = t5,.025 = 2.571 The 95% confidence interval for the difference between means, μd , is Sd Sd d t n 1,α/2 μ d d t n 1,α/2 n n 4.82 4.82 7 (2.571) μ d 7 (2.571) 6 6 1.94 μ d 12.06 Since this interval contains zero, we cannot be 95% confident, given this limited data, that the weight loss program helps people lose weight Chap 10-10 Difference Between Two Means Population means, independent samples Goal: Form a confidence interval for the difference between two population means, μx – μy Different data sources Unrelated Independent Sample selected from one population has no effect on the sample selected from the other population The point estimate is the difference between the two sample means: x–y Chap 10-11 Difference Between Two Means (continued) Population means, independent samples σx2 and σy2 known Confidence interval uses z/2 σx2 and σy2 unknown σx2 and σy2 assumed equal σx2 and σy2 assumed unequal Confidence interval uses a value from the Student’s t distribution Chap 10-12 σx2 and σy2 Known Population means, independent samples σx2 and σy2 known σx2 and σy2 unknown Assumptions: * Samples are randomly and independently drawn both population distributions are normal Population variances are known Chap 10-13 σx2 and σy2 Known (continued) When σx and σy are known and both populations are normal, the variance of X – Y is Population means, independent samples 2 σx2 and σy2 known σx2 and σy2 unknown * σ 2X Y 2 σy σx nx ny …and the random variable Z (x y) (μX μY ) 2 σ 2x σ y nX nY has a standard normal distribution Chap 10-14 Confidence Interval, σx2 and σy2 Known Population means, independent samples σx2 and σy2 known σx2 and σy2 unknown (x y) z α/2 * The confidence interval for μx – μy is: σ 2X σ 2Y σ 2X σ 2Y μX μY (x y) z α/2 nx ny nx ny Chap 10-15 σx2 and σy2 Unknown, Assumed Equal Assumptions: Population means, independent samples Samples are randomly and independently drawn σx2 and σy2 known Populations are normally distributed σx2 and σy2 unknown σx2 and σy2 assumed equal * Population variances are unknown but assumed equal σx2 and σy2 assumed unequal Chap 10-16 σx2 and σy2 Unknown, Assumed Equal (continued) Forming interval estimates: Population means, independent samples The population variances are assumed equal, so use the two sample standard deviations and pool them to estimate σ σx2 and σy2 known σx2 and σy2 unknown σx2 and σy2 assumed equal σx2 and σy2 assumed unequal * use a t value with (nx + ny – 2) degrees of freedom Chap 10-17 σx2 and σy2 Unknown, Assumed Equal (continued) Population means, independent samples The pooled variance is σx2 and σy2 known σx2 and σy2 unknown σx2 and σy2 assumed equal * sp2 (n x 1)s 2x (n y 1)s 2y nx ny 2 σx2 and σy2 assumed unequal Chap 10-18 Confidence Interval, σx2 and σy2 Unknown, Equal σx2 and σy2 unknown σx2 and σy2 assumed equal * The confidence interval for μ1 – μ2 is: sp2 sp2 σx2 and σy2 assumed unequal (x y) t nx ny 2,α/2 Where sp2 sp2 nx ny μX μY (x y) t nx ny 2,α/2 nx sp2 ny (n x 1)s 2x (n y 1)s 2y nx ny 2 Chap 10-19 Pooled Variance Example You are testing two computer processors for speed. Form a confidence interval for the difference in CPU speed. You collect the following speed data (in Mhz): CPUx Number Tested 17 Sample mean 3004 Sample std dev 74 CPUy 14 2538 56 Assume both populations are normal with equal variances, and use 95% confidence Chap 10-20 Calculating the Pooled Variance The pooled variance is: 2 2 n 1 S n 1 S 17 1742 14 1562 x x y y 2 S p (n x 1) (ny 1) (17 - 1) (14 1) 4427.03 The t value for a 95% confidence interval is: tnx ny 2 , α/2 t 29 , 0.025 2.045 Chap 10-21 Calculating the Confidence Limits The 95% confidence interval is (x y) t nx ny 2,α/2 (3004 2538) (2.054) sp2 nx sp2 ny μX μY (x y) t nx ny 2,α/2 sp2 nx sp2 ny 4427.03 4427.03 4427.03 4427.03 μX μY (3004 2538) (2.054) 17 14 17 14 416.69 μX μY 515.31 We are 95% confident that the mean difference in CPU speed is between 416.69 and 515.31 Mhz. Chap 10-22 σx2 and σy2 Unknown, Assumed Unequal Assumptions: Population means, independent samples Samples are randomly and independently drawn σx2 and σy2 known Populations are normally distributed σx2 and σy2 unknown Population variances are unknown and assumed unequal σx2 and σy2 assumed equal σx2 and σy2 assumed unequal * Chap 10-23 σx2 and σy2 Unknown, Assumed Unequal (continued) Forming interval estimates: Population means, independent samples The population variances are assumed unequal, so a pooled variance is not appropriate σx2 and σy2 known use a t value with degrees of freedom, where σx2 and σy2 unknown 2 σx2 and σy2 assumed equal σx2 and σy2 assumed unequal * s2x s2y ( ) ( ) n y n x v 2 2 2 2 s sx /(n x 1) y /(n y 1) n nx y Chap 10-24 Confidence Interval, σx2 and σy2 Unknown, Unequal σx2 and σy2 unknown σx2 and σy2 assumed equal σx2 and σy2 assumed unequal (x y) t ,α/2 * The confidence interval for μ1 – μ2 is: 2 2 s2x s y s2x s y μX μY (x y) t ,α/2 nx ny nx ny Where v s2x s2y ( ) ( ) n y n x 2 2 s2 s2x /(n x 1) y /(n y 1) n nx y 2 Chap 10-25 Two Population Proportions Population proportions Goal: Form a confidence interval for the difference between two population proportions, Px – Py Assumptions: Both sample sizes are large (generally at least 40 observations in each sample) The point estimate for the difference is pˆ x pˆ y Chap 10-26 Two Population Proportions (continued) Population proportions The random variable Z (pˆ x pˆ y ) (p x p y ) pˆ x (1 pˆ x ) pˆ y (1 pˆ y ) nx ny is approximately normally distributed Chap 10-27 Confidence Interval for Two Population Proportions Population proportions The confidence limits for Px – Py are: (pˆ x pˆ y ) Z / 2 pˆ x (1 pˆ x ) pˆ y (1 pˆ y ) nx ny Chap 10-28 Example: Two Population Proportions Form a 90% confidence interval for the difference between the proportion of men and the proportion of women who have college degrees. In a random sample, 26 of 50 men and 28 of 40 women had an earned college degree Chap 10-29 Example: Two Population Proportions (continued) Men: ˆp x 26 0.52 50 Women: ˆp y 28 0.70 40 pˆ x (1 pˆ x ) pˆ y (1 pˆ y ) 0.52(0.48) 0.70(0.30) 0.1012 nx ny 50 40 For 90% confidence, Z/2 = 1.645 Chap 10-30 Example: Two Population Proportions (continued) The confidence limits are: (pˆ x pˆ y ) Z α/2 pˆ x (1 pˆ x ) pˆ y (1 pˆ y ) nx ny (.52 .70) 1.645 (0.1012) so the confidence interval is -0.3465 < Px – Py < -0.0135 Since this interval does not contain zero we are 90% confident that the two proportions are not equal Chap 10-31 Confidence Intervals for the Population Variance Population Variance Goal: Form a confidence interval for the population variance, σ2 The confidence interval is based on the sample variance, s2 Assumed: the population is normally distributed Chap 10-32 Confidence Intervals for the Population Variance (continued) Population Variance The random variable 2 n1 (n 1)s 2 σ 2 follows a chi-square distribution with (n – 1) degrees of freedom 2 The chi-square value n1, denotes the number for which P( χn21 χn21, α ) α Chap 10-33 Confidence Intervals for the Population Variance (continued) Population Variance The (1 - )% confidence interval for the population variance is (n 1)s (n 1)s 2 σ 2 2 χn1, α/2 χn1, 1 - α/2 2 2 Chap 10-34 Example You are testing the speed of a computer processor. You collect the following data (in Mhz): CPUx Sample size 17 Sample mean 3004 Sample std dev 74 Assume the population is normal. Determine the 95% confidence interval for σx2 Chap 10-35 Finding the Chi-square Values n = 17 so the chi-square distribution has (n – 1) = 16 degrees of freedom = 0.05, so use the the chi-square values with area 0.025 in each tail: 2 χ n21, α/2 χ16 , 0.025 28.85 2 χ n21, 1 - α/2 χ16 , 0.975 6.91 probability α/2 = .025 probability α/2 = .025 216 = 6.91 216 = 28.85 216 Chap 10-36 Calculating the Confidence Limits The 95% confidence interval is 2 (n 1)s 2 (n 1)s 2 σ 2 2 χn1, α/2 χn1, 1 - α/2 2 (17 1)(74)2 (17 1)(74) σ2 28.85 6.91 3037 σ 2 12683 Converting to standard deviation, we are 95% confident that the population standard deviation of CPU speed is between 55.1 and 112.6 Mhz Chap 10-37 Sample Size Determination Determining Sample Size For the Mean For the Proportion Chap 10-38 Margin of Error The required sample size can be found to reach a desired margin of error (ME) with a specified level of confidence (1 - ) The margin of error is also called sampling error the amount of imprecision in the estimate of the population parameter the amount added and subtracted to the point estimate to form the confidence interval Chap 10-39 Sample Size Determination Determining Sample Size For the Mean x z α/2 σ n Margin of Error (sampling error) ME z α/2 σ n Chap 10-40 Sample Size Determination (continued) Determining Sample Size For the Mean ME z α/2 σ n Now solve for n to get z σ n 2 ME 2 α/2 2 Chap 10-41 Sample Size Determination (continued) To determine the required sample size for the mean, you must know: The desired level of confidence (1 - ), which determines the z/2 value The acceptable margin of error (sampling error), ME The standard deviation, σ Chap 10-42 Required Sample Size Example If = 45, what sample size is needed to estimate the mean within ± 5 with 90% confidence? z σ (1.645) (45) n 219.19 2 2 ME 5 2 α/2 2 2 2 So the required sample size is n = 220 (Always round up) Chap 10-43 Sample Size Determination Determining Sample Size For the Proportion pˆ z α/2 pˆ (1 pˆ ) n ME z α/2 pˆ (1 pˆ ) n Margin of Error (sampling error) Chap 10-44 Sample Size Determination (continued) Determining Sample Size For the Proportion ME z α/2 pˆ (1 pˆ ) n pˆ (1 pˆ ) cannot be larger than 0.25, when p̂ = 0.5 Substitute 0.25 for pˆ (1 pˆ ) and solve for n to get 0.25 z n 2 ME 2 α/2 Chap 10-45 Sample Size Determination (continued) The sample and population proportions, p̂ and P, are generally not known (since no sample has been taken yet) P(1 – P) = 0.25 generates the largest possible margin of error (so guarantees that the resulting sample size will meet the desired level of confidence) To determine the required sample size for the proportion, you must know: The desired level of confidence (1 - ), which determines the critical z/2 value The acceptable sampling error (margin of error), ME Estimate P(1 – P) = 0.25 Chap 10-46 Required Sample Size Example How large a sample would be necessary to estimate the true proportion defective in a large population within ±3%, with 95% confidence? Chap 10-47 Required Sample Size Example (continued) Solution: For 95% confidence, use z0.025 = 1.96 ME = 0.03 Estimate P(1 – P) = 0.25 0.25 z n 2 ME 2 α/2 2 (0.25)(1.9 6) 1067.11 2 (0.03) So use n = 1068 Chap 10-48 Part 1: Summary Compared two dependent samples (paired samples) Formed confidence intervals for the paired difference Compared two independent samples Formed confidence intervals for the difference between two means, population variance known, using z Formed confidence intervals for the differences between two means, population variance unknown, using t Formed confidence intervals for the differences between two population proportions Formed confidence intervals for the population variance using the chi-square distribution Determined required sample size to meet confidence and margin of error requirements Chap 10-49 Part 2 Hypothesis Testing: Two populations Chap 10-50 Part 2: Goals After completing this part, you should be able to: Test hypotheses for the difference between two population means Two means, matched pairs Independent populations, population variances known Independent populations, population variances unknown but equal Complete a hypothesis test for the difference between two proportions (large samples) Use the chi-square distribution for tests of the variance of a normal distribution Use the F table to find critical F values Complete an F test for the equality of two variances Chap 10-51 Two Sample Tests Two Sample Tests Population Means, Matched Pairs Population Means, Independent Samples Population Proportions Population Variances Examples: Same group before vs. after treatment Group 1 vs. independent Group 2 Proportion 1 vs. Proportion 2 Variance 1 vs. Variance 2 (Note similarities to part 1) Chap 10-52 Matched Pairs Tests Means of 2 Related Populations Matched Pairs Paired or matched samples Repeated measures (before/after) Use difference between paired values: di = xi - yi Assumptions: Both Populations Are Normally Distributed Chap 10-53 Test Statistic: Matched Pairs Matched Pairs The test statistic for the mean difference is a t value, with n – 1 degrees of freedom: d D0 t sd n Where D0 = hypothesized mean difference sd = sample standard dev. of differences n = the sample size (number of pairs) Chap 10-54 Decision Rules: Matched Pairs Paired Samples Lower-tail test: Upper-tail test: Two-tail test: H0: μx – μy 0 H1: μx – μy < 0 H0: μx – μy ≤ 0 H1: μx – μy > 0 H0: μx – μy = 0 H1: μx – μy ≠ 0 -t t Reject H0 if t < -tn-1, Where Reject H0 if t > tn-1, t d D0 sd n /2 -t/2 /2 t/2 Reject H0 if t < -tn-1 , /2 or t > tn-1 , /2 has n - 1 d.f. Chap 10-55 Matched Pairs Example Assume you send your salespeople to a “customer service” training workshop. Has the training made a difference in the number of complaints? You collect the following data: di d = n Number of Complaints: (2) - (1) Salesperson C.B. T.F. M.H. R.K. M.O. Before (1) After (2) Difference, di 6 20 3 0 4 4 6 2 0 0 - 2 -14 - 1 0 - 4 -21 = - 4.2 Sd 2 (d d ) i n 1 5.67 Chap 10-56 Matched Pairs: Solution Has the training made a difference in the number of complaints (at the = 0.01 level)? H0: μx – μy = 0 H1: μx – μy 0 = .01 d = - 4.2 Critical Value = ± 4.604 d.f. = n - 1 = 4 Reject Reject /2 /2 - 4.604 4.604 - 1.66 Decision: Do not reject H0 (t stat is not in the reject region) Test Statistic: d D0 4.2 0 t 1.66 sd / n 5.67/ 5 Conclusion: There is not a significant change in the number of complaints. Chap 10-57 Difference Between Two Means Population means, independent samples Goal: Form a confidence interval for the difference between two population means, μx – μy Different data sources Unrelated Independent Sample selected from one population has no effect on the sample selected from the other population Chap 10-58 Difference Between Two Means (continued) Population means, independent samples σx2 and σy2 known Test statistic is a z value σx2 and σy2 unknown σx2 and σy2 assumed equal σx2 and σy2 assumed unequal Test statistic is a a value from the Student’s t distribution Chap 10-59 σx2 and σy2 Known Population means, independent samples σx2 and σy2 known σx2 and σy2 unknown Assumptions: * Samples are randomly and independently drawn both population distributions are normal Population variances are known Chap 10-60 σx2 and σy2 Known (continued) When σx2 and σy2 are known and both populations are normal, the variance of X – Y is Population means, independent samples 2 σx2 and σy2 known σx2 and σy2 unknown * σ 2X Y 2 σy σx nx ny …and the random variable Z (x y) (μX μY ) 2 σ 2x σ y nX nY has a standard normal distribution Chap 10-61 Test Statistic, σx2 and σy2 Known Population means, independent samples σx2 and σy2 known σx2 and σy2 unknown The test statistic for μx – μy is: * x y D0 z 2 σy σx nx ny 2 Chap 10-62 Hypothesis Tests for Two Population Means Two Population Means, Independent Samples Lower-tail test: Upper-tail test: Two-tail test: H0: μx μy H1: μx < μy H0: μx ≤ μy H1: μx > μy H0: μx = μy H1: μx ≠ μy i.e., i.e., i.e., H0: μx – μy 0 H1: μx – μy < 0 H0: μx – μy ≤ 0 H1: μx – μy > 0 H0: μx – μy = 0 H1: μx – μy ≠ 0 Chap 10-63 Decision Rules Two Population Means, Independent Samples, Variances Known Lower-tail test: Upper-tail test: Two-tail test: H0: μx – μy 0 H1: μx – μy < 0 H0: μx – μy ≤ 0 H1: μx – μy > 0 H0: μx – μy = 0 H1: μx – μy ≠ 0 -z Reject H0 if z < -z z Reject H0 if z > z /2 -z/2 /2 z/2 Reject H0 if z < -z/2 or z > z/2 Chap 10-64 σx2 and σy2 Unknown, Assumed Equal Assumptions: Population means, independent samples Samples are randomly and independently drawn σx2 and σy2 known Populations are normally distributed σx2 and σy2 unknown σx2 and σy2 assumed equal * Population variances are unknown but assumed equal σx2 and σy2 assumed unequal Chap 10-65 σx2 and σy2 Unknown, Assumed Equal (continued) Forming interval estimates: Population means, independent samples The population variances are assumed equal, so use the two sample standard deviations and pool them to estimate σ σx2 and σy2 known σx2 and σy2 unknown σx2 and σy2 assumed equal σx2 and σy2 assumed unequal * use a t value with (nx + ny – 2) degrees of freedom Chap 10-66 Test Statistic, σx2 and σy2 Unknown, Equal The test statistic for μx – μy is: σx2 and σy2 unknown σx2 and σy2 assumed equal σx2 and σy2 assumed unequal * t x y μx μy 1 1 S n n y x 2 p Where t has (n1 + n2 – 2) d.f., and sp2 (n x 1)s 2x (n y 1)s 2y nx ny 2 Chap 10-67 σx2 and σy2 Unknown, Assumed Unequal Assumptions: Population means, independent samples Samples are randomly and independently drawn σx2 and σy2 known Populations are normally distributed σx2 and σy2 unknown Population variances are unknown and assumed unequal σx2 and σy2 assumed equal σx2 and σy2 assumed unequal * Chap 10-68 σx2 and σy2 Unknown, Assumed Unequal (continued) Forming interval estimates: Population means, independent samples The population variances are assumed unequal, so a pooled variance is not appropriate σx2 and σy2 known use a t value with degrees of freedom, where σx2 and σy2 unknown 2 σx2 and σy2 assumed equal σx2 and σy2 assumed unequal * s2x s2y ( ) ( ) n y n x v 2 2 2 2 s sx /(n x 1) y /(n y 1) n nx y Chap 10-69 Test Statistic, σx2 and σy2 Unknown, Unequal σx2 and σy2 unknown The test statistic for μx – μy is: σx2 and σy2 assumed equal σx2 and σy2 assumed unequal * t (x y) D0 Where t has degrees of freedom: σ σ nX nY 2 y 2 x v s2x s2y ( ) ( ) n y n x 2 2 s2 s2x /(n x 1) y /(n y 1) n nx y 2 Chap 10-70 Decision Rules Two Population Means, Independent Samples, Variances Unknown Lower-tail test: Upper-tail test: Two-tail test: H0: μx – μy 0 H1: μx – μy < 0 H0: μx – μy ≤ 0 H1: μx – μy > 0 H0: μx – μy = 0 H1: μx – μy ≠ 0 -t Reject H0 if t < -tn-1, t Reject H0 if t > tn-1, /2 -t/2 /2 t/2 Reject H0 if t < -tn-1 , /2 or t > tn-1 , /2 Where t has n - 1 d.f. Chap 10-71 Pooled Variance t Test: Example You are a financial analyst for a brokerage firm. Is there a difference in dividend yield between stocks listed on the NYSE & NASDAQ? You collect the following data: NYSE NASDAQ Number 21 25 Sample mean 3.27 2.53 Sample std dev 1.30 1.16 Assuming both populations are approximately normal with equal variances, is there a difference in average yield ( = 0.05)? Chap 10-72 Calculating the Test Statistic The test statistic is: X X μ μ t 1 2 1 1 1 S n1 n2 2 p 2 3.27 2.53 0 1 1 1.5021 21 25 2 2 2 2 n 1 S n 1 S 21 1 1.30 25 1 1.16 1 2 2 S2 1 p (n1 1) (n2 1) (21 - 1) (25 1) 2.040 1.5021 Chap 10-73 Solution H0: μ1 - μ2 = 0 i.e. (μ1 = μ2) H1: μ1 - μ2 ≠ 0 i.e. (μ1 ≠ μ2) = 0.05 df = 21 + 25 - 2 = 44 Critical Values: t = ± 2.0154 Reject H0 .025 -2.0154 Reject H0 .025 0 2.0154 t 2.040 Test Statistic: Decision: 3.27 2.53 t 2.040 Reject H0 at = 0.05 1 1 Conclusion: 1.5021 21 25 There is evidence of a difference in means. Chap 10-74 Two Population Proportions Population proportions Goal: Test hypotheses for the difference between two population proportions, Px – Py Assumptions: Both sample sizes are large, nP(1 – P) > 9 Chap 10-75 Two Population Proportions (continued) Population proportions The random variable Z (pˆ x pˆ y ) (p x p y ) pˆ x (1 pˆ x ) pˆ y (1 pˆ y ) nx ny is approximately normally distributed Chap 10-76 Test Statistic for Two Population Proportions The test statistic for H0: Px – Py = 0 is a z value: Population proportions z pˆ x pˆ y pˆ 0 (1 pˆ 0 ) pˆ 0 (1 pˆ 0 ) nx ny Where pˆ 0 nxpˆ x nypˆ y nx ny Chap 10-77 Decision Rules: Proportions Population proportions Lower-tail test: Upper-tail test: Two-tail test: H0: px – py 0 H1: px – py < 0 H0: px – py ≤ 0 H1: px – py > 0 H0: px – py = 0 H1: px – py ≠ 0 -z Reject H0 if z < -z z Reject H0 if z > z /2 -z/2 /2 z/2 Reject H0 if z < -z/2 or z > z/2 Chap 10-78 Example: Two Population Proportions Is there a significant difference between the proportion of men and the proportion of women who will vote Yes on Proposition A? In a random sample, 36 of 72 men and 31 of 50 women indicated they would vote Yes Test at the .05 level of significance Chap 10-79 Example: Two Population Proportions (continued) The hypothesis test is: H0: PM – PW = 0 (the two proportions are equal) H1: PM – PW ≠ 0 (there is a significant difference between proportions) The sample proportions are: Men: p̂M = 36/72 = .50 p̂ W = 31/50 = .62 Women: The estimate for the common overall proportion is: pˆ 0 nxpˆ x nypˆ y nx ny 72(36/72) 50(31/50) 67 .549 72 50 122 Chap 10-80 Example: Two Population Proportions (continued) The test statistic for PM – PW = 0 is: z pˆ pˆ W pˆ 0 (1 pˆ 0 ) pˆ 0 (1 pˆ 0 ) n1 n2 Reject H0 Reject H0 .025 .025 M -1.96 -1.31 1.96 .50 .62 .549 (1 .549) .549 (1 .549) Decision: Do not reject H0 72 50 Conclusion: There is not 1.31 Critical Values = ±1.96 For = .05 significant evidence of a difference between men and women in proportions who will vote yes. Chap 10-81 Hypothesis Tests of one Population Variance Population Variance Goal: Test hypotheses about the population variance, σ2 If the population is normally distributed, 2 n1 (n 1)s σ2 2 follows a chi-square distribution with (n – 1) degrees of freedom Chap 10-82 Confidence Intervals for the Population Variance (continued) Population Variance The test statistic for hypothesis tests about one population variance is χ 2 n 1 (n 1)s σ 02 2 Chap 10-83 Decision Rules: Variance Population variance Lower-tail test: Upper-tail test: Two-tail test: H0: σ2 σ02 H1: σ2 < σ02 H0: σ2 ≤ σ02 H1: σ2 > σ02 H0: σ2 = σ02 H1: σ2 ≠ σ02 χ n21, χn21,1 Reject H0 if χ 2 n 1 χ 2 n 1,1 Reject H0 if χ n21 χ n21, /2 /2 χ n21,1 / 2 χn21, / 2 Reject H0 if χ n21 χ n21, / 2 or χn21 χ n21,1 / 2 Chap 10-84 Hypothesis Tests for Two Variances Tests for Two Population Variances F test statistic Goal: Test hypotheses about two population variances H0: σx2 σy2 H1: σx2 < σy2 H0: σx2 ≤ σy2 H1: σx2 > σy2 H0: σx2 = σy2 H1: σx2 ≠ σy2 Lower-tail test Upper-tail test Two-tail test The two populations are assumed to be independent and normally distributed Chap 10-85 Hypothesis Tests for Two Variances (continued) Tests for Two Population Variances F test statistic The random variable 2 x 2 y s /σ F s /σ 2 x 2 y Has an F distribution with (nx – 1) numerator degrees of freedom and (ny – 1) denominator degrees of freedom Denote an F value with 1 numerator and 2 denominator degrees of freedom by Chap 10-86 Test Statistic Tests for Two Population Variances F test statistic The critical value for a hypothesis test about two population variances is s F s 2 x 2 y where F has (nx – 1) numerator degrees of freedom and (ny – 1) denominator degrees of freedom Chap 10-87 Decision Rules: Two Variances Use sx2 to denote the larger variance. H0: σx2 = σy2 H1: σx2 ≠ σy2 H0: σx2 ≤ σy2 H1: σx2 > σy2 /2 0 Do not reject H0 Reject H0 Fnx 1,ny 1,α Reject H0 if F Fnx 1,ny 1,α F 0 Do not reject H0 F Reject H0 Fnx 1,ny 1,α / 2 rejection region for a twotail test is: Reject H0 if F Fnx 1,ny 1,α / 2 where sx2 is the larger of the two sample variances Chap 10-88 Example: F Test You are a financial analyst for a brokerage firm. You want to compare dividend yields between stocks listed on the NYSE & NASDAQ. You collect the following data: NYSE NASDAQ Number 21 25 Mean 3.27 2.53 Std dev 1.30 1.16 Is there a difference in the variances between the NYSE & NASDAQ at the = 0.10 level? Chap 10-89 F Test: Example Solution Form the hypothesis test: H0: σx2 = σy2 (there is no difference between variances) H1: σx2 ≠ σy2 (there is a difference between variances) Find the F critical values for = .10/2: Degrees of Freedom: Numerator (NYSE has the larger standard deviation): nx – 1 = 21 – 1 = 20 d.f. Fnx 1, ny 1, α / 2 F20 , 24 , 0.10/2 2.03 Denominator: ny – 1 = 25 – 1 = 24 d.f. Chap 10-90 F Test: Example Solution (continued) The test statistic is: H0: σx2 = σy2 H1: σx2 ≠ σy2 s 2x 1.30 2 F 2 1.256 2 s y 1.16 F = 1.256 is not in the rejection region, so we do not reject H0 /2 = .05 Do not reject H0 Reject H0 F F20 , 24 , 0.10/2 2.03 Conclusion: There is not sufficient evidence of a difference in variances at = .10 Chap 10-91 Two-Sample Tests in EXCEL For paired samples (t test): Tools | data analysis… | t-test: paired two sample for means For independent samples: Independent sample Z test with variances known: Tools | data analysis | z-test: two sample for means For variances… F test for two variances: Tools | data analysis | F-test: two sample for variances Chap 10-92 Chapter Summary Compared two dependent samples (paired samples) Performed paired sample t test for the mean difference Compared two independent samples Performed z test for the differences in two means Performed pooled variance t test for the differences in two means Compared two population proportions Performed z-test for two population proportions Chap 10-93 Part 1: Summary (continued) Used the chi-square test for a single population variance Performed F tests for the difference between two population variances Used the F table to find F critical values Chap 10-94