OPRE504 Chapter Study Guide Chapter 12 Compare Two Groups I Two-Sample t-Test Two-Sample t-Test We assume that two groups are independent from each other and may or may not have different variances. 1. State Hypotheses: H0: π1 − π2 = 0 Ha: π1 − π2 ≠ 0 (two-tailed) Ha: π1 − π2 > 0 (one-tail upper) or Ha: π1 − π2 < 0 (one-tail lower) 2. Calculate Standard Error of Mean Difference π 2 π 2 1 2 ππΈ(π¦Μ 1 − π¦Μ 2 ) = √π1 + π2 , s1 = Standard Deviation of Sample 1, n1= size of Sample 1, s2 = Standard Deviation of Sample 2, n2= size of sample 2. 3. Determine Adjusted Degree of Freedom 2 π 2 π 2 1 2 ( + ) π1 π2 df = 2 2 π 2 π 2 1 1 1 2 ( ) + ( ) π1 −1 π1 π2 −1 π2 [Note: the smaller of ( π1 − 1) and ( π2 − 1) < df < π1 + π2 − 2 ] 4. Determine Critical Value (t*) according to Degree of Freedom and significance level ∗ π‘ππ 5. Calculate t-statistic t= 6. (π¦Μ 1 −π¦Μ 2 )− (π1 −π2 ) ππΈ(π¦Μ 1 −π¦Μ 2 ) (π¦Μ −π¦Μ 2 ) = ππΈ(π¦1Μ Μ 2 ) 1 −π¦ Decision ∗ Reject H0 when |t|> |π‘ππ | ∗ Fail to Reject H0 when |t| ≤ |π‘ππ | (t falls between two ends of critical t*) Q12.1 [Sharpe 2011, Ch.10, E.25] In an investigation of environmental causes of diseases, data were collected on the annual mortality rate (deaths per 100,000) for male in 61 large towns in Chaodong Han OPRE504 Data Analysis and Decisions Class Handout Page 1 of 8 England and Wales. In addition, those towns are classified into two groups – North and South of Derby. Is there a significant difference in mortality rates in the two regions at the 5% significance level? Here are summary statistics: Mortality North South 1. Count 34 27 Mean 1631.59 1388.85 Median 1631 1369 Standard Deviation 138.470 151.114 H0: H1: 2. 3. π 2 π 2 π π ππΈ(π¦Μ π − π¦Μ π ) = √ππ + ππ = 2 π 2 π 2 1 2 ( + ) π1 π2 df = 2 2 π 2 π 2 1 1 1 2 ( ) + ( ) π1 −1 π1 π2 −1 π2 alpha = 5%, ∗ π‘ππ = tailed? (π¦Μ 1 −π¦Μ 2 )− (π1 −π2 ) ππΈ(π¦Μ 1 −π¦Μ 2 ) = 4. t= = 5. compare |t| and |t*|, decision: DDXL – Hypothesis Tests - 2 Var t Test: More exercises: Chapter 12, Exercises 23, 24, and 26 II. Chaodong Han Confidence Interval for the Difference Between Two Group Means OPRE504 Data Analysis and Decisions Class Handout Page 2 of 8 Two-Sample t-Interval We assume that two groups are independent from each other and may or may not have different variances. π 2 π 2 1 2 Step 1: ππΈ(π¦Μ 1 − π¦Μ 2 ) = √π1 + π2 , s1 = Standard Deviation of Sample 1, n1= size of Sample 1, s2 = Standard Deviation of Sample 2, n2= size of sample 2. 2 2 2 π π ( 1+ 2) Step 2: Calculate adjusted degree of freedom: df = π1 π2 2 2 π 2 π 2 1 1 ( 1) + ( 2) π1 −1 π1 π2 −1 π2 ∗ Step 3: Find out Critical Value of π‘ππ according to the confidence interval and adjusted degree of freedom (T-Table A-34 in Appendix C) ∗ Step 4: CI = (π¦Μ 1 − π¦Μ 2 ) ± π‘ππ x ππΈ(π¦Μ 1 − π¦Μ 2 ) Q12.2 [Sharpe 2011, Ch.12, Ex.4, p.386] A chain that specializes in healthy and organic food would like to compare the sales performance of two of its primary stores in the state of Maryland. These stores are both in urban, residential areas with similar demographics. A comparison of the weekly sales randomly sampled over two years yield the following information: Store # 1 2 a) N 9 9 Mean 242170 235338 Standard Deviation 23937 29690 Min 211225 187475 Median 232901 232070 Max 292381 287838 Create a 95% confidence interval for the difference in the mean store weekly sales π 2 π 2 1 2 ππΈ(π¦Μ 1 − π¦Μ 2 )`= √π1 + π2 = 2 2 2 π π ( 1+ 2) df = π1 π2 2 2 π 2 π 2 1 1 ( 1) + ( 2) π1 −1 π1 π2 −1 π2 = ∗ π‘ππ = ∗ CI = (π¦Μ 1 − π¦Μ 2 ) ± π‘ππ x ππΈ(π¦Μ 1 − π¦Μ 2 ) = b) How do you interpret CI in the context? Chaodong Han OPRE504 Data Analysis and Decisions Class Handout Page 3 of 8 c) Can you tell that one store sells more on weekly average than the other store? d) Calculate the Margin of Error e) Calculate a 99% confidence interval for the difference in mean store weekly sales ∗ π‘ππ = ∗ CI = (π¦Μ 1 − π¦Μ 2 ) ± π‘ππ x ππΈ(π¦Μ 1 − π¦Μ 2 ) = More exercises: Credit Card Spending, Guided Example, p.365 Chapter 12, Exercises 20, 22, 23, 39, 49, 50, 51 III Pooled Samples Pooled t-Test We assume that two groups are independent from each other and have the same variances, at least when the null hypothesis is true. 1. State Hypotheses: H0: π1 − π2 = 0 Ha: π1 − π2 ≠ 0 (two-tailed) Ha: π1 − π2 > 0 (one-tail upper) or Ha: π1 − π2 < 0 (one-tail lower) 2. Calculate Standard Error of Mean Difference 1 1 ππΈππππππ (π¦Μ 1 − π¦Μ 2 ) = πππππππ √π + π 1 Where πππππππ = √ 2 , n1= size of Sample 1, n2= size of Sample 2. π 12 (π1 −1)+π 22 (π2 −1) π1 +π2 −2 3. Determine Adjusted Degree of Freedom df = n1 + n2 – 2( a slightly higher df than two-sample t-tests without equal variances) 4. Determine Critical Value (t*) according to Degree of Freedom and significance level ∗ π‘ππ Chaodong Han OPRE504 Data Analysis and Decisions Class Handout Page 4 of 8 5. Calculate t-statistic (π¦Μ −π¦Μ 2 ) t= ππΈ(π¦1Μ Μ 2 ) 1 −π¦ 6. Decision ∗ Reject H0 when |t|> |π‘ππ | ∗ Fail to Reject H0 when |t| ≤ |π‘ππ | (t falls between two ends of critical t*) Q12.3 We want to know whether people are more likely to offer a different amount for a used camera when buying from a friend than when buying from a stranger. The data from an experiment are as follows. Test your hypothesis at 5% significance level. N 8 7 Friends Strangers Mean Prices $281.88 $211.43 1. State Hypotheses: 2. πππππππ = √ π 12 (π1 −1)+π 22 (π2 −1) π1 +π2 −2 = 1 1 ππΈππππππ (π¦Μ 1 − π¦Μ 2 ) = πππππππ √π + π 1 3. Standard Deviation $18.31 $46.43 2 = df = ∗ π‘ππ,5% = (π¦Μ −π¦Μ 2 ) 4. t= ππΈ(π¦1Μ 5. ∗ compare t and π‘ππ,5% and decision: Μ 2 ) 1 −π¦ = Pooled Confidence Interval We assume that two groups are independent from each other and have same variances, at least when the null hypothesis is true. Chaodong Han OPRE504 Data Analysis and Decisions Class Handout Page 5 of 8 1. Calculate Standard Error of Mean Difference 1 1 ππΈππππππ (π¦Μ 1 − π¦Μ 2 ) = πππππππ √π + π 1 Where πππππππ = √ 2 , n1= size of Sample 1, n2= size of Sample 2. π 12 (π1 −1)+π 22 (π2 −1) π1 +π2 −2 2. Determine Adjusted Degree of Freedom df = n1 + n2 – 2( a slightly higher df than two-sample t-tests without equal variances) 3. Determine Critical Value (t*) according to Degree of Freedom and Confidence Interval ∗ Level: π‘ππ using T-Table A34 4. ∗ CI = (π¦Μ 1 − π¦Μ 2 ) ± π‘ππ ππΈππππππ (π¦Μ 1 − π¦Μ 2 ) Q12.4 We want to know whether people are more likely to offer a different amount for a used camera when buying from a friend than when buying from a stranger. The data from an experiment are as follows. Construct a 95% confidence interval for the difference. Friends Strangers 1. N 8 7 Mean Prices $281.88 $211.43 Standard Deviation $18.31 $46.43 Find Standard Error of Difference Distribution: π 12 (π1 −1)+π 22 (π2 −1) πππππππ = √ π1 +π2 −2 = 1 1 ππΈππππππ (π¦Μ 1 − π¦Μ 2 ) = πππππππ √π + π 1 2 = 2. df = 3. ∗ π‘ππ,5% = 4. ∗ CI = (π¦Μ 1 − π¦Μ 2 ) ± π‘ππ ππΈππππππ (π¦Μ 1 − π¦Μ 2 ) Chaodong Han OPRE504 Data Analysis and Decisions Class Handout Page 6 of 8 VI Paired Data Paired t-test Paired data may be used when two groups are not independent from each other. For example, a firm’s sales in January in 2007 and January in 2008; a subject’s response before a treatment and after a treatment in an experiment. Such a test is essentially a one-sample t-test where the difference of means is treated as a single random variable. 1. State Hypotheses H0: μd = Δ0 Ha: μd ≠ Δ0 (two-tailed test); μd > Δ0 (one-tailed upper test); or μd < Δ0 (one-tailed lower test) 2. Determine Critical Value (t*) according to DF (n-1) and significance level 3. Calculate Standard Error of the Paired Difference SE(πΜ ) = 4. π π √π Calculate t-statistic πΜ −0 t = SE(πΜ ) = 5. , π π is standard deviation of the pairwise difference, n = number of pairs πΜ −0 π π √π Decisions ∗ Reject H0 when |t|> |π‘ππ | ∗ Fail to Reject H0 when |t| ≤ |π‘ππ | (t falls between two ends of critical t*) Q12.5 We want to know whether credit card spending to change, on average, from December to January for a market segment. Our data record the credit card expenditure in December 2004 and January 2005 made by 911 cardholders. The average pairwise difference is $788.18 (December 2004 – January 2005) and standard deviation of the difference is $3740.22. a) Since we generally expect spending decreases from December to January, develop a hypothesis test for this belief at the 5% significance level. 1. State Hypotheses: H0: μd = 0; Ha: Chaodong Han μd >0 (one-tailed upper test) OPRE504 Data Analysis and Decisions Class Handout Page 7 of 8 2. Critical Value: t* = π π 3. SE(πΜ ) = 4. t = SE(πΜ ) = 5. compare |t| and |t*|, decision √π πΜ −0 ,= πΜ −0 π π √π = b) Find a 95% confidence interval for the true mean difference in credit card charges between those two months for all cardholders in this segment. 1. t* given df and CI at 95%: 2. ME = t* x SE(πΜ ) = 3. CI = πΜ ±ME = More exercises on paired t-tests: Chapter 12 Exercises 53, 55, 56, 57, 58, 63, 64, 66, 67, 68, Chaodong Han OPRE504 Data Analysis and Decisions Class Handout Page 8 of 8