Sociology 5811: Lecture 11: T-Tests for Difference in Means Copyright © 2005 by Evan Schofer Do not copy or distribute without permission Announcements • Problem Set #3 due today • Midterm in 2 weeks • Details coming soon • We are a bit ahead of readings • Try to start on readings for next week NOW! Hypothesis Testing • Definition: Two-tailed test: A hypothesis test in which the a-area of interest falls in both tails of a Z or T distribution. • Example: H0: m = 4; H1: m ≠ 4 • Definition: One-tailed test: A hypothesis test in which the a-area of interest falls in just one tail of a Z or T distribution. • Example: H0: m > or = 4; H1: m < 4 • Example: H0: m < or = 4; H1: m > 4 • This is called a “directional” hypothesis test. Hypothesis Tests About Means • A one-tailed test: H1: m < 4 • Entire a-area is on left, as opposed to half (a/2) on each side. SO: the critical t-value changes. 4 Hypothesis Tests About Means • T-value changes because the alpha area (e.g., 5%) is all concentrated in one size of distribution, rather than split half and half. • One tail vs. Two-tail: a=.05 a/2=.025 a/2=.025 Looking Up T-Tables How much does the 95% t-value change when you switch from a 2-tailed to 1-tailed test? Two-tailed test (20 df): t=2.086 One-tailed test (20df) t=1.725 Review: Hypothesis Tests • T-value changes because the alpha area (e.g., 5%) is all concentrated in one size of distribution, rather than split half and half. • One tail vs. Two-tail: a=.05 a/2=.025 a/2=.025 Concentrating the alpha area in one tail reduces the critical T-value needed to reject H0 Tests for Differences in Means • A more useful application: Two groups • Issue: Whenever you compare two groups, you’ll observe different means • Question: Is the difference due to the particular sample, but populations have the same mean? • Or can we infer that the populations are different? • Example: Test scores for 20 boys, 20 girls in the 2nd grade • Y-barboys = 72.75, s = 8.80 • Y-bargirls = 78.20, s = 9.55 Example: Boy’s Test Scores 8 7 6 5 4 3 2 Std. Dev = 8.80 1 Mean = 72.8 N = 20.00 0 45.0 55.0 50.0 65.0 60.0 Test Scores: BOYS 75.0 70.0 85.0 80.0 95.0 90.0 100.0 Example: Girl’s Test Scores 8 7 6 5 4 3 2 Std. Dev = 9.55 1 Mean = 78.2 N = 20.00 0 45.0 55.0 50.0 65.0 60.0 Test Scores: GIRLS 75.0 70.0 85.0 80.0 95.0 90.0 100.0 Differences in Means • Inferential statistics can help us determine if group population means are really different • The hypotheses we must consider are: H 0 : μ Boys = μ Girls H1 : μ Boys μ Girls • An alternate (equivalent) formulation: H 0 : μ Boys μ Girls = 0 H1 : μ Boys μ Girls 0 Differences in Means • Issue: YBoys YGirls = 5.45 μ Boys μ Girls = ? • How likely is it to draw means with a difference of -5.45, if the difference in population means is really 0? • If common, we can’t conclude anything • If rare, we can conclude that the population means differ. Strategy for Mean Difference • We never know true population means • So, we never know true value of difference in means • So, we don’t know if groups really differ • If we can figure out the sampling distribution of the difference in means… • We can guess the range in which it typically falls • If it is improbable for the sampling distribution to overlap with zero, then the population means probably differ • An extension of the Central Limit Theorem provides information necessary to do calculations! Strategy for Mean Difference • Logic of tests about differences in means: • The C.L.T. defines how sample means (Y-bars) cluster around the true mean: • The center and width of the sampling distribution • This tells us the range of values where Y-bars fall • For any two means, the difference will also fall in a certain range: • Group 1 means range from about 6.0 to 8.0 • Group 2 means range from about 1.0 to 2.0 • Estimates of the difference in means will range from about 4.0 to 7.0! A Corollary of the C.L.T. • Visually: If each group has a sampling distribution of the mean, the difference does too: μ 2 μ1 σY2 Y1 Sampling distribution of differences in means μ2 μ1 σ Y1 σY2 A Corollary of the C.L.T. • Example: If population means are 7 and 10, observed difference in means will cluster around 3 μ 2 μ1 = 3 σY2 Y1 μ1 = 7 μ 2 = 10 σ Y1 If group 1 sample mean is 7.4, group 2 is 9.8… Difference is 2.4 σY2 A Corollary of the C.L.T. • Example: If two groups have similar means, the difference will be near zero μ 2 μ1 = .2 μ1 = 7, μ 2 = 7.2 σY2 Y1 When group means are similar, difference are usually near zero. But, even if group means are identical, difference in sample means won’t be exactly zero in most cases. Sampling Distribution for Difference in Means • The mean (Y-bar) is a variable that changes depending on the particular sample we took • Similarly, the differences in means for two groups varies, depending on which two samples we chose • The distribution of all possible estimates of the difference in means is a sampling distribution! • The “sampling distribution of differences in means” • It reflects the full range of possible estimates of the difference in means. A Corollary of the C.L.T • For any two random samples (of size N1, N2), with means m1, m2 and S.D. s1, s2: • The sampling distribution for the difference of two means is normal, with mean and S.D: 1. μ (Y1 Y2 ) = μ1 - μ 2 2. σ (Y1 Y2 ) = σ σ N1 N 2 2 1 2 2 A Corollary of the C.L.T • We can calculate the standard error of differences in means • It is the standard deviation of the sampling distribution of differences in means: σ (Y1 Y2 ) = σ σ N1 N 2 2 1 2 2 • This formula tells us the dispersion of our estimates of the difference in means. A Corollary of the C.L.T • Hypothesis tests using Z-distribution depend on: • N being large • N of both groups > 60, ideally > 100 • And, we must estimate population standard deviations based on samples standard deviations: σ̂ (Y1 Y2 ) = 2 1 2 2 s s N1 N 2 Z-Values for Mean Differences • Finally, we can calculate a Z-value using the Zscore formula: • This will be compared to a critical Z-value Z(Y1 Y2 ) Y1 Y2 = σ̂ (Y1-Y2 ) Z(Y1 Y2 ) = Y1 Y2 s N1 s N 2 2 1 2 2 Z-Values for Mean Differences • Visually: Small Z Large Z Y1 Y2 σ̂ (Y1 Y2 ) Y1 Y2 σ̂ (Y1 Y2 ) • Question: In which case can we reject H0? • Answer: If observed Z is large, it is improbable that difference in means of populations is zero. Z-Values for Mean Differences • Back to the example: Test score differences for boys and girls • Y-barboys = 72.75, s = 8.80 • Y-bargirls = 78.20, s = 9.55 • Pretend our total N (of both groups) is “large” • Choose a=.05, two-tailed test: critical Z = 1.96 H 0 : μ Boys = μ Girls H1 : μ Boys μ Girls Z-Values for Mean Differences • Strategy: Calculate Z-value using formula: Z(Y1 Y2 ) = Z(Y1 Y2 ) = Y1 Y2 s N1 s N 2 2 1 2 2 5.45 8.80 20 9.55 20 2 2 Z-Values for Mean Differences • Strategy: Calculate Z-value using formula: Z(Y1 Y2 ) • • • • 5.45 5.45 = = = 1.87 3.87 4.56 2.90 Observed Z = 1.87, critical Z = 1.96 Question: Can we reject H0? Answer: NO! We are less than 95% confident Also, our N is too small to do a Z-test. Mean Differences for Small Samples • Sample Size: rule of thumb • Total N (of both groups) > 100 can safely be treated as “large” in most cases • Total N (of both groups) < 100 is possibly problematic • Total N (of both groups) < 60 is considered “small” in most cases • If N is small, the sampling distribution of mean difference cannot be assumed to be normal • Again, we turn to the T-distribution. Mean Differences for Small Samples • To use T-tests for small samples, the following criteria must be met: • 1. Both samples are randomly drawn from normally distributed populations • 2. Both samples have roughly the same variance (and thus same standard deviation) • To the extent that these assumptions are violated, the T-test will become less accurate • Check histogram to verify! • But, in practice, T-tests are fairly robust. Mean Differences for Small Samples • For small samples, the estimator of the Standard Error is derived from the variance of both groups (i.e. it is “pooled”) • Formulas: s (Y1 -Y2 ) ( N1 1)( s ) ( N 2 1)( s ) = N1 N 2 2 2 1 2 2 Probabilities for Mean Difference • A T-value may be calculated: t(N1 N 2 2 ) (Y1 Y2 ) = 1 1 s(Y1 Y2 ) N1 N 2 • Where (N1 + N2 – 2) refers to the number of degrees of freedom – Recall, t is a “family” of distributions – Look up t-dist for “N1 + N2 -2” degrees of freedom. T-test for Mean Difference • • • • Back to the example: 20 boys & 20 girls Boys: Y-bar = 72.75, s = 8.80 Girls: Y-bar = 78.20, s = 9.55 Let’s do a hypothesis test to see if the means differ: • Use a-level of .05 • H0: Means are the same (mboys = mgirls) • H1: Means differ (mboys ≠ mgirls). T-test for Mean Difference • Calculate t-value: t(N1 N 2 2 ) t( 38 ) (Y1 Y2 ) = 1 1 s(Y1 Y2 ) N1 N 2 ( 5.45 ) = 1 1 s(Y1 Y2 ) 20 20 T-Test for Mean Difference • We need to calculate the Standard Error of the difference in means: s (Y1 -Y2 ) ( N1 1)( s ) ( N 2 1)( s ) = N1 N 2 2 2 1 2 2 (19)(8.80 ) (19)(9.55 ) 2 s (Y1 -Y2 ) = 2 38 T-Test for Mean Difference • We also need to calculate the Standard Error of the difference in means: s (Y1 -Y2 ) (1471.36) (1732.85) = 38 s (Y1 - Y2 ) = 84.32 = 9.18 T-test for Mean Difference • Plugging in Values: t(N1 N 2 2 ) t( 38 ) (Y1 Y2 ) = 1 1 s(Y1 Y2 ) N1 N 2 ( 5.45 ) = 1 1 (9.18) 20 20 T-test for Mean Difference t( 38 ) ( 5.45 ) = (9.18)(.316) t( 38 ) ( 5.45 ) = = 1.88 (2.90) T-Test for Mean Difference • Question: What is the critical value for a=.05, two-tailed T-test, 38 degrees of freedom (df)? • Answer: Critical Value = approx. 2.03 • Observed T-value = 1.88 • Can we reject the null hypothesis (H0)? • Answer: No! Not quite! • We reject when t > critical value T-Test for Mean Difference • The two-tailed test hypotheses were: H 0 : μ Boys = μ Girls H1 : μ Boys μ Girls • Question: What hypotheses would we use for the one-tailed test? H 0 : μ Boys μ Girls H1 : μ Boys μ Girls T-Test for Mean Difference • Question: What is the critical value for a=.05, one-tailed T-test, 38 degrees of freedom (df)? • Answer: Around 1.684 (40 df) • One-tailed test: T =1.88 > 1.684 • We can reject the null hypothesis!!! • Moral of the story: • If you have strong directional suspicions ahead of time, use a one-tailed test. It increases your chances of rejecting H0. • But, it wouldn’t have made a difference at a=.01 T-Test for Mean Difference • Question: What if you wanted to compare 3 or more groups, instead of just two? • Example: Test scores for students in different educational tracks: honors, regular, remedial • Can you use T-tests for 3+ groups? • Answer: Sort of… You can do a T-test for every combination of groups • e.g., honors & reg, honors & remedial, reg & remedial • But, the possibility of a Type I error proliferates… 5% for each test • With 5 groups, chance of error reaches 50% • Solution: ANOVA.