EDUC 200C Two sample t-tests November 9, 2010 Review: What are the following? • • • • • Sampling Distribution Standard Error of the Mean Central Limit Theorem Confidence Interval Standard Error of the difference between the means – Pooled variance 20 samples randomly drawn from a population with mean = 0 and sd = 2 -4 -2 0 Sample means 2 4 -4 -2 N=5 -4 -2 0 Sample means N = 20 0 Sample means 2 4 2 4 N = 10 2 4 -4 -2 0 Sample means N = 50 .5 1 Density .4 0 0 .2 Density .6 1.5 .8 2 As the number of samples increases, the distribution of sample means approaches a normal distribution with a mean equal to the population mean -.5 0 mn .5 -.6 -.4 0 .2 .4 20 Samples Density .6 .5 .4 0 .2 0 Density 1 .8 1 1.5 10 Samples -.2 mn -1.5 -1 -.5 0 mn 50 Samples .5 1 -1 -.5 0 mn 100 Samples .5 1 Hypothesis testing cheat sheet Number of samples Null hypothesis example Population standard deviation known? Variance of test sample One H0: μ=0 Yes – σ σ2 𝜎𝑋 = One H0: μ=0 No – s s2 𝑠𝑋 = Two H0: μ1- μ2=0 No – s1, s2 2 𝑠𝑝𝑜𝑜𝑙𝑒𝑑 = 2 𝑁1 − 1 𝑠1 + 𝑁2 − 1 𝑠22 𝑁1 + 𝑁2 − 2 Standard error of the mean 𝜎 𝑁 𝑠 𝑁 𝑠𝑋1 −𝑋2 = 2 𝑠𝑝𝑜𝑜𝑙𝑒𝑑 1 1 + 𝑁1 𝑁2 Test statistic Confidence interval 𝑧= 𝑋−𝜇 𝜎𝑋 𝑋 ± 𝑧𝛼 𝑠𝑋 𝑡= 𝑋−𝜇 𝑠𝑋 𝑋 ± 𝑡𝛼 𝑠𝑋 𝑡= 𝑋1 − 𝑋2 − (𝜇1 − 𝜇2 ) (𝑋1 − 𝑋2 ) ± 𝑡𝛼 𝑠𝑋1 −𝑋2 𝑠𝑋1 −𝑋2 Practice Problem 1 • A high school social studies teacher decides to conduct action research in her classroom by investigating the effects of immediate testing on memory. She randomly divides her class into two groups. Group 1 studies a short essay for 20 minutes, while group 2 studies the essay for 20 minutes and then immediately takes a 10minutes test on the essay. The results below are from a final exam on the essay, taken one month later: Group 1 (studied only) Group 2 (Studied and tested) – Set up the appropriate statistical hypotheses – Perform the test (α=0.05) – Draw final conclusions. Independent sample t-tests • So far, we’ve talked about and calculated the difference in the mean of a certain variable between two independent populations. – Here, “independent” tells us that there was no connection between the two groups we were comparing • Our test statistic took the form • This allowed us to make claims about how the mean of one population compared to the mean of the other. Matched pairs t-tests • Used when the observations are not independent – Same person measured twice – Relevant connection between observation (e.g. parents and children) – Observations deliberately matched on particular attributes (e.g. intelligence) • Tells whether the mean difference within observations is significantly different from the null hypothesis value (usually 0). Matched pairs t-statistic • Based on how we’ve constructed our tests of significance so far, how would we go about testing a matched pairs hypothesis? Matched pairs t-statistic Stata • Performing a matched pairs t-test is the same as performing and independent samples t-test, just leave off the “unpaired” option ttest var1==var2 Questions??