Statistics – Spring 2008 Lab #7 – t-test (between and within) Defined: Variables: Relationship: Example: Assumptions: Testing differences between group means IV is categorical, DV is continuous Group differences Are males or females happier? Normality. Homogeneity. The “t-test” comprises two different types of statistical tests: (1) Independent t-test – where the mean score of two groups is compared (3) Paired-sample t-test – where two means from the same subject are compared Here are some examples: (1) Independent t-test – Are males more optimistic than females? (3) Paired-Sample t-test – Is there a change in anxiety scores from Time 1 to Time 2? The “t-test” has an interesting history: The t statistic was introduced by William Sealy Gosset for cheaply monitoring the quality of beer brews. "Student" was his pen name. Gosset was a statistician for the Guinness brewery in Dublin, Ireland, and was hired due to Claude Guinness's innovative policy of recruiting the best graduates from Oxford and Cambridge to apply biochemistry and statistics to Guinness' industrial processes. Gosset published the t test in Biometrika in 1908, but was forced to use a pen name by his employer who regarded the fact that they were using statistics as a trade secret. In fact, Gosset's identity was unknown not only to fellow statisticians but to his employer—the company insisted on the pseudonym so that it could turn a blind eye to the breach of its rules. (from wikipedia) 1. Graphing The first step of any statistical analysis is to first graphically plot the data. How do I plot the data? 1. Select Graphs --> Legacy Dialogs --> Error Bars 2. Click “Simple”, and “Define” 3. Move the categorical variable (IV) into the “Category Axis” and move the DV into the “variable” box. 4. Click OK. Confidence intervals are usually calculated so that the percentage is 95%, but you can produce 90%, 99%, 99.9% (or whatever) confidence intervals you want by typing your preferred number in the open box that says “Bars represent.” The output below is for sex (male, female) and errors1a. a. Notice that females have a higher mean than males. b. Also notice that the confidence intervals somewhat overlap. From this plot, I would suspect that the two groups are significantly different from each other because, even though the two confidence intervals overlap slightly, the sample size from the dataset is quite larger, and we have learned how large sample sizes lower significance values. c. Also notice that the confidence interval for males is longer than the confidence interval for females. Males have more variances in their responses. d. The final thing to notice is that the group means for both males and females are in the lower end of the scale range of the dependent variable. A useful function of plotting data is seeing where the group means (and confidence intervals) fall along the scale range. In this case, the dependent variable ranges from 1 to 13, so both means are in the lower-to-middle end of the scale. 1 FYI - Graphing a Paired-sample t-test is not provided by SPSS. Your textbook on page 279 has a discussion of the various steps necessary to configure your data so that SPSS can produce a graphical plot of Paired-sample t-test data. The steps are somewhat complicated and take a while to conduct. I would recommend not plotting Paired-sample t-test data because the time involved outweighs the usefulness of having a plot with confidence intervals. 2. Assumptions - Homogeneity When dealing with group means (t-test, ANOVA), there are two assumptions: Normality and Homogeneity. Normality was discussed in Lab #1. Homogeneity is whether the variances in the populations are equal. When conducting a t-test and ANOVA, one of the options is to produce a test of homogeneity in the output, called Levene’s Test. a. If Levene’s test is significant (p < .05) then equal variances are NOT assumed, called heterogeneity. b. If Levene’s is not significant (p > .05) then equal variances are assumed, called homogeneity. It doesn’t really matter whether you have homogeneity or heterogeneity since SPSS output gives you all available information for both situations: equal variances assumed, and equal variances not assume. See the next section where I discuss the output generated by SPSS for t-tests. FYI – Levene’s Test is not generated for Paired-sample t-tests, only Independent T-tests. 3. Independent t-test You conduct an Independent t-test by: 1. Select Analyze --> Compare Means --> Independent-Samples T-test 2. Move the IV into “grouping variable”. Click “Define Groups”, and assign the same numbering as in the dataset. For example, if you move “sex” into the box, you would assign a “1” to group1 and “2” to group 2 because that is how the dataset is coded (1=males, 2=females). 3. Move the DV into the “Test Variables” box. For example, move errors1a into the box. 4. Click OK 2 The output shows two boxes. a. The first box tells you descriptive data about your sample. b. The second box indicates if the test is significant under “Sig.” The box also tells you whether or not the data are homogeneous. Since Levene’s test is significant, then equal variances are not assumed. You then report the line of data that corresponds to equal variances not assumed. In other words, the difference between males and females is significant at p = .035. The t-test statistic is negative (-2.135) which indicates that group 2 mean is larger than group 1 mean. The output provides information about significance value, but you have to calculate the effect size by hand. The formula is on page 302 the Field’s textbook. The pieces of information required to calculate the effect size are reported above in the output from SPSS – df and t value. In this case the effect size is 19. Here is a website that calculates it for you -- http://web.uccs.edu/lbecker/Psy590/escalc3.htm WRITE-UP. The information included in the “Results” section is the means, t value, df, and significance value. Lately, the field as a whole has recommended that you report effect size as well. a. In this study males reported a stronger belief that it is better to acquit guilty people than convict innocent people (M = 4.84, M = 5.61, for males and females respectively), t(319) = 2.135, p = .035. The effect size is .19. EVALUATION: When evaluating tests that involve differences between group means (t-test, ANOVA), you only really care about three pieces of information: (1) is the effect significant (p-value), (2) what is the size of the effect (effect size), and (3) what is the direction of the effect (which mean is larger than the other). 4. Paired-samples t-test We don’t have a paired-samples factor in our dataset so we can’t use a Paired-sample t-test. However, imagine, just for the sake of argument, that system1 and system2 are the same question. Also imagine that we asked subjects to answer system1 before the study began and that we asked subjects to answer system2 after the study ended. The hypothesis would be that answering questions about the legal system changed how subjects would answer this question about how much you trust in the accuracy of convictions. For example, the hypothesis is that being exposed to the questions in the survey (such as flaws in the legal system regarding convicting innocent people, problems with DNA evidence, etc) made the subjects trust LESS in the accuracy of convictions. Here is how to conduct a paired-sample t-test 1. Select Analyze --> Compare Means --> Paired-Samples T-test 2. Move over both system1 and system2 3. Click OK. The output below indicates that there is a significance difference between the questions, p = .000. The subjects reported a lower mean at the end of the study (M = 6.68) than at the beginning of the study (M = 7.32). 3 The output provides information about significance value, but you have to calculate the effect size by hand. The formula is on page 294 of the Field’s textbook. The pieces of information required to calculate the effect size are reported above in the output from SPSS – df and t value. In this case, the effect size is .22. Here is a website that calculates it for you -- http://web.uccs.edu/lbecker/Psy590/escalc3.htm WRITE-UP. The information included in the write up is the means, t value, df, and significance value. Lately the field as a whole has recommended that you report effect size as well. a. Participants rated a higher level of trust in the accuracy of conviction after the experiment (M = 7.3) than before the experiment (M = 6.7), t(325) = 3.97, p < .001. The effect size is .22. EVALUATION: Just like with evaluation all types of analysis involving group means (t-test, ANOVA), you only really care about three pieces of information: (1) is the effect significant (p-value), (2) what is the size of the effect (effect size), and (3) what is the direction of the effect (which mean is larger than the other). 4