ANOVA Richard Rivera (aka Rico) ANOVA • In this lesson I will kill two birds with one stone. You’ll be introduced on how to – import excel data files into SPSS – conduct analyses of variance in SPSS Need some Data Files • But first, you need to download an Excel data file. • Open up your internet Explorer • Copy and paste the following two links into the web browser. http://www.west.asu.edu/statlab/downloads/WorkshopExample.xls http://www.west.asu.edu/statlab/downloads/WorkshopExample.sav • Please save them to your Desktop. Two Topics • Prepping an Excel spreadsheet to be imported into SPSS • Types of Analyses of variance (ANOVA). Excel spreadsheet • • • • • Each column is indicative of a variable First row contains the variable names You want to keep the same rules that apply to variable names in SPSS The subsequent rows contain the data for each case (subject) Gender has two levels – – • Age has three levels – – – • < 25 = 1 25 - 40 = 2 > 40 = 3 Time has three levels – – – • Male = 1 Female = 2 Baseline = time0 Time 1 = time1 Time 2 = time2 Composite scores for “attitude towards research” Excel spreadsheet I sorted the data based on gender and age variables for instructional purposes. Between subjects (aka independent samples) • What are two between-subject factors that have independent samples? – Gender: 2 samples – Age: 3 samples Within subjects (aka related samples) • What is the one within-subject factor that is indicative of three related samples? – Time: 3 samples Importing Excel Data into SPSS Stat Lab Staff: you may want to print this slide and follow the SPSS directions below • After formatting the data in Excel – First row contains the variable names – Other rows contain the data values • • Save and close Excel file Open up SPSS – Click on File > Open > Data – Navigate to the location you save the Excel file – In “Files of type:” choose either • Excel (*.xls) • or All files (*.*) • Open the Excel file you saved – You’ll get a dialogue box called: Opening Excel Data Source • There is a green check mark in box: Read variable names from the list from the first row of data • Worksheet: choose the worksheet in which the data is located • Click the OK button. – You just imported an excel file into SPSS Analysis of Variance • One-way ANOVA – One between subjects factor – Example: Age (discrete variable) • Two-way ANOVA – Usually consist of two between subjects factors – Example: Age and Gender • Repeated Measures – One-way within subjects ANOVA • One within subjects factor • Example: Time – Two-way between and within subjects ANOVA • One between subjects factor (e.g., Age) • And one within subject factor (e.g., Time) One-way ANOVA You may want to open the SPSS data file that you downloaded. • Differences among 2 or more independent sample means with SPSS • Analysis of Variance: between subjects factor • • • • Analyze > Compare Means > One-Way ANOVA … One dependent variable (e.g., baseline, time0) goes into the “Dependent List:” One between subjects factor (e.g., Age) goes into “Factor:” If the factor has more than three level, click on Post Hoc… – Click on Tukey and Dunett’s C (unless your instructor wants you to use a different one). – Click on continue • Click on Options – Chose • descriptive Statistics • Homogeneity of variance test • Perhaps on Means plot – Click on continue • Click on OK Descriptive Statistics Descriptives time0 N Under 25 25 - 40 Over 40 Total 33 37 25 95 Mean 14.12 13.24 9.76 12.63 Std. Deviation 3.267 3.840 3.018 3.837 Std. Error .569 .631 .604 .394 95% Confidence Interval for Mean Lower Bound Upper Bound 12.96 15.28 11.96 14.52 8.51 11.01 11.85 13.41 Minimum 7 7 6 6 • The first column contains the three levels of the Age factor. • The mean column contains the mean “attitude toward statistics” •Does it appear that there may be an age effect? •Do you notice a trend? •Which age group has the greatest mean? Maximum 20 20 18 20 F-test ANOVA time0 Between Groups W ithin Groups Total • • Sum of Squares 293.219 1090.886 1384.105 df 2 92 94 Mean Square 146.610 11.857 F 12.364 Sig. .000 The F-test looks tells us if there is an age effect. Is the F-test significant? – Look at the p value (in column called Sig.) – Is it less than an alpha of .05? • • However, the F-test does not tell us which pair of means are significantly different. We look at the multiple comparisons for that. Multiple Comparisons Multiple Com pari sons Dependent Variable: time0 Tukey HSD (I) age Under 25 25 - 40 Over 40 Dunnett C Under 25 25 - 40 Over 40 (J) age 25 - 40 Over 40 Under 25 Over 40 Under 25 25 - 40 25 - 40 Over 40 Under 25 Over 40 Under 25 25 - 40 Mean Difference (I-J) St d. .878 4.361* -.878 3.483* -4. 361* -3. 483* .878 4.361* -.878 3.483* -4. 361* -3. 483* E rror .824 .913 .824 .891 .913 .891 .850 .829 .850 .873 .829 .873 Sig. .538 .000 .538 .001 .000 .001 95% Confidenc e Interval Lower Bound Upper Bound -1. 09 2.84 2.19 6.54 -2. 84 1.09 1.36 5.61 -6. 54 -2. 19 -5. 61 -1. 36 -1. 20 2.96 2.31 6.42 -2. 96 1.20 1.33 5.64 -6. 42 -2. 31 -5. 64 -1. 33 *. The mean differenc e is significant at the .05 level. •Mean difference column was calculated by subtracting the means for age categories in column (j) from means for age categories in column (I). •If there is an asterisk, the mean difference is significant. Two-way ANOVA • Example: two between subjects factors • Analyze > General Linear Model > Univariate… • Dialogue box titled Univariate – Dependent variable: move one dependent variable (e.g., baseline: time0) – Fixed Factor(s): move in the between subjects factors (e.g., Gender and Age). Dialogue box titled Univariate (continued) • Click on Plots – Move one of factors into horizontal axis and the other into separate lines – Click Add – If you wish, you can do the inverse of that, then click add again – Click Continue Dialogue box titled Univariate (continued) • Click on Post Hoc… – Choose only factors that have three or more levels. (e.g., Age). – Click on Tukey and Dunett’s C (unless your instructor wants you to use different ones). – Click on continue Dialogue box titled Univariate (continued) • Click on Option… – Choose Descriptive statistics – Homogeneity tests – Click on continue • Click on OK Repeated Measures • Within Subjects ANOVA example • Analyze > General Linear Model > Repeated Measures… • Dialogue box: Repeated Measures Define Factor(s) • Within-Subject Factor Name: – Change name to: time – Number of levels: 3 levels (i.e., baseline, time0, and time1) – Click on Add – Click on Define Dialogue Box: Repeated Measures • Within-Subjects Variables (time): – Choose the three variables (time0, time1, time2) the insert them in the correct order. – Between-Subjects Factor(s): • Insert the factors that you are interested in. • In this case, enter in Age. Dialogue Box: Repeated Measures (continued) • Click on Plots – I recommend that you move the Time factor into horizontal axis and the other (Age) into separate lines – Click Add – Click Continue Dialogue Box: Repeated Measures (continued) • Click on Post Hoc… – Choose only factor(s) that have three or more levels. (e.g., Age). – Click on Tukey and Dunett’s C (unless your instructor wants you to use different ones). – Click on continue Dialogue Box: Repeated Measures (continued) • Click on Option… – Choose Descriptive statistics – Homogeneity tests – Click on continue • Click on OK Results Coach • In many instances there is a results coach in SPSS for certain types of output. • In the SPSS output, you can right click on a table and choose “results coach” • Results coach, which uses an example from a different data set, will introduce you to some of the concepts in the tables. • Good luck!