1 of 27 Solving Two-Factor ANOVA Problems Homework problems are multiple answer rather than multiple choice. The format for multiple answer questions is shown in the examples below. The directions for the problems instruct you to mark the check boxes for all of the statements that are true. One or more answers must be marked for each problem. Full or partial credit is computed for each question. To receive full credit, you must mark all of the correct answers and not mark any of the incorrect answers. Partial credit is computed by summing the points for each correct response and subtracting points for each incorrect answer. If the computation for partial credit results in a negative number, zero credit is assigned. Level of Measurement and Sample Size Requirement In a two-way analysis of variance, the level of measurement for the independent variables can be any level that defines groups (dichotomous, nominal, ordinal, or grouped interval) and the dependent variable is required to be interval level. If the dependent variable is ordinal level, we will follow the common convention of treating ordinal variables as interval level, but we should note the use of an ordinal variable in the discussion of our findings. I have imposed a minimum sample size requirement of 5 cases per cell for these problems. The cells are the possible combinations of categories for the two factors. If factor one contained 2 categories and the factor two contained three categories, the total number of cells would be 6, as shown in the following table: Factor one Category 1 Category 2 Category A Cell 1 Cell 4 Factor two Category B Cell 2 Cell 5 Category C Cell 3 Cell 6 2 of 27 If the sample size requirement and the level of measurement requirement are satisfied, the check box “The level of measurement requirement and the sample size requirement are satisfied” should be marked. If the level of measurement or sample size requirement is not satisfied, the correct answer to the problem is “Inappropriate application of the statistic.” The Assumption of Normality Analysis of variance assumes that the dependent variable is normally distributed, but there is general consensus that violations of this assumption do not seriously affect the probabilities needed for statistical decision making. The problems evaluate normality based on the criteria that the skewness and kurtosis of the dependent variable fall within the range from -1.0 to +1.0. If the dependent variable satisfies these criteria for skewness and kurtosis, the check box “The skewness and kurtosis of income satisfy the assumption of normality” should be marked. If the criteria for normality are not satisfied, the check box should remain unmarked and we should consider including a statement about the violation of this assumption in the discussion of our results. In these problems we will not test transformations or consider removing outliers to improve the normality of the variable. The Assumption of Homogeneity of Variance Analysis of variance assumes that the variance of the dependent variable is homogeneous across all of the cells formed by the factors (independent variables). We will use the significance of Levene’s test for equality of variance as our criteria for satisfying the assumption, which SPSS provides as part of the output. Levene’s test is a diagnostic statistic that tests the null hypothesis that the variance is homogeneous or equal across all cells. The desired outcome, and support for satisfying the assumption, is to fail to reject the null hypothesis. If the significance for the Levene test is greater that the alpha for diagnostic statistics, we fail to reject the null hypothesis and the check box “The assumption of homogeneity of variance is supported by Levene's test for equality of variances” should be marked. If the criterion for homogeneity of variance is not satisfied, the check box should remain unmarked. Analysis of variance is robust to violations of the assumption of homogeneity of variances provided the ratio of the largest group variance is not more than 3 time the smallest group variance. If we violate this assumption, but the ratio is less than or equal to 3.0, we should consider including a statement about the violation of this assumption in the discussion of our results. If we violate this assumption and the ratio of largest to smallest variance is 3.0 or greater, we should not use analysis of variance for the data for these variables and we mark the checkbox , “Inappropriate application of the statistic.” The check boxes for level of measurement and sample size, and the assumption of normality should remain marked if these conditions are satisfied. The Existence of an Interaction Effect Interaction effects represent the effects associated with combinations of the independent variables that are not detected when each independent variable is analyzed by itself. An interaction effect is generally understood to contradict the interpretation of the main effects, such that main effects are not interpreted when there is a statistically significant interaction effect. The pattern that we might ascribe to a single independent variable changes when we take into account the pattern that is exhibited when we look at it jointly. 3 of 27 If the interaction effect is statistically significant, the check box “The relationship between income and sex cannot be interpreted independent of self-employment’ is marked. If the interaction effect is not statistically significant, the check box is left blank. If the interaction effect is statistically significant, none of the statements about main effects are marked, even though they might be statistically significant. A significant interaction implies that the interpretation of the relationship changes for different categories of the other factor included in the analysis, making our statement about the individual main effects likely to be incorrect. The problem statement does not include a statement interpreting the interaction effect because the interpretation is complex. However, the feedback for the problem will contain a statement about the interaction effect when it is found to be statistically significant. Interpretation of Main Effects Determination of the correctness of statements about main effects is a two stage process. First, it is required that the main effect be statistically significant. Second, it is required that the statement be a correct comparison of the direction of the means, based on either a direct comparison of the group means when the factor contains two categories, or a post-hoc test when the factor includes three or more categories. There are two interpretive statements for each main effect. If the main effect is not statistically significant, neither of the two statements should be marked. If the effect is statistically significant, the one that is supported by the correct post-hoc test should be marked. For these problems, we will use the Bonferroni pairwise comparison test to determine which pairs of means are and are not statistically significant. The problems report the comparisons for the category with the largest mean. It is possible that it is significantly larger than the means of all of the other categories, or only some of the other categories. The problem should be answered in terms of the post-hoc comparison stated in the statements about main effects. It possible, but unlikely, that the main effect will be statistically significant, but the category with the highest mean does not meet the criteria for statistically significant post-hoc differences. It is quite likely that there are other statements about post-hoc differences that could legitimately be make, but these differences are not germane to correctly answering the question. If a main effect is statistically significant and both statements about the effect are marked, zero credit will be given for the answer, since the points will be counted for the correct answer, but be deducted for the incorrect answer. Inappropriate application of the statistic We should not use analysis of variance if we violate the level of measurement requirement, the minimum sample size requirement, or the assumption of homogeneity of variance when the ratio of largest to smallest group variance is larger than 3.0. Solving Problems in SPSS We will demonstrate the use of SPSS for an analysis of variance with this problem. Level of Measurement In a two-way analysis of variance, the level of measurement for the independent variables can be any level that defines groups (dichotomous, nominal, ordinal, or grouped interval) and the dependent variable is required to be interval level. "Computer use" [compuse] is ordinal satisfying the requirement for an independent variable. "Satisfaction with financial situation" [satfin] is dichotomous satisfying the requirement for an independent variable. The dependent variable "total family income" [income98] is ordinal level. However, we will follow the common convention of treating ordinal variables as interval level. This convention should be mentioned in the discussion of our findings. 4 of 27 Creating Two-Factor ANOVA Output with Univariate General Linear Model - 1 Select General Linear Model > Univariate from the Analyze menu. Creating Two-Factor ANOVA Output with Univariate General Linear Model - 2 First, move income98 to the Dependent Variable text box. Second, move compuse and satfin to the Fixed Factor(s) list box. Third, click on the Options button. 5 of 27 Creating Two-Factor ANOVA Output with Univariate General Linear Model – 3 6 of 27 First, move all of the Factors and Factor Interactions to the Display Means for list box. Second, mark the check box Compare main effects. This will compute the post hoc tests for the main effects. Third, select Bonferroni from the Confidence interval adjustment drop down men. This will hold the error rate for our multiple comparisons to the specified alpha error rate. Creating Two-Factor ANOVA Output with Univariate General Linear Model – 4 Next, mark the check boxes for o Descriptive statistics, o Estimates of effect size, o Parameter estimates, and o Homogeneity tests. Finally, click on the Continue button to close the dialog box. Creating Two-Factor ANOVA Output with Univariate General Linear Model – 5 Next, click on the Plots button to request the plots that will assist us in evaluating an interaction effect. Creating Two-Factor ANOVA Output with Univariate General Linear Model – 6 First, move the variable satfin to the Horizontal Axis text box. Second, move the compuse variable to the Separate Lines text box. Third, click on Add button to add this to the list of plots. Since it is often easier to spot the interaction with one of the possible combinations rather than the other, we will create both. 7 of 27 Creating Two-Factor ANOVA Output with Univariate General Linear Model – 7 First, move the variable compuse to the Horizontal Axis text box. Second, move the satfin variable to the Separate Lines text box. Third, click on Add button to add this to the list of plots. Creating Two-Factor ANOVA Output with Univariate General Linear Model – 8 With both plots added, click on the Continue button to close the dialog box. 8 of 27 Creating Two-Factor ANOVA Output with Univariate General Linear Model – 9 Having completed all of the specifications, click on the OK button to generate the output. Sample Size Requirement The smallest cell in the analysis had 13 cases. The sample size requirement of 5 or more cases per cell is satisfied. 9 of 27 Marking the Statement for the Level of Measurement and Sample Size Requirement Since we satisfied both the level of measurement and the sample size requirements for analysis of covariance, we mark the first checkbox for the problem. The Assumption of Normality The next statement in the problem focuses on the assumption of normality, using the skewness and kurtosis criteria that both statistical values should be between -1.0 and +1.0. 10 of 27 11 of 27 The Assumption of Normality - 1 To evaluate the assumption of normality, we will generate skewness and kurtosis with the Descriptives command. Select Descriptive Statistics > Descriptives from the Analyze menu. The Assumption of Normality - 2 First, move the variable income98 to the Variable(s) text box. Second, click on the OK button to generate the output. 12 of 27 The Assumption of Normality - 3 "Total family income" [income98] satisfied the criteria for a normal distribution. The skewness of the distribution (-.628) was between -1.0 and +1.0 and the kurtosis of the distribution (-.248) was between -1.0 and +1.0. Marking the Statement for the Assumption of Normality Since the skewness and kurtosis was between -1.0 and +1.0 for the variable, the assumption of normality is satisfied and the check box is marked. 13 of 27 Assumption of Homogeneity of Variance The next statement in the problem focuses on the assumption of homogeneity of variance as tested by the Levene Statistic. Assumption of Homogeneity of Variance - 1 The probability associated with Levene's test for equality of variances (F(5, 181) = 2.32, p = .045) is greater than the alpha for diagnostic tests (0.01). The assumption of equal variances is satisfied. Assumption of Homogeneity of Variance - 2 Had we violated the assumption of homogeneity of variance, we would use the table of descriptive statistics to square the standard deviation to compute the variance for each group or cell. Assumption of Homogeneity of Variance - 3 For this problem, we would compute 8 variances: 5.575 ^ 2 = 31.081 5.325 ^ 2 = 28.356 5.014 ^ 2 = 25.140 5.528 ^ 2 = 30.559 5.339 ^ 2 = 28.505 4.017 ^ 2 = 16.136 3.453 ^ 2 = 11.923 4.605 ^ 2 = 21.206 The largest variance is 31.081. The smallest variance is 11.923. The ratio of the two variances is 2.607, less than the rule of thumb of 3.0. We can interpret this ANOVA in spite of the violation of homogeneity. 14 of 27 Marking the Statement for the Assumption of Homogeneity of Variance Since we satisfied the assumption of homogeneity of variance, we mark the check box. The Interaction Effect The next statement asks about the interaction effect. If there is an interaction effect, the main effects cannot be interpreted individually. 15 of 27 16 of 27 Interaction Effect - 1 The interaction between satisfaction with financial situation and computer use was not statistically significant, F(2, 181) = 0.167, p = .846, partial eta squared = .002. The null hypothesis of no interaction effect is not rejected. The relationship between satisfaction with financial situation and total family income is not contingent on the category of computer use. Interaction Effect – 2 The non-significance of the interaction effect is supported in the profile plots which show the lines for the mean total income by computer usage to be approximately parallel for all categories of satisfaction with financial situation. Interaction Effect – 3 17 of 27 Just to make sure there is no interaction, we reverse the variables representing the lines and plotted on the horizontal axis. Again the lines are approximately parallel. Marking the Statement for the Interaction Effect Since we satisfied the assumption of sphericity, we mark the check box. 18 of 27 The Main Effect for Computer Use The next two statements offer an interpretation of the main effect for computer use. We must first determine that there is a significant main effect and then select the statement supported by the Post Hoc test. Main Effect for Computer Use The main effect for total family income by computer use was statistically significant (F(1, 181) = 30.512, p < .001, partial eta squared = 0.14). The null hypothesis that "the mean total family income was equal across all categories of computer use" was rejected. 19 of 27 Interpreting the Main Effect for Computer Use - 1 When we do not have the same number of cases in the cells (an unbalanced design), the means that we report are the Estimated Marginal Means. Survey respondents who said they used a computer had higher total family incomes (M=17.15, SE=0.48) compared to survey respondents who said they didn't use a computer (M=12.91, SE=0.60). Interpreting the Main Effect for Computer Use - 2 To report the mean difference as a finding, the post hoc test must also be statistically significance. The Bonferroni pairwise comparison of the difference (4.24) was statistically significant (p < .001). The statement that "survey respondents who said they used a computer had higher total family incomes than those who said they didn't use a computer" is correct. Marking the Statement for Main Effect for Computer Use Since the post hoc test and the pattern of the means supported the first statement, it is marked and the second statement is left blank. The Main Effect for Satisfaction with Financial Situation The next two statements offer an interpretation of the main effect for financial situation. We must first determine that there is a significant main effect and then select the statement supported by the Post Hoc test. 20 of 27 Main Effect for Satisfaction with Financial Situation The main effect for total family income by satisfaction with financial situation was statistically significant (F(2, 181) = 12.483, p < .001, partial eta squared = 0.12). The null hypothesis that "the mean total family income was equal across all categories of satisfaction with financial situation" was rejected. Interpreting the Main Effect for Satisfaction with Financial Situation - 1 The group with the highest mean was respondents who were satisfied with their financial situation. We will interpret the effect based on this category. Survey respondents who said they were satisfied with their present financial situation had higher total family incomes (M=17.61, SE=0.79) compared to survey respondents who said they were more or less satisfied with their present financial situation (M=15.05, SE=0.48 21 of 27 Interpreting the Main Effect for Satisfaction with Financial Situation – 2 The Bonferroni pairwise comparison of the difference (2.56) was statistically significant (p = .019). Interpreting the Main Effect for Satisfaction with Financial Situation – 3 Survey respondents who said they were satisfied with their present financial situation had higher total family incomes (M=17.61, SE=0.79) compared to survey respondents who said they were not at all satisfied with their present financial situation (M=12.43, SE=0.69). 22 of 27 Interpreting the Main Effect for Satisfaction with Financial Situation – 4 The Bonferroni pairwise comparison of the difference (5.19) was statistically significant (p < .001). Interpreting the Main Effect for Satisfaction with Financial Situation – 5 The statement that "survey respondents who said they were satisfied with their present financial situation had higher total family incomes than those who said they were more or less satisfied with their present financial situation and those who said they were not at all satisfied with their present financial situation" is correct. 23 of 27 Marking the Main Effect for Satisfaction with Financial Situation Since the post hoc test and the pattern of the means supported the first statement, it is marked and the second statement is left blank. The Correct Answers Marked in BlackBoard Based on the findings above, the check boxes for the correct answers are marked as shown in this picture. 24 of 27 25 of 27 The Problem Graded in BlackBoard When this assignment was submitted, BlackBoard indicated that all marked answers were correct, and we received the full 10 points for the question. 26 of 27 Logic Diagram for Two-Factor Analysis of Variance Problems – 1 Level of measurement and sample size ok? No Do not mark check box Mark: Inappropriate application of the statistic Yes Stop Mark check box for correct answer Ordinal dv? Yes Assumption of normality ok? (skewness and kurtosis between +/-1) No Mention convention in discussion of findings Do not mark check box Mention violation in discussion of findings Yes Mark check box for correct answer Assumption of homogeneity of variance ok? (Levene Sig > diagnostic alpha) No Do not mark check box Yes Ratio of largest group variance to smallest group variance ≤ 3 Mark check box for correct answer Yes Mention violation in discussion of findings No Mark: Inappropriate application of the statistic Stop 27 of 27 Logic Diagram for Two-Factor Analysis of Variance Problems – 1 Interaction effect statistically significant? (Sig < alpha) Yes Mark check box for correct answer Interpret interaction using means for combined cells No Do not interpret main effects Do not mark check box No Stop Main effect statistically significant? (Sig < alpha) No Do not mark check box Yes Relationship for group with largest mean statistically significant and correctly stated? Yes Mark check box for correct answer Repeat for other main effects