Solving Two-Factor ANOVA Problems

advertisement
1 of 27
Solving Two-Factor ANOVA Problems
Homework problems are multiple answer rather than multiple choice. The format for multiple
answer questions is shown in the examples below.
The directions for the problems instruct you to mark the check boxes for all of the statements
that are true. One or more answers must be marked for each problem. Full or partial credit is
computed for each question. To receive full credit, you must mark all of the correct answers
and not mark any of the incorrect answers. Partial credit is computed by summing the points
for each correct response and subtracting points for each incorrect answer. If the computation
for partial credit results in a negative number, zero credit is assigned.
Level of Measurement and Sample Size Requirement
In a two-way analysis of variance, the level of measurement for the independent variables can
be any level that defines groups (dichotomous, nominal, ordinal, or grouped interval) and the
dependent variable is required to be interval level. If the dependent variable is ordinal level,
we will follow the common convention of treating ordinal variables as interval level, but we
should note the use of an ordinal variable in the discussion of our findings.
I have imposed a minimum sample size requirement of 5 cases per cell for these problems. The
cells are the possible combinations of categories for the two factors. If factor one contained 2
categories and the factor two contained three categories, the total number of cells would be 6,
as shown in the following table:
Factor one
Category 1
Category 2
Category A
Cell 1
Cell 4
Factor two
Category B
Cell 2
Cell 5
Category C
Cell 3
Cell 6
2 of 27
If the sample size requirement and the level of measurement requirement are satisfied, the
check box “The level of measurement requirement and the sample size requirement are
satisfied” should be marked. If the level of measurement or sample size requirement is not
satisfied, the correct answer to the problem is “Inappropriate application of the statistic.”
The Assumption of Normality
Analysis of variance assumes that the dependent variable is normally distributed, but there is
general consensus that violations of this assumption do not seriously affect the probabilities
needed for statistical decision making.
The problems evaluate normality based on the criteria that the skewness and kurtosis of the
dependent variable fall within the range from -1.0 to +1.0. If the dependent variable satisfies
these criteria for skewness and kurtosis, the check box “The skewness and kurtosis of income
satisfy the assumption of normality” should be marked. If the criteria for normality are not
satisfied, the check box should remain unmarked and we should consider including a statement
about the violation of this assumption in the discussion of our results.
In these problems we will not test transformations or consider removing outliers to improve the
normality of the variable.
The Assumption of Homogeneity of Variance
Analysis of variance assumes that the variance of the dependent variable is homogeneous
across all of the cells formed by the factors (independent variables). We will use the
significance of Levene’s test for equality of variance as our criteria for satisfying the
assumption, which SPSS provides as part of the output.
Levene’s test is a diagnostic statistic that tests the null hypothesis that the variance is
homogeneous or equal across all cells. The desired outcome, and support for satisfying the
assumption, is to fail to reject the null hypothesis.
If the significance for the Levene test is greater that the alpha for diagnostic statistics, we fail
to reject the null hypothesis and the check box “The assumption of homogeneity of variance is
supported by Levene's test for equality of variances” should be marked. If the criterion for
homogeneity of variance is not satisfied, the check box should remain unmarked.
Analysis of variance is robust to violations of the assumption of homogeneity of variances
provided the ratio of the largest group variance is not more than 3 time the smallest group
variance.
If we violate this assumption, but the ratio is less than or equal to 3.0, we should consider
including a statement about the violation of this assumption in the discussion of our results.
If we violate this assumption and the ratio of largest to smallest variance is 3.0 or greater, we
should not use analysis of variance for the data for these variables and we mark the checkbox ,
“Inappropriate application of the statistic.” The check boxes for level of measurement and
sample size, and the assumption of normality should remain marked if these conditions are
satisfied.
The Existence of an Interaction Effect
Interaction effects represent the effects associated with combinations of the independent
variables that are not detected when each independent variable is analyzed by itself. An
interaction effect is generally understood to contradict the interpretation of the main effects,
such that main effects are not interpreted when there is a statistically significant interaction
effect. The pattern that we might ascribe to a single independent variable changes when we
take into account the pattern that is exhibited when we look at it jointly.
3 of 27
If the interaction effect is statistically significant, the check box “The relationship between
income and sex cannot be interpreted independent of self-employment’ is marked. If the
interaction effect is not statistically significant, the check box is left blank.
If the interaction effect is statistically significant, none of the statements about main effects
are marked, even though they might be statistically significant. A significant interaction
implies that the interpretation of the relationship changes for different categories of the other
factor included in the analysis, making our statement about the individual main effects likely
to be incorrect.
The problem statement does not include a statement interpreting the interaction effect
because the interpretation is complex. However, the feedback for the problem will contain a
statement about the interaction effect when it is found to be statistically significant.
Interpretation of Main Effects
Determination of the correctness of statements about main effects is a two stage process.
First, it is required that the main effect be statistically significant. Second, it is required that
the statement be a correct comparison of the direction of the means, based on either a direct
comparison of the group means when the factor contains two categories, or a post-hoc test
when the factor includes three or more categories.
There are two interpretive statements for each main effect. If the main effect is not
statistically significant, neither of the two statements should be marked. If the effect is
statistically significant, the one that is supported by the correct post-hoc test should be
marked. For these problems, we will use the Bonferroni pairwise comparison test to determine
which pairs of means are and are not statistically significant.
The problems report the comparisons for the category with the largest mean. It is possible that
it is significantly larger than the means of all of the other categories, or only some of the other
categories. The problem should be answered in terms of the post-hoc comparison stated in the
statements about main effects. It possible, but unlikely, that the main effect will be
statistically significant, but the category with the highest mean does not meet the criteria for
statistically significant post-hoc differences.
It is quite likely that there are other statements about post-hoc differences that could
legitimately be make, but these differences are not germane to correctly answering the
question.
If a main effect is statistically significant and both statements about the effect are marked,
zero credit will be given for the answer, since the points will be counted for the correct
answer, but be deducted for the incorrect answer.
Inappropriate application of the statistic
We should not use analysis of variance if we violate the level of measurement requirement, the
minimum sample size requirement, or the assumption of homogeneity of variance when the
ratio of largest to smallest group variance is larger than 3.0.
Solving Problems in SPSS
We will demonstrate the
use of SPSS for an
analysis of variance with
this problem.
Level of Measurement
In a two-way analysis of variance, the level of
measurement for the independent variables can be any
level that defines groups (dichotomous, nominal, ordinal,
or grouped interval) and the dependent variable is
required to be interval level.
"Computer use" [compuse] is ordinal satisfying the
requirement for an independent variable. "Satisfaction with
financial situation" [satfin] is dichotomous satisfying the
requirement for an independent variable. The dependent
variable "total family income" [income98] is ordinal level.
However, we will follow the common convention of treating
ordinal variables as interval level. This convention should be
mentioned in the discussion of our findings.
4 of 27
Creating Two-Factor ANOVA Output with Univariate General Linear Model - 1
Select General Linear
Model > Univariate from
the Analyze menu.
Creating Two-Factor ANOVA Output with Univariate General Linear Model - 2
First, move income98 to
the Dependent Variable
text box.
Second, move compuse
and satfin to the Fixed
Factor(s) list box.
Third, click on the
Options button.
5 of 27
Creating Two-Factor ANOVA Output with Univariate General Linear Model – 3
6 of 27
First, move all of the Factors
and Factor Interactions to the
Display Means for list box.
Second, mark the
check box Compare
main effects. This will
compute the post hoc
tests for the main
effects.
Third, select Bonferroni from the
Confidence interval adjustment
drop down men. This will hold the
error rate for our multiple
comparisons to the specified alpha
error rate.
Creating Two-Factor ANOVA Output with Univariate General Linear Model – 4
Next, mark the check boxes
for
o Descriptive statistics,
o Estimates of effect size,
o Parameter estimates,
and
o Homogeneity tests.
Finally, click on the
Continue button to
close the dialog box.
Creating Two-Factor ANOVA Output with Univariate General Linear Model – 5
Next, click on the Plots
button to request the
plots that will assist us in
evaluating an interaction
effect.
Creating Two-Factor ANOVA Output with Univariate General Linear Model – 6
First, move the variable
satfin to the Horizontal Axis
text box.
Second, move the
compuse variable to the
Separate Lines text box.
Third, click on Add
button to add this to
the list of plots.
Since it is often easier
to spot the interaction
with one of the possible
combinations rather
than the other, we will
create both.
7 of 27
Creating Two-Factor ANOVA Output with Univariate General Linear Model – 7
First, move the variable
compuse to the Horizontal
Axis text box.
Second, move the satfin
variable to the Separate
Lines text box.
Third, click on Add
button to add this to
the list of plots.
Creating Two-Factor ANOVA Output with Univariate General Linear Model – 8
With both plots added,
click on the Continue
button to close the
dialog box.
8 of 27
Creating Two-Factor ANOVA Output with Univariate General Linear Model – 9
Having completed all of
the specifications, click on
the OK button to generate
the output.
Sample Size Requirement
The smallest cell in the
analysis had 13 cases. The
sample size requirement of 5
or more cases per cell is
satisfied.
9 of 27
Marking the Statement for the Level of Measurement and Sample Size Requirement
Since we satisfied both the level
of measurement and the sample
size requirements for analysis of
covariance, we mark the first
checkbox for the problem.
The Assumption of Normality
The next statement in the problem focuses on
the assumption of normality, using the skewness
and kurtosis criteria that both statistical values
should be between -1.0 and +1.0.
10 of 27
11 of 27
The Assumption of Normality - 1
To evaluate the
assumption of
normality, we will
generate
skewness and
kurtosis with the
Descriptives
command.
Select Descriptive
Statistics > Descriptives
from the Analyze menu.
The Assumption of Normality - 2
First, move the variable
income98 to the Variable(s)
text box.
Second, click on the
OK button to
generate the output.
12 of 27
The Assumption of Normality - 3
"Total family income" [income98]
satisfied the criteria for a normal
distribution. The skewness of the
distribution (-.628) was between -1.0
and +1.0 and the kurtosis of the
distribution (-.248) was between -1.0
and +1.0.
Marking the Statement for the Assumption of Normality
Since the skewness and kurtosis
was between -1.0 and +1.0 for the
variable, the assumption of
normality is satisfied and the
check box is marked.
13 of 27
Assumption of Homogeneity of Variance
The next statement in the problem
focuses on the assumption of
homogeneity of variance as tested
by the Levene Statistic.
Assumption of Homogeneity of Variance - 1
The probability associated with
Levene's test for equality of variances
(F(5, 181) = 2.32, p = .045) is
greater than the alpha for diagnostic
tests (0.01). The assumption of equal
variances is satisfied.
Assumption of Homogeneity of Variance - 2
Had we violated the assumption of homogeneity
of variance, we would use the table of descriptive
statistics to square the standard deviation to
compute the variance for each group or cell.
Assumption of Homogeneity of Variance - 3
For this problem, we would compute
8 variances:

5.575 ^ 2 = 31.081

5.325 ^ 2 = 28.356

5.014 ^ 2 = 25.140

5.528 ^ 2 = 30.559

5.339 ^ 2 = 28.505

4.017 ^ 2 = 16.136

3.453 ^ 2 = 11.923

4.605 ^ 2 = 21.206
The largest variance is 31.081. The smallest
variance is 11.923. The ratio of the two variances
is 2.607, less than the rule of thumb of 3.0. We
can interpret this ANOVA in spite of the violation
of homogeneity.
14 of 27
Marking the Statement for the Assumption of Homogeneity of Variance
Since we satisfied the assumption of
homogeneity of variance, we mark
the check box.
The Interaction Effect
The next statement asks about the
interaction effect. If there is an
interaction effect, the main effects
cannot be interpreted individually.
15 of 27
16 of 27
Interaction Effect - 1
The interaction between
satisfaction with financial
situation and computer use was
not statistically significant, F(2,
181) = 0.167, p = .846, partial
eta squared = .002. The null
hypothesis of no interaction
effect is not rejected.
The relationship between satisfaction with financial
situation and total family income is not contingent on
the category of computer use.
Interaction Effect – 2
The non-significance of the
interaction effect is supported
in the profile plots which show
the lines for the mean total
income by computer usage to
be approximately parallel for all
categories of satisfaction with
financial situation.
Interaction Effect – 3
17 of 27
Just to make sure there is
no interaction, we reverse
the variables representing
the lines and plotted on
the horizontal axis. Again
the lines are
approximately parallel.
Marking the Statement for the Interaction Effect
Since we satisfied the assumption of
sphericity, we mark the check box.
18 of 27
The Main Effect for Computer Use
The next two statements offer an interpretation of the
main effect for computer use. We must first determine
that there is a significant main effect and then select
the statement supported by the Post Hoc test.
Main Effect for Computer Use
The main effect for total family income by
computer use was statistically significant (F(1,
181) = 30.512, p < .001, partial eta squared =
0.14). The null hypothesis that "the mean total
family income was equal across all categories
of computer use" was rejected.
19 of 27
Interpreting the Main Effect for Computer Use - 1
When we do not have the
same number of cases in the
cells (an unbalanced design),
the means that we report are
the Estimated Marginal Means.
Survey respondents who said they used a
computer had higher total family incomes
(M=17.15, SE=0.48) compared to survey
respondents who said they didn't use a
computer (M=12.91, SE=0.60).
Interpreting the Main Effect for Computer Use - 2
To report the mean difference as a finding,
the post hoc test must also be statistically
significance. The Bonferroni pairwise
comparison of the difference (4.24) was
statistically significant (p < .001).
The statement that "survey respondents who said they used
a computer had higher total family incomes than those who
said they didn't use a computer" is correct.
Marking the Statement for Main Effect for Computer Use
Since the post hoc test and the pattern of
the means supported the first statement,
it is marked and the second statement is
left blank.
The Main Effect for Satisfaction with Financial Situation
The next two statements offer an interpretation of the
main effect for financial situation. We must first
determine that there is a significant main effect and
then select the statement supported by the Post Hoc
test.
20 of 27
Main Effect for Satisfaction with Financial Situation
The main effect for total family income by
satisfaction with financial situation was
statistically significant (F(2, 181) = 12.483, p
< .001, partial eta squared = 0.12). The null
hypothesis that "the mean total family income
was equal across all categories of satisfaction
with financial situation" was rejected.
Interpreting the Main Effect for Satisfaction with Financial Situation - 1
The group with the
highest mean was
respondents who
were satisfied with
their financial
situation. We will
interpret the effect
based on this
category.
Survey respondents who said they were satisfied
with their present financial situation had higher
total family incomes (M=17.61, SE=0.79)
compared to survey respondents who said they
were more or less satisfied with their present
financial situation (M=15.05, SE=0.48
21 of 27
Interpreting the Main Effect for Satisfaction with Financial Situation – 2
The Bonferroni pairwise
comparison of the difference
(2.56) was statistically
significant (p = .019).
Interpreting the Main Effect for Satisfaction with Financial Situation – 3
Survey respondents who said they were satisfied with
their present financial situation had higher total family
incomes (M=17.61, SE=0.79) compared to survey
respondents who said they were not at all satisfied
with their present financial situation (M=12.43,
SE=0.69).
22 of 27
Interpreting the Main Effect for Satisfaction with Financial Situation – 4
The Bonferroni pairwise comparison of
the difference (5.19) was statistically
significant (p < .001).
Interpreting the Main Effect for Satisfaction with Financial Situation – 5
The statement that "survey respondents who said
they were satisfied with their present financial
situation had higher total family incomes than
those who said they were more or less satisfied
with their present financial situation and those who
said they were not at all satisfied with their present
financial situation" is correct.
23 of 27
Marking the Main Effect for Satisfaction with Financial Situation
Since the post hoc test and the pattern of
the means supported the first statement,
it is marked and the second statement is
left blank.
The Correct Answers Marked in BlackBoard
Based on the findings above,
the check boxes for the correct
answers are marked as shown
in this picture.
24 of 27
25 of 27
The Problem Graded in BlackBoard
When this assignment was submitted,
BlackBoard indicated that all marked answers
were correct, and we received the full 10
points for the question.
26 of 27
Logic Diagram for Two-Factor Analysis of Variance Problems – 1
Level of measurement
and sample size ok?
No
Do not mark check box
Mark: Inappropriate
application of the statistic
Yes
Stop
Mark check box for
correct answer
Ordinal dv?
Yes
Assumption of normality
ok? (skewness and
kurtosis between +/-1)
No
Mention convention in
discussion of findings
Do not mark check box
Mention violation in
discussion of findings
Yes
Mark check box for
correct answer
Assumption of
homogeneity of variance
ok? (Levene Sig >
diagnostic alpha)
No
Do not mark check box
Yes
Ratio of largest group
variance to smallest
group variance ≤ 3
Mark check box for
correct answer
Yes
Mention violation in
discussion of findings
No
Mark: Inappropriate
application of the statistic
Stop
27 of 27
Logic Diagram for Two-Factor Analysis of Variance Problems – 1
Interaction effect
statistically significant?
(Sig < alpha)
Yes
Mark check box for
correct answer
Interpret interaction using
means for combined cells
No
Do not interpret main effects
Do not mark check box
No
Stop
Main effect statistically
significant?
(Sig < alpha)
No
Do not mark check box
Yes
Relationship for group
with largest mean
statistically significant
and correctly stated?
Yes
Mark check box for
correct answer
Repeat for other
main effects
Download