Uploaded by Monica Manalastas

ANCOVA-MANOVA Notes

advertisement
ANCOVA
Analysis of Covariance
• Covariate: another name for extraneous (confounding) variables; a variable that could also provide a
variation to the dependent variable aside from the influence of the independent variable.
INTRODUCTION
• The one-way ANCOVA (analysis of covariance) can be thought of as an extension of the one-way
ANOVA to incorporate a covariate.
• Like the one-way ANOVA, the one-way ANCOVA is used to determine whether there are any significant
differences between two or more independent (unrelated) groups on a dependent variable.
• However, whereas the ANOVA looks for differences in the group means, the ANCOVA looks for
differences in adjusted means (i.e., adjusted for the covariate).
• As such, compared to the one-way ANOVA, the one-way ANCOVA has the additional benefit of allowing
you to "statistically control" for a third variable (sometimes known as a "confounding variable"), which
you believe will affect your results.
Example
•
•
•
•
Imagine that new students are assigned at random to three different introductory statistics groups,
using three different teaching methods. They have an hour’s session.
Group 1 has an hour of traditional ‘chalk ‘n’ talk’.
Group 2 has an hour of the same, only the lecture is interactive in that students can interrupt and
ask questions, and the lecturer will encourage this. This is traditional plus interactive.
Group 3 is highly interactive in that the students work in groups with guidance from the lecturer.
Problem Statement: Is there a statistically significant difference between the performance of the
students exposed to different teaching methods when controlling for IQ?
Ho: There is no statistically significant difference between the performance of the students exposed to
different teaching methods when controlling for IQ.
H1: There is a statistically significant difference between the performance of the students exposed to
different teaching methods when controlling for IQ.
•
•
However, assume that the ability to retain material in the lecture is related to IQ, irrespective of
teaching method.
If IQ and ability to retain such material are associated, we would expect the association to be
positive: that is, IQ and scores on the statistics test should be positively correlated.
WHY ANCOVA?
• ANCOVA gets rid of the effects due to the covariate; that is, it reduces error variance, which leads to a
larger F-value.
• ANCOVA adjusts the means on the covariate for all of the groups, which leads to an adjustment in the
means of the y variable.
PRE-EXISTING GROUPS
• Imagine a case where there are three groups of women (nightclub hostesses, part-time secretaries,
and full-time high-powered scientists)
• We wish to test the hypothesis that the more complex the occupation, the higher the testosterone
level.
• Now think of your three groups. Is it likely that the mean age of the three groups would be the same?
• Not only does ANCOVA reduce the error variance by removing the variance due to the relationship
between age (covariate) and the DV (testosterone) (the first purpose), it also adjusts the means on the
covariate for all of the groups, leading to the adjustment of the y means (testosterone).
• In other words, what ANCOVA does is to answer the question: ‘What would the means of the groups
be (on y) if the means of the three groups (on x) were all the same?’
PRETEST-POSTTEST DESIGNS
• One of the most common designs in which ANCOVA is used is the pretest–posttest design. This consists
of a test given before an experimental condition is carried out, followed by the same test after the
experimental condition.
• When carrying out a pretest–posttest study, researchers often wish to partial out (remove, hold
constant) the effect of the pretest, in order to focus on possible change following the intervention.
ASSUMPTIONS
1. Your dependent variable and covariate variable(s) should be measured on a continuous scale
2. Your independent variable should consist of two or more categorical, independent groups
3. You should have independence of observations, which means that there is no relationship between
the observations in each group or between the groups themselves.
4. There should be no significant outliers.
5. Your dependent variable should be approximately normally distributed for each category of the
independent variable.
6. The covariate cannot vary across the levels of the independent variable.
7. There needs to be homogeneity of variances.
8. The covariate should be linearly related to the dependent variable at each level of the independent
variable.
9. There needs to be homogeneity of regression slopes.
ANCOVA Example
•
At the local university, students were randomly allocated to one of three groups for their laboratory
work – a morning group, an afternoon group, and an evening group. At the end of the session, they
were given 20 questions to determine how much they remembered from the session. Their
motivation was measured because it might affect their test score.
1. What is the IV? Time
o Levels: Morning, afternoon, evening
2. What is the DV? Test Score
3. What is the covariate? Motivation
4. What is the design of the study? Between-subjects design (randomly allocated)
5. What statistical test is applicable for this study? One-way ANCOVA
o One-way ANCOVA – one IV with covariate
o Two-way ANCOVA – more than one IV with covariate
o One-way MANCOVA – one IV, more than one DV with covariate
o Two-way MANCOVA – more than IV, more than one DV, with covariate
6. What is our problem statement?
o Is there a significant difference on the test scores of the students exposed to morning,
afternoon, and evening classes while controlling for their level of motivation?
o Does class time affect test scores while partialling out the effect of motivation?
7. What is Ho?
o There is no significant difference on the test scores of the students exposed to morning,
afternoon, and evening classes while controlling for their level of motivation?
o Class time does not affect test scores while partialling out the effect of motivation?
OUTPUT
Descriptive statistics show that the scores of the students who underwent the morning class has a M =
14.80, SD = 2.49. Meanwhile, the afternoon students has a test score M = 15.20, SD = 1.99, and the
evening students has a test score M = 10.90, SD = 1.79. Test of normality was checked using Shapirowilk and it revealed that all groups are normally distributed, morning (W = 0.98, p = 0.943), afternoon
(W = 0.89, p = 0.184), and evening (W = 0.98, p = 0.937). Levene’s test showed that there is no violation
in the homogeneity of variance (p = 0.669). The covariate (motivation) did not vary across the levels of
the independent variable (time of classes) (F(2,29) = 1.083, p = 0.353). There is also no violation in the
homogeneity of regression slopes between the time and motivation (F(2,24) = 2.71, p = 0.087). Matrix
scatterplot revealed that the covariate, except the evening class, are linearly related to test scores.
One-way ANCOVA was used to analyze the effect of different time classes to test scores while partialling
out the effect of motivation. It revealed that there is a significant difference between the test scores of
the students exposed to morning, afternoon, and evening classes (F(2,26) = 11.57, p < 0.001). Partial η²
= 0.471 which means that 47.1% of the variation in the test scores is due to the difference in time of
the classes. This is considered a large effect.
Furthermore, pairwise comparisons using LSD revealed that the test scores of the students in the
morning and afternoon classes were not significantly different from each other (p = 0.979). Meanwhile,
morning and evening classes test scores are significant different from each other (p < 0.001). Lastly, the
test scores of the students in the afternoon and evening classes are also significantly different from
each other (p < 0.001).
MANOVA
Multivariate Analysis of Variance
CHAPTER OVERVIEW
• What MANOVA is
• The assumptions underlying the use of MANOVA, including:
o Multivariate normality
o Homogeneity of variance-covariance matrices
• MANOVA with:
o One between-participants IV and t DVs
o One within-participants IV and two DVs
o Each of these will be the IVs with two conditions
o Post-hoc testing of the contribution of each individual DV to the multivariate difference
between conditions of the IV
WHY USE MULTIVARIATE ANALYSES OF VARIANCE?
1. Univariate ANOVA is not adequate at all times
2. To explore the facets of one DV
3. Multiple t-tests can increase Familywise error rate
Example
•
•
•
Quite often we may have research questions where univariate ANOVA is not adequate.
For example, suppose like Van Cappellen et. Al (2016) you were interested in the effects of religion
on well-being.
o You might want to compare the well-being of churchgoers with that for atheists
o For example, well-being might be measured using a number of possible indices, including:
▪ Optimism about the future
▪ Happiness
▪ Enthusiasm for life
▪ Satisfaction with personal relationships
Do not perform multiple independent t-tests as it can increase the probability of committing a Type
I error or Familywise error.
LOGIC OF MANOVA
• When we have multiple DVs, MANOVA simply forms a linear combination of the DVs and then uses
this linear combination in the analysis in place of the individual DVs.
• That is, it combines the DVs into a new variable and then uses this new variable as the single DV in
the analyses
• A linear combination is a simple additive combination of the DVs
o Well-being = Happiness + Enthusiasm + Optimism + Relationships
o There are infinite ways to provide a combination of linear dependent variables.
ASSUMPTIONS OF MANOVA
•
As with any parametric statistics there are a number of assumptions associated with MANOVA that
have to be met in order for the analysis to be meaningful in any way.
1. Multivariate Normality
o A multivariate normality is a vector in multiple normally distributed variables, such that any
linear combination of the variables is also normally distributed.
o It is worth noting that MANOVA is still a valid test even with modest violations of the
assumption of multivariate normality, particularly when we have equal sample sizes and a
reasonable number of participants in each group
o There are infinite ways to provide a combination of linear dependent variables
▪ Well-being = Happiness + Enthusiasm + Optimism + Relationships
▪ Well-being = Enthusiasm + Optimism + Relationships + Happiness
2. Homogeneity of variance-covariance matrices
o Homogeneity of variance covariance matrices is the multivariate version of the univariate
assumption of Homogeneity of variance.
WHICH F-VALUE?
• SPSS gives us several different multivariate tests: that is, it uses several different ways of combining
the DVs and calculating the F-value. These tests are:
o Wilks’ lambda
o Pillai’s trace – use only when there is a violation in the assumptions
o Hotelling’s trace
o Roy’s largest root.
• When we say multivariate difference, we simply mean a difference in terms of the linear combination
of the DVs.
• Consequently, if we are to assume that these DVs measure well-being, we would conclude that there
was a difference between the well-being of churchgoers and atheists.
POS-HOC ANALYSES OF INDIVIDUAL DVS
• In the multivariate analyses, where we have more than one DV, once we find a multivariate difference,
we need to find out which DVs are contributing to this difference.
• We need to do this because it is likely, especially if we have many DVs, that not every DV will
contribute to the overall difference that we have observed.
CORRELATED DVS
• The above post-hoc procedure is recommended when we have DVs that are not correlated with each
other.
• Problems arise, however, when the DVs are correlated
• When we get a multivariate difference in our DVs, we then have to evaluate the contribution of each
DV to this overall effect (as we have just done).
• If we have uncorrelated DVs, this means that there is no overlap between their contribution to the
linear combination of the DVs.
• In such a situation, the univariate tests give a pure indication of the contribution of each DV to the
overall difference.
CORRELATIONS
Within-MANOVA Example
•
A researcher wants to find out if the new therapy is indeed effective in treating spider phobia. He
measured the participants’ fear of spiders before they were exposed to therapy and after. He then
used Fear of Spider Questionnaire, an 18-item self-report measure of fear of spiders. The total score
on the questionnaire gives an indication of a person’s level of fear of spiders. The higher the score
on the questionnaire, the higher the fear. He also used Behavioral Approach Test which involves
participants being asked to move nearer a live spider in stages until eventually the spider is placed
in the palm of their hand. A score of 0 was given if the participant refused to enter the room where
the spider was (maximum fear) and a score of 12 was given if the participants were able to have
the spider in the palm of their hand for at least 20 seconds (minimum fear).
1. What is the IV? The intervention / therapy
o Levels: Before therapy and after therapy
2. What is the DV? Spider Phobia – FSQ Score and BAT score
3. What is the covariate? None
4. What is the design of the study? Within-subjects design
5. What statistical test is applicable for this study? MANOVA
6. What is our problem statement?
o Is there a significant difference in the participants’ overall level of fear of spiders before
and after therapy?
o Is there a significant difference in the participants’ level of fear of spiders in terms of FSQ
before and after therapy?
o Is there a significant difference in the participants’ level of fear of spiders in terms of BAT
before and after therapy?
7. What is Ho?
o There is no significant difference in the participants’ overall level of fear of spiders before
and after therapy.
o There is no significant difference in the participants’ overall level of fear of spiders in terms
of FSQ before and after therapy.
o There is no significant difference in the participants’ overall level of fear of spiders in terms
of BAT before and after therapy.
OUTPUT
Descriptive statistics show that the level of fear of spiders of the participants using FSQ before the
therapy is M = 77.18, SD = 33.97. Meanwhile, using BAT, their level of fear of spiders before the therapy
is M = 6.27, SD = 3.02. After the exposure to therapy, their level of fear of spiders went down using FSQ,
M = 71.23, SD = 27.67. Using BAT, their level of fear of spiders also went down, M = 7.50, SD = 2.45.
Test of normality was checked using Shapiro-Wilk and there were no violations. FSQ_Pre (W = 0.94, p
= 0.217), FSQ_Post (W = 0.97, p = 0.751), BAT_Pre (W = 0.96, p = 0.539), BAT_Post (W = 0.94, p = 0.207).
Box plots showed that the data for each DV in each condition of the IV were approximately normally
distributed and therefore we can be reasonably confident that we have no major outliers and violations
of the assumption of multivariate normality
A repeated-measures MANOVA was used to determine the effectiveness of the new therapy using FSQ
and BAT as the dependent variables. This revealed that there was a significant multivariate difference
between the pre- and posttreatment conditions (F(2,20) = 5.58, Wilks’ ƛ = 0.642, p = 0.012). The partial
η² = 0.358 which is considered a large effect and translates to 35.8% of the variation in their level of
fear of spiders is due to the effects of therapy.
As the two dependent variables are uncorrelated, the univariate test is valid to determine the
participants’ level of fear of spiders in respect to each dependent variable. There was no significant
difference in their level of fear of spiders in their self-report measure, FSQ, before and after the therapy
(F(1,21) = 2.18, p = 0.154). Meanwhile, there was a significant difference in their level of fear of spiders
using BAT (F(1,21) = 10.90, p = 0.003)
Between-MANOVA Example
•
A researcher is interested to know how religion (belief) affect well-being. He gathered two groups
of participants; the first group are churchgoers, and the second group are atheists. He wanted to
measure well-being in two ways: happiness and optimism. The following are the participants’ score
on the happiness and optimism scale.
1. How many IVs? One
2. What is the IV? Religion / belief
o Levels: Churchgoers and Atheists
3. How many DVs? Two
4. What are the DVs? Well-being measured in happiness and optimism
5. What is the covariate? None
6. What is the design of the study? Between-subjects design
7. What statistical test is applicable for this study? One-way MANOVA
8. What is our problem statement?
o Is there a significant difference between the happiness level of churchgoers and atheists?
o Is there a significant difference between the optimism level of churchgoers and atheists?
o Is there a multivariate difference between the well-being of churchgoers and atheists?
9. What is Ho?
o There is no significant difference between the happiness level of churchgoers and atheists.
o There is no significant difference between the optimism level of churchgoers and atheists.
o There is no multivariate difference between the well-being of churchgoers and atheists.
OUTPUT
Descriptive statistics show that the churchgoers’ average happiness is M = 6.50, SD = 1.45. For atheists,
their average happiness is M = 6.00, SD = 1.54. Meanwhile, the optimism level of churchgoers has a M
= 5.50, SD = 1.45 and for atheists, they have a M = 3.50, SD = 1.00. Multivariate normality was not
violated as all the dependent variables are normally distributed for each level of the independent
variable. Churchgoers’ happiness (W = 0.97, p = 0.897). Atheists’ happiness (W = 0.94, p = 0.513).
Churchgoers’ optimism (W = 0.96, p = 0.897). Atheists’ optimism (W = 0.91, p = 0.187). Box plots showed
that the data for each DV in each condition of the IV were approximately normally distributed and
therefore we can be reasonably confident that we have no major outliers and violations of the
assumption of multivariate normality. The assumption of homogeneity of variance-covariance matrices
was also not violated (Box’s M = 1.51, p = 0.715).
A one-way MANOVA was used to determine the multivariate difference between churchgoers and
atheists’ well-being. This revealed that the difference in their well-being is significant (F(2,21) = 7.55,
Wilks’ ƛ = 0.582, p = 0.003). The partial η² = 0.418 which is a large effect. This means that 41.8% of the
variation in the participants’ well-being is due to the differences in their belief.
As the two dependent variables are uncorrelated, the univariate test is valid to determine the
participants’ differences in optimism and happiness. There is no significant difference in their happiness
level (F(1,22) = 0.67, p = 0.421). There is a significant difference in their optimism level (F(1,22) = 15.53,
p = 0.001). This means that even if a person does not believe in religion, she can have the same level of
happiness as to those who believe in one. But, believing in a religion will make someone have more
positive outlook in life.
Download