Multivariate meta-analysis Session 3.2

advertisement
Funded through the ESRC’s Researcher
Development Initiative
Session 3.2: Multivariate meta-analysis
Prof. Herb Marsh
Ms. Alison O’Mara
Dr. Lars-Erik Malmberg
Department of Education,
University of Oxford
Session 3.2: Multivariate meta-analysis
Establish
research
question
Define
relevant
studies
Develop code
materials
Data entry
and effect size
calculation
Pilot coding;
coding
Locate and
collate studies
Main analyses
Supplementary
analyses
 Involves the analysis of multiple outcomes
simultaneously
 Multiple outcomes could be due to:
 Different outcomes (e.g., math achievement and verbal
achievement)
 Correlations with multiple variables (e.g., age with
achievement and age with aspirations)
 Evaluation of different treatments in the same publication
 More than one control/comparison group
 Violations of independence occur when studies
produce multiple effect sizes due to the presence of
multiple treatment groups or multiple outcome
measures
 Effect sizes from the same study are more likely to
have a higher correlations than effect sizes from
different studies
 Issue of within versus between-study variation
 Choose one outcome of interest
 Separate analyses on each outcome
 Averaging the effect sizes (one outcome study)
 Shifting unit of analysis (Cooper, 1998)
 Multivariate multilevel modelling
 Select the outcome that is of most interest
 This is appropriate for many research questions
 However, does not allow contrasts between
outcomes, thereby restricting the questions you
can ask
 For each analysis, only one outcome (effect size)
per study is contributed to the analysis
 E.g., run separate analyses on maths achievement
effect sizes, and a different set of analyses on the
verbal achievement effect sizes
 The effect sizes are independent within the
particular analysis, but does not allow direct
comparison between the outcomes
 Therefore, this may not always make sense for the
research question under consideration (Rosenthal
& Rubin, 1986)
 Establish an independent set of effect sizes by
calculating the average of the effect sizes in the
study
 E.g., achievement, intelligence, satisfaction,
personality, obesity
 However, the dependent variables need to be
almost perfectly correlated for this method to work,
because the mean effect size gives an estimate that
is lower than expected (Rosenthal & Rubin, 1986)
 To make the results meaningful, outcomes should
be conceptually similar
 The outcomes are aggregated depending on the
level of analysis of interest—the study or outcome
level
 At the study level, all effect sizes from within a
study are aggregated to produce one outcome per
study
 For each moderator analysis, effect sizes are
aggregated based upon the particular moderator
variable, such that each study only includes one
effect size per outcome on that particular variable
 The effect sizes for two self-concept domains (e.g.,
physical and academic self-concept) from the same
primary study would initially be averaged to
produce a single effect size for calculations
involving the overall effect size for the sample
(study level)
 For the moderator analyses, the two self-concept
domains would be considered separately if the type
of domain was of interest, but would be aggregated
if the moderator variable of interest was, say, the
type of control group
 This means that the n of effect sizes contributing to
the analysis will change depending on the variables
being examined
Total n of
effect sizes
= 14
4 publications
Publication
2 outcomes: effect size =
difference between treatment 2 interventions:
& control group
1 = math, 2 = verbal
Intervention
1
1
Math achieve Verbal
achieve
.7
.3
.5
.7
2
3
3
4
.3
.8
.3
.4
.6
.2
.9
.8
2
1
2
2
4
.3
.9
2
1
2
Publication
Math
achieve
Verbal
achieve
Intervention
1
.7
.3
1
1
.5
.7
2
2
.3
.6
2
For study 4, effect size = (.4 + .3 + .8 + .9)/4 = .6
3
.8
.2
1
Publication
Achieve
Intervention
3
.3
.9
2
4
.4
.8
2
1
.5
1
4
.3
.9
2
1
.6
2
2
.45
2
3
.5
1
3
.6
2
4
.6
2
Only publication 4 has
more than one of the same type
of intervention
Total n of
effect sizes
=6
One effect size per study for maths
interventions, one per study for verbal
interventions
Publication
Math
achieve
Verbal
achieve
Intervention
1
.7
.3
1
1
.5
.7
2
2
.3
.6
2
3
.8
.2
1
3
.3
.9
2
4
.4
.8
2
4
.3
.9
2
Total n of
effect sizes
=8
One math effect size per study,
one verbal effect size per study
Calculate the average for math for
study 1: (.7 + .5)/2 = .6
Calculate the average for verbal for
study 1: (.3 + .7)/2 = .5
Publicatio
n
Math
achieve
Verbal
achieve
1
.6
.5
2
.3
.6
3
.55
.55
4
.35
.85
 Although this strategic compromise does not
eliminate the problem of independence, this
approach minimizes violations of assumptions
about the independence of effect sizes, whilst
preserving as much of the data as possible
(Cooper, 1998)
 Probably the most popular way of dealing with
multiple outcomes in fixed and random effects
models when explicitly interested in comparing
different outcomes
 Multilevel modelling accounts for dependencies in
the data because its nested structure allows for
correct estimation of standard errors on parameter
estimates and therefore accurate assessment of the
significance of predictor variables (Bateman &
Jones, 2003; Hox & de Leeuw, 2003; Raudenbush &
Bryk, 2002).
Meta-analytic data is inherently hierarchical (i.e.,
effect sizes nested within studies) and has random
error that must be accounted for
Effect sizes are not necessarily independent
Allows for multiple effect sizes per study
Also provides more precise and less biased
estimates of between-study variance than
traditional techniques
 Scholastic Aptitude Test (SAT) coaching
effectiveness data reported in Kalaian and
Raudenbush (1996), and Kalaian & Kasim (in press)
 The differences between the coached and
uncoached groups on SAT scores in the collection
of the SAT coaching effectiveness studies
 SAT tests are widely claimed to be so broad and
generic (almost IQ-like) that could not be affected
by short-term training program. Others suggest
that a limited amount of "familiarisation" is useful
but not much beyond this (i.e., non-linear effect of
hours).
 Meta-analytic data information:
 Study ID
 Constant (cons)
 Effect-size for verbal SAT scores (dv)
 Effect-size for maths SAT scores (dm)
 Sampling variance (SE) and covariance (cov_VM) of the
effect sizes
 Explanatory variables (study and sample characteristics)
(hours, logHR, year)
1. Click on "responses" at the bottom
of the screen
2. select "dv" and "dm" (the effect
sizes for verbal and maths
achievement, respectively)
1. Click on the equation
2. indicate a two level model
with L2=study, L1 –
resp_indicator
3. Click “done” button
1. Click "add term"
2. Select variable “cons”
3. Click “add separate
coefficients” button
4.Click “Estimates”
2
3
1
4
1. Right-click on
cons.dv in the
equation
2. Select j
3. Click on Done
4. Right-click on
cons.dm in the
equation
5. Select j
6. Click on Done
Click on “Estimates”
Your screen should look like this
1. Click "add term"
2. Select variable “SE_V”
3. Click “add separate
coefficients” button
1. Click on
“estimates” to
reveal numbers
2. Right click on
SE_V.dm
3. Click on
“Delete Term”
1. Click "add term"
2. Select variable “SE_M”
3. Click “add separate
coefficients” button
1. Right click on
SE_M.dv
2. Click on
“Delete Term”
4. Right-click on
1. Right-click on
SE_M.dm in the
SE_V.dv in the
equation
equation
2. Select j , unselect 5. Select j , unselect
Fixed parameter
Fixed parameter
6. Click on Done
3. Click on Done
Your model now looks like this. Some of the
parameters in the random part of the model (the u’s) do
not make sense to be estimated. The only random
parameters that we want are those on the diagonal in
the variance-covariance matrix.
You can delete the unnecessary random
parameters by clicking on them. For example,
click on
u02
The following screen will pop up. Click on Yes
Delete all of the off-diagonal random
parameters for the SEs, until your variancecovariance matrix looks like this
Now we need to add the covariance value
Click on “add term”
Then select the covariance term, cov_VM,
and click on add Common coefficient
The following
window will pop
up
Select dv and dm
Click Done
 The covariance term needs to be manually
calculated (see Kalaian & Kasim, in press)
 The formula is
cov
( d ip , d ip ' )

n1ip  n 2 ip
n1ip n 2 ip
2
rip ,ip ' 
d ip d ip ' rip ,ip '
2 ( n1ip  n 2 ip )
 Where n1 and n2 are the sample sizes for the 2
groups
 Rip,ip’ is the correlation between the two outcomes
 dip and dip’ are the two effect size outcomes
1. Click β4cov_VM.12j
2. Select the options as
below
2
Your equation window should now look like
this. Delete the off-diagonal covariance
components for u4j by clicking on them
Your equation window should now look like this.
Under “Model” in the menu bar, click on
“Constrain Parameters”
The following window should pop up
Click on the “random”
radio button
1
2
3
4
1. Select the SE and cov_VM
variances to be constrained
by entering a ‘1’ in the
boxes
2. Set them “to equal” 1
3. Choose a free column to
store the constraint matrix
in. In this case, we used
C20
4. Click on attach random
constraints
5. Go back to the “equations”
window
Select “Estimation” from the menu bar, then RIGLS
Click Done when the window pops up
Your model should look something like this... (you may need
to click on “Estimates” to show the numbers in blue font)
Click on START when you are ready to run the model
This is the mean effect size for the Verbal SAT scores
This is the mean effect size for the Maths SAT scores
This is the between-study random effects for Verbal SAT scores
This is the between-study
random effects for Maths
SAT scores
 On average, students scored higher on maths SAT
than verbal SAT
 However, variance was larger for maths
 There was no significant between-study variation
for maths (.012) or verbal (.004) SAT scores
 Given that there is no significant between-study
variation, we would not normally fit the model with
predictors.
Figure 4. Box Plot of SAT-Verbal and SATMath Effect Sizes
0.6000
20
0.4000
0.2000
0.0000
-0.2000
-0.4000
SAT- Verbal
SAT-Math
 Let’s look at a predictor anyway for demonstration
purposes!
 Test whether a coaching intervention improves
maths and verbal SAT scores
 Will the effects (size, direction, & significance) of
the coaching be the same for the two outcomes?
1. Add Term
2. Select LogEHR (not LogHR)
3. Click on grand mean (mean = 19 hours)
4. Click on add separate coefficients
5. Run the model (“start”)
Studies with Log coaching hours >2.75 (which is the study with
15 non-logged, ‘raw’ hours) will have a very nearly significant
positive effect on Verbal SAT scores (β = .102). Studies with Log
coaching hours >2.75 will have a significant positive effect on
Maths SAT scores (β = .290)
 Calculating the covariance (in this case, ‘cov_VM’)
requires knowing the correlation between the
outcomes
 Often, primary studies do not report the
correlations between the outcomes. Some methods
are being developed that bypass this problem
 Riley, Thompson, & Abrams (2008): An alternative model
for bivariate random-effects meta-analysis when the
within-study correlations are unknown
 However, these are confined to bivariate studies
 What to do when more than 2 outcomes?
 Model will become very complex
 Currently under development
 The multivariate results account for the
covariance between the verbal SAT and Maths
SAT effect sizes.
 Kalaian, S. A. & Kasim, R. M. (in press).
Applications of Multilevel Models for MetaAnalysis. Multilevel Analysis of Educational Data.
O’Connell, A. and McCoach, B. D. (Eds.).
Information Age Publishing.
 Kalaian, H. A., & Raudenbush, S. W. (1996). A
Multivariate Mixed-Effects Linear Model for MetaAnalysis. Psychological Methods, 1(3). 227-235.
 R. D. Riley, J. R. Thompson, & K. R. Abrams (2008).
An alternative model for bivariate random-effects
meta-analysis when the within-study correlations
are unknown, Biostatistics, 9, 172-186.
Download