Introduction to meta

advertisement
Wim Van den Noortgate
Katholieke Universiteit Leuven, Belgium
Belgian Campbell Group
Wim.VandenNoortgate@kuleuven-kortrijk.be
Workshop systematic reviews
Leuven June 4-6, 2012
1
1.
2.
Modelling heterogeneity
Publication bias
2
3
Growing popularity of evidence-based
thinking:
Decisions in practice and policy should be
based on scientific research about the effects
of these decisions/interventions
But: conflicting results (failures to replicate),
especially in social sciences!
4
1. The role of chance
- in measuring variables
- in sampling study participants
2. Study results may be systematically biased due to
- the way variables are measured
- the way the study is set up
3. Studies differ from each other (e.g., in the kind of
treatment, the duration of treatment, the dependent
variable, the characteristics of the investigated
population, …)
5
Differences between observed effect sizes due to
chance only
Population effect sizes all equal
(1   2  ...   k )
6
H0 : 1   2  ...   k
Ha : at least one  j differs from an other
k
Q   w j (g j  ˆ )2
j 1
H0 : Q ~ 
2
( k 1)
7
(Q  df )
I² 
* 100%
Q
= percentage of variability in effect estimates due
to heterogeneity rather than chance
Rough guidelines:
0% to 40%: might not be important
30% to 60%: may represent moderate heterogeneity
50% to 90%: may represent substantial heterogeneity
75% to 100%: considerable heterogeneity
Interpretation based on both I² and heterogeneity test!
8
(Raudenbush, S. W. (1984). Magnitude of teacher expectancy effects on pupil IQ as a
function of the credibility of expectancy induction: A synthesis of findings from 18
experiments. Journal of Educational Psychology, 76, 85-97.)
Study
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
Rosenthal et al. (1974)
Conn et al. (1968)
Jose & Cody (1971)
Pellegrini & Hicks (1972)
Pellegrini & Hicks (1972)
Evans & Rosenthal (1969)
Fielder et al. (1971)
Claiborn (1969)
Kester & Letchworth (1972)
Maxwell (1970)
Carter (1970)
Flowers (1966)
Keshock (1970)
Henrickson (1970)
Fine (1972)
Greiger (1970)
Rosenthal & Jacobson (1968)
Fleming & Anttonen (1971)
Ginsburg (1970)
Weeks prior
contact
2
3
3
0
0
3
3
3
0
1
0
0
1
2
3
3
1
2
3
gj
0.03
0.12
-0.14
1.18
0.26
-0.06
-0.02
-0.32
0.27
0.80
0.54
0.18
-0.02
0.23
-0.18
-0.06
0.30
0.07
-0.07
 (g
j
)
0.13
0.15
0.17
0.37
0.37
0.10
0.10
0.22
0.16
0.25
0.30
0.22
0.29
0.29
0.16
0.17
0.14
0.09
0.17
9
Q = 35,83, df = 18, I²= 50 %, p = .007
10



Not always wise: make set of studies more
homogeneous!
Can help to say something about ‘fruit’
Can help to make detailed conclusions:
Does the effect depend on the kind of fruit?
11
12
Population effect size possibly depends on study
category
Differences between observed effect sizes within
the same category due to chance only
13
(Raudenbush, S. W. (1984). Magnitude of teacher expectancy effects on pupil IQ as a
function of the credibility of expectancy induction: A synthesis of findings from 18
experiments. Journal of Educational Psychology, 76, 85-97.)
Study
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
Rosenthal et al. (1974)
Conn et al. (1968)
Jose & Cody (1971)
Pellegrini & Hicks (1972)
Pellegrini & Hicks (1972)
Evans & Rosenthal (1969)
Fielder et al. (1971)
Claiborn (1969)
Kester & Letchworth (1972)
Maxwell (1970)
Carter (1970)
Flowers (1966)
Keshock (1970)
Henrickson (1970)
Fine (1972)
Greiger (1970)
Rosenthal & Jacobson (1968)
Fleming & Anttonen (1971)
Ginsburg (1970)
Weeks prior
contact
2
3
3
0
0
3
3
3
0
1
0
0
1
2
3
3
1
2
3
gj
0.03
0.12
-0.14
1.18
0.26
-0.06
-0.02
-0.32
0.27
0.80
0.54
0.18
-0.02
0.23
-0.18
-0.06
0.30
0.07
-0.07
 (g
j
)
0.13
0.15
0.17
0.37
0.37
0.10
0.10
0.22
0.16
0.25
0.30
0.22
0.29
0.29
0.16
0.17
0.14
0.09
0.17
14
15
k
2
ˆ
Q   w j (g j   )
j 1
Total variability
in observed ES’s
QT
=
Variability
between groups

QB

+
Variability
within groups
QW
QT : homogeneity test
H0: QT ~²k-1
QB : moderator test
H0: QB ~²J-1
QW : test for within group homogeneity
H0: QW ~²k-J
16
Q total =
Q Between +
Q within
²
35.83
20.38
15.45
df
18
3
15
0.007
0.0001
0.42
p
17
Observed effect sizes for the 3 tasks
6.5
5.5
4.5
ES
3.5
2.5
1.5
0.5
-0.5
-1.5
Semantic categorization
Lexical decision
Naming
= Mean ES REM
18
Population effect size possibly depends on
continuous study characteristic
e.g.,
 j   0  1 x1 j  ...   p x pj
After taking into account this study characteristic,
differences between observed effect sizes due to
chance only
19
Initial effect is moderate (0.41, p < .001), but
decreases with increasing prior contact (with
-0.16 per week, p <.001)
20
Population effect size possibly varies randomly
over studies
Differences between observed effect sizes are due
to
- chance
- ‘true’ differences
21
22
Population effect size possibly depends on study
category
Differences between observed effect sizes within
the same category are due to
- chance
- ‘true’ differences
23
Population effect size possibly depends on
continuous study characteristic
e.g.,      x  ...   x  u
j
0
1 1j
p
pj
j
After taking into account this study
characteristics, differences between observed
effect sizes are due to
- chance
- ‘true’ differences
24
Random effects model with moderators:
◦ The least restrictive model: allows moderator
variables & random variation
◦ Also called a ‘Mixed effects model’
25
FEM
REM
Without moderator
Categorical moderator
Continuous moderator
26
1.
2.
3.
4.
5.
6.
7.
Is there an overall effect?
How large is this effect?
Is the effect the same in all studies?
How large is the variation over studies?
Is this variation related to study
characteristics?
Is there variation that remains unexplained?
What is the effect in the specific studies?
27
(Raudenbush, S. W. (1984). Magnitude of teacher expectancy effects on pupil IQ as a
function of the credibility of expectancy induction: A synthesis of findings from 18
experiments. Journal of Educational Psychology, 76, 85-97.)
Study
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
Rosenthal et al. (1974)
Conn et al. (1968)
Jose & Cody (1971)
Pellegrini & Hicks (1972)
Pellegrini & Hicks (1972)
Evans & Rosenthal (1969)
Fielder et al. (1971)
Claiborn (1969)
Kester & Letchworth (1972)
Maxwell (1970)
Carter (1970)
Flowers (1966)
Keshock (1970)
Henrickson (1970)
Fine (1972)
Greiger (1970)
Rosenthal & Jacobson (1968)
Fleming & Anttonen (1971)
Ginsburg (1970)
Weeks prior
contact
2
3
3
0
0
3
3
3
0
1
0
0
1
2
3
3
1
2
3
gj
0.03
0.12
-0.14
1.18
0.26
-0.06
-0.02
-0.32
0.27
0.80
0.54
0.18
-0.02
0.23
-0.18
-0.06
0.30
0.07
-0.07
 (g
j
)
0.13
0.15
0.17
0.37
0.37
0.10
0.10
0.22
0.16
0.25
0.30
0.22
0.29
0.29
0.16
0.17
0.14
0.09
0.17
28
Parameter
REM
Fixed
Intercept
Between study variance
0
0.084 (0.052)
 u2
0.019 (0.023)
29
Parameter
REM
MEM
Fixed
Intercept
0
Weeks
1
Between study variance
 u2
0.084 (0.052)
0.41 (0.087)
-0.16 (0.036)
0.019 (0.023)
0.00 (-)
30
1.
2.
3.
4.
Models can include multiple moderators
REM assumes randomly sampled studies
REM requires enough studies
Association (over studies) ≠ causation!
Be aware of potential confounding moderators
(studies are not ‘RCT participants’!)
31
Dependencies between studies
◦ E.g., research group, country, …
Multiple effect sizes per study
◦ Several samples
◦ Same sample but, e.g., several indicator variables
32


Ignoring dependence? NO!
Avoiding dependence
◦ (Randomly choosing one ES for each study)
◦ Averaging ES’s within a study
◦ Performing separate meta-analyses for each kind of
treatment or indicator

Modelling dependence
◦ Performing a multivariate meta-analysis,
accounting for sampling covariance.
◦ Performing a three level analysis
33
34
(Egger, M. D., & Smith, G. (1998). Meta-analysis. Bias in location and selection of
studies. British Medical Journal, 316, 61-66.
http://www.bmj.com/cgi/content/full/316/7124/61).
35
Proportion of publication within 5 years after
conference:
 81 % (of 233 trials) for significant results
 68 % (of 287 trials) for nonsignificant results
(Kryzanowska, M. K., Pintilie, M., & Tennock, I. F. (2003). Factors associated with
failure to publish large randomized trials presented at an oncology meeting. Journal
of the American Medical Association, 290, 495-501).
36
500
Sample size
400
300
200
100
0
-1
-0.5
0
0.5
1
1.5
Observed effect sizes
37
500
Sample size
400
300
200
100
0
-1
-0.5
0
0.5
1
1.5
Observed effect sizes
38

Thorough search for all relevant published
and unpublished study results
a)
b)
c)
d)
e)
f)
Articles
Books
Conference papers
Dissertations
(Un)finished research reports
…
39
-
-
-
outliers
- detection using graphs (or tests)
- conduct analysis with and without outliers
calculation effect sizes : several analyses
publication bias: analysis with and without
unpublished results
design & quality: compare results from studies with
strong design or good quality, with those of all
studies
researcher: literature search, effect size calculation,
coding quality, …, done by two researchers
…
40
Observed effect sizes
6.5
5.5
4.5
ES
3.5
2.5
1.5
0.5
-0.5
-1.5
1
3
5
7
9
11
13
15
17
19
21
23
25
27
29
31
33
35
37
39
41
43
45
47
49
51
53
55
57
59
61
63
65
67
69
71
73
75
77
79
81
83
85
87
89
91
93
95
97
99
101
103
105
107
109
111
113
115
117
119
121
123
125
127
129
131
133
Experiment
41
135


Spreadsheets (e.g., MS Excel, …)
Some general statistical software (note: often
not possible to fix the sampling variance)
SAS Proc Mixed, Splus, R Metafor package, …

Software for meta-analysis (note: often not
MEM; often only one moderator!)
CMA (http://www.meta-analysis.com/), RevMan, …

Software for multilevel/mixed models
HLM, MLwiN, …
42
Software
Calculation of effect sizes
Number of moderators
Funnel
Trim & Fill
Forest
Max. nr of levels
Flexibility
Price
Complexity
Excel
SAS
R
CMA
RevMan
X
X
X
X
X
2
√
X
∞
X
X
X
∞
√√
√
∞
√
√
√
2
√√
√√
1
√
√
√
2
X
X
1 (cat.)
√
X
√
2
X
Expensive
(but student
version)
Free
X
X
√
Expensive Free
(but limited
trial vers.)
√√
43
√√



Cooper, H., Hedges, L. V., & Valentine, J. C. (Eds.) (2009).
The handbook of research synthesis and meta-analysis.
New York: The Russell Sage Foundation.
Lipsey, M. W., & Wilson, D. B. (2001). Practical metaanalysis. Thousand Oaks, CA: Sage.
Van den Noortgate, W., & Onghena, P. (2005). Metaanalysis. In B. S. Everittt, & D. C. Howell (Eds),
Encyclopedia of Statistics in Behavioral Science (Vol. 3
pp. 1206-1217). Chichester, UK: John Wiley & Sons.
44

Site of David Wilson
http://mason.gmu.edu/~dwilsonb/ma.html

Site of William Shadish
faculty.ucmerced.edu/wshadish/
45
Download