Quantitative Research Syntheses: Meta-Analysis Advanced Biostatistics Dean C. Adams

advertisement
Quantitative Research Syntheses:
Meta-Analysis
Advanced Biostatistics
Dean C. Adams
Lecture 14
EEOB 590C
1
Today
•Methods for quantitative research synthesis
•Brief history of methods for combining results from prior studies
•Vote-counting
•Combined probability method
•Meta-analysis
For further information on these approaches see:
Cooper and Hedges (1994). Handbook of Research Synthesis.
Hedges and Olkin (2000). Statistical Methods for Meta-Analysis.
Rosenberg, Adams, and Gurevitch (2000). MetaWin: Statistical Software for Meta-Analysis. Vsn 2.
2
Synthesizing Prior Research
•One important goal of science is synthesizing existing knowledge
•What does a body of literature say about a particular topic?
•Does existing published evidence support a particular hypothesis?
•Is there a general ‘consensus’ about the importance of a hypothesis?
•This is an obvious question to ask (what do we already know?)
•Literature reviews are common approach: usually narrative
•Other more quantitative methods exist
•Three main approaches:
•Vote-counting
•Combined probability methods
•Meta-analysis
3
Quantitative Research Synthesis: A Brief History
•Quantitative research synthesis as old as modern statistics
•First QRS: Pearson (1904) calculated average correlation from
several studies on effectiveness of typhoid vaccine
•Early 20th century: narrative reviews most common (and still are)
•1930’s: several methods for combining probabilities developed
(but infrequently used)
•1970’s: ‘modern’ meta-analytic methods for combining effect
sizes from independent studies developed by Glass (1976),
Rosenthal etc.
•Currently, meta-analytic methods common in social sciences and
medicine; use in ecology and evolutionary biology is increasing
4
QRS: Beginnings
•ANY research synthesis begins with a hypothesis (e.g., does
smoking significantly increase cancer rates?)
•Published studies* are then obtained via a literature search (e.g.,
keyword search on Web of Science, Scholar.Google, Biological
Abstracts, etc.)
•Unusable articles are discarded based on certain criteria (e.g.,
incomplete information)
•Remaining articles are reviewed and summarized in some way
*Note: unpublished studies that can be obtained from authors can also be included
5
QRS: Vote-Counting
•Begin with hypothesis and set of published studies
•Results from each study classified as 1 of 3 outcomes
•Significant in expected direction
•Significant in unexpected direction
•Not significant
•Calculate proportion of each class, and that class with highest
proportion represents the ‘support’ (for, against, equivocal)
•Advantages: quick and easy to calculate, intuitive
•Disadvantages: overly conservative, low statistical power (# nonsignificant findings > expected # significant findings), ignores magnitude
of effects of studies, not sensitive to sample sizes (all studies treated
equally)
6
QRS: Combined Probability Methods
•Begin with hypothesis and set of published studies (with
significance levels)
•Combine probabilities in some way
•Many methods exist for various distributions (uniform, normal, t, X2,
etc: see Becker, 1994 in Handbook of Research Synthesis: Cooper & Hedges)
•Advantages: relatively easy to calculate, sample sizes taken into
account (b/c use exact probabilities), general approach (can almost
always obtain p-value from a study)
•Disadvantages: don’t directly assess magnitude of study effects,
cannot assess direction of effects, cannot assess whether effects are
homogeneous
•Often called omnibus tests (only depend on exact probabilities of each study)
7
Some Combined Probability Methods
•Minimum P method (Tippet, 1931): uses uniform distribution,
* 1/ n


a

1

1

a
significant if any study is significant at a-level:
•Sum of logs method (Fisher, 1932): uses inverse X2 distribution,
significant if P  21n log  pi  < 0.05 from X2 with 2n df (n is #
studies and pi are study significance levels)
•Sum of Z method (Stouffer et al., 1949): use normal distribution,
n
significant if probability of Z  1 Z  pi  n < 0.05 (Z(pi) are Zscores for study p-values)
•Sum of p method (Edgington,
1972): uses uniform distribution,
n
n
significant if P   1 pi  n! < 0.05 (n is # studies and pi are study
significance levels)
8
Example: Fisher’s Approach
•Sum of logs method (Fisher, 1932): uses inverse X2 distribution,
significant if P  21n log  pi  < 0.05 from X2 with 2n df (n is #
studies and pi are study significance levels)
•Pi: 0.06; 0.02; 0.035; 0.001; 0.24
•Log(pi): -1.22; -1.70; -1.46; -3; -0.62
•-2S(pi) = 16; PX2 = 0.096 NS
9
QRS: Meta-Analysis
•Approach that combines weighted effect sizes for each study to
assess overall significance
•Allows the interpretation of the strength of the statistical finding,
not just whether or not there is significance
•M-A model can be generalized to address more complicated
synthesis questions
•Requires calculating an effect size and weight for each study
•Meta-analysis has two steps:
•Calculate effect sizes (and weights) for each study
•Summarize effect sizes to address hypothesis (m-a model)
10
Effect Sizes
•Effect size: statistical measure of the magnitude of factor in the
data (how much does smoking increase cancer rates?)
•Different types of primary data require different effect size
estimates (some data types have several possible effect sizes)
•Many test statistics are a form of effect size (e.g., t  X  X is a

standardized mean difference effect size)
•Use of effect sizes in QRS is desirable because they ‘standardize’
results from independent studies and express them in a common
way (i.e., all results expressed as t-values)
1
•Weights are inverse of effect size variance: w  v
•Effect sizes are typically transformed so range is - to +
1
2
11
Effect Sizes From X and 
•Powerful effect sizes, but require much data from studies
•Require means, sample sizes, and std from experimental and
C
E
control group ( X & X , sC & sE, NC & NE)
•Most are variants on standardized mean difference (like t-test)
Name/s
Glass’ 
Hedges’ g
Cohen’s d
Equation
 X
  
 X 

C
s
 X E  X C 

g
S
 X E  X C 

d Cohen  
E
Variance
C
v
vg
NC  NE
2


NCNE
2 NC  1

g2
NC  NE


C
E
N N
2 NC  NE  2

v d  Cohen

Hedges’ d
response ratio
 X E  X C 
J
d
S
XE
ln R  ln  C

X




vd

 NC  NE
d2
 

C
E
2 NC  NE  2
 N N

NC  NE
d2


NCNE
2 NC  NE
v ln R 


s 
E 2
N E  X 


E
2


 N C  N E 


 N C  N E  2 



s 
C 2
C
N C  X 


2
12
Effect Sizes From 2 X 2 Tables
•Common in medicine: for data summarized by 2 X 2 table
Treatment
Response
A
No Response
C
Total
Control
B
D
Total
A+B
C+D
nt = A + C nc = B + D N= A + B + C + D
•From table calculate
Name/s
rate difference
risk difference
relative rate
risk ratio
rate ratio
odds ratio
relative odds
Pt 
A
nt
and
Equation
RD  Pt  Pc
Pt
RR 
Pc
OR 
Pt 1  Pc 
Pt 1  Pt 
Pc 
B
nc
: used for effect sizes
Variance
v RD 
Pt 1  Pt 
P 1  Pc 
 c
nt
nc
vln RR 
vln OR
1  Pt 
nt Pt

1 
Pc 
nc Pc
1
1
1
1




A
B
C
D
13
Effect Sizes From Correlations
•Useful when only summary statistics are available
•Convert all test-statistics to correlations, then convert these to
1
1 1  r 
v

Fisher’s Z-transform:
variance:
z  ln 

z
n3
2 1 r 
•Common transformations
statistic
Z*
conversion
r
N
t
r 
F
r 

2
Z
r 
t2
t 2  df
F
F  df
 (21)
N
*Probabilities can be converted to Z
as standard normal deviates
14
Meta-Analytic Models
•Summarize effect sizes to assess significance
•Standard statistical summary variables: mean, variance
•Cumulative Effect Size: weighted mean of effect sizes
•Homogeneity Statistic: Quantifies variation in effect sizes
(analogous to SS) Are effect sizes homogeneous?
•Method of summary depends upon model for effect size variation
•No structure: all studies belong to one ‘population’
•Categorical structure: studies belong to groups
•Continuous structure: studies covary with continuous variable
•For models with structure (categorical, continuous), variables are
often called moderator variables (groups, covariate, etc.)
•All models are actually special cases of same model
15
Meta-Analysis: No Structure
•Model: All studies belong to same group
•Example Ho: Is there an effect of competition on plant
communities?
n
n
1
s 
•Cumulative effect size: E   wi Ei  wi variance:
w
2
E
i 1
•
CI  E  ta / 2[ n 1] * s
T
i 1
i
: E significant if it CI does not bracket 0.0
E
•Homogeneity: Q
i 1
n

n
 wi Ei2
i 1
 n

  wi E i 
i 1

  n
 wi
2
or
QT 
n
 wi  Ei
i 1
 E 

2
i 1
•Test against X2 (n-1 df)
•Significant QT implies samples are NOT homogeneous
•Implies structure in data: may be captured by a moderator variable
16
Meta-Analysis: Categorical Structure
•Model: Studies belong to different groups
•Example Ho: Does competition differ among habitats
(terrestrial, marine, etc.)?
w E
1
CI  E j  ta / 2[ k 1] * s E
s

•For each group calculate: E 
kj
j
QW j 
 wij Eij
kj
i 1
ij
ij
 wij

i 1
2
 Ej
i 1
kj
Q M    wij  E j  E 


j 1 i 1
kj
j
 wij
j
i 1
Test if each group is different from zero
•Test if groups differ: QT  QM  Q E
m kj
2
Ej
2
m
m kj
j 1
j 1 i 1

Q E   QW j    wij E ij  E j

2
test QM vs. X2 with m-1 df, where m is # groups
•Significant QM implies groups are different (significant QE implies
there is still structure remaining)
17
Meta-Analysis: Continuous Structure
•Model: Study effect sizes covary with continuous variable
•Example Ho: Does competition intensity change with age?
•Use Weighted GLM: Ei  bo  b1 X i  
n
 wi X i Ei
i 1
b1 

n
n
i 1
i 1
 wi X i  wi Ei
n
n
 wi
b0 
i 1
 n

wi X i 


n
 i 1

 wi X i2 
n
i 1
 wi
i 1
2
 wi Ei
i 1
n
 b1  wi X i
i 1
n
 wi
i 1
(test slope and intercept by
Z b1  b1 / sb1and Z b0  b0 / sb0 )
b12
•Homogeneity: QM  sb2 (QM vs. X2 with 1 df)
•Significant QM implies X explains significant component of
variation in E
1
18
Meta-Analysis: Comments
•What are we doing? Summarizing effect sizes as if ‘primary’ data
•If wi= 1.0, then we’re calculating standard means & SS
n
E   wi Ei
i 1
n
w
i 1
i
QT 
n
 wi  Ei
i 1
 E 

2
Also note, QT is partitioned, just like SS
•Therefore, think of meta-analysis as ANOVA, regression, etc.
•Meta-analytic models are actually Weighted GLM
•Weighted GLM is a standard statistical method used to account for
different weights of objects (recall PGLS for phylogeny)
•Since meta-analysis is analyzed in this general framework, more
complicated designs can also be tested (e.g., ANCOVA, 2-factor
ANOVA, etc.)
19
Meta-Analysis: Weighted GLM
•Represent analyses using standard matrix algebra
E  X  
 E1 
E    
 E n 
1 X 11  X p1 


X  


1 X
 X pn 
1n

(For no structure,
X is vector of 1’s)
•Solve model as:   X WX  Xt WE
•QT, QM, etc. calculated as weighted SS
t
 wi
W0

 0
0
0
0
W ‘in’ error term

0
 (wi inverse of
wn  variance)
1
•Allows for simple-complicated designs
•Can be generalized to multivariate (though multivariate effect
sizes nearly impossible to obtain for a set of published studies!)
20
Meta-Analysis: Fixed vs. Random Models
•All previous models are ‘fixed effects’ models
•Fixed-effects model: assume only one true effect size shared by
all studies (studies therefore only differ by sampling error)
•Random-effects model: assume studies differ by sampling error
2

and random component (pooled study variance: pooled )
2

• pooled found from running a fixed effects model
1
2
w

•  pooled is incorporated in weights for random model:
v 
i ( rand )
2
pooled
i
 2pooled 
QT  n  1
n
n
 wi
i 1

 wi2
i 1
n
 wi
i 1
No Structure
 2pooled 
Q E  n  m 

k
m
 j
   wij 
j 1 i 1



 

i 1
kj

 wij 
i 1

kj
wij2
Categorical Model
 2pooled 
Q E  n  2
 n
n
n
2
 w X 2  2X
w
X

X
wi



i i
i
i i
i
n
n
2  i 1
i 1
i 1
w

w
 i  i n n
2
i 1
i 1
 n


2
w
w
X

w
X





i
i i
i i

i 1
i 1
 i 1









Continuous Model
21
Meta-Analysis: Example
•Competition in biological communities (Gurevitch et al., 1992)
•Subset of data (N=43) from 3 habitats (terrestrial, lentic, marine)
•Data: mean, std, n data from experiment/control
•Ho: Does competition differ among habitats?
Study
Part of data
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
Habitat
Nc
Terrestrial
Terrestrial
Terrestrial
Terrestrial
Terrestrial
Terrestrial
Terrestrial
Terrestrial
Terrestrial
Terrestrial
Terrestrial
Terrestrial
Terrestrial
Terrestrial
Terrestrial
Terrestrial
Terrestrial
Terrestrial
Terrestrial
Lentic
Ne
7
7
6
5
7
6
3
3
3
3
5
5
4
18
20
18
20
20
20
4
Xc
7
7
6
5
7
6
3
3
3
3
5
5
4
20
20
20
20
20
20
4
Xe
78.14
18.86
-1.8
-2.2
-2.1
-2.3
85.3
0
0
0
17
47
87
-0.113
-0.163
0.14
-0.184
-0.075
0.147
281.11
79.71
26
-2.1
-2.8
-3
-4.2
285.7
3
2
1.67
17
37
272
0.294
0.412
0.632
0.259
0.354
0.541
-201.03
Sc
40.65
9.17
0.49
0.224
0.265
0.49
115.008
0
0
0
7.603
10.286
37.712
0.255
0.588
0.38
0.326
0.487
0.34
158.038
Se
40.65
9.17
0.49
0.447
0.529
1.225
153.806
2.425
2.078
1.732
5.367
9.391
183.532
0.215
0.218
0.359
0.238
0.182
0.299
27.52
d
0.0362
0.7289
0.5651
1.5329
2.0139
1.8799
1.1806
1.3996
1.0889
1.0909
0
-0.9171
1.2142
1.6975
1.2709
1.3051
1.5213
1.1438
1.2062
3.6961
var(d)
0.2858
0.3047
0.3466
0.5175
0.4306
0.4806
0.7828
0.8299
0.7655
0.7658
0.4
0.4421
0.5921
0.1435
0.1202
0.128
0.1289
0.1164
0.1182
1.3538
22
Meta-Analysis: Results
•E: Group effect sizes differed from zero (except lentic)
•QM: Effect sizes differed among groups
•Conclusion: competition occurs and differs among habitats
Group #Studies
E+ df
95% CI
------------------------------------------------------------Terrestrial 19 1.1417 18
0.8999 to 1.3
Lentic
2 4.1072
1 -7.1465 to 15.3609
Marine
22 0.7985 21
0.5419 to 1.0550
E++
43
1.0099
Model
df
Q Prob(Chi-Square)
-------------------------------------------------Between
2 16.4798
0.00026
Within
40 69.5016
0.00262
-------------------------------------------------Total
42 85.9814
0.00007
0.8408 to 1.1789
Lentic
T errestrial
Grand Mean
Marine
0.00
1.46
2.92
Effect Size
4.38
5.84
23
Meta-Analysis: Publication Bias
•Common concern is that only studies with significant results get
published, resulting in bias
•Can be assessed in a number of ways:
•Funnel Plot: plot effect size vs. sample size: should be funnel
shaped (larger variance with smaller n). If overabundance of
extreme values (for given n) with lack of data ‘in’ funnel, might be
publication bias
4.67
3.02
1.36
d
-0.30
-1.96
2.00
6.50
11.00
Nc
15.50
20.00
24
Meta-Analysis: Publication Bias Cont.
•Rank-Correlation Tests: Look at rank-correlation of
standardized effect size vs. sample size E  E  E  where v  v    1v 
v
•Fail-Safe Numbers: For the ‘file drawer problem’. How many
non-significant studies must be added to change result to non Z ( p )  n
significant (if large #, then result is robust)
N 
Z
*
i
i
*
i
i
j
*
i
2
n
1
R
•N: # studies, Z(pi): Z-scores for study significance values, Za: 1-tail probability
1
i
2
a
•Normal Quantile Plot: Standardized effect size vs. normal
quantile (gaps or strange nonlinearities may indicate publication bias)
S
t
a
n 3.66
d
a
r
d
i 2.58
z
e
d
E 1.50
f
f
e
c
t 0.43
S
i
z
e
-0.65
-2.27
-1.13
0.00
Normal Quantile
1.13
2.27
25
Cumulative Meta-Analysis
•Rank studies by some criterion (e.g., year of publication)
•Perform meta-analysis on 1st 2 studies, then 1st 3, 1st 4, etc.
•Plot cumulative effect sizes (with CI)
•Addresses when a synthesized result could be determined
26
Meta-Analysis: Resampling Tests
•Adams et al., 1997 (Ecology) proposed some resampling methods
•Randomization for assessing significance of Q-statistics
•Bootstrapping for assessing CI of cumulative effect sizes
•Removes assumptions of testing vs. X2 distribution
Adams et al. 1997. Ecology. 78:1277-1283.
See also: Rosenberg, Adams, Gurevitch. 2000. MetaWin. Sinauer Assoc.
27
Phylogenies and Meta-Analysis
•
•
•
When studies come from a set of related taxa, phylogenetic nonindependence is an issue
Phylogenetic meta-analysis recently developed (Adams, 2008)
• Both PGLS and meta-analysis are GLS models, so can be combined
1 t
t




X
WX
X WE
• M-A:
1
t 1
t 1


X
Σ
X
X
Σ Y


• PGLS:
Steps
1. SVD of S: obtain transformation matrix (D) [see Garland and Ives, 2000. Am. Nat.]
2. Transform X and E as: Enew  DE
Xnew  DX
3. Solve meta-analysis with transformed data
 p ma   X WXnew  Xtnew WEnew
t
new
•
1
NOTE: this is a fixed effects, Brownian motion model (method
Adams. 2008. Evolution. 62:567-572.
generalized by Lajeunesse, 2009)
Also: Lajeunesse. 2009. Am. Nat. 174:369-381.
28
Download