Matching in case control studies

advertisement
Matching
(in case control studies)
James Stuart, Fernando Simón
EPIET
Dublin, 2006
Remember confounding…
Confounding factor is variable
independently associated with
• exposure of interest
• outcome
that distorts measurement of association
Control of confounders
In the study design
• Restriction
• Matching
In the analysis
• Stratification
• Multivariate analysis
Control of confounders
In the study design
• Restriction
• Matching
In the analysis
• Stratification
• Multivariate analysis
Matching
Selection of controls to match specific
characteristics of cases
a) Frequency matching
Select controls to get same distribution of
variable as cases (e.g. age group)
b) Individual matching
Select a specific control per case by
matching variable (e.g. date of birth)
… but matching introduces bias
because controls are no longer
representative of source population
to remove this selection bias
• Stratify analysis by matching criteria
matched design
matched analysis
• Can not study the effect of matching
variables on the outcome
a) Frequency matching
useful if distribution of cases for a
confounding variable differs markedly
from distribution of that variable in
source population
a) Frequency matching
Age
Cases
(years)
0-14
15-29
30-44
45+
TOTAL
50
30
15
5
100
a) Frequency matching
Age
Cases
(years)
0-14
15-29
30-44
45+
TOTAL
50
30
15
5
100
Controls
unmatched
20
20
20
40
100
a) Frequency matching
Age
Cases
(years)
0-14
15-29
30-44
45+
TOTAL
50
30
15
5
100
Controls
unmatched matched
10
25
25
40
100
50
30
15
5
100
a) Frequency matching: analysis
• Mantel-Haenszel Odds Ratio (weighted)
ORMH
[a  d n ]


[b  c n ]
i
i
• Conditional logistic regression for
multiple variables
a) Frequency matching: analysis
• keep stratification by age group
0-14 years
Exposed
Yes
No
Total
Cases
45(a)
5(c)
50
Controls
30(b)
20(d)
50
a  d ni 900 100 9


b  c ni 150 100 1.5
Total
75
25
100(ni)
a) Frequency matching: analysis
15-29 years
Exposed
Yes
No
Total
Cases
15(a)
15(c)
30
Controls
4(b)
26(d)
30
a  d ni 390 60 6.5


b  c ni
60 60 1.0
same process for each age group
ORMH
9  6.5  etc

1.5  1  etc
Total
19
41
60(ni)
b) individual matching
Each pair could be considered one stratum
4 possible outcomes per pair
Exposure
+ Case
1 0
Control
1 0
b) individual matching
Each pair could be considered one stratum
4 possible outcomes per pair
Exposure
+ - + Case
1 0 1 0
Control
1 0 0 1
b) individual matching
Each pair could be considered one stratum
4 possible outcomes per pair
Exposure
+ - + - + Case
1 0 1 0 0 1
Control
1 0 0 1 0 1
b) individual matching
Each pair can be considered as one stratum
4 possible outcomes per pair
Exposure
+ + + Case
1 0 1 0 0 1
Control
1 0 0 1 0 1
+ 0 1
1 0
ad = zero unless case exposed, control not exposed
bc = zero unless control exposed, case not exposed
b) individual matching
ORMH
[a  d n ]


[b  c n ]
i
i
The only pairs that contribute to OR are
discordant
ORMH= sum of discordant pairs where case exposed
sum of discordant pairs where control exposed
b) individual matching
If change way of presenting case and control data
to show in pairs
Controls
Exposed
Unexposed
Exposed
e
f (ad=1)
Cases
Unexposed
g (bc = 1)
h
ORMH
= sum of discordant pairs where case exposed
sum of discordant pairs where control exposed
=
f/g
b) individual matching: for n controls
each set analysed in pairs
case used in as many pairs as number of controls
Case Control1 Control2 Control3 Control4 C+/Ctr- C-/Ctr+
+
+
3
0
+
+
+
+
1
0
0
0
+
+
3
0
+
0
1
+
+
+
+
1
0
+
+
+
+
+
0
0
Total......................................................................... 8
1
OR= pairs case exp/control not = 8 = 8
pairs case not/control exp 1
Matched study: example
• 20 cases of cryptosporidiosis
• Hypothesis: associated with attendance at
local swimming pool
• 2 matched studies conducted
(i) controls from same general practice
and nearest date of birth
(ii) case nominated (friend) controls
Analysis: GP and age matched controls
swimming pool exposure
+
Controls
+
1
15
-
1
Cases
3
OR = f/g = 15/1 = 15.0
Analysis: friend controls
swimming pool exposure
+
Controls
+
13
3
Cases
-
1
3
OR = 3/1 = 3.0
Why do matched studies?
• Random sample may not be possible
• Quick and easy way to get controls
• Improves efficiency of study (smaller
sample size)
• Can control for confounding due to
factors that are difficult to measure or
even for unknown confounders.
Disadvantages of matching
• Cannot examine risks associated with
matching variable
• If no controls identified, more likely if
too many matching variables, lose case
data and vice versa
• Overmatching on exposure of interest
will bias OR towards 1
• May be residual confounding in
frequency matching
Over-matching
• exposure to the risk
factor of interest
• under-estimates true
association
• may fail to find true
association
Key points
• Matching controls for confounding factors in study
design
• Matched design
matched analysis
• Matching for variables that are not confounders
complicates design
• Frequency matching simpler than individual
• Multivariable analysis reduces need to match
Download