Effect Modification & Confounding
Kostas Danis
EPIET Introductory course,
Menorca 2012
Analytical epidemiology
Study design: cohorts & case control &
cross-sectional studies
Choice of a reference group
Biases
Impact
Causal inference
Stratification
- Effect modification
- Confounding
Matching
Multivariable analysis
Cohort studies
marching towards outcomes
Cohort study
Total
Cases
Non
cases
Exposed
100
50
50
50 %
Not
exposed
100
10
90
10 %
Risk ratio
Risk %
50% / 10% = 5
Source population
Exposed
Cases
Sample
Unexposed
Controls:
Sample of the denominator
Representative with
regard to exposure
Controls
Controls are non cases
Cases
Source
popn
Low attack rate:
non-cases likely to represent
exposure in source pop
start
Non- cases
end
Cases
High attack rate:
non-cases unlikely to represent
exposure in source population
start
Non- cases
end
Case control study
Cases
Controls Odds ratio
Exposed
a
b
Not
exposed
c
d
Total
a+c
b+d
Odds of
exposure
a/c
b/d
OR= (a/c) / (b/d)
= ad / bc
Who are the right controls?
Controls may not be easy to find
Cross-sectional study: Sampling
Sampling
Population
Sample
Target Population
Cross-sectional study
Total
Cases
Non
cases
Prevalence %
Exposed
1,000
500
500
50 %
Not
exposed
1,000
100
900
10 %
Prevalence ratio (PR)
50% / 10% = 5
Should I believe my measurement?
Exposure
Outcome
RR = 4
True association
causal
non-causal
Chance?
Bias?
Confounding?
Exposure
Outcome
Third variable
Two main complications
(1) Effect modifier
- useful information
(2) Confounding factor
- bias
To analyse effect modification
To eliminate confounding
Solution =
stratification
stratified analysis
Create strata
according to categories
inside the range of values
taken by third variable
Effect modification
Effect modifier
Variation in the magnitude
of measure of effect
across levels of a third variable.
Happens when RR or OR
is different between strata
(subgroups of population)
Effect modifier
To identify a subgroup
with a lower or higher risk ratio
To target public health action
To study interaction
between risk factors
Effect modification
Factor A
(asbestos)
Disease
(lung cancer)
Factor B
(smoking)
Effect modifier = Interaction
19
Asbestos (As) and lung cancer (Ca)
Case-control study, unstratified data
As
Ca
Controls
Yes
No
693
307
320
680
Total
1000
1000
OR
4.8
Ref.
Asbestos
Lung cancer
Smoking
Smokers
As
Ca
Controls
OR
Yes
No
517
183
160
340
6.0
Ref.
Total
700
500
Ca
Controls
OR
Yes
No
176
124
160
340
3.0
Ref.
Total
300
500
Nonsmokers
As
Asbestos (As), smoking and lung
cancer (Ca)
As
Smoking Cases
Controls
OR
Yes
Yes
517
160
8.9
Yes
No
176
160
3.0
No
Yes
183
340
1.5
No
No
124
340
Ref.
1.5 * 3.0 < 8.9
1.5 * 3.0 * interaction=8.9
Physical activity and MI
Physical activity
MI
Controls
OR, 95%CI
 2500 kcal/d
190
264
0.64 (0.6-0.9)
< 2500 kcal/d
176
157
Ref.
Physical
activity
Infarction
Gender
Men
Physical activity
MI
Controls
OR, 95%CI
2500 kcal/d
141
208
0.53 (0.4-0.7)
< 2500 kcal/d
144
112
Ref.
Physical activity
MI
Controls
OR, 95%CI
2500 kcal/d
49
56
1.2, (0.7-2.2)
< 2500 kcal/d
32
45
Ref.
Women
Vaccine efficacy
VE =
ARU – ARV
---------------ARU
VE =
1 – RR
Vaccine efficacy
Pop.
Cases
Cases
per 1000
RR
V
301 545
150
0.49
0.28
NV
298 655
515
1.72
Ref.
Total
600 200
665
1.11
Status
VE
= 1 - RR
VE
=
72%
= 1 - 0.28
Vaccine
Disease
Age
Vaccine efficacy by age group
Age
Status
Pop.
Cases
Cases
/1000
RR
VE
<1y
V
NV
35 625
24 375
38
30
1.07
1.23
0.87
13%
1-4y
V
NV
44 220
46780
34
86
0.77
1.84
0.42
58%
5-9y
V
NV
78 200
75 000
50
250
0.64
3.33
0.19
81%
10-24y
V
NV
83 400
82 600
18
120
0.22
1.45
0.15
85%
> 24y
V
NV
60 100
69 900
10
29
0.17
0.41
0.40
60%
Effect modification
Different effects (RR)
in different strata (age groups)
VE is modified by age
Test for homogeneity
among strata (Woolf test)
Any statistical test to help us?
Breslow-Day
Woolf test
Test for trends: Chi square
How to conduct a stratified analysis?
Crude analysis
Stratified analysis
1.
2.
3.
Do stratum-specific estimates look different?
95% CI of OR/RR do NOT overlap?
Is the Test of Homogeneity significant?
NO
Check for confounding
(compare crude RR/OR
with MH RR/OR)
YES
EFFECT MODIFICATION
(Report estimates by stratum)
33
Stratified analysis:
Effect Modification
ORs / RRs
different across strata
ORs / RRs 95% C.I.
do not overlap
ORs / RRs C.I.
do overlap
Effect modification
Use Woolf's test
Woolf's test significant
Woolf's test not significant
Effect modification
Effect modification
unlikely
Discuss lack of power
of Wollf's test
Death from diarrhea according to breast feeding,
Brazil, 1980s
(Crude analysis)
Diarrhea
Controls OR (95% CI)
No breast feeding
120
136
3.6 (2.4-5.5)
Breast feeding
50
204
Ref
No breast
feeding
Diarhoea
Age
Death from diarrhea according
to breast feeding, Brazil, 1980s
Infants < 1 month of age
Cases
Controls
OR (95% CI)
No breast feeding
10
3
32 (6-203)
Breast feeding
7
68
Cases
Controls
OR (95% CI)
No breast feeding
110
133
2.6 (1.7-4.1)
Breast feeding
43
136
Ref
Ref
Infants ≥ 1 month of age
Woolf test (test of homogeneity):p=0.03
Risk of gastroenteritis by exposure, Outbreak X,
Place, time X (crude analysis)
Exposed
Yes
No
Exposure
n
AR (%)*
n
AR(%)*
RR† (95% CI‡)
pasta
94
77
7
4.2
18.0 (8.8-38)
tuna
49
68
49
* AR = Attack Rate
24
† RR = Risk Ratio
‡ 95% CI = 95% confidence interval of the RR
2.9 (2.1-3.8)
Tuna
gastroenteritis
Pasta
Risk of gastroenteritis by exposure, Outbreak X,
Place, time X (stratified analysis)
Pasta Yes
Cases
Total
AR (%)
RR (95% CI)
Tuna
43
52
83
1.1
No tuna
46
60
77
Ref
Cases
4
Total
17
3
144
(0.9-1.3)
Pasta No
Tuna
No tuna
AR (%)
24
2
Woolf test (test of homogeneity): p=0.0007
RR
11
Ref
(95% CI)
(2.6-46)
Tuna, pasta and gastroenteritis
Tuna
Pasta
Yes
Yes
43
83
42
Yes
No
4
23
12
No
Yes
46
76
38
No
No
3
2
Ref.
38 * 12 > 42
Cases
AR(%)
RR
38 * 12 * interaction= 42
Risk of HIV by injecting drug use (idu), surveillance
data, Spain, 1988-2004
Cases
Total
AR (%)
RR (95% CI)
Idu
268
2,732
9.8
3.9 (3.3-4.4)
No idu
484
18,822
2.5
Ref
idu
hiv
gender
Risk of HIV by injecting drug use (idu),
Spain, 1988-2004 (stratified analysis)
Males
Cases
Total
AR (%)
RR (95% CI)
12
20
idu
86
693
No idu
52
8,306
0.6
Ref
Cases
182
Total
2,039
AR (%)
8.9
RR
2.3
432
10,576
(14-28)
Females
idu
No idu
4.1
Woolf test (test of homogeneity): p=0.00000
Ref
(95% CI)
(1.9-2.6)
Idu, gender and hiv
Idu
Male
Cases
AR(%)
RR
Yes
Yes
86
12.4
3.0
Yes
No
182
8.9
2.2
No
Yes
52
0.6
0.14
No
No
432
4.1
Ref.
0.14 * 2.2 > 3.0
0.14 * 2.2 * interaction= 3.0
Confounding
Confounding
Distortion of measure of effect
because of a third factor
Should be prevented
Needs to be controlled for
Confounding
Skateboarding
Chlamydia
Age
Age not evenly distributed between
the 2 exposure groups
- skate-boarders, 90% young
- Non skate-boarders, 20% young
Exposure
(coffee)
Outcome
(Lung cancer)
Third variable
(smoking)
50
Grey hair
stroke
Age
51
Cases of Down syndroms by birth order
Cases per 100 000
live births
180
160
140
120
100
80
60
40
20
0
1
2
3
Birth order
4
5
Cases of Down Syndrom by age groups
Cases per 1000
900
100000 live 800
births
700
600
500
400
300
200
100
0
< 20
20-24
25-29
30-34
Age groups
35-39
40+
Birth
order
Down
syndrom
Age or
mother
Cases of Down syndrom
by birth order and mother's age
Cases per 100000
1000
900
800
700
600
500
400
300
200
100
0
40
+
1
2
3
Birth order
4
5
<2
0
30
-34
25
29
20
-24
35
-39
s
up
o
r
eg
g
A
Confounding
To be a confounding factor, 2 conditions must be met:
Exposure
Outcome
Third variable
Be associated with exposure
- without being the consequence of exposure
Be associated with outcome
- independently of exposure
Exposure
Outcome
Hypercholesterolaemia
Myocardial infarction
Third factor
Atheroma
Any factor which is a necessary step in
the causal chain is not a confounder
Salt
Myocardial
infarction
Hypertension
The nuisance introduced by
confounding factors
• May simulate an association
• May hide an association that does exist
• May alter the strength of the association
– Increased
– Decreased
Confounding factor
Apparent association
Ethnicity
Pneumonia
Crowding
Altered strength of association
Crowding
Pneumonia
Malnutrition
How to prevent/control confounding?
Prevention
– Randomization (experiment)
– Restriction to one stratum
– Matching
Control
– Stratified analysis
– Multivariable analysis
Are Mercedes more dangerous than Porsches?
Type
Total
Accidents
AR %
RR
Porsche
1 000
300
30
1.5
Mercedes
1 000
200
20
Ref.
Total
2 000
500
25
95% CI = 1.3 - 1.8
Car type
Accidents
Confounding factor:
Age of driver
< 25 years
Type
Total
Accidents
AR %
Porsche
550
250
45.5
Mercedes
300
120
40.0
RR, 95%
CI
1.14
(0.9-1.3)
25 years
Type
Total
Accidents
AR %
Porsche
450
50
11.1
Mercedes
700
80
11.4
Crude RR = 1.5
Adjusted RR = 1.1 (0.94 - 1.27)
RR, 95%
CI
0.97
(0.7-1.4)
Incidence of malaria according to the
presence of a radio set,
Kahinbhi Pradesh
Crude data
Malaria Total
AR%
Radio set
80
520
15
No radio
220
1080
20
RR
0.7
Ref
RR: 0.7; 95% CI: 0.6- 0.9; p <
0.02
95%
CI = 0.6 - 0.9
Radio
Malaria
Confounding factor:
Mosquito net
Sleeping under mosquito net
Malaria
Total
AR%
RR
Radio
30
400
7.5
1.02
No radio
50
680
7.4
Ref
Malaria
Total
AR %
RR
50
120
41.7
0.98
170
400
42.5
Ref
No mosquito net
Radio
No radio
Crude RR = 0.7
Adjusted RR = 1.01
To identify confounding
Compare crude measure of effect
(RR or OR)
to
adjusted (weighted) measure of effect
(Mantel Haenszel RR or OR)
Any statistical test to help us?
When is ORMH different from crude OR ?
10 - 20 %
Mantel-Haenszel summary measure
Adjusted or weighted RR or OR
Advantages of MH
• Zeroes allowed
S (ai di) / ni
OR MH = ---------------------------
S (bi ci) / ni
Mantel-Haenszel
summary measure
• Mantel-Haenszel (adjusted or
weighted) OR
SUM (ai di / ni)
OR MH = ------------------SUM (bi ci / ni)
Cases
Exp+
a1
b1
Exp-
c1
d1
n1
Cases
(a1 x d1) / n1 + (a2 x d2) / n2
ORMH = ---------------------------------------(b1 x c1) / n1 + (b2 x c2) / n2
Controls
Controls
Exp+
a2
b2
Exp-
c2
d2
n2
How to conduct a stratified analysis?
Crude analysis
Stratified analysis
1.
2.
3.
Do stratum-specific estimates look different?
95% CI of OR/RR do NOT overlap?
Is the Test of Homogeneity significant?
NO
Check for confounding
(compare crude RR/OR
with MH RR/OR)
YES
EFFECT MODIFICATION
(Report estimates by stratum)
73
Risk of gastroenteritis by exposure, Outbreak X,
Place, time X (crude analysis)
. cstable case pesto pasta
Exposed
Exposure Total Cases AR%
pasta 121
pesto 79
94 77.69
45 56.96
Unexposed
Total Cases
165
212
7
58
AR%
Risk Ratio
P
4.24 18.31 [8.81-38.04] 0.000
27.36 2.08 [1.56-2.79] 0.000
74
Stratified Analysis
. csinter case pesto, by(pasta)
pasta = Exposed
pesto Total
Exposed
UnExposed
56
65
Cases
Risk %
43
51
76.79
78.46
Cases
Risk %
Risk difference
Risk Ratio
Attrib.risk.exp
Attrib.risk.pop
-0.02
0.98
0.02
0.01
[-0.17-0.13]
[0.81-1.19]
[-0.19-0.19]
[.-.]
Risk difference
Risk Ratio
Attrib.risk.exp
Attrib.risk.pop
0.01
1.21
0.17
0.02
[-0.09-0.11]
[0.15-9.53]
[-5.52-0.90]
[.-.]
pasta = Unexposed
pesto Total
Exposed
UnExposed
20
145
1
6
5.00
4.14
Test of Homogeneity (M-H) : pvalue :
0.8366301
Crude RR for pesto : 2.08 [1.56-2.79]
MH RR for pesto adjusted for pasta : 0.99 [0.81-1.20]
Adjusted/crude relative change : -52.67 %
> 10-20%
75
Examples of stratified analysis
Examples
1
2
3
4
5
Stratum 1
Stratum 2
Crude RR
4.00
1.01
3.05
1.02
1.07
4.00
1.03
5.20
1.86
9.40
4.00
4.00
4.00
4.00
4.00
Effect modifier
Belongs to nature
Different effects in different strata
Simple
Useful
Increases knowledge of biological mechanism
Allows targeting of PH action
Confounding factor
Belongs to study
Weighted RR different from crude RR
Distortion of effect
Creates confusion in data
Prevent (protocol)
Control (analysis)
Analyzing a third factor
Examine crude OR / RR
Examine ORs / RRs in each stratum
Identical ORs / RRs across strata
Different ORs / RRs across strata
Strata ORs / RRs similar to crude
(Crude value falls between strata)
Strata ORs / RRs different from crude
(Crude value does not fall between strata)
Effect modification
Third factor does not play a role
Confounding factor
Stop the analysis.
DO NOT adjust!
Report ONE crude OR/RR
Adjust using the
M-H technique
Report MULTIPLE ORs / RRs
for each stratum
Eliminate the confouding
Report ONE adjusted OR / RR
How to conduct a stratified analysis
Perform crude analysis
Measure the strength of association
List potential effect modifiers and confounders
Stratify data according to
potential modifiers or confounders
Check for effect modification
If effect modification present, show the data by stratum
If no effect modification present, check for confounding
If confounding, show adjusted data
If no confounding, show crude data
How to define the strata?
• Strata defined according to third variable:
– ‘Usual’ confounders
(e.g. age, sex, socio-economic status)
– Any other suspected confounder,
effect modifier or additional risk factor
– Stratum of public health interest
• For two risk factors:
– stratify on one to study the effect of the second
on outcome
• Two or more exposure categories:
– each is a stratum
• Residual confounding ?
80
Logical order of data analysis
How to deal with multiple risk factors:
Crude analysis
Multivariable analysis
1. stratified analysis
2. modelling
linear regression
logistic regression
Multivariate analysis
• Mathematical model
• Simultaneous adjustment of all
confounding and risk factors
• Can address effect modification
A train can mask a second train
A variable can mask another variable
Back-up slides
Risk factors for Salmonella
enteritidis infections, France, 1995
Delarocque-Astagneau et al Epidemiol. Infect 1998:121:561-7
86
Cases of Salmonella enteritidis gastroenteritis
according to egg storage and season
Summer
Cases Controls
OR
(95%CI)
Duration of storage
>= 2 weeks
12
2
< 2 weeks
52
64
>= 2 weeks
7
3
< 2 weeks
32
36
>= 2 weeks
19
5
< 2 weeks
84
100
7.4
(1.5-69.9)
Other seasons
Duration of storage
2.6
(0.5-16.8)
All seasons
4.5
(1.5 – 16.1)
87
Duration
of storage
Salmonellosis
Season
88
Cases of Salmonella enteritidis gastroenteritis
according to egg storage and season
Summer
(A)
“Long” storage
(B)
Yes
Yes
12
2
ORAB
6.8
Yes
No
52
64
ORA
0.9
No
Yes
7
3
ORB
2.6
No
No
32
36
Ref
Ref
Cases Control
OR
89
Advantages & Disadvantages of
Stratified Analysis
• Advantages
– straightforward to implement and comprehend
– easy way to evaluate interaction
• Disadvantages
– only one exposure-disease association at a time
– requires continuous variables to be grouped
• Loss of information; possible “residual confounding”
– deteriorates with multiple confounders
• e.g. suppose 4 confounders with 3 levels
– 3x3x3x3=81 strata needed
– unless huge sample, many cells have “0”’ and strata
have undefined effect measures
90