Strengthening Causal Inference in

advertisement
Strengthening Causal
Inference in HIV Studies:
+
Introduction
and Practical
Examples
CAPS Methods Core Presentation, April 18, 2012
Starley Shade, Sheri Lippman, Mi-Suk Kang Dufour
& Carol Camlin
+
Outline
 Answering
causal questions: common
roadblocks in HIV research
 Causal
Inference Framework and Overview of
methods
 Concrete
example: Using treatment and
censoring weighting in Prevention with
Positives
 Concrete
example: G-comp for population
level attributable risk in the SHAZ study
Q
&A
+
Roadblocks in HIV research:
selection bias / who gets exposed
 Population
surveillance and surveys in
probability-based samples
 study
participants (in testing, in survey research,
etc.) almost always systematically differ from nonparticipants
 Observational
studies
 using ‘comparison’ clinics, communities: Systematic
differences in study arms exist and/or may accrue
over time
+
Common roadblocks in HIV
research: Loss To Follow-up
 Cohort
studies of HIV+ individuals: highly
susceptible to loss to follow-up
 >20%
after 2 years, in resource-poor settings:
medical records don’t capture patient mobility
 Death registries rarely available & those who die
mistakenly assumed to be lost to follow-up
 Those who drop out are systematically different
from those who stay engaged in care
+
Roadblocks in HIV research:
time dependent confounding
C (&U)0
Expos0
C (&U)1
C (&U)2
C = group of
confounders
C (&U)3
U = unmeasured
confounders
STI0
Expos1
Expos2
Expos3
STI1
Time dependent
confounding – if
C is related to
prior exposure &
affects subsequent
exposure
STI2
STI3
+
Common roadblocks in HIV
research: Complex, multicomponent intervention studies
 Increasing
calls for comprehensive HIV
prevention interventions addressing
multiple levels and domains of influence on
individual behavior
 Evaluation
 Diverse
of such studies hampered by:
levels of exposure to individual
intervention components
 Difficult to distinguish relative contributions of
individual intervention components to observed
outcomes
+
Mending our comparison – the
causal /counter factual framework
 “We
may define a cause to be an object followed
by another… where, if the first object had not
been, the second never had existed” (Hume 1748)
 An
association can be considered causal when, if
the exposure had been altered, the outcome would
have been different
 Key
part is the counterfactual element – reference
to what would have happened if, contrary to
fact, the exposure had been something other than
what it actually was
+
8
Counterfactual framework
 “Ideal

a hypothetical study which, if we could actually conduct it,
would allow us to infer causality
 Ideal



experiment” illustrates the framework
experiment:
Person or population experiences one exposure and
observed for outcome over a given time period
Roll back the clock
Change the exposure but leave everything else the same,
observe for outcome over the same time period
+
9
Counterfactual framework
OBSERVED:
AIDS
Time
Counterfactual question: how long would Person A have survived had if
he/she had not received treatment?
+
10
Counterfactual framework
OBSERVED:
AIDS
Time
UNOBSERVED:
AIDS
+
11
Counterfactuals – specifying what
we really want to know
 Thinking
about the counterfactual outcome(s)
as something we are missing and something we
are trying to estimate when we analyze HIV
studies or any epidemiologic data is instructive
 Akin
to a missing data problem
 When
we compare groups of people observed
as exposed or unexposed we want to compare
groups that best estimate the counterfactual
outcomes that are unobserved or missing
+
Notation for presentation
A

A = treatment

Y = outcome

W = confounders (point treatment)

L = confounders (longitudinal)

The Likelihood of Data simplifies to:

W, L
L(O) = P(Y|A,W,L)P(A|W,L)
Y
+
Rationale for causal inference
approach
 Basic
regression models produce stratum
specific, or conditional, estimates (i.e.,
“while holding constant a set of
covariates”)
E[Y | A, L )  b0  b1 A  b3 L ( j )...
Where Y is outcome, A is observed exposure and L is
matrix of time-dependent covariates
 Therefore, our
conditional
estimates of effect are also
E[Y | A  1, L )  E (Y | A  0, L )  b1
+
Rationale for causal inference
approach
 Causal
inference approaches help us
model our way back to the ideal (counter
factual) experiment
E[Y (a  1)  Y (a  0)]
Where Y is outcome and a is counterfactual where all
individuals are exposed (a=1) or unexposed (a=0)
+
Inverse Probability
Weighting
+
Inverse Probability of
Treatment Weighting (IPTW)
 Re-create
the counter factual data set by weighting
 IPTW
assigns a weight for each subject equivalent
to the inverse probability of being in their
exposure group at each interval.
wt  1 / P[ A( j )  1 | A ( j  1), L ( j )]
 The
treatment model is based on values of past
and current covariates (L(j)) and past exposures
(A(j-1)).
E[ A( j ) | A ( j  1), L ( j )]  a0  a2 ( L ( j )  a3 A ( j  1)  a4 L ( j  1)...
+
Inverse Probability of
Treatment Weighting (IPTW)
 The
treatment weights are applied to the observed
population (e.g. weighted logistic regression)
wt [ E (Y | A)]  b0  b1 A
 Creates
a new pseudo-population in which the
distribution of confounders is balanced between
the two exposure groups, essentially mimicking a
randomized trial.
E[Y (a  1)  Y (a  0)]  b1
+
Inverse Probability of
Censoring Weighting (IPCW)
 Like
IPTW, IPCW assigns a weight equivalent to the
inverse probability of remaining in the study at
each interval, based on values of observed
covariates and past outcomes and exposures.
wc  1 / P[C  1 | A ( j ), L ( j )]
 The
censoring weights are applied to the observed
population, creating a new pseudo-population in
which censored subjects are “replaced” by upweighting uncensored subjects with the same
values of past exposures and covariates.
+
Example: Prevention with Positives
Demonstration Projects

Fifteen HRSA-funded demonstration projects implemented
prevention with positives in clinical settings

Each site decided whether to randomize patients to:


Provider-delivered intervention vs. Assessment

Specialist-delivered intervention vs. Assessment

Mixed intervention vs. Provider intervention
How do we assess the effectiveness of each intervention
type?
+
Example: Prevention with Positives
Patient characteristics
Male
White
Heterosexual
Age 40 or more
Education
(Less than HS)
Employed
CD4 < 200
VL < 75
Standard
of care
Provider
Specialist
Mixed
p<
781 (74)
410 (39)
453 (43)
720 (68)
540 (51)
490 (64)
282 (37)
371 (48)
423 (55)
377 (49)
705 (72)
332 (25)
478 (49)
704 (72)
524 (54)
530 (72)
298 (22)
297 (39)
431 (57)
371 (49)
.001
.001
.001
.001
ns
411 (39)
152 (14)
381 (36)
355 (46)
109 (14)
216 (28)
324 (33)
154 (16)
418 (43)
279 (37)
120 (16)
219 (29)
.001
ns
.001
+
Example: Prevention with Positives
Retention


At the 12-month follow-up assessment,

58% of patients were retained in the standard of care group,

76% of patients were retained in the provider intervention sites;

62% were retained in the specialist sites; and

44% in the mixed intervention sites.
There were differences in retention by patient
characteristics.

Older, white, gay males with more than a high school education
but who did not use cocaine or injection drugs were more likely
to be retained in the study at 12-months .
+
Example: Prevention with Positives
Risk Behavior
30%
25%
20%
Provider-led
Specialist-led
Mixed
Assessment
15%
10%
5%
0%
Baseline
6 months
12 months
+
Example: Prevention with Positives
Analysis

Inverse probability of treatment weights
E[ A | L ]  a0  a1 (male)  a2 ( white)  a3 ( gay )...
wt  1 / P( A | L )
+
Example: Prevention with Positives
Analysis

Inverse probability of censoring weights
E[C ( j )  1 | A , L ]  c  c( provider )  c( specialist )... 
c(male)  c( white)  c( gay )...
wc  1 / P[C ( j ) | A , L ] *1 / P[C ( j  1) | A , L ]...

Weighted logistic regression
wt * wc * log it [ E (Y | A)] 
b0  b1 ( provider )  b2 ( specialist  b3 (mixed )
+
Example: Prevention with Positives
Results
Intervention type
6 months
OR (95% CI)
12 months
OR (95% CI)
Provider-delivered
0.93 (0.60, 1.20)
0.55 (0.32, 0.94)
Specialist-delivered
0.58 (0.35, 0.96)
0.67 (0.39, 1.14)
Mixed
0.89 (0.53, 1.51;)
0.89 (0.53, 1.51)
Reference
Reference
Assessment only
+
G-computation and
Population intervention Models
G-computation

Sometimes called substitution estimation
approach

G-computation approach is to model the
exposure and outcome relationship and then
“control” exposure in the population by
substituting counterfactual exposures in your
model

Population intervention models use this
approach to answer practical questions
27
+
Population Intervention Models
Standard regression models give conditional estimate:
 
 
E (Y | A  1,W  w)  E (Y | A  0,W  w)
Marginal structural models allow total effect estimate:
Ew (Y1 )  Ew (Y0 )
For interventions what we care about is the population
difference when intervention is present or absent:
Ew (Ya )  Ew (Y )
+
Analogous to Attributable Risk

Traditional population Attributable Risk or Attributable
Fraction:

The proportion of the disease risk in the total population
associated with the exposure
Incidenceexp osed  Incidenceun exp osed
Incidenceexp osed
 proportion exp osed *100
This assumes the exposure causes the outcome and that there are
no other causes i.e. in absence of that exposure there would be no
outcome
+
Why PIMS?

Rarely looking at outcomes with only one important
predictor/confounder


PIMS allow assessment of effect averaged across covariates
Rarely able to completely eliminate a risk factor from
population

PIMS allow estimation for realistic interventions
+
Population Intervention Models:
estimation
1) Estimate outcome model
2) Create new dataset setting covariate(s) of interest
to intervention levels
3) Predict outcome of interest using model
estimated in step 1
4) Calculate the difference between predicted mean
outcome and observed mean outcome
+
Example: SHAZ! study

SHAZ! (Shaping the Health of Adolescents in Zimbabwe)

Enrolled adolescent orphan girls ages 16 to 19

Overall project was designed as an HIV prevention
intervention based on provision of reproductive health
services, economic livelihoods training and life-skills
education
+
Example: SHAZ! study

Using baseline data to look at a secondary outcome

Interested in the potential of interventions to improve mental
health for adolescent orphan girls

Several structural factors considered as potentially
modifiable with intervention
Orphaning
Age at orphaning
Socioeconomic status
Food security
Ability to pay for medication
Ever homeless
Changes in household
Completed education
Social environment
Female caregiver relationship
Social support
Exposure to violence
Feeling safe at home
Caring for ill person
Psychological distress
(Unmeasured)
Poor physical health
General health status
Viral infection
Baseline Self efficacy
Baseline
Mental Health status
SSQ
+
PIMS Question:
What is the potential impact of intervening on these factors on
this population’s mental health status?
+
Domain/variable
Social environment
Physical violence
Sexual violence
Prevalence in Population
N
%
Hypothesized intervention level
18
29
4.7%
7.6%
no experience of physical violence
no experience of sexual violence
forced sex
Unsafe home environment
28
241
7.3%
62.9%
no experience of forced sex
home environment considered very safe
Household expereince of violence
34
8.9%
noone in the house experiencing violence
Caring for ill
Low social support
115
231
30.0%
60.3%
not caring for someone ill in the household
"enough" people you can count on
Absence of supportive female caregiver
116
30.3%
presence of a female caregiver who is "often" or
"always" supportive
132
34.5%
Unable to buy medicine
235
61.4%
never going to bed hungry or not eating because
there is no food
able to buy needed medicine within 2 days
Changes in household location
197
51.4%
Ever homeless
Less than form 4 education
Low baseline self efficacy
86
99
335
22.5%
25.8%
87.5%
no changes in household location within the past
5 years
never homeless
at least form 4 (secondary) education
Average response of "agree/strongly agree" with
positive statements, "disagree/strongly disagree"
with negative statements
Poor physical health
Less than excellent health
278
72.6%
excellent self reported health
Viral infection HIV/HSV-2
42
11.0%
no viral infection with HIV or HSV-2
Socioeconomic status
Food security
+
Traditional regression results
Conditional Effects
parameter
(standard regression)
Dichotomized
Social environment
OR
Physical violence
3.67
Sexual violence
0.61
forced sex
2.99
Unsafe home environment
1.50
Household expereince of violence
1.85
Caring for ill
5.19
Low social support
1.64
Absence of supportive female caregiver
2.57
Socioeconomic status
Food security
0.88
Unable to buy medicine
1.30
Changes in household location
1.11
Ever homeless
2.40
Less than form 4 education
1.38
Low baseline self efficacy
4.84
Poor physical health
Less than excellent health
2.67
Viral infection HIV/HSV-2
2.57
Potential Impact of Interventions
Domain/variable
Prevalence in Population
Population
Intervention
parameter
N
%
Physical violence
18
4.7%
-1.1%
Sexual violence
29
7.6%
0.0%
forced sex
28
7.3%
-0.7%
Unsafe home environment
241
62.9%
-3.5%
Household experience of violence
34
8.9%
-1.1%
Caring for ill
115
30.0%
-5.8%
Low social support
231
60.3%
-4.4%
Absence of supportive female caregiver
116
30.3%
-3.9%
132
235
197
86
99
335
34.5%
61.4%
51.4%
22.5%
25.8%
87.5%
0.4%
-2.7%
-0.9%
-2.8%
-0.5%
-9.2%
Less than excellent health
278
72.6%
-7.4%
Viral infection HIV/HSV-2
42
11.0%
-1.3%
Social environment
Socioeconomic status
Food security
Unable to buy medicine
Changes in household location
Ever homeless
Less than form 4 education
Low baseline self efficacy
Poor physical health
+
Extension of this approach to
longitudinal context:
6 month
covariates
Baseline
covariates
Intervention
Participation:
Life-skills
Red Cross
Baseline
Mental
Health
12 month
covariates
Intervention
Participation:
Start vocational
training
Mental
Health at
6 months
Mental
Health at
12 months
Time
18 month
covariates
Intervention
Participation:
finish vocational
training
Mental
Health at
18 months
Intervention
Participation:
Receive grant
Mental
Health at
24 months
+
Question:
Does poor mental health status affect participation in the
intervention over time?
+
Analytic approach
Interested in effect of exposure (A) on outcome (Y) given
covariates and past exposure and outcome
EW[E0(Y|A=1,W)‐E0(Y|A=0,W)]
Where W includes past exposure and outcome and other
covariates
+
Analytic approach cont.
Fit a series of point treatment models for outcomes at
timepoints following exposure(s) of interest
+
Example 1:
6 month covariates
Baseline covariates (W)
Intervention
Participation:
Life-skills (Y)
Red Cross (Y)
Baseline
Mental Health (A)
Intervention Participation:
Start vocational training
Mental Health at 6
months
+
Example 2:
6 month covariates (W)
Baseline covariates (W)
Intervention
Participation:
Life-skills
Red Cross (W)
Baseline
Mental Health (W)
Intervention Participation:
Start vocational training (Y)
Mental Health at 6
months(A)
Odds of Completion of Intervention Components by Symptomatic Status for Mental Health
Distress at Baseline, Conditional on Completing Previous Intervention Components:
Estimates from Logistic Regression
Lifeskills
Sample
OR
Size
(95% CI)
300
1.1
(0.35, 3.42)
Red Cross
Sample
OR
Size
(95% CI)
282
0.57
(0.30, 1.11)
Start vocational training
Sample
OR
Size
(95% CI)
114
1.30
(0.14, 12.14)
Completed vocational
training
Sample
OR
Size
(95% CI)
114
0.63
(0.26, 1.54)
Received Grant
Sample
OR
Size
(95% CI)
78
0.54
(0.05, 6.37)
Difference in Intervention Component Completion by Mental Health Distress Symptoms,
Conditional on Completing Previous Intervention Components: Average Treatment Effects
(ATE) using tmle(D/S/A) estimation
Lifeskills
Sample
Size
300
Symptomatic at
baseline
Red Cross
Start vocational
training
Completed vocational
training
ATE
(95% CI)
Sample
Size
ATE
(95% CI)
Sample
Size
ATE
(95% CI)
Sample
Size
ATE
(95% CI)
0.03
(-0.02,0.08)
282
-0.23
(-0.41,-0.05)
119
-0.01
(-0.16, 0.14)
114
-0.18
(-0.43, 0.07)
118
0.05
(0.02,0.10)
113
0.04
(-0.19,0.26)
110
-0.01
(-0.28, 0.26)
Symptomatic at
6 months
Symptomatic at
12 months
Symptomatic at
18 months
bold numbers indicate parameters statistically
significant at p<0.05
+
Assumptions and Limitations
+
Assumptions


No Unmeasured Confounding

There is no way to empirically
test for no unmeasured
confounding;

collection of data on a complete
set of covariates should be
incorporated in the design phase

Time-ordering (temporality)

Need to be certain the
covariates measured were
prior to treatment if used in Tx
weights/ treatment is prior to
outcome.
Experimental Treatment Assignment (ETA) or
positivity


Groups defined by all possible combinations of
covariates must have the potential to be in any (either)
treatment groups.
If there are covariate groups that will only be observed in
one treatment state, then we cannot estimate the effect of
the exposure within that group
+
Acknowledgements
Thanks to:
 Alan
Hubbard, UCB
 Mark
van der Laan , UCB
 Jennifer
Ahern, UCB
Download