Brian C. Sauer, PhD MS
SLC VA Career Development Awardee
• SLC VA Career Development Awardee
• PhD in Pharmacoepidemiology from College of
Pharmacy at University of Florida
• MS in Biomedical Informatics from University
Utah
• Assistant Research Professor in Division of
Epidemiology, Department of Internal
Medicine
• Mentors
– Matthew Samore, MD
– Tom Greene , PhD
– Jonathan Nebeker, MD
• Simulation
– Chen Wang, MS statistics
• Primary References:
– Causal Inference Book: Jamie Robins & Miguel Hernán
• http://www.hsph.harvard.edu/faculty/miguel-hernan/causalinference-book/
– Modern Epidemiology 3 rd Ed. Chapter 12. Rothman,
Greenland, Lash
• Causal Inference and the counterfactual framework
• Exchangeability & conditional exchangeability
• Use of Directed Acyclic Graphs (DAGs) to identify a minimal set of covariates to remove confounding.
• Understand the Rationale for
– randomized control trials.
– covariate selection in observation research.
• Identify the minimal set of covariates needed to produce unbiased effect estimates.
• Develop terminology and language to describe these ideas with precision.
• Become familiar with notation for causal inference, which is a barrier to this literature.
• Neyman (1923)
– Effects of point exposures in randomized experiments
• Rubin (1974)
– Effects of point exposures in randomized and observational studies (potential outcomes and
Rubin Causal Framework)
• Robins (1986)
– Effects of time-varying exposures in randomized and observational studies. (counterfactuals)
Working Example:
• Zeus took the heart pill 5-days later he died
• Had he not taken the heart pill he would still be alive on that 5th day
– that is, all things being equal
• Did the pill cause Zeus’s death ?
Working Example:
• Hera didn’t take the pill
– 5-days later she was alive
• Had she taken the pill she would still be alive
5-days later.
• Did the pill cause Hera’s survival ?
• Newt Gingrich and
William R Forstchen
• Historical Fiction
– Imagines how the war would have ended if there was a confederate victory at Gettysburg
• Y=1 if patient died, 0 otherwise
– Y z =1, Y h =0
• A=1 if patient treated, 0 otherwise
– A z =1, A h =0
Pat ID A Y
Zeus
Hera
1
0
1
0
Outcome under No Treatment
• Y a=0 =1 if subject had not taken the pill, he would have died
– Y z, a=0 = 0, Y h, a=0 = 0
Outcome under Treatment
• Y a=1 =1 if subject had taken the pill, he would have died
– Y z, a=1 = 1, Y h, a=1 = 0
Pat ID A
Zeus 1
Hera 0
Y a =0
0
0
Y a =1
1
0
ID
Zeus
Hera
Apollo 1
Cyclope 0
A
1
0
Y
1
0
0
0
Y a=0
?
0
?
0
Y a=1
1
?
0
?
Formal definition of causal effects:
– For Zeus: Pill has a causal effect because
- Y z,a=1 ≠ Y z,a=0
– For Hera: Pill doesn’t have a causal effect because
- Y h,a=1 = Y h,a=0
Formal Definitional of Average Causal Effects
• In the population, exposure A has a causal effect on the outcome Y if
- Pr[Y a=1 =1] ≠ Pr[Y a=0 =1]
Causal null hypothesis holds if
- Pr[Y a=1 =1] = Pr[Y a=0 =1]
- E[Y a=1 =1] = E[Y a=0 =1]
Causal effects can be measured in many scales
• Risk difference: Pr[Y a=1 =1] - Pr[Y a=0 =1] =0
• Risk Ratio: Pr[Y a=1 = 1] ÷ Pr[Y a=0 =1] =1
• Odds Ratio, Hazard Ratio, etc…
No average causal effect:
• Risk difference: Pr[Y a=1 =1] - Pr[Y a=0 =1] =0
• Risk Ratio: Pr[Y a=1 =1] / Pr[Y a=0 =1] =1
• Are there individual Causal Effects?
Pat ID
Rheia
Kronos
Demeter
Hades
Hestia
Poseidon
Hera
Zeus
Artemis
Apollo
Leto
Ares
Athena
Hephaestus
Aphrodite
Cyclope
Persephone
Hermes
Hebe
Dionysus
0
1
1
0
0
1
1
Y a=0 Y a=1
1
0
0
0
0
1
0
0
1
1
0
1
1
0
0
0
0
1
0
0
1
1
0
1
1
1
1
1
0
1
1
0
0
Definition
• Causal effects are calculated by contrasting counterfactual risks within the population.
• Counterfactual contrasts are by definition causal effects.
Observed in real world – association ≠ causation
• Pr[Y=1|A=1] =Pr[Y=1|A=0]: Treatment A and outcome Y are independent
• Also quantify strength of association - Risk
Difference, Risk Ratio, OR, HR, etc
• Pr[Y=1|A=1] - Pr[Y=1|A=0]=0
• Pr[Y=1|A=1] ÷ Pr[Y=1|A=0]=1
The key conceptual difference:
• A causal effect defines a comparison of the sample subjects under different actions
– Assumes the counterfactual approach
– Everyone simultaneously treated and untreated
– Marginal effects
• Association is defined as a comparison of different subjects under different conditions
– Effects conditional on treatment assignment group
Question:
• Under what conditions can associational measures be used to estimate causal effects?
Question:
• Under what conditions can associational measures be used to estimate causal effects?
Answer:
• Ideal randomized experiments
• Generate missing counterfactual data
• Missing counterfactual is missing completely at random (MCAR)
• Because of this, causal effects can be consistently estimated with ideal RCTs despite the missing data
Exchangeability:
• Risk under the potential treatment value a among the treated is equal to the risk under the potential treatment value a for the untreated
• Pr[Y a =1|A=1] = Pr[Y a =1|A=0]
• Consequence of these conditional risk being equal in all subsets defined by treatment status in the population is that they must be equal to the
marginal risk under treatment value a in the whole population
• In the presence of exchangeability the counterfactual risk under treatment in the white part of the population would equal the counterfactual risk under treatment in the entire population.
• A= heart transplant
• Y= death
• L=prognostic factor
• Counts
– 13 of 20 (65%) gods were treated
– 9 of 12 (75%) treated had prognostic factor (l=1)
– 3 of 7 (25%) not treated have prognostic factor
Pat ID
Rheia
Kronos
Demeter
Hades
Hestia
Poseidon
Hera
Zeus
Artemis
Apollo
Leto
Ares
Athena
Eros
Aphrodite
Cyclope
Persephone
Hermes
Hebe
Dionysus
0
0
0
0
Y
0
1
0
1
1
1
1
1
0
1
1
1
1
0
0
0
1
1
0
0
A
0
0
1
1
0
1
1
0
0
1
1
1
1
1
1
1
0
0
0
0
L
0
0
0
0
1
1
1
1
1
1
1
1
1
1
1
1
• Design 1:
– 13 of 20 treated: Randomly selected 65% for treatment
• Design 2:
– 9 out of 12 in critical condition (75%) treated
– 4 out of 8 not in critical condition were treated
(50%)
Pat ID
Rheia
Kronos
Demeter
Hades
Hestia
Poseidon
Hera
Zeus
Artemis
Apollo
Leto
Ares
Athena
Eros
Aphrodite
Cyclope
Persephone
Hermes
Hebe
Dionysus
0
0
0
0
Y
0
1
0
1
1
1
1
1
0
1
1
1
1
0
0
0
1
1
0
0
A
0
0
1
1
0
1
1
0
0
1
1
1
1
1
1
1
0
0
0
0
L
0
0
0
0
1
1
1
1
1
1
1
1
1
1
1
1
• Simply combination of 2 marginally randomized experiments.
• One conducted in subset of population L=0 and the other in L=1
• Values not MCAR, but they are MAR condition on the covariate L
• Marginal exchangeability not achieved
• randomization generates conditional exchangeability
• Question: How do you analyze a marginally
RCT?
• Answer:
• Question: How do you typically analyze a marginally RCT?
• Hint : Dependent and independent variables?
• Answer:
• Question 1: How do you typically analyze a marginally RCT?
• Hint : Dependent and independent variables?
• Answer: Crude or unadjusted analysis with treatment and outcome.
• Question 2: How do you typically analyze a conditionally RCT?
• Question 2: How do you typically analyze a conditionally RCT?
• Answer 2:
– Robins recommends standardization and IPW
– Stratification type method or common
– Conditions were standardization ≠ stratification.
Review their text.
• Randomization produces
– Marginal exchangeability
– Conditional exchangeability
• Exchangeability
– Allows us to use associational measure to estimate causal effects
• Investigator has no control over treatment assignment, e.g., randomization
• Cannot achieve exchangeability by design
• To estimate a causal contrast we must obtain valid observable substitute quantities for the desired counterfactual quantities
• If we don’t have good substitutes thenwe have a confounded relationship, i.e., the associational RR ≠
CRR
Conceptual justification
• Conceptualize observational studies as though they are conditionally randomized experiments.
• We assume that some components of the observational study happen by chance.
• Consistency: treatment levels are not assigned by researcher, but correspond to well defined interventions
• Positivity: all conditional probabilities of treatment are greater than zero
• Conditional Exchangeability: conditional probabilities of being assigned to specific treatment not chosen by investigator, but can be calculated from data
Causal Inference
• Exchangeability and conditional exchangeability can not be reached by design.
• Question 1 : How do we address conditional exchangeability in Observational studies?
Causal Inference
• Exchangeability and conditional exchangeability can not be reached by design.
• Question 1 : How do we address conditional exchangeability in Observational studies?
• Question 2: How should be pick covariates for our observational studies?
• Covariates should be selected to produce conditional exchangeability
• Confounding must be removed to produce conditional exchangeability
– A variable that removes confounding is a confounder
• Adjusting for certain types of covariates can either block paths, open paths or do nothing
• We want to adjust variables that block all backdoor paths between the treatment and outcome, i.e., remove confounding.
• Mathematically formalized by
– Pearl (1988, 1995, 2000)
– Sprites, Glymour, and
Scheines (1993, 2000)
• Are abstract mathematical objects.
• Encode an investigators a priori assumptions about the causal relations among the exposure, outcomes and covariates.
• They represent:
– joint probability distributions
– causal structures.
• Support communication among researchers and clinicians
• Explicate our belief and background knowledge about causal structures
• Allow us to determine what needs to be measured to remove confounding
• Helps us determine how bias can be induced
• Helps choose appropriate statistics
• Directed edges (arrows) linking nodes
(variables)
• Variables joined by an arrow are said to be adjacent or neighbors
• Acyclic because no arrows from descendents (effects) to ancestors
(causes)
• Descendants of variable X are variables affected either directly or indirectly by X
• Ancestors of X are all the variables that affect X directly or indirectly
• Paths between two variables can be directed or undirected
• Rules linking absence of open paths to statistical independencies
• Describe expected data distributions if the causal structure represented by the graph is correct
• Unconditional d-separation
– Path is open or unblocked if no collider on path
– Collider blocks a path
• d-Connected
– Open path between two variables
• Conditioning (adjustment) on a collider F on a path, or any descendant of F, opens the path at F
– U
1 and U
2 are marginally independent, but conditionally associated (conditioning on F)
• Conditioning on a non-collider closes the path and removes C as a source of association between
A and Y
– A and Y are marginally associated, but conditionally independent
(conditioning on C)
• Statistical: a confounder must
– Be associated with the exposure under study in the source population
– Be a risk factor for the outcome, though it need not actually cause the outcome
– Not be affected by the exposure or the outcome
• Graphical: a confounder must
– Be a common cause
– Have an unblocked back-door path
• Bias can be reduced to or explained by 3 structures
– Reverse causation: case-control – outcome precedes exposure measurement or outcome can have effect on exposure. Measurement error or
Information bias.
– Common cause: confounding, confounding by indication
– Conditioning on common effects: collider, selection bias, time varying confounding
• Adequate Background Knowledge
– Confounder identification must be grounded on an understanding of the causal structure linking the variables being studied (treatment and disease)
– Build a directed acyclic graph (DAG) to check whether the necessary criteria for confounding exists.
– Condition on the minimal set of variables necessary to remove confounding
• Inadequate Background Knowledge
– Remove known instrumental variables, colliders, intermediates (variables with post treatment measurement
– Use automated selection procedures such as HDPS
• Under adjustment occurs when
– An open back door path was not closed
• Over adjustment can occur from adjusting
– Instrumental variables
– Intermediate variables
– Colliders
– Variables caused by outcome
• Discussion of variable types
• Common Cause, i.e., confounder
• Confounder L distort the effect of treatment A on disease Y
• Always adjust for confounders, unless small data set and confounder has strong association with treatment and week association with outcome
• Goal is to produce conditional exchangeability
• A = treatment
– a=1 statin alone
– a=0 niacin alone
• L = Baseline Cholesterol
– l=1: LDL ≥ 160 mg/dL
– l=0: LDL < 160 mg/dL
• Y = Myocardial infarction
– Y=1: Yes
– Y=0: No
• Adjusting for intermediate variable
I in a fixed covariate model will remove the effect of treatment A on disease/outcome Y
• In a fixed covariate model we do not want to include variables influenced by A or Y
• Time-varying treatment model does include time-varying confounding that is also an intermediate variable
• A = treatment
– a=1 statin alone
– a=0 niacin alone
• I = Post-treatment Cholesterol
– i=1: LDL ≥ 160 mg/dL
– i=0: LDL < 160 mg/dL
• Y = Myocardial infarction
– Y=1: Yes
– Y=0: No
• Adjusting for the collider
C can produce bias
• Conditioning on common effect F without adjustment of U
1 or U
2 will induce an association between U
1 and U
2
, which will confound the association between A and Y
• A = antidepressant use
• Y = lung cancer
• U1 = depression
• U2 = smoking status
• F= cardiovascular disease
• Inclusion of variables associated with treatment only (A) can cause bias and imprecision
• Variables associated with disease but not treatment (risk factors) can be included in models. They are expected to decrease variance of treatment effect without increasing bias
• Including variables associated with disease reduces the chance of missing important confounders
Shrier I, Platt, RW. Reducing bias through directed acyclic graphs. BMC
Medical Research Methodology. 2008: 8:70
• Produce a DAG and get clinical experts to agree on underlying causal network
• Block (condition) on the variables that allow for open backdoor paths
– Backdoor paths are confounders
• Pearl 6-step approach for determining minimal set of variables
(illustrated by Shrier & Platt.
Reducing bias through DAGs. BMC research Methodology 2008 8:70)
• Subject matter knowledge is often not good enough to draw DAG that can be used to determine the minimal set of covariates needed to produce conditional exchangeability
• In large database studies with many providers it is difficult to know all the factors that influence treatment decisions.
• Recommendation :
– Propensity score (PS) approach :
• Remove colliders and instruments (variables associated with treatment but not disease)
• In a large PS study we should include as many of the remaining variables as possible.
• Focus should be on variables that are a priori thought to be strongly causally related to outcomes (risk factors, confounders)
– Outcome models approach:
• Use a change in estimate approach to select variables
– Since evidence of best variable selection approaches are limited, researchers should explore the sensitivity of their results to different variable selection strategies as well a s removal and inclusion of variables that could be IV or colliders.
• Fixed treatments
– Propensity Score
– Instrumental Variables
– IPW
• Time-varying treatments (sequentially randomization)
– IPW
– G-estimation
– Doubly robust
• We have developed simulations to understand and teach these concepts.
• Poster at CDA conference.
• If interested then please contact me
• Brian.sauer@va.gov
; brian.sauer@utah.edu