Lifecourse Epidemiology:
Relevance for Studying Health
Disparities in Cognitive Aging
Maria Glymour
Friday Harbor Psychometrics Workshop
June 12, 2012
Acknowledgements
• Funded in part by Grant R13AG030995-01A1 from
the National Institute on Aging
• The views expressed in written conference materials
or publications and by speakers and moderators do
not necessarily reflect the official policies of the
Department of Health and Human Services; nor does
mention by trade names, commercial practices, or
organizations imply endorsement by the U.S.
Government.
Friday Harbor
Psychometrics
2
Organization
• Motivating questions in lifecourse
epidemiology
• Causation vs statistical association
• Drawing and using DAGs
• Biases of special concern in studying racial
disparities and cognitive aging
– Survivor bias
– Baseline adjustment
– Adjusting for mediators
3
Organization
• Motivating questions in lifecourse
epidemiology
• Causation vs statistical association
• Drawing and using DAGs
• Biases of special concern in studying racial
disparities and cognitive aging
– Survivor bias
– Baseline adjustment
– Adjusting for mediators
4
Epidemiology is a core tool in
public health
• We want to improve health
– Basic knowledge about how things work help improve health but is not
the fundamental motivation
• Our questions always come down to something like
“Would changing some exposure/treatment X improve some
health outcome Y?”
– Caveats: Some people ask strictly clinical prediction questions like
“How likely is this person with characteristic X to keel over in the next
few years?”, to either warn the person or take aggressive preventive
action.
– Some people focus on surveillance (“Y is much more common in
recent years.”) but that is usually to motivate us to do something to
reduce the incidence of Y.
5
Lifecourse Epidemiology
• Is the study of how exposures at one point in life (fetal
development, early childhood, adolescence, early adulthood)
influence health outcomes much later in life.
• Basic models:
– Immediate risk: lightning strikes.
– Cumulative risk: you overeat every day and you become more and
more obese
– Critical/sensitive period: exposure matters most at a critical
developmental period (e.g.,: learning birdsong, or becoming a smoker)
– Trajectory models: any specific levels of exposure would be fine, but
it’s a big problem to change (e.g.,: altitude sickness (??))
6
Why do we care?
• Understanding this helps establish when you can intervene to
prevent disease development.
• For cerebrovascular disease, some risk is probably incurred
very early in life, although we do not know if this is because
– learned behavioral patterns (smoking initiation typically in teens)
– a trajectory that later exposes you to risk (poor school  poor diet 
diabetes/poor medical care), or
– physiologic changes that are already causing physiologic damage in
early life (vascular development, hypertension, obesity).
• Regardless, early exposures can increase risk of physiologic
event of acute stroke much later in life.
• For AD? Very strong evidence that education affects
performance on tests of memory and EF.
7
Why do we care?
• We want to know “if we
•
•
•
•
Increased education
Taught a 2nd language
Gave more money
Provided a more interesting job
Would this person (people) have lower risk of cognitive
decline?
• We want to plan an intervention that will improve outcomes.
8
Why do we care?
• Some “exposures” we do not imagine intervening to change
(e.g., race, sex, geography)– we are primarily interested in
what mediates the association so we can intervene on the
mediating pathway:
• How to intervene if:
Female sex  fertility  time away from work  lower salary
vs
Female sex  sexual harassment  time away from work  lower salary
9
Social Exposures Become Physically
Embedded Across the Lifecourse
• Krieger calls this “embodiment”: something outside the body –
how other people treat you, the school you attend, the work
you do, the place you live, the kinds of medical care you get,
who you marry, who you have sex with, – gets inside the
body and changes your risk of disease.
• Link and Phelan identify “fundamental” causes of disease as
factors that enable you to command health promoting
resources no matter what health threats you may face –
whether tuberculosis or myocardial infarction.
• What does race affect? What does education affect?
– These factors are so profound they affect almost every aspect of life
from before you are born until the day of your death and they pattern
almost every health outcome.
10
Organization
• Motivating questions in lifecourse
epidemiology
• Causation vs statistical association
• Drawing and using DAGs
• Biases of special concern in studying racial
disparities and cognitive aging
– Survivor bias
– Baseline adjustment
– Adjusting for mediators
11
Causal Inference
Very commonly, we wish to know about causal
relations…
If we changed X, would Y also change?
But we observe only statistical associations…
People with high values of X also have high
values of Y.
12
Statistical versus Causal Language
Statistical claims:
• X and Y are correlated
• X predicts Y
• X predicts Y conditional on
(adjusting for or stratifying
on) Z
• The prevalence of Y among
those with X is twice as
high as the prevalence of Y
among those without X.
Causal claims:
• X causes Y
• X affects Y
• X increases (or decreases) Y
• X induces Z, which induces
Y
13
Counterfactuals or Potential Outcomes
• Everyone has a well-defined outcome value (Y), under all possible
values of the exposure (X), but we only get to observe one of the
possible outcomes.
• X is a cause of Y if the value of Y would have been different
under different values of X.
• X can be a cause of Y even if it is neither necessary nor sufficient
to produce Y.
• Extend to a population:
– X is a cause of Y if X is a cause of Y for some people in the population or
– X is a cause of Y if Y has different probability distributions under different
values of X
14
Counterfactuals or Potential Outcomes
What is the effect of living in poverty while aged 23-30 (X) on risk
of developing AD before age 75 (Y)?
• If Earnest lives in poverty, he will develop AD before age 75.
– YX=1=1
• If he doesn’t live in poverty, he will not get AD before age 75.
– YX=0=0
Earnest actually does live in poverty (graduate school? Starving
writer?) and he actually does develop AD. We never get to see what
would have happened to him if he’d taken that Wall Street job right
out of college and avoided poverty.
This is the fundamental problem of causal inference.
15
Estimating counterfactual values from
observed values
Instead we observe the diabetes status of Francis, who’s a lot like
Earnest but avoided poverty. Francis did not develop AD. We
assume that Francis and Earnest are “exchangeable”, and conclude
that Earnest developed AD because of his poverty.
We observe the statistical association between poverty and AD, and
hope that the AD outcomes of people who were not impoverished
represent the outcomes people who were impoverished would have
had if they hadn’t been poor.
“Confounding is present if our substitute imperfectly represents what
our target would have been like under the counterfactual condition.”
– Maldonado and Greenland (2002)
16
Inferring Causation from Association
“Confounding is present if our substitute imperfectly
represents what our target would have been like under
the counterfactual condition.”
– Maldonado and Greenland (2002)
17
Statistical Independence vs
Statistical Association
• If knowing the value of X gives you no information
about the value of Y, then we say X and Y are
statistically independent
• If knowing the value of X gives you some
information about the value of Y, we say X and Y are
statistically dependent or associated
• If knowing the value of X and C gives you some
information about the value of Y, we say X and Y are
statistically dependent conditional on C.
18
Inferring Causation From Association
Statistical association between two variables X
and Y may be due to:
1. Random fluctuation
2. X caused Y
3. Y caused X
4. X and Y share a common cause
5. The statistical association was induced by
conditioning on a common effect of X and Y
(as in selection bias).
19
How we can use this
• Eliminating four of these explanations is usually the
goal of a causal analysis.
• Knowing these five sources of statistical association
helps identify the (set of) causal structure(s) that
could have generated the observed statistical
associations.
• We are always trying to go backwards from a set of
observed (conditional) statistical associations to the
unobserved causal structure that generated those
associations.
20
Organization
• Motivating questions in lifecourse
epidemiology
• Causation vs statistical association
• Drawing and using DAGs
• Biases of special concern in studying racial
disparities and cognitive aging
– Survivor bias
– Baseline adjustment
– Adjusting for mediators
21
Causal Directed Acyclic Graphs
Non-parametric SEMs: show your
assumptions about the causal relationships
among X, Y, and possible covariates in a
causal diagram:
•If two variables shown in the graph have
a common cause, you must show the cause
in the graph.
X
A
Y
X
A
Y
U
•Do not allow causal “loops”.
X
A
U
Y
B
E
22
Terminology
• Descendants
• The direct or indirect effects of a variable
• Paths
• A sequence of lines (edges) between two variables, regardless
of direction of arrows
• Not retracing any line segments or going through the same
variable twice
• Colliders
• Common effect of two variables in a path: where the arrows
‘collide’.
• The two causes must both be “on the path”.
• Any variable on a path that is not a collider is a “non-collider”.
• Conditioning
• Examining the distribution of one variable within levels of
another
• Regression adjustment, stratification, restriction
23
Colliders vs Non-Colliders
Colliders: common effects
A
Non-Colliders:
common causes (=confounders)
A
B
B
C
C
Or mediators
A
B
C
24
D-separation
•
The assumptions shown in a causal diagram imply that a
variable X will be independent of a variable Y, after
conditioning on a set of variables {Z} if every path
between X and Y is blocked by {Z}.
•
{Z} blocks a path if and only if either:
1. The path contains a non-collider that is in {Z} , or
2. The path contains a collider which is not in {Z} ,
and no descendent of the collider is in {Z} .
•
If there is an unblocked path linking X and Y, then X
and Y will typically be statistically dependent (unless
there is a perfectly offsetting balance between two
paths).
25
D-separation: intuition
•
•
•
•
There may be many reasons that two variables are
associated (some confounding, some mediated
causation etc).
Adjusting for a confounder of the two variables
blocks that source of association between two
variables
Adjusting for a mediator between the two
variables blocks that source of association between
two variables
Adjusting for a common effect of the two
variables creates an association between the two
variables
26
Recap
Two variables X and Y will generally be
associated if:
1. X causes Y or Y causes X
•
Exceptions?
2. X and Y share a common cause
•
Exceptions?
3. You have conditioned on a common effect of
X and Y.
27
Conditioning on a Collider
If two variables are statistically independent, but
have a common effect, then, within levels of
this effect, they will be statistically dependent.
Really.
Usually.
28
A collider anecdote
Some tall people are fast, and some are slow.
Some short people are fast, and some are slow.
Knowing that somebody in the general population is
short does not give you information about whether
they are fast or slow.
NBA ball players must be either very tall, or very fast.
If you know an NBA ball player is short… what do you
know about his speed?
29
A collider anecdote
I throw a party, and I only invite people who are
either very rich or very funny.
You come to my party (you are very funny) and
get stuck talking to the most boring person
you have ever met.
Is he rich?
30
A Collider Illustration
•
•
•
•
•
X~N(0,1)
Y~N(0,1)
e~N(0,1)
Z=X+Y+e
n=100
31
A Collider Illustration
•
•
•
•
•
X~N(0,1)
Y~N(0,1)
e~N(0,1)
Z=X+Y+e
n=100
In this simulation,
–X has no effect on Y
–Y has no effect on X
–They share no common causes
Unconditionally, X and Y are independent
32
A Collider Illustration
Scatter X , Y
-2
-2
-1
-1
x
0
x
0
1
1
2
2
Scatter X , Z
-3
-2
-1
0
y
1
2
-4
-2
0
z
2
4
33
A Collider Illustration
. reg y x
Coef.
Std. Err. t
P>|t|
x | .0204 .1113
0.18 0.854
cons | -.0064 .10153 -0.06 0.950
1
2
Scatter X , residual (Y|Z)
. reg y x z
Std. Err. t
.1042 -4.84
.0609 8.57
.0770 -0.28
P>|t|
0.000
0.000
0.777
-2
-1
x
0
Coef.
x
| -.5046
z
| .5217
Cons | -.0219
-2
-1
0
Residuals
1
2
34
Collider Bias and Nihilism
• Once you recognize the potential for collider bias, you may see
it everywhere.
• Or at least… the possibility of collider bias
• Collider bias is often small (try inducing it in a simulated data
set)
• Among the many reasons your data and analytic tools are
completely inadequate to answer your most interesting research
questions, collider bias may not even be in the top 3.
• But on occasion, it can be critical, especially if the associations
of the parents with the collider are very strong.
35
Example Causal Diagrams
A2
A1
A
X
A3
X
B
Y
E
A
E
B
X
Y
X
A
Y
U
Y
36
Organization
• Motivating questions in lifecourse epidemiology
• Causation vs statistical association
• Drawing and using DAGs
• Biases of special concern in studying racial
disparities and cognitive aging
–
–
–
–
Confounding
Survivor bias
Baseline adjustment
Adjusting for mediators
37
Confounding
Education
Memory
Depression
–Education“confounds” the association between
Depression and memory.
–Conditioning on education would be sufficient to
identify the effect of depression on memory
38
Confounding
Education
Income
Memory
Depression
–Education“confounds” the association between
Depression and mortality.
–Conditioning on either education or income would be
sufficient to identify the effect of depression on Memory
39
A DAG for Selection/Survivor Bias
– Imagine studying education and dementia in EPESE data.
– Education completed ~age 25, affects survival to age 65.
– EPESE enrollment ~age 65
Education
Survival to age 65
Dementia
Some gene
40
A DAG for Selection Bias
Here, we assume education has no effect on dementia.
Would it be statistically associated with dementia among
EPESE enrollees?
Education
Survival to age 65
Dementia
Some gene
41
A DAG for Selection Bias
Yes.
Education
Survival to age 65
Dementia
Some gene
42
Stratifying on the Dependent Variable
X
Y
Y*
U
Suppose you want to know whether the effect of
education on MMSE score is larger or smaller for
individuals with cognitive impairment. Can you just
stratify by MMSE and examine the relationship?
43
Why would you condition on a collider?
Some “colliders” are not optional:
• Survival
• Diagnosis with a disease
• Selection into a study
• Providing complete data
44
Can you quantify the bias?
• Must make assumptions about the magnitude
and direction of each causal association
• This is not specified in the DAG, the DAG
only tells you conditional
dependence/independence.
• Often the bias is small, but not always.
• Often the bias is negative.
45
Unreliable Measures
Depression
Memory
CESD1
e1
46
Unreliable Measures
Unemployment
Depression
Memory
CESD1
e1
47
Unreliable Measures in Analyses of
Change
U
X
C1
Change in C1
Y1
Y2- Y1
e1
48
Estimating Direct Effects
Race
Educ
Y
Standard decomposition of
direct/indirect effects:
E(Y)=b0+b1*Race
E(Y)=a0+a1*Race+a2*Educ
Total effect= b1
Direct effect=a1
Indirect effect=b1-a1
49
Estimating Direct Effects
Race
Educ
Y
Nutrition
Problems if:
- Unmeasured confounding of Educ
and Y
- Race and Educ interact to effect Y
- Imperfect measurement of Educ
- (Non-linear models)
Standard decomposition of
direct/indirect effects:
E(Y)=b0+b1*Race
E(Y)=a0+a1*Race+a2*Educ
Total effect= b1
Direct effect=a1
Indirect effect=b1-a1
50
END
51
Confounding
Childhood
Cognitive
Skills
Education
Memory
Depression
52
Confounding
Neurodegenerative
Disease
Childhood
Cognitive
Skills
Education
Memory
Change
Memory
Depression
53