Intermediate methods in
observational epidemiology
2008
Instructor: Moyses Szklo
Study Designs in Observational
Epidemiology
Epidemiologic reasoning
• To determine whether a statistical
association exists between a presumed
risk factor and disease
• To derive inferences regarding a
possible causal relationship from the
patterns of the statistical associations
To determine whether a statistical
association exists between a
presumed risk factor and a disease
• Studies using populations or groups of individuals as
units of observation
– Descriptive studies (prevalence, incidence, trends)
– Analysis of birth cohorts (cohort, age, period effects)
– Ecological studies
• Studies using individuals as units of observation
–
–
–
–
–
Randomized clinical trials
Cohort studies
Case-control studies
Cross-sectional studies
Other (nested case-control, case-crossover study)
Studies using groups as units of observation
• ECOLOGIC STUDIES
– To assess the correlation between a presumed
risk factor and an outcome, mean values of the
outcome (e.g., rate, mean) are plotted against
mean values of the factor (e.g., average per capita
fat intake), using groups as units of observation
– Groups could be defined by place (geographical
comparisons) or time (temporal trends).
A plot of the population of Oldenburg at the end of each year against
the number of storks observed in that year, 1930-1936.
Ornitholigische Monatsberichte 1936;44(2)
Relation between Anopheles inoculation and incidence of Plasmodium
Falciparum parasitemia in cohorts of children in Western Kenya
McElroy et al: Am J Epidemiol 1997;145:945-56.
Ecological fallacy
“The bias that may occur because an
association observed between
variables on aggregate levels does no
necessarily represent the association
that exists at the individual level.”
Last: Dictionary of Epidemiology, 1995
Example of ecological bias*
Population A
$10.5K
$34.5K
$28.5K
$12.2K
$45.6K
$17.5K
$19.8K
Traffic injuries: 4/7=47%
Mean income: $23,940
Population B
$12.5K
$32.5K
$24.3K
$10.0K
$14.3K
$38.0K
$26.4K
Traffic injuries: 3/7=43%
Mean income: $22,430
Population C
$28.7K
$30.2K
$13.5K
$23.5K
Mean income: $21,410
*Based on: Diez-Roux, Am J Public Health 1998;88:216.
$10.8K
$22.7K
$20.5K
Traffic injuries: 2/7=29%
Traffic injuries (%)
60
Ecologic
analysis
50
40
30
Higher income is
associated with
higher injury rate
20
10
0
21
22
23
24
Mean income (US$, in 1000)
25
Example of ecological bias*
Population A
$10.5K
$34.5K
$28.5K
$12.2K
$45.6K
$17.5K
$19.8K
Traffic injuries: 4/7=47%
Mean income: $23,940
Population B
$12.5K
$32.5K
$24.3K
$10.0K
$14.3K
$38.0K
$26.4K
Traffic injuries: 3/7=43%
Mean income: $22,430
Population C
$28.7K
$30.2K
$13.5K
$23.5K
Mean income: $21,410
*Based on: Diez-Roux, Am J Public Health 1998;88:216.
$10.8K
$22.7K
$20.5K
Traffic injuries: 2/7=29%
Traffic injuries (%)
60
Ecologic
analysis
50
40
30
Higher income is
associated with
higher injury rate
20
10
0
21
22
23
24
25
Mean income (US$, in 1000)
Individual-based
analysis
Non cases
Injury cases have
lower mean income
than non cases
Injury
cases
0
10
20
30
Mean income (1000 US$)
40
• Which of the two levels of inference is wrong?
– Concluding that high income is a risk factor for injuries
(based on the ecologic data) is subject to ecologic fallacy.
– BUT … concluding that, because injury cases tend to have
lower income, communities with higher average income
should have lower injury rates is also wrong!
• The real problem is cross-level reference*
– Using ecologic data to make inference at the individual level
(ecologic fallacy).
– Or using the individual data to make inferences at the group
(population level).
• When used to make inferences at the proper
level, both approaches might be right:
– Individuals with a lower income are more likely to be injured.
– In communities with higher average incomes, there is a
greater number of cars, thus exposing lower income
individuals to injuries.
*Morgenstern: Ann Rev Public Health 1995;16:61-81.
Types of ecologic variables
• Analogs of individual-level characteristics
– Aggregate measures (proportion, mean)
• Prevalence of disease
• Mean saturated fat intake
• Percentage with less than high school education
– Environmental measures
• Air pollution
• Global measures
• Health care system
• Gun control law
• Herd immunity
Ecologic studies are the design
of choice in certain situations:
• When the level of inference of interest is at the
population level
– Food availability (e.g., Goldberger et al: Public Health Rep
1916;35:2673-714).
– Effects of tax hikes in cigarette sales
• When the variability of exposure within the population
is limited
– Salt intake and hypertension (Elliot, 1992)
– Fat intake and breast cancer (Wynder et al, 1997)
Hypothetical data on
individuals from a
World-wide population
Strong positive (linear)
association
Usual daily salt intake
Hypothetical data on
individuals from a
World-wide population
Individuals from
country A
Usual daily salt intake
No association
Usual daily salt intake
Hypothetical data on
individuals from a
World-wide population
Country A
Country B
Country C
Country D
Country E
Country F
Country G
Usual daily salt intake
Hypothetical ecologic
data from 7 countries
Country A
Country B
Country C
Country D
Country E
Country F
Country
G
Strong
positive (linear)
association
Mean usual daily salt intake
Relation between sodium (Na) excretion and age increase in systolic blood
pressure (SBP) in centers in the INTERSALT cohort*
*Elliot, in Marmot and Elliot (eds.): Coronary Heart Disease Epidemiology, Oxford, 1992, pp.166-78.
Studies based on individuals:
Prospective Studies
Experimental
(Randomized clinical trial)
Study Population
Random allocation
Intervention
Control
Follow-up
Outcome
Outcome
Studies based on individuals:
Prospective Studies
Experimental
Non-experimental
(Randomized clinical trial)
(observational*)
Study Population
Study Population
Random allocation
Non-random allocation
Intervention
Control
Follow-up
Outcome
Outcome
Intervention
Control
Follow-up
Outcome
Outcome
*Cohort Study
Studies based on individuals:
Prospective Studies
Experimental
Non-experimental
(Randomized clinical trial)
(observational*)
Study Population
Study Population
Random allocation
Non-random allocation
Intervention
Control
Follow-up
Outcome
Outcome
Intervention
Control
Follow-up
Outcome
Outcome
*Cohort Study
Studies based on individuals
1.- Cohort studies
Cohort
Outcome
Death
Disease
Recurrence
Recovery
Suspected
Exposure
Time
Studies based on individuals
1.- Cohort studies
Diseased
Non diseased
Ince
Exposed
RR
Non
Exposed
Incē
Time
Cohort study
Losses to follow-up
Events
Initial
pop
time
Final
pop
Cohort study
Losses to follow-up
Events
EXPOSED
INCIDENCEEXP
Initial
pop
Final
pop
time
= RR
Losses to follow-up
Events
UNEXPOSED
INCIDENCEUNEXP
Initial
pop
time
Final
pop
Cohort Studies
Strengths
• Allows calculation of incidence
• Time sequence is clear (exposure →outcome)
Reduces potential for bias
•
•
•
•
•
Allows calculation of all measures of association
Multiple outcomes can be assessed
Multiple exposures can be assessed
New hypothesis can be tested as time goes by
Efficient ways to evaluate associations
Stored specimens can be analyzed later for new analytes /
risk factors
Cohort Studies
Additional Advantages
• Can incorporate changes in exposures and
confounders over time:
As participants age
As exposure accumulates
• Exposures and outcomes do not (necessarily) have to
be identified a priori
• New endpoints can be assessed – e.g., cancer
• Examination of baseline associations
Cross-sectional bias less likely with subclinical
outcomes
• The cohort as an epidemiologic laboratory
Ancillary studies can be done
Cohort Studies
Additional Advantages
• Can incorporate changes in exposures and
confounders over time:
As participants age
As exposure accumulates
• Exposures and outcomes do not (necessarily) have to
be identified a priori
• New endpoints can be assessed – e.g., cancer
• Examination of baseline associations
Cross-sectional bias less likely with subclinical
outcomes
• The cohort as an epidemiologic laboratory
Ancillary studies can be done
Cohort Studies
Additional Advantages
• Can incorporate changes in exposures and
confounders over time:
As participants age
As exposure accumulates
• Exposures and outcomes do not (necessarily) have to
be identified a priori
• New endpoints can be assessed – e.g., cancer
• Examination of baseline associations
Cross-sectional bias less likely with subclinical
outcomes
• The cohort as an epidemiologic laboratory
Ancillary studies can be done
Cohort Studies
Additional Advantages
• Can incorporate changes in exposures and
confounders over time:
As participants age
As exposure accumulates
• Exposures and outcomes do not (necessarily) have to
be identified a priori
• New endpoints can be assessed – e.g., cancer
• Examination of baseline associations
Cross-sectional bias less likely with subclinical
outcomes
• The cohort as an epidemiologic laboratory
Ancillary studies can be done
Cohort Studies
Additional Advantages
• Can incorporate changes in exposures and
confounders over time:
As participants age
As exposure accumulates
• Exposures and outcomes do not (necessarily) have to
be identified a priori
• New endpoints can be assessed – e.g., cancer
• Examination of baseline associations
Cross-sectional bias less likely with subclinical
outcomes
• The cohort as an epidemiologic laboratory
Ancillary studies can be done
Rich database for analyses
Studies based on individuals
2.- Case-control studies
Non
Diseased diseased
Exposed
Non
Exposed
Odds expD
Odds expD-
OR
Case-control study
Losses
Cases
Controls
Hypothetical
pop
time
Case-control study
Losses
Cases
Controls
Hypothetical
Recruiting
only cases with longest survival (Prevalent cases)
pop
Risk of durationtime
(incidence-prevalence) bias
INCIDENCE-PREVALENCE BIAS
Prevalence
 Prevalence Odds  Incidence  Duration
1  Prevalence
Prev exposed
Prevalence Odds Ratio 
Prev unexposed
1  Prev exposed
1  Prev unexposed
 Incid exposed  Durexposed
 Incid unexposed  Durunexposed
Relative
Risk
Incidence-Prevalence Bias
or
Duration bias
or
Survival bias
or
Selection bias
CASE-CONTROL STUDY INCLUDING ALL
INCIDENT CASES AND NON-CASES*
Losses
Cases
EXPOSED
(n= 100)
Controls
Exposed?
Cases
Controls*
Yes
4
96
No
4
96
8
192
OR  (4  96)  (96  4)  10
.
Hypothetical
pop
time
Losses
Cases
UNEXPOSED
(n= 100)
Hypothetical
pop
Controls
time
Assumption: All non-cases
survive through the end of
the follow-up
CASE-CONTROL STUDY INCLUDING ALL
INCIDENT CASES AND NON-CASES*
Losses
Cases
EXPOSED
(n= 100)
Exposed?
Cases
Controls*
Yes
4
96
No
4
96
8
192
Controls
OR  (4  96)  (96  4)  10
.
Hypothetical
pop
time
Losses
Cases
UNEXPOSED
(n= 100)
Hypothetical
pop
Controls
CASE-CONTROL STUDY INCLUDING
ONLY POINT PREVALENT CASES,
BUT ALL NON-CASES
Exposed?
Cases
Controls*
Yes
1
96
No
4
96
5
192
OR  (1  96)  (96  4)  0.25
time
Assumption: All non-cases
survive through the end of
the follow-up
SELECTION/SURVIVAL
BIAS (ALSO KNOWN
AS PREVALENCEINCIDENCE BIAS)
Results from cross-sectional surveys can be
analyzed in a prospective or case-control mode
Disease
Exposure
Yes
No
Exposure
Yes
No
Odds exp
Yes
a
c
No
b
d
Disease
Yes
No
a
c
a/c
b
d
b/d
Prevalencedis
a/a+b
c/c+d
CASE-CONTROL STUDY INCLUDING ALL
INCIDENT CASES AND NON-CASES*
Losses
Cases
EXPOSED
(n= 100)
Exposed?
Cases
Controls
Yes
4
96
No
4
96
8
192
Controls
OR  (4  96)  (96  4)  10
.
Hypothetical
pop
time
Losses
Cases
UNEXPOSED
(n= 100)
Hypothetical
pop
Controls
PREVALENCE OF DISEASE BY
EXPOSURE
Disease?
Exposed?
Yes
No
Total
Yes
1
96
97
No
4
96
100
PR= 1/97 ÷ 4/100= 0.26
time
Assumption: All non-cases
survive through the end of
the follow-up
SELECTION/SURVIVAL
BIAS (ALSO KNOWN
AS PREVALENCEINCIDENCE BIAS)
Cross-Sectional Vs. “Retrospective” Case-Control Studies
Key concept: How “caseness” and exposure are ascertained
ANOTHER TYPE OF CROSS-SECTIONAL BIAS: REVERSE CAUSALITY
Cross-Sectional Exposure Assessment: Association of Low Serum Carotenoids with
Age-Related Macular Degeneration
Total Carotenoids
Cases
Controls
≤ 1.024 μmol/L (exposed)
107
115
> 1.024 μmol/L (unexposed)
284
462
Total
391
577
0.38:1.0
0.25:1.0
Exposure Odds
Odds Ratio= 1.5
IT IS NOT POSSIBLE TO DETERMINE WHAT CAME FIRST (EXPOSURE OR OUTCOME).
THUS, INDIVIDUALS WITH AGE-RELATED MACULAR DEGENERATION MAY CHANGE THEIR
DIETS, WHICH IN TURN MAY RESULT IN LOW CONCENTRATIONS OF TOTAL
CAROTENOIDS  ‘REVERSE CAUSALITY’
Cross-sectional Studies
National Center for Health Statistics (NCHS)
• National Health and Nutrition Examination Survey
(NHANES)
•
20,000+ individuals
•
Oversampled children, age>65, minorities
•
Questionnaires, physical exam, laboratory data
• National Health Interview Survey (NHIS)
• National Immunization Survey (NIS)
• National Survey of Family Growth (NSFG)
www.cdc.gov/nchs
Cross-sectional survey
Point prevalence=
Snapshot of prevalence at
time of a cross-sectional
survey
Cross-sectional Studies
What can we learn?
Descriptions / Distributions:
Standardized centile curves of body mass index for Japanese children and
adolescents based on the 1978-1981 national survey data.
Ann Hum Biol. 2006 Jul-Aug;33(4):444-53.
Prevalence:
The prevalence of oral mucosal lesions in U.S. adults: data from the Third
National Health and Nutrition Examination Survey, 1988-1994.
J Am Dent Assoc. 2004 Sep;135(9):1279-86
Trends in prevalence:
Thirty-year trends in cardiovascular risk factor levels among US adults with
diabetes: National Health and Nutrition Examination Surveys, 1971- 2000
Am J Epidemiol. 2004 Sep 15;160(6):531-9
Association of exposure with prevalence of disease:
Prevalence of urinary schistosomiasis and HIV in females living in a rural
community of Zimbabwe: does age matter?
Trans R Soc Trop Med Hyg. 2006 Oct 23
Cross-sectional Studies
• Baseline examination of randomized trials
•
Cross-sectional study of health-related quality of life
in African Americans with chronic renal insufficiency:
the African American Study of Kidney Disease and
Hypertension Trial.
•
Am J Kidney Dis. 2002 Mar;39(3):513-24.
• Baseline examination of cohort studies
•
Association of kidney function and hemoglobin with
left ventricular morphology among African Americans:
the Atherosclerosis Risk in Communities (ARIC) study.
• Am J Kidney Dis. 2004 May;43(5):836-45.
Cross-sectional Studies
Strengths and Limitations
• Strengths
• Primary method of estimating prevalence
• Logistically efficient
Relatively fast (no follow-up required)
Can enroll large numbers of participants
• Large surveys can be used for many exposures and diseases
• Often generalizable – can oversample smaller subpopulations
• Limitations
• Large numbers needed for rare exposures / outcomes
• No information on timing of outcome relative to exposure
(temporality)
• Includes only those individuals alive at the time of the study
Prevalence-incidence bias
Case-control studies
within a defined cohort
• Case-Cohort Studies
• Nested Case-Control Studies
Example of case-cohort study
Association between CMV antibodies and incident coronary heart
disease (CHD) in the Atherosclerosis Risk in Communities (ARIC)
Study
(Sorlie et al: Arch Intern Med 2000;160:2027-32)
Cohort: 14,170 adult individuals (45-64 yrs at baseline) from 4 US
communities (Jackson, Miss; Minneapolis, MN, Forsyth Co NC;
Washington Co, MD), free of CHD at baseline.
Followed-up for up to 5 years.
• Cases: 221 incident CHD cases
• Controls: Random sample from baseline cohort, n=515 (included 10
subsequent cases).
“The population with the highest antibody levels of CMV (approximately
the upper 20%) showed an increased relative risk (RR) of CHD of
1.76 (95% confidence interval, 1.00-3.11), adjusting for age, sex, and
race.”
Case-cohort study
N~14,000
Option 1= thaw serum samples
of 14,000 persons, classify
by CMV titer (+) or (-), and followup to calculate incidence in each
group (exposed vs. unexposed)
Option 2: Case-cohort study
Initial
pop
Time (5 years)
Final
pop
Example of case-cohort study
Association between CMV antibodies and incident coronary heart
disease (CHD) in the Atherosclerosis Risk in Communities (ARIC)
Study
(Sorlie et al: Arch Intern Med 2000;160:2027-32)
Cohort: 14,170 adult individuals (45-64 yrs at baseline) from 4 US
communities (Jackson, Miss; Minneapolis, MN, Forsyth Co NC;
Washington Co, MD), free of CHD at baseline.
Followed-up for up to 5 years.
• Cases: 221 incident CHD cases
• Controls: Random sample from baseline cohort, n=515 (included 10
subsequent cases).
“The population with the highest antibody levels of CMV (approximately
the upper 20%) showed an increased relative risk (RR) of CHD of
1.76 (95% confidence interval, 1.00-3.11), adjusting for age, sex, and
race.”
case
loss
Initial
pop
Nested case-control study (within a cohort)
(“Incidence density sampling”)
time
time
Final
pop
“Risk sets”
Example of nested case-control study
Inflammatory Markers and CHD Risk (Pai JK, et al, New Eng J Med 2004;351:2599-610)
Cohorts: Nurses’ Health Study (30-55 yrs old, n= 121 700 nurses) and Health
Professionals Follow-up Study (40-75 yrs old, n= 51 529) (follow-up: 6 years and 8
years, respectively)
• Cases: 239 women and 265 men who developed an MI
• Controls: Selected by risk set sampling using 2:1 ratio, matched for age, smoking, and
date of blood sampling from participants free of cardiovascular disease at the time CHD
was diagnosed in cases.
Cases
Initial
pop
Controls
Example of nested case-control study
Inflammatory Markers and CHD Risk (Pai JK, et al, New Eng J Med 2004;351:2599-610)
Cohorts: Nurses’ Health Study (30-55 yrs old, n= 121 700 nurses) and Health
Professionals Follow-up Study (40-75 yrs old, n= 51 529) (follow-up: 6 years and 8
years, respectively)
• Cases: 239 women and 265 men who developed an MI
• Controls: Selected by risk set sampling using 2:1 ratio, matched for age, smoking, and
date of blood sampling from participants free of cardiovascular disease at the time CHD
was diagnosed in cases.
Cases
Initial
pop
Controls
Rate Ratios of Coronary Heart Disease* During Follow-up According to
Quintiles of C-Reactive Protein at Baseline, Nurses Health Study
(Women) and Health Professionals Study (Men)
Quintile of Plasma Level of C-Reactive Protein
1
2
3
4
5
Median –
mg/liter
0.50
1.18
2.20
4.02
9.14
Rate Ratios
1.0
1.23
0.89
1.22
1.61
Median –
mg/liter
0.27
0.60
1.08
2.05
5.24
Rate Ratios
1.0
1.75
1.74
2.14
2.55
Women
Men
*Adjusted for socio-demographic and cardiovascular risk factors
(Pai JK, et al, New Eng J Med 2004;351:2599-610)
Rate Ratios of Coronary Heart Disease* During Follow-up According to
Quintiles of C-Reactive Protein at Baseline, Nurses Health Study
(Women) and Health Professionals Study (Men)
Quintile of Plasma Level of C-Reactive Protein
1
2
3
4
5
Median –
mg/liter
0.50
1.18
2.20
4.02
9.14
Rate Ratios
1.0
1.23
0.89
1.22
1.61
Median –
mg/liter
0.27
0.60
1.08
2.05
5.24
Rate Ratios
1.0
1.75
1.74
2.14
2.55
Women
Men
*Adjusted for socio-demographic and cardiovascular risk factors
(Pai JK, et al, New Eng J Med 2004;351:2599-610)
Nested Case-Control Design
Case-cohort Design
Cohort sample
Final
pop
Initial
pop
• It allows estimating the prevalence of
the risk factor in the cohort, and thus the
population attributable risk;
ARPOP
Initial
pop
Final
pop
“Risk set”
• It is best for time-dependent exposures
Pr ev RF ( RR  10
. )

 100
Pr ev RF ( RR  10
. )  10
.
• It allows studying correlations between
risk factors in the sample for variables not
measured in the whole cohort; and
• One control group (the cohort
sample) can be used for different
outcomes.
ADVANTAGES OF EACH CASE-CONTROL DESIGN WITHIN THE COHORT
Nested Case-Control Design
Case-cohort Design
Cohort sample
Final
pop
Initial
pop
• It allows estimating the prevalence of
the risk factor in the cohort, and thus the
population attributable risk;
ARPOP
Initial
pop
Final
pop
“Risk set”
• It is best for time-dependent exposures
Pr ev RF ( RR  10
. )

 100
Pr ev RF ( RR  10
. )  10
.
• It allows studying correlations between
risk factors in the sample for variables not
measured in the whole cohort; and
• One control group (the cohort
sample) can be used for different
outcomes.
ADVANTAGES OF EACH CASE-CONTROL DESIGN WITHIN THE COHORT
Nested Case-Control Design
Case-cohort Design
Cohort sample
Final
pop
Initial
pop
• It allows estimating the prevalence of
the risk factor in the cohort, and thus the
population attributable risk;
ARPOP
Initial
pop
Final
pop
“Risk set”
• It is best for time-dependent exposures
Pr ev RF ( RR  10
. )

 100
Pr ev RF ( RR  10
. )  10
.
• It allows studying correlations between
risk factors in the sample for variables not
measured in the whole cohort; and
• One control group (the cohort
sample) can be used for different
outcomes.
ADVANTAGES OF EACH CASE-CONTROL DESIGN WITHIN THE COHORT
Nested Case-Control Design
Case-cohort Design
Cohort sample
Final
pop
Initial
pop
• It allows estimating the prevalence of
the risk factor in the cohort, and thus the
population attributable risk;
ARPOP
Initial
pop
Final
pop
“Risk set”
• It is best for time-dependent exposures
Pr ev RF ( RR  10
. )

 100
Pr ev RF ( RR  10
. )  10
.
• It allows studying correlations between
risk factors in the sample for variables not
measured in the whole cohort; and
• One control group (the cohort
sample) can be used for different
outcomes.
ADVANTAGES OF EACH CASE-CONTROL DESIGN WITHIN THE COHORT
Nested Case-Control Design
Case-cohort Design
Cohort sample
Final
pop
Initial
pop
• It allows estimating the prevalence of
the risk factor in the cohort, and thus the
population attributable risk;
ARPOP
Pr ev RF ( RR  10
. )

 100
Pr ev RF ( RR  10
. )  10
.
• It allows studying correlations between
risk factors in the sample for variables not
measured in the whole cohort; and
Initial
pop
Final
pop
“Risk set”
• It is best for time-dependent exposures;
• It automatically matches for length of
follow (and for previous losses).
(Disadvantage: for each case, a different
matched control sample must be
chosen.)
• One control group (the cohort
sample) can be used for different
outcomes.
ADVANTAGES OF EACH CASE-CONTROL DESIGN WITHIN THE COHORT
Nested Case-Control Design
Case-cohort Design
Cohort sample
Final
pop
Initial
pop
• It allows estimating the prevalence of
the risk factor in the cohort, and thus the
population attributable risk;
ARPOP
Pr ev RF ( RR  10
. )

 100
Pr ev RF ( RR  10
. )  10
.
• It allows studying correlations between
risk factors in the sample for variables not
measured in the whole cohort; and
Initial
pop
Final
pop
“Risk set”
• It is best for time-dependent exposures;
• It automatically matches for length of
follow (and for previous losses).
(Disadvantage: for each case, a different
matched control sample must be
chosen.)
• One control group (the cohort
sample) can be used for different
outcomes.
ADVANTAGES OF EACH CASE-CONTROL DESIGN WITHIN THE COHORT
Nested Case-Control Design
Case-cohort Design
Cohort sample
Final
pop
Initial
pop
• It allows estimating the prevalence of
the risk factor in the cohort, and thus the
population attributable risk;
ARPOP
Pr ev RF ( RR  10
. )

 100
Pr ev RF ( RR  10
. )  10
.
• It allows studying correlations between
risk factors in the sample for variables not
measured in the whole cohort; and
Initial
pop
Final
pop
“Risk set”
• It is best for time-dependent exposures;
• It automatically matches for length of
follow (and for previous losses).
(Disadvantage: for each case, a different
matched control sample must be
chosen.)
• One control group (the cohort
sample) can be used for different
outcomes.
ADVANTAGES OF EACH CASE-CONTROL DESIGN WITHIN THE COHORT
• When are nested designs (case-cohort or
nested case-control) the best choice?
In well defined cohorts when additional (expensive or burdensome)
information needs to be collected
– Laboratory determination in samples from specimen repository
(e.g., serum bank).
– Additional record abstraction (e.g., medical, occupational records).
• Analytical techniques (analogous to
methods used in cohort studies, matched
case-control studies) are available.
A special type of case-control study: the case-crossover study
• Useful when exposures that vary over time can precipitate acute
events, such as sudden cardiac deaths, asthma episodes, etc.
• Cases serve as their own controls: The subject’s time of event of
interest (e.g., death) is the case period, and the subject’s other times
comprise the control period
•Advantages:
–Each participant is considered a matched stratum in a casecontrol study (self-matching) where “cases” and “controls” are
case and control times (no control selection bias)
– Self-matches for confounding variables that do not change over
time (sex, genetic factors, etc.)
•Disadvantages:
–Assumes no “carry over” (cumulative) effect of exposure of
interest
–Assumes no confounding or interaction by time-related variables
(e.g., ambient temperature, day of the week)
•Challenges:
–Lag time must be taken into account (relevant exposure period)
A special type of case-control study: the casecrossover study– Example: Valent et al, Pediatrics
2001;107:e23
• Objective: to evaluate the association between sleep (and
wakefulness) duration and childhood unintentional injury
• Sample: 292 unintentionally injured children
• Case period: 24 hours preceding injury
• Control period: 25-48 hours preceding injury
• Definition of exposure: Child slept <10 hs
• Analysis: matched-pair and conditional logistic regression
• Adjustment: for day of the week (week-end vs. weekday) and activity
risk level (higher vs. lower level of energy)
Odds Ratios and 95% CIs for Sleeping Less than 10 Hours
Study
subjects
n
Ca+
Co+
Ca+
Co-
CaCo+
CaCo-
OR
95% CI
All cases
292
62
26
14
190
1.86
.97, 3.55
Boys
181
40
21
9
111
2.33
1.07, 5.09
Girls
111
22
5
5
79
1.00
0.29, 3.45
(Valent et al, Pediatrics 2001;107:e23)
For ascertainment of exposure:
•Case period: 24 hours preceding injury
•Control period: 25-48 hours preceding injury
Threats of Validity in Case-Crossover Studies
(Maclure M, Am J Epidemiol 1991;133:144-53)
• Within-individual confounding
– No confounding by the individual’s characteristics that remain constant, but
there can be confounding by variables that vary over time.
•Example: A person who drinks coffee only in colder days. If colder days
precipitate the event (e.g., angina pectoris), the association with coffee drinking
can be explained away by the fact that the day was colder.
• Selection bias
– Case-crossover study of incident nonfatal myocardial infarction and anger
episode (Moller et al, Psychosom Med 1999;61:842-9)
– “Survival bias implies that if cases being exposed to anger have a
better prognosis for surviving MI than those not exposed to anger, a
study of only nonfatal cases would overestimate the relative risk of
MI. Likewise, if cases exposed to anger right before their MI are less
inclined to participate, this would result in an underestimation.”
Threats of Validity in Case-Crossover Studies
(cont.) (Maclure M, Am J Epidemiol 1991;133:144-53)
• Information bias
– Recall bias: When interviews are done at the time of the event, quality of the
information obtained from the patient (or a proxy) about the case (hazard)
period may differ from that about the control period (e.g., when the case period
is the 24-hr period preceding the event, and the control period is the 25 to 48hour preceding the event)
• Bias can go in either direction:
– Faulty memory regarding the control period
– Exaggeration or denial of exposure in the case period
• External validity
– “In principle, generalizable to all acute-onset outcomes hypothesized to be
caused by brief exposures with transient effects.” (Maclure)