Case-Control Studies

advertisement
CASE-CONTROL STUDIES
Nigel Paneth
EVOLUTION OF THE CASECONTROL STUDY
1. CASE
What is a case?
Consolidating several different signs
and symptoms into "caseness" was a
key development in medicine.
(for more details see: Paneth N, Susser E,
Susser: The early history and development of the
case-control study. Social & Preventive Medicine
2002; 47: 282-288 and 359-365)
2. CASE-SERIES
•Aggregating many individual cases into
a group, and describing the features of
the group, began in earnest in the 18th
century.
•Key figure - PCA Louis in France.
"The numerical method".
•Currently perhaps the single
commonest kind of medical article.
3. CASE-CONTROL STUDY
• In its simplest form, comparing a case
series to a matched control series.
• Possibly the first c-c study was by
Whitehead in Broad Street pump episode,
1854 (Snow did not do a c-c study).
• First modern c-c study was Janet LaneClaypon’s study of Breast cancer and
reproductive history in 1926.
• Four c-c studies implicating smoking and
lung cancer appeared in 1950, establishing
the method in epidemiology.
FEATURES OF CASECONTROL STUDIES
1. DIRECTIONALITY:
Outcome to exposure
2. TIMING:
Retrospective for exposure, but
case-ascertainment can be either
retrospective or concurrent.
3. SAMPLING:
Almost always on outcome, with
matching of controls to cases
TWO CHARACTERISTICS
OF CASES
1. REPRESENTATIVENESS:
Ideally, cases are a random sample
of all cases of interest in the source
population (e.g. from vital data,
registry data). More commonly they
are a selection of available cases
from a medical care facility.
(e.g. from hospitals, clinics)
2. METHOD OF SELECTION
Selection may be from incident or
prevalent cases:
• Incident cases are those derived
from ongoing ascertainment of cases
over time.
• Prevalent cases are derived from a
cross-sectional survey.
CHARACTERISTICS OF CONTROLS
• Who is the best control?
• Where should controls come from?
• If cases are a random sample of all cases
in the population, then controls should be
a random sample of all non-cases in the
population sampled at the same time
(i.e. from the same study base)
• But if study cases are not a random
sample of the university of all cases, it is
not likely that a random sample of the
population of non-cases will constitute a
good control population.
THREE QUALITIES NEEDED IN
CONTROLS
•
•
•
Key concept: Comparability is more
important than representativeness in the
selection of controls
The control must be at risk of getting the
disease.
The control should resemble the case in
all respects except for the presence of
disease
COMPARABILITY VS.
REPRESENTATIVENESS
Usually, study cases are not a
random sample of all cases in the
population, and therefore controls
must be selected so as to mirror the
same biases that entered into the
selection of cases
It follows from the above that a pool
of potential controls must be defined.
This pool must mirror the study base
of the cases.
STUDY BASE
Therefore, imagining the study base
is a useful exercise before deciding
on control selection.
The study base is composed of a
population at risk of exposure over a
period of risk of exposure.
Cases emerge within a study base.
Controls should emerge from the
same study base, except that they
are not cases.
For example, if cases are selected
exclusively from hospitalized
patients, controls must also be
selected from hospitalized patients.
• If cases must have gone through a certain
ascertainment process (e.g. screening),
controls must have also.
(e.g.
mammogram-detected breast cancer)
• If cases must have reached a certain age
before they can become cases, so must
controls. (thus we always match on age)
• If the exposure of interest is cumulative over
time, the controls and cases must each have
the same opportunity to be exposed to that
exposure. (if the case has to work in a
factory to be exposed to benzene, the control
must also have worked where he/she could
be exposed to benzene)
SIX ISSUES IN MATCHING CONTROLS
IN CASE-CONTROL STUDIES
1. Identify the pool from which controls
may come. This pool is likely to reflect
the way controls were ascertained
(hospital, screening test, telephone
survey).
2. Control selection is usually through
matching.
Matching variables (e.g. age), and
matching criteria (e.g. control must be
within the same 5 year age group) must
be set up in advance.
3. Controls can be individually matched or
frequency matched
INDIVIDUAL MATCHING: search for one (or
more) controls who have the required
MATCHING CRITERIA. PAIRED or TRIPLET
MATCHING is when there is one or two
controls individually matched to each case.
FREQUENCY MATCHING: select a population
of controls such that the overall
characteristics of the group match the overall
characteristics of the cases. e.g. if 15% of
cases are under age 20, 15% of the controls
are also.
4. AVOID OVER-MATCHING. match only on
factors known to be causes of the disease.
5. Obtain POWER by matching MORE THAN
ONE CONTROL PER CASE. In general, N of
controls should be < 4, because there is no
further gain of power above four controls per
case.
6. Obtain GENERALIZABILITY by matching
more than ONE TYPE OF CONTROL
ADVANTAGES AND DISADVANTAGES OF
C-C STUDIES
Advantages:
1. only realistic study design for
uncovering etiology in rare diseases
2. important in understanding new
diseases
3. commonly used in outbreak
investigation
4. useful if induction period is long
5. relatively inexpensive
Disadvantages:
1. Susceptible to bias if not carefully
designed
(and matched)
2. Especially susceptible to exposure
misclassification
3. Especially susceptible to recall bias
4. Restricted to single outcome
5. Incidence rates not usually calculable
6. Cannot assess effects of matching
variables
EXAMPLES OF PROBLEMS
• Doll’s 1951 study of smoking and lung
cancer. The problem was that the control
population (lung diseases other than
cancer) was biased in relation to the
exposure.
• McMahon’s 1981 study of coffee and
pancreatic cancer. Problem was that some
of the controls may have been biased in
relation to the exposure, because gastrointestinal diseases were excluded from the
control series, and these diseases might
have people who reduced coffee intake on
medical advice or because of symptoms.
SOME IMPORTANT DISCOVERIES
MADE IN CASE CONTROL STUDIES
1950's
• Cigarette smoking and lung cancer
1970's
• Diethyl stilbestrol and vaginal
adenocarcinoma
• Post-menopausal estrogens and
endometrial cancer
1980's
• Aspirin and Reyes syndrome
• Tampon use and toxic shock syndrome
• L-tryptophan and eosinophilia-myalgia
syndrome
• AIDS and sexual practices
1990's
• Vaccine effectiveness
• Diet and cancer
BASIC ANALYSIS OF CASE
CONTROL STUDIES
FOR ONE CONTROL
Data is expressed in a four-fold table, and
an odds ratio is calculated (relative risks
have no meaning here – why?).
Cases
Controls
Exposed
a
Unexposed c
b
d
OR = ad/bc
PAIRED ANALYSIS
FOR ONE CONTROL
Data is expressed in a four-fold table, and the
number of concordant and discordant pairs
are calculated. Test is McNemar’s chi squared
test for paired data.
Case
Exposed Unexposed
Exposed
Both
Mixed
Controls
Unexposed Mixed
Neither
PAIRED ANALYSIS
FOR ONE CONTROL
Case
Exposed Unexposed
Exposed
r
s
Controls
Unexposed t
u
McNemar chi2 = (t + s)2
(t – s)
MORE POINTS ABOUT
CASE-CONTROL ANALYSIS
• The odds ratio is a good estimate of the
relative risk when the disease is rare
(prevalence < 20%).
• Can be extended to N > 1 controls.
• statistical testing is by simple chi-square
(unmatched analysis) or by McNemar’s chi
square (matched-pairs analysis).
• Can be extended to multiple strata
(Mantel-Haenzel chi-square)
THEORETICAL FOUNDATION
of case-control studies
per McMahon and Trichopoulos
1. "Case-control studies should be viewed
as efficient sampling schemes of the
disease experience of the underlying open
or closed cohorts" (McMahon &
Trichopoulos, p. 230)
2. "The exposure odds ratio derived from
case-control studies equals the disease
odds ratio derived from cohort studies"
(p.231)
3.The incidence rate ratio:
Xe divided by Xo
Te
To
can also be written as:
Xe divided by Te
Xo
To
4. "In a case-control study based on a
dynamic population, Xe and Xo (exposed
and unexposed cases) are directly
ascertained, and the ratio Te/To can be
estimated in an unbiased way not
dependent on any rare disease
assumption by the ratio of exposed
versus unexposed prevalent individuals
at risk in the study base (the total study
period cancels out).
5. "any particular group of prevalent
individuals at risk for the disease in the
source population during the study
period (i.e. the study base) that correctly
reflects the ratio of exposed to
unexposed person-time in this
population over this period can be used
for this purpose."
6. "To the extent that Ye/Yo (the exposure
odds among the controls) is an unbiased
estimate of Te/To, controls may be viewed
as reflecting the person-time by
exposure status," (p.231)
Download