Biostatistics in Cancer Control
Donna McClish
Cohort Study
• Observational study
• Comparison groups are identified
according to exposure status
• Subjects must be free of the disease of
interest at entry
• Followed over time
• Development of disease and length of time
observed (person-time) are noted
Cohort Study
Disease No Disease
Exposure
No Exposure
n1
n2
Types of Cohort Studies
• Concurrent (prospective) – exposure
status assessed in present, prior to
disease development
– Advantages include assessing exposures in
common manner, not subject to recall
– Disadvantages include length of time needed
to complete study if disease has long latency
along with accompanying loss of patients
Types of cohort studies
• Nonconcurrent (retrospective or historic) –
exposure status assessed from previously
recorded information. (Outcome status
may be measured in past or present)
– Advantages include being less costly and less
time consuming
– Disadvantages include assessment methods
that are less than ideal or lacking in uniformity
Cohort studies – single
cohort
• Select a single population, based on factors
unrelated to exposure (specific community,
membership in an organization, etc)
• Determine exposure status
• Advantages – many different exposures can be
assessed; different exposure groups are more
likely to be comparable; generalizability is
improved
• Disadvantage –not optimal for sample size;
need very large sample if exposure is rare.
Cohort studies-multiple
cohorts
• Select 2 or more cohorts based on exposure
status of risk factor of interest
• Sometimes the “unexposed” sample is an
external population such as the entire US
• Advantage-handles rare exposures without
requiring unreasonable sample size
• Disadvantage -finding cohorts comparable
except for exposure status; may be less
generalizable
Nurses Health Study
• Single cohort, concurrent
• In 1976, recruited registered nurses ages 25-42 working
in 11 most populous states (N=121,7000 out of about
170,000 recruited)
• Questionnaires about diet and risk factors every 2 years
– Added QOL questions in 1992 (every 4 years)
– Toe nail samples taken in 1982,1984 (mineral assessment)
– Blood samples (for biomarkers) in 1989-90, 2000-01
• Nurses were asked about occurrence of outcomes,
confirmed by medical record
• Response rates 90%
Nurses Health Study II
• Added second cohort of women aged 2542 in 1989, n=116,686
• Particular interest in oral contraceptives,
diet, lifestyle risk factors
• Blood and urine samples taken for 30,000
in late 1990’s
• 90% response rates
Nurses Health Study Results
• No relationship between dietary fiber and
colon cancer
• Risk of colon cancer can be reduced by physical
activity, somewhat by daily folic acid supplement
and aspirin.
• High calcium and Vitamin D may be protective
Nurses Health Study Results
• Birth control pills, having at least 2 children,
eating fruits and vegetables, are protective of
ovarian cancer
• Being overweight, consuming too many dairy
products, using talcum powder on genitals may
increase risk of ovarian cancer
Nurses Health Study Results
• High estrogen levels, drinking alcohol, using
testosterone supplements, gaining 45 pounds since age
18, having an apple shaped body are all related to
increased risk of breast cancer
• Smoking, abortion, organochlorine chemicals and hair
dyes not related to breast cancer
• Eating fruits and vegetables (particularly with vitamin A
and beta carotene) and exercising (particularly if postmenopausal) are protective for breast cancer
Nurses Health Study Results
2005
• Drinking at least 2 alcoholic beverages/day
modestly increases risk of colorectal cancer
• Walking at least 90 minutes/week at ages 50-60
improves memory at ages 70+
• Drinking sugar-sweetened sodas results in
weight gain and increased risk of diabetes
• High levels of estrogen and androgens in post
menopausal women increases risk of breast
cancer
Nurses Health Study Results
2006
• Nurses who began hormone therapy near
menopause had 30% lower risk for heard
disease than women who didn’t use
hormones.
• Nurses who started taking hormones at
least 10 years after menopause didn’t
have any benefit.
FELS Longitudinal Study
• Began in 1927
• Enrolls subjects during pregnancy
Now has children, grandchildren, greatgrandchildren etc enrolled
• Concentrates on physical grown and maturation,
skeletal and dental data, and body composition
• Collect data: 5/yr in 1st year; 2/yr until age 5, 1/yr
until age 8; 2/yr until puberty; 1/yr until 21; every
2 years in adulthood
Multiple cancer study
• Double cohort - nonconcurrent
• Exposed are those on SEER registries
• Unexposed are general population in
SEER areas (using census data)
• Outcome is occurrence of a second
primary cancer at selected sites
• Standardized Incidence Ratios (SIRs)
calculated and compared
Multiple cancer study
• Exposures include age, race, sex, marital
status, stage, site and histology of initial
cancer, treatment
• If SEER-Medicare data used, then
exposures also include comorbidity and
SES variables such as income and
education composition of the area where
the subject lives
Multiple cancer study results
• Very few differences between males and
females
• Many differences between old and young
(young had higher SIRs)
• Most cancers had significantly higher (or
lower) risk than the general population
Multiple cancer study results
• Second primary sites with > 50% increase in risk
for all age/gender strata included: urinary tract,
upper aerodigestive, melanoma and small bowel
• Large increases just for women:breast, nonHodgkins lymphoma)
• Large increases for men: kidney, endocrine
• Large increases for those younger than
65:testicular cancer, leukemia, lung, connective
tissue
Multiple cancer study results
• Initial primary sites with > 50% increase in risk of
a second primary for all age/gender strata:
melanoma, larynx, kidney or upper
aerodigestive
• Except for older males, large increases for those
with initial:urinary cancer or hodgkins disease
• Large increases or decreases just for women
(uterus), men (testis), or just for those younger
than 65(esophagus, lung, ovary, other female )
or over 65 (pancreas).
Multiple cancer study results
• Women with breast cancer who have
radiation treatment are at higher risk for a
second cancer in the radiation field
(breast, trachea, lung, esophagus)
Rural Physician Cancer
Prevention Project (RNP)
• Randomized trial of low intensity dietary
intervention in rural residents
• Recruited from 3 physician practices: physicians
“endorsed” the project
• Subjects in intervention group mailed
information booklets along with individualized
dietary feedback on fat and fiber (based on Fat
and Fiber Behavior –related Questionnaire
[FFB])
FIBERR Project
• Single cohort study
• Evaluate feasibility of recruiting first
degree relatives of rural colon cancer
patients
• Same intervention as Rural Nutrition
Project
• Same data collection as Rural Nutrition
Project
Combined RNP and FIBERR
• Combine data from intervention group of
RNP with FIBERR to create a multiple
cohort study
• Research question: will relatives of colon
cancer patients be more motivated to
improve their diet?
• Could this have been done as a
randomized study?
Cohort study-exposure
measurement issues
• Exposure should be measured same way
on everyone
• Exposure needs to be assessed prior to
development of disease
• Exposures that change over time (e.g.,
diet, environment) are problematic in
determining that exposure is assessed
prior to even subclinical level of disease
Cohort study-disease
measurement issues
• Definition of disease should be standardized
• Disease assessment should be blinded to
exposure
• Assessments can be by periodic exam, surveys
of hospitals, registries, death certificates
• Minimize loss to followup
• Knowledge of latency needed to determine
appropriate length of follow-up
Cohort study risk measure
• Disease rate (incidence rate) estimated as
number with newly diagnosed disease
divided by person years of observation
• Relative risk(RR) is ratio of incidence in
exposed to unexposed populations
• RR>1 implies increased risk, RR<1 implies
protective factor
Cohort Study risk measures
• Standardized Incidence Ratio (SIR)
compares observed number of events in
the exposed group to expected number of
events
• Expected number of events calculated by
multiplying the person years of
observation in the exposed group by the
incidence rate of the standard population
Case-control study
• Comparison groups are selected
according to disease status –cases are
subjects with disease and controls are
subjects without disease
• Determine exposure status of each subject
Case-control Study
Disease
No Disease
Exposure
No Exposure
m1
m2
DES
• 8 cases of clear cell adenocarcinoma of the
vagina seen over short period in Boston in 15-22
year old women
• Matched each case with 4 controls born within 5
days on the same service
• Assessed exposure of parents and child
• 7 of 8 cases reported mothers had taken DES
during first trimester – none of the controls had
the exposure
Case-control study – choice
of cases
• Ideally, cases would be a random sample of
everyone who has the disease of interest
• Common sources of cases are hospitals or
registries
• Hospital cases may not be representative - more
severe if those with subclinical stage do not
seek treatment or less severe if cases die prior
to admission to hospital
Case-control – choice of controls
• Controls should be as comparable with cases as
possible (internal validity) or representative of
population of individuals without cancer in terms
of exposure
• Population controls
- advantage: are representative
- disadvantage: may be hard to get sampling
frame
- random digit dialing used to get representative
sample of those with phones
Case control-choice of
controls
• Hospital patients
– advantage: lists of eligible patients available
- disadvantage: hard to determine that they are
comparable to cases
• Neighbors, friends and relatives
-advantage: comparable in terms of SES,
lifestyle
or genetics
-disadvantage: may not be representative
Case-control study – Exposure
assessment
• Exposure measured by interview,
questionnaire or medical record
• Knowledge of latency period needed to
determine appropriate time frame for
exposure assessment (at least 15 years
for epithelial tumors, less than 10 years for
hematopoietic neoplasms. May be less if
genetic predisposition)
Case-control, bias in measuring
exposure
• Recall bias – particularly bad if related to
diseases status
• Knowledge of disease status may influence
selective gathering of assessment data (can be
controlled by blinded assessment)
• Different settings for assessment (e.g., hospital
for cases vs home for controls)
• Interview cases and controls at different time
points
• Interviewing surrogates
Case-control risk measure
• Odds ratio (OR) in case-control study compares
the odds of exposure among cases to the odds
of exposure among controls
– Odds of an event is the ratio of the number of ways
an event can occur to the number of ways it cannot
occur (ratio of probability of exposure to the
probability of no exposure)
– OR>1 implies increased risk, OR<1 implies protective
factor.
– For rare disease, OR approximates RR
Advantage of case-control study
•
•
•
•
•
•
Good for rare diseases
Relative quick to do
Requires fewer subjects than cohort study
Less costly
Sometimes can use existing records
Can study multiple exposures
Disadvantage of case-control
study
• Relies on recall or records for past exposure
• Validation of exposure information is difficult or
impossible
• Control of extraneous variables may be
incomplete
• Selection of appropriate control group is difficult
• Can’t determine incidence or RR
• Can only assess one outcome
Advantages of cohort study
• Can get complete description of the
experience occurring after exposure
• Can better assess exposure if prospective
• Can study multiple outcomes
• Better chance of establishing that
exposure occurred before the outcome
• Can assess incidence and RR
Disadvantages of cohort
study
• If disease is rare – very large sample is
needed
• May require very long follow-up time
• Hard to maintain follow-up
• Expensive