cohort - Boston University

advertisement
EP711
COHORT STUDIES
Types of Epidemiologic
Studies
•Experimental
Randomized controlled trials
•
Observational
Cohort
Case-control
Cohort Studies
Definition:
A study in which two or more groups of people that are free of
disease and that differ according to the extent of exposure
(e.g. exposed and unexposed) are compared with respect to
disease incidence
Cohort studies are the observational equivalent of
experimental studies
but
The researcher cannot allocate exposure –he/she must
locate a natural experiment to observe the relationship
between the exposure and disease
Randomized Controlled Trials
Identify the
study
population
(cohort)
Exposed
Active
Intervention
Unexposed
Outcome?
Comparison
Design:
• Investigator randomly assigns exposure (treatment)
• Then observe over time for subsequent outcome
Cohort Studies
Exposed
Identify the
cohort
Smokers
Outcome?
Unexposed
Non-smokers
Design:
• Non-diseased subjects grouped based on presence of exposure
• Then determine subsequent outcome (e.g.- disease)
Example: Is smoking associated with lung cancer?
Advantages of Cohort Studies
•
Temporal sequence between exposure & disease is
clear (e.g., smoking preceded cancer)
•
Can directly calculate incidence, RD, PRD
•
Good for looking at rare exposures or unusual risk
factors (e.g. agent orange)
•
Can evaluate multiple effects of a single factor
Retrospective Cohort Study
The
Cohort
Past
Have factor
Don’t have factor
Compare Incidence
Start
of Study
Future
•Cheaper, faster
•Efficient with diseases with long latent period
•Exposure data may be inadequate (limitation)
Prospective Cohort Study
Have factor
The
Cohort
Past
Don’t have factor
Start
of Study
Compare
Incidence
Future
•More expensive, time consuming
•Not efficient for diseases with long latent periods
•Better exposure and confounder data
•Less vulnerable to bias
Ambidirectional Cohort Study
Retrospective part
Prospective part
Compare
Incidence
Have factor
Don’t have factor
Compare
Incidence
Past
Start of Study
Contains elements of both types of studies
Future
Types of Cohort
Populations
• Open or Dynamic
– Changeable characteristic
– Members come and go
– Losses may occur
• Fixed
– Irrevocable event
– Does not add new members
– Losses may occur
Never married
Residents of Boston
Aged 25-54
Baby Boomers,
9/11 survivors,
RCT participants
• Closed
– Irrevocable event
– Does not add new members
– No losses occur
Church Picnic or
Wedding Attendees
Selection of Study Population
Choice depends upon hypothesis under study and feasibility
considerations
• For common risk factors(obesity, HBP):
 A cohort from the general population
Framingham Heart Study, NHANES)
(e.g.,
 A special study group, e.g., doctors or nurses
(e.g. The Nurse’s Health Study, Black Women’s Health
Study)
• For unusual risk factors :
 A special (rare) exposure group:
(e.g., Agent Orange, Hiroshima, Occupational)
The Framingham Heart Study
• Initiated by NHLBI
• Objective was to identify the common factors or characteristics that
contribute to CVD by following healthy individuals
• The researchers recruited 5,209 men and women between the ages of 30
and 62 from the town of Framingham, Massachusetts
• Since 1948, the subjects have continued to return to the study every two
years for a detailed medical history, physical examination, and laboratory
tests
• In 1971, the study enrolled a second generation - the original
participants' adult children and their spouses
• In April 2002 the Study entered a new phase: the enrollment of a third
generation of participants, the grandchildren of the original cohort.
The Nurse’s Health Study
•
2 cohorts; Differ by age
•
•
NHS I
– Assembled in 1976
– ~122,000 female nurses aged 30-55 years
NHS II
– Assembled in 1989
– 117,000 female nurses aged 25-42 years
•
Biennial postal questionnaires
The Nurse’s Health Study
•
•
•
•
•
•
The primary goal to investigate the potential long term
consequences of the use of oral contraceptives, in a
population of normal women
Primary outcomes include heart disease & cancer
(common endpoints)
Examines multiple common risk factors (diet, exercise,
obesity, vitamin use)
Subjects able to respond with a high degree of accuracy
Motivated to participate in a long term study
Easy to locate
Air Force Ranch Hand Study
(“Agent Orange Study”)
• The U.S. military sprayed some 11 million gallons of the defoliant over
southern and central Vietnam from 1962 to 1971 in an effort to expose
enemy supply lines, sanctuaries and bases.
• Airmen were exposed during spraying flights, while loading the chemical
and while performing maintenance on the aircraft and the spraying
equipment.
• Agent Orange was named for the orange-striped barrels it was shipped
in. It contains dioxin, a cancer-causing byproduct linked to medical
ailments in U.S. war veterans and their Vietnamese counterparts.
Selection of the Comparison Group
1) As similar as possible with respect to other factors that
could influence outcome
2) Comparable & accurate information
Counterfactual ideal
• The ideal comparison group consists of exactly the same
individuals in the exposed group – but without the
exposure
• Epidemiologists must select different sets of people who
are as similar as possible
Exposed
Unexposed
Sources of Comparison Group
Internal Comparison
Comparison Cohort
General Population
Comparison
General Population
General Population
General Population
Road
Crew/
Asphalt
Workers
Rubber
Workers
Nurses
Lean
vs.
Landscapers/
Grounds
Crew
General
Population
Obese
vs.
vs.
Which of the three comparison groups is best?
“Healthy-Worker Effect”
•
•
•
•
Rates of morbidity and
mortality among a working
population are lower than those
of the general population
Health requirements for
workers (especially physical
laborers) tend to be stringent
General population consists of
both healthy and ill people
Leads to underestimation of
risk
Sources of Exposure
Information
Pre-Existing Records
Advantages
•Inexpensive
•Recorded before disease occurrence
Disadvantages
•Inadequate level of detail
•Missing records
•Little or no information on confounders
Sources of Exposure
Information
Questionnaires, Interviews
Advantages
•Good for information not routinely recorded
Disadvantages
•Potential for recall bias
Sources of Exposure
Information
Direct Testing
(Physical exams, tests, environmental
monitoring)
Advantages
•Good for certain exposures
Disadvantages
•Expensive
•Not feasible in large studies
Sources of Outcome
Information
• Death certificates
• Physician, hospital, health plan records
• Questionnaires (verify by records)
• Medical exams
You can use blinding to ensure that there
is comparable ascertainment of the outcomes
in both groups
Follow-up
Goal is to obtain complete follow-up information on
all subjects regardless of exposure status
• Ascertainment of outcome data involves following
all subjects from exposure into the future
• Time consuming process
• However, high losses to follow-up raise doubts
about the validity of the study (bias)
Loss to Follow Up
If likelihood of loss to follow up is related to the risk factor
and the outcome, the estimate of the association will be
biased
Example:
True incidence of thromboembolism:
Subjects lost to follow up:
Subjects with TE lost to follow up:
Apparent incidence of TE:
True RR = 2.0
OC Users
20/10,000
Non-OC users
10/10,000
1,012
1,008
12
2
8/8,988
8/8,992
Apparent RR = 1.0
Can occur in prospective cohort studies and in experimental studies
Effects: can produce over- or under- estimate of association.
Follow-up Resources
•Town lists
•Telephone books, 411
•Vital records (birth, death, marriage)
•Registry of Motor Vehicles (RMV) lists
•MD & Hospital records
•Internet
•Credit bureau
•Relatives, friends (“contacts”)
•Professional registries (AMA, RN, ABA, etc.)
Tuberculosis Treatment and Breast Cancer Study
Follow-up Strategies
 Begin with an interested group
 Collect identifiable information
Full name & Address
DOB, SSN
Contact information
 Maintain frequent contact with all respondents
Regular mail (questionnaires, newsletters)
Telephone calls
Personal contact (if possible)
 Incentives (gifts, calendars, money)
Analysis of Cohort Study
•
Basic analysis involves calculation of incidence of
disease among exposed and unexposed groups
•
Depending on available data, you can calculate
cumulative incidence (CI) or incidence rates (IR)
•
Recall set up of 2 x 2 tables
Person-Time In A Prospective
Cohort Study
Subject
ABCDEFGHIJKLP1
P2
D
x= when they
got disease
Time
at Risk
8.3
11.0
x
14.0
14.0
10.2
3.0
?
x
12.0
?
7.0
10.0
D
3.0
9.0
6.2
x
P3
P4
P5
X= when they got disease
D= death
? = Lost to follow-up
P6
P7 P8
P9
P10
P11
P12
P13
Total time at risk =107.7
Total person-yrs
Analysis of a Cohort Study
Cases
Person-Years
of follow-up
Exposed
A
PYE
Unexposed
C
PYU
Total
A+C
PYE + PYU
IRE = A/PYE
IRU = C/PYU
RR = IRE/IRU
Interpretation: The RR is the risk of developing the outcome in the exposed
relative to the unexposed
Nurse’s Health Study
Examine the association
Between obesity and CHD
In a sample of 117,000 RNs
w/o cardiovascular disease
Have risk factor
Don’t have it
obese
lean
Compare
Incidence
of Disease
Follow-up Surveys
Start of Study
Future
Analysis
Obese
(Exposed)
CHD
Cases
85
Woman-Years
of follow-up
99,573
Lean
(Unexposed)
41
177,356
Total
126
276,929
IR1 = 85/99,573 = 8.54/10,000 woman-years
IR0 = 41/177,356 = 2.31/10,000 woman-years
RR = IR1/IR0 = 3.7
Obese women had 3.7 times the risk of CHD compared to lean women
Risk Ratio In The Nurses
Health Study
?
Rate of CHD
Obesity
BMI:
<21
CHD
cases
41
Rate of CHD per
Person-years 100,000 P-Yrs
of observation (incidence)
177,356
23.1
21-<23
57
194,243
29.3
23-<25
56
155,717
36.0
25-<29
67
148,541
45.1
>29
85
99,573
85.4
Risk
Ratio
1.0
3.7
Risk Ratio = 85.4/100,000 / 23.1/100,000 = 3.7
Risk Difference In The
Nurses Health Study
?
Rate of CHD
Obesity
BMI:
<21
21-<23
23-<25
25-<29
>29
CHD
cases
41
rate of CHD per
person-years 100,000 P-Yrs
Risk
of observation (incidence)
Difference
23.1
177,356
0.0
57
194,243
29.3
56
155,717
36.0
67
148,541
45.1
85
99,573
85.4
62.3
Risk Difference = 85.4/100,000 - 23.1/100,000
= 62.3 excess cases per 100,000 P-Yrs in heaviest group
Strengths of Cohort Studies
• Efficient for rare exposures
• Usually good information on exposures
• Can evaluate multiple effects of an exposure
A Cohort Study Can Look at
Multiple Outcomes
Yes
Orthopedic Problems
Breast
Cancer
Yes
No
Disease
Yes Cardiovascular
No
Problems
Yes Reproductive
No
139
Obesity
119
Yes
P-Yrs
169
No
239
217
310,820
227
320,807
3
98
138
Disadvantages to Cohort Studies
(especially prospective)
• May need large numbers of subjects for long
•
•
•
periods of time
Can be expensive and time consuming
Not good for rare diseases or those with long
latency
Loss to follow up undermines validity
When Reading A Cohort Study,
Ask…
• How were the study groups selected or
defined?
• Did they differ in other ways that could
influence the outcome?
• Were the data accurate?
• Was data collection comparable for all
study groups?
• How complete was the follow-up?
The Black Women’s Health
Study (BWHS)
A Follow-up Study of
African-American Women
Boston University
Slone Epidemiology Center
Why Is The BWHS Needed?
•
Rates of illness and death from many diseases are higher in
African-American women
•
Lack of health research studies involving African-American
women, particularly large studies
Death rate per 100,000 women
156
133
109
95
40
Heart
Black
White
23
Stroke
Cancer
Exposure and Outcome Information
•
•
Biennial postal questionnaires
Self-report
1995 Questionnaire Data:
Baseline
• Age
• Weight
• Height
• Waist, hip circumference
• Use of medical care
• Occupation
• Education
• Medical history (prevalent disease)
• Reproductive history
• Drugs (OCs, HRT, vitamins, medications)
• Cigarette smoking
• Alcohol use
• Diet (60 item Block-NCI questionnaire)
• Physical activity
• Family care responsibilities
1997-2007 Follow-up Questionnaires
Update “exposures” for previous 2-year period:
(e.g., OC use, weight, alcohol use, cigarette smoking, physical activity, etc.)
Record “outcomes” for previous 2-year period:
Incident disease, Births, Deaths
Additional questions:
Ancestry (race, ethnicity, where born)
Experiences and perceptions of racism
Family history of disease
Use of herbal remedies
Depression scale (CESD)*
Education *
Religion/Spirituality
Dental health
Siblings/birth order
*Repeat Question
Lupus symptom list
Hair straightener use
Exposure to violence
Individual health/belief system
Household Income
Diet *
Perceived stress and coping
Access to car/transportation
1995 – 2007 Questionnaire Data
Prevalent and incident diseases and conditions:
Hypertension
Diabetes
High cholesterol
Heart attack
Angina
Stroke
Clot in lung, leg
Cyst in breast
Fibroids
Endometriosis
Lupus
Sickle cell anemia
Breast cancer
Lung cancer
Colon/rectal cancer
Cervical cancer
Rheumatoid arthritis
Osteoarthritis
Gingivitis
Depression
Sarcoidosis
Asthma
Toxemia/Pre-eclampsia
Gastric/duodenal ulcer
Hydatidiform mole
Polycystic ovary
Glaucoma
Multiple Sclerosis
Kidney Stones
Other - specify
Validation
•
Self-reported data
•
Important for minimizing bias
(misclassification)
•
Must confirm:
 Exposures (when feasible)
 Vital status (deaths)
 Outcomes
Validation: Exposures
• Expensive, not feasible in most cohort studies
• Not all exposures can be easily validated (subjective measures)
• Perceptions and experiences of racism/unfair treatment
• Can be accomplished in a sample of the cohort
• Anthropometric (height, weight, hip & waist circumference)
• Physical Activity
• Diet
Diet Validation Study
• 408 BWHS Participants
• Over a 1-year period (quarterly) provided:
• 3 telephone 24 hour diet recalls
• 1 3-day food diary
• Compared nutrient intake estimates
• FFQ Data vs. Combined recall & diary data
Kumanyika et al., Ann Epidemiol 2003;13:111-118
Validation: Outcomes
Non-medical cohort:
•
Symptoms of illness are nonspecific
•
Participants may not know the diagnosis
even if it was made
•
Direct examinations not feasible in a large
cohort study
Validation: Outcomes
Information requested depends on outcome studied
– Breast cancer and other cancers:
hospital records, pathology
reports, discharge summaries, CA registry data
– Coronary heart disease:
hospital records, discharge summaries
– Lupus, RA, MS, Sarcoidosis:
hospital records, physician
checklists
– Hypertension and Diabetes:
drug, physician checklists
self-report plus use of appropriate
Challenges
Medical records
 Difficulties in obtaining records
• Additional consent process (medical release)
• Incomplete records
• Records from multiple sources
• Physician checklists: a burden on physician
» Remedies
• Patient checklists
• Registry Data (Cancer)
Validation: Deaths
•
Important for follow-up (person-time and outcome)
•
Reported by:
» Next of kin
» Post office
» Internet
» SSDMF
» NDI
•
Confirmed by:
» Death Certificate with date and cause(s) of death
•
Supplied by:
» State registrar
» Next of kin
Challenges
Obtaining death certificates
– State IRB process (varies by city and state)
– State registry budget cuts
– Takes time to receive certificates
– Cost ranges from $0.37 to $20 per
certificate search
Remedy:
− NDI plus
− Gives DOD and coded cause of death
− Still requires state IRB approval
Study Results
•
~100 publications (manuscripts and abstracts)
•
Full list available at:
www.bu.edu/bwhs/publications
Genomic Studies
•
Collection of cheek samples (for extraction of
DNA) from all BWHS participants between
January 2004 and December 2007
•
Samples sent to National Human Genome
Center at Howard University
BWHS Genomic Studies
•
Participants receive $15 AFTER their consent and
sample have been received
•
Samples stored at Howard University Human Genome
Center for future analyses
•
56% (n~27,000) participation / ~5% refusal
•
Non-responders followed-up via telephone calls
>1 Move Between 1995-1997
Age:
Russell, et al. AJE. 2001;154:845-53
<30
70%
30-39
57%
40-49
45%
50-69
39%
Download