EP711 COHORT STUDIES Types of Epidemiologic Studies •Experimental Randomized controlled trials • Observational Cohort Case-control Cohort Studies Definition: A study in which two or more groups of people that are free of disease and that differ according to the extent of exposure (e.g. exposed and unexposed) are compared with respect to disease incidence Cohort studies are the observational equivalent of experimental studies but The researcher cannot allocate exposure –he/she must locate a natural experiment to observe the relationship between the exposure and disease Randomized Controlled Trials Identify the study population (cohort) Exposed Active Intervention Unexposed Outcome? Comparison Design: • Investigator randomly assigns exposure (treatment) • Then observe over time for subsequent outcome Cohort Studies Exposed Identify the cohort Smokers Outcome? Unexposed Non-smokers Design: • Non-diseased subjects grouped based on presence of exposure • Then determine subsequent outcome (e.g.- disease) Example: Is smoking associated with lung cancer? Advantages of Cohort Studies • Temporal sequence between exposure & disease is clear (e.g., smoking preceded cancer) • Can directly calculate incidence, RD, PRD • Good for looking at rare exposures or unusual risk factors (e.g. agent orange) • Can evaluate multiple effects of a single factor Retrospective Cohort Study The Cohort Past Have factor Don’t have factor Compare Incidence Start of Study Future •Cheaper, faster •Efficient with diseases with long latent period •Exposure data may be inadequate (limitation) Prospective Cohort Study Have factor The Cohort Past Don’t have factor Start of Study Compare Incidence Future •More expensive, time consuming •Not efficient for diseases with long latent periods •Better exposure and confounder data •Less vulnerable to bias Ambidirectional Cohort Study Retrospective part Prospective part Compare Incidence Have factor Don’t have factor Compare Incidence Past Start of Study Contains elements of both types of studies Future Types of Cohort Populations • Open or Dynamic – Changeable characteristic – Members come and go – Losses may occur • Fixed – Irrevocable event – Does not add new members – Losses may occur Never married Residents of Boston Aged 25-54 Baby Boomers, 9/11 survivors, RCT participants • Closed – Irrevocable event – Does not add new members – No losses occur Church Picnic or Wedding Attendees Selection of Study Population Choice depends upon hypothesis under study and feasibility considerations • For common risk factors(obesity, HBP): A cohort from the general population Framingham Heart Study, NHANES) (e.g., A special study group, e.g., doctors or nurses (e.g. The Nurse’s Health Study, Black Women’s Health Study) • For unusual risk factors : A special (rare) exposure group: (e.g., Agent Orange, Hiroshima, Occupational) The Framingham Heart Study • Initiated by NHLBI • Objective was to identify the common factors or characteristics that contribute to CVD by following healthy individuals • The researchers recruited 5,209 men and women between the ages of 30 and 62 from the town of Framingham, Massachusetts • Since 1948, the subjects have continued to return to the study every two years for a detailed medical history, physical examination, and laboratory tests • In 1971, the study enrolled a second generation - the original participants' adult children and their spouses • In April 2002 the Study entered a new phase: the enrollment of a third generation of participants, the grandchildren of the original cohort. The Nurse’s Health Study • 2 cohorts; Differ by age • • NHS I – Assembled in 1976 – ~122,000 female nurses aged 30-55 years NHS II – Assembled in 1989 – 117,000 female nurses aged 25-42 years • Biennial postal questionnaires The Nurse’s Health Study • • • • • • The primary goal to investigate the potential long term consequences of the use of oral contraceptives, in a population of normal women Primary outcomes include heart disease & cancer (common endpoints) Examines multiple common risk factors (diet, exercise, obesity, vitamin use) Subjects able to respond with a high degree of accuracy Motivated to participate in a long term study Easy to locate Air Force Ranch Hand Study (“Agent Orange Study”) • The U.S. military sprayed some 11 million gallons of the defoliant over southern and central Vietnam from 1962 to 1971 in an effort to expose enemy supply lines, sanctuaries and bases. • Airmen were exposed during spraying flights, while loading the chemical and while performing maintenance on the aircraft and the spraying equipment. • Agent Orange was named for the orange-striped barrels it was shipped in. It contains dioxin, a cancer-causing byproduct linked to medical ailments in U.S. war veterans and their Vietnamese counterparts. Selection of the Comparison Group 1) As similar as possible with respect to other factors that could influence outcome 2) Comparable & accurate information Counterfactual ideal • The ideal comparison group consists of exactly the same individuals in the exposed group – but without the exposure • Epidemiologists must select different sets of people who are as similar as possible Exposed Unexposed Sources of Comparison Group Internal Comparison Comparison Cohort General Population Comparison General Population General Population General Population Road Crew/ Asphalt Workers Rubber Workers Nurses Lean vs. Landscapers/ Grounds Crew General Population Obese vs. vs. Which of the three comparison groups is best? “Healthy-Worker Effect” • • • • Rates of morbidity and mortality among a working population are lower than those of the general population Health requirements for workers (especially physical laborers) tend to be stringent General population consists of both healthy and ill people Leads to underestimation of risk Sources of Exposure Information Pre-Existing Records Advantages •Inexpensive •Recorded before disease occurrence Disadvantages •Inadequate level of detail •Missing records •Little or no information on confounders Sources of Exposure Information Questionnaires, Interviews Advantages •Good for information not routinely recorded Disadvantages •Potential for recall bias Sources of Exposure Information Direct Testing (Physical exams, tests, environmental monitoring) Advantages •Good for certain exposures Disadvantages •Expensive •Not feasible in large studies Sources of Outcome Information • Death certificates • Physician, hospital, health plan records • Questionnaires (verify by records) • Medical exams You can use blinding to ensure that there is comparable ascertainment of the outcomes in both groups Follow-up Goal is to obtain complete follow-up information on all subjects regardless of exposure status • Ascertainment of outcome data involves following all subjects from exposure into the future • Time consuming process • However, high losses to follow-up raise doubts about the validity of the study (bias) Loss to Follow Up If likelihood of loss to follow up is related to the risk factor and the outcome, the estimate of the association will be biased Example: True incidence of thromboembolism: Subjects lost to follow up: Subjects with TE lost to follow up: Apparent incidence of TE: True RR = 2.0 OC Users 20/10,000 Non-OC users 10/10,000 1,012 1,008 12 2 8/8,988 8/8,992 Apparent RR = 1.0 Can occur in prospective cohort studies and in experimental studies Effects: can produce over- or under- estimate of association. Follow-up Resources •Town lists •Telephone books, 411 •Vital records (birth, death, marriage) •Registry of Motor Vehicles (RMV) lists •MD & Hospital records •Internet •Credit bureau •Relatives, friends (“contacts”) •Professional registries (AMA, RN, ABA, etc.) Tuberculosis Treatment and Breast Cancer Study Follow-up Strategies Begin with an interested group Collect identifiable information Full name & Address DOB, SSN Contact information Maintain frequent contact with all respondents Regular mail (questionnaires, newsletters) Telephone calls Personal contact (if possible) Incentives (gifts, calendars, money) Analysis of Cohort Study • Basic analysis involves calculation of incidence of disease among exposed and unexposed groups • Depending on available data, you can calculate cumulative incidence (CI) or incidence rates (IR) • Recall set up of 2 x 2 tables Person-Time In A Prospective Cohort Study Subject ABCDEFGHIJKLP1 P2 D x= when they got disease Time at Risk 8.3 11.0 x 14.0 14.0 10.2 3.0 ? x 12.0 ? 7.0 10.0 D 3.0 9.0 6.2 x P3 P4 P5 X= when they got disease D= death ? = Lost to follow-up P6 P7 P8 P9 P10 P11 P12 P13 Total time at risk =107.7 Total person-yrs Analysis of a Cohort Study Cases Person-Years of follow-up Exposed A PYE Unexposed C PYU Total A+C PYE + PYU IRE = A/PYE IRU = C/PYU RR = IRE/IRU Interpretation: The RR is the risk of developing the outcome in the exposed relative to the unexposed Nurse’s Health Study Examine the association Between obesity and CHD In a sample of 117,000 RNs w/o cardiovascular disease Have risk factor Don’t have it obese lean Compare Incidence of Disease Follow-up Surveys Start of Study Future Analysis Obese (Exposed) CHD Cases 85 Woman-Years of follow-up 99,573 Lean (Unexposed) 41 177,356 Total 126 276,929 IR1 = 85/99,573 = 8.54/10,000 woman-years IR0 = 41/177,356 = 2.31/10,000 woman-years RR = IR1/IR0 = 3.7 Obese women had 3.7 times the risk of CHD compared to lean women Risk Ratio In The Nurses Health Study ? Rate of CHD Obesity BMI: <21 CHD cases 41 Rate of CHD per Person-years 100,000 P-Yrs of observation (incidence) 177,356 23.1 21-<23 57 194,243 29.3 23-<25 56 155,717 36.0 25-<29 67 148,541 45.1 >29 85 99,573 85.4 Risk Ratio 1.0 3.7 Risk Ratio = 85.4/100,000 / 23.1/100,000 = 3.7 Risk Difference In The Nurses Health Study ? Rate of CHD Obesity BMI: <21 21-<23 23-<25 25-<29 >29 CHD cases 41 rate of CHD per person-years 100,000 P-Yrs Risk of observation (incidence) Difference 23.1 177,356 0.0 57 194,243 29.3 56 155,717 36.0 67 148,541 45.1 85 99,573 85.4 62.3 Risk Difference = 85.4/100,000 - 23.1/100,000 = 62.3 excess cases per 100,000 P-Yrs in heaviest group Strengths of Cohort Studies • Efficient for rare exposures • Usually good information on exposures • Can evaluate multiple effects of an exposure A Cohort Study Can Look at Multiple Outcomes Yes Orthopedic Problems Breast Cancer Yes No Disease Yes Cardiovascular No Problems Yes Reproductive No 139 Obesity 119 Yes P-Yrs 169 No 239 217 310,820 227 320,807 3 98 138 Disadvantages to Cohort Studies (especially prospective) • May need large numbers of subjects for long • • • periods of time Can be expensive and time consuming Not good for rare diseases or those with long latency Loss to follow up undermines validity When Reading A Cohort Study, Ask… • How were the study groups selected or defined? • Did they differ in other ways that could influence the outcome? • Were the data accurate? • Was data collection comparable for all study groups? • How complete was the follow-up? The Black Women’s Health Study (BWHS) A Follow-up Study of African-American Women Boston University Slone Epidemiology Center Why Is The BWHS Needed? • Rates of illness and death from many diseases are higher in African-American women • Lack of health research studies involving African-American women, particularly large studies Death rate per 100,000 women 156 133 109 95 40 Heart Black White 23 Stroke Cancer Exposure and Outcome Information • • Biennial postal questionnaires Self-report 1995 Questionnaire Data: Baseline • Age • Weight • Height • Waist, hip circumference • Use of medical care • Occupation • Education • Medical history (prevalent disease) • Reproductive history • Drugs (OCs, HRT, vitamins, medications) • Cigarette smoking • Alcohol use • Diet (60 item Block-NCI questionnaire) • Physical activity • Family care responsibilities 1997-2007 Follow-up Questionnaires Update “exposures” for previous 2-year period: (e.g., OC use, weight, alcohol use, cigarette smoking, physical activity, etc.) Record “outcomes” for previous 2-year period: Incident disease, Births, Deaths Additional questions: Ancestry (race, ethnicity, where born) Experiences and perceptions of racism Family history of disease Use of herbal remedies Depression scale (CESD)* Education * Religion/Spirituality Dental health Siblings/birth order *Repeat Question Lupus symptom list Hair straightener use Exposure to violence Individual health/belief system Household Income Diet * Perceived stress and coping Access to car/transportation 1995 – 2007 Questionnaire Data Prevalent and incident diseases and conditions: Hypertension Diabetes High cholesterol Heart attack Angina Stroke Clot in lung, leg Cyst in breast Fibroids Endometriosis Lupus Sickle cell anemia Breast cancer Lung cancer Colon/rectal cancer Cervical cancer Rheumatoid arthritis Osteoarthritis Gingivitis Depression Sarcoidosis Asthma Toxemia/Pre-eclampsia Gastric/duodenal ulcer Hydatidiform mole Polycystic ovary Glaucoma Multiple Sclerosis Kidney Stones Other - specify Validation • Self-reported data • Important for minimizing bias (misclassification) • Must confirm: Exposures (when feasible) Vital status (deaths) Outcomes Validation: Exposures • Expensive, not feasible in most cohort studies • Not all exposures can be easily validated (subjective measures) • Perceptions and experiences of racism/unfair treatment • Can be accomplished in a sample of the cohort • Anthropometric (height, weight, hip & waist circumference) • Physical Activity • Diet Diet Validation Study • 408 BWHS Participants • Over a 1-year period (quarterly) provided: • 3 telephone 24 hour diet recalls • 1 3-day food diary • Compared nutrient intake estimates • FFQ Data vs. Combined recall & diary data Kumanyika et al., Ann Epidemiol 2003;13:111-118 Validation: Outcomes Non-medical cohort: • Symptoms of illness are nonspecific • Participants may not know the diagnosis even if it was made • Direct examinations not feasible in a large cohort study Validation: Outcomes Information requested depends on outcome studied – Breast cancer and other cancers: hospital records, pathology reports, discharge summaries, CA registry data – Coronary heart disease: hospital records, discharge summaries – Lupus, RA, MS, Sarcoidosis: hospital records, physician checklists – Hypertension and Diabetes: drug, physician checklists self-report plus use of appropriate Challenges Medical records Difficulties in obtaining records • Additional consent process (medical release) • Incomplete records • Records from multiple sources • Physician checklists: a burden on physician » Remedies • Patient checklists • Registry Data (Cancer) Validation: Deaths • Important for follow-up (person-time and outcome) • Reported by: » Next of kin » Post office » Internet » SSDMF » NDI • Confirmed by: » Death Certificate with date and cause(s) of death • Supplied by: » State registrar » Next of kin Challenges Obtaining death certificates – State IRB process (varies by city and state) – State registry budget cuts – Takes time to receive certificates – Cost ranges from $0.37 to $20 per certificate search Remedy: − NDI plus − Gives DOD and coded cause of death − Still requires state IRB approval Study Results • ~100 publications (manuscripts and abstracts) • Full list available at: www.bu.edu/bwhs/publications Genomic Studies • Collection of cheek samples (for extraction of DNA) from all BWHS participants between January 2004 and December 2007 • Samples sent to National Human Genome Center at Howard University BWHS Genomic Studies • Participants receive $15 AFTER their consent and sample have been received • Samples stored at Howard University Human Genome Center for future analyses • 56% (n~27,000) participation / ~5% refusal • Non-responders followed-up via telephone calls >1 Move Between 1995-1997 Age: Russell, et al. AJE. 2001;154:845-53 <30 70% 30-39 57% 40-49 45% 50-69 39%