Biostatistics in Cancer Control Donna McClish Cohort Study • Observational study • Comparison groups are identified according to exposure status • Subjects must be free of the disease of interest at entry • Followed over time • Development of disease and length of time observed (person-time) are noted Cohort Study Disease No Disease Exposure No Exposure n1 n2 Types of Cohort Studies • Concurrent (prospective) – exposure status assessed in present, prior to disease development – Advantages include assessing exposures in common manner, not subject to recall – Disadvantages include length of time needed to complete study if disease has long latency along with accompanying loss of patients Types of cohort studies • Nonconcurrent (retrospective or historic) – exposure status assessed from previously recorded information. (Outcome status may be measured in past or present) – Advantages include being less costly and less time consuming – Disadvantages include assessment methods that are less than ideal or lacking in uniformity Cohort studies – single cohort • Select a single population, based on factors unrelated to exposure (specific community, membership in an organization, etc) • Determine exposure status • Advantages – many different exposures can be assessed; different exposure groups are more likely to be comparable; generalizability is improved • Disadvantage –not optimal for sample size; need very large sample if exposure is rare. Cohort studies-multiple cohorts • Select 2 or more cohorts based on exposure status of risk factor of interest • Sometimes the “unexposed” sample is an external population such as the entire US • Advantage-handles rare exposures without requiring unreasonable sample size • Disadvantage -finding cohorts comparable except for exposure status; may be less generalizable Nurses Health Study • Single cohort, concurrent • In 1976, recruited registered nurses ages 25-42 working in 11 most populous states (N=121,7000 out of about 170,000 recruited) • Questionnaires about diet and risk factors every 2 years – Added QOL questions in 1992 (every 4 years) – Toe nail samples taken in 1982,1984 (mineral assessment) – Blood samples (for biomarkers) in 1989-90, 2000-01 • Nurses were asked about occurrence of outcomes, confirmed by medical record • Response rates 90% Nurses Health Study II • Added second cohort of women aged 2542 in 1989, n=116,686 • Particular interest in oral contraceptives, diet, lifestyle risk factors • Blood and urine samples taken for 30,000 in late 1990’s • 90% response rates Nurses Health Study Results • No relationship between dietary fiber and colon cancer • Risk of colon cancer can be reduced by physical activity, somewhat by daily folic acid supplement and aspirin. • High calcium and Vitamin D may be protective Nurses Health Study Results • Birth control pills, having at least 2 children, eating fruits and vegetables, are protective of ovarian cancer • Being overweight, consuming too many dairy products, using talcum powder on genitals may increase risk of ovarian cancer Nurses Health Study Results • High estrogen levels, drinking alcohol, using testosterone supplements, gaining 45 pounds since age 18, having an apple shaped body are all related to increased risk of breast cancer • Smoking, abortion, organochlorine chemicals and hair dyes not related to breast cancer • Eating fruits and vegetables (particularly with vitamin A and beta carotene) and exercising (particularly if postmenopausal) are protective for breast cancer Nurses Health Study Results 2005 • Drinking at least 2 alcoholic beverages/day modestly increases risk of colorectal cancer • Walking at least 90 minutes/week at ages 50-60 improves memory at ages 70+ • Drinking sugar-sweetened sodas results in weight gain and increased risk of diabetes • High levels of estrogen and androgens in post menopausal women increases risk of breast cancer Nurses Health Study Results 2006 • Nurses who began hormone therapy near menopause had 30% lower risk for heard disease than women who didn’t use hormones. • Nurses who started taking hormones at least 10 years after menopause didn’t have any benefit. FELS Longitudinal Study • Began in 1927 • Enrolls subjects during pregnancy Now has children, grandchildren, greatgrandchildren etc enrolled • Concentrates on physical grown and maturation, skeletal and dental data, and body composition • Collect data: 5/yr in 1st year; 2/yr until age 5, 1/yr until age 8; 2/yr until puberty; 1/yr until 21; every 2 years in adulthood Multiple cancer study • Double cohort - nonconcurrent • Exposed are those on SEER registries • Unexposed are general population in SEER areas (using census data) • Outcome is occurrence of a second primary cancer at selected sites • Standardized Incidence Ratios (SIRs) calculated and compared Multiple cancer study • Exposures include age, race, sex, marital status, stage, site and histology of initial cancer, treatment • If SEER-Medicare data used, then exposures also include comorbidity and SES variables such as income and education composition of the area where the subject lives Multiple cancer study results • Very few differences between males and females • Many differences between old and young (young had higher SIRs) • Most cancers had significantly higher (or lower) risk than the general population Multiple cancer study results • Second primary sites with > 50% increase in risk for all age/gender strata included: urinary tract, upper aerodigestive, melanoma and small bowel • Large increases just for women:breast, nonHodgkins lymphoma) • Large increases for men: kidney, endocrine • Large increases for those younger than 65:testicular cancer, leukemia, lung, connective tissue Multiple cancer study results • Initial primary sites with > 50% increase in risk of a second primary for all age/gender strata: melanoma, larynx, kidney or upper aerodigestive • Except for older males, large increases for those with initial:urinary cancer or hodgkins disease • Large increases or decreases just for women (uterus), men (testis), or just for those younger than 65(esophagus, lung, ovary, other female ) or over 65 (pancreas). Multiple cancer study results • Women with breast cancer who have radiation treatment are at higher risk for a second cancer in the radiation field (breast, trachea, lung, esophagus) Rural Physician Cancer Prevention Project (RNP) • Randomized trial of low intensity dietary intervention in rural residents • Recruited from 3 physician practices: physicians “endorsed” the project • Subjects in intervention group mailed information booklets along with individualized dietary feedback on fat and fiber (based on Fat and Fiber Behavior –related Questionnaire [FFB]) FIBERR Project • Single cohort study • Evaluate feasibility of recruiting first degree relatives of rural colon cancer patients • Same intervention as Rural Nutrition Project • Same data collection as Rural Nutrition Project Combined RNP and FIBERR • Combine data from intervention group of RNP with FIBERR to create a multiple cohort study • Research question: will relatives of colon cancer patients be more motivated to improve their diet? • Could this have been done as a randomized study? Cohort study-exposure measurement issues • Exposure should be measured same way on everyone • Exposure needs to be assessed prior to development of disease • Exposures that change over time (e.g., diet, environment) are problematic in determining that exposure is assessed prior to even subclinical level of disease Cohort study-disease measurement issues • Definition of disease should be standardized • Disease assessment should be blinded to exposure • Assessments can be by periodic exam, surveys of hospitals, registries, death certificates • Minimize loss to followup • Knowledge of latency needed to determine appropriate length of follow-up Cohort study risk measure • Disease rate (incidence rate) estimated as number with newly diagnosed disease divided by person years of observation • Relative risk(RR) is ratio of incidence in exposed to unexposed populations • RR>1 implies increased risk, RR<1 implies protective factor Cohort Study risk measures • Standardized Incidence Ratio (SIR) compares observed number of events in the exposed group to expected number of events • Expected number of events calculated by multiplying the person years of observation in the exposed group by the incidence rate of the standard population Case-control study • Comparison groups are selected according to disease status –cases are subjects with disease and controls are subjects without disease • Determine exposure status of each subject Case-control Study Disease No Disease Exposure No Exposure m1 m2 DES • 8 cases of clear cell adenocarcinoma of the vagina seen over short period in Boston in 15-22 year old women • Matched each case with 4 controls born within 5 days on the same service • Assessed exposure of parents and child • 7 of 8 cases reported mothers had taken DES during first trimester – none of the controls had the exposure Case-control study – choice of cases • Ideally, cases would be a random sample of everyone who has the disease of interest • Common sources of cases are hospitals or registries • Hospital cases may not be representative - more severe if those with subclinical stage do not seek treatment or less severe if cases die prior to admission to hospital Case-control – choice of controls • Controls should be as comparable with cases as possible (internal validity) or representative of population of individuals without cancer in terms of exposure • Population controls - advantage: are representative - disadvantage: may be hard to get sampling frame - random digit dialing used to get representative sample of those with phones Case control-choice of controls • Hospital patients – advantage: lists of eligible patients available - disadvantage: hard to determine that they are comparable to cases • Neighbors, friends and relatives -advantage: comparable in terms of SES, lifestyle or genetics -disadvantage: may not be representative Case-control study – Exposure assessment • Exposure measured by interview, questionnaire or medical record • Knowledge of latency period needed to determine appropriate time frame for exposure assessment (at least 15 years for epithelial tumors, less than 10 years for hematopoietic neoplasms. May be less if genetic predisposition) Case-control, bias in measuring exposure • Recall bias – particularly bad if related to diseases status • Knowledge of disease status may influence selective gathering of assessment data (can be controlled by blinded assessment) • Different settings for assessment (e.g., hospital for cases vs home for controls) • Interview cases and controls at different time points • Interviewing surrogates Case-control risk measure • Odds ratio (OR) in case-control study compares the odds of exposure among cases to the odds of exposure among controls – Odds of an event is the ratio of the number of ways an event can occur to the number of ways it cannot occur (ratio of probability of exposure to the probability of no exposure) – OR>1 implies increased risk, OR<1 implies protective factor. – For rare disease, OR approximates RR Advantage of case-control study • • • • • • Good for rare diseases Relative quick to do Requires fewer subjects than cohort study Less costly Sometimes can use existing records Can study multiple exposures Disadvantage of case-control study • Relies on recall or records for past exposure • Validation of exposure information is difficult or impossible • Control of extraneous variables may be incomplete • Selection of appropriate control group is difficult • Can’t determine incidence or RR • Can only assess one outcome Advantages of cohort study • Can get complete description of the experience occurring after exposure • Can better assess exposure if prospective • Can study multiple outcomes • Better chance of establishing that exposure occurred before the outcome • Can assess incidence and RR Disadvantages of cohort study • If disease is rare – very large sample is needed • May require very long follow-up time • Hard to maintain follow-up • Expensive