Vascular Surgery Biostatistics Seminar • We have a website: http://www.phs.wfubmc.edu/public/edu_vascSurg.cfm • Course is “experimental” – Ask questions during lectures – Let me know of specific statistical issues that you want covered • Assignment: for last 2 sessions (review of student-selected publications) – Pick 2 articles for class review – Email PDFs of them to me by October 20th Texts 1. Gehlbach: Interpreting the Medical Literature (ISBN 0-07-143789-4) 2. Dawson & Trapp: Basic and Clinical Biostatistics (ISBN 0-07-141017-1) 3. Good & Hardin: Common Errors in Statistics (ISBN 0-471-79431-7) 4. Huck: Reading Statistics and Research (ISBN 0205-51067-1) 5. van Belle: Statistical Rules of Thumb (ISBN 0471-40227-3) Schedule Seminar # Topic Date Time 1 Study design and data collection 9/10 1:30 – 3:00 2 Probability and statistical inference 9/17 2:00 – 4:00 3 Data summary measures and graphical display of results 10/1 2:00 – 4:00 4 Survey of statistical analysis techniques (part I) 10/8 2:00 – 4:00 5 Survey of statistical analysis techniques (part II) 10/15 2:00 – 4:00 6 Evidence-based medicine and decision analysis 11/5 2:00 – 4:00 7 Reading and reviewing analyses in medical literature* 11/19 2:00 – 4:00 8 Review of student-selected medical publications* 12/3 2:00 – 4:00 Study Design Gehlbach: Chapters 1-6 Hypothetical example: factors affecting (causing) renovascular disease (RVD) • Outcomes – Renal function (GFR, serum creatinine) – RVD by diagnostic test (ultrasound, angiogram) – End-stage renal disease (dialysis dependence) – Renal-related mortality • Exposures – Hypertension – RVD repair • Open revascularization • Percutaneous repair – Risk factors: age, race, smoking, diabetes,… Q: How can we examine a specific hypothesis as it relates to RVD? A: Formulate a hypothesis and design a study! Design Dilemma Ideal question one would pose Data one can collect or access From Good & Hardin, Common Errors in Statistics and How to Avoid Them: Before conducting the experiment, trial, survey, data analysis: 1. Write down the objectives 2. Translate those objectives into testable hypotheses 3. List potential findings and resulting conclusions Research Question vs. Hypothesis • Research Question: “How does diabetes affect renal function after renal revascularization?” • Hypothesis: “In patients treated for RVD with endovascular repair, those with diabetes have poorer early renal function response than those without diabetes.” Good & Hardin: Formulate hypotheses to be quantifiable, testable, and statistical in nature. Classification of Study Designs Observational studies 1. Descriptive or case-series 2. Retrospective (case-control) 3. Cross-sectional (prevalence), surveys 4. Prospective (cohort) 5. Retrospective cohort Experimental studies 1. Controlled trials a) Parallel designs b) Sequential designs c) External controls 2. Studies with no controls Meta-analyses Adapted from Dawson & Trapp, Basic & Clinical Biostatistics (4th ed) Observational Studies Retrospective Designs • Begin with disease/condition/outcome and look back for features (“exposure”) of those with and without outcome • Useful for: – Hypothesizing causes of disease – Identifying risk factors • Weaknesses: – Biased case and/or control selection – Biased exposure ascertainment – Temporal sequence of exposure/outcome Retrospective Designs (cont.) • Advantages: – Data availability (design of choice for chart reviews) – Usually inexpensive – Can be performed quickly • Matching cases and controls: – Prevents imbalance of known risk factor and potential confounding – Can reduce variability (increase efficiency) – Require special analysis techniques Retrospective Design (example) Lei et. al., “Familial aggregation of renal disease…” J Am Soc Neph (1998) 9:1270-1276 – Recruited 689 patients with new onset ESRD – Used random-digit dialing to recruit 361 controls from geographic community – Matched cases to controls (2:1) using 5-year age groups – Obtained information on familial history of ESRD and other risk factors (age, race, sex, socioeconomic,…) – Found patients with ≥ 2 relatives with ESRD at increased risk for ESRD Retrospective Cohort Design • Uses previously collected data on a welldefined cohort • Common approach for disease or treatment registries since meticulous record-keeping is required • All follow-up took place in the past • Subject to many of the same biases of other retrospective designs • Allows estimation of “prospective-like” measures Retrospective Cohort (example) Holland and Lam, “Predictors of hospitalization and death among pre-dialysis patients…” Nephrol Dial Transplant (2000) 15:650-658 – Identified predictors of first hospitalization in a cohort of 362 seen in “pre-dialysis” clinic – Dialysis initiation and loss to follow-up were censored events – Hospitalization (for any cause) was outcome – Risk factors examined using survival analysis – Took advantage of records kept in “pre-dialysis” clinic Cross-sectional Designs • Classifies a population or group with respect to both outcome and exposure at a single point in time • Useful for: – Disease description – Diagnosis and staging – Describing disease processes, mechanisms • Weaknesses: – Subject to sampling and recall biases – Temporal order problem – Can’t estimate disease incidence, only prevalence Cross-sectional Design (example) Hansen et. al., “Prevalence of renovascular disease in the elderly…” J Vasc Surg (2002) 36:443-451. – 834 participants in the CHS Study were examined with RDS at a single point in time – RVD status determined and prevalence in CHS cohort estimated – Increased age, lower HDL-c, and increased SBP associated with RVD Surveys • Single point-in-time studies; many utilize sampling techniques to assure generalizability • Complex survey designs (e.g., NHANES, NIS HCUP) use probability sampling – Target population is divided into clusters; subsets of clusters are sampled randomly – Certain clusters may be “oversampled” to assure representation – Statistical analyses require special methods that correct variance for study design Complex Survey (example) Mondrall et. al., “Operative mortality for renal artery bypass in the United States” J Vasc Surg (2008) 48:317-322 – Examined RABG from NIS/H-CUP survey, 20002004 – Observed 10% in-hospital post-op mortality – Risk factors for increased mortality included: age, female gender, Hx renal failure, CHF, lung disease – In-hospital mortality higher than previously reported – Used methods that accounted for survey design Ecologic Studies • Use data from large groups to compare rates of exposure and disease • Data are on group-level (e.g., data on air pollution levels in specific cities could be compared to rates of lung cancer) • Can lead to “ecologic fallacy”, because one doesn’t know whether the actual individuals disease are subject to the exposure of interest • Subject to “crackpot” biases Ecologic Study (example) Reynolds et. al., “Childhood cancer and agricultural pesticide use…” Environ Health Prospect (2002) 110:319-324 – Examined incidence of childhood cancers in California in relation to pesticide use, 1988-1994 – Data sources: California Cancer Registry; U.S. Census; California Dept. of Pesticide Regulations – Looked at cancer of all types, and by specific types – Found a significant association between childhood leukemia rates in communities with highest use of propargite – No other associations were observed Prospective Designs • Start with well-defined cohort and follow-up for occurrence of disease/outcome • Considered the optimal design for observational studies • Useful for: – Finding causes and estimating incidence of disease – Identification of risk factors – Following natural history, determining prognosis Prospective Designs (cont.) • Weaknesses: – Subject to selection bias (all studies are) and surveillance bias – Losses to follow-up or dropouts – Temporal changes in health habits (e.g., MRFIT) • Can be expensive and always take time • Advantages: – Correct temporal relationship between exposures and disease/outcome – Allows estimation of disease incidence and relative risks Prospective Design (example) Edwards et. al., “Renovascular disease and the risk of adverse coronary events…” Arch Intern Med (2005) 165:207-213 – 840 CHS participants with RDS exams from Hansen et. al. – Followed for CVD events for an average of 14 months post-RDS – Participants with RVD found to have nearly twice the rate of adverse CVD during observation period than those without RVD Observational Designs E(+) Controls E(-) E(+) E(-) Cross-sectional Time Control Case Control Case Exposure Participants, Patients, Subjects Today Cases Retrospective Cohort No Expo. Controls Prospective (Cohort) Cases E(+) E(-) E(+) E(-) Retrospective (Case-control) Experimental Studies Clinical Trials Participants are assigned to an experimental treatment and followed for event of interest – Clinical trials may… a) b) c) d) e) …be randomized or non-randomized …include a control group or have no control group …compare current treatment to an historical control …employ parallel or cross-over design …employ blinding of investigator and/or participant – The randomized, double-blind, placebocontrolled, parallel design is considered to be the best to determine efficacy Clinical Trials (cont.) Randomization – Purpose: to balance groups on both observed and unobserved factors – No guarantees: balance occurs in expectation (i.e., there is chance that some factors will not be balanced) – In cross-over design, it’s best to randomize treatment order (if possible) – Blocking used to assure treatment arm balance at fixed points – Stratification used to assure balance on a factor of interest With Outcome Without Outcome With Outcome Control Treatment Participants screened for entry criteria Without Outcome Experimental Treatment Clinical Trial: Parallel Group Design Time Screening Baseline Treatment Clinical Trial (example 1) Kay et. al., “Acetylcysteine for prevention of acute deterioration of renal function…” JAMA (2003) 289:553-558. – Experiment to test efficacy of antioxidant acetylcysteine to prevent acute nephrotoxicity – 200 patients with moderate renal insufficiency undergoing elective coronary angiography – Randomized, double-blind, placebo-controlled – 12% with increase in SCr in placebo group vs. 4% in acetylcysteine group (P=0.03) Screening B/L Treatment (Phase 1) {Washout} Treatment (Phase 2) Without Outcome With Outcome Control Treatment With Outcome Without Outcome With Outcome Experimental Treatment With Outcome Experimental Treatment Without Outcome Participants screened for entry criteria Without Outcome Control Treatment Clinical Trial: Crossover Design Clinical Trial (example 2) Whelton et. al., “Effects of celecoxib and naproxen on renal function…” Arch Intern Med (2000) 160:1465-1470 – Experiment to compare effect of celecoxib vs. naproxen on renal function in elderly cohort – 29 healthy elderly subjects took either celecoxib or naproxen for 10 days, had 7-day washout, then took other med for 10 days – Randomized treatment order, single-blind design – At day 6, GFR change on naproxen -7.5 mL/min/1.73m2 vs. -1.1 on celecoxib (P=0.004) Clinical Trials (other types) • Non-randomized trials: patients not assigned to treatment (or treatment order) via randomization; interpret with caution • External or historical controls: compare current experiment to an external control group (e.g., from prior study or literature); interpret with caution • Uncontrolled trial: experimental group only (no comparison); interpret with caution Clinical Trial (example 3) Gomes et. al., “Acute renal dysfunction in highrisk patients after angiography…” (1989) Radiology 170;65-68 – 145 patients at “high-risk” for renal failure undergoing angiography after administration with iohexol (non-ionic contrast) – Compared to 202 historical controls previously studied with ionic contrast – Acute renal dysfunction observed in 5.5% of iohexol group vs. 10% of historical control group (P=NS) – Authors use result to argue for new, randomized trial of two contrast agents Clinical Trials (issues) • Blinding: double-blind is optimal but not always feasible – Surgical trials usually impossible to blind both investigator and participant – Some trials are “open-label” and treat participants to a goal; others test a behavioral intervention – Group interventions are typically not blinded; must also account for “clustering” in intervention • If possible, always blind staff performing measurements • Avoid surveillance and/or ascertainment bias Clinical Trials (issues) • Look out for loss to follow-up, differential attrition, and poor adherence to treatments • Intention-to-treat: when analyzing outcomes, participants are included in analyses based on treatment group assignment regardless of treatments received or adherence – Necessary to avoid potential bias due to selfselection – Preserves randomization – Drug and device companies love to do analyses based on treatments received Meta-analysis • Pools results across multiple studies • A review article with quantitative summary • Typically combines results of several experimental studies – Useful for combining small studies – Studies should have same or similar treatments – Pools results to get single measure of effect • Beware: meta-analyses combining experimental and observational designs • Dependent upon articles reporting sufficient data (N, effect measure, variance) Meta-analysis (example) Leertouwer et. al., “Stent placement for renal arterial stenosis…” Radiology (2000) 80:78-85 – Compared studies of RVD repair with stent placement vs. PTA alone – Combined data on technical success rate, BP response, renal function response, anatomic F/U from 14 studies of stent placement and 10 studies of PTA – Conclusion: “Renal artery stent placement is technically superior and clinically comparable to renal PTA alone.” Data Collection for Statistical Analyses Data Collection for Statistical Analyses 1. Enter all or most of the data as numbers. Avoid entering letters, words, string variables (e.g.,NA, 22%, <3.6), or anything that resembles a cartoon curse word, @#&*%,. In Excel, all columns, with the exception of names and text comments, should be formatted as numbers or dates (not as general or text). 2. Give each column a unique, simple, 1-word name, 8 characters or less with no spaces, beginning with a letter, and place this name in the first row. 3. Put only one variable in a column. Do not combine variables in the same column. 4. Enter each patient (or unit of analysis) on a separate line, beginning on the second line. 5. Give each research participant or patient a unique case number (1,2,3, etc.)- in the first column. Delete patient name, SS#, MR#, and any identifying information before sending it to a statistician. Always, save the spreadsheet with a password. http://biostat.mc.vanderbilt.edu/twiki/bin/view/Main/DataTransmissionProcedures?CGISESSID=9fe1d0d63a71d176ca460de518acf2cf Data Collection for Statistical Analyses 6. Enter cases and controls in the same spreadsheet. Use one variable to define the control group (TREATED 0=no, 1=yes or GROUP 1=Drug A, 2=Drug B). 7. Quantify. Enter continuous measurements when possible. 8. Create a simple guide (or key) using a word processor to explain variables abbreviations, value coding, and how missing values were entered. Be consistent. 9. Think through the analysis before collecting any data. 10. Have a biostatistician review the coding before data entry and again after the first 10 patients have been entered. http://biostat.mc.vanderbilt.edu/twiki/bin/view/Main/DataTransmissionProcedures?CGISESSID=9fe1d0d63a71d176ca460de518acf2cf Spreadsheet from Hell Spreadsheet from Heaven