Introduction to Outcomes Research Methods and Data Resources David C. Chang, PhD, MPH, MBA Director of Outcomes Research UCSD Department of Surgery Surgery and public health Problem in surgical clinical research • Unregulated • FDA regulation applies only to “devices” (whether a real device, or a molecular device in the form of a drug) • Procedural medicine are not regulated • Many reasons: complexity, difficulty in standardizing, difficulty of enforcement (“surgeons know best” attitude) • Self-regulation Erroneous literature RCTs often too late EVAR-1, DREAM “Tipping Point” OVER Social responsibility • It is our responsibility in academic medicine, to shoulder the responsibility that, in other fields of medicine, has been assumed by the FDA • To ensure that only good treatment modalities are applied to patients Biggest barrier to good research? • • • • Not having a correctly constructed hypothesis Incorrect design Don’t know how to get data Fear of statistics Typical questions • Components • What/why/when/how • Verb • Condition • “Why is the sky blue?” • “What is the typical presentation of appendicitis?” • Open-ended Open-ended questions • Descriptive analysis • Observational study = no comparison = no statistical test • Only one denominator • May have more than one numerator, generating more than one ratio • All ratios are calculated with the same denominator Descriptive statistics 43% 57% P value not applicable to compare different parts of the same population Value and pitfall • To explore the unknown • When you know nothing, the first step is to explore and document the numbers • Risk of over-generalizing Inferential statistics 43% 57% 45% 55% P value applicable for comparing parts of two populations What is a hypothesis? • Question ≠ hypothesis • Questions: usually open-ended • Hypothesis: usually is closed-ended, asking for a yes/no answer • Statistical testing can only give yes/no answers The process – study design Study design phase Data preparation Analysis phase Question development Select database Univariate Define population Link database Bivariate Define subset Select data elements Multivariable Define outcome Generate new data elements Sensitivity Define primary comparison Define covariates Subset analysis Steps in constructing a hypothesis • Specify the outcomes (O in PICO) • Common oversight: Often focus on the P, but vague about O (a typical question, “What is the outcome (?) of xyz patients?”) • Specify the comparisons (C in PICO) • Not done in open-ended questions • Specify covariates (control variables, adjustment) Hypothesis statement • y = b1X1 + b2X2 + b3X3 • Death = age + race + gender + insurance… Inclusion/exclusion criteria • Just like a clinical trials (“eligibility criteria”) • Diagnosis and/or procedure codes? • Common mistake Comparison 43% 57% 55% 45% Outcome • Mortality? • Rare • Complications • Length of stay • Charges • Be judicious Covariates / independent variables • • • • • • Patient demographcis Patient comorbidity Surgeon volume Hospital volume Hospital type (teaching vs non-teaching) Area (rural vs urban) Hierarchy of influence on surgical outcomes Nation Outcomes research Region Hospital Surgeon Clinical trials Technique and Management Patient The process – data preparation Study design phase Data preparation Analysis phase Question development Select database Univariate Define population Link database Bivariate Define subset Select data elements Multivariable Define outcome Generate new data elements Sensitivity Define primary comparison Define covariates Subset analysis Overview of public and semi-public databases Multi-specialty Specialty-specific • Administrative Databases • • Nationwide Inpatient Sample (NIS) • • Medicare, Medicaid • California OSHPD • Clinical Databases • National Surgical Quality Improvement Program (NSQIP) • Trauma • National Trauma Databank (NTDB) Oncology • Surveillance, Epidemiology, and End Results (SEER) • National Cancer Databank (NCDB) Transplant • United Network for Organ Sharing (UNOS) Administrative databases Advantages Disadvantages • • • • Large patient numbers Less selection bias Can be linked to other databases containing other nonmedical information Limited clinical course information • Limited surgical procedure information Non-NSQIP NSQIP 2009 2008 2007 2006 2005 2004 2003 2002 2001 2000 1999 1998 1997 1996 1995 2009 2008 2007 2006 2005 2004 2003 2002 2001 2000 1999 1998 1997 1996 1995 Odds Ratio NSQIP/non-NSQIP in-hospital mortality In-Hospital Mortality: Single Reference Group 1.1 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 Cumulative Incidence of Adhesive Small Bowel Obstruction After an Isolated Abdominal Surgery Partial Colectomy Gastric Bypass Hysterectomy Cholecystectomy Appendectomy C-Section Number at Risk Partial Colectomy Gastric Bypass Cholecystectomy Hysterectomy Appendectomy C-Section 150,782 72,404 488,387 431,380 413,557 822,811 116,131 59,894 411,674 382,672 353,208 672,777 92,750 47,221 346,067 332,072 296,143 546,099 72,608 33,005 285,565 279,197 240,536 442,431 54,877 17,556 225,971 223,313 184,135 351,351 39,231 8,071 166,769 162,445 130,948 269,203 24,848 3,320 111,778 106,503 83,744 189,146 12,419 1,039 56,541 53,150 40,897 102,206 576 30 2,896 2,349 1,973 5,045 Select data elements Generate new data elements • Most time consuming step of outcomes analysis • Not every component of your research question is readily available in the database • For example, comorbidity • Charlson Index, Elixhauser Index • Some common concepts actually undefined • Readmission? What is a “re-admission”? • • • • • • Not all “admissions” are “re-admissions” 30-day? Elective? Transfers? Diagnosis-specific? Preventable? The process – analysis Study design phase Data preparation Analysis phase Question development Select database Univariate Define population Link database Bivariate Define subset Select data elements Multivariable Define outcome Generate new data elements Sensitivity Define primary comparison Define covariates Subset analysis Hypothesis statement • y = b1X1 + b2X2 + b3X3 • Death = age + race + gender + insurance… Table 1: Descriptive analysis Table 2: Bi-variate analysis (unadjusted comparison) Table 3: Multivariable analysis (adjusted analysis) Analysis for Table 1 Analysis for Table 1 43% 57% P value not applicable to compare different parts of the same population Analysis for Table 1 • % for categorical data • Mean/median/SD for continuous data • For exploratory studies, descriptive studies, case series, etc., this would be the end of the process • Reminder, avoid overgeneralizing Analysis for Table 2 Analysis for Table 2 • Think about data types… • Continuous data • Categorical data • (Ordinal data) Analysis for Table 2 • Two questions to think about when picking a stats test… • What is my outcome/dependent variable? What is my independent/input variable? • What type of data do I have for each? • 4 possible combinations: • 2 variables • 2 data types Analysis for Table 2 Cat. Cat. c2 Cont. ROC T-test Cont. Correlation Rank sum Analysis for Table 3 Analysis for table 3 Cat. Cat. c2 Cont. ROC Logistic regression Correlation Linear regression T-test Cont. Rank sum Subset analysis • Consistency of findings • Generalizability Generalizability “This is not research anymore” “That guy”