University of Nebraska Medical Center Methods Used in Comparative Effectiveness Research September 18, 2012 by Susan D. Horn, Ph.D Institute for Clinical Outcomes Research 699 East South Temple, Suite 300 Salt Lake City, Utah 84102 801-466-5595 (V) 801-466-6685 (F) shorn@isisicor.com www.isisicor.com 1 The Problem • How best to treat the patient in your office right now? • “Scientific studies” provide imperfect guidance • Clinical medicine is untidy; innumerable variables describe patients, providers, and practices • Must consider clinical variability of patient populations, intervention combinations, and outcomes 2 Objectives of Presentation 1. Describe, compare, and contrast models to conduct comparative effectiveness research, including randomized controlled trials, analysis of claims databases, electronic medical record databases, condition or treatment registries, and practice-based evidence (PBE) study methodology. 2. Demonstrate how evidence-based practice and comparative effectiveness of treatments and treatment strategies can be derived using PBE study designs and health information systems, including clinical examples. 3. Present examples of PBE clinical findings related to quality, policy, and safety improvements. 3 Comparative Effectiveness Research – What is it? • Comparative effectiveness research is designed to inform health-care decisions by providing evidence on the effectiveness, benefits, and harms of different treatment options. • Evidence is generated from research studies that compare drugs, medical devices, tests, surgeries, or ways to deliver health care. 4 Comparative Effectiveness Research – What types of evidence are used? There are two ways that evidence is found: 1. Systematic reviews of existing evidence. Researchers look at all available evidence about benefits and harms of each treatment choice for different groups of people from existing clinical trials, clinical studies, and other research. 2. New studies. Researchers conduct studies that generate new evidence of effectiveness or comparative effectiveness of a test, treatment, procedure, or healthcare service. 5 What We Have and What We Need We have Efficacy trials that determine whether an intervention produces a specified result(s) under well controlled conditions in a selected population – includes randomized controlled trials (RCTs) We need Effectiveness trials that measure outcomes of an intervention under “real world” conditions in an unselected clinical population. Hypotheses and study designs of an effectiveness trial are formulated based on conditions of routine clinical practice and on outcomes essential for clinical decisions. 6 More Definitions • Efficacy – benefit under the best possible circumstances (Can it work?) » » • Generation - RCT, explanatory Synthesis – systematic review of RCTs (e.g. Cochrane) Effectiveness – benefit under ordinary practice (Does it work?) » » Generation – effectiveness trial, pragmatic trial, observational studies Synthesis – systematic review of RCTs and observational studies Effectiveness Trials and RCTs RCT Practice effects of RCT results Progenitor of RCTs Effectiveness 8 Databases for Effectiveness Trials • RCT databases • Large claims databases, e.g., Medicare, Medicaid, CDC • HMO or VA databases from claims and electronic medical records • Specific condition registries such as arthritis registry • Practice-based evidence study registries PBE studies overcome limitations of RCTs (that limit patient types and treatments) More detailed patient, process, and outcome evaluation than is possible with 9 traditional registries or large claims datasets RCT Databases for Comparative Effectiveness Research • RCT databases – gold standard for efficacy and value of interventions • Advantages include: High internal validity Causal inferences can be made since patients with known confounders are excluded and randomization eliminates unknown confounders 10 RCT Databases for Comparative Effectiveness Research • RCT databases – gold standard for efficacy and value of interventions • Limitations include: Small sample sizes – too small to detect uncommon risks Follow-up periods too short to assess long-term benefits/risks Higher-risk patients typically excluded; limited external validity Level of monitoring more rigorous than done in routine practice High rates of treatment discontinuation 11 Electronic Databases for Comparative Effectiveness Research • Three primary sources of electronic databases: Large health systems with electronic medical record data, e.g., HMO, HMO Research Network (HMORN), and VA databases Insurance claims databases, e.g., Medicare, Medicaid, HCUP data Disease or procedure-specific registries 12 Electronic Databases for Comparative Effectiveness Research • Advantages include: Identify and track patient populations over time Measure treatment exposures over time Assess some health status, health behaviors, and other potential confounders Assess both positive and negative outcomes over time Better suited to evaluate safety as opposed to effectiveness 13 Insurance Claims Electronic Databases for Comparative Effectiveness Research • Many States and health care and pharmacy insurance providers make administrative claims data available, including Medicare, Medicaid, Healthcare Cost and Utilization Project (HCUP), Blue Cross Blue Shield, and United Health. • Databases contain information on all health care encounters in which billable services were delivered 14 Insurance Claims Electronic Databases for Comparative Effectiveness Research • Advantages include: Cover millions of people Ability to provide treatment exposures and adverse events, including hospitalizations and mortality, over extended periods of time Provide population and subpopulation-based estimates for various outcomes 15 Insurance Claims Electronic Databases for Comparative Effectiveness Research • Limitations include: Functional and cognitive status, severity of illness, and health behaviors cannot be obtained and can be important unmeasured confounders Possibility of exposure misclassification if patient does not apply for insurance coverage for some treatment 16 Health System Electronic Databases for Comparative Effectiveness Research • Advantages include: Data are of high quality usually Cover millions of persons including minority and elderly populations Exist in electronic form so some data elements may be exported for use in comparative effectiveness research 17 Health System Electronic Databases for Comparative Effectiveness Research • Limitations include: Restricted ability to capture patients’ severity of illness, functional and cognitive status, health behaviors other than smoking or pain. Restricted access for CER unless researcher is part of organization that owns the data Often required variables are in text so are not exportable 18 Clinical Registry Electronic Databases for Comparative Effectiveness Research • • Systematically collected and stored health-related information on specific patient populations, most often defined by a particular illness or procedure. Advantages include: Designed to collect detailed information related to a particular illness or procedure, e.g., arthritis registry, CABG registry, cancer registries • Limitations include: Typically created as an add-on or separate database from those used for clinical care or payment Limited information outside of particular illness or procedure 19 The Problems • Minimizing Bias in Existing Data • Capturing variation in intervention combinations • Capturing variation in outcomes 20 The Balancing Act Longer follow-up Strong external validity Strong internal validity Balanced groups Outcomes clearly defined Different than routine care Defined patient population Confounded by life Real world settings Large sample size 21 Causal Inference Comparison Randomized Experiment Quasi-Experiment Cause precedes effect Yes Yes Cause covaries with effect Yes Yes Alternate explanations implausible Yes ? 22 Basic Problems in Non-Randomized Studies • Non-independence of Observations « Many statistical analyses are based on the assumption that the observations are independent. « Studying patients treated in a single hospital or unit or provider breaks this assumption « Intra-correlation among observations 23 Modeling to Account for Non-Independence • Hierarchical designs that account for the intra-correlation • Generalized estimating equation (GEE) regression models for clustered data • Robust standard errors 24 Basic Problems in Non-Randomized Studies • Confounding by Indication » Therapies administered in non-random fashion » Prognostic characteristics influence therapy » Recipients of therapy at high risk for outcomes » Users differ from comparators in key respects 25 Propensity Score Theory • Multivariable scoring method that collapses multiple observed predictors of treatment into a single value (a score) » Probability that a subject with given characteristics receives specified treatment » Removes confounding by components of the score » Used to: match, stratify, or model 26 Instrumental Variable AnalysisTheory • • • • Identify instrument variables that are randomly associated with individual case, correlated with treatment, but uncorrelated with outcomes Therefore, the instrument can effectively randomize subjects across treatment arms to achieve equal distribution Controls for underlying differences in groups that are unobservable (endogeneity bias due to unobserved heterogeneity) Supposedly permits estimation of causal effects even when important confounders are unmeasured27 Instrumental Variables (IV) to address confounding by indication/selection bias • Used to estimate effect of missing predictor variable or when there is measurement error in predictor variable • Instrument itself does not belong in prediction equation but is correlated with missing predictor variable. For example, • Severity of illness measure is missing; instead use distance from patient home to treatment center as IV • No measure of smoking in a community; instead use amount of tobacco taxes collected as IV to assess smoking effect on health outcome. • Disadvantage: Assumptions about IV not testable 28 Overcoming Selection Bias/ Confounding by Indication • Statistical adjustments: –Matching –Propensity scoring/instrumental variables –Covariate adjustments (Severity of Illness) • Ongoing debate about the adequacy of adjustments 29 PBE Presentation Overview • What is practice-based evidence (PBE) study methodology? • How does it incorporate patient heterogeneity treatment heterogeneity outcome heterogeneity • Discuss examples of PBE comparative effectiveness findings • Discuss implications for health information technology 30 PBE Methodology What makes this approach different? 31 31 Comparative Effectiveness Issues Addressed Using PBE • Both patients and providers report data • Data come from existing EMR with standardized data elements about patient characteristics, treatments, processes, patient-reported data, and multiple outcomes • Data are part of routine documentation, so not an ‘addon’ • Rapid patient accrual since documentation is standard of care • Longitudinal and ongoing 32 Comparative Effectiveness Issues Addressed Using PBE (cont) • Patient comparability is addressed with Comprehensive Severity Index (CSI): disease-specific, physiologic-based, >2,200 criteria, >5,500 disease-specific criteria sets • CSI addresses confounding by indication/selection bias • Database includes all treatments with date/dose/intensity/route • Can assess drug and non-drug combination therapies • Findings of PBE-CER more readily translated into practice 33 Practice Based Evidence (PBE) Design PBE is observational research that: » captures differential outcomes » associated with naturally occurring variations in treatment » while adjusting for severity of illness and injury, co-morbid conditions, and co-occurring treatments 34 34 Practice-Based Evidence Study Design Standardize documentation of : Process Factors • Management Strategies • Interventions • Medications Control for: Patient Factors • Psychosocial/demographic Factors • Disease(s) • Severity of Disease(s) Measure: Outcomes of interest • Clinical • Health Status • Functional • Cost/LOS/Encounters • Productivity › physiologic signs and symptoms • Genetic information • Measured at Multiple Points in Time 35 7 Signature Features of PBE Studies 1. Hypotheses can be focused or broad 2. All interventions are considered to determine relative contribution of each 3. Broad patient selection criteria maximize generalizability and external validity 4. Detailed characterization of the patient by robust measures of patient severity, genetic information, and functional status 36 7 Signature Features of PBE Studies 5. Patient differences controlled statistically rather than through randomization 6. Facility and clinical/patient buy-in through use of trans-disciplinary Clinical Practice Team 7. Strength of evidence built through the research process PBE findings are more generalizable and transportable 37 than RCT findings PBE Signature Feature: Interventions 2. Consider all interventions to determine relative contribution of each. Uses a detailed characterization of the care process through a well-designed point-of-care (POC) documentation system – User-defined and user friendly – Time sensitive characterization of all interventions 38 PBE Signature Feature: Patients 4. Detailed characterization of the patient by robust measures of individual severity and functional status ® Includes Comprehensive Severity Index (CSI ) – Over 2,200 condition-specific signs, symptoms, and physical findings – Continuous score: 0 ∞ – Admission, discharge, maximum during stay, visit Genetic information Includes Functional Independence Measure (FIM) and/or other measures of functional status 39 PBE Signature Feature: Clinical Practice Team 6. Facility and clinical buy-in through use of transdisciplinary Clinical Practice Team that: Develops and frames the questions Defines variables Gathers data Interprets data Implements findings Fosters clinical and individual buy-in (bottom-up) Facilitates knowledge translation 40 PBE Signature Feature: Strength of Evidence 7. Strength of evidence built through the research process Added confounders preserve the significant association ■ A change in outcomes follows a change in treatment as predicted by the PBE model ■ Repeated studies on the same topic yield similar findings 41 PBE Hallmarks • Decisions are made by front-line clinicians vs. researchers • “Bottom-up” vs. “Top-down” approach • Guidance from researchers (scientific advisory board) and patient experience 42 42 PBE Hallmarks –Non-experimental: Follows outcomes of treatments actually prescribed –Inclusive: Uses patient populations undergoing routine clinical care –Pragmatic: Uses actual clinical outcomes –Lower Cost than RCTs –Faster than RCTs 43 Practice-Based Evidence Study Design • High external validity - - Includes essentially all patients with specific condition Captures confounders that could affect relevant treatment responses. Reduces accidental associations between treatments and outcomes. 44 Practice-Based Evidence Study Design Standardize documentation of : Process Factors • Management Strategies • Interventions • Medications Control for: Patient Factors • Psychosocial/demographic/history factors • Disease(s) • Severity of Disease(s) Measure: Outcomes of interest • Clinical • Health Status • Functional • Cost/LOS/Encounters • Productivity › physiologic signs and symptoms • Genetic information • Multiple Points in Time 45 Examples of Severity Systems to address Selection Bias/Confounding by Indication Diagnostic/Procedure Based Systems • Clinical definition of severity • Body Systems Count •Charlson Comorbidity Index (1-yr death) • 3 Disease Staging (hosp death) Physiologic/Clinically Based Systems • 2 Apache II & III (ICU death) • 2 Medisgroups (Atlas) (hosp death) • Patient Management Categories (hosp death) • Resource-based definition of severity •Acuity Index Method (LOS) •APR DRGs (hosp $) •Patient Management Categories (hosp $) •Refined DRGs (hosp LOS, $) CSI® 46 Comprehensive Severity Index (CSI®) used to avoid selection bias or confounding by indication • Severity defined as “physiologic complexity presented to medical • personnel due to the extent and interactions of a patient’s diseases” Disease-specific: 5,500 disease-specific groups; over 2,200 distinct criteria. ICD-9 codes trigger disease-specific patient signs, symptoms, and physical findings used to score disease-specific and overall severity levels • No treatments used as criteria • Comprehensive (all diseases) • Clinically credible: computes disease-specific and overall severity levels • Can measure severity at multiple time points • Allows statistical comparison of interventions without confounding by severity of illness 47 14 Characteristics of the CSI System Developed as an addition to ICD-9-CM Explicit severity criteria established for each ICD-9-CM code Severity criteria result in a continuous severity score for each diagnosis Failure to meet enough criteria results in a severity rating of zero Overall patient severity computed as continuous score, taking all diagnoses into account 48 CSI Severity Indicators Physiological signs and symptoms of a disease - Vital signs - Laboratory values - Radiology findings - Other physical findings Severity indicators are specific to each disease based on ICD-9-CM coding 49 Pneumonia Criteria Set 480.0-486; 506.3; 507.0-507.1; 516.8; 517.1; 518.3; 518.5; 668.00-668.04; 997.3; 112.4; 136.3; 055.1 CATEGORY 1 2 3 Cardiovascular pulse rate 51-100; ST segment changes-EKG; systolic BP 90mmHg pulse rate 100-129; 41-50; PACs, PAT, PVCs-EKG; systolic BP 80-89mmHg pulse rate 130; 31-40; systolic BP 61-79mmHg pulse rate 30; asystole, VT, VF, V flutter; systolic BP 60 mmHg Fever 96.8-100.4 and/or chills 100.5-102.0 oral; 94.0-96.7 102.1-103.9; 90.1-93.9 and/or rigors 104.0 90.0 Labs ABGs pH 7.35-7.45 pH >7.46 7.25-7.34 pH 7.10-7.24 pH 7.09; pO2 51-60mmHg pO2 50mmHg WBC 11.1-20.0K/cu mm; 2.4-4.4K/cu mm; bands 10-20% WBC 20.1-30.0K/cu mm; 1.0-2.3K/cu mm; bands 21-40% WBC 30.1K/cu mm; 1.0K/cu mm; bands 40% chronic confusion acute confusion unresponsive 9-11 6-8 5 Radiology Chest X-Ray or CT Scan infiltrate and/or consolidation in 1 lobe; pleural effusion infiltrate and/or consolidation in >1 but 3 lobes; infiltrate and/or consolidation in >3 lobes; cavitation or lung necrosis Respiratory dyspnea on exertion; stridor; rales 50%/3 lobes; decreased breath sounds 50%/3 lobes; positive for fremitus; stridor hemoptysis NOS; blood tinged or purulent or frothy sputum cyanosis present dyspnea at rest; rales >50%/ 3 lobes; decreased breath sounds >50%/ 3 lobes apnea absent breath sounds >50%/ 3 lobes Hematology pO2 61mmHg WBC 4.5-11.0K/cu mm; bands <10%; Neuro Status Lowest Glasgow coma score 12 white, thin, mucoid sputum 4 frank hemoptysis 50 Copyright 2006. Susan D. Horn. All rights reserved. Do not quote, copy or cite without permission. Uses of PBE Findings PBE findings can be used to Create protocols depending on patient characteristics that result in better outcomes Evaluate treatments or programs 51 Examples of PBE Study Findings • Children Hospitalized with RSV. Birth at 33-35 weeks • GA is significantly associated with higher intubation rates, longer ICU stays, and longer hospital LOS (prompted guideline change for prophylaxis). Low-Level Stroke. More time spent in high-level rehabilitation activities, such as gait, upper extremity control, and problem solving in the first three hours of therapy is significantly associated with higher total, motor, and cognitive FIM at discharge. 52 Pediatric Bronchiolitis Study Length of Stay 8 7.1 7 6.2 6 Day s 5 4 5 4.3 4 4.3 4.8 3.5 4.5 3.6 3 2 1 0 Site Site Site Site Site Site Site Site Site Site 6 5 7 9 4 1 3 8 2 10 53 Pediatric Bronchiolitis Study Cost $14,000 12,373 $12,000 $10,000 8,839 10,041 8,934 $8,000 $6,000 9,342 6,097 4,908 4,522 Site Site 6 5 Site Site 7 9 4,122 $4,000 $2,000 $0 Site Site 4 1 Site Site Site 3 8 2 Site 10 54 Pediatric Bronchiolitis Study Outcome = Length of Stay Assessment n=804 R2=.62 Procedures - Age in months (.0006) + O2 used (.01) + MCSIC (.0001) + Steroids (.0001) + Antibiotics (.0001) + Intubation (.0001) + Lasix (.0001) + Interaction of chest physiotherapy and atelectasis (.0001) 55 Pediatric Bronchiolitis Study Outcome = Cost n=722 Assessment - Age in months (.0001) + MCSIC (.0001) R2=.73 Procedures + Admitted to PICU (.0001) + Arterial line (.04) + Central line (.003) + Continuous nebulization (.0002) + Interaction: chest pt & atelectasis (.005) + Intubation (.0001) + Ipratropium bromide (.005) + Lasix (.0001) + Ribavirin (.0001) + Steroids (.0003) Willson, et al. PEDIATRICS 2001;108(4):851-855. 56 Prematurity and RSV Hospital Outcomes Significant Differences by Gestational Age Groups 33-35 week GA infants had highest hospital resource use < 32 wks 33-35 wks 36 wks > 37 wks p-value Intubation 21.4% 38.7% 20% 12.1% 0.002 ICU LOS 5.8 days 7.7 days 4.2 days 3.8 days 0.021 Hospital LOS 6.8 days 8.4 days 4.9 days 4.1 days <0.0001 Admitted to ICU 39.3% 48.4% 30.0% 27.9% 0.101 HX of Hosp. for RSV /Bronchiolitis 14.3% 16.1% 6.7% 6.1% 0.137 57 RSV Hospital Outcomes and Policy Changes Conclusions • 33-35 week GA infants had highest hospital resource use • 36 week infants have risk similar to full term infants • Changed guidelines for immunoprophylaxis for 33-35 week infants • Changed guidelines for intubation – try ‘stimulating’ first 58 Post-Stroke Rehabilitation Study 2001 – 2003; 1,161 patients Study Objectives PBE study designed to discover what combinations of medical devices, therapies, medications, feeding approaches, and their interactions worked best for specific types of stroke patients treated in realworld practices. 59 Post-Stroke Rehabilitation Study Trans-Disciplinary Project Clinical Team • • • • • Physicians Nurses Social Workers Psychologists Physical Therapists • • • Occupational Therapists Recreation Therapists Speech/Language Pathologists 60 Outcome: Discharge Motor FIM Severe Stroke – Full Stay General Assessment – Age PT Interventions – Formal assessment – Bed mobility + Mild motor impairment + Gait + Admission Motor FIM + Advanced gait + Admission Cognitive FIM – Black race OT Interventions + Home management SLP Interventions – Swallowing – Orientation + Reading comprehension Medications General Interventions – Days onset to rehab + Enteral feeding – Anti-Parkinsons – Modafinil – Old SSRIs + Atypical antipsychotics 61 Outcome: Discharge Motor FIM Severe Stroke–1st 3 hour Therapy block only General Assessment PT Interventions OT Interventions SLP Interventions – Age – Bed mobility + Home management st – Severe motor impairment time in 1 3 hrs + Gait time in 1st 3 + Admission Motor FIM hrs + Admission Cog. FIM + Advanced gait + No Dysphagia time in 1st 3 hrs + Neurotropic Impairments Medications General treated with meds Interventions – Other Antidepressant – Days onset to rehab – Old SSRIs + LOS + Atypical antipsychotics Horn et al., Arch Phys Med Rehabil 2005;86(12 Supplement + Enteral feeding 62 2):S101-S114 Policy Changes from Stroke PBE Study Early Rehabilitation Admission – get patient into rehabilitation as soon as possible after stroke onset; possibly start in Neuro ICU Early gait in PT – start gait as soon as possible after rehab admission; put patient in harness on treadmill for safety Early Feeding – continue or start enteral nutrition at rehab admission if patient is not able to eat full meals Use Opioids for Pain – continue or start opioids at rehab 63 admission if patient misses therapy due to pain RCTs, Observational Cohort Studies, & PBE Compared Feature RCT Observational Cohort PBE Researcher control of variables Experimental Observational Observational Time dimension Prospective Prospective or retrospective Prospective Intervention(s) 1 or 2 discrete interventions; standardized protocols All interventions deemed relevant; 1 or many; often a more global documented separately and in great intervention, e.g., a program of detail so that one can examine interventions interactions among interventions Hypotheses Well-specified Specified, general, or none Focused or broad Data source Primary data Primary data supplemented by abstraction from clinical records Protocol- vs. data- intensity Protocol-intensive Primary data and/or large administrative data sets Not inherently protocolor data-intensive Data intensive Archives Physical Medicine & Rehabilitation 2012. RCTs, Observational Cohort Studies, & PBE Compared Feature RCT Observational Cohort PBE Exclusion criteria Extensive exclusions to minimize variation Minimal exclusions (permits studying specific subsets of patients) Minimal exclusions (permits studying specific detailed subsets of patients) Sample size Typically small (e.g., <200) because of narrow hypothesis Usually large Large (e.g., >1,500) Control for participant differences Through exclusions and randomization Through propensity scoring, instrumental variable (IV) analysis, and statistical control Thorough detailed characterization of comprehensive patient severity of illness and other relevant characteristics, and statistical control Blinding Single, double, or triple blinding No No Outcomes Few Few or many Effect size Often small Small or large Many- appropriate to study population Usually small when samples are large Archives Physical Medicine & Rehabilitation 2012. RCTs, Observational Cohort Studies, & PBE Compared Feature Attitude to Confounders Validity Causality RCT Observational Cohort Affect outcomes but may Not interesting; exclude not be measured so hard them to control Internal—high Internal—low External—low External—moderate Assigned Assumed Ability to examine treatment effects on subgroups Limited, unless predesigned and powered for subgroup analyses Research culture (1) Top-down More likely because of large data sets, but subgroup analyses may be limited by patient control strategies such as propensity scoring Varies PBE Affect outcomes, measure as many as suggested by clinical team, and are interesting Internal—moderate to high External—high Assumed; with drill-down analyses, implementation, and repeat studies can move close to ‘assigned’ Very likely since measure details about patients that are used to create specific subsets Highly collaborative; bottom-up Archives Physical Medicine & Rehabilitation 2012. RCTs, Observational Cohort Studies, & PBE Compared Feature RCT Observational Cohort Research culture (2) Not depend on local knowledge Varies Knowledge translation Less buy-in Varies Science of …. Confirmation Confirmation Science of …. Efficacy Associations Cost High Varies Archives Physical Medicine & Rehabilitation 2012. PBE Local knowledge highly valued and sought High level of buy-in; findings more “transporatable” Confirmation, discovery, and innovation Associations and effectiveness Moderate