Advanced Modeling Considerations within a Comparative Effectiveness Research Framework Jim Murray, Ph.D. THE 35th ANNUAL MIDWEST BIOPHARMACEUTICAL STATISTICS WORKSHOP BALL STATE UNIVERSITY ALUMNI CENTER, MUNCIE, INDIANA May 22nd, 2012 WHY CER? Why Now? US Health Care System US Healthcare is a fragmented System with significant variability in Process of Care & Patient Outcomes Medical Expenditure/Enrollee • Heterogeneous Delivery Models • Inadequate Access & Coverage • Misaligned financial “incentives” & “disincentives” • Misplaced Financial Risk • Perspective Oriented 3 Source: Dartmouth Atlas Project, The Dartmouth Atlas of Health Care. Relationship Between Quality of Care and Medicare Spending: As Expressed by Overall Quality Ranking, 2000–2001 Data: Medicare administrative claims data and Medicare Quality Improvement Organization program data. Adapted and republished with permission of Health Affairs from Baicker and Chandra, “Medicare Spending, The Physician Workforce, and Beneficiaries’ Quality of Care” (Web Exclusive), 2004. Source: McCarthy and Leatherman, Performance Snapshots, 2006. www.cmwf.org/snapshots Unsustainable Growth in Health Care Costs CBO : Research on the Comparative Effectiveness of Medical Treatments: Issues and Options for an Expanded Federal Role, December 2007 CER Concept and Definitions Comparative Effectiveness Research Other definitions offer considerable overlap Organization Definition American College of Physicians Evaluation of the relative (clinical) effectiveness, safety and cost of 2 or more medical services, drugs, devices, therapies, or procedures used to treat the same condition.3 Institute of Medicine (IOM) – Roundtable on Evidence-Based Medicine Comparison of one diagnostic or treatment option to ≥1 others. Primary CER involves the direct generation of clinical info on the relative merits or outcomes of one intervention in comparison to ≥1 others. Secondary CER involves the synthesis of primary studies to allow conclusions to be drawn.4 Agency for Healthcare Research and Quality (AHRQ) A type of health care research that compares results of one approach for managing a disease to results of other approaches. CER usually compares ≥2 types of treatment, such as different drugs, for the same disease but it can also compare medical procedures and tests. The results can be summarized in a systematic review.5 Medicare Payment Advisory Commission (MedPAC) Evaluation of the relative value of drugs, devices, diagnostic and surgical procedures, diagnostic tests, and medical services. By value, it is meant the clinical effectiveness of a service compared with its alternatives.6 Congressional Budget Office (CBO) A rigorous evaluation of the impact of different options that are available for treating a given medical condition for a particular set of patients. Such research may compare similar treatments, such as competing drugs, or it may analyze very different approaches, such as surgery and drug therapy.7 Center for Medical Technology Policy (CMTP) The direct comparison of existing health care interventions to determine which work best for which patients and which pose the greatest benefits and harms. The core question of comparative effectiveness research is which treatment works best, for whom, and under what circumstances.8 Comparative Effectiveness Research Distinguishing characteristics 1. Informs a specific clinical decision from the individual patient perspective or a health policy decision from the population perspective 2. Compares at least two alternative interventions, each with the potential to be “best practice” 3. Describes results at the population and subgroup levels 4. Measures outcomes—both benefits and harms—that are important to patients and decision makers 5. Employs methods and data sources appropriate for the decision of interest (e.g., observational studies, pragmatic trials, systematic reviews) 6. Conducted in settings that are similar to those in which the intervention will be used in practice Committee on Comparative Effectiveness Research Prioritization – Institute of Medicine. Initial National Priorities for Comparative Effectiveness Research. The National Academies Press, 2009. Drug vs. drug 9 Lifestyle vs. drug Screening vs. usual care Drug vs. device What’s it really all about…. CER Statistical and Methodological Challenges Current PCORI Request for Information on a Transformational Framework (Proposed Release May 2012) Intrinsic translation framework components • Internal validity (bias) • External validity (generalizability, or applicability to non-study settings and populations) • Precision (Sample size, length of follow-up, etc.) • Heterogeneity in risk or benefit (e.g. subgroup or “personalized” evidence) • Ethical dimensions of the study (including considerations of risk-benefit balance and study burden for study participants) Extrinsic translation framework components • Time urgency (e.g., rapidly changing technology, policy or public health needs) • Logistical constraints (e.g., feasibility of collecting information from participants, number of participants available, study complexity) • Data availability, quality and completeness IF CER was Simple….. “All models are wrong, some are useful” – George Box • Efficacy vs. effectiveness • Head to Head vs. Indirect comparisons • Bias from non-representative populations • Average treatment effects may obscure heterogeneity of treatment response in subpopulations – Heterogeneity in risk or benefit (e.g. subgroup or “personalized” evidence) • Signal Detection – what might we find from a CER analysis – Types of results – Precision vs. accuracy/validity (Sample size, length of follow-up, etc.) CER plays a key role in generating evidence for clinical and policy decision-making CAN IT WORK? DOES IT WORK? IS IT WORTH IT? CER RCTs EBM HTA (Evidence-Based Medicine) CLINICAL GUIDELINES PATIENT LEVEL DECISION MAKING Source: Drummond et al, 20089 CONDITIONAL COVERAGE COVERAGE DECISION MAKING What is clinical research? • Research that takes place in clinical settings – Case studies – Case series – Descriptive & retrospective studies – Clinical trials – Randomized controlled trials Efficacy vs. Effectiveness • What is the best evidence? – minimizes bias to the greatest extent possible, – pertinent to clinical practice. • The evidence most highly valued and ultimately judged to be the best may differ based on which perspective predominates. – efficacy and effectiveness approaches to research. – assumed to be synonyms, the are not – are often used incorrectly • Which question we answer (efficacy or effectiveness) will dictate or is dictated by – Data Source – Study Design – Analytical Methods – Generalizabilty Efficacy and effectiveness studies optimize different aspects of validity… clinical research [intervention] studies continuum: Internal Validity External Validity Efficacy Experimental (RCTs) Effectiveness Quasi-experimental/ Observational designs Most studies fall somewhere in between these anchors From: Winstein & Lewthwaite, Eugene Michels Forum: CSM, Nashville TN, February 7, 2004 Comparing Pragmatic vs. RCT and Observational Characteristic RCT Pragmatic Observational Focus Efficacy and safety; assess mechanistic effect; Can it work Effectiveness and safety; assess / inform decision-making; Does it work under usual care conditions? Effectiveness and safety; Does it work in actual practice? Setting Ideal / artificial Real-world routine care (with potential minor departures) Real-world routine care Population Strictly defined; homogenous Typically broad; heterogeneous Broad; heterogeneous Randomization Yes Typically yes No Blinding Typically yes No No Interventions Fully interventional Minimally interventional (e.g., rand.) Non-interventional Outcomes Clinical surrogates; short term Longer term outcomes; PROs Long term outcomes; PROs Sample Size Typically small Typically larger Typically large Validity High internal ( bias); low external ( generalizability) Moderate internal; moderate to high external Low internal; high external Prospective/Retro Prospective Prospective Prospective or retrospective Comparable cost Higher Moderate Lower Example sub design Adaptive design LST; adaptive design Database studies, cohort, casecontrol, cross-sectional Pragmatic trials • Pragmatic trials are prospective medical research studies that have minimal interventional activities*, such as randomization, while maintaining routine care of the subjects. The goal of pragmatic trials is to provide comparative effectiveness evidence applicable to a broad range of patients and intended for health care decision makers to make informed health care decisions. • Pragmatic trials take place during phase 3b and phase 4 of study drug development and can involve drug or non-drug interventions, such as a patient education program. They can be used in support of registration or for nonregistration purposes, or to evaluate the economic impact of therapy. They tend to be simpler than RCTs and have less data collection. • Practical approaches are used to design and implement pragmatic trials. The level of pragmatism can be measured on several domains (see next slide) such as: practitioner expertise, intervention flexibility, eligibility criteria, types of outcomes, intensity of follow-up, and adherence monitoring. * minimal interventional activities means limiting the number and degree of controlled conditions, that is, departures from standard routine care Head to Head vs. Indirect Comparisons Head to Head comparison comes from a trial where A was directly compared to B. Indirect Comparison comes from multiple studies where A and B may have been compared to the same comparator (i.e., C) but have never been compared to each other in the same study, What is indirect comparison? Fujian Song BMed MMed PhD Reader in Research Synthesis, Faculty of Health, University of East Anglia www.whatisseries.co.uk http://www.medicine.ox.ac.uk/bandolier/painres/download/whatis/What_is_ind_comp.pdf Indirect Comparisons • • • • Indirect comparison refers to a comparison of different healthcare interventions using data from separate studies, in contrast to a direct comparison within randomized controlled trials. Indirect comparison is often used because of a lack of, or insufficient, evidence from head-to-head comparative trials. Naive indirect comparison is a comparison of the results of individual arms from different trials as if they were from the same randomized trials. This method provides evidence equivalent to that of observational studies and should be avoided in the analysis of data from randomized trials. Adjusted indirect comparison (including mixed treatment comparison) is an indirect comparison of different treatments adjusted according to the results of their direct comparison with a common control, so that the strength of the randomized trials is preserved. Empirical evidence indicates that results of adjusted indirect comparison are usually, but not always, consistent with the results of direct comparison. Basic assumptions underlying indirect comparisons include a homogeneity assumption for standard meta-analysis, similarity assumption for adjusted indirect comparison and consistency assumption for the combination of direct and indirect evidence. It is essential to fully understand and appreciate these basic assumptions in order to use adjusted indirect and mixed treatment comparisons appropriately. What is indirect comparison? Fujian Song BMed MMed PhD Reader in Research Synthesis, Faculty of Health, University of East Anglia www.whatisseries.co.uk http://www.medicine.ox.ac.uk/bandolier/painres/download/whatis/What_is_ind_comp.pdf Bias and Generalizability Outcome True Estimate of Effect Biased Estimate of effect c Oval contains full target population Inclusion/Exclusion Criteria for an RCT If it were not for the great variability among individuals, medicine might as well be a science and not an art.” “ Sir William Osler, The Principles and Practice of Medicine 1892 Heterogeneity of Treatment Effects 24 Reprinted from Kravitz et al. Milbank Q. 2004;82:661– 687 Analytical Strategy depends on source of Population Heterogeneity •Diversity in intrinsic population characteristics •Underlying disease variability •Multiple relevant targets for intervention Low Hanging Fruit •Differential ADME (efficacy and/or toxicity) •Disease adaptation leading to treatment resistance Company Confidential Copyright © 2000 Eli Lilly and Company 25 Signal Detection What can come from CER • A > B or B > A • A ≥ B or B ≥ A • A=B • A~B • A?B • A? B ? C yet A > C Superiority (Or inferiority) Better or Same as • May be conditional Identical (rare) Equivalence (functional) • Doesn’t guarantee interchangeability Unknown due to Availability, Quantity or Quality of evidence Semi-order Semi-Orders – Is there any hope? • a Semi- order is a type of ordering that may be determined for a set of items with numerical scores by declaring two items to be incomparable when their scores are within some threshold of each other and otherwise using the numerical comparison of their scores. – Precision – Sample Size - Power - Meaningful Tx Effects There is No Free Lunch • Accuracy and certainty can be improved only if additional information, that is both relevant and appropriate, is included in the analysis. • Bayesian methods might bring additional information by including “prior information” – Finding, representing and justifying the appropriate and relevant “Prior” information can be “difficult”. What to think about it all…. Final Thoughts – evolving and to be continued… • The US Health Care system must develop high-quality, timely, and relevant CER to help inform individual treatment decisions made by physicians and patients. • CER must not be used as a blunt policy instrument for cost control. • All scientifically valid data from broad ranges of data sources (i.e., EMRs, retrospective data, etc.) and study designs (e.g., RCTs, Observational Studies and Pragmatic Clinical Trials) for the clinical question under consideration should be included given appropriate accommodations for known limitations. Improved Quality of Care and Patient Outcomes Good News…. Our future’s so bright we gotta wear shades What do we do about it? There are many statistical issues that need to be addressed made able to conduct CER in a valid manner. Several sessions over the next two days will discuss and address them Sidney Harris, Used with Permission Discussion