Outline of Talk Rewarding Provider Performance: Key Concepts, Available Evidence, Special Situations, and Future Directions • Review of obstacles to using incentives (using the example of public reporting) • Summary of available data • Addressing the tough decisions R. Adams Dudley, MD, MBA Institute for Health Policy Studies University of California, San Francisco Support: Agency for Healthcare Research and Quality, California Healthcare Foundation, Robert Wood Johnson Foundation • If we have time, consider the value of outcomes reports Dudley 2005 Project Overview 2 11 Communities Goals: • Describe employer hospital report cards • Explore what issues determine success Qualitative study • 11 communities • 37 semi-structured interviews with hospital and employer coalition representatives Coding and analysis using NVivo software See Mehrotra, A, et al. Health Affairs 22(2):60. Seattle Maine S Central Buffalo Wisconsin Detroit Cleveland Indianapolis Memphis E Tennessee N Alabama Orlando Dudley 2005 3 Summary of Report Cards 1. Ambiguity of goals 2. Uncertainty on how to measure quality surveys also common Majority use billing data 4/11 communities no public release Dudley 2004 4 4 Issues Determining Success Only 3 report cards begun before 1998 Majority use mortality and LOS outcomes, patient Dudley 2005 Dudley 2005 3. Lack of consensus on how to use data 4. Relationships between local stakeholders 5 Dudley 2005 6 1 Ambiguity of Goals Uncertainty on How to Measure Quality Hospitals Skeptical of Employer Goals Process vs. Outcome Debate Hospitals don’t trust employers, suspect their primary interest is still cost “An organization that has been a negotiator of cost, first and foremost, that then declares it’s now focused on quality, is a hard sell.” Clinicians: Process measures more useful • “We should have moved from outcomes to process measures. Process measures are much more useful to hospitals who want to improve.” Employers: Outcomes better, process measures unnecessary “Ultimately, you’re going to contract with me or not contract with me on the basis of cost. Wholly. End of story.” “People want longer lasting batteries. Duracell doesn’t stand there with their hands on their hips and say, ‘Tell us how to make longer-lasting batteries.’ That’s the job of Duracell.” Dudley 2005 Dudley 2005 7 Uncertainty Ambiguity ofon Goals How to Measure Quality 8 Lack of Consensus on How to Use Quality Data The Case-mix Adjustment Controversy Low Level of Public Interest a Positive Trend? Clinicians: Forever skeptical case mix adjustment is good enough: “[The case-mix adjustment] still explained less than 30 percent of the differences that we saw…” Employers: We cannot wait for perfect case-mix adjustment “My usual answer to that is ‘OK, let’s make you guys in charge of perfect, I’m in charge of progress. We have to move on with what we have today. When you find perfect, come back, and we’ll change immediately.’ ” Dudley 2005 • 9 Low levels of consumer interest, at least initially One interviewee felt slow growth is better: “Food labeling is the right metaphor. You want some model which gets to one and a half to three percent of the people to pay attention. This gives hospitals time to fix their problems without horrible penalties.… But if they ignore it for five years all of a sudden you’re looking at a three or four percent market share shift.” Dudley 2005 10 Relationships Ambiguity of Goals Between Local Stakeholders Key Elements of an Incentive Program Market Factors “Market power does not remain constant. Sometimes purchasers are in the ascendancy and at other times, providers are in the ascendancy, like when hospitals consolidate. And that can vary from community to community at a point in time, too.” Dudley 2005 Dudley 2004 11 • Measures acceptable to both clinicians and the stakeholders creating the incentives • Data available in a timely manner at reasonable cost • Reliable methods to collect and analyze the data • Incentives that matter to providers Dudley 2005 12 2 Participants in CHART CHART: California Hospital Assessment and Reporting Task Force • All the stakeholders: – Hospitals: e.g., HASC, hospital systems, individual hospitals – Physicians: e.g., California Medical Association – Consumers/Labor: e.g., Consumers Union/California Labor Federation – Employers: e.g., PBGH, CalPERS – Health Plans: e.g., Blue Shield, Wellpoint, Kaiser – Regulators: e.g., JCAHO, OSHPD, NQF – Government Programs: CMS, MediCal A collaboration between California hospitals, clinicians, patients, health plans, and purchasers Supported by the California HealthCare Foundation Dudley 2005 How CHART Might Play Out Admin data Clinical Measures CHART Measures Report to Hospitals OR • For public reporting in 2005-6: Specialized clinical data collection Patient Experience and Satisfaction Measures H-CAPHS "Plus" Scores IT or Other Structural Measures Surveys with Audits Data Aggregator Produces one set of scores per hospital Report to Health Plans and Purchasers Report to Public Dudley 2005 15 • For piloting in 2005-6: Dudley 2005 16 • Not because we’ve done it correctly in CHART, but just as a basis for discussion – ICU processes (e.g., stress ulcer prophylaxis), mortality, and LOS by chart abstraction – ICU nosocomial infection rates by infection control personnel – Decubitus ulcer rates and falls by annual survey Dudley 2005 – JCAHO core measures for MI, CHF, pneumonia, surgical infection from chart abstraction – Maternity measures from administrative data – Leapfrog data – Mortality rates for MI, pneumonia, and CABG Tough Decisions: General Ideas and Our Experience in CHART CHART Measures Dudley 2004 14 17 Dudley 2005 18 3 Tough Decision #1: Collaboration vs. Competition? Tough Decision #2: Same Incentives for Everyone? • Does it make sense to set up incentive programs that are the same for every provider? • Among health plans • Among providers – This would be the norm in other industries if providers were your employees; unusual in many other industries if you were contracting with suppliers. • With legislators and regulators Dudley 2005 19 Dudley 2005 20 Tough Decision #3: Encourage Investment? Tough Decision #2: Same Incentives for Everyone? • Much of the difficulty we face in starting public reporting or P4P comes from the lack of flexible IT that can cheaply generate performance data. • Similarly, much QI is best achieved by creating new team approaches to care. • Should we explicitly pay for these changes, or make the value of these investments an implicit factor in our incentive programs? • But providers differ in important ways – Baseline performance/potential to become top provider – Preferred rewards (more patients vs. more $) – Monopolies and safety net providers • But do you want the complexity? • Can be achieved by pay-for-participation, for instance. Dudley 2005 21 Tough Decision #4: Moving Beyond HEDIS/JCAHO Dudley 2005 22 Tough Decision #4: Moving Beyond HEDIS/JCAHO • No other measure sets routinely collected and audited as current cost of doing business • If you want public reporting or P4P of new measures, must balance data collection and auditing costs vs. information gained • If purchasers/policymakers drive the introduction of new quality measurement costs, who pays and how? • So, who picks the measures? – Admin data involves less data collection cost, equal or more auditing costs – Chart abstraction much more expensive data collection, equal or less auditing Dudley 2005 Dudley 2004 23 Dudley 2005 24 4 Tough Decision #5: Use Only National Measures or Local? An Example of Collaboration: C-Section Rates in CHART • Well this is easy, national, right? • Initial measure: total C-section rate (NQF) • Collaborate/advocate within CHART: • Hmmm. Have you ever tried this? Is there any “there” there? Are there agreed upon, nonproprietary data definitions and benchmarks? Even with the National Quality Forum? – Some OB-GYNs convinced the group to develop an alternative: the C-section rate among nulliparous women with singleton, vertex, term (NSVT) presentations • Collaborate with hospital: • Maybe local initiatives should be leading national?? Dudley 2005 – NSVT not traditionally coded: need to train Medical Records personnel 25 Dudley 2005 26 Tough Decision #6: Use Outcomes Data? Outcome Reports • Especially important issue as sample sizes get small…that is, when you try to move from groups to individual providers in “second generation” incentive programs. Some providers are concerned about random events causing variation in reported outcomes that could: • If we can’t fix the sample size issue, we’ll be forced to use general measures only (e.g., patient experience measures). • Ruin reputations (if there is public reporting) • Cause financial harm (if direct financial incentives are based on outcomes) Dudley 2005 27 Dudley 2005 28 Probability Distribution of Risk Adjusted Mortality Rate for Mean Hospital in Each Sub-Group Scenario #3: 200 patients per hospital; trim points calculated using normal distribution around population mean, 2 tails, each with 2.5% of distribution contained beyond trim points. 20.0% An Analysis of MI Outcomes and Hospital “Grades” 18.0% 7.6% 16.6% Fairly consistent pattern over 8 years: 10% of hospitals labeled “worse than expected”, 10% “better”, 80% “as expected” Processes of care for MI worse among those with higher mortality, better among those with lower mortality Probability Distribution Mortality Outcome 16.0% • From California hospital-level risk-adjusted MI mortality data: 14.0% Poor Quality Hospitals Good Quality Hospitals 12.0% Superior Quality Hospitals All Hospitals in Model 10.0% Low Trim Point 8.0% High Trim Point Poor Hospital Mean 6.0% Good Hospital Mean 4.0% Superior Hospital Mean 2.0% 8.6% Dudley 2005 Dudley 2004 12.2% 17.1% 0.0% • From these data, calculate mortality rates for “worse”, “better”, and “as expected” groups 0% 5% 10% 15% 20% 25% 30% Risk Adjusted Mortality Rate 29 Dudley 2005 30 5 3 Groups of Hospitals with Repeated Measurements (3 Years) Outcomes Reports and Random Variation: Conclusions Predictive Values 3 Year Star Scores Scenario #3 80.0% 70.0% Proportion Total Hospitals 60.0% 50.0% Superior Quality Hospital Expected Quality Hospital Poor Quality Hospital 40.0% 30.0% 20.0% 10.0% • Random variation can have an important impact on any single measurement • Repeating measures reduces the impact of chance • Provider performance is more likely to align along a spectrum rather than lumped into two groups whose outcomes are quite similar • Providers on the superior end of the performance spectrum will almost never be labeled poor 0.0% 3 4 5 6 7 Dudley Hospital Star Score 2005 8 9 31 Dudley 2005 32 Conclusions • • • • Many tough decisions ahead Nonetheless, paralysis is undesirable Collaborate on the choice of measures Everyone frustrated with limited (JCAHO and HEDIS) measures…need to figure out how to fund collecting and auditing new measures • Consider varying incentives across providers Dudley 2005 Dudley 2004 33 6