Quality Measurement: Is the Information Sound Enough to be

advertisement
Quality Measurement: Is the
Information Sound Enough to be
Used by Decision Makers?
Cheryl L. Damberg, Ph.D., Director of Research
Pacific Business Group on Health
Academy Health: June 8, 2004
Reframed Question…
 How good is good enough?
 For use by whom for what purposes?
 Purchasers--changes in plan design to
reward higher quality, more efficient
providers, steer enrollment (premiums, out
of pocket costs)
 Plans--incentive payments, tiering, narrow
networks, channeling or centers of
excellence
 Consumers--to guide treatment choices
 Providers--quality improvement
2
© Pacific Business Group on Health, 2004
How Good is Good Enough?
 We don’t know what the right standard is
 Should standards apply in same way to all end users?
 What are the dangers of “noisy” information?
 Demming Toyota studies (Six Sigma) showed that when
gave back noisy information on performance
 Increased variation, decreased quality
 Disorienting; lost natural instinct for how to improve
 How do we make optimal decision in the face of
uncertainty?
 Decision theory analysis could help to inform these
questions
 Need research in this area
3
© Pacific Business Group on Health, 2004
Reality Check!
 What information?
 Measures exist--few implemented routinely or universally
 Most providers have no clue what their performance is
 “I’m following guidelines, it is someone else who isn’t”
 Is the current information better than no information?
 Absent information—choice is like a flip of the coin (50:50)
 Decisions will still be made with no information or
poor information
 Default position is to base decisions solely on price
 Consequences differ
 Patient—inconvenience for little gain in outcome
 Provider—ruin reputation, livelihood
4
© Pacific Business Group on Health, 2004
What’s Currently Going On Out There
in Measurement?
 Two ends of the extreme…examples
 Commercial vendors
 Using administrative data, often with poor case
mix adjustment
 omitted variables that can lead to biased results
 handling of missing data
 rank ordering problems that lead enduser to incorrect
decision
 Research-level work
 Doing shrinking estimates to address noise
problem without thinking about issues of
underlying data quality
5
© Pacific Business Group on Health, 2004
Where in the Measurement Process
Can Things Go Wrong?
Measures
 Link to
outcomes
 Importance
 Valid
 Reliable
6
Implementation
 Poor data
 Small “n”
Display
Reporting
 Will
enduser
draw
correct
conclusion
based on
how
reported?
© Pacific Business Group on Health, 2004
Data: The Next Generation…..
7
© Pacific Business Group on Health, 2004
Underlying Problem of Data Quality
 One of the greatest threats to validity of
performance results are the data that “feed”
the measures
 Even if quality measure is good (i.e., reliable,
valid), can still produce bad (“biased”) result if the
data used to score performance are flawed or if
the source of data omits key variables important
in predicting the outcome.
8
© Pacific Business Group on Health, 2004
Example 1: Risk-Adjusted Hospital
Outcome for Bypass Surgery
 CA CABG Mortality Reporting Program
 70 hospitals submitted data in 1999
 Concern about comparability across hospitals in
coding
 Potential impact on hospital scores
 Importance of “getting it right” given public reporting
 38 hospitals selected for audit
 Focused on outliers or near outliers, with random
selection in the middle; over sampled high risk cases
 2408 cases audited
 Inter-rater reliability 97.6% (range: 95-99%: Cohen’s
Kappa)
9
© Pacific Business Group on Health, 2004
Table 1: Comparison of Audited Data and CCMRP Submissions
for Acuity, All Hospitals, 1999 Data
Audited Data
Elective Urgent Emergent Salvage Total
CCMRP Elective
Data
Urgent
Total
10
447
431
7
1
886
140
911
53
4
1,108
Emergent
16
117
199
3
335
Salvage
1
18
29
4
52
604
1,477
288
12
2,381
© Pacific Business Group on Health, 2004
Results of Audit
 Revealed downcoding and upcoding
problems
 Worst agreement: acuity (65.6%), angina
type (65.4%), angina class (45.8%), MI
(68.3%), and ejection fraction (78.0%)
Missing data: incorrect classification of risk
based on policy of replacing with lowest risk
 Ejection fraction (15.8%), MI (38.1%)
11
© Pacific Business Group on Health, 2004
Table 1: Agreement Statistics, All Hospitals, 1999 Data
Variable
Acuity
2,408
2
100.00
65.56
64.36
Angina Type (Stable/Unstable)
2,408
0
NA
65.37
34.73
Angina (Yes/No)
2,408
0
NA
86.21
42.47
CCS Angina Class
2,408
105
79.05
45.76
53.19
Congestive Heart Failure
2,408
31
38.71
82.23
32.94
COPD
2,408
6
0.00
86.34
73.25
Creatinine (mg/dl)
2,408
556
3.96
93.31
56.37
Cerebrovascular Disease
2,408
3
0.00
87.67
45.79
Dialysis
2,408
91
0.00
98.13
86.67
Diabetes
2,408
3
0.00
94.73
45.67
Ejection Fraction (%)
2,408
228
15.79
78.95
60.27
Method of measuring ejection fraction 2,408
406
0.00
74.34
Not Calculated
2,408
7
85.71
84.39
40.43
125
45
42.22
78.40
12.50
2,408
388
7.22
85.96
51.46
Hypertension
Time from PTCA to surgery
Left Main Stenosis
12
% Missing Values
% Lower Triangle
Records Missing
that Would be
Severity Weighted
Audited Values Incorrectly Classified % Agreement Disagreement
© Pacific Business Group on Health, 2004
Results of Audit
 Classification of some hospitals as outliers
may be a result of coding deficiencies
 When model was re-run, saw changes in
statistical significance and/or risk differential
 Death (outcome variable)—small levels of
disagreement can change hospital rating
 Change in rankings
 1 (no different  better than)
 6 (worse than  no different)
 1 (no different  worse than)
13
© Pacific Business Group on Health, 2004
Impact on Fitted Model Characteristics when Replacing
Audited Records with Information from Audit, 1999 Data
Model: CCMRP Data and Audited Data
Where Record was Audited
Model: CCMRP Data
Estimate
p-value
Intercept
-7.74
0.00
Creatinine (mg/dl)
0.18
0.00
Congestive Heart Failure
0.38
Hypertension
Estimate
p-value
-9.11
0.00
1.20
0.01
0.15
1.01
0.00
1.46
0.55
0.00
1.73
0.14
0.18
1.15
0.23
0.04
1.25
Dialysis
0.39
0.18
1.47
1.24
0.00
3.45
Diabetes
0.19
0.04
1.21
0.25
0.01
1.29
Acuity
Elective
OR*
Reference Group
OR
Reference Group
Urgent
0.26
0.02
1.29
0.33
0.00
1.39
Emergent
1.24
0.00
3.46
1.33
0.00
3.77
Salvage
2.46
0.00
11.71
3.11
0.00
22.46
Fit Statistics:
2
R
0.188
0.202
c-statistic
0.818
0.833
9.303 (0.317)
23.068 (0.003)
Hosmer-Lemeshow  (p-value)
2
14
© Pacific Business Group on Health, 2004
Steps Taken to Safeguard Against
Getting it Wrong
 Audit
 Data cross validation
 Training on coding of variables; support to
hospital coders
 Display of confidence intervals
 Small hospital with zero deaths (CI: 0.0%-10.0%)
 Combine data over multiple years
 Generate more stable estimates for small volume
hospitals
15
© Pacific Business Group on Health, 2004
Example 2: Pay for Performance
 Plan payouts to medical groups based on
rewarding those groups that rank at 75th
percentile or higher
 Rank ordering problems
 Medical groups with estimates based on small “n”
(i.e., noisy) more likely to fall in top or bottom part
of distribution
 Straight ranking ignores uncertainty in estimates
 Potential for rewarding wrong players
 Rewarding noise, not signal
16
© Pacific Business Group on Health, 2004
Example 3: Individual Physician
Performance Measurement
 Small “n” problem
 Physician lacks enough events (e.g., diabetics) to
score him/her at the level of the individual
indicator
 Estimates at indicator level are noisy (large SEs)
 Need to pool more information on physician’s
performance across conditions to improve
the signal to noise ratio
 Create summary scores (e.g., RAND QA Tools)
17
© Pacific Business Group on Health, 2004
Can We Proceed?
 OK to start with Version 1.0 of the measures
 Means of soliciting feedback
 Help drive improvement in measurement
 Won’t get it perfect on first attempt
 Important to safeguard against possible
mistakes in classifying
 Check validity of data (audit, cross validate)
 Assess extent of disagreement
 Perform sensitivity analyses
18
© Pacific Business Group on Health, 2004
Hedging Against Uncertainty
 Conservative ways of reporting so don’t
mislead (level of certainty in estimate)
 Rank ordering—small groups may rank either in
the highest/lowest part of the distribution, yet we
are most uncertain of their true performance
 Cruder binning (categorization)
 When faced with more uncertainty or
consequences are higher
 Use measures as a tool to identify bottom
performers, then send out teams to find out
what is going as a way to validate
19
© Pacific Business Group on Health, 2004
Measurement Issues Remain
 Existing measures
 OK, but difficult to implement (many rely on chart review)
 Hospital performance
 Complexity of what to measure (service line vs. overall)
 Physician performance
 Small “n” problem; challenges of pooling data
 Comprehensive assessment important, but too much
information will overwhelm endusers
 Need for summary measures
 Need to improve data systems
20
© Pacific Business Group on Health, 2004
Why Do We Need to Fill the Gaps?
 Lack of information and transparency
 Hard to improve if you don’t know where the
problem is
 Continue rewarding status quo
 Need to increase competition to improve
quality and contain costs
 Information is vital for competitive markets to
operate
21
© Pacific Business Group on Health, 2004
Download