CREATING COMPOSITES: Overview Examples from Physician Profiling, Case-Mix, and Total Illness Burden

advertisement
Overview
CREATING COMPOSITES:
Examples from Physician Profiling,
Case-Mix, and Total Illness Burden
Sherrie H. Kaplan, PhD, MPH,
Dara Sorkin, PhD,
Sheldon Greenfield, MD
UCI School of Medicine
ARM, June 5, 2007
Definition
• Composite measure: a representation
of an abstract construct defined in
terms of two or more individual
measures
Example:
Math is a complex multi-dimensional
construct (arithmetic, algebra, geometry,
trigonometry…) requiring multiple measures
to assess each dimension
Sports Composites…
• GmSc - Game Score - This is a value created
by Bill James that evaluates how good a pitcher's
start was.
• Start with 50 points. Add 1 point for each out
recorded, (or 3 points per inning). Add 2 points for
each inning completed after the 4th. Add 1 point
for each strikeout. Subtract 2 points for each hit
allowed. Subtract 4 points for each earned run
allowed. Subtract 2 points for each unearned run
allowed. Subtract 1 point for each walk.
•
•
•
•
Definition of ‘composite’ measures
Ubiquitiousness of composites
Role of purpose of measurement
Methodologic ‘musts’ of composite
construction
• How to create composites
• Practical issues
• Examples from health, healthcare
Prevalence of Composites…
• Composite measures are currently used:
– To rank countries (e.g. OECD)
– To rate financial institutions
– To rank and reward schools
– To rank nursing homes
– To choose students (by colleges)
– To evaluate patient satisfaction
– To evaluate efficacy in clinical trials
– In sports….
Composites in Sports…
On-base plus slugging:
OPS = OBP + SLG
where OBP is on-base percentage, and SLG is
slugging percentage. These percentages are defined
as
SLG = TB / AB and
OBP = H + BB + HBP / AB + BB + SF + HBP
where:
H = Hits
BB = Bases on balls
HBP = Times hit by pitch
AB = At bats
SF = Sacrifice flies
TB = Total bases
1
MrNFL.COM…
Role of Purpose of Measurement
• Changes content of aggregate
measure
• Changes tolerance of error
• Changes psychometric
requirements of aggregate
• Changes ‘level of confidence’,
dissemination strategy
RATIONALE FOR COMPOSITE
MEASURES: The Pros…
1. To summarize constructs:
•
•
•
•
Over complex or multi-dimensional diseases,
conditions, patient characteristics or clinical
situations
Summarize a large amount of information in to a
simpler (interpretable) measure
Standardization
Facilitate ranking of providers
2. To improve reliability; potentially reduce the
list of quality measures
RATIONALE FOR COMPOSITE
MEASURES: The Cons…
1. Composite measures are hard to interpret
(what do units of measurement mean?)
2. Composite measures are hard to validate
3. Composite measures don’t guide quality
improvement
4. Information is ‘wasted’ or ‘hidden’ in
composite measures
5. Weighting isn’t transparent
RATIONALE FOR COMPOSITE
MEASURES: The Pros…(Cont’)
3. To be ‘fairer’ (different ways to get good
scores)
4. To increase effective sample size when
characterizing constructs under study
•
Increase power of study
5. To reduce length of “study”, e.g., the time
period over which to establish change in
construct
How to create composites:
Lessons from psychometrics…
1. Choose well tested physicianphysician-level
quality measures
–
Assess “physician effect”
effect” esp. on outcomes
2. Account for nonnon-random clustering of
patients within physician (case(case-mix
bias)
3. Sampling, power (n pts/MD, n MDs)
2
How to create composites:
Lessons from psychometrics…
4. Test scoring methods for creating
aggregate scores (weighting, missing
data, etc.)
5. Test reliability/validity of profile
scores
6. Make sure composites are mutable
(i.e. physician, practice, system
characteristics related to high/low
profiles)
Models for Composite Scoring
• Conjunctive scoring (‘ands’): highest,
lowest levels achieved define score
– Rheumatoid arthritis trials: patient responded if:
• at least a 20% improvement in tender joint count and
• 20% improvement in swollen joint count and
• at least 20% improvement in 3 out of 5 of the
following: pain assessment, global assessment,
physician assessment, etc.
• Compensatory scoring (‘ors’): high scores
on one component make up for low scores
on another
To weight or not to weight?
Scoring Strategies for Composites
• Decision to weight should be based on
purpose of weighting
• Weighting methods should be credible,
defensible, transparent
• Weighting methods must be tested to
demonstrate value over simple
summary methods
1. Mean: measure and report estimate.
2. Tournament: measure and identify
those in a specific quantile.
3. Threshold: measure and identify those
that pass a certain threshold.
4. Change: Using (1), (2) or (3), measure
and identify those that change
“significantly”.
Models for weighting
• Expert defined
– Conditioned by ‘expert’ representation
• Regression-based
– Conditioned by database (provider,
patient sample, sample size)
Assessing the ‘provider effect’
• Evaluate measures to be included in
composites for attribution to physician,
‘site’ or group practice, institution,
health plan
• Reliability-based
– Conditioned by database (sample size)
3
Inflation Factor (IF)
Intraclass Correlations (ICCs)
For any provider-level quality
measure, if the mean-square estimate
of between provider variation is
large, and variation across patients
within a provider’s practice is small
(indicating a large physician effect),
then the ICC will be large
Translation
IF = (n-1)*ICC
Where n = number of patients
per provider
ICC = provider level intraclass
correlation
Optimizing the ‘Physician
Effect’ on HbA1c levels
• Larger ICC’s are better
• IF’s > 6 are better
Patient vs. Physician ‘Effect’:
LDL levels
Table 2. Number of quality measures (k) needed for the
physician-level reliability desired and varying level of
intraclass correlation
Desired
physician
level
reliability
(rjj)
.65
ICC
.01
.05
.10
.20
.30
.50
184
35
17
7
2
.70
231
44
24
9
.75
297
57
27
12
4
5
7
.80
396
76
36
16
9
4
.85
561
108
51
23
891
171
81
36
13
21
6
.90
Based on Spearman-Brown Prophecy formula (46):
2
3
9
k = r ii(1-ICC)
ICC(1- r ii)
4
Creation of an Aggregate Profile Score
Creating a Composite PhysicianLevel Performance Score
An Example from the ADA/NCQA
Provider Recognition Program…
Measure
Annual HbA1c
Annual lipids
Correlation with Total
.41
.73
Annual urine microalbumin
Annual eye exam
Annual foot exam
.30
.43
.39
HbA1c < 9%
LDL < 130 mg/dl
HDL OK
.44
.61
.63
Triglycerides < 200 mg/dl
BP <140/90
.57
.18
Cronbach’s α = .78
Creation of an Aggregate Profile Score
Measure
Correlation with Total
Sum 5 process measures
.62
HbA1c < 9%
.45
LDL < 130 mg/dl
.62
HDL OK
.63
Triglycerides < 200 mg/dl
.61
Cronbach’s α = .82
Practical Issues
Summary: Why aggregate?
• Enough items to create a reliable
composite
• Enough patients per item
(condition, quality construct
• Adding up apples and airplanes…
• Weighting vs. simple sums
• Transparency
• Validation???
• Credibility
• Individual measures are not reliable;
may not accurately reflect quality
• Composite scores easier for public,
insurers, employers to use
• Composite scores are fairer to
physicians (multiple ways to get a
good score)
• Individual measures in aggregates can
still be used (e.g. for quality
improvement)
5
The Unreliability of Individual
Physician “Report Cards” for Assessing
the Costs and Quality of Care of a
Chronic Disease
Timothy P. Hofer, MD, MS
Rodney A. Hayward, MD
Sheldon Greenfield, MD
Edward H. Wagner, MD
Sherrie H. Kaplan, PhD, MPH
Willard G. Manning, PhD
Figure 1. Comparison of Physicians' Visit Rate Profiles
Other Examples: Case-Mix
• Self-Reliant/Provider Dependent Health
Care Orientation (Dr. Dara Sorkin)
• Total Illness Burden Index (Dr. Sheldon
Greenfield)
6
Download