Christopher Aisbett, Laeta Pty Ltd and Casemix Unit

advertisement
Has the Data Cleaning Hidden the
Cost Effect?
Or Has the Baby Gone with the Bath
Water?
Introduction
ABF model development is impeded by
Inconsistency in health product definition and
 Inconsistency in attribution of incurred cost.
This leads to the removal of “bad” data from
analyses aimed at establishing model parameters.
By striving for reliability the analyst may sacrifice
validity especially in the assessment of effects (such as
Indigenous status) that may markedly affect outcomes.
This paper presents a meta-analytic method to
testing if effects have been concealed by data
editing.
Method Part 1
 Inconsistencies in product definition and errors in cost allocation
tend to occur at establishment level. Therefore studies within
establishment are less affected.
 The original unedited establishment data provides the basis for
validly assessing cost effects such as Indigenous Status.
 Phase 1
 When data from many establishments are available, the findings at
establishment level may be pooled in a meta-analysis:
1.
2.
3.
4.
5.
Partition the data in each establishment between contacts with
and without the study characteristic (e.g. Indigenous Status)
Form the ratio of the cost of each partition to its modelled cost,
and then the ratio of these to each other.
Regard the occurrence of a ratio of ratios greater than 1 as a
success in a Binomial trial.
Pool the results across establishments and test the effect
hypothesis against the Binomial distribution.
Record the number of Successes.
Method Part 2
 Phase 2
Working with a copy of the data:
1.
2.
3.
Set the Status of all contacts to the value 0.
Within each Hospital by Case-type Cell randomly
assign Status value 1 to the same number of contacts
that had Status value 1 in the original data.
Repeat Phase 1 above with these modified data.
 Phase 3
Repeat Phase 2 a sufficiently large number of times
 Final Phase
Compare the outcome of Phase 1 with the distribution of
outcomes generated in Phase 3.
Exemplar System
The Casemix System chosen to demonstrate the
approach is one for Emergency Care (EC).
– patient level costing data
– from the National Hospital Cost Data Collection,
Round 15 (NHCDC15).
– Urgency Related Groups v1p3 (URGs) and
The work was conducted for the Independent
Hospital Pricing Authority in Australia.
The model was based on presentations
moderated by Indigenous Status.
Demonstration Part 1
First summarise the PLC data:
Establishment
1
1
1
1
2
2
2
2
3
3
….
163
URG
1
3
….
66
9
12
….
61
1
3
….
66
Status 1
Expenditure
8,695
4,392
….
890
1,141
686
….
7,435
452
213
….
250
Status 1
Status 0
Funding Expenditure
5,640
254,749
4,028
123,367
….
….
742
70,247
954
136,412
911
205,250
….
….
7,347
465,975
3,760
9,368
1,343
6,874
….
….
371
19,331
Status 0
Funding
169,200
51,022
….
57,499
146,841
207,764
….
568,397
45,120
21,483
….
17,806
n
n1
93
41
….
314
155
229
….
3,840
26
17
….
98
3
3
….
4
1
1
….
49
2
1
….
2
Demonstration Part 2
Then Calculate the Ratios and Compare with 1.
Establishment
1
1
1
1
2
2
2
2
3
….
3
URG
1
3
….
66
9
12
….
61
1
….
3
Status 1 Status 0
Cost
Cost Ratio of
Trial
Ratio
Ratio
Ratios OutCome
1.542
1.506
1.024
1
1.090
2.418
0.451
0
….
….
….
….
1.199
1.222
0.982
0
1.196
0.929
1.288
1
0.753
0.988
0.762
0
….
….
….
….
1.012
0.820
1.234
1
0.120
0.208
0.579
0
….
….
….
….
0.158
0.320
0.495
0
n
n1
93
41
….
314
155
229
….
3,840
26
….
17
3
3
….
4
1
1
….
49
2
….
1
Demonstration Results
 Count the number of trials (Hospital by Case-type combinations in
test data) – Observed 4,734 (4,712 for cleaned data)
 Determine the p=0.01, 2-tailed “Acceptance Range” (AR) for
Bin([.],4,734,0.5) – [2278,2455] ([2259,2452] for cleaned data)
 Sum the Trial Outcome (Ratios of Ratios greater than 1) – Observed
1,847 (1,853 for cleaned data)
 Observed value is outside the AR
 Provisionally Reject the Hypothesis “No Adjustment Need” so go
onto Phase 2 and 3 of method.
 We find that the result is not unusual within the set of outcomes
obtained through randomization.
 Reverse the decision and determine that the need for an
adjustment has not been confirmed.
Discussion
 The Phase 1 analysis appeared to confirmed an effect of Indigenous status
in the IHPA model for emergency services. However its direction
corresponds to that expected in long tailed cost distributions, irrespective
of any Status effect.
 An extra use of data in the second summary form is to calculate the level
of adjustment needed.
 This is calculated by using:
 the inverse variance weighted average of Ratios of Ratios (the variance
estimate is (n+1)/(n1*(n-n1)) where n is total presentations and n1 Indigenous
presentations.
 This returns an uplift multiplier estimate of 1.0035 (1.0046 cleaned data)
 The apparent contradiction between the Binomial finding and the
estimated uplift indicated the need for the later Phase analyses.
 Note however, when the IHPA regression model approach is followed on
the cleaned data the estimate is 1.0398 very close to the 1.04 value
determined by IHPA
Conclusion
Data editing has not affected the validity of the
Indigenous Status Estimate.
Despite its simplicity, the Phase 1 method allows
cross validation of findings (or otherwise), even
when the model development used edited data;
provided that the cost ratios have Bell shaped
probability curves with median close to mean.
The Phase 1 Binomial results allow assessment of
the Bell curve assumption. If it does not hold then
results from the (Empirical) Conditional Likelihood
approach (Phase 2 and 3) must be used.
Postscript 1
• Note that the Binomial test result suggests
that Indigenous Status is associated with
lower spending within Hospital by URG cell.
• When Hospital ratios rather than Hospital by
URG ratios are used:
– 107 trials, 34 successes (p=0.0001 ) uplift
multiplier estimate of 0.980
– Very similar outcome to the more detailed cell
analysis.
Postscript 2
• When URG ratios rather than Hospital by URG ratios
are used:
– 65 trials, 27 successes (p=0.1073) uplift multiplier estimate
of 1.031
– Quite different outcome from the more detailed cell
analysis.
– In line with IHPA determination
– In line with iterative regression estimates
• So what is the real story here?
• The answer is to be found in the shapes of the
distributions of cost ratios.
Download