Modelling Cardinal Utilities from Ordinal Utility data: An exploratory analysis

advertisement
Modelling Cardinal Utilities from
Ordinal Utility data: An
exploratory analysis
Peter Gilks, Chris McCabe, John
Brazier, Aki Tsuchiya, Josh Solomon
Background
• Limitations of conventional methods of utility
elicitation
• Early work suggesting ordinal data can predict
cardinal preferences
• SF6D and HUI 2 surveys used ranking exercises
as warm up prior to SG valuation tasks
• Opportunity to test and develop methods proposed
by Solomon
SF-6D valuation data sets
• Ranked seven SF-6d states (including pits and full
health) and death
• SG valuations of five states against full health and
pits and then chained using valuation of pits
against full health and death (respondents asked to
confirm pits ranking against death)
• 611 respondents sampled from the general
population
• 249 mean SG health states values ranging from
.21 to .99; averaged 14 valuations per state
HUI2 valuation data set
• Ranked 9 HUI2 states (including pits and full
health) and death
• SG valuations of 8 states against full health and
death (respondents asked to confirm ranking of
state against death)
• 198 respondents sampled from the general
population
• 249 mean SG health states values ranging from .064 to .77; averaged 24 valuations per state
Methods
Aim: To model the predicted health state valuations using
the ordinal preference data
1)
Statistical model
Conditional logistic regression (McFadden choice
model) based on random utility theory (previous
attempts used Thurstone’s Comparative Judgement
Model)
2)
Value function
Relating the health state descriptive system to the utility
value
The Statistical Model
• Respondent i has latent utility value for state j, Uij.
• Respondent will choose state j as best from a group of
states k=1,…,n if Uij > Uik for all k  j.
• Utility function Uij = μj + εij. Where μj represents the
underlying tastes of the population and εij represents the
peculiar choice of the individual.
• Odds of choosing state j over state k are exp{μj – μk}
• So we want to model the dependent variable μ against the
dimensions of the descriptive systems: SF6D and HUI2.
Assumption:
independence of irrelevant alternatives
• Model is based on assumption that the ranking exercise is equivalent to
the respondent making a series of individual choices from smaller and
smaller groups of states.
For example, to rank 10 health states;
•
Selects first preference from all 10, rank 1
•
Selects best from remaining 9, rank 2
•
Selects best from remaining 8, rank 3 and so on……
NB. This assumes that the choice over a given pair does not depend
on the other alternatives available
Value function
The expected value of each unobserved utility was assumed to be a
linear function of the categorical ratings on the domains of each dataset
respectively. The specifications are;
For HUI2: μ = β1S2 + β2S3 + β3S4 + β4M2 + β5M3 + β6M4 + β7M5 +
β8E2 + β9E3 + β10E4 + β11E5 + β12C2 + β13C3 + β14C4 + β15SC2 +
β16SC3 + β17SC4 + β18P2 + β19P3 + β20P4 + β21P5 + βdDeath
For SF6D: μ = β1PF2 + β2PF3 + β3PF4 + β4PF5 + β5PF6 + β6RL2 +β7RL3
+ β8RL4 + β9SF2 + β10SF3 + β11SF4 + β12SF5 + β13P2 + β14P3 + β15P4
+ β16P5 + β17P6 + β18MH2 + β19MH3 + β20MH4 + β21MH5 + β22V2 +
β23V3 + β24V4 + β25V5 +βdDeath
• Note: no constant term and a coefficient for death! This facilitates rescaling results on to the Full-Health Death (1,0) Scale.
Rescaling
The scale of the latent variable μ is arbitrarily defined by
the identifying assumptions in the model.
1) Normalise to observed SG scale (originally proposed by
Josh Solomon)
Multiply coefficients by the ratio:
βri = βi * min. obs. SG/ Predicted PITS value
2) Normalise to death
βri = βi / |βd|
This anchors death at zero and perfect health at 1
NB. states can still be valued as worse than death.
Model Assessment Methods
Main aim is to compare the predictive performance of
the rank model and the original standard gamble
model:
• Check coefficients for sign and consistency.
• Plot predictions against observed for rank model and
SG model for both datasets.
• Statistical tests of predictive performance.
• Look for systematic patterns in the errors.
HUI2
HUI2 Rank Model and SG Model(OLS)
RankCoeff
-0.9932932
-0.9350973
-2.116679
-0.7287155
-0.9887335
-0.8041412
-1.008526
-0.8122273
-1.0001
-1.429127
-1.43784
-0.3222758
-0.5438438
-0.773194
-0.4409409
-0.692351
-0.7762394
-0.8131845
-0.940143
-1.216913
-1.76543
-8.589516
RescaledCoeff
-0.1156
-0.1089
-0.2464
-0.0848
-0.1151
-0.0936
-0.1174
-0.0946
-0.1164
-0.1664
-0.1674
-0.0375
-0.0633
-0.0900
-0.0513
-0.0806
-0.0904
-0.0947
-0.1095
-0.1417
-0.2055
-1
SGCoeff
-0.1151
-0.1223
-0.2253
-0.0516
-0.1224
-0.1308
-0.1103
-0.0945
-0.1119
-0.1801
-0.1824
-0.0567
-0.0966
-0.1676
-0.0516
-0.1138
-0.1158
-0.1114
-0.1155
-0.1626
-0.2538
n
MAE
No.>0.05
No.>0.10
RMSE
LB
Corr(means)
No. of Logical Inconsistencies
51
0.615
23
12
0.0775
36.11
0.8814
2
51
0.051
18
5
0.0657
25.78
0.921
1
sens2
sens3
sens4
mobil2
mobil3
mobil4
mobil5
emot2
emot3
emot4
emot5
cogn2
cogn3
cogn4
sc2
sc3
sc4
pain2
pain3
pain4
pain5
death
Mean values, predicted values and error (predict - mean) for
Rank model including death and SG Model (OLS) HUI2
SG Model
Rank Model
1
1
.8
.8
.6
.6
.4
.4
.2
.2
0
0
-.2
-.2
-.4
-.4
1
51
ts
Smooth line = mean health state
values ranked by severity
Top line is predictions
Bottom line is error.
1
51
ts
SF6D Rank Model and SG model(Mean'6')
SF6D
pf2
pf3
pf4
pf5
pf6
rl2
rl3
rl4
sf2
sf3
sf4
sf5
pain2
pain3
pain4
pain5
pain6
mh2
mh3
mh4
mh5
vit2
vit3
vit4
vit5
death
RankCoeff
RescaledCoeff
SGCoeff
-0.363575
-0.0566
-0.0532
-0.431302
-0.0671
-0.0106
-0.9856325
-0.1534
-0.0402
-0.6340183
-0.0987
-0.0535
-1.447536
-0.2253
-0.1110
-0.3210761
-0.0500
-0.0530
-0.4069154
-0.0633
-0.0552
-0.4052777
-0.0631
-0.0503
-0.3626836
-0.0565
-0.0555
-0.4203095
-0.0654
-0.0668
-0.5737133
-0.0893
-0.0698
-0.8054821
-0.1254
-0.0866
-0.377161
-0.0587
-0.0467
-0.3635335
-0.0566
-0.0250
-0.6520135
-0.1015
-0.0561
-0.8187383
-0.1275
-0.0912
-1.191158
-0.1854
-0.1669
-0.2157184
-0.0336
-0.0490
-0.3371096
-0.0525
-0.0424
-0.7015521
-0.1092
-0.1092
-0.8992905
-0.1400
-0.1279
-0.173969
-0.0271
-0.0861
-0.2139943
-0.0333
-0.0606
-0.3226131
-0.0502
-0.0543
-0.5267463
-0.0820
-0.0907
-6.423983
-1.0000
n
MAE
No.>0.05
No.>0.10
RMSE
LB
Corr(means)
No. of logical inconsistencies
249
0.0882
169
84
0.1096
106.7200
0.7111
3
249
0.0742
118
51
0.0976
169.5700
0.7377
8
Mean values, predicted values and error (predict - mean) for
Rank model including death and SG Model (6) SF6D
Rank Model
SG Model
1
1
.8
.8
.6
.6
.4
.4
.2
.2
0
0
-.2
-.2
-.4
-.4
1
249
1
ts
Smooth line = means
Both Models:
Top messy line is predictions
• Under predict large means
Bottom messy lines is error.
249
ts
• Over predict low means
Summary of Findings
• Rank models able to predict actual mean SG
health states nearly as well as the SG
models – associated with modest increase in
in MAE
• Evidence that it has produced less
systematic error in SF-6D data set and
improvements in consistency
Issues – taking results at face value
• Is the ranked model good enough? Could we start
using it………
• Given ranking is a warm up, results could be
better if more care taken over this part of the
exercise
• Ranked methods are probably cheaper
• What evidence is there that ranking exercises
impose a lower cognitive burden? Seems to be
higher levels of completion.
Issues – harder questions
• Is the selection process of the ranking task assumed by the
model correct?
• Why should the relationship between the latent utility
value and SG (in this case) cardinal values be linear?
– What other functional forms might theory suggest?
– Is the latent utility value similar to Dyer and Sarin’s
‘value function’ or something else?
• Does rank data elicit preferences or simply how good or
bad a health state is, and does it matter?
Issues – the death question
• Not a major problem here because all mean health
state values above zero
• The MVH EQ-5D data has been analysed in a
similar way by Josh Solomon, but the ranking of
death was very different to the implied ranking
from the TTO – only state 33333 is ranked worse
than death compared to 16/43 states by TTO!
Ranked model normalised to death and full health
does not predict TTO values worse than death
very well
Further work – more suggestions welcome
• See how well SG data predicts ranking at the
individual level
• Consider interactions
• Model different functional relationships between
latent variable and SG
• examine completion rates and extent to which
ranking will extend the vote to more vulnerable
populations
Download