Utilising rank and DCE data to value health status on the ‘QALY’ scale using conventional and Bayesian methods John Brazier and Theresa Cain with Aki Tsuchiya and Yaling Yang Health Economics and Decision Science, ScHARR, University of Sheffield, UK Prepared for the CHEBS Focus Fortnight Outline Concerns with current cardinal methods for valuing health states Problems in using ordinal data Application of rank and DCE methods to valuing Asthma health states using conventional methods Application of Bayesian methods to analysing DCE data Implications for research and policy Problems with cardinal methods for valuing health states TTO and SG seen to be cognitively complex tasks that may be too difficult for some (e.g children, very elderly) TTO values contaminated by time preference, standard gamble by risk attitude and rating scales by end point bias (among other things) Role for ordinal methods (rank and discrete choice) Ordinal tasks: Ranking and discrete choice experiments Ranking respondents asked to order a set of health states from best to worst - traditionally used as a warm up exercise prior to VAS/SG/TTO based preference elicitation Discrete choice experiments (DCE) typically asks respondents to choose between two health states (A and B) Problems with using ordinal data to value health for QALYs DCE and rank models estimate a latent health state utility value, but with arbitrary anchors QALYs require health states to be valued on the full health (one) and being dead (zero) scale Key problem is linking results of DCEs to the full health-dead scale Previous work using ordinal data Ranking Early application of Thurstone’s method by Kind (1982) Use of conditional logit on rank data by Salomon (2003) on EQ-5D and McCabe et al (2005) on SF-6D and HUI2 – some success DCE DCE applications in health economics mainly concerned with relative weight of different attributes of health care rather than to valuing health per se DCE considered unsuitable for assessing cost effectiveness (because utility scale is not comparable between studies) Past attempts to apply DCE to valuing HRQoL - - Hakim and Pathak (1999) applied DCE to valuing EQ-5D states used ‘pick one’ from 12 choice sets (each containing 3 states plus dead) Exploratory and did not produce weights McKenzie et al (2001) estimated weights for asthma symptoms no link to full health-dead scale Viney et al (2004) included attributes for HRQoL and survival – but did not estimate health state values Alternative approaches to using DCE The latent utility scale needs to be anchored on the full health-dead scale and there are a number of different ways: Value PITS state externally by TTO/SG (Ratcliffe and Brazier, 2005) Include a dead state in the pair wise choice set* Using the question ‘is this a state worth living’ in the best-worst scaling method (Flynn et al, 2005) * Method used in this study Background to AQLQ study Asthma Quality of Life Questionnaire (AQLQ) developed by Professor Juniper is a condition specific measure with 32 questions with 7 levels each covering 4 dimensions A simplified health state classification was developed from the AQL-5D based on a sample of items on 5 domains: concern, breathlessness, pollution and environment, sleep and activity AQL-5D Feel concerned about having asthma [1]None of the time [2]A little or hardly any of the time [4]Most of the time [5] All of the time [3]Some of the time Feel short of breath as a result of asthma [1]None of the time [2]A little or hardly any of the time [4]Most of the time [5] All of the time [3]Some of the time Experience asthma as a result of air pollution [1]None of the time [2]A little or hardly any of the time [4]Most of the time [5] All of the time [3]Some of the time Asthma interferes with getting a good night’s sleep [1]None of the time [2]A little or hardly any of the time [4]Most of the time [5] All of the time [3]Some of the time Overall, the activities I have done have been limited [1] Not at all [2] A little some [4] Extremely or very [5] Totally [3] Moderate or Health state 32345 Feel concerned about having asthma some of the time [3] Feel short of breath as a result of asthma a little or hardly any of the time [2] Experience asthma symptoms as a result of air pollution some of the time [3] Asthma interferes with getting a good night’s sleep most of the time [4] Overall, totally limited with all the activities done [5] Valuation survey: sampling and interview Representative sample of adult general population invited to participate At the interview: Ranked health states from best to worst (7 AQLQ health states, full health (i.e. best AQLQ state), the worst AQLQ state and immediate death) Time trade-off (York MVH variant) of 8 AQLQ health states against shorter time in full health 100 health states valued in this way Methods: postal follow-up Approx 4 weeks after interview respondents received DCE questionnaire in post Optimal statistical design for DCE based upon level balance, orthogonality and minimum overlap was produced by programme in SAS (Huber and Zwerina, 1996) 12 pair wise comparisons were produced and randomly allocated to two versions of questionnaire with 6 choices in each Two additional pairs presented to respondents containing with AQL-5D states vs. dead. Discrete choice question Health State A Health State B Feel concerned about having asthma none of the time. Feel concerned about having asthma all of the time. Feel short of breath as a result of asthma none of the time. Feel short of breath as a result of asthma a little of hardly any of the time. Experience asthma symptoms as a result of air pollution none of the time. Experience asthma symptoms as a result of air pollution most of the time. Asthma interferes with getting a good night's sleep all of the time. Asthma interferes with getting a good night's sleep a little or hardly any of the time. Overall, a little limitation in every activity done. Overall, moderate or some limitation in any activity done. Which health state do you think is better? (please tick one box only) A B Statistical model for rank and DCE data General model: µij = f(ß’xij + ΦD+uij) Where µij is the latent utility function of respondent i for state j x is a vector of dummy explanatory variables for each level of each dimension of the classification. For example, x32 denotes dimension α=3, level λ = 2. D is a dummy variable for the state of being dead which takes the value 1 for being dead or otherwise zero. Modelling health state values Modelling: TTO: individual level model (random effects) DCE: random effects probit model Ranking: rank ordered logit model Rescaling: Re-scale by dividing ß coefficients on each dimension level by the coefficient for being dead. These rescaled coefficients provide predictions for health state values on the same scale as TTO valuations although the predicted values for health states may not necessarily be the same as those obtained using the TTO technique. Results of valuation survey Rank/TTO interview: 308 respondents (response rate 40% ) Representative in terms of gender, age, education 2455 TTO valuations across 100 health states DCE 168 returned questionnaires (response rate 55%) 1336 pair wise comparisons Results - impact of dimension level on TTO scores (Individual level Random Effects model with main effects) Concern2 Concern3 Concern4 Concern5 -0.047* -0.064* -0.074* -0.095* Breath2 Breath3 Breath4 Breath5 -0.024 -0.045* -0.107* -0.116* * statistically significant in 0.05 level Dependent variable: TTO values MAE = 0.051 Pollution2 Pollution3 Pollution4 Pollution5 -0.017 -0.028 -0.063* -0.099* Sleep2 Sleep3 Sleep4 Sleep5 -0.013 -0.029 -0.054* -0.069* Activity2 Activity3 Activity4 Activity5 -0.029 -0.044* -0.139* -0.164* Comparison of ßs 0.15 0.10 Concern Breath Pollution Sleep 0.05 Decrements 0.00 -0.05 -0.10 -0.15 -0.20 -0.25 -0.30 TTO Rank DCE -0.35 -0.40 Dimension level Activity Spearman rank Correlations (n=100) TTO pred Rank pred. DCE pred Rank pred DCE pred .918 .901 TTO 0.790 Observed .885 .688 .770 Predicted health state valuations 1 Predicted health state values TTO predictions y=0.73x+0.2 Rank predictions y=0.66x+0.2 0.8 DCE predictions y=1.21x-0.2 0.6 0.4 0.2 0 0.0 0.2 0.4 0.6 Observed mean TTO values 0.8 1.0 Comparisons of models 20/20 DCE warm 15/20 DCE cold 19/20 0 0 3 1 MAE/MAD >0.05 0.051 22 0.065 30 .09 31 0.12 40 Mean error/difference Scale range 0.015 0.01 0.03 0.1 0.4571.00 0.4051.00 0.1721.00 0.1211.00 Negative ßs Inconsistencies TTO RE Rank 20/20 Overall comparison TTO model predicts observed TTO values best (lowest MAE) Rank model predicts observed TTO values nearly as well as TTO model DCE model is associated with largest difference from observed TTO values and seems to have a steeper gradient (i.e. more extreme values) Research questions 1. Is DCE really easier than TTO/SG or VAS? 2. Does DCE produce different estimates from TTO and SG? 3. Theoretical basis for using DCE rather than conventional TTO or SG 4. Basic DCE design issues 5. Analysis – mixed logit or Bayesian models 6. Does the dead dummy solve the problem? Does including dead solve the problem? A more natural solution is to include survival as an attribute – but this has a multiplicative relationship to QoL and so would require a far larger design Using ‘dead’ requires the ‘pits’ health state of the classification to be considered worse than dead by some respondents – so not suitable for milder classifications What about those who do not think any state is worse than dead (85% in this sample)? For those who do not think any state is worse than dead, then their data tells us nothing about their strength of preference for QoL compared to quantity of life Are the 85% all none traders? SF-6D (67%), HUI3 (33%) and EQ-5D (14%)