1 A TWO LEVEL MONTE CARLO APPROACH TO CALCULATING EXPECTED VALUE OF 2 SAMPLE INFORMATION:- HOW TO VALUE A RESEARCH DESIGN. 3 4 Alan Brennan, MSc(a) 5 Jim Chilcott, MSc(a) 6 Samer Kharroubi, PhD(b) 7 Anthony O’Hagan, PhD(b) 8 Johanna Cowan (a) 9 10 (a) School of Health and Related Research, The University of Sheffield, Regent Court, Sheffield S1 11 4DA, England. 12 (b) Department of Probability and Statistics, The University of Sheffield, Hounsfield Road, Sheffield 13 S3 7RH, England. 14 15 Reprint requests to: a.brennan@sheffield.ac.uk 16 17 18 Acknowledgements: 19 20 The authors are members of CHEBS: The Centre for Bayesian Statistics in Health Economics. 21 Particular thanks to Karl Claxton and Tony Ades who helped finalise our thinking at a CHEBS “focus 22 fortnight” event in April 2002. Also to Gordon Hazen, Doug Coyle, Myriam Hunink and others who 23 stimulated AB further at the viewing of the poster at the MDM conference 2002 Finally, thanks to the 24 UK National Coordinating Centre for Health Technology Assessment which originally commissioned 25 two of the authors to review the role of modelling methods in the prioritisation of clinical trials (Grant: 26 96/50/02) and have started us on this path. 27 28 1 29 ABSTRACT (original MDM conference abstract – rewrite) 30 31 32 Purpose: describe a generalised two level Monte Carlo simulation for calculating partial EVSI. 33 34 Background: Claxton has promoted the EVSI concept as a measure of the societal value of research 35 designs to help identify optimal sample sizes for primary data collection. A method applies (using unit 36 normal loss integral formula to calculate EVSI for sample size n), if the net benefit function is normally 37 distributed, and if the proposed sample exercise measures all the model parameters. Whilst common in 38 trials, such conditions are not universal and a generalised method is needed for, so called, partial EVSI. 39 40 Methods: A two level algorithm, originally developed for partial EVPI applications, has been further 41 refined, tested and applied to undertake EVSI calculations. The paper also sets out a mathematical 42 notation for EVSI for parameter(s). The algorithm begins by sampling a 1st value for the parameter of 43 interest (e.g. the underlying true %response rate to a drug, 1) and then… [sampling data (e.g. the 44 response rate found in a trial of n=100, given 1). A Bayesian update for the parameter of interest is 45 calculated (merging the prior probability distribution with the sampled data). This has a revised mean 46 and a smaller variance than the ‘prior’ because new information has been obtained. 1000 monte carlo 47 samples are now generated, allowing the parameter of interest to vary over its Bayesian updated 48 distribution, and all other parameters to vary according to their priors. The expected net benefit of each 49 strategy is examined and the best strategy chosen.] The process in [ ] is repeated for a 2nd monte carlo 50 sample of the parameter of interest, then a 3rd and … 1000 times. The overall mean expected net 51 benefit is compared to the expected net benefit without further data collection - the difference is the 52 EVSI for the research. An illustrative model for two treatments, costs and benefits is used to show 53 EVSI for a trial, a QALY study, both together, a long term follow up, and an all parameter study. 54 55 Results: A mathematical expression for partial EVSI is generated, clearly showing the 2 level monte 56 carlo in terms of 2 “expectations”. Given sample data, Bayesian updating requires an analytic 57 expression to revise parameters of the distribution (given here for normal, beta, gamma and lognormal 58 distributions. More general distributions may require other approximations or MCMC methods. 59 The algorithm applies equally for sets of parameters as for a single one. 60 61 Conclusion: This provides a significant step towards a generalised method for partial EVSI 62 2 63 A TWO LEVEL MONTE CARLO APPROACH TO CALCULATING EXPECTED VALUE OF 64 SAMPLE INFORMATION:- HOW TO VALUE A RESEARCH DESIGN. 65 66 Introduction 67 68 Clinical trials and other health research studies are investments, which hopefully lead to reduced 69 uncertainty about the performance of health technologies and improved decision making on their use in 70 practice. What is the added value of a proposed research study? How much more valuable would it be 71 if its sample size were increased, or its design altered to measure additional data items or to collect 72 information over a longer time? Would a different research question be more appropriate and valuable 73 to decision makers than the proposed study? These questions are important for governmental, 74 academic and indeed commercial healthcare research funding bodies throughout the world. The aim is 75 to allow researchers and funding agencies to rank different options for data collection, including 76 different sample sizes, additional data on resource use and costs, the added benefit of quality of life / 77 utility studies, longer trial durations or alternative study design such as observational or 78 epidemiological studies rather than randomised controlled trials. 79 80 Traditionally, the large investments in, for example, phase III clinical drug trials, have been planned 81 around proving a statistically significant clinical difference i.e. delta (δ). The implied assumption is 82 that, if a clinical delta is shown conclusively, then adoption of the new technology will automatically 83 follow, i.e. in a sense the clinical delta is a proxy for economic viability of the technology. This 84 produces a relatively common problem in cost-effectiveness analysis, where a study may show a 85 clinically important outcome, but not be powered sufficiently to provide enough information for a 86 reasonable estimation of cost-effectiveness. This leaves healthcare decision makers with uncertainty 87 concerning adoption decisions and sometimes with the need to acquire further information before a 88 final decision is made. 89 90 Recent work on valuing research has followed either the value of information direction (as this paper 91 does) or the so-called payback approach (Eddy 1989, Townsend and Buxton 1997, Davies et al 1999, 92 Detsky 1990, Drummond et al 1992, Buxton et al 1994). The payback approach compares the costs 93 and benefits of undertaking a specified technology assessment versus no data collection. Firstly, the 94 analyst lists the possible results of the proposed technology assessment (so-called ‘delta results’), then 95 quantifies the likely changes in practice, health benefits and cost consequences for each delta result, 96 and finally, by estimating the probability of each possible delta result, produces a weighted average 97 (i.e. expected) cost and health benefits for the study. By comparing this with the anticipated cost and 98 health benefit consequences of no data collection, the incremental cost-effectiveness of undertaking the 99 proposed study can be calculated. There are two main shortcomings of this method: firstly, the 100 starting point of the cost-benefit analysis is evaluation of a single specified research proposal, which 101 implicitly assumes that the suggested proposal is optimally designed. In fact, the optimal trial design 102 can only be shown be comparing a range of possible designs. Secondly, in practice the likelihood of the 3 103 different results of the trial (at its simplest positive or negative) has very often been arbitrarily decided, 104 giving rise to a lack of robustness in the payback model’s calculations. 105 106 The expected value of sample information approach has been suggested as a tool to quantify the 107 societal value of research and to identify optimal designs and sample sizes for primary research data 108 collection (Claxton and Posnett 1996, Claxton et al 2001). EVSI differs from the payback approach in 109 that it begins immediately with a cost-effectiveness model of the interventions of interest including an 110 assessment of the existing uncertainty. The decision to adopt a particular intervention policy is made 111 on the basis of the cost-effectiveness decision rule: choose the option with the greatest expected health 112 benefits after adjusting for cost differences between strategies (i.e. the largest expected net benefit). 113 Where there is large uncertainty, the adoption decision based on current information could turn out to 114 be sub-optimal, and so obtaining more data on uncertain parameters reduces uncertainty and adds value 115 by reducing the chance of making a sub-optimal adoption decision. Because collecting data also incurs 116 costs, a trade-off exists between the expected value of the additional data and the cost of obtaining it. 117 In this framework, the research design problem involves two components: what data to collect (i.e. 118 upon which uncertain model parameters should data be collected) and how much data to collect 119 (sample size)? 120 121 To date most methodological development has focussed on the expected value of perfect information 122 (ref Brennan MDM or CHBS discussion paper, Karl, Coyle) i.e. the expected value of perfect 123 knowledge on all parameters in the model (overall EVPI) or perfect knowledge on subgroups of 124 parameters (partial EVPI). Earlier work (Claxton and Posnett 1996, Claxton et al 2001) on EVSI 125 focussed upon relatively simple proposed data collection exercises (i.e. collect data on every uncertain 126 model parameter in one single data collection exercise with a single specified sample size) and was 127 valid in a relatively small number contexts (technically, the uncertainty in the net benefit function must 128 be a normally distributed function for the so-called ‘unit normal loss integral’ formula to be valid). 129 Given that most cost-effectiveness models of interventions and most proposed research studies focus 130 on more complex situations than this, a generalised method for calculating partial EVSI is required. 131 132 This paper describes a generalised method to quantify the value of alternative proposed research 133 designs. The approach uses Bayesian methods and two level Monte Carlo sampling and simulation, to 134 produce the expected value of sample information for uncertain parameters in the cost-effectiveness 135 model. A detailed algorithm for undertaking the EVSI calculation process is developed and the 136 mathematical formulation for the calculations is presented. An important component is the synthesis of 137 the existing uncertainty for a parameter with the simulated data collection, which requires a process 138 called Bayesian updating. The Bayesian updating procedure is described for the normal, beta, gamma 139 and lognormal distributions and methods for other distributions are discussed. To test its feasibility 140 and to explore the interpretation of the results, the methodology is applied to an illustrative cost- 141 effectiveness model comparing two interventions. EVSI results can be calculated for each different 4 142 sample size but this paper also examines an interpolation method so that EVSI for smaller or larger 143 sample sizes can be estimated to reduce the need for larger numbers of simulations. 144 145 1035 words - Intro 146 147 Method 148 149 The two level EVSI algorithm 150 151 The general algorithm to calculate EVSI involves setting up decision model for the possible 152 interventions, characterising uncertainty with probability distributions, simulating a variety of results 153 for a proposed collection of further data, synthesising the existing (prior) evidence with the simulated 154 collected data, and finally evaluating the impact of the data collected on the decision between the 155 interventions. Box 1 describes the detailed structure of the algorithm. 156 Box 1: General Algorithm to Calculate EVSI 0. Set up a decision model –to compare the costs and health benefits of different interventions, including adoption decision rule, e.g. “select strategy if marginal cost per QALY is < $50,000”, enabling net benefits (i.e. $50,000 * QALYs – Cost) to be calculated. 1. Characterize uncertainty – for each model parameter (i) identify appropriate probability distributions to characterise existing evidence and uncertainty e.g. % response rate to drug is beta (a, b) 2. Work out the ‘baseline decision’ given current evidence – use the probabilistic sensitivity analysis approach to simulate, say 10,000 sample sets of, uncertain parameter values by Monte Carlo sampling. Use the decision model to identify the intervention that provides highest net benefit on average over the simulations. 3. Define a proposed data collection - decide on a proposed sample size for collecting data on each of the parameters of interest (e.g. a trial of n=100 to collect data on the % response rate to a drug). 4. Begin a loop to simulate the possible variety of results from the proposed data collection. – There are two sources of variety in the possible results. The first is the uncertainty about the true underlying value of the parameter of interest. The second is that, even given the true underlying value of the parameter of interest, there is random chance associated with data collection of a specific finite sample size. Both need to be accounted for in the simulation. Sample the data collection: a) sample the true underlying value for parameter of interest (i) from its prior uncertainty 5 (e.g. use Monte Carlo to sample the true underlying value for the % response rate to a drug from the probability distribution identified in step 1, say sample 1 for i= 60%) b) sample simulated data (Xi) given the sampled true underlying value of parameter of interest (e.g. sample the mean value for the response rate found in a trial of n=100, given i= 60%). 5. Synthesise existing evidence with simulated data - for each parameter, combine the prior knowledge with the simulated data collection using Bayesian updating techniques. The result is a simulated posterior probability distribution for true value of the parameter of interest. Typically, the posterior has a revised mean and a smaller variance than the prior probability distribution. 6. Examine the impact of the simulated data collection on the decision between interventions Re-run the probabilistic sensitivity analysis on the decision model. Allow parameters of interest to be sampled from their posterior probability distributions (step 4) and remaining parameters (those with no additional collected data) to be sampled from prior probability distributions (step 1). Identify the ‘revised decision’ - the intervention that provides the highest net benefit on average over the simulations, and compare this to the ‘baseline decision’. 7. Quantify the added value of the simulated data collection – if the ‘revised decision’ is different to the ‘baseline decision’ then the simulated data collection has added value. the expected net benefit of the ‘revised decision’ (step 6) The value of simulated data = . minus the expected net benefit of the ‘baseline decision’ (step 2). Record the ‘revised decision’ and its average net benefit. Then, loop back to repeat steps 4 to 7, say 1,000 times, in order to simulate the variety of results from the proposed data collection. 8. Quantify the Expected Value of the Sample Information (EVSI) for the proposed data collection –After all of the loops to simulate the variety of possible results from the proposed data collection, the EVSI is simply the average of the results obtained in step 7. Hence, EVSI for proposed data collection = average net benefit provided by the ‘revised decisions’ . average net benefit provided by the ‘baseline decision’. minus 157 158 159 6 160 Mathematical formulation 161 162 The general algorithm is reflected in the detailed mathematical formulation for the EVSI calculations 163 as described below. 164 165 Let, 166 be the set of uncertain parameters for the model, with defined prior probability distributions 167 d be the set of possible decisions i.e. interventions 168 NB(d,) be the net benefit function for decision d, which also depends on the parameters 169 E[f(x)] denote the expected value of a function f(x) 170 i be the parameters of interest for possible data collection 171 -i be other parameters, for which there will be no further data collection. Note {} ≡ {i} U {- 172 i} 173 p(i) be the prior probability distribution for the true underlying value of i 174 p(-i) be the prior probability distribution for the true underlying value of -i 175 Xi be the (simulated) data collected on i 176 p(i|Xi) be the posterior probability distribution for the value of i having obtained the data Xi 177 178 First, consider the situation where we have only the current existing evidence (prior information). 179 Given current information we will evaluate the expected net benefit of each decision strategy in turn 180 and then choose the baseline decision that gives the highest expected net benefit. 181 182 Expected net benefit of ‘baseline decision’ = max E NB(d, ) (1) d 183 184 Next, consider a proposal to collect further information on the subset of parameters of interest, i. Any 185 data collected would be used to update our understanding about the true underlying value of i, giving a 186 new probability distribution with a revised posterior mean and typically a smaller standard deviation. 187 Given a particular simulated dataset Xi, the ‘revised decision’ can be made by evaluating each 188 decision strategy in turn and then choosing the one with the highest expected net benefit i.e. 189 max E NBd , | X i . d 190 191 The variety of possible results of the data collection means this expression needs to be evaluated across 192 all possible results for the data collection exercise. That is, the overall expected net benefit following 193 the proposed data collection is given by: 194 195 Expected net benefit of ‘revised decision’ |proposed data: = E X i max E NBd , | X i d (2) 196 7 197 This expression clearly shows the two levels of expectation involved. The outer expectation relates to 198 the variety of possible results of the proposed data collection exercise (i.e. the loop which begins at step 199 4 in the algorithm). The inner expectation relates to the evaluation of the decision model under 200 remaining uncertainty having obtained the proposed data (i.e. step 6 in the algorithm where the ‘revised 201 decision’ is evaluated using the probabilistic sensitivity analysis approach allowing the parameters of 202 interest to be sampled from their posterior probability distributions p(i|Xi) and the remaining 203 parameters (those with no additional collected data) to be sampled from their prior probability 204 distributions p(-i)). 205 206 Finally then, the expected value of sample information (EVSI) for the proposed data collection exercise 207 is given by the expected net benefit of the ‘revised decision’ given the proposed data collection (2) 208 minus the expected net benefit of the ‘baseline decision’ (1). 209 EVSI E Xi max E NBd , | X i max E NB(d, ) d (3) d 210 211 8 212 Bayesian Updating - synthesising the existing evidence with the simulated data 213 214 Technically, the most demanding part of undertaking EVSI calculations is synthesising the existing 215 prior evidence with the simulated data to form a simulated posterior probability distribution for the 216 parameter of interest. The generalised mathematical equation is given by Bayes theorem (reference 217 useful books from Tony’s preliminary reading list on Bayes). The complexity of the synthesis required 218 depends upon the form of the probability distributions for both the parameter concerned and the related 219 data. For a limited set of distributions, the synthesis of existing evidence and simulated data can be 220 done using a simple formula. These so-called conjugate distributions are such that the functional form 221 of the probability distribution does not change as new data is added. Examples (detailed below), 222 include the normal, beta (binomial), gamma (poisson) and lognormal distributions. Other conjugate 223 distributions include ….. (Tony / Samer complete). For more complex or non-conjugate distributions, 224 simple analytic formulae for Bayesian updating are not possible and the only approach is to use 225 estimation methods. The most commonly used estimation method uses the mathematics of Markov 226 Chain Monte Carlo (reference Spiegelhalter / MCMC papers) to estimate a posterior probability 227 distribution. Typically, this involves a potentially time consuming MCMC simulation (at step 5 in the 228 algorithm) using freely available software such as WinBUGS (refs and website). 229 230 Bayesian updating: Normal distribution 231 232 The normal distribution can represent uncertain cost-effectiveness model parameters, which are 233 continuous variables distributed symmetrically around a mean value (no skewness), and which might 234 include both positive and negative values. The exact shape of the probability distribution is determined 235 by the two normal distribution parameters, the mean μ and standard deviation σ. The normal 236 distribution is particularly useful when representing the uncertainty in an average. An example of a 237 cost-effectiveness model parameter that might take a normal distribution could be the mean utility 238 change for patients who respond to a drug treatment. Typically, the value of such a cost-effectiveness 239 model parameter might be estimated from a clinical trial or observational study of individual patients’ 240 utilities before and after treatment. This individual patient level data (or the mean and 95% confidence 241 interval from a published report of the study) would be used to estimate the prior mean μ0 and prior 242 standard deviation σ0 for the cost-effectiveness model parameter, given current evidence. 243 244 Box 2 part A provides the mathematics for the process of Bayesian updating for normal distributions 245 and three worked examples using illustrative data on a model parameter: utility gain for drug 246 responders. The process begins with the prior mean μ 0 and standard deviation σ0 assigned from 247 existing evidence, where necessary incorporating expert opinion. For the normal distribution updating 248 formula a third parameter is needed, that is the population level standard deviation σ pop, which 249 represents patient level uncertainty. (E.g. the individual patient level uncertainty in the utility gain for 250 drug responders σpop may be 0.2 but the uncertainty in the mean σ0 may be 0.1, lower because the 251 uncertainties of many patients will average out to give less uncertainty in the mean). The next step is 9 252 to simulate a data collection exercise of the chosen sample size. For some other distributions it is 253 necessary to simulate sample data at the individual patient level, but for the normal distribution the 254 process can be shortcut by simulating the sample mean. First, sample a true underlying value for the 255 cost-effectiveness model parameter μ0sample from the prior probability distribution Normal(μ 0, σ02). 256 Second, simulate the mean of the data collected X given this sampled true underlying value μ 0sample 257 from the probability distribution Normal(μ 0sample, σ2pop/n). Thus, we account firstly, for the uncertainty 258 in the true value of the parameter, and secondly, for the random variation in data collection due to the 259 sample size n, which has greater effect for a smaller chosen sample size n. The final step is to 260 synthesise the simulated data with the prior information using the formulae in (j) and (k). 261 262 The worked examples show that as new simulated data is obtained, the estimated mean of the 263 parameter is adjusted and the standard deviation (uncertainty in the mean of the parameter) is reduced. 264 Figure 1 illustrates the results for 3 separate simulated data samples, showing the impact on the mean 265 and variance of the model parameter. We observe that the posterior mean can take values across the 266 full range of the prior probability distribution. The posterior variance is the exactly same for all 3 267 simulated data sets and examining the formula shows that the posterior variance is independent of the 268 simulated data X but depends only on the size of the sample n. The formula also shows that shows 269 the posterior variance is always lower than the prior variance. Finally, there is a clear relationship 270 between the proposed sample size n and the posterior variance. If the sample size is very small, the 271 posterior variance approaches the prior variance i.e. we have little additional certainty. As the sample 272 size approaches infinity, the posterior variance approaches zero and we are nearing ‘perfect 273 information’. 274 275 Insert Figure 1. 276 10 277 Figure 1. Bayesian Updating 278 Part A Normal distribution: Posterior distributions after sampling 50 cases Frequency (10,000 samples) 1600 1400 Prior 1200 Posterior after Sample 1 1000 800 600 Posterior after Sample 2 400 Posterior after Sample 3 200 0 0 900 800 700 600 500 400 300 200 100 0 0% Prior Posterior after Sample 1 Posterior after Sample 2 Posterior after Sample 3 20% 40% 60% 80% % response rate for drug 100% Part C Gamma distribution: Posterior distributions after sampling 50 cases Frequency (10,000 samples) 281 282 1 Part B Beta distribution: Posterior distributions after sampling 50 cases Frequency (10,000 samples) 279 280 0.5 Utility gain for drug responder 2000 Prior 1500 Posterior from Sample 1 1000 Posterior from Sample 2 500 Posterior from Sample 3 0 0% 50% 100% Mean Side Effect Rate 283 284 285 11 286 Bayesian updating: Beta (Binomial) distribution 287 288 The beta distribution can represent uncertain cost-effectiveness model parameters, which are 289 proportions, probabilities or percentages. The beta distribution is fundamentally linked to data for 290 which the answer is either “yes” or “no”, i.e. binomial data often expressed with only the values 0 and 291 1 (). The exact shape of the probability distribution is determined by the two beta distribution 292 parameters, a and b. An example of a cost-effectiveness model parameter that might take a beta 293 distribution could be the percentage of patients who respond successfully to a drug. At an individual 294 level, each patient has only two possible results, “response” or “no response”. Again, existing evidence 295 from clinical trial or observational study data should be used to quantify the prior distribution e.g. 296 percentage responders will be a distribution Beta(a0,b0) where a0 is the number of patients who 297 responded, b0 is the number of patients who did not respond, and the prior mean for the response rate is 298 a0/(a0 + b0). 299 300 Box 2 part B provides the mathematics for the process of Bayesian updating for beta distributions and 301 three worked examples using illustrative data on another model parameter: drug response rate. The 302 process begins with the quantifying the prior parameters a0 and b0. The next step is to simulate a data 303 collection exercise of the chosen sample size. First, sample a true underlying value for the response 304 rate parameter resp_rate0sample from the prior probability distribution Beta(a0,b0). Second, simulate the 305 number of responders y, which might arise from the proposed sample size n given this sampled true 306 underlying value resp_rate0sample from the probability distribution Binomial (n, resp_rate 0sample). Again, 307 we account firstly, for the uncertainty in the true value of the parameter, and secondly, for the random 308 variation in data collection due to the sample size n, which has greater effect for a smaller chosen 309 sample size n. The final step is to synthesise the simulated data with the prior information to give the 310 posterior distribution using the very simple formula posterior is Beta(a 0+y, b0+n-y). 311 312 The worked examples show that as new simulated data is obtained, the estimated mean of the 313 parameter is adjusted and the standard deviation (uncertainty in the mean of the parameter) is reduced. 314 Figure 1 part B illustrates the results for 3 separate simulated data samples, showing the impact on the 315 mean and variance of the model parameter. We observe that the Beta distribution based on small 316 sample sizes (as in the prior) is skewed but as larger samples are obtained it appears (and can be 317 approximated by) normal. As with the updated normal distribution, the posterior mean can take values 318 across the full range of the prior probability distribution. The posterior variance is not exactly the same 319 for all 3 simulated data sets because mathematically it depends on the sampled data, and in these 320 examples the closer that sample is to a response rate of 100%, the smaller the posterior variance. 321 Finally, again there is a relationship between the proposed sample size n and the posterior variance. If 322 the sample size is very small, the posterior variance approaches the prior variance i.e. we have little 323 additional certainty. As the sample size approaches infinity, the posterior variance approaches zero and 324 again we are nearing ‘perfect information’. 325 12 326 Bayesian updating: Gamma (Poisson) distribution 327 328 The Poisson distribution is often used to model the number of events happening during a specified unit 329 of time, working under the assumption that event occurrence is random but with a constant mean event 330 rate λ. The gamma distribution is related and can be used to express the uncertainty in the mean event 331 rate λ. The exact shape of the probability distribution is determined by the two gamma distribution 332 parameters, a and b and thus λ ~ Gamma (a,b). An example of a cost-effectiveness model parameter 333 that might take a gamma distribution could be the mean number of side effects experienced by patients 334 in a year. At an individual level, each patient will experience a discrete number of side effects. The 335 mean number of side effects cannot be negative, and the distribution may be skewed as most patients 336 will experience some side effects but a few may experience many. Again, existing evidence from 337 clinical trial or observational study data should be used to quantify the prior distribution for the average 338 rate e.g. mean number of side effects experienced by patients in a year will be a distribution 339 Gamma(a0,b0) where a0 is prior side-effect rate multiplied by the prior sample size reported from 340 existing evidence, and b0 is 1divided by the prior sample size. The resulting prior mean is a 0*b0 and 341 the prior variance is a0*b02. 342 343 Box 2 part C provides the mathematics for the process of Bayesian updating for gamma distributions 344 and three worked examples using illustrative data on a model parameter: mean side-effect rate for drug. 345 The process begins with the quantifying the prior parameters a 0 and b0. The next step is to simulate a 346 data collection exercise of the chosen sample size. First, sample a true underlying value for the side- 347 effect rate parameter sideeffect_rate0sample from the prior probability distribution Gamma(a 0,b0). 348 Second, given this sampled true underlying value, simulate the number of side effects (y1, y2, … yn) for 349 each of the n patients in the proposed sample from the probability distribution 350 Poisson(sideeffect_rate0sample). This is slightly different from the normal and beta distributions because 351 it is necessary to sample individual-level data. Again, we account firstly, for the uncertainty in the true 352 value of the parameter, and secondly, for the random variation in data collection due to the sample size 353 n, which has greater effect for a smaller chosen sample size n. The final step is to synthesise the 354 simulated data with the prior information, using the simple formulae in Box 2 row9 and row10, to give 355 the posterior distribution for mean number of side effects per person: 356 357 The worked examples show that as new simulated data is obtained, the estimated mean of the 358 parameter is adjusted and the standard deviation (uncertainty in the mean of the parameter) is reduced. 359 Figure 1 part C illustrates the results for 3 separate simulated data samples, showing the impact on the 360 mean and variance of the model parameter. We observe that the Gamma distribution based on small 361 sample sizes (as in the prior) is skewed but (check) as larger samples are obtained it appears (and can 362 be approximated by) normal. As with the updated normal distribution, the posterior mean can take 363 values across the full range of the prior probability distribution. The posterior variance is not exactly 364 the same for all 3 simulated data sets because mathematically it depends on the sampled data. Finally, 365 again if the sample size is very small, then the posterior variance approaches the prior variance i.e. we 13 366 have little additional certainty, whilst if the sample size approaches infinity, the posterior variance 367 approaches zero and we are nearing ‘perfect information’. 368 369 Bayesian updating for other distributions 370 371 Some other distributions are of related functional forms to the normal, beta and gamma and can use 372 adjustments to the analytic formulae presented in Box 2 for Bayesian updating. The lognormal 373 distribution (where log(X) ~ N(, )) is a useful skewed distribution allowing only positive values and 374 can often be used for cost parameters. To update a lognormal parameter, it is necessary to convert into 375 a normal distribution (by raising e to the power of the variable), form the Bayesian posterior for the 376 normal distribution as shown above, and then convert the normal posterior back into a lognormal 377 distribution. Similarly, the exponential distribution, is a special case of the gamma distribution (a=1) 378 and can use the same updating formula. 379 380 Other distributions do not have conjugate properties and cannot use simple formulae for Bayesian 381 updating. Examples include the Erlang, Gumbel and Weibull distributions. If there is no simple 382 conjugate based formula for updating the current standard approach is to use simulation approaches to 383 generate an estimate of the posterior distribution. These simulation approaches are based on 384 mathematics called Markov Chain Monte Carlo (references). Freely available software such as 385 WinBUGS (http://www.mrc-bsu.cam.ac.uk/bugs/) (references) allows the user to input the prior 386 distribution and relevant data, and then run thousands of iterations of Markov Chain Monte Carlo 387 analysis to create a posterior distribution. This process takes some time depending on the complexity 388 of the prior and the data. 389 390 14 391 Box 2. Bayesian updating for the Normal, Beta and Gamma Distributions Mathematical Formulae Part A: Normal distribution Worked Examples Sample 1 Sample 2 Sample 3 Example: utility gain of responders a 0 = prior mean 0.3 0.3 0.3 b 0 = prior standard deviation 0.1 0.1 0.1 c I0 = 1/σ02 prior precision 1/0.12= 100 1/0.12= 100 1/0.12= 100 d pop = patient level standard deviation 0.2 0.2 0.2 e N = sample size 50 50 50 f μ0sample = random sample from N~(0, 0) 0.3545 0.2615 0.1545 g X = sample mean from N~(μ0sample, σ2pop/n) 0.3425 0.2389 0.1977 h 2X = 2pop/n = sample variance 0.22/50 = 0.0008 0.22/50 = 0.0008 0.22/50 = 0.0008 i IS = 1/2X = precision of the sample mean 1/0.0008= 1250 1/0.0008= 1250 1/0.0008= 1250 j I I X 1 0 0 S I0 I S 100 0.3 1250 0.3425 100 1250 100 0.3 1250 0.2389 100 1250 100 0.3 1250 0.1977 100 1250 = 0.3505 = 0.2644 = 0.1653 k = posterior mean 2pop / n 2 1 2 2 / n 0 pop 0 = posterior standard deviation 0.0008 0.01 0.01 0.0008 0.0008 0.01 0.01 0.0008 0.0008 0.01 0.01 0.0008 = 0.0272 = 0.0272 = 0.0272 Part B Beta distribution Example: % response rate for drug l a0 = number of patients who responded 7 7 7 m b0 = number of patients who did not respond 3 3 3 n 0 = a/(a+b) = prior mean 70% 70% 70% o 20 0.1892 0.1892 0.1892 50 50 50 50.12% 70.63% 90.24% 28 34 43 35 41 50 25 19 10 58% 68% 83% 0.2390 0.2128 0.1366 = a0b0 = prior variance (a0 b0 ) (a0 b0 1) 2 p n = sample size q resp_rate0sample = sample from Beta (a0 b0) r y = sample no. responders | resp_rate0sample from Binomial (n, resp_rate0sample). s a1 = a0 + y = posterior patients who responded t b1 = b0 + n – y = posterior patients who did not respond u 1 = a1/(a1+b1) = posterior mean v 21 = a1b1 = posterior variance (a1 b1 ) (a1 b1 1) 2 Part C: Gamma distribution 15 Example: side effects per annum 1 a0 = event rate*(number of patients in study) 1 1 1 2 b0 = 1 / (number of patients in study) 0.25 0.25 0.25 3 0 = a0*b0 = prior mean 25.00% 25.00% 25.00% 4 20 = a0*b02 = prior variance 0.0625 0.0625 0.0625 5 n = sample size 50 50 50 6 sideeffect_rate0sample = from Gamma(a0, b0) 10.07% 21.66% 44.11% 7 (y1, yn) = from poisson (sideeffect_rate0sample) 7 10 20 n a1 a0 yi i 1 n y i 1 i 8 n y i 1 i 11 n y i 1 i 21 8 b1 1 (1 / b0 n) 0.0185 0.0185 0.0185 9 1 = a1*b1 = posterior mean 14.81% 20.37% 38.89% 10 21 = a1*b12 = posterior variance 0.0027 0.0038 0.0072 392 393 16 394 Illustrative Model 395 396 To test the algorithm and explore the results, a hypothetical cost-effectiveness model was developed 397 comparing two strategies: treatment with drug T0 versus treatment with drug T1. Figure 2 shows the 398 nineteen model parameters, with prior mean values shown for T0 (column a), T1 (column b) and hence 399 the incremental analysis (column c). Costs for each strategy include “cost of the drug” and cost of 400 hospitalisations - “the percentage of patients receiving the drug who were admitted to hospital” x “days 401 in hospital” x “cost per day in hospital” (e.g. cost of strategy T0 = £1,000 + 10% x 5.20 x £400 = 402 £1,208). Health benefits are measured as QALY gained and come from two sources: responders 403 receive a utility improvement for a specified duration, and some patients have side effects with a utility 404 decrement for a specified duration (e.g. QALY for strategy T0 = 70% responders x 0.3 x 3 years + 25% 405 side effects x –0.1 x 0.5 years = 0.6175). The threshold cost per QALY () is set at £10,000 (i.e. net 406 benefit of T0 is = £10,000* 0.6175 – £1,208 = £4,967) and model results using only central estimates 407 show that T1 (£5,405) has greater net benefit than T0 (£4,967), which means that our baseline decision 408 should be to fund T1. 409 410 Insert Figure 2 – 411 412 The uncertain model parameters are characterised with normal distributions, standard deviations are 413 shown in columns (d) and (e.). Uncertainty in mean drug cost per patient is very low, but other cost 414 related uncertainties in %admissions, length of stay, cost per day are relatively large. For health 415 benefits, uncertainties in % response rates are relatively large, uncertainty in mean utility gain for a 416 responder to T1 is less than that for a responder to T0 (short-term trials of the newer treatment T1 have 417 measured the utility gain more accurately), but uncertainty in the duration of response for T1 is higher 418 (long-term observational data on T1 is not as complete as for T0). The patient level uncertainty for 419 each variable is given in columns (f) and (g). 420 421 The general algorithm for EVSI calculation was applied to the illustrative model using 1000 422 simulations for both the inner and the outer expectations and the formula for normal distributions 423 Bayesian updating from Box 2. The simulations were performed in EXCEL, sampling from normal 424 distributions using the RAND and NORMINV functions together with macro programming to loop the 425 process. EVSI calculations were performed for sample sizes of 10, 25, 50, 100 and 200 in addition to 426 EVPI calculations. The EVSI was analysed for each individual parameter and also five different 427 proposed data collection exercises based on subgroups of parameters. The five subgroups were:- a) a 428 proposed randomised controlled clinical trial measuring only response rate parameters (parameters 429 5,15), b) an observational study on utility only (parameters 6,16), c) a trail combined with utility data 430 collection (parameters 5,6,15,16), d) an observational study of the duration of response to therapy 431 (parameters 7,17) and finally e) a trial combined with utility study alongside an observational study on 432 duration of response (parameters 5,6,7,15,16,17). 433 17 434 Figure 2: Illustrative Model Part a) Model Parameters and Uncertainty Model Parameters Parameter mean Values T0 a £1000 10% 5.20 400 70% 0.300 3.0 25% -0.10 0.50 (1) Cost of Drug T0 (2) % Admissions on T0 (3) Days in Hospital on T0 (4) Cost per day on T0 or T1 (5) % Responding on T0 (6) Utility Change on T0 (7) Duration of response (yrs) on T0 (8) % Side Effects on T0 (9) Change in utility if side effect on T0 (10) Duration of side effect (yrs) on T0 Central Estimate Results Total Cost Total QALY Cost per QALY Net Benefit of T1 versus T0 Threshold cost per QALY = £10,000 435 (11) Cost of Drug T1 (12) % Admissions on T1 (13) Days in Hospital on T1 (14) Cost per day on T0 or T1 (15) % Responding on T1 (16) Utility Change on T1 (17) Duration of response on T1 (18) % Side Effects on T1 (19) Change in utility if side effect T1 (20) Duration of side effect on T1 £1208 0.6175 £1956 £4967 T1 b £1500 8% 6.10 400 80% 0.300 3.0 20% -0.10 0.50 Incremental c £500 -2% 0.90 10% -5% 0.00 - £1695 0.7100 £2388 £5405 £487 0.0925 £5267 £437.80 Uncertainty in Parameter Mean Standard Deviation T0 T1 d e 1 1 2% 2% 200 200 200 200 10% 10% 0.100 0.050 0.5 1.0 10% 5% 0.02 0.02 0.20 0.20 Individual Patient Level Variability Standard Deviation T0 T1 f g 500 500 25% 25% 200 200 200 200 20% 20% 0.200 0.200 1.0 2.0 20% 10% 0.10 0.10 0.80 0.80 Part b) Model Results for Overall Uncertainty 436 Cost Effectiveness Plane from 1,000 samples Cost Effectiveness Acceptability Curves 437 Cost Effectiveness Acceptability of T1 versus T0 £2,000 438 100% £1,500 90% 439 80% £1,000 Inc Cost £0 -1.4 -1.2 -1 -0.8 -0.6 -0.4 -0.2 0 0.2 -£500 0.4 0.6 0.8 1 1.2 1.4 1.6 Probability Cost Effective 70% £500 60% T1 50% T0 40% 30% -£1,000 20% -£1,500 10% 0% -£2,000 £- £20,000 £40,000 £60,000 £80,000 Inc QALY Threshold (MAICER) £100,000 £120,000 £140,000 18 440 Interpolating the EVSI curve 441 442 The expected value of sample information for a sample size of 50 should clearly be higher than for that for 443 a simple size of n=10, and lower than EVSI for n=100. Thus we expect a smooth curve for the relationship 444 between EVSI and n. The lower bound of EVSI for a sample size of zero is clearly zero (there is no value 445 to be obtained from no data). The upper bound for the curve is given by the expected value of perfect 446 information (there is no greater value to be obtained than that from an infinite sample which produces 447 perfect accuracy for the value of the uncertain parameter). Conceptually we expect that the curve will be 448 monotonic (increases with n) and show diminishing returns (the additional value obtained from increasing 449 the sample size from 5,050 to 5,100 is lower than the value obtained by increasing the sample size from 50 450 to 100. Given this knowledge, a series of exponential form curves were investigated to determine a 451 common functional form for the relationship between EVSI and sample size n. 452 453 454 455 19 456 Results 457 458 Para 1 - Basic model results 459 460 Using prior mean values for illustrative model parameters, the results show a cost difference of £487, with 461 strategy T1 the more expensive. Strategy T1 also has better health outcomes, providing 0.0925 more 462 QALYs. The incremental cost per QALY is £5,267, which is below our decision threshold of £10,000 per 463 QALY. When measured on the net benefit scale, T1 provides £5,405 compared with T0 at £4,967 464 (difference = £437.80), which means that our baseline decision should be to adopt strategy T1. 465 Probabilistic sensitivity analysis confirms this and also shows that T1 provides greater net benefits on 466 54.5% of samples. So, given our existing uncertainty, there is actually a high chance (45.5%) that the 467 alternative strategy T0 would provide more net benefit than current baseline decision T1. This suggests 468 that obtaining more data on the uncertain parameters might help us with our decision. Overall EVPI 469 measures the expected net benefit we would gain if we were to obtain perfectly accurate knowledge about 470 the value of every parameter in the decision model. Overall EVPI for our illustrative model is £1,352 per 471 patient, which is equivalent to 0.1352 QALYs expected gain if perfect knowledge about every parameter 472 were obtained. 473 474 Para 2 - EVSI results for individual parameters, and meaning 475 476 The EVSI results (Table 1 part a) show that further data collection on 6 of the 19 individual model 477 parameters would potentially be valuable. The uncertain parameter with the greatest impact on the current 478 decision is the duration of response on treatment T1. The partial EVPI for this parameter is £803 and the 479 EVSI for a sample of n-50 is £768. The duration of response for patients on T0 causes slightly less 480 decision uncertainty - partial EVPI is £267, and EVSI for n=50 is £256. The second set of important 481 parameters concern utility change if a patient successfully responds to treatment. Utility change on T0 482 response is a key uncertain parameter (EVPI £656, EVSI for n=50 is £617). The final parameters causing 483 decision uncertainty are drug response rates, with both T1 and T0 showing EVSI results for n=50 of the 484 order of £200. Each of the other 13 model parameters had EVSI values lower than £5. For some of the 485 parameters, even small samples would be valuable – some of the EVSI curves (Figure 3) are steep precisely 486 because there is relatively little existing knowledge of the exact values of these important parameters. 487 EVSI for a parameter is often higher if existing uncertainty is higher e.g. standard deviation for mean utility 488 change on T0 response is twice as high as that for utility change on T1 response, and EVSI results are 489 correspondingly higher. However, large uncertainty does not always result in high EVSI values because it 490 is only if the decision between strategies is affected that the uncertainty is relevant. For example, the 491 standard deviation for % side effects on T0 is twice as high as for T1 but both parameters have negligible 492 EVSI results because this uncertainty does not affect the decision. 493 494 20 495 Para 3 - EVSI results for 4 subgroups, and EVSI results for all 6 parameters (i.e. 5 th subgroup) 496 497 EVSI results for groups of parameters are often more useful and relevant in practice than those for single 498 individual parameters. For example, a long-term observational study might measure duration of response 499 for both T0 and T1 (2 model parameters), or a randomised controlled trial might measure response rates 500 and utility changes for both drugs (4 model parameters). EVSI results for parameter groups (Table 1 part 501 b) show that the duration parameters remain the most important pairing, with EVPI for duration of response 502 to both T0 and T1 at £887 and EVSI (n=50) at £856. A study of utilities for both is slightly less valuable 503 (EVSI n=50 is £716), whilst a randomised clinical trial measuring response rates to both drugs is 504 considerably less valuable (EVSI n=50 is £330). A study to measure all 6 of the important model 505 parameters has EVPI of £1,351 and an EVSI for n=50 samples of £1,319. These results also demonstrate 506 that EVSI for a combination of parameters is not the simple addition of EVSI for individual parameters, 507 because parameters interact in the net benefit functions for each strategy. The EVSI curves are relatively 508 steep (Figure 4), with 90% to 95% of the EVPI provided by a sample of n=50, 98% by a sample of n=100, 509 and 99% by a sample of 200. 510 511 Para 4 - EVSI curve fitting 512 513 Investigation of the functional form of the EVSI curves produced a mathematical equation to fit the curves, 514 at least for the illustrative model examined here. The first equations examined followed the form EVSI(n) 515 = EVPI * exp(-λ/n) and EVSI(n) = EVPI * (1 – 1/ (1+n)), but neither provided an adequate fit. The 516 functional form which did fit the curves was EVSI(n) = EVPI * (1 – exp(-*(n))). Given the results for 5 517 different sample sizes, the best fit was calculated using EXCEL Solver to minimise the sum of squared 518 differences between the actual data and the fitted curve on a normalised scale (i.e. dividing each curve by 519 the relevant EVPI so that the maximum value is 1.00). Figure 5 shows the actual data (symbols) and the 520 fitted curve for one parameter, rising form zero and tending to the EVPI. Figure 6 shows all of the curves, 521 with values of ranged from 0.22 to 0.57 for increasing steepness. The sum of squared difference error 522 statistics ranged from 0.0004 to 0.0033, showing a consistently good fit in our illustrative model. 523 524 21 525 Table 2: EVSI per patient based on 1000 x 1000 Monte Carlo Simulations Part a) EVSI Results for Individual Parameters (£ per patient) Sample Size Parameter 10 25 50 100 % responding to T0 172 194 215 222 200 223 Perfect 230 Utility change if respond to T0 526 588 617 644 644 656 Duration of response (years) T0 207 244 256 261 266 267 % responding to T1 146 179 183 192 189 197 Utility change if respond to T1 148 221 250 290 302 320 Duration of response (years) T1 654 741 768 777 796 803 Part b) EVSI Results for Parameter Groups (£ per patient) % responding to T0 and T1 297 331 330 355 357 359 Utility change if response to T0 and T1 605 669 716 736 769 768 Duration of response to T0 and T1 690 815 856 855 860 887 % responding and utility change if response to T0 and T1 776 853 878 877 907 916 % responding, utility change and duration of response to T0 and T1 1100 1217 1277 1310 1321 1350 526 527 22 528 £1,400 Value of Information £1,200 Duration of reposnse T1 £1,000 Utitlity if T0 £800 Utitlity if T1 Duration of response T0 £600 % Responding T0 % Responding T1 £400 £200 £0 50 100 150 200 250 Sample Size (n) 529 530 Figure 3: EVSI per patient for Individual Parameters 531 532 £1,400 All Six main Paramenters Value of Information £1,200 Trial + Utility Study (% responding & utility T0 & T1) £1,000 Duration Study (duration of response to T0 & T1) £800 Utility Study (Utility T0, Utility T1) £600 Trial (% reponding T0, % responding T1) £400 £200 £0 50 100 150 200 250 Sample Size (n) 533 534 Figure 4: EVSI per patient for Parameter Groups 535 536 23 537 Function : - EVSI = EVPI * [1- EXP( -0.2181 * sqrt(n)) ] £400 Value of Information EVPI = £320 £300 Acutal EVSI £200 Exponential Fit £100 £0 50 100 150 200 250 Sample Size (n) 538 539 Figure 5: Fitting an Exponential Functional Form to the EVSI curve – Parameter Utility 540 Function : - EVSI = EVPI * [1- EXP( * sqrt(n)) ] £1,600 £1,400 Value of Information £1,200 £1,000 £800 £600 All Six Main Parameters 0.4989 Trial + Utility Study 0.5654 Duration Study 0.4797 Duration of response T1 0.5187 Utility Study 0.4489 Utility Change respond T0 0.4806 Trial Study 0.5256 Utitlity Change respond T1 0.2181 Duration of response T0 0.4731 % Responding T0 0.4034 % Responding T1 0.4301 £400 Symbols = Acutal EVSI data £200 Lines = Fitted Exponential Curves £0 50 100 150 200 250 Sample Size (n) 541 542 Figure 6: Fitting Exponential Functional Forms – All Parameters 543 24 544 Discussion 545 546 Para 1 - Summary of the process, show it can be done and the kind of results that can be obtained. 547 548 The methodology presented here assesses the potential value of different research designs in terms of 549 expected health and monetary gain accruing from the research. The approach demands a clear description 550 of the health policy options in a decision model, a quantified characterisation of the uncertain variables, a 551 description of the proposed data collection exercise, and finally the calculation of expected value of sample 552 information via the 2 level monte carlo simulation incorporating Bayesian updating. The illustrative model 553 demonstrates the feasibility of the approach. The results presented are typical, showing which research is 554 most valuable to the policy decision and what sample sizes would be large enough to aid the decision 555 maker. 556 557 Para 2 – Effect of thinking this way on model construction 558 559 EVSI often has implications for decision model construction and characterisation of uncertainty because 560 possible further data collection forces additional rigour in specifying probability distributions. This rigour 561 applies to the specification of confidence intervals on parameters. The simple task of quantifying an 562 implied sample size n0 for these prior confidence intervals can be useful. For example, we specified a 563 standard deviation of 0.1 for the uncertainty in the mean utility change following response to T0. Given 564 our specified patient level uncertainty of 0.2, we can use a simple formula for the normal distribution case 565 [20 = 2pop/n0] to work out the implied sample size of our specified prior uncertainty. The result is n0=2 566 for mean utility change following response to T0 and n0=4 for the mean utility change following response 567 to T1, while the equivalent calculations for cost of the drugs gives n0=500. This explains the high EVSI for 568 utilities parameters in our model and the low EVSI for costs of drugs. The rigour also applies to the 569 mathematical form of the probability distributions. Triangular distributions may sometimes be considered, 570 but as soon as the analyst considers the properties of the individual patient level data, the triangular 571 specification is often seen to be inappropriate. We characterised the uncertainty in mean response rate to 572 T0 as a normal distribution, which might be reasonable large sample sizes, as the central limit theorem 573 suggests uncertainty in a sample mean follows a normal distribution. However, data on an individual 574 patient will measure whether there is response i.e. a Yes or No dichotomous variable, and so it is clear that 575 a Beta distribution is the correct characterisation of uncertainty in mean response rate. Sometimes analysts 576 might consider expressing the uncertainty in mean utility as a Beta distribution, the rationale being it would 577 lie between 0 to 1 and many probability distribution shapes are possible. However, we know the Beta 578 distribution must have individual level data that is dichotomous, and it is clear that individual patient level 579 utility data would not just produce 0 or 1 but individual results anywhere on the range between. A normal 580 distribution may well be the correct approach for characterising uncertainty in mean utility. 581 25 582 Para 3 Correlation 583 584 EVSI also brings into focus the correlation between uncertain parameters. This can have at least two 585 effects. Firstly, data collected about one parameter may also provide evidence about another parameter. 586 For example discovering that the utility change for a responder to T0 is actually at the high end of its prior 587 uncertain range may well imply that the utility change for a responder to T1 may also be higher than 588 expected. Building correlations into the model of uncertainty would therefore be important. One way to 589 achieve this is to restructure the model so that utility of T0 is an uncertain parameter and then the difference 590 between T1 and T0 utility is a second uncertain parameter. Another way would be to sample the 2 591 parameters as part of a multi-variate joint normal distribution with a specified variance-covariance matrix, 592 incorporating the covariance to reflect our knowledge about correlation. This is less easy in EXCEL but 593 straightforward in software such as R. Both of these re-parameterisations would affect the C-E plane 594 scatter-plot and the CEAC curve for the decision model. The second effect is that the extent of correlation 595 itself may be an uncertain variable. Thus correlation moves from affecting the shape of uncertainty in the 596 C-E plane scatter-plot, to being a parameter within the model itself upon which EVPI and EVSI 597 calculations can be done – how valuable would policy decision makers consider data on correlation 598 between variables? This may be particularly useful in the context of epidemiological or observational 599 studies, where the underlying causality of disease or health system processes is being explored. A further 600 intensification of the methodology would involve incorporating the existing evidence on related parameters 601 into a WinBUGs based evidence synthesis to determine an integrated analysis of correlations, variable 602 dependence and probability distributions before the EVSI calculations are performed. 603 604 Para 4- Further methods development work Computation time issues are important 605 606 Some further development of the method would be useful, particularly how to deal with probability 607 distributions beyond the normal, beta and gamma. Work is needed on other conjugate distributions, to 608 produce additional formulae for Bayesian updating and to examine their usefulness in the context of real 609 decision models. For non-conjugate distributions there are two options. One option for Bayesian updating 610 would be to use MCMC methods (refs ?include Jim transplantation work), although this would incur 611 significant additional computation on each sample of the possible data collected. The only other option is 612 to use approximation methods, the topic of our current investigations (Ref both Laplace approximation and 613 Weibull MDM abstracts). Work is needed also to improve computational efficiency. In our study on EVPI 614 calculations (ref EVPI paper) we explored the number of outer and inner level simulations required to 615 achieve accurate results in the 2 level algorithm. The same issues apply in EVSI and work to establish a 616 rule of thumb for the number of inner and outer level simulations would be useful. For computationally 617 intensive decision models, e.g. patient level simulation models of one hour per model run, the replacing the 618 model with an emulator or meta-model is valuable (ref Paul Simon and Jim’s MS EVI). The emulator 619 produces similar results but in a fraction of the time enabling 2 nd order simulations for EVSI to be 26 620 undertaken in a reasonable time. A particularly useful approach is the Gaussian process emulator (Ref 621 Jeremy and Matt) 622 623 Para 5 – Generaliseability 624 625 The algorithm and mathematics presented are generalisable to any decision model with any decision rule. 626 The exponential functional form for EVSI curves is not however. It has a shape showing even small 627 samples of data would be valuable. Consider an instance where a moderate amount of clinical trial data 628 already exists and a further very small trial is planned. Conceptually one would expect that the new very 629 small trial would add little to the policy decision, especially if the trial protocol or setting is slightly 630 different to the earlier study implying the potential for a biased or skewed sample from the underlying 631 population. If a larger new trial were planned, of size equal to the original, then this may have substantial 632 impact on the decision, for example through formal meta-analysis of the two studies. If an extremely large 633 trial were planned, say 50 times larger than the original, then one would expect the new study to dominate 634 the evidence and be of considerable value. This suggests that EVSI curves may be S shaped in some 635 circumstances rather than the inverted exponential shown in our study. This conceptual thinking also 636 suggests that it may be necessary to extend EVSI calculation methods to incorporate possible bias, or 637 heterogeneity between planned trials and those making up the existing prior evidence. 638 639 Para 6 – Implications - for related fields 640 641 How this approach relates to the traditional frequentist approach for calculating sample size for a clinical 642 trial also requires research. The traditional approach (refs ? Machin,?) specifies a clinically significant 643 difference , and assesses the probability that such a difference would be found in the trial by pure chance 644 if in fact there were no true difference between treatments. The traditional levels of significance () and 645 power () are used to determine the sample size n, which would show that a true clinically significant 646 difference exists. There are many issues which Bayesian statisticians view as ill-conceived with this 647 approach, not least the lack of accounting for existing prior evidence. However, it is used across the world 648 in commercial and governmental contexts and delivers the benefits of simplicity and useability. It would 649 be useful to research the comparison of sample size results against the EVSI approach. There is a link - in 650 the traditional approach, the implied decision rule is that clinicians will adopt a treatment with a proven 651 clinically significant difference . In the EVSI approach, the decision model rule is that policy makers will 652 adopt a treatment if its cost-effectiveness is below . Given the enormous expansion of health economics 653 as an influencing factor over the adoption of new treatments, it is clear that these two modelled decision 654 rules are linked and likely to be coming closer together. There are likely to be mathematical modelling 655 methods to examine the interaction between the two approaches, as well as more pragmatic studies of how 656 decisions are really made and which decision model is closest to real life. In the first instance, comparison 657 of the implied sample sizes of the two approaches in the design of clinical trials would aid understanding. 27 658 659 A key advantage of the EVSI approach is that is helps consider the value other kinds of data collection, 660 including economic cost and quality of life data, not just clinical efficacy. The traditional approach on 661 these issues is to consider data collection on resource use and sometimes quality of life data alongside 662 clinical trials. The illustrative model demonstrates common problems with this, i.e. long-term follow-up on 663 disease progression is often more important than another trial, resource use within clinical trials is often the 664 same between two arms, and quality of life data is necessary. 665 666 Para 7 - Importance of this approach 667 668 The EVSI calculations shown here take a societal perspective on the value of the research and are therefore 669 directly relevant to governmental agencies funding health technology assessments. They are also of value 670 to commercial investors in research because they prefigure the likely cost-effectiveness and uncertainty 671 analysis which governmental reimbursement authorities in many countries use to assess pharmaceuticals. 672 Thus commercial companies might utilise this approach to predict the likely response of governemental 673 agencies concerning the evidence gaps in a reimbursement analysis, and hence commission related studies 674 to their existing product evidence earlier in the process. Finally ofcourse, the decision rule used for 675 adoption can be re-examined in terms of commercial incenticves and models of the sales, profit and 676 payback on commercial research incorporated into the framework in order to make explicit decisions in the 677 R&D cycle of product development on the value of further data collection. (? Ref Neil Hawkins or papers 678 from FF 6). 679 680 Para 8- Sum up the conclusion and importance of this paper. 681 682 In summary, this paper presents a clear methodology for assessing the value of different research designs in 683 terms of expected health and monetary gain accruing from the research. The mathematics and algorithm 684 are generaliseable to many situations and the illustrative model demonstrates the feasibility of the approach. 685 It is hoped that the approach will now be implemented in many more health economic evaluations, as a 686 relatively simple extension of probabilistic sensitivity analysis. This in turn, should lead to a more coherent 687 analysis of the prioritisation and design of health technology assessment studies. 688 689 690 691 28 692 References 693 694 Eddy DM. Selecting technologies for assessment. Int J Technol Assess Health Care 1989; 5: 485-501. 695 696 Townsend J, Buxton M. Cost effectiveness scenario analysis for a proposed trial of hormone replacement 697 therapy. Health Policy 1997; 39: 181-94. 698 699 Davies L, Drummond M, Papanikoloau P. Prioritising investments in Health Technology Assessment: can 700 we assess the potential value for money? Discussion Papers. York: University of York Centre for Health 701 Economics; 1999. 170. 702 703 Detsky AS. Using cost-effectiveness analysis to improve the efficiency of allocating funds to clinical trials. 704 Stat Med 1990; 9: 173-84. 705 706 Drummond MF, Davies LM, Ferris FL III. Assessing the costs and benefits of medical research: the 707 diabetic retinopathy study. Soc Sci Med 1992; 34: 973-81. 708 709 Buxton M, Elliott R, Hanney S, Henkel M, Keen J, Sculpher M, Youll P. Assessing payback from 710 Department of Health research and development: preliminary report, Vol. 2, Eight case studies. Uxbridge: 711 Health Economics Research Group, Brunel University; 1994. 19 Part 2. 712 713 Claxton K, Posnett J. An economic approach to clinical trial design and research priority-setting. Health 714 Econ 1996; 5: 513-24. 715 716 Claxton K, Neumann PJ, Araki S, Weinstein MC. Bayesian value-of-information analysis. An application 717 to a policy model of Alzheimer’s disease. Int J Technol Assess Health Care 2001; 17: 38-55. 718 719 29