Detailed Comments from NIAAA on the OECD draft report on Harmful Use of Alcohol, received December 2014 The draft OECD report on Harmful Alcohol Use represents an important step forward in integrating data and evidence from multiple sources in a structured analytic framework to inform quantitative estimates of the potential effects of a range of policy interventions on multiple adverse outcomes associated with alcohol consumption. The microsimulation modeling approach that forms the core of the report’s analytic contribution is innovative in the context of research on preventing alcohol-related problems. The scope of the analysis is quite broad, predicting effects of 9 distinct policies plus three bundles representing different subsets of these 9 policies, all for three separate countries and over a 40-year period. Broadly, the findings of the report lend further support for the policy approaches examined as potential tools for reducing alcohol-related harm, mainly as a result of empirical support that appears in the extant literature. The comments that follow focus primarily on the modeling and empirical issues that are the focus of Chapter 5, including the cost and cost-effectiveness findings reported there. Additional comments address issues that rely on or pertain to economic theory and methods that appear throughout the report. A key conclusion regarding the modeling is that substantial additional reporting, including some that may require further analyses, is needed to document the credibility, stability, and reliability of the modeling framework, its empirical implementation, and its results. Simulation models of dynamic processes are particularly useful in identifying effects of interventions when the processes that lead to outcomes of interest include feedback loops or other complex interactions, or when there are multiple outcomes that are influenced by multiple factors. The latter appears to be the primary issue addressed in the CDP-Alcohol model, with two primary factors (alcohol consumption quantities and patterns) influencing incidence rates for 10 categories of health outcomes. As shown in the diagrammatic representation of the basic structure in Figure A, the model’s logic does not incorporate significant feedback loops but focuses on the direct chain of causal linkages from alcohol consumption (both levels and patterns) to health outcomes. This framework does not incorporate a high degree of complexity, but it does admit significant heterogeneity across individuals in terms of consumption levels, patterns, health consequences, and policy effects. Such heterogeneity is a key point of interest from both epidemiologic and policy perspectives. One limitation of the report’s analyses is that they do not exploit this heterogeneity nearly as much as one might hope. The value of a model is limited by factors that impinge on its credibility, including problems of transparency, calibration and validation, and stability or robustness. The presentation of the OECD model in this report suffers from these kinds of limitations to a significant degree. In addition, the report presents findings on the cost-effectiveness of several of the interventions 2 examined. However, there are significant concerns with the data and methods used to assess cost-effectiveness. Transparency and verification The modeling approach is described in Box 1 as “…a Montecarlo semi-Markov discrete-event microsimulation model with a dynamic population … structured in discrete time and state transitions occur in yearly cycles.” The details of how these characteristics are manifest in the information flows that comprise the model are not described in the report or its appendices. This lack of transparency creates a fundamental obstacle to understanding how the reported estimates were generated. The OECD report generally treats the model as a magic box, into which selected information from identified studies is added and from which – somehow, through the magic of the model – estimates of the various outcomes emerge. Little information is provided on the structure or inner workings of the model. The characterization cited above and the diagrammatic representation in Figure A of the “Basic Structure of the CDP-Alcohol model” are much too limited to determine whether the approach to modeling the various factors that contribute to the outcomes of interest is internally consistent and incorporates appropriate structural relationships, measurement, and levels of detail. The process of assessing the internal validity and credibility of the model structure is known as verification of the model, and information on verification of the CDP-Alcohol model is severely lacking in the report. Having constructed the model, the authors have enough information to provide such technical details as the flow of information across modules and the functions that define the probabilities of various outcomes as they depend on variables such as drinking quantity and pattern, treatment history, and demographics. Giving informed credence to model results depends in part on the ability to understand the model structure, and that ability is precluded by the lack of detailed information provided in the report. Calibration Another challenge for modelers is to calibrate a model so that it provides credible results, and to examine those results critically to determine whether the model provides a valid representation of the quantities it aims to describe. Models often involve a large number of parameters, only some of which are susceptible to direct estimation or for which useful estimates can be gleaned from the literature. The number of parameters required, and the functional structures into which they are incorporated, are not described. In particular, the parameters in the baseline model are not described at all. The report provides some specific information on how certain key parameters associated with particular policies (e.g., price elasticities for estimating the effects of price-based policies) have been selected from the literature, but not enough to understand how those parameters figure into the overall determination of outcomes. A clue to the level of detail 3 that goes into the structure of such a model is provided in the pithy description of simulating alcohol consumption at the individual and population levels that appears in Box 5.1: …Throughout the simulation, individuals may change their drinking behaviour and, therefore, their risk following country-specific algorithms derived from national survey data. In particular, alcohol consumption in a population is modelled to follow a negative binomial distribution (a zero-inflated negative binomial model was used for Canada), based on a gamma-Poisson mixture. In particular, the number of weekly drinks for each individual is determined on the basis of a Poisson distribution, in which the lambda parameter is a random function of the age, sex and socioeconomic status of each individual, following a gamma distribution. The current model does not distinguish between different types of alcohol (e.g. beer, wine, spirits, etc.) and places of consumption (e.g. on- and off-premises). (Chapter 5, page 5) As this passage shows, individual alcohol consumption during any given period (year) is modeled as a draw from a probability distribution that embeds an additional stochastic function of multiple characteristics, and these individual consumption levels are constrained to aggregate to a population-level distribution of a specific functional form. This specification is not problematic (in fact, it seems reasonable), but it is still quite opaque and therefore difficult to assess. It also is of considerable importance as drinking levels and patterns are the key drivers for the health outcomes examined. Unfortunately, the foregoing passage is by far the most detailed description of any specific aspect of the model contained anywhere in the report. Specifically, one may wonder, how are these functions parameterized? How, in a functional sense, does the lambda parameter depend on age, sex, and SES, and so how do the resulting simulated consumption levels vary across these dimensions? How have the values for these (and the dozens, or perhaps hundreds, of other parameters involved in the larger model) been determined? More importantly, how well do the resulting simulated distributions of consumption conform to information on actual consumption distributions in the populations being modeled, either in total or in the dimensions incorporated in the simulation (age, sex, SES)? Validation This last point raises the issue of model validation – assessing how well the model actually represents what it purports to represent. This is a challenging issue confronting modelers, and there is no cookbook to assuring model validity nor is there any summary statistic or other standard measure for assessing whether a model does or does not meet the validity criterion. One standard approach is to run the model against historical data to assess the degree to which it can replicate known outcomes. Comparing the simulated dynamic trajectories of multiple model outputs to the corresponding trajectories in the historical data provides a basis for assessing the model’s validity. Such examinations are particularly needed for simulations that may be 4 projected quite far into the future (40 years, in this case). Unfortunately, this report provides no indication that the modelers undertook this kind of validation effort at all. To the contrary, by selecting a baseline of 2010, they precluded any meaningful comparison of model outputs with historical data. If, instead, they selected a baseline of, say, 1990 or earlier, this would provide at least 20 years of historical data that would allow fine-tuning of model parameters and a basis for assessing the model’s validity in forecasting effects of policies and other factors on outcomes of interest. There are additional strategies for assessing model validity that vary with the specific contexts and quantities the model aims to represent. The need for model validation arises precisely because the numerous and complex inter-relationships that simulation models can incorporate are not susceptible to testing by standard statistical methods. The OECD report fails to provide adequate information on steps taken to assess the validity of the CDP-Alcohol model, and this failure impinges on the credibility that can be attached to the findings. Robustness and sensitivity analyses The final section of Chapter 5 considers the robustness of model findings and discusses analyses of the sensitivity of results to parameter values and assumptions. However, the discussion of the methods used is too opaque to allow a reader to endorse the report’s conclusion that the model findings are indeed highly robust. Of particular concern, the entire sensitivity analysis effort is apparently conducted using the WHO software MCLeague, which is not described in the report. . (On the WHO web site, it is described as “…a software program that represents uncertainty around costs and effects to decision-makers in the form of stochastic league tables. It provides additional information beyond that offered by the traditional treatment of uncertainty in costeffectiveness analysis, presenting the probability that each intervention is included in the optimal intervention mix for given levels of resource availability.”) It appears that this software is geared specifically toward the cost-effectiveness aspects of the analysis, about which several concerns are identified below. However, for complex simulation models, it is necessary to examine the implications of uncertainty about specific parameter values through numerous model runs that vary parameters individually and in groups to determine if any of the results are highly sensitive to specific parameter values. It is not clear that this kind of sensitivity analysis has been done; if so it should be reported. Further strategies for assessing the robustness of a complex simulation model include consideration of alternate functional forms for key model processes to determine if results may be artifacts of arbitrary modeling decisions rather than of the underlying processes being modeled, and assessment of the variation attributable to stochastic elements embedded in structural components of the model. Again, the report does not indicate that these kinds of assessments have been conducted. 5 A pervasive issue with the presentation is that results obtained in the CDP-Alcohol model simulations are reported as unitary point estimates. As noted in Box A, “However, since the degree of stochastic variation in the results generated by the model is large compared with the size of the effects produced by the policies examined, a high number of simulations were run for each country (more for countries with a larger population, i.e. 5 000 for the Czech Republic; 10 000 for Canada, and 20 000 for Germany), in order to increase the degree of confidence around the results estimated.” Thus, it appears that the reported results are the mean values obtained from these large numbers of model runs. The degree of variation in the results is of some interest and should be reported, although it should be noted that confidence intervals around such estimates do not admit the standard interpretation, as precision may be increased to any desired degree simply by running the model more times until the desired degree of statistical significance is achieved (this consideration affects interpretation of the significance levels shown in Figure 5.2). Another manifestation of the concern with oversimplified and condensed results is the reporting of aggregate values for changes over the 40-year simulation period, as in the key results shown in Figure 5.1. Changes in prevalence rates are shown for the entire 40-year simulation period for each policy in each country (i.e., difference in prevalence rates in year 40 vs. year 0), but no indication is provided of how these changes may be distributed over time (i.e., trajectories) or across population subgroups (as could be explored based on the heterogeneity incorporated in the model). The problems identified above are inherent here as well, as there is no description of how incidence risks for simulated individuals are estimated as a function of their alcohol consumption histories (quantity and pattern). The results are shown for three categories of drinkers (hazardous/harmful drinkers; heavy episodic drinkers; dependent drinkers) for which the category definitions are not provided and that do not precisely align with the three categories identified in the model description provided in Box A (regular drinkers, binge drinkers, dependent drinkers). Costs and cost-effectiveness analysis The report presents per-capita costs for implementing each policy examined as a single number for each of the 3 countries (Table A.7). The derivation of these estimates is not described except to refer to the WHO-CHOICE methods, for which the online descriptions also do not provide relevant details. Many factors could affect the calculation of these costs, and the simple reporting of numbers in a table without details on their provenance is not good scientific practice. It is not clear, for example, just what are the components of implementation costs, and the data sources and methods for estimating those (unidentified) cost components are not described or explained. One suspects that the implementation costs are limited to direct expenditures – perhaps even limited to public-sector expenditures – involved in establishing and executing the policies, which is not consistent with standard CEA methods, as it may omit other resource costs that are 6 attributable to the policies. For example, expanding brief interventions in healthcare settings may involve significant inputs of patients’ time and resources that are not fully accounted. As another example, tax interventions generally reduce the real purchasing power of consumers by an amount larger than the revenue transferred to governments (this is the “dead-weight loss” associated with a commodity tax), and may also cause changes in sectoral employment and economic activity, sometimes including shifts among domestic production and imports. Such effects may represent genuine costs associated with these policies that are overlooked by considering only the expenditures required to establish and maintain the policies The OECD report’s approach to cost-effectiveness analysis (CEA) appears to depart from standard methods1 used in U.S. evaluations of health-related interventions in at least two important ways. First, as noted above, it omits some potentially important elements of cost that should be included. Reporting only the implementation costs systematically and perhaps seriously understates the true costs associated with the policies examined. Second, the cost-effectiveness analysis is presented as a comparison between implementation costs on one hand and reductions in health care expenditures on the other. This is distinctly at odds with the standard cost-effectiveness ratio, in which the numerator is the net present value of all changes in resource use resulting from the policy, and the denominator is a (non-monetary) measure of change in health status. The preferred metric for health status change is the QALY (quality-adjusted life year), which facilitates comparisons across alternative interventions (although it has some well-documented limitations). The OECD’s preference for DALYs (disability-adjusted life years, which bear a greater conceptual distinction from QALYs than the similarity in their names suggests) as a synthetic measure of population health outcomes could be accommodated in a quite straightforward manner. However, treating changes in health care expenditures as the outcome and examining these in comparison to policy implementation costs (whatever those are deemed to comprise, but which apparently include some elements of health care expenditures as noted in the first bullet in section 5.8, page 30) is certainly a non-standard framework for reporting CEA results. Additional comments based on the entire report The report is generally well-written and represents an admirable effort to provide balanced discussion of a number of challenging issues relating to alcohol consumption and specific policy interventions. It gives worthwhile attention to balancing the harms and benefits of alcohol consumption, including health effects but also the utility associated with consumption, as with 1 . The standard U.S. reference for cost-effectiveness analysis in the context of health-related interventions is Gold, M. R., Siegel, J. E., Russell, L. B., & Weinstein, M. (1996). Cost-effectiveness in health and medicine: report of the panel on cost-effectiveness in health and medicine. New York: Oxford Univ Press. This monograph was authored by a group of 13 non-government scholars convened by the U.S. Public Health Service. The OECD report does not cite this reference but does cite a WHO report from 2003 that bears some similarities to the Gold et al effort. 7 any other consumer good. The report would benefit from acknowledgement that its perspective is generally that of economics. It would also benefit from a clear statement of purpose, identifying how it contributes to the literature and to general understanding of the issues it addresses. The “goal of alcohol policy” is identified differently at various points throughout the report. For example, the first paragraph of section 1.4 provides a relatively complicated statement: “The goal of alcohol policies should be to reduce welfare losses to a minimum and preserve people’s freedom to consume alcoholic beverages when these are indeed a source of pleasure, provided no harms are caused to others as a result, and provided that drinkers are able to make a fully informed choice, based on a complete knowledge of the consequences of drinking alcohol.” Later in the same chapter the report says, “…the goal of alcohol policies is to curb the harmful use of alcohol….” (page 14), which is echoed in the first bullet in the introduction to Chapter 4. However, the first sentence of section 4.1 contradicts this, saying, “The goal of alcohol policies is typically to reduce alcohol-related harms.” Still later in Chapter 4 we are told, “One of the main goals of alcohol policies is to promote public health and social wellbeing” (page 6). There are real differences among these stated goals (and there may be additional variations represented in the report). In fact, the goals of policy makers may not correspond to any of these intentions, but may instead be focused on other objectives, such as increasing public revenues, or benefiting the economic interests of one constituency over others. It may be simpler and more accurate to identify the goal of the report as evaluating the effects of alcohol policies in reducing alcoholrelated harms through reductions in alcohol consumption – regardless of how that objective might figure in the actual motivations for policy-makers to adopt such measures. Chapter 1, Page 5, paragraph 3: The large rise in rank of alcohol consumption as a source of global death and disability (from eighth to fifth) in just ten years demands some explanation. Did prevalence or severity of other major sources of health problems decline over the period? Did prevalence or severity of alcohol problems actually increase? Chapter 1, pages 9-10: The discussion of “Individual choice and social impacts” needs to be buttressed with a discussion of causal linkages between alcohol consumption and the harms identified, acknowledging that not every harm associated with alcohol consumption is caused by that consumption, yet clarifying that strong evidence supports a causal role in a large fraction of these harmful outcomes. Chapter 1, page 12: The discussion of rational addiction necessarily oversimplifies the economic theory of rational addiction, which most readers of this report will find unfamiliar, confusing, and hard to accept. Discussion of this conceptually challenging theoretical framework could safely be omitted from the report, but if it is retained, citation to the primary source material (Becker and Murphy, 1988) should be included. The discussion also underplays the significance of alcohol dependence as a major public health problem (e.g., consider the disability weights that have been attached to alcohol dependence in DALY estimations). 8 Chapter 1, page 14, “Policy responses” section: This section begins with a tendentious framing of “the question that policy makers have to answer” that ignores the possibility of strategies to reduce the harms directly and focuses exclusively on curbing alcohol use. The question of broadbased vs. targeted prevention strategies is presented as essentially either/or, despite the promise that the book will examine combinations of policies that may accomplish a balancing of these considerations, at least in terms of net public health benefits (i.e., without necessarily addressing other non-health concerns that policy-makers must also consider). The section continues (page 15) with a brief discussion of minimum pricing that seems poorly thought out from an economic perspective. The claimed welfare loss borne by consumers due to minimum prices is transferred to sellers, for little net change in social surplus relative to a taxdriven price increase. Focusing on this welfare loss also neglects the presumed beneficial effects on health and other adverse alcohol-related outcomes that motivate the policies in the first place. Relative to a taxation strategy that generates public-sector revenues, minimum pricing reallocates the social surplus that arises from voluntary exchange from the public to the private sector, but it is not clear a priori whether the overall surplus would be the same, larger, or smaller under the minimum-pricing regime. The problematic framing from this discussion is exacerbated in the third of the “Key findings” bullets at the beginning of Chapter 4, which simply asserts “a welfare loss” without clarifying that most of the loss to consumers is transferred to sellers. Chapter 2, page 4: Figure 2.2 raises important questions that are not resolved, and the accompanying text focuses on youth drinking, which is not depicted in the Figure, and on disclaiming the value of these kinds of aggregate data. If the information in Figure 2.2 is not central enough to deserve elaboration in the text, it should be omitted or relegated to an appendix. If it is retained, the text should address issues such as: What explains the large-scale changes reported in the 20-year span? How much of this temporal variation is likely to be due to changes in data reporting and collection? Most important, can the changes reported be reconciled with the more disaggregated data reported in Figs 2.5-2.8? Discrepancies appear especially pronounced for France, Italy, and Ireland. The magnitude of changes reported appears potentially alarming, especially in countries not closely examined in this report (e.g., Russia, India, China, Brazil account for over 40% of world population and all show increases of 40% - 57% in per capita alcohol consumption over 20 years). Chapter 4, pages 3-4: The discussion of how various categories of drinkers would benefit from reducing their alcohol consumption appears at odds with the standard welfare economics framework that seems to underlie much of this discussion. That framework attaches full weight to consumers’ own ex ante judgments regarding the utility they derive from their preferred consumption levels. Under that framework, well-informed, non-addicted, adult consumers would not be judged better off by policies, such as taxes, that induce them to reduce their consumption, but in fact would be harmed by such policies. Ex post avoidance of adverse effects that otherwise would be borne by knowledgeable sovereign consumers generally does not justify public 9 intervention in the utility-maximizing consumption decisions of sovereign consumers under a standard welfare economics framework. Chapter 4, pages 4-5: The exclusive focus on price policies in this section is not signaled, explained, or justified. No general framework for evaluating such policies is identified, but the discussion raises a number of issues that appear to serve as a piecemeal set of evaluation criteria. The presentation sounds defensive, summoning a range of half-arguments to justify such policies. The discussion of the distribution of welfare losses resulting from a specific tax increase is imprecise enough to be objectionable to economists and terse enough to be completely opaque to most non-economists. Some better explanation is called for, along with a reference to a standard text on public economics. Chapter 4, page 5, first paragraph: It is worth noting that the adverse health consequences of alcohol consumption may also be distributed in a manner that disadvantages those at the lower end of the income and wealth distributions. Research has not established the relationship between the distribution of health benefits and the net incidence of alcohol taxes. Chapter 4, page 5, second paragraph: “Cross-border trade,” as it is generally understood, is not primarily a form of tax avoidance; most internationally-traded alcoholic beverages are subject to taxation in one or both countries involved. Illicit cross-border trade, i.e., smuggling, tends to flow from lower-tax to higher-tax jurisdictions, thereby offsetting some potential effects of higher taxation in terms of both health and revenue. Chapter 4, page 5, bullets under “Key messages….” First bullet – broader measures of health outcomes are useful for evaluating both health benefits and adverse health consequences of alcohol consumption, not just benefits. Second bullet – this assertion is tendentious and betrays a primary focus on reducing drinking rather than on improving health outcomes. This point also does not emerge from the foregoing section, which these few bullets claim to summarize. Third bullet – this claim is not firmly substantiated in the exposition to this point. It is also worth noting that effects on consumption among more moderate drinkers can have large effects on population health outcomes, reflecting the so-called “prevention paradox”. Chapter 4, page 16, last paragraph: The selection of Miron et al., 2009 as the sole empirical study of MLDA in the United States to be cited in this section provides a highly skewed view of the evidence. A more comprehensive examination of MLDA effects is reported in Carpenter, C., & Dobkin, C. (2011). The minimum legal drinking age and public health. The Journal of Economic Perspectives, 25(2), 133-156. Chapter 4, page 18: The assertion that regulation of outlet density and/or opening hours is “more strongly supported by evidence” than minimum legal drinking age restrictions seems at odds 10 with the literature from the United States, and the few citations provided do not dispel this impression.