Detailed Comments from NIAAA on the OECD draft report on

advertisement
Detailed Comments from NIAAA on the OECD draft report on Harmful Use of Alcohol,
received December 2014
The draft OECD report on Harmful Alcohol Use represents an important step forward in
integrating data and evidence from multiple sources in a structured analytic framework to inform
quantitative estimates of the potential effects of a range of policy interventions on multiple
adverse outcomes associated with alcohol consumption. The microsimulation modeling approach
that forms the core of the report’s analytic contribution is innovative in the context of research on
preventing alcohol-related problems. The scope of the analysis is quite broad, predicting effects
of 9 distinct policies plus three bundles representing different subsets of these 9 policies, all for
three separate countries and over a 40-year period. Broadly, the findings of the report lend
further support for the policy approaches examined as potential tools for reducing alcohol-related
harm, mainly as a result of empirical support that appears in the extant literature.
The comments that follow focus primarily on the modeling and empirical issues that are the
focus of Chapter 5, including the cost and cost-effectiveness findings reported there. Additional
comments address issues that rely on or pertain to economic theory and methods that appear
throughout the report. A key conclusion regarding the modeling is that substantial additional
reporting, including some that may require further analyses, is needed to document the
credibility, stability, and reliability of the modeling framework, its empirical implementation,
and its results.
Simulation models of dynamic processes are particularly useful in identifying effects of
interventions when the processes that lead to outcomes of interest include feedback loops or
other complex interactions, or when there are multiple outcomes that are influenced by multiple
factors. The latter appears to be the primary issue addressed in the CDP-Alcohol model, with two
primary factors (alcohol consumption quantities and patterns) influencing incidence rates for 10
categories of health outcomes. As shown in the diagrammatic representation of the basic
structure in Figure A, the model’s logic does not incorporate significant feedback loops but
focuses on the direct chain of causal linkages from alcohol consumption (both levels and
patterns) to health outcomes. This framework does not incorporate a high degree of complexity,
but it does admit significant heterogeneity across individuals in terms of consumption levels,
patterns, health consequences, and policy effects. Such heterogeneity is a key point of interest
from both epidemiologic and policy perspectives. One limitation of the report’s analyses is that
they do not exploit this heterogeneity nearly as much as one might hope.
The value of a model is limited by factors that impinge on its credibility, including problems of
transparency, calibration and validation, and stability or robustness. The presentation of the
OECD model in this report suffers from these kinds of limitations to a significant degree. In
addition, the report presents findings on the cost-effectiveness of several of the interventions
2
examined. However, there are significant concerns with the data and methods used to assess
cost-effectiveness.
Transparency and verification
The modeling approach is described in Box 1 as “…a Montecarlo semi-Markov discrete-event
microsimulation model with a dynamic population … structured in discrete time and state
transitions occur in yearly cycles.” The details of how these characteristics are manifest in the
information flows that comprise the model are not described in the report or its appendices. This
lack of transparency creates a fundamental obstacle to understanding how the reported estimates
were generated. The OECD report generally treats the model as a magic box, into which selected
information from identified studies is added and from which – somehow, through the magic of
the model – estimates of the various outcomes emerge. Little information is provided on the
structure or inner workings of the model. The characterization cited above and the diagrammatic
representation in Figure A of the “Basic Structure of the CDP-Alcohol model” are much too
limited to determine whether the approach to modeling the various factors that contribute to the
outcomes of interest is internally consistent and incorporates appropriate structural relationships,
measurement, and levels of detail.
The process of assessing the internal validity and credibility of the model structure is known as
verification of the model, and information on verification of the CDP-Alcohol model is severely
lacking in the report. Having constructed the model, the authors have enough information to
provide such technical details as the flow of information across modules and the functions that
define the probabilities of various outcomes as they depend on variables such as drinking
quantity and pattern, treatment history, and demographics. Giving informed credence to model
results depends in part on the ability to understand the model structure, and that ability is
precluded by the lack of detailed information provided in the report.
Calibration
Another challenge for modelers is to calibrate a model so that it provides credible results, and to
examine those results critically to determine whether the model provides a valid representation
of the quantities it aims to describe. Models often involve a large number of parameters, only
some of which are susceptible to direct estimation or for which useful estimates can be gleaned
from the literature. The number of parameters required, and the functional structures into which
they are incorporated, are not described. In particular, the parameters in the baseline model are
not described at all. The report provides some specific information on how certain key
parameters associated with particular policies (e.g., price elasticities for estimating the effects of
price-based policies) have been selected from the literature, but not enough to understand how
those parameters figure into the overall determination of outcomes. A clue to the level of detail
3
that goes into the structure of such a model is provided in the pithy description of simulating
alcohol consumption at the individual and population levels that appears in Box 5.1:
…Throughout the simulation, individuals may change their drinking behaviour and,
therefore, their risk following country-specific algorithms derived from national survey
data. In particular, alcohol consumption in a population is modelled to follow a negative
binomial distribution (a zero-inflated negative binomial model was used for Canada),
based on a gamma-Poisson mixture. In particular, the number of weekly drinks for each
individual is determined on the basis of a Poisson distribution, in which the lambda
parameter is a random function of the age, sex and socioeconomic status of each
individual, following a gamma distribution. The current model does not distinguish
between different types of alcohol (e.g. beer, wine, spirits, etc.) and places of
consumption (e.g. on- and off-premises). (Chapter 5, page 5)
As this passage shows, individual alcohol consumption during any given period (year) is
modeled as a draw from a probability distribution that embeds an additional stochastic function
of multiple characteristics, and these individual consumption levels are constrained to aggregate
to a population-level distribution of a specific functional form. This specification is not
problematic (in fact, it seems reasonable), but it is still quite opaque and therefore difficult to
assess. It also is of considerable importance as drinking levels and patterns are the key drivers for
the health outcomes examined. Unfortunately, the foregoing passage is by far the most detailed
description of any specific aspect of the model contained anywhere in the report. Specifically,
one may wonder, how are these functions parameterized? How, in a functional sense, does the
lambda parameter depend on age, sex, and SES, and so how do the resulting simulated
consumption levels vary across these dimensions? How have the values for these (and the
dozens, or perhaps hundreds, of other parameters involved in the larger model) been determined?
More importantly, how well do the resulting simulated distributions of consumption conform to
information on actual consumption distributions in the populations being modeled, either in total
or in the dimensions incorporated in the simulation (age, sex, SES)?
Validation
This last point raises the issue of model validation – assessing how well the model actually
represents what it purports to represent. This is a challenging issue confronting modelers, and
there is no cookbook to assuring model validity nor is there any summary statistic or other
standard measure for assessing whether a model does or does not meet the validity criterion. One
standard approach is to run the model against historical data to assess the degree to which it can
replicate known outcomes. Comparing the simulated dynamic trajectories of multiple model
outputs to the corresponding trajectories in the historical data provides a basis for assessing the
model’s validity. Such examinations are particularly needed for simulations that may be
4
projected quite far into the future (40 years, in this case). Unfortunately, this report provides no
indication that the modelers undertook this kind of validation effort at all. To the contrary, by
selecting a baseline of 2010, they precluded any meaningful comparison of model outputs with
historical data. If, instead, they selected a baseline of, say, 1990 or earlier, this would provide at
least 20 years of historical data that would allow fine-tuning of model parameters and a basis for
assessing the model’s validity in forecasting effects of policies and other factors on outcomes of
interest.
There are additional strategies for assessing model validity that vary with the specific contexts
and quantities the model aims to represent. The need for model validation arises precisely
because the numerous and complex inter-relationships that simulation models can incorporate
are not susceptible to testing by standard statistical methods. The OECD report fails to provide
adequate information on steps taken to assess the validity of the CDP-Alcohol model, and this
failure impinges on the credibility that can be attached to the findings.
Robustness and sensitivity analyses
The final section of Chapter 5 considers the robustness of model findings and discusses analyses
of the sensitivity of results to parameter values and assumptions. However, the discussion of the
methods used is too opaque to allow a reader to endorse the report’s conclusion that the model
findings are indeed highly robust. Of particular concern, the entire sensitivity analysis effort is
apparently conducted using the WHO software MCLeague, which is not described in the report. .
(On the WHO web site, it is described as “…a software program that represents uncertainty
around costs and effects to decision-makers in the form of stochastic league tables. It provides
additional information beyond that offered by the traditional treatment of uncertainty in costeffectiveness analysis, presenting the probability that each intervention is included in the optimal
intervention mix for given levels of resource availability.”) It appears that this software is geared
specifically toward the cost-effectiveness aspects of the analysis, about which several concerns
are identified below. However, for complex simulation models, it is necessary to examine the
implications of uncertainty about specific parameter values through numerous model runs that
vary parameters individually and in groups to determine if any of the results are highly sensitive
to specific parameter values. It is not clear that this kind of sensitivity analysis has been done; if
so it should be reported. Further strategies for assessing the robustness of a complex simulation
model include consideration of alternate functional forms for key model processes to determine
if results may be artifacts of arbitrary modeling decisions rather than of the underlying processes
being modeled, and assessment of the variation attributable to stochastic elements embedded in
structural components of the model. Again, the report does not indicate that these kinds of
assessments have been conducted.
5
A pervasive issue with the presentation is that results obtained in the CDP-Alcohol model
simulations are reported as unitary point estimates. As noted in Box A, “However, since the
degree of stochastic variation in the results generated by the model is large compared with the
size of the effects produced by the policies examined, a high number of simulations were run for
each country (more for countries with a larger population, i.e. 5 000 for the Czech Republic; 10
000 for Canada, and 20 000 for Germany), in order to increase the degree of confidence around
the results estimated.” Thus, it appears that the reported results are the mean values obtained
from these large numbers of model runs. The degree of variation in the results is of some interest
and should be reported, although it should be noted that confidence intervals around such
estimates do not admit the standard interpretation, as precision may be increased to any desired
degree simply by running the model more times until the desired degree of statistical significance
is achieved (this consideration affects interpretation of the significance levels shown in Figure
5.2).
Another manifestation of the concern with oversimplified and condensed results is the reporting
of aggregate values for changes over the 40-year simulation period, as in the key results shown
in Figure 5.1. Changes in prevalence rates are shown for the entire 40-year simulation period for
each policy in each country (i.e., difference in prevalence rates in year 40 vs. year 0), but no
indication is provided of how these changes may be distributed over time (i.e., trajectories) or
across population subgroups (as could be explored based on the heterogeneity incorporated in the
model). The problems identified above are inherent here as well, as there is no description of
how incidence risks for simulated individuals are estimated as a function of their alcohol
consumption histories (quantity and pattern). The results are shown for three categories of
drinkers (hazardous/harmful drinkers; heavy episodic drinkers; dependent drinkers) for which the
category definitions are not provided and that do not precisely align with the three categories
identified in the model description provided in Box A (regular drinkers, binge drinkers,
dependent drinkers).
Costs and cost-effectiveness analysis
The report presents per-capita costs for implementing each policy examined as a single number
for each of the 3 countries (Table A.7). The derivation of these estimates is not described except
to refer to the WHO-CHOICE methods, for which the online descriptions also do not provide
relevant details. Many factors could affect the calculation of these costs, and the simple reporting
of numbers in a table without details on their provenance is not good scientific practice. It is not
clear, for example, just what are the components of implementation costs, and the data sources
and methods for estimating those (unidentified) cost components are not described or explained.
One suspects that the implementation costs are limited to direct expenditures – perhaps even
limited to public-sector expenditures – involved in establishing and executing the policies, which
is not consistent with standard CEA methods, as it may omit other resource costs that are
6
attributable to the policies. For example, expanding brief interventions in healthcare settings may
involve significant inputs of patients’ time and resources that are not fully accounted. As another
example, tax interventions generally reduce the real purchasing power of consumers by an
amount larger than the revenue transferred to governments (this is the “dead-weight loss”
associated with a commodity tax), and may also cause changes in sectoral employment and
economic activity, sometimes including shifts among domestic production and imports. Such
effects may represent genuine costs associated with these policies that are overlooked by
considering only the expenditures required to establish and maintain the policies
The OECD report’s approach to cost-effectiveness analysis (CEA) appears to depart from
standard methods1 used in U.S. evaluations of health-related interventions in at least two
important ways. First, as noted above, it omits some potentially important elements of cost that
should be included. Reporting only the implementation costs systematically and perhaps
seriously understates the true costs associated with the policies examined.
Second, the cost-effectiveness analysis is presented as a comparison between implementation
costs on one hand and reductions in health care expenditures on the other. This is distinctly at
odds with the standard cost-effectiveness ratio, in which the numerator is the net present value of
all changes in resource use resulting from the policy, and the denominator is a (non-monetary)
measure of change in health status. The preferred metric for health status change is the QALY
(quality-adjusted life year), which facilitates comparisons across alternative interventions
(although it has some well-documented limitations). The OECD’s preference for DALYs
(disability-adjusted life years, which bear a greater conceptual distinction from QALYs than the
similarity in their names suggests) as a synthetic measure of population health outcomes could be
accommodated in a quite straightforward manner. However, treating changes in health care
expenditures as the outcome and examining these in comparison to policy implementation costs
(whatever those are deemed to comprise, but which apparently include some elements of health
care expenditures as noted in the first bullet in section 5.8, page 30) is certainly a non-standard
framework for reporting CEA results.
Additional comments based on the entire report
The report is generally well-written and represents an admirable effort to provide balanced
discussion of a number of challenging issues relating to alcohol consumption and specific policy
interventions. It gives worthwhile attention to balancing the harms and benefits of alcohol
consumption, including health effects but also the utility associated with consumption, as with
1
. The standard U.S. reference for cost-effectiveness analysis in the context of health-related interventions is Gold,
M. R., Siegel, J. E., Russell, L. B., & Weinstein, M. (1996). Cost-effectiveness in health and medicine: report of the
panel on cost-effectiveness in health and medicine. New York: Oxford Univ Press. This monograph was authored by
a group of 13 non-government scholars convened by the U.S. Public Health Service. The OECD report does not cite
this reference but does cite a WHO report from 2003 that bears some similarities to the Gold et al effort.
7
any other consumer good. The report would benefit from acknowledgement that its perspective is
generally that of economics. It would also benefit from a clear statement of purpose, identifying
how it contributes to the literature and to general understanding of the issues it addresses.
The “goal of alcohol policy” is identified differently at various points throughout the report. For
example, the first paragraph of section 1.4 provides a relatively complicated statement: “The
goal of alcohol policies should be to reduce welfare losses to a minimum and preserve people’s
freedom to consume alcoholic beverages when these are indeed a source of pleasure, provided no
harms are caused to others as a result, and provided that drinkers are able to make a fully
informed choice, based on a complete knowledge of the consequences of drinking alcohol.”
Later in the same chapter the report says, “…the goal of alcohol policies is to curb the harmful
use of alcohol….” (page 14), which is echoed in the first bullet in the introduction to Chapter 4.
However, the first sentence of section 4.1 contradicts this, saying, “The goal of alcohol policies
is typically to reduce alcohol-related harms.” Still later in Chapter 4 we are told, “One of the
main goals of alcohol policies is to promote public health and social wellbeing” (page 6). There
are real differences among these stated goals (and there may be additional variations represented
in the report). In fact, the goals of policy makers may not correspond to any of these intentions,
but may instead be focused on other objectives, such as increasing public revenues, or benefiting
the economic interests of one constituency over others. It may be simpler and more accurate to
identify the goal of the report as evaluating the effects of alcohol policies in reducing alcoholrelated harms through reductions in alcohol consumption – regardless of how that objective
might figure in the actual motivations for policy-makers to adopt such measures.
Chapter 1, Page 5, paragraph 3: The large rise in rank of alcohol consumption as a source of
global death and disability (from eighth to fifth) in just ten years demands some explanation. Did
prevalence or severity of other major sources of health problems decline over the period? Did
prevalence or severity of alcohol problems actually increase?
Chapter 1, pages 9-10: The discussion of “Individual choice and social impacts” needs to be
buttressed with a discussion of causal linkages between alcohol consumption and the harms
identified, acknowledging that not every harm associated with alcohol consumption is caused by
that consumption, yet clarifying that strong evidence supports a causal role in a large fraction of
these harmful outcomes.
Chapter 1, page 12: The discussion of rational addiction necessarily oversimplifies the economic
theory of rational addiction, which most readers of this report will find unfamiliar, confusing,
and hard to accept. Discussion of this conceptually challenging theoretical framework could
safely be omitted from the report, but if it is retained, citation to the primary source material
(Becker and Murphy, 1988) should be included. The discussion also underplays the significance
of alcohol dependence as a major public health problem (e.g., consider the disability weights that
have been attached to alcohol dependence in DALY estimations).
8
Chapter 1, page 14, “Policy responses” section: This section begins with a tendentious framing
of “the question that policy makers have to answer” that ignores the possibility of strategies to
reduce the harms directly and focuses exclusively on curbing alcohol use. The question of broadbased vs. targeted prevention strategies is presented as essentially either/or, despite the promise
that the book will examine combinations of policies that may accomplish a balancing of these
considerations, at least in terms of net public health benefits (i.e., without necessarily addressing
other non-health concerns that policy-makers must also consider).
The section continues (page 15) with a brief discussion of minimum pricing that seems poorly
thought out from an economic perspective. The claimed welfare loss borne by consumers due to
minimum prices is transferred to sellers, for little net change in social surplus relative to a taxdriven price increase. Focusing on this welfare loss also neglects the presumed beneficial effects
on health and other adverse alcohol-related outcomes that motivate the policies in the first place.
Relative to a taxation strategy that generates public-sector revenues, minimum pricing reallocates
the social surplus that arises from voluntary exchange from the public to the private sector, but it
is not clear a priori whether the overall surplus would be the same, larger, or smaller under the
minimum-pricing regime. The problematic framing from this discussion is exacerbated in the
third of the “Key findings” bullets at the beginning of Chapter 4, which simply asserts “a welfare
loss” without clarifying that most of the loss to consumers is transferred to sellers.
Chapter 2, page 4: Figure 2.2 raises important questions that are not resolved, and the
accompanying text focuses on youth drinking, which is not depicted in the Figure, and on
disclaiming the value of these kinds of aggregate data. If the information in Figure 2.2 is not
central enough to deserve elaboration in the text, it should be omitted or relegated to an
appendix. If it is retained, the text should address issues such as: What explains the large-scale
changes reported in the 20-year span? How much of this temporal variation is likely to be due to
changes in data reporting and collection? Most important, can the changes reported be reconciled
with the more disaggregated data reported in Figs 2.5-2.8? Discrepancies appear especially
pronounced for France, Italy, and Ireland. The magnitude of changes reported appears potentially
alarming, especially in countries not closely examined in this report (e.g., Russia, India, China,
Brazil account for over 40% of world population and all show increases of 40% - 57% in per
capita alcohol consumption over 20 years).
Chapter 4, pages 3-4: The discussion of how various categories of drinkers would benefit from
reducing their alcohol consumption appears at odds with the standard welfare economics
framework that seems to underlie much of this discussion. That framework attaches full weight
to consumers’ own ex ante judgments regarding the utility they derive from their preferred
consumption levels. Under that framework, well-informed, non-addicted, adult consumers would
not be judged better off by policies, such as taxes, that induce them to reduce their consumption,
but in fact would be harmed by such policies. Ex post avoidance of adverse effects that otherwise
would be borne by knowledgeable sovereign consumers generally does not justify public
9
intervention in the utility-maximizing consumption decisions of sovereign consumers under a
standard welfare economics framework.
Chapter 4, pages 4-5: The exclusive focus on price policies in this section is not signaled,
explained, or justified. No general framework for evaluating such policies is identified, but the
discussion raises a number of issues that appear to serve as a piecemeal set of evaluation criteria.
The presentation sounds defensive, summoning a range of half-arguments to justify such
policies. The discussion of the distribution of welfare losses resulting from a specific tax increase
is imprecise enough to be objectionable to economists and terse enough to be completely opaque
to most non-economists. Some better explanation is called for, along with a reference to a
standard text on public economics.
Chapter 4, page 5, first paragraph: It is worth noting that the adverse health consequences of
alcohol consumption may also be distributed in a manner that disadvantages those at the lower
end of the income and wealth distributions. Research has not established the relationship
between the distribution of health benefits and the net incidence of alcohol taxes.
Chapter 4, page 5, second paragraph: “Cross-border trade,” as it is generally understood, is not
primarily a form of tax avoidance; most internationally-traded alcoholic beverages are subject to
taxation in one or both countries involved. Illicit cross-border trade, i.e., smuggling, tends to
flow from lower-tax to higher-tax jurisdictions, thereby offsetting some potential effects of
higher taxation in terms of both health and revenue.
Chapter 4, page 5, bullets under “Key messages….”
 First bullet – broader measures of health outcomes are useful for evaluating both health
benefits and adverse health consequences of alcohol consumption, not just benefits.
 Second bullet – this assertion is tendentious and betrays a primary focus on reducing
drinking rather than on improving health outcomes. This point also does not emerge from
the foregoing section, which these few bullets claim to summarize.
 Third bullet – this claim is not firmly substantiated in the exposition to this point. It is also
worth noting that effects on consumption among more moderate drinkers can have large
effects on population health outcomes, reflecting the so-called “prevention paradox”.
Chapter 4, page 16, last paragraph: The selection of Miron et al., 2009 as the sole empirical
study of MLDA in the United States to be cited in this section provides a highly skewed view of
the evidence. A more comprehensive examination of MLDA effects is reported in
Carpenter, C., & Dobkin, C. (2011). The minimum legal drinking age and public health. The
Journal of Economic Perspectives, 25(2), 133-156.
Chapter 4, page 18: The assertion that regulation of outlet density and/or opening hours is “more
strongly supported by evidence” than minimum legal drinking age restrictions seems at odds
10
with the literature from the United States, and the few citations provided do not dispel this
impression.
Download