Toward an Improved Methodology to Construct and Reconcile

advertisement
Decision Analysis
Vol. 10, No. 2, June 2013, pp. 121–134
ISSN 1545-8490 (print) — ISSN 1545-8504 (online)
http://dx.doi.org/10.1287/deca.2013.0268
© 2013 INFORMS
Toward an Improved Methodology to Construct and
Reconcile Decision Analytic Preference Judgments
Richard M. Anderson
Puget Sound Institute, Center for Urban Waters, University of Washington Tacoma,
Tacoma, Washington 98421, rmanders@uw.edu
Robert Clemen
Fuqua School of Business, Duke University, Durham, North Carolina 27708, robert.clemen@duke.edu
P
sychologists and behavioral economists have documented a variety of judgmental flaws that people make
when they face novel decision situations. Similar flaws arise when decision analysts work with decision
makers to assess their preferences and trade-offs, because the methods the analyst uses are often unfamiliar to
the decision makers. In this paper we describe a process designed to mitigate the occurrence of such biases;
it brings together three steps. In training, the decision maker is first given values to apply in judgment tasks
unrelated to the decision at hand, providing an introduction to thinking deliberately and quantitatively about
preferences. In practice, the learned tasks are then applied to a familiar decision, with the goal of developing
the next incremental level of expertise in using the methods. Finally, in application, the more deliberative style
of thinking is used to address the problem of interest. In an environmental resource setting with two oyster
habitat managers, we test the procedure by attempting to mitigate the prominence effect that has been reported
in the behavioral research literature. The resulting preference weights appear to be free of the prominence effect,
providing initial steps toward operationalizing the “building code” for preferences introduced by Payne et al.
[Payne JW, Bettman JR, Schkade DA (1999) Measuring constructed preferences: Towards a building code. J. Risk
Uncertainty 19(1–3):243–270].
Key words: behavioral decision making; biases; decision analysis; environment; multiattribute value theory;
practice
History: Received on December 5, 2012. Accepted by former Editor-in-Chief L. Robin Keller on February 11,
2013, after 1 revision.
1.
Introduction
assessed and combined using an additive multiattribute value function
Decision analysis approaches decision complexity
generally by decomposing a large and difficult
decision problem into a representation consisting of
alternatives, preferences, and beliefs. The analyst typically guides the decision maker in identifying and
assessing these components so that they are consistent with the relevant normative theory and thus
can be recombined using the rules of the theory. For
example, many problems involve evaluating an alternative x described in terms of a set of attributes
x1 1 x2 1 0 0 0 1 xn . The alternative might be a prospective
job offer, and the decision problem to choose between
job offers that differ in terms of salary, vacation
days, and location. In this situation, the components
commonly consist of single attribute value funcP
tions vi 4xi 5 and weights wi (where i wi = 1), both
representing decision maker preferences, which are
v4x5 =
n
X
wi vi 4xi 50
(1)
i=1
For (1) to be a valid representation of aggregate value
v4x5, a normative requirement is that the attributes are
structured in such a way that they may be thought of
as separable in the sense that variations in preference
for outcomes in terms of a given attribute are independent of preferences for outcomes in terms of the
others (Keeney and Raiffa 1976).
A prescriptive recommendation is assessing redundant component judgments using different methods
and performing consistency checks to reconcile any
discrepancies. The process of reconciliation is thought
to improve quality of the final assessment since it
provides opportunity for the decision maker to think
121
122
harder about her values, and thus arrive at greater
confidence that her values have indeed been captured
(von Winterfeldt and Edwards 1986, Huber et al. 1993,
Hobbs and Horn 1997). Consider, for example, the
weights wi in (1). These reflect preference judgments
about the ranges of attribute outcomes spanning the
alternatives under consideration. A basic notion of
consistency is as follows. Suppose the available difference in annual salary (over all job prospects) is
$20,000 and salary is weighted twice as highly as
vacation days, which range from two to four weeks.
Suppose further that a reduction in vacation days
from four to two weeks is equivalent in value to having a job in a coastal rather than noncoastal location
(assuming the coastal location is preferred). To be consistent, the decision maker should confirm that she
would be willing to give up the coastal job location in
return for an increase in salary of $10,000, assuming
that value varies linearly over these ranges.
The results of a large body of research in behavioral
decision making have shown that violations of this
elementary notion of consistency often take place, and
will typically be because of more than, say, random
errors due to routine lapses of mental attention (e.g.,
Lichtenstein and Slovic 2006). This research reports
that preference judgments display an extraordinary
degree of sensitivity as well as predictable directions of bias depending on the way in which preference elicitation questions are asked. For example, the
response to the final question in the last paragraph
has been shown to depend on whether the question
was put in the form of a choice between two alternatives or whether the respondent was given a matching task, i.e., asked to give a value that would induce
indifference between alternatives. Respondents have,
for example, been shown to prefer the alternative that
is superior on the prominent attribute more often in
choice than in matching, a widely replicated preference bias called the prominence effect (e.g., Tversky
et al. 1988, Fischer et al. 1999).
Indeed, these observations have been so ubiquitous
that one prominent team of researchers in this field
suggested that it may be the case that “the foundations of choice theory and decision analysis are called
into question” (Tversky et al. 1988, p. 84; see also
Fischer et al. 1999, p. 1074 for a similar comment). The
Anderson and Clemen: Decision Analytic Preference Judgments
Decision Analysis 10(2), pp. 121–134, © 2013 INFORMS
foundation of which they speak is in part the plausible notion that if well-defined preferences exist, they
should be invariant to different, but logically equivalent, procedures to elicit them. The reality has proven
more complex. What has instead been observed is
that preferences depend on details of how the question is asked (e.g., choice versus match, mentioned
earlier; Tversky et al. 1988), how the choice set is displayed (e.g., attribute order; Carlson et al. 2006), or
on characteristics of other options in the choice set
(e.g., whether an asymmetrically dominated option is
included; Huber et al. 1982), to give a few examples.
Behavioral decision researchers have concluded that
in complex and unfamiliar elicitation settings when
preferences are not available for direct recall, subjects are not merely revealing stable preferences, but
instead constructing them on the spot. Lacking precise knowledge of how their fundamental, familiar
values relate to the decision setting at hand, they
are prone to respond based on error-prone heuristics or on intuition readily influenced by the most
affectual or the most salient features of the elicitation procedures (Gilovich et al. 2002, Lichtenstein and
Slovic 2006). There is active debate on the mechanisms
behind such phenomena, but little doubt that preferences revealed depend on the questions asked (e.g.,
Fischer et al. 1999).
How should the proponent of decision analysis
respond? If sensitivity analysis reveals that these
errors matter, one response is to attempt to use what
is being learned about how people respond to elicitation procedures to improve the way in which
these procedures are conducted. Indeed, this has been
called one of the central tasks of behavioral decision theory (Edwards 1990). With respect to value
weights wi , there have been a number of attempts
to determine, first, features of elicitation protocols to
which people are sensitive and the magnitudes and
directions of bias that tend to result, and then, second,
ways of counteracting those biases, including warning and training respondents to guard against them
and designing analytical approaches to mitigate their
occurrence.
For example, one often replicated result is that
the particular version of “value tree” used to structure objectives impacts value weights, producing a
so-called splitting bias where presenting an attribute
Anderson and Clemen: Decision Analytic Preference Judgments
Decision Analysis 10(2), pp. 121–134, © 2013 INFORMS
in greater detail increases the weight that it receives
(e.g., Borcherding and von Winterfeldt 1988 report
on one of the earliest studies of this phenomenon).
In two more recent investigations, Hamalainen and
Alaja (2008) attempted, with partial success, to avoid
this problem by educating respondents about the danger, whereas Jacobi and Hobbs (2007) developed a
computational method to debias weights. Another
analytical approach to debiasing weights was aimed
at standard “matching” or trade-off task approaches
to obtaining weights (setting an attribute level so that
a pair of consequences is indifferent). In this case,
Anderson and Hobbs (2002) used a statistical procedure to adjust for the magnitude and direction of scale
compatibility bias also frequently reported on within
the literature, in which the attribute that is adjusted
in the response appears to be more salient, leading to
overweighting (e.g., Tversky et al. 1988, Slovic et al.
1990, Delquié 1993, Fischer et al. 1999). In contrast
to this analytical approach, Delquié (1997) proposed
a potential improvement to the matching procedure
itself in which both attributes of the pair are adjusted.
However, Fischer et al. (1999) found that a similar
procedure remained likely to show a similar bias. Yet
another approach has been to use weights derived
from a less cognitively demanding task such as simply ranking the attribute ranges under consideration,
followed by deriving weights via a specific algorithm
(Edwards and Barron 1994, Barron and Barrett 1996).
However, this would lead to a loss of specific ratio
scale information by the decision maker and potentially suboptimal decisions (Jia et al. 1998).
This paper returns to the initial recommendation for addressing error and bias in value weights
described earlier, that of making redundant assessments using different methods and reconciling any
inconsistencies. Respondents will usually find themselves in complex settings lacking everyday familiarity and will thus to some degree be constructing
their preferences and therefore prone to inconsistency. Our central idea is to use behavioral insights
to improve the process of weight elicitation and
particularly of weight reconciliation in such settings (cf. Edwards 1990, Keeney 2002). We distinguish between approaches described in the previous
paragraph where these insights were used to debias
weights “offline” from the process of interaction
123
with the decision maker and approaches in which
these insights are used to design a better interactive elicitation or reconciliation procedure. This paper
focuses on the latter category. The former category
of approaches will be most useful when little time
is available to interact with the decision maker and
in such cases provides a plausible approach given
known tendencies toward weighting bias. However,
it leaves unaddressed the prescriptive question of how
to use specific behavioral insight to interact with the decision maker in a way that moves toward judgments
that are both unbiased, providing normative validation, and in which the decision maker herself has a
high degree of confidence, providing user acceptancebased “validation.” To be clear, our ultimate goal is
to develop weights that are normatively valid—that
is, that accurately reflect the long-term preferences
of the elicitee or her agent—but, as we explain later,
user confidence should be one of validity’s first fruits
in an improved process. We propose that behavioral
insight can and should be used to explicitly develop
prescriptive guidance about how weights should be
elicited and then reconciled, seeking to build on previous studies that applied multiple methods but did
not report details of the underlying behavioral philosophy (e.g., Hobbs and Horn 1997). Thus we attempt
to operationalize the Payne et al. (1999) concept of a
“building code” for preference construction, providing a framework for a new way to think about preference elicitation and reconciliation (cf. Gregory 1999).
We illustrate these ideas in an environmental preference assessment exercise conducted with two professional natural resource managers. Although our
demonstration shows that our approach can be used
with real decision makers, we hasten to add that the
demonstration does not constitute experimental evidence one way or the other. Providing such evidence
remains for a future study.
We continue in §2 by reviewing existing literature
on reconciliation to see what it suggests about how
reconciliation has been and should be done. We ask
if these previous approaches to eliciting and reconciling value weighting judgments have led to weights
that satisfy two key aspects of validity, convergence,
and user acceptance. We conclude that challenges to
doing so remain, motivating the synthesis of the set of
Anderson and Clemen: Decision Analytic Preference Judgments
124
Decision Analysis 10(2), pp. 121–134, © 2013 INFORMS
procedures described in §3 for eliciting and reconciling the component weight judgments required for the
multiattribute value model (1). In §4 we describe how
this procedure was applied in elicitations conducted
with two North Carolina natural resource managers.
In §5 we describe how our approach represents initial steps toward implementation of the Payne et al.
(1999) building code for preference construction, and
we outline hypotheses that should be tested to translate those guidelines into generalizable and practical
guidance for decision analysts.
2.
Studies on Reconciliation:
Is Convergent Validity the Answer?
Beyond the standard advice to perform redundant
elicitations, little has been reported in the literature
on what actually happens or what should happen in
the reconciliation process. For one thing, merely asking respondents to “think harder” about their preferences might not lead to more accurate assessments
if the judgment task they have been assigned is of a
level of difficulty such that simpler but more errorprone heuristic thought procedures are employed to
carry them out (Payne et al. 1993). Thus the goal of
merely converging on a final response may produce
a result that, despite greater confidence on the part
of the respondent, is still confounded by errors of the
sort previously described. Are reconciled judgments
more bias free? Can we expect them to be so?
One perspective on inconsistencies in attempted
measures of preference is that the inconsistency measures are invalid and that decision-analytic attempts
to obtain useful expressions of preference are thus
subject to major if not fatal problems (e.g., Tversky
et al. 1988, Fischer et al. 1999). For example, many
studies have used a convergent validity approach that
compares values obtained for the same objects by
a holistic and a decomposed procedure. The holistic approach is the intuitive, unaided approach to,
say, ranking a set of multiattributed alternatives,
whereas an example of a decomposed procedure is
the typical decision-analytic decomposition into preferences, beliefs, and alternatives as exemplified by
value model (1). These studies have typically been
motivated by descriptive questions (e.g., When are the
judgments different?), as opposed to prescriptive considerations, where decomposition is recommended
for its ability to help address human difficulties in
making complex comparisons. The convergent validity principle predicts that if these different approaches
are measuring the same thing (e.g., preferentially separable, multiattributed objects), they should be highly
correlated; this is one approach to evaluating validity
when dealing with measures such as stated utilities
that are subjective and unobservable. The approach
is also used to assess multiattribute value or utility
weights elicited by different procedures as we do in
this paper, where the idea of avoiding lack of “validity” (in its broader sense; e.g., Cronbach and Meehl
1955) could be more narrowly expressed as minimizing elicitation mode bias. However, in anticipation of
future applications of the broader ideas, we retain use
of the term “validity” when we discuss convergence
(cf. Gregory 2004).
For the holistic versus decomposed comparisons,
in many cases correlations were high, though not
perfect, and arguably offered evidence in favor of
validity. Von Winterfeldt and Edwards (1986, p. 364)
listed 11 such studies, with correlations ranging from
0.7 to 0.9. However, two studies, one experimental
(John 1984) and the other a field study looking at
alternative future German energy scenarios (Keeney
et al. 1987), dealt with more challenging decision
tasks in settings where attributes were uncorrelated
or negatively correlated. In these studies, ordinal discrepancies suggesting lack of validity were obtained
between intuitive rankings and results of a multiattribute model. There are also many studies that
have reported low correlations and ordinal discrepancies in multiattribute utility models between preference weights derived by different procedures (e.g.,
Schoemaker and Waid 1982, Borcherding et al. 1991,
Hobbs et al. 1992, Bell et al. 2001).
Still, counter to the prior-stated suggestion that
such inconsistencies and biases call into question the
validity of these measures of preference or the usefulness of the procedures that attempt to obtain them,
the decision-analytic response is typically to note
that they fall short of violations of the normative
assumptions of utility theory, as in Ellsberg’s paradox, for example (e.g., von Winterfeldt and Edwards
1986). Rather, they are unintended though nonrandom inconsistencies that should and can be somehow reconciled. In fact, it has been further suggested
Anderson and Clemen: Decision Analytic Preference Judgments
Decision Analysis 10(2), pp. 121–134, © 2013 INFORMS
that such an error can be an asset rather than a
liability—that the attempt to reconcile the demands of
the decision-analytic models to decision maker intuition can generate a “creative stress” that provides
opportunity for deeper insight into the decision problem (von Winterfeldt and Edwards 1986). At a minimum, noticing and reconciling such inconsistencies
is an inherent part of any realistic decision analysis.
For example, John (1984) presented inconsistencies
between multiattribute utility-based and holistic rankings to subjects and asked, “Why are you inconsistent and which numbers would you like to change?”
Subjects attributed the disagreement to faulty judgment and were explicit about what they wanted to
change. Keeney et al. (1987) obtained a similar result.
But these authors did not address whether these reconciled judgments were themselves bias free.
Eppel (1990) studied the effect of reconciling inconsistent multiattribute value weights in a series of
laboratory experiments. His studies focused on the
weights themselves and did not involve choosing alternatives. His experiments were administered by computer, facilitating tailored wording and
sequencing of questions and immediate feedback and
reconciliation of weight inconsistencies to respondents. Four widely used weight elicitation methods
were compared: Ratio, Swing weighting, Indifference
Trade-off, and Pricing out. (See von Winterfeldt
and Edwards (1986) for a detailed explanation of
each method.) An additive representation with linear single attribute value functions was assumed
throughout.
Respondents were asked to reconcile any ordinal
inconsistencies between the weights from the different
methods using one of two modes, a Trade-off mode
and a Pricing mode. In the Trade-off mode, normalized weights were converted into indifference values
between two alternatives that were supposed to be
equal in value, where each alternative was described
in terms of two attributes. For each of the four weight
elicitation methods, the indifference values for a particular attribute were displayed simultaneously on
the screen and respondents were asked to indicate
what indifference value they felt most comfortable
with, one of the four or a new one. This process was
repeated for each attribute consistent with the normalization that the weights summed to one. The Pricing mode differed only in that results were displayed
125
and reconciled in terms of the amount society should
be willing to pay to move each attribute from its
worst to its best consequence (i.e., the cost attribute).
Respondents were undergraduates.
In contrast to Eppel (1990), who used a computerized instrument with undergraduates in a laboratory
setting, Hobbs and Horn (1997) held one-on-one interviews to reconcile inconsistent decision analysis judgments from citizen stakeholders in a realistic energy
resource planning process for a Canadian utility. Recommended alternatives would be added to a portfolio
of nonexclusive options to provide the desired energy
services. In this study, two different weight elicitation
methods were applied, a Ratio-questioning/Swingweighting hybrid and Indifference Trade-off weighting, each combined with the same additive single attribute value functions. In a third approach,
alternatives were also scored holistically. Interviews
were held with stakeholders individually, each lasting
about 1.5 hours.
The reconciliation proceeded in two stages: In
the first stage, weights from the two elicitation
approaches were reflected back to the stakeholder,
compared, and discussed. Stakeholders were given
an opportunity to revise any judgments or choose a
preferred set of weights. Discussion then focused on
the alternatives recommended by the different sets of
weighting judgments. Each alternative could be categorized as either “in” or “out” in this formulation,
and reconciliation focused on cases where an alternative was ranked “in” under one weighting method
and “out” under the other. Stakeholders were asked
to indicate which of the two sets of weights was consistent with their values and why, leading ultimately
to a final set of weights and recommended alternatives with which they were most comfortable. In the
second stage, the final recommendation for each alternative (“in” or “out”) was compared to that made
by the holistic assessment. Any differences were discussed. A final recommendation for each alternative
was elicited, and when it differed from the recommendation of the preferred weights, a rationale was
requested.
As would be expected, both studies found that
different weight elicitation methods lead to different
weights, suggesting the need for reconciliation: different magnitude orderings of weights in Eppel (1990)
126
and different weights in the sense of recommending
different portfolios of alternatives in Hobbs and Horn
(1997). Hobbs and Horn (1997) also found (stage two)
disagreement in recommended alternatives between
final weights and holistic choices.
Eppel (1990) attempted to understand the strategies
used by respondents to reconcile inconsistent weights.
To do so, he categorized each final response (separately for Trade-off and Pricing modes) as being closest (in terms of absolute difference) to one of the
four responses (i.e., the feedback response) initially
obtained by the four weight elicitation methods. Each
respondent was categorized as reconciling toward the
method that had the lowest average (across attributes)
absolute difference between feedback and final values. The frequencies of these classifications for the
two reconciliation formats (Trade-off or Pricing) were
then analyzed.
In contrast, Hobbs and Horn (1997) focused on
the degree of stakeholder confidence in final recommended portfolios of alternatives. Most inconsistencies in this study were finally resolved in favor
of weight-recommended portfolios, but there were
some exceptions in which stakeholders finally rejected
the model-based recommendations. Typically, the reason given was disagreement with the assumptions
behind multiattributed descriptions of certain alternatives. Providing opportunity for such insights had
been viewed as one of the goals of the reconciliation process. Based on explicit stakeholder feedback,
Hobbs and Horn (1997) concluded that applying different preference elicitation methods plus reconciliation resulted in learning and increased confidence in
recommendations.
Although these studies shed light on what does happen in reconciliation of multiattribute value weights,
they also suggest additional things that should happen in reconciliation. Given the abundance of evidence
in the literature that preferences are often constructed
at the time the valuation question is asked and thus
will reflect features of the construction process, a more
informative reconciliation procedure might attempt to
indicate whether the reconciled weights are likely
to be free of assessment-related biases. If preferences
are really being constructed at the time of elicitation, then the reconciliation process in which initial
judgments are reviewed and then adjusted to better
Anderson and Clemen: Decision Analytic Preference Judgments
Decision Analysis 10(2), pp. 121–134, © 2013 INFORMS
reflect values should overlap considerably with the
construction process. User acceptance and confidence
in results of the reconciliation process is certainly of
critical importance (e.g., Hobbs and Horn 1997); the
goal, of course, being an expression of preference that
serves as a guide to action. However, we also wish
that this confidence rests on a firm foundation, one
that is not essentially determined by features of the
procedure by which these results (e.g., weights) were
initially elicited and/or reconciled. A more diagnostic
look at the processes involved in reconciliation could
suggest further avenues toward accomplishing these
dual objectives.
For example, Eppel (1990) reported evidence that
reconciled weights might not be bias free. Recall that
he calculated the frequencies of classifications for
the two reconciliation modes, Trade-off and Pricing.
So, for example, if a respondent’s final, reconciled
response via the Trade-off mode was closest to the
feedback response from the Ratio method, he was
categorized as reconciling toward the Ratio method.
Eppel (1990) found that for the Trade-off reconciliation mode, the largest fraction (6/16) of respondents
reconciled toward the Trade-off weighting feedback
value, and for the Pricing reconciliation mode, the
largest fraction (also 6/16) of respondents reconciled
toward the Pricing out weighting feedback value. He
interpreted this as evidence of a tendency to reconcile toward the method that is most compatible with
the corresponding feedback format. He noted that this
could be partly because of a memory effect, since the
feedback indifference values for the Trade-off weighting method were identical to the ones the respondent
gave in the initial elicitation.
However, Eppel (1990) also suggested that the question remains open whether this could be because of
a more fundamental “compatibility effect” similar to
the one discussed in the literature (e.g., Tversky et al.
1988, Slovic et al. 1990, Delquié 1993, Anderson and
Hobbs 2002), where the response (in this case, the reconciliation strategy) is biased toward the most compatible stimulus (in this case, the feedback format).
If so, reconciled weights may be biased by the particular process by which they were obtained. A second,
follow-up experiment by Eppel (1990) for a different
Anderson and Clemen: Decision Analytic Preference Judgments
127
Decision Analysis 10(2), pp. 121–134, © 2013 INFORMS
decision context with an expanded set of reconciliation modes produced confirmatory results. Eppel
(1990) also found that, in general, reconciled weights
failed to conform to certain other desirable requirements: they displayed the well-known “splitting bias”
(e.g., Borcherding and von Winterfeldt 1988, Weber
et al. 1988) and failed to appropriately respond to
changes in attribute ranges (e.g., Fischer 1995, Keeney
2002). Both of these effects had been present in the
initial weights as well.
In contrast to Eppel (1990), Hobbs and Horn’s
(1997) study reflected a shift away from a focus on satisfying such desirable criteria to a concern with user
acceptance and confidence in final weights. Hobbs
and Horn (1997) suggested that respondents found
the more flexible one-on-one format of personal interviews to be essential to a reconciliation process that
increases learning and builds user confidence. But it
would seem that this would tend to compromise one’s
ability to maintain experimental control across interviews because the style of the experimenter might
change from one interview to the next (as pointed out
by Eppel 1990). Analyst influence is one of the factors
reported to affect preferences in a constructive context
(e.g., Fischhoff et al. 1980, Brown 2005). Perhaps more
importantly, the behavioral rationale for the elicitation and reconciliation design is not specified, so no
basis is provided for having confidence that the final
weights are unbiased. Still, in light of the difficulties
encountered in obtaining satisfactory weights, Eppel
(1990) concluded that perhaps a broader viewpoint
on what constitutes a valid or high-quality preference
assessment process is needed, e.g., one that also considers user confidence. In other words, it may be that
no single preference assessment process is likely to
satisfy all aspects of validity.
Is such a compromise really necessary? In distinction from Eppel (1990), we suggest that an improved
process, both initial elicitation as well as an appropriately designed reconciliation, can lead to weights
that are both bias free and satisfy users. Freedom from
bias (here, elicitation mode bias), remains the primary
determinant of validity, but we would hope that the
implied user confidence that his or her preferences
have been accurately assessed would be one of the
immediate and most important fruits of the process.
This is what should happen in reconciliation of value
weights, though our observations in this section suggest that this is not necessarily what does happen. In
the next section, we turn to discussion of what a process that does both might look like, linking the proposed improvements to the particular issue they are
designed to address.
3.
Elicitation and Reconciliation to
Learn Both Methods and
Preferences
To move toward the design of an elicitation and reconciliation procedure that addresses the dual goals
of freedom from bias and user acceptance, we offer
the following fundamental principle: because respondents are constructing or learning about their preferences during elicitation, decision-analytic techniques
should facilitate this process (Payne et al. 1999,
Gregory et al. 2005). Arguably, then, decision analysis should provide a framework so that respondents
may adjust their initial responses as they better understand how the relevant values (e.g., their basic values) apply to the decision situation at hand. In short,
it should provide an environment in which they can
learn about their preferences. That this is done in a
behaviorally effective way is the first requirement of
an improved elicitation and reconciliation design.
Previous reconciliation procedures may not effectively facilitate the learning process. Some respondents have found the trade-off weighting approach
to eliciting value weights (Keeney and Raiffa 1976)
too complex, suggesting this approach is sometimes
a hindrance rather than a help (e.g., Fischer 1995,
Hobbs and Horn 1997). More fundamentally, previous procedures may have failed to adequately instruct
respondents in conducting the more deliberate and
quantitative introspection about preferences that is
required. Thus it becomes unclear whether adjustments during reconciliation actually represent learning about (e.g., constructing) preferences. They may
instead signal confusion about how the methods
work, how to do the thinking required, or even about
what kind of thinking is required. Kahneman and
Frederick (2002) discuss a dual-process model of cognitive operations in which more deliberative, reasondriven judgments may endorse, correct, or override
judgments that are more intuitive, affective, and error
128
prone (see also Kahneman 2011). Facilitating the more
deliberate judgments is, of course, the role of decision analysis, but if respondents maintain what may
be a customarily more intuitive style in providing
judgments, then decision analysis fails to carry out its
approach toward facilitating learning. In fact, these
two styles of thinking, reason and intuition, are inextricably linked in decision making, so to be more
successful an elicitation procedure must bring the
two together (Finucaine et al. 2003). This suggests
that as a second requirement, an improved elicitation
and reconciliation design must seek to ensure that
the more deliberative way of thinking about preferences is actually taking place and interfacing with the
more intuitive.
These two requirements are addressed by a threestep process: (1) training, designed to build skill
and familiarity with the style of thinking required;
(2) practice, where the learned elicitation tasks are performed for a decision with which the respondent is
familiar; and (3) application, where the elicitation tasks,
including any necessary reconciliation of inconsistent
judgments, are applied to the actual decision context
that we are interested in for the analysis. We proceed
to explain each of these steps in detail, including our
rationale for how it satisfies the two requirements just
elucidated.
3.1. Training
In an approach we adapt here in our first step, Huber
et al. (2002) describe the use of training to prepare
respondents in a principal–agent study. The respondents are given a set of part-worths (the overall preference or value associated with each specified level of
each attribute1 ) that they are to apply in performing
a series of choice and matching tasks, and then given
feedback on the accuracy of their responses. Because
the respondents are able to focus exclusively on how
the judgment tasks are used to translate the known
part-worths, it is likely that by the end of training they
would be highly confident in making the judgments
and introduced to a way of thinking deliberately and
quantitatively about preferences.
1
For example, using value model (1), the part-worth associated
with option x on attribute i would be wi vi 4xi 5.
Anderson and Clemen: Decision Analytic Preference Judgments
Decision Analysis 10(2), pp. 121–134, © 2013 INFORMS
3.2. Practice
In this second step, the respondents first encode single attribute value functions, and then carry out the
choice and matching tasks just learned for a decision for which they have everyday familiarity. Value
weights can be derived from responses to the choice
and matching tasks. An example of a familiar decision is buying groceries, where there is direct interaction with the market, and where through repeated
experience a stable set of preferences is likely to
have been formed, in particular, monetary values
associated with the experienced consequences of purchasing decisions. This step involves only an incremental level of difficulty compared to the previous
step because, using methods with which they are now
familiar, respondents need only encode their personal,
well-known values.
We hypothesize that, after these first two steps,
the respondents’ status with respect to understanding
preference-elicitation methods and quantitative ways
of thinking about preferences will be significantly
improved in two specific ways. First, the respondents should possess a significantly greater comfort
level and skill with the methods involved in performing the judgment tasks. Second, the respondents
may have gained important insights through contrasting the outcomes of familiar decisions made in
a customarily more intuitive style with outcomes
of the decision analytic approach that seeks to be
more deliberate and consistent. We would hypothesize that any variability or inconsistency in preferences observed in the next step–application to the real
problem–will be because of a lack of understanding of
(or certainty about) preferences, and not lack of understanding of the assessment methods or quantitative
way of thinking about preferences. With the removal
of this confounding factor related to method, an environment would have been created for respondents to
effectively learn quantitatively about (e.g., construct)
their preferences in the real problem of interest and
for unambiguous identification (by the analyst) of any
behavior indicating poor preference construction.
3.3. Application
Finally, the learned elicitation methods and more
deliberative thinking style are carried out to address
Anderson and Clemen: Decision Analytic Preference Judgments
Decision Analysis 10(2), pp. 121–134, © 2013 INFORMS
the decision problem of interest. Elicitation and reconciliation can now be expected to be a more deliberative process and to reflect quantitative learning
about one’s preferences, integrating this style of thinking with the more intuitive and qualitative. Because
the elicitation methods as well as the types of thinking involved in making the judgments are now better
understood and integrated, we hypothesize that this
learning will also produce a high level of user acceptance of the methods and their results.
Choice and matching tasks are chosen as the different methods to use in carrying out the redundant assessments and consistency checks involved in
deriving value weights based on a behavioral rationale. In one of the strongest preference biases documented in the decision-making literature, choice
tasks have been shown to lead to steeper (more
unequal) weights, whereas matching (direct tradeoff) tasks have been shown to lead to flatter (more
equal) weights (Tversky et al. 1988, Fischer et al.
1999). This effect has been labeled the prominence
effect, for which a recent explanation is that response
tasks, like choice, that have as a goal differentiating
between alternatives tend to give more weight to the
more important (prominent) attribute than tasks such
as matching that have equating alternatives as their
goal (Fischer et al. 1999). Therefore, all other things
being equal, it is plausible that using them together
is likely to create offsetting tendencies and result in
weights that better reflect user preferences. This idea
has been described differently: (i) as using a portfolio
of assessment procedures for which the biases cancel out (Kleinmuntz 1990); and (ii) as attempting to
overcome inconsistencies among methods by triangulation of responses (Payne et al. 1999). Our point is
that using these two particular methods together provides a behavioral basis for expecting weighting judgments to be free of a major, known weighting bias.
4.
Demonstration: Attempting to
Mitigate the Prominence Effect
Through Training and Practice in
Elicitation Methods
Two state agency natural resource managers provided
decision-analytic preference judgments to construct
129
value model (1) for a decision problem involving setting harvest policy for management of oyster fishery habitat in North Carolina, United States. Decision maker 1 provided judgments from which single
attribute value functions and value weights were
derived using choice and matching tasks for a twoattribute problem, whereas decision maker 2 did the
same for a three-attribute version of the problem.
Before addressing the attributes of interest, both individuals underwent the training and practice steps
described earlier. As stated at the outset, although our
example demonstrates the feasibility of the approach,
it is not meant to provide controlled experimental evidence.
4.1. Training
The respondents were presented with part-worths,
which reflect linearity of the value functions, and
are assumed to be known (Figure 1). An example
of a choice training question, including the feedback
that was given if the chosen alternative is wrong, is
given in Figure 2(a). For this question, the correct
choice is A, because the length of the “Total cost’’ bar
reflected by the poor–good difference is greater than
that reflected by the “Ski slope quality’’ difference.
Several training choice tasks of differing complexity
were administered.
For training in matching tasks, the respondents
learned the correct answer ($573 in Figure 2(b))
after generating their estimates. Additional feedback depended on whether responses were within
10%, 20%, or >20% of the magnitude of the correct response. An answer within 10% elicited a “Very
good” response, 10%–20% elicited an “OK,” whereas
>20% elicited, “That’s not very accurate.” Again, several matching training tasks were administered. Both
respondents gradually developed facility in making
the judgments, including quite subtle ones.
4.2. Practice
In this step the respondents chose a familiar, twoattribute problem to provide single attribute value
function, choice, and matching judgments. For one
decision maker it was a decision on purchasing
a favorite fruit from the grocery store, with the
attributes cost per item and distance to the location
where the fruit was grown (as proxy for environmental performance). Attribute value functions were
Anderson and Clemen: Decision Analytic Preference Judgments
130
Decision Analysis 10(2), pp. 121–134, © 2013 INFORMS
Figure 1
Known Part-Worths and Instructions Provided to Respondents in the Training Stage of Elicitation and Reconciliation
Relative importance of shift from
Poor
Fair
Good
|
Total cost ($900-$600-$300)
|
Ski slope quality (C-70, B-80, A-90)
|
Likelihood of excellent snow (50%, 70%, 90%)
Note. The length of each bar for each attribute reflects the value provided by that attribute. The dark part reflects the value of moving from poor to fair, whereas
the lighter segment reflects the value of moving from fair to good.
obtained by assigning zero points to the worst outcome and 100 points to the best and having the decision maker assign the midpoint of the attribute scale
to a point (between 0 and 100) reflecting its importance relative to the attribute’s endpoints (Fishburn
1967). Curvature in these functions was then fit with
a one-parameter exponential function (Keeney 1992).
Our main interest was in the choice and matching judgments for which training had been provided. Both respondents carried out several choice
and matching judgments on these familiar decision
problems, improving both their facility with the methods as well as with the type of thinking involved in
Figure 2
(a) Choice and (b) Matching Training Exercises
Administered to Respondents
(a) Choice training exercises: Which one would you choose, A or B?
Total cost
($900-$600-$300)
Ski slope quality
(C-70, B-80, A-90)
A
$300
70
B
$900
90
Feedback
The correct answer is A. The importance of $300 versus $900 in total
cost is greater than the importance of 70 versus 90 in quality.
(b) Matching training exercises: What cost would make these equally
valuable?
Total cost
($900-$600-$300)
A
Likelihood of excellent snow
(50%, 70%, 90%)
$300
50%
90%
B
learning quantitatively about their preferences. There
was informal evidence that the latter process was taking place. The respondent who used the fruit purchase as her familiar decision commented that prior
to the exercise her intuitive judgment was that both
attributes under consideration were “important,” but
that this approach forced her to think about her tradeoffs more specifically. She expressed surprise at identifying the limits on how much extra she would spend
to purchase a fruit that was grown closer to the purchase location. She noted that she could feel the tension between the more analytic, deliberative, and the
more intuitive style of thinking about the decision.
4.3. Application
The problem of interest involved three attributes,
listed in Table 1. Decision maker 1 provided judgments for the first two attributes, and decision
maker 2 for all three.2 The first attribute (E) is constructed (Keeney and Gregory 2005) to consist of
levels of factors that are combined to reflect the economic impact of the oyster fishery (Table 2). Decision
maker 1 viewed the second attribute (C) as an indicator of habitat quantity for visual foragers, whereas
decision maker 2 viewed it as an indicator of the
presence of, or potential for, algal blooms. The third
attribute (P ) is an indication of the enhancement of
species abundance or biodiversity because of the habitat effect of the oyster reef (Peterson et al. 2003) and
was used to capture the concern of decision maker 2
for ecosystem resilience. First choice and then matching judgments were made; the tasks are displayed in
Feedback
2
Correct answer =
$573
Very good: $515­$630 OK: $458­$687
Otherwise: That’s not very
accurate
The process described here does not reflect a comprehensive introspection and structuring of values and objectives; instead, a few
objectives and their associated attributes or measures are chosen to
illustrate the issues raised in this paper.
Anderson and Clemen: Decision Analytic Preference Judgments
131
Decision Analysis 10(2), pp. 121–134, © 2013 INFORMS
Figure 3
Choice Tasks for Management of North Carolina Oyster
Fishery Habitat for Decision Maker 1
Choice: Which one would you choose, A or B?
Economic impact of
commercial fishery
(1-3-5)
Water clarity in
lower NRE
(0 m-1.5 m-3 m)
A
5
1.5 meters
B
1
3 meters
A
5
1.5 meters
B
3
3 meters
A
5
1.5 meters
B
4
3 meters
A
5
0 meters
B
3
1.5 meters
A
5
0 meters
B
2
1.5 meters
A
5
2 meters
B
4
3 meters
A
5
2 meters
B
3
3 meters
Figures 3 and 4, respectively, for decision maker 1. To
produce a set of value weights for the attributes, each
of the six matching judgments was combined with
the assessed single attribute value functions using a
standard approach (Keeney and Raiffa 1976), yielding
six replicate sets of value weights for each round of
elicitations.
Figure 5 shows the results of a series of elicitations with both decision makers, three rounds for
decision maker 1 (Figures 3 and 4 tasks repeated in
each round) and two for decision maker 2, culminating in a final value weight for each assessed attribute,
indicated by the arrows. By convention, weights were
normalized to sum to 1. Decision maker 1 had two
attributes, so all graphed points fall on the diagonal. For example, 4WE 1 WC 5 = 40051 0055 would be in
the middle of the diagonal. In contrast, the fact that
decision maker 2 included a third attribute explains
why the results are pulled inward from the sum-to-1
line, because all three assessed weights were nonzero.
After each round of elicitations, the decision maker
was asked to explain how she arrived at her judgments. If her responses reflected any methodological
Figure 4
Matching Tasks for Management of North Carolina Oyster
Fishery Habitat for Decision Maker 1
Matching: What level of water clarity would make these equally
valuable?
Economic impact of
commercial fishery (1-3-5)
Water clarity in lower
NRE (0 m-1.5 m-3 m)
A
1
3 meters
B
5
A
1
B
4
A
1
B
3
A
1
B
2
A
4
B
5
A
3
B
5
3 meters
3 meters
3 meters
3 meters
3 meters
misunderstanding, this was explained, and another
round of elicitations was carried out. Thus, variability in weights in the final round of elicitations is
expected to reflect not method but preference uncertainty. The weights derived from the matching judgments along with the single attribute value functions
were also used to score the alternatives in the choice
tasks using value model (1) so that the degree of consistency between choice and matching tasks could be
assessed (Figure 6). The elicitation and reconciliation
process led to the following observation that relates
Table 1
E
C
P
Attribute Definitions and Ranges Used to Evaluate Oyster
Restoration Policy Options
Attribute
Range
Annual statewide economic impact due to the
north subtidal commercial oyster fishery
Water clarity depth just downstream of a
subtidal oyster reef in the lower Neuse
River estuary, North Carolina, U.S., given
average, July, steady-state hydrographic
conditions
Enhanced annual production of finfish and
crustaceans on subtidal oyster reef in the
lower Neuse River estuary, North Carolina,
U.S.
1–5 (unitless)
0–3 m
0–5.2 kg/10 m2
Anderson and Clemen: Decision Analytic Preference Judgments
132
Level
1
2
3
4
5
Attribute for Annual Economic Impact of North Carolina North
Subtidal Commercial Fishery, Where Level 1 Is Defined as
the Worst Outcome and Level 5 Is Defined as the Best
Outcome
Ex-vessel
value
Fishermen
(w/crew)
Total statewide
impact
Additional jobs
created
$6001000
$9001000
$112001000
$115001000
$118001000
617
789
895
993
11013
$9761512
$115001346
$210501878
$216531057
$311931762
409
708
909
1106
1308
Figure 5
Value Weights Derived from Matching Judgments for
(a) Decision Maker 1 (See Figure 4 for the Actual Tasks)
and (b) Decision Maker 2
(a) Decision maker 1
1.0
0.9
0.8
0.7
0.6
wC
Table 2
Decision Analysis 10(2), pp. 121–134, © 2013 INFORMS
Note. See North Carolina Division of Marine Fisheries (2007, Table 8.4).
0.5
0.4
0.3
0.2
1
2
3
0.1
0
0
0.2
0.4
0.6
0.8
1.0
0.8
1.0
wE
(b) Decision maker 2
1.0
0.9
0.8
0.7
0.6
wC
to the issue of method understanding versus learning
about preferences previously discussed.
Matching- and choice-based assessments were
essentially in agreement by the end of the final
round of elicitations for both decision makers, obviating the need for separate reconciliation. This was
evaluated by using the computed weights from the
matching tasks to compute overall value of A versus B to see whether the choice recommended by
the assessed model matched the choice made by the
decision makers. The final matching judgment produced weights–indicated by the arrows in Figure 5–
that were in agreement with six of seven choices for
decision maker 1 and nine of nine choices for decision maker 2. The only “disagreement” for decision
maker 1 was for a choice task in which the alternatives were felt to be equally desirable (a tie), but
that was scored by the matching task in favor of one
of the alternatives. This process and result provide
a basis for regarding these final weights as a better
representation of respondent preferences than would
be obtained by either type of task alone. It suggests
that the well-documented tendency of choice and
matching tasks to produce inconsistent preferences is
absent, a result that may be because of the training
and practice conducted. Specifically, we hypothesize
that the training and practice in the steps enabled
the respondents to learn the methods well enough
so that at the application step distraction from focusing on integrating quantitative and qualitative learning about preferences was minimized. The resulting
weights appear to be free of the prominence effect
for these two decision makers, increasing validity: the
two methods agree, the weights being approximately
equal.
0.5
0.4
0.3
0.2
1
2
0.1
0
0
0.2
0.4
0.6
wP
Notes. Arrows indicate the final matching judgment in the final series of elicitations for each decision maker. Squares, Elicitation 1; circles, elicitation 2;
triangles, elicitation 3.
5.
Next Steps Toward
Operationalization of the Payne
et al. (1999) “Building Code”
In 2009, the North Carolina Coastal Federation
received a $5 million federal economic stimulus grant
for habitat conservation with the goal of putting private industry to work rebuilding the state’s oyster
reefs (North Carolina Coastal Federation 2009). The
Anderson and Clemen: Decision Analytic Preference Judgments
133
Decision Analysis 10(2), pp. 121–134, © 2013 INFORMS
Figure 6
Example of 1 of 9 Choice Tasks for Decision Maker 2
Indicating Agreement Between Choice and Matching Tasks
Economic
Water clarity Enhanced production Score
impact (1-3-5) (0 m-1.5 m-3 m) (0-2.6-5.2 kg/10 m2)
X
A
5
1.5 meters
2.6 kg/10 m2
0.73
B
1
3 meters
2.6 kg/10 m2
0.55
Notes. “X” indicates his choice. The “Score” column indicates the evaluation
of alternatives A and B using value model (1) based on elicited single attribute
value functions and the final set of value weights (indicated by the arrow in
Figure 5(b)).
project moved oyster restoration plans ahead by several years, and relieved much of the financial constraint that was making decisions to take immediate action on oyster restoration difficult. The stimulus
grant-funded project’s emphasis on both environment
and economy (e.g., job creation) was quite consistent
with the desire of the decision makers, as reflected in
our results, to put emphasis on economy as well as
environment.
This paper has proposed and demonstrated a procedure to elicit and reconcile multiattribute value
weights, resulting in weights that appear to (a) be
free of one known weighting bias and (b) elicit a high
degree of user acceptance and confidence. Although
we do not provide detailed evidence that would allow
us to generalize from our observations, we believe
that our results provide steps forward in operationalizing the “building code” of Payne et al. (1999).
In particular, we recommend that any such operationalization procedure should address the following
issues: First, how do we design training and practice
to provide greater facility with elicitation? Second,
how do we facilitate judgments that are both more bias
free and elicit greater user confidence? Third, how can
the training and practice stages of a given procedure
lead to the reduction of method misunderstanding
so that remaining variability in preferences is mainly
because of uncertainty about preferences?
Acknowledgments
The authors thank two anonymous reviewers, an associate
editor, and former Editor-in-Chief L. Robin Keller for comments on an earlier draft. The opinions, findings, and conclusions expressed in this paper are those of the authors
and do not necessarily reflect the views of the National Science Foundation. This research was supported by the NSF
[SES-0922154].
References
Anderson RM, Hobbs BF (2002) Using a Bayesian approach
to quantify scale compatibility bias. Management Sci. 48(12):
1555–1568.
Barron FH, Barrett BE (1996) Decision quality using ranked
attribute weights. Management Sci. 42(11):1515–1523.
Bell ML, Hobbs BF, Elliott EM, Ellis JH, Robinson Z (2001) An evaluation of multi-criteria methods in integrated assessment of
climate policy. J. Multi-Criteria Decision Anal. 10(5):229–256.
Borcherding K, von Winterfeldt D (1988) The effect of varying
value trees on multiattribute evaluations. Acta Psychologica
68:153–170.
Borcherding K, Eppel T, von Winterfeldt D (1991) Comparison of
weighting judgments in multiattribute utility measurement.
Management Sci. 37(12):1603–1619.
Brown R (2005) Rational Choice and Judgment: Decision Analysis for
the Decider (Wiley, Hoboken, NJ).
Carlson KA, Meloy MG, Russo JE (2006) Leader-driven primacy:
Using attribute order to affect consumer choice. J. Consumer
Res. 32(4):513–518.
Cronbach, L, Meehl P (1955) Construct validity in psychological
tests. Psych. Bull. 52(4):281–302.
Delquié P (1993) Inconsistent trade-offs between attributes: New
evidence in preference assessment biases. Management Sci.
39(11):1382–1395.
Delquié P (1997) “Bi-matching”: A new preference assessment
method to reduce compatibility effects. Management Sci. 43(5):
640–658.
Edwards W (1990) Unfinished tasks: A research agenda for behavioral decision theory. Hogarth RM, ed. Insights in Decision Making: A Tribute to Hillel J. Einhorn (University of Chicago Press,
Chicago), 44–65.
Edwards W, Barron FH (1994) SMARTS and SMARTER: Improved
simple methods for multiattribute utility measurement. Organ.
Behav. Human Decision Processes 60(3):306–325.
Eppel T (1990) Eliciting and reconciling multiattribute weights.
Unpublished doctoral dissertation, University of Southern
California, Los Angeles.
Finucaine ML, Peters E, Slovic P (2003) Judgment and decision making: The dance of affect and reason. Schneider SL,
Shanteau J, eds. Emerging Perspectives on Judgment and Decision Research (Cambridge University Press, Cambridge, UK),
327–364.
Fischer GW (1995) Range sensitivity of attribute weights in multiattribute value models. Organ. Behav. Human Decision Processes
62(3):252–266.
Fischer GW, Carmon Z, Ariely D, Zauberman G (1999) Goal-based
construction of preferences: Task goals and the prominence
effect. Management Sci. 45(8):1057–1075.
Fischhoff B, Slovic P, Lichtenstein S (1980) Knowing what you want:
Measuring labile values. Wallensten T, ed. Cognitive Processes
in Choice and Decision Behavior (Lawrence Erlbaum Associates,
Hillsdale, NJ), 117–141.
Fishburn PC (1967) Methods of estimating additive utilities. Management Sci. 13(7):435–453.
Gilovich T, Griffin D, Kahneman D, eds. (2002) Heuristics and
Biases: The Psychology of Intuitive Judgment (Cambridge University Press, New York).
134
Gregory R (1999) Commentary on “Measuring constructed preferences: Towards a building code” by Payne, Bettman, and
Schkade. J. Risk Uncertainty 19(1–3):273–275.
Gregory R (2004) Valuing risk management choices. McDaniels T,
Small M, eds. Risk Analysis and Society: An Interdisciplinary
Characterization of the Field (Cambridge University Press, New
York), 213–250.
Gregory R, Fischhoff B, McDaniels T (2005) Acceptable input: Using
decision analysis to guide public policy deliberations. Decision
Anal. 2(1):4–16.
Hamalainen RP, Alaja S (2008) The threat of weighting biases
in environmental decision analysis. Ecological Econom. 68(1–2):
556–569.
Hobbs BF, Horn GTF (1997) Building public confidence in energy
planning: A multimethod MCDM approach to demand-side
planning at BC Gas. Energy Policy 25(3):357–375.
Hobbs BF, Chankong V, Hamadeh W (1992) Does choice of multicriteria method matter? An experiment in water resources
planning. Water Resources Res. 28(7):1767–1779.
Huber J, Ariely D, Fischer G (2002) Expressing preferences in
a principal-agent task: A comparison of choice, rating, and
matching. Organ. Behav. Human Decision Processes 87(1):66–90.
Huber J, Payne JW, Puto CP (1982) Adding asymmetrically dominated alternatives: Violations of regularity and the similarity
hypothesis. J. Consumer Res. 9(1):90–98.
Huber J, Wittink DR, Fiedler JA, Miller R (1993) The effectiveness
of alternative preference elicitation procedures in predicting
choice. J. Marketing Res. 30(1):105–114.
Jacobi SK, Hobbs BF (2007) Quantifying and mitigating the splitting
bias and other value tree-induced weighting biases. Decision
Anal. 4(4):194–210.
Jia J, Fischer GW, Dyer J (1998) Attribute weighting methods and
decision quality in the presence of response error: A simulation
study. J. Behav. Decision Making 11(2):85–105.
John RS (1984) Value tree analysis of social conflicts about risky
technologies. Unpublished doctoral dissertation, University of
Southern California, Los Angeles.
Kahneman D (2011) Thinking, Fast and Slow (Farrar, Strauss and
Giroux, New York).
Kahneman D, Frederick S (2002) Representativeness revisited:
Attribute substitution in intuitive judgment. Gilovich T,
Griffin D, Kahneman D, eds. Heuristics and Biases: The
Psychology of Intuitive Judgment (Cambridge University Press,
New York), 49–81.
Keeney RL (1992) Value-Focused Thinking (Harvard University Press,
Boston).
Keeney RL (2002) Common mistakes in making value trade-offs.
Oper. Res. 50(6):935–945.
Keeney RL, Gregory RS (2005) Selecting attributes to measure the
achievement of objectives. Oper. Res. 53(1):1–11.
Keeney RL, Raiffa H (1976) Decisions with Multiple Objectives: Preferences and Value Tradeoffs (John Wiley and Sons, New York).
Keeney RL, Renn O, von Winterfeldt D (1987) Structuring West
Germany’s energy objectives. Energy Policy 15(4):352–362.
Kleinmuntz DN (1990) Decomposition and the control of error in
decision analytic models. Hogarth RM, ed. Insights in Decision
Making: A Tribute to Hillel J. Einhorn (University of Chicago
Press, Chicago), 107–126.
Lichtenstein S, Slovic P, eds. (2006) The Construction of Preference
(Cambridge University Press, New York).
Anderson and Clemen: Decision Analytic Preference Judgments
Decision Analysis 10(2), pp. 121–134, © 2013 INFORMS
North Carolina Coastal Federation (2009) North Carolina estuary habitat restoration: Coastal restoration at work. Accessed
December 4, 2012, http://www.nccoast.org/Northeast/pdfs/
NOAA-stimulus-Fact-Sheet.pdf.
North Carolina Division of Marine Fisheries (2007) North Carolina
oyster fishery management plan amendment II. Division of
Marine Fisheries, North Carolina Department of Environment
and Natural Resources, Morehead City, NC.
Payne JW, Bettman JR, Johnson EA (1993) The Adaptive Decision
Maker (Cambridge University Press, Cambridge, UK).
Payne JW, Bettman JR, Schkade DA (1999) Measuring constructed
preferences: Towards a building code. J. Risk Uncertainty 19(1–3):
243–270.
Peterson CH, Grabowski JH, Powers SP (2003) Estimated enhancement of fish production resulting from restoring oyster reef
habitat: Quantitative valuation. Marine Ecology Progress Ser.
264:249–264.
Schoemaker PJH, Waid CC (1982) An experimental comparison of
different approaches to determining weights in additive utility
models. Management Sci. 28(2):182–196.
Slovic P, Griffin D, Tversky A (1990) Compatibility effects in judgment and choice. Hogarth RM, ed. Insights in Decision Making: A Tribute to Hillel J. Einhorn (University of Chicago Press,
Chicago), 5–27.
Tversky A, Sattath S, Slovic P (1988) Contingent weighting in judgment and choice. Psych. Rev. 95(3):371–384.
von Winterfeldt D, Edwards W (1986) Decision Analysis and Behavioral Research (Cambridge University Press, New York).
Weber M, Eisenführ F, von Winterfeldt D (1988) The effects of splitting attributes on weights in multiattribute utility measurement. Management Sci. 34(4):431–445.
Richard M. Anderson is a research scientist at the Puget
Sound Institute at University of Washington Tacoma, where
he is using decision analysis to help guide protection and
restoration of the Puget Sound in Washington. He has
broad interests in engineering and decision analysis, having
previously worked at Pacific Northwest National Laboratory as a risk analyst and served as assistant professor in
the Nicholas School of the Environment at Duke University. Special areas of application include biases in preference assessment and Bayesian network models in natural
resource management. He holds a Ph.D. in environmental
systems analysis from the Department of Geography and
Environmental Engineering at Johns Hopkins University.
Robert Clemen is professor emeritus of decision sciences
at Duke University’s Fuqua School of Business. He has
broad interests in the use of decision analysis for organizational decision making, and special interests in the psychology of judgment, assessing expert probabilities, the effectiveness of decision-making techniques, and using decision
analysis to help organizations become environmentally sustainable. The author of the widely used text Making Hard
Decisions, is also founding co-Editor-in-Chief of Decision
Analysis. He has published more than 40 scholarly articles
that have appeared in journals such as Management Science,
Operations Research, International Journal of Forecasting, Decision Analysis, among others.
Download