3 Qualitative Methods for Decision Making

advertisement
Quantitative and qualitative approaches to reasoning under
uncertainty in medical decision making
John Fox , David Glasspool and Jonathan Bury
Imperial Cancer Research Fund Labs
Lincoln's Inn Fields
London WC2A 3PX
United Kingdom
jf@acl.icnet.uk; dg@acl.icnet.uk; jb@acl.icnet.uk
Abstract: Medical decision making frequently requires the effective
management and communication of uncertainty and risk. However a tension
exists between classical probability theory, which is precise and rigorous but
which people find non-intuitive and difficult to use, and qualitative approaches
which are ad hoc but can be more versatile and easily comprehensible. In this
paper we review a range of approaches to uncertainty management, then
describe a logical approach, argumentation, which subsumes qualitative as well
as quantitative representations and has a clear formal semantics. The approach
is illustrated and evaluated in five decision support applications.
1
Introduction
Representing and managing uncertainty is central to understanding and supporting
much clinical decision-making and is extensively studied in AI, computer science and
psychology. Implementing effective decision models for practical clinical applications
presents a dilemma. On the one hand, informal and qualitative representations of
uncertainty may be natural for people to understand but they often lack formal rigour.
On the other hand formal approaches based on probability theory are precise but can
be awkward and non-intuitive to use. While in many cases probability theory is an
ideal for optimal decision-making it is often impractical.
In this paper we review a range of approaches to uncertainty representation, then
describe a logical approach, argumentation, which can subsume both qualitative and
quantitative approaches within the same formal framework. We believe that this may
enable improvements in both the scope and comprehensibility of decision support
systems and illustrate the approach with five decision support applications developed
by our group.
2
Decision Theory and Decision Support
It is generally accepted that human reasoning and decision-making can exhibit various
shortcomings when compared with accepted prescriptive theories derived from
mathematical logic and statistical decision-making, and there are systematic patterns
of distortion and error in people's use of uncertain information [1, 2, 3, 4]. In the
1970s and 1980s cognitive scientists generally came to the view that these
characteristic failures come about because people revise their beliefs by processes that
bear little resemblance to formal mathematical calculation. Kahneman and Tversky
developed a celebrated account of human decision-making and its weaknesses in
terms of what they called heuristics and biases [5]. They argued that people judge
things to be highly likely when, for example, they come easily to mind or are typical
of a class rather than by means of a proper calculation of the relative probabilities.
Such heuristic methods are often reasonable approximations for practical decision
making but they can also lead to systematic errors.
If people demonstrate imperfect reasoning or decision-making then it would
presumably be desirable to support them with techniques that avoid errors and comply
with rational rules. Mathematical logic (notably "classical" propositional logic and the
predicate calculus) is traditionally taken as the gold standard for logical reasoning and
deduction, while expected utility theory (EUT) plays the equivalent role for decisionmaking. A standard view on the "correct" way to take decisions is summarised by
Lindley as follows:
"... there is essentially only one way to reach a decision sensibly. First, the
uncertainties present in the situation must be quantified in terms of values
called probabilities. Second, the consequences of the courses of actions
must be similarly described in terms of utilities. Third, that decision must
be taken which is expected on the basis of the calculated probabilities to
give the greatest utility. The force of 'must' used in three places there is
simply that any deviation from the precepts is liable to lead the decision
maker in procedures which are demonstrably absurd" [6], p.vii.
However, many people think that this overstates the value of mathematical methods
and understates the capabilities of human decision makers. There are a number of
problems with EUT as a practical method for decision making, and there are
indications that, far from being "irrational", human decision processes depart from
EUT because they are optimised to make various tradeoffs to address these problems
in practical situations.
An expected-utility decision procedure requires that we know, or can estimate
reasonably accurately, all the required probability and utility parameters. This is
frequently difficult in real-world situations since a decision may still be urgently
required even if precise quantitative data are not available. Even when it is possible to
establish the necessary parameters, the cost of obtaining good estimates may
outweigh the expected benefits. Furthermore, in many situations a decision is needed
before the decision options, or the relevant information sources, are fully known. The
complete set of options may only emerge as the decision making process evolves. The
potential value of mathematical decision theory is frequently limited by the lack of
objective quantitative data on which to base the calculations, the limited range of
functions that it can be used to support, and the problem that the underlying numerical
representation of the decision is very different from the intuitive understanding of
human decision makers.
Human decision-making may also not be as "absurd" as the normative theories
appear to suggest. One school of thought argues that many apparent biases and
shortcomings are actually artefacts of the highly artificial situations that researchers
create in order to study reasoning and judgement in controlled laboratory conditions.
When we look at real-world decision-making we see that human reasoning and
decision-making is more impressive than the research implies.
Shanteau [7] studied expert decision-making, “investigating factors that lead to
competence in experts, as opposed to the usual emphasis on incompetence". He
identified a number of important positive characteristics of expert decision-makers:
First, they know what is relevant to specific decisions, what to attend to in a busy
environment, and they know when to make exceptions to general rules 1. Secondly,
experts know a lot about what they know, and can make decisions about their own
decision processes: they know which decisions to make and when, and which to skip,
for example. They often have good communication skills and the ability to articulate
their decisions and how they arrived at them. They can adapt to changing task
conditions, and are frequently able to find novel solutions to problems. Classical
deduction and probabilistic reasoning do not deal with these meta-cognitive skills.
It has also been strongly argued that people are well adapted for making decisions
under adverse conditions: time pressure, lack of detailed information and knowledge
etc. Gigerenzer, for instance, has suggested that people make decisions in a "fast and
frugal" way, which is to say human cognition is rational in the sense that it is
optimised for speed at the cost of occasional and usually inconsequential errors [4].
2.1
Tradeoffs in Effective Decision-Making
The strong view expressed by Lindley and others is that the only rational or
"coherent" way to reason with uncertainty is to require that we comply with certain
mathematical axioms - the axioms of EUT. In practice compliance with these axioms
is often difficult because there is insufficient data to permit a valid calculation of the
expected utility of the decision options, or a valid calculation may require too much
time. An alternative perspective is that human decision-making is rational in that it
incorporates practical tradeoffs that, for example, trade a lower cost (e.g. errors in
decision-making) against a higher one (e.g. the amount of input data required or the
time taken to calculate expected utility).
Tradeoffs of this kind not only simplify decision-making but in practice may
entail only modest costs in the accuracy or effectiveness of decision-making.
Consequently the claim that we should not model decision-support systems on human
cognitive processes is less compelling than it may at first appear.
This possibility has been studied extensively in the field of medical decisionmaking.
In the prediction of sudden infant death, for example, Carpenter et al. [8]
attempted to predict death from a simple linear combination of eight variables. They
found that weights can be varied across a broad range without decreasing predictive
accuracy.
In diagnosing patients suffering from dyspepsia, Fox et al. [9] found that giving
all pieces of evidence equal weights produced the same accuracy as a more precise
statistical method (and also much the same pattern of errors). In another study [10]
we developed a system for the interpretation of blood data in leukaemia diagnosis,
using the EMYCIN expert system shell. EMYCIN provided facilities to attach
numerical "certainty factors" to inference rules. Initially a system was developed
1
“Good surgeons, the saying goes, know how to operate, better surgeons know when to
operate and the best surgeons know when not to operate. That's true for all of medicine”
Richard Smith, Editor of the British Medical Journal.
using the full range of available values (-1 to +1). These values were then replaced
with just two: if the rule made a purely categorical inference the certainty factor was
set to be 1.0 while if there was any uncertainty associated with the rule the certainty
factor was set to 0.5. The effect was to increase diagnostic accuracy by 5%!
In a study of whether or not to admit patients with suspected heart attacks to
hospital by O’Neil and Glowinski [11] no advantage was found of a precise decision
procedure over simply "adding up the pros and cons". A similar comparison by
Pradhan et al in a diagnosis task [12] showed a slight increase in accuracy of
diagnosis with precise statistical reasoning, but the effect was so small it would have
no practical clinical value.
While the available evidence is not conclusive, a provisional hypothesis is that for
some decisions strict use of quantitatively precise decision-making methods may not
add much practical value to the design of decision support systems over simpler, more
“ad hoc” methods.
3
Qualitative Methods for Decision Making
Work in artificial intelligence raises an even more radical option for the designers of
decision support systems. The desire to develop versatile automata has stimulated a
great deal of research in new methods of decision making under uncertainty, ranging
from sophisticated refinements of probabilistic methods such as Bayesian networks
and Dempster-Shafer belief functions to non-probabilistic methods such as fuzzy
logic and possibility theory. Good overviews of the different approaches and their
applications are [13, 14].
These "non-standard" approaches are similar to probability methods in that they
treat uncertainty as a matter of degree. However, even this apparently innocuous
assumption has also been widely questioned on the practical grounds that they also
demand a great deal of quantitative input data and also that decision-makers often find
them difficult to understand because they do not capture intuitions about uncertainty the nature of "belief", "doubt" and the form of natural justifications for decisionmaking.
Consequently, interest has grown in AI in the use of non-numerical methods that
seem to have some "common-sense" validity for reasoning under uncertainty but are
not ad hoc from a formal point of view. These include non-classical logics including
non-monotonic logics, default logic and defeasible reasoning. Cognitive approaches,
sometimes called reason-based decision making, are also gaining ground, including
the idea of using informal endorsements for alternative decision options [15] and
formalisations of everyday strategies of reasoning about competing beliefs and
actions based on logical arguments [16, 17, 18]
Argumentation is a formalisation of the idea that decisions are made on the basis
of arguments for or against a claim. Fox and Das [19] propose that argumentation
may be the basis of a generalised decision theory, embracing standard probability
theory as a special case, as well as other qualitative and semi-quantitative
approaches. To take an example, suppose we wished to make the following informal
argument:
If three or more first degree relatives of a patient have contracted breast
cancer, then this is one reason to believe that the patient carries a gene
predisposing to breast cancer.
In the scheme of [19] arguments are defined as logical structures having three
terms:
(Claim, Grounds, Qualifier)
In the example the Claim term would be the proposition "this patient carries a
gene predisposing to breast cancer" and the Grounds "three or more first degree
relatives of this patient have contracted breast cancer" is the justification for the
argument. The final term, Qualifier, specifies the nature and strength of the argument
which can be drawn from the grounds to the claim. In the example the qualifier is
informal "this is one reason to believe", but it could be more conventional, as in
"given the Grounds are true the Claim is true with a conditional probability of 0.43".
Qualifiers are specified with reference to a particular dictionary of terms, along
with an aggregation function that specifies how qualifiers in multiple arguments are
to be combined. One possible dictionary is the set of real numbers from 0.0 to 1.0,
which would allow qualifiers to be specified as precise numerical probability values
and, with an appropriate aggregation function based on the theorems of classical
probability, would allow the scheme to reduce to standard probability theory.
In many situations little can be known of the strength of arguments, other than
that they indicate an increase or decrease in the overall confidence in the claim. In this
case the dictionary and aggregation function might be simpler. In [20] for example we
describe a system which assesses carcinogenicity of novel compounds using a
dictionary comprising the symbols + (argument in favour), - (argument against). The
aggregation function in this case simply sums the number of arguments in favour of a
proposition minus those against. Other dictionaries can include additional qualifiers,
such as ++ (the argument confirms the claim) and -- (the argument refutes the claim).
Another possible dictionary adopts linguistic confidence terms (e.g. probable,
improbable, possible, plausible etc.) and requires a logical aggregation function that
provides a formalised semantics for combining such terms according to common
usage. Such terms have been formally categorised by their logical structure for
example [21, 22].
A feature of argumentation theory is that it subsumes these apparently diverse
approaches within a single formalism which has a clear consistent formal semantics
[19]. Additionally, argumentation gives a different perspective on the decision
making process which we believe to be more in line with the way people naturally
think about probability and possibility than standard probability theory. Given the
evidence reviewed above that people do not naturally use EUT in their everyday
decision making, our tentative position is that expressing a preference or confidence
in terms of arguments for and against each option will be more accessible and
comprehensible than providing a single number representing its aggregate probability.
We have developed a number of decision support systems to explore this idea. In
the next sections we describe a technology which supports an argument-based
decision procedure, then outline several applications which have been built using it,
and finally consider quantitative evaluations of the applications.
4
Practical Applications of the Argumentation Method
We have used the argumentation framework in designing five decision support
systems to date.
4.1
CAPSULE
The CAPSULE (Computer Aided Prescribing Using Logic Engineering) system was
developed to assist GPs with prescribing decisions [19,23]. CAPSULE analyses
patient notes and constructs a list of relevant candidate medications, together with
arguments for and against each option (based on nine different criteria, including
efficacy, contra-indications, drug interactions, side effects, costs etc).
4.2
CADMIUM
The CADMIUM radiology workstation is an experimental package for combining
image processing with logic-based decision support. The main use that CADMIUM
has been applied to is in screening for breast cancer, in which an argument based
decision procedure has the dual function of controlling image processing functions
which extract and describe micro-calcifications in breast x-rays and interprets the
descriptions in terms of whether they are likely to indicate benign or malignant
abnormalities. The decision procedure assesses the arguments and presents them to
the user in a structured report.
4.3
RAGs
The RAGs (Risk Assessment in Genetics) system allows the user to describe a
patient's family tree incorporating information on the known incidence of cancers
within the family. This information is evaluated to assess the likelihood that a genetic
predisposition to a particular cancer is present. The software makes recommendations
for patient managementin language which is comprehensible to both clinician and
patient.
A family tree graphic is used for incremental data entry (Figure 1). Data about
relatives are added by clicking on each relative’s icon, and completing a simple form.
RAGs analyses the data and provides detailed assessments of genetic risk by
weighing simple arguments like the example above rather than using numerical
probabilities. Based on the aggregate genetic risk level the patient is classified as at
high, moderate or low risk of carrying a BrCa1 genetic mutation, which implies an
80% lifetime risk of developing breast cancer. Appropriate referral advice can be
given for the patient based on this classification. The software can provide a
comprehensible explanation for its decisions based on the arguments it has applied
(Figure 1).
RAGs uses a set of 23 risk arguments (e.g. if the client has more than two firstdegree relatives with breast cancer under the age of 50 then this is a risk factor) and a
simple dictionary of qualifiers which allows small positive and negative integer
values as well as plus and minus infinity (equivalent to confirming or refuting the
claim). This scheme thus preserves relative rather than absolute weighting of different
factors in the analysis.
4.4
ERA
Patients with symptoms or signs which may indicate cancer should be
investigated quickly so that treatment may commence as soon as possible. The ERA
(Early Referrals Application) system has been designed in the context of the UK
Department of Health’s (DoH) “2 week” guideline which states that patients with
suspected cancer should be seen and assessed by an appropriate specialist within 2
week of their presentation. Referral criteria are given for each of 12 main cancer
groups (e.g. Breast, Lung, Colorectal).
The practical decision as to whether or not to refer a particular patient is treated in
ERA as based on a set of patient-specific arguments (see figure 2). The published
referral criteria have been specifically designed to be unambiguous and easy to apply.
Should any of the arguments apply to a particular patient, an early referral is
warranted. Qualifiers as to the weight of each argument are thus unnecessary, with
arguments generally acting categorically in an all or nothing fashion. An additional
feature of this domain is that there are essentially no counter-arguments. The value of
argumentation in this example lies largely in the ability to give meaningful
explanations in the form of reasons for referral expressed in English.
4.5
REACT
The applications described so far involve making a single decision - what drug to
prescribe, or whether to refer a patient. Most decisions are made in the context of
plans of action, however, where they may interact or conflict with other planned
actions or anticipated events. REACT (Risk, Events, Actions and their Consequences
over Time) is being developed to provide decision support for extended plans. In
effect REACT is a logical spreadsheet that allows a user to manipulate graphical
widgets representing possible clinical events and interventions on a timeline interface
and propagates their implications (both qualitative and quantitative) to numerical
displays of risk (or other parameters) and displays of arguments and counterarguments.
While the REACT user creates a plan, a knowledge-based DSS analyses it according
to a set of definable rules and provides feedback on interactions between events.
Rules may specify, for example, that certain events are mutually exclusive, that
certain combinations of events are impossible, or that events have different
consequences depending on prior or simultaneous events). Global measures (for
example the predicted degree of risk or predicted cost or benefit of combinations of
events) can be displayed graphically alongside the planning timeline. Qualitative
arguments for and against each individual action proposed in the plan can be
reviewed, and can be used to generate recommended actions when specified
combinations of plan elements occur.
Figure 1: The RAGs software application. A family tree has been input for
the patient Karen, who has had three relatives affected by relevant cancers. The
risk assessment system has determined a pattern of inheritance that may indicate
a genetic factor, and provides an explanation in the left-hand panel. Referral
advice for the patient is also available.
Figure 2: The ERA early referral application. After completing patient
details on a form, the program provides referral recommendations with advice,
and can contact the hospital to automatically make an appointment.
5
Evaluations
A number of these applications have been used to carry out controlled evaluations
(with the exception of ERA and REACT which are still in development.
In the case of the CAPSULE prescribing system a controlled study with
Oxfordshire general practitioners showed the potential for substantial improvements
in the quality of their prescribing decisions [23]. With decision support there was a
70% increase in the number of times the GPs decisions agreed with those of experts
considering the cases, and a 50% reduction in the number of times that they missed a
cheaper but equally effective medication.
The risk classifications generated by the RAGS software was compared for 50
families with that provided by the leading probabilistic risk assessment software. This
uses the Claus model [27], a mathematical model of genetic risk of breast cancer
based on a large dataset of cancer cases. Despite the use of a very simple weighting
scheme the RAGs system produced exactly the same risk classification (into high,
medium or low risk, according to established guidelines) for all cases as the
probabilistic system [24, 25]. RAGs resulted in more accurate pedigree taking and
more appropriate management decisions than either pencil and paper or standard
probabilistic software [26].
In the CADMIUM image-processing system users are provided with assistance in
interpreting mammograms using an argument-based decision making component.
Radiographers who were trained to interpret mammograms were asked to make
decisions as to whether observed abnormalities were benign or malignant, with and
without decision support. The decision support condition produced clear
improvements in the radiographers’ performance, in terms of increased hits and
correct rejections and reduced misses and false positives [27].
6
Summary and conclusions
Human reasoning and decision-making can exhibit various shortcomings when
compared with accepted prescriptive theories. Good decision-making is central to
many important human activities and much effort is directed at developing decision
support technologies to assist us. One school of thought argues that these systems
must be based on prescriptive axioms of decision-making since to do otherwise leads
inevitably to irrational conclusions and choices. Others argue that decision-making in
the real world demands practical tradeoffs and a failure to make those tradeoffs would
be irrational. Although the debate remains inconclusive, we have described a number
of examples of medical systems in which the use of simple logical argumentation
appears to provide effective decision support together with a versatile representation
of uncertainty which fits comfortably with people's intuitions.
7
Acknowledgements
The RAGs project was supported by the Economic and Social Research Council, and
much of the development and evaluation work was carried out by Andrew Coulson
and Jon Emery. Michael Humber carried out much of the implementation of the ERA
system.
References
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
Kahneman D., Slovic P. and Tversky A. (eds): Heuristics and Biases. Cambridge
University Press, Cambridge (1982)
Evans J. st B. and Over D.E.: Rationality and reasoning. Psychology press, London (1996)
Wright G. and Ayton P. (eds): Subjective Probability. Wiley, Chichester, UK (1994)
Gigerenzer G. and Todd P.M.: Simple heuristics that make us smart. Oxford University
Press, Oxford (1999)
Kahneman, D. and Tversky, A.: Judgement under Uncertainty: Heuristics and Biases.
Science, Vol. 185, 27 (1974) pp. 1124-1131.
Lindley DV.: Making Decisions (2nd Edition). Wiley, Chichester, UK (1985)
Shanteau, J.: Psychological characteristics of expert decision makers. In , J Mumpower
(ed) Expert judgement and expert systems. NATO ASI Series, volume F35 (1987)
Carpenter R G, Garnder A, McWeeny P M. and Emery J L.: Multistage scoring system
for identifying infants at risk of unexpected death. Archives of disease in childhood, 53(8),
(1977) pp.600-612
Fox J. Barber DC. and Bardhan KD.: Alternatives to Bayes? A quantitative comparison
with rule-based diagnosis. Methods of Information in Medicine, 10 (4) (1980) pp.210-215
Fox, J. Myers CD., Greaves MF. and Pegram S.: Knowledge acquisition for expert
systems: experience in leukaemia diagnosis. Methods of Information in Medicine, 24 (1)
(1985) pp.65-72
O'Neil MJ. and Glowinski AJ.: Evaluating and validating very large knowledge-based
systems. Medical Informatics, 15 (3) (1990) pp.237-251
Pradhan M.: The sensitivity of belief networks to imprecise probabilities: an experimental
investigation. Artificial Intelligence Journal, 84 (1-2) (1996) pp.365-397
Krause P. and Clark C.: Uncertainty and Subjective Probability in AI Systems In Wright
G & Ayton P (Eds) Subjective Probability, Wiley J & Sons (1994) pp.501-527
Hunter, A. and Parsons, S. (Editors) Applications of uncertainty formalisms, Springer
Verlag, LNAI 1455, (1998)
Cohen P.R.: Heuristic Reasoning: An Artificial Intelligence Approach. Pitman Advanced
Publishing Program, Boston (1985)
Fox J.: On the necessity of probability: Reasons to believe and grounds for doubt. In
Wright G, Ayton, P, eds., Subjective Probability. John Wiley, Chichester (1994)
Fox J, Krause P. and Ambler S.: Arguments, contradictions and practical reasoning. In
Neumann B, ed. Proceedings of the 10th European Conference on AI (ECAI92), Vienna,
Austria (1992) pp.623-627
Curley SP. and Benson PG.: Applying a cognitive perspective to probability construction.
In G Wright & P Ayton (eds.), Subjective Probability. John Wiley & Sons. Chichester,
England (1994) pp. 185-209
Fox J. and Das S K.: Safe and Sound: Artificial Intelligence in Hazardous Applications,
American Association of Artificial Intelligence and MIT Press (2000)
Tonnelier CAG., Fox J., Judson P., Krause P., Pappas N. and Patel M.: Representation of
Chemical Structures in Knowledge-Based Systems. J. Chem. Inf. Sci. 37 (1997) pp.117123.
21. Elvang-Goransson M, Krause P.J and Fox J.: Acceptability of Arguments as Logical
Uncertainty. In: Clarke M, Kruse R and Moral S, eds. Symbolic and Quantitative
Approaches to Reasoning and Uncertainty. Proceedings of European Conference
ECSQARU93. Lecture Notes in Computer Science 747. Springer-Verlag (1993) pp.85-90.
22. Glasspool DW. and Fox J.: Understanding probability words by constructing concrete
mental models. In Hahn M, Stoness, SC, eds, Proceedings of the 21 st Annual Conference
of the Cognitive Science Society (1999) pp.185-190.
23. Walton RT, Gierl C, Yudkin P, Mistry H, Vessey MP & Fox J.: Evaluation of computer
support for prescribing (CAPSULE) using simulated cases. British Medical Journal 315
(1997) pp.791-795.
24. Emery J, Watson E, Rose P and Andermann, A. A systematic review of the literature
exploring the role of primary care in genetic services. Fam. Prac. 16 (1999) pp.426-45.
25. Coulson AS, Glasspool DW, Fox J and Emery J. Computerized Genetic Risk Assessment
from Family Pedigrees. MD computing, in press.
26. Emery J., Walton, R., Murphy, M., Austoker, J., Yudkin, P., Chapman, C., Coulson, A.,
Glasspool, D. and Fox, J.: Computer support for interpreting family histories of breast and
ovarian cancer in primary care: comparative study with simulated cases. British Medical
Journal, 321 (2000) pp.28-32.
27. Claus E, Schildkraut J, Thompson WD and Risch, N. The genetic attributable risk of
breast and ovarian cancer. Cancer 1996; 77, pp. 2318-24.
28. Taylor, P, Fox J and Todd-Pokropek, A “The development and evaluation of CADMIUM:
a prototype system to assist in the interpretation of mammograms” Medical Image
Analysis, 1999, 3 (4), 321-337.
Download