Subjective Evidential Reasoning

advertisement
Subjective Evidential Reasoning
Audun Jøsang
Distributed Systems Technology Centre
Level 12, S-Block, QUT, PO Box 2434, Brisbane, Qld 4001, Australia
email: ajosang@dstc.edu.au
We describe two operators called discounting 1
and consensus that operate on subjective uncertain beliefs, and show how they can be used for
assessing evidence from different sources. This
work builds on Demspter-Shafer belief theory [3].
Our consensus operator is different from Dempster’s rule but has the same purpose; namely of
combining uncertain beliefs, and a comparison is
provided at the end of the paper. Our discounting
operator is similar to the Shaferian discounting
operator, and can be used to model recommendation of beliefs. The operators form part of Subjective Logic which is described in Jøsang 2001 [4].
These operators can for example be applied to legal reasoning [5] and authentication in computer
networks[6].
Abstract
This paper describes a framework for
combining and assessing subjective evidence from different sources. The
approach is based on the DempsterShafer belief theory, but instead of using Dempster’s rule we introduce a new
rule called the consensus operator that
is based on statistical inference. We
show how this framework can be applied to subjective evidential reasoning.
1 Introduction
Several alternative calculi and logics which take
uncertainty and ignorance into consideration have
been proposed and quite successfully applied to
practical problems where conclusions have to be
made based on insufficient evidence (see for example Hunter 1996 [1] or Motro & Smets 1997
[2] for an analysis of some uncertainty logics and
calculi). Although including uncertainty in the
belief model is a significant step forward, it only
goes half the way in realising the real nature of
human beliefs. It is also necessary to take into account that beliefs always are held by individuals
and that beliefs for this reason are fundamentally
subjective.
2
Representing Uncertain Beliefs
The first step in applying the Dempster-Shafer belief model [3] is to define a set of possible situations, which is called the frame of discernment,
denoted by and which delimits a set of possible
states of a given system.
The powerset of , denoted by , contains all
possible unions of the states in including itself. Elementary states in a frame of discernment will be called atomic states because they
do not contain substates. It is assumed that only
one atomic state can be true at any one time. If a
state is assumed to be true, then all superstates are
considered true as well.
Appears in the proceedings of the 9th International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems (IPMU 2002), Annecy, France, 1-5 July, 2002.
The work reported in this paper has been funded in
part by the Co-operative Research Centre for Enterprise Distributed Systems Technology (DSTC) through the Australian
Federal Government’s CRC Programme (Department of Industry, Science & Resources).
An observer who believes that one or more states
in the powerset of might be true can assign belief mass to these states. Belief mass on an atomic
1
1
Called recommendation in some earlier publications
is interpreted as the belief that the
state
state in question is true. Belief mass on a non is interpreted as the belief
atomic state
that one of the atomic states it contains is true,
but that the observer is uncertain about which of
them is true. The following definition is central in
the Dempster-Shafer theory.
The disbelief of corresponds to the doubt of
in Shafer’s book. However, we choose to use the
term ‘disbelief’ because we feel that for example the case when it is certain that a state is false
can better be described by ‘total disbelief’ than
by ‘total doubt’. Our next definition expresses
uncertainty regarding a given state as the sum of
belief masses on superstates or on partly overlapping states.
Definition 1 (Belief Mass Assignment) Let be a frame of discernment. If with each substate
a number
is associated such that:
!
Definition 4 (Uncertainty Function) Let be a
frame of discernment, and let
be a BMA on
. The uncertainty function corresponding with
is the function
defined by:
<=$ &)( + *,.< >@?B/ ADEGC F 32 4* J* 2 H> IC A
"
then
is called a belief mass assignment2 on
, or BMA for short. For each substate
,
is called the belief mass3 of .
A BMA with zero belief mass assigned to is
called a dogmatic BMA. The sum of the belief,
disbelief and uncertainty functions is equal to the
sum of the belief masses in a BMA which according to Def.1 sums up to 1. The following equality
is therefore trivial to prove:
A belief mass
expresses the belief as
signed to the state and does not express any belief in substates of in particular.
In contrast to belief mass, the belief in a state must
be interpreted as an observer’s total belief that a
particular state is true. The next definition from
the Dempster-Shafer theory will make it clear that
belief in not only depends on belief mass assigned to but also on belief mass assigned to
substates of .
# JK 6 LK < !M*
+
,
*
.#%$ '&)(
# 0,/ 1 32 4* 5* 2 2
2
(1)
J*2 2 J* 2 2
Definition 5 (Relative Atomicity) Let be a
. Then
frame of discernment and let
the relative atomicity of to is the function
defined by:
P $ '&)( +*,.P Q 2 O SR2 2 O * J* 2 OO
2 UT P Q 2 V ,
It can be2X
observed
that JR 2 W
T
P Q H! . In all other cases
and that
Definition 3 (Disbelief Function) Let be a
frame of discernment, and let
be a BMA on .
Then the disbelief function corresponding with
defined by:
is the function
3
"N OO
Similarly to belief, an observer’s disbelief must
be interpreted as the total belief that a state is not
true. The following definition is ours.
for
For the purpose of deriving probability expectation values of sets in , the relative number of
atomic sets is also needed in addition to belief
masses. For any particular state the atomicity
of is the number of atomic states it contains,
denoted by . If is a frame of discernment,
the atomicity of is equal to the total number of
atomic states it contains. Similarly, if
then the overlap between and relative to can
be expressed in terms of atomic states. Our next
definition captures this idea of relative atomicity:
Definition 2 (Belief Function) Let be a frame
of discernment, and let
be a BMA on . Then
the belief function corresponding with
is the
function
defined by:
67$ '&)( +*,.6 0.8 / :9; 32 4* 5* 2 Called basic probability assignment in Shafer 1976[3]
Called basic probability number in Shafer 1976[3]
it will have a value between 0 and 1.
2
P Q The relative atomicity of an atomic state to its
frame of discernment, denoted by
, can
simply be written as
, i.e. it refers to the
frame of discernment by default. The relative
atomicity function can be used to determine a
probability expectation value for any given state.
P A focused relative atomicity represents the
weighted average of relative atomicities of to
all other states in function of their uncertainty belief mass. Working with focused BMAs makes it
possible to represent the belief functions relative
to using a binary frame of discernment, making
the notation compact.
Definition 6 (Probability Expectation) Let be a frame of discernment with BMA
, then
the probability expectation function corresponding with
is the function
defined by:
3
$ & ( +*,. U/ 3 2 B P Q 2 4* 5* 2 (2)
0 The Opinion Space
We define a 3-dimensional belief metric called
opinion containing a 4th redundant component.
For compactness of notation the belief, disbelief,
uncertainty and relative atomicity functions will
and
respectively.
be denoted as , ,
# 6 < Definition 9 (Opinion) Let be a frame of discernment with 2 atomic states and , and let
be a BMA on where , , , and
are the belief, disbelief, uncertainty and relative
atomicity functions on in respectively. Then
the opinion about , denoted by , is the tuple:
Def.6 is equivalent to pignistic probability described in e.g. Smets and Kennes 1994 [7], and is
based on the principle of insufficient reason: a belief mass assigned to the union of atomic states
is split equally among these states.
In order to simplify the representation of uncertain beliefs for particular states we will define a
focused frame of discernment and a focused BMA.
The focused frame of discernment and the corresponding BMA will for a given state produce the
same belief, disbelief and uncertainty functions as
the original frame of discernment and BMA.
P
P # D*
6 D*
< *
( (4)
Director
# -3Q < P
Uncertainty
1
Definition 8 (Focused Belief Mass Assignment)
Let be a frame of discernment with BMA
and let
,
and
be the belief, disbelief
and uncertainty functions of in . Let be
the the focused frame of discernment with focus
on . The corresponding focused BMA and
on is defined by:
relative atomicity # *6 *< < # 6 < # * 6 * < *HP Because the three coordinates
are dependent through Eq.(1) they represent nothing more
than the traditional Belief, Plausibility pair of
Shaferian belief theory. It is useful to keep all
three coordinates in order to obtain simple operator expressions. Eq.(1) defines a triangle that can
be used to graphically illustrate opinions as shown
in Fig.1.
Definition 7 (Focused Frame Of Discernment)
Let be a frame of discernment and let .
The frame of discernment denoted by containing only and , where is the complement
of
in is then called a focused frame of
discernment with focus on .
# 6 P
0
0.5
ωx
0
Projector
0.5
Disbelief 1
0
(3)
0.5
0
0.5 ax E(x )
Probability axis
Figure 1: Opinion triangle with 3
1
1Belief
as example
+*7 +*7 +*S As an example the position of the opinion is indicated as a point
in the triangle. Also shown are the probability
expectation value and the relative atomicity. The
horizontal bottom line between the belief and disbelief corners in Fig.1 is called the probability
axis. The relative atomicity can be graphically
represented as a point on the probability axis. The
line joining the top corner of the triangle and the
relative atomicity point becomes the director. In
is represented as a point, and
Fig.1
the dashed line pointing at it represents the director.
The first criterion is self evident. The second criterion is less so, but it is supported by experimental findings described by Ellsberg 1961 [8]. The
third criterion is more an intuitive guess and so
is the priority between the second and third criterion, and before these assumptions can be supported by evidence from practical experiments we
invite the readers to judge whether they agree.
P 4
So far we have described the elements of a frame
of discernment as states. In practice states can be
verbally described as propositions; if for example consists of possible colours of a ball when
drawn from an urn with red and black balls, and
designates the state when the colour drawn from
the urn is red then it can be interpreted as the verbal proposition : “I will draw a red ball”.
The projector is parallel to the director and passes
through the opinion point . Its intersection with
the probability axis defines the probability expectation value which otherwise can be computed by
the formula of Def.6. The position of the proba
is shown. For simbility expectation
plicity the probability expectation of an
opinion
can
be
denoted
according
to
.
V Opinions are considered to be held by individuals, and will therefore have an ownership assigned
whenever relevant. In our notation, superscripts
indicate ownership, and subscripts indicate the
proposition to which the opinion applies. For example is an opinion held by agent about the
truth of proposition .
Opinions situated on the probability axis are
called dogmatic opinions, representing traditional
probabilities without uncertainty. The distance
between an opinion point and the probability axis
can be interpreted as the degree of uncertainty.
Opinions situated in the left or right corner, i.e.
or
are called absolute
with either
opinions. They represent situations where it is absolutely certain that a state is either true or false,
and correspond to TRUE or FALSE proposition
in binary logic.
# 6 In this section we describe the operators discounting and consensus which are suitable for subjective evidential reasoning. Other operators suitable
for subjective logic reasoning are described in [4].
4.1
Opinions can be ordered according to probability expectation value, but additional criteria are
needed in case of equal probability expectation
values. The following definition can be used to
determine the order of opinions:
Definition 10 (Ordering of Opinions) Let and be two opinions. They can be ordered
according to the following criteria by priority:
0
Evidential Operators
The opinion with the greatest probability expectation is the greatest opinion.
The opinion with the least uncertainty is the
greatest opinion.
The opinion with the least relative atomicity
is the greatest opinion.
4
Discounting
Assume two agents and where has an
opinion about in the form of the proposition:
“
is knowledgeable and will tell the truth”. In
addition has an opinion about a proposition .
Agent can then form an opinion about by discounting ’s opinion about with ’s opinion
about . There is no such thing as physical belief
discounting, and discounting of opinions therefore lends itself to different interpretations. The
main difficulty lies with describing the effect of disbelieving that will give a good advice. This
we will interpret as if thinks that is uncertain
about the truth value of so that also is uncertain about the truth value of no matter what ’s
actual advice is. Our next definition captures this
idea.
# * 6 * < * P # * 6 * < *P # * 6 * < *P # # # <6 #6 6 K < K # < L
P P *
is called the discounting of by # * 6 * < *P # * 6 * < *P < * < & Q<
<
< K < < < # * 6 * < *P for @
N # : # < K # < Q 6 6 < K 6 < Q L
< < < Q J
P for :
@
# 6 L
< V J
P and ,
is then called the consensus between representing an imaginary agent ( * - ’s opinion
about , as if she represented both and . By
using the symbol ‘ ’ to designate this operator,
.
we define Definition 11 (Discounting)
Let be agent ’s opinion
about agent as a recommender, and let be a
proposition where is ’s
opinion about expressed in a recommendation
be
to . Let the opinion such that:
Definition 12 (Consensus)
Let and be ’s and ’s opinions about
the same proposition . When ,
the relative dogmatism between and is
Let defined by so that .
. The opinion defined by:
The discounting function defined by Shafer[3]
uses a discounting rate that can be denote by ,
where the belief mass on each state in except
the belief mass on itself is multiplied by .
our definition becomes
By setting equivalent to Shafer’s definition.
#
It is easy to prove that is associative but not
commutative. This means that in case of a chain
the discounting of opinions can start in either end,
but that the order of opinions is significant. In a
chain with more than one advisor, opinion independence must be assumed, which for example
translates into not allowing the same entity to appear more than once in a chain.
4.2
then expressing ’s opinion about as a result of ’s
advice to . By using the symbol
‘ ’ to designate
.
this operator, we define It is easy to prove that the consensus operator is
both commutative and associative. Opinion independence must be assumed. The consensus of two
totally uncertain opinions results in a new totally
uncertain opinion, although the relative atomicity
is not well defined in that case.
Consensus
In [4] it is incorrectly stated that the consensus
operator can not be applied to two dogmatic opin. The definition above
ions, i.e. when rectifies this so that dogmatic opinions can be
combined. This result is obtained by computing the limits of , , , and ,
as using the relative dogmatism
. This result makes the consensus operator even more general than Dempster’s rule because the latter excludes the combination of two
totally conflicting beliefs.
The consensus of two possibly conflicting opinions is an opinion that reflects both opinions in a
fair and equal way, i.e. when two observers have
beliefs about the truth of resulting from distinct
pieces of evidence about , the consensus operator produces a consensus belief that combines
the two separate beliefs into one. If for example a process can produce two outcomes and ,
and and have observed the process over two
different time intervals so that they have formed
two independent opinions about the likelihood of
to occur, then the consensus opinion is the belief about to occur which a single agent would
have had after having observed the process during
both periods. See Jøsang 2001 [4] for a formal
justification of the consensus operator.
#
< * < Q & < <
6
<
P
In order to understand the meaning of the relative dogmatism , it is useful to consider a process with possible outcomes that produces
times as many as . For example when throwing a fair dice and some mechanism makes sure
5*
5
that only observes the outcome of “six” and
only observes the outcome of “one”, “two”,
“three”, “four” and “five”, then will think the
dice only produces “six” and will think that the
dice never produces “six”. After infinitely many
observations and will have the conflicting
dogmatic opinions “six”
and “six”
respectively. On
the average observes 5 times more events than
so that remains 5 times more dogmatic than
as “six”
, meaning that the relative
“six”
dogmatism between and is . By
combining their opinions according to the case
where in Def.12 and inserting the value
of , we get a combined opinion about
obtaining
.
a “six” with the dice as “six”
We have assumed that and initially agreed
on the relative atomicity of obtaining “six” with
the dice.
about the truth of . Assuming that the opinions
resulting from each witness are independent, they
can finally be combined using the consensus operator to produce the judge’s own opinion about
:
+* +* +* + *S +*M +* < *<
& +
,
*
.
$
&
(
N 0.8/ 9; 32 *
0.8 9 32 *
0.8 9; 32 and N
where in Dempster’s rule, and where in the non-
normalised version.
Suppose that we have a murder case with three
suspects; Peter, Paul and Mary and two witwho give highly conflicting
nesses and
testimonies. Table 1 gives the witnesses’ belief
masses in Zadeh’s example and the resulting belief masses after applying Dempster’s rule, the
non-normalised Dempster’s rule and the consensus operator.
x
Trust
Figure 2: Trust in testimony from witnesses
Because the frame of discernment in Zadeh’s example is ternary a focused binary frame of discernment must be derived in order to apply the
consensus operator. The resulting opinions are all
dogmatic, and the case where in Def.12
must be invoked. Because of the symmetry between and
we determine the relative dogmatism between and
to be .
The effect on the judge of each individual testimony can be computed using the discounting operator, so that for example ’s testimony causes
the judge to have the opinion:
C
A
Legend:
B
(5)
Comparing the Consensus Operator
with Dempster’s Rule
We consider a court case where 3 witnesses , and are giving testimony to express their opinions about a verbal proposition which has been
made about the accused. We assume that the verbal proposition is either true or false, and let each
witness express his or her opinion about the truth
of the proposition as the opinions , and ,
to the courtroom. The judge then has to determine his or her own opinion about as a function
of her trust : ’Witness is reliable’, and sim
ilarly for and . This situation is illustrated in
Fig.2 where the arrows denote trust or opinions
about truth.
J
Definition 13 Let be a frame of discernment,
and let and
be BMAs on . Then is a function such
that for all
:
Example: Assessment of Testimony from
Witnesses
We will start with the well known example that
Zadeh 1984 [9] used for the purpose of criticising
Dempster’s rule. Smets 1988 [10] used the same
example in defence of the non-normalised version of Dempster’s rule. The definition of Dempster’s rule and the no-normalised Dempster’s rule
is given below.
* *7+* 4.4
:Q
4.3
6
!
Peter
Paul
Mary
0.99
0.01
0.00
0.00
0.00
0.00
0.01
0.99
0.00
0.00
Dempster’s
rule
0.00
1.00
0.00
0.00
0.00
Nonnorm.
rule
0.0000
0.0001
0.0000
0.0000
0.9999
Peter, Paul, Mary . Table 2 gives the modified
Cons.
operator
0.495
0.010
0.495
0.000
0.000
BMAs and the results of applying the rules.
Peter
Paul
Mary
Peter
Paul
Mary
0.00
0.01
0.98
0.01
0.00
Nonnorm.
rule
0.0098
0.0003
0.0098
0.0001
0.9800
Cons.
operator
0.492
0.010
0.492
0.005
0.000
The column for the consensus operator is obtained by taking the ‘belief’ coordinate from the
consensus opinions. The consensus opinion values and their corresponding probability expectation values are:
* *D +* *
+* +*D +* *
* *D +* *
H *
H 7*
H S
Dempster’s
rule
0.490
0.015
0.490
0.005
0.000
Table 2: Comparison of operators after introducing uncertainty in Zadeh’s example
The column for the consensus operator is obtained by taking the ‘belief’ coordinate from the
consensus opinions. The consensus opinion values and their corresponding probability expectation values are:
0.98
0.01
0.00
0.01
0.00
Table 1: Comparison of operators in Zadeh’s example
Peter
Paul
Mary
Peter
* D* * D *
+* D* * D *
* D* * D *
Paul
Mary
Dempster’s rule selects the least suspected by
both witnesses as the guilty. The non-normalised
version acquits all the suspects and indicates that
somebody else is the guilty. This is explained by
Smets 1988 [10] with the so-called open world
interpretation of the frame of discernment.
V V *
Peter
Paul
V Mary
*
When uncertainty is introduced Dempster’s rule
corresponds well with intuitive human judgement. The non-normalised Dempster’s rule however still indicates that none of the suspects are
guilty and that new suspects must be found. Our
consensus operator corresponds well with human
judgement and gives almost the same result as
Dempster’s rule. Note that the values resulting
from the consensus operator have been rounded
off after the third decimal. The belief masses resulting from Dempster’s rule in Table 2 add up to
1. The ‘belief’ parameters of the consensus opinions do not add up to 1 because they are actually
taken from 3 different focused frames of discernment, but the following holds:
Murphy 2000 [11] argues that Dempster’s rule
can only be considered safe for combining beliefs
with low conflict and that statistical average is the
best way to combine highly conflicting beliefs.
The consensus operator gives precisely the average of beliefs to Peter and Mary, and leaves the
non-conflicting beliefs on Paul unaltered. This result is further consistent with classical estimation
theory (see e.g. comments to Smets 1988 [10]
p.278 by M.R.B.Clarke) which is based on taking the average of probability estimates when all
estimates have equal weight.
In the following example uncertainty is introduced by allocating some belief to the set 7
K
Peter
JK
Paul
Mary
References
The above example indicates that Dempster’s rule
and the Consensus operator give similar results.
However, this is not always the case as illustrated
by the following example. Let two agents and
have equal beliefs about the truth of a binary
proposition . The agents’ BMAs and the results
of applying the rules are give in Table 3.
0.90
0.00
0.10
0.00
0.90
0.00
0.10
0.00
Dempster’s
rule
0.99
0.00
0.01
0.00
Nonnorm.
rule
0.99
0.00
0.01
0.00
[1] A. Hunter. Uncertainty in information systems. McGraw-Hill, London, 1996.
[2] A. Motro and Ph. Smets. Uncertainty management in information systems: from needs
to solutions. Kluwer, Boston, 1997.
Cons.
operator
0.947
0.000
0.053
0.000
[3] G. Shafer. A Mathematical Theory of Evidence. Princeton University Press, 1976.
[4] A. Jøsang. A Logic for Uncertain Probabilities. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 9(3):279–311, June 2001.
[5] A. Jøsang and V.A. Bondi. Legal Reasoning
with Subjective Logic. Artificial Reasoning
and Law, 8(4):289–315, winter 2000.
Table 3: Comparison of operators i.c.o. equal beliefs
and the correThe consensus opinion about
sponding probability expectation value are:
[6] A. Jøsang. An Algebra for Assessing Trust
in Certification Chains. In J. Kochmar,
editor, Proceedings of the Network and
Distributed Systems Security Symposium
(NDSS’99). The Internet Society, 1999.
* +* *D *
[7] Ph. Smets and R. Kennes. The transferable belief model. Artificial Intelligence,
66:191–234, 1994.
It is difficult to give an intuitive judgement of
these results. It can be observed that Dempster’s rule and the non-normalised version produce equal results because the witnesses’ BMAs
are non-conflicting. The two variants of Dempster’s rule amplify the combined belief twice as
much as the consensus operator and this difference needs an explanation. The consensus operator produces results that are consistent with statistical analysis [4] and in the absence of good criteria for intuitive judgement this constitutes a strong
argument in favour of the consensus operator.
5
[8] Daniel Ellsberg.
Risk, ambiguity, and
the Savage axioms. Quarterly Journal of
Ecomonics, 75:643–669, 1961.
[9] L.A. Zadeh. Review of Shafer’s A Mathematical Theory of Evidence. AI Magazine,
5:81–83, 1984.
[10] Ph. Smets. Belief functions. In Ph. Smets
et al., editors, Non-Standard Logics for Automated Reasoning, pages 253–286. Academic Press, 1988.
Conclusion
The Opinion metric described here provides a
simple and compact notation for beliefs in the
Shaferian belief model. We have defined an alternative to Dempster’s rule which we believe corresponds more closely with human intuitive reasoning. We have also defined a discounting operator
which combined with the consensus operator can
be used for subjective evidential reasoning.
[11] Catherine K. Murphy. Combining belief
functions when evidence conflicts. Decision
Support Systems, 29:1–9, 2000.
8
Download