Subjective Evidential Reasoning Audun Jøsang Distributed Systems Technology Centre Level 12, S-Block, QUT, PO Box 2434, Brisbane, Qld 4001, Australia email: ajosang@dstc.edu.au We describe two operators called discounting 1 and consensus that operate on subjective uncertain beliefs, and show how they can be used for assessing evidence from different sources. This work builds on Demspter-Shafer belief theory [3]. Our consensus operator is different from Dempster’s rule but has the same purpose; namely of combining uncertain beliefs, and a comparison is provided at the end of the paper. Our discounting operator is similar to the Shaferian discounting operator, and can be used to model recommendation of beliefs. The operators form part of Subjective Logic which is described in Jøsang 2001 [4]. These operators can for example be applied to legal reasoning [5] and authentication in computer networks[6]. Abstract This paper describes a framework for combining and assessing subjective evidence from different sources. The approach is based on the DempsterShafer belief theory, but instead of using Dempster’s rule we introduce a new rule called the consensus operator that is based on statistical inference. We show how this framework can be applied to subjective evidential reasoning. 1 Introduction Several alternative calculi and logics which take uncertainty and ignorance into consideration have been proposed and quite successfully applied to practical problems where conclusions have to be made based on insufficient evidence (see for example Hunter 1996 [1] or Motro & Smets 1997 [2] for an analysis of some uncertainty logics and calculi). Although including uncertainty in the belief model is a significant step forward, it only goes half the way in realising the real nature of human beliefs. It is also necessary to take into account that beliefs always are held by individuals and that beliefs for this reason are fundamentally subjective. 2 Representing Uncertain Beliefs The first step in applying the Dempster-Shafer belief model [3] is to define a set of possible situations, which is called the frame of discernment, denoted by and which delimits a set of possible states of a given system. The powerset of , denoted by , contains all possible unions of the states in including itself. Elementary states in a frame of discernment will be called atomic states because they do not contain substates. It is assumed that only one atomic state can be true at any one time. If a state is assumed to be true, then all superstates are considered true as well. Appears in the proceedings of the 9th International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems (IPMU 2002), Annecy, France, 1-5 July, 2002. The work reported in this paper has been funded in part by the Co-operative Research Centre for Enterprise Distributed Systems Technology (DSTC) through the Australian Federal Government’s CRC Programme (Department of Industry, Science & Resources). An observer who believes that one or more states in the powerset of might be true can assign belief mass to these states. Belief mass on an atomic 1 1 Called recommendation in some earlier publications is interpreted as the belief that the state state in question is true. Belief mass on a non is interpreted as the belief atomic state that one of the atomic states it contains is true, but that the observer is uncertain about which of them is true. The following definition is central in the Dempster-Shafer theory. The disbelief of corresponds to the doubt of in Shafer’s book. However, we choose to use the term ‘disbelief’ because we feel that for example the case when it is certain that a state is false can better be described by ‘total disbelief’ than by ‘total doubt’. Our next definition expresses uncertainty regarding a given state as the sum of belief masses on superstates or on partly overlapping states. Definition 1 (Belief Mass Assignment) Let be a frame of discernment. If with each substate a number is associated such that: ! Definition 4 (Uncertainty Function) Let be a frame of discernment, and let be a BMA on . The uncertainty function corresponding with is the function defined by: <=$ &)( + *,.< >@?B/ ADEGC F 32 4* J* 2 H> IC A " then is called a belief mass assignment2 on , or BMA for short. For each substate , is called the belief mass3 of . A BMA with zero belief mass assigned to is called a dogmatic BMA. The sum of the belief, disbelief and uncertainty functions is equal to the sum of the belief masses in a BMA which according to Def.1 sums up to 1. The following equality is therefore trivial to prove: A belief mass expresses the belief as signed to the state and does not express any belief in substates of in particular. In contrast to belief mass, the belief in a state must be interpreted as an observer’s total belief that a particular state is true. The next definition from the Dempster-Shafer theory will make it clear that belief in not only depends on belief mass assigned to but also on belief mass assigned to substates of . # JK 6 LK < !M* + , * .#%$ '&)( # 0,/ 1 32 4* 5* 2 2 2 (1) J*2 2 J* 2 2 Definition 5 (Relative Atomicity) Let be a . Then frame of discernment and let the relative atomicity of to is the function defined by: P $ '&)( +*,.P Q 2 O SR2 2 O * J* 2 OO 2 UT P Q 2 V , It can be2X observed that JR 2 W T P Q H! . In all other cases and that Definition 3 (Disbelief Function) Let be a frame of discernment, and let be a BMA on . Then the disbelief function corresponding with defined by: is the function 3 "N OO Similarly to belief, an observer’s disbelief must be interpreted as the total belief that a state is not true. The following definition is ours. for For the purpose of deriving probability expectation values of sets in , the relative number of atomic sets is also needed in addition to belief masses. For any particular state the atomicity of is the number of atomic states it contains, denoted by . If is a frame of discernment, the atomicity of is equal to the total number of atomic states it contains. Similarly, if then the overlap between and relative to can be expressed in terms of atomic states. Our next definition captures this idea of relative atomicity: Definition 2 (Belief Function) Let be a frame of discernment, and let be a BMA on . Then the belief function corresponding with is the function defined by: 67$ '&)( +*,.6 0.8 / :9; 32 4* 5* 2 Called basic probability assignment in Shafer 1976[3] Called basic probability number in Shafer 1976[3] it will have a value between 0 and 1. 2 P Q The relative atomicity of an atomic state to its frame of discernment, denoted by , can simply be written as , i.e. it refers to the frame of discernment by default. The relative atomicity function can be used to determine a probability expectation value for any given state. P A focused relative atomicity represents the weighted average of relative atomicities of to all other states in function of their uncertainty belief mass. Working with focused BMAs makes it possible to represent the belief functions relative to using a binary frame of discernment, making the notation compact. Definition 6 (Probability Expectation) Let be a frame of discernment with BMA , then the probability expectation function corresponding with is the function defined by: 3 $ & ( +*,. U/ 3 2 B P Q 2 4* 5* 2 (2) 0 The Opinion Space We define a 3-dimensional belief metric called opinion containing a 4th redundant component. For compactness of notation the belief, disbelief, uncertainty and relative atomicity functions will and respectively. be denoted as , , # 6 < Definition 9 (Opinion) Let be a frame of discernment with 2 atomic states and , and let be a BMA on where , , , and are the belief, disbelief, uncertainty and relative atomicity functions on in respectively. Then the opinion about , denoted by , is the tuple: Def.6 is equivalent to pignistic probability described in e.g. Smets and Kennes 1994 [7], and is based on the principle of insufficient reason: a belief mass assigned to the union of atomic states is split equally among these states. In order to simplify the representation of uncertain beliefs for particular states we will define a focused frame of discernment and a focused BMA. The focused frame of discernment and the corresponding BMA will for a given state produce the same belief, disbelief and uncertainty functions as the original frame of discernment and BMA. P P # D* 6 D* < * ( (4) Director # -3Q < P Uncertainty 1 Definition 8 (Focused Belief Mass Assignment) Let be a frame of discernment with BMA and let , and be the belief, disbelief and uncertainty functions of in . Let be the the focused frame of discernment with focus on . The corresponding focused BMA and on is defined by: relative atomicity # *6 *< < # 6 < # * 6 * < *HP Because the three coordinates are dependent through Eq.(1) they represent nothing more than the traditional Belief, Plausibility pair of Shaferian belief theory. It is useful to keep all three coordinates in order to obtain simple operator expressions. Eq.(1) defines a triangle that can be used to graphically illustrate opinions as shown in Fig.1. Definition 7 (Focused Frame Of Discernment) Let be a frame of discernment and let . The frame of discernment denoted by containing only and , where is the complement of in is then called a focused frame of discernment with focus on . # 6 P 0 0.5 ωx 0 Projector 0.5 Disbelief 1 0 (3) 0.5 0 0.5 ax E(x ) Probability axis Figure 1: Opinion triangle with 3 1 1Belief as example +*7 +*7 +*S As an example the position of the opinion is indicated as a point in the triangle. Also shown are the probability expectation value and the relative atomicity. The horizontal bottom line between the belief and disbelief corners in Fig.1 is called the probability axis. The relative atomicity can be graphically represented as a point on the probability axis. The line joining the top corner of the triangle and the relative atomicity point becomes the director. In is represented as a point, and Fig.1 the dashed line pointing at it represents the director. The first criterion is self evident. The second criterion is less so, but it is supported by experimental findings described by Ellsberg 1961 [8]. The third criterion is more an intuitive guess and so is the priority between the second and third criterion, and before these assumptions can be supported by evidence from practical experiments we invite the readers to judge whether they agree. P 4 So far we have described the elements of a frame of discernment as states. In practice states can be verbally described as propositions; if for example consists of possible colours of a ball when drawn from an urn with red and black balls, and designates the state when the colour drawn from the urn is red then it can be interpreted as the verbal proposition : “I will draw a red ball”. The projector is parallel to the director and passes through the opinion point . Its intersection with the probability axis defines the probability expectation value which otherwise can be computed by the formula of Def.6. The position of the proba is shown. For simbility expectation plicity the probability expectation of an opinion can be denoted according to . V Opinions are considered to be held by individuals, and will therefore have an ownership assigned whenever relevant. In our notation, superscripts indicate ownership, and subscripts indicate the proposition to which the opinion applies. For example is an opinion held by agent about the truth of proposition . Opinions situated on the probability axis are called dogmatic opinions, representing traditional probabilities without uncertainty. The distance between an opinion point and the probability axis can be interpreted as the degree of uncertainty. Opinions situated in the left or right corner, i.e. or are called absolute with either opinions. They represent situations where it is absolutely certain that a state is either true or false, and correspond to TRUE or FALSE proposition in binary logic. # 6 In this section we describe the operators discounting and consensus which are suitable for subjective evidential reasoning. Other operators suitable for subjective logic reasoning are described in [4]. 4.1 Opinions can be ordered according to probability expectation value, but additional criteria are needed in case of equal probability expectation values. The following definition can be used to determine the order of opinions: Definition 10 (Ordering of Opinions) Let and be two opinions. They can be ordered according to the following criteria by priority: 0 Evidential Operators The opinion with the greatest probability expectation is the greatest opinion. The opinion with the least uncertainty is the greatest opinion. The opinion with the least relative atomicity is the greatest opinion. 4 Discounting Assume two agents and where has an opinion about in the form of the proposition: “ is knowledgeable and will tell the truth”. In addition has an opinion about a proposition . Agent can then form an opinion about by discounting ’s opinion about with ’s opinion about . There is no such thing as physical belief discounting, and discounting of opinions therefore lends itself to different interpretations. The main difficulty lies with describing the effect of disbelieving that will give a good advice. This we will interpret as if thinks that is uncertain about the truth value of so that also is uncertain about the truth value of no matter what ’s actual advice is. Our next definition captures this idea. # * 6 * < * P # * 6 * < *P # * 6 * < *P # # # <6 #6 6 K < K # < L P P * is called the discounting of by # * 6 * < *P # * 6 * < *P < * < & Q< < < K < < < # * 6 * < *P for @ N # : # < K # < Q 6 6 < K 6 < Q L < < < Q J P for : @ # 6 L < V J P and , is then called the consensus between representing an imaginary agent ( * - ’s opinion about , as if she represented both and . By using the symbol ‘ ’ to designate this operator, . we define Definition 11 (Discounting) Let be agent ’s opinion about agent as a recommender, and let be a proposition where is ’s opinion about expressed in a recommendation be to . Let the opinion such that: Definition 12 (Consensus) Let and be ’s and ’s opinions about the same proposition . When , the relative dogmatism between and is Let defined by so that . . The opinion defined by: The discounting function defined by Shafer[3] uses a discounting rate that can be denote by , where the belief mass on each state in except the belief mass on itself is multiplied by . our definition becomes By setting equivalent to Shafer’s definition. # It is easy to prove that is associative but not commutative. This means that in case of a chain the discounting of opinions can start in either end, but that the order of opinions is significant. In a chain with more than one advisor, opinion independence must be assumed, which for example translates into not allowing the same entity to appear more than once in a chain. 4.2 then expressing ’s opinion about as a result of ’s advice to . By using the symbol ‘ ’ to designate . this operator, we define It is easy to prove that the consensus operator is both commutative and associative. Opinion independence must be assumed. The consensus of two totally uncertain opinions results in a new totally uncertain opinion, although the relative atomicity is not well defined in that case. Consensus In [4] it is incorrectly stated that the consensus operator can not be applied to two dogmatic opin. The definition above ions, i.e. when rectifies this so that dogmatic opinions can be combined. This result is obtained by computing the limits of , , , and , as using the relative dogmatism . This result makes the consensus operator even more general than Dempster’s rule because the latter excludes the combination of two totally conflicting beliefs. The consensus of two possibly conflicting opinions is an opinion that reflects both opinions in a fair and equal way, i.e. when two observers have beliefs about the truth of resulting from distinct pieces of evidence about , the consensus operator produces a consensus belief that combines the two separate beliefs into one. If for example a process can produce two outcomes and , and and have observed the process over two different time intervals so that they have formed two independent opinions about the likelihood of to occur, then the consensus opinion is the belief about to occur which a single agent would have had after having observed the process during both periods. See Jøsang 2001 [4] for a formal justification of the consensus operator. # < * < Q & < < 6 < P In order to understand the meaning of the relative dogmatism , it is useful to consider a process with possible outcomes that produces times as many as . For example when throwing a fair dice and some mechanism makes sure 5* 5 that only observes the outcome of “six” and only observes the outcome of “one”, “two”, “three”, “four” and “five”, then will think the dice only produces “six” and will think that the dice never produces “six”. After infinitely many observations and will have the conflicting dogmatic opinions “six” and “six” respectively. On the average observes 5 times more events than so that remains 5 times more dogmatic than as “six” , meaning that the relative “six” dogmatism between and is . By combining their opinions according to the case where in Def.12 and inserting the value of , we get a combined opinion about obtaining . a “six” with the dice as “six” We have assumed that and initially agreed on the relative atomicity of obtaining “six” with the dice. about the truth of . Assuming that the opinions resulting from each witness are independent, they can finally be combined using the consensus operator to produce the judge’s own opinion about : +* +* +* + *S +*M +* < *< & + , * . $ & ( N 0.8/ 9; 32 * 0.8 9 32 * 0.8 9; 32 and N where in Dempster’s rule, and where in the non- normalised version. Suppose that we have a murder case with three suspects; Peter, Paul and Mary and two witwho give highly conflicting nesses and testimonies. Table 1 gives the witnesses’ belief masses in Zadeh’s example and the resulting belief masses after applying Dempster’s rule, the non-normalised Dempster’s rule and the consensus operator. x Trust Figure 2: Trust in testimony from witnesses Because the frame of discernment in Zadeh’s example is ternary a focused binary frame of discernment must be derived in order to apply the consensus operator. The resulting opinions are all dogmatic, and the case where in Def.12 must be invoked. Because of the symmetry between and we determine the relative dogmatism between and to be . The effect on the judge of each individual testimony can be computed using the discounting operator, so that for example ’s testimony causes the judge to have the opinion: C A Legend: B (5) Comparing the Consensus Operator with Dempster’s Rule We consider a court case where 3 witnesses , and are giving testimony to express their opinions about a verbal proposition which has been made about the accused. We assume that the verbal proposition is either true or false, and let each witness express his or her opinion about the truth of the proposition as the opinions , and , to the courtroom. The judge then has to determine his or her own opinion about as a function of her trust : ’Witness is reliable’, and sim ilarly for and . This situation is illustrated in Fig.2 where the arrows denote trust or opinions about truth. J Definition 13 Let be a frame of discernment, and let and be BMAs on . Then is a function such that for all : Example: Assessment of Testimony from Witnesses We will start with the well known example that Zadeh 1984 [9] used for the purpose of criticising Dempster’s rule. Smets 1988 [10] used the same example in defence of the non-normalised version of Dempster’s rule. The definition of Dempster’s rule and the no-normalised Dempster’s rule is given below. * *7+* 4.4 :Q 4.3 6 ! Peter Paul Mary 0.99 0.01 0.00 0.00 0.00 0.00 0.01 0.99 0.00 0.00 Dempster’s rule 0.00 1.00 0.00 0.00 0.00 Nonnorm. rule 0.0000 0.0001 0.0000 0.0000 0.9999 Peter, Paul, Mary . Table 2 gives the modified Cons. operator 0.495 0.010 0.495 0.000 0.000 BMAs and the results of applying the rules. Peter Paul Mary Peter Paul Mary 0.00 0.01 0.98 0.01 0.00 Nonnorm. rule 0.0098 0.0003 0.0098 0.0001 0.9800 Cons. operator 0.492 0.010 0.492 0.005 0.000 The column for the consensus operator is obtained by taking the ‘belief’ coordinate from the consensus opinions. The consensus opinion values and their corresponding probability expectation values are: * *D +* * +* +*D +* * * *D +* * H * H 7* H S Dempster’s rule 0.490 0.015 0.490 0.005 0.000 Table 2: Comparison of operators after introducing uncertainty in Zadeh’s example The column for the consensus operator is obtained by taking the ‘belief’ coordinate from the consensus opinions. The consensus opinion values and their corresponding probability expectation values are: 0.98 0.01 0.00 0.01 0.00 Table 1: Comparison of operators in Zadeh’s example Peter Paul Mary Peter * D* * D * +* D* * D * * D* * D * Paul Mary Dempster’s rule selects the least suspected by both witnesses as the guilty. The non-normalised version acquits all the suspects and indicates that somebody else is the guilty. This is explained by Smets 1988 [10] with the so-called open world interpretation of the frame of discernment. V V * Peter Paul V Mary * When uncertainty is introduced Dempster’s rule corresponds well with intuitive human judgement. The non-normalised Dempster’s rule however still indicates that none of the suspects are guilty and that new suspects must be found. Our consensus operator corresponds well with human judgement and gives almost the same result as Dempster’s rule. Note that the values resulting from the consensus operator have been rounded off after the third decimal. The belief masses resulting from Dempster’s rule in Table 2 add up to 1. The ‘belief’ parameters of the consensus opinions do not add up to 1 because they are actually taken from 3 different focused frames of discernment, but the following holds: Murphy 2000 [11] argues that Dempster’s rule can only be considered safe for combining beliefs with low conflict and that statistical average is the best way to combine highly conflicting beliefs. The consensus operator gives precisely the average of beliefs to Peter and Mary, and leaves the non-conflicting beliefs on Paul unaltered. This result is further consistent with classical estimation theory (see e.g. comments to Smets 1988 [10] p.278 by M.R.B.Clarke) which is based on taking the average of probability estimates when all estimates have equal weight. In the following example uncertainty is introduced by allocating some belief to the set 7 K Peter JK Paul Mary References The above example indicates that Dempster’s rule and the Consensus operator give similar results. However, this is not always the case as illustrated by the following example. Let two agents and have equal beliefs about the truth of a binary proposition . The agents’ BMAs and the results of applying the rules are give in Table 3. 0.90 0.00 0.10 0.00 0.90 0.00 0.10 0.00 Dempster’s rule 0.99 0.00 0.01 0.00 Nonnorm. rule 0.99 0.00 0.01 0.00 [1] A. Hunter. Uncertainty in information systems. McGraw-Hill, London, 1996. [2] A. Motro and Ph. Smets. Uncertainty management in information systems: from needs to solutions. Kluwer, Boston, 1997. Cons. operator 0.947 0.000 0.053 0.000 [3] G. Shafer. A Mathematical Theory of Evidence. Princeton University Press, 1976. [4] A. Jøsang. A Logic for Uncertain Probabilities. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 9(3):279–311, June 2001. [5] A. Jøsang and V.A. Bondi. Legal Reasoning with Subjective Logic. Artificial Reasoning and Law, 8(4):289–315, winter 2000. Table 3: Comparison of operators i.c.o. equal beliefs and the correThe consensus opinion about sponding probability expectation value are: [6] A. Jøsang. An Algebra for Assessing Trust in Certification Chains. In J. Kochmar, editor, Proceedings of the Network and Distributed Systems Security Symposium (NDSS’99). The Internet Society, 1999. * +* *D * [7] Ph. Smets and R. Kennes. The transferable belief model. Artificial Intelligence, 66:191–234, 1994. It is difficult to give an intuitive judgement of these results. It can be observed that Dempster’s rule and the non-normalised version produce equal results because the witnesses’ BMAs are non-conflicting. The two variants of Dempster’s rule amplify the combined belief twice as much as the consensus operator and this difference needs an explanation. The consensus operator produces results that are consistent with statistical analysis [4] and in the absence of good criteria for intuitive judgement this constitutes a strong argument in favour of the consensus operator. 5 [8] Daniel Ellsberg. Risk, ambiguity, and the Savage axioms. Quarterly Journal of Ecomonics, 75:643–669, 1961. [9] L.A. Zadeh. Review of Shafer’s A Mathematical Theory of Evidence. AI Magazine, 5:81–83, 1984. [10] Ph. Smets. Belief functions. In Ph. Smets et al., editors, Non-Standard Logics for Automated Reasoning, pages 253–286. Academic Press, 1988. Conclusion The Opinion metric described here provides a simple and compact notation for beliefs in the Shaferian belief model. We have defined an alternative to Dempster’s rule which we believe corresponds more closely with human intuitive reasoning. We have also defined a discounting operator which combined with the consensus operator can be used for subjective evidential reasoning. [11] Catherine K. Murphy. Combining belief functions when evidence conflicts. Decision Support Systems, 29:1–9, 2000. 8