Belief Augmented Frames for Knowledge Representation Colin Tan, Department of Computer Science,

advertisement
Belief Augmented Frames for
Knowledge Representation
Colin Tan,
Department of Computer Science,
School of Computing,
National University of Singapore.
Belief Augmented Frames
Motivation
• Frames
– Flexible, intuitive way of representing
knowledge.
– Frames represent an entity or a concept
– Frames consist of slots with values
• Represents relations between current frame and
other frames.
– Slots have events attached to them
• Can invoke procedures (“daemons”) whenever a
slot’s value is changed, removed etc.
Belief Augmented Frames
Motivation
• In the original definition of frames, slotvalue pairs are “definite”.
• One improvement is to introduce
uncertainties into these relations.
Belief Augmented Frames
Motivation
• Statistical representations are not always ideal:
– If we are p% certain that a fact is true, this doesn’t
mean that we are 1-p% certain that it is false.
• Various uncertainty reasoning methods introduced
to address this:
– Dempster-Schafer Theory
– Transferrable Belief Model
– Probabilistic Argumentation Systems
Belief Augmented Frames
Motivation
• By combining uncertainty measures with
frames:
– Uncertainties in slot-value pair assignments
provide frames with greater expressiveness and
reasoning ability.
– Frames offer a neat intuitive structure for
reasoning uncertain relations.
Modeling Uncertainty in
Belief Augmented Frames
• Uncertainty not only on slot-value pair
assignments, but also on the existence of the
concept/object represented by the frame.
• The belief mass Tf is called the
“Supporting Mass”, and it is the degree of
support that the fact f is true.
• Likewise Ff is the Refuting Mass, and it is
the degree of support that the fact f is false.
Modeling Uncertainty in
Belief Augmented Frames
• In general:
• 0  Tf , Ff  1
• Tf is fully independent of Ff
• Tf + Ff  1
• The Degree of Inclination Dif is defined as:
– Dif = Tf - Ff
• The Plausibility plf is defined as:
– Plf = 1 - Ff
Combining Belief Masses
• Fuzzy-logic style min-max functions are
used to combine belief masses from
different facts.
• Given two facts P and Q:
– Conjunctions
• TPQ = min(TP, TQ)
• FPQ = max(FP, FQ)
Combining Belief Masses
• Given two facts P and Q:
– Disjunctions
• TPQ = max(TP, TQ)
• FP  Q = min(FP, FQ)
– Negation
• TP = FP
• FP = TP
Reasoning Example
• Suppose we have the following facts:
– P:-A   B, P:-A  C  B, P:-A  B
– Supose also that A, B and C are three facts
defined as follows (Fact, Tfact, Ffact)
• (A, 0.3, 0.4), (B, 0.9, 0.2), (C, 0.8, 0.3)
– Then to evalute P:
• P =((A   B)  (A  C  B))  (A  B)
• P = ((A   B)  (A  C  B))  (A  B)
= ((A  B)  (A  C  B))  (A  B)
Reasoning Example
– P =((A   B)  (A  C  B))  (A  B)
• TP = min(max(min(TA, FB), min(FA, TC, FB)),
max(FA, TB))
= min(max(min(0.3, 0.2), min(0.4, 0.8, 0.2)),
max(0.4, 0.9))
= min(max(0.2, 0.2), max(0.4, 0.9))
= min(0.2, 0.9)
= 0.2
Reasoning Example
– P = ((A  B)  (A  C  B))  (A  B)
FP = max(min(max(FA, TB), max(TA, FC, TB)),
min(TA, TB))
= max(min(max(0.4, 0.9), max(0.3, 0.3, 0.9)),
min(0.3, 0.9))
= max(min(0.9, 0.9), min(0.3, 0.9))
= 0.9
DIP = 0.2 – 0.9 = -0.7
PlsP = 1 – 0.9 = 0.1
Major BAF Operations
• add_concept(con, ex_t, ex_f)
– Creates a new frame named “con” representing a
concept or an object. ex_t and ex_f are respectively the
supporting and refuting masses for the existence of
“con”.
• add_concept_inh(con, par, ex_t, ex_f, in_t, in_f)
– Creates a new frame “con” inheriting all the slot-value
assignments of a parent frame “par”.
– Sets up “isChild” and “isParent” relations between the
new and parent frames. in_t and in_f are respectively
the supporting and refuting masses for these relations.
Major BAF Operations
• add_rel(frame, slot, value, r_t, r_f)
– Adds a new relationship between current frame “frame”
and another frame “value”, with the relation given in
“slot” r_t and r_f are respectively the supporting and
refuting masses for this relation.
• abstract(frame, frame_set)
– Creates a new frame “frame” consisting of slot-value
assignments found in every frame in the given set
“frame_set”.
– The new frame will thus contain all the slot-value
assignments common to every frame in the set.
Daemons
• Daemons, or event handlers, may be
attached to slots to respond to any of the
following events.
– NewValue:A new value is being added to a slot.
– DelValue: An existing value is being deleted.
– UpdValue: The belief masses for an existing
value is being updated.
– NeedValue: An empty slot needs a value.
Daemons
• Demons respond with:
–
–
–
–
dm_Ignore: No action to be taken
dm_Rule: Execute a returned rule set
dm_Assign: Assign a returned value to the slot
dm_Delete: Remove existing value from slot
• dm_Rule can be returned together with
dm_Assign, dm_Delete and dm_Ignore.
• dm_Assign has two modes:
– Exclusive: All previous slot assignments are removed
– Inclusive: Returned value is attached to previous
values.
Rules
• Rules may be attached to modify the
existence or relation belief values. E.g.
– rel(a, rel1, b) :- rel(c,rel2,d), rel(c,rel3,e)
– The Supporting and Refuting belief masses for
a rel1 b will depend on the Supporting and
Refuting belief masses for the right hand side.
– Rules are “Prolog-like”, everything on RHS is
taken to be a conjunction.
– Identical named rules are taken to be
conjunctions, and the final belief values are
evaluated as in the earlier example.
Application
• Text Classification
– Compute term frequency ft from all documents
in a class.
– Let fti be the term frequency of term t in
document i in class c. Let ftk be the term
frequency of term t in document k of class c’,
where c’  c
• cT = mini(fti)
• cF = maxk(ftk)
Application
• Text Classification
– For an unseen document dm, extract all terms tm.
– Compute:
• Tc = min(TCt)
• Fc = max(FCt)
– Here TCt, FCt = Support and Refuting mass for
term t in class C, as computed earlier.
– Compute DIC = TCt  FCt
– The document is classified as class cmax =
argmaxC(DIC)
Application
• Text Classification
– Trained on 400 news articles broken into 5
categories.
– Tested on training set (inside test) and unseen
set of 100 articles.
– Evaluation of BAF vs. Naïve Bayes
Results – Inside Test
Naive Bayes vs. BAF - Inside Test
100
Accuracy (%)
90
80
70
60
50
40
30
20
10
0
Naïve Bayes
BAF
Naïve Bayes
BAF
None
AddOne
Jperks
56.31
85.1
89.89
51
88.4
88.4
Smoothing
Results – Outside Test
Naive Bayes vs. BAF - Outside Test
90
80
70
Accuracy (%)
60
50
Naïve Bayes
40
BAF
30
20
10
0
Naïve Bayes
BAF
None
AddOne
Jperks
36.54
64.42
67.3
64
83.7
83.7
Smoothing
Results - Analysis
• BAF outperforms Naïve Bayes for all
outside tests.
• BAF performs slightly less well than Naïve
Bayes without smoothing or with JeffreysPerks smoothing for inside tests
• Results are promising – Use thesaurus to
improve results?
Difficulties
• Semantics of slots is ill-defined
– There is no fixed way to use the slots to
represent relations between frames.
– This complicates the modeling of real-world
English sentences. E.g.
• The black cat stole the purse.
– Should this be modeled as stole(subject: cat,
object: purse), cat(action:stole, object:purse),
purse(action:stolenby, subject:cat)?
Difficulties
• Many ways to derive a-priori Supporting
and Refuting masses.
– Some ways might be better than others.
• Separation of Supporting and Refuting
masses introduces additional problems that
can make modeling awkward and counterintuitive:
– E.g. plsf < Tf , when Tf , Ff > 0.5
Difficulties
• The range for the sum of Tf ,andFf falls in
[0, 2] instead of [0, 1]. This is again
counter-intuitive.
Future Work
• A model for measuring the quality of the
knowledge in the BAF knowledge base
should be developed.
• To apply BAF to more problems to study its
behavior, especially against established
methods.
Download