Discourse Understanding with Discourse Representation Theory and Belief Augmented Frames Colin Tan,

advertisement
Discourse Understanding with
Discourse Representation Theory
and Belief Augmented Frames
Colin Tan,
Department of Computer Science,
School of Computing,
National University of Singapore.
Belief Augmented Frames
Motivation
• Frames
– Flexible, intuitive way of representing
knowledge.
– Frames represent an entity or a concept
– Frames consist of slots with values
• Represents relations between current frame and
other frames.
– Slots have events attached to them
• Can invoke procedures (“daemons”) whenever a
slot’s value is changed, removed etc.
Belief Augmented Frames
Motivation
• In the original definition of frames, slotvalue pairs are “definite”.
• One improvement is to introduce
uncertainties into these relations.
Belief Augmented Frames
Motivation
• Statistical representations are not always ideal:
– If we are p% certain that a fact is true, this doesn’t
mean that we are 1-p% certain that it is false.
• Various uncertainty reasoning methods introduced
to address this:
– Dempster-Schafer Theory
– Transferrable Belief Model
– Probabilistic Argumentation Systems
Belief Augmented Frames
Motivation
• By combining uncertainty measures with
frames:
– Uncertainties in slot-value pair assignments
provide frames with greater expressiveness and
reasoning ability.
– Frames offer a neat intuitive structure for
reasoning uncertain relations.
Modeling Uncertainty in
Belief Augmented Frames
• Uncertainty not only on slot-value pair
assignments, but also on the existence of the
concept/object represented by the frame.
• The belief mass Tf is called the
“Supporting Mass”, and it is the degree of
support that the fact f is true.
• Likewise Ff is the Refuting Mass, and it is
the degree of support that the fact f is false.
Modeling Uncertainty in
Belief Augmented Frames
• In general:
• 0  Tf , Ff  1
• Tf is fully independent of Ff
• Tf + Ff  1
• The Degree of Inclination Dif is defined as:
– Dif = Tf - Ff
• The Plausibility plf is defined as:
– Plf = 1 - Ff
Combining Belief Masses
• Fuzzy-logic style min-max functions are
used to combine belief masses from
different facts.
• Given two facts P and Q:
– Conjunctions
• TPQ = min(TP, TQ)
• FPQ = max(FP, FQ)
Combining Belief Masses
• Given two facts P and Q:
– Disjunctions
• TPQ = max(TP, TQ)
• FP  Q = min(FP, FQ)
– Negation
• TP = FP
• FP = TP
Discourse Representation
Structures
• Discourse Representation Theory provides
the techniques and structures for resolving
important discourse processing issues like
anaphoric and ellipses references.
• The main structure in DRT is the Discourse
Representation Structure, or DRS.
Example
• An example DRS representing “Pedro owns
a donkey” is shown below:
– [u1: u2: pedro(u1)
donkey(u2)
owns(u1, u2)]
• The symbols u1 and u2 are known as
referent markers.
Embedded DRSs
• Embedded DRSs are used to model more
complex relations:
– Conditionals: If Pedro owns a donkey he will
beat it.
[u1: u2: [ pedro(u1)
donkey(u2)
owns(u1, u2)] ===> [u3=u1
u4=u2
beats(u3, u4)]]
Embedded DRSs
• Some, Few, Most, All etc are similarly
modeled. E.g.
– Some men who own donkeys love them.
[u1: u2: [ men(u1)
donkey(u2)
own(u1, u2)] =some=> [u3=u1
u4=u2
love(u3, u4)]]
From DRS to BAF
• Conversion from DRS to BAF is trivial:
– All nouns and objects are inserted as new frames in the
BAF:
• New frames for Pedro(u1) and Donkey(u2) are created.
– All relations between nouns and objects in the DRS are
modeled slot-value pairs in the BAF. E.g.
• beats(u1, u2)
• u1 is resolved to Pedro, u2 is resolved to donkey, a slot beats is
created in Pedro and the frame for donkey is assigned to it.
From DRS to BAF
• Uncertainties for slot-value assignments:
– For simple relations (e.g. Pedro owns a
donkey):
• Towns(pedro, donkey) = 
• Fowns(pedro, donkey) = 1- 
– Here  is our degree of belief in the reliability
of the source that told us that Pedro owns a
donkey.
From DRS to BAF
• Alternatively, if person C says that Pedro
doesn’t own a donkey, then:
• Towns(pedro, donkey) = 
• Fowns(pedro, donkey) = 
– This example illustrates the expressive power
of making T and F separate and fully
independent.
From DRS to BAF
• “Fuzzy” relations like some, most, etc. can
be represented in BAF by using fuzzy-logic
style membership functions.
– E.g. Some boys beat their donkeys
• Let S be the set of boys who beat their donkeys.
– Tbeat(boys, donkeys) = f(|S|)
– Fbeat(boys, donkeys) = 1 - Tbeat(boys, donkeys)
– Here f(.) is a monotonically increasing function
definied in the range [0, 1], similar to the
fuzzy-logic S function.
From DRS to BAF
• Other “fuzzy” notions can also be similarly
expressed:
– All boys beat their donkeys
• Let S be the set of boys who beat their donkeys, and
let D be the set of boys who own donkeys. Then:
• Tbeat(boys, donkeys) = (|S| ==|D|)
• Fbeat(boys, donkeys) = 1 - Tbeat(boys, donkeys)
– Here  is a proximity function, defined on [0,
1], that increases as |S| approaches |D|.
Applications
• Several applications of BAFs are currently being
developed:
– Q & A system
• Digests newswire articles and answers questions.
• Most direct application of the topics in today’s talk.
– Text Classification System
• Uses abstraction feature of BAFs to learn the features of
document classes.
• Use these features to classify unseen documents.
• Results are better than Naïve Bayes.
Difficulties
• Semantics of slots is ill-defined
– There is no fixed way to use the slots to
represent relations between frames.
– This complicates the modeling of real-world
English sentences. E.g.
• The black cat stole the purse.
– Should this be modeled as stole(subject: cat,
object: purse), cat(action:stole, object:purse),
purse(action:stolenby, subject:cat)?
Difficulties
• Many ways to derive a-priori Supporting
and Refuting masses.
– Some ways might be better than others.
• Separation of Supporting and Refuting
masses introduces additional problems that
can make modeling awkward and counterintuitive:
– E.g. plsf < Tf , when Tf , Ff > 0.5
Difficulties
• The range for the sum of Tf ,andFf falls in
[0, 2] instead of [0, 1]. This is again
counter-intuitive.
Future Work
• More work to be done in incorporating
linguistic hedges like very and somewhat.
• A model for measuring the quality of the
knowledge in the BAF knowledge base
should be developed.
Download