ICoS-4

advertisement
Working with
Discourse Representation Theory
Patrick Blackburn & Johan Bos
Lecture 5
Applying DRT
Today
Given what we know about DRT, both
from a theoretical and practical
perspective, can we use it for practical
applications?
Outline
Spoken dialogue system with DRT
 Using DRT and inference to
control a mobile robot
Wide coverage parsing with DRT
 Recognising Textual Entailment
Human-Computer Dialogue
Focus on small domains
 Grammatical coverage ensured
 Background knowledge encoding
Spoken Dialogue system
 Godot the robot
 Speech recognition and synthesis
 People could give Godot directions, ask it
questions, tell it new information
 Godot was a REAL robot
Godot the Robot
Godot with Tetsushi Oka
Typical conversation with Godot
Simple dialogues in beginning





Human: Robot?
Robot: Yes?
Human: Where are you?
Robot: I am in the hallway.
Human: OK. Go to the rest room!
Adding DRT to the robot
The language model that the robot used for
speech recognition returned DRSs
We used the model builder MACE and the
theorem prover SPASS for inference
The model produced by MACE was used to
find out what the robot should do
This was possible as MACE produces
minimal models
Of course we also checked for consistency
and informativity
Advanced conversation with Godot
Dealing with inconsistency and
informativity






Human: Robot, where are you?
Robot: I am in the hallway.
Human: You are in my office.
Robot: No, that is not true.
Human: You are in the hallway.
Robot: Yes I know!
Videos of Godot
Video 1:
Godot in the basement
of Bucceuch Place
Video 2:
Screenshot of dialogue
manager with DRSs and
camera view of Godot
Minimal Models
Model builders normally generate
models by iteration over the domain
size
As a side-effect, the output is a model
with a minimal domain size
From a linguistic point of view, this is
interesting, as there is no redundant
information
Using models
Examples:
Turn on a light.
Turn on every light.
Turn on everything except the radio.
Turn off the red light or the blue light.
Turn on another light.
Adding presupposition
Godot was connected to an automated home
environment
One day, I asked Godot to switch on all the
lights
However, Godot refused to do this,
responding that it was unable to do so.
Why was that?
 At first I thought that the theorem prover made a
mistake.
 But it turned out that one of the lights was already
on.
Intermediate Accommodation
Because I had coded to switch on X having a
precondition that X is not on, the theorem
prover found a proof.
Coding this as a presupposition, would not
give an inconsistency, but a beautiful case of
intermediate accommodation.
In other words:
 Switch on all the lights!
[ All lights are off; switch them on.]
[=Switch on all the lights that are currently off]
Sketch of resolution
x
Robot[x]
e
y
Light[y]
=> Off[y]
switch[e]
 Agent[e,x]
Theme[e,y]
Global Accommodation
x
Robot[x]
Off[y]
e
y
Light[y]
=>
switch[e]
Agent[e,x]
Theme[e,y]
Intermediate Accommodation
x
Robot[x]
e
y
Light[y]
Off[y]
=>
switch[e]
Agent[e,x]
Theme[e,y]
Local Accommodation
x
Robot[x]
e
y
Light[y]
=>
switch[e]
Agent[e,x]
Theme[e,y]
Off[y]
Outline
Spoken dialogue system with DRT
 Using DRT and inference to
control a mobile robot
Wide coverage parsing with DRT
 Recognising Textual Entailment
Wide-coverage DRT
Nowadays we have robust widecoverage parsers that use stochastic
methods for producing a parse tree
Trained on Penn Tree Bank
Examples are parsers like those from
Collins and Charniak
Wide-coverage parsers
Say we wished to produce DRSs on
the output of these parsers
We would need quite detailed syntax
derivations
Closer inspection reveals that many of
the parsers use many [several
thousands] phrase structure rules
Often, long distance dependencies are
not recovered
Combinatory Categorial Grammar
CCG is a lexicalised theory of grammar
(Steedman 2001)
Deals with complex cases of
coordination and long-distance
dependencies
Lexicalised, hence easy to implement
 English wide-coverage grammar
 Fast robust parser available
Categorial Grammar
Lexicalised theory of syntax
 Many different lexical categories
 Few grammar rules
Finite set of categories defined over a
base of core categories
 Core categories:
s np
n
pp
 Combined categories:
np/n
s\np (s\np)/np
CCG: type-driven lexicalised grammar
Category
Name
Examples
N
noun
Ralph, car
NP
noun phrase
Everyone
NP/N
determiner
a, the, every
S\NP
intrans. verb
walks, smiles
(S\NP)/NP
transitive verb
loves, hates
(S\NP)\(S\NP) adverb
quickly
CCG: combinatorial rules
Forward Application (FA)
Backward Application (BA)
Generalised Forward Composition (FC)
Backward Crossed Composition (BC)
Type Raising (TR)
Coordination
CCG derivation
NP/N:a N:spokesman S\NP:lied
CCG derivation
NP/N:a N:spokesman S\NP:lied
CCG derivation
NP/N:a N:spokesman S\NP:lied
------------------------------- (FA)
CCG derivation
NP/N:a N:spokesman S\NP:lied
------------------------------- (FA)
NP: a spokesman
CCG derivation
NP/N:a N:spokesman S\NP:lied
------------------------------- (FA)
NP: a spokesman
---------------------------------------- (BA)
CCG derivation
NP/N:a N:spokesman S\NP:lied
------------------------------- (FA)
NP: a spokesman
---------------------------------------- (BA)
S: a spokesman lied
CCG derivation
NP/N:a N:spokesman S\NP:lied
------------------------------- (FA)
NP: a spokesman
---------------------------------------- (BA)
S: a spokesman lied
Coordination in CCG
np:Artie
(s\np)/np:likes (x\x)/x:and
np:Tony (s\np)/np:hates np:beans
---------------- (TR)
---------------- (TR)
s/(s\np):Artie
s/(s\np):Tony
------------------------------------ (FC)
--------------------------------------- (FC)
s/np: Artie likes
s/np:Tony hates
------------------------------------------------------- (FA)
(s/np)\(s/np):and Tony hates
--------------------------------------------------------------------------------- (BA)
s/np: Artie likes and Tony hates
------------------------------------------------------ (FA)
s: Artie likes and Tony hates beans
The Glue
Use the Lambda Calculus to combine
CCG with DRT
Each lexical entry gets a DRS with
lambda-bound variables, representing
the “missing” information
Each combinatorial rule in CCG gets a
semantic interpretation, again using the
tools of the lambda calculus
Interpreting Combinatorial Rules
Each combinatorial rule in CCG is
expressed in terms of the lambda
calculus:
 Forward Application:
FA(,) = @
 Backward Application:
BA(,) = @
 Type Raising:
TR() = x.x@
 Function Composition:
FC(,) = x.@x@
CCG: lexical semantics
Category
N
NP/N
S\NP
Semantics
x.
Example
spokesman
spokesman(x)
p. q.((
X
;p@x);q@x)
e
x. x@y. lie(e)
agent(e,y)
a
lied
CCG derivation
NP/N:a
N:spokesman
e
x
p. q.
S\NP:lied
;p@x;q@x
z. spokesman(z)
x.x@y. lie(e)
agent(e,y)
CCG derivation
NP/N:a
N:spokesman
e
x
p. q.
S\NP:lied
;p@x;q@x
z. spokesman(z)
x.x@y. lie(e)
agent(e,y)
------------------------------------------------ (FA)
NP: a spokesman
p. q.
x
;p@x;q@x@z.
spokesman(z)
CCG derivation
NP/N:a
N:spokesman
S\NP:lied
e
x
p. q.
;p@x;q@x z. spokesman(z)
x.x@y. lie(e)
agent(e,y)
-------------------------------------------------------- (FA)
NP: a spokesman
q.
x
;
spokesman(x)
;q@x
CCG derivation
NP/N:a
N:spokesman
S\NP:lied
e
x
p. q.
;p@x;q@x z. spokesman(z)
x.x@y. lie(e)
agent(e,y)
-------------------------------------------------------- (FA)
NP: a spokesman
x
q. spokesman(x)
;q@x
CCG derivation
NP/N:a
N:spokesman
S\NP:lied
e
x
p. q.
;p@x;q@x z. spokesman(z)
x.x@y. lie(e)
agent(e,y)
-------------------------------------------------------- (FA)
NP: a spokesman
x
q. spokesman(x) ;q@x
-------------------------------------------------------------------------------- (BA)
S: a spokesman lied
e
x.x@y.
lie(e)
agent(e,y)
@q.
x
spokesman(x)
;q@x
CCG derivation
NP/N:a
N:spokesman
S\NP:lied
e
x
p. q.
;p@x;q@x z. spokesman(z)
x.x@y. lie(e)
agent(e,y)
-------------------------------------------------------- (FA)
NP: a spokesman
x
q. spokesman(x) ;q@x
-------------------------------------------------------------------------------- (BA)
S: a spokesman lied
q.
x
spokesman(x)
e
;q@x @ y. lie(e)
agent(e,y)
CCG derivation
NP/N:a
N:spokesman
S\NP:lied
e
x
p. q.
;p@x;q@x z. spokesman(z)
x.x@y. lie(e)
agent(e,y)
-------------------------------------------------------- (FA)
NP: a spokesman
x
q. spokesman(x) ;q@x
-------------------------------------------------------------------------------- (BA)
S: a spokesman lied
x
spokesman(x)
e
; lie(e)
agent(e,x)
CCG derivation
NP/N:a
N:spokesman
S\NP:lied
e
x
p. q.
;p@x;q@x z. spokesman(z)
x.x@y. lie(e)
agent(e,y)
-------------------------------------------------------- (FA)
NP: a spokesman
x
q. spokesman(x) ;q@x
-------------------------------------------------------------------------------- (BA)
S: a spokesman lied
xe
spokesman(x)
lie(e)
agent(e,x)
The Clark & Curran Parser
Use standard statistical techniques
 Robust wide-coverage parser
 Clark & Curran (ACL 2004)
Grammar derived from CCGbank
 409 different categories
 Hockenmaier & Steedman (ACL 2002)
Results: 96% coverage WSJ
 Bos et al. (COLING 2004)
 Example output:
Applications
Has been used for different kind of
applications
 Question Answering
 Recognising Textual Entailment
Recognising Textual Entailment
A task for NLP systems to recognise
entailment between two (short) texts
Introduced in 2004/2005 as part of the
PASCAL Network of Excellence
Proved to be a difficult, but popular task.
Pascal provided a development and test set
of several hundred examples
RTE Example (entailment)
RTE 1977 (TRUE)
His family has steadfastly denied the
charges.
----------------------------------------------------The charges were denied by his family.
RTE Example (no entailment)
RTE 2030 (FALSE)
Lyon is actually the gastronomical capital
of France.
----------------------------------------------------Lyon is the capital of France.
Aristotle’s Syllogisms
ARISTOTLE 1 (TRUE)
All men are mortal.
Socrates is a man.
------------------------------Socrates is mortal.
How to deal with RTE
There are several methods
We will look at five of them to see how
difficult RTE actually is
Recognising Textual Entailment
Method 1:
Flipping a coin
Flipping a coin
Advantages
 Easy to implement
Disadvantages
 Just 50% accuracy
Recognising Textual Entailment
Method 2:
Calling a friend
Calling a friend
Advantages
 High accuracy (95%)
Disadvantages
 Lose friends
 High phonebill
Recognising Textual Entailment
Method 3:
Ask the audience
Ask the audience
RTE 893 (????)
The first settlements on the site of Jakarta were
established at the mouth of the Ciliwung, perhaps
as early as the 5th century AD.
---------------------------------------------------------------The first settlements on the site of Jakarta were
established as early as the 5th century AD.
Human Upper Bound
RTE 893 (TRUE)
The first settlements on the site of Jakarta were
established at the mouth of the Ciliwung, perhaps
as early as the 5th century AD.
---------------------------------------------------------------The first settlements on the site of Jakarta were
established as early as the 5th century AD.
Recognising Textual Entailment
Method 4:
Word Overlap
Word Overlap Approaches
Popular approach
Ranging in sophistication from simple
bag of word to use of WordNet
Accuracy rates ca. 55%
Word Overlap
Advantages
 Relatively straightforward algorithm
Disadvantages
 Hardly better than flipping a coin
RTE State-of-the-Art
Pascal RTE
challenge
Hard problem
Requires
semantics
Accuracy RTE 2004/5 (n=25)
0.59-0.60
0.57-0.58
0.55-0.56
0.53-0.54
0.51-0.52
0.49-0.50
0
10
Recognising Textual Entailment
Method 5:
Using DRT
Inference
How do we perform inference with
DRSs?
 Translate DRS into first-order logic,
use off-the-shelf inference engines.
What kind of inference engines?
 Theorem Provers
 Model Builders
Using Theorem Proving
Given a textual entailment pair T/H with
text T and hypothesis H:
 Produce DRSs for T and H
 Translate these DRSs into FOL
 Give this to the theorem prover:
T’  H’
If the theorem prover finds a proof, then
T entails H
Vampire (Riazanov & Voronkov 2002)
Let’s try this. We will use the theorem
prover Vampire (currently the best
known theorem prover for FOL)
This gives us good results for:





apposition
relative clauses
coodination
intersective adjectives/complements
passive/active alternations
Example (Vampire: proof)
RTE-2 112 (TRUE)
On Friday evening, a car bomb exploded
outside a Shiite mosque in Iskandariyah,
30 miles south of the capital.
----------------------------------------------------A bomb exploded outside a mosque.
Example (Vampire: proof)
RTE-2 489 (TRUE)
Initially, the Bundesbank opposed the
introduction of the euro but was compelled
to accept it in light of the political pressure
of the capitalist politicians who supported
its introduction.
----------------------------------------------------The introduction of the euro has been
opposed.
Background Knowledge
However, it doesn’t give us good results
for cases requiring additional
knowledge
 Lexical knowledge
 World knowledge
We will use WordNet as a start to get
additional knowledge
All of WordNet is too much, so we
create MiniWordNets
MiniWordNets
MiniWordNets
 Use hyponym relations from WordNet to
build an ontology
 Do this only for the relevant symbols
 Convert the ontology into first-order
axioms
MiniWordNet: an example
Example text:
There is no asbestos in our products now.
Neither Lorillard nor the researchers who
studied the workers were aware of any
research on smokers of the Kent cigarettes.
MiniWordNet: an example
Example text:
There is no asbestos in our products now.
Neither Lorillard nor the researchers who
studied the workers were aware of any
research on smokers of the Kent cigarettes.
x(user(x)person(x))
x(worker(x)person(x))
x(researcher(x)person(x))
x(person(x)risk(x))
x(person(x)cigarette(x))
…….
Using Background Knowledge
Given a textual entailment pair T/H with
text T and hypothesis H:




Produce DRS for T and H
Translate drs(T) and drs(H) into FOL
Create Background Knowledge for T&H
Give this to the theorem prover:
(BK & T’)  H’
MiniWordNets at work
RTE 1952 (TRUE)
Crude oil prices soared to record levels.
----------------------------------------------------Crude oil prices rise.
Background Knowledge:
x(soar(x)rise(x))
Troubles with theorem proving
Theorem provers are extremely
precise.
They won’t tell you when there is
“almost” a proof.
Even if there is a little background
knowledge missing, Vampire will say:
NO
Vampire: no proof
RTE 1049 (TRUE)
Four Venezuelan firefighters who were traveling
to a training course in Texas were killed when their
sport utility vehicle drifted onto the shoulder of a
Highway and struck a parked truck.
---------------------------------------------------------------Four firefighters were killed in a car accident.
Using Model Building
Need a robust way of inference
Use model builder Paradox
 Claessen & Sorensson (2003)
Use size of (minimal) model
 Compare size of model of T and T&H
 If the difference is small, then it is likely
that T entails H
Using Model Building
Given a textual entailment pair T/H with
text T and hypothesis H:




Produce DRSs for T and H
Translate these DRSs into FOL
Generate Background Knowledge
Give this to the Model Builder:
i) BK & T’
ii) BK & T’ & H’
If the models for i) and ii) are similar in size,
then we predict that T entails H
Example 1
T: John met Mary in Rome
H: John met Mary
Model T: 3 entities
Model T+H: 3 entities
Modelsize difference: 0
Prediction: entailment
Example 2
T: John met Mary
H: John met Mary in Rome
Model T:
2 entities
Model T+H: 3 entities
Modelsize difference: 1
Prediction: no entailment
Model size differences
Of course this is a very rough
approximation
But it turns out to be a useful one
Gives us a notion of robustness
Of course we need to deal with
negation as well
 Give not T and not [T & H] to model builder
 Not necessarily one unique minimal model
Lack of Background Knowledge
RTE-2 235 (TRUE)
Indonesia says the oil blocks are within its borders,
as does Malaysia, which has also sent warships
to the area, claiming that its waters and airspace
have been violated.
--------------------------------------------------------------There is a territorial waters dispute.
How well does this work?
We tried this at the RTE 2004/05
Combined this with a shallow approach
(word overlap)
Using standard machine learning methods to
build a decision tree
Features used:





Proof (yes/no)
Model size
Model size difference
Word Overlap
Task (source of RTE pair)
RTE Results 2004/5
Accuracy
CWS
Shallow
0.569
0.624
Deep
0.562
0.608
Hybrid (S+D)
0.577
0.632
Hybrid+Task
0.612
0.646
Bos & Markert 2005
Conclusions
We have got the tools for doing
computational semantics in a principled
way using DRT
For many applications, success
depends on the ability to systematically
generate background knowledge
 Small restricted domains [dialogue]
 Open domain
What we did in this course
 We introduced DRT, a notational variant of first-order
logic.
 Semantically, we can handle in DRT anything we can
in FOL, including events.
 Moreover, because it is so close to FOL, we can use
first-order methods to implement inference for DRT.
 The DRT box syntax, is essentially about nesting
contexts, which allows a uniform treatment of
anaphoric phenomena.
 Moreover, this works not only on the theoretical
level, but is also implementable, and even
applicable.
What we hope you got out of it
First, we hope we made you aware that
nowadays computational semantics is able to
handle some difficult problems.
Second, we hope we made you aware that
DRT is not just a theory. It is a complete
architecture allowing us to experiment with
computational semantics.
Third, we hope you are aware that state-ofthe-art inference engines can help to study or
apply semantics.
Where you can find more
 For more on DRT
read the standard
textbook devoted to
DRT by Kamp and
Reyle. This book
discusses not only
the basic theory, but
also plurals, tense,
and aspect.
Where you can find more
 For more on the
basic architecture
underlying this work
on computational
semantics, and
particular on
implementations on
the lambda calculus,
and parallel use of
theorem provers and
model builders, see:
www.blackburnbos.org
Where you can find more
 All of the theory we
discussed in this course is
implemented in Prolog. This
software can be downloaded
from www.blackburnbos.org.
For an introduction to Prolog
written very much with this
software in mind, try
Learn Prolog Now!
www.learnprolognow.org
Download