Reconnecting Computational Linguistics to Artificial Intelligence and Cognitive Science Raymond J. Mooney

advertisement
Machine Learning Group
Reconnecting Computational Linguistics to
Artificial Intelligence and Cognitive Science
Raymond J. Mooney
Machine Learning Group
Department of Computer Sciences
University of Texas at Austin
University of Texas at Austin
Current Focus of Computational Linguistics
• Well defined problems with large annotated corpora for
training and evaluating systems.
• Focus on engineering goals of achieving demonstrable,
measurable performance on complex realistic data.
• Competitions, Competitions, Competitions!
–
–
–
–
–
CoNLL Shared tasks
Senseval/Semeval
DUC
MT-Eval
Biocreative
• Research driven by the demands of government funding
agencies run by bureaucrats not scientists.
• Research driven by the immediate needs of industry.
• Little focus on using computation to gain insight into
interesting and important linguistic, psycholinguistic,
cognitive science, or AI issues.
2
Computational Linguistics is Becoming
Boring and Incremental!
3
CompLing and AI
• Is CompLing a part of Linguistics?
• Is CompLing a part of AI?
• Real language understanding is an AI complete
problem.
– Requires extensive knowledge of the world to resolve
ambiguities.
• The dog is in the pen. The ink is in the pen.
• John ate the spaghetti with a fork. John ate spaghetti with meatballs.
– Requires reasoning about beliefs, goals, plans, etc.
• Today was Jack's birthday. Penny and Janet went to the store. They
were going to get presents. Janet decided to get a kite. "Don't do
that," said Penny. "Jack has a kite. He will make you take it back."
4
AI–NLP in the 70’s and early 80’s
• Focused on knowledge representation and complex
inference.
• Focused on story understanding and interpreting characters
goals and plans (Charniak, Schank et al.)
• Focused on dialog understanding that inferred participants
goals and plans.
• Required extensive manual symbolic knowledge
engineering.
• Only worked on a small number of specially concocted
illustrative examples.
• Brittle and difficult to generalize to robust, reliable
performance on unseen examples.
5
Story Understanding
• My PhD thesis (1987) involved learning plan
schemata for story understanding in the
“Schankian” tradition.
• Used hand-coded symbolic knowledge to “deeply
understand” short, concocted stories by
understanding the plans and goals of the characters.
• GENESIS learned new plan schemata from a single
example using explanation-based learning to
improve its future understanding.
6
GENESIS Trace
Inititial Schema Learning
Input: Fred is Mary's father and is a millionaire. John approached Mary and pointed a gun at her. She
was wearing blue jeans. He told her if she did not get in his car then he would shoot her. He drove
her to his hotel and locked her in his room. John called Fred and told him John was holding Mary
captive. John told Fred if Fred gave him 250000 dollars at Trenos then John would release Mary.
Fred paid him the ransom and the kidnapper released Mary. Valerie is Fred's wife and he told her that
someone had kidnapped Mary.
Thematic goal achieved: John is happy that John has the $250000.
Explanation suitable for generalization. Pruning...Generalizing...Packaging...
Creating New Schema: (CaptureBargain ?x55 ?a34 ?b9 ?c4 ?r5 ?y5 ?l11)
?b9 is a person. ?c4 is a location. ?r5 is a room. ?c4 is in ?r5. ?x55 is a character. ?b9 is free. ?x55
captures ?b9 and locks him/her in ?r5. ?a34 is a character. ?x55 contacts ?a34 and tells it that ?b9 is
?x55's captive. ?y5 is a valuable. ?x55 wants to have ?y5 more than it wants ?b9 to be ?x55's captive.
?a34 has a positive relationship with ?b9. ?a34 has ?y5. ?x55 and ?a34 carry out a bargain in which
?x55 releases ?b9 and ?a34 gives ?x55 ?y5 at ?l11.
Unknown word 'ransom' refers to ?y5 in CaptureBargain
Unknown word 'kidnapper' refers to ?x55 in CaptureBargain
Having: ?x55 captured ?b9. suggest CaptureBargain
Having: ?x55 contacted ?a34 and told it that ?b9 was ?x55's captive. suggest CaptureBargain
Having: ?x55 and ?a34 carried out a bargain in which ?x55 released ?b9 and ?a34 gave ?x55 ?y5 at
?l11. suggest CaptureBargain
7
Unknown word '?x55 kidnap ?b9' refers to CaptureBargain
GENESIS Trace
Question Answering
Ready for questions:
>Summarize
Mary was free. John captured Mary and locked her in a room. John contacted Fred and told him that
Mary was John's captive. John wanted to have $250000 more than he wanted Mary to be John's
captive. Fred had a positive relationship with Mary. Fred had the $250000. John and Fred carried
out a bargain in which John released Mary and Fred gave John the $250000 at Trenos.
>Why did John approach Mary?
So John could aim the gun at Mary.
>Why did John aim the gun at Mary?
So John could threaten to shoot Mary with the gun unless Mary went from John in to the car.
>Why did Mary get into the car?
Because Mary wanted not to be shot and because Mary believed that if Mary did not go from John
into the car then John would shoot Mary with the gun.
>Why did John threaten to shoot Mary unless she got in the car?
So John could drive Mary to the hotel in the car.
>Why did John drive Mary to the hotel?
So John could move Mary from the hotel into the room.
8
GENESIS Trace
Question Answering (Cont.)
>Why did John move Mary into the room?
So John could confine Mary in the room.
>Why did John confine Mary in the room?
So John could call Fred and could tell him that Mary was John's captive and so John and Fred could
carry out a bargain in which John released Mary and Fred gave John the $250000 at Trenos.
>Why did John tell Fred that John was holding Mary captive?
So John and Fred could carry out a bargain in which John released Mary and Fred gave John the
$250000 at Trenos.
>Why did Fred give John the money?
Because Fred believed that if Fred gave John the $250000 at Trenos then John would release Mary
and because Fred wanted Mary to be free more than he wanted to have the $250000.
>Why did Fred have $250000?
Because Fred had millions of dollars.
>Why did Fred believe that John was holding Mary captive?
Because John called Fred and told him that Mary was John's captive.
9
GENESIS Trace
Question Answering (Cont.)
>Why did Fred want Mary free?
Because Fred was Mary's father.
>Why did John release Mary?
Because John wanted to have the $250000 more than he wanted Mary to be John's captive and
because John believed that if John released Mary then Fred would give John the $250000 at
Trenos.
10
Explanation Graph of Story
11
GENESIS Trace
Using the Learned Schema
Input: Ted is Alice's husband. He won 100000 dollars in the lottery. Bob imprisoned Alice in his
basement. Bob got 75000 dollars and released Alice.
Thematic goal achieved: Ted is happy that Ted has the $100000.
Thematic goal achieved: Bob is happy that Bob has the $75000.
Ready for questions:
>Summarize
Alice was free. Bob captured Alice and locked her in a basement. Bob contacted Ted and told him
that Alice was Bob's captive. Bob wanted to have $75000 more than he wanted Alice to be Bob's
captive. Ted had a positive relationship with Alice. Ted had the $75000. Bob and Ted carried out a
bargain in which Bob released Alice and Ted gave Bob the $75000.
>Why did Bob lock Alice in his basement?
So Bob could contact Ted and could tell him that Alice was Bob's captive and so Bob and Ted could
carry out a bargain in which Bob released Alice and Ted gave Bob the $75000.
>Why did Bob release Alice?
Because Bob wanted to have the $75000 more than he wanted Alice to be Bob's captive and because
Bob believed that if Bob released Alice then Ted would give Bob the $75000.
12
GENESIS Trace
Question Answering (Cont.)
>How did Bob get the money?
Bob kidnapped Alice.
>Who gave Bob the money?
Ted gave Bob the $75000.
>Why did Ted give him the money?
Because Ted believed that if Ted gave Bob the $75000 then Bob would release Alice and because Ted
wanted Alice to be free more than he wanted to have the $75000.
>Why did Ted want Alice free?
Because Ted was Alice's husband.
>Why did Ted believe that Bob was holding Alice captive?
Because Bob contacted Ted and told him that Alice was Bob's captive.
13
CogSci NLP in the 70’s and early 80’s
• Computational models of human syntactic parsing
that explained phenomena like garden-path
sentences.
– The horse raced past the barn fell.
– While Anna dressed the baby spit up on the bed.
• Computational models of human word-sense
disambiguation that explained phenomenon like
semantic priming and semantic garden-path
sentences.
– The astronomer married the star.
– The rabbi was hit on the temple.
14
Challenge Problem
Can one study the interesting NLP problems
of the 70’s using 21st century technology
and rigorous methodology to develop
robust, deep-understanding systems and
insightful and accurate computational
models of human language processing and
learning?
15
Promising Steps
• SemEval-07 FrameNet task
– Must identify semantic frames by disambiguating
trigger words and then fill their semantic roles.
• Results on modeling human reading time
(sentence processing time) using PCFGs (Levy,
2006).
• Computational Psycholinguistics workshops at
COLING 2004, ACL 2005 and 2007.
16
Language Grounding
• The meanings of many words are grounded in our
perception of the physical world: red, ball, cup, run, hit,
fall, etc.
– Symbol Grounding: Harnad (1990)
• Even many abstract words and meanings are metaphorical
abstractions of terms grounded in the physical world: up,
down, over, in, etc.
– Lakoff and Johnson’s (1980) Metaphors We Live By
• Its difficult to put my words into ideas.
• Interest in competitions is up.
• Most work in CompLing tries to represent meaning
without any connection to perception or to the physical
world, circularly defining the meanings of words in terms
of other words with no firm foundation.
17
Learning Language from Perceptual Context
• Children do not learn language from reading the
newspaper, surfing the web, or listening to the
radio.
• The natural way to learn language is to perceive
language in the context of its use in the physical
and social world.
• Must learn to infer the meanings of utterances
from their perceptual context.
– William James’ “Buzzing blooming confusion”
18
Existing Work
• There is some work on learning grounded
language from perceptual context.
– Deb Roy et al.
– Yu and Ballard
– Siskind
• However, these researchers come from computer
vision and robotics and not computational
linguistics.
– Only simple language is used with no linguistic
processing beyond finite state HMMs.
19
Robots that Learn Language from
Narrowly Focused Context
• Deb Roy’s Toco Robot
20
Contextually Ambiguous
Sentence Meaning
• In reality, sentences are frequently uttered in
complex situations composed of numerous
potential meanings.
• Assume each sentence is annotated with multiple
possible meanings inferred from context.
– Siskind (1996) uses this type “referentially uncertain”
training data to learn a semantic lexicon.
21
Ambiguous Supervision for
Semantic Parsing
• NL training sentences annotated with a set of
possible MRs, only one of which is correct.
• We artificially obfuscated a normal NL-MR
corpus (GeoQuery) by adding extra distracter MRs
to each training pair
• We also created completely artificial ambiguous
training data for semantic parsing.
22
Sample Ambig-ChildWorld Corpus
gave(daisy, clock, mouse)
Daisy gave the clock to the mouse.
ate(mouse, orange)
ate(dog, apple)
Mommy saw that Mary gave the
hammer to the dog.
saw(mother,
gave(mary, dog, hammer))
broke(dog, box)
The dog broke the box.
gave(woman, toy, mouse)
gave(john, bag, mouse)
John gave the bag to the mouse.
The dog threw the ball.
threw(dog, ball)
runs(dog)
saw(john, walks(man, dog))
23
KRISPER
•
•
Version of our KRISP semantic parser that learns
from ambiguous supervision.
Uses an iterative EM-like method to gradually
converge on a correct meaning for each sentence.
1) Assume every possible meaning for a sentence is
correct.
2) Train KRISP on the resulting noisy data.
3) Test learned parser on each training sentence and
determine the most likely MR for each sentence from
the set of potential meanings.
4) Build a new training set using the most likely meaning
for each sentence and iterate.
24
Sample Ambig-ChildWorld Corpus
gave(daisy, clock, mouse)
Daisy gave the clock to the mouse.
ate(mouse, orange)
ate(dog, apple)
Mommy saw that Mary gave the
hammer to the dog.
saw(mother,
gave(mary, dog, hammer))
broke(dog, box)
The dog broke the box.
gave(woman, toy, mouse)
gave(john, bag, mouse)
John gave the bag to the mouse.
The dog threw the ball.
threw(dog, ball)
runs(dog)
saw(john, walks(man, dog))
25
Sample Ambig-ChildWorld Corpus
gave(daisy, clock, mouse)
Daisy gave the clock to the mouse.
ate(mouse, orange)
ate(dog, apple)
Mommy saw that Mary gave the
hammer to the dog.
saw(mother,
gave(mary, dog, hammer))
broke(dog, box)
The dog broke the box.
gave(woman, toy, mouse)
gave(john, bag, mouse)
John gave the bag to the mouse.
The dog threw the ball.
threw(dog, ball)
runs(dog)
saw(john, walks(man, dog))
26
Sample Ambig-ChildWorld Corpus
gave(daisy, clock, mouse)
Daisy gave the clock to the mouse.
ate(mouse, orange)
ate(dog, apple)
Mommy saw that Mary gave the
hammer to the dog.
saw(mother,
gave(mary, dog, hammer))
broke(dog, box)
The dog broke the box.
gave(woman, toy, mouse)
gave(john, bag, mouse)
John gave the bag to the mouse.
The dog threw the ball.
threw(dog, ball)
runs(dog)
saw(john, walks(man, dog))
27
Results on Ambig-ChildWorld Corpus
28
Un-Enumerated Ambiguity
• It may not always be possible to narrow the
meaning of a sentence down to a relatively small
number of explicitly enumerated potential
meanings.
• Assuming context meaning is represented as a
semantic network, sentence meaning could be
assumed to be any connected subgraph of the
context.
29
Sample Ambiguous Context
“Juvenile caresses canine.”
Dog
isa
Chewing
HasColor
attr
obj
Black
Spot patient Petting
agent
agent
patient
Thing1
isa
Mary
isa
Child
HasColor
attr
obj
Blonde
Possess
patient
agent
isa
Thing2
Barbie
obj
part
isa
HasPart
Doll
Hair
Bone
30
Sample Ambiguous Context
“Juvenile caresses canine.”
Dog
isa
Chewing
HasColor
attr
obj
Black
Spot patient Petting
agent
agent
patient
Thing1
isa
Mary
isa
Child
HasColor
attr
obj
Blonde
Possess
patient
agent
isa
Thing2
Barbie
obj
part
isa
HasPart
Doll
Hair
Bone
31
Sample Ambiguous Context
“Juvenile caresses canine.”
Dog
isa
Chewing
HasColor
attr
obj
Black
Spot patient Petting
agent
agent
patient
Thing1
isa
Mary
isa
Child
HasColor
attr
obj
Blonde
Possess
patient
agent
isa
Thing2
Barbie
obj
part
isa
HasPart
Doll
Hair
Bone
32
Sample Ambiguous Context
“Juvenile caresses canine.”
Dog
isa
Chewing
HasColor
attr
obj
Black
Spot patient Petting
agent
agent
patient
Thing1
isa
Mary
isa
Child
HasColor
attr
obj
Blonde
Possess
patient
agent
isa
Thing2
Barbie
obj
part
isa
HasPart
Doll
Hair
Bone
33
Sample Ambiguous Context
“Juvenile caresses canine.”
Dog
isa
Chewing
HasColor
attr
obj
Black
Spot patient Petting
agent
agent
patient
Thing1
isa
Mary
isa
Child
HasColor
attr
obj
Blonde
Possess
patient
agent
isa
Thing2
Barbie
obj
part
isa
HasPart
Doll
Hair
Bone
34
Sample Ambiguous Context
“Juvenile caresses canine.”
Dog
isa
Chewing
HasColor
attr
obj
Black
Spot patient Petting
agent
agent
patient
Thing1
isa
Mary
isa
Child
HasColor
attr
obj
Blonde
Possess
patient
agent
isa
Thing2
Barbie
obj
part
isa
HasPart
Doll
Hair
Bone
35
Sample Ambiguous Context
“Juvenile caresses canine.”
Dog
isa
Chewing
HasColor
attr
obj
Black
Spot patient Petting
agent
agent
patient
Thing1
isa
Mary
isa
Child
HasColor
attr
obj
Blonde
Possess
patient
agent
isa
Thing2
Barbie
obj
part
isa
HasPart
Doll
Hair
Bone
36
Sample Ambiguous Context
“Juvenile caresses canine.”
Dog
isa
Chewing
HasColor
attr
obj
Black
Spot patient Petting
agent
agent
patient
Thing1
isa
Mary
isa
Child
HasColor
attr
obj
Blonde
Possess
patient
agent
isa
Thing2
Barbie
obj
part
isa
HasPart
Doll
Hair
Bone
37
Sample Ambiguous Context
“Juvenile caresses canine.”
Dog
isa
Chewing
HasColor
attr
obj
Black
Spot patient Petting
agent
agent
patient
Thing1
isa
Mary
isa
Child
HasColor
attr
obj
Blonde
Possess
patient
agent
isa
Thing2
Barbie
obj
part
isa
HasPart
Doll
Hair
Bone
38
Potential Approach
• Can employ an approach similar to WOLFIE
(Thompson & Mooney, 1999) to learn a semantic
lexicon from such graph-based ambiguous
supervision.
• A candidate meaning for a word is the largest
isomorphic common subgraph (LICS) of the
conceptual context graphs of a set of sentences in
which the word appears.
• Use the candidate meanings of the words in a
sentence to determine the connected subgraph of
its conceptual context graph that is its most-likely
meaning.
39
Conclusions
• CompLing is becoming disconnected from AI and
Cognitive Science and becoming boring.
• Many interesting and important problems that were
studied in the 70’s and 80’s are being ignored.
• Can we start to address these issues again using the
latest CompLing methods and systems to
automatically acquire the requisite knowledge and
improve robustness.
• Sample challenge tasks:
– Recognizing the goals and plans of people in narrative text
or dialog participants.
– Learning grounded language from ambiguous perceptual
context.
40
Download