Natural Language Agents Elaine Rich

advertisement
Natural Language Agents
Elaine Rich
What do they Need?
The linguistic facts
A KR system that is rich enough to encode everything we can
talk about
Understanding: Figuring out what the speaker meant and what
you should do about it
Generation: Figuring out what to say to achieve your goals
Turing’s Prediction
1950 Turing predicted that in about fifty years "an
average interrogator will not have more than a 70
percent chance of making the right identification after
five minutes of questioning".
Was Turing Right?
>What's the best place to eat Chinese food?
ALICE: My favorite places are California and Maine. Holland
is cool too.
>What do you like about Maine?
ALICE: The special effects are the best.
Another Application - MT
Austin Police are trying to find the person responsible for robbing a
bank in Downtown Austin.
El policía de Austin está intentando encontrar a la persona
responsable de robar un banco en Austin céntrica.
The police of Austin is trying to find the responsible person to rob a
bank in centric Austin.
MT - The Limits
A Florida teen charged with hiring an undercover policeman to shoot
and kill his mother instructed the purported hitman not to damage
the family television during the attack, police said on Thursday.
Un adolescente de la Florida cargado con emplear a un policía de
la cubierta interior para tirar y para matar a su madre mandó a
hitman pretendida para no dañar la televisión de la familia durante
el ataque, limpia dicho el jueves.
An adolescent of Florida loaded with using a police of the inner
cover to throw and to kill his mother commanded to hitman tried not
to damage the television of the family during the attack, clean said
Thursday.
MT - The Limits
I have a dream, that my four little children will one day live in a
nation where they will not be judged by the color of their skin but
by the content of their character. I have a dream today – Martin
Luther King
I am a sleepy, that my four small children a day of alive in a
nation in where they will not be judged by the color of its skin but
by the content of its character. I am a sleepy today. (Spanish)
http://www.shtick.org/Translation/translation47.htm
Going Both Ways
Notice that both of these applications require that we process
language in two directions:
•Understanding
•Generation
But also notice that it is possible to do a somewhat passable job
without going through any meaning representation.
When Meaning is Critical
English: Put the kid’s cereal on the bottom shelves.
Java
import java.util.ArrayList;
public class GroceryStore
{
private int[][][] shelves;
private ArrayList products;
public void placeProducts(String productFile)
{ FileReader r = new FileReader(productFile);
GroceryItemFactory factory = new GroceryItemFactory();
while(r.hasNext())
products.add( factory.createItem(r.readNext()));
ThreeDLoc startLoc;
GroceryItem temp;
for(itemNum = 0; itemNum < products.size(); itemNum++)
{ temp = (GroceryItem)(products.get(itemNum))
startLoc = temp.getPlacement(this);
shelves[startLoc.getX()][startLoc.getY()][startLoc.getY()]=
tempgetIDNum();
}
}
}
Java, Continued
public class ChildrensCereal
{
private static final int
private static final int
private static final int
extends GroceryItem
PREFERRED_X = -1;
PREFERRED_Y = 0;
PREFERRED_Z = 0;
public ThreeDLoc getPlacement(GroceryStore store)
{
ThreeDLoc result = new ThreeDLoc();
result.setX(store.find(this));
result.setY(PREFERRED_Y);
result.setZ(PREFERRED_Z);
return result;
}
}
It’s All about Mapping
What Are We Going to Map to?
English: Do you know how much it rains in Austin?
The database:
Months
Month
Days
RainfallByStation
year
month
station
rainfall
Stations
station
City
English: What is the average rainfall, in Austin, in
months with 30 days?
SQL:
SELECT Avg(RainfallByStation.rainfall) AS
AvgOfrainfall FROM Stations INNER JOIN
(Months INNER JOIN RainfallByStation
ON Months.Month =
RainfallByStation.month)
ON Stations.station =
RainfallByStation.station
HAVING (((Stations.City)="Austin")
AND ((Months.Days)=30));
Designing a Mapping Function for NL Understanding
•Morphological Analysis and POS tagging
* The womans goed home.
•Syntactic Analysis (Parsing)
* Fishing went boys older
•Extracting Meaning
Colorless green ideas sleep furiously.
Sue cooked. The potatoes cooked.
* Sue and the potatoes cooked.
•Putting it All in Context
My cat saw a bird out the window. It batted at it.
•What isn’t Said
Winnie doesn’t like August.
He doesn’t like melted ice cream.
Ambiguity – the Core Problem
•Time flies like an arrow.
•I hit the boy with the blue shirt (a bat).
•I saw the Grand Canyon (a Boeing 747)
flying to New York.
•I know more beautiful women than Kylie.
•The boys may not come.
•I only want potatoes or rice and beans.
•Is there water in the fridge?
•Who cares?
•Have you finished writing your paper?
I’ve written the outline.
Morphological Analysis and POS Tagging
Morphological Analysis:
played = play + ed = play (V) + PAST
saw = see (V) + PAST
leaves =
Morphological Analysis and POS Tagging
Morphological Analysis:
played = play + ed = play (V) + PAST
saw = see (V) + PAST
leaves = leaf (N) + PL
= leave (N) + PL
= leave (V) + 3rdS
compute
Morphological Analysis and POS Tagging
Morphological Analysis:
played = play + ed = play (V) + PAST
saw = see (V) + PAST
leaves = leaf (N) + PL
= leave (N) + PL
= leave (V) + 3rdS
compute
computer
POS Tagging:
I hit the bag.
computerize
computerization
Morphological Analysis Using a Finite
State Transducer
Stochastic POS Tagging
Naïve Bayes Classification: Choose the POS tag that is
most likely for the current word given its context. For
example:
Secretariat expected to race tomorrow.
P(t i in context | w) 
Using Bayes Rule
P( w | t i in context) P(t i in context)
P( w)
We want to choose the tag tj with maximum likelihood:
t j  arg max P( w | t i ) P(t i | t j 1 )
i
The Importance of Parsing Even When
We’re Not Doing Full Understanding
Find me all the:
Lawyers whose clients committed fraud
vs
Lawyers who committed fraud
vs
Clients whose lawyers committed fraud
Parsing - Building a Tree
S
John hit the ball.
(S (NP (N John))
NP
VP
(VP (V hit)
(NP (DET
N
V
NP
John
hit
DET N
the)
(N
ball))))
the ball
Grammar Rules
We can build such a parse tree using a grammar with rules
such as:
S  NP VP
NP  N
VP  V NP
The Lexicon is Important
* The cat with a furry tail purred a collar.
Mary imagined a cat with a furry tail.
Mary decided to go.
* Mary decided a cat with a furry tail.
Mary decided a cat with a furry tail would be her next pet.
Mary gave Lucy the food.
* Mary decided Lucy the food.
Mary asked the cat.
Mary demanded a raise.
Mary asked for a raise.
Parsing: Dealing with Ambiguity
English:
Water the flowers with the hose.
Water the flowers with brown leaves.
Using Domain Knowledge
(plant (isa living thing))
(flower (isa plant)
(has parts leaf))
(water (isa action)
(instrument mustbe container))
(hose (isa container))
A Harder One
John saw a boy and a girl with a red wagon with one blue and
one white wheel dragging on the ground under a tree with huge
branches.
How Bad is the Ambiguity?
•Kim (1)
•Kim and Sue (1)
•Kim and Sue or Lee (2)
•Kim and Sue or Lee and Ann (5)
•Kim and Sue or Lee and Ann or Jon (14)
•Kim and Sue or Lee and Ann or Jon and Joe (42)
•Kim and Sue or Lee and Ann or Jon and Joe or Zak (132)
•Kim and Sue or Lee and Ann or Jon and Joe or Zak and Mel (469)
•Kim and Sue or Lee and Ann or Jon and Joe or Zak and Mel or Guy (1430)
•Kim and Sue or Lee and Ann or Jon and Joe or Zak and Mel or Guy and Jan
(4862)
The number of parses for an expression with n terms is the n’th Catalan number:
 2n   2n 

Cat (n)     
 n   n  1
Parsing: Gapping
English: Who did you say Mary gave the ball to?
Sentences like this make specifying the grammar difficult. They
also make it hard to use a simple, context-free parser.
Semantics: The Meaning of Words
Getting it right for the target application:
“month”  RainfallByStation.month
Dealing with ambiguity:
“spring” 
“stamp” 
or
or
or
Semantics: The Meaning of Phrases
Semantics is (mostly) compositional.
Olive oil
(oil (made-from olives))
Occasionally It Isn’t
Olive oil
But Usually It Is
Peanut oil
(oil (made-from peanuts))
Another One
Coconut oil
(oil (made-from coconut))
But What About This One?
Baby oil
But What About This One?
Baby oil
(oil (used-on baby))
And Another One
Cooking oil
(oil (used-for cooking))
And Another One
Riding jacket
Leather jacket
Letter jacket
Rain jacket
Idioms Don’t Work This Way
•I’m going to give her a piece of my mind.
•He bent over backwards to make the sale.
•I’m going to brush up on my Spanish.
Putting Phrases Together
Bill cooked the potatoes.
The potatoes cooked in about an hour.
The heat from the fire cooked the potatoes in 30 minutes.
(cooking-event (agent
)
(object
)
(instrument
)
(time-frame
)
* Bill and the potatoes cooked.
Language at its Most Straightforward –
Propositional Content
•Bill Clinton was the 42nd president of the United States.
•Texas is in France.
•The Matrix is playing at the Dobie.
•Lunch is at noon.
•What time is it?
When There’s More - Presuppositions
•What is Clinton famous for?
•Where’s The Matrix playing?
•Who is the king of France?
•Have you started making it to your morning classes?
•I’m going to check out all the five star restaurants in
Cleveland on this trip.
Coherence
Winnie doesn’t like melted ice cream. He always dreads August.
* Winnie doesn’t like melted ice cream. He always dreads January.
Winnie wanted to go to the store. He went to find Christopher Robin.
* Winnie wanted to go to the store. He counted quickly to 10.
Winnie walked into the room. Christopher Robin looked up and smiled.
* Winnie walked into the room. The earth rotates around the sun.
We Can’t Say it All
Christopher Robin and Winnie decided to go out for lunch.
They remembered that Coji’s doesn’t have hot dogs on
Saturdays, so they went to Buzzy’s. They got their food,
slathered on the mustard, and walked home.
Conversational Postulates
Grice’s maxims:
•The Maxim of Quantity:
Be as informative as required.
Don’t be more so.
•The Maxim of Quality:
Do not say what you believe to be false.
Do not say that for which you lack sufficient evidence.
•Maxim of relevance: Be relevant
•Maxim of manner:
Avoid obscurity of expression
Avoid ambiguity
Be brief.
Be orderly.
Conversational Postulates and Scalar
Implicature
A: Have you done the first math assignment yet?
B: I’m going to go buy the book tomorrow.
Another Example of Scalar Implicature
A: When did you get home last night?
B: I was in bed by midnight.
When There’s More – Conversational
Postulates and Inference
A: Joe doesn't seem to have a girl-friend these days.
B: He's been going to Dallas a lot lately.
When There’s More – Conversational
Postulates and Inference
A: Let’s go to the movies tonight.
B: I have to study for an exam.
When There’s More – Conversational
Postulates and Inference
Reviewer of new book:
It is well-bound and free of typographical errors.
When There’s More – Conversational
Postulates and Inference
A: What do you think of my new dress?
B: It’s interesting.
When There’s More – Conversational
Postulates and Illocutionary Force
•Do you know what time it is?
When There’s More – Conversational
Postulates and Illocutionary Force
•Do you know what time it is?
•What time is it?
When There’s More – Conversational
Postulates and Illocutionary Force
•Do you know what time it is?
•What time is it?
•I’m freezing.
When There’s More – Conversational
Postulates and Illocutionary Force
•Do you know what time it is?
•What time is it?
•I’m freezing.
•Get up and go close the window.
When There’s More – Conversational Postulates
and Illocutionary Force
•Do you know what time it is?
•What time is it?
•I’m freezing.
•Get up and go close the window.
Politeness
What About Applications Where Almost is OK?
•Searching the web
–Leaving some of the work for people
–Retrieval failures are ok
www.google.com
•Snooping
Going the Other Way: Generation
(c (isa cooking-event)
(agent x )
(object y)
(instrument z)
(time-frame ))
(x (isa man)
(name Bill)
(height 6')
(attire (headcovering h))
(born-location b))
(y (some-of potatoes)
(type-of Idaho))
(maturity new))
(z (isa microwave)
(brand Sharp))
(h (isa gimme)
(color red))
(gimme (subclass hat))
(b (isa city)
(name Austin))
One Problem - What is Reusable?
Because we’re dealing with a mapping problem, we can’t build
a definitive, useful for all applications, NL interface. However,
we can build:
•Lexicons that contain morphological and POS information, as
well as meanings given in terms of standard KBs such as
Wordnet.
•Morphological analyzers
•Syntactic parsers that appeal to a statistical and/or semantic
filter through a common interface
•Inference engines
•Dialogue management systems that handle linguistic-level
issues and plug into domain-specific knowledge
Spoken Language - Understanding
1
0.8
0.6
Thedis Four transform of a real val sig is con sym
me
ju
ued nal
crete ier
gate
tric
0.4
0.2
0
-0.2
-0.4
-0.6
-0.8
0
2
4
6
8
10
12
4
x 10
Spoken Language - Generation
The issues:
•Figuring out what to say (same as for written language)
•Pronouncing words
•Linking them together
•Getting the prosody right
A Recent Loebner Prize Winner
> Why do you think people want to go to the
moon?
Jabberwacky: You don't. Plastic and metal,
perhaps. No flesh.
> Can you imagine life before plastic?
Japperwacky: Can you?
Sure. Milk came in glass bottles.
Jabberwacy: You can come here whenever you
want. Iwill welcome you.
Download