Communication Communication Examples of communication

advertisement
Communication
2D1380 Artificial Intelligence: Lecture 9
Communication
Patric Jensfelt
patric@kth.se
I
Communication: Intentional exchange of information using
signs from a shared system of conventional symbols.
I
Can for example help agents to get information from others
without having to directly observe themselves
Ex: Don’t have to get out of bed, can ask about the weather
I
Can be seen as an action for the agent
September 30 July 2005
Examples of communication: Human speech
Examples of communication: Body language
Examples of communication: Gestures
I
Examples of communication: Gestures
I
Depends on situation, country, etc
I
Check out http://en.wikipedia.org/wiki/Gesture
Depends on situation
Examples of communication: Signs
Examples of communication: Animals
I
I
Example: STOP
I
Example: warning for mothers with lunch bags?
Animals with sound, gestures, acts, etc
Examples of communication: Bees
I
Bees communicating way home, direction to nectar, etc
Speech act
I
Communication action for agent: speech act
I
Many forms of communication as we have seen
I
To change state of other agents and their future actions
The speech act
I
Animals
I
I
I
Speaker
Utterance
Hearer
I
Animals typically uses isolated symbols for sentences
I
Restricted set of communication propositions
I
No generative capability
Speech acts
I
Different objectives for speaker
Inform
“Dani will give the next lecture”
Query
“Am I speaking loud enough?”
Command
“Don’t fall asleep!”
Promise
“We will not fall asleep”
Acknowledgement “OK, message received”, ACK/NACK
Language
Speech act
I
Speaker must determine
I
I
I
When to use a speech act
Which speech act to use
Hearer must
I
I
Understand
Speech is planned action ⇒ Recognize plan increases
understanding of speech
I
Formal languages
I
I
I
For example first-order logic, java, etc
Have strict definitions
Natural languages
I
I
For example English, Swedish, etc
No strict definitions
Formal language
Semantics
I
Semantics - meaning to each valid string
I
Defined as a set of strings
I
“X + Y ” sum of X and Y in arithmetic
I
Each string is a concatenation of terminal symbols (words)
I
Ex: P and ∧ terminal symbols of first-order logic
“P ∧ Q” a typical valid string
“PQ∧” is an invalid string
I
warning survey workers on the road
Phrase structure
Grammar
I
Grammar is a finite set of rules that specifies a language
I
Each string in the language can be analyzed/generated by
the grammar
I
Formal languages have official grammars e.g. first-order
logic, ANSI-C
I
Natural languages have no official grammar
(some foreign language teacher may disagree)
English phrase structure
I
Strings are composed of substring or phrases
Nonterminal symbols
I
Noun phrase (NP) such as “Patric”
I
Verb phrase (VP) such is “is hungry”
I
NP, VP and S are nonterminal symbols
I
Sentence (S)
I
Nonterminals are defined by rewrite rules
I
NP and VB help describe the allowable strings
“Patric
hungry” and not “hungry Patric is”
| {z } is
| {z }
I
S can consist of any NP followed by any VP
NP
VP
Backus-Naur Form (BNF)
I
I
Using BNF
Using Backus-Naur form (BNF) notation for rewrite rules
BNF has four components
I
I
I
I
A set of terminal symbols
Symbols or words that make up the strings
in English letters (a,b, . . . ) or words (apple,banana,. . . )
A set of nonterminal symbols
Categorize substrings of the language
in English “is hungry” is a Verb phrase
A start symbol
Nonterminal symbols that denotes a complete string
In English it is a sentence
A set of rewrite rules
Sentence → NounPhrase VerbPhrase
BNF Ex: Rewrite rule of simple arithmetics
Expr
Number
Digit
Operator
→
→
→
→
Expr Operator Expr | (Expr) | Number
Digit | Number Digit
0|1|2|3|4|5|6|7|8|9
+|-|÷|×
I
S can consist of any NP followed by any VP
I
BNF: S → NP VP
Generative capacity of grammar
I
Grammar can be classified according to their generative
capacity: the set of languages they can represent
I
Chomsky suggested a hierarchy with four classes of
grammars
I
Grammar higher up in hierarchy more expressive, but
I
Algorithms for these are less efficient
Chomsky’s grammar hierarchy (most powerful first)
I
I
I
I
Recursively enumerable grammars
Unrestricted rules, any number of symbols on left and right
side
Stages in communication
I
I
I
Context-sensitive grammars
Right hand side of rules at least as many as left
Ex: A S B → A X B
Context-free grammars (CFG)
Left hand side a single nonterminal symbol
Ex: S → NP VP
Can rewrite left hand side as right hand side in any context
I
I
I
I
I
Three main parts
I
Syntactic interpretation (build a parse tree)
I
Semantic interpretation (what does it means)
I
Pragmatic interpretation (what does it mean in this context)
Ex: “I am looking at the diamond”
Jeweler: the diamond
Baseball player: the baseball field
Perception
H perceives W’ in context C’
Analysis
H infers possible meanings P1 , P2 , . . . , Pn
Disambiguation
H infers intended meaning Pi
Incorporation
H incorporates Pi into KB
Ex: Wumpus world
I
I
Intention
S wants to inform H that P
Generation
S selects words W to express P in context C
Synthesis
S utters words W
Hearer:
I
Regular grammars
Each rule has one left hand side nonterminal
Right hand side terminal optionally followed by nonterminal
Analysis
Speaker:
Paul wants to tell Frank that the wumpus is dead
Ex: Wumpus world
Small grammar for English
What could go wrong
I
Insincerity (S is lying)
I
Ambiguous utterance (“I am dead” dead or just tired?)
I
Different understanding of current context (C 6= C 0 )
Lexicon for E0
Noun
Verb
I
Formal grammar for English in wumpus world
I
We will call the language E0
I
Cannot make full grammar for English
I
People have different ideas of what is proper/valid English
Adjective
Adverb
Pronoun
Name
Article
Preposition
Conjunction
Digit
→ stench | breeze | glitter | nothing
| wumpus | pit | pits | gold | east | . . .
→ is | see | smell | shoot | feel | stinks
| go | grab | carry | kill | turn | . . .
→ right | left | east | south | back | smelly | . . .
→ here | there | nearby | ahead
| right | left | east | south | back | . . .
→ me | you | I | it | . . .
→ John | Mary | Boston | UCB | PAJC | . . .
→ the | a | an | . . .
→ to | in | on | near | . . .
→ and | or | but | . . .
→ 0| 1| 2| 3| 4| 5| 6| 7| 8| 9
Divided into closed and open classes
Grammar for E0
S
NP
VP
PP
RelClause
Problems with this grammar
→ NP VP
| S Conjunction S
→
|
|
|
|
|
→
|
|
|
|
→
→
Pronoun
Noun
Article Noun
Digit Digit
NP PP
NP RelClause
Verb
VP NP
VP Adjective
VP PP
VP Adverb
Preposition NP
that VP
I + feel a breeze
I feel a breeze + and +
I smell a wumpus
I
pits
the + wumpus
34
the wumpus + to the east
the wumpus + that is smelly
stinks
feel + a breeze
is + smelly
turn + to the east
go + ahead
to + the east
that + is smelly
Augmented grammar
I
Overgenerates
Generates sentences that are not correct
“Me go Boston” or “I smell pit gold wumpus nothing east”
I
Undergenerates
Rejects many correct sentences
“I think the wumpus is smelly”
Augmented grammar: Agreement
Want to get rid of non-English sentences
“Me is tired”
I
Still not enough
I
“Me” is objective case and “I” is subjective case
I
Ex: “I eats” not OK, but “He eats” OK
I
Cannot use objective case as subject!
I
Three forms “I am”, “You are” and “He is”
I
Introduce new categorizes for pronouns
PronounS → I | you | he | she | it | . . .
PronounO → me | you | him | her | it | . . .
I
For each subject and object form there are three of these
agreement distinctions
I
Many more distinctions to be made
I
Now we can split NP
NPS → PronounS | Name | Noun | . . .
NPO → PronounO | Name | Noun | . . .
I
Results in exponential increase in rules!
I
Augment existing grammar
I
Don’t enumerate all possible cases
I
Parameterize existing grammar instead
Ex: NP(case) → Pronoun(case) | Name | Noun | . . .
I
Parsing
I
Parsing aims at finding a parse tree
I
The leafs in the tree are the words in the string
Called definite clause grammar (DCG)
Parsing cont’d
I
Two extremes
I
Top-down parsing
Start with S and search for tree with leafs being the words
I
Bottom-up parsing
Start with words and look for tree with root S
Top-down parsing
I
Initial state
root S and unknown children [S:?]
I
Successor function
Replace ? with list of right hand side from grammar for a
certain parent node
Ex:
1. [S :?]
2. [S : [S :?][Conjunction :?][S :?]]
[S : [NP :?][VP :?]]
3. . . .
I
Goal test
Leafs corresponds exactly to words in string
Bottom-up parsing
I
Initial state
Each word its own parse tree
I
Successor function
Look for matches from right side of grammar and replace
with new tree where the category is the left hand side and
the subsequences are the children
I
Goal test
Checks single tree with root S
Ex: Bottom-up parsing
Ex: Bottom-up parsing
Ex: Bottom-up parsing
Ex: Bottom-up parsing
Ex: Bottom-up parsing
Improvements needed
Semantics
I
I
Both top-down and bottom-up can be very inefficient
Huge amount of different parses for different phrases
I
Some sentences have exponentially many parse trees
I
See book for suggested improvements
I
For example store intermediate results of substrings
I
Ex: “John loves Mary”
I
How to get to the logical sentence Loves(John, Mary )?
I
“John
Mary”
| {z } loves
|
{z
}
NP
VP
I
NP “John” corresponds to logical term John
I
VP “loves Mary” is trickier
Predicate
Parse tree for “John loves Mary”
I
We call “loves Mary” a predicate
I
Combined with person it gives a logical sentence
I
Can use λ-notation
λx Loves(x, Mary )
I
Can make predicates out of verbs as well
“loves” ⇒ λy λx Loves(x, y )
I
“loves Mary” ⇒ λx Loves(x, Mary )
Semantics
I
Also need to account for
I
I
Time and tense, e.g. “Loves” and “Loved”
Quantification, e.g. “Everyone loves someone”
Do we love the same person or do we each have our own
love?
Parsing real languages
I
Ambiguity
I
Anaphora
I
Indexicality
I
Vagueness
I
Metonymy
I
Metaphor
I
Noncompositionality
Example: Ambiguity
I
Example of ambiguity:
I
I
I
I
I
Squad helps dog bite victim
Helicopter powered by human flies
American pushes bottle up Germans
I ate spaghetti with meatballs
salad
a fork
a friend
I
I
I
Indexical sentences refer to utterance situations (place,
time, S/H, etc)
Examples:
I
I
I am over here
Why did you do that?
Using pronouns to refer back to entities already introduced
in the text
Examples:
I
I
I
Ambiguity can be lexical (polysemy), syntactic, semantic,
referential
Indexicality
I
Anaphora
I
After Mary proposed to John, they found a preacher and
got married.
For the honeymoon, they went to Hawaii
Mary saw a ring through the window and asked John for it
Mary threw a rock at the window and broke it
Metonymy
I
I
Use one noun to stand for another
Examples:
I
I
I dropped Russell and Norvig on the floor
The ham sandwich on table 4 wants another beer
Metaphor
Examples: Noncompositionality
I
Example noncompositionality
I
I
I
I
“Non-literal” usage of words and phrases, often systematic
Examples:
I
I
I
I
I
Men are pigs
You are my sunshine
I’ve tried killing the process but it won’t die. Its parents
keeps it alive
I
I
I
I
I
I
Example: Not keeping the words together
I
English influence large in Swedish language
I
Swedes have a tendency to not keep the words together
I
Can change the meaning completely
I
I
I
En brunhårig sjuksköterska ⇒ A nurse with brown hair
En brun hårig sjuk sköterska ⇒ A brown hairy sick nurse
I
I
I
I
I
KYCKLING LEVER - “Chicken is alive” instead of “chicken
liver”
BAD SHORTS - “asked the shorts” instead of “swimming
shorts”
HUGG ORM - “stabbed the snake” instead of “viper”
SKUM TOMTE - “weird santa” instead soft candy in the
shape of santa
RÖK FRITT - “smoke freely” instead of “free from smoking”
http://www.skrivihop.nu/
red book
red pen
red hair
red herring
small moon
large molecule
Disambiguation
I
Disambiguation requires knowledge of different kind
I
I
More examples:
I
basketball shoes
baby shoes
alligator shoes
designer shoes
I
I
World model
Likelihood that it occurs in the world
Mental model (of the speaker)
Would the speaker communicate this if it occurs?
Language model
Likelihood to choose certain string of words given what to
communicate
Acoustic model
Likelihood that a particular sound will be generated given
the string of words
Probabilistic Language Processing
Probabilistic language model
I
Idea: Instead of building up very complicated grammars for
natural language, learn from text written by humans
I
I
So called corpus-based approach
I
I
Building probabilistic model
I
The WWW provides enormous amount of training data
I
Roughly: Count occurrences to get estimate of probability
I
Can handle any string
I
Can handle many views on what is correct
Examples of models
I
I
I
I
I
Examples with AI-book as corpus
unigram
logical are as are confusion a may right tries agent goal the
was diesel more object then information-gathering search
is
bigram
planning purely diagnostic expert systems are very similar
computational approach would be represented compactly
using tic tac toe a predicate
trigram
planning and scheduling are integrated the success of
naive bayes model is just a possible prior source by that
time
Trigram model clearly best and so say the models
I
I
I
trigram P(S) = 10−10
bigram P(S) = 10−29
unigram P(S) = 10−59
Defines the probability distribution over a set of strings
(Can be infinite set of strings)
Examples:
I
I
I
unigram model
Assign probability to each word
Q
Each words treated independently, P(S) = i P(wi )
bigram model
Assign probabilities P(wi |wi−1 ) to a word wi
n-gram model
Assign probabilities P(wi |wi−1 , wi−2 , . . . , wi−n )
AI-book as corpus
I
Approximately 500,000 words
I
15,000 different words
I
⇒ bigram model has 225 million word pairs in model
I
Most of these pairs will never occur
I
However, cannot assign the to zero
I
Would make it impossible to generate then
Smoothing
I
I
Need to smooth our model
Simplest strategy: add-one smoothing
I
I
I
every bigram gets at least count 1
Results often not so good
Better strategy: line interpolation smoothing
I
I
Combine unigram, bigram and trigrams
PĚ‚(wi |wi−1 , wi−2 ) =
c1 P(wi ) + c2 P(wi |wi−1 ) + c3 P(wi |wi−1 , wi−2 )
Evaluation
I
I
I
I
Example: System that can automatically generate a paper
on computer science
I
Full length paper with figures, references, etc
I
http://pdos.csail.mit.edu/scigen/
Examples of using corpus based approach
Split corpus in two parts
I
I
SCIgen - An Automatic CS Paper Generator
training data
validation data
Learn model on training data
I
Information retrieval
Calculate probability that the model assigns to the
validation data (that we assume is correct)
I
Information extraction
I
Machine translation
I
The higher score the better
I
Problems with long strings
see for example perplexity in the book (2−log2 (P(words))/N )
Information retrieval (IR)
I
Find information in corpus of interest to use
I
Compare google
Characterized by
I
I
I
I
I
I
Document collection (corpus)
Query language (how to ask for information)
Result set (relevant documents)
Presentation of results (ranking?)
Typically rather simple language models
(huge amount of data!)
Information extraction
Evaluating IR systems
I
Recall
How many of the relevant document are in the result set?
(Return all documents give 100% recall)
I
Precision
How many documents in result set are relevant?
(Return all documents give ≈0% precision)
I
Recall and precision typically presented in ROC curve
false negatives on y (good recall gives few false neg)
false positives on x (good precision gives few false pos)
I
Time to answer
I
Average reciprocal rank
1
average rank of first
relevant
Machine translation
I
I
Translate text from one natural language to another
Some example tasks:
I
I
Create database entries by searching for information
automatically
I
For example pricerunner
I
Do not have to analyze everything in a document
I
I
I
I
Rough translation
To get an idea of what a piece text is about
Restricted-source information
Accurate translation of material in limited area (weather
reports)
Preedited translation
Human preedits the text to a subset of the original language
Ex: Instruction manual
Literary translation
Preserve all nuances in the translation
Not possible today!
http://world.altavista.com/
Download