pptx

advertisement
Formal Semantics
Slides by Julia Hockenmaier, Laura
McGarrity, Bill McCartney, Chris
Manning, and Dan Klein
Question Answering:
IBM’s Watson
Jeopardy challenge:
https://www.youtube.com/watch?v=seNkjYyG3
gI
Question Answering:
IBM’s Watson
What components does Watson need?
Question Answering:
IBM’s Watson
What components does Watson need?
- named-entity recognition
- Named-entity disambiguation
- Phrase chunking
- Relation extraction
- Word sense disambiguation
Formal Semantics
It comes in two flavors:
• Lexical Semantics: The meaning of words
• Compositional semantics: How the meaning
of individual units combine to form the
meaning of larger units
What is meaning
• Meaning ≠ Dictionary entries
Dictionaries define words using words.
Circularity!
Reference
• Referent: the thing/idea in the world that a
word refers to
• Reference: the relationship between a word
and its referent
Reference
Barack
Obama
president
The president is the commander-in-chief.
= Barack Obama is the commander-in-chief.
Reference
Barack
Obama
I want to be the president.
≠ I want to be Barack Obama.
president
Reference
• Tooth fairy?
• Phoenix?
• Winner of the 2016 presidential election?
What is meaning?
• Meaning ≠ Dictionary entries
• Meaning ≠ Reference
Sense
• Sense: The mental representation of a word
or phrase, independent of its referent.
Sense ≠ Mental Image
• A word may have different mental images for
different people.
– E.g., “mother”
• A word may conjure a typical mental image (a
prototype), but can signify atypical examples as
well.
Sense v. Reference
• A word/phrase may have sense, but no
reference:
– King of the world
– The camel in CIS 3203
– The greatest integer
– The
• A word may have reference, but no sense:
– Proper names: Dan McCloy, Kristi Krein
(who are they?!)
Sense v. Reference
• A word may have the same referent, but more
than one sense:
– The morning star / the evening star (Venus)
• A word may have one sense, but multiple
referents:
– Dog, bird
Some semantic relations
between words
• Hyponymy: subclass
–
–
–
–
Poodle < dog
Crimson < red
Red < color
Dance < move
• Hypernymy: superclass
• Synonymy:
– Couch/sofa
– Manatee / sea cow
• Antonymy:
– Dead/alive
– Married/single
Lexical Decomposition
• Word sense can be represented with
semantic features:
Compositional Semantics
Compositional Semantics
• The study of how meanings of small units
combine to form the meaning of larger units
The dog chased the cat ≠ The cat chased the dog.
ie, the whole does not equal the sum of the parts.
The dog chased the cat = The cat was chased by the dog
ie, syntax matters to determining meaning.
Principle of Compositionality
The meaning of a sentence is determined by
the meaning of its words in conjunction with
the way they are syntactically combined.
Exceptions to Compositionality
• Anomaly: when phrases are well-formed
syntactically, but not semantically
– Colorless green ideas sleep furiously. (Chomsky)
– That bachelor is pregnant.
Exceptions to Compositionality
• Metaphor: the use of an expression to refer
to something that it does not literally denote
in order to suggest a similarity
– Time is money.
– The walls have ears.
Exceptions to Compositionality
• Idioms: Phrases with fixed meanings not
composed of literal meanings of the words
– Kick the bucket = die
(*The bucket was kicked by John.)
– When pigs fly = ‘it will never happen’
(*She suspected pigs might fly tomorrow.)
– Bite off more than you can chew
= ‘to take on too much’
(*He chewed just as much as he bit off.)
Idioms in other languages
Logical Foundations
for Compositional Semantics
• We need a language for expressing the
meaning of words, phrases, and sentences
• Many possible choices; we will focus on
– First-order predicate logic (FOPL) with types
– Lambda calculus
Truth-conditional Semantics
• Linguistic expressions
– “Bob sings.”
• Logical translations
– sings(Bob)
– but could be p_5789023(a_257890)
• Denotation:
– [[bob]] = some specific person (in some context)
– [[sings(bob)]] = true, in situations where Bob is singing; false, otherwise
• Types on translations:
– bob: e(ntity)
– sings(bob): t(rue or false, a boolean type)
Truth-conditional Semantics
Some more complicated logical descriptions of language:
– “All girls like a video game.”
– x . y . girl(x)  [video-game(y)  likes(x,y)]
– “Alice is a former teacher.”
– (former(teacher))(Alice)
– “Alice saw the cat before Bob did.”
– x, y, z, t1, t2 .
cat(x)  see(y)  see(z) 
agent(y, Alice)  patient(y, x) 
agent(z, Bob)  patient(z, x) 
time(y, t1)  time(z, t2)  <(t1, t2)
FOPL Syntax Summary
• A set of constants C = {c1, …}
• A set of relations R = {r1, …}, where each ri is a
subset of Cn for some n.
• A set of variables X = {x1, …}
• , , , , , , .
Truth-conditional semantics
• Proper names:
– Refer directly to some entity in the world
– Bob: bob
• Sentences:
– Are either t or f, so they are FOL sentences
– Bob sings: sings(bob)
• So what about verbs and VPs?
– sings must combine with bob to produce sings(bob)
– The λ-calculus is a notation for functions whose arguments are not yet
filled.
– sings: λx.sings(x)
– This is a predicate, a function that returns a truth value. In this case, it
takes a single entity as an argument, so we can write its type as e  t
Lambda calculus
• FOL + λ (new quantifier) will be our lambda calculus
• Intuitively, λ is just a way of creating a function
– E.g., girl() is a relation symbol; but
λx . girl(x) is a function that takes one argument.
• New inference rule: function application
(λx . L1(x)) (L2)
→ L1(L2)
E.g., (λx . x2) (3) → 32
E.g., (λx . sings(x)) (Bob) → sings(Bob)
• Lambda calculus lets us describe the meaning of words individually.
– Function application (and a few other rules) then lets us combine those
meanings to come up with the meaning of larger phrases or sentences.
Quiz: Lambda calculus
For each lambda calculus expression below, find
a simplified form:
• (λx . x) (-19)
• (λx . canFly(x)) (PollyParrot)
• (λf . f(PollyParrot)) (λx . canFly(x))
Answer: Lambda calculus
For each lambda calculus expression below, find
a simplified form:
• (λx . x) (-19)  -19
• (λx . canFly(x)) (PollyParrot) 
canFly(PollyParrot)
• (λf . f(PollyParrot)) (λx . canFly(x)) 
(λx . canFly(x)) (PollyParrot) 
canFly(PollyParrot)
Quiz: Lambda calculus 2
For each lambda calculus expression below, find
a factored form, where each factor contains
some portion of the original:
• canFly(PollyParrot)
• likes(SuzySueMae, JimmyJoeBob)
• 2
Answer: Lambda calculus 2
For each lambda calculus expression below, find a factored form,
where each factor contains some portion of the original:
• canFly(PollyParrot) 
λx . canFly(x), PollyParrot OR
λf . f(PollyParrot), λx . canFly(x)
• likes(SuzySueMae, JimmyJoeBob) 
λx . likes(x, JimmyJoeBob), SuzySueMae OR
λx . likes(SuzySueMae, x), JimmyJoeBob OR
λx . λy . likes(x, y), SuzySueMae, JimmyJoeBob OR EVEN
λf . λx . λy . f(x, y), SuzySueMae, JimmyJoeBob,
λa.λb.likes(a, b)
• 2
Can’t do it.
Only real option: λx . x, 2. But the first factor has nothing of the
original.
Compositional Semantics
with the λ-calculus
Associate a combination rule with each
grammar rule:
– S : β(α)  NP : α VP : β (function application)
– VP : λx. α(x) ∧ β(x)  VP : α and : ∅ VP : β
(intersection)
• Example:
Composition: Some more examples
• Transitive verbs:
– likes : λx.λy.likes(y,x)
– VP “likes Amy” : λy.likes(y,Amy) is just a one-place predicate
• Quantifiers:
– What does “everyone” mean?
– Everyone : λf.x.f(x)
– Some problems:
• Have to change our NP/VP rule
• Won’t work for “Amy likes everyone”
– What about “Everyone likes someone”?
– Gets tricky quickly!
Composition: Some more examples
• Indefinites
– The wrong way:
• “Bob ate a waffle” : ate(bob,waffle)
• “Amy ate a waffle” : ate(amy,waffle)
– Better translation:
• ∃x.waffle(x) ^ ate(bob, x)
Composition Example
∃x.waffle(x) ^ ate(bob, x)
Use factoring to determine the meaning of each node in the tree.
Quiz: Composition
∃x.waffle(x) ^ ate(bob, x)
bob
λy. ∃x.waffle(x) ^ ate(y, x)
By repeatedly applying factoring, what is the lambda calculus form for
• ate?
• waffle?
• a?
Answer: Composition
∃x.waffle(x) ^ ate(bob, x)
λy. ∃x.waffle(x) ^ ate(y, x)
bob
λf. λy. ∃x.waffle(x) ^ f(y, x)
λa. λb. ate(a, b)
λc. waffle(c)
λg. λf. λy. ∃x.g(x) ^ f(y, x)
By repeatedly applying factoring, what is the lambda calculus form for
• ate? λa. λb. ate(a, b)
• waffle? λc. waffle(c)
• a? λg. λf. λy. ∃x.g(x) ^ f(y, x)
Denotation
• What do we do with the logical form?
– It has fewer (no?) ambiguities
– Can check the truth-value against a database
– More usefully: can add new facts, expressed in
language, to an existing relational database
– Question-answering: can check whether a statement
in a corpus entails a question-answer pair:
“Bob sings and dances” 
Q:“Who sings?” has answer A:“Bob”
– Can chain together facts for story comprehension
Grounding
• What does the translation likes : λx. λy. likes(y,x) have
to do with actual liking?
• Nothing! (unless the denotation model says it does)
• Grounding: relating linguistic symbols to perceptual
referents
– Sometimes a connection to a database entry is enough
– Other times, you might insist on connecting “blue” to the
appropriate portion of the visual EM spectrum
– Or connect “likes” to an emotional sensation
• Alternative to grounding: meaning postulates
– You could insist, e.g., that likes(y,x) => knows(y,x)
More representation issues
• Tense and events
– In general, you don’t get far with verbs as predicates
– Better to have event variables e
• “Alice danced” : danced(Alice) vs.
• “Alice danced” : ∃e.dance(e)^agent(e, Alice)^(time(e)<now)
– Event variables let you talk about non-trivial
tense/aspect structures:
“Alice had been dancing when Bob sneezed”
More representation issues
• Propositional attitudes (modal logic)
– “Bob thinks that I am a gummi bear”
• thinks(bob, gummi(me))?
• thinks(bob, “He is a gummi bear”)?
– Usually, the solution involves intensions (^p) which are,
roughly, the set of possible worlds in which predicate p is
true.
• thinks(bob, ^gummi(me))
– Computationally challenging
• Each agent has to model every other agent’s mental state
• This comes up all the time in language –
– E.g., if you want to talk about what your bill claims that you bought, vs.
what you think you bought, vs. what you actually bought.
More representation issues
• Multiple quantifiers:
“In this country, a woman gives birth every 15 minutes.
Our job is to find her, and stop her.”
-- Groucho Marx
• Deciding between readings
– “Bob bought a pumpkin every Halloween.”
– “Bob put a warning in every window.”
More representation issues
• Other tricky stuff
–
–
–
–
Adverbs
Non-intersective adjectives
Generalized quantifiers
Generics
• “Cats like naps.”
• “The players scored a goal.”
– Pronouns and anaphora
• “If you have a dime, put it in the meter.”
– … etc., etc.
Mapping Sentences
to Logical Forms
CCG Parsing
• Combinatory Categorial
Grammar
– Lexicalized PCFG
– Categories encode
argument sequences
• A/B means a category that
can combine with a B to
the right to form an A
• A \ B means a category
that can combine with a B
to the left to form an A
– A syntactic parallel to the
lambda calculus
Learning to map sentences
to logical form
• Zettlemoyer and Collins (IJCAI 05, EMNLP 07)
Some Training Examples
CCG Lexicon
Parsing Rules (Combinators)
Application
Right: X : f(a)  X/Y : f Y : a
Left: X : f(a)  Y : a X\Y : f
Additional rules:
• Composition
• Type-raising
CCG Parsing Example
Lexical Generation
Input Training Example
Sentence:
Texas borders Kansas.
Logical form:
borders(Texas, Kansas)
GENLEX
• Input: a training example (Si, Li)
• Computation:
– Create all substrings of consecutive words in Si
– Create categories from Li
– Create lexical entries that are the cross products
of these two sets
• Output: Lexicon Λ
GENLEX Cross Product
Input Training Example
Sentence:
Texas borders Kansas.
Logical form:
borders(Texas, Kansas)
Output Lexicon
Output Substrings
Texas
borders
Kansas
Texas borders
borders Kansas
Texas borders Kansas
X
(cross product)
Output Categories
NP : texas
NP : kansas
(S\NP)/NP : λx.λy.borders(y,x)
GENLEX Output Lexicon
Words
Category
Texas
NP : texas
Texas
NP : kansas
Texas
(S\NP)/NP : λx.λy.borders(y,x)
borders
NP : texas
Borders
NP : kansas
borders
(S\NP)/NP : λx.λy.borders(y,x)
…
…
Texas borders Kansas
NP : texas
Texas borders Kansas
NP : kansas
Texas borders Kansas
(S\NP)/NP : λx.λy.borders(y,x)
Example Learned Lexical Entries
Geo880 Test Set
Precision
Recall
F1
Zettlemoyer & Collins 2007
95.49
83.20
88.93
Zettlemoyer & Collins 2005
96.25
79.29
86.95
Wong & Mooney 2007
93.72
80.00
86.31
Challenge revisited
Suppose this is your training data:
How well will your semantic parser process a question like
this one:
Who scored the most points in the 2005-2006 NHL season?
Building a Large-scale QA System
How can we generalize a QA system to all of the
information in Wikipedia?
In Freebase?
www.freebase.com
In IMDB, ESPN, Yahoo! Finance, Twitter, and …?
Learning meanings of words without
labeled examples
directed_by Table
Web
Search
Director
Film
Ang Lee
Life of Pi
Luca Boni
Zombie Massacre
Kay Hawtrey
Face-Off
…
…
Known semantics:
λx. λy. directed_by(y,x)
Want lexical entries:
directed by 
λx. λy. directed_by(y,x)
directing 
λx. λy. directed_by(y,x)
Life of Pi,” directed by Ang Lee and based on
the novel by Yann Martel, features a young
man, a tiger and lots of talk about God.
A review of "Life of Pi," directed by Ang Lee.
Ang Lee poses with his award for best
directing for "Life of Pi" during the Oscars at
the Dolby Theatre on Feb.
Taiwanese-born Ang Lee won his second Oscar
for Best Directing on Sunday for Life of Pi.
Create new
directed by,
lexical entries
directing,
director of,
…
Extract critical words
Summing Up
• Hypothesis: Principle of Compositionality
– Semantics of NL sentences and phrases can be composed
from the semantics of their subparts
• Rules can be derived which map syntactic analysis to semantic
representation (Rule-to-Rule Hypothesis)
– Lambda notation provides a way to extend FOPC to this
end
– But coming up with rule2rule mappings is hard
• Idioms, metaphors and other non-compositional aspects of
language makes things tricky (e.g. fake gun)
Download