winter-school-day4

advertisement
Issues in Computational Linguistics:
Semantics
Dick Crouch & Tracy King
Overview

What is semantics?:
–

Aims & challenges of syntax-semantics interface
Introduction to Glue Semantics:
– Linear logic for meaning assembly

Topics in Glue
–
–
–
–
–
–
The glue logic
Quantified NPs
Type raising & intensional verbs
Coordination
Control
Skeletons and modifiers
What is Semantics?

Traditional Definition:
– Study of logical relations between sentences

Formal Semantics:
– Map sentences onto logical representations
making relations explicit

All men are mortal
Socrates is a man
Socrates is mortal
Computational Semantics
– Algorithms for inference/knowledge-based
applications
x. man(x)  mortal(x)
man(socrates)
mortal(socrates)
Logical & Collocational Semantics

Logical Semantics
– Map sentences to logical representations of meaning
– Enables inference & reasoning

Collocational semantics
– Represent word meanings as feature vectors
– Typically obtained by statistical corpus analysis
– Good for indexing, classification, language modeling, word
sense disambiguation
– Currently does not enable inference

Complementary, not conflicting, approaches
What does semantics have
that f-structure doesn’t?

Repackaged information, e.g:
– Logical formulas instead of AVMs
– Adjuncts wrap around modifiees

Extra information, e.g:
– Aspectual decomposition of events
break(e,x,y) & functional(y,start(e)) & functional(y,end(e))
– Argument role assignments
break(e) & cause_of_change(e,x) & object_of_change(e,y)

Extra ambiguity, e.g:
– Scope
– Modification of semantic event decompositions
e.g. Ed was observed putting up a deckchair for 5 minutes
Example Semantic Representation
The wire broke
Syntax (f-structure)
Semantics (logical form)
PRED
break<SUBJ>
SUBJ
PRED wire
SPEC def
NUM sg
TENSE

past
w. wire(w) & w=part25 &
t. interval(t) & t<now &
e. break_event(e) & occurs_during(e,t) &
object_of_change(e,w) &
c. cause_of_change(e,c)
F-structure gives basic predicate-argument structure,
but lacks:
– Standard logical machinery (variables, connectives, etc)
– Implicit arguments
(events, causes)
– Contextual dependencies
(the wire = part25)

Mapping from f-structure to logical form is systematic,
but can introduce ambiguity (not illustrated here)
Mapping sentences to logical forms

Borrow ideas from compositional compilation of
programming languages (with adaptations)
Computer Program
parse
compile
Object Code
Execution
NL Utterance
parse
interpret
Logical Form
Inference
The Challenge to Compositionality
Ambiguity & context dependence

Strict compositionality (e.g. Montague)
– Meaning is a function of (a) syntactic structure, (b) lexical
choice, and (c) nothing else
– Implies that there should be no ambiguity in absence of
syntactic or lexical ambiguity

Counter-examples? (no syntactic or lexical ambiguity)
– Contextual ambiguity
» John came in. He sat down. So did Bill.
– Semantic ambiguity
»
»
»
»
Every man loves a woman.
Put up a deckchair for 5 minutes
Pets must be carried on escalator
Clothes must be worn in public
Semantic Ambiguity

Syntactic & lexical ambiguity in formal languages
– Practical problem for program compilation
» Picking the intended interpretation
– But not a theoretical problem
» Strict compositionality generates alternate meanings

Semantic ambiguity a theoretical problem, leading to
– Ad hoc additions to syntax (e.g. Chomskyan LF)
– Ad hoc additions to semantics (e.g. underspecification)
– Ad hoc additions to interface (e.g. quantifier storage)
Weak Compositionality

Weak compositionality
– Meaning of the whole is a function of (a) the meaning of its
parts, and (b) the way those parts are combined
– But (a) and (b) are not completely fixed by lexical choice and
syntactic structure, e.g.
» Pronouns: incomplete lexical meanings
» Quantifier scope: combination not fixed by syntax

Glue semantics
– Gives formally precise account of weak compositionality
Modular Syntax-Semantics Interfaces

Different grammatical formalisms
– LFG, HPSG, Categorial grammar, TAG, minimalism, …

Different semantic formalisms
– DRT, Situation semantics, Intensional logic, …

Need for modular syntax-semantics interface
– Pair different grammatical & semantic formalisms

Possible modular frameworks
– Montague’s use of lambda-calculus
– Unification-based semantics
– Glue semantics (interpretation as deduction)
Some Claims

Glue is a general approach to the syntax-semantics interface
– Alternative to unification-based semantics, Montagovian λ-calculus

Glue addresses semantic ambiguity/weak compositionality

Glue addresses syntactic & semantic modularity

(Glue may address context dependence & update)
Glue Semantics
Dalrymple, Lamping & Saraswat 1993 and subsequently

Syntax-semantics mapping as linear logic inference

Two logics in semantics:
– Meaning Logic (target semantic representation)
any suitable semantic representation
– Glue Logic
(deductively assembles target meaning)
fragment of linear logic

Syntactic analysis produces lexical glue premises

Semantic interpretation uses deduction to assemble final meaning from
these premises
Linear Logic


Influential development in theoretical computer
science (Girard 87)
Premises are resources consumed in inference
(Traditional logic: premises are non-resourced)
Traditional
Linear
A, AB |= B
A, AB |= A&B
A, A -o B |= B
A, A -o B |=/AB
A re-used
A, B |= B
A discarded
A consumed
A, B |=/ B
Cannot discard A
• Linguistic processing typically resource sensitive
Words used exactly once
Glue Interpretation (Outline)


Parsing sentence instantiates lexical entries to produce
lexical glue premises
Example lexical premise (verb “saw” in “John saw Fred”):
see
:
Meaning Term
2-place predicate
g -o (h -o f)
Glue Formula
g, h, f: constituents in parse
“consume meanings of g and h
to produce meaning of f”
• Glue derivation  |= M : f
• Consume all lexical premises ,
• to produce meaning, M, for entire sentence, f
Glue Interpretation
Getting the premises
Syntactic Analysis:
S
PRED
NP
VP
f:
John
V
NP
saw
Fred
Lexicon:
John NP john:
Fred NP fred: 
saw V see: SUBJ -o (OBJ -o )
see
SUBJ
g: PRED John
OBJ
h: PRED Fred
Premises:
john: g
fred: h
see: g -o (h -o f)
Glue Interpretation
Deduction with premises
Premises
Linear Logic Derivation
john: g
fred: h
see: g -o (h -o f)
g -o (h -o f)
g
h -o f
h
f
Using linear modus ponens
Derivation with Meaning Terms
see: g -o (h -o f)
john: g
see(john) : h -o f
fred : h
see(john)(fred) : f
Linear modus ponens = function application
Modus Ponens = Function Application
The Curry-Howard Isomorphism
Curry Howard Isomorphism:
Pairs LL inference rules with operations on meaning terms
Fun: g -o f
Arg: g
Fun(Arg): f
Propositional linear logic inference constructs meanings
LL inference completely independent of meaning language
(Modularity of meaning representation)
Semantic Ambiguity
Multiple derivations from single set of premises
Alleged criminal from London
PRED
f:
Premises
criminal:
f
alleged
alleged:
f -o f
from London
from-London: f -o f
criminal
ADJS
Two distinct derivations:
1.
from-London(alleged(criminal))
2.
alleged(from-London(criminal))
Quantifier Scope Ambiguity

Every cable is attached to a base-plate
– Has 2 distinct readings
– x cable(x)  y plate(y) & attached(x,y)
– y plate(y) & x cable(x)  attached(x,y)

Quantifier scope ambiguity accounted for by
mechanism just shown
– Multiple derivations from single set of premises
– More on this later
Semantic Ambiguity & Modifiers

Multiple derivations from single premise set
– Arises through different ways of permuting  -o 
modifiers around an  skeleton

Modifiers given formal representation in glue as
 -o  logical identities
– E.g. an adjective is a noun -o noun modifier

Modifiers prevalent in natural language, and
lead to combinatorial explosion
– Given N  -o  modifiers, N! ways of permuting
them around an  skeleton
Packing & Ambiguity Management

Exploit explicit skeleton-modifier of glue derivations to
implement efficient theorem provers that manage
combinatorial explosion
– Packing of N! analyses
» Represent all N! analyses in polynomial space
» Compute representation in polynomial time
» Read off any given analysis in linear time
– Packing through structure re-use
» N! analyses through combinations of N sub-analyses
» Compute each sub-analysis once, and re-use

Combine with packed output from XLE
Summary

Glue: semantic interpretation as (linear logic) deduction
– Syntactic analysis yields lexical glue premises
– Standard inference combines premises to construct sentence
meaning




Resource sensitivity of linear logic reflects resource
sensitivity of semantic interpretation
Gives modular & general syntax-semantics interface
Models semantic ambiguity / weak compositionality
Leads to efficient implementations
Topics in Glue






The glue logic
Quantified NPs and scope ambiguity
Type raising and intensionality
Coordination
Control
Why glue is a good computational theory
Two Rules of Inference
Modus ponens /
-o elimination
A: a
F: a-o b
F(A): b
F is a function of type a –o b
that takes arguments of type a
to give results of type b
Hypothetical reasoning /
-o elimination
[ x: a]
:
F(x): b
λx.F(x): a –o b
Assume a
and thus
prove b
a implies b
(discharging
assumption)
Have shown that there is some function
taking arguments, x, of type a
to give results, F(x), of type b.
Call this function λx.F(x), of type a –o b
λ-terms describe propositional proofs
A direct proof of f
from g –o f and g
A: g
F: g –o f
F(A):f
A roundabout proof of f
from g -o f and g
[x:g]
F: g -o f
F(x): f
λx.F(x): g –o f
(λx.F(x))(A): f
By λ-reduction: (λx.F(x))(A) = F(A)

Intimate relation between λ-calculus and
propositional inference (Curry-Howard)
– λ-terms are descriptions of proofs
– Equivalent λ-terms mean equivalent proofs
A:g
Digression: Structured Meanings

Glue proofs as an intermediate level of structure in
semantic theory
– Identity conditions given by λ-equivalence
– Used to explore notions of semantic parallelism (Asudeh &
Crouch)

Unlike Montague semantics
– MS allows nothing between syntax and model theory.
– Logical formulas are not linguistic structures; cannot build
theories off arbitrary aspects of their notation

Unlike Minimal Recursion Semantics
– MRS uses partial descriptions of logical formulas
– A theory built off aspects of logical notation
Two kinds of semantic resource

Some nodes, n, in f-structure gives rise to entity-denoting
semantic resources, e(n)
– e(n) is a proposition stating that n has an entity-denoting resource

Other nodes, n, give rise to proposition/truth-value
denoting semantic resources, t(n)
– t(n) is a proposition stating that n has a truth-denoting resource

Notational convenience:
– Write e(n) as ne, or just n (when kind of resource is unimportant)
– Write t(n) as nt, or just n (when kind of resource is unimportant)
Variables over f-structure nodes

The glue logic allows universal quantification over fstructure nodes, e.g.
N. (e(g) –o t(N)) –o t(N)
– Important for dealing with quantified NPs

But the logic is still essentially propositional
– Quantification allows matching of variable propositions with
atomic propositions, e.g. t(N) with t(f)

Notational Convenience:
– Drop explicit quantifiers, and write variables over nodes as
upper case letters, e.g.
(ge –o Nt) –o Nt
Non-Quantified and Quantified NPs
PRED
sleep
f:
PRED
f:
SUBJ
g: PRED John
sleep: ge –o ft
john: ge
john: g
sleep: g –o f
sleep(john): f
SUBJ
sleep
g: PRED everyone
QUANT +
sleep: ge –o ft
everyone: (ge –o Xt) –o Xt
sleep: ge –o ft everyone: (ge –o Xt) –o Xt
everyone(sleep): ft
everyone = λP.x.person(x)P(x)
everyone(sleep)
= λP.x.person(x)P(x)[sleep]
= x.person(x)sleep(x)
Quantifier Scope Ambiguity
Two derivations
f:
PRED
SUBJ
OBJ
see: g –o h –o f
:(g –o X) –o X
:(h –o Y) –o Y
see
g: everyone
h: someone
see:g –o h –o f
[x:g]
see(x): h –o f
[y:h]
see(x,y): f
(g –o X) –o X
f
(h –o Y) –o Y
h –o f
see: f
f
g –o f
f
h –o f
(h –o Y) –o Y
f
g –o f
see: f
(g –o X) –o X
Quantifier Scope Ambiguity
Two derivations
f:
PRED
SUBJ
OBJ
see: g –o h –o f
:(g –o X) –o X
:(h –o Y) –o Y
see
g: everyone
h: someone
see:g –o h –o f
[x:g]
see(x): h –o f
[y:h]
see(x,y): f
see(x,y): f
:(g-oX)-oX
λx.see(x,y): g-o f
λx.see(x,y): f
:(h-oY)-oY
λyλx.see(x,y): h-of
λyλx.see(x,y): f
see(x,y): f
λy.see(x,y): h-o f
:(h-oY)-oY
λy.see(x,y): f
λxλy.see(x,y): h-of
:(g-oX)-oX
λxλy.see(x,y): f
No Additional Scoping Machinery



Scope ambiguities arise simply through
application of the two standard rules of
inference for implication
Glue theorem prover automatically finds all
possible derivations / scopings
Very simple and elegant account of scope
variation.
Type Raising and Intensionality

Intensional verbs (seek, want, dream about)
– Do not take entities as arguments
* x. unicorn(x) & seek(ed, x)
– But rather quantified NP denotations
seek(ed, λP.x unicorn(x) & P(x))

Glue lexical entry for seek
λxλQ. seek(x,Q):
SUBJ –o
((OBJ –o Nt) –o Nt) –o

(subject entity, x)
(object quant, Q)
(clause meaning)
Ed seeks a unicorn
f:
PRED
SUBJ
OBJ
seek
g: Ed
h: a unicorn
ed: g
λP.x unicorn(x) & P(x)) : (h –o X) –o X
λxλQ. seek(x,Q): g –o ((h –o Y) –o Y) –o f
Derivation (without meanings)
g
g –o ((h –o Y) –o Y) –o f
((h –o Y) –o Y) –o f
(h –o X) –o X
f
Derivation (with meanings)
ed: g
λxλQ.seek(x,Q): g –o ((h –o Y) –o Y) –o f
λQ.seek(ed,Q):((h –oY)–oY)–of
λP.x unicorn(x) & P(x):(h–oX)–oX
seek(ed, λP.x unicorn(x) & P(x)): f
Ed seeks Santa Claus
f:
PRED
SUBJ
OBJ

seek
g: Ed
h: Santa
ed: g
santa: h
λxλQ. seek(x,Q): g –o ((h –o Y) –o Y) –o f
Looks problematic
– “seek” expects a quantifier from its object
– But we only have a proper name

Traditional solution (Montague)
– Uniformly give all proper names a more
complicated, type-raised, quantifier-like semantics
λP.P(santa) : (h –o X) –o X

Glue doesn’t force you to do this
– Or rather, it does it for you
Type Raising in Glue
[h –o X]
h
Propositional tautology
h |- (h –o X) –o X
santa: h
X
(h –o X) –o X
[P: h –o X]
P(santa): X
λP. P(santa):(h –o X) –o X
Ed seeks Santa Claus
PRED
SUBJ
OBJ
f:
ed: g
santa: h
λxλQ. seek(x,Q): g –o [(h –o Y) –o Y] –o f
seek
g: Ed
h: Santa
santa: h
g
g –o ((h –o Y) –o Y) –o f
((h –o Y) –o Y) –o f
[P: h –o X]
P(santa): X
λP. P(santa):(h –o X) –o X
seek(ed, λP. P(santa)): f
Glue derivations will automatically type raise, when needed
Coordination
Incorrect Treatment
PRED eat
SUBJ Ed
PRED drink
SUBJ
ed: g
eat: g –o f1
drink: g –o f2
and: f1 –o f2 –o f
Resource deficit:
There aren’t enough g’s to go round
Coordination: Correct Treatment
PRED eat
SUBJ Ed
PRED drink
SUBJ
ed: g
eat: g –o f1
drink: g –o f2
λP1 λP2 λx. P1(x)&P2(x):
(g –o f1) –o (g –o f2) –o (g –o f)
λP1P2x. P1(x)&P2(x): (g–o f1) –o (g–o f2) –o (g–o f)
eat: g –o f1
λP2x.eat(x)&P2(x): (g–o f2) –o (g–o f)
drink: g –o f2
λx.eat(x)&drink(x): (g–of)
ed: g
eat(ed)&drink(ed): f
Resolving Apparent Resource Deficits

Deficit:
– Multiple consumers for some resource g
– But only one instance of g

Resolution
– Consume the consumers of g, until there is only one

Applies to coordination, and also control
Control: Apparent resource deficit
PRED
SUBJ
XCOMP
want<SUBJ, XCOMP>
Ed
PRED sleep<SUBJ>
SUBJ
want: e –o s –o w
sleep: e –o s
ed: e
Resource Deficit:
Not enough e’s to go round
Resolve in same way as for coordination
Control: Deficit resolved
PRED
SUBJ
want<SUBJ, XCOMP>
Ed
XCOMP
want: e –o (e –o s) –o w
sleep: e –o s
ed: e
PRED sleep<SUBJ>
SUBJ
ed: e
want: e –o (e –o s) –o w
want(ed): (e –o s) –o w
sleep: e –o s
want(ed,sleep): w
Does this commit you to a property analysis of control?
i.e. want takes a property as its second argument
Property and/or Propositional Control
Property Control
λxλP. want(x,P): SUBJ –o (SUBJ –o XCOMP) –o 
ed: e
λxλP.want(x,P): e –o (e –o s) –o w
λP.want(ed,P): (e –o s) –o w
sleep: e –o s
want(ed,sleep): w
Propositional Control
λxλP. want(x, P(x)): SUBJ –o (SUBJ –o XCOMP) –o 
ed: e
λxλP.want(x,P(x)): e –o (e –o s) –o w
λP.want(ed,P(ed)): (e –o s) –o w
want(ed,sleep(ed)): w
sleep: e –o s
Lexical Variation in Control

Glue does not commit you to either a
propositional or a property-based analysis of
controlled XCOMPs (Asudeh)

The type of analysis can be lexically specified
– Some verbs get property control
– Some verbs get propositional control
Why Glue Makes Computational Sense

The backbone of glue is the construction of
propositional linear logic derivations
– This can be done efficiently

Combinations of lexical meanings determined solely
by this propositional backbone
– Algorithms can factor out idiosyncracies of meaning
expressions

Search for propositional backbone can further factor
out skeleton (α) from modifier (α –o α) contributions,
leading to efficient free choice packing of scope
ambiguities
– Work still in progress
Download