Natural Language Processing Lecture 2: Semantics

advertisement
Natural Language Processing
Lecture 2: Semantics
Last Lecture
Motivation
 Paradigms for studying language
 Levels of NL analysis
 Syntax

– Parsing



Top-down
Bottom-up
Chart parsing
Today’s Lecture
DCGs and parsing in Prolog
 Semantics

– Logical representation schemes
– Procedural representation schemes
– Network representation schemes
– Structured representation schemes
Parsing in PROLOG

How do you represent a grammar in
PROLOG?
Writing a CFG in PROLOG

Consider the rule S -> NP VP

We can reformulate this as an axiom:
– A sequence of words is a legal S if it begins with a
legal NP that is followed by a legal VP
What about
s(P1, P3):-np(P1, P2), vp(P2, P3)?

– There is an S between position P1 and P3 if there
is a position P2 such that there is an NP between
P1 and P2 and a VP between P2 and P3
Inputs

John ate the cat can be described
– word(john, 1, 2)
– word(ate, 2, 3)
– word(the, 3, 4)
– word(cat, 4, 5)

Or (better) use a list representation:
– [john, ate, the, cat]
Lexicon

First representation
– isname(john), isverb(ate)
– v(P1, P2):- word(Word, P1, P2),
isverb(Word)

List representation
– name([john|T], T).
A simple PROLOG grammar
s(P1, P3):-np(P1, P2), vp(P2, P3).
np(P1, P3):-art(P1, P2), n(P2, P3).
np(P1, P3):-name(P1, P3).
pp(P1, P3):-p(P1, P2), np(P2, P3).
vp(P1, P2):-v(P1, P2).
vp(P1, P3):-v(P1, P2), np(P2, P3).
vp(P1, P3):-v(P1, P2), pp(P2, P3).
Direct clause grammars
PROLOG provides an operator that
supports DCGs
 Rules look like CFG notation
 PROLOG automatically translates these

DCGs and Prolog
Grammar
s(P1, P3):-np(P1, P2), vp(P2, P3).
np(P1, P3):-art(P1, P2), n(P2, P3).
np(P1, P3):-name(P1, P3).
pp(P1, P3):-p(P1, P2), np(P2, P3).
vp(P1, P2):-v(P1, P2).
vp(P1, P3):-v(P1, P2), np(P2, P3).
vp(P1, P3):-v(P1, P2), pp(P2, P3).
s --> np, vp.
np --> art, n.
np --> name.
pp --> p, np.
vp --> v.
vp --> v, np.
vp --> v, pp.
Lexicon
name([john|P], P).
v([ate|P],P).
art([the|P],P).
n([cat|P],P).
Lexicon
name --> [john].
v --> [ate].
art --> [the].
n --> [cat].
Building a tree with DCGs

We can add extra arguments to DCGs
to represent a tree:
– s --> np, vp. becomes
– s(s(NP, VP)) -->np(NP), vp(VP).
An ambiguous DCG
s(s(NP, VP)) --> np(NP), vp(VP).
np(np(ART, N)) --> art(ART), n(N).
np(np(NAME)) --> name(NAME).
pp(pp(P,NP)) --> p(P), np(NP).
vp(vp(V)) --> v(V).
vp(vp(V,NP)) --> v(V), np(NP).
vp(vp(V,PP)) --> v(V), pp(PP).
vp(vp(V,NP,PP)) --> v(V), np(NP), pp(PP).
np(np(ART, N, PP)) --> art(ART), n(N), pp(PP).
%Lexicon
art(art(the)) --> [the].
n(n(man)) --> [man].
n(n(boy)) --> [boy].
n(n(telescope)) --> [telescope].
v(v(saw)) --> [saw].
p(p(with)) --> [with].
Semantics

What does it mean?
Semantic ambiguity

A sentence may have a single syntactic
structure, but multiple semantic
structures
– Every boy loves a dog

Vagueness – some senses are more
specific than others
– “Person” is more vague than “woman”
– Quantifiers: Many people saw the accident
Logical forms
Most common is first-order predicate
calculus (FOPC)
 PROLOG is an ideal implementation
language

Thematic roles

Consider the following sentences:
– John broke the window with the hammer
– The hammer broke the window
– The window broke

The syntactic structure is different, but
John, the hammer, and the window
have the same semantic roles in each
sentence
Themes/Cases

We can define a notion of theme or case
– John broke the window with the hammer
– The hammer broke the window
– The window broke
John is the AGENT
 The window is the THEME (syntactic
OBJECT -- what was Xed)
 The hammer is the INSTR(ument)

Case Frames
TIME
Sarah
AGENT
fix
past
THEME
chair
INSTR
glue
Sarah fixed the chair with glue
Network Representations

Examples:
– Semantic networks
– Conceptual dependencies
– Conceptual graphs
Semantic networks
General term encompassing graph
representations for semantics
 Good for capturing notions of

inheritance

Think of OOP
Part of a type hierarchy
ALL
PHYSOBJ
NON-ANIMATE
NON-LIVING
SITUATION
EVENT
ANIMATE
VEGETABLE
DOG
PERSON
Strengths of semantic
networks

Ease the development of lexicons
through inheritance
– Reasonable sized grammars can
incorporate hundreds of features

Provide a richer set of semantic
relationships between word senses to
support disambiguation
Conceptual dependencies
Influential in early semantic
representations
 Base representation on a small set of
primitives

Primitives for conceptual
dependency

Transfer
– ATRANS - abstract transfer (as in transfer of
ownership)
– PTRANS - physical transfer
– MTRANS - mental transfer (as in speaking)

Bodily activity
– PROPEL (applying force), MOVE (a body part),
GRASP, INGEST, EXPEL

Mental action
– CONC (conceptualize or think)
– MBUILD (perform inference)
Problems with conceptual
dependency

Very ambitious project
– Tries to reduce all semantics to a single
canonical form that is syntactically identical
for all sentences with same meaning

Primitives turn out to be inadequate for
inference
– Must create larger structures out of
primitives, compute on those structures
Structured representation
schemes
Frames
 Scripts

Frames
Much of the inference required for NLU
involves making assumptions about
what is typically true about a situation
 Encode this stereotypical information in
a frame
 Looks like themes, but on a higher level
of abstraction

Frames
For an (old) PC:
Class PC(p):
Roles: Keyb, Disk1, MainBox
Constraints: Keyboard(Keyb) & PART_OF(Keyb, p) &
CONNECTED_TO(Keyb,KeyboardPlug(MainBox)) &
DiskDrive(Disk1) & PART-OF(Disk1, p) &
CONNECTED_TO(Disk1, DiskPort(MainBox)) &
CPU(MainBox) & PART_OF(MainBox, p)

Scripts
A means of identifying common
situations in a particular domain
 A means of generating expectations

– We precompile information, rather than
recomputing from first principles
Scripts

Travel by plane:
– Roles: Actor, Clerk, Source, Dest, Airport, Ticket,
Money, Airplane
– Constraints: Person(Actor), Value(Money,
Price(Ticket)), . . .
– Preconditions: Owns(Actor, Money), At(Actor,
Source)
– Effects: not(Owns(Actor, Money)), not(At(Actor,
Source)), At(Actor, Dest)
– Decomposition:


GoTo(Actor, Airport)
BuyTicket(Actor, Clerk, Money, Ticket),. . .
Issues with Scripts

Script selection
– How do we decide which script is relevant?

Where are we in the script?
NLP -- Where are we?
We’re five years away (??)
 Call 1-888-NUANCE9 (banking/airline
ticket demo)
 1-888-LSD-TALK (Weather information)
 Google
 Ask Jeeves
 Office Assistant

Download