Natural Language Processing Lecture 2: Semantics Last Lecture Motivation Paradigms for studying language Levels of NL analysis Syntax – Parsing Top-down Bottom-up Chart parsing Today’s Lecture DCGs and parsing in Prolog Semantics – Logical representation schemes – Procedural representation schemes – Network representation schemes – Structured representation schemes Parsing in PROLOG How do you represent a grammar in PROLOG? Writing a CFG in PROLOG Consider the rule S -> NP VP We can reformulate this as an axiom: – A sequence of words is a legal S if it begins with a legal NP that is followed by a legal VP What about s(P1, P3):-np(P1, P2), vp(P2, P3)? – There is an S between position P1 and P3 if there is a position P2 such that there is an NP between P1 and P2 and a VP between P2 and P3 Inputs John ate the cat can be described – word(john, 1, 2) – word(ate, 2, 3) – word(the, 3, 4) – word(cat, 4, 5) Or (better) use a list representation: – [john, ate, the, cat] Lexicon First representation – isname(john), isverb(ate) – v(P1, P2):- word(Word, P1, P2), isverb(Word) List representation – name([john|T], T). A simple PROLOG grammar s(P1, P3):-np(P1, P2), vp(P2, P3). np(P1, P3):-art(P1, P2), n(P2, P3). np(P1, P3):-name(P1, P3). pp(P1, P3):-p(P1, P2), np(P2, P3). vp(P1, P2):-v(P1, P2). vp(P1, P3):-v(P1, P2), np(P2, P3). vp(P1, P3):-v(P1, P2), pp(P2, P3). Direct clause grammars PROLOG provides an operator that supports DCGs Rules look like CFG notation PROLOG automatically translates these DCGs and Prolog Grammar s(P1, P3):-np(P1, P2), vp(P2, P3). np(P1, P3):-art(P1, P2), n(P2, P3). np(P1, P3):-name(P1, P3). pp(P1, P3):-p(P1, P2), np(P2, P3). vp(P1, P2):-v(P1, P2). vp(P1, P3):-v(P1, P2), np(P2, P3). vp(P1, P3):-v(P1, P2), pp(P2, P3). s --> np, vp. np --> art, n. np --> name. pp --> p, np. vp --> v. vp --> v, np. vp --> v, pp. Lexicon name([john|P], P). v([ate|P],P). art([the|P],P). n([cat|P],P). Lexicon name --> [john]. v --> [ate]. art --> [the]. n --> [cat]. Building a tree with DCGs We can add extra arguments to DCGs to represent a tree: – s --> np, vp. becomes – s(s(NP, VP)) -->np(NP), vp(VP). An ambiguous DCG s(s(NP, VP)) --> np(NP), vp(VP). np(np(ART, N)) --> art(ART), n(N). np(np(NAME)) --> name(NAME). pp(pp(P,NP)) --> p(P), np(NP). vp(vp(V)) --> v(V). vp(vp(V,NP)) --> v(V), np(NP). vp(vp(V,PP)) --> v(V), pp(PP). vp(vp(V,NP,PP)) --> v(V), np(NP), pp(PP). np(np(ART, N, PP)) --> art(ART), n(N), pp(PP). %Lexicon art(art(the)) --> [the]. n(n(man)) --> [man]. n(n(boy)) --> [boy]. n(n(telescope)) --> [telescope]. v(v(saw)) --> [saw]. p(p(with)) --> [with]. Semantics What does it mean? Semantic ambiguity A sentence may have a single syntactic structure, but multiple semantic structures – Every boy loves a dog Vagueness – some senses are more specific than others – “Person” is more vague than “woman” – Quantifiers: Many people saw the accident Logical forms Most common is first-order predicate calculus (FOPC) PROLOG is an ideal implementation language Thematic roles Consider the following sentences: – John broke the window with the hammer – The hammer broke the window – The window broke The syntactic structure is different, but John, the hammer, and the window have the same semantic roles in each sentence Themes/Cases We can define a notion of theme or case – John broke the window with the hammer – The hammer broke the window – The window broke John is the AGENT The window is the THEME (syntactic OBJECT -- what was Xed) The hammer is the INSTR(ument) Case Frames TIME Sarah AGENT fix past THEME chair INSTR glue Sarah fixed the chair with glue Network Representations Examples: – Semantic networks – Conceptual dependencies – Conceptual graphs Semantic networks General term encompassing graph representations for semantics Good for capturing notions of inheritance Think of OOP Part of a type hierarchy ALL PHYSOBJ NON-ANIMATE NON-LIVING SITUATION EVENT ANIMATE VEGETABLE DOG PERSON Strengths of semantic networks Ease the development of lexicons through inheritance – Reasonable sized grammars can incorporate hundreds of features Provide a richer set of semantic relationships between word senses to support disambiguation Conceptual dependencies Influential in early semantic representations Base representation on a small set of primitives Primitives for conceptual dependency Transfer – ATRANS - abstract transfer (as in transfer of ownership) – PTRANS - physical transfer – MTRANS - mental transfer (as in speaking) Bodily activity – PROPEL (applying force), MOVE (a body part), GRASP, INGEST, EXPEL Mental action – CONC (conceptualize or think) – MBUILD (perform inference) Problems with conceptual dependency Very ambitious project – Tries to reduce all semantics to a single canonical form that is syntactically identical for all sentences with same meaning Primitives turn out to be inadequate for inference – Must create larger structures out of primitives, compute on those structures Structured representation schemes Frames Scripts Frames Much of the inference required for NLU involves making assumptions about what is typically true about a situation Encode this stereotypical information in a frame Looks like themes, but on a higher level of abstraction Frames For an (old) PC: Class PC(p): Roles: Keyb, Disk1, MainBox Constraints: Keyboard(Keyb) & PART_OF(Keyb, p) & CONNECTED_TO(Keyb,KeyboardPlug(MainBox)) & DiskDrive(Disk1) & PART-OF(Disk1, p) & CONNECTED_TO(Disk1, DiskPort(MainBox)) & CPU(MainBox) & PART_OF(MainBox, p) Scripts A means of identifying common situations in a particular domain A means of generating expectations – We precompile information, rather than recomputing from first principles Scripts Travel by plane: – Roles: Actor, Clerk, Source, Dest, Airport, Ticket, Money, Airplane – Constraints: Person(Actor), Value(Money, Price(Ticket)), . . . – Preconditions: Owns(Actor, Money), At(Actor, Source) – Effects: not(Owns(Actor, Money)), not(At(Actor, Source)), At(Actor, Dest) – Decomposition: GoTo(Actor, Airport) BuyTicket(Actor, Clerk, Money, Ticket),. . . Issues with Scripts Script selection – How do we decide which script is relevant? Where are we in the script? NLP -- Where are we? We’re five years away (??) Call 1-888-NUANCE9 (banking/airline ticket demo) 1-888-LSD-TALK (Weather information) Google Ask Jeeves Office Assistant