Representing Meaning Lecture 18 12 Sep 2007 Transition First we did words (morphology) Then simple sequences of words Then we looked at true syntax Now we’re moving on to meaning. Where some would say we should have started to begin with. Meaning Language is useful and amazing because it allows us to encode/decode… Descriptions of the world What we’re thinking What we think about what other people think Don’t be fooled by how natural and easy it is… In particular, you never really… Utter word strings that match the world Say what you’re thinking Say what you think about what other people think Meaning You’re simply uttering linear sequences of words such that when other people read/hear and understand them they come to know what you think of the world. Meaning Representations We’re going to take the same basic approach to meaning that we took to syntax and morphology We’re going to create representations of linguistic inputs that capture the meanings of those inputs. But unlike parse trees and the like these representations aren’t primarily descriptions of the structure of the inputs… In most cases, meaning representations are simultaneously descriptions of the meanings of utterances and of some potential state of affairs in some world. Introduction Meaning representation languages: capturing the meaning of linguistic utterances using formal notation so that they make semantic processing possible Example: deciding what to order at a restaurant by reading a menu, giving advice about where to go for dinner Requires knowledge about food, its preparation, what people like to eat and what restaurants are like Example: answering a question on an exam Requires background knowledge about the topic of the question Example: Learning to use a software by reading a manual Requires knowledge about current computers, the specific software, similar software applications, knowledge about users in general. Semantic Analysis Semantic analysis: mapping between language and real life I have a car: 1. First Order Logic ∃ x,y: Having(x) ^ Haver(speaker,x) ^ HadThing(y,x) ^ Car(y) 2. Semantic Network Having Haver Had-thing 3. Conceptual Dependency Diagram Car POSS-BY Speaker Car Speaker 4. Frame Based Representation Having Haver: Speaker HadThing: Car Semantic analysis A meaning representation consists of structures composed from a set of symbols, or representational vocabulary. Why meaning representations are needed? What they should do for us? Example: Giving advice about restaurants to tourists. A computer system that accepts spoken language queries from tourists and constructs appropriate responses by using a knowledge base of relevant domain knowledge. Representations that Permit us to reason about their truth (relationship to some world) Permit us to answer questions based on their content Permit us to perform inference (answer questions and determine the truth of things we don’t actually know) Semantic Processing Touchstone application is often question answering Can a machine answer questions involving the meaning of some text or discourse? What kind of representations do we need to mechanize that process? Verifiability Verifiability: Ability to compare the state of affairs described by a representation to the state of affairs in some world modeled in a knowledge base. Example: Does Anarkali serve vegetarian food? Knowledge base (KB) Sample entry in KB: Serves(Anarkali,Vegetarian Food) Convert question to logical form and verify its truth value against the knowledge base Unambiguousness Example: I want to eat someplace near Chowringhee. (multiple interpretations) Interpretation is important Preferred interpretations Regardless of ambiguity in the input, it is critical that a meaning representation language support representations that have a single unambiguous interpretation. Vagueness Vagueness: I want to eat Italian food. - what particular food? Meaning representation language must support some vagueness Canonical form Inputs that have the same meaning should have the same meaning representation. Distinct sentences having the same meaning Does Anarkali have vegetarian dishes? Do they have vegetarian food at Anarkali? Are vegetarian dishes served at Anarkali? Does Anarkali serve vegetarian fare? Words have different senses, multiple words may have the same sense Having vs. serving Food vs. fare vs. dishes (each is ambiguous but one sense of each matches the others) Alternative syntactic analyses have related meaning (Ex: active vs passive) Inference and variables; expressiveness Inference and variables: Can vegetarians eat at Anarkali? I’d like to find a restaurant that serves vegetarian food. Serves (x,VegetarianFood) System’s ability to draw valid conclusions based on the meaning representations of inputs and its store of background knowledge. Expressiveness: system must be able to handle a wide range of subject matter Semantic Processing We’re going to discuss 2 ways to attack this problem (just as we did with parsing) There’s the theoretically motivated correct and complete approach… Computational/Compositional Semantics And there are practical approaches that have some hope of being useful and successful. Information extraction Meaning Structure of Language The various methods by which human languages convey meaning Form-meaning associations Word-order regularities Tense systems Conjunctions Quantifiers A fundamental predicate-argument structure Asserts that specific relationships / dependencies hold among the concepts underlying the constituent words and phrases The underlying structure permits the creation of a single composite meaning representation from the meanings of the various parts. Predicate-argument structure Sentences Syntactic argument frames I want Italian food. NP want NP I want to spend less than five dollars. NP want Inf-VP I want it to be close by here. NP want NP Inf-VP The syntactic frames specify the number, position and syntactic category of the arguments that are expected to accompany a verb. Thematic roles: e.g. entity doing the wanting vs. entity that is wanted (linking surface arguments with the semantic=case roles) Syntactic selection restrictions: I found to fly to Dallas. Semantic selection restrictions: The risotto wanted to spend less than ten dollars. Make a reservation for this evening for a table for two persons at eight: Reservation (Hearer,Today,8PM,2) Any useful meaning representation language must be organized in a way that supports the specification of semantic predicate-argument structures. Variable arity predicate-argument structures The semantic labeling of arguments to predicates The statement of semantic constraints on the fillers of argument roles Model-theoretic semantics Basic notions shared by representation schemes Ability to represent Objects Properties of objects Relations among objects A model is a formal construct that stands for the particular state of affairs in the world that we are trying to represent. Expressions in a meaning representation language will be mapped in a systematic way to the elements of the model. Vocabulary of a meaning representation language Non-logical vocabulary: open-ended set of names for the objects, properties and relations (may appear as predicates, nodes, labels on links, labels in slots in frames, etc) Logical vocabulary: closed set of symbols, operators, quantifiers, links, etc. provide the formal meaning for composing expressions Each element of non-logical vocabulary must have a denotation in the model. domain of a model: set of objects that are part of the application Capture properties of objects by a set (of domain elements having the property) Relations denote sets of tuples of elements on the domain Interpretation: a mapping that maps from the nonlogical vocabulary of our meaning representation to the corresponding denotations in the model. Representational Schemes We’re going to make use of First Order Predicate Calculus (FOPC) as our representational framework Not because we think it’s perfect All the alternatives turn out to be either too limiting or They turn out to be notational variants FOPC Allows for… The analysis of truth conditions Supports the use of variables Allows us to answer yes/no questions Allows us to answer questions through the use of variable binding Supports inference Allows us to answer questions that go beyond what we know explicitly FOPC This choice isn’t completely arbitrary or driven by the needs of practical applications FOPC reflects the semantics of natural languages because it was designed that way by human beings In particular… First-order predicate calculus (FOPC) Formula AtomicFormula | Formula Connective Formula | Quantifier Variable … Formula | ¬ Formula | (Formula) AtomicFormula Predicate (Term…) Term Function (Term…) | Constant | Variable Connective ∧ | ⋁ | ⇒ Quantifier ∀ | ∃ Constant A | VegetarianFood | Anarkali Variable x | y | … Predicate Serves | Near | … Function LocationOf | CuisineOf | … Example I only have five dollars and I don’t have a lot of time. Have(Speaker,FiveDollars) ∧ ¬ Have(Speaker,LotOfTime) variables: Have(x,FiveDollars) ∧ ¬ Have(x,LotOfTime) Note: grammar is recursive Semantics of FOPC FOPC sentences can be assigned a value of true or false. Anarkali is near RC. Near (LocationOf (Anarkali), LocationOf (RC)) Inference Modus ponens: ⇒ Example: VegetarianRestaurant(Joe’s) x: VegetarianRestaurant(x) ⇒ Serves(x,VegetarianFood) Serves(Joe’s,VegetarianFood) Uses of modus ponens Forward chaining: as individual facts are added to the database, all derived inferences are generated Backward chaining: starts from queries. Example: the Prolog programming language father(X, Y) :- parent(X, Y), male(X). parent(john, bill). parent(jane, bill). female(jane). male (john). ?- father(M, bill). Variables and quantifiers A restaurant that serves Mexican food near UM. ∃ All vegetarian restaurants serve vegetarian food. x: VegetarianRestaurant(x) x: Restaurant(x) ∧ Serves(x,MexicanFood) ∧ Near(LocationOf(x),LocationOf(UM)) ⇒ Serves (x,VegetarianFood) If this sentence is true, it is also true for any substitution of x. However, if the condition is false, the sentence is always true. Meaning Structure of Language The semantics of human languages… Display a basic predicate-argument structure Make use of variables Make use of quantifiers Use a partially compositional semantics Predicate-Argument Structure Events, actions and relationships can be captured with representations that consist of predicates and arguments to those predicates. Languages display a division of labor where some words and constituents function as predicates and some as arguments. Predicate-Argument Structure Predicates Primarily Verbs, VPs, PPs, Sentences Sometimes Nouns and NPs Arguments Primarily Nouns, Nominals, NPs, PPs But also everything else; as we’ll see it depends on the context Example Mary gave a list to John. Giving(Mary, John, List) More precisely Gave conveys a three-argument predicate The first arg is the subject The second is the recipient, which is conveyed by the NP in the PP The third argument is the thing given, conveyed by the direct object Not exactly The statement The first arg is the subject can’t be right. Subjects can’t be givers. We mean that the meaning underlying the subject phrase plays the role of the giver. Better Turns out this representation isn’t quite as useful as it could be. Giving(Mary, John, List) Better would be x, y Giving ( x)^ Giver( Mary , x)^ Given( y, x) ^ Givee( John, x)^ Isa ( y, List ) Predicates The notion of a predicate just got more complicated… In this example, think of the verb/VP providing a template like the following w, x, y, zGiving ( x)^ Giver( w, x)^ Given( y, x)^ Givee( z , x) The semantics of the NPs and the PPs in the sentence plug into the slots provided in the template Compositional Semantics Compositional Semantics Syntax-driven methods of assigning semantics to sentences Semantic Analysis Semantic analysis is the process of taking in some linguistic input and assigning a meaning representation to it. There a lot of different ways to do this that make more or less (or no) use of syntax We’re going to start with the idea that syntax does matter The compositional rule-to-rule approach Semantic Processing We’re going to discuss 2 ways to attack this problem (just as we did with parsing) There’s the theoretically motivated correct and complete approach… Computational/Compositional Semantics Create a FOL representation that accounts for all the entities, roles and relations present in a sentence. And there are practical approaches that have some hope of being useful and successful. Information extraction Do a superficial analysis that pulls out only the entities, relations and roles that are of interest to the consuming application. Compositional Analysis Principle of Compositionality The meaning of a whole is derived from the meanings of the parts What parts? The constituents of the syntactic parse of the input What could it mean for a part to have a meaning? Example AyCaramba serves meat e Serving (e)^ Server(e, AyCaramba)^ Served (e, Meat ) Compositional Analysis Augmented Rules We’ll accomplish this by attaching semantic formation rules to our syntactic CFG rules Abstractly A 1...n { f ( 1.sem,...n.sem)} This should be read as the semantics we attach to A can be computed from some function applied to the semantics of A’s parts. Example Easy parts… NP -> PropNoun NP -> MassNoun PropNoun -> AyCaramba MassMoun -> meat Attachments {PropNoun.sem} {MassNoun.sem} {AyCaramba} {MEAT} Example S -> NP VP VP -> Verb NP Verb -> serves {VP.sem(NP.sem)} {Verb.sem(NP.sem) ??? xy e Serving (e)^ Server(e, y )^ Served (e, x) Lambda Forms A simple addition to FOPC Take a FOPC sentence with variables in it that are to be bound. Allow those variables to be bound by treating the lambda form as a function with formal arguments xP(x ) xP( x)( Sally ) P ( Sally ) Example Example Example Example Syntax/Semantics Interface: Two Philosophies 1. 2. Let the syntax do what syntax does well and don’t expect it to know much about meaning In this approach, the lexical entry’s semantic attachments do all the work Assume the syntax does know something about meaning • Here the grammar gets complicated and the lexicon simpler (constructional approach) Example Mary freebled John the nim. Who has it? Where did he get it from? Why? Example Consider the attachments for the VPs VP -> Verb NP NP rule (gave Mary a book) VP -> Verb NP PP (gave a book to Mary) Assume the meaning representations should be the same for both. Under the lexicon-heavy scheme, the VP attachments are: VP.Sem (NP.Sem, NP.Sem) VP.Sem (NP.Sem, PP.Sem) Example Under a syntax-heavy scheme we might want to do something like VP -> V NP NP V.sem ^ Recip(NP1.sem) ^ Object(NP2.sem) VP -> V NP PP V.Sem ^ Recip(PP.Sem) ^ Object(NP1.sem) i.e the verb only contributes the predicate, the grammar “knows” the roles. Integration Two basic approaches Integrate semantic analysis into the parser (assign meaning representations as constituents are completed) Pipeline… assign meaning representations to complete trees only after they’re completed Example From BERP I want to eat someplace near campus Two parse trees, two meanings Pros and Cons If you integrate semantic analysis into the parser as it is running… You can use semantic constraints to cut off parses that make no sense But you assign meaning representations to constituents that don’t take part in the correct (most probable) parse Mismatches There are unfortunately some annoying mismatches between the syntax of FOPC and the syntax provided by our grammars… So we’ll accept that we can’t always directly create valid logical forms in a strictly compositional way We’ll get as close as we can and patch things up after the fact. Complex Terms Allow the compositional system to pass around representations like the following as objects with parts: Complex-Term → <Quantifier var body> x Isa (x, Restaurant ) Example Our restaurant example winds up looking like eServing (e) Server(e, xIsa( x, Restaurant ) ) Served(e,Meat) Big improvement… Conversion So… complex terms wind up being embedded inside predicates. So pull them out and redistribute the parts in the right way… P(<quantifier, var, body>) turns into Quantifier var body connective P(var) Example Server(e, x Isa ( x, Restaurant ) ) x Isa ( x, Restaurant ) Server(e, x) Quantifiers and Connectives If the quantifier is an existential, then the connective is an ^ (and) If the quantifier is a universal, then the connective is an > (implies) Multiple Complex Terms Note that the conversion technique pulls the quantifiers out to the front of the logical form… That leads to ambiguity if there’s more than one complex term in a sentence. Quantifier Ambiguity Consider Every restaurant has a menu That could mean that every restaurant has a menu Or that There’s some uber-menu out there and all restaurants have that menu Quantifier Scope Ambiguity xRestaurant ( x) e, yHaving (e) Haver (e, x) Had (e, y ) Isa ( y, Menu ) yIsa ( y, Menu ) xIsa( x, Restaurant ) eHaving (e) Haver(e, x) Had (e, y ) Ambiguity This turns out to be a lot like the prepositional phrase attachment problem The number of possible interpretations goes up exponentially with the number of complex terms in the sentence The best we can do is to come up with weak methods to prefer one interpretation over another Non-Compositionality Unfortunately, there are lots of examples where the meaning (loosely defined) can’t be derived from the meanings of the parts Idioms, jokes, irony, sarcasm, metaphor, metonymy, indirect requests, etc English Idioms Kick the bucket, buy the farm, bite the bullet, run the show, bury the hatchet, etc… Lots of these… constructions where the meaning of the whole is either Totally unrelated to the meanings of the parts (kick the bucket) Related in some opaque way (run the show) The Tip of the Iceberg Describe this construction 1. A fixed phrase with a particular meaning 2. A syntactically and lexically flexible phrase with a particular meaning 3. A syntactically and lexically flexible phrase with a partially compositional meaning 4. … Example Enron is the tip of the iceberg. NP -> “the tip of the iceberg” Not so good… attested examples… the tip of Mrs. Ford’s iceberg the tip of a 1000-page iceberg the merest tip of the iceberg How about That’s just the iceberg’s tip. Example What we seem to need is something like NP -> An initial NP with tip as its head followed by a subsequent PP with of as its head and that has iceberg as the head of its NP And that allows modifiers like merest, Mrs. Ford, and 1000-page to modify the relevant semantic forms Quantified Phrases Consider A restaurant serves meat. Assume that A restaurant looks like x Isa ( x, Restaurant ) If we do the normal lambda thing we get eServing (e) Server(e, xIsa( x, Restaurant )) Served(e,Meat )) END Examples from Russell&Norvig (1) 7.2. p.213 Not all students take both History and Biology. Only one student failed History. Only one student failed both History and Biology. The best history in History was better than the best score in Biology. Every person who dislikes all vegetarians is smart. No person likes a smart vegetarian. There is a woman who likes all men who are vegetarian. There is a barber who shaves all men in town who don't shave themselves. No person likes a professor unless a professor is smart. Politicians can fool some people all of the time or all people some of the time but they cannot fool all people all of the time. Categories & Events Categories: VegetarianRestaurant (Joe’s) – categories are relations and not objects MostPopular(Joe’s,VegetarianRestaurant) – not FOPC! ISA (Joe’s,VegetarianRestaurant) – reification (turn all concepts into objects) AKO (VegetarianRestaurant,Restaurant) Events: Reservation (Hearer,Joe’s,Today,8PM,2) Problems: Determining the correct number of roles Representing facts about the roles associated with an event Ensuring that all the correct inferences can be drawn Ensuring that no incorrect inferences can be drawn MUC-4 Example On October 30, 1989, one civilian was killed in a reported FMLN attack in El Salvador. INCIDENT: DATE INCIDENT: LOCATION INCIDENT: TYPE INCIDENT: STAGE OF EXECUTION INCIDENT: INSTRUMENT ID INCIDENT: INSTRUMENT TYPE PERP: INCIDENT CATEGORY PERP: INDIVIDUAL ID PERP: ORGANIZATION ID PERP: ORG. CONFIDENCE PHYS TGT: ID PHYS TGT: TYPE PHYS TGT: NUMBER PHYS TGT: FOREIGN NATION PHYS TGT: EFFECT OF INCIDENT PHYS TGT: TOTAL NUMBER HUM TGT: NAME HUM TGT: DESCRIPTION HUM TGT: TYPE HUM TGT: NUMBER HUM TGT: FOREIGN NATION HUM TGT: EFFECT OF INCIDENT HUM TGT: TOTAL NUMBER 30 OCT 89 EL SALVADOR ATTACK ACCOMPLISHED TERRORIST ACT "TERRORIST" "THE FMLN" REPORTED: "THE FMLN" "1 CIVILIAN" CIVILIAN: "1 CIVILIAN" 1: "1 CIVILIAN" DEATH: "1 CIVILIAN" Subcategorization frames 1. 2. 3. 4. 5. 6. 7. I ate I ate a turkey sandwich I ate a turkey sandwich at my desk I ate at my desk I ate lunch I ate a turkey sandwich for lunch I ate a turkey sandwich for lunch at my desk - no fixed “arity” (problem for FOPC) One possible solution Eating1 (Speaker) 2. Eating2 (Speaker, TurkeySandwich) 3. Eating3 (Speaker, TurkeySandwich, Desk) 4. Eating4 (Speaker, Desk) 5. Eating5 (Speaker, Lunch) 6. Eating6 (Speaker, TurkeySandwich, Lunch) 7. Eating7 (Speaker, TurkeySandwich, Lunch, Desk) Meaning postulates are used to tie semantics of predicates: w,x,y,z: Eating7(w,x,y,z) ⇒ Eating6(w,x,y) Scalability issues again! 1. Another solution - Say that everything is a special case of Eating7 with some arguments unspecified: ∃w,x,y Eating (Speaker,w,x,y) - Two problems again: Too many commitments (e.g., no eating except at meals: lunch, dinner, etc.) No way to individuate events: ∃w,x Eating (Speaker,w,x,Desk) ∃w,y Eating (Speaker,w,Lunch,y) – cannot combine into ∃w Eating (Speaker,w,Lunch,Desk) Reification w: Isa(w,Eating) ∧ Eater(w,Speaker) ∧ Eaten(w,TurkeySandwich) – equivalent to sentence 5. Reification: No need to specify fixed number of arguments for a given surface predicate No more roles are postulated than mentioned in the input No need for meaning postulates to specify logical connections among closely related examples ∃ Representing time 3. I arrived in New York I am arriving in New York I will arrive in New York ∃ 1. 2. w: Isa(w,Arriving) ∧ Arriver(w,Speaker) ∧ Destination(w,NewYork) Representing time i,e,w,t: Isa(w,Arriving) ∧ Arriver(w,Speaker) ∧ Destination(w,NewYork) ∧ IntervalOf(w,i) ∧ EndPoint(I,e) ∧ Precedes (e,Now) ∃ i,e,w,t: Isa(w,Arriving) ∧ Arriver(w,Speaker) ∧ Destination(w,NewYork) ∧ IntervalOf(w,i) ∧ MemberOf(i,Now) ∃ i,e,w,t: Isa(w,Arriving) ∧ Arriver(w,Speaker) ∧ Destination(w,NewYork) ∧ IntervalOf(w,i) ∧ StartPoint(i,s) ∧ Precedes (Now,s) ∃ Representing time We fly from San Francisco to Boston at 10. Flight 1390 will be at the gate an hour now. Use of tenses Flight 1902 arrived late. Flight 1902 had arrived late. “similar” tenses When Mary’s flight departed, I ate lunch When Mary’s flight departed, I had eaten lunch reference point Aspect Stative: I know my departure gate Activity: John is flying no particular end point Accomplishment: Sally booked her flight natural end point and result in a particular state Achievement: She found her gate Figuring out statives: * I am needing the cheapest fare. * I am wanting to go today. * Need the cheapest fare! Representing beliefs Want, believe, imagine, know - all introduce hypothetical worlds I believe that Mary ate British food. Reified example: ∃ u,v: Isa(u,Believing) ∧ Isa(v,Eating) ∧ Believer (u,Speaker) ∧ BelievedProp(u,v) ∧ Eater(v,Mary) ∧ Eaten(v,BritishFood) However this implies also: ∃ u,v: Isa(v,Eating) ∧ Eater(v,Mary) ∧ Eaten(v,BritishFood) Modal operators: Believing(Speaker,Eating(Mary,BritishFood)) - not FOPC! – predicates in FOPC hold between objects, not between relations. Believes(Speaker, ∃ v: ISA(v,Eating) ∧ Eater(v,Mary) ∧ Eaten(v,BritishFood)) Modal operators Beliefs Knowledge Assertions Issues: If you are interested in baseball, the Red Sox are playing tonight. Examples from Russell&Norvig (2) 7.3. p.214 One more outburst like that and you'll be in comptempt of court. Annie Hall is on TV tonight if you are interested. Either the Red Sox win or I am out ten dollars. The special this morning is ham and eggs. Maybe I will come to the party and maybe I won't. Well, I like Sandy and I don't like Sandy.