G22.2590 - Natural Language Processing - Spring 2001 Lecture 7 Outline Prof. Grishman February 28, 2001 Semantic Representation, cont’d Basic Logical Form Representation Based on predicate calculus Use words or word senses for predicate names Use linguistic quantifiers (all, most, some, the, a, …) with the general form (quantifier variable : restriction-proposition body-proposition) Encoding Ambiguity One source of ambiguity is quantifier scope: A woman gives birth in the United States every five minutes. We can represent the two readings in conventional predicate calculus using different quantifier scopes. If we explicitly represent all the semantic ambiguities in a sentence in this way, we may have very many readings. It is therefore practical to initially produce (from the parse) a representation which captures multiple readings … which encodes (some of) the ambiguity. (And hope that this ambiguity can be resolved at a later stage of semantic analysis.) In particular, Allen (p. 240) introduces unscoped quantifiers, using P(<quantifier var: restriction>) to represent (quantifier var : restriction P(var)) so that both readings of “Every boy loves a dog.” are represented by (LOVES1 <EVERY b1 (BOY1 b1)> <A d1 (DOG1 d1)>) Labeling Arguments (Allen 8.5 and 8.6) Predicate logic typically uses predicates with a fixed number of positional arguments. This is not convenient for many natural language predicates, which can appear with different numbers of arguments (“Fred ate.” “Fred ate a sandwich.”); we would need different predicates, and rules which related the meanings of the predicates. The situation is much worse if we consider all the modifiers (location, time, instrument, …) a verb can take. A practical solution is to have named, optional argument roles (eat [subject Fred] [object a sandwich] [location kitchen]). By itself, this is somewhat informal; it must be coupled with rules which relate the predicate with and without a given argument. A more formal approach involves reification: treating events as objects which can be quantified: (exists e (eat e) & (subject e Fred) & (object e sandwich) & (location e kitchen)). We can consider the named-argument representation, (eat e [subject Fred] [object a sandwich] [location kitchen]), an abbreviation for this. Argument Roles What names should we use for the arguments? We can use basically syntactic labels (subject, object). However, then we have to distinguish the predicates in John broke the window with a hammer. The hammer broke the window. The window broke. because the “subject” plays quite different roles in the three cases. Some linguists and computational linguists have suggested using more semantic argument roles (thematic roles) which will capture these relations. In this case, John would be the agent, window the theme, and hammer the instrument. Such thematic roles can capture a number of verbal alternations (though they are difficult to apply uniformly over a large vocabulary). Mapping Syntax to Semantics (Allen 8.1) We want to compute the semantic representation of a sentence from the parse tree. We could embed this translation in a procedure but, as in the case of parsing, it will be easier to develop and maintain this translation if it is rule driven. Furthermore, because the parse tree provides a structural framework, we will use a compositional, syntax-driven translation process. This means that we will associate a (partial) semantic interpretation with each node of the parse tree, to be computed (using a rule) from the interpretations of its children. This approach works best if there is a close correspondence between the syntactic structure and the semantic structure. The correspondence is closer for the unscoped logical form discussed above; a noun phrase translates naturally into an unscoped quantifier, and a clause into a predicate with its arguments. The semantics of a verb phrase is essentially the semantics of a clause, with one argument (the subject) missing … a predicate with one unbound argument. We can represent this by a lambda expression (p. 265). So the grammar can be extended to add a SEM feature, representing the semantic interpretation of a node. Each production will then incorporate the rule for computing its SEM value, and the SEM of the root will be the interpretation of the sentence. (Allen 8.2). Assignment #7 … wait till next week