Logic Form Representations 11/06/2003 Reading: Chap 14

advertisement
Logic Form
Representations
Reading: Chap 14, Jurafsky & Martin
Slide set adapted from Vasile Rus, U. Memphis
Instructor: Rada Mihalcea
Problem Description
There is need for Knowledge Bases


E.g.: Question Answering
find the answer to
Q471: What year did Hitler die?
in a collection of documents
1.
A: “Hitler committed suicide in 1945”
2.

how would one justify that it is the right answer: using world
knowledge
suicide – {kill yourself}
kill – {cause to die}
Create intelligent interfaces to databases:
E.g.: Where can I eat Italian food?
Or: I'd like some pizza for dinner. Where can I go?
Slide 1
How to Build Knowledge Bases?
Manually
- building common sense knowledge bases
- see Cyc, Open Mind Common Sense
Automatically
- from open text
- from dictionaries like WordNet
Slide 1
Logic Form Representation
• What representation to use?
• Logic Form (LF) is a knowledge representation introduced by
Jerry Hobbs (1983)
• Logic form is a first-order representation based on natural
language
Slide 1
First Order Representations
Fulfil the five main desiderata for representing meaning:
1. Verifiability:
Does Maharani serve vegetarian food?
Serves(Maharani, vegetarian food)
A representation that can be used to match a proposition against a
knowledge base
2. Unambiguous representations:
I would like to eat someplace close to UNT.
= eat in a place near UNT
= eat a place
Get rid of ambiguity by assigning a sense to words, or by adding
additional information that rules out ambiguity.
A representation should be free of ambiguity.
Slide 1
First Order Representations
3. Canonical Form
Does Maharani serve vegetarian food?
Are vegetarian dishes served at Maharani?
Do they have vegetarian food at Maharani?
Texts that have the same meaning should have the same
representation.
4. Inference and Variables
The ability to draw inferences from the representations
Serves(x, Vegetarian Food) --> EatAt(Vegetarians, x)
5. Expresiveness
Representations should be expressive enough to handle a wide range
of subjects.
Slide 1
Induction, Abduction
Use FOP for automatic reasoning
How?
• Induction
• Abduction
Slide 1
Logic Form Transformations
First order representations
- have the characteristics of FOP
Add some extra information (e.g. POS, word sense)
Derived automatically from text, starting with parse trees
Used for automatic construction of knowledge bases:
- e.g. Starting with WordNet
Slide 1
WordNet as a Source of World Knowledge
• [review]
• WordNet, developed at Princeton by Prof. Miller, is an electronic
semantic network whose main element is the synset
– synset – a set of synonym words that define a concept
•
E.g.: {cocoa, chocolate, hot chocolate}
• a word may belong to more than one synset
• WordNet contains synsets for four parts of speech: noun, verb,
adjective and adverb
• synsets are related to each other via a set of relations: hypernymy
(ISA), hyponymy(reverseISA), cause, entailment,
meronymy(PART-OF) and others.
• hypernymy is the most important relation which organizes
concepts in a hierarchy (see next slide)
• adjectives and adverbs are organized in clusters based on
similarity and antonymy relations
Slide 1
WordNet glosses
• Each synset includes a small textual definition and one or more
examples that form a gloss.
• E.g.:
– {suicide:n#1} – {killing yourself}
– {kill:v#1} – {cause to die}
– {extremity, appendage, member} – {an external body part
that projects from the body “it is important to keep the
extremities warm”}
• Glosses are a rich source of world knowledge
• Can transform glosses into a computational representation
Slide 1
Logic Form Representation
• A predicate is a concatenation of the morpheme’s base form, part
of speech and WordNet semantic sense
– morpheme:POS#sense(list_of_arguments)
• There are two types of arguments:
– x – for entities
– e – for events
• The position of the arguments is important
– verb:v#sense(e, subject, direct_object, indirect_object)
– preposition(head, prepositional_object)
• A predicate is generated for each noun, verb, adjective and
adverb
• Complex nominals are represented using the predicate nn:
– e.g.: “goat hair” – nn(x1, x2, x3) & goat(x2) & hair(x3)
• The logic form of a sentence is the conjunction of individual
predicates
Slide 1
An Example
• {lawbreaker, violator}: (someone who breaks the law)
• Someone:n#1(x1) & break:v#6(e1, x1, x2) &
law:n#1(x2)
Part of Speech
Categorial
Information
WordNet sense
Subject
Direct object
Semantic
Information
Functional
Information
Slide 1
Logic Form Notation (cont’d)
• Ignores: plurals and sets, verb tenses, auxiliaries,
negation, quantifiers, comparatives
• Consequence:
– Glosses with comparatives can not be fully
transformed in logic forms
• The original notation does not handle special cases of
postmodifiers (modifiers placed after modifee)
respectively relative adverbs (where, when, how, why)
Slide 1
Comparatives
• {tower}: (structure taller than its diameter)
• taller/JJR modifies structure or diameter? Both?
• Solution: introduce a relation between structure and
diameter
• LF: structure(x1) & taller(x1, x2) & diameter(x2)
Slide 1
Postmodifiers
• {achromatic_lens}: (a compound lens system that forms an
•
•
•
•
image free from chromatic_aberration)
Free is a modifier of image ?
What is the prepositional head of from ?
Solution: free_from – NEW predicate
LF: image(x1) & free_from(x1, x2) &
chromatic_aberration(x2)
Slide 1
Relative Adverbs
• {airdock}: (a large building at an airport where aircraft can be
stored)
• Equivalent to: (aircraft can be stored in a large building at an
airport)
• LF: large(x1) & building(x1) & at(x1, x2) & airport(x2) &
where(x1, e1) & aircraft(x3) & store(e1, x4, x3)
Slide 1
Logic Form Identification
• Take advantage of the structural information embedded in a
parse tree
S
NP VP-PASS
NP VP-ACT
S -> NP VP
NP
VP
Architecture
Preprocess
(Extract Defs, Tokenize)
Direct object
POS Tag
Subject
Parse
Slide 1
LF
Transformer
Example of Logic Form
NP
NP
DT
VP
NN
VBN
PP
NP
IN
DT
NN
a
monastery
ruled
by
an
abbot
monastery:n(x1) rule:v(e1, x2, x1) abbot:n(x2)
Slide 1
Logic Form Derivation
• Take advantage of the syntactic information from the parser
• For each grammar rule derive one or more LF identification
rules
Identification Rules
Grammar Rule
Rule
Phrase
Synset
NP  DT NN
Noun/NN  noun(x)
(NP (a/DT monastery/NP))
{abbey:n#3}
VP  VP PP
Verb(e, -, -)/VP-PASS by/PP(-,x) 
(VP (ruled/VBN by/PP))
{abbey:n#3}
verb(e,x, -) & by(e,x)
NP
DT
VP
NN
VP
PP
Slide 1
Building Logic Forms from WordNet
• From definitions to axioms
• WordNet glosses transformed into axioms, to
enable automated reasoning
• Specific rules to derive axioms for each part of
speech:
– Nouns: the noun definition consists of a genus and differentia. The generic axiom is:
concept(x)  genus(x) & differentia(x).
•
E.g.: abbey(x1)  monastery(x1) & rule(e1, x2, x1) & abbot(x2)
– Verbs: are more trickier as some syntactic functional changes can occur from the left
hand side to the right hand side
•
E.g.: kill:v#1(e1, x1, x2, x3)  cause(e2, x1, e3, x3) & die(e3, x2)
– Adjectives: they borrow a virtual argument representing the head they modify
•
E.g.: american:a#1(x1)  of(x1, x2) & United_States_Of_America(x2)
– Adverbs: the argument of an adverb borrows a virtual event argument as they usually
modify an event
•
E.g: fast:r#1(e1)  quickly:r#1(e1)
Slide 1
Building a Knowledge Base from WordNet
• Parse all glosses and extract all grammar rules embedded in the
parse trees
Part of speech
Rules
Noun
5,392
Verb
1,837
Adjectives
1,958
Adverbs
Total
639
9,826
• The grammar is large
• If we consider that a grammar rule can map in more than one LF
rules the effort to analyse and implement all of them would be
tremendous
Slide 1
Coverage issue
• Group the grammar rules by the non terminal on the Left Hand
Side (LHS) and notice that the most frequent rules for some class
cover most of the occurrences of rules belonging to that class
Phrase on the LHS
Occurrences
Unique Rules
Coverage of top ten
of Grammar Rule
Base NP
33,643
857
69%
NP
11,408
244
95%
VP
19,415
450
70%
PP
12,315
40
99%
S
14,740
35
99%
The coverage of top 10 most frequent grammar rule for phrases as
measured in 10,000 noun glosses.
What does this remind you of?
Slide 1
Coverage issue (cont’d)
• Two phases:
– Phase 1: develop LF rules for most frequent rules and ignore
the others
– Phase 2: select more valuable rules
• The accuracy of each LF rule is almost perfect
• The performance issue is mainly about how many glosses are
entirely transformed into LF
• i.e. how many glosses the selected grammar rules fully map into
LF
Slide 1
Reduce the number of candidate grammar
rules (1)
• Selected grammar rules for baseNPs (non-recursive NPs) have
only a coverage of 69%
• Selected grammar rules for VPs have only 70% coverage
• Before selecting rules for baseNPs we make some
transformations to reduce more complex ones to simpler ones
NP
NP
NP
DT
NN
CC
NP
NN
DT
a
CC
ruler or institution
a
NN
NN
ruler or institution
• Coordinated base NPs are transformed into coordinated NPs and
simple base NPs
Slide 1
Reduce the number of candidate grammar
rules (2)
• Base Nps:
– Determiners are ignored (an increase of 11% in coverage for selected
grammar rules for base NPs)
– Plurals are ignored
– Everything in a prenominal position plays the role of a modifier
Base NP rule
NP  DT JJ NN|NNS|NNP|NNPS
NP  DT VBG NN|NNS|NNP|NNPS
NP  DT VBN NN| NNS | NNP|NNPS
• VPs:
– Negation is ignored
– Tenses are ignored (auxiliaries and modals)
Slide 1
Map grammar rules into LF rules
• Selected grammar rules map into one or more Logic Form rules
• Case 1: grammar rule is mapped into one LF rule
– Grammar rule: PP -> IN NP
– LFT: prep(_, x)  prep(_, x) & headNP(x)
• Case 2: grammar rule is mapped into one or more LF rules:
–
–
–
–
Grammar rule: VP -> VP PP
LFT 1: verb(e, x1, _)  verb-PASS(e,x1, _) & prep-By(e, x1)
LFT 2: verb(e, _, x2)  verb-PASS(e, _, x2) & prep-nonBy(e, x2)
To differentiate among the two cases we use two features:
•
•
The mood of the VP: active or passive
The type of preposition: by or non-by
Slide 1
Logic Form Derivation Results
• Phase 1:
– From a corpus of 10,000 noun glosses extract grammar rules, sort them by
the nonterminal on the LHS, select the most frequent grammar rules and
generate LF rules for them
– Manually develop a test corpus of 400 glosses
– Test the implemented LF rules on 400 noun glosses
– 72% coverage (with almost 100% accuracy)
• Phase 2:
– Select iteratively more rules that bring an increase in coverage of at
least 
– For glosses  was established at 1%
• This resulted in a total number of 70 grammar rules selected
• The new coverage achieved is 81%
• Open issue: how to fully cover the remaining 19% of glosses which are
not fully transformed
– using a set of heuristics
• E.g.: if the subject argument of a verb is missing use the first
previous noun as its subject
Slide 1
Question Answering Application
• Given a question and an answer the task is to select the answer
from a set of candidate answers and to automatically justify that
the answer is the right answer
• Ideal case: all the keywords from the question together with their
syntactic relationship exist in the answer
– Question: What year did Hitler die?
– Perfect Answer: Hitler died in 1945.
• Real case:
– Real Answer: Hitler committed suicide in 1945.
– Requires extra resources to link suicide to die: use WordNet as a
knowledge base
Slide 1
Download