Feb 22: Logic for Representing Natural Language
Feb 24: Syntax and Compositional Semantics of Clauses
Mar 1: Syntax and Compositional Semantics of NPs
Mar 3: Coordination and Comparison
OR
Inference, Coreference, and Metonymy
Logic and natural language are the best two ways we know of representing information / knowledge
Natural language is too variable to compute with
Logic has traditionally been too narrow in what can be represented
A goal of knowledge representation research:
Develop logics with expressivity closer to natural language
Propositional constants: P, Q, R, ... Have vaues of either True or False.
Latin word for “or” is “vel”
Logical connectives: and: & or
not: ~ or ¬ equivalent, iff: <--> or
or: v imply: --> or
Defined by truth tables:
& Q:
T F
P: T T F
F F F v Q:
T F
P: T T T
F T F
~ P:
T F
F T
“Material implication”: either P is false or Q is true
Definitions of <--> and -->:
[P <-->Q] <--> [[P --> Q] & [Q --> P]] [P --> Q] <--> [~P v Q]
Properties of logical connectives:
Modus ponens : [P & [P-->Q]] --> Q
& and v are associative and commutative:
[[P & Q] & R] <--> [P & [Q & R]] [P & Q] <--> [Q & P]
[[P v Q] v R] <--> [P v [Q v R]] [P v Q] <--> [Q v P] so we can write [P & Q & R & ...] and [P v Q v R v ...]
(What about [[P --> Q] --> R] <-?-> [P --> [Q --> R]]?)
Relating &, v and ~:
~[P & Q] <--> ~P v ~Q ~[P v Q] <--> ~P & ~Q
[P & [Q v R]] <--> [[P & Q] v [P & R]] [P v [Q & R]] <--> [[P v Q] & [P v R]]
Double negation: ~~P <--> P
Clause form : clause
[P1 v P2 v ~P3 v ....] & [Q1 v ~Q2 v ...]
Negation applies only to propositional constants, not larger expressions.
Disjunctions (v) appear at the midlevel, outscoping negation and outscoped by conjunctions.
Conjunction at the highest level.
Eliminate --> and <--> with the rules
[P <--> Q] ==> [[P --> Q] & [Q --> P]] [P --> Q] ==> [~P v Q]
Push ~ all the way inside with the rules:
~[P & Q] ==> ~P v ~Q ~[P v Q] ==> ~P & ~Q
Push v inside & with the rule:
[P v [Q & R]] ==> [[P v Q] & [P v R]]
Eliminate double negations with the rule:
~~P ==> P
It is always possible to reduce an expression to clause form.
(Conjunctive Normal Form)
We can rewrite this as two rules:
P --> Q and
P & R --> False
~[P & [Q --> R]]
~[P & [~Q v R]]
~P v ~[~Q v R]
Eliminate -->
Move ~ inside
~P v [~~Q & ~R] Move ~ inside
~P v [Q & ~R] Cancel double negation
[~P v Q] & [~P v ~R] Distribute v through &
Clause form with two clauses
Literals: P, ~Q, R, ...
Positive literals: P, R, ...
Horn clause : A clause with at most one positive literal.
~P v ~Q v R equivalent to [P & Q] --> R
~P v ~Q equivalent to [P & Q] --> False
Definite Horn clause: A clause with exactly one positive literal.
Propositional logic: Don’t look inside propositions: P, Q, R, ...
First-order logic: Look inside propositions: p(x,y), like(J,M), ...
Constants: John1, Sam1, ..., Chair-46, ..., 0, 1, 2, ...
Variables: x, y, z, ....
Predicate symbols: p, q, r, ..., like, hate, ...
Function symbols: motherOf, sumOf, ...
All the logical connectives of propositional logic.
Predicates and functions apply to a fixed number of arguments:
Predicates: like(John1,Mary1), hate(Mary1,George1), tall(Sue3), ...
Functions: motherOf(Sam1) = Mary1, sumOf(2,3) = 5, ...
In the expression: 3 + 2 > 4 function predicate
Predicates applied to arguments are propositions and yield True or False.
Functions applied to arguments yield entities in the domain.
Two different roles for variables:
Recall from high-school algebra:
(x + y)(x - y) = x 2 - y 2 x 2 -7x + 12 = 0 universal statement:
(A x,y)[(x + y)(x - y) = x 2 - y 2 ] existential statement:
(E x)[x 2 -7x + 12 = 0]
Universal quantifier: A or
: statement is true for all values of variable
Existential quantifier: E or
: statement is true for some value of variable
In (A x)[p(x) & q(y)] x is bound by the quantifier; y is not.
Both are in the scope of the quantifier.
We’ll only use variables that are bound by a quantifier.
The quantifier tells how the variable is being used.
Relation between A and E:
~(A x) p(x) <--> (E x)~p(x)
(A x) p(x) <--> (A y) p(y)
Negation can be moved inside
The variable doesn’t matter
(A x)[p(x)] & Q <--> (A x)[p(x) & Q] No harm scoping over what where no x in Q doesn’t involve the variable
Eliminate --> and <-->
Move negation to the inside
Give differently quantified variables different names:
(A x)p(x) & (E x)q(x) ==> (A x)p(x) & (E y)q(y)
Eliminate existential quantifiers with Skolem constants and functions :
(E x)p(x) ==> p(A) (A x)(E y)p(x,y) ==> (A x)p(x,f(x))
Skolem constant Skolem function
Move universal quantifiers to outside:
(A x)p(x) & (A y)[q(y) v r(y,f(y))] ==> (A x)(Ay)[p(x) & [q(y) v r(y,f(y))] prenex form: prefix matrix
Put matrix into clause form
(A x)[(E y)[p(x,y) --> q(x,y)] --> (E y)[r(x,y)]]
(A x)[ ~(E y)[~ p(x,y) v q(x,y)] v (E y)[r(x,y)]] Eliminate implication
(A x)[ (A y)~[~p(x,y) v q(x,y)] v (E y)[r(x,y)]] Move negation inside
(A x)[(A y) [p(x,y) & ~q(x,y)] v (E y )[r(x, y )]] Move negation inside
(A x)[(A y)[p(x,y) & ~q(x,y)] v (E z )[r(x, z )]] Rename variables
(A x) [(A y) [p(x,y) & ~q(x,y)] v r(x, f(x) )] Skolem function
(A x) (A y)[ [p(x,y) & ~q(x,y)] v r(x,f(x))] Prenex form
(A x)(A y)[[p(x,y) v r(x,f(x))] & [~q(x,y) v r(x,f(x))]] Distribute v inside & p(x,y) v r(x,f(x)), ~q(x,y) v r(x,f(x)) Break into clauses
Horn clause : A clause with one positive literal.
~p(x,y) v ~q(x) v r(x,y) is equivalent to
[p(x,y) & q(x)] --> r(x,y) procedure body procedure name The key idea in Prolog
Implicative normal form:
(A x,y)[[p1(x,y) & p2(x,y) & ...] --> (E z)[q1(x,z) & q2(x,z) & ...]]
Useful for commonsense knowledge:
(A x)[[car(x) & intact(x) --> (E z)[engine(z) & in(z,x)]]
Every intact car has an engine in it.
Logical theory :
The logic as we have defined it so far
+ A set of logical expressions that are taken to be true ( axioms )
Rules of inference :
Modus Ponens : From P , P --> Q infer Q
Universal instantiation : From (A x)p(x) infer p(A)
Theorems :
Expressions that can be derived from the axioms and the rules of inference.
What do the logical symbols mean? What do the axioms mean?
A logical theory is used to describe some domain.
We assign an individual or entity in the domain to each constant (the denotation of that constant.
To each unary predicate we assign a set of entities in the domain, those entities for which the predicate is true (the denotation or extension of p).
To each binary predicate we assign a set of ordered pairs of entities, etc.
~P : true when P is not true.
P & Q : true when P is true and Q is true
P v Q : true when P is true or when Q is true p(A) : true when the denotation of A is in the set assigned to p
(A x)p(x) : true when for every assignment of x, x is in the set assigned to p
If all the axioms of the logical theory are true, then the domain is a model of the theory.
Logical theory:
Predicate: sum(x,y,z) (x is the “sum” of y and z)
Axiom 1: (A x,y,z,w)[(E u)[sum(u,x,y) & sum(w,u,z)]
<--> (E v)[sum(v,y,z) & sum(w,x,v)]]
(associativity)
Some models: addition of numbers, multiplication of numbers concatenation of strings
Add Axiom 2: (A x,y,w)[sum(w,x,y) <--> sum(w,y,x)]
(commutativity)
Some models: addition of numbers, multiplication of numbers concatenation of strings
In general, adding axioms eliminates models.
Consistency : A theory is consistent if you can’t conclude a contradiction.
If a logical theory has a model, it is consistent.
Independence : Two axioms are independent if you can’t prove one from the other.
To show two axioms are independent, show that there is a model in which one is true and the other isn’t true.
Soundness : All the theorems of the logical theory are true in the model.
Completeness : All the true statements in the model are theorems in the logical theory.
Precision = 100%
Recall = 100%
The logical theory should tell the whole truth (complete) and nothing but the truth (sound)
s
(not “inten t ion”)
Extension: “president”
....
Clinton
Bush
Obama predicate --> set of entities
Intension: “president” predicate --> set of possible world-entity pairs
W1: ....
Clinton
Bush
Obama
W2: ....
Clinton
Gore
Obama
W3: ....
Clinton
Bush
McCain
Frees meaning of predicate from accidents of how the world is
Logic is about representing information.
Language conveys information.
Logic is a good way to represent the information conveyed by language.
A man builds a boat.
(E x,y)[man(x) & build(x,y) & boat(y)]
A tall man builds a small boat.
(E x,y)[tall(x) & man(x) & build(x,y) & small(y) & boat(y)]
Seems simple enough, but problems arise.
(e.g., the determiner “a”, the present tense, tall/small for what)
Two ways to deal with these problems:
Complicate the logic.
Much computational semantics
Complicate our conceptualization of the underlying domain.
Me +
(Davidson)
Events can be modified: John ran slowly .
Events can be placed in space and time: On Tuesday, John ran in Chicago .
Events can be causes and effects: John ran, because Sam was chasing him.
Because John ran , he was tired.
Events can be objects of propositional attitudes: Sam believes John ran.
Events can be nominalized: John’s running tired him out.
Events can be referred to by pronouns: John ran, and Sam saw it .
To represent these, we need some kind of “handle” on the event.
We need constants and variables to be able to denote events.
We need to treat events as “things” -reify events (from Latin “re(s)” - thing)
Let e1 be John’s running. Then slow(e1) believe(Sam,e1) onDay(e1, ...), in(e1, Chicago) tiredOut(e1, John) cause(..., e1), cause(e1, ...) see(Sam, e1)
Why not this?
slow( run(John) )
This evaluates to True or False
Then slow would describe not John’s running, but True or False e1: run(John)
This is easily understood, but it takes us out of logic. run’(e1,John)
This means “e1 is the event of John’s running”
I’ll use this when I need to; run(John) otherwise.
Not just events, but states, conditions, properties:
John fell because the floor was slippery. cause(e1,e2) & fall’(e2, j) & slippery’(e1, f)
The contract was invalid because John failed to sign it.
cause(e1,e2) & invalid’(e2,c) & fail’(e1,j, e3) & sign’(e3,j,c)
I will use the word “eventuality” to describe all these things -- events, states, conditions, etc.
Controversial
Jenny pushed the chair from the living room to the dining room for Sam yesterday
Case : Agent Theme Source Goal Benefactor Time
Could represent this like push(Jenny, Chair1, LR, DR, Sam, 21Feb11, ...)
Or like push’(e) & Agent(Jenny,e) & Theme(Chair1,e) & Source(LR,e) & Goal(DR,e)
& Benefactor(Sam,e) & atTime(e, 21Feb11)
Or like push’(e, Jenny, Chair1) & from(e, LR) & to(e, DR) & for(e, Sam)
& yesterday(e, ...) from complements from adjuncts
Equivalence of these: (A e,x,y)[push’(e,x,y) --> Agent(x,e) & Theme(y,e)]
John ran.
run’(e,J) & Past(e) tense
John ran on Tuesday.
run’(e,J) & Past(e) & onDay(e,d) & Tuesday(d)
John ran in Chicago.
run’(e,J) & Past(e) & in(e,Chicago)
John ran slowly.
run’(e,J) & Past(e) & slow(e)
John ran reluctantly.
run’(e,J) & Past(e) & reluctant(J,e)
Some attributive adjectives have an implicit comparison set or scale:
A small elephant is bigger than a big mosquito.
That mosquito is big.
mosquito(x) & big(x, s)
The implicit comparison set or scale, which must be determined from context
Proper names:
Could treat them as constants:
Springfield is the capital of Illinois. ==> capital(Springfield, Illinois)
But there are many Springfields; we could treat it as a predicate true of any town named Springfield: capital(x,y) & Springfield(x) & Illinois(y)
Or we could treat the name as a string, related to the entity by the predicate name : capital(x,y) & name(“Springfield”, x) & name(“Illinois”, y)
An indexical or deictic is a word or phrase that requires knowledge of the situation of utterance for its interpretation.
“I”, “you”, “we”, “here”, “now”, some uses of “this”, “that”, ...
The property of being “I” is being the speaker of the current utterance
Indexicals require an argument for the utterance or the speech situation.
I(x,u) : x is the speaker of utterance u you(x,u) : x is the intended hearer of utterance u we(s,u) : s is a set of people containing the speaker of utterance u here(x,u) : x is the place of utterance u now(t,u) : t is the time of utterance u from the quotation marks
Chris said, “I see you now.”
==> say(Chris,u) & content(e,u) & see’(e,x,y) & I(x,u) & you(y,u)
& atTime(e,t) & now(t,u)