October 25, 2006 11-721: Grammars and Lexicons Lori Levin

advertisement
October 25, 2006
11-721: Grammars and Lexicons
Lori Levin
Lexical Functional Grammar
• History:
– Joan Bresnan (linguist, MIT and Stanford)
– Ron Kaplan (computational psycholinguist,
Xerox PARC)
– Around 1978
What is Linguistic Theory
• Delimit the range of possible human languages.
– What do all languages have in common?
• Semantic roles, grammatical relations, pragmatic relations, some
constituent structure; only subjects can be controllees in matrix
coding as subject constructions; etc.
– What are the ways in which they can differ from each
other?
• Relative prominence of grammatical or pragmatic relations: word
order reflects grammatical relations in English and reflects focus
(new information) in Hungarian; Topic takes precedence over
subject in Chinese in determining antecedent of null pronouns;
Subject is more prominent in English.
– What never happens in a human language?
• Make a question by saying the sentence backwards.
Universalist view of language
• There is “a common organizing structure of
all languages that underlies their superficial
variations in modes of expression”
(Bresnan)
– E.g., Passives that look very different in
different languages can be described by a
universal passive rule.
• The common organizing structure is part of
human biology.
Some differences between English and Warlpiri
S
VP’
NP
VP
Aux
The two small children
V
NP
are chasing that dog.
S
NP
Wita-jarra-rlu
Small-DU-ERG
AUX
V
NP
NP
NP
ka-pala
wajili-pi-nyi yalumpu kurdu-jarra-rlu maliki.
pres-3duSUBJ chase-NPAST that.ABS child-DU-ERG dog.ABS
Possible word orders in Warlpiri that are
not possible in English
• *The two small are chasing that children
dog.
• *The two small are dog chasing that
children.
• *Chasing are the two small that dog
children.
• *That are children chasing the two small
dog.
Non-configurational languages
• Free word order.
• May have discontinuous constituents.
• Tests for constituency do not yield evidence
for VP constituent.
Something that English and Warlpiri
have in common
• Lucy is hitting herself.
• *Herself is hitting Lucy.
• Napaljarri-rli
ka-nyanu paka-rni
Napaljarri-ERG PRES-REFL hit-NONPAST
“Napaljarri is hitting herself.”
• *Napaljarri
ka-nyanu paka-rni
Napaljarri.ABS PRES-REFL hit-NONPAST
“Herself is hitting Napaljarri.”
What English and Warlpiri have in common
according to Chomsky
S
Deep structure
VP’
NP
VP
Aux
V
NP
English
S
Surface Structure
VP’
NP
VP
Aux
V
NP
What English and Warlpiri have in common
according to Chomsky
S
Deep structure
VP’
NP
VP
Aux
V
NP
Warlpiri
Surface Structure
S
NP
Aux
V
NP
NP
NP
What English and Warlpiri have in
common according to Bresnan
• Same grammatical relations and semantic roles
– SUBJECT: the two small children: AGENT
– PREDICATE: are chasing
– OBJECT: that dog: PATIENT
• Different codings of grammatical relations:
– English subject: NP immediately under S
– Warlpiri subject: Ergative case marked NP (if verb is
transitive)
Strength of Chomsky’s approach
• Proposing that there is a VP in all languages
explains why there are subject-object
asymmetries in all languages.
Strength of Bresnan’s approach
• Doesn’t propose non-existent VPs:
– phrase structure is used for representing
constituency
– A different representation is used for
grammatical relations
Challenges for Bresnan and
Chomsky
• Bresnan:
– explain subject-object asymmetries in the absence of a
VP
– Explain in a principled way the range of possible
coding properties of grammatical relations
• Chomsky:
– explain in a principled way how the words get
scrambled out of VP;
– The phrase structure tree has to represent both
grammatical relations and constituent structure, which
may conflict with each other.
Levels of Representation in LFG
[s [np The bear] [vp ate [np a sandwich]]]
SUBJ
Agent
Eat < agent
SUBJ
PRED
SUBJ
OBJ
eat
functional structure
Lexical mapping
thematic roles
patient
patient >
lexical mapping
OBJ
S
NP
constituent structure
Grammatical encoding
VP
VP
V
NP
OBJ
V
PP
OBL
Grammatical
Encoding
For English!!!
Syntax
• Syntax is not about the form (phrase
structure) of sentences.
• It is about how strings of words are associated
with their semantic roles.
– Phrase structure is only part of the solution.
• Sam saw Sue
– Sam: perceiver
– Sue: perceived
Syntax
• Syntax is also about how to tell that two
sentences are thematic paraphrases of each
other (same phrases filling the same semantic
roles).
–
–
–
–
It seems that Sam ate the sandwich.
It seems that the sandwich was eaten by Sam.
Sam seems to have eaten the sandwich.
The sandwich seems to have been eaten by Sam.
How to associate phrases with
their semantic roles in LFG
• Starting from a constituent structure tree:
• Grammatical encoding tells you how to find
the subject.
– The bear is the subject.
• Lexical mapping tells you what semantic
role the subject has.
– The subject is the agent.
– Therefore, the bear is the agent.
Levels of Representation in LFG
[s [np The sandwich ] [vp was eaten [pp by the bear]]]
constituent structure
Grammatical encoding
SUBJ
PRED
OBL
patient
eat
agent
Eat < agent
OBL
patient >
SUBJ
lexical mapping
SUBJ
S
NP
functional structure
Lexical mapping
thematic roles
VP
VP
V
NP
OBJ
V
PP
OBL
Grammatical
Encoding
For English!!!
Active and Passive
• Active:
– Patient is mapped to OBJ in lexical mapping.
• Passive
– Patient is mapped to SUBJ in lexical mapping.
• Notice that the grammatical encodings are
the same for active and passive sentences!!!
Passive mappings
• Starting from the constituent structure tree.
• The grammatical encoding tells you that the
sandwich is the subject.
• The lexical mapping tells you that the subject is the
patient.
– Therefore, the sandwich is the patient.
• The grammatical encoding tells you that the bear is
oblique.
• The lexical mapping tells you that the oblique is
the agent.
– Therefore, the bear is the agent.
How you know that the active and
passive have the same meaning
• In both sentences, the mappings connect the
bear to the agent role.
• In both sentences, the mappings connect the
sandwich to the patient role (roll?)
• In both sentences, the verb is eat.
Levels of Representation in LFG
[s-bar [np what ] [s did
[np the bear]
eat ]]
constituent structure
Grammatical encoding
OBJ
SUBJ
patient
Eat < agent
SUBJ
S-bar
NP
S
OBJ
agent
patient >
PRED
eat
functional structure
Lexical mapping
thematic roles
lexical mapping
OBJ
VP
S
NP
SUBJ
V
PP
OBL
Grammatical
Encoding
For English!!!
Wh-question
• Different grammatical encoding:
– In this example, the OBJ is encoded as the NP
immediately dominated by S-bar
• Same lexical mappings are used for:
– What did the bear eat?
– The bear ate the sandwich.
Principles
• Variability:
– Phrase structures and grammatical encodings
vary across languages.
• Universality
– Functional structures are largely invariant
across languages.
Functional Structure
SUBJ
PRED
TENSE
OBJ
PRED ‘bear’
NUM
sg
PERS
3
DEF
+
‘eat< agent patient >
SUBJ OBJ
past
PRED ‘sandwich’
NUM sg
PERS 3
DEF
-
Functional Structure
• Pairs of attributes (features) and values
– Attributes (in this example): SUBJ, PRED,
OBJ, NUM, PERS, DEF, TENSE
– Values:
• Atomic: sg, past, +, etc.
• Feature structure:
[num sg, pred `bear’, def +, person 3]
• Semantic form: ‘eat<subj ob>’, ‘bear’, ‘sandwich’
Semantic Forms
• Why are they values of a feature called
PRED?
– In some approaches to semantics, even nouns
like bear are predicates (function) that take one
argument and returns true or false.
– Bear(x) is true when the variable x is bound to
a bear.
– Bear(x) is false when x is not bound to a bear.
Why is it called a Functional
Structure?
X squared
1
1
2
4
3
9
Each feature has
a unique value.
Also, another term for
grammtical relation is
grammatical function.
4 16
5
features
25
values
We will use the terms functional structure,
f-structure and feature structure interchangeably.
Give a name to each function
f1
SUBJ
PRED ‘bear’
NUM
sg
f2
PERS
3
DEF
+
PRED
‘eat< agent patient >
SUBJ OBJ
TENSE
past
OBJ
PRED ‘sandwich’
NUM sg
f3 PERS 3
DEF
-
How to describe an f-structure
• F1(TENSE) = past
– Function f1 applied to TENSE gives the value
past.
• F1(SUBJ) = [PRED ‘bear’, NUM sg, PERS
3, DEF +]
• F2(NUM) = sg
Descriptions can be true or false
• F(a) = v
– Is true if the feature-value pair [a v] is in f.
– Is false if the feature-value pair [a v] is not in f.
This is the notation we really use
• (f1 TENSE) = past
• Read it this way:
f1’s tense is past.
• (f1 SUBJ) = [PRED ‘bear’, NUM sg, PERS
3, DEF +]
• (f2 NUM) = sg
Chains of function application
• (f1 SUBJ) = f2
• (f2 NUM) = sg
• ((f1 SUBJ) NUM) = sg
• Write it this way.
(f1 SUBJ NUM) = sg
• Read it this way.
“f1’s subject’s number is sg.”
More f-descriptions
• (f a) = v
– f is something that evaluates to a function.
– a is something that evaluates to an attribute.
– v is something that evaluates to a function, symbol, or
semantic form.
• (f1 subj) = (f1 xcomp subj)
– Used for matrix coding as subject. A subject is shared by
the main clause and the complement clause (xcomp).
• (f1 (f6 case)) = f6
– Used for obliques
SUBJ
PRED
TENSE
VFORM
XCOMP
S
NP
N
VP
V
PRED ‘lion’
NUM
pl
PERS
3
‘seem < theme > SUBJ’
XCOMP
pres
fin
SUBJ [ ]
VFORM INF
PRED ‘live< theme loc >’
VP-bar
SUBJ
OBL-loc
COMP VP
V
PP
P
NP
DET
N
Lions seem to live in the forest
CASE
PRED
OBJ
OBL-loc OBJ
OBL-loc
‘in<OBJ>’
PRED ‘forest’
NUM sg
PERS 3
DEF
+
SUBJ
f1
f2
PRED
TENSE
VFORM
XCOMP
S n1
n2
NP
n3
N
VP
V
n5
n4
VP-bar
SUBJ
n6
f4
n7
COMP VP
PRED ‘lion’
NUM
pl
PERS
3
‘seem < theme > SUBJ’
XCOMP
pres
fin
SUBJ [ ] f3
VFORM INF
PRED ‘live< theme loc >’
OBL-loc
n8
f5
V
PP
n9
P
n10
NP
n11
DET
N
n12
n13
n14
Lions seem to live in the forest
CASE
PRED
OBJ
f6
OBL-loc OBJ
OBL-loc
‘in<OBJ>’
PRED ‘forest’
NUM sg
PERS 3
DEF
+
SUBJ
f1
f2
PRED
TENSE
VFORM
XCOMP
S n1
n2
NP
n3
N
VP
V
n5
n4
VP-bar
SUBJ
n6
f4
n7
COMP VP
PRED ‘lion’
NUM
pl
PERS
3
‘seem < theme > SUBJ’
XCOMP
pres
fin
SUBJ [ ] f3
VFORM INF
PRED ‘live< theme loc >’
OBL-loc
n8
f5
V
PP
n9
P
n10
NP
n11
DET
N
n12
n13
n14
Lions seem to live in the forest
CASE
PRED
OBJ
f6
OBL-loc OBJ
OBL-loc
‘in<OBJ>’
PRED ‘forest’
NUM sg
PERS 3
DEF
+
Properties of the mapping from cstructure to f-structure
• Each c-structure node maps onto at most one fstructure node.
• More than one c-structure node can map onto
the same f-structure node.
• An f-structure node does not have to
correspond to any c-structure node. (But the
information it contains does come from
somewhere – either a grammar rule or lexical
entry.)
The formalism for grammatical encoding :
Local co-description of partial structures
• Φ is a mapping from c-structure nodes to fstructure nodes.
– There are other mappings to semantic structures,
argument structures, discourse structures,etc.
•
•
•
•
* is the “current” c-structure node (me).
Φ(*) is “my f-structure” ()
m(*) is “my c-structure mother”
Φ(m(*)) is “my c-structure mother’s f-structure”
()
Local co-description of partial structures
• S  NP
VP
( SUBJ) =  = 
NP says: My mother’s f-structure has a SUBJ
feature whose value is my f-structure.
VP says: My mother’s f-structure is my f-structure.
This rule simultaneously describes a piece of cstructure and a piece of f-structure.
It is local because each equation refers only to the
current node and its mother. (page 119-120)
Other types of equations
• F-structure composition
– ( SUBJ NUM) = sg
– My f-structure has a subj feature, whose value is another
f-structure, which has a num feature, whose value is sg.
– Usually, path names are not longer than two.
• Two features pointing to the same value:
– ( SUBJ) = ( XCOMP SUBJ)
– ( SUBJ) = ( TOPIC)
• ( ( CASE)) =  (Dalrymple pages 152-153)
– Sam walked in the park.
– ( CASE) = OBL-loc
– ( OBL-loc) = 
The minimal solution
• The f-structure for a sentence is the minimal
f-structure that satisfies all of the equations.
(page 101).
Building an F-structure: informal, for linguists
• Annotate
– Assign a variable name to the f-structure corresponding
to each c-structure node.
– May find out later that some of them are the same.
• Instantiate
– Replace the arrows with the variable names.
• Solve
– Locate the f-structure named on the left side of the
equation.
– Locate the f-structure named on the right side of the
equation
– Unify them.
– Replace both of them with the result of unification.
Unification
• [], empty feature structure, is identity
element
– [] U x = x
• Atomic value unified with an atomic value:
–xUx=x
– x U y = fail
• Atomic value unified with a non empty
feature structure: fail
Unification
• Feature structure f1 unified with feature
structure f2 to make feature structure f3:
– The set of features is the union of the features in
f1 and f2.
– The value of each feature in f3 is the value of
that feature in f1 unified with the value of that
feature in f2.
– Keep going recursively if there are embedded
feature structures.
– If any unification fails, then the whole thing
fails.
Unification and Grammaticality
• Grammatical sentence:
– All unifications succeed and
– Phrase structure derivation succeeds
• Ungrammatical sentence:
– At least one unification fails or
– Phrase structure derivation fails
Unification Example
f1 [ num sg
gender masc
person 3]
f2 [ case nom
def
+
person 3]
f3 [ num sg
gender masc
person 3
case nom
def
+]
Unification Example
f1 [ num sg
gender masc
person 3]
f2 [ case nom
def
+
person 2]
Unification fails. No fstructure is produced.
Unification Example
f1 [ subj [num sg
gender masc
person 3]
tense pres]
f2 [ subj [case nom
def
+
person 3]
tense pres
neg +]
f3 [ subj [num sg
gender masc
person 3
case nom
def
+]
tense pres
neg +]
Unification Example
f1 [ subj [num sg
gender masc
person 2]
tense pres]
f2 [ subj [case nom
def
+
person 3]
tense pres
neg +]
Unification fails. No fstructure is produced.
Rule:
S → NP
(↑ SUBJ) = ↓
VP
↑=↓
(↑VFORM) = fin
SUBJ
Instantiated equations:
(f1 SUBJ) = f2
f1 = f3
f2
f1
PRED
f3
TENSE
VFORM
XCOMP
S f1
NP f2
N
V
VP f3
PRED ‘lion’
NUM
pl
PERS
3
‘seem < theme > SUBJ’
XCOMP
pres
fin
SUBJ [ ]
VFORM INF
PRED ‘live< theme loc >’
VP-bar
SUBJ
OBL-loc
COMP VP
V
PP
P
NP
DET
N
Lions seem to live in the forest
CASE
PRED
OBJ
OBL-loc OBJ
OBL-loc
‘in<OBJ>’
PRED ‘forest’
NUM sg
PERS 3
DEF
+
lion: N
(↑ PRED) = `lion’
seem: V
(↑ PRED) =
‘seem < theme > SUBJ’
XCOMP
(↑ SUBJ) =
(↑ XCOMP SUBJ)
SUBJ
-s (suffix for nouns)
(↑ NUM) = pl
- Ø (suffix for verbs)
(↑ PERS) = 3
(↑ VFORM) = fin
(↑ SUBJ NUM) = pl
PRED
S
NP
f4 N f5 V
VP
f5
TENSE
VFORM
XCOMP
PRED ‘lion’
pl
f4 NUM
PERS
3
‘seem < theme > SUBJ’
XCOMP
pres
fin
SUBJ [ ]
VFORM INF
PRED ‘live< theme loc >’
VP-bar
SUBJ
OBL-loc
COMP VP
V
PP
P
NP
DET
N
Lions seem to live in the forest
CASE
PRED
OBJ
OBL-loc OBJ
OBL-loc
‘in<OBJ>’
PRED ‘forest’
NUM sg
PERS 3
DEF
+
lion: N
(f4 PRED) = `lion’
seem: V
(f5 PRED) =
‘seem < theme > SUBJ’
XCOMP
(f5 SUBJ) =
(f5 XCOMP SUBJ)
SUBJ
-s (suffix for nouns)
(f4 NUM) = pl
- Ø (suffix for verbs)
(f4 PERS) = 3
(f5 VFORM) = fin
(f5 SUBJ NUM) = pl
PRED
S
NP
f4 N f5 V
VP
f5
TENSE
VFORM
XCOMP
PRED ‘lion’
pl
f4 NUM
PERS
3
‘seem < theme > SUBJ’
XCOMP
pres
fin
SUBJ [ ]
VFORM INF
PRED ‘live< theme loc >’
VP-bar
SUBJ
OBL-loc
COMP VP
V
PP
P
NP
DET
N
Lions seem to live in the forest
CASE
PRED
OBJ
OBL-loc OBJ
OBL-loc
‘in<OBJ>’
PRED ‘forest’
NUM sg
PERS 3
DEF
+
•
What
is
an
XCOMP
A non-finite clause, predicate nominal, predicate
adjective, or predicate PP
–
–
–
–
–
Sam seemed to be happy (VP)
Sam seemed happy (AP)
Sam became a teacher (NP)
We had them arrested (VP)
We kept them in the drawer (PP)
• Has to be an argument of a verb:
– Arrested by the police, Sam had no alternative but to give up
his life of crime.
• This is an adjunct, not an XCOMP
• Gets its subject by sharing with another verb:
– I think that Sam is happy.
• This is a COMP, not an XCOMP
seem: V
VP → V VP
↑=↓ (↑ XCOMP) = ↓
(↑ PRED) = ‘seem < theme > SUBJ’
XCOMP
(↑ SUBJ) = (↑ XCOMP SUBJ)
(↑ XCOMP VFORM) = INF
SUBJ
- Ø (suffix for verbs)
(↑ VFORM) = fin
PRED
(↑ SUBJ NUM) = pl
f3
S
NP
TENSE
VFORM
XCOMP
f5
VP f3
f6
N f5 V
f7
f8 VP-bar
f8
f6COMP VP f9
f7V
PRED ‘lion’
NUM
pl
PERS
3
‘seem < theme > SUBJ’
XCOMP
pres
fin
SUBJ [ ]
VFORM INF
PRED ‘live< theme loc >’
SUBJ
OBL-loc
f9
PP
P
NP
DET
N
Lions seem to live in the forest
to: COMP
(↑ VFORM) = INF
CASE
PRED
OBJ
OBL-loc OBJ
OBL-loc
‘in<OBJ>’
PRED ‘forest’
NUM sg
PERS 3
DEF
+
- Ø (suffix for verbs)
(↑ VFORM) = INF
live: V
(↑ PRED) = `live<theme loc>’
SUBJ OBL
seem: V
VP → V
f3=f5
(f5 PRED) = ‘seem < theme > SUBJ’
XCOMP
(f5 SUBJ) = (f5 XCOMP SUBJ)
(f5 XCOMP VFORM) = INF
SUBJ
- Ø (suffix for verbs)
(f5 VFORM) = fin
(f5 SUBJ NUM) = pl
PRED
f3
S
NP
TENSE
VFORM
XCOMP
f5
VP f3
f6
N f5 V
f7
f8 VP-bar
f8
f6COMP VP f9
f7V
VP
(f3 XCOMP) = f8
PRED ‘lion’
NUM
pl
PERS
3
‘seem < theme > SUBJ’
XCOMP
pres
fin
SUBJ [ ]
VFORM INF
PRED ‘live< theme loc >’
SUBJ
OBL-loc
f9
PP
P
NP
DET
to: COMP
(f6 VFORM) = INF
CASE
PRED
OBJ
OBL-loc OBJ
OBL-loc
‘in<OBJ>’
PRED ‘forest’
NUM sg
PERS 3
DEF
+
- Ø (suffix for verbs)
(f7 VFORM) = INF
N
Lions seem to live in the forest
live: V
(f7 PRED) = `live<theme loc>’
SUBJ OBL
SUBJ
PRED
TENSE
VFORM
XCOMP
S
NP
N
VP
V
PRED ‘lion’
NUM
pl
PERS
3
‘try < agent theme >’
SUBJ XCOMP
pres
fin
SUBJ [ ]
VFORM INF
PRED ‘live< theme loc >’
VP-bar
SUBJ
OBL-loc
COMP VP
V
PP
P
NP
DET
Lions try
N
to live in the forest
CASE
PRED
OBJ
OBL-loc OBJ
OBL-loc
‘in<OBJ>’
PRED ‘forest’
NUM sg
PERS 3
DEF
+
have: V
(↑ PRED) = ‘have < theme > SUBJ’
XCOMP
(↑ SUBJ) = (↑ XCOMP SUBJ)
(↑ XCOMP VFORM) = PASTPART
SUBJ
- Ø (suffix for verbs)
(↑ VFORM) = fin
PRED
(↑ SUBJ NUM) = pl
TENSE
VFORM
XCOMP
S
NP
N
VP
PRED ‘lion’
NUM
pl
PERS
3
‘have < theme > SUBJ’
XCOMP
pres
fin
SUBJ [ ]
VFORM PASTPART
PRED ‘live< theme loc >’
V
SUBJ
OBL-loc
VP
V
PP
P
NP
DET
N
Lions have lived in the forest
CASE
PRED
OBJ
OBL-loc OBJ
OBL-loc
‘in<OBJ>’
PRED ‘forest’
NUM sg
PERS 3
DEF
+
were : V
(↑ PRED) = ‘be < theme > SUBJ’
XCOMP
(↑ SUBJ) = (↑ XCOMP SUBJ)
(↑ XCOMP VFORM) = PASSIVE
SUBJ
(↑ VFORM) = fin
(↑ SUBJ NUM) = pl
PRED
TENSE
VFORM
XCOMP
S
NP
N
VP
V
PRED
OBJ
VP
V
PRED ‘lion’
NUM
pl
PERS
3
‘be < theme > SUBJ’
XCOMP
pres
fin
SUBJ [ ]
VFORM PASSIVE
PRED ‘hunt<agent theme loc >’
Ø SUBJ OBL-loc OBJ
OBL-loc
CASE OBL-loc
PP
P
NP
DET
N
Lions were hunted in the forest
‘in<OBJ>’
PRED ‘forest’
NUM sg
PERS 3
DEF
+
Download