October 25, 2006 11-721: Grammars and Lexicons Lori Levin Lexical Functional Grammar • History: – Joan Bresnan (linguist, MIT and Stanford) – Ron Kaplan (computational psycholinguist, Xerox PARC) – Around 1978 What is Linguistic Theory • Delimit the range of possible human languages. – What do all languages have in common? • Semantic roles, grammatical relations, pragmatic relations, some constituent structure; only subjects can be controllees in matrix coding as subject constructions; etc. – What are the ways in which they can differ from each other? • Relative prominence of grammatical or pragmatic relations: word order reflects grammatical relations in English and reflects focus (new information) in Hungarian; Topic takes precedence over subject in Chinese in determining antecedent of null pronouns; Subject is more prominent in English. – What never happens in a human language? • Make a question by saying the sentence backwards. Universalist view of language • There is “a common organizing structure of all languages that underlies their superficial variations in modes of expression” (Bresnan) – E.g., Passives that look very different in different languages can be described by a universal passive rule. • The common organizing structure is part of human biology. Some differences between English and Warlpiri S VP’ NP VP Aux The two small children V NP are chasing that dog. S NP Wita-jarra-rlu Small-DU-ERG AUX V NP NP NP ka-pala wajili-pi-nyi yalumpu kurdu-jarra-rlu maliki. pres-3duSUBJ chase-NPAST that.ABS child-DU-ERG dog.ABS Possible word orders in Warlpiri that are not possible in English • *The two small are chasing that children dog. • *The two small are dog chasing that children. • *Chasing are the two small that dog children. • *That are children chasing the two small dog. Non-configurational languages • Free word order. • May have discontinuous constituents. • Tests for constituency do not yield evidence for VP constituent. Something that English and Warlpiri have in common • Lucy is hitting herself. • *Herself is hitting Lucy. • Napaljarri-rli ka-nyanu paka-rni Napaljarri-ERG PRES-REFL hit-NONPAST “Napaljarri is hitting herself.” • *Napaljarri ka-nyanu paka-rni Napaljarri.ABS PRES-REFL hit-NONPAST “Herself is hitting Napaljarri.” What English and Warlpiri have in common according to Chomsky S Deep structure VP’ NP VP Aux V NP English S Surface Structure VP’ NP VP Aux V NP What English and Warlpiri have in common according to Chomsky S Deep structure VP’ NP VP Aux V NP Warlpiri Surface Structure S NP Aux V NP NP NP What English and Warlpiri have in common according to Bresnan • Same grammatical relations and semantic roles – SUBJECT: the two small children: AGENT – PREDICATE: are chasing – OBJECT: that dog: PATIENT • Different codings of grammatical relations: – English subject: NP immediately under S – Warlpiri subject: Ergative case marked NP (if verb is transitive) Strength of Chomsky’s approach • Proposing that there is a VP in all languages explains why there are subject-object asymmetries in all languages. Strength of Bresnan’s approach • Doesn’t propose non-existent VPs: – phrase structure is used for representing constituency – A different representation is used for grammatical relations Challenges for Bresnan and Chomsky • Bresnan: – explain subject-object asymmetries in the absence of a VP – Explain in a principled way the range of possible coding properties of grammatical relations • Chomsky: – explain in a principled way how the words get scrambled out of VP; – The phrase structure tree has to represent both grammatical relations and constituent structure, which may conflict with each other. Levels of Representation in LFG [s [np The bear] [vp ate [np a sandwich]]] SUBJ Agent Eat < agent SUBJ PRED SUBJ OBJ eat functional structure Lexical mapping thematic roles patient patient > lexical mapping OBJ S NP constituent structure Grammatical encoding VP VP V NP OBJ V PP OBL Grammatical Encoding For English!!! Syntax • Syntax is not about the form (phrase structure) of sentences. • It is about how strings of words are associated with their semantic roles. – Phrase structure is only part of the solution. • Sam saw Sue – Sam: perceiver – Sue: perceived Syntax • Syntax is also about how to tell that two sentences are thematic paraphrases of each other (same phrases filling the same semantic roles). – – – – It seems that Sam ate the sandwich. It seems that the sandwich was eaten by Sam. Sam seems to have eaten the sandwich. The sandwich seems to have been eaten by Sam. How to associate phrases with their semantic roles in LFG • Starting from a constituent structure tree: • Grammatical encoding tells you how to find the subject. – The bear is the subject. • Lexical mapping tells you what semantic role the subject has. – The subject is the agent. – Therefore, the bear is the agent. Levels of Representation in LFG [s [np The sandwich ] [vp was eaten [pp by the bear]]] constituent structure Grammatical encoding SUBJ PRED OBL patient eat agent Eat < agent OBL patient > SUBJ lexical mapping SUBJ S NP functional structure Lexical mapping thematic roles VP VP V NP OBJ V PP OBL Grammatical Encoding For English!!! Active and Passive • Active: – Patient is mapped to OBJ in lexical mapping. • Passive – Patient is mapped to SUBJ in lexical mapping. • Notice that the grammatical encodings are the same for active and passive sentences!!! Passive mappings • Starting from the constituent structure tree. • The grammatical encoding tells you that the sandwich is the subject. • The lexical mapping tells you that the subject is the patient. – Therefore, the sandwich is the patient. • The grammatical encoding tells you that the bear is oblique. • The lexical mapping tells you that the oblique is the agent. – Therefore, the bear is the agent. How you know that the active and passive have the same meaning • In both sentences, the mappings connect the bear to the agent role. • In both sentences, the mappings connect the sandwich to the patient role (roll?) • In both sentences, the verb is eat. Levels of Representation in LFG [s-bar [np what ] [s did [np the bear] eat ]] constituent structure Grammatical encoding OBJ SUBJ patient Eat < agent SUBJ S-bar NP S OBJ agent patient > PRED eat functional structure Lexical mapping thematic roles lexical mapping OBJ VP S NP SUBJ V PP OBL Grammatical Encoding For English!!! Wh-question • Different grammatical encoding: – In this example, the OBJ is encoded as the NP immediately dominated by S-bar • Same lexical mappings are used for: – What did the bear eat? – The bear ate the sandwich. Principles • Variability: – Phrase structures and grammatical encodings vary across languages. • Universality – Functional structures are largely invariant across languages. Functional Structure SUBJ PRED TENSE OBJ PRED ‘bear’ NUM sg PERS 3 DEF + ‘eat< agent patient > SUBJ OBJ past PRED ‘sandwich’ NUM sg PERS 3 DEF - Functional Structure • Pairs of attributes (features) and values – Attributes (in this example): SUBJ, PRED, OBJ, NUM, PERS, DEF, TENSE – Values: • Atomic: sg, past, +, etc. • Feature structure: [num sg, pred `bear’, def +, person 3] • Semantic form: ‘eat<subj ob>’, ‘bear’, ‘sandwich’ Semantic Forms • Why are they values of a feature called PRED? – In some approaches to semantics, even nouns like bear are predicates (function) that take one argument and returns true or false. – Bear(x) is true when the variable x is bound to a bear. – Bear(x) is false when x is not bound to a bear. Why is it called a Functional Structure? X squared 1 1 2 4 3 9 Each feature has a unique value. Also, another term for grammtical relation is grammatical function. 4 16 5 features 25 values We will use the terms functional structure, f-structure and feature structure interchangeably. Give a name to each function f1 SUBJ PRED ‘bear’ NUM sg f2 PERS 3 DEF + PRED ‘eat< agent patient > SUBJ OBJ TENSE past OBJ PRED ‘sandwich’ NUM sg f3 PERS 3 DEF - How to describe an f-structure • F1(TENSE) = past – Function f1 applied to TENSE gives the value past. • F1(SUBJ) = [PRED ‘bear’, NUM sg, PERS 3, DEF +] • F2(NUM) = sg Descriptions can be true or false • F(a) = v – Is true if the feature-value pair [a v] is in f. – Is false if the feature-value pair [a v] is not in f. This is the notation we really use • (f1 TENSE) = past • Read it this way: f1’s tense is past. • (f1 SUBJ) = [PRED ‘bear’, NUM sg, PERS 3, DEF +] • (f2 NUM) = sg Chains of function application • (f1 SUBJ) = f2 • (f2 NUM) = sg • ((f1 SUBJ) NUM) = sg • Write it this way. (f1 SUBJ NUM) = sg • Read it this way. “f1’s subject’s number is sg.” More f-descriptions • (f a) = v – f is something that evaluates to a function. – a is something that evaluates to an attribute. – v is something that evaluates to a function, symbol, or semantic form. • (f1 subj) = (f1 xcomp subj) – Used for matrix coding as subject. A subject is shared by the main clause and the complement clause (xcomp). • (f1 (f6 case)) = f6 – Used for obliques SUBJ PRED TENSE VFORM XCOMP S NP N VP V PRED ‘lion’ NUM pl PERS 3 ‘seem < theme > SUBJ’ XCOMP pres fin SUBJ [ ] VFORM INF PRED ‘live< theme loc >’ VP-bar SUBJ OBL-loc COMP VP V PP P NP DET N Lions seem to live in the forest CASE PRED OBJ OBL-loc OBJ OBL-loc ‘in<OBJ>’ PRED ‘forest’ NUM sg PERS 3 DEF + SUBJ f1 f2 PRED TENSE VFORM XCOMP S n1 n2 NP n3 N VP V n5 n4 VP-bar SUBJ n6 f4 n7 COMP VP PRED ‘lion’ NUM pl PERS 3 ‘seem < theme > SUBJ’ XCOMP pres fin SUBJ [ ] f3 VFORM INF PRED ‘live< theme loc >’ OBL-loc n8 f5 V PP n9 P n10 NP n11 DET N n12 n13 n14 Lions seem to live in the forest CASE PRED OBJ f6 OBL-loc OBJ OBL-loc ‘in<OBJ>’ PRED ‘forest’ NUM sg PERS 3 DEF + SUBJ f1 f2 PRED TENSE VFORM XCOMP S n1 n2 NP n3 N VP V n5 n4 VP-bar SUBJ n6 f4 n7 COMP VP PRED ‘lion’ NUM pl PERS 3 ‘seem < theme > SUBJ’ XCOMP pres fin SUBJ [ ] f3 VFORM INF PRED ‘live< theme loc >’ OBL-loc n8 f5 V PP n9 P n10 NP n11 DET N n12 n13 n14 Lions seem to live in the forest CASE PRED OBJ f6 OBL-loc OBJ OBL-loc ‘in<OBJ>’ PRED ‘forest’ NUM sg PERS 3 DEF + Properties of the mapping from cstructure to f-structure • Each c-structure node maps onto at most one fstructure node. • More than one c-structure node can map onto the same f-structure node. • An f-structure node does not have to correspond to any c-structure node. (But the information it contains does come from somewhere – either a grammar rule or lexical entry.) The formalism for grammatical encoding : Local co-description of partial structures • Φ is a mapping from c-structure nodes to fstructure nodes. – There are other mappings to semantic structures, argument structures, discourse structures,etc. • • • • * is the “current” c-structure node (me). Φ(*) is “my f-structure” () m(*) is “my c-structure mother” Φ(m(*)) is “my c-structure mother’s f-structure” () Local co-description of partial structures • S NP VP ( SUBJ) = = NP says: My mother’s f-structure has a SUBJ feature whose value is my f-structure. VP says: My mother’s f-structure is my f-structure. This rule simultaneously describes a piece of cstructure and a piece of f-structure. It is local because each equation refers only to the current node and its mother. (page 119-120) Other types of equations • F-structure composition – ( SUBJ NUM) = sg – My f-structure has a subj feature, whose value is another f-structure, which has a num feature, whose value is sg. – Usually, path names are not longer than two. • Two features pointing to the same value: – ( SUBJ) = ( XCOMP SUBJ) – ( SUBJ) = ( TOPIC) • ( ( CASE)) = (Dalrymple pages 152-153) – Sam walked in the park. – ( CASE) = OBL-loc – ( OBL-loc) = The minimal solution • The f-structure for a sentence is the minimal f-structure that satisfies all of the equations. (page 101). Building an F-structure: informal, for linguists • Annotate – Assign a variable name to the f-structure corresponding to each c-structure node. – May find out later that some of them are the same. • Instantiate – Replace the arrows with the variable names. • Solve – Locate the f-structure named on the left side of the equation. – Locate the f-structure named on the right side of the equation – Unify them. – Replace both of them with the result of unification. Unification • [], empty feature structure, is identity element – [] U x = x • Atomic value unified with an atomic value: –xUx=x – x U y = fail • Atomic value unified with a non empty feature structure: fail Unification • Feature structure f1 unified with feature structure f2 to make feature structure f3: – The set of features is the union of the features in f1 and f2. – The value of each feature in f3 is the value of that feature in f1 unified with the value of that feature in f2. – Keep going recursively if there are embedded feature structures. – If any unification fails, then the whole thing fails. Unification and Grammaticality • Grammatical sentence: – All unifications succeed and – Phrase structure derivation succeeds • Ungrammatical sentence: – At least one unification fails or – Phrase structure derivation fails Unification Example f1 [ num sg gender masc person 3] f2 [ case nom def + person 3] f3 [ num sg gender masc person 3 case nom def +] Unification Example f1 [ num sg gender masc person 3] f2 [ case nom def + person 2] Unification fails. No fstructure is produced. Unification Example f1 [ subj [num sg gender masc person 3] tense pres] f2 [ subj [case nom def + person 3] tense pres neg +] f3 [ subj [num sg gender masc person 3 case nom def +] tense pres neg +] Unification Example f1 [ subj [num sg gender masc person 2] tense pres] f2 [ subj [case nom def + person 3] tense pres neg +] Unification fails. No fstructure is produced. Rule: S → NP (↑ SUBJ) = ↓ VP ↑=↓ (↑VFORM) = fin SUBJ Instantiated equations: (f1 SUBJ) = f2 f1 = f3 f2 f1 PRED f3 TENSE VFORM XCOMP S f1 NP f2 N V VP f3 PRED ‘lion’ NUM pl PERS 3 ‘seem < theme > SUBJ’ XCOMP pres fin SUBJ [ ] VFORM INF PRED ‘live< theme loc >’ VP-bar SUBJ OBL-loc COMP VP V PP P NP DET N Lions seem to live in the forest CASE PRED OBJ OBL-loc OBJ OBL-loc ‘in<OBJ>’ PRED ‘forest’ NUM sg PERS 3 DEF + lion: N (↑ PRED) = `lion’ seem: V (↑ PRED) = ‘seem < theme > SUBJ’ XCOMP (↑ SUBJ) = (↑ XCOMP SUBJ) SUBJ -s (suffix for nouns) (↑ NUM) = pl - Ø (suffix for verbs) (↑ PERS) = 3 (↑ VFORM) = fin (↑ SUBJ NUM) = pl PRED S NP f4 N f5 V VP f5 TENSE VFORM XCOMP PRED ‘lion’ pl f4 NUM PERS 3 ‘seem < theme > SUBJ’ XCOMP pres fin SUBJ [ ] VFORM INF PRED ‘live< theme loc >’ VP-bar SUBJ OBL-loc COMP VP V PP P NP DET N Lions seem to live in the forest CASE PRED OBJ OBL-loc OBJ OBL-loc ‘in<OBJ>’ PRED ‘forest’ NUM sg PERS 3 DEF + lion: N (f4 PRED) = `lion’ seem: V (f5 PRED) = ‘seem < theme > SUBJ’ XCOMP (f5 SUBJ) = (f5 XCOMP SUBJ) SUBJ -s (suffix for nouns) (f4 NUM) = pl - Ø (suffix for verbs) (f4 PERS) = 3 (f5 VFORM) = fin (f5 SUBJ NUM) = pl PRED S NP f4 N f5 V VP f5 TENSE VFORM XCOMP PRED ‘lion’ pl f4 NUM PERS 3 ‘seem < theme > SUBJ’ XCOMP pres fin SUBJ [ ] VFORM INF PRED ‘live< theme loc >’ VP-bar SUBJ OBL-loc COMP VP V PP P NP DET N Lions seem to live in the forest CASE PRED OBJ OBL-loc OBJ OBL-loc ‘in<OBJ>’ PRED ‘forest’ NUM sg PERS 3 DEF + • What is an XCOMP A non-finite clause, predicate nominal, predicate adjective, or predicate PP – – – – – Sam seemed to be happy (VP) Sam seemed happy (AP) Sam became a teacher (NP) We had them arrested (VP) We kept them in the drawer (PP) • Has to be an argument of a verb: – Arrested by the police, Sam had no alternative but to give up his life of crime. • This is an adjunct, not an XCOMP • Gets its subject by sharing with another verb: – I think that Sam is happy. • This is a COMP, not an XCOMP seem: V VP → V VP ↑=↓ (↑ XCOMP) = ↓ (↑ PRED) = ‘seem < theme > SUBJ’ XCOMP (↑ SUBJ) = (↑ XCOMP SUBJ) (↑ XCOMP VFORM) = INF SUBJ - Ø (suffix for verbs) (↑ VFORM) = fin PRED (↑ SUBJ NUM) = pl f3 S NP TENSE VFORM XCOMP f5 VP f3 f6 N f5 V f7 f8 VP-bar f8 f6COMP VP f9 f7V PRED ‘lion’ NUM pl PERS 3 ‘seem < theme > SUBJ’ XCOMP pres fin SUBJ [ ] VFORM INF PRED ‘live< theme loc >’ SUBJ OBL-loc f9 PP P NP DET N Lions seem to live in the forest to: COMP (↑ VFORM) = INF CASE PRED OBJ OBL-loc OBJ OBL-loc ‘in<OBJ>’ PRED ‘forest’ NUM sg PERS 3 DEF + - Ø (suffix for verbs) (↑ VFORM) = INF live: V (↑ PRED) = `live<theme loc>’ SUBJ OBL seem: V VP → V f3=f5 (f5 PRED) = ‘seem < theme > SUBJ’ XCOMP (f5 SUBJ) = (f5 XCOMP SUBJ) (f5 XCOMP VFORM) = INF SUBJ - Ø (suffix for verbs) (f5 VFORM) = fin (f5 SUBJ NUM) = pl PRED f3 S NP TENSE VFORM XCOMP f5 VP f3 f6 N f5 V f7 f8 VP-bar f8 f6COMP VP f9 f7V VP (f3 XCOMP) = f8 PRED ‘lion’ NUM pl PERS 3 ‘seem < theme > SUBJ’ XCOMP pres fin SUBJ [ ] VFORM INF PRED ‘live< theme loc >’ SUBJ OBL-loc f9 PP P NP DET to: COMP (f6 VFORM) = INF CASE PRED OBJ OBL-loc OBJ OBL-loc ‘in<OBJ>’ PRED ‘forest’ NUM sg PERS 3 DEF + - Ø (suffix for verbs) (f7 VFORM) = INF N Lions seem to live in the forest live: V (f7 PRED) = `live<theme loc>’ SUBJ OBL SUBJ PRED TENSE VFORM XCOMP S NP N VP V PRED ‘lion’ NUM pl PERS 3 ‘try < agent theme >’ SUBJ XCOMP pres fin SUBJ [ ] VFORM INF PRED ‘live< theme loc >’ VP-bar SUBJ OBL-loc COMP VP V PP P NP DET Lions try N to live in the forest CASE PRED OBJ OBL-loc OBJ OBL-loc ‘in<OBJ>’ PRED ‘forest’ NUM sg PERS 3 DEF + have: V (↑ PRED) = ‘have < theme > SUBJ’ XCOMP (↑ SUBJ) = (↑ XCOMP SUBJ) (↑ XCOMP VFORM) = PASTPART SUBJ - Ø (suffix for verbs) (↑ VFORM) = fin PRED (↑ SUBJ NUM) = pl TENSE VFORM XCOMP S NP N VP PRED ‘lion’ NUM pl PERS 3 ‘have < theme > SUBJ’ XCOMP pres fin SUBJ [ ] VFORM PASTPART PRED ‘live< theme loc >’ V SUBJ OBL-loc VP V PP P NP DET N Lions have lived in the forest CASE PRED OBJ OBL-loc OBJ OBL-loc ‘in<OBJ>’ PRED ‘forest’ NUM sg PERS 3 DEF + were : V (↑ PRED) = ‘be < theme > SUBJ’ XCOMP (↑ SUBJ) = (↑ XCOMP SUBJ) (↑ XCOMP VFORM) = PASSIVE SUBJ (↑ VFORM) = fin (↑ SUBJ NUM) = pl PRED TENSE VFORM XCOMP S NP N VP V PRED OBJ VP V PRED ‘lion’ NUM pl PERS 3 ‘be < theme > SUBJ’ XCOMP pres fin SUBJ [ ] VFORM PASSIVE PRED ‘hunt<agent theme loc >’ Ø SUBJ OBL-loc OBJ OBL-loc CASE OBL-loc PP P NP DET N Lions were hunted in the forest ‘in<OBJ>’ PRED ‘forest’ NUM sg PERS 3 DEF +