Crossed S e r i a l Dependencies: i low-power parseable extension to GPSG Henry Thompson Department of Artificial Intelligence and Program in Cognitive Science U n i v e r s i t y of Edinburgh Hope Park Square, Meadow Lane Edinburgh EH8 9NW SCOTLAND ABSTRACT paper I present formalism a minimal (Gazdar 1981c) extension which to also the GPSC provides a An extension to the GPSG grammatical formalism is solution. proposed, allowing non-terminals to consist It induces sequences of category labels, which are non-trivially distinct from and allowing those schematic variables for the relevant of sentences finite structures in BKPZ, and which I appropriate. It appears, constrained, to be similar making a argue are more to range over such sequences. when suitably The extension is shown to be sufficient to provide a strongly adequate dependencies, as grammar found for in e.g. crossed Dutch The structures in serial only small increment in power, being subordinate incapable, clauses. to Joshi's proposal induced for for instance, of analysing anbnc n with such crossed dependencies. And it can easily be parsed constructions are argued to be more appropriate to by a small modification to the parsing mechanisms I data involving conjunction than some previous have already developed for GPSG. proposals have been. parseable by a The extension is shown to be simple extension to an existing II. AN EXTENSION TO GPSG parsing method for GPSG. II.I Extendin G the s~ntax I. INTRODUCTION GPSG includes the idea of compound non-terminals, There has community serial been lately considerable for grammar. in e.g. Dutch non-transformational Although grammars under in the composed with the implications of crossed dependencies clauses interest context-free the standard can subordinate theories phrase of are not dependencies capable - that recent paper of is, assigning they are the this trivially labels. This to finite We sequences of in itself does not change the weak generative capacity of the grammar, as the set are of non-terminals includes weakly adequate to generate such languages as anb n, they extend category structure interpretations of pairs of standard category labels. correct notstrongly the idea of remains rule finite. schemata variables over categories. If variables over sequences, then we we CPSG also - rules further get a with allow real change. adequate. At this point I must introduce some notation. In a Zaenen 1982) (Bresnan Kaplsn (hereafter BKPZ), Peters a solution end I will write to the [a,b ,c] Dutch problem was presented in terms of LFG (Kaplan and Bresnan 1982), considerably (Steedman proposals grammars more 1983) and which than (Joshi for solutions and Steedman 1982; tree is known context-free 1983) have to for a non-terminal label composed of the categories have a, b, and c. I will write power. Za b* also made in terms of Steedman/Ades adjunction grammars Joshi Levy and Yueh 1975). (Ades to and indicate that the schematic variable Z ranges over sequences of the category b. We can then give In this the 16 following grammar for anb n with crossed II.2 Semantic interpretation dependencies: As S -> e S:Z - > a SIZ:b .(I) s:z -> a s z:b blZ -> b z for the required (2) (3), - sufficient to of work. not only constant alone, but in simple, that terms only, concatenation, is bar (I). This analysis subscripts to for the gives us numbers give where I have dependencies, the rule and pairs which admits albeit with a fair amount The to a p a ......6 and compound b (2) s" [bI,2,b] sequences P PI of interpretations. is such a pair, = P(kxXx[y]). of course In produce what Using (3) arbitrary follows sequences, I will use a 3 this we see that rule I the 'crossed', and accumulated order. rule b's, This so it is important the crossed dependencies come out in two ways restrictions, and is 5 in successively the correct, essentially shorthand, we can give the following the syntactic rules given above, clear from the is handled example. That properly Suppose should be that the rather terminals, and that there are actually three sorts sorts of b's, subcategorised (2 (3 rules are most the immediately Rule 2 takes sequence, result of for each other. If one for recording used the standard as appearing structure, a feature on in them directly, those to the first sequence sequence interpretation of it to the interpretations dominated a and S, and of of b's. such a of the prepends the to the unused balance of the sequence of b first We now have a sequence consisting a sentential interpretation, of h interpretations. the second (b type) the interpretation and pre-terminals The we would get the above where we can reinterpret in reverse Rule and then a I thus applies GPSG this dependency, namely by providing three rules, whose rule number would then appear b the dominated the interpretations. number mechanism dominated of applies immediately than easily understood Rule 3 simply appends the interpretation of interpretations the a and three pre-terminals (~ where Q' is short for Ziqh' , ADJOIN(Z' ,b' ). order. This must categories of a's and b are assuming that CO~S(CAR(Q')(a') (S') ,CDR(Q ' )) These in GPSG - subcategorisation above which the to point out exactly how are captured. interpretation. subcategorisation These where Q' is short for SI,Z~,b' , structure we will produce for the Dutch examples as well, Lisp-based CONS(CADR (Q')(a' )(CA~(Q' )),CDDR (Q ' )) generates a's while accumulating b's, rule 2 brings generates in the b'(a',S') is the desired result at each level: the aid of this example, end, we as in Appendix I. preserves the appropriate dependency, an PO pairs example of a set of semantic rules for association with to then Using using CAR, CDR, CONS, and so on. usages are discharged (I) ~ process have we can represent a pair <l,r> as If and shorthand, (I) al/~[S,bl] this nodes the can S and Church, ~f(1)(r)]. the Lisp. With is still used adjacent node: a is the P(~x~x[x]) marginal extension calculus notated with a grammar a3b 5, record the task, approach. Following following actual lambda compound interpretations, which are distributed appropriately higher up the tree. For this, we with need vertical no untyped We can use what amounts unpacking where we allow variables over sequences to appear semantics the the subscripts as the rule numbers so introduced, and see that the the result 17 to a, (S type) is element again balance, if any. himself that (crossed) dependencies are correctly reflected. first element of such a sequence of the immediately dominated this The of prepended patient will interpretation: the sequence. to the unused reader can satisfy produce the following II.3 Parsin~ As for parsing context-free non-terminals a n d schemata grammars this with proposal the I) omdat ik probeer Nikki te leren Nederlands te spreken 2) omdat ik probeer Nikki Nederlands te leren spreken 3) omdat ik Nikki probeer te leren Nederlands te spreken 4) omdat ik Nikki Nederlands probeer te leren spreken allows, very little needs to be added to the mechanisms I have provided to deal with non-sequence schemata in GPSG, as described in (Thompson 1981 b). We simply treat all non-terminals as sequences, many of only one element. up chart parsing matched rule, The same basic technique of a bottomstrategy, variables will sequence do in the job. variable terminal, the to which active substitutes for version the By restricting occur once in of 5) * omdat ik Nikki probeer Nederlands te leren spreken. With only one each non- pattern the task of matching is kept simple and deterministic. ZlblZ. Thus The concatenation, we allow e.g. substitutions so that rule (~) matching if we have place the course of bottom-up the judgements of some (I) is often judged on stylistic grounds, seems fairly Dutch from the suggestion that this stable this among Netherlands. is n o t the pattern of judgements typical of native speakers of an instance of processing, is that at least speakers There by first [a] and then [3,b,b,b] proviso of native SIZIb but not take the questionable, Dutch from Belgium. in Z on the III.2 Grammar rules for the Dutch data right hand side will match [b,b], and the resulting This substitution into the left hand side will cause the pattern leads us to propose the following basic rules for subordinate clauses: constituent to be labeled [S,b,b]. A) S' -> omdat NP VP In making the changes part of nodes, required B) VP -> V VP C) VP -> NP V VP D) VP -> NP V to my existing system, were all localised to that the code which matches rule parts against and sequence that this extension the here the variable impact is of price is paid encountered. this mechanism only This if Taken a (probeer) (leren) (spreken). straight, these give us (I) only. - (4), we propose what amounts suggests approach, on the parsing they complexity of the system is quite small. ruled to a verb lowering where verbs are lowered onto VPs, whence lower again out by to form compound verbs. requiring that a lowered have a target verb to compound with. III. APPLICATION TO DUTCH Given the limited space available, only a very high-level account have the nothing precise to say about distribution I can present This of transformational how this tensed So issue of and approach partially account inspired in the interpretation of VP with follows III. 1 The Dutch data the is terms by Seuren's of predicate the compound labels is that e.g. [V,V] is a compound verb, and [VP,V,V! is untensed a of The resulting raising (Seuren 1972). verb forms. Discussion is must In particular I will the difficult of (5) verb compound may itself be lowered, but only as a unit. extension to GPSG can provide an account of crossed serial dependencies in Dutch. For (2) associated phenomenon of crossed a compound that for verb each compound lowered VP version it. It we need an which lowering of (possibly compound) serial onto rule, verbs allows the from the VP onto the verb, so we would have e.g. dependencies bedeviled what in Dutch by considerable the facts are. The subordinate disagreement clauses about is Di) VPIZ -> NP just ZIV, where we now use Z as a variable over sequences of following five examples VS. form the core of the basis for my analysis: 18 The other half of the process must be reflected in rules associated with each Sentences VP rule which introduces a VP complement, allowing the verb generated to be of space, lowered onto the complement. As this rule we want e.g. metarules enumerate such rules, we can ~Pa use I) VP -> ... V ... ==> VPIZ -> ... ZlV ... H) vP -> ... v vP (with compounding compound if allowing needed) and the is an of verbs onto insure syntactic part of our variable whose subcategorisation combining and (Cii), (4). (4), similar to the meta-generated ~d Vd . Nederlands te !preken follows that in section II.2 quite For our purposes simple interpretations range The semantics for the metarules straightforward, information the rules (A) given that is also reasonably we know where we are example is fully F(V') ==> CONS(F(CAR(Z:V')),CDR(Z',V')) II') (E), F(V',VP') ==> CONS(F(CADR(Q'),CAR(Q')), cm~(Q')), rules (Bi) - (Di), we can now generate which - examples crossed, in section II.1, is where (2) Q' is semantics very short very section II.2, and uses for much expansions for all its VP nodes: E') VPlZl,V '. like (I') will those of rule give (2) in while (II') will give semantics like those of rule (I). (E °) is just like (3): ADJ01N(Z' ,W ' ) (A) S' It (Cii ) (I) Vc i te leren the enthusiastic reader to work the examples and see that all of sentences - (4) above in fact receive the same III.4 Which structure conjunction is right - evidence from (E) [Vc,Vd] probeer to interpretation. (E) Vb left through Nikki Nederlands is (Bii) (Di) The Vd careful structures I spreken BKPZ. from and indicate with subscripts the rule name the reader proposed Their depending Once again I include the relevant rule name in the margin, Vb going: together with the meta-generated (Bii) ~ ~ of (B) - (D) will suffice: I') suitably (ci) (E),(Di) [vP,Zb] closely. propagates correctly. By is B') v'(vP') c') v' (NP' , ~ ' ) D') v'(NP'). ordinary that (2) (Rii) The semantics to unpack the consists of V. This slight indirection is necessary to below. III.3 Semantic rules for the Dutch data E) wlz -> W Z, W meta- (II) effort is complete: where is illustrated pro~eer te ~cleren the lowering We need one more rule, verbs, (3) two For reasons (A) Nikki compound verbs to be discharged at any point. complements. involve vP (I) will apply to all three of (B) - (D), allowing (C), each .~Pc [Vb,Vc]~ o-> vPlz -> ... vP:z:v. to (B) and only ik to conveniently express what is wanted: will apply (3) s' cii) vPlz -> ~P wlzlv. than and similar, but using rules (B), (Cii), and (Di). must also expand VPs with verbs lowered onto them, Rather (2) rules and one ordinary one. will are not structures have the same have from the highest lowest possible. noted the VP, With that as those of compound while the ours verb depend the exception of BKPZ's example (~3), which none of my sources judge feature introduced to enforce subcategorisation. grammatical 19 with the 'root Marie' as given, I believe my proposal accounts for all the judgements reasonably cited in their paper. least believe conjunction (4), On the other hand, I do not they can account the judgement, next standard two GPSG the first (3), treatment of three whereas based the Further all establish they phenomena account without of at radically work the is clearly needed of this augmented status to exactly GPSG with respect to generative capacity and parsability. fall out of our analysis: is 6) Dutch satisfying on under conjunction the and altering the grammatical framework of GPSG. for all of the following on concise intriguing omdat ik Nikki Nederlanda wil leren spreken en Frans wil laten schrijven equivalence because I want to teach Nikki to speak Dutch and let [Nikki] write French allowing 7) * omdat ik Nikki Nedrelands wil leren spreken en Frans laten schrijven apparent Joahi et to with al. speculate the tree Even in the weakest one occurence in any constituent similarity of to adjunction only sequences as its grammars of of augmentation, of one variable their It weak any power over rule, remains the to be formally established, but it at least appears that 8) omdat ik Nikki Nederlands wil leren spreken en Carla Frans wil laten schrijven like tree cannot because I want to teach Nikki to speak Dutch and let Carla write French. adjunction generate grammars, anbncn with these both dependencies crossed, and like them, it can generate it with any one set crossed and the other nested. 9) grammars omdat ik Nikki wil leren Nederlands te spreken en Frans te schrijven it generate variable because I want to teach Nikki to speak Dutch and to write French WW, although ranging over it can with Neither can a sequence the entire alphabet, if it can be shown that it is indeed weakly equivalent to TAG, IO) * omdat ik Nikki wil leren Nederlands te spreken en Carla Frans te schrijven or ... en Frans (ts) laten schrijven then strong support will be lent to the claim that an interesting new point on the Chomsky hierarchy between CFGs and the indexed grammars has been found. (6) contains a conjoined [VP,V,V], [VP,V], and (7) fails (8) a conjoined because it conjoin a [VP,V,V] with a [VP,V]. ordinary trying VP iaside to a conjoin [VP,V], a VP attempts to ACKNOWLEDGEMENTS (9) conjoins an and with (10) fails either a The work described herein was partially supported by by non- SERC Grant GR/B/93086. My thanks to Reichgelt, for renewing my interest in this problem constituent or a [VP,V]. by presenting a version of Seuren's analysis It small is certainly amount not of the case 'evidence' that to the adding small together but with embedding I t h i n k it the allows obvious is this to amount way some vestige to persist in the semantics, very a serious least suggestive. in which Klein, for telling me about Church's of pairs and conditionals in the lambda calculus; to Brian Smith, for introducing me Taken to the wonderfully obscure power of the Y operator; the deep and to Gerald Gazdar, Aravind Joshi, Martin Kay and Mark I think that at the of Ewan 'implementation' of compositionality reconsideration in a seminar, and providing the initial sentential data; already published establishes the case for the deep embedding, Han Steedman, for helpful discussion on various aspects of this work. the BKPZ proposal is in order. APPENDIX I SEQUENCES IN THE UNTYPED LAMBDA CALCULUS IV. CONCLUSIONS To It is of course augmentation significance. too early to tell whether will be It does of seem general to me this use to offer imbed enough of Lisp in the lambda cslculus for our needs, we require not just pairs, but NIL or and a conditionals implemented 20 as well. Conditionals are similarly to pairs - "if p then q else r" is simply p applied to the pair <q,r>, where Joshi, A. Joehi, A.K., Levy, L. So and Yueh, K. 1975. Tree adjunct grammars. Journal of Comp .... and System Sciences. TRUE and FALSE are the left and right pair element selectors respectively. construct and determining their possibilities relatively end exist, lists, is of inefficient approach. to effectively some method required. which but we of Numerous have chosen conceptually a clear Kaplan, R.M. and Bresnan, J. 1982. Lexicalfunctional grammar: A formal system of grammatical representation. In J. Bresnan, editor, The mental representation of grammatical relations. MIT Press, Cambridge, MA. We compose lists of triples, rather than pairs. Normal <TRUE,car,cdr>, CONS pairs are given as while NIL is <FALSE,,>. Given this approach, shorthand, sections In order manipulate we can define the following with which the semantic Seuren, P. 1972. Predicate Raising in French and Sundry Languages. ms., Nijmegen. rules given in II.2 and III.3 can be translated into the Steedman, M. 1983. On the Generality of the Nested Dependency Constraint and the reason for an Exception in Dutch. In Butterworth, B., Comrie, E. and Dahl, 0., editors, Explanations of Language Universals. Mouton. lambda calculus: TR= - Ix [~y [~]] FALSE- ~x.Lky.LyJ] Thompson, N I L - ~f.Ef(FALSE)(kp.[p])(~p.[p])l C0NS(A,B) - ~f.Ef(TRUE)(A)(B)J CAe(L) - L(~x.[ ~y[ ~z[y] ]3 ) CDR(L) L()~x.t ),y.L ),z.[ z] ] j ) C0NSP(L) - T(~x [~y.[~z.[x]]]) CADR(L) - CAR(CDR(L)) ADJOINFORM - la.[ IL. [ ~N. [ CONSP(L)(CONS(CA~(L), a(CD~(L))(N))) (CONS(N,NIL)) ] ]] - ~f.[ ~.[ f(x(~) )] (~x.[ f(x(x))])] ADJOIN(L,N) 1983. How much context-sensitivity is required to provide reasonable structural descriptions: Tree adjoining gran~nars, version submitted to this conference. - Y(ADJOI~0~M)(T)(N) Note that we use Church's Y operator to produce the required recursive definition of ADJOIN. REFERENCES Ades, A. and Steedman, M. 1982. On the order of words. Linguistics and Philosophy. to appear. Bresnan, J.W., Kaplan, R., Peters, S. and Zaenen, A. 1982. Cross-serial dependencies in Dutch. Linguistic Inquir[ 13. Cazdar, G. 1981c. Phrase structure grammar. In P. Jacobson and G. Pullum, editors, The nature of syntactic representation. D. Reidel, Dordrecht. 21 H.S. 1981b. Chart Parsing and Rule Schemata in GPSG. In Proceedings of the Nineteenth Annual Meeting of the Association for Computational Linguistics. ACL, Stanford, CA. Also DAI Research Paper 165, Dept. of Artificial Intelligence, Univ. of Edinburgh.