Axiomatizing Dependeny Parsing Using Set Constraints Denys Duhier Programming Systems Lab University of the Saarland, Saarbr uken duhierps.uni-sb.de Abstrat We propose a new formulation of dependeny grammar and develop a orresponding axiomatization of syntati well-formedness with a natural reading as a onurrent onstraint program. We demonstrate the expressivity and eetiveness of set onstraints, and desribe a treatment of ambiguity with wide appliability. Further, we provide a onstraint programming aount of dependent disjuntions that is both simple and eÆient and additionally provides the benets of onstrutive disjuntions. Our approah was implemented in Oz and yields parsers with very good performane for our urrently middle sale grammars. Constraint propagation an be observed to be remarkably eetive in pruning the searh spae. 1 Introdution Modern linguisti theories suh as hpsg (Pollard and Sag, 1987) and lfg (Kaplan and Bresnan, 1982) are primarily onerned with the formulation of general strutural priniples that determine syntatially well-formed entities, typially represented as hierarhial, possibly typed, feature strutures. While feature strutures are appealing for their perspiuity and are easily supported by languages with uniation, they have the disadvantage that the resulting grammatial formalisms are hard to parse for both theoretial and pratial reasons, even beoming undeidable in the worst ase (Kaplan and Bresnan, 1982). These difulties are magnied in languages like German where free word order and disontinuous onstituents ompliate the formal aount and handiap parsing tehniques relying on surfae order. While muh researh was devoted to the ompilation of grammatial frameworks based on typed feature strutures and on the ompat representation and eÆient proessing of disjuntive feature strutures, reent advanes in onstraint tehnology have reeived rather less attention in the omputational linguistis (CL) ommunity than they deserve, and their relevane has gone largely unnotied. Conurrent onstraint programming provides today far greater expressivity and eÆieny than is ommonly assumed. In partiular, set onstraints are emerging as an espeially elegant and omputationally eetive tool for appliations in CL. In an earlier paper (Duhier and Gardent, 1999), we desribed how they ould serve to eÆiently solve dominane onstraints and nd minimal models of tree desriptions. We propose here to show their appliation to parsing. In this paper, we are going to demonstrate how, with the help of state-of-the-art onstraint programming, an elegant and onise axiomati speiation of syntati well-formedness beomes naturally an eÆient program. In so doing, we develop a treatment of ambiguity with fairly general appliability. In partiular, we provide a onstraint-based aount of dependent disjuntions (Maxwell and Kaplan, 1989; Dorre and Eisele, 1990; Gerdemann, 1991; Grifth, 1990) that naturally supports onstrutive disjuntion semantis (lifting of ommon information). Our approah stands in sharp ontrast to most extant parsing tehniques. We abandon the generative view; we no longer build partial parses by ombination of smaller ones, as is the ase in e.g. hart parsing. Rather, we give a global well-formedness ondition and proeed to enumerate its models. We hoose dependeny grammar (DG) as our framework and axiomatize the notion of a syntatially well-formed dependeny tree in a manner that is partiularly well-suited to onstraint-based proessing: we propose to regard dependeny parsing as a nite onguration problem that an be formulated as a onstraint satisfation problem (CSP). This view takes advantage of the fat that for a sentene of length n, there are nitely many possible trees involving just n nodes. Out of this large number, we must selet those that are grammatial. We do not attempt this through explorative generation, but rather via model elimination. What onstraint programming affords us is eetive model elimination through onstraint propagation. Maruyama (1990) was the rst to propose a omplete treatment of dependeny grammar as a CSP and desribed parsing as a proess of inremental disambiguation. Harper (Harper et al., 1999) ontinues this line of researh and has proposed several algorithmi improvements within the MUSE CSP framework (Helzerman and Harper, 1993). Menzel (Menzel, 1998; Heineke et al., 1998; Menzel and Shroder, 1998) advoates the use of soft \graded" onstraints for robustness e.g. in parsing spoken language. His proposal turns parsing into a more expensive optimization problem, but adapts graefully to onstraint violations. Our presentation has the advantage over Maruyama's that it follows modern linguisti pratie: the grammar is speied by a lexion and a olletion of priniples. Further, we illustrate the expressiveness of set onstraints and seletion onstraints, and demonstrate how they an provide ompat and elegant enodings of various forms of ambiguity suh as lexial or attahment ambiguity. Finally, our axiomatization of dependeny parsing has the property that, modulo details of syntax, it an also be regarded as a program in a onurrent onstraint programming language suh as Oz (Smolka, 1995). Several parsers were implemented in Oz as desribed and provide exellent performane, without any sort of optimization. Consider the sentene1 below whih illusJoahim Niehren suggests the following sentene, whih exhibits the same struture but sounds more onvining to the German ear: \Genau diese Flashe Wein 1 hat mir mein Kommissionn ar versprohen auf der Auktion zu ersteigern." trates the sort of non-projetive analysis with fronting, srambling and extraposition that is typial of German sentenes. das Buh hat mir Peter versprohen zu lesen the book has me (dat) Peter promised to read (1) Sine both \das Buh" and \Peter" an be indifferently assigned nominative or ausative ase, either one may be subjet of \hat" while the other is ausative objet of \lesen." There are thus two readings. (Fig 1) demonstrates the eetiveness of onstraint propagation: the two readings are enumerated in approximately 190ms using a single hoie point.2 A graphial representation of the preferred reading is shown in the upper window, while the searh tree is displayed in the lower window. Figure 1: Parser Demo In our approah, onstraint propagation alone onstruts the dependeny tree. Searh is only needed to explore alternatives that annot be ruled out by onstraint propagation. In pratie, we observe that searh is required to enumerate readings, but very rarely leads to failure where e.g. baktraking is required. Setion 2 introdues the formal framework; Setion 3 presents the novel onstraint programming ideas suh as set and seletion onstraints; We expet better performane when e.g. the seletion onstraint is given low-level builtin support. It is urrently implemented in Oz itself. 2 and in Setion 4 we axiomatize a onstraint model for syntati well-formedness in DG and turn the problem into a CSP. 2 16 4 hWords; Cats; Agrs; Comps; Mods; Lexion; Rulesi string : at : agr : roles : 3 Words Cats 7 7 Agrs 5 2Comps We write attribute aess in funtional notation. If e is a lexial entry: string(e) is the full form of the orresponding word, at(e) is the ategory, agr(e) the agreement, and roles(e) the valeny expressed as a set of omplement roles. 4 er liest das Buh In our formal setting, a dependeny grammar G is a 7-tuple 2 66 4 det 3 Dependeny Grammar where Words is a nite set of strings notating the fully ineted forms of words, Cats is a nite set of ategories suh as n for noun, det for determiner, or vfin for nite verb, Agrs is a nite set of agreement tuples suh as hmas sing 3 nomi, Comps is a nite set of omplement role types suh as subjet or np dat for dative noun phrase, Mods is a nite set of modier role types, suh as adj for adjetives, disjoint from Comps. We write Roles = Comps ℄ Mods for the set of all role types; they will serve to label the edges of a dependeny tree. Lexion is a nite set of lexial entries (see below), and Rules is a family of binary prediates, indexed by role labels, expressing loal grammatial priniples: for eah 2 Roles, there is 2 Rules suh that (w1 ; w2 ) haraterizes the grammatial admissibility of an edge labeled from mother w1 to daughter w2 . A lexial entry is an attribute value matrix (AVM) with signature: np a 1 2 Formal Framework We now present our notion of a dependeny grammar and of dependeny trees. The formulation below ignores issues of word order as they are beyond the sope of this artile. However, the full treatment inludes ideas derived from Reape's word order domains (Reape, 1994) as well as from the theory of topologial elds in German sentenes. 2.1 2 subjet he string reads : at : agr : 2 26 4 roles : string : at : agr : 2 36 4 roles : string : 2 46 4 at : agr : roles : string : at : agr : roles : the book er pro hmas sing 3 nomi fg liest vfin hmas sing 3 nomi fsubjet,np ag das det hneut sing 3 ai fg Buh n hneut sing 3 ai fdetg 3 7 5 3 7 5 3 7 5 3 7 5 Figure 2: Example Dependeny Tree 2.2 Dependeny Trees We assume an innite set Nodes of nodes and dene a labeled direted edge to be an element of Nodes Nodes Roles. Thus, given a set V Nodes of nodes and a set E V V Roles of labeled edges between these node, hV ; Ei is a direted graph, in the lassial sense, with labeled edges. We will restrit our attention to nite graphs that are also trees. Every node of a dependeny tree ontributes a word to the sentene; the position where that word is ontributed will be represented by a mapping index from nodes to integers. Further, every node must be assigned syntati features (e.g. ase, agreement : : : ), whih will be realized by a mapping entry from nodes to lexial entries. A dependeny tree T is then dened as a 4tuple: hV ; E ; index; entryi where V is a nite set fw ; : : : ; wn g of nodes, E 1 is a nite set of labeled edges wi ! wj between elements of V , and hV ; Ei is a tree as dened above. index is a bijetion from V to [1: :n℄ assigning to eah node a linear position in the orresponding sentene, and entry : V 7! Lexion assigns a lexial entry to eah node in V . Example. Figure 2 ontains a graphial depition of a dependeny tree. The upper part shows nodes represented as boxes onneted by labeled direted edges. The integer appearing in eah box is the position whih index assigns to the node. For ease of reading, the linearization of the sentene is also provided and a dotted line indiates for eah node the orresponding word in the sentene. The lower part of the gure displays for eah node the lexial entry whih entry assigns to it. Figure 3 shows the dependeny tree for example sentene (1). In plaes where the value assignment is fully unonstrained (i.e. all assignments are aeptable), we have used the standard \don't are" notation ` '. Well-Formedness. A dependeny tree is grammatially admissible i onditions (2), (3) and (4) below are satised. First, any omplement required by a node's valeny must be realized preisely one: 8wi 2 V ; 8 2 roles(entry(wi )) 9!wj 2 V ; wi ! wj 2 E (2) Seond, if there is an edge emanating from wi , then it must be labeled either by a omplement type in wi 's valeny or by a modier type: 8wi ! wj 2 E 2 roles(entry(wi )) [ Mods (3) Third, whenever there is an edge wi ! wj , then the grammatial ondition (wi ; wj ) for 2 Rules must be satised in T : 8wi ! wj 2 E 2.3 T j= (wi ; wj ) Improving Lexial Eonomy (4) Typially, the same full form of a word may orrespond to several distint agreements. In the interest of a more ompat representation, we wish to ollapse together entries that dier only in their agreement. We replae attribute agr by whose values are now sets of agreement tuples. Although of less frequent appliability, we do the same for ategories and replae at by ats. Optional omplements are another soure of onsiderable redundany in the lexion. Instead of modeling the valeny with just one set roles(e) of omplement types, we propose to use instead a lower bound broles(e) and an upper bound drolese(e), suh that broles(e) drolese(e). broles(e) represents the required roles and drolese(e) the permissible roles. Optional roles are simply drolese(e) n broles(e). We all the new, more ompat representation a lexion entry to distinguish it from a lexial entry whih remains as dened earlier. A lexion entry is said to generate lexial entries. The lexial entries generated by (5) below are of the form (6) and orrespond to all solutions of onstraint (7). agrs 2 string 66 ats 64 agrs broles : : : drolese : string : 2 64 2C : at : agr : roles : ^ a2A ^ S C A Rlo Rhi S a r 3 77 75 (5) 3 75 rR (6) (7) This simple formulation illustrates how onstraints an be used to produe ompat representations of ertain forms of lexial ambiguity. Rlo hi 3 Constraint Programming The foundations of onurrent onstraint programming (CCP) are well doumented e.g. in (Saraswat et al., 1991; Smolka, 1995) and have been implemented in the programming language Oz (Mozart Consortium, 1998). Modern onstraint tehnology, as provided by CHIP (Dinbas et al., 1988), lp(FD) (Codognet and Diaz, 1996), ECLiPSe (Aggoun et al., 1995), ILOG Solver (ILOG, 1996), and Oz (Mozart Consortium, 1998) has proven quite suessful at solving pratial problems with high ombinatorial omplexity, suh as sheduling and onguration, that were resisting traditional methods of operations researh. 3 vp past subjet 5 6 np dat 4 8 np a 2 det vp zu 7 zu 1 das Buh hat mir Peter versprohen zu lesen 2 16 4 1 string : at : agr : 2 36 4 roles : string : at : agr : 2 56 4 roles : string : 2 76 4 at : agr : roles : string : at : agr : roles : 2 3 4 das det hneut sing 3 ai fg hat vfin hmas sing 3 nomi fsubjet,vp pastg Peter n hmas sing 3 nomi fg zu part fg 5 3 75 2 26 4 3 75 3 75 3 75 6 string 7 : at : agr : 2 46 4 roles : string : at : agr : 2 66 4 roles : string : 2 86 4 at : agr : roles : string : at : agr : roles : 8 Buh n hneut sing 3 ai fdetg 3 mir pro 7 h sing 3 dati 5 fg versprohen vpast fnp dat,vp zu3g lesen vinf fzu,np ag 3 75 3 75 75 Figure 3: Dependeny tree of demo sentene The key to suess, presented here, is to redue parsing to a problem that onstraint programming (CP) is good at, namely onguration. 3.1 Basi Notions A CSP onsists of a nite olletion of variables (xi ) taking values in nite domains (Di ) and a onstraint on these variables. A solution is an assignment of values to these variables suh that is satised. In CP, is omputed inrementally as a sequene of improving approximations: (xi ) is an approximation of the assignment to xi and is typially speied either by listing the remaining values or by providing upper and lower bounds. Initially the approximation of xi ontains all of Di . Solutions are derived by a searh proess onsisting of alternating propagation and distribution steps until all variables are determined or a ontradition is derived. A variable xi is said to be determined when its approximation (xi ) has been redued to a single value. In CCP, is alled the store and primitive onstraints are implemented by onurrent agents that observe the store and orrespondingly improve the urrent approximations by deriving new basi onstraints aording to their delarative semantis. Basi onstraints are e.g. I = 5, I 6= 5, 5 2 S or 5 62 S . Propagation. This is a proess of deterministi inferene. It removes from every approximation (xi ) all values that an be inferred to be inonsistent with . In pratie only a heap loal inferene proess is applied suh as \ar onsisteny." Distribution. When propagation has reahed a xed point but not all variables are determined, searh beomes neessary to e.g. enumerate the possible assignments of a non-determined variable x. Whih variable to pik and how to enumerate the values is expressed by a distribution (or labeling) strategy. For example the \rst-fail" strategy piks a variable that has fewest remaining possible values in its domain; its intended eet is to keep the branhing fator in the searh tree as small as possible. 3.2 Set Constraints Modern onstraint tehnology not only supports nite domain variables, denoting integers, but also nite set variables, denoting nite sets of integers (Gervet, 1995; Muller and Muller, 1997). One of our ontributions is to demonstrate the elegane, suintness and eÆieny aruing from the use of sets and set onstraints. A set variable S is approximated by a lower bound bS and an upper bound dS e: bS S dS e These bounds an be tightened by onstraint propagation. All the usual operations of set theory are supported as onstraints. In partiular, we often use S = S1 ℄ : : : ℄ Sn, where ℄ denotes disjoint union, to state that (Si ) forms a partition of S . Our formulation illustrates many appliations of sets. Of speial interest is the treatment of attahment ambiguity (or more generally of ambiguous graph onnetivity): we use set variables to represent sets of daughters for eah node in a tree. 3.3 Seletion Constraint An essential ontribution of this paper is a general tehnique for the ompat representation and eetive treatment of ambiguity suh as lexial ambiguity. Consider a variable X that may be equated with one of n variables (Vi ). We represent the hoie expliitly using a nite domain variable I 2 f1: :ng and the seletion onstraint : X = hV1 ; : : : ; Vn i[I ℄ (8) The notation hV1 ; : : : ; Vn i[I ℄ was hosen to suggest seletion of the I th element of sequene hV1 ; : : : ; Vni. The idea is that the onstrained value of X an be approximated from the set of variables that may still be seleted by I from hV1 ; : : : ; Vni. Conversely: the remaining domain of I must be restrited to the positions for whih Vi is still ompatible with X . This powerful idea was rst introdued in CHIP (Dinbas et al., 1988) for nite domain (FD) variables under the name of `element' onstraint. We extend it here to nite set (FS) variables. The seletion onstraint an be implemented eÆiently and we outline the intuition in the ase where X and (Vi ) are set variables. We write bX and dX e for the lower and upper approximations of X and dom(I ) for the approximation of I . Given the seletion onstraint (8), propagation maintains the following invariant: \ bVj bX X dX e j 2dom(I ) [ dVj e j 2dom(I ) and applies the following rule of inferene: bVj 6 dX e _ bX 6 dVj e ) I 6= j As a onsequene of the above invariant, the seletion onstraint implements a form of onstrutive disjuntion (lifting of information ommon to all remaining alternatives). The fat that the hoie is made expliit by variable I permits dependent seletions. For example in (9) the hoie of whih Vi to equate with X and whih Wi to equate with Y are mutually depend: X = hV1 ; : : : ; Vn i[I ℄ (9) Y = hW ; : : : ; W i[I ℄ 1 n These two onstraints an be viewed as realizing the following dependent (or named) disjuntions (Maxwell and Kaplan, 1989; Dorre and Eisele, 1990; Gerdemann, 1991; GriÆth, 1990) both labeled with name I : (X = V1 (Y = W1 _:::_ _:::_ X = Vn )I Y = Wn )I Notational variations on dependent disjuntion have been used to onisely express ovariant assignment of values to dierent features in feature strutures. The seletion onstraint provides the same notational onveniene and delarative semantis, but also gives it all the omputational benets that arue from stateof-the-art onstraint tehnology. 4 Constraint Model In this setion, we develop a onstraint model for the formal framework of Setion 2: we introdue FD and FS variables to enode the quantities and mappings it mentions, and formulate onstraints on them that preisely apture the onditions the formal model stipulates. What motivates this redution is the desire to take advantage of the powerful and very eÆient tehnologial support for onstraint propagation on nite domains and nite sets. For our appliation to parsing, we assume that we are given an input sentene onsisting of n words s1 s2 : : : sn . The solutions to the CSP obtained as its onstraint model orrespond to all admissible parse trees of the sentene. Our approah takes advantage of the following observations: (1) there are nitely many edges labeled with Roles that an be drawn between n nodes,3 (2) for eah word there are nitely many lexial entries. Thus the problem is to pik a set of edges and, for eah node, to pik a lexial entry so that (a) the result is a tree, (b) none of the grammatial onditions are violated. Viewed in this light, dependeny parsing redues to a onguration problem. 4.1 Representation we identify a node with the integer representing its position in the sentene; thus index is simply the identity funtion. The lexial entry assigned to eah node w is represented by its attributes. We write at(w), agr(w), and roles(w) for the variables denoting their values. Domains: eah ategory in Cats is enoded by a distint integer. Similarly for eah agreement tuple in Agrs4 and eah role in Roles. Thus every value is represented either by an integer or a set of integers. Daughter Sets: for eah node w 2 V and eah role 2 Roles, we write (w) for the set Nodes and Lexial Attributes: jV V Rolesj = n jRolesj In German, there are 72 distint agreement tuples: 4 ases 3 genders 3 persons 2 numbers 3 4 2 of immediate daughters of w whose dependeny edge is labeled , i.e. (w) = fw j w ! w 2 Eg. Thus, if there is no edge labeled emanating from w, (w) is the empty set. Sine (w) is a subset of V , i.e. a nite of set of integers in the onstraint model, we represent it by a FS variable. Lexion: we onsider the funtion Lex from words to sets of lexion entries. 0 Lex 0 (s) = fe 2 Lexion j string(e) = sg Without loss of generality, we an assume that Lex returns a sequene rather than a set whih allows us to identify the lexion entries of a given word by position in this sequene. The lexial entry assigned to w is generated by one of its lexion entries as desribed in Setion 2.3. We say that the latter is \seleted" and introdue variable entryIndex(w) to denote its position in the sequene returned by Lex for w. 4.2 Lexial Constraints We now expliate the assignment of lexial attributes to a node w and demonstrate how lexial ambiguity an be axiomatized by redution to the seletion onstraint. Lexial attribute assignment proeeds in two steps: (a) seletion of a lexion entry (b) generation of a lexial entry from this lexion entry as desribed in Setion 2.3. Let us write E for the lexion entry seleted for w and I for its position in the sequene returned by Lex. They are abstratly dened by the following equations: he ; : : : ; en i = 1 I E = = ( (w)) (w) he1 ; : : : ; en i[I ℄ Lex string entryIndex the lexial attributes are obtained as solutions of formula (7) whih translates into the following onstraints: (w) 2 ats(E ) (w) 2 agrs(E ) broles(E ) roles(w) drolese(E ) at agr For pratial reasons of implementation, the seletion onstraint is only provided for nite domains and nite sets, but E and (ei ) are AVMs. We overome this diÆulty by pushing attribute The subjet of a nite verb must be either a noun or a pronoun, it must agree with the verb in person and number, and must have nominative ase. We write nom for the set of agreement tuples with nominative ase: aess into the seletion: at(w ) 2 hats(e1 ); : : : ; ats(en )i[I ℄ agr(w ) 2 hagrs(e1 ); : : : ; agrs(en )i[I ℄ hbroles(e1 ); : : : ; broles(en )i[I ℄ roles(w) hdrolese(e1 ); : : : ; drolese(en )i[I ℄ roles(w) 4.3 Subjet. (w ) 2 fn; prog ^ agr(w) = agr(w ) ^ agr(w ) 2 nom (10) Adjetive. An adjetive may modify a noun and must agree with it: subjet Valeny Constraints Every daughter set (w) is a nite set of nodes ourring in the tree: 8w 2 V 8 2 Roles (w) V A omplement daughter set (w) is of ardinality at most 1 (e.g. a verb has at most 1 subjet), and it is non-empty i appears in w's valeny. We write j(w)j for (w)'s ardinality. 8 2 Comps 0 j(w)j 1 ^ j(w)j = 1 2 roles(w) Role Constraints 0 0 0 0 w0 2 (w) ) (w; w ) 0 Modern onstraint tehnology has eÆient support for suh impliative onditions.5 In partiular, if onstraint propagation an show that (w; w ) is inonsistent in T , then w 62 (w) is inferred whih improves the approximation of (w). 0 adj or In Oz, this is expressed by the onstrut: w 2 (w) ^ (w; w ) [ ℄ w 62 (w) end 0 det 0 0 0 0 (w; w ) 0 ^ ^ (w) = n at(w ) = adj agr(w ) = agr(w ) at 0 0 (11) (w; w ) 0 ^ ^ (w ) = det agr(w ) = agr(w ) (12) w = min(yield(w)) at 0 0 0 The yield of w is the set of nodes (inluding w) reahable from w by traversing downwards any number of dependeny edges. We introdue variable yield(w) for this quantity and develop, in Setion 4.6, its onstraint-based axiomatization. Sine yield(w) ontains w, it is a nonempty set of nodes i.e. integers (Setion 4.1). Thus it is meaningful to speak of its minimum element, whih we write min(yield(w)).6 4.5 Treeness Constraints Our formal framework simply assumed the \usual" denition of treeness: (a) every node has a unique mother exept for a distinguished node, alled the root, whih has none, (b) there are no yles. We must now provide an expliit axiomatization of this notion. For this purpose we introdue variables daughters(w) and mother(w ) for eah node w . The set of immediate daughters of w is simply dened as the union of its daughter sets: (w) = 0 5 at Determiner. The determiner of a noun must agree with the noun and our left-most in its yield: For eah role type 2 Roles our grammar speies a binary prediate expressing a grammatial ondition between mother and daughter. We assume that (w; w ) is dened by a formula of a onstraint language that an be interpreted over dependeny trees. We will not make this language expliit here, but detail below a few examples. These examples are intended to be illustrative rather than normative and we extend for them no laim of linguisti adequay. Aording to well-formedness ondition (4), for every edge w ! w , grammatial ondition (w; w ) must hold. Therefore the tree must satisfy the following ondition: 8w; w 2 V 8 2 Roles 0 0 A modier daughter set has no ardinality restrition (e.g. a noun an have any number of adjetives). 4.4 (w; w ) daughters [ 2Roles (w) Oz supports onstraints of the form I = (S ) between a nite domain variable I and a nite set variable S . 6 min We model the notion of `mother' of a node as a set of ardinality at most 1. Thus, we an aount for both the presene or absene of a mother. The latter ase is needed only for the root of the dependeny tree. mother (w) V 0 jmother(w)j 1 w is a mother of w0 i w0 is an immediate daughter of w: w 2 mother(w0 ) w0 2 daughters(w) Further, treeness requires the existene of a unique root. We therefore introdue the new variable root to denote the root of the dependeny tree. This is the only node without a mother: 8w 2 V root 2 V w = root jmother(w)j = 0 Eah node must ll preisely one role in the sentene; it must either be the root or an immediate daughter of another node: V = frootg ℄ ℄ (w) w2V 2 Roles In order to guarantee well-formedness, we must additionally enfore ayliity. We do this below in the axiomatization of yields. 4.6 Axiomatization of Yield The notion of yield of a lexial node, i.e. the set of nodes reahable through the transitive losure of immediate dominane edges (omplements and modiers), is essential for the expression of grammatial priniples. In a generative framework, the yield of a node annot be alulated until the full dependeny tree rooted at this node has been onstruted. In this setion, we exhibit a stati axiomatization of yields that fully exposes the underspeiation of a yield to the inferene mehanisms of onstraint propagation. Weighted Set. as a preliminary, we introdue the weighted set onstraint, between boolean variable B and FS variables S 1; S 2: S1 = B ? S2 whih has the delarative semantis: B ) :B ) S1 S1 = S2 =; Note that, if we make the lassial identiation of false with 0 and true with 1, we an also express it with the seletion onstraint: S1 = B ? S2 S1 = h;; S2 i[B + 1℄ The strit yield yield!(w) is the set of nodes stritly below w in the dependeny tree. In other words, it is formed by the disjoint union of the yields of its immediate daughters: ℄ (w) = yield! (w ) yield 0 w0 2daughters (w) But daughters(w) is not statially known; so, instead, we reformulate the above as a union of weighted sets: ℄ ... yield!(w ) = (w 2 daughters(w)) ? yield(w ) 0 0 w0 2V ... where w 2 daughters(w) is a \reied" onstraint: it denotes a boolean whih is true i w 2 daughters(w) is satised. To obtain the yield of w, we simply add w to its strit yield: 0 0 (w) = fwg ℄ yield!(w) yield (13) Disjoint union in (13) enfores the ondition that w must not appear in its own strit yield, thus ruling out loops. 4.7 Word-order onstraints Although the treatment of word-order onstraints lies well beyond the sope of this artile, we give here an idea of how they may be aommodated. The tehnique exploits powerful onstraints on sets, suh as \sequentiality" and \onvexity" (no holes). Consider that adjetives must be plaed between the determiner (if any) and the noun, and, for simpliity of presentation, ignoring the possibility of PPs, nothing else is allowed to land between the determiner and the noun. That onstraint an be expressed as follows: ^ (det(w); adj(w); fwg) (det(w) [ adj(w) [ fwg) Seq Convex where Seq(S1 ; : : : ; Sm ) is satised whenever for all i < j , all elements of Si are stritly smaller than all elements of Sj , and Convex(S ) when S is an interval (with no holes). 4.8 Creating and Solving the CSP The CSP for a given sentene is dened by the variables introdued above and by the onjuntion of all onstraints presented in the preeding setions. It is important to notie that all quantiation is of the form 8x 2 D (x) where D is a nite statially known set of integers. Suh V formulae are expanded in the CSP into (x). x D To solve the CSP, we need a \labeling" strategy. We have only experimented with the following obvious one: rst apply the default nave labeling strategy on the olletion of mother sets fmother(w) j w 2 Vg, then apply rstfail to the olletion of lexion entry seletors fentryIndex(w) j w 2 Vg, and nally use again the default nave labeling strategy on the olletion of daughter sets f(w) j 2 Roles w 2 Vg. 2 5 Results We developed several prototype parsers, for both German and English, using the tehniques desribed in this paper. In addition to what has been presented, we also support PPs (however, in the ase of ambiguous attahments, we expliitly enumerate all possibilities), relative lauses, innitive lauses, separable verb prexes in German (our tehniques are very eetive in dealing with the ambiguity arising from the multipliity of possible verb prexes for V2 sentenes). We over the following phenomena: topialization, fronting (inluding partial fronting), extraposition (although not in its full generality) and srambling. However we are still missing many important notions suh as oordination and nested extraposition elds, and our pratial overage is still quite far from what is possible in more mature grammatial frameworks. We have experimented with both relatively small hand-rafted lexions and one lexion automatially derived from an annotated orpus (quite large, 18000 entries, but in pratie disappointingly sparse). While moderate, our overage nonetheless permits experimentation with fairly intriate sentenes. Our experiene so far has been quite positive: performane is good and sales up well with sentene length and number of readings, but a systemati evaluation remains to be done. Sine our framework does not assume any apriori word order, the hallenge is not to permit diÆult linearizations to aount for suh phenomena as fronting, extraposition, or srambling, but rather to rule out those that are ungrammatial and to avoid over-generation. Our most omplete parser relies on set onstraints to express omplex priniples of linear preedene, and borrows ideas from the theory of topologial elds as well as from Reape's (1994) word order domains. While we have developed powerful tehniques to express omplex word-order onstraints, we have not yet arrived at a satisfatory formal aount for them. Our urrent researh proeeds along 4 lines: (a) extend the grammatial overage, (b) develop a formal framework for word-order onstraints in the style of the present artile, () push beyond the limits of dependeny grammar to aount for e.g. \headless" onstrutions, (d) integrate a onstraint-based treatment of semantis (Egg et al., 1998) with our treatment of syntax. 6 Conlusion In this artile, we ontributed a new presentation of dependeny grammar that follows modern linguisti pratie, and provided an axiomatization of syntati well-formedness whose eonomy and elegane arues primarily from the expressivity of set and seletion onstraints. This axiomatization regards dependeny parsing as a onguration problem and expresses it as a CSP. Within the paradigm of onurrent onstraint programming, our axiomatization also has a diret omputational reading as a program. As ould be observed in Figure 1, this program is quite eÆient in two respets: 1. Constraint propagation is very eetive at pruning the searh spae. The example of (Fig 1) makes preisely one hoie to enumerate the two possible readings of the sentene. In (Fig 4), parsing of an 18 word sentene with ambiguous PP attahment is demonstrated. Again, onstraint propagation is strong enough to permit optimal Figure 4: Enumerating PP Attahments enumeration of all parses without any failure. 2. Absolute performane is also quite satisfatory, even without any sort of optimization: the two readings of (Fig 1) are enumerated in 190ms and the 9 readings of (Fig 4) in 910ms. Our preliminary results appear to be typially about 1 order of magnitude faster than those reported by Harper (Harper et al., 1999). We intend to look into this more losely. We desribed an eetive treatment of ambiguity resting on set and seletion onstraints. In partiular we showed how seletion onstraints provided a CP aount of dependent disjuntions with the added benets of onstrutive disjuntion. In losing, it is our hope that this paper will help promote awareness of reent advanes in onurrent onstraint tehnology and of their relevane to omputational linguistis, and that the tehniques we desribed will nd appliations in other grammatial frameworks. Aknowledgements. The author is espeially grateful to Joahim Niehren for his extensive help in revising and improving this paper. Referenes Abderrahamane Aggoun, David Chan, Pierre Dufresne, Eamon Falvey, Hugh Grant, Alexander Herold, Georey Maartney, Miha Meier, David Miller, Shyam Mudambi, Bruno Perez, Emmanuel Van Rossum, Joahim Shimpf, Periklis Andreas Tsahageas, and Dominique Henry de Villeneuve. 1995. ECLiPSe 3.5. User manual, European Computer Industry Re- searh Centre (ECRC), Munih, Germany, Deember. Patrik Blakburn. 1995. Introdution: Stati and dynami aspets of syntati struture. Journal of Logi Language and Information, 4:1{4. Philippe Codognet and Daniel Diaz. 1996. Compiling onstraints in lp(FD). The Journal of Logi Programming, 27(3):185{226, June. Mehmet Dinbas, Pasal Van Hentenryk, Helmut Simonis, Abderrahamane Aggoun, Thomas Graf, and F. Berthier. 1988. The onstraint logi programming language CHIP. In Proeedings of the International Conferene on Fifth Generation Computer Systems FGCS-88, pages 693{702, Tokyo, Japan, Deember. Johen Dorre and Andreas Eisele. 1990. Feature logi with disjuntive uniation. In COLING-90, volume 2, pages 100{105. Denys Duhier and Claire Gardent. 1999. A onstraint-based treatment of desriptions. In Proeedings of the 3rd International Workshop on Computational Semantis (IWCS-3), Tilburg. Markus Egg, Joahim Niehren, Peter Ruhrberg, and Feiyu Xu. 1998. Constraints over lambda-strutures in semanti underspeiation. In Proeedings of the 17th International Conferene on Computational Linguistis and 36th Annual Meeting of the Assoiation for Computational Linguistis (COLING/ACL'98), pages 353{359, Mon- treal, Canada, August. Dale Gerdemann. 1991. Parsing and Generation of Uniation Grammars. Ph.D. thesis, University of Illinois. Carmen Gervet. 1995. Set Intervals in Constraint-Logi Programming: Denition and Implementation of a Language. Ph.D. thesis, Universite de Frane-Compte, September. European Thesis. John GriÆth. 1990. Modularizing ontexted onstraints. In COLING-96. Mary P. Harper, Stephen A. Hokema, and Christopher M. White. 1999. Enhaned onstraint dependeny grammar parsers. In Proeedings of the IASTED International Conferene on Artiial Intelligene and Soft Computing, Honolulu, Hawai USA, August. Johannes Heineke, Jurgen Kunze, Wolfgang Menzel, and Ingo Shroder. 1998. Eliminative parsing with graded onstraints. In Proeedings of the Joint Conferene COLINGACL, pages 526{530. Randall A. Helzerman and Mary P. Harper. 1993. Muse sp: An extension to the onstraint satisfation problem. Journal of Artiial Intelligene Researh, 1. Christian Holzbaur. 1990. Speiation of Constraint Based Inferene Mehanisms through Extended Uniation. Ph.D. thesis, Tehnish-Naturwissenshaftlihe Fakultat der Tehnishen Universitat Wien, Otober. ILOG. 1996. ILOG Solver: User manual, July. Version 3.2. Ronald M. Kaplan and Joan Bresnan. 1982. Lexial-funtional grammar: A formal system for grammatial representation. In Joan Bresnan, editor, The Mental Representation of Grammatial Relations, pages 173{281. MIT Press. Vinenzo Lombardo and Leonardo Lesmo. 1996. An earley-type dependeny parser for dependeny grammar. In COLING 96, pages 723{728, Kyoto, Japan. Hiroshi Maruyama. 1990. Constraint dependeny grammar. Researh Report RT0044, IBM Researh, Tokyo, Marh. John T. Maxwell and Ronald M. Kaplan. 1989. An overview of disjuntive onstraint satisfation. In Proeedings of the Internation Workshop on Parsing Tehnologies, pages 18{27. John T. Maxwell and Ronald M. Kaplan. 1998. Uniation-based parsers that automatially take advantage of ontext freeness. in preparation. Wolfgang Menzel and Ingo Shroder. 1998. Deision proedures for dependeny parsing using graded onstraints. In Proeedings of the COLING-ACL98 Workshop \Proessing of Dependeny-based Grammars". Wolfgang Menzel. 1998. Constraint satisfation for robust parsing of spoken language. Jour- nal of Experimental and Theoretial Artiial Intelligene, 10(1):77{89. The Mozart Consortium. 1998. The Mozart Programming System. http://www.mozartoz.org/. Tobias Muller and Martin Muller. 1997. Finite set onstraints in Oz. In Franois Bry, Burkhard Freitag, and Dietmar Seipel, editors, 13. Workshop Logishe Programmierung, pages 104{115, Tehnishe Universitat Munhen, 17{19 September. Peter Neuhaus and Norbert Broker. 1997. The omplexity of reognition of linguistially adequate dependeny grammars. In Proeeding of the 35th Annual Meeting of the ACL and the 8th Conferene of the EACL, pages 337{ 343, Madrid. Carl Pollard and Ivan Sag. 1987. InformationBased Syntax and Semantis, volume 1 of CSLI Leture Notes. CSLI. Mike Reape. 1994. Domain union and word order variation in german. In John Nerbonne, Klaus Netter, and Carl Pollard, editors, German in Head-Driven Phrase Struture Grammar, number 46. CSLI Publiations. Vijay A. Saraswat, Martin Rinard, and Prakash Panangaden. 1991. Semanti foundations of onurrent onstraint programming. In Con- ferene Reord of the Eighteenth Annual ACM Symposium on Priniples of Programming Languages, pages 333{352, Orlando, Florida, January 21{23,. ACM SIGACT-SIGPLAN, ACM Press. Preliminary report. Gert Smolka. 1995. The Oz Programming Model. In Computer Siene Today, volume 1000 of LNCS, pages 324{343.