1 The string-generative capacity of regular dependency languages Marco Kuhlmann and Mathias Möhl Abstract This paper contributes to the formal theory of dependency grammar. We apply the classical concept of algebraic recognizability to characterize regular sets of dependency structures, and show how in this framework, two empirically relevant structural restrictions on dependency analyses yield infinite hierarchies of ever more expressive string languages. Keywords dependency grammar, generative capacity 1.1 Introduction Syntactic representations based on word-to-word dependencies have a long and venerable tradition in descriptive linguistics. Lately, they have also been used in many computational tasks; one of them is parsing. (Nivre, 2006, provides a good overview of the field.) In dependency parsing, there is a specific interest in non-projective dependency analyses, in which a word and its dependents may be spread out over a discontinuous region of the sentence; such structures naturally arise in the syntactic analysis of languages with flexible word order. Unfortunately, most formal results on non-projectivity are rather discouraging: for example, while grammar-driven dependency parsers that are restricted to projective structures can be as efficient as parsers for lexicalized context-free grammar (Eisner and Satta, 1999), parsing is prohibitively expensive when unrestricted forms of non-projectivity are permitted (Neuhaus and Bröker, 1997). A similar problem arises in data-driven dependency parsing (McDonald and Pereira, 2006). FG-2007. Organizing Committee:, Laura Kallmeyer, Paola Monachesi, Gerald Penn, Giorgio Satta. c 2009, CSLI Publications. Copyright ! 1 2 / Marco Kuhlmann and Mathias Möhl In search of a balance between the need for more expressivity and the disadvantage of increased processing complexity, several authors have proposed constraints to identify classes of ‘mildly’ non-projective dependency structures that are computationally well-behaved. In this paper, we focus on two of these proposals: the gap-degree restriction, which puts a bound on the number of discontinuities in the region of a sentence covered by a word and its dependents, and the well-nestedness condition, which constrains the arrangement of dependency subtrees (Bodirsky et al., 2005). Both constraints have been shown to be in very good fit with data from dependency treebanks (Kuhlmann and Nivre, 2006). However, very little is known about the formal properties of sets of restricted non-projective dependency structures. Contents of the paper In this paper, we contribute to the formal theory of dependency grammar. This theory was pioneered by Gaifman (1965), who presented a formalism that generates sets of projective dependency structures and is weakly equivalent to context-free grammar. We generalize and extend Gaifman’s work by exploring languages based on restricted classes of non-projective dependency structures: We apply the classical concept of algebraic recognizability (Mezei and Wright, 1967) to dependency structures; this yields a canonical notion of regular dependency languages. The main contribution of the paper is the result that, in this framework, both the gap-degree parameter and the well-nestedness condition have an immediate impact on string-generative capacity. Regular dependency languages are weakly equivalent to the languages generated by lexicalized Linear Context-Free Rewriting Systems (lcfrs) (Weir, 1988, Kuhlmann and Möhl, 2007). We show that the string-language hierarchy known for lcfrs can be recovered by controlling the gap-degree parameter. Adding the well-nestedness condition leads to a proper decrease in generative capacity on nearly all levels of this hierarchy; it induces the language hierarchy known for Coupled Context-Free Grammars (Hotz and Pitsch, 1996). The hierarchies coincide only in the projective case (gap-degree 0), where we find the languages generated by Gaifman’s formalism. Structure of the paper The remainder of the paper is structured as follows. Section 1.2 contains some basic notions related to strings and terms. In Section 1.3, we develop the notion of regular dependency languages. Then, in Section 1.4 we present our main results in the form of a hierarchy theorem. Section 1.5 concludes the paper. String-generative capacity of regular dependency languages / 3 1.2 Preliminaries We expect the reader to be familiar with the basic concepts of universal algebra (see Denecke and Wismath, 2001, for an introduction), and only introduce our particular notation. Throughout the paper, we write [n] to refer to the set of positive integers up to and including n. 1.2.1 Strings An alphabet is a non-empty, finite set of symbols. The set of strings over the alphabet A is denoted by A∗ ; the set of non-empty strings is denoted by A+ . The empty string is denoted by ε. In the following, let w ∈ A∗ be a string over A. We write |w| for the length of w, and define the set of positions in w as the set pos(w) := { i ∈ N | 1 ≤ i ≤ |w| }. 1.2.2 Terms Let S be a non-empty set of sorts. An S-sorted alphabet consists of an alphabet Σ and a type assignment type Σ : Σ → S + . In the following, let σ ∈ Σ be a symbol. We usually write σ : s1 × · · · × sn → s instead of type Σ (σ) = s1 · · · sn s. The integer n is called the rank of σ. A ranked alphabet is a sorted alphabet with a single sort; in this case, the type of each symbol is uniquely determined by its rank. If Σ is a sorted alphabet and A is an alphabet, %Σ, A& denotes the sorted alphabet Σ×A in which type #Σ,A$ (%σ, a&) = type Σ (σ) , for all %σ, a& ∈ Σ × A. In the following, let Σ be a sorted alphabet. The set of terms over Σ of sort s is defined by the recursive equation TΣs := { σ(t1 , . . . , tn ) | σ : s1 × · · · × sn → s ∧ ∀i ∈ [n]. ti ∈ TΣsi } . The set of all terms over Σ is defined as TΣ := { t | s ∈ S ∧ t ∈ TΣs }. For a given term t, the set of nodes of t is a subset of N∗ , defined recursively as nod (σ(t1 , . . . , tn )) := {ε} ∪ { iu | i ∈ [n] ∧ u ∈ nod (ti ) }, where juxtaposition denotes concatenation. We also use the notations t/u for the subterm of t rooted at u, and t(u) for the symbol at the node u of t. A context is a term c ∈ TΣ in which some subterm c/u ∈ TΣs has been replaced by the special symbol !sc , the hole of c. We write CΣ for the set of all contexts obtained from terms in TΣ . Given a context c ∈ CΣ with hole !sc and a term t ∈ TΣs , we write c · t for the term that results from replacing the hole of c by t. We also define the following notation for iterated replacement into contexts: c0 · t := t , cn · t := c · cn−1 · t , n ≥ 1 . 4 / Marco Kuhlmann and Mathias Möhl a1 a2 a3 b1 b2 b3 FIGURE 1 1.3 c1 c2 c3 a1 a2 a3 b1 b2 b3 c1 c2 c3 Two labelled dependency structures Regular dependency languages In this section, we introduce dependency structures, and motivate the notion of regular dependency languages. 1.3.1 Dependency structures For the purposes of this paper, dependency structures are defined relative to a ranked alphabet Σ of term constructors and an (unranked) alphabet A of labels. More precisely, a labelled dependency structure over Σ and A is a pair d = (t, #u), where t ∈ T#Σ,A$ is a term, and #u is a list of the nodes in t. Given two nodes u1 , u2 ∈ nod (t), we say that u1 governs u2 , and write u1 " u2 , if u2 = u1 w, for some sequence w ∈ N∗ ; we say that u1 precedes u2 , u1 + u2 , if the position of u1 in #u is equal to or properly precedes the position of u2 ; we say that u1 is labelled with a, λ(u1 ) = a, if t(u1 ) = %σ, a&, for some σ ∈ Σ. The surface string of a dependency structure d = (t, u1 · · · un ) is defined as s(d) := λ(u1 ) · · · λ(un ). Example 1 Figure 1 shows how we visualize dependency structures: nodes are represented by circles, governance is represented by arrows, precedence by the left-to-right order of the nodes, and labelling by the dotted lines. (The boxes are irrelevant for now.) We write DΣ (A) to refer to the set of all labelled dependency structures over Σ and A. A dependency language over Σ and A is a subset of DΣ (A). Given such a dependency language, the term language and the string language corresponding to L are defined as follows: tL := { t ∈ T#Σ,A$ | (t, #u) ∈ L } , sL := { w ∈ A+ | d ∈ L ∧ s(d) = w } . 1.3.2 Structural constraints We now define the classes of ‘mildly’ non-projective dependency structures that we want to investigate in this paper. Let d = (t, #u) be a dependency structure, and let u, u1 , u2 ∈ nod (t) be nodes. The yield of u is the set of all nodes governed by u; it is denoted by ,u-. The interval between u1 and u2 is the set of all nodes preceded by min(u1 , u2 ) String-generative capacity of regular dependency languages / 5 and succeeded by max(u1 , u2 ); it is denoted by [u1 , u2 ]. Every yield ,uis partitioned into blocks: u1 , u2 are in the same block of ,u-, if u " w, for all nodes w ∈ [u1 , u2 ]. The number of blocks that constitute the yield ,u- of u is called the block-degree of u. The gap-degree of u is the block-degree, minus 1. The class of all dependency structures in which every node has block-degree at most k is denoted by Dk . Two yields ,u1 -, ,u2 - interleave, if there are nodes v1 , w1 ∈ ,u1 - and v2 , w2 ∈ ,u2 such that v1 ≺ v2 , v2 ≺ w1 , and w1 ≺ w2 . A dependency structure is called well-nested, if for every pair of interleaving yields ,u1 -, ,u2 -, either u1 " u2 or u2 " u1 holds. The class of all well-nested dependency structures is denoted by Dwn . Example 2 Both structures in Figure 1 have block-degree 2. To witness, consider the node b2 : the yield of this node falls into two blocks (marked by the boxes), and this is the highest number of blocks per node. The left structure is well-nested. The right structure is not wellnested; for example, the yields ,b1 - and ,b2 - interleave, but neither does b1 govern b2 , nor vice versa. 1.3.3 Local order annotations Kuhlmann and Möhl (2007) show how to encode dependency structures into terms over a sorted set Ω of local order annotations. This result is based on the observation that the blocks of a node have a recursive structure that closely follows the term structure: the blocks of a node u can be decomposed into the singleton interval containing u, and the blocks of the children of u. The precedence relation of a dependency structure can therefore be represented, in a unique way, by annotating each node u with an order on its sub-blocks, and information on how to merge adjacent sub-blocks to obtain the blocks of u. Example 3 Consider the node b2 in the left structure in Figure 1. This node has two blocks: the first block consists of the first (and only) block of a2 , followed by the first block of b3 ; the second block consists of b2 itself, followed by the second block of b3 and the first (and only) block of c2 . If we number the direct dependents of b2 from left-to-right, then the local order among the constituents of the blocks of b2 can be annotated as %12, 023&, where the ith occurrence of the symbol j refers to the ith block of the direct dependent j, the symbol 0 refers to b2 , and the two components of the tuple represent the two blocks of b2 . An order annotation ω of a node u with n children fixes the blockdegree of u as the number |ω| of components of ω and the block-degree of the ith child of u as the number #i (ω) of occurrences of the symbol i in ω. To reflect this, ω receives the type #1 (ω) × · · · × #n (ω) → |ω|. 6 / Marco Kuhlmann and Mathias Möhl 1.3.4 Regular sets of dependency structures Based on the local order annotations, we can give an algebraic struck ture to dependency structures as follows: Let k ∈ N. The set DΩ of segmented dependency structures over Ω of sort k is the set of all pairs (t, %#u1 , . . . , #uk &) with #ui /= ε for all i ∈ [k], and (t, #u1 · · · #uk ) ∈ DΩ .1 To every order annotation ω ∈ Ω of a node with n children, we associate an n-ary algebraic operation fω on segmented dependency structures. This operation takes the disjoint union of its arguments, introduces a new root node, puts the blocks corresponding to the components of the argument structures and the block for the root node into the order defined by ω, and groups these ordered constituents into blocks according to the component structure of ω. An order annotation ω : k1 ×· · ·×kn → k kn k1 k → DΩ . thus is a compact description of a function fω : DΩ × · · · × DΩ Example 4 The local order annotation %12, 023& gives rise to a com1 2 1 2 position operation f#12,023$ : DΩ × DΩ × DΩ → DΩ . In the composition of the left structure in Figure 1, this operation takes the sub-structure with yield {a2 } (1 block), the sub-structure with yield {a3 , b3 , c1 } (2 blocks), and the sub-structure with yield {c2 } (1 block) and produces the substructure with yield {a2 , a3 , b2 , b3 , c1 , c2 } (2 blocks). Once we are able to view dependency structures as the result of operations in an algebra, we can investigate recognizable sets of dependency structures (Mezei and Wright, 1967). The concept of algebraic recognizability provides a canonical notion of ‘regularity’ for sets of structures, be they strings, trees, graphs, or general relational structures (see Courcelle, 1996, for an overview). It is a fundamental concept with many different characterizations: it can be understood in terms of homomorphisms into finite algebras, automata, and definability in monadic second-order logic. In this paper, we use its characterization through regular term grammars (see Denecke and Wismath, 2001): Definition 1 Let Σ be an X-sorted alphabet, and let x ∈ X be a sort. A regular term grammar is a construct G = (N, Σ, S, P ), where N is an X-indexed family of alphabets of non-terminal symbols, S ∈ Nx is a distinguished start symbol, and P is a finite set of productions of the form A → t, where A ∈ Ny and t ∈ TΣy (N ), for some sort y ∈ X. Given that the conversion between dependency structures and terms over the signature Ω of local order annotations is one-to-one (Kuhlmann and Möhl, 2007), recognizable dependency languages are isomorphic to their corresponding term languages. This motivates the following definition: 1 We ignore the aspect of labelling for the sake of simplicity. String-generative capacity of regular dependency languages / 7 !!012314", a1 " !!012, 314", a1 " !!0", b1 " !!0", a2 " !!0", b2 " !!01, 23", a1 " !!0", b1 " !!0", a2 " !!0", b2 " a1 a1 a1 b 1 b 1 b 1 a2 a2 a2 b 2 b 2 b 2 !!0", b1 " !!0", a2 " !!0", b2 " FIGURE 2 Regular dependency grammars Definition 2 A dependency language over Σ and A is called regular, if its encoding as a set of terms over %Ω, A& is generated by some regular term grammar. In slight abuse of terminology, we refer to regular term grammars over (subsets of) %Ω, A& as regular dependency grammars. The class of all regular dependency languages is denoted by regd. Note that, while the set Ω of all order annotations is infinite, regular dependency grammars require us to get by with a finite subset of these annotations per language. In particular, every regular dependency language is built on a finite set of sorts, corresponding to the numbers of blocks per child in the order annotations and composition operations. Example 5 To illustrate the definition of regular dependency grammars, we present a grammar that generates a language L with sL = count(2) , where count(k) := { an1 bn1 · · · ank bnk | n ≥ 1 } . Note that, for k = 1, count(k) is homomorphic to the familiar contextfree language an bn , and that for every k > 1, count(k) is not contextfree. We present the productions of a grammar with start symbol S. S → %%012314&, a1 &(R, B1 , A2 , B2 ) R → %%012, 314&, a1 &(R, B1 , A2 , B2 ) A2 → %%0&, a2 & S → %%0123&, a1 &(B1 , A2 , B2 ) R → %%01, 23&, a1 &(B1 , A2 , B2 ) Bi → %%0&, bi & , for i ∈ [2]. Figure 2 shows a term generated by this grammar, and its corresponding dependency structure. Note that the structure is well-nested. The construction in the example can be extended to obtain a regular dependency grammar for every language count(k), k ∈ N. In this construction, we need exactly two sorts, 1 and k. The usage of the sort k implies, that the resulting dependency language contains structures with block-degree k or, equivalently, gap-degree k − 1. In the next section, we show that count(k) actually enforces these structures. 8 / Marco Kuhlmann and Mathias Möhl 1.4 Main results In this section, we present the main technical results of this paper: that both the gap-degree parameter and the well-nestedness condition have immediate consequences for the string-generative capacity of regular dependency languages. The proof is based on two technical lemmata. 1.4.1 Technical lemmata The first lemma is a pumping lemma for regular term languages; it can be seen as the correspondent to Ogden’s lemma for context-free string languages. Let L ⊆ TΣ be a term language, and let t ∈ L. A non-empty context p ∈ CΣ is called pumpable, if there is a context c ∈ CΣ and a term t& ∈ TΣ such that t = c · p · t& and c · pn · t& ∈ L, for all n ≥ 0. Lemma 1 (Kuhlmann (2008)) For every regular term language L ⊆ TΣ , there is a constant nL ≥ 1 such that, if t ∈ L and at least nL nodes in t are marked as distinguished, then there exists a pumpable context p ∈ CΣ in t that contains at least one marked node. Proof. The proof is based on the observation that, for all terms t with rank at most n, if the number of distinguished nodes in t is large enough, then there is at least one root-to-leaf path in t that visits at least i + 1 nodes that qualify as roots of a context containing marked nodes, for any given i. By choosing i to be the number of non-terminals in a (normal form) grammar for L, we thus obtain a pumpable context. 2 3 The second lemma formulates two elementary results about intervals of positions in strings. Let w ∈ A∗ be a string. A mask for w is a nonempty sequence M = [i1 , j1 ] · · · [in , jn ] of intervals of positions in w such that jk < ik+1 , for all k ∈ [n − 1]. We call the intervals [i, j] the blocks of the mask M , and use |M | to denote their number. We write B ∈ M , if B is a block of M . Given a string w and a mask M for w, the set of positions corresponding to M is defined as pos([i1 , j1 ] · · · [in , jn ]) := { i ∈ pos(w) | ∃k ∈ [n]. i ∈ [ik , jk ] } . Given a set P of positions in w, we write P̄ for the set of remaining positions, and [P ] for the minimal mask for w such that pos(M ) = P . We say that P contributes to a block B of some mask, if P ∩ B /= ∅. For masks M with an even number of blocks, we define the fusion of M as F ([i1 , j1 ][i&1 , j1& ] · · · [in , jn ][i&n , jn& ]) := [i1 , j1& ] · · · [in , jn& ] . Lemma 2 Let w ∈ A∗ be a string, let M be a mask for w with an even number of blocks, and let P be a set of positions in w such that both P and P̄ contribute to every block of M . Then |[P ]| ≥ |M |/2. Furthermore, if |[P ]| ≤| M |/2, then P ⊆ pos(F (M )). String-generative capacity of regular dependency languages / 9 Proof. For every block B ∈ [P ], let n(B) be the number of blocks in M that B contributes to. We make two observations: ! Since P contributes to each block of M , |M | ≤ B∈[P ] n(B). . . Since P̄ contributes to each block of M , no block B ∈ [P ] can fully contain a block of M ; therefore, n(B) ≤ 2, for all blocks B ∈ [P ]. Putting these two observations together, we see that ! ! |M | ≤ B∈[P ] n(B) ≤ B∈[P ] 2 = 2 · |[P ]| . For the second part of the lemma, let M = [i1 , j1 ][i&1 , j1& ] · · · [in , jn ][i&n , jn& ] and [P ] = [k1 , l1 ] · · · [kn , ln ]. Then, each block of [P ] contributes to exactly two blocks of M . More precisely, for each h ∈ [n], the block [kh , lh ] of [P ] contributes to the blocks [ih , jh ] and [i&h , jh& ] of M . Because P̄ also contributes to [ih , jh ] and [i&h , jh& ], [kh , lh ] is a proper subset of [ih , jh& ], which is a block of F (M ). Hence, P ⊆ pos(F (M )). 2 3 1.4.2 String languages that enforce structural properties We now show how certain families of string languages can be used to enforce structural properties in regular dependency languages. The first family that we consider is the family count from the example above. For what follows, put regd(D) := { L ∈ regd | L ⊆ D }. Lemma 3 Let k ∈ N. Every language L ∈ regd with sL = count(k) contains structures with block-degree at least k. Proof. Let L ⊆ DΣ , and let d1 = (t1 , #u1 ) be a dependency structure contained in L such that |s(d1 )| ≥ nL , where nL is the constant from Lemma 1. Because of the one-to-one correspondence between the positions in the string s(d1 ) and the nodes in the term t1 , we have |t1 | ≥ nL . If we now mark all nodes in t1 as distinguished, Lemma 1 tells us that there exist contexts p, c ∈ CΣ and a term t& ∈ TΣ with t1 = c · p · t& such that the ‘pumped’ term t2 := c · p2 · t& is contained in tL. Hence, there must be a linearization #u2 of the nodes in t2 such that d2 := (t2 , #u2 ) ∈ L. Now, let P be the set of positions in s(d2 ) that correspond to the subterm p · t& of t2 , and hence to a yield ,u-, and let M be the uniquely determined mask for s(d2 ) with 2k blocks in which each of the blocks covers |s(d2 )|/2k positions. Since c · p · t& ∈ tL and c · p2 · t& ∈ tL, the context p contributes to s(d2 ) at least one occurrence of every symbol ai , bi , for i ∈ [k]. Hence, both P and P̄ contribute to each block of M . With the first part of Lemma 2, we then deduce that |[P ]| ≥ k. By the one-to-one correspondence between the positions in s(d2 ) and the nodes in d2 , this means that ,u- is distributed over at least k blocks; hence, d2 has block-degree at least k (gap-degree at least k − 1). 2 3 10 / Marco Kuhlmann and Mathias Möhl To enforce ill-nested dependency structure, we choose the family m n n m m n n resp(k) := { am 1 b1 c1 d1 · · · ak bk ck dk | m, n ∈ N } , where resp(2) is from Weir (1988). Similar to count(k), resp(k) can be generated by a regular dependency grammar of degree k. Lemma 4 Let k ∈ N, k > 1. Every L ∈ regd(Dk ) with sL = resp(k) contains structures that are not well-nested. Proof. Consider a dependency structure d1 = (t1 , #u1 ) ∈ L with m m n n m n n s(d1 ) = am 1 b1 c1 d1 · · · ak bk ck dk , where m = n = nL is the constant from Lemma 1, and mark all nodes in t1 that are labelled with symbols from X := { xi | x ∈ {a, b}, i ∈ [k] } as distinguished. Because of the one-to-one correspondence between the positions in s(d1 ) and the nodes in t1 , we thus have nL ·|X| ≥ nL marked nodes. Lemma 1 then tells us that there exist contexts p, c ∈ CΣ and a term t& ∈ TΣ such that t2 := c · p2 · t& belongs to tL. Thus, there is a linearization #u2 of the nodes in t2 such that the dependency structure d2 := (t2 , #u2 ) belongs to L. Let v be the root node of the context p in t1 and let w be the root node of the instance of p in t2 that is farther away from the root. As another consequence of Lemma 1, at least one node in p is labelled with a symbol x ∈ X. We now show that the set of labels that occur in the subterm p · t& of t1 equals X. Since both s(d1 ) and s(d2 ) are elements of resp(k), all symbols x& ∈ X must occur equally often as labels in p, and since at least one node of p is labelled with x ∈ X, each element of X occurs at least once as a label in p and hence also in p · t& . To show the inclusion in the other direction, consider the set P of positions in s(d2 ) that are contributed by the yield ,w- that equals the set of nodes in p · t& . Furthermore, let M be the (uniquely determined) mask for s(d2 ) with 2k blocks in which each block contains all the occurrences of one symbol in X. We now apply Lemma 2 to deduce that |[P ]| ≥ k; since d2 ∈ Dk , we even have |[P ]| = k, and P ⊆ pos(F (M )). Given that all positions in pos(F (M )) are labelled with symbols from X, all nodes in ,w- and consequently all nodes in p · t& are labelled with symbols in X. We now apply Lemma 1 a second time, marking all elements of the set Y := { yi | y ∈ {c, d}, i ∈ [k] }. Analogously to the first application, we deduce the existence of a pumpable context with root node v & such that the set of labels occurring in ,v & - equals Y . Since X and Y are set-wise disjoint, neither v " v & nor v & " v; however, the yields ,v- and ,v & - interleave. Hence, the structure d2 is not well-nested. 2 3 References / 11 1.4.3 A hierarchy theorem We are now ready to present our main result, a hierarchy theorem for the string languages corresponding to regular dependency languages. We define sregd(D) := { sL | L ∈ regd(D) }. From Example 5 and Lemmata 3 and 4, we get the following result: Theorem 5 (Hierarchy theorem) For every k ∈ N, . sregd(D ) ! sregd(D ), . sregd(D ∩ D ), . sregd(D ∩∩ DD )) !! sregd(D ) if k > 1, . sregd(D ∩ D ) −sregd(D sregd(D ) /= ∅. k k+1 k wn k+1 k wn k k+1 wn wn k This theorem recovers and relates the string-language hierarchies for Linear Context-Free Rewriting Systems (Weir, 1988) and Coupled Context-Free Grammars (Hotz and Pitsch, 1996). 1.5 Conclusion In this paper, we have motivated the notion of regular dependency languages (Kuhlmann and Möhl, 2007) by the notion of algebraic recognizability and shown that, in this framework, both the gap-degree restriction and the well-nestedness condition have an immediate impact on the string-generative capacity: they induce two infinite hierarchies of ever more expressive dependency languages. These hierarchies are intimately related to the hierarchies obtained by several mildly context-sensitive grammar formalisms. The close link between ‘mild’ forms of non-projectivity and notions of formal power provides a promising starting point for future work. Structural properties such as gap-degree and well-nestedness are immediately observable in dependency analyses (for example, in dependency treebanks), and may therefore offer an interesting alternative to a comparison of grammar formalisms based on the relatively weak notion of string-generative capacity, or the very formalism-specific notion of strong generative capacity. Acknowledgements The research reported in this paper is funded by the German Research Foundation. We thank the anonymous reviewers of the paper for their detailed comments. References Bodirsky, Manuel, Marco Kuhlmann, and Mathias Möhl. 2005. Well-nested drawings as models of syntactic structure. In Tenth Conference on Formal Grammar and Ninth Meeting on Mathematics of Language. Edinburgh, Scotland, UK. 12 / Marco Kuhlmann and Mathias Möhl Courcelle, Bruno. 1996. Basic notions of universal algebra for language theory and graph grammars. Theoretical Computer Science 163:1–54. Denecke, Klaus and Shelly L. Wismath. 2001. Universal algebra and applications in theoretical computer science. Chapman and Hall/CRC. Eisner, Jason and Giorgio Satta. 1999. Efficient parsing for bilexical contextfree grammars and head automaton grammars. In 37th Annual Meeting of the Association for Computational Linguistics (ACL), pages 457–464. College Park, Maryland, USA. Gaifman, Haim. 1965. Dependency systems and phrase-structure systems. Information and Control 8:304–337. Hotz, Günter and Gisela Pitsch. 1996. On parsing coupled-context-free languages. Theoretical Computer Science 161:205–233. Kuhlmann, Marco. 2008. Ogden’s lemma for regular tree languages. CoRR abs/cs/0810.4249. Kuhlmann, Marco and Mathias Möhl. 2007. Mildly context-sensitive dependency languages. In 45th Annual Meeting of the Association for Computational Linguistics (ACL). Prague, Czech Republic. Kuhlmann, Marco and Joakim Nivre. 2006. Mildly non-projective dependency structures. In 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics (COLING-ACL), Main Conference Poster Sessions, pages 507–514. Sydney, Australia. McDonald, Ryan and Fernando Pereira. 2006. Online learning of approximate dependency parsing algorithms. In Eleventh Conference of the European Chapter of the Association for Computational Linguistics (EACL), pages 81–88. Trento, Italy. Mezei, Jorge E. and Jesse B. Wright. 1967. Algebraic automata and contextfree sets. Information and Control 11:3–29. Neuhaus, Peter and Norbert Bröker. 1997. The complexity of recognition of linguistically adequate dependency grammars. In 35th Annual Meeting of the Association for Computational Linguistics (ACL), pages 337–343. Madrid, Spain. Nivre, Joakim. 2006. Inductive Dependency Parsing, vol. 34 of Text, Speech and Language Technology. Dordrecht, The Netherlands: Springer-Verlag. Weir, David J. 1988. Characterizing Mildly Context-Sensitive Grammar Formalisms. Ph.D. thesis, University of Pennsylvania, Philadelphia, Pennsylvania, USA.