The string-generative capacity of regular dependency languages 1 Marco Kuhlmann and Mathias M¨

advertisement
1
The string-generative capacity
of regular dependency languages
Marco Kuhlmann and Mathias Möhl
Abstract
This paper contributes to the formal theory of dependency grammar. We
apply the classical concept of algebraic recognizability to characterize regular
sets of dependency structures, and show how in this framework, two empirically relevant structural restrictions on dependency analyses yield infinite
hierarchies of ever more expressive string languages.
Keywords dependency grammar, generative capacity
1.1
Introduction
Syntactic representations based on word-to-word dependencies have a
long and venerable tradition in descriptive linguistics. Lately, they have
also been used in many computational tasks; one of them is parsing.
(Nivre, 2006, provides a good overview of the field.) In dependency parsing, there is a specific interest in non-projective dependency analyses,
in which a word and its dependents may be spread out over a discontinuous region of the sentence; such structures naturally arise in the
syntactic analysis of languages with flexible word order. Unfortunately,
most formal results on non-projectivity are rather discouraging: for example, while grammar-driven dependency parsers that are restricted
to projective structures can be as efficient as parsers for lexicalized
context-free grammar (Eisner and Satta, 1999), parsing is prohibitively
expensive when unrestricted forms of non-projectivity are permitted
(Neuhaus and Bröker, 1997). A similar problem arises in data-driven
dependency parsing (McDonald and Pereira, 2006).
FG-2007.
Organizing Committee:, Laura Kallmeyer, Paola Monachesi, Gerald Penn, Giorgio Satta.
c 2009, CSLI Publications.
Copyright !
1
2 / Marco Kuhlmann and Mathias Möhl
In search of a balance between the need for more expressivity and
the disadvantage of increased processing complexity, several authors
have proposed constraints to identify classes of ‘mildly’ non-projective
dependency structures that are computationally well-behaved. In this
paper, we focus on two of these proposals: the gap-degree restriction,
which puts a bound on the number of discontinuities in the region of a
sentence covered by a word and its dependents, and the well-nestedness
condition, which constrains the arrangement of dependency subtrees
(Bodirsky et al., 2005). Both constraints have been shown to be in very
good fit with data from dependency treebanks (Kuhlmann and Nivre,
2006). However, very little is known about the formal properties of sets
of restricted non-projective dependency structures.
Contents of the paper
In this paper, we contribute to the formal theory of dependency grammar. This theory was pioneered by Gaifman (1965), who presented a
formalism that generates sets of projective dependency structures and
is weakly equivalent to context-free grammar. We generalize and extend
Gaifman’s work by exploring languages based on restricted classes of
non-projective dependency structures: We apply the classical concept of
algebraic recognizability (Mezei and Wright, 1967) to dependency structures; this yields a canonical notion of regular dependency languages.
The main contribution of the paper is the result that, in this framework, both the gap-degree parameter and the well-nestedness condition
have an immediate impact on string-generative capacity.
Regular dependency languages are weakly equivalent to the languages generated by lexicalized Linear Context-Free Rewriting Systems (lcfrs) (Weir, 1988, Kuhlmann and Möhl, 2007). We show that
the string-language hierarchy known for lcfrs can be recovered by
controlling the gap-degree parameter. Adding the well-nestedness condition leads to a proper decrease in generative capacity on nearly all
levels of this hierarchy; it induces the language hierarchy known for
Coupled Context-Free Grammars (Hotz and Pitsch, 1996). The hierarchies coincide only in the projective case (gap-degree 0), where we find
the languages generated by Gaifman’s formalism.
Structure of the paper
The remainder of the paper is structured as follows. Section 1.2 contains some basic notions related to strings and terms. In Section 1.3,
we develop the notion of regular dependency languages. Then, in Section 1.4 we present our main results in the form of a hierarchy theorem.
Section 1.5 concludes the paper.
String-generative capacity of regular dependency languages / 3
1.2
Preliminaries
We expect the reader to be familiar with the basic concepts of universal
algebra (see Denecke and Wismath, 2001, for an introduction), and only
introduce our particular notation. Throughout the paper, we write [n]
to refer to the set of positive integers up to and including n.
1.2.1
Strings
An alphabet is a non-empty, finite set of symbols. The set of strings
over the alphabet A is denoted by A∗ ; the set of non-empty strings is
denoted by A+ . The empty string is denoted by ε. In the following, let
w ∈ A∗ be a string over A. We write |w| for the length of w, and define
the set of positions in w as the set pos(w) := { i ∈ N | 1 ≤ i ≤ |w| }.
1.2.2
Terms
Let S be a non-empty set of sorts. An S-sorted alphabet consists of an
alphabet Σ and a type assignment type Σ : Σ → S + . In the following,
let σ ∈ Σ be a symbol. We usually write σ : s1 × · · · × sn → s instead of
type Σ (σ) = s1 · · · sn s. The integer n is called the rank of σ. A ranked
alphabet is a sorted alphabet with a single sort; in this case, the type of
each symbol is uniquely determined by its rank. If Σ is a sorted alphabet
and A is an alphabet, %Σ, A& denotes the sorted alphabet Σ×A in which
type #Σ,A$ (%σ, a&) = type Σ (σ) ,
for all %σ, a& ∈ Σ × A.
In the following, let Σ be a sorted alphabet. The set of terms over Σ
of sort s is defined by the recursive equation
TΣs := { σ(t1 , . . . , tn ) | σ : s1 × · · · × sn → s ∧ ∀i ∈ [n]. ti ∈ TΣsi } .
The set of all terms over Σ is defined as TΣ := { t | s ∈ S ∧ t ∈ TΣs }.
For a given term t, the set of nodes of t is a subset of N∗ , defined
recursively as nod (σ(t1 , . . . , tn )) := {ε} ∪ { iu | i ∈ [n] ∧ u ∈ nod (ti ) },
where juxtaposition denotes concatenation. We also use the notations
t/u for the subterm of t rooted at u, and t(u) for the symbol at the
node u of t. A context is a term c ∈ TΣ in which some subterm c/u ∈ TΣs
has been replaced by the special symbol !sc , the hole of c. We write CΣ
for the set of all contexts obtained from terms in TΣ . Given a context
c ∈ CΣ with hole !sc and a term t ∈ TΣs , we write c · t for the term that
results from replacing the hole of c by t. We also define the following
notation for iterated replacement into contexts:
c0 · t := t ,
cn · t := c · cn−1 · t , n ≥ 1 .
4 / Marco Kuhlmann and Mathias Möhl
a1
a2
a3
b1
b2
b3
FIGURE 1
1.3
c1
c2
c3
a1
a2
a3
b1
b2
b3
c1
c2
c3
Two labelled dependency structures
Regular dependency languages
In this section, we introduce dependency structures, and motivate the
notion of regular dependency languages.
1.3.1 Dependency structures
For the purposes of this paper, dependency structures are defined relative to a ranked alphabet Σ of term constructors and an (unranked)
alphabet A of labels. More precisely, a labelled dependency structure
over Σ and A is a pair d = (t, #u), where t ∈ T#Σ,A$ is a term, and #u
is a list of the nodes in t. Given two nodes u1 , u2 ∈ nod (t), we say
that u1 governs u2 , and write u1 " u2 , if u2 = u1 w, for some sequence
w ∈ N∗ ; we say that u1 precedes u2 , u1 + u2 , if the position of u1
in #u is equal to or properly precedes the position of u2 ; we say that u1
is labelled with a, λ(u1 ) = a, if t(u1 ) = %σ, a&, for some σ ∈ Σ. The
surface string of a dependency structure d = (t, u1 · · · un ) is defined as
s(d) := λ(u1 ) · · · λ(un ).
Example 1 Figure 1 shows how we visualize dependency structures:
nodes are represented by circles, governance is represented by arrows,
precedence by the left-to-right order of the nodes, and labelling by the
dotted lines. (The boxes are irrelevant for now.)
We write DΣ (A) to refer to the set of all labelled dependency structures
over Σ and A. A dependency language over Σ and A is a subset of
DΣ (A). Given such a dependency language, the term language and the
string language corresponding to L are defined as follows:
tL := { t ∈ T#Σ,A$ | (t, #u) ∈ L } ,
sL := { w ∈ A+ | d ∈ L ∧ s(d) = w } .
1.3.2 Structural constraints
We now define the classes of ‘mildly’ non-projective dependency structures that we want to investigate in this paper. Let d = (t, #u) be a
dependency structure, and let u, u1 , u2 ∈ nod (t) be nodes. The yield
of u is the set of all nodes governed by u; it is denoted by ,u-. The interval between u1 and u2 is the set of all nodes preceded by min(u1 , u2 )
String-generative capacity of regular dependency languages / 5
and succeeded by max(u1 , u2 ); it is denoted by [u1 , u2 ]. Every yield ,uis partitioned into blocks: u1 , u2 are in the same block of ,u-, if u " w,
for all nodes w ∈ [u1 , u2 ]. The number of blocks that constitute the
yield ,u- of u is called the block-degree of u. The gap-degree of u is the
block-degree, minus 1. The class of all dependency structures in which
every node has block-degree at most k is denoted by Dk . Two yields
,u1 -, ,u2 - interleave, if there are nodes v1 , w1 ∈ ,u1 - and v2 , w2 ∈ ,u2 such that v1 ≺ v2 , v2 ≺ w1 , and w1 ≺ w2 . A dependency structure is
called well-nested, if for every pair of interleaving yields ,u1 -, ,u2 -, either u1 " u2 or u2 " u1 holds. The class of all well-nested dependency
structures is denoted by Dwn .
Example 2 Both structures in Figure 1 have block-degree 2. To witness, consider the node b2 : the yield of this node falls into two blocks
(marked by the boxes), and this is the highest number of blocks per
node. The left structure is well-nested. The right structure is not wellnested; for example, the yields ,b1 - and ,b2 - interleave, but neither
does b1 govern b2 , nor vice versa.
1.3.3 Local order annotations
Kuhlmann and Möhl (2007) show how to encode dependency structures
into terms over a sorted set Ω of local order annotations. This result
is based on the observation that the blocks of a node have a recursive
structure that closely follows the term structure: the blocks of a node u
can be decomposed into the singleton interval containing u, and the
blocks of the children of u. The precedence relation of a dependency
structure can therefore be represented, in a unique way, by annotating
each node u with an order on its sub-blocks, and information on how
to merge adjacent sub-blocks to obtain the blocks of u.
Example 3 Consider the node b2 in the left structure in Figure 1.
This node has two blocks: the first block consists of the first (and only)
block of a2 , followed by the first block of b3 ; the second block consists
of b2 itself, followed by the second block of b3 and the first (and only)
block of c2 . If we number the direct dependents of b2 from left-to-right,
then the local order among the constituents of the blocks of b2 can be
annotated as %12, 023&, where the ith occurrence of the symbol j refers
to the ith block of the direct dependent j, the symbol 0 refers to b2 ,
and the two components of the tuple represent the two blocks of b2 .
An order annotation ω of a node u with n children fixes the blockdegree of u as the number |ω| of components of ω and the block-degree
of the ith child of u as the number #i (ω) of occurrences of the symbol i
in ω. To reflect this, ω receives the type #1 (ω) × · · · × #n (ω) → |ω|.
6 / Marco Kuhlmann and Mathias Möhl
1.3.4 Regular sets of dependency structures
Based on the local order annotations, we can give an algebraic struck
ture to dependency structures as follows: Let k ∈ N. The set DΩ
of
segmented dependency structures over Ω of sort k is the set of all pairs
(t, %#u1 , . . . , #uk &) with #ui /= ε for all i ∈ [k], and (t, #u1 · · · #uk ) ∈ DΩ .1 To
every order annotation ω ∈ Ω of a node with n children, we associate an
n-ary algebraic operation fω on segmented dependency structures. This
operation takes the disjoint union of its arguments, introduces a new
root node, puts the blocks corresponding to the components of the argument structures and the block for the root node into the order defined
by ω, and groups these ordered constituents into blocks according to
the component structure of ω. An order annotation ω : k1 ×· · ·×kn → k
kn
k1
k
→ DΩ
.
thus is a compact description of a function fω : DΩ
× · · · × DΩ
Example 4 The local order annotation %12, 023& gives rise to a com1
2
1
2
position operation f#12,023$ : DΩ
× DΩ
× DΩ
→ DΩ
. In the composition
of the left structure in Figure 1, this operation takes the sub-structure
with yield {a2 } (1 block), the sub-structure with yield {a3 , b3 , c1 }
(2 blocks), and the sub-structure with yield {c2 } (1 block) and produces the substructure with yield {a2 , a3 , b2 , b3 , c1 , c2 } (2 blocks).
Once we are able to view dependency structures as the result of operations in an algebra, we can investigate recognizable sets of dependency structures (Mezei and Wright, 1967). The concept of algebraic
recognizability provides a canonical notion of ‘regularity’ for sets of
structures, be they strings, trees, graphs, or general relational structures (see Courcelle, 1996, for an overview). It is a fundamental concept
with many different characterizations: it can be understood in terms
of homomorphisms into finite algebras, automata, and definability in
monadic second-order logic. In this paper, we use its characterization
through regular term grammars (see Denecke and Wismath, 2001):
Definition 1 Let Σ be an X-sorted alphabet, and let x ∈ X be a sort.
A regular term grammar is a construct G = (N, Σ, S, P ), where N is
an X-indexed family of alphabets of non-terminal symbols, S ∈ Nx is
a distinguished start symbol, and P is a finite set of productions of the
form A → t, where A ∈ Ny and t ∈ TΣy (N ), for some sort y ∈ X.
Given that the conversion between dependency structures and terms
over the signature Ω of local order annotations is one-to-one (Kuhlmann
and Möhl, 2007), recognizable dependency languages are isomorphic
to their corresponding term languages. This motivates the following
definition:
1 We
ignore the aspect of labelling for the sake of simplicity.
String-generative capacity of regular dependency languages / 7
!!012314", a1 "
!!012, 314", a1 " !!0", b1 " !!0", a2 " !!0", b2 "
!!01, 23", a1 " !!0", b1 " !!0", a2 " !!0", b2 "
a1 a1 a1 b 1 b 1 b 1 a2 a2 a2 b 2 b 2 b 2
!!0", b1 " !!0", a2 " !!0", b2 "
FIGURE 2
Regular dependency grammars
Definition 2 A dependency language over Σ and A is called regular, if
its encoding as a set of terms over %Ω, A& is generated by some regular
term grammar.
In slight abuse of terminology, we refer to regular term grammars over
(subsets of) %Ω, A& as regular dependency grammars. The class of all
regular dependency languages is denoted by regd. Note that, while
the set Ω of all order annotations is infinite, regular dependency grammars require us to get by with a finite subset of these annotations per
language. In particular, every regular dependency language is built on
a finite set of sorts, corresponding to the numbers of blocks per child
in the order annotations and composition operations.
Example 5 To illustrate the definition of regular dependency grammars, we present a grammar that generates a language L with
sL = count(2) ,
where
count(k) := { an1 bn1 · · · ank bnk | n ≥ 1 } .
Note that, for k = 1, count(k) is homomorphic to the familiar contextfree language an bn , and that for every k > 1, count(k) is not contextfree. We present the productions of a grammar with start symbol S.
S → %%012314&, a1 &(R, B1 , A2 , B2 )
R → %%012, 314&, a1 &(R, B1 , A2 , B2 )
A2 → %%0&, a2 &
S → %%0123&, a1 &(B1 , A2 , B2 )
R → %%01, 23&, a1 &(B1 , A2 , B2 )
Bi → %%0&, bi & ,
for i ∈ [2].
Figure 2 shows a term generated by this grammar, and its corresponding dependency structure. Note that the structure is well-nested.
The construction in the example can be extended to obtain a regular dependency grammar for every language count(k), k ∈ N. In this
construction, we need exactly two sorts, 1 and k. The usage of the
sort k implies, that the resulting dependency language contains structures with block-degree k or, equivalently, gap-degree k − 1. In the next
section, we show that count(k) actually enforces these structures.
8 / Marco Kuhlmann and Mathias Möhl
1.4
Main results
In this section, we present the main technical results of this paper: that
both the gap-degree parameter and the well-nestedness condition have
immediate consequences for the string-generative capacity of regular
dependency languages. The proof is based on two technical lemmata.
1.4.1 Technical lemmata
The first lemma is a pumping lemma for regular term languages; it can
be seen as the correspondent to Ogden’s lemma for context-free string
languages. Let L ⊆ TΣ be a term language, and let t ∈ L. A non-empty
context p ∈ CΣ is called pumpable, if there is a context c ∈ CΣ and a
term t& ∈ TΣ such that t = c · p · t& and c · pn · t& ∈ L, for all n ≥ 0.
Lemma 1 (Kuhlmann (2008)) For every regular term language
L ⊆ TΣ , there is a constant nL ≥ 1 such that, if t ∈ L and at least nL
nodes in t are marked as distinguished, then there exists a pumpable
context p ∈ CΣ in t that contains at least one marked node.
Proof. The proof is based on the observation that, for all terms t with
rank at most n, if the number of distinguished nodes in t is large enough,
then there is at least one root-to-leaf path in t that visits at least i + 1
nodes that qualify as roots of a context containing marked nodes, for
any given i. By choosing i to be the number of non-terminals in a
(normal form) grammar for L, we thus obtain a pumpable context. 2
3
The second lemma formulates two elementary results about intervals
of positions in strings. Let w ∈ A∗ be a string. A mask for w is a nonempty sequence M = [i1 , j1 ] · · · [in , jn ] of intervals of positions in w
such that jk < ik+1 , for all k ∈ [n − 1]. We call the intervals [i, j] the
blocks of the mask M , and use |M | to denote their number. We write
B ∈ M , if B is a block of M . Given a string w and a mask M for w,
the set of positions corresponding to M is defined as
pos([i1 , j1 ] · · · [in , jn ]) := { i ∈ pos(w) | ∃k ∈ [n]. i ∈ [ik , jk ] } .
Given a set P of positions in w, we write P̄ for the set of remaining
positions, and [P ] for the minimal mask for w such that pos(M ) = P .
We say that P contributes to a block B of some mask, if P ∩ B /= ∅. For
masks M with an even number of blocks, we define the fusion of M as
F ([i1 , j1 ][i&1 , j1& ] · · · [in , jn ][i&n , jn& ]) := [i1 , j1& ] · · · [in , jn& ] .
Lemma 2 Let w ∈ A∗ be a string, let M be a mask for w with an
even number of blocks, and let P be a set of positions in w such that
both P and P̄ contribute to every block of M . Then |[P ]| ≥ |M |/2.
Furthermore, if |[P ]| ≤| M |/2, then P ⊆ pos(F (M )).
String-generative capacity of regular dependency languages / 9
Proof. For every block B ∈ [P ], let n(B) be the number of blocks in M
that B contributes to. We make two observations:
!
Since P contributes to each block of M , |M | ≤ B∈[P ] n(B).
.
. Since P̄ contributes to each block of M , no block B ∈ [P ] can fully
contain a block of M ; therefore, n(B) ≤ 2, for all blocks B ∈ [P ].
Putting these two observations together, we see that
!
!
|M | ≤
B∈[P ] n(B) ≤
B∈[P ] 2 = 2 · |[P ]| .
For the second part of the lemma, let M = [i1 , j1 ][i&1 , j1& ] · · · [in , jn ][i&n , jn& ]
and [P ] = [k1 , l1 ] · · · [kn , ln ]. Then, each block of [P ] contributes to exactly two blocks of M . More precisely, for each h ∈ [n], the block [kh , lh ]
of [P ] contributes to the blocks [ih , jh ] and [i&h , jh& ] of M . Because P̄ also
contributes to [ih , jh ] and [i&h , jh& ], [kh , lh ] is a proper subset of [ih , jh& ],
which is a block of F (M ). Hence, P ⊆ pos(F (M )).
2
3
1.4.2 String languages that enforce structural properties
We now show how certain families of string languages can be used to
enforce structural properties in regular dependency languages. The first
family that we consider is the family count from the example above.
For what follows, put regd(D) := { L ∈ regd | L ⊆ D }.
Lemma 3 Let k ∈ N. Every language L ∈ regd with sL = count(k)
contains structures with block-degree at least k.
Proof. Let L ⊆ DΣ , and let d1 = (t1 , #u1 ) be a dependency structure
contained in L such that |s(d1 )| ≥ nL , where nL is the constant from
Lemma 1. Because of the one-to-one correspondence between the positions in the string s(d1 ) and the nodes in the term t1 , we have |t1 | ≥ nL .
If we now mark all nodes in t1 as distinguished, Lemma 1 tells us that
there exist contexts p, c ∈ CΣ and a term t& ∈ TΣ with t1 = c · p · t& such
that the ‘pumped’ term t2 := c · p2 · t& is contained in tL. Hence, there
must be a linearization #u2 of the nodes in t2 such that d2 := (t2 , #u2 ) ∈ L.
Now, let P be the set of positions in s(d2 ) that correspond to the subterm p · t& of t2 , and hence to a yield ,u-, and let M be the uniquely
determined mask for s(d2 ) with 2k blocks in which each of the blocks
covers |s(d2 )|/2k positions. Since c · p · t& ∈ tL and c · p2 · t& ∈ tL, the
context p contributes to s(d2 ) at least one occurrence of every symbol
ai , bi , for i ∈ [k]. Hence, both P and P̄ contribute to each block of M .
With the first part of Lemma 2, we then deduce that |[P ]| ≥ k. By
the one-to-one correspondence between the positions in s(d2 ) and the
nodes in d2 , this means that ,u- is distributed over at least k blocks;
hence, d2 has block-degree at least k (gap-degree at least k − 1).
2
3
10 / Marco Kuhlmann and Mathias Möhl
To enforce ill-nested dependency structure, we choose the family
m n n
m m n n
resp(k) := { am
1 b1 c1 d1 · · · ak bk ck dk | m, n ∈ N } ,
where resp(2) is from Weir (1988). Similar to count(k), resp(k) can
be generated by a regular dependency grammar of degree k.
Lemma 4 Let k ∈ N, k > 1. Every L ∈ regd(Dk ) with sL = resp(k)
contains structures that are not well-nested.
Proof. Consider a dependency structure d1 = (t1 , #u1 ) ∈ L with
m m n n
m n n
s(d1 ) = am
1 b1 c1 d1 · · · ak bk ck dk ,
where m = n = nL is the constant from Lemma 1, and mark all nodes
in t1 that are labelled with symbols from X := { xi | x ∈ {a, b}, i ∈ [k] }
as distinguished. Because of the one-to-one correspondence between the
positions in s(d1 ) and the nodes in t1 , we thus have nL ·|X| ≥ nL marked
nodes. Lemma 1 then tells us that there exist contexts p, c ∈ CΣ and a
term t& ∈ TΣ such that t2 := c · p2 · t& belongs to tL. Thus, there is a
linearization #u2 of the nodes in t2 such that the dependency structure
d2 := (t2 , #u2 ) belongs to L. Let v be the root node of the context p
in t1 and let w be the root node of the instance of p in t2 that is farther
away from the root. As another consequence of Lemma 1, at least one
node in p is labelled with a symbol x ∈ X.
We now show that the set of labels that occur in the subterm p · t&
of t1 equals X. Since both s(d1 ) and s(d2 ) are elements of resp(k), all
symbols x& ∈ X must occur equally often as labels in p, and since at
least one node of p is labelled with x ∈ X, each element of X occurs at
least once as a label in p and hence also in p · t& . To show the inclusion
in the other direction, consider the set P of positions in s(d2 ) that
are contributed by the yield ,w- that equals the set of nodes in p · t& .
Furthermore, let M be the (uniquely determined) mask for s(d2 ) with
2k blocks in which each block contains all the occurrences of one symbol
in X. We now apply Lemma 2 to deduce that |[P ]| ≥ k; since d2 ∈ Dk ,
we even have |[P ]| = k, and P ⊆ pos(F (M )). Given that all positions
in pos(F (M )) are labelled with symbols from X, all nodes in ,w- and
consequently all nodes in p · t& are labelled with symbols in X.
We now apply Lemma 1 a second time, marking all elements of the
set Y := { yi | y ∈ {c, d}, i ∈ [k] }. Analogously to the first application,
we deduce the existence of a pumpable context with root node v & such
that the set of labels occurring in ,v & - equals Y . Since X and Y are
set-wise disjoint, neither v " v & nor v & " v; however, the yields ,v- and
,v & - interleave. Hence, the structure d2 is not well-nested.
2
3
References / 11
1.4.3 A hierarchy theorem
We are now ready to present our main result, a hierarchy theorem for
the string languages corresponding to regular dependency languages.
We define sregd(D) := { sL | L ∈ regd(D) }. From Example 5 and
Lemmata 3 and 4, we get the following result:
Theorem 5 (Hierarchy theorem) For every k ∈ N,
. sregd(D ) ! sregd(D ),
. sregd(D
∩ D ),
. sregd(D ∩∩ DD )) !! sregd(D
) if k > 1,
. sregd(D ∩ D ) −sregd(D
sregd(D ) /= ∅.
k
k+1
k
wn
k+1
k
wn
k
k+1
wn
wn
k
This theorem recovers and relates the string-language hierarchies for
Linear Context-Free Rewriting Systems (Weir, 1988) and Coupled
Context-Free Grammars (Hotz and Pitsch, 1996).
1.5
Conclusion
In this paper, we have motivated the notion of regular dependency
languages (Kuhlmann and Möhl, 2007) by the notion of algebraic recognizability and shown that, in this framework, both the gap-degree
restriction and the well-nestedness condition have an immediate impact on the string-generative capacity: they induce two infinite hierarchies of ever more expressive dependency languages. These hierarchies
are intimately related to the hierarchies obtained by several mildly
context-sensitive grammar formalisms.
The close link between ‘mild’ forms of non-projectivity and notions
of formal power provides a promising starting point for future work.
Structural properties such as gap-degree and well-nestedness are immediately observable in dependency analyses (for example, in dependency treebanks), and may therefore offer an interesting alternative to
a comparison of grammar formalisms based on the relatively weak notion of string-generative capacity, or the very formalism-specific notion
of strong generative capacity.
Acknowledgements The research reported in this paper is funded by
the German Research Foundation. We thank the anonymous reviewers
of the paper for their detailed comments.
References
Bodirsky, Manuel, Marco Kuhlmann, and Mathias Möhl. 2005. Well-nested
drawings as models of syntactic structure. In Tenth Conference on Formal
Grammar and Ninth Meeting on Mathematics of Language. Edinburgh,
Scotland, UK.
12 / Marco Kuhlmann and Mathias Möhl
Courcelle, Bruno. 1996. Basic notions of universal algebra for language theory
and graph grammars. Theoretical Computer Science 163:1–54.
Denecke, Klaus and Shelly L. Wismath. 2001. Universal algebra and applications in theoretical computer science. Chapman and Hall/CRC.
Eisner, Jason and Giorgio Satta. 1999. Efficient parsing for bilexical contextfree grammars and head automaton grammars. In 37th Annual Meeting
of the Association for Computational Linguistics (ACL), pages 457–464.
College Park, Maryland, USA.
Gaifman, Haim. 1965. Dependency systems and phrase-structure systems.
Information and Control 8:304–337.
Hotz, Günter and Gisela Pitsch. 1996. On parsing coupled-context-free languages. Theoretical Computer Science 161:205–233.
Kuhlmann, Marco. 2008. Ogden’s lemma for regular tree languages. CoRR
abs/cs/0810.4249.
Kuhlmann, Marco and Mathias Möhl. 2007. Mildly context-sensitive dependency languages. In 45th Annual Meeting of the Association for Computational Linguistics (ACL). Prague, Czech Republic.
Kuhlmann, Marco and Joakim Nivre. 2006. Mildly non-projective dependency structures. In 21st International Conference on Computational
Linguistics and 44th Annual Meeting of the Association for Computational Linguistics (COLING-ACL), Main Conference Poster Sessions,
pages 507–514. Sydney, Australia.
McDonald, Ryan and Fernando Pereira. 2006. Online learning of approximate
dependency parsing algorithms. In Eleventh Conference of the European
Chapter of the Association for Computational Linguistics (EACL), pages
81–88. Trento, Italy.
Mezei, Jorge E. and Jesse B. Wright. 1967. Algebraic automata and contextfree sets. Information and Control 11:3–29.
Neuhaus, Peter and Norbert Bröker. 1997. The complexity of recognition
of linguistically adequate dependency grammars. In 35th Annual Meeting
of the Association for Computational Linguistics (ACL), pages 337–343.
Madrid, Spain.
Nivre, Joakim. 2006. Inductive Dependency Parsing, vol. 34 of Text, Speech
and Language Technology. Dordrecht, The Netherlands: Springer-Verlag.
Weir, David J. 1988. Characterizing Mildly Context-Sensitive Grammar Formalisms. Ph.D. thesis, University of Pennsylvania, Philadelphia, Pennsylvania, USA.
Related documents
Download