Document 10970793

advertisement
Axiomatizing Dependeny Parsing
Using Set Constraints
Denys Duhier
Programming Systems Lab
University of the Saarland, Saarbr
uken
duhierps.uni-sb.de
Abstrat
We propose a new formulation of dependeny
grammar and develop a orresponding axiomatization of syntati well-formedness with a
natural reading as a onurrent onstraint program. We demonstrate the expressivity and
eetiveness of set onstraints, and desribe a
treatment of ambiguity with wide appliability. Further, we provide a onstraint programming aount of dependent disjuntions that is
both simple and eÆient and additionally provides the benets of onstrutive disjuntions.
Our approah was implemented in Oz and yields
parsers with very good performane for our urrently middle sale grammars. Constraint propagation an be observed to be remarkably eetive in pruning the searh spae.
1 Introdution
Modern linguisti theories suh as hpsg (Pollard and Sag, 1987) and lfg (Kaplan and
Bresnan, 1982) are primarily onerned with
the formulation of general strutural priniples
that determine syntatially well-formed entities, typially represented as hierarhial, possibly typed, feature strutures. While feature
strutures are appealing for their perspiuity
and are easily supported by languages with uniation, they have the disadvantage that the
resulting grammatial formalisms are hard to
parse for both theoretial and pratial reasons, even beoming undeidable in the worst
ase (Kaplan and Bresnan, 1982). These difulties are magnied in languages like German where free word order and disontinuous
onstituents ompliate the formal aount and
handiap parsing tehniques relying on surfae
order.
While muh researh was devoted to the ompilation of grammatial frameworks based on
typed feature strutures and on the ompat
representation and eÆient proessing of disjuntive feature strutures, reent advanes in
onstraint tehnology have reeived rather less
attention in the omputational linguistis (CL)
ommunity than they deserve, and their relevane has gone largely unnotied. Conurrent onstraint programming provides today far
greater expressivity and eÆieny than is ommonly assumed.
In partiular, set onstraints are emerging as
an espeially elegant and omputationally eetive tool for appliations in CL. In an earlier paper (Duhier and Gardent, 1999), we desribed
how they ould serve to eÆiently solve dominane onstraints and nd minimal models of
tree desriptions. We propose here to show their
appliation to parsing.
In this paper, we are going to demonstrate
how, with the help of state-of-the-art onstraint
programming, an elegant and onise axiomati
speiation of syntati well-formedness beomes naturally an eÆient program. In so doing, we develop a treatment of ambiguity with
fairly general appliability. In partiular, we
provide a onstraint-based aount of dependent disjuntions (Maxwell and Kaplan, 1989;
Dorre and Eisele, 1990; Gerdemann, 1991; Grifth, 1990) that naturally supports onstrutive
disjuntion semantis (lifting of ommon information).
Our approah stands in sharp ontrast to
most extant parsing tehniques. We abandon
the generative view; we no longer build partial
parses by ombination of smaller ones, as is the
ase in e.g. hart parsing. Rather, we give a
global well-formedness ondition and proeed to
enumerate its models.
We hoose dependeny grammar (DG) as
our framework and axiomatize the notion of
a syntatially well-formed dependeny tree in
a manner that is partiularly well-suited to
onstraint-based proessing: we propose to regard dependeny parsing as a nite onguration problem that an be formulated as a onstraint satisfation problem (CSP).
This view takes advantage of the fat that for
a sentene of length n, there are nitely many
possible trees involving just n nodes. Out of
this large number, we must selet those that are
grammatial. We do not attempt this through
explorative generation, but rather via model
elimination. What onstraint programming affords us is eetive model elimination through
onstraint propagation.
Maruyama (1990) was the rst to propose a
omplete treatment of dependeny grammar as
a CSP and desribed parsing as a proess of inremental disambiguation. Harper (Harper et
al., 1999) ontinues this line of researh and
has proposed several algorithmi improvements
within the MUSE CSP framework (Helzerman
and Harper, 1993). Menzel (Menzel, 1998; Heineke et al., 1998; Menzel and Shroder, 1998)
advoates the use of soft \graded" onstraints
for robustness e.g. in parsing spoken language.
His proposal turns parsing into a more expensive optimization problem, but adapts graefully to onstraint violations.
Our presentation has the advantage over
Maruyama's that it follows modern linguisti
pratie: the grammar is speied by a lexion and a olletion of priniples. Further, we
illustrate the expressiveness of set onstraints
and seletion onstraints, and demonstrate how
they an provide ompat and elegant enodings
of various forms of ambiguity suh as lexial or
attahment ambiguity. Finally, our axiomatization of dependeny parsing has the property
that, modulo details of syntax, it an also be regarded as a program in a onurrent onstraint
programming language suh as Oz (Smolka,
1995). Several parsers were implemented in Oz
as desribed and provide exellent performane,
without any sort of optimization.
Consider the sentene1 below whih illusJoahim Niehren suggests the following sentene,
whih exhibits the same struture but sounds more onvining to the German ear: \Genau diese Flashe Wein
1
hat mir mein Kommissionn
ar versprohen auf der Auktion zu ersteigern."
trates the sort of non-projetive analysis with
fronting, srambling and extraposition that is
typial of German sentenes.
das Buh hat
mir
Peter versprohen zu lesen
the book has me (dat) Peter
promised
to read
(1)
Sine both \das Buh" and \Peter" an be indifferently assigned nominative or ausative ase,
either one may be subjet of \hat" while the
other is ausative objet of \lesen." There are
thus two readings.
(Fig 1) demonstrates the eetiveness of onstraint propagation: the two readings are enumerated in approximately 190ms using a single
hoie point.2 A graphial representation of the
preferred reading is shown in the upper window,
while the searh tree is displayed in the lower
window.
Figure 1: Parser Demo
In our approah, onstraint propagation alone
onstruts the dependeny tree. Searh is only
needed to explore alternatives that annot be
ruled out by onstraint propagation. In pratie, we observe that searh is required to enumerate readings, but very rarely leads to failure
where e.g. baktraking is required.
Setion 2 introdues the formal framework;
Setion 3 presents the novel onstraint programming ideas suh as set and seletion onstraints;
We expet better performane when e.g. the seletion onstraint is given low-level builtin support. It is
urrently implemented in Oz itself.
2
and in Setion 4 we axiomatize a onstraint
model for syntati well-formedness in DG and
turn the problem into a CSP.
2
16
4
hWords; Cats; Agrs; Comps; Mods; Lexion; Rulesi
string
:
at
:
agr
:
roles
:
3
Words
Cats 7
7
Agrs 5
2Comps
We write attribute aess in funtional notation.
If e is a lexial entry: string(e) is the full form of
the orresponding word, at(e) is the ategory,
agr(e) the agreement, and roles(e) the valeny
expressed as a set of omplement roles.
4
er liest das Buh
In our formal setting, a dependeny grammar G
is a 7-tuple
2
66
4
det
3
Dependeny Grammar
where Words is a nite set of strings notating the fully ineted forms of words, Cats is
a nite set of ategories suh as n for noun,
det for determiner, or vfin for nite verb,
Agrs is a nite set of agreement tuples suh
as hmas sing 3 nomi, Comps is a nite set
of omplement role types suh as subjet or
np dat for dative noun phrase, Mods is a nite
set of modier role types, suh as adj for adjetives, disjoint from Comps. We write Roles =
Comps ℄ Mods for the set of all role types; they
will serve to label the edges of a dependeny
tree. Lexion is a nite set of lexial entries
(see below), and Rules is a family of binary
prediates, indexed by role labels, expressing loal grammatial priniples: for eah 2 Roles,
there is 2 Rules suh that (w1 ; w2 ) haraterizes the grammatial admissibility of an edge
labeled from mother w1 to daughter w2 .
A lexial entry is an attribute value matrix
(AVM) with signature:
np a
1
2 Formal Framework
We now present our notion of a dependeny
grammar and of dependeny trees. The formulation below ignores issues of word order as they
are beyond the sope of this artile. However,
the full treatment inludes ideas derived from
Reape's word order domains (Reape, 1994) as
well as from the theory of topologial elds in
German sentenes.
2.1
2
subjet
he
string
reads
:
at
:
agr
:
2
26
4
roles
:
string
:
at
:
agr
:
2
36
4
roles
:
string
:
2
46
4
at
:
agr
:
roles
:
string
:
at
:
agr
:
roles
:
the
book
er
pro
hmas sing 3 nomi
fg
liest
vfin
hmas sing 3 nomi
fsubjet,np ag
das
det
hneut sing 3 ai
fg
Buh
n
hneut sing 3 ai
fdetg
3
7
5
3
7
5
3
7
5
3
7
5
Figure 2: Example Dependeny Tree
2.2
Dependeny Trees
We assume an innite set Nodes of nodes and
dene a labeled direted edge to be an element
of Nodes Nodes Roles. Thus, given a set
V Nodes of nodes and a set E V V Roles
of labeled edges between these node, hV ; Ei is a
direted graph, in the lassial sense, with labeled edges. We will restrit our attention to
nite graphs that are also trees.
Every node of a dependeny tree ontributes
a word to the sentene; the position where that
word is ontributed will be represented by a
mapping index from nodes to integers. Further,
every node must be assigned syntati features
(e.g. ase, agreement : : : ), whih will be realized by a mapping entry from nodes to lexial
entries.
A dependeny tree T is then dened as a 4tuple:
hV ; E ; index; entryi
where V is a nite set fw ; : : : ; wn g of nodes, E
1
is a nite set of labeled edges wi !
wj between
elements of V , and hV ; Ei is a tree as dened
above.
index is a bijetion from V to [1: :n℄ assigning
to eah node a linear position in the orresponding sentene, and entry : V 7! Lexion assigns a
lexial entry to eah node in V .
Example. Figure 2 ontains a graphial depition of a dependeny tree. The upper part
shows nodes represented as boxes onneted by
labeled direted edges. The integer appearing in
eah box is the position whih index assigns to
the node. For ease of reading, the linearization
of the sentene is also provided and a dotted line
indiates for eah node the orresponding word
in the sentene. The lower part of the gure
displays for eah node the lexial entry whih
entry assigns to it.
Figure 3 shows the dependeny tree for example sentene (1). In plaes where the value
assignment is fully unonstrained (i.e. all assignments are aeptable), we have used the standard \don't are" notation ` '.
Well-Formedness. A dependeny tree is
grammatially admissible i onditions (2), (3)
and (4) below are satised. First, any omplement required by a node's valeny must be realized preisely one:
8wi 2 V ; 8 2 roles(entry(wi ))
9!wj 2 V ; wi ! wj 2 E
(2)
Seond, if there is an edge emanating from wi ,
then it must be labeled either by a omplement
type in wi 's valeny or by a modier type:
8wi ! wj 2 E
2 roles(entry(wi )) [ Mods
(3)
Third, whenever there is an edge wi !
wj , then
the grammatial ondition (wi ; wj ) for 2
Rules must be satised in T :
8wi ! wj 2 E
2.3
T
j= (wi ; wj )
Improving Lexial Eonomy
(4)
Typially, the same full form of a word may orrespond to several distint agreements. In the
interest of a more ompat representation, we
wish to ollapse together entries that dier only
in their agreement. We replae attribute agr by
whose values are now sets of agreement tuples. Although of less frequent appliability, we
do the same for ategories and replae at by
ats.
Optional omplements are another soure of
onsiderable redundany in the lexion. Instead of modeling the valeny with just one
set roles(e) of omplement types, we propose
to use instead a lower bound broles(e) and an
upper bound drolese(e), suh that broles(e) drolese(e). broles(e) represents the required
roles and drolese(e) the permissible roles. Optional roles are simply drolese(e) n broles(e).
We all the new, more ompat representation a lexion entry to distinguish it from a lexial entry whih remains as dened earlier. A
lexion entry is said to generate lexial entries.
The lexial entries generated by (5) below are
of the form (6) and orrespond to all solutions
of onstraint (7).
agrs
2 string
66 ats
64 agrs
broles
:
:
:
drolese
:
string
:
2
64
2C
:
at
:
agr
:
roles
:
^ a2A ^
S
C
A
Rlo
Rhi
S
a
r
3
77
75
(5)
3
75
rR
(6)
(7)
This simple formulation illustrates how onstraints an be used to produe ompat representations of ertain forms of lexial ambiguity.
Rlo
hi
3 Constraint Programming
The foundations of onurrent onstraint programming (CCP) are well doumented e.g.
in (Saraswat et al., 1991; Smolka, 1995) and
have been implemented in the programming
language Oz (Mozart Consortium, 1998).
Modern onstraint tehnology, as provided
by CHIP (Dinbas et al., 1988), lp(FD)
(Codognet and Diaz, 1996), ECLiPSe (Aggoun
et al., 1995), ILOG Solver (ILOG, 1996), and
Oz (Mozart Consortium, 1998) has proven quite
suessful at solving pratial problems with
high ombinatorial omplexity, suh as sheduling and onguration, that were resisting traditional methods of operations researh.
3
vp past
subjet
5
6
np dat
4
8
np a
2
det
vp zu
7
zu
1
das Buh hat mir Peter versprohen zu lesen
2
16
4
1
string
:
at
:
agr
:
2
36
4
roles
:
string
:
at
:
agr
:
2
56
4
roles
:
string
:
2
76
4
at
:
agr
:
roles
:
string
:
at
:
agr
:
roles
:
2
3
4
das
det
hneut sing 3 ai
fg
hat
vfin
hmas sing 3 nomi
fsubjet,vp pastg
Peter
n
hmas sing 3 nomi
fg
zu
part
fg
5
3
75
2
26
4
3
75
3
75
3
75
6
string
7
:
at
:
agr
:
2
46
4
roles
:
string
:
at
:
agr
:
2
66
4
roles
:
string
:
2
86
4
at
:
agr
:
roles
:
string
:
at
:
agr
:
roles
:
8
Buh
n
hneut sing 3 ai
fdetg
3
mir
pro
7
h sing 3 dati 5
fg
versprohen
vpast
fnp dat,vp zu3g
lesen
vinf
fzu,np ag
3
75
3
75
75
Figure 3: Dependeny tree of demo sentene
The key to suess, presented here, is to redue parsing to a problem that onstraint programming (CP) is good at, namely onguration.
3.1
Basi Notions
A CSP onsists of a nite olletion of variables
(xi ) taking values in nite domains (Di ) and
a onstraint on these variables. A solution
is an assignment of values to these variables
suh that is satised. In CP, is omputed
inrementally as a sequene of improving approximations: (xi ) is an approximation of the
assignment to xi and is typially speied either
by listing the remaining values or by providing
upper and lower bounds. Initially the approximation of xi ontains all of Di . Solutions are
derived by a searh proess onsisting of alternating propagation and distribution steps until
all variables are determined or a ontradition is
derived. A variable xi is said to be determined
when its approximation (xi ) has been redued
to a single value.
In CCP, is alled the store and primitive onstraints are implemented by onurrent
agents that observe the store and orrespondingly improve the urrent approximations by deriving new basi onstraints aording to their
delarative semantis. Basi onstraints are e.g.
I = 5, I 6= 5, 5 2 S or 5 62 S .
Propagation. This is a proess of deterministi inferene. It removes from every approximation (xi ) all values that an be inferred to
be inonsistent with . In pratie only a heap
loal inferene proess is applied suh as \ar
onsisteny."
Distribution. When
propagation
has
reahed a xed point but not all variables
are determined, searh beomes neessary to
e.g. enumerate the possible assignments of a
non-determined variable x. Whih variable
to pik and how to enumerate the values is
expressed by a distribution (or labeling) strategy. For example the \rst-fail" strategy piks
a variable that has fewest remaining possible
values in its domain; its intended eet is to
keep the branhing fator in the searh tree as
small as possible.
3.2
Set Constraints
Modern onstraint tehnology not only supports
nite domain variables, denoting integers, but
also nite set variables, denoting nite sets of integers (Gervet, 1995; Muller and Muller, 1997).
One of our ontributions is to demonstrate the
elegane, suintness and eÆieny aruing
from the use of sets and set onstraints.
A set variable S is approximated by a lower
bound bS and an upper bound dS e:
bS S dS e
These bounds an be tightened by onstraint
propagation. All the usual operations of set theory are supported as onstraints. In partiular,
we often use S = S1 ℄ : : : ℄ Sn, where ℄ denotes disjoint union, to state that (Si ) forms a
partition of S .
Our formulation illustrates many appliations
of sets. Of speial interest is the treatment of
attahment ambiguity (or more generally of ambiguous graph onnetivity): we use set variables to represent sets of daughters for eah
node in a tree.
3.3
Seletion Constraint
An essential ontribution of this paper is a general tehnique for the ompat representation
and eetive treatment of ambiguity suh as lexial ambiguity. Consider a variable X that may
be equated with one of n variables (Vi ). We represent the hoie expliitly using a nite domain
variable I 2 f1: :ng and the seletion onstraint :
X
= hV1 ; : : : ; Vn i[I ℄
(8)
The notation hV1 ; : : : ; Vn i[I ℄ was hosen to suggest seletion of the I th element of sequene
hV1 ; : : : ; Vni.
The idea is that the onstrained value of
X an be approximated from the set of variables that may still be seleted by I from
hV1 ; : : : ; Vni. Conversely: the remaining domain of I must be restrited to the positions
for whih Vi is still ompatible with X .
This powerful idea was rst introdued in
CHIP (Dinbas et al., 1988) for nite domain
(FD) variables under the name of `element' onstraint. We extend it here to nite set (FS)
variables.
The seletion onstraint an be implemented
eÆiently and we outline the intuition in the
ase where X and (Vi ) are set variables. We
write bX and dX e for the lower and upper approximations of X and dom(I ) for the approximation of I . Given the seletion onstraint (8),
propagation maintains the following invariant:
\
bVj bX X dX e j 2dom(I )
[
dVj e
j 2dom(I )
and applies the following rule of inferene:
bVj 6 dX e _ bX 6 dVj e )
I
6= j
As a onsequene of the above invariant, the
seletion onstraint implements a form of onstrutive disjuntion (lifting of information
ommon to all remaining alternatives). The fat
that the hoie is made expliit by variable I
permits dependent seletions. For example in
(9) the hoie of whih Vi to equate with X and
whih Wi to equate with Y are mutually depend:
X = hV1 ; : : : ; Vn i[I ℄
(9)
Y = hW ; : : : ; W i[I ℄
1
n
These two onstraints an be viewed as realizing the following dependent (or named) disjuntions (Maxwell and Kaplan, 1989; Dorre and
Eisele, 1990; Gerdemann, 1991; GriÆth, 1990)
both labeled with name I :
(X = V1
(Y = W1
_:::_
_:::_
X = Vn )I
Y = Wn )I
Notational variations on dependent disjuntion
have been used to onisely express ovariant
assignment of values to dierent features in
feature strutures. The seletion onstraint
provides the same notational onveniene and
delarative semantis, but also gives it all the
omputational benets that arue from stateof-the-art onstraint tehnology.
4 Constraint Model
In this setion, we develop a onstraint model
for the formal framework of Setion 2: we introdue FD and FS variables to enode the quantities and mappings it mentions, and formulate
onstraints on them that preisely apture the
onditions the formal model stipulates. What
motivates this redution is the desire to take advantage of the powerful and very eÆient tehnologial support for onstraint propagation on
nite domains and nite sets.
For our appliation to parsing, we assume
that we are given an input sentene onsisting of
n words s1 s2 : : : sn . The solutions to the CSP
obtained as its onstraint model orrespond to
all admissible parse trees of the sentene.
Our approah takes advantage of the following observations: (1) there are nitely many
edges labeled with Roles that an be drawn between n nodes,3 (2) for eah word there are
nitely many lexial entries. Thus the problem is to pik a set of edges and, for eah node,
to pik a lexial entry so that (a) the result is
a tree, (b) none of the grammatial onditions
are violated. Viewed in this light, dependeny
parsing redues to a onguration problem.
4.1
Representation
we identify
a node with the integer representing its position in the sentene; thus index is simply the
identity funtion. The lexial entry assigned to
eah node w is represented by its attributes. We
write at(w), agr(w), and roles(w) for the variables denoting their values.
Domains: eah ategory in Cats is enoded
by a distint integer. Similarly for eah agreement tuple in Agrs4 and eah role in Roles. Thus
every value is represented either by an integer
or a set of integers.
Daughter Sets: for eah node w 2 V and
eah role 2 Roles, we write (w) for the set
Nodes and Lexial Attributes:
jV V Rolesj = n jRolesj
In German, there are 72 distint agreement tuples:
4 ases 3 genders 3 persons 2 numbers
3
4
2
of immediate daughters of w whose dependeny
edge is labeled , i.e. (w) = fw j w !
w 2 Eg.
Thus, if there is no edge labeled emanating
from w, (w) is the empty set. Sine (w) is
a subset of V , i.e. a nite of set of integers in
the onstraint model, we represent it by a FS
variable.
Lexion: we onsider the funtion Lex from
words to sets of lexion entries.
0
Lex
0
(s) = fe 2 Lexion j string(e) = sg
Without loss of generality, we an assume that
Lex returns a sequene rather than a set whih
allows us to identify the lexion entries of a
given word by position in this sequene.
The lexial entry assigned to w is generated
by one of its lexion entries as desribed in Setion 2.3. We say that the latter is \seleted" and
introdue variable entryIndex(w) to denote its
position in the sequene returned by Lex for w.
4.2
Lexial Constraints
We now expliate the assignment of lexial attributes to a node w and demonstrate how lexial ambiguity an be axiomatized by redution
to the seletion onstraint.
Lexial attribute assignment proeeds in two
steps: (a) seletion of a lexion entry (b) generation of a lexial entry from this lexion entry as
desribed in Setion 2.3. Let us write E for the
lexion entry seleted for w and I for its position in the sequene returned by Lex. They are
abstratly dened by the following equations:
he ; : : : ; en i =
1
I
E
=
=
(
(w))
(w)
he1 ; : : : ; en i[I ℄
Lex string
entryIndex
the lexial attributes are obtained as solutions
of formula (7) whih translates into the following onstraints:
(w) 2 ats(E )
(w) 2 agrs(E )
broles(E ) roles(w) drolese(E )
at
agr
For pratial reasons of implementation, the seletion onstraint is only provided for nite domains and nite sets, but E and (ei ) are AVMs.
We overome this diÆulty by pushing attribute
The subjet of a nite verb must be
either a noun or a pronoun, it must agree with
the verb in person and number, and must have
nominative ase. We write nom for the set of
agreement tuples with nominative ase:
aess into the seletion:
at(w ) 2 hats(e1 ); : : : ; ats(en )i[I ℄
agr(w ) 2 hagrs(e1 ); : : : ; agrs(en )i[I ℄
hbroles(e1 ); : : : ; broles(en )i[I ℄ roles(w)
hdrolese(e1 ); : : : ; drolese(en )i[I ℄ roles(w)
4.3
Subjet.
(w ) 2 fn; prog
^ agr(w) = agr(w )
^ agr(w ) 2 nom
(10)
Adjetive. An adjetive may modify a noun
and must agree with it:
subjet
Valeny Constraints
Every daughter set (w) is a nite set of nodes
ourring in the tree:
8w 2 V 8 2 Roles (w) V
A omplement daughter set (w) is of ardinality at most 1 (e.g. a verb has at most 1 subjet),
and it is non-empty i appears in w's valeny.
We write j(w)j for (w)'s ardinality.
8 2 Comps
0 j(w)j 1
^ j(w)j = 1 2 roles(w)
Role Constraints
0
0
0
0
w0
2 (w) ) (w; w )
0
Modern onstraint tehnology has eÆient support for suh impliative onditions.5 In partiular, if onstraint propagation an show that
(w; w ) is inonsistent in T , then w 62 (w)
is inferred whih improves the approximation
of (w).
0
adj
or
In Oz, this is expressed by the onstrut:
w 2 (w) ^ (w; w ) [ ℄ w 62 (w) end
0
det
0
0
0
0
(w; w ) 0
^
^
(w) = n
at(w ) = adj
agr(w ) = agr(w )
at
0
0
(11)
(w; w ) 0
^
^
(w ) = det
agr(w ) = agr(w )
(12)
w = min(yield(w))
at
0
0
0
The yield of w is the set of nodes (inluding
w) reahable from w by traversing downwards
any number of dependeny edges. We introdue
variable yield(w) for this quantity and develop,
in Setion 4.6, its onstraint-based axiomatization. Sine yield(w) ontains w, it is a nonempty set of nodes i.e. integers (Setion 4.1).
Thus it is meaningful to speak of its minimum
element, whih we write min(yield(w)).6
4.5
Treeness Constraints
Our formal framework simply assumed the
\usual" denition of treeness: (a) every node
has a unique mother exept for a distinguished
node, alled the root, whih has none, (b) there
are no yles. We must now provide an expliit
axiomatization of this notion. For this purpose we introdue variables daughters(w) and
mother(w ) for eah node w .
The set of immediate daughters of w is simply
dened as the union of its daughter sets:
(w) =
0
5
at
Determiner. The determiner of a noun must
agree with the noun and our left-most in its
yield:
For eah role type 2 Roles our grammar speies a binary prediate expressing a grammatial ondition between mother and daughter. We assume that (w; w ) is dened by a
formula of a onstraint language that an be interpreted over dependeny trees. We will not
make this language expliit here, but detail below a few examples. These examples are intended to be illustrative rather than normative
and we extend for them no laim of linguisti
adequay.
Aording to well-formedness ondition (4),
for every edge w !
w , grammatial ondition
(w; w ) must hold. Therefore the tree must
satisfy the following ondition:
8w; w 2 V 8 2 Roles
0
0
A modier daughter set has no ardinality restrition (e.g. a noun an have any number of
adjetives).
4.4
(w; w ) daughters
[
2Roles
(w)
Oz supports onstraints of the form I = (S ) between a nite domain variable I and a nite set variable S .
6
min
We model the notion of `mother' of a node as
a set of ardinality at most 1. Thus, we an
aount for both the presene or absene of a
mother. The latter ase is needed only for the
root of the dependeny tree.
mother
(w) V 0 jmother(w)j 1
w is a mother of w0 i w0 is an immediate daughter of w:
w 2 mother(w0 )
w0
2 daughters(w)
Further, treeness requires the existene of a
unique root. We therefore introdue the new
variable root to denote the root of the dependeny tree. This is the only node without a
mother:
8w 2 V
root 2 V
w = root jmother(w)j = 0
Eah node must ll preisely one role in the sentene; it must either be the root or an immediate daughter of another node:
V = frootg ℄
℄
(w)
w2V
2 Roles
In order to guarantee well-formedness, we must
additionally enfore ayliity. We do this below in the axiomatization of yields.
4.6
Axiomatization of Yield
The notion of yield of a lexial node, i.e. the
set of nodes reahable through the transitive
losure of immediate dominane edges (omplements and modiers), is essential for the expression of grammatial priniples. In a generative
framework, the yield of a node annot be alulated until the full dependeny tree rooted at
this node has been onstruted. In this setion,
we exhibit a stati axiomatization of yields that
fully exposes the underspeiation of a yield
to the inferene mehanisms of onstraint propagation.
Weighted Set. as a preliminary, we introdue the weighted set onstraint, between
boolean variable B and FS variables S 1; S 2:
S1
= B ? S2
whih has the delarative semantis:
B )
:B )
S1
S1
= S2
=;
Note that, if we make the lassial identiation
of false with 0 and true with 1, we an also
express it with the seletion onstraint:
S1
= B ? S2
S1
= h;; S2 i[B + 1℄
The strit yield yield!(w) is the set of nodes
stritly below w in the dependeny tree. In
other words, it is formed by the disjoint union
of the yields of its immediate daughters:
℄
(w) =
yield!
(w )
yield
0
w0 2daughters (w)
But daughters(w) is not statially known; so,
instead, we reformulate the above as a union of
weighted sets:
℄ ...
yield!(w ) =
(w 2 daughters(w)) ? yield(w )
0
0
w0 2V
...
where w 2 daughters(w) is a \reied" onstraint: it denotes a boolean whih is true i
w 2 daughters(w) is satised. To obtain the
yield of w, we simply add w to its strit yield:
0
0
(w) = fwg ℄ yield!(w)
yield
(13)
Disjoint union in (13) enfores the ondition
that w must not appear in its own strit yield,
thus ruling out loops.
4.7
Word-order onstraints
Although the treatment of word-order onstraints lies well beyond the sope of this artile, we give here an idea of how they may be
aommodated. The tehnique exploits powerful onstraints on sets, suh as \sequentiality"
and \onvexity" (no holes).
Consider that adjetives must be plaed between the determiner (if any) and the noun,
and, for simpliity of presentation, ignoring the
possibility of PPs, nothing else is allowed to
land between the determiner and the noun.
That onstraint an be expressed as follows:
^
(det(w); adj(w); fwg)
(det(w) [ adj(w) [ fwg)
Seq
Convex
where Seq(S1 ; : : : ; Sm ) is satised whenever for
all i < j , all elements of Si are stritly smaller
than all elements of Sj , and Convex(S ) when S
is an interval (with no holes).
4.8
Creating and Solving the CSP
The CSP for a given sentene is dened by the
variables introdued above and by the onjuntion of all onstraints presented in the preeding setions. It is important to notie that all
quantiation is of the form 8x 2 D (x) where
D is a nite statially known set of integers.
Suh
V formulae are expanded in the CSP into
(x).
x D
To solve the CSP, we need a \labeling" strategy. We have only experimented with the following obvious one: rst apply the default nave
labeling strategy on the olletion of mother
sets fmother(w) j w 2 Vg, then apply rstfail to the olletion of lexion entry seletors
fentryIndex(w) j w 2 Vg, and nally use again
the default nave labeling strategy on the olletion of daughter sets f(w) j 2 Roles w 2 Vg.
2
5 Results
We developed several prototype parsers, for
both German and English, using the tehniques
desribed in this paper. In addition to what
has been presented, we also support PPs (however, in the ase of ambiguous attahments, we
expliitly enumerate all possibilities), relative
lauses, innitive lauses, separable verb prexes in German (our tehniques are very eetive in dealing with the ambiguity arising from
the multipliity of possible verb prexes for V2
sentenes). We over the following phenomena: topialization, fronting (inluding partial
fronting), extraposition (although not in its full
generality) and srambling. However we are still
missing many important notions suh as oordination and nested extraposition elds, and our
pratial overage is still quite far from what
is possible in more mature grammatial frameworks.
We have experimented with both relatively
small hand-rafted lexions and one lexion automatially derived from an annotated orpus
(quite large, 18000 entries, but in pratie
disappointingly sparse). While moderate, our
overage nonetheless permits experimentation
with fairly intriate sentenes. Our experiene
so far has been quite positive: performane is
good and sales up well with sentene length
and number of readings, but a systemati evaluation remains to be done.
Sine our framework does not assume any apriori word order, the hallenge is not to permit diÆult linearizations to aount for suh
phenomena as fronting, extraposition, or srambling, but rather to rule out those that are ungrammatial and to avoid over-generation. Our
most omplete parser relies on set onstraints to
express omplex priniples of linear preedene,
and borrows ideas from the theory of topologial elds as well as from Reape's (1994) word
order domains. While we have developed powerful tehniques to express omplex word-order
onstraints, we have not yet arrived at a satisfatory formal aount for them.
Our urrent researh proeeds along 4 lines:
(a) extend the grammatial overage, (b) develop a formal framework for word-order onstraints in the style of the present artile, ()
push beyond the limits of dependeny grammar
to aount for e.g. \headless" onstrutions, (d)
integrate a onstraint-based treatment of semantis (Egg et al., 1998) with our treatment
of syntax.
6 Conlusion
In this artile, we ontributed a new presentation of dependeny grammar that follows modern linguisti pratie, and provided an axiomatization of syntati well-formedness whose
eonomy and elegane arues primarily from
the expressivity of set and seletion onstraints.
This axiomatization regards dependeny parsing as a onguration problem and expresses it
as a CSP.
Within the paradigm of onurrent onstraint
programming, our axiomatization also has a diret omputational reading as a program. As
ould be observed in Figure 1, this program is
quite eÆient in two respets:
1. Constraint propagation is very eetive at
pruning the searh spae. The example of
(Fig 1) makes preisely one hoie to enumerate the two possible readings of the sentene. In (Fig 4), parsing of an 18 word
sentene with ambiguous PP attahment
is demonstrated. Again, onstraint propagation is strong enough to permit optimal
Figure 4: Enumerating PP Attahments
enumeration of all parses without any failure.
2. Absolute performane is also quite satisfatory, even without any sort of optimization: the two readings of (Fig 1) are enumerated in 190ms and the 9 readings of
(Fig 4) in 910ms. Our preliminary results
appear to be typially about 1 order of
magnitude faster than those reported by
Harper (Harper et al., 1999). We intend
to look into this more losely.
We desribed an eetive treatment of ambiguity resting on set and seletion onstraints. In
partiular we showed how seletion onstraints
provided a CP aount of dependent disjuntions with the added benets of onstrutive
disjuntion.
In losing, it is our hope that this paper will
help promote awareness of reent advanes in
onurrent onstraint tehnology and of their
relevane to omputational linguistis, and that
the tehniques we desribed will nd appliations in other grammatial frameworks.
Aknowledgements. The author is espeially grateful to Joahim Niehren for his extensive help in revising and improving this paper.
Referenes
Abderrahamane Aggoun, David Chan, Pierre
Dufresne, Eamon Falvey, Hugh Grant,
Alexander Herold, Georey Maartney,
Miha Meier, David Miller, Shyam Mudambi, Bruno Perez, Emmanuel Van
Rossum, Joahim Shimpf, Periklis Andreas Tsahageas, and Dominique Henry
de Villeneuve. 1995. ECLiPSe 3.5. User
manual, European Computer Industry Re-
searh Centre (ECRC), Munih, Germany,
Deember.
Patrik Blakburn. 1995. Introdution: Stati
and dynami aspets of syntati struture.
Journal of Logi Language and Information,
4:1{4.
Philippe Codognet and Daniel Diaz. 1996.
Compiling onstraints in lp(FD). The Journal of Logi Programming, 27(3):185{226,
June.
Mehmet Dinbas, Pasal Van Hentenryk,
Helmut Simonis, Abderrahamane Aggoun,
Thomas Graf, and F. Berthier. 1988.
The onstraint logi programming language
CHIP. In Proeedings of the International
Conferene on Fifth Generation Computer
Systems FGCS-88, pages 693{702, Tokyo,
Japan, Deember.
Johen Dorre and Andreas Eisele. 1990. Feature logi with disjuntive uniation. In
COLING-90, volume 2, pages 100{105.
Denys Duhier and Claire Gardent. 1999. A
onstraint-based treatment of desriptions.
In Proeedings of the 3rd International Workshop on Computational Semantis (IWCS-3),
Tilburg.
Markus Egg, Joahim Niehren, Peter Ruhrberg,
and Feiyu Xu. 1998. Constraints over
lambda-strutures in semanti underspeiation. In Proeedings of the 17th International Conferene on Computational Linguistis and 36th Annual Meeting of the
Assoiation for Computational Linguistis
(COLING/ACL'98), pages 353{359, Mon-
treal, Canada, August.
Dale Gerdemann. 1991. Parsing and Generation of Uniation Grammars. Ph.D. thesis,
University of Illinois.
Carmen Gervet.
1995. Set Intervals in
Constraint-Logi Programming: Denition
and Implementation of a Language. Ph.D.
thesis, Universite de Frane-Compte, September. European Thesis.
John GriÆth. 1990. Modularizing ontexted
onstraints. In COLING-96.
Mary P. Harper, Stephen A. Hokema, and
Christopher M. White. 1999. Enhaned onstraint dependeny grammar parsers. In Proeedings of the IASTED International Conferene on Artiial Intelligene and Soft
Computing, Honolulu, Hawai USA, August.
Johannes Heineke, Jurgen Kunze, Wolfgang
Menzel, and Ingo Shroder. 1998. Eliminative parsing with graded onstraints. In Proeedings of the Joint Conferene COLINGACL, pages 526{530.
Randall A. Helzerman and Mary P. Harper.
1993. Muse sp: An extension to the onstraint satisfation problem. Journal of Artiial Intelligene Researh, 1.
Christian Holzbaur. 1990. Speiation of
Constraint Based Inferene Mehanisms
through Extended Uniation. Ph.D. thesis,
Tehnish-Naturwissenshaftlihe Fakultat
der Tehnishen Universitat Wien, Otober.
ILOG. 1996. ILOG Solver: User manual, July.
Version 3.2.
Ronald M. Kaplan and Joan Bresnan. 1982.
Lexial-funtional grammar: A formal system for grammatial representation. In Joan
Bresnan, editor, The Mental Representation
of Grammatial Relations, pages 173{281.
MIT Press.
Vinenzo Lombardo and Leonardo Lesmo.
1996. An earley-type dependeny parser for
dependeny grammar. In COLING 96, pages
723{728, Kyoto, Japan.
Hiroshi Maruyama. 1990. Constraint dependeny grammar. Researh Report RT0044,
IBM Researh, Tokyo, Marh.
John T. Maxwell and Ronald M. Kaplan. 1989.
An overview of disjuntive onstraint satisfation. In Proeedings of the Internation Workshop on Parsing Tehnologies, pages 18{27.
John T. Maxwell and Ronald M. Kaplan. 1998.
Uniation-based parsers that automatially
take advantage of ontext freeness. in preparation.
Wolfgang Menzel and Ingo Shroder. 1998.
Deision proedures for dependeny parsing
using graded onstraints. In Proeedings of
the COLING-ACL98 Workshop \Proessing
of Dependeny-based Grammars".
Wolfgang Menzel. 1998. Constraint satisfation
for robust parsing of spoken language. Jour-
nal of Experimental and Theoretial Artiial
Intelligene, 10(1):77{89.
The Mozart Consortium. 1998. The Mozart
Programming System. http://www.mozartoz.org/.
Tobias Muller and Martin Muller. 1997. Finite set onstraints in Oz. In Franois
Bry, Burkhard Freitag, and Dietmar Seipel,
editors, 13. Workshop Logishe Programmierung, pages 104{115, Tehnishe Universitat Munhen, 17{19 September.
Peter Neuhaus and Norbert Broker. 1997. The
omplexity of reognition of linguistially adequate dependeny grammars. In Proeeding
of the 35th Annual Meeting of the ACL and
the 8th Conferene of the EACL, pages 337{
343, Madrid.
Carl Pollard and Ivan Sag. 1987. InformationBased Syntax and Semantis, volume 1 of
CSLI Leture Notes. CSLI.
Mike Reape. 1994. Domain union and word order variation in german. In John Nerbonne,
Klaus Netter, and Carl Pollard, editors, German in Head-Driven Phrase Struture Grammar, number 46. CSLI Publiations.
Vijay A. Saraswat, Martin Rinard, and Prakash
Panangaden. 1991. Semanti foundations of
onurrent onstraint programming. In Con-
ferene Reord of the Eighteenth Annual ACM
Symposium on Priniples of Programming
Languages, pages 333{352, Orlando, Florida,
January 21{23,. ACM SIGACT-SIGPLAN,
ACM Press. Preliminary report.
Gert Smolka. 1995. The Oz Programming
Model. In Computer Siene Today, volume
1000 of LNCS, pages 324{343.
Download