Crossed Serial Dependencies: A low-power

advertisement
Crossed S e r i a l Dependencies:
i low-power parseable extension to GPSG
Henry Thompson
Department of Artificial Intelligence
and
Program in Cognitive Science
U n i v e r s i t y of Edinburgh
Hope Park Square, Meadow Lane
Edinburgh EH8 9NW
SCOTLAND
ABSTRACT
paper
I present
formalism
a minimal
(Gazdar
1981c)
extension
which
to
also
the
GPSC
provides
a
An extension to the GPSG grammatical formalism is
solution.
proposed,
allowing
non-terminals
to
consist
It induces
sequences
of category labels,
which
are
non-trivially
distinct
from
and allowing
those
schematic variables
for the relevant
of
sentences
finite
structures
in
BKPZ,
and
which
I
appropriate.
It
appears,
constrained,
to be similar
making
a
argue
are
more
to range over such sequences.
when
suitably
The extension is shown to be sufficient to provide
a
strongly
adequate
dependencies,
as
grammar
found
for
in e.g.
crossed
Dutch
The
structures
in
serial
only
small
increment
in
power,
being
subordinate
incapable,
clauses.
to Joshi's proposal
induced
for
for instance,
of analysing anbnc n with
such
crossed dependencies.
And it can easily be parsed
constructions are argued to be more appropriate to
by a small modification to the parsing mechanisms I
data
involving
conjunction
than
some
previous
have already developed for GPSG.
proposals have been.
parseable
by
a
The extension is shown to be
simple
extension
to
an
existing
II. AN EXTENSION TO GPSG
parsing method for GPSG.
II.I Extendin G the s~ntax
I. INTRODUCTION
GPSG includes the idea of compound non-terminals,
There
has
community
serial
been
lately
considerable
for
grammar.
in
e.g.
Dutch
non-transformational
Although
grammars
under
in
the
composed
with the implications of crossed
dependencies
clauses
interest
context-free
the
standard
can
subordinate
theories
phrase
of
are
not
dependencies
capable
-
that
recent
paper
of
is,
assigning
they
are
the
this
trivially
labels.
This
to finite
We
sequences
of
in itself does not change
the weak generative capacity of the grammar, as the
set
are
of
non-terminals
includes
weakly adequate to generate such languages as anb n,
they
extend
category
structure
interpretations
of pairs of standard category labels.
correct
notstrongly
the
idea
of
remains
rule
finite.
schemata
variables
over
categories.
If
variables
over
sequences,
then
we
we
CPSG also
- rules
further
get
a
with
allow
real
change.
adequate.
At this point I must introduce some notation.
In
a
Zaenen
1982)
(Bresnan
Kaplsn
(hereafter BKPZ),
Peters
a solution
end
I
will write
to the
[a,b ,c]
Dutch problem was presented in terms of LFG (Kaplan
and
Bresnan
1982),
considerably
(Steedman
proposals
grammars
more
1983)
and
which
than
(Joshi
for solutions
and
Steedman 1982;
tree
is
known
context-free
1983)
have
to
for a non-terminal label composed of the categories
have
a, b, and c. I will write
power.
Za b*
also made
in terms of Steedman/Ades
adjunction
grammars
Joshi Levy and Yueh 1975).
(Ades
to
and
indicate
that
the
schematic
variable
Z ranges
over sequences of the category b. We can then give
In this
the
16
following
grammar
for
anb n
with
crossed
II.2 Semantic interpretation
dependencies:
As
S -> e
S:Z - > a SIZ:b
.(I)
s:z -> a s z:b
blZ -> b z
for
the
required
(2)
(3),
-
sufficient
to
of work.
not
only
constant
alone,
but
in
simple,
that
terms only, concatenation,
is
bar
(I).
This
analysis
subscripts
to
for
the
gives
us
numbers
give
where
I
have
dependencies,
the
rule
and
pairs
which
admits
albeit with a fair amount
The
to a p a ......6 and
compound
b
(2)
s" [bI,2,b]
sequences
P
PI
of
interpretations.
is
such
a
pair,
= P(kxXx[y]).
of
course
In
produce
what
Using
(3)
arbitrary
follows
sequences,
I will
use
a
3
this
we see
that rule
I
the
'crossed',
and
accumulated
order.
rule
b's,
This
so it is important
the crossed dependencies
come
out
in
two ways
restrictions,
and
is
5
in
successively
the
correct,
essentially
shorthand,
we can give
the following
the
syntactic
rules
given
above,
clear
from
the
is
handled
example.
That
properly
Suppose
should
be
that
the
rather
terminals,
and that there are actually three sorts
sorts of b's, subcategorised
(2
(3
rules are most
the
immediately
Rule
2 takes
sequence,
result
of
for
each
other.
If
one
for recording
used
the
standard
as
appearing
structure,
a
feature
on
in them directly,
those
to
the
first
sequence
sequence
interpretation
of
it to the interpretations
dominated
a
and
S,
and
of
of b's.
such
a
of the
prepends
the
to the unused balance of the sequence of b
first
We now have a sequence consisting
a sentential
interpretation,
of h interpretations.
the second
(b type)
the interpretation
and
pre-terminals
The
we would get the above
where we can reinterpret
in reverse
Rule
and
then
a
I thus applies
GPSG
this dependency, namely by
providing three rules, whose rule number would then
appear
b
the dominated
the
interpretations.
number
mechanism
dominated
of
applies
immediately
than
easily understood
Rule 3 simply appends the interpretation of
interpretations
the
a and
three
pre-terminals
(~
where Q' is short for Ziqh' ,
ADJOIN(Z' ,b' ).
order.
This must
categories
of a's and
b are
assuming that
CO~S(CAR(Q')(a') (S') ,CDR(Q ' ))
These
in GPSG - subcategorisation
above
which
the
to point out exactly how
are captured.
interpretation.
subcategorisation
These
where Q' is short for SI,Z~,b' ,
structure we will produce for the Dutch examples as
well,
Lisp-based
CONS(CADR (Q')(a' )(CA~(Q' )),CDDR (Q ' ))
generates a's while accumulating b's, rule 2 brings
generates
in
the b'(a',S') is the desired result at each level:
the aid of this example,
end,
we
as
in Appendix I.
preserves the appropriate dependency,
an
PO
pairs
example of a set of semantic rules for association
with
to
then
Using
using CAR, CDR, CONS, and so on.
usages are discharged
(I)
~
process
have
we can represent a pair <l,r> as
If
and
shorthand,
(I)
al/~[S,bl]
this
nodes
the
can
S
and
Church,
~f(1)(r)].
the
Lisp.
With
is
still
used
adjacent node:
a
is
the
P(~x~x[x])
marginal
extension
calculus
notated with a
grammar
a3b 5,
record
the task,
approach.
Following
following
actual
lambda
compound
interpretations,
which are distributed
appropriately higher up the tree.
For this, we
with
need
vertical
no
untyped
We can use what amounts
unpacking
where we allow variables over sequences to appear
semantics
the
the subscripts
as the rule numbers so introduced, and see that the
the
result
17
to
a,
(S type)
is
element
again
balance,
if any.
himself
that
(crossed)
dependencies are correctly reflected.
first
element of such a sequence
of the immediately dominated
this
The
of
prepended
patient
will
interpretation:
the
sequence.
to
the
unused
reader
can
satisfy
produce
the
following
II.3 Parsin~
As
for
parsing
context-free
non-terminals a n d
schemata
grammars
this
with
proposal
the
I)
omdat ik probeer Nikki te leren Nederlands
te spreken
2)
omdat ik probeer Nikki Nederlands te leren
spreken
3)
omdat ik Nikki probeer te leren Nederlands
te spreken
4)
omdat ik Nikki Nederlands probeer te leren
spreken
allows,
very little needs to be added to the mechanisms I
have provided to deal with non-sequence schemata in
GPSG, as described in (Thompson 1981 b).
We simply
treat all non-terminals as sequences, many of only
one element.
up
chart
parsing
matched
rule,
The same basic technique of a bottomstrategy,
variables
will
sequence
do
in
the
job.
variable
terminal,
the
to
which
active
substitutes
for
version
the
By restricting
occur
once
in
of
5) * omdat ik Nikki probeer Nederlands te leren
spreken.
With
only one
each
non-
pattern
the task of matching is kept simple and
deterministic.
ZlblZ.
Thus
The
concatenation,
we allow e.g.
substitutions
so
that
rule (~) matching
if we have
place
the course of bottom-up
the
judgements
of
some
(I)
is
often
judged
on stylistic grounds,
seems
fairly
Dutch
from
the
suggestion
that
this
stable
this
among
Netherlands.
is n o t
the
pattern of judgements typical of native speakers of
an instance of
processing,
is
that
at least
speakers
There
by
first [a] and then [3,b,b,b]
proviso
of
native
SIZIb but not
take
the
questionable,
Dutch from Belgium.
in
Z on the
III.2 Grammar rules for the Dutch data
right hand side will match [b,b], and the resulting
This
substitution into the left hand side will cause the
pattern
leads
us
to propose
the
following
basic rules for subordinate clauses:
constituent to be labeled [S,b,b].
A) S' -> omdat NP VP
In making
the
changes
part of
nodes,
required
B) VP -> V VP
C) VP -> NP V VP
D) VP -> NP V
to my existing system,
were
all
localised
to
that
the code which matches rule parts against
and
sequence
that
this extension
the
here
the
variable
impact
is
of
price
is
paid
encountered.
this mechanism
only
This
if
Taken
a
(probeer)
(leren)
(spreken).
straight,
these
give us (I) only.
- (4), we propose what amounts
suggests
approach,
on the parsing
they
complexity of the system is quite small.
ruled
to a verb lowering
where verbs are lowered onto VPs, whence
lower again
out
by
to form compound verbs.
requiring
that
a
lowered
have a target verb to compound with.
III.
APPLICATION TO DUTCH
Given the limited space available,
only
a
very
high-level
account
have
the
nothing
precise
to say about
distribution
I can present
This
of
transformational
how
this
tensed
So
issue of
and
approach
partially
account
inspired
in
the interpretation of
VP
with
follows
III. 1 The Dutch data
the
is
terms
by
Seuren's
of
predicate
the compound
labels is
that e.g. [V,V] is a compound verb, and [VP,V,V! is
untensed
a
of
The resulting
raising (Seuren 1972).
verb forms.
Discussion
is
must
In particular I will
the difficult
of
(5)
verb
compound may itself be lowered, but only as a unit.
extension to GPSG can provide an account of crossed
serial dependencies in Dutch.
For (2)
associated
phenomenon
of
crossed
a
compound
that
for
verb
each
compound
lowered
VP
version
it.
It
we
need
an
which
lowering of (possibly compound)
serial
onto
rule,
verbs
allows
the
from the VP
onto the verb, so we would have e.g.
dependencies
bedeviled
what
in
Dutch
by considerable
the facts
are.
The
subordinate
disagreement
clauses
about
is
Di) VPIZ -> NP
just
ZIV,
where we now use Z as a variable over sequences of
following five examples
VS.
form the core of the basis for my analysis:
18
The
other
half
of
the
process
must
be
reflected
in
rules
associated
with
each
Sentences
VP rule
which introduces a VP complement, allowing the verb
generated
to be
of space,
lowered
onto
the complement.
As this rule
we want e.g.
metarules
enumerate
such
rules,
we
can
~Pa
use
I)
VP -> ... V ... ==> VPIZ -> ... ZlV ...
H)
vP -> ... v vP
(with
compounding
compound
if
allowing
needed)
and
the
is
an
of
verbs
onto
insure
syntactic
part
of
our
variable
whose
subcategorisation
combining
and (Cii),
(4).
(4),
similar
to the
meta-generated
~d
Vd
.
Nederlands
te !preken
follows
that in section II.2 quite
For our purposes
simple
interpretations
range
The semantics
for the metarules
straightforward,
information
the
rules
(A)
given
that
is also reasonably
we know
where
we are
example
is
fully
F(V') ==> CONS(F(CAR(Z:V')),CDR(Z',V'))
II')
(E),
F(V',VP') ==>
CONS(F(CADR(Q'),CAR(Q')),
cm~(Q')),
rules (Bi) - (Di),
we can now generate
which
-
examples
crossed,
in section
II.1,
is
where
(2)
Q'
is
semantics
very
short
very
section II.2,
and uses
for
much
expansions for all its VP nodes:
E')
VPlZl,V '.
like
(I') will
those
of
rule
give
(2)
in
while (II') will give semantics like
those of rule (I).
(E °) is just like (3):
ADJ01N(Z' ,W ' )
(A)
S'
It
(Cii )
(I)
Vc
i
te leren
the
enthusiastic
reader
to
work
the examples and see that all of sentences
-
(4)
above
in
fact
receive
the
same
III.4 Which structure
conjunction
is right - evidence
from
(E)
[Vc,Vd]
probeer
to
interpretation.
(E)
Vb
left
through
Nikki
Nederlands
is
(Bii)
(Di)
The
Vd
careful
structures
I
spreken
BKPZ.
from
and indicate with subscripts the rule name
the
reader
proposed
Their
depending
Once again I include the relevant rule name in the
margin,
Vb
going:
together with the meta-generated
(Bii)
~ ~
of (B) - (D) will suffice:
I')
suitably
(ci)
(E),(Di)
[vP,Zb]
closely.
propagates correctly.
By
is
B') v'(vP')
c') v' (NP' , ~ ' )
D') v'(NP').
ordinary
that
(2)
(Rii)
The semantics
to unpack the
consists of V. This slight indirection is necessary
to
below.
III.3 Semantic rules for the Dutch data
E) wlz -> W Z,
W
meta-
(II)
effort is complete:
where
is illustrated
pro~eer te ~cleren
the lowering
We need one more rule,
verbs,
(3)
two
For reasons
(A)
Nikki
compound verbs to be discharged at any point.
complements.
involve
vP
(I) will apply to all three of (B) - (D), allowing
(C),
each
.~Pc [Vb,Vc]~
o-> vPlz -> ... vP:z:v.
to (B) and
only
ik
to conveniently express what is wanted:
will apply
(3)
s'
cii) vPlz -> ~P wlzlv.
than
and
similar, but using rules (B), (Cii), and (Di).
must also expand VPs with verbs lowered onto them,
Rather
(2)
rules and one ordinary one.
will
are not
structures
have
the same
have
from
the highest
lowest
possible.
noted
the
VP,
With
that
as those of
compound
while
the
ours
verb
depend
the exception
of
BKPZ's example (~3), which none of my sources judge
feature introduced to enforce subcategorisation.
grammatical
19
with
the
'root
Marie'
as
given,
I
believe my proposal accounts for all the judgements
reasonably
cited in their paper.
least
believe
conjunction
(4),
On the other hand, I do not
they can account
the
judgement,
next
standard
two
GPSG
the
first
(3),
treatment
of
three
whereas
based
the
Further
all
establish
they
phenomena
account
without
of
at
radically
work
the
is
clearly
needed
of this
augmented
status
to
exactly
GPSG with
respect to generative capacity and parsability.
fall out of our analysis:
is
6)
Dutch
satisfying
on
under
conjunction
the
and
altering the grammatical framework of GPSG.
for all of the following
on
concise
intriguing
omdat ik Nikki Nederlanda wil leren spreken
en Frans wil laten schrijven
equivalence
because I want to teach Nikki to speak Dutch
and let [Nikki] write French
allowing
7) * omdat ik Nikki Nedrelands wil leren spreken
en Frans laten schrijven
apparent
Joahi
et
to
with
al.
speculate
the
tree
Even
in
the
weakest
one
occurence
in
any
constituent
similarity
of
to
adjunction
only
sequences
as
its
grammars
of
of
augmentation,
of one variable
their
It
weak
any
power
over
rule,
remains
the
to be
formally established, but it at least appears that
8)
omdat ik Nikki Nederlands wil leren spreken
en Carla Frans wil laten schrijven
like
tree
cannot
because I want to teach Nikki to speak Dutch
and let Carla write French.
adjunction
generate
grammars,
anbncn
with
these
both
dependencies
crossed, and like them, it can generate it with any
one set crossed and the other nested.
9)
grammars
omdat ik Nikki wil leren Nederlands te spreken
en Frans te schrijven
it
generate
variable
because I want to teach Nikki to speak Dutch
and to write French
WW,
although
ranging over
it
can
with
Neither can
a
sequence
the entire alphabet,
if it
can be shown that it is indeed weakly equivalent to
TAG,
IO) * omdat ik Nikki wil leren Nederlands te
spreken en Carla Frans te schrijven
or
...
en Frans (ts) laten schrijven
then strong support will be lent to the claim
that
an
interesting
new
point
on
the
Chomsky
hierarchy between CFGs and the indexed grammars has
been found.
(6) contains a conjoined [VP,V,V],
[VP,V],
and
(7)
fails
(8) a conjoined
because
it
conjoin a [VP,V,V] with a [VP,V].
ordinary
trying
VP
iaside
to
a
conjoin
[VP,V],
a
VP
attempts
to
ACKNOWLEDGEMENTS
(9) conjoins an
and
with
(10)
fails
either
a
The work described herein was partially supported
by
by
non-
SERC
Grant
GR/B/93086.
My
thanks
to
Reichgelt, for renewing my interest in this problem
constituent or a [VP,V].
by presenting a version of Seuren's analysis
It
small
is
certainly
amount
not
of
the
case
'evidence'
that
to
the
adding
small
together
but
with
embedding
I t h i n k it
the
allows
obvious
is
this
to
amount
way
some vestige
to persist
in the semantics,
very
a serious
least
suggestive.
in which
Klein,
for
telling
me
about
Church's
of pairs and conditionals
in the
lambda calculus; to Brian Smith, for introducing me
Taken
to the wonderfully obscure power of the Y operator;
the deep
and to Gerald Gazdar, Aravind Joshi, Martin Kay and
Mark
I think that at the
of
Ewan
'implementation'
of compositionality
reconsideration
in a
seminar, and providing the initial sentential data;
already published establishes the case for the deep
embedding,
Han
Steedman,
for
helpful
discussion
on
various
aspects of this work.
the BKPZ
proposal is in order.
APPENDIX I
SEQUENCES IN THE UNTYPED LAMBDA CALCULUS
IV. CONCLUSIONS
To
It
is of course
augmentation
significance.
too early to tell whether
will
be
It
does
of
seem
general
to
me
this
use
to
offer
imbed
enough of
Lisp
in the lambda cslculus
for our needs, we require not just pairs, but NIL
or
and
a
conditionals
implemented
20
as
well.
Conditionals
are
similarly to pairs - "if p then q else
r"
is
simply
p applied
to the pair <q,r>,
where
Joshi,
A.
Joehi,
A.K., Levy, L. So and Yueh, K. 1975.
Tree
adjunct grammars. Journal of Comp .... and
System Sciences.
TRUE and FALSE are the left and right pair element
selectors
respectively.
construct
and
determining
their
possibilities
relatively
end
exist,
lists,
is
of
inefficient
approach.
to effectively
some
method
required.
which
but
we
of
Numerous
have
chosen
conceptually
a
clear
Kaplan, R.M. and Bresnan, J. 1982.
Lexicalfunctional grammar:
A formal system of
grammatical representation. In J. Bresnan,
editor, The mental representation of
grammatical relations. MIT Press,
Cambridge, MA.
We compose lists of triples, rather than
pairs.
Normal
<TRUE,car,cdr>,
CONS
pairs
are
given
as
while NIL is <FALSE,,>.
Given this approach,
shorthand,
sections
In order
manipulate
we can define the following
with which the semantic
Seuren, P. 1972.
Predicate Raising in French and
Sundry Languages. ms., Nijmegen.
rules given in
II.2 and III.3 can be translated
into the
Steedman, M. 1983.
On the Generality of the
Nested Dependency Constraint and the
reason for an Exception in Dutch. In
Butterworth, B., Comrie, E. and Dahl, 0.,
editors, Explanations of Language
Universals. Mouton.
lambda calculus:
TR= - Ix [~y [~]]
FALSE-
~x.Lky.LyJ]
Thompson,
N I L - ~f.Ef(FALSE)(kp.[p])(~p.[p])l
C0NS(A,B) - ~f.Ef(TRUE)(A)(B)J
CAe(L) - L(~x.[ ~y[ ~z[y] ]3 )
CDR(L)
L()~x.t ),y.L ),z.[ z] ] j )
C0NSP(L) - T(~x [~y.[~z.[x]]])
CADR(L) - CAR(CDR(L))
ADJOINFORM - la.[ IL. [ ~N. [
CONSP(L)(CONS(CA~(L),
a(CD~(L))(N)))
(CONS(N,NIL)) ] ]]
- ~f.[ ~.[ f(x(~) )] (~x.[ f(x(x))])]
ADJOIN(L,N)
1983.
How much context-sensitivity is
required to provide reasonable structural
descriptions:
Tree adjoining
gran~nars, version submitted to this
conference.
- Y(ADJOI~0~M)(T)(N)
Note that we use Church's Y operator to produce the
required recursive definition of ADJOIN.
REFERENCES
Ades, A. and Steedman, M. 1982. On the order of
words. Linguistics and Philosophy. to
appear.
Bresnan, J.W., Kaplan, R., Peters, S. and Zaenen,
A. 1982. Cross-serial dependencies in
Dutch. Linguistic Inquir[ 13.
Cazdar, G. 1981c.
Phrase structure grammar. In P.
Jacobson and G. Pullum, editors, The
nature of syntactic representation. D.
Reidel, Dordrecht.
21
H.S.
1981b.
Chart Parsing and Rule
Schemata in GPSG. In Proceedings of the
Nineteenth Annual Meeting of the
Association for Computational Linguistics.
ACL, Stanford, CA. Also DAI Research Paper
165, Dept. of Artificial Intelligence,
Univ. of Edinburgh.
Download