Materi Pendukung : T0264P24_4 14.1 Feature-based Grammars

advertisement
Materi Pendukung : T0264P24_4
Patrick Blackburn and Kristina Striegnitz
Version 1.2.4 (20020829)
14.1 Feature-based Grammars
We motivated the introduction of feature structures in the previous section, by
saying that we would like to be able to express things like
: number of
= number of
,
i.e. NP and VP have to agree in number in order to form a sentence. The basic
idea is that non-terminal symbols no longer are atomic, but are feature
structures, which specify what properties the constituent in question has to have.
So, instead of writing the (atomic) non-terminal symbols , , , we use feature
structures where the value of the
attribute is , , . The rule
becomes
That doesn't look so exciting, yet. But what we can to now is to add further
information to the feature structures representing the non-terminal symbols. We
can, e.g., add the information that the np must have nominative case:
Further, we can add an attribute called
to the np and the vp and require that
the values be shared. Note how we express this requirement by co-indexing the
values.
Here is a feature based grammar for a (tiny) fragment of English.
14.2 Parsing Feature-based Grammars
Now that we know what feature based grammars look like, let's see what we
have to change in our active chart parsing algorithm for dealing with them. We
will first look at the fundamental rule and then at the general algorithm.
14.2.1 The Fundamental Rule
The fundamental rule for feature structure based grammars says that an active
arc and a passive arc can be combined if
1. the passive arc starts in the position where the active arc ends
2. the active arc is looking for a category which is unifiable with the
category that is provided by the passive arc.
The new arc starts in the starting position of the active arc and ends in the ending
position of the passive arc. The label of the new arc is built as follows. We take
the label of the active arc, we unify with and advance the dot by one position.
Here are some examples.
The active arc
and the passive arc
can be combined by applying the fundamental rule. Why? The active arc ends in
0 and that's where the passive arc starts. Furthermore,
is unifiable with
The resulting new arc is
Here is a more interesting example. The active arc as above, but let the passive
arc be
These two arc can also be combined: their starting and ending positions are
compatible with the requirements of the rule and furthermore
is unifiable with
The result of this unification is
So, the new arc that is created looks like this:
In the last example, the application of the fundamental rule not only moved the
dot, but also specified the active arc further. This mechanism together with coindexation of values within a rule can lead to propagation of information to other
parts of the rule. Let's see an example. Here are the active and the passive rule:
The result of combining them is
The new arc is looking for a vp. But because of the co-indexation it is not looking
just for any vp but for a singular vp.
So, this is how the fundamental rule works. If you compare it to the fundamental
rule for context free grammars, you will see we really just replaced identity of
atomic categories by unification of feature structures. I.e., at all places where we
said before that two categories have to be the same we now say that they have
to unify.
14.2.2 Bottom-up Chart Parsing with Feature Based Grammars
Here is our general algorithm for active chart parsing.
1. Make initial chart and agenda.
2. Repeat until agenda is empty:
a. Take first arc from agenda.
b. Add arc to chart. (Only do this if edge is not already on the chart!)
c. Use the fundametal rule to combine this arc with arcs from the
chart. Any edges obtained in this way should be added to the
agenda.
d. Make hypotheses (i.e., active edges) about new constituents based
on the arc and the rules of the grammar. Add these new arcs to the
agenda.
End repeat
3. See if the chart contains a passive edge from the first node to the last
node that has the label s. If "yes", succeed. If "xno", fail.
The general procedure can stay completely unchanged. We only have to take
care in those places where we deal with categories. Let's go through an example
and see where we have to adapt the general algorithm for dealing with feature
based grammars.
Assume we want to parse the sentence mia dances with the grammar given
above. We will use the general algorithm in a bottom-up fashion.
Step 1. We initialize the chart and agenda:
Step 2a. We take the first arc from the agenda.
Step 2b. Now, what do we have to check before we add the arc to the chart?
When working with plain context free grammars this was simple: we just checked
whether there already was an arc in the chart that was identical to the one we
wanted to add. Instead of doing that we will now check whether there is an arc
that starts and ends in the same position, has the dot in the same position, and
the feature structures in the rule subsume all corresponding feature structures in
the new arc.
If we, e.g., had the arc
in the chart, we would clearly not want add our new arc. The two arcs are not
identical but they contain exactly the same information. And, therefore, our
condition for when to not add is satisfied: all feature structures of the arc in the
chart subsume all corresponding feature structures in the new arc. In particular,
What if we had the arc
in the chart? All feature structures in this arc subsume the corresponding feature
structures in the new arc. So, we wouldn't add the new arc in this case. Let's
think a bit about this. Why does it make sense? The arc that's already in the chart
is more general than the new one. This means that all combinations that we
might be able to do with the new arc can also be done with the one that's already
in the chart, as everything that's unifiable with
is also unifiable with
So, adding the new arc to the chart would not allow us to do any more
combinations than we can already do. Hence, we don't need it in the chart.
Returning to the example, we add the new arc to the chart.
Step 2c. We try to apply the fundamental rule. Nothing happens as there are not
yet any active arcs in the chart that we could combine with the new one.
Step 2d. The arc we are looking at is passive and we are doing bottom-up
parsing, so we have to build new hypotheses in this step. When dealing with
plain context free grammars we were looking for grammar rules such that the first
category on the right hand side was identical to the category provided by the
passive arc. Now, when dealing with feature based grammars, we are going to
look for grammar rules such that the first category on the right hand side is
unifiable with the category provided by the passive arc. So, in our example we
are going to look for all grammar rules such that the first symbol in the right hand
side is unifiable with
The rule
is such a rule. The new arc is build from this rule after carrying out the unification.
It looks like this:
14.3 Putting it in Prolog
The first thing we have to do is to decide how we are going to represent the
feature based grammar. Once we have done that we will change the active chart
parser active_chart_bottomup.pl so that it can handle grammars of this format.
14.3.1 Feature-based Grammars in Prolog
Let's start with the lexicon. A lexical entry for the word robber, for instance,
looked like this until now:
lex(robber,n).
Now, we want to have feature structures instead of atomic category symbols.
Using the Prolog representation of feature structures that we introduced in the
previous section, we want [cat:n,|_] instead of n. So, we will write lexical entry
as follows
lex(robber,N) :- N = [cat:n,|_].
Of course, we could also have written
lex(robber,[cat:n,|_]).
But when you start adding more features lexical entries of the first format might
be more readable. It is very easy now to add the information that robber is also
singular:
lex(robber,N) :- N = [cat:n,num:sg|_].
Here is another example: the lexical entry for the pronoun him.
lex(him,N) :- N = [cat:pro,num:sg,case:acc|_].
How about the non-lexical rules of the grammar? So far, we have written them as
s ---> [np,vp].
Again we want to replace the atomic non-terminal symbols s, np, and vp by the
feature structures [cat:s|_], [cat:np|_], and [cat:vp|_]. And that's what we
are going to do. But, again, to make the code more readable we will not write
[cat:s|_] ---> [[cat:np|_], [cat:vp|_]].
but
S ---> [NP,VP] :S = [cat:s|_],
NP = [cat:np|_],
VP = [cat:vp|_].
If we add the requirement that the np must be nominative we get
S ---> [NP,VP] :S = [cat:s|_],
NP = [cat:np, case:nom|_],
VP = [cat:vp|_].
And if we further want to make sure that the NP and the VP agree in number, the
rule looks like this:
S ---> [NP,VP] :S = [cat:s|_],
NP = [cat:np, case:nom, num:NUM|_],
VP = [cat:vp, num:NUM|_].
14.3.2 Parsing Feature-based Grammars in Prolog
In this lecture we want to change the code in active_chart_bottomup.pl so that
it will work with feature based grammars in the format that we just introduced.
Now, what do we have to change for that? We already saw that the general
structure of the general algorithm didn't change at all. Similarly, the general
structure of the implementation in active_chart_bottomup.pl won't change. In
fact, we can reuse most of the code. We only have to change those places where
we access individual non-terminal symbols. I.e., we have to change
1. apply_fundamental_rule/2 (the fundamental rule),
2. predict_new_arcs_bottomup/2 (making hyptheses),
3. the place in process_agenda/1 where we check whether we should add
the arc to the agenda,
4. and the place in active_chart_recognize/1 where we have been
successfull.
These are the only places where we access individual non-terminals.
Let's look at apply_fundamental_rule. Here is the old version. It applies the
fundamental rule to the first argument and returns all arcs that can be built that
way in a list. It uses findall/3 to collect all solutions. We can apply the
fundamental rule if we can have a passive arc and an active arc where the
symbol that the active arc is looking for next is the same as the symbol that the
passive arc is providing.
%%% apply_fundamental_rule(+arc, -list of arcs)
%%% We have an active arc; we are looking for a passive one that
%%% follows it.
apply_fundamental_rule(arc(I, J, Cat, Done, [SubCat|SubCats]), NewArcs)
:findall(arc(I, K, Cat, [SubCat|Done], SubCats),
arc(J, K, SubCat, _, []),
NewArcs
).
%%% We have a passive arc; we are looking for an active one that
%%% precedes it.
apply_fundamental_rule(arc(J, K, Cat, _, []), NewArcs) :findall(arc(I, K, SuperCat, [Cat|Done], Cats),
arc(I, J, SuperCat, Done, [Cat|Cats]),
NewArcs
).
Here is the new version of apply_fundamental_rule. The symbols that the active
arc is looking for and the passive arc is providing now don't have to be the same
any more, but we want them to unify. unify_silent is the same unification
predicate that we saw in the last chapter, except that it can handle empty feature
structures properly and doesn't write any output to the screen but returns the
result of unification in its third argument.
%%% apply_fundamental_rule(+arc, -list of arcs)
apply_fundamental_rule(arc(I, J, Cat, Done, [CatNeeded|RestNeeded]), Ne
wArcs) :findall(arc(I, K, Cat, [CatUnified|Done], RestNeeded),
(arc(J, K, CatFound, _, []),
unify_silent(CatFound,CatNeeded,CatUnified)),
NewArcs
).
apply_fundamental_rule(arc(J, K, CatFound, _, []), NewArcs) :findall(arc(I, K, SuperCat, [CatUnified|Done], RestNeeded),
(arc(I, J, SuperCat, Done, [CatNeeded|RestNeeded]),
unify_silent(CatFound,CatNeeded,CatUnified)),
NewArcs
).
The changes that we have to make to predict_new_arcs_bottomup are pretty
much of the same nature. Instead of demanding that the rules used for new
hypotheses have as first symbol on the right hand side a symbol which is
identical to the symbol provided by the passive arc, we require that these two
feature structures unify. Compare the old version:
%%% predict_new_arcs_bottomup(+arc, -list of arcs)
predict_new_arcs_bottomup(arc(J, _, Cat, _, []), NewArcs) :findall(arc(J, J, SuperCat, [], [Cat|Cats]),
SuperCat ---> [Cat|Cats],
NewArcs
).
and the new version:
%%% predict_new_arcs_bottomup(+arc, -list of arcs)
predict_new_arcs_bottomup(arc(J, _, CatFound, _, []), NewArcs) :findall(arc(J, J, SuperCat, [], [CatUnified|Cats]),
(SuperCat ---> [CatNeeded|Cats],
unify_silent(CatFound,CatNeeded,CatUnified)),
NewArcs
).
Most of process_agenda stays the same. But we have to define a new predicate
subsuming_edge_in_chart that checks the subsumptions conditions determining
whether we add an arc or throw it away.
%%% process_agenda(+agenda)
process_agenda([]).
process_agenda([Arc | Agenda]) :-
%%% CHANGE: We add the Arc only if there is no subsuming edge
%%% already in the chart.
%%% Changed from: \+ Arc.
\+ subsuming_edge_in_chart(Arc),
!,
assert(Arc),
make_new_arcs_bottomup(Arc, NewArcs),
append(NewArcs, Agenda, NewAgenda),
process_agenda(NewAgenda).
process_agenda([_|Agenda]) :process_agenda(Agenda).
%%% subsuming_edge_in_chart(+arc)
subsuming_edge_in_chart(arc(Start,End,Cat,Found,ToFind)) :%%% There is an arc in the chart which starts in the same posit
ion.
arc(Start, End, CatX, FoundX, ToFindX),
%%% The feature structures of this arc in the chart subsume all
%%% corresponding feature structures of the arc in the argument
.
subsumes(CatX, Cat),
subsumes_list(FoundX, Found),
subsumes_list(ToFindX,ToFind).
%%% subsumes_list(+ list of FS, +list of FS)
%%% The feature structures of the first list subsume the corresponding
%%% feature structures of the second list.
subsumes_list([],[]) :- !.
subsumes_list([H1|T1],[H2|T2]) :subsumes(H1,H2),
subsumes_list(T1,T2).
And finally, we have to adapt active_chart_recognize. Only the last line of the
old version is affected. We have to make sure that there is a passive arc in the
chart that spans the whole sentence and has recognized a constituent that has
the category s. We use the predicate val/4, which we introduced in the last
chapter, to check whether the features structure on the left hand side of the rule
(with which the passive arc is labeled) contains the attribute value pair cat:s.
%%% active_chart_recognize(+sentence)
active_chart_recognize(Input) :cleanup,
initialize_chart_bottomup(Input, 0),
initialize_agenda_bottomup(Agenda),
process_agenda(Agenda),
length(Input, N),
%%% CHANGEd from arc(0,N,s,_,[])
arc(0, N, Cat, _, []),
val(cat,s,Cat,_).
Download