Materi Pendukung : T0264P24_4 Patrick Blackburn and Kristina Striegnitz Version 1.2.4 (20020829) 14.1 Feature-based Grammars We motivated the introduction of feature structures in the previous section, by saying that we would like to be able to express things like : number of = number of , i.e. NP and VP have to agree in number in order to form a sentence. The basic idea is that non-terminal symbols no longer are atomic, but are feature structures, which specify what properties the constituent in question has to have. So, instead of writing the (atomic) non-terminal symbols , , , we use feature structures where the value of the attribute is , , . The rule becomes That doesn't look so exciting, yet. But what we can to now is to add further information to the feature structures representing the non-terminal symbols. We can, e.g., add the information that the np must have nominative case: Further, we can add an attribute called to the np and the vp and require that the values be shared. Note how we express this requirement by co-indexing the values. Here is a feature based grammar for a (tiny) fragment of English. 14.2 Parsing Feature-based Grammars Now that we know what feature based grammars look like, let's see what we have to change in our active chart parsing algorithm for dealing with them. We will first look at the fundamental rule and then at the general algorithm. 14.2.1 The Fundamental Rule The fundamental rule for feature structure based grammars says that an active arc and a passive arc can be combined if 1. the passive arc starts in the position where the active arc ends 2. the active arc is looking for a category which is unifiable with the category that is provided by the passive arc. The new arc starts in the starting position of the active arc and ends in the ending position of the passive arc. The label of the new arc is built as follows. We take the label of the active arc, we unify with and advance the dot by one position. Here are some examples. The active arc and the passive arc can be combined by applying the fundamental rule. Why? The active arc ends in 0 and that's where the passive arc starts. Furthermore, is unifiable with The resulting new arc is Here is a more interesting example. The active arc as above, but let the passive arc be These two arc can also be combined: their starting and ending positions are compatible with the requirements of the rule and furthermore is unifiable with The result of this unification is So, the new arc that is created looks like this: In the last example, the application of the fundamental rule not only moved the dot, but also specified the active arc further. This mechanism together with coindexation of values within a rule can lead to propagation of information to other parts of the rule. Let's see an example. Here are the active and the passive rule: The result of combining them is The new arc is looking for a vp. But because of the co-indexation it is not looking just for any vp but for a singular vp. So, this is how the fundamental rule works. If you compare it to the fundamental rule for context free grammars, you will see we really just replaced identity of atomic categories by unification of feature structures. I.e., at all places where we said before that two categories have to be the same we now say that they have to unify. 14.2.2 Bottom-up Chart Parsing with Feature Based Grammars Here is our general algorithm for active chart parsing. 1. Make initial chart and agenda. 2. Repeat until agenda is empty: a. Take first arc from agenda. b. Add arc to chart. (Only do this if edge is not already on the chart!) c. Use the fundametal rule to combine this arc with arcs from the chart. Any edges obtained in this way should be added to the agenda. d. Make hypotheses (i.e., active edges) about new constituents based on the arc and the rules of the grammar. Add these new arcs to the agenda. End repeat 3. See if the chart contains a passive edge from the first node to the last node that has the label s. If "yes", succeed. If "xno", fail. The general procedure can stay completely unchanged. We only have to take care in those places where we deal with categories. Let's go through an example and see where we have to adapt the general algorithm for dealing with feature based grammars. Assume we want to parse the sentence mia dances with the grammar given above. We will use the general algorithm in a bottom-up fashion. Step 1. We initialize the chart and agenda: Step 2a. We take the first arc from the agenda. Step 2b. Now, what do we have to check before we add the arc to the chart? When working with plain context free grammars this was simple: we just checked whether there already was an arc in the chart that was identical to the one we wanted to add. Instead of doing that we will now check whether there is an arc that starts and ends in the same position, has the dot in the same position, and the feature structures in the rule subsume all corresponding feature structures in the new arc. If we, e.g., had the arc in the chart, we would clearly not want add our new arc. The two arcs are not identical but they contain exactly the same information. And, therefore, our condition for when to not add is satisfied: all feature structures of the arc in the chart subsume all corresponding feature structures in the new arc. In particular, What if we had the arc in the chart? All feature structures in this arc subsume the corresponding feature structures in the new arc. So, we wouldn't add the new arc in this case. Let's think a bit about this. Why does it make sense? The arc that's already in the chart is more general than the new one. This means that all combinations that we might be able to do with the new arc can also be done with the one that's already in the chart, as everything that's unifiable with is also unifiable with So, adding the new arc to the chart would not allow us to do any more combinations than we can already do. Hence, we don't need it in the chart. Returning to the example, we add the new arc to the chart. Step 2c. We try to apply the fundamental rule. Nothing happens as there are not yet any active arcs in the chart that we could combine with the new one. Step 2d. The arc we are looking at is passive and we are doing bottom-up parsing, so we have to build new hypotheses in this step. When dealing with plain context free grammars we were looking for grammar rules such that the first category on the right hand side was identical to the category provided by the passive arc. Now, when dealing with feature based grammars, we are going to look for grammar rules such that the first category on the right hand side is unifiable with the category provided by the passive arc. So, in our example we are going to look for all grammar rules such that the first symbol in the right hand side is unifiable with The rule is such a rule. The new arc is build from this rule after carrying out the unification. It looks like this: 14.3 Putting it in Prolog The first thing we have to do is to decide how we are going to represent the feature based grammar. Once we have done that we will change the active chart parser active_chart_bottomup.pl so that it can handle grammars of this format. 14.3.1 Feature-based Grammars in Prolog Let's start with the lexicon. A lexical entry for the word robber, for instance, looked like this until now: lex(robber,n). Now, we want to have feature structures instead of atomic category symbols. Using the Prolog representation of feature structures that we introduced in the previous section, we want [cat:n,|_] instead of n. So, we will write lexical entry as follows lex(robber,N) :- N = [cat:n,|_]. Of course, we could also have written lex(robber,[cat:n,|_]). But when you start adding more features lexical entries of the first format might be more readable. It is very easy now to add the information that robber is also singular: lex(robber,N) :- N = [cat:n,num:sg|_]. Here is another example: the lexical entry for the pronoun him. lex(him,N) :- N = [cat:pro,num:sg,case:acc|_]. How about the non-lexical rules of the grammar? So far, we have written them as s ---> [np,vp]. Again we want to replace the atomic non-terminal symbols s, np, and vp by the feature structures [cat:s|_], [cat:np|_], and [cat:vp|_]. And that's what we are going to do. But, again, to make the code more readable we will not write [cat:s|_] ---> [[cat:np|_], [cat:vp|_]]. but S ---> [NP,VP] :S = [cat:s|_], NP = [cat:np|_], VP = [cat:vp|_]. If we add the requirement that the np must be nominative we get S ---> [NP,VP] :S = [cat:s|_], NP = [cat:np, case:nom|_], VP = [cat:vp|_]. And if we further want to make sure that the NP and the VP agree in number, the rule looks like this: S ---> [NP,VP] :S = [cat:s|_], NP = [cat:np, case:nom, num:NUM|_], VP = [cat:vp, num:NUM|_]. 14.3.2 Parsing Feature-based Grammars in Prolog In this lecture we want to change the code in active_chart_bottomup.pl so that it will work with feature based grammars in the format that we just introduced. Now, what do we have to change for that? We already saw that the general structure of the general algorithm didn't change at all. Similarly, the general structure of the implementation in active_chart_bottomup.pl won't change. In fact, we can reuse most of the code. We only have to change those places where we access individual non-terminal symbols. I.e., we have to change 1. apply_fundamental_rule/2 (the fundamental rule), 2. predict_new_arcs_bottomup/2 (making hyptheses), 3. the place in process_agenda/1 where we check whether we should add the arc to the agenda, 4. and the place in active_chart_recognize/1 where we have been successfull. These are the only places where we access individual non-terminals. Let's look at apply_fundamental_rule. Here is the old version. It applies the fundamental rule to the first argument and returns all arcs that can be built that way in a list. It uses findall/3 to collect all solutions. We can apply the fundamental rule if we can have a passive arc and an active arc where the symbol that the active arc is looking for next is the same as the symbol that the passive arc is providing. %%% apply_fundamental_rule(+arc, -list of arcs) %%% We have an active arc; we are looking for a passive one that %%% follows it. apply_fundamental_rule(arc(I, J, Cat, Done, [SubCat|SubCats]), NewArcs) :findall(arc(I, K, Cat, [SubCat|Done], SubCats), arc(J, K, SubCat, _, []), NewArcs ). %%% We have a passive arc; we are looking for an active one that %%% precedes it. apply_fundamental_rule(arc(J, K, Cat, _, []), NewArcs) :findall(arc(I, K, SuperCat, [Cat|Done], Cats), arc(I, J, SuperCat, Done, [Cat|Cats]), NewArcs ). Here is the new version of apply_fundamental_rule. The symbols that the active arc is looking for and the passive arc is providing now don't have to be the same any more, but we want them to unify. unify_silent is the same unification predicate that we saw in the last chapter, except that it can handle empty feature structures properly and doesn't write any output to the screen but returns the result of unification in its third argument. %%% apply_fundamental_rule(+arc, -list of arcs) apply_fundamental_rule(arc(I, J, Cat, Done, [CatNeeded|RestNeeded]), Ne wArcs) :findall(arc(I, K, Cat, [CatUnified|Done], RestNeeded), (arc(J, K, CatFound, _, []), unify_silent(CatFound,CatNeeded,CatUnified)), NewArcs ). apply_fundamental_rule(arc(J, K, CatFound, _, []), NewArcs) :findall(arc(I, K, SuperCat, [CatUnified|Done], RestNeeded), (arc(I, J, SuperCat, Done, [CatNeeded|RestNeeded]), unify_silent(CatFound,CatNeeded,CatUnified)), NewArcs ). The changes that we have to make to predict_new_arcs_bottomup are pretty much of the same nature. Instead of demanding that the rules used for new hypotheses have as first symbol on the right hand side a symbol which is identical to the symbol provided by the passive arc, we require that these two feature structures unify. Compare the old version: %%% predict_new_arcs_bottomup(+arc, -list of arcs) predict_new_arcs_bottomup(arc(J, _, Cat, _, []), NewArcs) :findall(arc(J, J, SuperCat, [], [Cat|Cats]), SuperCat ---> [Cat|Cats], NewArcs ). and the new version: %%% predict_new_arcs_bottomup(+arc, -list of arcs) predict_new_arcs_bottomup(arc(J, _, CatFound, _, []), NewArcs) :findall(arc(J, J, SuperCat, [], [CatUnified|Cats]), (SuperCat ---> [CatNeeded|Cats], unify_silent(CatFound,CatNeeded,CatUnified)), NewArcs ). Most of process_agenda stays the same. But we have to define a new predicate subsuming_edge_in_chart that checks the subsumptions conditions determining whether we add an arc or throw it away. %%% process_agenda(+agenda) process_agenda([]). process_agenda([Arc | Agenda]) :- %%% CHANGE: We add the Arc only if there is no subsuming edge %%% already in the chart. %%% Changed from: \+ Arc. \+ subsuming_edge_in_chart(Arc), !, assert(Arc), make_new_arcs_bottomup(Arc, NewArcs), append(NewArcs, Agenda, NewAgenda), process_agenda(NewAgenda). process_agenda([_|Agenda]) :process_agenda(Agenda). %%% subsuming_edge_in_chart(+arc) subsuming_edge_in_chart(arc(Start,End,Cat,Found,ToFind)) :%%% There is an arc in the chart which starts in the same posit ion. arc(Start, End, CatX, FoundX, ToFindX), %%% The feature structures of this arc in the chart subsume all %%% corresponding feature structures of the arc in the argument . subsumes(CatX, Cat), subsumes_list(FoundX, Found), subsumes_list(ToFindX,ToFind). %%% subsumes_list(+ list of FS, +list of FS) %%% The feature structures of the first list subsume the corresponding %%% feature structures of the second list. subsumes_list([],[]) :- !. subsumes_list([H1|T1],[H2|T2]) :subsumes(H1,H2), subsumes_list(T1,T2). And finally, we have to adapt active_chart_recognize. Only the last line of the old version is affected. We have to make sure that there is a passive arc in the chart that spans the whole sentence and has recognized a constituent that has the category s. We use the predicate val/4, which we introduced in the last chapter, to check whether the features structure on the left hand side of the rule (with which the passive arc is labeled) contains the attribute value pair cat:s. %%% active_chart_recognize(+sentence) active_chart_recognize(Input) :cleanup, initialize_chart_bottomup(Input, 0), initialize_agenda_bottomup(Agenda), process_agenda(Agenda), length(Input, N), %%% CHANGEd from arc(0,N,s,_,[]) arc(0, N, Cat, _, []), val(cat,s,Cat,_).