Set 26. Eliminating unit productions

advertisement
ELIMINATING UNIT
PRODUCTIONS
Definition.
A unit production is a production that we wish
to eliminate whose right-hand side consists of
a single symbol
We’ll abbreviate it as a “unit prod”
Example
B_EXPR → B_EXPR OR B_FACT
I B_FACT
B_FACT → B_FACT AND B_SEC
| B_SEC
B_SEC → NOT B_PRIM
I B_PRIM
B_PRIM → ARITH_EXPR = ARITH_EXPR
In the above grammar the three productions
whose lhs’s are shown in red are unit prods
Unit productions such as those on
the previous slide play no role in
code generation.
Eliminating them both reduces the
size of the parser obtained and
increases its speed
A tree of unit productions is a graphical
representation of those that occur. For
instance, if the unit prods. in a grammar are
A→B B→C C→D E→B F→D
F → G then the tree involved would be:
A
E
B
C
F
D
G
In the (upside down) tree on the previous
slide, the leaves are F and G. These are the
symbols that occur as rhs’s of unit prods but
do not occur as the lfs’s of any unit prod.
Algorithm for Eliminating Unit Productions from a Parsing Machine
1. For each state S of the machine in turn (including the the new states
added to the machine in step 2),
do step 2 for each leaf x, if any, such that the x-successor of S has a
unit reduction. When all these iterations of step 2 are complete,
go on to steps 3, 4, 5.
2. Let x1,..,xn be the symbols (which will include x) for which actions are
defined at S such that we can derive x from xi entirely via unit
productions, and for 1<= i <= n, let the xi - successor of S be Ti. If any
state R is, or at a previous stage of the algorithm has been, a
combination of states T1,...,Tn, make R the new x-successor of S;
otherwise setup a new state T as the x-successor of S, where T is a
combination of states T1,...,Tn.
3. Delete all connections to states that represent transitions with respect to
left-hand sides of unit productions.
4. Delete all state which at this stage are not reachable from State 0.
5. Replace every reduce action y → ..., where y is the left-hand side of a
unit production by x → ..., where x is an arbitrarily selected leaf such
that x is derivable from y entirely via unit productions.
Example
Consider the grammar
E -> E + T | T
T -> T * a | a
The unit productions here are:
E
T
a
and the sole leaf is a
Consider the grammar
E→ E+T|T
T→ T*a|a
The unit productions here are
E → T and T → a.
a is a leaf, as it occurs as the rhs of T → a
but does not occur as the lhs of a unit prod
The parsing machine for this grammar was
given in set 2, and is reproduced on the next
slide
There are unit productions at states 3 and 4.
These states are successors of
states 0 and 2
Step 1 of the algorithm, accordingly, asks us
to perform step 2 as applied to
states 0 and 2
Applying step 2 to state 0, we note that this
state has an a, T, and E successor.
These are all symbols from which one can
derive “a” entirely through unit productions.
For instance we can derive a from T via
T→a
and we can derive a from E via
E → T, T → a
So we combine the E, T and a successors of
state 0 (states 4, 3 and 1), to form the new asuccessor of state 0. This new state has all
the actions (other than unit prods.) defined at
it that states 4, 3, and 1 have.
For simplicity of exposition, we do not show
state 3, the t-successor of state 0, which
would still be present at this stage, and only
deleted in step 4.
a
ACCEPT if -|
1,3,4
+
a
Applying step 2 to state 2, we make the new asuccessor of state 2 one which combines the
actions (other than unit prods) of states 4 and 6.
At this stage state 6 is still present, and only gets
deleted in step 4. But for simplicity we have
omitted it from the diagram.
a
+
a
ACCEPT if -|
a
1,3,4
4,6
Applying step 3, then produces
a
+
a
ACCEPT if -|
a
1,3,4
4,6
States 1 and 4 (as well as states 3 and 6
which were omitted from the previous
diagrams) are at this stage not reachable
from state 0. So, in applying step 4, they are
deleted. The result then is:
a
+
a
ACCEPT if -|
a
1,3,4
4,6
Finally, in step 5, we change the productions
which at present have E or T as their rhs’s, by
replacing this rhs by a. So the reduction
T → T * a becomes a → T * a
and E → E + T becomes a → E + T
This produces:
a
+
a
ACCEPT if -|
a
1,3,4
4,6
a→
a→
In using this parsing machine, whatever code
was associated with the reduction
T→T*a
now becomes associated with
a → T * a,
and whatever code was associated with
E→E+T
now becomes associated with
a→ E+t
Class Exercise
Employing the stacks Symbol List and State
No. List, provide a parse of a + a * a using
the parsing machine on the previous slide
Download