Set 12. Short-circuit evaluation of boolean expressions, and the IF

advertisement
SHORT-CIRCUIT EVALUATION OF BOOLEAN EXPRESSIONS
Let's consider this in connection with generating code for the IF statement. A suitable grammar for this
statement is:
IF_STATEMENT → SIMPLE_IF | SIMPLE_IF_ELSE_BLOCK
The production for the if statement without ELSE
SIMPLE_IF → IF_PREFIX TRUE_BLOCK ENDIF
Productions for the if statement including ELSE
SIMPLE_IF_ELSE_BLOCK → SIMPLE_IF_ELSE FALSE_BLOCK ENDIF
SIMPLE_IF_ELSE → IF_PREFIX TRUE_BLOCK ELSE
Productions employed for both cases
IF_PREFIX → IF BOOL_EXPRESSION
BOOL_EXPRESSION → BOOL_EXPRESSION OR BOOL_FACTOR
I BOOL_FACTOR
BOOL_FACTOR → BOOL_FACTOR AND BOOL_SECONDARY
| BOOL SECONDARY
BOOL_SECONDARY → NOT BOOL_PRIMARY I BOOL_PRIMARY
BOOL_PRIMARY → ARITH_EXPRESSION = ARITH_EXPRESSION
I ARITH_EXPRESSION <> ARITH_EXPRESSION
I ARITH_EXPRESSION > ARITH_EXPRESSION
I ARITH_EXPRESSION >= ARITH_EXPRESSION
I ARITH_EXPRESSION < ARITH_EXPRESSION
I ARITH_EXPRESSION <= ARITH_EXPRESSION
I ( BOOL_EXPRESSION )
For the purpose of readibility let's employ the macros:
#define head_of_false_chain kind_of_Location
#define head_of_true_chain location
The code required for the various productions is described below
BOOL_PRIMARY → ARITH_EXPRESSION = ARITH_EXPRESSION
For productions such as these we generate code to compare the arithmetic expressions, followed by a
conditional jump instruction which will be executed if the comparison involved is true. In the present case
we would generate in code_s:
JE 0000
coded as
0f 84 0000
and we would set $$.head_of_true_chain = code_i - 2
and
$$.head_of_false_chain = 0
As we carry out the present algorithm, for each boolean expression that we generate code for, we will
produce an entry on symbol stack whose head_of_true chain points to a chain of (the offset fields within)
conditional branch instructions that branch if the boolean expression as a whole is true, and whose
head_of_false_chain points to a similar chain for when the boolean expression is false. In the code
above, there are no branches for if the comparison is false, so $$.head_of_false_chain is 0. There is a
single branch for if the condition is true. Since it's the only one in the chain, it is itself the end of the chain.
So the offset field of the JE instruction is 0. $$.head_of_true chain here points to this offset field.
BOOL SECONDARY → NOT BOOL PRIMARY
Clearly all the code needs to do here is to set $$.head_of_false_chain to $2.head_of_true_chain, and
$$.head_of_true_chain to $2.head_of_false_chain.
BOOL_FACTOR → BOOL_FACTOR AND BOOL_SECONDARY
See steps 3 and 5 in the example below
Consider the conditional branch right at the end of the code for $1 (ie. BOOL_FACTOR). If this were a
branch for if $1 is true, then it would have to branch to the start of the code for evaluating $3. On the
other hand if it were a branch for if $1 is false, then it would constitute a branch for if $$ was false, and
the code would here avoid having to evaluate $3. So the first thing to do is to ensure that the conditional
branch at the end of the code for $1 is a branch for if $1 is false. You can tell which it is by checking
which of $1.head_of_false_chain or $1.head_of_true_chain is larger. If the conditional branch instruction
is a branch for if $1 is true, then you need to change it to a branch for the opposite condition to make it a
branch for if $1 is false. It so happens that this can be done in all cases by merely reversing the
low_order bit of the opcode involved (i.e. of the byte following the 0f byte). This can be done by means of
an exclusive-or with 1. E.g. if x is the value of $1.head_of_true_chain, one could use code such as:
code_s[x - 1] = code_s[x-1] ^ 0x'01'
You will now need to fixup the false chain, since it has been given a new head, and
amend the values of $1.head_of_true_chain and $1.head_of_false_chain accordingly. After all this is
done, you can make all the members of the true chain for $1 (conditional) branches to the start of the
code for $3. If y is the larger value of $1.head_of_false_chain and $1.head_of_true_chain, then the code
for $3 starts at code_s[y+2]. Now join together the false chains for $3 and $1, since these all are
branches for if $$ is false. Finally set $$.head_of_true_chain to $3.head_of_true_chain, and
$$.head_of_false_chain to the head of the combined false chains referred to above. Be careful to allow
for the case where the false chain for $3 is empty.
BOOL EXPRESSION → BOOL EXPRESSION OR BOOL FACTOR
See steps 7 and 9 in the example below.
The code for this production follows similar principles to that described above for AND. In this case, in
order that the last conditional branch instruction in the code for $1 be a branch that, if taken, avoids
having to evaluate $3, it needs to be a branch for if $1 is true. If it isn't then you should change it by the
means described above, and adjust the true chain for $1, and the values of $1.head_of_false_chain and
$1.head_of_true_chain accordingly. Make,the members of the false chain for $1 into branches to the
start of the code for $3. Then join together the true chains for $3 and $1, since these all are branches for
if $$ is true. Finally set $$.head_of_false_chain to $3.head_of_false_chain, and $$.head_of_true_chain
to the head of the combined true chains referred to above. Again be careful to allow for the case where
the true chain for $3 is empty.
IF_PREFIX → IF BOOLEAN_EXPRESSION
As in the code for AND, ensure that the last branch in the code for $2 is a branch if $2 is false, as this is
the case where we do not evaluate the block of statements following. Adjust the false chain for $2 and
$2.head_of_false_chain and $2.head _of_true_chain accordingly. Now make all the members of the true
chain pointed to by $2.head_of_true_chain into branches to code_i. Set $$.head_of_false_chain to
$2.head_of_false_chain.
Code for the IF statement without ELSE
SIMPLE_IF → IF_PREFIX TRUE_BLOCK ENDIF
Make all the members of the false chain pointed to by $1.head_of_false_chain into branches to code i.
Code for the IF statement including ELSE
SIMPLE_IF _ELSE → IF_PREFIX TRUE_BLOCK ELSE
Generate an unconditional branch with offset 0000, and set $$.head_of_true_chain to point to this offset
field. Make all the member of the false chain pointed to by $1.head_of_false_chain into branches to
code_i.
SIMPLE_IF _ELSE_BLOCK → SIMPLE_IF _ELSE FALSE_BLOCK ENDIF
Make the branch pointed to by $1.head_of_true_chain into a branch to code_i.
NOTE
Yacc reports a shift-reduce conflict for most grammars written for the IF statement.
For example, with the statement:
If A then if B then C else D
the shift-reduce conflict occurs in the state reached when the next input symbol
is “else”. If the shift action is chosen, the statement will be interpreted as:
if A then (if B then C else D)
whereas if the reduce action is taken, it will be interpreted as:
if A then (if B then C) else D
As you can see, if A is false in the first case, then D will not be executed (nor will C), whereas if A is false
in the second case, then D will be executed.
Nearly all programming languages adopt the first interpretation (obtained taking the shift action). Since
the default action that Yacc takes in shift-reduce conflicts is in fact the shift action, the above conflict can
be ignored for such languages, and while it is possible to produce grammars for the If statement that
avoid the conflict, this is not necessary.
EXAMPLE
Let’s consider the problem of evaluating the short-circuit code for:
if (a and b and c) or (d and e and f) or (g and h and i)
go to place1
else
go to place2
where the letters a – i are conditions such as X > Y+2.
Let the letters A – I denote the code for evaluating a – i respectively.
By e.g. Cjt300h we mean the code for evaluating condition c followed by a jump to offset 300h in the
case where c is true. For example, if c is X > Y + 2, then Cjt300h could represent the following code:
add y, 2
mov ax, x
cmp ax, y
je 300h1
By Cjf300h we mean the code for the comparison as above, followed by a jump to offset 300h in the case
where c is false, ie. the last instruction should be changed to jne 300h
Using our productions for the if statement, the following is a derivation of the example’s if statement.
1
By je 300h we mean the conditional jump to offset 300h in the code segment, but je is a relative jump whose operand
indicates how many bytes to jump forward or backward, so the actual operand required to branch to offset 300h has to be
evaluated.
DERIVATION OF THE EXAMPLE’S IF STATEMENT
The “bool_” prefixes in the nonterminals have been omitted for brevity.
The reductions are numbered in the order they occur in a parse using an LR(1) parsing machine.
The derivation of “go to true-place” from true-block, and of “go to false-place” from false-block
are not shown.
if_statement
46
simple_if_else_block
45
simple_if_else false-block endif
44
if_prefix true-block else
43
if expression
42
expression or factor
28
41
expression or factor
secondary
14
27
40
factor
secondary
primary
13
26
39
secondary
primary
( expression)
12
25
38
bool_primary
( expression )
factor
11
24
37
( expression )
factor
factor and secondary
10
23
34
36
factor
factor and secondary
factor and secondary primary
9
20
22
31
33
35
factor and secondary
factor and secondary primary
secondary
primary
i
6
8
17
19
21
30
32
factor and secondary
primary
secondary
primary
f
primary
h
3
5
7
16
18
29
secondary
primary
c
primary
e
g
2
4
15
primary
b
d
1
a
SNAPSHOTS OF THE PARSE OF THE EXAMPLE IF STATEMENT
Our snapshots show relavant parts of the content of the code segment and symbol stack after selected productions, identified
by the numbering given in the derivation above. After a production such as
bool_factor → bool_factor and bool_secondary
with our original method, in which the entries in symbol_stack where grammar symbols, we replaced the top three members of
the stack by bool_factor. Now we are employing a stack in which each entry is a pair of numbers, that we have named
head_of_true_chain and head_of_false_chain, and our algorithm is designed to make the head_of_true_chain point 2 to a back
chain in the code segment of jumps that will occur if the code derived from bool_factor is true, and head_of_false_chain point
to a back chain of such jumps that will occur if the code involved is false. For illustration purposes, our snapshots will also
depict the source code corresponding to the code that has been generated
1. The situation after reduction 1: primary → a
primary
t=n1 f=0
symbol stack (abbrev. ss)
code segment (abbrev. cs)
AJt0
n1
( a
source code involved (abbrev. sc)
Here the code for comparison a, followed by a jump if true with a zero operand has been generated in the code segment and
n1 is the offset of the operand field involved. In symbol stack the “t=n1 f=0” means the head of the true chain for $$ has been
set to n1, and the head of its false chain has been set to zero
2. After reduction 5: secondary → primary
ss
cs
sc
factor and
t =n1 f=0
AJt0
n1
( a
and
secondary
t=n2 f=0
BJt0
n2
b
3. After reduction 6: factor → factor and secondary
factor
t=n2 f=n1
ss
cs
sc
AJf0
n1
( a
and
BJt0
n2
b
4. After reduction 8: secondary → primary
factor
t=n2 f=n1
ss
cs
sc
2
AJf0
n1
( a
and
and
BJt0
n2
b
and
secondary
t=n3 f=0
CJt0
n3
c
While one could use a C pointer for this purpose, by “points to” I mean here “supplies the offset in the code segment of”
5. After reduction 9: factor → factor and secondary
Note that we have changed the BJt to BJf , since we want to avoid evaluated c in the case where b is false. If b is true,
we would need to evaluate c to determine whether a and b and c is true or not. So the code BJt
would have to be a condition branch to C , and C would thus be evaluated whether or not B were true.
factor
t=n3 f=n2
ss
cs
sc
AJf0
n1
( a
BJfn1
n2
and
b
CJt0
n3
and
c
6. After reduction 27: factor → secondary
expression
t=n3 f=n2
ss
cs
sc
or
AJf0
n1
BJfn1
n2
CJt0
n3
n1
n2
n3
( a
and
b
and
c )
factor
t=n6 f=n5
DJf0
n4
or
EJfn4
n5
( d
and
FJt0
n6
e
and
f )
7. After reduction 28: expresssion → expression or factor
expression
t=n6 f=n5
ss
cs
sc
AJfn3+2
n1
( a
BJfn3+2
n2
and
b
CJt0
n3
and
c )
DJf0
n4
or
EJfn4
n5
( d
and
FJt0
n6
e
and
f )
8. After reduction 41: factor → secondary
expression
t=n6 f=n5
ss
cs
AJfn3+2
n1
sc ( a
and
BJfn3+2
n2
CJt0
n3
b
c ) or ( d
and
DJf0
n4
and
factor
t=n9 f=n8
EJfn4
n5
FJt0
n6
e
f )
and
GJf0
n7
or
( g
and
HJfn1
n8
IJt0
n9
h
i )
and
9. After reduction 42:
expression → expression or factor
if
ss
cs
AJfn3+2
n1
sc ( a
and
BJfn3+2
n2
CJt0
n3
b
c ) or ( d
and
DJfn6+2
n4
and
EJfn6+2
n5
FJt0
n6
e
f )
and
expression
t=n9 f=n8
GJf0
n7
or
( g
HJfn1
n8
and
h
IJt0
n9
and
i )
10. After reduction 43: if_prefix → if expression
if_prefix
t=0 f=n9
ss
cs
AJfn3+2
n1
sc ( a
and
BJfn3+2
n2
CJtn9+2
n3
b
c ) or ( d
and
DJfn6+2
n4
and
EJfn6+2
n5
FJt n9+2
n6
e
f )
and
or
GJf0
n7
( g
HJfn1
n8
IJf0
n9
h
i )
and
and
n10
11. After reduction 44: simple_if_else → if_prefix tblk else
simple_if_else
t=n11 f=n9
ss
cs
AJfn3+2
n1
sc ( a
and
BJfn3+2
n2
CJtn9+2
n3
b
c ) or ( d
and
DJfn6+2
n4
and
EJfn6+2
n5
FJt n9+2
n6
GJf11+2
n7
e
f )
( g
and
or
HJfn11+2
n8
and
h
and
IJfn11+2 tblk J0
n9
n10 n11
i )
tblk
else
12. After reduction 45: simple_if_else_block → simple_if_else fblk endif
simple_if_else fblk endif
t=n11 f=n9
ss
cs
AJfn3+2
n1
sc ( a
and
BJfn3+2
n2
CJtn9+2
n3
b
c ) or ( d
and
DJfn6+2
n4
and
EJfn6+2
n5
e
and
FJt n9+2
n6
f )
or
GJf11+2
n7
( g
and
HJfn11+2
n8
h
and
IJfn11+2
n9
i )
tblk Jn13 fblk
n10 n11 n12 n13
tblk
else blk
Download