Optimization

advertisement
CPSC 388 – Compiler Design
and Construction
Optimization
Optimization Goal
 Produce Better Code
 Fewer instructions
 Faster Execution
 Do Not Change Behavior of Program!
Optimization Techniques
 Peep-hole optimization
 Done after code generation
 Makes small local changes to assembly
 Moving Loop-Invariants
 Done before code generation
 Find Computations in loops that can be moved
outside
 Strength Reduction in for loops
 Done before code generation
 Replace multiplications with additions
 Copy Propagation
 Done before code generation
 Replace use of variable with literal or other variable
Peep-hole Optimization
 Look through small window at assembly
code for common cases that can be
improved
1. Redundant load
2. Redundant push/pop
3. Replace a Jump to a jump
4. Remove a Jump to next instruction
5. Replace a Jump around jump
6. Remove Useless operations
7. Reduction in strength
Redundant Load
 Before
store
load
Rx, M
M, Rx
 After
store
Rx, M
Redundant Push/Pop
 Before
push
pop
Rx
Rx
 After
… nothing …
Replace a jump to a jump
 Before
goto
…
L1:goto
 After
goto L2
L1:goto L2
L1
L2
Remove a Jump to next Instruction
 Before
goto
L1:…
 After
L1:…
L1
Replace a jump around jump
 Before
if T0 = 0 goto L1
else goto L2
L1:…
 After
if T0 != 0 goto L2
L1:…
Remove useless operations
 Before
add
mul
T0, T0, 0
T0, T0, 1
 After
… nothing …
Reduction in Strength
 Before
mul
add
T0, T0, 2
T0, T0, 1
 After
shift-left T0
inc
T0
One optimization may lead to
another
load
add
store
Tx, M
Tx, 0
Tx, M
 After One Optimization:
load
store
Tx, M
Tx, M
 After Another Optimization:
load
Tx, M
You Try It
The code generated from this program contains opportunities for the first
two kinds (redundant load, jump to a jump). Can you explain how just by
looking at the source code?
public class Opt {

public static void main() {
int a;
int b;
if (true) {
if (true) {
b = 0;
}
else {
b = 1;
}
return;
}
a = 1;
b = a;
}
}
Moving Loop-Invariant
Computations Out of the Loop
 For greatest gain, optimize “hot
spots”, i.e. inner loops.
 An expression is loop invariant if the
same value is computed on every
iteration of the loop
 Compute the value once outside loop
and reuse value inside loop
Example
for (int i=0;i<100;i++) {
for (int j=0;j<100;j++) {
for (int k=0;k<100;k++) {
A[i][j][k]=i*j*k;
}
}
}
Example
for (int i=0;i<100;i++) {
for (int j=0;j<100;j++) {
for (int k=0;k<100;k++) {
T0=i*j*k;
T1=FP+<offset of A>-i*4000-j*400-k*4;
Store T0, 0(T1)
}
}
}
Invariant to I loop
Invariant to J loop
Invariant to K loop
Example
tmp0=FP + <offset of A>
for (int i=0;i<100;i++) {
tmp1=tmp0-i*4000;
for (int j=0;j<100;j++) {
tmp2=tmp1-j*400;
tmp3=i*j;
for (int k=0;k<100;k++) {
T0=tmp3*k;
T1=tmp2-k*4;
store T0, 0(T1)
}
}
}
Comparison before and after
of inner most loop
(executed 1 million times)
Original Code
 5 multiplications (3
for lvalue, 2 for
rvalue)
 3 subtractions(for
lvalue)
 1 indexed store
New Code
 2 multiplications (1
for lvalue, 1 for
rvalue)
 1 subtraction (for
lvalue)
 1 indexed store
Questions
 How do you recognize loop-invariant
expressions?
 When and where do we move the
computations of those expressions?
Recognizing Loop Invariants
An expression is invariant with respect
to a loop if for every operand, one of
the following holds:
 It is a literal
 It is a variable that gets its value only
from outside the loop
When and Where to move invariant
expressions
 Must consider safety of move
 Must consider profitability of move
Safety of moving invariants
 If evaluating expression might cause
an error and the loop might not get
executed:
b=a;
while (a != 0) {
x = 1/b; //possible “/0” if moved
a--;
}
Safety of moving invariants
 What about preserving order of
events?
 if the unoptimized code performed
output THEN had runtime error
 Is it valid for the optimized code to
simply have runtime error?
 Changing order of computations may
change result for floating-point
computations due to differing
precisions
Profitability of moving invariants
If the computation might NOT
execute in the original program then
moving the computation might
actually slow down the program!
Moving is Safe and Profitable If
 Loop will execute at least once
 Code will execute if loop does
 Isn’t inside any condition
 Is on all paths through loop (both if and
else portions)
 Expression is in non short-circuited
part of the loop test
 E.g. while (x < i+j*100)
You Try It
 What are some examples of loops for
which the compiler can be sure that
the loop will execute at least once?
Strength Reduction
 Concentrate on “hot spots”
 Replace expensive operations (*) with
cheaper ones (+)
Example Strength Reduction
For i from low to high do
…i*k1+k2
Where
 i is the loop index
 K1 and K2 are constant with respect to
the loop
Consider the sequence of values for i and expression
Examples Strength Reduction
Iteration
#
i
i*k1+k2
1
low
low*k1+k2
2
low+1
(low+1)*k1+k2=
low*k1+k2+k1
3
low+1+1
(low+1+1)*k1+k2=
low*k1+k2+k1+k1
Example Strength Reduction
 Compute low*k1+k2 once before loop
 Store value in a temporary
 Use the temporary instead of the
expression inside loop
 Increment temporary by k1 at the
end of the loop
Example Strength Reduction
temp=low*k1+k2
For i from low to high do
…temp…
temp=temp+k1
end
Another Example
tmp0 = FP + offset A
for (i=0; i<100; i++) {
tmp1 = tmp0 - i*40000
//
for (j=0; j<100; j++) {
tmp2 = tmp1 - j*400
//
tmp3 = i*j
//
for (k=0; k<100; k++) {
T0 = tmp3 * k
// k
T1 = tmp2 - k*4 // k
store T0, 0(T1)
}
}
}
i * -40000 + tmp0
j * -400 + tmp1
j * i + 0
* tmp3 + 0
* -4 + tmp2
Now Perform Strength Reduction
tmp0 = FP + offset A
temp1 = tmp0
// temp1 = 0*-40000+tmp0
for (i=0; i<100; i++) {
tmp1 = temp1
temp2 = tmp1
// temp2 = 0*-400+tmp1
temp3 = 0
// temp3 = 0*i+0
for (j=0; j<100; j++) {
tmp2 = temp2
tmp3 = temp3
temp4 = 0
// temp4 = 0*tmp3+0
temp5 = tmp2
// temp5 = 0*-4+tmp2
for (k=0; k<100; k++) {
T0 = temp4
T1 = temp5
store T0, 0(T1)
temp4 = temp4 + tmp3
temp5 = temp5 - 4
}
temp2 = temp2 - 400
temp3 = temp3 + i
}
temp1 = temp1 - 40000
}
You Try It
 Suppose that the index variable is
incremented by something other than one
each time around the loop. For example,
consider a loop of the form:
for (i=low; i<=high; i+=2) ...
 Can strength reduction still be performed?
If yes, what changes must be made to the
proposed algorithm?
Copy Propagation
 Statements of the form “x=y” (called
d) are called copy statements. For
every use, u, of variable x reached by
a copy statement such that:
 No other definition of x reaches u, and
 y can’t change between d and u
 You can replace the use of x at u with
a use of y.
Examples of Copy Propagation
x=y
a=x+z
Yes
x=y
if (…) x=2
a=x+z
No
x=y
if (…) y=3
a=x+z
No
Question
 Why is this a useful transformation?
 If ALL uses of x reached by definition
d are replaced, then the definition of
d is useless, and can be removed.
tmp0 = FP + offset A
temp1 = tmp0
// cannot be propagated
for (i=0; i<100; i++) {
tmp1 = temp1
temp2 = tmp1
// cannot be propagated
temp3 = 0
// cannot be propagated
for (j=0; j<100; j++) {
tmp2 = temp2
tmp3 = temp3
temp4 = 0
// cannot be propagated
temp5 = tmp2
// cannot be propagated
for (k=0; k<100; k++) {
T0 = temp4
T1 = temp5
store T0, 0(T1)
temp4 = temp4 + tmp3
temp5 = temp5 - 4
}
temp2 = temp2 - 400
temp3 = temp3 + i
}
temp1 = temp1 - 40000
}
tmp0 = FP + offset A
temp1 = tmp0
for (i=0; i<100; i++) {
temp2 = temp1
temp3 = 0
for (j=0; j<100; j++) {
temp4 = 0
temp5 = temp2
for (k=0; k<100; k++) {
store temp4 0(temp5)
temp4 = temp4 + temp3
temp5 = temp5 - 4
}
temp2 = temp2 - 400
temp3 = temp3 + i
}
temp1 = temp1 - 40000
}
Comparision before and after
Before
 5 *, 3 +/-, 1
indexed store in
inner most loop
After
 2 +/- in inner most
loop
 2 +/-, 2 copy
statements in
middle loop
 1 +/-, 1 copy in
outer loop
Download