FSU Improving Performance by Branch Reordering by

advertisement
FSU
DEPARTMENT OF COMPUTER SCIENCE
Improving Performance
by Branch Reordering
by
Minghui Yang and David Whalley
Florida State University
and
Gang-Ryung Uh
Lucent Technologies
1
FSU
DEPARTMENT OF COMPUTER SCIENCE
Outline of Presentation
• Motivation
• Detecting a Reorderable Sequence
• Selecting the Sequence Ordering
• Applying the Transformation
• Results
• Future Work
2
FSU
DEPARTMENT OF COMPUTER SCIENCE
Example Sequence of Comparisons with the Same Variable
while ((c=getchar())
!= EOF)
if (c == ’\n’)
X;
else if (c == ’ ’)
Y;
else
Z;
(a) Original Code
Segment
while (1) {
while (1) {
c = getchar();
c = getchar();
if (c == ’ ’)
if (c > ’ ’)
Y;
goto def;
else if (c == ’\n’)
else if (c == ’ ’)
X;
Y;
else if (c == EOF)
else if (c == ’\n’)
break;
X;
else
else if (c == EOF)
Z;
break;
}
else
def: Z;
(b) Conventional
Reordering
}
(c) Improved
Reordering
3
FSU
DEPARTMENT OF COMPUTER SCIENCE
Overview of Compilation Process for Branch Reordering
C
first
source
program
training
input
compilation
profile
second
data
compilation
data
test
input
data
executable
instrumented
for
profiling
executable
with
branches
reordered
4
FSU
DEPARTMENT OF COMPUTER SCIENCE
Ranges and Corresponding Range Conditions
Form
1
2
3
4
Range
c..c
MIN..c
c..MAX
c1..c2
Range Condition
v == c
v <= c
v >= c
c1 <= v && v <= c2
5
FSU
DEPARTMENT OF COMPUTER SCIENCE
Requirements for a Sequence to Be Reorderable
• All the ranges in the sequence are nonoverlapping.
• The sequence can only be entered through the first
range condition.
• The sequence has no side effects.
• Each range condition can only contain comparisons
and branches.
6
FSU
DEPARTMENT OF COMPUTER SCIENCE
Example of Detecting Range Conditions
if (c>=’a’ && c<=’z’ ||
c>=’A’ && c<=’Z’)
T1;
else if (c==’_’)
T2;
else if (c<=’˜’)
T3;
else
T4;
(a) C Code Segment
T
T
c < 97
F
c <= 122
F
c < 65
F
c > 90
F
T1
1
2
3
T
4
T
5
c != 95 6
F
7 T
T2
c > 126 8
F
9 T
T3
T4
10
(b) Control Flow
7
FSU
DEPARTMENT OF COMPUTER SCIENCE
Example of Detecting Range Conditions (cont.)
Blocks
1,2
3,4
6
8
Range
[97..122]
[65..90]
[95..95]
[127..MAX]
Target
T1
T1
T2
T4
8
FSU
DEPARTMENT OF COMPUTER SCIENCE
Explicit and Default Ranges
• An explicit range is a range that is
checked by a range condition.
• A default range is a range that is
not checked by a range condition.
P
R1
F
R2
T
T
T1 [c1..c2]
T2 [c3..c4]
F
TD [MIN..c1-1]
[c2+1..c3-1]
[c4+1..MAX]
9
FSU
DEPARTMENT OF COMPUTER SCIENCE
Example of Reordering Range Conditions
P
P
R1
F
R2
T
T
T1 [c1..c2]
R1
F
T2 [c3..c4]
R2
F
F
TD [MIN..c1-1]
[c2+1..c3-1]
[c4+1..MAX]
(a) Original Sequence
R3
F
R4
F
R5
T
T
T
T
T
T1 [c1..c2]
T2 [c3..c4]
[MIN..c1-1]
TD [c2+1..c3-1]
[c4+1..MAX]
(b) Equivalent Original Sequence
10
FSU
DEPARTMENT OF COMPUTER SCIENCE
Example of Reordering Range Conditions (cont.)
P
R5
F
R1
F
R2
F
R3
F
R4
T
T
T
T
T
[c4+1..MAX]
P
R5
T1 [c1..c2]
F
R1
T2 [c3..c4]
F
R2
[MIN..c1-1]
T
T
T
[c4+1..MAX]
T1 [c1..c2]
T2 [c3..c4]
F
TD [MIN..c1-1]
TD [c2+1..c3-1]
(c) Reordered Sequence
[c2+1..c3-1]
(d) Equivalent Reordered Sequence
11
FSU
DEPARTMENT OF COMPUTER SCIENCE
Sequence Cost Equations
pi is the probability that Ri will exit the sequence.
ci is the cost of testing Ri.
Explicit_Cost([R1 , . . . , R n ])
= p1 c 1 + p2 (c 1 + c 2 ) + . . . + p n (c 1 + c 2 + . . . + c n )
The optimal order of a sequence of explicit range
conditions is achieved by sorting them in descending order of pi/ci.
Cost([R1 , . . . , R n ]) = E xplicit_Cost([R1 , . . . , R n ]) +
(1 − ( p1 + . . . + p n ))(c 1 + . . . + c n )
12
FSU
DEPARTMENT OF COMPUTER SCIENCE
Selecting the Sequence Ordering
• We need to select one of t targets as
the default.
• A potential default target having m
m-1 combinations
ranges could have 2
of ranges that do not have to be explicitly checked.
• We used the ordering p1/c1 ≥ ... ≥
pm/cm to select the lowest cost from
only m combinations of default range
conditions for each target.
{Rm}, {Rm-1,Rm}, ..., {R1, ..., Rm}
• The minimum cost among the t tar-
gets is selected.
• Only the cost of n sequences are con-
sidered, where n is the total number
of ranges for all of the targets.
13
FSU
DEPARTMENT OF COMPUTER SCIENCE
Applying the Reordering Transformation
P1
P1
P1
R3’
F
TD
F
T1
R1’
R1
... T F
F
S1
R2’
S1
P2 T2
F
T R3’
R2 T
F
F
T2 S1
S2
S2
S1
T
S2
R3
T3
TD
F
TD
(a) Original Sequence (b) After Duplicating
the Sequence
(c) After Eliminating
Intervening Side Effects
...
R1
F
S1
T
T1
...
...
P2
P3
R2 T
F
T2
S2
R3
F
TD
T
T3
P3
R1
F
S1
T
T1
T
R1’
...
F
S1
P2
F
T R2’
R2 T
T2
F
F
S2
S2
R3
F
TD
T
T3
T
T
...
P3
T
14
FSU
DEPARTMENT OF COMPUTER SCIENCE
Applying the Reordering Transformation (cont.)
P1
...
P3
R1
F
S1
T
R2 T
F
S2
R3
F
TD
...
S1
T2
P2
T1
T2
S1
S2
T
T3
R4
F
T
R2’
F
T R1’
F
T R3’
F
T
S1
S2
TD
(d) After Reordering Range Conditions
P1
...
P3
S2
R3
F
TD
...
S1
T2
P2
T1
T2
S1
S2
T
T3
R4
F
T
R2’
F
T R1’
F
T R3’
F
T
S1
S2
TD
(e) After Dead Code Elimination
15
FSU
DEPARTMENT OF COMPUTER SCIENCE
Heuristics Used for Translating switch Statements
Term
n
m
Heuristic
Set
I
II
III
Definition
Number of cases in a switch statement.
Number of possible values between the first and last case.
Indirect Jump
Binary Search
Linear Search
n ≥ 4 &&
m ≤ 3n
n ≥ 16 &&
m ≤ 3n
never
!indirect_jump
&& n ≥ 8
!indirect_jump
&& n ≥ 8
never
!indirect_jump &&
!binary_search
!indirect_jump &&
!binary_search
always
16
FSU
DEPARTMENT OF COMPUTER SCIENCE
Dynamic Frequency Measurements
Switch
Translation
Heuristics
Set I
Set II
Set III
Reordered
Original
Program
awk
cb
cpp
ctags
deroff
grep
hyphen
join
lex
nroff
pr
ptx
sdiff
sed
sort
wc
yacc
average
average
average
Insts
Insts
Branches
13,611,150
17,100,927
18,883,104
71,889,513
15,460,307
9,256,749
18,059,010
3,552,801
10,005,018
25,307,809
73,051,342
20,059,901
14,558,535
14,229,310
23,146,400
25,818,199
25,127,817
23,477,465
23,510,571
24,556,842
-2.02%
-7.65%
-0.13%
-9.10%
-1.53%
-3.60%
+3.42%
-1.68%
-4.56%
-2.48%
-16.25%
-9.18%
-16.09%
-1.16%
-47.20%
-15.05%
-0.25%
-7.91%
-8.37%
-12.72%
-4.19%
-15.46%
-0.19%
-14.72%
-2.63%
-8.31%
+3.40%
-2.12%
-10.39%
-6.35%
-29.96%
-13.28%
-37.03%
-2.03%
-57.38%
-26.26%
-0.44%
-13.37%
-14.30%
-20.75%
17
FSU
DEPARTMENT OF COMPUTER SCIENCE
Execution Time
Machine
SPARC IPC
SPARC 20
SPARC Ultra I
Heuristic Set
I
I
II
Average Execution Time
-4.94%
-5.57%
-2.88%
18
FSU
DEPARTMENT OF COMPUTER SCIENCE
Future Work
• Using Binary Search Instead of Linear
Search
• Contrasting Various Semi-static Search
Methods
— Linear Search
— Binary Search
— Jump Table
— Combinations of Methods
• Reordering Branches with a Common
Successor
19
Download