# Register Allocation by Puzzle Solving

```Register Allocation
by Puzzle Solving
Jens Palsberg
UCLA Computer Science Department
University of California, Los Angeles
[email protected]
This talk
 Register
allocation
 Aliased
registers and pre-coloring
 Optimal
live-range splitting produces elementary programs
 Elementary
 Coloring
programs have elementary interference graphs
elementary graphs is the same as solving puzzles
 A linear-time puzzle
solving algorithm
 Spilling
 Experimental
results
Register allocation  a collection of puzzles
A Compiler
source language
parser
intermediate representation
code generator
machine code
A Better Compiler
source language
parser
intermediate representation
code generator with a
register allocator
machine code
What is Register Allocation?
A = 10
B = 20
C = A + 30
Print C + 40 + B
Assume we have
two registers
Register allocation =
liveness analysis + graph coloring
A = 10
A
B = 20
B,A
C = A + 30
B,C
Print C + 40 + B
Interference graph:
A
B
C
With colors:
A
B
C
After Register Allocation
A = 10
B = 20
C = A + 30
Print C + 40 + B
R1 = 10
R2 = 20
R1 = R1 + 30
Print R1 + 40 + R2
A
B
C
Core (Spill-free) Register Allocation Problem
Instance: a program P and
a number K of available registers.
Problem: can each variable of P be mapped
to one of the K registers such that:
variables with interfering live ranges
are assigned to different registers?
Theorem (Chaitin et al., 1981): NP-complete
A program in SSA form has
a chordal interference graph
 Proved
by three groups independently in 2005:
 Bouchez (ENS Lyon)
 Brisk et al. (UCLA)
 Hack (U. Karlsruhe)
 A chordal graph
can be colored in linear time
Aliased registers on the Pentium
32 bits
EAX
EBX
ECX
EDX
16 bits
AX
BX
CX
DX
8 bits
AH AL
BH BL
CH CL
DH DL
Pre-coloring
Weighted Graphs and Aligned 1-2-coloring
 Nodes
 Two
have weight one or two
numbers 2i and 2i+1 are aligned
 Aligned
1-2-coloring:
• assigns a color to every vertex of weight one
• assigns two aligned colors to every vertex of weight two
 Partial
aligned 1-2-coloring: partial function
 models pre-coloring
Aligned 1-2-coloring Extension
Instance: a number 2K of colors,
a weighted graph G, and
a partial aligned 1-2-coloring C of G
Problem: Extend C to an aligned 1-2-coloring
of G.
Aligned 1-2-coloring: no vertex is pre-colored
Coloring Extension: all vertices have weight one
Coloring: no vertex is pre-colored; all weight one
Related work
Problem
General
Chordal
Aligned 1-2-coloring
extension
NPNPNPLinear time
complete complete complete [this paper]
Aligned 1-2-coloring
NP-
NP-
Interval
NP-
Elementary
Linear time
complete complete complete [this paper]
Coloring extension
NPNPNPLinear time
complete complete complete [this paper]
Coloring
NP-
Linear
complete time
Linear
time
Linear time
This talk
 Register
allocation
 Aliased
registers and pre-coloring
 Optimal
live-range splitting produces elementary programs
 Elementary
 Coloring
programs have elementary interference graphs
elementary graphs is the same as solving puzzles
 A linear-time puzzle
solving algorithm
 Spilling
 Experimental
results
From strict programs to elementary programs
 Optimal
live range splitting :
strict program  elementary program
 Used
by Appel and George (PLDI 2001)
Basic block

…

Statement1
Parallel copy
Statement2
A program P is an elementary program if:
1.
P is strict
2.
P is in static single assignment form
3.
For any variable v of P, LR(v) contains at most one
program point outside the basic block that contains def(v)
4.
If two variables u,v of P interfere, then
either def(u) = def(v), or kill(u) = kill(v)
5.
If two variables u,v of P interfere, then
either LR(u)  LR(v), or LR(v)  LR(u)
Interference graph of the example program
A clique substitution of P3
 P3
is a path with three vertices 


Elementary graphs
 Definition:
G is an elementary graph if and only if
every connected component of G is
a clique substitution of P3
 Theorem: An
elementary program has an elementary
interference graph.
Six classes of graphs
This talk
 Register
allocation
 Aliased
registers and pre-coloring
 Optimal
live-range splitting produces elementary programs
 Elementary
 Coloring
programs have elementary interference graphs
elementary graphs is the same as solving puzzles
 A linear-time puzzle
solving algorithm
 Spilling
 Experimental
results
A puzzle board
The six kinds of pieces
A puzzle and a solution
From graphs to puzzles
 Given
PX,Y,Z we build a puzzle:
 Vertex
 piece
 Color
 column
 X-clique
 upper row
 Y-clique
 both rows
 Z-clique
 lower row
 Precoloring
 some pieces are on the board already
 Theorem: Aligned
1-2-coloring extension for clique substitutions of
P3 and puzzle solving are equivalent under linear-time reductions
A rule, a pattern, and a mismatch
Example program
Counterexample 1
Lesson: use a size-2 piece before two size-1 pieces
Counterexample 2
Lesson: statements 7-10 must come before statements 11-14
Counterexample 3
Lesson: statement 15 must come before statements 11-14
Counterexample 4
Lesson: the order in statement 11-14 is crucial
From graph coloring to puzzle solving
 Theorem: A puzzle is
solvable if and only if our program
succeeds on the puzzle
 Our
puzzle solving program runs in linear time
This talk
 Register
allocation
 Aliased
registers and pre-coloring
 Optimal
live-range splitting produces elementary programs
 Elementary
 Coloring
programs have elementary interference graphs
elementary graphs is the same as solving puzzles
 A linear-time puzzle
solving algorithm
 Spilling
 Experimental
results
Spilling
 Visit
 If
each puzzle once
the puzzle is not solvable, then remove some pieces and
try to solve again
 Each
time we remove a piece, we also remove all other
pieces that stem from the same variable in the original
program
Benchmark characteristics
Benchmark
LoC
Asm
ASCI Purple:smg2000
74,875
73,039
303,037
SPEC2000:175.vpr
70,253
52,917
173,475
SPEC2000:188.ammp
54,335
35,567
149,245
MallocBench:expresso
52,853
45,041
250,770
SPEC2000:197.parser
49,388
32,849
163,025
SPEC2000:164.gzip
39,157
8,130
46,188
…
…
…
(six more)
Total
btcode
409,540 286,900 1,345,898
Puzzles and the number of times we solve them
Benchmark
Puzzles
Avg
max
Once
ASCI Purple:smg2000
52,791
1.33
8
33,822
SPEC2000:175.vpr
47,276
1.10
10
45,575
SPEC2000:188.ammp
33,428
1.09
9
28,515
MallocBench:expresso
43,791
1.06
3
38,925
SPEC2000:197.parser
30,868
1.05
4
28,992
7,840
1.06
3
6,718
…
…
…
…
251,428
1.13
10
213,411
SPEC2000:164.gzip
(six more)
Total
Execution time of the generated code vs. gcc
Conclusion
 If
you want to do register allocation for the Pentium,
your problem is to solve a collection of puzzles
 Fast
compilation time, competitive code quality
 Future work:
compare with run times of code generated by
Smith-Ramsey-Holloway (PLDI 2004)
 Future work: