Carnegie Mellon Optimal Scheduling “in a lifetime” for the SPIRAL compiler Frédéric de Mesmay Theodoros Strigkos based on Y. Voronenko’s idea Carnegie Mellon Scientific Code Generation Approaches Produces Blocked, Parallelized, SIMDized, Scheduled code Backend : C, asm, verilog Carnegie Mellon The SPIRAL compiler The IR is simple SSA form no function calls no pointers type is uniform no control flow (this is dealt by others) Huge simple DAG SPIRAL compiles libraries Plenty of time for compilation… Carnegie Mellon Scheduling & Register Allocation Traditionally done in two different passes The first pass is imposing constraints on the second Allocate registers first introduce anti- and output dependencies used when targeting OoO architectures – few ISA registers, hardware scheduler Schedule first may run out of registers used when targeting In-order processors – plenty of registers, limited hardware scheduling Carnegie Mellon Scheduling & Register Allocation Optimality is achieved with Integer Linear Programming (ILP) Schedule code and allocate registers together Model processor with constraints Solve the equations Optimal Scheduling = NP complete Optimal Register Allocation = NP complete Simultaneous Allocation and Scheduling = Problematic Carnegie Mellon Node Packing Graph b Adder #1 Adder #2 b a a c c a a c c ASAP a 1 b 1 c 2 ALAP 2 2 3 a1,1 a1, 2 a2,1 a2, 2 1 aib, j {0,1} Cycle 1 a <- x + 3 b <- x + 5 c <- a + b b Cycle 2 Cycle 3 Operation Assignment : 3 equ. FU Constraint : 4 equ. Precedence : 28 equ. + Number of registers + Spilling + Objective function +… Carnegie Mellon LP instead of ILP ILP hardly schedule DFT[4]… Every ILP problem can be reformulated such that it can be solved with an LP solver and keep integer solution! LP is polynomial! Where is the catch? Characterize integral facets: “good” inequalities Have an objective function whose optimum lie on those facets Carnegie Mellon Current Implementation Simple Processor model for functional unit allocation DFT(8) do-able (1 hour) Poor Objective function Non-integer solutions, that we fix Lose optimality (keep good solution?) Carnegie Mellon What’s Left ? Get rid of non-integer solutions if possible Get a better objective function If not, determine if our “fixing step” is costly Add register allocation equations Deal with necessary spilling Carnegie Mellon Thank You! Questions?