demesmay-strigkos

Carnegie Mellon Optimal Scheduling “in a lifetime” for the SPIRAL compiler Frédéric de Mesmay Theodoros Strigkos based on Y. Voronenko’s idea Carnegie Mellon Scientific Code Generation Approaches   Produces Blocked, Parallelized, SIMDized, Scheduled code Backend : C, asm, verilog Carnegie Mellon The SPIRAL compiler  The IR is simple  SSA form  no function calls  no pointers  type is uniform  no control flow (this is dealt by others)   Huge simple DAG SPIRAL compiles libraries  Plenty of time for compilation… Carnegie Mellon Scheduling & Register Allocation  Traditionally done in two different passes  The first pass is imposing constraints on the second  Allocate registers first introduce anti- and output dependencies  used when targeting OoO architectures – few ISA registers, hardware scheduler  Schedule first  may run out of registers  used when targeting In-order processors – plenty of registers, limited hardware scheduling  Carnegie Mellon Scheduling & Register Allocation  Optimality is achieved with Integer Linear Programming (ILP)  Schedule code and allocate registers together  Model processor with constraints  Solve the equations  Optimal Scheduling = NP complete Optimal Register Allocation = NP complete  Simultaneous Allocation and Scheduling = Problematic Carnegie Mellon Node Packing Graph b Adder #1 Adder #2 b a a c c a a c c ASAP a 1 b 1 c 2 ALAP 2 2 3 a1,1  a1, 2  a2,1  a2, 2  1 aib, j {0,1} Cycle 1 a <- x + 3 b <- x + 5 c <- a + b b Cycle 2 Cycle 3 Operation Assignment : 3 equ. FU Constraint : 4 equ. Precedence : 28 equ. + Number of registers + Spilling + Objective function +… Carnegie Mellon LP instead of ILP   ILP hardly schedule DFT[4]… Every ILP problem can be reformulated such that it can be solved with an LP solver and keep integer solution!  LP is polynomial! Where is the catch?  Characterize integral facets: “good” inequalities  Have an objective function whose optimum lie on those facets Carnegie Mellon Current Implementation    Simple Processor model for functional unit allocation DFT(8) do-able (1 hour) Poor Objective function  Non-integer solutions, that we fix  Lose optimality (keep good solution?) Carnegie Mellon What’s Left ?    Get rid of non-integer solutions if possible  Get a better objective function  If not, determine if our “fixing step” is costly Add register allocation equations Deal with necessary spilling Carnegie Mellon Thank You!  Questions?

demesmay-strigkos

Related documents

Products

Support

demesmay-strigkos

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib