An Overview of the SUIF2 System Monica Lam Stanford University http://suif.stanford.edu/ The SUIF System PGI Fortran EDG C EDG C++ Java OSUIF Interprocedural Analysis Parallelization Locality Opt C SUIF2 * MachSUIF Alpha * C++ OSUIF to SUIF is incomplete Scalar opt Inst. Scheduling Register Allocation x86 Overview of SUIF Components (I) Basic Infrastructure Extensible IR and utilities Hoof: Suif object specification lang Standard IR Modular compiler system Pass submodule Data structures (e.g. hash tables) Object-oriented Infrastructure OSUIF representation Backend Infrastructure MachSUIF program representation Optimization framework FE: PGI Fortran, EDG C/C++, Java SUIF1 / SUIF2 translators, S2c Interactive compilation: suifdriver Statement dismantlers SUIF IR consistency checkers Suifbrowser, TCL visual shell Linker Java OSUIF -> SUIF lowering object layout and method dispatch Scalar optimizations common subexpression elimination deadcode elimination peephole optimizations Graph coloring register allocation Alpha and x86 backends Overview of SUIF Components (II) High-Level Analysis Infrastructure Graphs, sccs Iterated dominance frontier Dot graph output Region framework Interprocedural analysis framework Presburger arithmetic (omega) Farkas lemma Gaussian elimination package Intraprocedural analyses copy propagation deadcode elimination Steensgaard’s alias analysis Call graph Control flow graphs Interprocedural region-based analyses: array dependence & privatization scalar reduction & privatization Interprocedural parallelization Affine partitioning for parallelism & locality unifies: unimodular transform (interchange, reversal, skewing) fusion, fission statement reindexing and scaling Blocking for nonperfectly nested loops Motivation for Extensible IR Suif1 design A fixed set of flat C++ classes All codes must change if we add new IR nodes e.g. printing objects, reading and writing to file Higher level semantic objects OSUIF (for object-oriented programming) Verilog event-driven control flow semantics Saturated arithmetic used in multimedia instruction sets Program analysis concepts Phi nodes in SSA analysis Synchronization operations introduced by parallelization Results of analysis Concept I: Reflective IR Metaclass: captures representation of IR nodes in a data structure Enables common generic routines to implement Persistence: reading/writing to disk Cloning: routine that copies the data structure Printing: a generic routine to print out the information Walkers, iterators, visitors Concept II: Object hierarchy & virtual aggregate fields ExecutionObject Statement get_child_statements IfStatement WhileStatement get_then_part get_else_part get_body Abstract names to refer to fundamental concepts in subclasses e.g. Statement::get_child_statements at Statement level IfStatement: get_then_part and get_else_part, or WhileStatement: get_body Allows a pass to run on representation with extended semantics without recompilation e.g. Reuse a dead code elimination on SPMD code without recompilation Concept III: Multiple Representations for HighLevel Analyses Multiple representations for different semantic levels e.g. FOR loops versus basic blocks in a control flow graph => Alternative representations Mixture of high-level and low-level constructs Dismantlers lower the representation Concept IV: High-level object specification Insulates user from details Object Definition (.hoof) SUIF Macro Generator a general grammar-based tool Interface for user (.h) Implementation in Meta-Class System (.cpp) Meta-Class System reading & writing to file in machine-independent format • Easy for the programmer • Easy for the implementor to develop the system Example of a Hoof Definition C++ hoof concrete New { int x; } class New : public SuifObject { public: int get_x(); void set_x(int the_value); ~New(); void print(…); static const Lstring get_class_name(); … } Uniform data access functions (get_ & set_) Automatic generation of meta class information etc. Examples of Suif Nodes abstract Statement : ExecutionObject { virtual list<Statement* owner> child_statements; ... } concrete IfStatement : Statement { Expression * condition in source_ops; Statement * owner then_part in child_statements; Statement * owner else_part in child_statements; } Motivation for a Modular Compiler System SUIF1: All passes read and write suif files: more modular and supportive of experimentation but it is slow Data between passes are written out as annotations Annotations must be expressed as strings when written out Requires manual pickling of annotations Nontrivial effort to support any interactivity SUIF2 Concept I: A Modular Compiler Architecture Executable suifdriver MODULES: Passes analyses optimizations Kernel suifkernel iokernel IR suifnodes basicnodes Components Kernel: provides all basic functionality iokernel: implements I/O suifkernel: hides iokernel and provides modules support, cloning, command line parsing, list of factories, etc. Modules passes: provides a pass framework IR: basic program representations Suifdriver provides execution control over modules and passes Concept II: Dynamic Registration & Interactive Compilation Each module (a C++ class) has a unique module_name A DLL (dynamically linked library) has one or more modules Register its modules dynamically (init_<dllname>) An interactive SUIF compiler > suifdriver suif> require basicnodes suifnodes suif> require mylibrary suif> load test.suif suif> mylibrary_pass1 suif> print test.out suif> save test.tsuif The Suifdriver: has a set of pre-registered modules (require, load, print, save) imports libraries dynamically which can register new modules (new commands) System can be used for demand-driven program analysis. Memory/Memory vs File/File Passes COMPILER A series of stand-alone programs Suif-file1 driver+module1 Suif-file2 driver+module2 Suif-file3 A driver that imports & applies modules to program in memory Suif-file1 Suifdriver imports/executes module1 module2 module3 driver+module3 Suif-file4 Suif-file4 Concept III: Easy to write a new analysis: subclass of a pass module Executable suifdriver Passes analyses optimizations Kernel suifkernel iokernel IR suifnodes basicnodes Example Pass: Constant Folding of Procedures class mypass: public Pass { public: mypass(SuifEnv *env, const Lstring &name): Pass(env, name) {} virtual ~mypass() {} Module *clone() const {return(Module*) this:} void do_procedure_definition (ProcedureDefinition* proc_def) { if (is_kind_of<Statement>(proc_def->get-body()) { fold_statements(proc_def->get_body()); } } } extern “C” void init_mypass (SuifEnv *suif_env) { suif_env->get_module_subsystem()->register_module (new mypass (suif_env, “mypass”)); } Research Infrastructure: Support at 3 Levels I. Compose compiler with existing passes Dynamic composition of different passes II. Develop new passes User concentrates on algorithmic issues Infrastructure provides common functionalities Write code once, can run on SUIF program in memory or a file III. Develop new IR High-level specification of the IR nodes Old code works with new IR without recompilation Status Base infrastructures are solid System getting populated with interesting analyses End of NCI project: PGI will no longer provide support Relies on community effort Stanford Team Gerald Aigner Gerald Cheong Amer Diwan Andrew Fikes David Heine Monica Lam Amy Lim Vladimir Livshits Virendra Mehta Brian Murphy Costa Sapuntzakis Christopher Unkel Hansel Wan Christopher Wilson