Evolving Legacy Software with a Generic Program Transformation Framework Using Meta-Programming and Domain-Specific Languages Dissertation Defense Joshua Yue University of Alabama May 13, 2015 1 Overview of Presentation Overview of Presentation High Performance Computing (HPC) • Provides solutions to problems that demand significant computational power – e.g., weather prediction systems, Geographic Information Systems (GIS) • A vast body of legacy code in HPC – Fortran and C code “Software that is being used must be continually adapted or it becomes progressively less satisfactory.” -- Manny M. Lehman Two Major Categories of Software Maintenance and Evolution • Parallelization – Parallelizing sequential code with parallel programming models, e.g., MPI, OpenMP, and CUDA • Utility functions – Logging – Profiling – Checkpointing Challenges of Parallel Programming • Tedious and error-prone to manually parallelize sequential code – Programming models necessitate invasive reengineering of existing programs for inserting parallel code • Difficult to evolve parallel applications – Core logic code is often tangled with the code to accomplish parallelization • Separate management of sequential and parallel code 6 Challenges in Implementing Utility Functions • Utility functions – Crosscutting concerns – Represent a considerable amount of the total LOC – Implement in a modularized manner without damaging the overall performance of the program • Timer Implementation in NAS parallel benchmarks 7 The Primary Research Objective • To facilitate the process of software development and maintenance using meta-programming and Domain-Specific Languages (DSLs) – Separate sequential and parallel concerns – Separate cross-cutting concerns • Main contributions – OpenFortran/OpenC – SPOT – OpenFoo [1] Yue, S. (2013). Program transformation techniques applied to languages used in high performance computing. In Proceedings of the 2013 companion publication for conference on Systems, programming, & applications: software for humanity (pp. 49-52). 8 Overview of Presentation Domain-Specific Languages (DSLs) • Offer expressive power focused on a particular problem domain – VS. general-purpose programming languages (GPLs) • Provide appropriate abstraction and notations – e.g., HTML, SQL, and LaTex 10 Meta-Object Protocol (MOP) • Meta-programming – A technique for writing programs that generate or manipulate other programs • MOP – A powerful tool to extend a programming language by providing the ability of metaprogramming – Organizes a meta-level architecture – Provide interfaces to access the internal implementation of a program • CLOS, OpenC++, and OpenJava 11 Meta-Object Protocol (MOP) • Meta-level program and base-level program • Meta-object and meta-class Meta-Level Program Class Base-Level Program Function 12 Overview of Presentation OpenFortran • A framework to build arbitrary source-tosource program transformation libraries and tools for Fortran programs [2] • Control over compilation rather than over the run-time execution • Transformation Engine – ROSE: Open Source Compiler Infrastructure [2] Yue, S., & Gray, J. (2013). OpenFortran: Extending Fortran with Meta-programming. In the companion publication for The International Conference for High Performance Computing, Networking, Storage, and Analysis, SC2013. 14 The Transformation Process Using OpenFortran Meta-level Transformation Code OpenFortran Extended Fortran Code Common Fortran Parser Base-level Fortran Code Rose 15 Built-in Meta-Classes in OpenFortran Meta-Class Transformation Scope MetaFunction Program, function, subroutine, subprogram MetaModule Module MetaClass Derived type MetaGlobal Project 16 Member Functions in MetaObject • ofExtendDefinition() – transform the definition of language Constructs • ofExtendFunctionCall(string funName) – manipulate a function invocation where it is called • ofExtendVariableRead(string varName) – intercept and translate the behavior of a variable read • ofExtendVariableWrite(string varName) – intercept and translate the behavior of a variable write 17 Timer Implementation in NAS • NAS: NASA Advanced Supercomputing – NAS parallel benchmarks (NPB-3.2) • Timer – measures the execution time between any two points in the program • Function calls – timer_start – timer_end – timer_read 18 A Code Snippet in EP (Embarrassingly Parallel) program EMBAR call mpi_barrier(MPI_COMM_WORLD, ierr) call vranlc(2 * nk, t1, a, x) …… end program 19 Desired Transformed Code call timer_start(1) call mpi_barrier(MPI_COMM_WORLD, ierr) call timer_stop(1) call print *, ‘mpi_barrier’, timer_read(1) call timer_start(2) call vranlc(2 * nk, t1, a, x) call timer_stop(2) call print *, ‘vranlc’, timer_read(2) 20 class TimerEPMetaClass: public MetaFunction { public: TimerEPMetaClass(string name); virtual bool ofExtendDefinition(); } Timer Implementation in NAS 21 1. bool TimerEPMetaClass::ofExtendDefinition(){ 2. timerId = 1; 3. for(int i=0; i<funCallList.size(); i++, timerId++){ 4. SgStatement* targetStmt = getStmtsContainFunctionCall(funCallList[i]); 5. insertStatementBefore(targetStmt, buildFunctionCallStmt("timer_start",\ buildParaList(to_string(timerId)))); 6. SgStatement* targetStmtStop = insertStatementAfter(targetStmt,\ buildFunctionCallStmt("timer_stop",\ buildParaList(to_string(timerId )))); 7. insertStatementAfter(targetStmtStop, buildFunctionCallStmt("print", \ buildParaList(“*”, funCallList[i]->getName(),\ buildFunctionCallStmt(“timer_read”, buildParaList(to_string(timerId ))))))); 8. } 9. } 22 Transformed Code with Timer Implementation call timer_start(1) call mpi_barrier(MPI_COMM_WORLD, ierr) call timer_stop(1) call print *, ‘mpi_barrier’, timer_read(1) call timer_start(2) call vranlc(2 * nk, t1, a, x) call timer_stop(2) call print *, ‘vranlc’, timer_read(2) 23 Apply Transformations to Base Code program EMBAR MetaFunction TimerEPMetaClass …… call mpi_barrier(MPI_COMM_WORLD, ierr) call vranlc(2 * nk, t1, a, x) …… end program 24 OpenC Meta-level Transformation Code OpenC MOP Extended C Code Common C Parser Base-level C Code Rose [5] Yue, S., & Gray, J. (2015). Extending C with Computational Reflection. 24th International Conference on Software Engineering and Data Engineering, SEDE2015. (in preparation) 25 A Short Summary • Benefits of MOPs – Cross-cutting concerns – Less invasive – Convenient to apply • Challenges of MOPs – The steep learning curve – The difficulty of understanding the complex details of meta-programming and program transformation 26 Overview of Presentation SPOT: A DSL for Specifying PrOgram Transformations • Design goal – To provide language constructs that allow developers to perform direct manipulation on programs and hide the accidental complexities of using a MOP [3] • Design decisions – High-level programming concepts, e.g., functions, variables, statements and classes as language constructs of SPOT – Facilitate systematic transformations, such as insert, delete, move and update [3] Yue, S., & Gray, J. (2014). SPOT: A DSL for Extending FORTRAN Programs With Meta-Programming. Advances in Software Engineering, Volume 2014, pp. 1-23 28 SPOT Language Constructs • Language constructs – File, Function, Statement, FunctionCall, VariableAccess • Location and scope patterns – – – – Within(Construct con){Patterns/Actions} After/Before(Construct con){Patterns/Actions} FORALL(Construct con){Patterns/Actions} WildCard: *, %varName, $varName • Actions – AddCallStatement(<loc>, <targetStmt>, <funName>, <parameterList>) – RenameVariable(<oldName>, <newName>) – Function <fun> = GetFunction(<name>) 29 Overview of the Transformation Process with SPOT Antlr + StringTemplate Transformation Specified in SPOT Code Generator Meta-level Transformation Code OpenFortran Transformed Fortran Code Original Fortran Code Rose 30 Code Generator Design Structure Code Generator Transformation Specified in SPOT Parser generated with Antlr StringTemplate SPOT Grammar Fortran Grammar Template Store Meta-level Transformation Code 31 Case Studies • Supporting aspect-oriented programming – A profiling tool • Separating sequential and parallel concerns – An OpenMP library • Supporting extension for new application domains – A checkpointing tool 32 Supporting AOP • Profiling –A technique to gain an overview of system performance • A cross-cutting concern –Scattered in multiple modules 33 A Program to Be Profiled PROGRAM exampleProg USE profiling_mod IMPLICIT NONE REAL a, b, c, result REAL calculation CALL profiling(“exampleProg:Input”) CALL Input(a, b, c) CALL profiling(“exampleProg:Input”) CALL profiling(“exampleProg:Calc”) result = Calc (a, b, c) CALL profiling(“exampleProg:Calc”) END 34 1. Transformer Profiling{ 2. Within(File *){ 3. FORALL(Function %fun){ 4. AddUseModuleStatement(profiling_mod); 5. FORALL(FunctionCall %funCall){ 6. AddCallStatement(Before, $funCall.statement, profiling, $fun.funName+”:”+$funCall.funName); 7. AddCallStatement(After, $funCall.statement, Profiling, $fun.funName+”:”+$funCall.funName); 8. } 9. } 10. } 11.} SPOT code implementing the profiling tool 35 Separating Sequential and Parallel Concerns • OpenMP – A parallel model for developing multithreaded programs in a shared memory setting – C, C++ and Fortran • An OpenMP library – Assist in instrumenting OpenMP directives to parallelize sequential code – Dijkstra’s minimum graph distance algorithm 36 Actions for Inserting OpenMP Directives • OmpUsePair(<directive>, <startStmt>, <endStmt>, <clauses>) – e.g., PARALLEL • OmpUseSingleBefore/After (<directive>, <targetStmt>, <clauses>) – e.g., BARRIER • OmpGetEnVariable(<name>, <var>) – e.g., OMP_GET_NUM_THREADS 37 Supporting Extension for New Application Domains • Checkpointing – Provide fault-tolerance – Save a snapshot of critical data periodically to stable storage – Restore the execution in case of failure • Extended SPOT to support checkpointing for Fortran applications 38 Constructs Added for Checkpointing • Start checkpointing – StartCheckpointing(<location>, <statement>) – CKPSaveInteger(<variable name>) • Restore – StartInitializing(<location>, <statement>) – CKPReadInteger(<variable name >) • Additional features – CKPFrequencey(<number>) – CKPType(<Checkpointing Type>) 39 A Short Summary • Express program transformations in terms of design intent rather than the underlying implementation • Higher-level abstraction – Underlying transformations are transparent – Flexible to generate to different languages of implementation • At compile-time – No harm to run-time performance 40 Overview of Presentation A Generic Framework for Extending Arbitrary GPLs with a MOP OpenFoo: an Extensible MOP Construction Approach Generalizing SPOT to Support New MOPs 42 OpenFoo: an Extensible MOP Construction Approach • There is a general lack of infrastructure support for language extension in terms of building a MOP for an arbitrary language • OpenFoo: an extensible prototype – Language-independent – Allows reusing artifacts and source code • Models of MOP construction to assist extension • Fortran 90 and C++ – OpenCpp to differentiate form OpenC++ 43 The Transformation Process Using OpenFoo [5] Yue, S., & Gray, J. (2015). OpenFoo: A Generalized Framework for Extending a Language with MetaObject Protocols. Target Journal: Computer Languages, Systems and Structures. (In preparation) 44 OpenFoo Design Structure 45 OpenFoo Design Structure-Statement 46 OpenCpp Design Structure 47 Abstract Syntax of SPOT 48 49 Extending Concrete Syntax of SPOT actionStatement : 'AddStatement' '(' locationKeyWord, target=statement, new=statement ')' -> ^('AddCallStatement' locationKeyWord $target $new) |'ReplaceStatement' '(' oldStmt= statement',' newStmt= statement')' -> ^('ReplaceStatement' $oldStmt $newStmt) |'DeleteStatement' '(' statement ')' -> ^('DeleteStatement' statement) |'AddIncludeStatement' '(' ID ‘.h’ ')' -> ^('AddIncludeStatement' ID) |'AddNewStatement' '('TYPENAME ID ')' -> ^('AddNewStatement’ TYPENAME ID); statement : assignmentStatement -> ^(ASSIGN_STATEMENT assignmentStatement) | callStatement -> ^(CALL_STATEMENT callStatement) | declareStatement -> ^(DEC_STATEMENT declareStatement) | ifStatementWhole -> ^(COND_STATEMENT ifStatementWhole) | doStatement -> ^(DO_STATEMENT doStatement) | whileStatement -> ^(WHILE_STATEMENT whileStatement) | forStatement -> ^(FOR_STATEMENT forStatement) | switchStatement -> ^(SWITCH_STATEMENT switchStatement) ; Case Study: Implementing A Code Coverage Tool • Code coverage tool – Determine the extent to which the source is covered by running a test suite – Statement Coverage and Decision Coverage • Implemented a code coverage tool for both Fortran and C++ code – Fast Fourier Transform (FFT) algorithm 51 Statement Coverage for C++ 1. Transformer statementCoverage { 2. Within(File %file){ 3. AddIncludeStatement(CodeCoverage.h); 4. FORALL(Function *){ 5. FORALL(Statement %stmt){ 6. AddCallStatement(Before, $stmt.statement, Visited, $stmt.lineNum, $file.fileName); 7. } 8. } 9. } 10.} Statement Coverage for Fortran 1. Transformer statementCoverage { 2. Within(File %file){ 3. AddUseStatement(CodeCoverage); 4. FORALL(Function *){ 5. FORALL(Statement %stmt){ 6. AddCallStatement(Before, $stmt.statement, Visited, $stmt.lineNum, $file.fileName); 7. } 8. } 9. } 10.} How to Use the Generic Framework Programmer SPOT Program Meta-Programmer SPOT Syntax Target Language Syntax Parser Template Store Source Code Code Generator New Templates Translated Source Code OpenFoo MOP Extension 54 Overview of Presentation A GUI-Base Wizard for Program Transformation 56 A GUI-Base Wizard for Program Transformation 57 Use MDE Techniques to Improve the Adoption of SPOT and OpenFoo Grammarware TS M3 Model-Driven Engineering (MDE) Technical Space (TS) Grammarware TS EBNF MOF EBNF M2 SPOT Grammar UML Metamodel C++ Grammar M1 SPOT Programs SPOT UML Model Injection OpenFoo UML Model Model Transformation OpenFoo Implementation Extraction 58 Overview of Presentation Conclusion • OpenFortran/OpenC – To bring the power of meta-programming to languages used in HPC • SPOT – To reduce accidental complexities and simplify the use of MOPs • OpenFoo – A generalized framework suitable for generating a MOP for an arbitrary language 60 References 1. 2. 3. 4. 5. 6. 7. Yue, S. (2013). Program transformation techniques applied to languages used in high performance computing. In Proceedings of the 2013 companion publication for conference on Systems, programming, & applications: software for humanity (pp. 49-52). Yue, S., & Gray, J. (2013). OpenFortran: Extending Fortran with Meta-programming. In the companion publication for The International Conference for High Performance Computing, Networking, Storage, and Analysis, SC2013. Yue, S., & Gray, J. (2014). SPOT: A DSL for Extending FORTRAN Programs With MetaProgramming. Advances in Software Engineering, Volume 2014, pp. 1-23 Yue, S., & Gray, J. (2015). Extending C with Computational Reflection. 24th International Conference on Software Engineering and Data Engineering, SEDE2015. (in preparation) Yue, S., & Gray, J. (2015). OpenFoo: A Generalized Framework for Extending a Language with Meta-Object Protocols. Target Journal: Computer Languages, Systems and Structures. (In preparation) OpenFoo source code and code generator implementation. https://gist.github.com/mountop/6875d1da35adf6cea516 Jacob, F., Yue, S., Gray, J., & Kraft, N. (2012). Modulo-F: A Modularization Language for FORTRAN Programs. In Journal of Convergence Information Technology, vol. 7, no. 12 (pp. 256-263). 61