ASSIGNMENT 7 Akshaya Misal 4240 17/09/2014 TITLE: Target code generation for optimized code. PROBLEM STATEMENT: Write a program to implement code generation phase for optimized three address code statements using C/C++. The program should read optimized IC from input file and generates equivalent assembly code as target code. LEARNING OBJECTIVES: Understand the code generation phase Understand the issues in design of code generation Understand data structures and simple code generation algorithm PREREQUISTE: Ability to write regular expressions. Use concepts of Theory of Computation. Basic understanding of optimized code and the structure of expressions. THEORY: Code Generation: Goal: Take Intermediate code representation of source program (optimized three address code statements) and produce equivalent target program as output. The requirement of a code generator are: 1) Correct target code 2) High quality code 3) Effective use of resources of target machines 4) Be quick Code generation algorithm: The basic assumption in this algorithm is, assume for each operator in the Intermediate Code, there is a corresponding target language operator. Assume computed result is left in register, but it can also be stored in memory if register is needed for another computation or just before a procedure call, jump or labeled statement. Data Structures used by algorithm: To track values in registers and where a given value resides the following data structures should use by algorithm 1) Register Descriptor: It contain list of variables currently stored in registers 2) Address Descriptor: Each variable has an address descriptor containing list of locations where this variable is currently stored. The register descriptor can be computed from an address descriptor. [1] Three Aspects in Code generation: 1) Choosing registers 2) Generating instructions 3) Managing descriptors Code Generation algorithm: For each three address code statement x:=y op z perform following actions: 1) Invoke a function getreg() to determine the location L where result of y or z should be stored. 2) Consult address descriptor for y [register / memory: Prefer register]. If y is not already in L (Register returned by getreg() function) generate MOV y,L target code. 3) Generate instruction OP z,L. Update address descriptor of x to L. If L is register then update its descriptor and remove x from other register descriptor. 4) If current values of y and/or z have no next uses (not live) on exit from block , then alter register descriptor to indicate it will no longer contain y and/or z. 5) Function getreg() : a) If y is in register that holds no other variable [e.g. x=y] and y is not live[ no further use] then return register of y for L and update address descriptor of y to indicate that y is no longer in L. b) Failing above return empty register L. c) Failing above, if x has a next use in block, free occupied register store it in memory and return that register. d) If x is not used in block then select memory location [ if no suitable register is found]. Example with test input and output: d= (a-b)+(a-c)+(a-c) Equivalent three address code with t, u and v as temporary variables and empty register descriptors is: t=a-b u=a-c v=t+u d=v+u Statement t=a-b u=a-c v=t+u D=v+u Code Register Descriptor MOV a,R0 SUB b,R0 MOV a,R1 SUB c,R1 ADD R1,R0 ADD R1,R0 MOV R0,d x=y+z if x<0 goto z R0 contains t t in R0 R0 contains t R1 contains u R0 contains v R1 contains u R0 contains d t in R0 u in R1 v in R0 u in R1 d in R0 and in memory MOV y,R0 ADD z,R0 MOV R0,x CJ< z [2] Address Descriptor iBURG: It is a simple and efficient code generator which uses BURS (bottom-up rewrite system) theory to move the dynamic programming to compile time. BURS table generation is more complicated, but BURS matchers generate optimal code in constant time per node. The main disadvantage of BURS is that costs must ne constants. iBURG generates a state function that uses a straightforward implementation of tree pattern matching. It generates hard code instead of tables. ALGORITM: 1. 2. 3. 4. Decide the target architecture along with number of registers and addressing modes. Store the quadruples in structure or linked list from input file. Implement simple code generation algorithm and getReg function which given above. For every operator check for any of case from code generation algorithm and generate equivalent assembly code and write to output file. CONCLUSION: Hence we have written a program to implement code generation phase for optimized three address code statements using C/C++. The program reads optimized IC from input file and generates equivalent assembly code as target code. [3]