CPSC 388 – Compiler Design and Construction Code Generation Code Generation Global Variables Functions (entry and exit) Statements Expressions Assume only scalar variables, no arrays (you figure out arrays) Generate MIPS assembly code for Spim http://pages.cs.wisc.edu/~larus/spim.html Spim Interpreter spim –file <name> <name> is the name file containing MIPS assembly code Program will run, giving output and errors to screen Also has graphical interface Some Spim Registers Register Purpose $sp Stack pointer $fp Frame pointer $ra Return address $v0, $a0 Used for output and return value temporaries $t0-$t7 Helps for Generating Assembly Code Constants Used SP, FP, RA, V0, A0, T0, T1, TRUE, FALSE Methods Used generate(opcode, arg1, arg2, arg3) generateIndexed(opcode,R1,R2,offset) genPush(R1) genPop(R1) String nextLabel() genLabel(label) Global Variables For each global variable, v: _v: .data .align 2 # align on word boundary .space N N is the size of the variable in bytes int: 4 bytes arrays: 4*(size of array) Global Variable Example Give source code int x; int y[10]; Generate code _x: _y: .data .align .space .data .align .space 2 4 2 40 Code Generation for Functions For each Function Function Function Function Function caller) preamble entry (setup AR) body (function’s statements) exit (restore stack, return to Function Preamble For the main function generate: .text .globl main main: All other functions: .text _<fName>: Where <fname> is the function name Function Entry <- SP <- SP New AR parameters Caller’s AR <- FP Space for Local vars Control link return add parameters <- FP Caller’s AR Function Entry Steps Push RA sw subu $ra, 0($sp) $sp, $sp, 4 Push CL sw subu $fp, 0($sp) $sp, $sp, 4 Set FP addu $fp, $sp, <size of params +8> Push space for local vars subu $sp, $sp, <size of locals in bytes> Function Body FnBodyNode DeclListNode StmtListNode No need for code from DeclListNode Call codeGen() for statement nodes in StmtListNode Function Exit Pop AR Jump to RA field lw move lw move jr $ra, $t0, $fp, $sp, $ra <- SP -<paramsize>($fp) New $fp AR -<paramsize+4>($fp) $t0 Caller’s AR Space for Local vars Control link return add parameters <- FP Function Returns Return statement End of function Two ways to handle this Generate return code once and have return statements jump to this code (op code is ‘b’ for branch) Generate return code for each return statement and end of function Return Statement’s value Return statements can return a a value from an ExpNode ExpNodes will push values onto the stack Return statement should pop the top of stack and place return value in register V0 before the rest of return code Statements Write a different codeGen() method for each kind of statement in AST Hard to debug assembly code Alternate method: Write codeGen() for the WriteIntStmtNode and WriteStrStmtNode classes first (maybe one method) Test codeGen() for other kinds of statements and expressions by writing a c- program that computes and prints a value. Write Statement Call codeGen for expression being printed Leaves value on top of stack (if int) Leaves address on top of stack (if String) Pop top of stack into A0 (register used for output) Set register V0 to 1 of int 4 if String Generate: syscall Write Statement Example myExp.codeGen(); genPop(A0); if ( type is int) generate(“li”,V0,1); else if (type is String) generate(“li”,V0,4); generate(“syscall”); If Statement IfStmtNode ExpNode DeclListNode StmtListNode Two Methods for Generating code Numeric Method Control-Flow Method Numeric Method for If Statements Evaluate the condition, leave value on stack Pop top of stack into register T0 Jump to FalseLabel if T0==FALSE Code for statement list FalseLabel: Note: Every Label in assembly code must be unique! I’m Using FalseLabel but the actual label is generated using genLabel() You Try It Write the actual code needed for IfStmtNode What is the form for IfElseStmtNode? What is the form for WhileStmtNode? Return Stmt ReturnStmtNode ReturnStmtNode ExpNode Call codeGen() for expNode child (leaves result value on stack) Pop value off stack into V0 Generate code for actual return Pop AR Jump to address in RA Read Statement ReadStmtNode ExpNode Code: li syscall $v0, 5 Loads special value 5 into register V0, then does syscall. 5 tells syscall to read in an integer and store it back in V0 Need to write code to copy value from V0 back into address represented by ExpNode ReadStmtNode Examples int x; int *p; int **q; *q=p=&x; scanf(“%d”,&x); scanf(“%d”,p); scanf(“%d”,*q); All three calls to scanf read in a value into variable x. The value of the expression is the address of x. To store value into address do: Generate code to compute value of expression (value is pushed onto stack) Pop the value into T0 Store from V0 to address in T0 ReadStmtNode Example generate(“li”,V0,5); generate(“syscall”); myExp.codeGen(); genPop(T0); generateIndexed(“sw”,V0,T0,0); Identifiers in Code Generation Function call (id is name of function) Need to jump-and-link to instruction using the name of function Expressions (can be just a name (id) or an id can be one of the operands) Generate code to fetch current value and push onto stack Assignment statements (id of lefthand side) Generate code to fetch the address of variable and push address onto stack IdNode Needs several methods genJumpAndLink() generate jump and link code for given IdNode codeGen() pushes value of IdNode expression onto stack genAddr() pushes address of IdNode onto stack genJumpAndLink() for IdNode simply generate a jump-and-link instruction (with opcode jal) using label as target of the jump. If the called function is "main", the label is just "main". For all other functions, the label is of the form: _<functionName> codeGen() for IdNode copy the value of the global / local variable into a register (e.g., T0), then push the value onto the stack Different for local or global variables Examples: lw $t0 _g // load global g into T0 lw $t0 -4($fp) // load local into T0 How do you tell if variable is local or global? – Using Symbol Table genAddr() for IdNode load the address of the identifier into a register then push onto the stack Uses opcode for loading address la rather than loading values lw Different for locals or globals Examples: la $t0, _g // global la $t0, -8($fp) // local AssignStmtNode AssignStmtNode ExpNode ExpNode Push the address of the left-hand-side expression onto the stack. Evaluate the right-hand-side expression, leaving the value on the stack. Store the top-of-stack value into the second-from-the top address. Expression Node codeGen Always generate code to leave value of expression on top of stack Literals IntLitNode, StrLitNode Function Call Non short-circuited operators Short-circuited operators IntLitNode generate code to push the literal value onto the stack Generated code should look like: li $t0, <value> # load value into T0 sw $t0, ($sp) # push onto stack subu $sp, $sp, 4 StrLitNode Store string literal in data area Push address of string onto stack Two string lits should be equal if they contain the same characters This means store only a single instance of a string literal no matter how often it appears in user code Storing String Literals Code to store a string literal in data area .data <label>:.asciiz <string value> <label> needs to be a new label; e.g., returned by a call to nextLabel. <string value> needs to be a string in quotes. You should be storing string literals that way, so just write out the value of the string literal, quotes and all. Storing Strings Once To avoid storing the same string literal value more than once, keep a hashtable in which the keys are the string literals, and the associated information is the staticdata-area label. When you process a string literal, look it up in the hashtable: if it is there, use its associated label; otherwise, generate code to store it in the static data area, and add it to the hashtable. Pushing StrLitNodes onto stack Generated Code: .text la $t0, <label> #load addr into $t0 sw $t0, ($sp) #push onto stack subu $sp, $sp, 4 CallExpNode Since the codeGen method for an expression generates code to evaluate the CallExpNode expression, leaving the value on the stack, all we need to do for step 1 is call the codeGen method of the ExpListNode (which will in turn call the codeGen methods of each ExpNode in the list). For step IdNode ExpListNode 2, we just call the genJumpAndLink method of the IdNode. For step 3, we just call genPush(V0). Code Should: Evaluate each actual parameter, pushing the values onto the stack; Jump and link (jump to the called function, leaving the return address in the RA register). Push the returned value (which will be in register V0) onto the stack. Also CallStmtNode CallStmtNode CallExpNode IdNode ExpListNode CallExpNode pushes value onto stack (may be void, i.e. garbage from V0) CallStmtNode MUST pop value off stack Non-Short Circuited ExpNodes Plus, Minus, …, Not, Less, Equals,… All do Same basic sequence of tasks Call each child's codeGen method to generate code that will evaluate the operand(s), leaving the value(s) on the stack. Generate code to pop the operand value(s) off the stack into register(s) (e.g., T0 and T1). Remember that if there are two operands, the right one will be on the top of the stack. Generate code to perform the operation (see Spim documentation for a list of opcodes). Generate code to push the result onto the stack. Note on SPIM op-codes The NOT opcode is a bit-wise note (flips bits), this won’t work for the Not boolean operations Suggest using seq opcode Seq Rdest, Rsrc1, Src2 Example AddExpNode AddExpNode ExpNode public void codeGen() { // step 1: evaluate both operands myExp1.codeGen(); myExp2.codeGen(); // step 2: pop values in T0 and T1 genPop(T1); genPop(T0); // step 3: do the addition (T0 = T0 + T1) generate("add", T0, T0, T1); // step 4: push result genPush(T0) } ExpNode Short-Circuited Operators AndNode and OrNode Short-Circuit means the right operand is evaluated ONLY if it is needed to be evaluated Example: (J != 0) && (I/J > Epsilon) AndNode Procedure Evaluate left operand If left operand is true then Evaluate right operand Expression value is value of right operand Else Expression value is false OrNode Procedure Evaluate left operand If left operand is false evaluate right operand expression is value of right operand Else expression is true Short-Circuit Nodes Need to do jump depending on values of sub-expressions Look at if-node code for example of this You Try It Write code for AndExpNode If Statement IfStmtNode ExpNode DeclListNode StmtListNode Two Methods for Generating code Numeric Method Evaluate condition, pop off stack, jump on particular value Control-Flow Method Evaluate condition and jump to TrueLabel on true or FalseLabel on false (i.e. ALWAYS do a jump) Requires a new method for Expression Nodes (i.e. don’t put value on the stack, instead do jump) Call New method genJumpCode(LabelTrue,LabelFalse) codeGen for IfStmtNode (controlflow method) public void codeGen() { String trueLab = nextLabel(); String doneLab = nextLab(); myExp.genJumpCode(trueLab, doneLab); genLabel(trueLab); myStmtList.codeGen(); genLabel(doneLab); } genJumpCode() for IdNode Old way lw push $t0, <var’s addr> $t0 New way lw beq b $t0, <var’s addr> $t0, FALSE, falseLab trueLab genJumpCode() for LessNode Old Way -- code to eval operands -- pop values into T1, T0 slt $t2, $t0, $t1 push $t2 New Way -- code to eval operands -- pop values into T1, T0 blt $t0, $t1, trueLab B falseLab genJumpCode() for Short-Circuited Operators (AndExpNode) AndExpNode ExpNode ExpNode Call genJumpCode() of left child. If child is false then jump to false label If child is true jump to right child Generate label for right child Call genJumpCode() of right child. If child is false jump to false label If child is true jump to true label genJumpCode() for AndExpNode Public void genJumpCode(String trueLab, String falseLab) { String newLab=nextLabel(); myExp1.genJumpCode(newLab,falseLab); genLabel(newLab); myExp2.genJumpCode(trueLab,falseLab); } Example with genJumpCode If (a && b>0) { … IfStmtNode AndExpNode IdNode DeclListNode LessExpNode … … StmtListNode genJumpCode() Example IfStmtNode creates two labels, trueLab, doneLab calls AndNode’s genJumpCode(trueLabel,doneLabel) Generate trueLab --code for StmtListNode Generate doneLabel AndNode creates a label newLabel, calls IdNode’s genJumpCode(newLabel,doneLabel) Generate newLabel Call LessNode’s genJumpCode(trueLab,doneLab) You Try It What is the form of the code for genJumpCode() For an OrNode For a NotNode Comparing Numeric and ControlFlow methods Numeric Method -- code to evaluate left operand, leaving value on stack Pop into T0 Goto trueLab if t0==FALSE Push FALSE Goto doneLab trueLab: -- code to evalute right operand, leaving value on stack doneLab: Control-Flow Method --code to evaluate left operand, including jumps to newLab and falseLab newLab: --code to evaluate right operand, including jumps to trueLab and falseLab You Try It Compare two approaches for OrNode and NotNode