Computer Architecture and Design – ECEN 350 Part 3 [Some slides adapted from M. Irwin, D. Paterson and others] Logical Operations There are a number of bit-wise logical operations in the MIPS ISA and $t0, $t1, $t2 #$t0 = $t1 & $t2 or $t0, $t1, $t2 #$t0 = $t1 | $t2 nor $t0, $t1, $t2 #$t0 = not($t1 | $t2) op 0 rs 9 rt rd 10 8 shamt funct R format 0x24 andi $t0, $t1, 0xff00 #$t0 = $t1 & ff00 R[t] <- R[s] & 016::IR15-0 ori $t0, $t1, 0xff00 #$t0 = $t1 | ff00 Logical Operations in Action Logical operations operate on individual bits of the operand. $t2 = 0…0 0000 1101 0000 $t1 = 0…0 0011 1100 0000 and $t0, $t1, $t2 $t0 = or $t0, $t1 $t2 nor $t0, $t1, $t2 $t0 = $t0 = Logic Operations Logic operations operate on individual bits of the operand. $t2 = 0…0 0000 1101 0000 $t1 = 0…0 0011 1100 0000 and $t0, $t1, $t2 $t0 = 0…0 0000 1100 0000 or $t0, $t1 $t2 nor $t0, $t1, $t2 $t0 = 1…1 1100 0010 1111 $t0 = 0…0 0011 1101 0000 How About Larger Constants? We'd also like to be able to load a 32-bit constant into a register Must use two instructions, new "load upper immediate" instruction lui $t0, 0xaaaa 16 0 8 1010101010101010 Then must get the lower order bits right, i.e., ori $t0, $t0, 0xaaaa 1010101010101010 0000000000000000 0000000000000000 1010101010101010 How About Larger Constants? We'd also like to be able to load a 32-bit constant into a register Must use two instructions, new "load upper immediate" instruction lui $t0, 0xaaaa f 0 8 1010101010101010 Then must get the lower order bits right, i.e., ori $t0, $t0, 0xaaaa 1010101010101010 0000000000000000 0000000000000000 1010101010101010 1010101010101010 1010101010101010 Shift Operations Need operations to pack and unpack 8-bit characters into 32-bit words Shifts move all the bits in a word left or right sll $t2, $s0, 8 #$t2 = $s0 << 8 bits srl $t2, $s0, 8 #$t2 = $s0 >> 8 bits op 0 rs rt rd shamt 16 10 8 funct R format 0x00 Such shifts are called logical because they fill with zeros Notice that a 5-bit shamt field is enough to shift a 32-bit value 25 – 1 or 31 bit positions More Shift Operations An arithmetic shift (sra) maintain the arithmetic correctness of the shifted value (i.e., a number shifted right one bit should be ½ of its original value; a number shifted left should be 2 times its original value) sra uses the most significant bit (sign bit) as the bit shifted in sll works for arithmetic left shifts for 2’s compl. (so there is no need for a sla) sra $t2, $s0, 8 op 0 rs #$t2 = $s0 >> 8 bits rt rd shamt funct 16 10 8 0x03 R format Compiling Another While Loop the assembly code for the C while loop where i is in $s3, k is in $s5, and the base address of the array save is in $s6 Compile while (save[i] == k) i += 1; Compiling Another While Loop the assembly code for the C while loop where i is in $s3, k is in $s5, and the base address of the array save is in $s6 Compile while (save[i] == k) i += 1; Loop: Exit: sll $t1, add $t1, lw $t0, bne $t0, addi $s3, j Loop . . . $s3, 2 $t1, $s6 0($t1) $s5, Exit $s3, 1 Review: MIPS Instructions, so far Category Instr OpC Example Meaning Arithmetic add 0 & 20 add $s1, $s2, $s3 $s1 = $s2 + $s3 (R & I format) 0 & 22 sub $s1, $s2, $s3 $s1 = $s2 - $s3 subtract add immediate 8 addi $s1, $s2, 4 $s1 = $s2 + 4 shift left logical 0 & 00 sll $s1, $s2, 4 $s1 = $s2 << 4 shift right logical 0 & 02 srl $s1, $s2, 4 $s1 = $s2 >> 4 (fill with zeros) shift right arithmetic 0 & 03 sra $s1, $s2, 4 $s1 = $s2 >> 4 (fill with sign bit) and 0 & 24 and $s1, $s2, $s3 $s1 = $s2 & $s3 or 0 & 25 or $s1 = $s2 | $s3 nor 0 & 27 nor $s1, $s2, $s3 $s1, $s2, $s3 $s1 = not ($s2 | $s3) and immediate c and $s1, $s2, ff00 $s1 = $s2 & 0xff00 or immediate d or $s1 = $s2 | 0xff00 load upper immediate f lui $s1, 0xffff $s1, $s2, ff00 $s1 = 0xffff0000 Review: MIPS Instructions, so far Category Instr Data transfer (I format) load word 23 lw store word 2b sw $s1, 100($s2) Memory($s2+100) = $s1 load byte 20 lb $s1, 101($s2) $s1 = Memory($s2+101) store byte 28 sb $s1, 101($s2) Memory($s2+101) = $s1 load half 21 lh $s1, 101($s2) $s1 = Memory($s2+102) store half 29 sh $s1, 101($s2) Memory($s2+102) = $s1 Cond. branch br on equal 4 beq $s1, $s2, L if ($s1==$s2) go to L br on not equal 5 bne $s1, $s2, L if ($s1 !=$s2) go to L (I & R format) set on less than immediate a slti $s1, $s2, 100 if ($s2<100) $s1=1; else $s1=0 set on less than Uncond. jump jump jump register OpC Example $s1, 100($s2) $s1 = Memory($s2+100) 0 & 2a slt $s1, $s2, $s3 2 j 0 & 08 jr Meaning if ($s2<$s3) $s1=1; else $s1=0 2500 go to 10000 $t1 go to $t1 Review: MIPS R3000 ISA Instruction Categories Registers Load/Store Computational Jump and Branch Floating Point R0 - R31 PC HI - coprocessor 3 Memory Management Special LO Instruction Formats: all 32 bits wide 6 bits 5 bits 5 bits 5 bits rd OP rs rt OP rs rt OP 5 bits shamt 16 bit number 26 bit jump target 6 bits funct R format I format J format MIPS Organization Processor Memory Register File src1 addr 5 src2 addr 5 dst addr write data 5 1…1100 src1 data 32 32 registers ($zero - $ra) read/write addr src2 32 data 32 32 32 bits br offset 32 Fetch PC = PC+4 Exec 32 Add PC 32 Add 4 read data 32 32 32 write data 32 Decode 230 words 32 32 ALU 32 32 4 0 5 1 6 2 32 bits byte address (big Endian) 7 3 0…1100 0…1000 0…0100 0…0000 word address (binary) Programming Styles Procedures (subroutines, functions) allow the programmer to structure programs making them easier to understand and debug and allowing code to be reused Procedures allow the programmer to concentrate on one portion of the code at a time parameters act as barriers between the procedure and the rest of the program and data, allowing the procedure to be passed values (arguments) and to return values (results) Six Steps in Execution of a Procedure Main routine (caller) places parameters in a place where the procedure (callee) can access them $a0 - $a3: four argument registers Caller transfers control to the callee Callee acquires the storage resources needed Callee performs the desired task Callee places the result value in a place where the caller can access it $v0 - $v1: two value registers for result values Callee returns control to the caller $ra: one return address register to return to the point of origin Instruction for Calling a Procedure MIPS procedure call instruction: jal ProcAddress #jump and link Saves PC+4 in register $ra as the link to the following instruction to set up the procedure return Machine format: op 3 Then 26 bit address ???? procedure can return with just jr $ra #return J format Basic Procedure Flow For a procedure that computes the GCD of two values i (in $t0) and j (in $t1) gcd(i,j); caller puts the i and j (the parameters values) in $a0 and $a1 and issues a jal gcd #jump to routine gcd The The callee computes the GCD, puts the result in $v0, and returns control to the caller using gcd: . . . #code to compute gcd jr $ra #return Spilling Registers What if the callee needs to use more registers than allocated to argument and return values? it uses a stack – a last-in-first-out queue high addr top of stack $sp One of the general registers, $sp ($29), is used to address the stack (which “grows” from high address to low address) add data onto the stack – push $sp = $sp – 4 data on stack at new $sp low addr remove data from the stack – pop data from stack at $sp $sp = $sp + 4 Compiling a C Leaf Procedure Leaf procedures are ones that do not call other procedures. Give the MIPS assembler code for int leaf_ex (int g, int h, int i, int j) { int f; f = (g+h) – (i+j); return f; } where g, h, i, and j are in $a0, $a1, $a2, $a3 Compiling a C Leaf Procedure Leaf procedures are ones that do not call other procedures. Give the MIPS assembler code for int leaf_ex (int g, int h, int i, int j) { int f; f = (g+h) – (i+j); return f; } where g, h, i, and j are in $a0, $a1, $a2, $a3 leaf_ex: addi sw sw add add sub lw lw addi jr $sp,$sp,-8 $t1,4($sp) $t0,0($sp) $t0,$a0,$a1 $t1,$a2,$a3 $v0,$t0,$t1 $t0,0($sp) $t1,4($sp) $sp,$sp,8 $ra #make stack room #save $t1 on stack #save $t0 on stack #restore $t0 #restore $t1 #adjust stack ptr Example Given the following C code int main() (int g, int h, int i, int j) { int y; ... y=diffofsums(2,3,4,5); ... } int diffofsums(int f, int g, int h, int i) { int result; result = (f+g)-(h+i); return result; } Fill in the blanks in the following translation of this C code into MIPS (registers $s0, $t0, and $t1 must be saved and restored): # $s0=y main: ... addi $a0,$0,2 addi $a1,$0,3 addi $a2,$0,4 addi $a3,$0,_ jal ________ ... # $s0=result # agrument0=2 # agrument1=3 # agrument2=4 # agrument3=5 #call procedure diffofsums: addi $sp, $sp, ____# sw $s0, _____ # sw $t0, _____ # sw $t1, _____ # add ___, ___, ___ # add ___, ___, ___ # sub ___, ___, ___ # add ___, ___, ___ # lw $t1, _____ # lw $t0, _____ # lw $s0, _____ # addi $sp,$sp, __ # jr __ # make space on stack to store three registers save $s0 on stack save $t0 on stack save $t1 on stack $t0=f+g $t1=h+i result = (f+g)-(h+i) put return value in $v0 restore $t1 from stack restore $t0 from stack restore $s0 from stack de-allocate stack space return to caller Example Given the following C code int main() (int g, int h, int i, int j) { int y; ... y=diffofsums(2,3,4,5); ... } int diffofsums(int f, int g, int h, int i) { int result; result = (f+g)-(h+i); return result; } Fill in the blanks in the following translation of this C code into MIPS (registers $s0, $t0, and $t1 must be saved and restored): # $s0=y main: ... addi $a0,$0,2 addi $a1,$0,3 addi $a2,$0,4 addi $a3,$0,5 jal diffofsums ... # $s0=result diffofsums: addi $sp, $sp, -12 sw $s0, 8(sp) sw $t0, 4(sp) sw $t1, 0(sp) add $t0, $a0, $a1 add $t1, $a2, $a3 sub $s0, $t0, $t1 add $v0, $s0, $0 lw $t1, 0(sp) lw $t0, 4(sp) lw $s0, 8(sp) addi $sp,$sp, 12 jr $ra # agrument0=2 # agrument1=3 # agrument2=4 # agrument3=5 #call procedure # # # # # # # # # # # # # make space on stack to store three registers save $s0 on stack save $t0 on stack save $t1 on stack $t0=f+g $t1=h+i result = (f+g)-(h+i) put return value in $v0 restore $t1 from stack restore $t0 from stack restore $s0 from stack de-allocate stack space return to caller Nested Procedures What happens to return addresses with nested procedures? int rt_1 (int i) { if (i == 0) return 0; else return rt_2(i-1); } caller: jal rt_1 next: . . . rt_1: to_2: rt_2: bne add jr addi jal jr $a0, $zero, to_2 $v0, $zero, $zero $ra $a0, $a0, -1 rt_2 $ra . . . Nested Procedures Outcome caller: jal rt_1 next: . . . rt_1: to_2: bne $a0, $zero, to_2 add $v0, $zero, $zero jr $ra addi $a0, $a0, -1 jal rt_2 jr $ra rt_2: . . . the call to rt_1, the return address (next in the caller routine) gets stored in $ra. What happens to the value in $ra (when i != 0) when rt_1 makes a call to rt_2? On Saving the Return Address, Part 1 Nested procedures (i passed in $a0, return value in $v0) high addr old TOS $sp low addr $ra Save stack rt_1:bne add jr to_2:addi sw sw addi jal bk_2:lw lw addi jr $a0, $v0, $ra $sp, $ra, $a0, $a0, rt_2 $a0, $ra, $sp, $ra $zero, to_2 $zero, $zero $sp, -8 4($sp) 0($sp) $a0, -1 0($sp) 4($sp) $sp, 8 the return address (and arguments) on the Saving the Return Address, Part 1 Nested procedures (i passed in $a0, return value in $v0) old TOS high addr $sp caller rt addr old $a0 $sp low addr caller bk_2 rt addr $ra Save stack rt_1:bne add jr to_2:addi sw sw addi jal bk_2:lw lw addi jr $a0, $v0, $ra $sp, $ra, $a0, $a0, rt_2 $a0, $ra, $sp, $ra $zero, to_2 $zero, $zero $sp, -8 4($sp) 0($sp) $a0, -1 0($sp) 4($sp) $sp, 8 the return address (and arguments) on the Saving the Return Address, Part 2 Nested procedures (i passed in $a0, return value in $v0) old TOS high addr $sp caller rt addr old $a0 $sp low addr caller rt addr $ra bk_2 Save stack rt_1:bne add jr to_2:addi sw sw addi jal bk_2:lw lw addi jr $a0, $v0, $ra $sp, $ra, $a0, $a0, rt_2 $a0, $ra, $sp, $ra $zero, to_2 $zero, $zero $sp, -8 4($sp) 0($sp) $a0, -1 0($sp) 4($sp) $sp, 8 the return address (and arguments) on the MIPS Register Convention Name Register Number $zero 0 $v0 - $v1 2-3 $a0 - $a3 4-7 $t0 - $t7 8-15 $s0 - $s7 16-23 $t8 - $t9 24-25 $gp 28 $sp 29 $fp 30 $ra 31 Usage the constant 0 returned values arguments temporaries saved values temporaries global pointer stack pointer frame pointer return address The convention used in the book does not preserve $a0-$a3 Preserve on call? n.a. no Yes* no yes no yes yes yes yes What is and what is not preserved across a procedure call. Compiling a Recursive Procedure A procedure for calculating factorial int fact (int n) { if (n < 1) return 1; else return (n * fact (n-1)); } A recursive procedure (one that calls itself!) fact (0) = 1 fact (1) = 1 * 1 = 1 fact (2) = 2 * 1 * 1 = 2 fact (3) = 3 * 2 * 1 * 1 = 6 fact (4) = 4 * 3 * 2 * 1 * 1 = 24 ... Assume $v0 n is passed in $a0; result returned in Compiling a Recursive Procedure fact: addi sw sw slti beq addi addi jr L1: $sp, $ra, $a0, $t0, $t0, $v0, $sp, $ra $sp, -8 4($sp) 0($sp) $a0, 1 $zero, L1 $zero, 1 $sp, 8 #adjust stack pointer #save return address #save argument n #test for n < 1 #if n >=1, go to L1 #else return 1 in $v0 #adjust stack pointer #return to caller addi $a0, $a0, -1 #n >=1, so decrement n jal fact #call fact with (n-1) #this is where fact returns bk_f: lw $a0, 0($sp) #restore argument n lw $ra, 4($sp) #restore return address addi $sp, $sp, 8 #adjust stack pointer mul $v0, $a0, $v0 #$v0 = n * fact(n-1) jr $ra #return to caller A Look at the Stack for $a0 = 2, Part 1 old TOS $sp Stack state after execution of first encounter with the jal instruction (second call to fact routine with $a0 now holding 1) $ra $a0 $v0 save return address to caller routine (i.e., location in the main routine where first call to fact is made) on the stack save original value of $a0 on the stack Compiling a Recursive Procedure Text Data Caller First invocation fact fact $a0(n)=4 jal fact fact $ra fact $a0(n)=3 $ra Second invocation A Look at the Stack for $a0 = 2, Part 1 old TOS $sp caller rt addr $a0 = 2 $sp Stack state after execution of first encounter with the jal instruction (second call to fact routine with $a0 now holding 1) caller bk_f rt addr $ra 1 2 $a0 $v0 saved return address to caller routine (i.e., location in the main routine where first call to fact is made) on the stack saved original value of $a0 on the stack A Look at the Stack for $a0 = 2, Part 2 old TOS caller rt addr $a0 = 2 $sp bk_f $a0 = 1 $sp Stack state after execution of second encounter with the jal instruction (third call to fact routine with $a0 now holding 0) bk_f $ra 0 1 $a0 $v0 saved return address of instruction in caller routine (instruction after jal) on the stack saved previous value of $a0 on the stack A Look at the Stack for $a0 = 2, Part 3 old TOS caller rt addr $a0 = 2 bk_f $a0 = 1 $sp $sp bk_f $a0 = 0 $sp bk_f $ra 0 $a0 1 $v0 Stack state after execution of first encounter with the first jr instruction ($v0 initialized to 1) stack pointer updated to point to third call to fact A Look at the Stack for $a0 = 2, Part 4 old TOS caller rt addr $a0 = 2 $sp bk_f $a0 = 1 $sp bk_f $a0 = 0 Stack state after execution of first encounter with the second jr instruction (return from fact routine after updating $v0 to 1 * 1) bk_f $ra 0 1 $a0 1 *1 1 $v0 return address to caller routine (bk_f in fact routine) restored to $ra from the stack previous value of $a0 restored from the stack stack pointer updated to point to second call to fact A Look at the Stack for $a0 = 2, Part 5 old TOS $sp caller rt addr $a0 = 2 $sp bk_f $a0 = 1 bk_f $a0 = 0 Stack state after execution of second encounter with the second jr instruction (return from fact routine after updating $v0 to 2 * 1 * 1) caller bk_f rt addr $ra 1 2 2 1 * * 1 1 * 1 $a0 $v0 return address to caller routine (main routine) restored to $ra from the stack original value of $a0 restored from the stack stack pointer updated to point to first call to fact Allocating Space on the Stack The high addr Saved argument regs (if any) $fp Saved return addr Saved local regs (if any) Local arrays & structures (if any) low addr segment of the stack containing a procedure’s saved registers and local variables is its procedure frame (aka activation record) $sp The frame pointer ($fp) points to the first word of the frame of a procedure – providing a stable “base” register for the procedure -$fp is initialized using $sp on a call and $sp is restored using $fp on a return Allocating Space on the Heap Static data segment for constants and other static variables (e.g., arrays) Dynamic data segment (aka heap) for structures that grow and shrink (e.g., linked lists) Allocate space on the heap with malloc() and free it with free() $sp Memory 0x 7f f f f f f c Stack Dynamic data (heap) $gp Static data 0x 1000 8000 0x 1000 0000 Text (Your code) PC 0x 0040 0000 Reserved 0x 0000 0000 MIPS Addressing Modes Register addressing – operand is in a register Base (displacement) addressing – operand is at the memory location whose address is the sum of a register and a 16-bit constant contained within the instruction Immediate addressing – operand is a 16-bit constant contained within the instruction PC-relative addressing –instruction address is the sum of the PC and a 16-bit constant contained within the instruction Pseudo-direct addressing – instruction address is the 26-bit constant contained within the instruction concatenated with the upper 4 bits of the PC Addressing Modes Illustrated 1. Register addressing op rs rt rd funct Register word operand 2. Base addressing op rs rt offset Memory word or byte operand base register 3. Immediate addressing op rs rt operand 4. PC-relative addressing op rs rt offset Memory branch destination instruction Program Counter (PC) 5. Pseudo-direct addressing op Memory jump address || Program Counter (PC) jump destination instruction Review: MIPS Instructions, so far Category Instr Arithmetic add (R & I subtract format) add immediate OpC Example Meaning 0 & 20 add $s1, $s2, $s3 $s1 = $s2 + $s3 0 & 22 sub $s1, $s2, $s3 $s1 = $s2 - $s3 8 addi $s1, $s2, 4 $s1 = $s2 + 4 shift left logical 0 & 00 sll $s1, $s2, 4 $s1 = $s2 << 4 shift right logical 0 & 02 srl $s1, $s2, 4 $s1 = $s2 >> 4 (fill with zeros) shift right arithmetic 0 & 03 sra $s1, $s2, 4 $s1 = $s2 >> 4 (fill with sign bit) and 0 & 24 and $s1, $s2, $s3 $s1 = $s2 & $s3 or 0 & 25 or $s1 = $s2 | $s3 nor 0 & 27 nor $s1, $s2, $s3 $s1, $s2, $s3 $s1 = not ($s2 | $s3) and immediate c and $s1, $s2, ff00 $s1 = $s2 & 0xff00 or immediate d or $s1 = $s2 | 0xff00 load upper immediate f lui $s1, 0xffff $s1, $s2, ff00 $s1 = 0xffff0000 Review: MIPS Instructions, so far Category Instr Data transfer (I format) load word 23 lw store word 2b sw $s1, 100($s2) Memory($s2+100) = $s1 load byte 20 lb $s1, 101($s2) $s1 = Memory($s2+101) store byte 28 sb $s1, 101($s2) Memory($s2+101) = $s1 load half 21 lh $s1, 101($s2) $s1 = Memory($s2+102) store half 29 sh $s1, 101($s2) Memory($s2+102) = $s1 br on equal 4 beq $s1, $s2, L if ($s1==$s2) go to L br on not equal 5 bne $s1, $s2, L if ($s1 !=$s2) go to L set on less than immediate a slti $s1, $s2, 100 if ($s2<100) $s1=1; else $s1=0 Cond. branch (I & R format) set on less than Uncond. jump jump jump register jump and link OpC Example 0 & 2a slt 2 j 0 & 08 jr 3 jal Meaning $s1, 100($s2) $s1 = Memory($s2+100) $s1, $s2, $s3 if ($s2<$s3) $s1=1; else $s1=0 2500 go to 10000 $t1 go to $t1 2500 go to 10000; $ra=PC+4 Review: MIPS R3000 ISA Instruction Categories Registers Load/Store Computational Jump and Branch Floating Point R0 - R31 - coprocessor 3 PC HI Memory Management Special LO Instruction Formats: all 32 bits wide 6 bits 5 bits 5 bits 5 bits rd OP rs rt OP rs rt OP 5 bits shamt 16 bit number 26 bit jump target 6 bits funct R format I format J format