COM181 Computer Hardware Lecture 4: The MIPs CPU "Adapted from Computer Organization and Design, 4th Edition, Patterson & Hennessy, © 2008.” This material may not be copied or distributed for commercial purposes without express written permission of the copyright holders. Also drawn from the work of Mary Jane Irwin ( www.cse.psu.edu/~mji ) You should read the MIPs handout after this lecture, also we will revisit the MIP instruction set in tutorials and the next lecture 08/08/13 www.eej.ulster.ac.uk/~ian/modules/COM181/files/COM181_L4.pptx 1 Assembly language High-level language e.g a = b + c; Machine language e.g 000000 01000 01001 01010 00000 100001 Assembly language is between high-level and machine Each statement defines one machine operation Directly represents architecture So if the hardware chip can’t multiply then there will be no multiply statement (you can multiply by successive addition!) MIPs has very limited, simple, instructions. It has 32 registers and can add, subtract, and, or, ex-or and shift There is one instruction to move (copy) 32 bit data from memory to a register and one instruction to move(copy) 32 bit data to memory (i.e you can’t just add the contents of a memory location to another – they have to be brought into registers to do the addition) Assembler program translates to machine language 08/08/13 www.eej.ulster.ac.uk/~ian/modules/COM181/fi les/COM181_L4.pptx 2 INSTRUCTION SET ARCHITECTURES (ISA): Types CISC: complex instruction set computer: Traditional computer architecture Unique instructions for as many operations as possible Advantages Disadvantages Each instruction can do more work More complex hardware circuits Programs use less memory More expensive to develop and build Easier to program directly or to write compilers Usually slower RISC: reduced instruction set computer: Look at actual instruction use, focus on most frequent ones Advantages Disadvantages Easier to learn Larger, more complex programs Simpler circuits Harder to program Cheaper and more reliable to design and build Depends on compiler for optimization Faster, quicker to implement when foundry improves silicon processes 08/08/13 www.eej.ulster.ac.uk/~ian/modules/COM181/fi les/COM181_L4.pptx 3 Stored program Stored program concept Instructions and data are stored in the same memory Instructions are simply another kind of data Instructions are executed sequentially unless branch elsewhere or stop Fetch-execute cycle - Instruction fetch Get the next instruction from memory - Decode Figure out what operation to perform on which operands - Operand fetch Get the operand values - Execute Perform the operation - Store result Repeat until done 08/08/13 www.eej.ulster.ac.uk/~ian/modules/COM181/fi les/COM181_L4.pptx 4 Instructions Any instruction set must perform a basic set of operations May have more complex combinations or special operations as well Types of operations Data transfer: load, store Arithmetic: add, subtract, multiply, divide Logic: and, or, xor, complement Compare: equal, not equal, greater than, less than Branch/jump: change execution order 08/08/13 www.eej.ulster.ac.uk/~ian/modules/COM181/fi les/COM181_L4.pptx 5 MIPS MIPS "Microcomputer without interlocked pipeline stages" Name is pun on acronym for "millions of instructions per second" RISC architecture developed in middle '80's Extended through several versions - current: MIPS IV Used in many "embedded" applications Game machines: Sony, Nintendo TV set top boxes: LSI Logic shipped 7 million in 2001 Routers: Cisco Laser printers PDAs High-performance workstations: Silicon Graphics (Lord of the Rings, other films) "Over 100 million sold" 08/08/13 www.eej.ulster.ac.uk/~ian/modules/COM181/fi les/COM181_L4.pptx 6 MIPS: machine model notes Main memory data: 32-bit address: range from 0x00000000 to 0xFFFFFFFF (upper half of range reserved ) Processor 32 registers ($r0 to $r31, though $r0 is readonly and holds zero: these store data to perform operations to and from themselves - faster than main memory load-store architecture: access memory only through load, store instructions load: register <--- data from memory (ld instruction) store: register ---> data to memory (sw instruction) amount of data in bytes (1, 2, 4, 8) depends on instruction (we’ll stick to 32bit (4 bytes@a time)) all other operations use only registers or immediate values (contained in instruction) Design Principle #2: "Smaller is faster." 16 floating point registers (ignore these) ALU: arithmetic-logic unit performs operations on values in registers control: determines how operations executed ("computer within computer") 08/08/13 www.eej.ulster.ac.uk/~ian/modules/COM181/fi les/COM181_L4.pptx 7 MIPS: instructions ALU performs arithmetic and logical operations (instructions) Instruction specifies 1. The operation to perform. 2. The first operand (usually in a register). 3. The second operand (usually in a register). 4. The register that receives the result. (we call the MIPs a 3-address machine) MIPS has about 111 different instructions (we will look at about a dozen) all 32 bits, 3 different formats (r-type, i-type and j-type) r-types all have three register addresses for 2,3 and 4 above i-types have 2 registers and a 16bit constant (number). j-type (there is only one instruction!) is a simple JUMP instruction, with a 26bit address built in to the 32 bit instruction. 08/08/13 www.eej.ulster.ac.uk/~ian/modules/COM181/fi les/COM181_L4.pptx 8 MIPS: instruction example Example: add unsigned addu $r10,$r8,$r9 # add 2 numbers this is assembler Syntax 3-operand instructions: all arithmetic/logical operations operands separated by commas. Design principle #1: "Simplicity favors regularity." one operation per instruction, one instruction per line operation: registers addu sources : target : comment: # add 2 numbers line) $r8, $r9 $r10 (Comments starts with #, ends with end of Semantics $r10 = $r8 + $r9; What humans R[10] <-- R[8] + R[9] Alternative way of understand what 08/08/13 Machine code humans www.eej.ulster.ac.uk/~ian/modules/COM181/fi les/COM181_L4.pptx understand (RTL) 9 MIPS: instruction fields 0000 0001 0000 1001 0101 0000 0010 0001 R-types use three registers, their format is always the same. The 32 bits is split into 6 fields of varying lengths – 6 bit, then 4 x 5 bit then another 6 bit. (i-types have 4 fields, 6,5,5,16) addu $r10,$r8,$r9 # add 2 numbers hex: 0x01095021 0 1 0 9 5 0 2 1 binary: 0000 0000 0010 0001 fields: 100001 b10-6 0001 0000 000000 01000 b31-26 b5-0 opcode $rs 1001 0101 01001 b25-21 $rt 01010 b20-16 $rd 00000 b15-11 shamt function R-types all have an opcode of six zeroes and the actual function code listed in www.eej.ulster.ac.uk/~ian/modules/COM181/fi the rightmost 6 bits. (i-types use the opcode field and have no function field) les/COM181_L4.pptx 08/08/13 10 08/08/13 www.eej.ulster.ac.uk/~ian/modules/COM181/fi les/COM181_L4.pptx 11 08/08/13 www.eej.ulster.ac.uk/~ian/modules/COM181/fi les/COM181_L4.pptx 12 08/08/13 www.eej.ulster.ac.uk/~ian/modules/COM181/fi les/COM181_L4.pptx 13 View from 30,000 Feet Diagram of MIPS – some parts not shown!!! 08/08/13 www.eej.ulster.ac.uk/~ian/modules/COM181/fi les/COM181_L4.pptx 14 The MIPs CPU is described in the textbook, note how the diagram below relates to lecture 3 PCSrc 1 ID/EX 0 EX/MEM Control IF/ID Add Shift left 2 4 PC Instruction Memory Read Address Add Read Addr 1 Data Memory Register Read Read Addr 2Data 1 File Write Addr Write Data 16 Sign Extend MEM/WB Branch ALU Read Data 2 1 Address Read Data Write Data 0 32 1 0 ALU cntrl EX/MEM.RegisterRd 0 1 IF/ID.RegisterRs IF/ID.RegisterRt Forward Unit MEM/WB.RegisterRd 15 08/08/13 www.eej.ulster.ac.uk/~ian/modules/COM181/files/COM181_L4.pptx Often you must deal with the management of complexity, and use abstraction and partition to reduce systems to sizes that the human brain can cope with... When programming a MIPs CPU it is enough to maintain a “Programmer’s Model” of the CPU. The CPU is designed as a RISC (Reduced Instruction Set Computer) machine, this suits implementing the hardware but not necessarily suits humans programming it! Software tools help A RISC machine has a fixed length instruction (32 bits in the simple MIPs) A RISC machine has a limited number of addressing modes A RISC machine has a limited number of operations (small instrucution set) A RISC machine has, typically, a register bank and uses load/store instructions (only) to access main memory. 08/08/13 www.eej.ulster.ac.uk/~ian/modules/COM181/fi les/COM181_L4.pptx 16 MIPs machines have three types of Instruction; R-type for arithmetic instructions – using Registers, I-type where the number needed is available immediately and J-type for conditional/control, jumps etc (there are also a few others...) Example of some MIPS assembly language arithmetic statements add $t0, $s1, $s2 sub $t0, $s1, $s2 Each arithmetic instruction performs only one operation Each arithmetic instruction specifies exactly three operands destination source1 op source2 Operand order is fixed (the destination is specified first) The operands are contained in the datapath’s register file ($t0, $s1, $s2) The registers above have been given symbolic names, the actual numbered registers Run from $0 to $31. We use software to convert the statements above to a 32 bit instruction. The ASSEMBLER program can also convert symbols into numbers 08/08/13 www.eej.ulster.ac.uk/~ian/modules/COM181/fi les/COM181_L4.pptx 17 MIPS Register File Operands of arithmetic instructions must be from a limited number of special Register File locations contained in the datapath’s register file Thirty-two 32-bit registers Two read ports One write port src1 addr src2 addr dst addr write data 5 5 5 data 25 = 32 locations 32 src2 32 Registers are 32 src1 data 32 bits Fast - Smaller is faster & Make the common case fast Easy for a compiler to use - e.g., (A*B) – (C*D) – (E*F) can do multiplies in any order Improves code density - Since register are named with fewer bits than a memory location Register addresses are indicated by using $ 08/08/13 www.eej.ulster.ac.uk/~ian/modules/COM181/files/COM181_L4.pptx 18 Naming Conventions for Registers 0 $zero constant 0 (Hdware) 16 1 $at reserved for assembler ... 2 $v0 expression evaluation & 23 $s7 3 $v1 function results 24 $t8 temporary (cont’d) 4 $a0 arguments 25 $t9 5 $a1 26 $k0 reserved for OS kernel 6 $a2 27 $k1 7 $a3 28 $gp pointer to global area 8 $t0 temporary: caller saves 29 $sp stack pointer ... 15 08/08/13 (callee can clobber) $t7 $s0 callee saves (caller can clobber) 30 $fp frame pointer 31 $ra return address (Hdware) www.eej.ulster.ac.uk/~ian/modules/COM181/files/COM181_L4.pptx 19 Registers vs. Memory Arithmetic instructions operands must be in registers only thirty-two registers are provided Devices Processor Network Control Datapath Memory Input Output Compiler associates variables with registers What about programs with lots of variables? 08/08/13 www.eej.ulster.ac.uk/~ian/modules/COM181/files/COM181_L4.pptx 20 Registers vs. Memory Arithmetic instructions operands must be in registers only thirty-two registers are provided Devices Processor Network Control Datapath Memory Input Output Compiler associates variables with registers What about programs with lots of variables? 08/08/13 www.eej.ulster.ac.uk/~ian/modules/COM181/files/COM181_L4.pptx 21 Processor – Memory Interconnections Memory is a large, single-dimensional array An address acts as the index into the memory array Memory read addr/ write addr Processor ? locations read data write data 10 101 1 32 bits 08/08/13 www.eej.ulster.ac.uk/~ian/modules/COM181/files/COM181_L4.pptx 22 Processor – Memory Interconnections Memory is a large, single-dimensional array An address acts as the index into the memory array The word Memory read addr/ write addr address of the data 32 Processor read data ? locations 32 32write data The data stored in the memory 08/08/13 10 101 1 32 bits 8 4 0 232 Bytes (4 230 GB) Words (1 GW) = 4 Bytes = 1 Word www.eej.ulster.ac.uk/~ian/modules/COM181/files/COM181_L4.pptx 23 Accessing Memory 08/08/13 MIPS has two basic data transfer instructions for accessing memory (assume $s3 holds 2410) lw $t0, 4($s3) #load word from memory sw $t0, 8($s3) #store word to memory The data transfer instruction must specify where in memory to read from (load) or write to (store) – memory address where in the register file to write to (load) or read from (store) – register destination (source) The memory address is formed by summing the constant portion of the instruction and the contents of the second register www.eej.ulster.ac.uk/~ian/modules/COM181/files/COM181_L4.pptx 24 Accessing Memory MIPS has two basic data transfer instructions for accessing memory (assume $s3 holds 2410) lw $t0, 4($s3) #load word from memory 28 sw $t0, 8($s3) #store word to memory 32 08/08/13 The data transfer instruction must specify where in memory to read from (load) or write to (store) – memory address where in the register file to write to (load) or read from (store) – register destination (source) The memory address is formed by summing the constant portion of the instruction and the contents of the second register www.eej.ulster.ac.uk/~ian/modules/COM181/files/COM181_L4.pptx 25 Compiling with Loads and Stores 08/08/13 Assuming variable b is stored in $s2 and that the base address of array A is in $s3, what is the MIPS assembly code for the C statement ... ... A[3] $s3+12 A[2] $s3+8 A[1] $s3+4 A[0] $s3 A[8] = A[2] - b www.eej.ulster.ac.uk/~ian/modules/COM181/files/COM181_L4.pptx 26 Compiling with Loads and Stores Assuming variable b is stored in $s2 and that the base address of array A is in $s3, what is the MIPS assembly code for the C statement ... ... A[3] $s3+12 A[2] $s3+8 A[1] $s3+4 A[0] $s3 A[8] = A[2] - b lw $t0, 8($s3) sub $t0, $t0, $s2 sw $t0, 32($s3) 08/08/13 www.eej.ulster.ac.uk/~ian/modules/COM181/files/COM181_L4.pptx 27 Compiling with a Variable Array Index ... ... A[3] $s4+12 A[2] $s4+8 A[1] $s4+4 A[0] $s4 Assuming that the base address of array A is in register $s4, and variables b, c, and i are in $s1, $s2, and $s3, respectively, what is the MIPS assembly code for the C statement c = A[i] - b 08/08/13 add $t1, $s3, $s3 #array index i is in $s3 add $t1, $t1, $t1 #temp reg $t1 holds 4*i www.eej.ulster.ac.uk/~ian/modules/COM181/files/COM181_L4.pptx 28 Compiling with a Variable Array Index ... ... A[3] $s4+12 A[2] $s4+8 A[1] $s4+4 A[0] $s4 Assuming that the base address of array A is in register $s4, and variables b, c, and i are in $s1, $s2, and $s3, respectively, what is the MIPS assembly code for the C statement c = A[i] - b add $t1, $s3, $s3 #array index i is in $s3 add $t1, $t1, $t1 #temp reg $t1 holds 4*i add $t1, $t1, $s4 #addr of A[i] now in $t1 lw $t0, 0($t1) sub $s2, $t0, $s1 08/08/13 www.eej.ulster.ac.uk/~ian/modules/COM181/files/COM181_L4.pptx 29 Dealing with Constants Small constants are used quite frequently (50% of operands in many common programs) e.g., Solutions? Why not? Put “typical constants” in memory and load them Create hard-wired registers (like $zero) for constants like 1, 2, 4, 10, … How do we make this work? 08/08/13 A = A + 5; B = B + 1; C = C - 18; How do we Make the common case fast ! www.eej.ulster.ac.uk/~ian/modules/COM181/files/COM181_L4.pptx 30 Constant (or Immediate) Operands Include constants inside arithmetic instructions Much faster than if they have to be loaded from memory (they come in from memory with the instruction itself) MIPS immediate instructions addi $s3, $s3, 4 #$s3 = $s3 + 4 There is no subi instruction, can you guess why not? 08/08/13 www.eej.ulster.ac.uk/~ian/modules/COM181/files/COM181_L4.pptx 31 MIPS Instructions, so far Category Arithmetic Data transfer 08/08/13 Instr Example Meaning add add $s1, $s2, $s3 $s1 = $s2 + $s3 subtract sub $s1, $s2, $s3 $s1 = $s2 - $s3 add immediate addi $s1, $s2, 4 $s1 = $s2 + 4 load word lw $s1, 32($s2) $s1 = Memory($s2+32) store word sw $s1, 32($s2) Memory($s2+32) = $s1 www.eej.ulster.ac.uk/~ian/modules/COM181/fi les/COM181_L4.pptx 32 Machine Language - Arithmetic Instruction Instructions, like registers and words of data, are also 32 bits long add $t0, $s1, $s2 Example: registers have numbers $t0=$8,$s1=$17,$s2=$18 Instruction Format: op rs 000000 10001 08/08/13 rt 10010 rd 01000 shamt funct 00000 100000 www.eej.ulster.ac.uk/~ian/modules/COM181/files/COM181_L4.pptx 33 Machine Language - Arithmetic Instruction Instructions, like registers and words of data, are also 32 bits long Example: add $t0, $s1, $s2 registers have numbers $t0=$8,$s1=$17,$s2=$18 Instruction Format: op rs 000000 10001 rt 10010 rd 01000 shamt funct 00000 100000 Can you guess what the field names stand for? 08/08/13 www.eej.ulster.ac.uk/~ian/modules/COM181/files/COM181_L4.pptx 34 MIPS Instruction Fields op 6 bits 08/08/13 rs 5 bits rt 5 bits rd shamt 5 bits 5 bits funct 6 bits = 32 bits op opcode indicating operation to be performed rs address of the first register source operand rt address of the second register source operand rd the register destination address shamt shift amount (for shift instructions) funct function code that selects the specific variant of the operation specified in the opcode field www.eej.ulster.ac.uk/~ian/modules/COM181/files/COM181_L4.pptx 36 Machine Language - Load Instruction Consider the load-word and store-word instr’s What would the regularity principle have us do? Introduce a new type of instruction format But . . . Good design demands compromise I-type for data transfer instructions (previous format was R-type for register) Example: lw $t0, 24($s2) op 23hex 100011 rs 18 10010 rt 8 01000 16 bit number 24 0000000000011000 Where's the compromise? 08/08/13 www.eej.ulster.ac.uk/~ian/modules/COM181/files/COM181_L4.pptx 37 Machine Language - Load Instruction Consider the load-word and store-word instr’s What would the regularity principle have us do? Introduce a new type of instruction format But . . . Good design demands compromise I-type for data transfer instructions (previous format was R-type for register) Example: lw $t0, 24($s2) op 23hex 100011 rs rt 16 bit number 18 8 10010 01000 24 0000000000011000 Where's the compromise? 08/08/13 www.eej.ulster.ac.uk/~ian/modules/COM181/files/COM181_L4.pptx 38 Memory Address Location Example: lw $t0, 24($s2) Memory 0xf f f f f f f f 2410 + $s2 = 0x00000002 0x12004094 $s2 Note that the offset can be positive or negative 08/08/13 data www.eej.ulster.ac.uk/~ian/modules/COM181/files/COM181_L4.pptx 0x0000000c 0x00000008 0x00000004 0x00000000 word address (hex) 39 Memory Address Location Example: lw $t0, 24($s2) Memory 0xf f f f f f f f 2410 + $s2 = $t0 0x00000002 . . . 1001 0100 + . . . 0001 1000 . . . 1010 1100 = 0x120040ac Note that the offset can be positive or negative 08/08/13 0x120040ac 0x12004094 $s2 data www.eej.ulster.ac.uk/~ian/modules/COM181/files/COM181_L4.pptx 0x0000000c 0x00000008 0x00000004 0x00000000 word address (hex) 40 Machine Language - Store Instruction Example: sw $t0, 24($s2) op 43 101011 rt 18 10010 16 bit number 8 01000 24 0000000000011000 A 16-bit offset means access is limited to memory locations within a range of +213-1 to -213 (~8,192) words (+215-1 to 215 (~32,768) bytes) of the address in the base register $s2 08/08/13 rs 2’s complement (1 sign bit + 15 magnitude bits) www.eej.ulster.ac.uk/~ian/modules/COM181/files/COM181_L4.pptx 41 Machine Language - Store Instruction Example: sw $t0, 24($s2) op 43 101011 18 10010 rt 16 bit number 8 01000 24 0000000000011000 A 16-bit offset means access is limited to memory locations within a range of +213-1 to -213 (~8,192) words (+215-1 to -215 (~32,768) bytes) of the address in the base register $s2 08/08/13 rs 2’s complement (1 sign bit + 15 magnitude bits) www.eej.ulster.ac.uk/~ian/modules/COM181/files/COM181_L4.pptx 42 Machine Language – Immediate Instructions 08/08/13 What instruction format is used for the addi ? addi $s3, $s3, 4 #$s3 = $s3 + 4 Machine format: www.eej.ulster.ac.uk/~ian/modules/COM181/files/COM181_L4.pptx 43 Machine Language – Immediate Instructions What instruction format is used for the addi ? addi $s3, $s3, 4 #$s3 = $s3 + 4 Machine format: op 8 rt 19 19 16 bit immediate I format 4 The constant is kept inside the instruction itself! 08/08/13 rs So must use the I format – Immediate format Limits immediate values to the range +215–1 to -215 www.eej.ulster.ac.uk/~ian/modules/COM181/files/COM181_L4.pptx 44 Instruction Format Encoding Can reduce the complexity with multiple formats by keeping them as similar as possible First three fields are the same in R-type and I-type Each format has a distinct set of values in the op field Instr Frmt op rs rt rd shamt funct address add R 0 reg reg reg 0 32ten NA sub R 0 reg reg reg 0 34ten NA addi I 8ten reg reg NA NA NA constant lw I 35ten reg reg NA NA NA address sw I 43ten reg reg NA NA NA address Assembling Code Remember the assembler code we compiled last lecture for the C statement A[8] = A[2] - b lw $t0, 8($s3) sub $t0, $t0, $s2 #subtract b from A[2] sw $t0, 32($s3) #load A[2] into $t0 #store result in A[8] Assemble the MIPS object code for these three instructions (decimal is fine) lw sub sw 08/08/13 www.eej.ulster.ac.uk/~ian/modules/COM181/files/COM181_L4.pptx 46 Assembling Code Remember the assembler code we compiled last lecture for the C statement A[8] = A[2] - b lw $t0, 8($s3) sub $t0, $t0, $s2 #subtract b from A[2] sw $t0, 32($s3) 08/08/13 #load A[2] into $t0 #store result in A[8] Assemble the MIPS object code for these three instructions (decimal is fine) lw 35 19 8 sub 0 8 18 sw 43 19 8 8 8 0 34 32 www.eej.ulster.ac.uk/~ian/modules/COM181/files/COM181_L4.pptx 47 Review: MIPS Instructions, so far Category Arithmetic (R format) Instr Op Code Example Meaning add 0& 32 add $s1, $s2, $s3 $s1 = $s2 + $s3 subtract 0& 34 sub $s1, $s2, $s3 $s1 = $s2 - $s3 Arithmetic (I format) add immediate 8 addi $s1, $s2, 4 $s1 = $s2 + 4 Data transfer (I format) load word 35 lw $s1, 100($s2) $s1 = Memory($s2+100) store word 43 sw $s1, 100($s2) Memory($s2+100) = $s1 08/08/13 www.eej.ulster.ac.uk/~ian/modules/COM181/fi les/COM181_L4.pptx 48 MIPS Operand Addressing Modes Summary Register addressing – operand is in a register 1. Register addressing op rs rt rd funct Register word operand (displacement) addressing – operand’s address in memory is the sum of a register and a 16-bit constant contained within the instruction Base 2. Base addressing op rs rt offset Memory word or byte operand base register addressing – operand is a 16-bit constant contained within the instruction Immediate 3. Immediate addressing op 08/08/13 rs rt operand www.eej.ulster.ac.uk/~ian/modules/COM181/files/COM181_L4.pptx 49 MIPS Instruction Addressing Modes Summary addressing – instruction’s address in memory is the sum of the PC and a 16-bit constant contained within the instruction PC-relative 4. PC-relative addressing op rs rt offset Memory branch destination instruction Program Counter (PC) addressing – instruction’s address in memory is the 26-bit constant contained within the instruction concatenated with the upper 4 bits of the PC Pseudo-direct 5. Pseudo-direct addressing op Memory jump address || jump destination instruction Program Counter (PC) 08/08/13 www.eej.ulster.ac.uk/~ian/modules/COM181/files/COM181_L4.pptx 50 Review: MIPS Instructions, so far Category Arithmetic (R & I format) 08/08/13 Instr OpC Example Meaning add 0 & 20 add $s1, $s2, $s3 $s1 = $s2 + $s3 subtract 0 & 22 sub $s1, $s2, $s3 $s1 = $s2 - $s3 addi $s1, $s2, 4 $s1 = $s2 + 4 add immediate 8 shift left logical 0 & 00 sll $s1, $s2, 4 $s1 = $s2 << 4 shift right logical 0 & 02 srl $s1, $s2, 4 $s1 = $s2 >> 4 (fill with zeros) shift right arithmetic 0 & 03 sra $s1, $s2, 4 $s1 = $s2 >> 4 (fill with sign bit) and 0 & 24 and $s1, $s2, $s3 $s1 = $s2 & $s3 or 0 & 25 or $s1 = $s2 | $s3 nor 0 & 27 nor $s1, $s2, $s3 $s1 = not ($s2 | $s3) and immediate c and $s1, $s2, ff00 $s1 = $s2 & 0xff00 or immediate d or $s1 = $s2 | 0xff00 load upper immediate f lui $s1, 0xffff $s1, $s2, $s3 $s1, $s2, ff00 www.eej.ulster.ac.uk/~ian/modules/COM181/fi les/COM181_L4.pptx $s1 = 0xffff0000 51 Review: MIPS Instructions, so far Category Data transfer (I format) Cond. branch (I & R format) Uncond. jump 08/08/13 Instr OpC Example Meaning load word 23 lw $s1, 100($s2) $s1 = Memory($s2+100) store word 2b sw $s1, 100($s2) Memory($s2+100) = $s1 load byte 20 lb $s1, 101($s2) $s1 = Memory($s2+101) store byte 28 sb $s1, 101($s2) Memory($s2+101) = $s1 load half 21 lh $s1, 101($s2) $s1 = Memory($s2+102) store half 29 sh $s1, 101($s2) Memory($s2+102) = $s1 br on equal 4 beq $s1, $s2, L if ($s1==$s2) go to L br on not equal 5 bne $s1, $s2, L if ($s1 !=$s2) go to L set on less than immediate a slti $s1, $s2, 100 if ($s2<100) $s1=1; $s1=0 set on less than 0 & 2a slt $s1, $s2, $s3 if ($s2<$s3) $s1=1; else $s1=0 2 j 2500 go to 10000 jump register 0 & 08 jr $t1 go to $t1 jump and link 3 jal 2500 go to 10000; $ra=PC+4 jump www.eej.ulster.ac.uk/~ian/modules/COM181/fi les/COM181_L4.pptx else 52 Review: MIPS R3000 ISA Instruction Categories Load/Store Computational Jump and Branch Floating Point R0 - R31 coprocessor Memory Management Special PC HI LO 3 Instruction Formats: all 32 bits wide 6 bits 5 bits 5 bits OP rs rt OP rs rt OP 08/08/13 Registers 5 bits rd 5 bits shamt 16 bit number 26 bit jump target www.eej.ulster.ac.uk/~ian/modules/COM181/files/COM181_L4.pptx 6 bits funct R format I format J format 53 RISC Design Principles Review Simplicity favors regularity fixed size instructions – 32-bits small number of instruction formats Smaller is faster limited instruction set limited number of registers in register file limited number of addressing modes Good design demands good compromises 08/08/13 three instruction formats Make the common case fast arithmetic operands from the register file (load-store machine) allow instructions to contain immediate operands www.eej.ulster.ac.uk/~ian/modules/COM181/files/COM181_L4.pptx 54 The Code Translation Hierarchy C program compiler assembly code 08/08/13 www.eej.ulster.ac.uk/~ian/modules/COM181/fi les/COM181_L4.pptx 55 Compiler Transforms the C program into an assembly language program Advantages of high-level languages many fewer lines of code easier to understand and debug … Today’s optimizing compilers can produce assembly code nearly as good as an assembly language programming expert and often better for large programs 08/08/13 smaller code size, faster execution www.eej.ulster.ac.uk/~ian/modules/COM181/files/COM181_L4.pptx 56 The Code Translation Hierarchy C program compiler assembly code assembler object code 08/08/13 www.eej.ulster.ac.uk/~ian/modules/COM181/fi les/COM181_L4.pptx 57 Assembler Does a syntactic check of the code (i.e., did you type it in correctly) and then transforms the symbolic assembler code into object (machine) code Advantages of assembler much easier than remembering instr’s binary codes can use labels for addresses – and let the assembler do the arithmetic can use pseudo-instructions e.g., “move $t0, $t1” exists only in assembler (would be implemented using “add $t0,$t1,$zero”) When considering performance, you should count instructions executed, not code size 08/08/13 www.eej.ulster.ac.uk/~ian/modules/COM181/fi les/COM181_L4.pptx 58 The Two Main Tasks of the Assembler 1. 2. 08/08/13 Builds a symbol table which holds the symbolic names (labels) and their corresponding addresses A label is local if it is used only within the file where its defined. Labels are local by default. A label is external (global) if it refers to code or data in another file or if it is referenced from another file. Global labels must be explicitly declared global (e.g., .globl main) Translates each assembly language statement into object (machine) code by “assembling” the numeric equivalents of the opcodes, register specifiers, shift amounts, and jump/branch targets/offsets www.eej.ulster.ac.uk/~ian/modules/COM181/fi les/COM181_L4.pptx 59 MIPS (spim) Memory Allocation Memory Mem Map I/O $sp Kernel Code & Data fffffffc 7f f f f f fc Stack 230 words Dynamic data $gp Static data 1000 8000 1000 0000 Text Segment 0040 0000 PC Reserved 08/08/13 0000 0000 www.eej.ulster.ac.uk/~ian/modules/COM181/fi les/COM181_L4.pptx 60 Other Tasks of the Assembler Converts pseudo-instr’s to legal assembly code register $at is reserved for the assembler to do this Converts branches to far away locations into a branch followed by a jump Converts instructions with large immediates into a lui followed by an ori Converts numbers specified in decimal and hexidecimal into their binary equivalents and characters into their ASCII equivalents Deals with data layout directives (e.g., .asciiz) Expands macros (frequently used sequences of instructions) 08/08/13 www.eej.ulster.ac.uk/~ian/modules/COM181/files/COM181_L4.pptx 61 Typical Object File Pieces Object file header: size and position of the following pieces of the file Text (code) segment (.text) : assembled object (machine) code Data segment (.data) : data accompanying the code static data - allocated throughout the program dynamic data - grows and shrinks as needed Relocation information: identifies instructions (data) that use (are located at) absolute addresses – not relative to a register (including the PC) 08/08/13 on MIPS only j, jal, and some loads and stores (e.g., 100($zero) ) use absolute addresses lw $t1, Symbol table: global labels with their addresses (if defined in this code segment) or without (if defined external to this code segment) Debugging information www.eej.ulster.ac.uk/~ian/modules/COM181/files/COM181_L4.pptx 62 An Example Gbl? yes yes Symbol str cr main loop brnc done printf Address 1000 0000 1000 000b 0040 0000 0040 000c 0040 001c 0040 0024 ???? ???? Relocation Info Address Data/Instr 1000 0000 1000 000b 0040 0018 0040 0020 0040 0024 str cr j loop j loop jal printf .data .align 0 str: .asciiz "The answer is " cr: .asciiz "\n" .text .align 2 .globl main .globl printf main: ori $2, $0, 5 syscall move $8, $2 loop: beq $8, $9, done blt $8, $9, brnc sub $8, $8, $9 j loop brnc: sub $9, $9, $8 j loop done: jal printf The Code Translation Hierarchy C program compiler main text segment assembly code printf text segment assembler object code library routines linker machine code 08/08/13 executable www.eej.ulster.ac.uk/~ian/modules/COM181/fi les/COM181_L4.pptx 64 Linker Takes all of the independently assembled code segments and “stitches” (links) them together 1. Decides on memory allocation pattern for the code and data segments of each module 2. 3. 08/08/13 Remember, modules were assembled in isolation so each has assumed its code’s starting location is 0x0040 0000 and its static data starting location is 0x1000 0000 Relocates absolute addresses to reflect the new starting location of the code segment and its data segment Uses the symbol tables information to resolve all remaining undefined labels Faster to recompile and reassemble a patched segment, than it is to recompile and reassemble the entire program branches, jumps, and data addresses to/in external modules Linker produces an executable file www.eej.ulster.ac.uk/~ian/modules/COM181/files/COM181_L4.pptx 65 Linker Code Schematic Executable file Object file main: main: . . . jal ???? call, printf Relocation info 08/08/13 Linker C library . . . jal printf printf: . . . printf: . . . www.eej.ulster.ac.uk/~ian/modules/COM181/files/COM181_L4.pptx 66 Linking Two Object Files 08/08/13 Reloc Txtseg Hdr Txtseg www.eej.ulster.ac.uk/~ian/modules/COM181/files/COM181_L4.pptx Dseg File 2 Dseg Reloc Smtbl Dbg + Hdr Hdr Txtseg Dseg Reloc Smtbl Dbg File 1 Executable 67 The Code Translation Hierarchy C program compiler assembly code assembler object code library routines linker machine code executable loader memory 08/08/13 www.eej.ulster.ac.uk/~ian/modules/COM181/files/COM181_L4.pptx 68 Loader 08/08/13 Loads (copies) the executable code now stored on disk into memory at the starting address specified by the operating system Copies the parameters (if any) to the main routine onto the stack Initializes the machine registers and sets the stack pointer to the first free location (0x7fff fffc) Jumps to a start-up routine (at PC addr 0x0040 0000 on xspim) that copies the parameters into the argument registers and then calls the main routine of the program with a jal main www.eej.ulster.ac.uk/~ian/modules/COM181/files/COM181_L4.pptx 69 Dynamically Linked Libraries Statically linking libraries mean that the library becomes part of the executable code It loads the whole library even if only a small part is used (e.g., standard C library is 2.5 MB) What if a new version of the library is released ? (Lazy) dynamically linked libraries (DLL) – library routines are not linked and loaded until a routine is called during execution The first time the library routine called, a dynamic linker-loader must 08/08/13 find the desired routine, remap it, and “link” it to the calling routine (see book for more details) DLLs require extra space for dynamic linking information, but do not require the whole library to be copied or linked www.eej.ulster.ac.uk/~ian/modules/COM181/files/COM181_L4.pptx 70 08/08/13 www.eej.ulster.ac.uk/~ian/modules/COM181/fi les/COM181_L4.pptx 71