CS.305 Computer Architecture <local.cis.strath.ac.uk/teaching/ug/classes/CS.305> Computer Abstractions and Technology Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005, and from slides kindly made available by Dr Mary Jane Irwin, Penn State University. Instruction Sets Language of the Machine We’ll be working with the MIPS instruction set architecture Similar to other architectures developed since the 1980's Almost 100 million MIPS processors manufactured in 2002 Used by NEC, Nintendo, Cisco, Silicon Graphics, Sony, … Instructions: Language of the Computer CS305_03/2 MIPS is a RISC RISC - Reduced Instruction Set Computer RISC philosophy fixed instruction lengths load-store instruction sets limited addressing modes limited operations MIPS, Sun SPARC, HP PA-RISC, IBM PowerPC, Intel (Compaq) Alpha, … Instruction sets are measured by how well compilers use them as opposed to how well assembly language programmers use them Design goals: speed, cost (design, fabrication, test, packaging), size, power consumption, reliability, memory space (embedded systems) Instructions: Language of the Computer CS305_03/3 MIPS R3000 Instruction Set Architecture (ISA) Instruction Formats R - Register I - Immediate J - Jump Instruction Categories Computational Load/Store Jump and Branch Floating Point • Registers R0 - R31 PC HI LO coprocessor Memory Management Special Instructions: Language of the Computer CS305_03/4 MIPS Instruction Formats Three basic formats: R-format op rs rt I-format op rs rt J-format op rd shamt funct 16-bit address/number 26-bit address Simple instructions - all 32 bits wide Very structured, no unnecessary baggage Rely on compiler to achieve performance — what are the compiler's goals? [Suggests another version of the acronym RISC ;-)] Q: Why only three basic formats? A: Design Principle #1… Instructions: Language of the Computer CS305_03/5 Design Principle #1 Simplicity favours regularity The fixed-width and limited number of instruction formats keeps the hardware simple One example of this first underlying principle of hardware design in action Instructions: Language of the Computer CS305_03/6 Registers vs. Memory Arithmetic instructions operands must be registers, — only 32 registers provided Compiler associates variables with registers What about programs with lots of variables? Control Input Memory Datapath Processor Instructions: Language of the Computer Output I/O CS305_03/7 Memory Organization Viewed as a large, single-dimension array, with an address. A memory address is an index into the array "Byte addressing" means that the index points to a byte of memory. 0 1 2 3 4 5 6 ... 8 bits of data 8 bits of data 8 bits of data 8 bits of data 8 bits of data 8 bits of data 8 bits of data Instructions: Language of the Computer CS305_03/8 Memory Organization Bytes are nice, but most data items use larger "words" For MIPS, a word is 32 bits or 4 bytes. 0 4 8 12 ... 32 bits of data 32 bits of data 32 bits of data Registers hold 32 bits of data 32 bits of data 232 bytes with byte addresses from 0 to 232-1 230 words with byte addresses 0, 4, 8, ... 232-4 Words are aligned What are the least 2 significant bits of a word address? Instructions: Language of the Computer CS305_03/9 Machine Language Instructions, like registers and words of data, are also 32 bits long Example: add $t1,$s1,$s2 Registers have numbers, $t1=9,$s1=17,$s2=18 Above add's machine language instruction encoding: 000000 10001 10010 01001 00000 100000 op rs rt rd shamt funct Can you guess what the field names, such as 'op', stand for? Instructions: Language of the Computer CS305_03/10 MIPS Computational Operations Computational (arithmetic and logical) instructions have 3 operands. Example: C code: a = b + c MIPS ‘code’: add a, b, c (we’ll talk about registers in a bit) “The natural number of operands for an operation like addition is three…requiring every instruction to have exactly three operands, no more and no less, conforms to the philosophy of keeping the hardware simple” Instructions: Language of the Computer CS305_03/11 MIPS Arithmetic Instructions MIPS assembly language arithmetic statement examples: add $t0,$s1,$s2 sub $t0,$s1,$s2 Each arithmetic instruction performs only one operation Each arithmetic instruction fits in 32 bits and specifies exactly three operands destination source1 op source2 Those operands are all contained in the datapath’s register file ($t0,$s1,$s2) – indicated by $ Operand order is fixed (destination first in the assembly language statement) Instructions: Language of the Computer CS305_03/12 MIPS Arithmetic Remember "Simplicity favors regularity" Of course this complicates some things... C code: a = b + c + d; MIPS 'code': add a, b, c add a, a, d Each register contains 32 bits Operands must be registers, but only 32 registers available Q: Why only 32 registers? A: Design Principle #2… Instructions: Language of the Computer CS305_03/13 Design Principle #2 Smaller is Faster Operands of arithmetic instructions cannot be arbitrary (program) variables; they must come from a limited number of special operands called registers. One major difference between program variables and registers is the limited number of registers - 32 in MIPS. A very large number of registers would increase the clock cycle time as electronic signals take longer the further they have to travel. This is one illustration of this second underlying principle of hardware design Instructions: Language of the Computer CS305_03/14 Aside: MIPS Register Conventions Name Register Usage Number Preserved on call? $zero 0 The constant value 0 n.a. $at 1 Reserved for the assembler n.a. $v0-$v1 2-3 Values for results and expression evaluation no $a0-$a3 4-7 arguments no $t0-$t7 8-15 temporaries no $s0-$s7 16-23 saved yes $t8-$t9 24-25 More temporaries no $k0-$k1 26-27 Reserved for the operating system n.a. $gp 28 Global pointer yes $sp 29 Stack pointer yes $fp 30 Frame pointer yes $ra 31 Return address yes Instructions: Language of the Computer CS305_03/15 Aside: MIPS Register File Register File Holds thirty-two 32-bit registers Two read ports and One write port Registers are Faster than main memory src1 addr src2 addr dst addr write data 32 bits 5 32 src1 data 5 5 32 locations 32 src2 32 • But register files with more locations are slower (e.g., a 64 word file could be as much as 50% slower than a 32 word file) data write control • Read/write port increase impacts speed quadratically Easier for a compiler to use Convenient places to hold variables • code density improves (since register are named with fewer bits than a memory location) Instructions: Language of the Computer CS305_03/16 Recap: R-format Instructions op rs rt rd shamt funct 6-bits 5-bits 5-bits 5-bits 5-bits 6-bits op opcode that specifies the operation rs register file address of the first source operand rt register file address of the second source operand rd register file address of the result’s destination shamt shift amount (for shift instructions) funct function code augmenting the opcode Instructions: Language of the Computer CS305_03/17 Register Addressing Mode The register address fields are rs, rt, and rd. Each field is 5-bits wide op rs rt rd shamt funct 6-bits 5-bits 5-bits 5-bits 5-bits 6-bits Register addressing op rs rt rd s… f… Registers Register ($rd) Register ($rt) Register ($rs) Instructions: Language of the Computer CS305_03/18 Load and Store Instructions Example: C code: A[12] = h + A[8]; MIPS code: lw $t0,32($s3) add $t0,$s2,$t0 sw $t0,48($s3) Destination is last in the store word AL statement Remember arithmetic operands are registers, not memory! Can’t write: add 48($s3),$s2,32($s3) Instructions: Language of the Computer CS305_03/19 Our First Example Can we figure out the code? swap(int v[], int k); { int temp; temp = v[k] v[k] = v[k+1]; v[k+1] = temp; } Instructions: Language of the Computer swap: muli add lw lw sw sw jr $2,$5,4 $2,$4,$2 $15,0($2) $16,4($2) $16,0($2) $15,4($2) $31 CS305_03/20 So far we’ve learned: MIPS — loading words but addressing bytes — arithmetic on registers only Instruction add $s1,$s2,$s3 sub $s1,$s2,$s3 lw $s1,100($s2) sw $s1,100($s2) Instructions: Language of the Computer Meaning $s1 = $s2 + $s3 $s1 = $s2 – $s3 $s1 = Memory[$s2+100] Memory[$s2+100] = $s1 CS305_03/21 MIPS Load/Store Instruction Format Consider load-word and store-word instructions and the design principle Simplicity Favours Regularity… …so use another (existing) type of (32-bit) instruction format other than R-type: I-type for data transfer instructions Example: lw $t0,32($s2) 35 18 9 32 op rs rt 16-bit number/offset Q: Why only a 16-bit number/offset? A: Design Principle #3… Instructions: Language of the Computer CS305_03/22 Design Principle #3 Good Design Demands Good Compromises A single (R-type) instruction format is not well suited to instructions - like lw and sw - that specify address as well as register operands. If the address field was to be allocated to one of the 5-bit fields, say, then such instructions could only address 32 (25) words! The conflict between having instructions all the same length and the desire to have a single format leads to this third underlying principle of hardware design. One compromise in MIPS is to have a small number of different fixed-width instruction formats rather than instructions of varying length. Multiple formats do complicate the hardware, but the complexity can be minimised by keeping them similar. Instructions: Language of the Computer CS305_03/23 MIPS Load/Store Memory Addressing MIPS has two basic data transfer instructions for accessing memory: lw $t0, 4($s3) # load word from memory sw $t0, 8($s3) # store word to memory Data is loaded into (lw) or stored from (sw) a register in the register file – a 5 bit address The memory address – a 32 bit address – is formed by adding the contents of a base address register to an offset value A 16-bit field means access is limited to memory locations within a region of 213 or 8,192 words (215 or 32,768 bytes) of the address in the base register Note that the offset can be positive or negative Instructions: Language of the Computer CS305_03/24 Base (displacement) Addressing Mode Base (displacement) addressing – operand is at the memory location whose address is the sum of a register and a 16-bit constant contained within the instruction Memory Base addressing op rs rt offset Byte/Halfword/Word Register ($rs) Instructions: Language of the Computer CS305_03/25 Stored Program Concept Instructions are bits Programs are stored in memory to be read or written just like data Fetch & Execute Cycle Instructions are fetched and put into a special register Bits in the register "control" the subsequent actions Fetch the “next” instruction and continue Instructions: Language of the Computer CS305_03/26 Control Decision making instructions alter the control flow, i.e., change the "next" instruction to be executed MIPS conditional branch instructions: bne $t0,$t1,Label beq $t0,$t1,Label Example: C if (i==j) h = i + j; Instructions: Language of the Computer MIPS bne $s0,$s1,Label add $s3,$s0,$s1 Label: .... CS305_03/27 Control MIPS unconditional branch instructions: j label Example: C if (i!=j) h=i+j; else h=i-j; MIPS beq $s4,$s5,Lab1 add $s3,$s4,$s5 j Lab2 Lab1: sub $s3,$s4,$s5 Lab2: ... Can you build a simple for loop? Instructions: Language of the Computer CS305_03/28 Recap: Instruction Meaning add $s1,$s2,$s3 sub $s1,$s2,$s3 lw $s1,100($s2) sw $s1,100($s2) bne $s4,$s5,Label beq $s4,$s5,Label j Label $s1 = $s2 + $s3 $s1 = $s2 – $s3 $s1 = Memory[$s2+100] Memory[$s2+100] = $s1 Next ins. at Label if $s4≠$s5 Next ins. at Label if $s4=$s5 Next ins. at Label Formats: R-format op rs rt I-format op rs rt J-format op Instructions: Language of the Computer rd shamt funct 16-bit address/number 26-bit address CS305_03/29 More Control 'Instructions' We have: beq, bne, what about blt (branch-if-lessthan)? New instruction: slt $t0,$s1,$s2 if $s1 < $s2 then $t0 = 1 else $t0 = 0 Can use slt to synthesise "blt $s1,$s2,Label" — can now build general control structures Note that the assembler needs a register to do this, $at Instructions: Language of the Computer CS305_03/30 Constants Constant (immediate) operands are frequently used in programs e.g., A = A + 4; B = B + 1; C = C - 16; Possible approaches? put 'typical constants' in memory and load them. create hard-wired registers (like $zero) for constants like 1. have special instructions that contain constants ! Note: small constants are very common (>50% of operands) Q: Which instruction format(s) to use? A: See Design Principle #4… Instructions: Language of the Computer CS305_03/31 Design Principle #4 Make the Common Case Fast Analysis of a large variety of compiled programs reveal that the vast majority of constants used are quite small numbers: >90% within the range of a 16bit twos complement integer. Obvious choice is to use the I-type format for instructions that have as an operand this most common case of constant. Hence, these typical MIPS 'Immediate' instructions: addi $sp,$sp,4 #$sp = $sp + 4 slti $t0,$s2,15#$t0 = 1 if $s2<15 Instructions: Language of the Computer CS305_03/32 What about larger constants? There must be a way to 'load' a 32-bit constant into a register. Compromise by using two instructions: "Load Upper Immediate" (lui) instruction: lui $t0,1010101010101010b Zero filled $t0 1010101010101010 0000000000000000 Followed by a "logical or" (ori) instruction: ori $t0,$t0,1010101010101010b $t0 1010101010101010 0000000000000000 ori 0000000000000000 1010101010101010 $t0 1010101010101010 1010101010101010 Instructions: Language of the Computer CS305_03/33 Assembly Language vs. Machine Language Assembly provides convenient symbolic representation much easier than writing down numbers e.g., destination first Machine language is the underlying reality e.g., destination is no longer first Assembly can provide 'pseudoinstructions' e.g., “move $t0,$t1” exists only in Assembly would be implemented by “add $t0,$t1,$zero” When considering performance you should count real instructions Instructions: Language of the Computer CS305_03/34 Addresses in Branches and Jumps Instructions: bne $t4,$t5,Label beq $t4,$t5,Label j Label Next instruction is at Label if $t4≠$t5 Next instruction is at Label if $t4=$t5 Next instruction is at Label Formats: I-format op J-format op rs rt 16-bit address 26-bit address Addresses are not 32 bits How do we handle this with load and store instructions? Instructions: Language of the Computer CS305_03/35 Addresses in Branches Instructions: bne $t4,$t5,Label beq $t4,$t5,Label Next instruction is at Label if $t4≠$t5 Next instruction is at Label if $t4=$t5 Format: I-format op rs rt 16-bit address (offset) Could specify a register (like lw and sw did) and add it to address (offset). Q: Which register? A: Instruction Address Register (aka Program Counter - PC) PC-Relative addressing op rs rt offset ? ? Program Counter (PC) Instructions: Language of the Computer ? CS305_03/36 Specifying Branch Destinations Why PC? its use is automatically implied by instruction • PC gets updated (PC+4) during the fetch cycle so that it holds the address of the next instruction limits the branch distance to -215 to +215-1 instructions from the (instruction after the) branch instruction, but most branches are local anyway. (Principle of Locality). from the low order 16 bits of the branch instruction 16 offset sign-extend 00 32 32 Add PC 32 Instructions: Language of the Computer 32 4 32 Add 32 branch dst address 32 ? CS305_03/37 Addresses in Jumps Instruction: j Label Next instruction is at Label Format: J-format op 26-bit address Jump instructions just use high order bits of PC A compromise: such jumps are limited by address boundaries of 256 MB, i.e within blocks of 226 instructions. from the low order 26 bits of the jump instruction 26 00 32 4 PC Instructions: Language of the Computer 32 CS305_03/38 MIPS ISA So Far Category Arithmetic (R & I format) Data Transfer (I format) Cond. Branch (I & R format) Uncond. Jump (J & R format) Instr Op Code Example Meaning add 0 and 32 add $s1, $s2, $s3 $s1 = $s2 + $s3 subtract 0 and 34 sub $s1, $s2, $s3 $s1 = $s2 - $s3 add immediate 8 addi $s1, $s2, 6 $s1 = $s2 + 6 or immediate 13 ori $s1, $s2, 6 $s1 = $s2 v 6 load word 35 lw $s1, 24($s2) $s1 = Memory($s2+24) store word 43 sw $s1, 24($s2) Memory($s2+24) = $s1 load byte 32 lb $s1, 25($s2) $s1 = Memory($s2+25) store byte 40 sb $s1, 25($s2) Memory($s2+25) = $s1 load upper imm 15 lui $s1, 6 $s1 = 6 * 216 br on equal 4 beq $s1, $s2, L if ($s1==$s2) go to L br on not equal 5 bne $s1, $s2, L if ($s1 !=$s2) go to L set on less than 0 and 42 slt if ($s2<$s3) $s1=1 else $s1=0 set on less than immediate 10 slti $s1, $s2, 6 if ($s2<6) $s1=1 else $s1=0 jump 2 j 2500 go to 10000 jump register 0 and 8 jr $t1 go to $t1 jump and link 3 jal 2500 go to 10000; $ra=PC+4 Instructions: Language of the Computer $s1, $s2, $s3 CS305_03/39 Review of MIPS Operand Addressing Modes Register addressing – operand is in a register op rs rt rd funct Register word operand Base (displacement) addressing – operand is at the memory location whose address is the sum of a register and a 16-bit constant contained within the instruction op rs rt offset Memory word or byte operand base register Register relative (indirect) with Pseudo-direct with 0($a0) addr($zero) Immediate addressing – operand is a 16-bit constant contained within the instruction op rs rt operand Instructions: Language of the Computer CS305_03/40 Review of MIPS Instruction Addressing Modes PC-relative addressing –instruction address is the sum of the PC and a 16-bit constant contained within the instruction op rs rt offset Memory branch destination instruction Program Counter (PC) Pseudo-direct addressing – instruction address is the 26-bit constant contained within the instruction concatenated with the upper 4 bits of the PC op Memory jump address || jump destination instruction Program Counter (PC) Instructions: Language of the Computer CS305_03/41 Instructions for Accessing Procedures MIPS 'procedure call' instruction: jal ProcedureAddress #jump and link Saves PC+4 in register $ra ($31) to have a link to the next instruction for the procedure return Instruction format (J-format): jal 000011 26-bit address Then can do procedure 'return' with a jr $ra #return Instruction format (R-format): jr 00000 rs op rs Instructions: Language of the Computer 001000 rt rd shamt funct CS305_03/42 Aside: Spilling Registers What if the callee needs more registers and/or the procedure is recursive? use a stack – a last-in-first-out queue – in memory for passing additional values or saving (recursive) return address(es) One of the general registers, $sp, is high addr used to address the stack (which “grows” from high address to low address) top of stack $sp Push (a register onto the stack): subi $sp,$sp,4 sw $ra,0($sp) Pop (a register off the stack): low addr lw $ra,0($sp) addi $sp,$sp,4 Instructions: Language of the Computer CS305_03/43 Example: Nested Procedure Calls - MIPS code A: B: C: ... ... jal B ... ... ... subi $sp,$sp,4 sw $ra,0($sp) jal C lw $ra,0($sp) addi $sp,$sp,4 ... jr $ra ... ... jr $ra Instructions: Language of the Computer # Call B, save return addr in $ra # # # # # # Get ready to call C Adjust ToS to make room to... ...'push' the old return addr Call C, save return addr in $31 Restore B's return address... ...and re-adjust ToS ('pop') # Return to proc that called B # Return to proc that called C CS305_03/44 Passing Parameters to Procedures Conventions for passing parameters - arguments may vary from machine to machine, language to language, and even compiler to compiler. MIPS uses $4 to $7 ($a0-$a3) as arguments. There must also be a convention for preserving registers across procedure calls. The two usual conventions are: Caller save. The calling procedure (caller) has the responsibility for preserving affected registers. The called procedure (callee) can then modify any registers without constraint. Callee save. The callee has the responsibility for saving and restoring any registers that it might use. The calling procedure (caller) uses registers without worrying about their preservation. Instructions: Language of the Computer CS305_03/45 MIPS Pseudoinstructions In keeping with design principles the MIPS ISA does not contain complex instructions as these could compromise the performance of all instructions. However, a MIPS compiler/assembler can synthesise 'pseudoinstructions' from common variations of real instructions. Such pseudoinstructions simplify translation and programming. Pseudoinstructions give MIPS a richer set of assembly language instructions than those implemented by hardware. The assembler reserves one register, $at, that is used in the synthesis of many pseudoinstructions. For example… Instructions: Language of the Computer CS305_03/46 Example MIPS Pseudoinstructions Pseudoinstruction move $t0,$t1 Real MIPS add $t0,$t1,$zero clear $s0 add $s0,$zero,$zero blt $s1,$s2,label slt $at,$s1,$s2 bne $at,$zero,label bge $s1,$s2,label slt $at,$s1,$s2 beq $at,$zero,label Instructions: Language of the Computer CS305_03/47 Summary of MIPS (RISC) Design Principles Simplicity favors regularity fixed size instructions – 32-bits small number of instruction formats opcode always the first 6 bits Good design demands good compromises three instruction formats Smaller is faster limited instruction set limited number of registers in register file limited number of addressing modes Make the common case fast arithmetic operands from the register file (load-store machine) allow instructions to contain immediate operands Instructions: Language of the Computer CS305_03/48 Fallacies and Pitfalls Fallacy: More powerful instructions mean higher performance. Such instructions often do more work than is required in the frequent case or don't match the requirements of the language. Pitfall: To obtain the highest performance, write in assembly language. The increasing sophistication of modern compilers means that the gap between compiled code and 'hand-crafted' code is closing fast. Even if the gap isn't closed completely, the drawbacks of writing in assembly language are longer time spent coding and debugging, the loss in portability, and difficulty of maintenance. Pitfall: Forgetting that sequential word addresses in memory differ by 4, not by 1. Instructions: Language of the Computer CS305_03/49