Lecture 4 1 Last lecture MIPS Instruction Meaning add $s1, $s2, $s3 sub $s1, $s2, $s3 $s1 = $s2 + $s3 2 A word of motivation Assembly language is for the brave; C and other high level languages are for sissies! Engineering folklore 3 A word of advice The first day that someone writes an assembly program, only 2 persons know what it is about: the programmer and God. The second day, only God knows what is going on. Engineering folklore 4 Machine Language Consider the load-word and store-word instructions, What would the regularity principle have us do? New principle: Good design demands a compromise Introduce a new type of instruction format I-type for data transfer instructions other format was R-type for register Example: lw $t0, 32($s2) 35 18 9 op rs rt 32 16 bit number Where’s the compromise? 5 Control Decision making instructions alter the control flow, i.e., change the "next" instruction to be executed how do you specify the destination of a branch/jump? Most conditional branches go short distances from current program counter MIPS conditional branch instructions: beq r1,r2, addr => if (r1=r2) goto addr bne $t0, $t1, Label beq $t0, $t1, Label Example: if (i==j) h = i + j; bne $s0, $s1, Label add $s3, $s0, $s1 Label: .... 6 Control MIPS unconditional branch instructions: j label => PC <= label Example: if (i!=j) h=i+j; else h=i-j; beq $s4, $s5, Lab1 add $s3, $s4, $s5 j Lab2 Lab1: sub $s3, $s4, $s5 Lab2: ... Can you build a simple for loop? loop1: Beq $s4, $s5, out_of_loop # do_instructions of the loop addi $s4, $s4, 1 j loop1 out_of_loop: 7 More control Translate if (I=j) && (p!=q)) w=w+1 else w=5; bne $s4, $s5, lab1 Notice that a logical expression gets translated in to a path! bne $s6, $s7, lab1 addi $s8, $s8, 1 j nexti lab1: addi $s8, $s0, 5 nexti: 8 So far: Instruction Meaning add $s1,$s2,$s3 sub $s1,$s2,$s3 $s1 = $s2 + $s3 Formats: R op rs rt rd I op rs rt 16 bit address J op shamt funct 26 bit address 9 Control Flow We have: beq, bne, what about Branch-if-less-than? New instruction: if $s1 < $s2 then $t0 = 1 slt $t0, $s1, $s2 else $t0 = 0 Can use this instruction to build "blt $s1, $s2, Label" 10 2 Exercise: Translate if (I<j) w=w+1 else w=5 Note that the assembler needs a register to do this, 11 Policy of Use Conventions Name Register number $zero 0 $v0-$v1 2-3 $a0-$a3 4-7 $t0-$t7 8-15 $s0-$s7 16-23 $t8-$t9 24-25 $gp 28 $sp 29 $fp 30 $ra 31 Usage the constant value 0 values for results and expression evaluation arguments temporaries saved more temporaries global pointer stack pointer frame pointer return address Constants Small constants are used quite frequently (50% of operands) e.g., A = A + 5; B = B + 1; C = C - 18; Solutions? Why not? put ’typical constants’ in memory and load them. create hard-wired registers (like $zero) for constants like one. MIPS Instructions: addi $29, $29, 4 slti $8, $18, 10 andi $29, $29, 6 ori $29, $29, 4 13 How about larger constants? We’d like to be able to load a 32 bit constant into a register Must use two instructions, new "load upper immediate" instruction lui $t0, 1010101010101010 1010101010101010 filled with zeros 0000000000000000 Then must get the lower order bits right, i.e., ori $t0, $t0, 1010101010101010 1010101010101010 0000000000000000 0000000000000000 1010101010101010 1010101010101010 1010101010101010 ori 14 MIPS Instruction formats so far simple instructions all 32 bits wide very structured, no unnecessary baggage only three instruction formats R op rs rt rd I op rs rt 16 bit address J op shamt funct 26 bit address rely on compiler to achieve performance help compiler where we can 15 Addresses in Branches and Jumps Instructions: bne $t4,$t5,Label Next instruction is at Label if Next instruction is at Label if Next instruction is at Label Formats: I op J op rs rt 16 bit address 26 bit address Addresses are not 32 bits 16 Addresses in Branches Instructions: bne $t4,$t5,Label beq $t4,$t5,Label Next instruction is at Label if $t4!=$t5 Next instruction is at Label if $t4=$t5 Formats: op I rs rt 16 bit address Could specify a register (like lw and sw) and add it to address use Instruction Address Register (PC = program counter) most branches are local (principle of locality) Jump instructions just use high order bits of PC address boundaries of 256 MB 17 Branch and Jump Addressing Modes Branch (e.g., beq) uses PC-relative addressing mode (uses few bits if address typically close). That is, it uses base+displacement mode, with the PC being the base. If opcode is 6 bits, how many bits are available for displacement? How far can you jump? Jump uses pseudo-direct addressing mode. 26 bits of the address is in the instruction, the rest is taken from the PC. 6 instruction 26 4 program counter 26 00 jump destination address 18 Addressing Modes, Data and Control Transfer 1. Immediate addressing op rs rt Immediate 2. Register addressing op rs rt rd . .. funct Registers Register 3. Base addressing op rs rt Memor y Address + Register Byte Halfword Word 4. PC-relative addressing op rs rt Memor y Address PC + Word 5. Pseudodirect addressing op Address Memor y PC Word 19 Subroutines jal X # jump and link; PC+1 is stored in $ra (r31) jr $ra # jump register instruction parameters in $a0-$a3 And for more registers? We have a stack! 20 Learning by example Can we figure out the code? swap(int v[], int k); { int temp; temp = v[k] v[k] = v[k+1]; v[k+1] = temp; } swap: muli $2, $5, 4 add $2, $4, $2 lw $15, 0($2) lw $16, 4($2) sw $16, 0($2) sw $15, 4($2) jr $31 21 To summarize: 22 Review -- Instruction Execution in a CPU Registers10 R0 R1 R2 R3 R4 R5 ... 0 36 60000 45 198 12 ... op CPU Program Counter Memory2 10000 10000 10001100010000110100111000100000 10004 00000000011000010010100000010000 Instruction Buffer rs rt rd shamt 80000 00000000000000000000000000111001 immediate/disp in1 in2 operation out addr Load/Store Unit data 23 24 PowerPC Indexed addressing example: lw $t1,$a0+$s3 #$t1=Memory[$a0+$s3] What do we have to do in MIPS? Update addressing update a register as part of load (for marching through arrays) example: lwu $t0,4($s3) #$t0=Memory[$s3+4];$s3=$s3+4 What do we have to do in MIPS? Others: load multiple/store multiple bc Loop decrement counter, if not 0 goto loop 25 80x86 1978: The Intel 8086 is announced (16 bit architecture) 1980: The 8087 floating point coprocessor is added 1982: The 80286 increases address space to 24 bits, +instructions 1985: The 80386 extends to 32 bits, new addressing modes 1989-1995: The 80486, Pentium, Pentium Pro add a few instructions (mostly designed for higher performance) 1997: MMX is added 26 A dominant architecture: 80x86 See your textbook for a more detailed description Complexity: Instructions from 1 to 17 bytes long one operand must act as both a source and destination one operand can come from memory complex addressing modes Saving grace: the most frequently used instructions are not too difficult to build compilers avoid the portions of the architecture that are slow 27 And for the future See Transmeta team! 28 Summary Instruction complexity is only one variable lower instruction count vs. higher CPI / lower clock rate Design Principles: simplicity favors regularity smaller is faster good design demands compromise make the common case fast Instruction set architecture a very important abstraction indeed! 29 Assembly Language vs. Machine Language Assembly provides convenient symbolic representation much easier than writing down numbers Machine language is the underlying reality When considering performance you should count real instructions 30