Computing Systems Instructions: language of the computer claudio.talarico@mail.ewu.edu 1 Instructions Instructions are the language of the machine We’ll be work with the MIPS instruction set architecture Design goals find a language that make it easy to build the hardware and the compiler maximizing performance and minimizing cost Stored-program concept programs (= instructions and data) can be stored in memory as numbers continue with “next” instruction Fetch & execute cycles Fetch the instruction from memory and put it into a special register (IR) The bits in the register “control” the subsequent actions required to execute the task specified by the instruction 2 1 Instruction set characteristic Small Few primitive instructions Simple Few instruction formats Regular Instructions can be handled in similar ways Complete support all kind of high level language instructions Efficient High level language instructions can be mapped efficiently Compatible 3 Definition of the architecture Data types - bit, byte, word, unsigned integer, char, … Operations - arithmetic, logical, shift, flow control, data transfers, … # of operands (3, 2, 1, or 0 operands) - number of operands affect the instruction length Registers Memory organization 4 2 MIPS arithmetic all arithmetic instructions have 3 operands operands order is fixed (destination first) C code c = a+b MIPS code add $s0, $s1, $s2 Design principle: Simplicity favors regularity. Why ? of course this complicate some things … C code f = (g + h) – (i + j) MIPS code add $t0, $s1, $s2 add $t1, $s3, $s4 sub $s0, $t0, $t1 5 Registers vs. memory Control Input Memory Datapath Output Processor Arithmetic operands must be registers, - only 32 registers provided - each register is 32 bits (word) I/O Absolute truth ? Design Principle: Smaller is faster. Why ? Compiler associates variables with registers, but … What about programs with lots of variable or complex data structures ? Only a small amount of data can be kept in the computer’s registers, the rest is kept in memory We’ll need instructions that transfer data between memory and registers 6 3 Memory organization Viewed as a large, single-dimension array, with an address. A memory address is an index into the array "Byte addressing" means that the index points to a byte of memory 0 1 2 3 4 5 6 8 bits of data 8 bits of data 8 bits of data 8 bits of data 8 bits of data 8 bits of data 8 bits of data ... 7 Memory organization Bytes are nice, but most data items use larger "words" For MIPS, a memory word is 32 bits wide (= 4 bytes) 232 bytes with byte addresses from 0 to 232 - 1 230 words with byte addresses 0, 4, 8, ... 230 – 4 Words are aligned (alignment restriction) words start at addresses that are multiple of 4 what are the least 2 significant bits of a word address? 8 4 Bytes order within a word Which byte is first and which is last ? There are two choices Least significant byte is the at rightmost end (= little end) Least significant byte is the at leftmost end (= big end) MIPS uses Big Endian 0 4 3 7 2 6 1 5 0 4 Little Endian 0 4 0 4 1 5 2 6 3 7 Big Endian 9 MIPS data transfer instructions load (it copies data from memory to a register) store (copies data from a register to memory) Memory address for load and store has two parts A register whose contents is known (called base register or index register) An offset to be added to the base register content This way the address is 32 bits The offset can be positive or negative (2’s complement) C code MIPS code lw $t0, 32($s3) A[12] = h + A[8] add $t0, $s2, $t0 sw $t0, 48($s3) Meaning $t0 = Memory[$s3 + 32] $t0 = $s2 + $t0 Memory[$s3 + 48] = $t0 10 5 Machine language MIPS assembly instructions are translated into machine instructions (“binary numbers”) Instructions, like registers and words of data, are also 32 bits long Example: add $t1, $s1, $s2 registers are represented as numbers: $t1=9, $s1=17, $s2=18, … 6 bits 5 bits 5 bits 5 bits 5 bits 6 bits The format above is called R-format (for register) Can you guess what the field names stand for? 11 Machine language Consider the load-word and store-word instructions We must specify two registers and a constant (=immediate operand) Regularity principle would suggest to use for the constant one of the 5-bit fields Problem: the constant would be limited to only 32 !!!) Solution: we introduce a new format called I-type (for immediate) The 16 bit number is in 2’s complement form 6 bits 5 bits 5 bits 16 bits Design principle: good design demands good compromises 12 6 Examples (op) lw sw (op) (rs) $s2 $s3 (rs) (rt) $t0 $t0 (rt) 13 MIPS logical operations Logical operation Example Meaning shift left logical sll $s1, $s2, 10 $s1 = $s2 << 10 srl $s1, $s2, 10 $s1 = $s2 >> 10 bit-by-bit and and $s1, $s2, $s3 $s1 = $s2 & $s3 bit-by-bit or or $s1, $s2, $s3 $s1 = $s2 | $s3 bit-by-bit nor nor $s1, $s2, $s3 $s1 = ~ ($s2 | $s3) shift right logical The above instructions are all R-type 14 7 Example sll $s1, $s2, 10 0 op 0 18 rt 17 rd 10 shamt 0 func 19 rt 17 rd 10 shamt 2 func srl $s1, $s3, 10 0 op 0 15 Control flow Decision making instructions alter the program control flow, i.e., change the "next" instruction to be executed MIPS conditional branch instructions: bne $t0, $t1, Label beq $t0, $t1, Label C code if (i==j) h=i+j; MIPS code bne $s3, $s4, Label add $s5, $s3, $s4 Label: … 16 8 Control flow MIPS unconditional branch instruction (jump) j Label C code if (i==j) f=g+h; else f=g-h MIPS code bne $t0, $t1, Else add $s5, $s3, $s4 j Exit Else: sub $s5, $s3, $s4 Exit: … Can you build a simple for loop? 17 Control flow We have: beq (test for equality) and bne (test for inequality) But, sometime is useful to see if a variable is less than another one (branch-less-than) MIPS provides the instruction set-on-less-than (slt) The format is R-type Meaning MIPS code if (s1<s2) t0=1; else t0=0 slt $t0, $s1, $s2 18 9 Control flow Case switch statement can be implemented as a chain of if-then-else statements or more efficiently as a table of addresses MIPS provides a jump register instruction (jr) It is an unconditional jump to the address specified in a register Example: jr $s0 19 Control flow – Instruction formats slt and jr are R-format slt $s1, $s2, $s3 0 op 18 rs 19 rt 17 rd 0 shamt 42 func 0 0 8 func jr $s1 0 op 17 rs 0 20 10 Control flow – Instruction formats beq and bne are I-format instructions beq $s1, $s2, 100 4 op 17 rs 18 rt 25 16-bit number The 16-bit number is in 2’s complement form The 16-bit number specifies the # of instructions to be skipped Most branches are local [Principle of locality] Next memory address = PC + (16-bit number x 4) 21 Control flow – Instruction formats The unconditional jump instruction requires a new instruction format (J-format) Example: j 10000 2 op 2500 26-bit number The address of the next instruction is obtained by concatenating the 4 upper bits of the PC with the 26-bit address shifted left 2 bits Next memory address = {PC[31:28], INS[25:0], 2’b00} Address boundaries of 256 MB = (228) 22 11 MIPS Registers convention Name Register number $zero 0 $v0-$v1 2-3 $a0-$a3 4-7 $t0-$t7 8-15 $s0-$s7 16-23 $t8-$t9 24-25 $gp 28 $sp 29 $fp 30 $ra 31 Usage the constant value 0 values for results and expression evaluation arguments temporaries saved more temporaries global pointer stack pointer frame pointer return address Register 1 ($at) reserved for assembler, 26-27 ($k0-$k1) for operating system 23 Constants Small constant operands occur quite frequently in programs (50% of operands) By including constants inside instructions, execution become much faster than if constants were loaded from memory w/o immediate instruction lw $t0, AddrConstant4($s1) add $s3, $s3, $t0 with immediate instruction addi $s3,$s3,4 Design Principle: make the common case fast 24 12 Constants Category Instruction Example Meaning Arithmetic add immediate addi $s1, $s2, 100 $s1 = $s2 + 100 and immediate andi $s1, $s2,100 $s1 = $s2 & 100 or immediate ori $s1, $s2, 100 $s1 = $s2 | 100 set less than immediate slti $s1, $s2, 100 If ($s2<100) $s1=1 else $s1=0 Logical Conditional branch Why doesn’t MIPS have a subtract immediate operation ? Constants are frequently short and fit into the 16-bit field But, what if they are bigger ? It would be convenient to have a 32-bit constant or address !!!! 25 How about large constants ? The compiler or the assembler must break large constants into pieces and then reassemble them into registers We'd like to be able to load a 32 bit constant into a register Must use two instructions, new "load upper immediate" instruction lui $t0, 1010101010101010 filled with zeros 1010101010101010 0000000000000000 Then must get the lower order bits right, i.e., ori $t0, $t0, 1010101010101010 1010101010101010 0000000000000000 1010101010101010 ori 1010101010101010 1010101010101010 26 13 Assembly Language vs. Machine Language Assembly provides convenient symbolic representation much easier than writing down numbers e.g., destination first Machine language is the underlying reality e.g., destination is no longer first Assembly can provide 'pseudoinstructions' e.g., “move $t0, $t1” exists only in assembly would be implemented using “add $t0,$t1,$zero” When considering performance you should count real instructions 27 Translating and starting a program 28 14 Overview of MIPS simple instructions all 32 bits wide very structured only three instruction formats R op rs rt rd I op rs rt 16 bit number J op shamt funct 26 bit number rely on compiler to achieve performance 29 Summary – MIPS operands MIPS operands Name Example Comments $s0-$s7, $t0-$t9, $zero, Fast locations for data. In MIPS, data must be in registers to perform 32 registers $a0-$a3, $v0-$v1, $gp, arithmetic. MIPS register $zero alw ays equals 0. Register $at is $fp, $sp, $ra, $at reserved for the assembler to handle large constants. Memory[0], 230 memory Memory[4], ..., words Memory[4294967292] Accessed only by data transfer instructions. MIPS uses byte addresses, so sequential w ords differ by 4. Memory holds data structures, such as arrays, and spilled registers, such as those saved on procedure calls. 30 15 Summary – MIPS instructions MIPS assembly language Category Arithmetic Instruction add Example add $s1, $s2, $s3 Meaning $s1 = $s2 + $s3 Three operands; data in registers subtract sub $s1, $s2, $s3 $s1 = $s2 - $s3 Three operands; data in registers $s1 = $s2 + 100 $s1 = Memory[$s2 + 100] Memory[$s2 + 100] = $s1 $s1 = Memory[$s2 + 100] Memory[$s2 + 100] = $s1 Used to add constants addi $s1, $s2, 100 lw $s1, 100($s2) sw $s1, 100($s2) store word lb $s1, 100($s2) load byte sb $s1, 100($s2) store byte load upper immediate lui $s1, 100 add immediate load word Data transfer Conditional branch Unconditional jump $s1 = 100 * 2 16 Comments Word from memory to register Word from register to memory Byte from memory to register Byte from register to memory Loads constant in upper 16 bits branch on equal beq $s1, $s2, 25 if ($s1 == $s2) go to PC + 4 + 100 Equal test; PC-relative branch branch on not equal bne $s1, $s2, 25 if ($s1 != $s2) go to PC + 4 + 100 Not equal test; PC-relative set on less than slt $s1, $s2, $s3 if ($s2 < $s3) $s1 = 1; else $s1 = 0 Compare less than; for beq, bne set less than immediate slti jump j jr jal jump register jump and link $s1, $s2, 100 if ($s2 < 100) $s1 = 1; Compare less than constant else $s1 = 0 2500 $ra 2500 Jump to target address go to 10000 For switch, procedure return go to $ra $ra = PC + 4; go to 10000 For procedure call 31 Addressing modes in MIPS Register addressing (e.g., add, sll, nor, slt, …) the operand is a register Immediate addressing (e.g., addi, ori, …) the operand is a constant within the instruction Base addressing (e.g. lw, sw) [register+offset] The operand is at the memory location whose address is the sum of a register and a constant in the instruction PC-relative addressing (e.g., bne, beq) [PC + offset] The address is the sum of the PC and a constant in the instruction Pseudo-direct addressing (e.g., j) [PC concatenation] The address is the concatenation of a value in the instruction with the upper bits of the PC 32 16 Addressing modes in MIPS 33 Alternative architectures Design alternative: provide more powerful operations goal is to reduce number of instructions executed danger is a slower cycle time and/or a higher CPI RISC vs. CISC virtually all instruction sets since 1982 are RISC common architectures: 80x86, IA-32, PowerPC, … 34 17 “Spilling” registers Many programs have more variables than computers have registers The compiler tries to keep the most frequently used variables in registers and places the rest in memory The process of putting less commonly used variables (or those needed later) into memory is called spilling registers 35 Supporting procedures Goal: structure programs so that it is easier to understand and reuse code Execution of a procedure 1. place parameters in a place where the procedure can access them 2. transfer control to the procedure 3. acquire the storage resources needed for the procedure 4. Place the result value in a place where the calling program can access it 5. Return control to the point of origin Basic MIPS facilities $a0-$a3 Æ 4 argument registers in which to pass parameters $v0-$v1 Æ 2 value registers in which to return values $ra Æ return address register to return to the point of origin jal ProcedureAddress Æ jump-and-link (it saves PC+4 in $ra) jr $ra Æ jump register 36 18 What if we need more registers ? A compiler may need more registers for a procedure than the four argument and two return value registers The local variables we need may not even fit in the MIPS registers also any register needed by the caller must be restored to the values they contained before the procedure was invoked we need to spill the registers to memory The ideal data structure for spilling registers is a stack (last in first out) push – place data on the stack pop – remove data from the stack 37 The stack MIPS allocate a register just for the stack: the stack pointer ($sp) it points to the most recently allocated address in the stack Stacks grow from higher addresses to lower addresses push values on the stack by subtracting from the $sp pop values from the stack by adding to the $sp 38 19 Example – compiling a leaf procedure int leaf_example (int g, int h, int i, int j) { int f; f = (g+h) – (i+j); return f; Memory } … main … jal leaf_example … leaf_example … 39 Example – compiling a leaf procedure leaf_example: addi $sp,$sp,-12 # adjust stack to make room for 3 items sw $t1, 8($sp) sw $t0, 4($sp) sw $s0, 0($sp) add $t0,$a0,$a1 add $t1,$a2,$a3 sub $s0,$t0,$t1 add $v0,$s0,$zer0 # # # # # # # save save save t0 = t1 = s0 = v0 = lw $s0, 0($sp) lw $t0, 4($sp) lw $t1, 8($sp) addi $sp,$sp,12 jr $ra # # # # # restore $s0 for caller restore $t0 for caller restore $t1 for caller adjust stack to delete 3 items jump back to calling routine $t1 for later us by the caller $t0 for later us by the caller $s0 for later us by the caller g+h i+j t0–t1 = (g+h)-(i+j) s0+0 (return f) 40 20 Reduce register spilling To avoid saving and restoring a register whose value is never used, MIPS separate temporary registers from saved registers: $t0-$t9: are not preserved by the callee (called procedure) $s0-$s7: must be preserved on a procedure call (if used, the callee saves and restore them) Thus, the assembly code generated for leaf_example by a compiler can be more compact !!! Can you guess how the code should look like ? When a compiler finds a leaf procedure it exhaust all temporary registers before using registers it must save 41 Nested procedures procedures that do not call other procedures are called leaf procedures that call other procedures are called nested when a procedure call another procedure the solution is again to push all the registers that must be preserved onto stack the caller pushes any argument registers ($a0-$a3) or temporary registers ($t0-$t9) the callee (called procedure) pushes the return address $ra and any saved registers ($s0-$s7) 42 21 Automatic and static variables automatic variables are local to a procedure and are discarded when the procedure exits. static variables exist across exits from and entry to procedures variables outside of procedures and any variable declared using the keyword static MIPS reserve a register called global pointer ($gp) to simplify access to static data 43 Allocating space for new data on the stack The stack is also used to store variables that are local to the procedure that do not fit in registers (e.g., local arrays or structures) The segment of the stack containing a procedure’s saved registers and variables is called a procedure frame MIPS uses a frame pointer ($fp) to point to the first word of the frame A stack pointer may move during the procedure, and so references to a local variable in memory may have different offsets A frame pointer offers a stable base register within a procedure 44 22 Allocating space for new data on the heap certain data structures (e.g., linked list) tend to grow and shrink during their life time the segment for such data structures is called heap stack and heap grow toward each other memory “leaks” can use up this segment program code for the OS 45 Text processing Because of the popularity of text in some programs MIPS provides instructions to move bytes and half words C uses 8-bit to represent characters (ASCII = American Standard Code for Information Interchange) Java uses 16-bit to represent characters (Unicode) load byte (lb) loads a byte from memory placing it in the rightmost 8 bits of a register store byte (sb) takes a byte from the rightmost 8 bits of a register a writes it to memory load half word (lh) store half word (sh) 46 23 MIPS instructions and high level programming languages Arithmetic/logical instructions correspond to the operations found in assignments Data transfer are most likely to occur when dealing with data structures like arrays or structures Conditional branches are used in if statements and in loops Unconditional jumps are used in procedure calls and returns and for case/switch statements 47 Concluding Remarks Instruction complexity is only one variable lower instruction count vs. higher CPI / lower clock rate Design Principles: simplicity favors regularity smaller is faster good design demands compromise make the common case fast Instruction set architecture a very important abstraction ! 48 24 Common misconceptions and mistakes More powerful instructions mean higher performance Write in assembly language give the highest performance Forgetting that sequential word addresses in machines with byte addressing do not differ by one Using a pointer to an automatic variable outside its defining procedure 49 25