Computer Architecture Chapter 2 Instructions: Language of the Computer Yu-Lun Kuo 郭育倫 Department of Computer Science and Information Engineering Tunghai University, Taichung, Taiwan R.O.C. sscc6991@gmail.com http://www.csie.ntu.edu.tw/~d95037/ 1 Introduction Computer designers have a common goal Find a language that makes it easy to build hardware and the compiler Maximizing performance and minimizing cost Instruction Set Language of the machine, its vocabulary is called an instruction set The vocabulary of commands understood by a given architecture. 2 Introduction We’ll be working with the MIPS instruction set architecture Similar to other architectures developed since the 1980's Almost 100 million MIPS processors manufactured in 2002 Used by NEC, Nintendo, Cisco, Silicon Graphics, and Sony. Stored-program concept The idea that instructions and data of many types can be stored in memory as numbers, leading to the stored program computer. 3 CPU Manufacturer (1/2) Intel: Pentium IV, IA-64, i3, i5, i7 AMD: K6-3, K7, Duron, Athron IBM: PowerPC Sun: SPARC HP: PA-RISK, IA-64 DEC: Alpha MIPS: MIPS (Book) VIA/Cyrix: C7 series Motorola: DragonBall Used in Palm handheld devices 4 4 CPU Manufacturer (2/2) 1400 1200 Other SPARC Hitachi SH 1100 PowerPC 1300 1000 Motorola 68K 900 MIPS IA-32 800 ARM 700 600 500 400 300 200 100 0 1998 1999 2000 2001 5 2002 5 RISC (Reduced Instruction Set Computer) RISC philosophy fixed instruction lengths load-store instruction sets limited addressing modes limited operations Instruction sets are measured by how well compilers use them as opposed to how well assembly language programmers use them Design goals: speed, cost (design, fabrication, test, packaging), size, power consumption, reliability, memory space 6 MIPS Instruction Set Architecture (ISA) Registers Instruction Categories Computational Load/Store Jump and Branch Floating Point Memory Management Special OP rs rt OP rs rt OP R0 - R31 PC HI LO rd sa immediate jump target funct R format I format J format 3 Instruction Formats: all 32 bits wide 7 MIPS arithmetic HLL MIPS Assembly MIPS Machine High Level Language Statements Assembly Language Translation The translation process includes Assigning variables in high level language statement into registers Translation into assembly 8 Operations of MIPS All instructions have 3 operands Operand order is fixed (destination first) Example: C code: a = b + c; MIPS code: add a, b, c 9 MIPS Arithmetic Instructions MIPS assembly language arithmetic statement add $t0, $s1, $s2 sub $t0, $s1, $s2 Each arithmetic instruction performs only one operation Each arithmetic instruction fits in 32 bits and specifies exactly three operands destination source1 op source2 Operand order is fixed (destination first) 10 Operations of MIPS Of course this complicates some things... C code: MIPS code: a = b + c + d; add a, b, c add a, a, d Operands must be registers Only 32 registers provided. (MIPS) Each register contains 32 bits. (32bits = 4bytes = word) 11 Example Place the sum of variables b, c, d, and e into variable a add a, b, c add a, a, d add a, a, e # a = b+c+d+e; Takes three instructions to take sum of four variables # is comments for the human reader 12 Operands of MIPS All instructions have 3 operands “The natural number of operands for an operation like addition is three…requiring every instruction to have exactly three operands, no more and no less, conforms to the philosophy of keeping the hardware simple” Each line of this language can contain at most one instruction 13 Operands of MIPS Design Principle 1: Simplicity favors regularity Simple fixed number of operand regularity Hardware for a variable number of operands is more complicated than hardware for a fixed number 14 Compiling C into MIPS C (Java) program contains the five variables a, b, c, d, and e a = b + c; d = a – e; MIPS instruction Category Instruction Example Meaning Comments Arithmetic add add a, b, c a=b+ c 3 Operands subtract sub a, b, c a = b – c 3 Operands 15 Example Complex statement contains five variables f = (g + h) – (i + j); MIPS code: add t0, g, h #temporary add t1, i, j sub f, t0, t1 16 Operands of MIPS Design Principle 2: Smaller is faster. A very large number of registers may increase the clock cycle time simply. Because it takes electronic signals longer when they must travel farther Arithmetic instructions operands must be registers, only 32 registers are provided. 17 Operands of MIPS (Registers) Simply write instructions using numbers for register, from 0 to 31 Following a dollar sign to represent a register Use $s0, $s1, … for registers that correspond to variables (variable registers) Use $t0, $t1, … for temporary registers 18 Example Compiler’s job to associate program variables with registers f = (g + h) - (i + j); Variable f, g, h, i and j are assigned to the registers $ s0, $ s1, $ s2, $ s3, $ s4 add $t0,$s1,$s2 add $t1,$s3,$s4 sub $s0,$t0,$t1 19 Registers vs. Memory The processor can keep only a small amount of data in registers, but computer memory contains millions of data elements Control Input Memory Datapath Processor Output I/O 20 Registers Data is more useful in a register MIPS registers take both less time to access and have higher throughput than memory Faster to access Highest performance Simpler to use 21 Memory Operands Data transfer instructions Arithmetic operations occur only on registers in MIPS, thus, MIPS must include instructions that transfer data between memory and registers. Access a word in memory (supply memory address) Store lw and sw Register load Memory Addressing (定址) A value used to delineate the location of a specific data element within a memory array. 22 Memory Organization Viewed as a large, single-dimension array, with an address. A memory address is an index into the array “Byte addressing” means that the index points to a byte of memory. ... 23 Memory Organization The constant in the data transfer instruction is called offset Copy data from memory to register is called load The register added to form the address is called base register Register used to access memory MIPS name for this instruction is lw Standing for load word Base + offset offset Base address ($s3) 24 Operand is in Memory (Example) A is an array of 100 bytes The variables g and h with registers $s1 and $s2 The starting address (base address) of the array is in $s3 C code: g = h + A[8]; MIPS code: lw $t0, 8($s3) add $s1,$s2,$t0 25 Memory Organization Bytes are nice, but most data items use larger “words”, for MIPS, a word is 32 bits or 4 bytes. Alignment restriction 232 bytes with byte addresses from 0 to 232-1 Words are aligned 0 4 8 12 32 bits of data 32 bits of data 32 bits of data 32 bits of data ... Registers hold 32 bits of data 26 Endian Problem Since 8-bit bytes are so useful, most architectures address individual bytes in memory Big Endian: leftmost byte is word address The memory address of a word must be a multiple of 4 (alignment restriction) IBM 360/370, Motorola 68k, MIPS, Sparc, HP PA Little Endian: rightmost byte is word address Intel 80x86, DEC Vax, DEC Alpha (Windows NT) 3 2 1 little endian byte 0 0 msb lsb 0 big endian byte 0 1 2 3 27 Compiling Using Load and Store Load and store instructions C code: MIPS code: + A[8]; 32($s3) $s2 ,$t0 48($s3) Can refer to registers by name (e.g., $s2, $s3 + 4*12 $t0) instead of number A[12] = h lw $t0, add $t0, sw $t0, g$s1 register, h$s2 register $s3 is Array A’s base register $s3 + 4*8 A[12] A[10] A[8] Remember arithmetic operands are registers, not memory! Cannot write: add 48($s3), $s2, 32($s3) 28 Example Example: g$s1 register, h$s2 register, i $s4 $s3 is Array A’s base register C code: add add add lw lw add $t1, $s4, $s4 $t1, $t1, $t1 $t1, $t1, $s3 $t0, 0($t1) $t0, $t1($s3) $s1, $s2, $t0 g = h + A[i]; # $t1 gets 4*i # $t0 gets A[i] A[i] $s3 + 4*i A[0] $s3 29 Emphasis Load: Memory register (lw) Store: register memory (sw) Store Register Memory load Base +Offset Offset Base 30 MIPS Register Convention Name Register Number $zero 0 $at 1 $v0 - $v1 2-3 $a0 - $a3 4-7 $t0 - $t7 8-15 $s0 - $s7 16-23 $t8 - $t9 24-25 $gp 28 $sp 29 $fp 30 $ra 31 Usage Preserve on call? constant 0 (hardware) n.a. reserved for assembler n.a. returned values no arguments yes temporaries no saved values yes temporaries no global pointer yes stack pointer yes frame pointer yes return addr (hardware) yes Register 1 ($at) reserved for assembler, 26-27 for operating system 31 Compiling Using Load and Store 32 So far We Learn 33 So far We Learn Design Principle 3: Make the common case fast MIPS loading words but addressing bytes arithmetic on registers only Instruction add sub lw sw $s1,$s2,$s3 $s1,$s2,$s3 $s1,100($s2) $s1,100($s2) Meaning $s1 = $s2 + $s3 $s1 = $s2 – $s3 $s1 = Memory[$s2+100] Memory[$s2+100] = $s1 34 Spilling Register Many programs have more variable than computers have registers 32 registers in MIPS Compiler tries to keep the most frequently used variables in registers and places the rest in memory The process of putting less commonly used variables (needed later) into memory is called spilling registers 35 Representing Instructions Instructions, like registers and words of data, are also 32 bits long Example: add $t1,$s1,$s2 Registers have numbers (0, 1, 2, …, 31) $s0 to $s7 map onto registers 16 to 23 $t0 to $t7 map onto registers 8 to 15 $t1=9, $s1=17, $s2=18 Instruction Format: 000000 10001 10010 01001 00000 100000 R format op rsrt rd shamt funct Can you guess what the field names stand for? 36 Representing Instructions Binary representation 000000 6-bit 10001 10010 01000 00000 100000 5-bit 5-bit 5-bit 5-bit 6-bit Instruction format MIPS instructions are 32 bits long Simplicity favors regularity 37 Representing Instructions 38 Representing Instructions 39 MIPS Fields Arithmetic Instruction Format (R format) add $t0, $s1, $s2 op rs rt rd shamt funct op 6-bits opcode that specifies the operation rs 5-bits register file address of the first source operand rt 5-bits register file address of the second source operand rd 5-bits register file address of the result’s destination shamt 5-bits shift amount (for shift instructions) funct 6-bits function code augmenting the opcode 40 40 Representing Instructions Machine Language Binary representation used for communication within a computer system Instruction Format A form of representation of an instruction composed of fields of binary numbers. 41 MIPS Memory Access Instructions MIPS has two basic data transfer instructions for accessing memory lw sw $t0, 4($s3) #load word from memory $t0, 8($s3) #store word to memory The load word instruction must specify two registers and a constant Constant with the load word instruction would be limited to only 25 or 32 5 bit-field is too small to be useful (often much larger than 32) 42 One Size Fits All? The compromise chosen by the MIPS designer Requiring different kinds of instruction formats for different kinds of instruction Keep all instructions the same length Design Principle 4: Good design demands good compromises We have 3 types of instructions R-type (register) I-type (immediate) J-type (jump) 43 Representing Instructions 44 Machine Language – Load Instruction Load/Store Instruction Format (I format) lw op rs $t0, rt 24 ($s2) 16 bit offset A 16-bit field meaning access is limited to memory locations within a region of 215 or 32,768 bytes (213 or 8,192 words) of the address in the base register Note that the offset can be positive or negative 45 MIPS Instruction Encoding Instruction Format op rs rt rd shamt funct address Add R 0 reg reg reg 0 32 n.a. Sub R 0 reg reg reg 0 34 n.a. Add immediate I 8 reg reg n.a. n.a. n.a. Constant Lw I 35 reg reg n.a. n.a. n.a. Address Sw I 43 reg reg n.a. n.a. n.a. address 46 Translate MIPS into Machine Language EX. $t1 has the base of array A, $s2 is h A[300] = h + A[300] complied into lw $t0, 1200($t1) add $t0, $s2, $t0 # $t0 gets h+A[300] sw $t0, 1200($t1) op 35 rs 9 rt 8 op 0 rs 18 rt 8 op 43 rs 9 rt 8 address 1200 rd 8 shamt 0 funct 32 address 1200 47 Translate MIPS into Machine Language Binary representation op rs 100011 01001 rt 01000 op rs 000000 10010 op rs 101011 01001 rt rd 01000 rt 01000 01000 address 0000 0100 1011 0000 shamt funct 00000 100000 address 0000 0100 1011 0000 48 So Far We Learn 49 Stored Program Concept Instructions are bits Programs are stored in memory to be read or written just like data Fetch and Execute Cycle Instructions are fetched and put into a special register (Instruction Register) Bits in the register “control” the subsequent actions Fetch the “next” instruction and continue 50 Logical Operations MIPS provides the usual bitwise logical instructions that are also in x86 and or nor (not or) and immediate or immediate shift left logical shift right logical Table 2.10 shows a summary 51 Logical Operations (MIPS instructions) Logical operations C operators Java operators MIPS instructions Shift left << << sll Shift right >> >>> srl Bit-by-bit AND & & and, andi Bit-by-bit OR | | or, ori Bit-by-bit NOT ~ ~ nor 52 Shift Operations Shift left logical (sll) 0000 0000 0000 1001 = 9 0000 0000 1001 0000=144 sll $t2,$s0,4 #reg $t2 = reg $s0 << 4 bits Shift right logical (srl) 53 and/or/not/nor Example 0000 0000 0000 0000 0000 1101 0000 0000 (t2) 0000 0000 0000 0000 0011 1100 0000 0000 (t1) and $t0, $t1, $t2 0000 0000 0000 0000 0000 1100 0000 0000 or $t0, $t1, $t2 0000 0000 0000 0000 0011 1101 0000 0000 not/nor A NOR 0 = NOT(A OR 0)= NOT(A) nor $t0, $t1, $t3 1111 1111 1111 1111 1100 0011 1111 1111 54 Summary of Logical Operations 55 Making Decisions Decision making instructions Alter the control flow i.e., change the “next” instruction to be executed MIPS conditional branch instructions: beq register 1, register 2, L1 #go to Ll if $s0=$s1 bne register 1, register 2, L1 #go to Ll if $s0 $s1 Example: Beq (branch if equal) Bne (branch if not equal) if (i==j) h = i + j; bne $s0,$s1,Label add $s3,$s0,$s1 Label: ...? 56 Branch-if-less-than We have: beq, bne, what about Branch-ifless-than? New instruction: slt $t0,$s1,$s2 if $s1 < $s2 then $t0 = 1 else $t0 = 0 Set on less than Branch on less than Can use this instruction to build “blt $s1, $s2, Label” — can now build general control structures Note that the assembler needs a register to do this, there are policy of use conventions for registers 57 Control MIPS unconditional branch instructions: j Label jr s2 (jumps to address held in s2) Formats: J op 26 bit address 58 Conditional Branches Example if (i==j) f=g+h; else f=g-h; T i == j f = g+h F f = g-h Exit: Else:sub bne $s3,$s4,Else # go to Else if i j add $s0,$s1,$s2 #f = g + h (skipped if i j) j Exit # go to Exit $s0,$s1,$s2 # f = g – h (skipped if i=j) Exit: 59 Compiling a while Loop in C Ex. while ( save[i] == k ) i = i + j; T save[i]==k F i = i+j; Loop: add add add lw bne add j Exit: $t1, $s3, $s3 $t1, $t1, $t1 $t1, $t1, $s6 $t0, 0($t1) $t0, $s5, Exit $s3, $s3, $s4 Loop # $t1 gets address of save[i] # $t0 gets save[i] # go to Exit if condition is false #i=i+j 60 Review of Instructions Instruction add $s1,$s2,$s3 sub $s1,$s2,$s3 lw $s1,100($s2) sw $s1,100($s2) bne $s4,$s5,L beq $s4,$s5,L j Label Formats: Meaning $s1 = $s2 + $s3 $s1 = $s2 – $s3 $s1 = Memory[$s2+100] Memory[$s2+100] = $s1 Next instr. is at Label if $s4 ≠ $s5 Next instr. is at Label if $s4 = $s5 Next instr. is at Label R op rs rt rd I op rs rt 16 it address J op shamt funct 26 bit address 61 More Branch Instructions (1/2) The slt instruction: Set On Less Than slt $t0, $s3, $s4 # if $s3 < $s4 then # $t0 = 1 else # $t0 = 0 slt $t0, $s1, $zero compares s1 to (register) zero slti $t0, $s2, 10 #$t0 = 1 if $s2 < 10 slti: slt immediate 62 More Branch Instructions (2/2) No branches on “less than” directly • Because it’s too complicated Two faster instruction are more useful Can use slt, beq, bne, and the fixed value of 0 in register $zero to create other conditions – less than blt $s1, $s2, Label – – – slt $at, $s1, $s2 bne $at, $zero, Label less than or equal to greater than great than or equal to #$at set to 1 if # $s1 < $s2 ble $s1, $s2, Label bgt $s1, $s2, Label bge $s1, $s2, Label 63 Case/Switch Statement Jump address table A table of address of alternative instruction sequences An array of words MIPS include a jump register (jr) Unconditional jump to the address specified in a register Program loads the appropriate entry from the jump table into a register Then jump to the proper address using a jump register Described in Section 2.7 64 So far… Arithmetic Data transfer and, or, nor, andi, ori, sll, srl Conditional branch lw, sw Logical add, sub beq, bne, slt, slti Unconditional jump j 65 So Far We LearnMIPS Operands MIPS operands Name Example $s0-$s7, $t0-$t9, $zero, 32 registers $a0-$a3, $v0-$v1, $gp, $fp, $sp, $ra, $at Memory[0], 230 memory Memory[4], ..., words Memory[4294967292] Comments Fast locations for data. In MIPS, data must be in registers to perform arithmetic. MIPS register $zero alw ays equals 0. Register $at is reserved for the assembler to handle large constants. Accessed only by data transfer instructions. MIPS uses byte addresses, so sequential w ords differ by 4. Memory holds data structures, such as arrays, and spilled registers, such as those saved on procedure calls. 66 66 So Far We LearnMIPS Assembly Language Category Instr Op Code Example Meaning Arithmetic add 0 and 32 add $s1, $s2, $s3 $s1 = $s2 + $s3 (R & I format) subtract 0 and 34 sub $s1, $s2, $s3 $s1 = $s2 - $s3 add immediate 8 addi $s1, $s2, 6 $s1 = $s2 + 6 or immediate 13 ori $s1, $s2, 6 $s1 = $s2 v 6 Data Transfer load word 35 lw $s1, 24($s2) $s1 = Memory($s2+24) store word 43 sw $s1, 24($s2) Memory($s2+24) = $s1 (I format) load byte 32 lb $s1, 25($s2) $s1 = Memory($s2+25) store byte 40 sb $s1, 25($s2) Memory($s2+25) = $s1 load upper imm 15 lui $s1, 6 $s1 = 6 * 216 br on equal 4 beq $s1, $s2, L if ($s1==$s2) go to L br on not equal 5 bne $s1, $s2, L if ($s1 !=$s2) go to L Cond. Branch (I & R format) Uncond. Jump (J & R format) set on less than 0 and 42 slt $s1, $s2, $s3 if ($s2<$s3) $s1=1 else $s1=0 set on less than immediate 10 slti $s1, $s2, 6 if ($s2<6) $s1=1 else $s1=0 jump 2 j 2500 go to 10000 jump register 0 and 8 jr $t1 go to $t1 jump and link 3 jal 2500 go to67 10000; $ra=PC+4 67 So Far We LearnMIPS Machine Language 68 Supporting Procedures Registers play a major role in keeping track of information for function calls. Register conventions: Return address Arguments Return value Local variables Temporary variables $ra $a0, $v0, $s0, $t0, $a1, $a2, $a3 $v1 $s1, … , $s7 …, $t7,$t8, $t9 The stack is also used; more later. 69 Instruction for Functions C M I P S ... sum(a,b);... /* a,b:$s0,$s1 */ } int sum(int x, int y) { return x+y; } address 1000 1004 1008 1012 1016 … 2000 2004 In MIPS, all instructions are 4 bytes, and stored in memory just like data. So here we show the addresses of where the programs are stored. 70 Instruction for Functions ... sum(a,b);... /* a,b:$s0,$s1 */ C} int sum(int x, int y) { return x+y; } Maddress I 1000 1004 P 1008 S 1012 1016 ... … 2000 sum: 2004 add add addi j $a0,$s0,$zero $a1,$s1,$zero $ra,$zero,1016 sum add jr $v0,$a0,$a1 $ra # new instruction #x=a #y=b # $ra=1016 #jump to sum 71 Instruction for Functions C ... sum(a,b);... /* a,b:$s0,$s1 */ } int sum(int x, int y) { return x+y;} M I P S Question: Why use jr here? Why not simply use j? Answer: sum might be called by many functions, so we can’t return to a fixed place. The calling proc to sum must be able to say “return here” somehow. 2000 2004 sum: add jr $v0,$a0,$a1 $ra # new instruction 72 Instruction for Functions Single instruction to jump and save return address: jump and link (jal) Before: 1008 1012 After: addi $ra,$zero,1016 # $ra=1016 j sum # goto sum 1008 jal sum # $ra=1012, goto sum Why have a jal? Make the common case fast: function calls are very common. Also, you don’t have to know where the code is. loaded into memory with jal. 73 Instruction for Functions Syntax for jal (jump and link) is same as for j (jump): jal label jal should really be called laj for “link and jump”: Step 1 (link): Save address of next instruction into $ra (Why next instruction? Why not current one?) Step 2 (jump): Jump to the given label 74 Instruction for Functions Syntax for jr (jump register): jr register Instead of providing a label to jump to, the jr instruction provides a register which contains an address to jump to. Only useful if we know exact address to jump to. Very useful for function calls: jal stores return address in register ($ra) jr $ra jumps back to that address 75 Instruction for Functions 76 The Stack Pointer 77 Procedure Call 需保留 不需保留 78 Memory Allocation on the Stack $fp and $sp 79 Memory Allocation on the Heap 80 Policy of Use Conventions 81 So Far We Learn 82 MIPS Machine Language 83 Loading, Storing Bytes In addition to word data transfers (lw, sw), MIPS has byte data transfers. load byte: lb store byte: sb load halfword: lh store halfword: sh same format as lw, sw 84 Assembly Language vs. Machine Language Assembly provides convenient symbolic representation Machine language is the underlying reality e.g., destination is no longer first Assembly can provide ‘pseudo instructions’ much easier than writing down numbers e.g., destination first e.g., “move $t0, $t1” exists only in Assembly Would be implemented as “add $t0,$t1,$zero” When considering performance you should count real instructions 87 Other Issues Discussed in your assembly language programming lab: support for procedures linkers, loaders, memory layout stacks, frames, recursion manipulating strings and pointers interrupts and exceptions system calls and conventions Some of these we’ll talk more about later We’ll talk about compiler optimizations when we hit chapter 4. 88 Overview of MIPS Simple instructions all 32 bits wide Very structured, no unnecessary baggage Only three instruction formats R op rs rt rd I op rs rt 16 bit address J op shamt funct 26 bit address Rely on compiler to achieve performance — what are the compiler's goals? Help compiler where we can 89 Addressing in Branches and Jumps J type, which consists of 6 bits for the operation field e.g., j Loop # go to label Loop j 10000 or # go to location 10000 J-type format: 2 6-bit 10000 26-bit 90 90 Conditional branch instruction e.g, bne $s0, $s1, Exit # go to Exit it $s0 != $s1 I-type format 5 16 17 Exit 6-bit 5-bit 5-bit 16-bit Restriction No program could be bigger than 216, which is far too small to be a realistic option today Program Counter (PC) = Register + Branch address PC-relative addressing (PC相對定址法) 91 91 Branch Far Away Given a branch beq $s0, $s1, L1 # 16-bit offset Offers a much greater branching distance Replace it by a pair of instructions bne j L2: $s0, $s1, L2 L1 # 26-bit offset 92 92 Addresses in Branches and Jumps Instructions: bne $t4,$t5,Label Next instruction is at Label if $t4 <> $t5 Next instruction is at Label if $t4 = $t5 Next instruction is at Label beq $t4,$t5,Label j Label Formats: I op J op rs rt 16 bit address 26 bit address Addresses are not 32 bits How do we handle this with load and store instructions? 93 Addresses in Branches Instructions: bne $t4,$t5,Label Next instruction is at Label if $t4≠$t5 beq $t4,$t5,Label Next instruction is at Label if $t4=$t5 I op rs rt 16 bit address Format Could specify a register (like lw and sw) and add it to address use Instruction Address Register (PC = program counter) most branches are local (principle of locality) 94 MIPS Addressing Mode 95 96 97 MIPS Instruction Formats 98 99 Translating and Starting a Program 100 Dynamically Linked Libraries Traditional approach to linking libraries before the program is run Static approach is the fastest way to call libraries routines Disadvantages If a new version of library Is released, the statically linked program keeps using the old version Loads the whole library even if all of the library is not used when the program is run Dynamically linked libraries (DLLs) Not linked and loaded until program is run 102 102 Starting a Java Program Traditional model of executing a program Emphasis is on fast execution time for a program Java was invented with a different set of goals Quickly run safely on any computer Even if it might slow execution time 103 103 Starting a Java Program (1/2) Rather than compile to the assembly language of a target computer Java bytecode instruction set That are easy to interpret Java program Compiler Class files (Java bytecode) Just in Time compiler (JIT) Java Library routines Java Virtual Machine (JVM) Compiled Java methods 104 104 Starting a Java Program (2/2) Java Virtual Machine (JVM ) http://java.com/zh_TW/download/installed.jsp A software interpreter, can execute Java bytecode Just In Time compilers (JIT) Improve execution speed Typically profile the running program to find where the “hot” methods Compile them into the native instruction set Compiled portion is saved for the next time the program is run So that can run faster each time it is run 105 105 How Compilers Optimize High-level optimizations involve loop transformations Can reduce loop overhead Improve memory access In loops that execute many iterations Traditionally controlled by a for statement The optimization of loop unrolling is useful Loop unrolling Taking a loop and replicating the body multiple times Reduces the loop overhead and provides opportunities for many other optimizations 106 106 How Compilers Optimize 107 MIPS (RISC) Design Principles Simplicity favors regularity Good design demands good compromises three instruction formats Smaller is faster fixed size instructions – 32-bits Always requiring 3 register operands small number of instruction formats opcode always the first 6 bits limited instruction set limited number of registers in register file limited number of addressing modes Make the common case fast arithmetic operands from the register file (load-store machine) allow instructions to contain immediate operands 108 Alternative Architectures Design alternative: Provide more powerful operations Goal is to reduce number of instructions executed Danger is a slower cycle time and/or a higher CPI –“The path toward operation complexity is thus fraught with peril. To avoid these problems, designers have moved toward simpler instructions” Let’s look (briefly) at IA-32 109 IA - 32 1978: The Intel 8086 is announced (16 bit architecture) 1980: The 8087 floating point coprocessor is added 1982: The 80286 increases address space to 24 bits, +instructions 1985: The 80386 extends to 32 bits, new addressing modes 1989-1995: The 80486, Pentium, Pentium Pro add a few instructions (mostly designed for higher performance) 1997: 57 new “MMX” instructions are added, Pentium II 1999: The Pentium III added another 70 instructions (SSE) 2001: Another 144 instructions (SSE2) 2003: AMD extends the architecture to increase address space to 64 bits, widens all registers to 64 bits and other changes (AMD64) 2004: Intel capitulates and embraces AMD64 (calls it EM64T) and adds more media extensions 110 IA-32 Overview Complexity: Instructions from 1 to 17 bytes long one operand must act as both a source and destination one operand can come from memory complex addressing modes, e.g., “base or scaled index with 8 or 32 bit displacement” Saving grace: the most frequently used instructions are not too difficult to build compilers avoid the portions of the architecture that are slow “what the 80x86 lacks in style is made up in quantity, making it beautiful from the right perspective” 111 IA-32 Registers and Data Addressing Registers in the 32-bit subset that originated with 80386 112 IA-32 Register Restrictions Registers are not “general purpose” – note the restrictions below 113 IA-32 Typical Instructions Four major types of integer instructions: Data movement including move, push, pop Arithmetic and logical (destination register or memory) Control flow (use of condition codes / flags) String instructions, including string move and string compare 114 IA-32 instruction Formats Typical formats: (notice the different lengths) 115 Summary Instruction complexity is only one variable Design Principles: lower instruction count vs. higher CPI / lower clock rate simplicity favors regularity smaller is faster good design demands compromise make the common case fast Instruction set architecture a very important abstraction indeed! 116