Execution Cycle Outline • • • • • (Brief) Review of MIPS Microarchitecture Execution Cycle Pipelining Big vs. Little Endian-ness CPU Execution Time IF 1 ID EX MEM IF ID IF EX ID IF WB MEM WB EX MEM WB ID EX MEM WB MIPS Microarchitecture • Recall the datapath for the lw (load word) command 2 MIPS Microarchitecture • The first step was to fetch the instruction 3 MIPS Microarchitecture • fetch the instruction 4 MIPS Microarchitecture • The next step was to decode the instruction 5 MIPS Microarchitecture • decode the instruction 6 MIPS Microarchitecture • Next, execute the instruction 7 MIPS Microarchitecture • execute the instruction 8 MIPS Microarchitecture • Next, access memory (if necessary) 9 MIPS Microarchitecture • Finally, write back to a register 10 MIPS Microarchitecture • write back to a register 11 MIPS Microarchitecture • Just described classic 5-stage execution cycle • • • • • Fetch Decode Execute Memory Write Back • 5-stage execution cycle typical of RISC machines • RISC is easier to explain • CISC is more complicated… • x86 is CISC 12 Outline • • • • • (Brief) Review of MIPS Microarchitecture Execution Cycle Pipelining Big vs. Little Endian-ness CPU Execution Time IF 13 ID EX MEM IF ID IF EX ID IF WB MEM WB EX MEM WB ID EX MEM WB Execution Cycle (aka Instruction Cycle) IF – Instruction Fetch ID – Instruction Decode EX - Execute MEM – Memory WB- Write Back 14 Execution Cycle - Fetch IF • Send the program counter (PC) to memory • fetch the current instruction from memory • Update the PC • PC = PC + 4 • (since each instruction is four bytes) ID EXE MEM WB 15 Execution Cycle - Decode IF • Figure out type of instruction (e.g., load, add, etc.) • Based on “opcode” ID • Determine registers involved • aka “operands” • Get things “setup” for execution EXE • Control Unit sets appropriate pins MEM WB 16 Execution Cycle - Execute IF • ALU operates on operands prepared during decode • ALU performs function based on instruction type • Arithmetic (add, subtract, …) • Logic (equivalence, negation, …) ID EXE MEM WB 17 Execution Cycle - Memory IF • If instruction is LOAD, • Read data from effective memory address • Effective memory address computed during EXE • If instruction is STORE, • Write data from register to effective memory address • Effective memory address computed during EXE ID EXE MEM • MEM is an OPTIONAL execution stage • Memory access does not always occur 18 WB Execution Cycle – Write Back IF • Write results “back” to a register • Result type depends on instruction • Results could be from: • ALU computation -or• Memory access (i.e., load) ID EXE MEM WB 19 Execution Cycle – Fetch IF • The execution cycle then repeats… • The next instruction is already indicated by PC ID • Recall that PC set to PC + 4 during previous fetch EXE MEM WB 20 Outline • • • • • (Brief) Review of MIPS Microarchitecture Execution Cycle Pipelining Big vs. Little Endian-ness CPU Execution Time IF 21 ID EX MEM IF ID IF EX ID IF WB MEM WB EX MEM WB ID EX MEM WB Pipelining • It’s laundry day, and you have to complete the following tasks: 1. 2. 3. 4. 5. 6. 22 Wash white clothes in washing machine Dry white clothes in dryer Wash color clothes in washing machine Dry color clothes in dryer Wash athletic clothes in washing machine Dry athletic clothes in dryer Pipelining • Would you do the following? • I.e., Wait for each load to wash and dry before starting next? wash whites time 23 dry whites wash colors dry colors wash athletic dry athletic Pipelining • Heck no!! • What a waste of time!! • What do you do instead? wash whites time 24 dry whites wash colors dry colors wash athletic dry athletic Pipelining • Overlap: wash one load while another is drying wash whites dry whites wash colors dry colors wash athletic 25 time dry athletic Pipelining • Do more things at once… wash whites dry whites wash colors dry colors wash athletic 26 time dry athletic Pipelining • Complete tasks in less time… FREE TIME!! 27 Pipelining • Can this be applied to the execution cycle? • Yes!! • Fetch the next instruction while decoding the current instruction? • Decode the next instruction while executing the current instruction? • … IF 28 ID EX MEM WB IF ID EX MEM WB Pipelining • Typical 5-stage pipeline of RISC CPU • 5th instruction is being fetched while 1st instruction is being written back • There are much deeper and fancier pipelines… IF 29 ID IF EX ID IF MEM WB EX MEM WB ID EX MEM WB IF ID EX MEM WB IF ID EX MEM WB Pipelining • Typical 5-stage pipeline of RISC CPU • 9 clock cycles to complete 5 instructions clock cycle Instruction 2 3 4 5 6 7 instr-1 IF ID EX E MEM WB instr-2 IF ID EXE MEM WB IF ID EXE MEM WB IF ID EXE MEM IF ID EXE instr-3 instr-4 instr-5 30 1 8 9 WB MEM WB Pipelining • Without pipelining • 25 clock cycles to complete 5 instructions IF 31 ID EX MEM WB … IF ID EX MEM WB Pipelining • There are several things that can disrupt a pipeline • Called hazards • E.g., What happens if the next instruction depends on the result of the current instruction? 32 Pipeline Hazards • Three types of hazards • Control hazard • Data hazard • Structural hazard 33 Pipeline Hazards: Control • Control Hazard • Occurs when pipelining branches (e.g., if statements) • … or other instructions that change the PC ??? 34 Pipeline Hazards: Data • Data Hazard • Occurs when an instruction tries to use data before it’s available • For example: 1: 2: R1 <- R2 + R3 R4 <- R1 + R5 • Contents in R1 (register 1) may have been loaded for instruction #2 before instruction #1 has finished. • Several types of data hazards… 35 Pipeline Hazards: Data • Data Hazard 1: 2: 1: 2: IF ID IF R1 <- R2 + R3 R4 <- R1 + R5 EX ID MEM WB EX MEM WB R1 used in instruction #2’s execution before instruction #1 writes back 36 Pipeline Hazards: Structural • Structural Hazard • Occurs when one hardware component is needed by two (pipelined) tasks at same time • Example: read from and write to memory at the same time • Fetch an instruction from memory while writing data to memory • Hence why instruction and data memory are separated 37 Pipelining: Solutions • Ways to minimize pipeline hazards • • • • • • • Stall Flush Out-of-order execution Forwarding Bypassing Branch prediction … • Beyond the scope of this course… • Learn about / master pipeline hazards 38 Break Time!!! I don’t fish, but this likes nice… 39 Outline • • • • • (Brief) Review of MIPS Microarchitecture Execution Cycle Pipelining Big vs. Little Endian-ness CPU Execution Time IF 40 ID EX MEM IF ID IF EX ID IF WB MEM WB EX MEM WB ID EX MEM WB Big vs. Little Endian • Some important jargon: 0x97 46 AB 07 1001 0111 0100 0110 1010 1011 0000 0111 MSB: Most Significant Bit 41 LSB: Least Significant Bit Big vs. Little Endian 0x97 46 AB 07 1001 0111 0100 0110 1010 1011 0000 0111 Most Significant Byte MSB can stand for most significant bit OR byte 42 LSB can stand for least significant bit OR byte Least Significant Byte Big vs. Little Endian • Endian refers to the ordering of bytes for multiple byte words • How the bytes are stored in memory • How the bytes are interpreted • Whether the MSB comes “first” or “last” • Whether the LSB comes “first” or “last” MSB - Most Significant Byte LSB - Least Significant Byte 43 Big Endian • Most significant byte stored at smallest address • Least significant byte stored at largest address 0x97 46 AB 07 44 address byte 1000 97 1001 46 1002 AB 1003 07 Little Endian • Most significant byte stored at largest address • Least significant byte stored at smallest address 0x97 46 AB 07 45 address byte 1000 07 1001 AB 1002 46 1003 97 Example • Store 0x46 A0 B7 FF using: Big Endian address 46 byte Little Endian address 1274 1274 1275 1275 1276 1276 1277 1277 byte Example • Store 0x46 A0 B7 FF using: Big Endian 47 Little Endian address byte address byte 1274 46 1274 FF 1275 A0 1275 B7 1276 B7 1276 A0 1277 FF 1277 46 Outline • • • • • (Brief) Review of MIPS Microarchitecture Execution Cycle Pipelining Big vs. Little Endian-ness CPU Execution Time IF 48 ID EX MEM IF ID IF EX ID IF WB MEM WB EX MEM WB ID EX MEM WB CPU Execution Time æ cycles öæ seconds ö ExecutionTime = ( # instructions) ç ÷ç ÷ è instruction øè cycle ø 49 Example • What is the execution time required to complete: • 10 instructions • with 5 cycles per instruction • using a 100 Hz CPU? • Hz = cycles / second æ cycles öæ seconds ö ExecutionTime = ( # instructions) ç ÷ç ÷ è instruction øè cycle ø 50 Example æ cycles öæ seconds ö ExecutionTime = ( # instructions) ç ÷ç ÷ è instruction øè cycle ø æ 5 öæ 1 ö ExecutionTime = (10) ç ÷ç ÷ = 0.5 seconds è 1 øè 100 ø • 10 instructions • 5 cycles per instruction • 100 Hz CPU = 100 cycles per second = 1 second per 100 cycles 51 Next Time.. • Review for midterm • Bring your laptop (if you can) • Linux bootstrap day • Install VMWare onto your computer • Will need very soon!! 52