Chapter 5: Basic MIPS Architecture Adapted from Mary Jane Irwin at Penn State University for Computer Organization and Design, Patterson & Hennessy, © 2005 810:142 Lectures 3 & 4: Datapath & Control Fall 2006 The Processor: Datapath & Control Our implementation of the MIPS is simplified Generic implementation memory-reference instructions: lw, sw arithmetic-logical instructions: add, sub, and, or, slt control flow / branch instructions: beq, j use the program counter (PC) to supply the instruction address and fetch the instruction from memory (and update the PC) decode the instruction (and read registers) execute the instruction Fetch PC = PC+4 Exec Decode All instructions (except j) use the ALU after reading the registers How? memory-reference? arithmetic? control flow? 810:142 Lectures 3 & 4: Datapath & Control Fall 2006 Clocking Methodologies The clocking methodology defines when signals can be read and when they are written An edge-triggered methodology Typical execution read contents of state elements send values through combinational logic write results to one or more state elements State element 1 Combinational logic State element 2 clock one clock cycle Assumes state elements are written on every clock cycle; if not, need explicit write control signal write occurs only when both the write control is asserted and the clock edge occurs 810:142 Lectures 3 & 4: Datapath & Control Fall 2006 Fetching Instructions Fetching instructions involves reading the instruction from the Instruction Memory updating the PC to hold the address of the next instruction Add 4 Instruction Memory PC Read Address Instruction PC is updated every cycle, so it does not need an explicit write control signal Instruction Memory is read every cycle, so it doesn’t need an explicit read control signal 810:142 Lectures 3 & 4: Datapath & Control Fall 2006 Decoding Instructions Decoding instructions involves sending the fetched instruction’s opcode and function field bits to the control unit Control Unit Read Addr 1 Instruction Register Read Read Addr 2 Data 1 File Write Addr Read Write Data Data 2 reading two values from the Register File - Register File addresses are contained in the instruction 810:142 Lectures 3 & 4: Datapath & Control Fall 2006 Executing R Format Operations R format operations (add, sub, slt, and, or) 31 R-type: op 25 rs 20 15 rt rd 5 0 shamt funct perform the (op and funct) operation on values in rs and rt store the result back into the Register File (into location rd) RegWrite Instruction Read Addr 1 Register Read Read Addr 2 Data 1 File Write Addr Read Write Data 10 ALU control ALU overflow zero Data 2 The Register File is not written every cycle (e.g. sw), so we need an explicit write control signal for the Register File 810:142 Lectures 3 & 4: Datapath & Control Fall 2006 Executing Load and Store Operations Load and store operations involves compute memory address by adding the base register (read from the Register File during decode) to the 16-bit signed-extended offset field in the instruction store value (read from the Register File during decode) written to the Data Memory load value, read from the Data Memory, written to the Register RegWrite ALU control MemWrite File Instruction overflow zero Read Addr 1 Register Read Read Addr 2 Data 1 File Write Addr Read Write Data 16 810:142 Lectures 3 & 4: Datapath & Control Address ALU Write Data Data 2 Sign Extend Data Memory Read Data MemRead 32 Fall 2006 Executing Branch Operations Branch operations involves compare the operands read from the Register File during decode for equality (zero ALU output) compute the branch target address by adding the updated PC to the 16-bit signed-extended offset field in the instr Add 4 Add Shift left 2 Branch target address ALU control PC Instruction Read Addr 1 Register Read Read Addr 2 Data 1 File Write Addr Read Write Data 810:142 Lectures 3 & 4: Datapath & Control 16 zero (to branch control logic) ALU Data 2 Sign Extend 32 Fall 2006 Executing Jump Operations Jump operation involves replace the lower 28 bits of the PC with the lower 26 bits of the fetched instruction shifted left by 2 bits Add 4 4 Instruction Memory PC 810:142 Lectures 3 & 4: Datapath & Control Read Address Shift left 2 Jump address 28 Instruction 26 Fall 2006 Creating a Single Datapath from the Parts Assemble the datapath segments and add control lines and multiplexors as needed Single cycle design – fetch, decode and execute each instructions in one clock cycle no datapath resource can be used more than once per instruction, so some must be duplicated (e.g., separate Instruction Memory and Data Memory, several adders) multiplexors needed at the input of shared elements with control lines to do the selection write signals to control writing to the Register File and Data Memory Cycle time is determined by length of the longest path 810:142 Lectures 3 & 4: Datapath & Control Fall 2006 Control signals for Fetch, Reg, and Memory Add RegWrite ALUSrc ALU control 4 MemtoReg ovf zero Instruction Memory PC MemWrite Read Address Instruction Read Addr 1 Register Read Read Addr 2 Data 1 File Write Addr Read Write Data Address ALU Write Data Data 2 Sign 16 Extend Data Memory Read Data MemRead 32 What determines the values needed on these control signals? 810:142 Lectures 3 & 4: Datapath & Control Fall 2006 Adding the Control Selecting the operations to perform (ALU, Register File and Memory read/write) Controlling the flow of data (multiplexor inputs) 31 R-type: op 31 Observations op field always in bits 31-26 I-Type: op 31 25 rs 25 rs 20 15 rt rd 20 rt 10 5 shamt funct 15 0 address offset 25 0 addr of registers J-type: op target address to be read are always specified by the rs field (bits 25-21) and rt field (bits 20-16); for lw and sw rs is the base register addr. of register to be written is in one of two places – in rt (bits 20-16) for lw; in rd (bits 15-11) for R-type instructions offset for beq, lw, and sw always in bits 15-0 810:142 Lectures 3 & 4: Datapath & Control 0 Fall 2006 Single Cycle Datapath with Control Unit 0 Add Add Shift left 2 4 ALUOp 1 PCSrc Branch MemRead MemtoReg MemWrite Instr[31-26] Control Unit ALUSrc RegWrite RegDst Instruction Memory PC Read Address Instr[31-0] ovf Instr[25-21] Read Addr 1 Register Read Instr[20-16] Read Addr 2 Data 1 File 0 Write Addr Read 1 Instr[15 -11] Instr[15-0] Write Data zero ALU Address Data Memory Read Data 1 Write Data 0 0 Data 2 1 Sign 16 Extend 32 ALU control Instr[5-0] 810:142 Lectures 3 & 4: Datapath & Control Fall 2006 R-type Instruction Data/Control Flow 0 Add Add Shift left 2 4 ALUOp 1 PCSrc Branch MemRead MemtoReg MemWrite Instr[31-26] Control Unit ALUSrc RegWrite RegDst Instruction Memory PC Read Address Instr[31-0] ovf Instr[25-21] Read Addr 1 Register Read Instr[20-16] Read Addr 2 Data 1 File 0 Write Addr Read 1 Instr[15 -11] Instr[15-0] Write Data zero ALU Address Data Memory Read Data 1 Write Data 0 0 Data 2 1 Sign 16 Extend 32 ALU control Instr[5-0] 810:142 Lectures 3 & 4: Datapath & Control Fall 2006 Load Word Instruction Data/Control Flow 0 Add Add Shift left 2 4 ALUOp 1 PCSrc Branch MemRead MemtoReg MemWrite Instr[31-26] Control Unit ALUSrc RegWrite RegDst Instruction Memory PC Read Address Instr[31-0] ovf Instr[25-21] Read Addr 1 Register Read Instr[20-16] Read Addr 2 Data 1 File 0 Write Addr Read 1 Instr[15 -11] Instr[15-0] Write Data zero ALU Address Data Memory Read Data 1 Write Data 0 0 Data 2 1 Sign 16 Extend 32 ALU control Instr[5-0] 810:142 Lectures 3 & 4: Datapath & Control Fall 2006 Branch Instruction Data/Control Flow 0 Add Add Shift left 2 4 ALUOp 1 PCSrc Branch MemRead MemtoReg MemWrite Instr[31-26] Control Unit ALUSrc RegWrite RegDst Instruction Memory PC Read Address Instr[31-0] ovf Instr[25-21] Read Addr 1 Register Read Instr[20-16] Read Addr 2 Data 1 File 0 Write Addr Read 1 Instr[15 -11] Instr[15-0] Write Data zero ALU Address Data Memory Read Data 1 Write Data 0 0 Data 2 1 Sign 16 Extend 32 ALU control Instr[5-0] 810:142 Lectures 3 & 4: Datapath & Control Fall 2006 Adding the Jump Operation Instr[25-0] Shift left 2 26 1 28 32 0 PC+4[31-28] 0 Add Jump ALUOp Add Shift left 2 4 1 PCSrc Branch MemRead MemtoReg MemWrite Instr[31-26] Control Unit ALUSrc RegWrite RegDst Instruction Memory PC Read Address Instr[31-0] ovf Instr[25-21] Read Addr 1 Register Read Instr[20-16] Read Addr 2 Data 1 File 0 Write Addr Read 1 Instr[15 -11] Instr[15-0] Write Data zero ALU Address Data Memory Read Data 1 Write Data 0 0 Data 2 1 Sign 16 Extend 32 ALU control Instr[5-0] 810:142 Lectures 3 & 4: Datapath & Control Fall 2006 Single Cycle Disadvantages & Advantages Uses the clock cycle inefficiently – the clock cycle must be timed to accommodate the slowest instruction especially problematic for more complex instructions like floating point multiply Cycle 1 Cycle 2 Clk lw sw Waste May be wasteful of area since some functional units (e.g., adders) must be duplicated since they can not be shared during a clock cycle but Is simple and easy to understand 810:142 Lectures 3 & 4: Datapath & Control Fall 2006 Multicycle Datapath Approach Let an instruction take more than 1 clock cycle to complete Break up instructions into steps where each step takes a cycle while trying to - balance the amount of work to be done in each step - restrict each cycle to use only one major functional unit Not every instruction takes the same number of clock cycles In addition to faster clock rates, multicycle allows functional units that can be used more than once per instruction as long as they are used on different clock cycles, as a result only need one memory – but only one memory access per cycle need only one ALU/adder – but only one ALU operation per cycle 810:142 Lectures 3 & 4: Datapath & Control Fall 2006 Multicycle Datapath Approach, con’t At the end of a cycle Write Data IR – Instruction Register A, B – regfile read data registers ALU ALUout A Read Addr 1 Register Read Read Addr 2Data 1 File Write Addr Read Data 2 Write Data B Memory Address Read Data (Instr. or Data) IR Store values needed in a later cycle by the current instruction in an internal register (not visible to the programmer). All (except IR) hold data only between a pair of adjacent clock cycles (no write control signal needed) PC MDR MDR – Memory Data Register ALUout – ALU output register Data used by subsequent instructions are stored in programmer visible registers (i.e., register file, PC, or memory) 810:142 Lectures 3 & 4: Datapath & Control Fall 2006 The Multicycle Datapath with Control Signals Memory Address 1 1 Write Data 1 Read Data (Instr. or Data) 0 MDR Write Data Data 2 Shift left 2 28 2 0 1 0 1 zero ALU 4 0 Instr[15-0] Sign Extend 32 Instr[5-0] 810:142 Lectures 3 & 4: Datapath & Control Shift left 2 Instr[25-0] Read Addr 1 Register Read Read Addr 2 Data 1 File Write Addr Read IR PC Instr[31-26] 0 PC[31-28] 0 1 2 3 ALU control Fall 2006 ALUout MemRead MemWrite MemtoReg IRWrite PCSource ALUOp Control ALUSrcB ALUSrcA RegWrite RegDst A IorD B PCWriteCond PCWrite Multicycle Control Unit Multicycle datapath control signals are not determined solely by the bits in the instruction e.g., op code bits tell what operation the ALU should be doing, but not what instruction cycle is to be done next Must use a finite state machine (FSM) for control a set of states (current state stored in State Register) next state function (determined by current state and the input) output function (determined by current state and the input) Combinational control logic ... Inst Opcode 810:142 Lectures 3 & 4: Datapath & Control ... Datapath control points ... State Reg Next State Fall 2006 The Five Steps of the Load Instruction Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 lw IFetch Dec Exec Mem WB IFetch: Instruction Fetch and Update PC Dec: Instruction Decode, Register Read, Sign Extend Offset Exec: Execute R-type; Calculate Memory Address; Branch Comparison; Branch and Jump Completion Mem: Memory Read; Memory Write Completion; Rtype Completion (RegFile write) WB: Memory Read Completion (RegFile write) INSTRUCTIONS TAKE FROM 3 - 5 CYCLES! 810:142 Lectures 3 & 4: Datapath & Control Fall 2006 Multicycle Advantages & Disadvantages Uses the clock cycle efficiently – the clock cycle is timed to accommodate the slowest instruction step Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Cycle 8 Cycle 9Cycle 10 Clk lw IFetch sw Dec Exec Mem WB IFetch R-type Dec Exec Mem IFetch Multicycle implementations allow functional units to be used more than once per instruction as long as they are used on different clock cycles but Requires additional internal state registers, more muxes, and more complicated (FSM) control 810:142 Lectures 3 & 4: Datapath & Control Fall 2006 Single Cycle vs. Multiple Cycle Timing Single Cycle Implementation: Cycle 1 Cycle 2 Clk lw sw multicycle clock slower than 1/5th of single cycle clock due to state register overhead Multiple Cycle Implementation: Clk Waste Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Cycle 8 Cycle 9Cycle 10 lw IFetch sw Dec Exec 810:142 Lectures 3 & 4: Datapath & Control Mem WB IFetch R-type Dec Exec Mem IFetch Fall 2006 Next Lecture and Reminders Next lecture MIPS pipelined datapath Reading assignment – PH, Chapter 6.1-6.3 Reminders HW1 due September 7th - Ch 4 Exercises: 4.7, 4.17, 4.18, 4.45, 4.46, 4.51 810:142 Lectures 3 & 4: Datapath & Control Fall 2006