ELEC 5200-001/6200-001 Computer Architecture and Design Spring 2016 Datapath and Control (Chapter 4) Vishwani D. Agrawal James J. Danaher Professor Department of Electrical and Computer Engineering Auburn University, Auburn, AL 36849 http://www.eng.auburn.edu/~vagrawal vagrawal@eng.auburn.edu Spring 2016, Feb 15 . . . ELEC 5200-001/6200-001 Lecture 5 1 Von Neumann Kitchen Start Registers PC ALU Control My choice Processor Program Data Input Spring 2016, Feb 15 . . . Memory ELEC 5200-001/6200-001 Lecture 5 Output 2 Where Does It All Begin? In a register called program counter (PC). PC contains the memory address of the next instruction to be executed. In the beginning, PC contains the address of the memory location where the program begins. Spring 2016, Feb 15 . . . ELEC 5200-001/6200-001 Lecture 5 3 Where is the Program? Processor Memory Program counter (register) Start address Spring 2016, Feb 15 . . . ELEC 5200-001/6200-001 Lecture 5 Machine code of program 4 How Does It Run? Start PC has memory address where program begins Fetch instruction word from memory address in PC and increment PC ← PC + 4 to point to next instruction Decode instruction Execute instruction Save result in register or memory No Spring 2016, Feb 15 . . . Program complete? ELEC 5200-001/6200-001 Lecture 5 Yes STOP 5 Datapath and Control Datapath: Memory, registers, adders, ALU, and communication buses. Each step (fetch, decode, execute, save result) requires communication (data transfer) paths between memory, registers and ALU. Control: Datapath for each step is set up by control signals that set up dataflow directions on communication buses and select ALU and memory functions. Control signals are generated by a control unit consisting of one or more finitestate machines. Spring 2016, Feb 15 . . . ELEC 5200-001/6200-001 Lecture 5 6 Datapath for Instruction Fetch Add 4 PC Spring 2016, Feb 15 . . . Address Instruction Memory ELEC 5200-001/6200-001 Lecture 5 Instruction word to control unit and registers 7 Register File: A Datapath Component 5 Read registers Write register Write data 5 reg 1 reg 2 32 reg 1 data 32 Registers (reg. file) 5 32 reg 2 data 32 RegWrite from control Spring 2016, Feb 15 . . . ELEC 5200-001/6200-001 Lecture 5 8 Multi-Operation ALU Operation select ALU function 000 001 010 110 111 AND OR Add Subtract Set on less than Operation select from control 3 zero ALU result overflow zero = 1, when all bits of result are 0 Spring 2016, Feb 15 . . . ELEC 5200-001/6200-001 Lecture 5 9 R-Type Instructions Also known as arithmetic-logical instructions add, sub, slt Example: add $t0, $s1, $s2 – Machine instruction word 000000 10001 10010 01000 00000 100000 opcode $s1 $s2 $t0 function – Read two registers – Write one register – Opcode and function code go to control unit that generates RegWrite and ALU operation code. Spring 2016, Feb 15 . . . ELEC 5200-001/6200-001 Lecture 5 10 Datapath for R-Type Instruction 01000 10010 10001 000000 10001 10010 01000 00000 100000 Operation opcode $s1 $s2 $t0 function (add) select from control (add) 5 $s1 Read 3 32 register zero numbers 5 $s2 32 Registers ALU result (reg. file) 32 overflow Write reg. 5 $t0 number Write data 32 RegWrite from control activated Spring 2016, Feb 15 . . . ELEC 5200-001/6200-001 Lecture 5 11 Load and Store Instructions I-type instructions lw $t0, 1200 ($t1) # incr. in bytes 100011 01001 01000 0000 0100 1011 0000 opcode $t1 sw $t0 $t0, 1200 ($t1) 1200 # incr. in bytes 101011 01001 01000 0000 0100 1011 0000 opcode $t1 Spring 2016, Feb 15 . . . $t0 ELEC 5200-001/6200-001 Lecture 5 1200 12 Datapath for lw Instruction Read register numbers 01001 100011 01001 01000 0000 0100 1011 0000 opcode $t1 $t0 1200 5 $t1 3 32 01000 5 Operation select from control (add) result ALU Addr. Data memory overflow $t0 Write data Write data 0000 0100 1011 0000 Read data zero 5 32 Registers (reg. file) Write reg. number MemWrite 32 32 RegWrite from control activated MemRead activated Sign extend 16 mem. data to $t0 Spring 2016, Feb 15 . . . ELEC 5200-001/6200-001 Lecture 5 13 Datapath for sw Instruction 01001 Read register numbers 5 01000 101011 01001 01000 0000 0100 1011 0000 opcode $t1 $t0 1200 5 Write reg. number $t1 3 32 Operation select from control (add) Read data zero $t0 result ALU 32 Registers (reg. file) Addr. Data memory overflow 5 32 Write data 0000 0100 1011 0000 MemWrite activated $t0 data to mem. 32 Write data 32 RegWrite from control Sign extend MemRead 16 Spring 2016, Feb 15 . . . ELEC 5200-001/6200-001 Lecture 5 14 Branch Instruction (I-Type) beq $s1, $s2, 25 # if $s1 = $s2, advance PC through 25 instructions 16-bits 000100 10001 10010 0000 0000 0001 1001 opcode $s1 $s2 25 Note: Can branch within ± 215 words from the current instruction address in PC. Spring 2016, Feb 15 . . . ELEC 5200-001/6200-001 Lecture 5 15 Datapath for beq Instruction 10001 Read register numbers 5 10010 16-bits 000100 10001 10010 0000 0000 0001 1001 opcode $s1 $s2 25 5 Write reg. number Operation select from control (subtract) $s1 32 3 $s2 32 Registers zero (reg. file) ALU 5 32 result To branch control logic overflow 0000 0000 0001 1001 Write data 32 16 Sign extend Spring 2016, Feb 15 . . . PC+4 From instruction 32 fetch datapath RegWrite from control 32 Shift left 2 Add Branch target 32 32 ELEC 5200-001/6200-001 Lecture 5 16 J-Type Instruction j 2500 # jump to instruction 2,500 26-bits 000010 0000 0000 0000 0010 0111 0001 00 opcode 2,500 32-bit jump address 0000 0000 0000 0000 0010 0111 0001 0000 bits 28-31 from PC+4 Spring 2016, Feb 15 . . . ELEC 5200-001/6200-001 Lecture 5 17 Datapath for Jump Instruction Branch 4 Add Branch addr. 32 Jump 1 mux 0 32 32 0 mux 1 PC+4 4 32 32 PC Spring 2016, Feb 15 . . . 28 Shift left 2 Address Instruction Memory 26 32 ELEC 5200-001/6200-001 Lecture 5 6 opcode (bits 26-31) to control Instruction word to control and registers 18 Instr. mem. 16-20 11-15 Combined Datapaths 0-15 Sign ext. Shift left 2 0 mux 1 1 mux 0 ALU zero MemtoReg MemWrite MemRead Data mem. 0 mux 1 PC 1 mux 0 21-25 ALU 26-31 Branch Reg. File opcode CONTROL RegDst Add 4 Jump Shift left 2 1 mux 0 0-25 ALU Cont. 0-5 Spring 2016, Feb 15 . . . ELEC 5200-001/6200-001 Lecture 5 19 Control RegDst Jump Branch MemRead Instruction bits 26-31 opcode Control Logic MemtoReg ALUOp MemWrite ALUSrc RegWrite Instruction bits 0-5 funct. Spring 2016, Feb 15 . . . 2 ALU Control ELEC 5200-001/6200-001 Lecture 5 to ALU 20 Control Logic: Truth Table Inputs: instr. opcode bits RegDst Jump ALUSrc MemtoReg RegWrite MemRead MemWrite Branch ALOOp1 ALUOp2 Instr type Outputs: control signals R 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 0 lw 1 0 0 0 1 1 0 0 1 1 1 1 0 0 0 0 sw 1 0 1 0 1 1 X 0 1 X 0 0 1 0 0 0 beq 0 0 0 1 0 0 X 0 0 X 0 0 0 1 0 1 j 0 0 0 0 1 0 X 1 X X 0 X 0 X X X 31 30 29 28 27 26 Spring 2016, Feb 15 . . . ELEC 5200-001/6200-001 Lecture 5 21 How Long Does It Take? Assume control logic is fast and does not affect the critical timing. Major time delay components are ALU, memory read/write, and register read/write. Arithmetic-type (R-type) Fetch (memory read) Register read ALU operation Register write Total Spring 2016, Feb 15 . . . ELEC 5200-001/6200-001 Lecture 5 2ns 1ns 2ns 1ns 6ns 22 Time for lw and sw (I-Types) ALU (R-type) Load word (I-type) – Fetch (memory read) – Register read – ALU operation – Get data (mem. Read) – Register write – Total 6ns 2ns 1ns 2ns 2ns 1ns Store word (no register write) Spring 2016, Feb 15 . . . ELEC 5200-001/6200-001 Lecture 5 8ns 7ns 23 Time for beq (I-Type) ALU (R-type) Load word (I-type) Store word (I-type) Branch on equal (I-type) – Fetch (memory read) – Register read – ALU operation – Total Spring 2016, Feb 15 . . . 6ns 8ns 7ns 2ns 1ns 2ns 5ns ELEC 5200-001/6200-001 Lecture 5 24 Time for Jump (J-Type) ALU (R-type) Load word (I-type) Store word (I-type) Branch on equal (I-type) Jump (J-type) – Fetch (memory read) 2ns – Total Spring 2016, Feb 15 . . . 6ns 8ns 7ns 5ns 2ns ELEC 5200-001/6200-001 Lecture 5 25 How Fast Can Clock Run? If every instruction is executed in one clock cycle, then: – Clock period must be at least 8ns to perform the longest instruction, i.e., lw. – This is a single cycle machine. – It is slower because many instructions take less than 8ns but are still allowed that much time. Method of speeding up: Use multicycle datapath. Spring 2016, Feb 15 . . . ELEC 5200-001/6200-001 Lecture 5 26 A Single Cycle Example Delay of 1-bit full adder = 1ns Clock period ≥ 32ns a31 . . . a2 a1 a0 1-b full adder 1-b full adder b31 . . . b2 b1 b0 1-b full adder c32 s31 . . . s2 s1 s0 1-b full adder Time of adding words ~ 32ns Time of adding bytes ~ 32ns 0 Spring 2016, Feb 15 . . . ELEC 5200-001/6200-001 Lecture 5 27 a31 . . . a2 a1 a0 b31 . . . b2 b1 b0 Spring 2016, Feb 15 . . . Delay of 1-bit full adder = 1ns Clock period ≥ 1ns Time of adding words ~ 32ns Time of adding bytes ~ 8ns 1-b full adder c32 FF Initialize to 0 ELEC 5200-001/6200-001 Lecture 5 s31 . . . s2 s1 s0 Shift Shift Shift A Multicycle Implementation 28 Instr. mem. 16-20 Single-cycle Datapaths 0-15 11-15 Sign ext. Shift left 2 0 mux 1 1 mux 0 ALU zero MemtoReg MemWrite MemRead Data mem. 0 mux 1 PC 1 mux 0 21-25 ALU 26-31 Branch Reg. File opcode CONTROL RegDst Add 4 Jump Shift left 2 1 mux 0 0-25 ALU Cont. 0-5 Spring 2016, Feb 15 . . . ELEC 5200-001/6200-001 Lecture 5 29 ALUOut Reg. 4 ALU A Reg. B Reg. Register file Mem. Data (MDR) Data Instr. reg. (IR) Addr. Memory PC Multicycle Datapath One-cycle data transfer paths (need registers to hold data) Spring 2016, Feb 15 . . . ELEC 5200-001/6200-001 Lecture 5 30 Multicycle Datapath Requirements Only one ALU, since it can be reused. Single memory for instructions and data. Five registers added: – Instruction register (IR) – Memory data register (MDR) – Three ALU registers, A and B for inputs and ALUOut for output Spring 2016, Feb 15 . . . ELEC 5200-001/6200-001 Lecture 5 31 MUX in1 control in2 MemRead MemWrite Spring 2016, Feb 15 . . . IRWrite MemtoReg 0-15 Sign extend 0-5 ELEC 5200-001/6200-001 Lecture 5 ALUOut Reg. ALU ALUSrcB 28-31 ALUSrcA A Reg. 11-15 RegDst B Reg. out RegWrite 16-20 PCSource Shift left 2 21-25 Register file Data 0-25 Mem. Data (MDR) IorD Memory Addr. PC 26-31 to Control FSM Instr. reg. (IR) PCWrite etc. Multicycle Datapath 4 Shift left 2 ALU control ALUOp 32 3 to 5 Cycles for an Instruction Step R-type (4 cycles) Mem. Ref. (4 or 5 cycles) Branch type (3 cycles) J-type (3 cycles) Instruction fetch IR ← Memory[PC]; PC ← PC+4 Instr. decode/ Reg. fetch A ← Reg(IR[21-25]); B ← Reg(IR[16-20]) ALUOut ← PC + {(sign extend IR[0-15]) << 2} ALUOut ← A op B Execution, addr. Comp., branch & jump completion Mem. Access or R-type completion Reg(IR[1115]) ← ALUOut Memory read completion Spring 2016, Feb 15 . . . ALUOut ← A+sign extend (IR[0-15]) If (A= =B) then PC←ALUOut PC←PC[2831] || (IR[0-25]<<2) MDR←M[ALUout] or M[ALUOut]←B Reg(IR[16-20]) ← MDR ELEC 5200-001/6200-001 Lecture 5 33 Cycle 1 of 5: Instruction Fetch (IF) Read instruction into IR, M[PC] → IR Control signals used: IorD MemRead IRWrite = = = 0 1 1 select PC read memory write IR Increment PC, PC + 4 → PC Control signals used: ALUSrcA ALUSrcB ALUOp PCSource PCWrite Spring 2016, Feb 15 . . . = = = = = 0 01 00 00 1 ELEC 5200-001/6200-001 Lecture 5 select PC into ALU select constant 4 ALU adds select ALU output write PC 34 Cycle 2 of 5: Instruction Decode (ID) 31-26 25-21 20-16 15-11 10-6 5-0 R opcode | reg 1 | reg 2 | reg 3 | shamt | fncode I opcode | reg 1 | reg 2 | word address increment J opcode | word address jump Control unit decodes instruction Datapath prepares for execution R and I types, reg 1→ A reg, reg 2 → B reg No control signals needed Branch type, compute branch address in ALUOut ALUSrcA ALUSrcB ALUOp Spring 2016, Feb 15 . . . =0 = 11 = 00 select PC into ALU Instr. Bits 0-15 shift 2 into ALU ALU adds ELEC 5200-001/6200-001 Lecture 5 35 Cycle 3 of 5: Execute (EX) R type: execute function on reg A and reg B, result in ALUOut Control signals used: ALUSrcA ALUsrcB ALUOp = = = 1 00 10 A reg into ALU B reg into ALU instr. Bits 0-5 control ALU I type, lw or sw: compute memory address in ALUOut ← A reg + sign extend IR[0-15] Control signals used: ALUSrcA ALUSrcB ALUOp Spring 2016, Feb 15 . . . = = = 1 10 00 ELEC 5200-001/6200-001 Lecture 5 A reg into ALU Instr. Bits 0-15 into ALU ALU adds 36 Cycle 3 of 5: Execute (EX) I type, beq: subtract reg A and reg B, write ALUOut to PC Control signals used: ALUSrcA = 1 ALUsrcB = 00 ALUOp = 01 If zero = 1, PCSource = 01 If zero = 1, PCwriteCond =1 Instruction complete, go to IF A reg into ALU B reg into ALU ALU subtracts ALUOut to PC write PC J type: write jump address to PC ← IR[0-25] shift 2 and four leading bits of PC Control signals used: PCSource = 10 PCWrite = 1 Instruction complete, go to IF Spring 2016, Feb 15 . . . ELEC 5200-001/6200-001 Lecture 5 write PC 37 Cycle 4 of 5: Reg Write/Memory R type, write destination register from ALUOut Control signals used: RegDst = 1 MemtoReg = 0 RegWrite = 1 Instruction complete, go to IF Instr. Bits 11-15 specify reg. ALUOut into reg. write register I type, lw: read M[ALUOut] into MDR Control signals used: IorD MemRead = = 1 1 select ALUOut as mem adr. read memory to MDR I type, sw: write M[ALUOut] from B reg Control signals used: IorD = 1 MemWrite = 1 Instruction complete, go to IF Spring 2016, Feb 15 . . . select ALUOut as mem adr. write memory ELEC 5200-001/6200-001 Lecture 5 38 Cycle 5 of 5: Reg Write I type, lw: write MDR to reg[IR(16-20)] Control signals used: RegDst = 0 instr. Bits 16-20 are write reg MemtoReg = 1 MDR to reg file write input RegWrite = 1 write reg. file Instruction complete, go to IF For an alternative method of designing datapath, see N. Tredennick, Microprocessor Logic Design, the Flowchart Method, Digital Press, 1987. Spring 2016, Feb 15 . . . ELEC 5200-001/6200-001 Lecture 5 39 1-bit Control Signals Signal name Value = 0 Value =1 RegDst Write reg. # = bit 16-20 Write reg. # = bit 11-15 RegWrite No action Write reg. ← Write data ALUSrcA First ALU Operand ← PC First ALU Operand←Reg. A MemRead No action Mem.Data Output←M[Addr.] MemWrite No action M[Addr.]←Mem. Data Input MemtoReg Reg.File Write In←ALUOut Reg.File Write In←MDR IorD Mem. Addr. ← PC Mem. Addr. ← ALUOut IRWrite No action IR ← Mem.Data Output PCWrite No action PC is written PCWriteCond No action PC is written if zero(ALU)=1 zero(ALU) PCWriteCond PCWrite Spring 2016, Feb 15 . . . ELEC 5200-001/6200-001 Lecture 5 PCWrite etc. 40 2-bit Control Signals Signal name Value ALUOp ALUSrcB PCSource Action 00 ALU performs add 01 ALU performs subtract 10 Funct. field (0-5 bits of IR ) determines ALU operation 00 Second input of ALU ← B reg. 01 Second input of ALU ← 4 (constant) 10 Second input of ALU ← 0-15 bits of IR sign ext. to 32b 11 Second input of ALU ← 0-15 bits of IR sign ext. and left shift 2 bits 00 ALU output (PC +4) sent to PC 01 ALUOut (branch target addr.) sent to PC 10 Jump address IR[0-25] shifted left 2 bits, concatenated with PC+4[28-31], sent to PC Spring 2016, Feb 15 . . . ELEC 5200-001/6200-001 Lecture 5 41 Control: Finite State Machine Start State 0 Clock cycle 1 Instruction fetch State 1 Clock cycle 2 Instruction decode and register fetch FSM-M Clock cycles 3-5 Memory access instr. Spring 2016, Feb 15 . . . FSM-R R-type instr. FSM-B Branch instr. ELEC 5200-001/6200-001 Lecture 5 FSM-J Jump instr. 42 State 0: Instruction Fetch (CC1) MUX in1 control in2 MemRead = 1 MemWrite Spring 2016, Feb 15 . . . IRWrite =1 RegWrite RegDst MemtoReg 0-15 Sign extend 0-5 ELEC 5200-001/6200-001 Lecture 5 4 ALUOut Reg. ALU ALUSrcB=01 28-31 ALUSrcA=0 A Reg. 16-20 B Reg. out Shift left 2 21-25 Register file Data 0-25 Mem. Data (MDR) IorD=0 Memory Addr. PC 26-31 to Control FSM Instr. reg. (IR) PCWrite etc.=1 PCSource=00 Add Shift left 2 ALU control ALUOp =00 43 State 0 Control FSM Outputs Start State0 Instruction fetch MemRead =1 ALUSrcA = 0 IorD = 0 IRWrite = 1 ALUSrcB = 01 ALUOp = 00 PCWrite = 1 PCSource = 00 Spring 2016, Feb 15 . . . State 1 Instruction decode/ Register fetch/ Branch addr. ELEC 5200-001/6200-001 Lecture 5 Outputs? 44 State 1: Instr. Decode/Reg. Fetch/ Branch Address (CC2) MUX in1 control in2 MemRead MemWrite Spring 2016, Feb 15 . . . IRWrite RegWrite RegDst MemtoReg 0-15 Sign extend 0-5 ELEC 5200-001/6200-001 Lecture 5 4 ALUOut Reg. ALU ALUSrcB=11 28-31 ALUSrcA=0 A Reg. 16-20 B Reg. out Shift left 2 21-25 Register file Data 0-25 Mem. Data (MDR) IorD Memory Addr. PC 26-31 to Control FSM Instr. reg. (IR) PCWrite etc. PCSource Add Shift left 2 ALU control ALUOp = 00 45 State 1 Control FSM Outputs Start State0 Instruction fetch MemRead =1 (IF) ALUSrcA = 0 IorD = 0 IRWrite = 1 ALUSrcB = 01 ALUOp = 00 PCWrite = 1 PCSource = 00 State 1 Instruction decode (ID) / Register fetch / Branch addr. ALUSrcA = 0 ALUSrcB = 11 ALUOp = 00 Opcode = BEQ FSM-M Spring 2016, Feb 15 . . . FSM-R FSM-B ELEC 5200-001/6200-001 Lecture 5 Opcode = J-type FSM-J 46 State 1 (Opcode = lw) → FSM-M (CC3-5) PCSource MUX in1 control in2 MemRead=1 MemWrite Spring 2016, Feb 15 . . . IRWrite RegWrite=1 4 RegDst=0 MemtoReg=1 0-15 Sign extend 0-5 ELEC 5200-001/6200-001 Lecture 5 CC3 ALU ALUSrcB=10 ALUSrcA=1 28-31 ALUOut Reg. CC5 A Reg. 16-20 B Reg. out Shift left 2 21-25 Register file Data 0-25 Mem. Data (MDR) IorD=1 Memory Addr. PC 26-31 to Control FSM Instr. reg. (IR) PCWrite etc. CC4 Add Shift left 2 ALU control ALUOp = 00 47 State 1 (Opcode= sw)→FSM-M (CC3-4) CC4 MUX in1 control IRWrite CC4 in2 MemRead MemWrite=1 Spring 2016, Feb 15 . . . RegWrite RegDst=0 MemtoReg 0-15 Sign extend 0-5 ELEC 5200-001/6200-001 Lecture 5 4 ALU ALUSrcB=10 CC3 ALUOut Reg. 28-31 ALUSrcA=1 A Reg. 16-20 B Reg. out Shift left 2 21-25 Register file Data 0-25 Mem. Data (MDR) IorD=1 Memory Addr. PC 26-31 to Control FSM Instr. reg. (IR) PCWrite etc. PCSource Add Shift left 2 ALU control ALUOp = 00 48 FSM-M (Memory Access) From state 1 Opcode = “lw” or “sw” Opcode Read Memory data Compute mem addrress ALUSrcA =1 ALUSrcB = 10 Opcode ALUOp = 00 = “lw” = “sw” MemRead = 1 IorD = 1 Write register RegWrite = 1 MemtoReg = 1 RegDst = 0 Spring 2016, Feb 15 . . . ELEC 5200-001/6200-001 Lecture 5 Write memory MemWrite = 1 IorD = 1 To state 0 (Instr. Fetch) 49 State 1(Opcode=R-type)→FSM-R (CC3-4) MUX in1 control in2 MemRead MemWrite Spring 2016, Feb 15 . . . IRWrite RegWrite RegDst=0 4 CC4 MemtoReg=0 0-15 Sign extend 0-5 ELEC 5200-001/6200-001 Lecture 5 ALUOut Reg. ALU CC3 ALUSrcB=00 11-15 28-31 ALUSrcA=1 A Reg. 16-20 B Reg. out Shift left 2 21-25 Register file Data 0-25 Mem. Data (MDR) IorD Memory Addr. PC 26-31 to Control FSM Instr. reg. (IR) PCWrite etc. PCSource “funct. code” Shift left 2 ALU control ALUOp = 10 50 FSM-R (R-type Instruction) From state 1 Opcode = R-type ALU operation ALUSrcA =1 ALUSrcB = 00 ALUOp = 10 Write register RegWrite = 1 MemtoReg = 0 RegDst = 1 Spring 2016, Feb 15 . . . ELEC 5200-001/6200-001 Lecture 5 To state 0 (Instr. Fetch) 51 State 1 (Opcode = beq ) → FSM-B (CC3) If(zero) MUX in1 control in2 MemRead MemWrite Spring 2016, Feb 15 . . . IRWrite RegDst MemtoReg 0-15 Sign extend 0-5 ELEC 5200-001/6200-001 Lecture 5 4 ALUOut Reg. zero ALU ALUSrcB=00 CC3 28-31 ALUSrcA=1 A Reg. 16-20 11-15 Shift left 2 RegWrite B Reg. out 21-25 Register file Data 0-25 Mem. Data (MDR) IorD Memory Addr. PC 26-31 to Control FSM Instr. reg. (IR) PCWrite etc.=1 PCSource 01 subtract Shift left 2 ALU control ALUOp = 01 52 Write PC on “zero” zero=1 PCWriteCond=1 PCWrite etc.=1 PCWrite PCWrite = 1, unconditionally write PC Cycle 1, fetch Cycle 3, jump Spring 2016, Feb 15 . . . ELEC 5200-001/6200-001 Lecture 5 53 FSM-B (Branch) From state 1 Opcode = “beq” Write PC on branch condition ALUSrcA =1 ALUSrcB = 00 ALUOp = 01 PCWriteCond=1 PCSource=01 Branch condition: If A – B=0 zero = 1 To state 0 (Instr. Fetch) Spring 2016, Feb 15 . . . ELEC 5200-001/6200-001 Lecture 5 54 State 1 (Opcode = j) → FSM-J (CC3) MUX in1 control in2 MemRead MemWrite Spring 2016, Feb 15 . . . IRWrite Shift left 2 RegWrite RegDst MemtoReg 0-15 Sign extend 0-5 ELEC 5200-001/6200-001 Lecture 5 ALUOut Reg. zero ALU ALUSrcB 28-31 ALUSrcA 11-15 A Reg. 16-20 B Reg. out PCSource 10 21-25 Register file Data 0-25 Mem. Data (MDR) IorD Memory Addr. PC 26-31 to Control FSM Instr. reg. (IR) PCWrite etc. CC3 4 Shift left 2 ALU control ALUOp 55 Write PC zero PCWriteCond PCWrite=1 PCWrite etc.=1 PCWrite = 1, unconditionally write PC Cycle 1, fetch Cycle 3, jump Spring 2016, Feb 15 . . . ELEC 5200-001/6200-001 Lecture 5 56 FSM-J (Jump) From state 1 Opcode = “jump” Write jump addr. In PC PCWrite=1 PCSource=10 To state 0 (Instr. Fetch) Spring 2016, Feb 15 . . . ELEC 5200-001/6200-001 Lecture 5 57 Control FSM Start State 0 1 Instr. decode/reg. fetch/branch addr. Instr. fetch/ adv. PC R 3 2 Read memory data Compute memory addr. lw 5 Write register Spring 2016, Feb 15 . . . B ALU operation 6 Write PC on branch condition 8 sw 4 J Write jump addr. to PC 9 7 Write memory data Write register ELEC 5200-001/6200-001 Lecture 5 58 Control FSM (Controller) 6 inputs (opcode) 16 control outputs Combinational logic Present state Next state Reset Clock FF FF FF FF Spring 2016, Feb 15 . . . ELEC 5200-001/6200-001 Lecture 5 59 Designing the Control FSM Encode states; need 4 bits for 10 states, e.g., – State 0 is 0000, state 1 is 0001, and so on. Write a truth table for combinational logic: Opcode 000000 .... Present state 0000 .... Control signals 0001000110000100 .... Next state 0001 .... Synthesize a logic circuit from the truth table. Connect four flip-flops between the next state outputs and present state inputs. Spring 2016, Feb 15 . . . ELEC 5200-001/6200-001 Lecture 5 60 Block Diagram of a Processor MemWrite Reset Clock Mem. Addr. Datapath (PC, register file, registers, ALU) Mem. write data Mem. data out Spring 2016, Feb 15 . . . ELEC 5200-001/6200-001 Lecture 5 61 ALUOp 3-bits PCWriteCond PCWrite IRWrite IorD MemtoReg RegWrite ALUSrcB 2-bits PCSource 2-bits RegDst ALUSrcA Overflow Opcode 6-bits zero ALU control funct. [0,5] ALUOp 2-bits Controller (Control FSM) MemRead Exceptions or Interrupts Conditions under which the processor may produce incorrect result or may “hang”. – Illegal or undefined opcode. – Arithmetic overflow, divide by zero, etc. – Out of bounds memory address. EPC: 32-bit register holds the affected instruction address. Cause: 32-bit register holds an encoded exception type. For example, – 0 for undefined instruction – 1 for arithmetic overflow Spring 2016, Feb 15 . . . ELEC 5200-001/6200-001 Lecture 5 62 MUX in1 control 1 EPC Overflow to Control FSM 0 in2 EPCWrite=1 ALU ALUSrcB=01 ALUSrcA=0 Instr. reg. (IR) 26-31 to Control FSM CauseWrite=1 out PCSource 11 8000 0180(hex) PC PCWrite etc.=1 Implementing Exceptions Cause 32-bit register 4 Subtract ALU control Spring 2016, Feb 15 . . . ELEC 5200-001/6200-001 Lecture 5 ALUOp =01 63 How Long Does It Take? Again Assume control logic is fast and does not affect the critical timing. Major time components are ALU, memory read/write, and register read/write. Time for hardware operations, suppose Memory read or write Register read ALU operation Register write Spring 2016, Feb 15 . . . ELEC 5200-001/6200-001 Lecture 5 2ns 1ns 2ns 1ns 64 Single-Cycle Datapath R-type Load word (I-type) Store word (I-type) Branch on equal (I-type) Jump (J-type) Clock cycle time = Each instruction takes one cycle Spring 2016, Feb 15 . . . ELEC 5200-001/6200-001 Lecture 5 6ns 8ns 7ns 5ns 2ns 8ns 65 Performance Parameters Average cycles per instruction (CPI) Cycle time (clock period, T) Execution time of a program For single-cycle datapath: CPI = 1 T = hardware time for lw instruction Execution time of a program = T × CPI × (Total instructions executed) Spring 2016, Feb 15 . . . ELEC 5200-001/6200-001 Lecture 5 66 Multicycle Datapath Clock cycle time is determined by the longest operation, ALU or memory: Clock cycle time = 2ns Cycles per instruction: lw sw R-type beq j Spring 2016, Feb 15 . . . 5 4 4 3 3 ELEC 5200-001/6200-001 Lecture 5 (10ns) (8ns) (8ns) (6ns) (6ns) 67 CPI of a Multicycle Computer = ∑k (Instructions of type k) × CPIk —————————————— ∑k (instructions of type k) CPIk = Cycles for instruction of type k Note: CPI is dependent on the instruction mix of the program being run. Standard benchmark programs are used for specifying the performance of CPUs. CPI where Spring 2016, Feb 15 . . . ELEC 5200-001/6200-001 Lecture 5 68 Example Consider a program containing: loads stores branches jumps Arithmetic CPI CPI Spring 2016, Feb 15 . . . 25% 10% 11% 2% 52% = 0.25×5 + 0.10×4 + 0.11×3 + 0.02×3 + 0.52×4 = 4.12 for multicycle datapath = 1.00 for single-cycle datapath ELEC 5200-001/6200-001 Lecture 5 69 Multicycle vs. Single-Cycle Performance ratio = Single cycle time / Multicycle time = (CPI × cycle time) for single-cycle ——————————————— (CPI × cycle time) for multicycle = 1.00 × 8ns ————— 4.12 × 2ns = 0.97 Single cycle is faster in this case, but is it always so? Spring 2016, Feb 15 . . . ELEC 5200-001/6200-001 Lecture 5 70 Consider Another Example For this program: loads stores branches jumps Arithmetic CPI CPI Spring 2016, Feb 15 . . . 5% 5% 30% 10% 50% = 0.05×5 + 0.05×4 + 0.30×3 + 0.10×3 + 0.50×4 = 3.65 for multicycle datapath = 1.00 for single-cycle datapath ELEC 5200-001/6200-001 Lecture 5 71 Multicycle vs. Single-Cycle Performance ratio = Single cycle time / Multicycle time = (CPI × cycle time) for single-cycle ——————————————— (CPI × cycle time) for multicycle = 1.00 × 8ns ————— 3.65 × 2ns = 1.096 Multicycle is faster in this case, so the performance ratio depends on the instruction mix. Can we do better? Spring 2016, Feb 15 . . . ELEC 5200-001/6200-001 Lecture 5 72 Next: Pipelining https://www.youtube.com/watch?v=IjarLbD9r30 https://www.youtube.com/watch?v=ANXGJe6i3G8 https://www.youtube.com/watch?v=5lp4EbfPAtI Spring 2016, Feb 15 . . . ELEC 5200-001/6200-001 Lecture 5 73