inst.eecs.berkeley.edu/~cs61c/su06 CS61C : Machine Structures Lecture #16 – Datapath 2006-07-25 Andy Carle CS 61C L16 Datapath (1) A Carle, Summer 2006 © UCB Anatomy: 5 components of any Computer Personal Computer Computer Processor This week Control (“brain”) Datapath (“brawn”) Memory (where programs, data live when running) Devices Input Output Keyboard, Mouse Disk (where programs, data live when not running) Display, Printer CS 61C L16 Datapath (2) A Carle, Summer 2006 © UCB Outline • Design a processor: step-by-step • Requirements of the Instruction Set • Hardware components that match the instruction set requirements CS 61C L16 Datapath (3) A Carle, Summer 2006 © UCB How to Design a Processor: step-by-step • 1. Analyze instruction set architecture (ISA) => datapath requirements • meaning of each instruction is given by the register transfers • datapath must include storage element for ISA registers • datapath must support each register transfer • 2. Select set of datapath components and establish clocking methodology • 3. Assemble datapath meeting requirements • 4. Analyze implementation of each instruction to determine setting of control points that effects the register transfer. • 5. Assemble the control logic CS 61C L16 Datapath (4) A Carle, Summer 2006 © UCB Step 1: The MIPS Instruction Formats • All MIPS instructions are 32 bits long. 3 formats: 31 26 op • R-type rs 6 bits 31 • I-type 26 op 31 16 rt 5 bits 5 bits 21 rs 6 bits • J-type 21 5 bits 11 rd shamt funct 5 bits 5 bits 6 bits 16 6 bits • The different fields are: 0 0 address/immediate rt 5 bits 16 bits 26 op 6 0 target address 26 bits • op: operation (“opcode”) of the instruction • rs, rt, rd: the source and destination register specifiers • shamt: shift amount • funct: selects the variant of the operation in the “op” field • address / immediate: address offset or immediate value • target address: target address of jump instruction CS 61C L16 Datapath (5) A Carle, Summer 2006 © UCB Step 1: The MIPS-lite Subset for today • ADD and SUB 31 •addU rd,rs,rt op 31 op 31 •lw rt,rs,imm16 •sw rt,rs,imm16 31 • BRANCH: •beq rs,rt,imm16 CS 61C L16 Datapath (6) 26 op 6 bits rs 5 bits shamt funct 5 bits 5 bits 6 bits 0 16 bits 0 immediate 5 bits 21 0 rd 16 rt 5 bits 6 immediate 5 bits 21 rs 11 16 rt 5 bits 26 6 bits 5 bits 21 rs op 16 rt 5 bits 26 •ori rt,rs,imm166 bits • LOAD and STORE Word 21 rs 6 bits •subU rd,rs,rt • OR Immediate: 26 16 bits 16 rt 5 bits 0 immediate 16 bits A Carle, Summer 2006 © UCB Step 1: Register Transfer Language • RTL gives the meaning of the instructions {op , rs , rt , rd , shamt , funct} = MEM[ PC ] {op , rs , rt , Imm16} = MEM[ PC ] • All start by fetching the instruction inst Register Transfers ADDU R[rd] = R[rs] + R[rt]; PC = PC + 4 SUBU R[rd] = R[rs] – R[rt]; PC = PC + 4 ORI R[rt] = R[rs] | zero_ext(Imm16); PC = PC + 4 LOAD R[rt] = MEM[ R[rs] + sign_ext(Imm16)];PC = PC + 4 STORE MEM[ R[rs] + sign_ext(Imm16) ] = R[rt];PC = PC + 4 BEQ CS 61C L16 Datapath (7) if ( R[rs] == R[rt] ) then PC = PC + 4 + sign_ext(Imm16)] << 2 else PC = PC + 4 A Carle, Summer 2006 © UCB Step 1: Requirements of the Instruction Set • Memory (MEM) • instructions & data • Registers (R: 32 x 32) • read RS • read RT • Write RT or RD • PC • Extender (sign extend) • Add and Sub register or extended immediate • Add 4 or extended immediate to PC CS 61C L16 Datapath (8) A Carle, Summer 2006 © UCB Step 1: Abstract Implementation PC Clk Next Address ALU Control Ideal Instruction Instruction Control Signals Conditions Memory Rd Rs Rt 5 5 5 Instruction Address A Data Data 32 Address Rw Ra Rb 32 Ideal Out 32 32-bit 32 Data Data Registers B Memory In Clk 32 Clk Datapath CS 61C L16 Datapath (9) A Carle, Summer 2006 © UCB How to Design a Processor: step-by-step • 1. Analyze instruction set architecture (ISA) => datapath requirements • meaning of each instruction is given by the register transfers • datapath must include storage element for ISA registers • datapath must support each register transfer • 2. Select set of datapath components and establish clocking methodology • 3. Assemble datapath meeting requirements • 4. Analyze implementation of each instruction to determine setting of control points that effects the register transfer. • 5. Assemble the control logic (hard part!) CS 61C L16 Datapath (10) A Carle, Summer 2006 © UCB Step 2a: Components of the Datapath •Combinational Elements •Storage Elements • Clocking methodology CS 61C L16 Datapath (11) A Carle, Summer 2006 © UCB Combinational Logic: More Elements A B 32 Adder •Adder CarryIn 32 Sum Carry 32 Select B 32 MUX •MUX A 32 Y 32 OP A CS 61C L16 Datapath (12) B ALU •ALU 32 32 Result 32 A Carle, Summer 2006 © UCB ALU Needs for MIPS-lite + Rest of MIPS • Addition, subtraction, logical OR, ==: ADDU R[rd] = R[rs] + R[rt]; ... SUBU R[rd] = R[rs] – R[rt]; ... ORI R[rt] = R[rs] | zero_ext(Imm16)... BEQ if ( R[rs] == R[rt] )... • Test to see if output == 0 for any ALU operation gives == test. How? • P&H also adds AND, Set Less Than (1 if A < B, 0 otherwise) • ALU follows chap 5 CS 61C L16 Datapath (13) A Carle, Summer 2006 © UCB Storage Element: Idealized Memory Write Enable Address • Memory (idealized) Data In • One input bus: Data In 32 • One output bus: Data Out Clk DataOut 32 • Memory word is selected by: • Address selects the word to put on Data Out • Write Enable = 1: address selects the memory word to be written via the Data In bus • Clock input (CLK) • The CLK input is a factor ONLY during write operation • During read operation, behaves as a combinational logic block: - Address valid => Data Out valid after “access time.” CS 61C L16 Datapath (14) A Carle, Summer 2006 © UCB Storage Element: Register (Building Block) • Similar to D Flip Flop except - N-bit input and output - Write Enable input • Write Enable: - negated (or deasserted) (0): Data Out will not change - asserted (1): Data Out will become Data In CS 61C L16 Datapath (15) Write Enable Data In N Data Out N Clk A Carle, Summer 2006 © UCB Storage Element: Register File • Register File consists of 32 registers: • Two 32-bit output busses: busA and busB • One 32-bit input bus: busW • Register is selected by: RW RA RB Write Enable 5 5 5 busW 32 Clk busA 32 32 32-bit Registers busB 32 • RA (number) selects the register to put on busA (data) • RB (number) selects the register to put on busB (data) • RW (number) selects the register to be written via busW (data) when Write Enable is 1 • Clock input (CLK) • The CLK input is a factor ONLY during write operation • During read operation, behaves as a combinational logic block: - RA or RB valid => busA or busB valid after “access time.” CS 61C L16 Datapath (16) A Carle, Summer 2006 © UCB Administrivia • Project 2 due Friday • Hope you’ve already started • HW5 (maybe HW56?) out soon CS 61C L16 Datapath (17) A Carle, Summer 2006 © UCB Step 3: Assemble DataPath meeting requirements • Register Transfer Requirements Datapath Assembly • Instruction Fetch • Read Operands and Execute Operation CS 61C L16 Datapath (18) A Carle, Summer 2006 © UCB 3a: Overview of the Instruction Fetch Unit • The common RTL operations • Fetch the Instruction: mem[PC] • Update the program counter: - Sequential Code: PC = PC + 4 - Branch and Jump: PC = “something else” Clk PC Next Address Logic Address Instruction Memory CS 61C L16 Datapath (19) Instruction Word 32 A Carle, Summer 2006 © UCB 3b: Add & Subtract • R[rd] = R[rs] op R[rt] Ex.: addU rd,rs,rt • Ra, Rb, and Rw come from instruction’s Rs, Rt, 26 21 16 11 6 and Rd fields 31 op 6 bits rs 5 bits rt 5 bits rd 5 bits shamt 5 bits funct 6 bits 0 • ALUctr and RegWr: control logic after decoding the instruction Rd Rs Rt RegWr 5 5 5 32 32-bit Registers busA 32 busB 32 ALU busW 32 Clk Rw Ra Rb ALUctr Result 32 • Already defined register file, ALU CS 61C L16 Datapath (20) A Carle, Summer 2006 © UCB Clocking Methodology Clk . . . . . . . . . . . . • Storage elements clocked by same edge • Being physical devices, flip-flops (FF) and combinational logic have some delays • Gates: delay from input change to output change • Signals at FF D input must be stable before active clock edge to allow signal to travel within the FF, and we have the usual clock-to-Q delay • “Critical path” (longest path through logic) determines length of clock period CS 61C L16 Datapath (21) A Carle, Summer 2006 © UCB Register-Register Timing: One complete cycle Clk New Value PC Old Value Rs, Rt, Rd, Op, Func Old Value ALUctr Old Value RegWr Old Value busA, B Old Value busW Old Value Instruction Memory Access Time New Value Delay through Control Logic New Value New Value Register File Access Time New Value ALU Delay New Value Rd Rs Rt RegWr5 5 5 CS 61C L16 Datapath (22) busA 32 busB 32 ALU busW 32 Clk Rw Ra Rb 32 32-bit Registers ALUctr Register Write Occurs Here Result 32 A Carle, Summer 2006 © UCB 3c: Logical Operations with Immediate • R[rt] = R[rs] op ZeroExt[imm16] ] 31 26 21 op rs 31 6 bits 5 bits 16 rt 5 bits 16 15 rd? 11 0 immediate 16 bits 0 ALU immediate 0000000000000000 Rd Rt RegDst 16 bits 16 bits Mux Rt register read?? Rs Rt? What about ALUct RegWr 5 5 5 r busA Rw Ra Rb busW 32 Result 32 32-bit 32 Registers 32 busB Clk 32 Mux 16 ZeroExt imm16 32 ALUSrc • Already defined 32-bit MUX; Zero Ext? CS 61C L16 Datapath (23) A Carle, Summer 2006 © UCB 3d: Load Operations • R[rt] = Mem[R[rs] + SignExt[imm16]] Example: lw rt,rs,imm16 31 26 op rs 6 bits Rd RegDst Mux RegWr 5 32 Clk 0 rt 5 bits immediate 5 bits 16 bits Rt Rs Rt 5 5 Rw Ra Rb 32 32-bit Registers W_Src 32 busB 32 32 ExtOp 32 MemWr ?? ALUSrc Data In 32 Clk Mu x CS 61C L16 Datapath (24) busA Mux 16 ALUctr Extender imm16 16 ALU busW 21 WrEn Adr Data Memory 32 A Carle, Summer 2006 © UCB 3e: Store Operations • Mem[ R[rs] + SignExt[imm16] ] = R[rt] Ex.: sw rt, rs, imm16 31 26 21 op rs 6 bits 5 bits Rd Rt RegDst Mux RegWr5 5 rt 5 bits 32 ExtOp CS 61C L16 Datapath (25) W_Src 32 Data In32 Clk WrEn Adr 32 Data Memory Mux Extender 16 immediate 16 bits ALUctr MemWr ALU busA Rw Ra Rb 32 32 32-bit Registers busB 32 imm16 0 Rs Rt 5 Mux busW 32 Clk 16 ALUSrc A Carle, Summer 2006 © UCB 3f: The Branch Instruction 31 26 op 6 bits 21 rs 5 bits 16 rt 5 bits 0 immediate 16 bits • beq rs, rt, imm16 • mem[PC] Fetch the instruction from memory • Equal = R[rs] == R[rt] Calculate branch condition • if (Equal) Calculate the next instruction’s address - PC = PC + 4 + ( SignExt(imm16) x 4 ) else - PC = PC + 4 CS 61C L16 Datapath (26) A Carle, Summer 2006 © UCB Datapath for Branch Operations • beq rs, rt, imm16 Datapath generates condition (equal) 26 op 6 bits 21 00 Adder 32 PC Mux Adder PC Ext imm16 0 rs rt immediate 5 bits 5 bits 16 bits Inst Address nPC_sel 4 16 Rs Rt 5 busA Rw Ra Rb 32 32 32-bit Registers busB 32 Cond RegWr 5 5 busW Clk Equal? 31 Clk • Already MUX, adder, sign extend, zero CS 61C L16 Datapath (27) A Carle, Summer 2006 © UCB Putting it All Together:A Single Cycle Datapath Instruction<31:0> <0:15> <11:15> Rs <16:20> <21:25> Inst Memory Adr Rt Rd Imm16 RegDst ALUctr MemWr MemtoReg Equal Rt Rd 1 0 Rs Rt RegWr 5 5 5 busA Rw Ra Rb = busW 32 32 32-bit 0 32 32 Registers busB 0 32 Clk 32 WrEn Adr 1 1 Data In Data imm16 32 Clk 16 Clk Memory nPC_sel imm16 Mux ALU Extender PC Ext Adder Mux PC Mux Adder 00 4 ExtOp ALUSrc CS 61C L16 Datapath (28) A Carle, Summer 2006 © UCB Peer Instruction A. Our ALU is a synchronous device B. We should use the main ALU to compute PC=PC+4 C. The ALU is inactive for memory reads or writes. CS 61C L16 Datapath (29) A Carle, Summer 2006 © UCB Summary: Single cycle datapath °5 steps to design a processor • 1. Analyze instruction set => datapath requirements • 2. Select set of datapath components & establish clock methodology • 3. Assemble datapath meeting the requirements • 4. Analyze implementation of each instruction to determine setting of control points that effects the register transfer. Processor • 5. Assemble the control logic °Control is the hard part °Next time! CS 61C L16 Datapath (30) Input Control Memory Datapath Output A Carle, Summer 2006 © UCB