inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture 28 – Single Cycle CPU Control I 2004-11-02 Lecturer PSOE Dan Garcia www.cs.berkeley.edu/~ddgarcia Déjà vu all over again! Who Won? As of 2am 2004-11-03, it looks like Bush was ahead but hadn’t yet clinched it. We may have to wait for a recount and an Ohio tabulation of provisional ballots. A Country Divided! cnn.com CS 61C L28 Single Cycle CPU Control I (1) Garcia, Fall 2004 © UCB Review: How to Design a Processor: step-by-step • 1. Analyze instruction set architecture (ISA) => datapath requirements • meaning of each instruction is given by the register transfers • datapath must include storage element for ISA registers • datapath must support each register transfer • 2. Select set of datapath components and establish clocking methodology • 3. Assemble datapath meeting requirements • 4. Analyze implementation of each instruction to determine setting of control points that effects the register transfer. • 5. Assemble the control logic CS 61C L28 Single Cycle CPU Control I (2) Garcia, Fall 2004 © UCB Why do we have two dirty bits? if (wEnb) if (writeReg!=4'h0) begin array[writeReg] = writeD; dirty1=1'b1; dirty2=1'b1; end always @ (readReg1 or dirty1) begin readD1 = array[readReg1]; dirty1=0; end always @ (readReg2 or dirty2) begin readD2 = array[readReg2]; dirty2=0; end CS 61C L28 Single Cycle CPU Control I (3) Garcia, Fall 2004 © UCB Review : 3e: Store Operations • Mem[ R[rs] + SignExt[imm16] ] = R[rt] Ex.: sw rt, rs, imm16 31 26 21 op rs 6 bits 5 bits Rd Rt RegDst Mux RegWr5 5 rt 5 bits W_Src 32 ExtOp CS 61C L28 Single Cycle CPU Control I (4) 32 Data In32 Clk WrEn Adr 32 Data Memory Mux Extender 16 immediate 16 bits ALUctr MemWr ALU busA Rw Ra Rb 32 32 32-bit Registers busB 32 imm16 0 Rs Rt 5 Mux busW 32 Clk 16 ALUSrc Garcia, Fall 2004 © UCB 3f: The Branch Instruction 31 26 op 6 bits 21 rs 5 bits 16 rt 5 bits 0 immediate 16 bits • beq rs, rt, imm16 • mem[PC] Fetch the instruction from memory • Equal = R[rs] == R[rt] Calculate branch condition • if (Equal) Calculate the next instruction’s address - PC = PC + 4 + ( SignExt(imm16) x 4 ) else - PC = PC + 4 CS 61C L28 Single Cycle CPU Control I (5) Garcia, Fall 2004 © UCB Datapath for Branch Operations • beq rs, rt, imm16 Datapath generates condition (equal) 26 op 6 bits 21 00 Adder 32 PC Mux Adder PC Ext imm16 0 rs rt immediate 5 bits 5 bits 16 bits Inst Address nPC_sel 4 16 Rs Rt 5 busA Rw Ra Rb 32 32 32-bit Registers busB 32 Cond RegWr 5 5 busW Clk Equal? 31 Clk • Already MUX, adder, sign extend, zero CS 61C L28 Single Cycle CPU Control I (6) Garcia, Fall 2004 © UCB Putting it All Together:A Single Cycle Datapath Instruction<31:0> <0:15> <11:15> Rs <16:20> <21:25> Inst Memory Adr Rt Rd Imm16 RegDst ALUctr MemWr MemtoReg Equal Rt Rd 1 0 Rs Rt RegWr 5 5 5 busA Rw Ra Rb = busW 32 32 32-bit 0 32 32 Registers busB 0 32 Clk 32 WrEn Adr 1 1 Data In Data imm16 32 Clk 16 Clk Memory nPC_sel imm16 Mux ALU Extender PC Ext Adder Mux PC Mux Adder 00 4 ExtOp ALUSrc CS 61C L28 Single Cycle CPU Control I (7) Garcia, Fall 2004 © UCB An Abstract View of the Critical Path Critical Path (Load Operation) = • This affects Delay clock through PC (FFs) + how much you Instruction Memory’s Access Time + can overclock Register File’s Access Time + your PC! ALU to Perform a 32-bit Add + Data Memory Access Time + Ideal Instruction InstructionStable Time for Register File Write Memory Rd Rs Rt 5 5 5 PC Clk A 32 Rw Ra Rb 32 32-bit 32 Registers B Clk CS 61C L28 Single Cycle CPU Control I (8) ALU Next Address Instruction Address Imm 16 32 Data 32 Address Data In Ideal Data Memory Clk Garcia, Fall 2004 © UCB An Abstract View of the Implementation PC Clk Next Address ALU Control Ideal Instruction Instruction Control Signals Conditions Memory Rd Rs Rt 5 5 5 Instruction Address A Data Data 32 Address Rw Ra Rb 32 Ideal Out 32 32-bit 32 Data Data Registers B Memory In Clk 32 Clk Datapath CS 61C L28 Single Cycle CPU Control I (9) Garcia, Fall 2004 © UCB Summary: A Single Cycle Datapath • Rs, Rt, Rd, Imed16 connected to datapath • We have everything except control signals Instruction<31:0> MemWr Clk MemtoReg 0 32 Data In32 ALUSrc Rs Rd Imm16 WrEn Adr Data Memory 32 Mux 32 1 <0:15> Extender 16 ALU busA Rw Ra Rb 32 32 32-bit Registers busB 0 32 imm16 Rt Zero ALUctr Mux busW 32 Clk Clk <11:15> Rt RegDst 1 Mux0 Rs Rt RegWr 5 5 5 <16:20> Rd Instruction Fetch Unit <21:25> nPC_sel 1 ExtOp CS 61C L28 Single Cycle CPU Control I (10) Garcia, Fall 2004 © UCB Anatomy Review: 5 components of any Computer Personal Computer Computer Processor Today Control (“brain”) Yesterday Datapath (“brawn”) (& finish up) Memory (where programs, data live when running) Devices Input Output Keyboard, Mouse Disk (where programs, data live when not running) Display, Printer CS 61C L28 Single Cycle CPU Control I (11) Garcia, Fall 2004 © UCB Recap: Meaning of the Control Signals 0 PC <– PC + 4 1 PC <– PC + 4 + “n”=next {SignExt(Im16) , 00 } • Later in lecture: higher-level connection between mux and branch cond • nPC_MUX_sel: nPC_MUX_sel Inst Adr Memory Adder imm16 PC Mux Adder PC Ext CS 61C L28 Single Cycle CPU Control I (12) 00 4 Clk Garcia, Fall 2004 © UCB Recap: Meaning of the Control Signals ° MemWr: 1 write memory • ExtOp: “zero”, “sign” ° MemtoReg: 0 ALU; 1 Mem • ALUsrc: 0 regB; 1 immed ° RegDst: 0 “rt”; 1 “rd” • ALUctr: “add”, “sub”, “or” ° RegWr: 1 write register RegDst ALUctr MemWr MemtoReg = 32 WrEn Adr Data In Data Clk Memory ExtOp ALUSrc CS 61C L28 Single Cycle CPU Control I (13) 32 0 Mux ALU Mux Extender Equal Rd Rt 1 0 Rs Rt RegWr 5 5 5 busA Rw Ra Rb busW 32 32 32-bit 32 Registers busB 0 32 Clk 1 imm16 32 16 1 Garcia, Fall 2004 © UCB Great talk today – Don’t miss 306 Soda Hall @ 4pm • Dr. David Anderson • Space Sciences Laboratory, U.C. Berkeley. SETI Director “Public Resource Computing” The majority of the world's computing power is no longer concentrated in supercomputer centers and machine rooms. Instead it is distributed around the world in hundreds of millions of personal computers and game consoles, many connected to the Internet. A new computing paradigm, “publicresource computing”, uses these PCs to do scientific supercomputing. This paradigm enables new research in a number of areas and has social implications as well: it catalyzes global communities centered around common interests and goals, it encourages public awareness of current scientific research, and it may give the public a measure of control over the directions of science progress. CS 61C L28 Single Cycle CPU Control I (14) Garcia, Fall 2004 © UCB Administrivia • Dan will be away at a conference on Thursday and Friday, Andrew will cover lecture. • We regraded all the midterms and your TAs have them to return to you. CS 61C L28 Single Cycle CPU Control I (15) Garcia, Fall 2004 © UCB RTL: The Add Instruction 31 26 op 6 bits 21 rs 5 bits 16 rt 5 bits 11 6 0 rd shamt funct 5 bits 5 bits 6 bits add rd, rs, rt • MEM[PC] Fetch the instruction from memory • R[rd] = R[rs] + R[rt] The actual operation • PC = PC + 4 Calculate the next instruction’s address CS 61C L28 Single Cycle CPU Control I (16) Garcia, Fall 2004 © UCB Instruction Fetch Unit at the Beginning of Add • Fetch the instruction from Instruction memory: Instruction = MEM[PC] • same for all instructions Inst Memory Adr Instruction<31:0> nPC_MUX_sel imm16 Adder PC Ext CS 61C L28 Single Cycle CPU Control I (17) PC Mux Adder 00 4 Clk Garcia, Fall 2004 © UCB The Single Cycle Datapath during Add 31 26 21 op rs 16 11 rt 6 rd shamt • R[rd] = R[rs] + R[rt] 5 Zero ALU 16 Extender imm16 1 32 Rd Clk Imm16 MemtoReg = 0 MemWr = 0 0 32 Data In 32 ALUSrc = 0 Rs WrEn Adr 32 Mux busA Rw Ra Rb 32 32 32-bit Registers busB 0 32 Rt <0:15> 5 ALUctr = Add Rt <11:15> 5 Rs Mux 32 Clk Clk 1 Mux 0 RegWr = 1 busW Rt Instruction Fetch Unit <16:20> RegDst = 1 Rd funct Instruction<31:0> <21:25> nPC_sel= +4 0 1 Data Memory ExtOp = x CS 61C L28 Single Cycle CPU Control I (18) Garcia, Fall 2004 © UCB Instruction Fetch Unit at the End of Add • PC = PC + 4 • This is the same for all instructions except: Branch and Jump Inst Memory Adr Instruction<31:0> nPC_MUX_sel CS 61C L28 Single Cycle CPU Control I (19) PC Mux Adder imm16 Adder 0 00 4 1 Clk Garcia, Fall 2004 © UCB Peer Instruction 1: Suppose we’re writing a MIPS interpreter in 2: Verilog. Which sequence below is best 3: organization for the interpreter? 4: A. repeat loop that fetches instructions 5: B. while loop that fetches instructions 6: C. Decodes instructions using case statement 7: D. Decodes instr. using chained if statements 8: 9: E. Executes each instruction 0: F. Increments PC by 4 CS 61C L28 Single Cycle CPU Control I (20) ACEF ADEF AECF AEDF BCEF BDEF BECF BEDF EF FAE Garcia, Fall 2004 © UCB Summary: Single cycle datapath °5 steps to design a processor • 1. Analyze instruction set => datapath requirements • 2. Select set of datapath components & establish clock methodology • 3. Assemble datapath meeting the requirements • 4. Analyze implementation of each instruction to determine setting of control points that effects the register transfer. Processor Input • 5. Assemble the control logic Control °Control is the hard part °MIPS makes that easier Memory Datapath • Instructions same size • Source registers always in same place • Immediates same size, location • Operations always on registers/immediates CS 61C L28 Single Cycle CPU Control I (21) Output Garcia, Fall 2004 © UCB