General Purpose Microprocessor Design Dongbing Gu School of Computer Science and Electronic Engineering University of Essex UK Spring 2013 D. Gu (Univ. of Essex) General Purpose Microprocessor Design Spring 2013 1 / 30 Outline 1 General Purpose Microprocessor 2 General Datapath 3 Datapath VHDL 4 Control Unit 5 Control Unit VHDL D. Gu (Univ. of Essex) General Purpose Microprocessor Design Spring 2013 2 / 30 Overview of the CPU Design The CPU is simply a dedicated microprocessor that only executes software instructions. D. Gu (Univ. of Essex) General Purpose Microprocessor Design Spring 2013 3 / 30 CPU Design Instruction set How many instructions do you want? What are the instructions? What is opcode to each of the instructions? Datapath What functional units do you want? A single register file or separate registers? How are the different units connected together? In particular, there is a program counter (PC) and an instruction register (IR). D. Gu (Univ. of Essex) General Purpose Microprocessor Design Spring 2013 4 / 30 General Datapath Its role is performing the data operations for all the instructions. Its design is an incremental process: a class of instructions is considered at each increment and a portion of the datapath is built. The complete datapath is generated by combining partial datapaths together. Only how control signals affect flow of data and function of data units are concerned, not how control signals are generated. D. Gu (Univ. of Essex) General Purpose Microprocessor Design Spring 2013 5 / 30 Incremental Design Process Step 1: program sequencing - instruction fetching. It involves an instruction register(IR), a program counter (PC), and an adder for incrementing the PC. Step 2: arithmetic instruction datapath It involves an ALU and an accumulator. The operands of the ALU are from data memory and/or the immediate data of the instruction. Step 3: data transfer instruction datapath It involves a data memory and the accumulator. The location of data is from the address field of the instruction. Step 4: control flow instruction datapath - jump instruction. A path from the address field of the instruction to the PC input. D. Gu (Univ. of Essex) General Purpose Microprocessor Design Spring 2013 6 / 30 General Datapath The goal is to find the simplest and smallest datapath that matches the requirement of the problem as closely as possible. The requirement is to implement this algorithm for an 8-bit input number. sum = 0 INPUT n WHILE ( n != 0 ) { sum =sum + n n = n − 1 } OUTPUT sum D. Gu (Univ. of Essex) General Purpose Microprocessor Design Spring 2013 7 / 30 Datapath Diagram Since two operands are required, a register file is needed A multiplexer is to select the inputs for operations ALU for operations A tri-state buffer for the output D. Gu (Univ. of Essex) General Purpose Microprocessor Design Spring 2013 8 / 30 ALU Operations ALU2 0 0 0 0 1 1 1 1 D. Gu (Univ. of Essex) ALU1 0 0 1 1 0 0 1 1 ALU0 0 1 0 1 0 1 0 1 Operation Pass through A A AND B A OR B NOT A A+B A-B A+1 A-1 General Purpose Microprocessor Design Spring 2013 9 / 30 Control Words Control word 1 initialize sum by performing a subtraction. The register file (RF) is used for the two operands from the same location 00. At the active edge of the clock, data from RF location 00 is read from both ports and passed to the ALU. The ALU operates within the same clock cycle. The result sum is written back to RF location 00 at the next active clock edge. CW 1 2 3 4 5 Instruction sum=0 INUT n sum=sum+n n=n-1 OUTPUT sum D. Gu (Univ. of Essex) IE 0 1 0 0 x WE 1 1 1 1 0 WA1,0 00 01 00 01 xx RAE 1 0 1 1 1 RAA1,0 00 xx 00 01 00 RBE 1 0 1 0 0 General Purpose Microprocessor Design RBA1,0 00 xx 01 xx xx ALU2,1,0 101 (sub) xxx 100 (add) 111 (dec) 000 (pass) Spring 2013 OE 0 0 0 0 1 10 / 30 Control Words Control word 2 inputs the value n and stores it in RF location 1. IE=1, WE=1 and WA1,0 =01. Control word 3 reads sum through part A by setting RAE=1 and RAA1,0 =00, and n through port B by setting RBE=1 and RBA1,0 =01. ALU2,1,0 =100 for addition operation. The result is written back to RF location 0. Control word 4 decrements n by 1 by setting ALU2,1,0 =111. It reads n from RF location 01 through port A, and passes to the A operand of the ALU. The result is written back to RF location 01. Control word 5 outputs the result that is stored in sum by reading from RF location 0 via port A and passing it through the ALU. OE is asserted to enable the tri-state buffer for the output. D. Gu (Univ. of Essex) General Purpose Microprocessor Design Spring 2013 11 / 30 General Datapath VHDL Entity LIBRARY IEEE ; USE IEEE . STD_ LOGIC_11 64 . a l l ; ENTITY datapath I S PORT ( clock : IN STD_LOGIC ; input : IN S T D _ L O G I C _ V E C T O R ( 7 DOWNTO 0 ) ; IE , WE : IN STD_LOGIC ; WA : IN S T D _ L O G I C _ V E C T O R ( 1 DOWNTO 0 ) ; RAE : IN STD_LOGIC ; RAA : IN S T D_ L O G I C _ V E C T O R ( 1 DOWNTO 0 ) ; RBE : IN STD_LOGIC ; RBA : IN S T D_ L O G I C _ V E C T O R ( 1 DOWNTO 0 ) ; ALUSel : IN S T D _ L O G I C _ V E CT O R ( 2 DOWNTO 0 ) ; OE : IN STD_LOGIC ; output : OUT S T D _ L O G I C _ V E C TO R ( 7 DOWNTO 0 ) ; END datapath D. Gu (Univ. of Essex) General Purpose Microprocessor Design Spring 2013 12 / 30 General Datapath VHDL Architecture ARCHITECTURE Structural OF datapath I S COMPONENT mux2 PORT ( S : IN STD_LOGIC ; D1 , D0 : IN S T D _ L O G I C_ V E C T OR ( 7 DOWNTO 0 ) ; Y : OUT S T D _ L O G I C _ V E C TO R ( 7 DOWNTO 0 ) ) ; END COMPONENT; COMPONENT regfile PORT ( clk : IN STD_LOGIC ; WE : IN STD_LOGIC ; WA : IN S T D _ L O G I C _ V E C TO R ( 1 DOWNTO 0 ) ; RAE : IN STD_LOGIC ; RAA : IN S T D _ L O G I C _ VE C T OR ( 1 DOWNTO 0 ) ; RBE : IN STD_LOGIC ; RBA : IN S T D _ L O G I C _ VE C T OR ( 1 DOWNTO 0 ) ; input : IN S T D _ L O G I C _V E C T OR ( 7 DOWNTO 0 ) ) ; Aout , Bout :OUT S T D _ LO G I C _ V E C T O R ( 7 DOWNTO 0 ) ) ; END COMPONENT; D. Gu (Univ. of Essex) General Purpose Microprocessor Design Spring 2013 13 / 30 General Datapath VHDL Architecture COMPONENT alu PORT ( ALUSel : IN S T D _ L O G I C _ V E C T OR ( 2 DOWNTO 0 ) ; A , B : IN S T D _ L O G I C _ VE C T OR ( 7 DOWNTO 0 ) ; F: OUT S T D _ L O G I C _ V E C T OR ( 7 DOWNTO 0 ) ) ; END COMPONENT; COMPONENT TriS tateBuff er PORT ( E : IN STD_LOGIC ; D : IN S T D _ L O G I C _ V E C TO R ( 7 DOWNTO 0 ) ; Y : OUT S T D _ L O G I C _ V E C TO R ( 7 DOWNTO 0 ) ) ; END COMPONENT; SIGNAL muxout , rfAout , rfBout : S T D _ L O G I C _ V E C T OR ( 7 DOWNTO 0 ) ) ; SIGNAL aluout , tristateout : S T D _ L O G I C _ V E C TO R ( 7 DOWNTO 0 ) ) ; BEGIN U0 : mux2 PORT MAP( IE , input , aluout , muxout ) : U1 : regfile PORT MAP( clock , WE , WA , RAE , RAA , RBE , RBA , ←muxout , rfAout , rfBout ) ; U2 : alu PORT MAP( ALUSel , rfAout , rfBout , aluout ) ; U3 : TriS tateBuff er PORT MAP( OE , aluout , tristateout ) ; output <= tristateout ; END Structural D. Gu (Univ. of Essex) General Purpose Microprocessor Design Spring 2013 14 / 30 Control Unit Cycles through three main steps (instruction cycle). Each step is executed in one state of the FSM. Step 1: fetches an instruction - reads the memory location specified by the PC and copies the content of that location into the IR. Step 2: decodes the instruction - extracts the opcode bits from the IR and determines what the current instruction is. Step 3: executes the instruction - asserts the appropriate control signals. D. Gu (Univ. of Essex) General Purpose Microprocessor Design Spring 2013 15 / 30 Instruction Set Example Five instructions, three bits are required to encode the opcode. Instruction IN A OUT A DEC A JNZ address HALT Encoding 011xxxxx 100xxxxx 101xxxxx 110xaaaa 111xxxxx Operation A← Input Output→ A A← A-1 IF (A!=0) THEN PC=aaaa Halt aaaa=four bits for specifying a memory address x=don’t care A=accumulator D. Gu (Univ. of Essex) General Purpose Microprocessor Design Spring 2013 16 / 30 The Datapath D. Gu (Univ. of Essex) General Purpose Microprocessor Design Spring 2013 17 / 30 The Dedicated Datapath For 16 locations in ROM for a program, the 4-bit address is required. The PC is 4-bit wide. Each instruction is 8-bit wide, the IR is 8-bit wide. A 4-bit increment unit is used for the PC. The PC needs to be loaded with either the result of the increment unit or the address from JNZ instruction. The output of the PC is connected directly to the 4-bit memory lines. The 8-bit memory output is connected to the input of the IR. The JNZ instruction requires an 8-bit OR gate connected to the output of the accumulator to test the condition (A6=0). Five control signals: IRload, PCload, INmux, Aload, and JNZmux. One status signal: (A6=0) D. Gu (Univ. of Essex) General Purpose Microprocessor Design Spring 2013 18 / 30 The Control Unit The instruction execution - decoding the instruction, generate the control signals, and PC+1 is written into PC. Its design starts from a state diagram: D. Gu (Univ. of Essex) General Purpose Microprocessor Design Spring 2013 19 / 30 The Control Unit The first Start state, 000, serves as the initial reset state. No action is performed. It requires an extra clock cycle. JNZ instruction requires an extra clock cycle to complete its operation, because the PC must be loaded with a new address value if the condition is tested true. This new address value is loaded into the PC at the next clock cycle. D. Gu (Univ. of Essex) General Purpose Microprocessor Design Spring 2013 20 / 30 The Control Unit The next step is the next state table from the state diagram. Q2 Q1 Q0 000 001 010 011 100 101 110 111 Start Fetch Decode Input Output Dec Jnz Halt 000 NOP 001 010 000 000 000 000 000 111 001 NOP 001 010 000 000 000 000 000 111 010 NOP 001 010 000 000 000 000 000 111 D2 D1 D0 IR7 IR6 IR5 011 100 INPUT OUTPUT 001 001 010 010 011 100 000 000 000 000 000 000 000 000 111 111 101 DEC 001 010 101 000 000 000 000 111 110 JNZ 001 010 110 000 000 000 000 111 111 HALT 001 010 111 000 000 000 000 111 D2 = Q2 Q1 Q0 + Q20 Q1 Q00 IR7 D1 = Q2 Q1 Q0 + Q20 Q1 Q00 (IR6 IR5 + IR7 IR6 ) + Q20 Q10 Q0 D0 = Q2 Q1 Q0 + Q20 Q1 Q00 (IR6 IR5 + IR7 IR5 ) + Q20 Q10 Q0 D. Gu (Univ. of Essex) General Purpose Microprocessor Design Spring 2013 21 / 30 The Control Unit And the output table from the state diagram: Control word 0 1 2 3 4 5 6 7 Encoding Q2 Q1 Q0 000 Start 001 Fetch 010 Decode 011 Input 100 Output 101 Dec 110 Jnz 111 Halt IRload PCload INmux Aload JNZmux Halt 0 1 0 0 0 0 0 0 0 1 0 0 0 0 1 or 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 IRload = Q20 Q10 Q0 PCload = Q20 Q10 Q0 + Q2 Q1 Q00 (A 6= 0) INmux = Q20 Q1 Q0 Aload = Q20 Q1 Q0 + Q2 Q10 Q0 JNZmux = Q2 Q1 Q00 D. Gu (Univ. of Essex) General Purpose Microprocessor Design Spring 2013 22 / 30 The Circuit Finally the circuit diagram from the true tables. D. Gu (Univ. of Essex) General Purpose Microprocessor Design Spring 2013 23 / 30 VHDL Entity LIBRARY IEEE ; USE IEEE . STD_ LOGIC_11 64 . a l l ; ENTITY controlunit I S PORT ( Reset , Clock : IN STD_LOGIC ; IRload , PCload , INmux , Aload , JNZmux , Halt : IN STD_LOGIC ; IR : IN S T D _ L O G I C _V E C T OR ( 2 DOWNTO 0 ) ; ANZ : IN STD_LOGIC ) ; END controlunit ARCHITECTURE Behavioral OF controlunit I S TYPE state_type I S ( s0 , s1 , s2 , s3 , s4 , s5 , s6 , s7 ) ; SIGNAL state : state_type ; BEGIN D. Gu (Univ. of Essex) General Purpose Microprocessor Design Spring 2013 24 / 30 VHDL Architecture n e x t _ s t a t e _ l o g i c : PROCESS( Reset ) BEGIN I F ( Reset = ' 1 ' ) THEN state=s0 ; ELSE I F ( Clock ' EVENT AND Clock = ' 1 ' ) CASE state I S WHEN s0 => state=s1 ; WHEN s1 => state=s2 ; WHEN s2 => I F ( IR = ' 0 1 1 ' ) THEN state=s3 ; I F ( IR = ' 1 0 0 ' ) THEN state=s4 ; I F ( IR = ' 1 0 1 ' ) THEN state=s5 ; I F ( IR = ' 1 1 0 ' ) THEN state=s6 ; I F ( IR = ' 1 1 1 ' ) THEN state=s7 ; WHEN s7 => state=s7 ; WHEN OTHERS => state=s0 ; END CASE ; END I F ; END PROCESS ; D. Gu (Univ. of Essex) THEN END END END END END General Purpose Microprocessor Design IF ; IF ; IF ; IF ; IF ; Spring 2013 25 / 30 VHDL Architecture output_logic : PROCESS( state BEGIN CASE state I S WHEN s0 => IRload = ' 0 ' ; PCload = ' 0 ' ; Halt = ' 0 ' ; WHEN s1 => IRload = ' 1 ' ; PCload = ' 1 ' ; Halt = ' 0 ' ; WHEN s2 => IRload = ' 0 ' ; PCload = ' 0 ' ; Halt = ' 0 ' ; WHEN s3 => IRload = ' 0 ' ; PCload = ' 0 ' ; Halt = ' 0 ' ; WHEN s4 => IRload = ' 0 ' ; PCload = ' 0 ' ; Halt = ' 0 ' ; −−more c a s e s END CASE ; END PROCESS ; END Behavioral ; D. Gu (Univ. of Essex) ) INmux = ' 0 ' ; Aload = ' 0 ' ; JNZmux = ' 0 ' ; ←- INmux = ' 0 ' ; Aload = ' 0 ' ; JNZmux = ' 0 ' ; ←- INmux = ' 0 ' ; Aload = ' 0 ' ; JNZmux = ' 0 ' ; ←- INmux = ' 1 ' ; Aload = ' 1 ' ; JNZmux = ' 0 ' ; ←- INmux = ' 0 ' ; Aload = ' 0 ' ; JNZmux = ' 0 ' ; ←- General Purpose Microprocessor Design Spring 2013 26 / 30 Combining Datapath and Control Unit Together LIBRARY IEEE ; USE IEEE . STD_ LOGIC_11 64 . a l l ; ENTITY mc I S PORT ( Reset , Clock : IN STD_LOGIC ; Output : OUT S T D _ L O G I C _ V E C T O R ( 7 DOWNTO 0 ) ; Input : IN S T D _ L O GI C _ V E C T OR ( 7 DOWNTO 0 ) ; END mc ARCHITECTURE Behavioral OF mc I S COMPONENT controlunit PORT ( −− ); END COMPONENT; COMPONENT datapath PORT ( −− ); END COMPONENT; SIGNAL −− BEGIN U0 : controlunit PORT MAP( ) ; U1 : datapath PORT MAP( ) ; END Behavioral ; D. Gu (Univ. of Essex) General Purpose Microprocessor Design Spring 2013 27 / 30 A Sample Program A loop sample program is: IN A l o o p : OUT A DEC A JNZ l o o p HALT D. Gu (Univ. of Essex) Memory address 0000 0001 0010 0011 0100 General Purpose Microprocessor Design Instruction encoding 01100000 −− 10000000 −− 10100000 −− 11000001 −− 11111111 −− IN A OUT A DEC A JNZ HALT Spring 2013 28 / 30 The Complete Circuit The datapath and control unit. D. Gu (Univ. of Essex) General Purpose Microprocessor Design Spring 2013 29 / 30 The Hardware Implementation D. Gu (Univ. of Essex) General Purpose Microprocessor Design Spring 2013 30 / 30