4/21/2021 Introduction Half Adder (Data Flow) library IEEE; use IEEE.STD_LOGIC_1164.ALL; use IEEE.STD_LOGIC_ARITH.ALL; use IEEE.STD_LOGIC_UNSIGNED.ALL; CST204 Computer Architecture and Organization Computer Arithmetic and ALU Design Half Adder entity HALF_ADDER is Port ( a : in std_logic; b : in std_logic; s : out std_logic; c : out std_logic); end HALF_ADDER; architecture Behavioral of HALF_ADDER is begin s <= a XOR b; c <= a AND b; end Behavioral; Dr. Ashok Kherodia Assistant Professor IIIT Kota • If is the delay of basic gate, then sum will require 3 delay and carry requires delay 1 Multi bit addition library IEEE; Half use IEEE.STD_LOGIC_1164.ALL; use IEEE.STD_LOGIC_ARITH.ALL; use IEEE.STD_LOGIC_UNSIGNED.ALL; Adder (Structural) entity HALF_ADDER_struct is port ( a, b : in std_logic; s, c : out std_logic); end HALF_ADDER_struct; architecture struct of HALF_ADDER_struct is component XOR_TWO is Port ( X : in std_logic; Y : in std_logic; Z : out std_logic); end component XOR_TWO; component AND_TWO is Port( A : in std_logic; B : in std_logic; C : out std_logic); end component AND_TWO; begin X1 : XOR_TWO PORT MAP (a, b, s); A1 : AND_TWO PORT MAP (a, b, c); end struct; • The addition of two n-bit numbers A and B requires a basic hardware circuit that accepts three inputs, that is, ai, bi, and ci-1. • These three bits represent respectively the two current bits of the numbers A and B (at position i) and the carry bit from the previous bit position (at position i - 1). • The circuit should produce two outputs, that is, si and ci representing respectively the sum and the carry. Full Adder • A one-bit full adder is a combinational circuit that forms the arithmetic sum of three bits. It consists of three inputs (ai, bi, ci-1) and two outputs si and ci • Full-adder (FA) implement these two functions, FUNDAMENTALS OF COMPUTER ORGANIZATION AND ARCHITECTURE, Mostafa Abd-El-Barr Hesham El-Rewini .Wiley InterScience,2005 4/21/2021 Full Adder 2 4/21/2021 Parallel Adder Full Adder (structural) Carry-ripple through adder (CRT)/Ripple carry adder (RCA). library IEEE; use IEEE.STD_LOGIC_1164.ALL; use IEEE.STD_LOGIC_ARITH.ALL; use IEEE.STD_LOGIC_UNSIGNED.ALL; architecture struct of FULL_ADDER_struct is component HALF_ADDER is Port (HA1 : in std_logic; HA2 : in std_logic; HA_SUM : out std_logic; HA_CAR : out std_logic); end component HALF_ADDER; signal T1,T2,T3,SUM : std_logic; begin HA1: HALF_ADDER port map (FA1,FA2,T1,T2); HA2: HALF_ADDER port map (T1,FA3,SUM,T3); FA_CAR <= (T2 OR T3); FA_SUM <= SUM; end Behavioral; Full Adder (Data Flow) library IEEE; use IEEE.STD_LOGIC_1164.ALL; use IEEE.STD_LOGIC_ARITH.ALL; use IEEE.STD_LOGIC_UNSIGNED.ALL; entity FULL_ADDER_DATA is Port ( x,y,z : in std_logic; S: out std_logic; C : out std_logic); end FULL_ADDER_DATA; Carry-ripple through adder (CRT)/Ripple carry adder (RCA). • A ripple carry adder circuit produces the arithmetic sum of two binary numbers. It consists of full adders connected in cascaded, architecture Behavioral of FULL_ADDER_DATA is with the carry output from each full adder connected to the begin S <= x XOR y XOR z; C <= (x AND y) OR (y AND z) OR (z AND x); end Behavioral; carry input of the next full adder in the chain. Ref: NPTEL V ideo Lectures, IIT Kharagpur entity FULL_ADDER_struct is Port ( FA1 : in std_logic; FA2 : in std_logic; FA3 : in std_logic; FA_SUM : out std_logic; FA_CAR : out std_logic); end FULL_ADDER_struct; • • • Total delay is proportional to n i.e number of bits 16 bit RCA will have twice delay then 8 bit RCA 32 bit RCA will have 4 times the delay of 8 bit RCA 3 4/21/2021 Parallel Subtractor / add-subtract circuit Carry-ripple through adder (CRT)/Ripple carry adder (RCA). 4 Bit RCA (structural) library IEEE; use IEEE.STD_LOGIC_1164.ALL; use IEEE.STD_LOGIC_ARITH.ALL; use IEEE.STD_LOGIC_UNSIGNED.ALL; Ref: NPTEL V ideo Lectures, IIT Kharagpur entity FULL_ADDER_4BIT is Port(Ain, Bin : in std_logic_vector (3 downto 0); Carry_in : in std_logic; Cout : out std_logic_vector (3 downto 0); Cary_out : out std_logic); end FULL_ADDER_4BIT; architecture struct of FULL_ADDER_4BIT is component FULL_ADDER_COMP is Port (FA1, FA2, FA3 : in std_logic; FA_SUM : out std_logic; FA_CAR : out std_logic); end component FULL_ADDER_COMP; signal T : std_logic_vector (4 downto 0); begin T(0) <= Carry_in; FA4B_1: FULL_ADDER_COMP PORT MAP (Ain (0), Bin(0), T(0), Cout(0), T(1)); FA4B_2: FULL_ADDER_COMP PORT MAP ( Ain(1), Bin(1), T(1), Cout(1), T(2)); FA4B_3: FULL_ADDER_COMP PORT MAP (Ain(2), Bin(2), T(2), Cout(2), T(3)); FA4B_4: FULL_ADDER_COMP PORT MAP (Ain(3), Bin(3), T(3), Cout(3), T(4)); Cary_out <= T(4); end struct; 4 4/21/2021 The carry-look-ahead (CLA) adder The carry-look-ahead (CLA) adder The 4 bit carry-look-ahead (CLA) adder • To speed up the addition process, it is necessary to introduce addition circuits in which the chain of dependence among the adder stages must be broken. • The carry-look-ahead (CLA) adder is well known. 5 • Assume Pi and Gi already generated Ref: NPTEL V ideo Lectures, IIT Kharagpur • The total delay = 8 independent of n • Carry = 5 and sum = 8 delay • Complexity increases with number of bits. Two level AND-OR Delay = 2 16 bit adder using 4 CLA adder The carry-look-ahead (CLA) adder Delay of 5 for carry, 8 for sum Ref: NPTEL V ideo Lectures, IIT Kharagpur The 4 bit carry-look-ahead (CLA) adder circuit FUNDAMENTALS OF COMPUTER ORGANIZATION AND ARCHITECTURE, Mostafa Abd-El-Barr Hesham El-Rewini .Wiley InterScience,2005 4/21/2021 6 4/21/2021 The 4 bit carry-look-ahead (CLA) adder Carry Select Adder library IEEE; use IEEE.STD_LOGIC_1164.ALL; Delay Comparison entity cla_adder is port( A,B : in std_logic_vector(3 downto 0); cin : in std_logic; S : out std_logic_vector(3 downto 0); cout : out std_logic); end cla_adder; entity cla_basic is port ( P : in std_logic_vector(3 downto 0); G : in std_logic_vector(3 downto 0); C : out std_logic_vector(4 downto 0); cin : in std_logic); end cla_basic; Ref: NPTEL V ideo Lectures, IIT Kharagpur architecture Behavioral of cla_basic is begin C(0) <= cin; C(1) <= G(0) or (P(0) and cin); C(2) <= G(1) or (P(1) and G(0)) or (P(1) and P(0) and cin); C(3) <= G(2) or (P(2) and G(1)) or (P(2) and P(1) and G(0)) or (P(2) and P(1) and P(0) and cin); C(4) <= G(3) or (P(3) and G(2)) or (P(3) and P(2) and G(1)) or (P(3) and P(2) and P(1) and G(0)) or (P(3) and P(2) and P(1) and P(0) and cin); end Behavioral; architecture Behavioral of cla_adder is signal P,G : STD_LOGIC_VECTOR(3 DOWNTO 0) := "0000"; signal C : STD_LOGIC_VECTOR(4 DOWNTO 0) := "00000"; component cla_basic is port ( P,G : in std_logic_vector(3 downto 0); C : out std_logic_vector(4 downto 0); cin : in std_logic); end component; begin --first level P <= A xor B; G <= A and B; --second level gen_c : cla_basic port map(P,G,C,cin); --third level S <= P xor C(3 downto 0); cout <= C(4); end Behavioral; Multiplexer Ref: NPTEL V ideo Lectures, IIT Kharagpur library IEEE; use IEEE.STD_LOGIC_1164.ALL; • Basic Building blocks of a carry select adder with block size 4 • More Hardware, avoiding addition delay with little multiplexer delay Multiplexers 7 4/21/2021 Here also we see rippling effect but that is through multiplexers instead of adders Ref: NPTEL V ideo Lectures, IIT Kharagpur • Carry Save adder Variable Sized adder Ref: NPTEL V ideo Lectures, IIT Kharagpur Ref: NPTEL V ideo Lectures, IIT Kharagpur Uniform Sized adder 8 Example: Carry Save adder Example: Carry Save adder Ref: NPTEL V ideo Lectures, IIT Kharagpur • Advantageous for large adders m > 3 • Delay -= 3 X 3 + parallel rippling adder delay FUNDAMENTALS OF COMPUTER ORGANIZATION AND ARCHITECTURE, Mostafa Abd-El-Barr Hesham El-Rewini .Wiley InterScience,2005 4/21/2021 Multiplication • Let the two input arguments are the multiplier Q given by • The multiplicand M given by A. The Paper and Pencil Method (for Unsigned Numbers) • Binary multiplication can be just a bunch of right shifts and adds 9 B. The Add-Shift Method • Multiplication is performed as a series of (n) conditional addition and shift operations. • If the given bit of the multiplier is 0 (only a shift operation) is performed, • If the given bit of the multiplier is 1, (addition of the partial products and a shift operation) are performed. Example: • Multiplication of the two unsigned numbers 11 and 13. The process is shown below in a tabular form. • A is a 4-bit register and is initialized to 0s • C is the carry bit from the most significant bit position. • The process is repeated n = 4 times (the number of bits in the multiplier Q). If the bit of the multiplier is “1”, then A A + M and the concatenation of AQ is shifted one bit position to the right. • If, on the other hand, the bit is “0”, then only a shift operation is performed on AQ. FUNDAMENTALS OF COMPUTER ORGANIZATION AND ARCHITECTURE, Mostafa Abd-El-Barr Hesham El-Rewini .Wiley InterScience,2005 4/21/2021 Hardware structure for add-shift multiplication Combinational Array Multiplier • The control logic is used to determine the operation to be performed depending on the least significant bit in Q. • An n-bit adder is used to add the contents of registers A and M. 10 • To Speed up the multiplication operation • Booth’s algorithm effectively skips over runs of 1’s and 0s that is encountered in multipliers Q(0)Q(-1) bits. • This skipping allows faster multipliers to be designed due to reduction in average number add-subtract steps but requires more complex timing and control circuitry. • Booth algorithm gives a procedure for multiplying binary integers in signed- 2’s complement representation. The Booth’s Algorithm • In this technique, two bits of the multiplier, are inspected at a time. • The action taken depends on the binary values of the two bits. • If the two values are respectively. 01, then A • If the two values are 10, then A A+M A - M. (use 2’s complement) • No action is needed if the values are 00 or 11. • In all four cases, an arithmetic shift right operation on the concatenation of AQ is performed. The whole process is repeated n times (n is the number of bits in the multiplier). • The Booth’s algorithm requires the inclusion of a bit Q(-1) = 0 as the least significant bit in the multiplier Q at the beginning of the multiplication process. FUNDAMENTALS OF COMPUTER ORGANIZATION AND ARCHITECTURE, Mostafa Abd-El-Barr Hesham El-Rewini .Wiley InterScience,2005 The Booth’s Algorithm FUNDAMENTALS OF COMPUTER ORGANIZATION AND ARCHITECTURE, Mostafa Abd-El-Barr Hesham El-Rewini .Wiley InterScience,2005 4/21/2021 The Booth’s Algorithm 11 ALU • It consists of an ALU that can perform the add/sub operation depending on the two bits Q(0)Q(-1). • A control circuitry is also required to perform the ASR (AQ) and to issue the appropriate signals needed to control the number of cycles. • Drawbacks: The variability in the number of add/sub operations and the inefficiency of the algorithm when the bit pattern in Q becomes a repeated pair of a 0 (1) followed by a 1(0). • This last situation can be improved if three rather than two bits are inspected at a time. Example : The Booth’s Algorithm 0 FUNDAMENTALS OF COMPUTER ORGANIZATION AND ARCHITECTURE, Mostafa Abd-El-Barr Hesham El-Rewini .Wiley InterScience,2005 The Booth’s Algorithm Hardware FUNDAMENTALS OF COMPUTER ORGANIZATION AND ARCHITECTURE, Mostafa Abd-El-Barr Hesham El-Rewini .Wiley InterScience,2005 FUNDAMENTALS OF COMPUTER ORGANIZATION AND ARCHITECTURE, Mostafa Abd-El-Barr Hesham El-Rewini .Wiley InterScience,2005 4/21/2021 Example : The Booth’s Algorithm 12 4/21/2021 Arithmetic Logic Shift Unit Arithmetic Logic Shift Unit Arithmetic Logic Shift Unit Bitwise operation Y0 =A0 NAND B0 Y1 = A1 NAND B1) • An ALU, performs many different arithmetic and logic operations. NPTEL, Digital electronics Circuits, IIT Kharaggpur NPTEL, Digital electronics Circuits, IIT Kharaggpur else in the CPU is there to support the ALU. Computer architecture and Organization, John P Hayes, McGraw Hill, 3rd ed. • The ALU is the “heart” of a processor— means everything 13 4/21/2021 A Register Level view of 74181 4-bit ALU Sequential ALU Introduction Computer architecture and Organization, John P Hayes, McGraw Hill, 3rd ed. Computer architecture and Organization, John P Hayes, McGraw Hill, 3rd ed. ❖ Definition ❖ A Key processing element of a microprocessor that performs arithmetic and logic operations such as ADD, SUB, NOT, OR, AND, XOR etc. ❖ Control Signals from Control Unit determine what type of operation is to be performed. ❖ Input data consists of two operands: operand A and operand B stored in registers and having n bits ❖ ALU also outputs Status Signals such as: o Zero (when the result of the operation is 0) o Negative (when the operation result is < 0) o Carry (when the operation results in carry) o Overflow (when the result exceeds the number of bits allocated for its storage) etc. , Dr. Ashok Kherodia, IIIT KOTA 14 4/21/2021 Full Adder Circuit Modular ALU Design ❖ Facts ❖ Building blocks (AND gate, OR gate, Not Gate, Multiplexer etc. work with individual (I/O) bits 1. Construct the logic circuit for Sum 2. Implement the logic circuit for CarryOut 3. Connect all inputs with the same name Basic ALU Design ❖ Start with the simplest operations. ❖ AND, OR ❖ Add control signal ❖ Incrementally add functionality and control. ❖ 32 bit ALU works with 32-bit registers ❖ ALU performs a variety of tasks (+, -, *, /, shift, etc) ❖ Make a 1-bit ALU ❖ Principles ❖ Modify design for a n-bit ALU ❖ Build 32 separate 1-bit ALUs ❖ Build separate hardware blocks for each task ❖ Perform all operations in parallel ❖ Use a mux to choose the actual operation (make decision) ❖ Advantages ❖ Easy to add new operations (instructions) ❖ Add new data lines into the muxes; inform “control” of the change. , Dr. Ashok Kherodia, IIIT KOTA , Dr. Ashok Kherodia, IIIT KOTA , Dr. Ashok Kherodia, IIIT KOTA 15 4/21/2021 Basic ALU Design Basic ALU Design Adder/Subtractor 0 1 a a b + S=a+b b Cout S=a-b Cout -b = b + 1 AND, OR Reference: Computer Organization and Design, D.A. Patterson, J.L. Hennessy5th ed. Elsevier Morgan Kaufmann Publishers , 2014, Dr. Ashok Kherodia, IIIT KOTA ADDER, AND, OR Reference: Computer Organization and Design, D.A. Patterson, J.L. Hennessy5th ed. Elsevier Morgan Kaufmann Publishers , 2014, Dr. Ashok Kherodia, IIIT KOTA Reference: Computer Organization and Design, D.A. Patterson, J.L. Hennessy5th ed. Elsevier Morgan Kaufmann Publishers , 2014, Dr. Ashok Kherodia, IIIT KOTA 16 4/21/2021 A±B, AND, OR, SUB ALU Design ALU Design A NAND B = (A AND B) = A OR B A NOR B = (A OR B) = A AND B Set on Less Than ❖ Set a status bit if (a < b) ❖ If (a < b); (a-b) is -ve ❖ Calculate (a-b). Observe sign bit of the result ❖ Add a new input to each bit (“Less”) which provide result of slt. ❖ Set the Operation control to pass the Less input through as output ❖ Set Binvert and CarryIn controls to compute A + (− B) ❖ Pass the most significant bit of the sum back to the Less input of the least significant bit ❖ Tie the Less input of the other bits to zero. A–B: Binvert = 1; CarryIn = 1. A+B: Binvert = 0; CarryIn = 0. Reference: Computer Organization and Design, D.A. Patterson, J.L. Hennessy5th ed. Elsevier Morgan Kaufmann Publishers , 2014, Dr. Ashok Kherodia, IIIT KOTA a NAND b: Ainvert=1;Binvert=1;Operation = 1 a NOR b: Ainvert=1;Binvert=1;Operation = 0 Reference: Computer Organization and Design, D.A. Patterson, J.L. Hennessy5th ed. Elsevier Morgan Kaufmann Publishers , 2014, Dr. Ashok Kherodia, IIIT KOTA , Dr. Ashok Kherodia, IIIT KOTA 17 4/21/2021 ALU Design Set slt(a,b): Binvert = 1; CarryIn=1; Operation = 3; A±B, AND, OR, NOR, NAND, SLT Reference: Computer Organization and Design, D.A. Patterson, J.L. Hennessy5th ed. Elsevier Morgan Kaufmann Publishers , 2014, Dr. Ashok Kherodia, IIIT KOTA Dr. Ashok Kherodia, IIIT KOTA Dr. Ashok Kherodia, IIIT KOTA 18 4/21/2021 Implementing BEQ and BNE ALU Design Circuit for shifting by 2 bits ❖ Multiplexer with wires acts as shifter Mux 8 bits ❖ A = B, if A–B = 0 So for BEQ: ❖ Compute A–B ❖ Or the 32 resulting bits ❖ Invert that result For BNE: ❖ As for BEQ, but no inverter From ALU31 only (a < b) shows MSB sign bit A±B, AND, OR, NOR, NAND, SLT, Overflow detection Reference: NPTEL, Prof. Anshul Kumar CAO video course, 2009 Dr. Ashok Kherodia, IIIT KOTA Dr. Ashok Kherodia, IIIT KOTA Reference: Computer Organization and Design, D.A. Patterson, J.L. Hennessy5th ed. Elsevier Morgan Kaufmann Publishers , 2014, Dr. Ashok Kherodia, IIIT KOTA 19 4/21/2021 ALU Design Binvert Ainvert ❖ Combine Binvert and CarryIn into a single control line Bnegate Operation CarryIn Sum a b A 32-bit ALU A 32-bit ALU Set 1-bit ALU Less less ❖ Zero Detection Logic is just a one BIG NOR gate ❖ Any non-zero input to the NOR gate will cause its output to be zero Overflow CarryOut Zero detection? Reference: Computer Organization and Design, D.A. Patterson, J.L. Hennessy5th ed. Elsevier Morgan Kaufmann Publishers , 2014, Dr. Ashok Kherodia, IIIT KOTA Reference: Computer Organization and Design, D.A. Patterson, J.L. Hennessy5th ed. Elsevier Morgan Kaufmann Publishers , 2014, Dr. Ashok Kherodia, IIIT KOTA Reference: Computer Organization and Design, D.A. Patterson, J.L. Hennessy5th ed. Elsevier Morgan Kaufmann Publishers , 2014, Dr. Ashok Kherodia, IIIT KOTA 20 4/21/2021 Main Control Unit ALU Control – AND Instruction ALU Control Signals (Opcode bits) Main Control unit generates ALUOp and other control signals 8 bit (2 bit ALUOp and 6 bit function code as input for ALU control) ALU ! ❖ Uses multiple levels of decoding: The main control unit generates the ALUOp bits, which are then used as input to the ALU control that generates the actual signals to control the ALU unit. AND 6 opcode bits as input 4 bit ALU Control signal MAIN CONTROLUNIT ALU CONTROL 0000 A +-*/ == != &|^ ❖ Using multiple levels of control can reduce the size of the main control unit. ❖ Using several smaller control units potentially increase the speed of the control unit. ❖ The ALU uses 4 bits for control. OP rd, rs, rt AND R3,R1,R2 Op (0x0) 6 bits Dr. Ashok Kherodia, IIIT KOTA Rs (1) 5 bits B Rt (2) 5 bits Rd (3) 5 bits Shamt (0) 5 bits , A &B Funct ( 0x26) 6 bits Dr. Ashok Kherodia, IIIT KOTA , Dr. Ashok Kherodia, IIIT KOTA 21 4/21/2021 ALU – Control Signals An Example ALU Control Unit Truth Table of the ALU CU Instruction LW, SW, BEQ, ADD, SUB, AND, OR, SLT Function bits Reference: Computer Organization and Design, D.A. Patterson, J.L. Hennessy5th ed. Elsevier Morgan Kaufmann Publishers , 2014, Dr. Ashok Kherodia, IIIT KOTA Ainvert Binvert CarryIn 0 0 0 00 OR 0 0 0 01 ADD,LW,SW 0 0 0 10 SUB,BEQ 0 1 1 10 NAND 1 1 0 01 NOR 1 1 0 00 SLT 0 1 1 (ALU Control input) , Dr. Ashok Kherodia, IIIT KOTA Operation(Op) AND AND , 11 ABOp (ALU Control input) 0000 OR 0001 ADD,LW,SW 0010 SUB,BEQ 0110 NAND 1101 NOR 1100 SLT 0111 Dr. Ashok Kherodia, IIIT KOTA 22