Uploaded by Gaurav Kumar

Computer Arithmetic and ALU

advertisement
4/21/2021
Introduction
Half Adder (Data Flow)
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.STD_LOGIC_ARITH.ALL;
use IEEE.STD_LOGIC_UNSIGNED.ALL;
CST204
Computer Architecture and Organization
Computer Arithmetic and ALU Design
Half Adder
entity HALF_ADDER is
Port ( a : in std_logic;
b : in std_logic;
s : out std_logic;
c : out std_logic);
end HALF_ADDER;
architecture Behavioral of HALF_ADDER is
begin
s <= a XOR b;
c <= a AND b;
end Behavioral;
Dr. Ashok Kherodia
Assistant Professor
IIIT Kota
• If  is the delay of basic gate, then sum will require 3 delay and
carry requires  delay
1
Multi bit addition
library IEEE;
Half
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.STD_LOGIC_ARITH.ALL;
use IEEE.STD_LOGIC_UNSIGNED.ALL;
Adder (Structural)
entity HALF_ADDER_struct is
port ( a, b : in std_logic;
s, c : out std_logic);
end HALF_ADDER_struct;
architecture struct of HALF_ADDER_struct is
component XOR_TWO is
Port ( X : in std_logic; Y : in std_logic;
Z : out std_logic);
end component XOR_TWO;
component AND_TWO is
Port( A : in std_logic; B : in std_logic;
C : out std_logic);
end component AND_TWO;
begin
X1 : XOR_TWO PORT MAP (a, b, s);
A1 : AND_TWO PORT MAP (a, b, c);
end struct;
• The addition of two n-bit numbers A and B requires a basic
hardware circuit that accepts three inputs, that is, ai, bi, and ci-1.
• These three bits represent respectively the two current bits of the
numbers A and B (at position i) and the carry bit from the previous
bit position (at position i - 1).
• The circuit should produce two outputs, that is, si and ci
representing respectively the sum and the carry.
Full Adder
• A one-bit full adder is a combinational circuit that forms the
arithmetic sum of three bits. It consists of three inputs (ai, bi, ci-1) and
two outputs si and ci
• Full-adder (FA) implement these two functions,
FUNDAMENTALS OF COMPUTER ORGANIZATION AND ARCHITECTURE, Mostafa Abd-El-Barr Hesham El-Rewini .Wiley InterScience,2005
4/21/2021
Full Adder
2
4/21/2021
Parallel Adder
Full Adder (structural)
Carry-ripple through adder (CRT)/Ripple carry adder (RCA).
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.STD_LOGIC_ARITH.ALL;
use IEEE.STD_LOGIC_UNSIGNED.ALL;
architecture struct of FULL_ADDER_struct is
component HALF_ADDER is
Port (HA1 : in std_logic; HA2 : in std_logic;
HA_SUM : out std_logic;
HA_CAR : out std_logic);
end component HALF_ADDER;
signal T1,T2,T3,SUM : std_logic;
begin
HA1: HALF_ADDER port map (FA1,FA2,T1,T2);
HA2: HALF_ADDER port map (T1,FA3,SUM,T3);
FA_CAR <= (T2 OR T3);
FA_SUM <= SUM;
end Behavioral;
Full Adder (Data Flow)
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.STD_LOGIC_ARITH.ALL;
use IEEE.STD_LOGIC_UNSIGNED.ALL;
entity FULL_ADDER_DATA is
Port ( x,y,z : in std_logic;
S: out std_logic;
C : out std_logic);
end FULL_ADDER_DATA;
Carry-ripple through adder (CRT)/Ripple carry adder (RCA).
• A ripple carry adder circuit produces the arithmetic sum of two
binary numbers. It consists of full adders connected in cascaded,
architecture Behavioral of FULL_ADDER_DATA is
with the carry output from each full adder connected to the
begin
S <= x XOR y XOR z;
C <= (x AND y) OR (y AND z) OR (z
AND x);
end Behavioral;
carry input of the next full adder in the chain.
Ref: NPTEL V ideo Lectures, IIT Kharagpur
entity FULL_ADDER_struct is
Port ( FA1 : in std_logic; FA2 : in std_logic;
FA3 : in std_logic;
FA_SUM : out std_logic;
FA_CAR : out std_logic);
end FULL_ADDER_struct;
•
•
•
Total delay is proportional to n i.e number of bits
16 bit RCA will have twice delay then 8 bit RCA
32 bit RCA will have 4 times the delay of 8 bit RCA
3
4/21/2021
Parallel Subtractor / add-subtract circuit
Carry-ripple through adder (CRT)/Ripple carry adder (RCA).
4 Bit RCA (structural)
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.STD_LOGIC_ARITH.ALL;
use IEEE.STD_LOGIC_UNSIGNED.ALL;
Ref: NPTEL V ideo Lectures, IIT Kharagpur
entity FULL_ADDER_4BIT is
Port(Ain, Bin : in std_logic_vector (3 downto 0);
Carry_in : in std_logic;
Cout : out std_logic_vector (3 downto 0);
Cary_out : out std_logic);
end FULL_ADDER_4BIT;
architecture struct of FULL_ADDER_4BIT is
component FULL_ADDER_COMP is
Port (FA1, FA2, FA3 : in std_logic;
FA_SUM : out std_logic;
FA_CAR : out std_logic);
end component FULL_ADDER_COMP;
signal T : std_logic_vector (4 downto 0);
begin
T(0) <= Carry_in;
FA4B_1: FULL_ADDER_COMP PORT MAP (Ain (0), Bin(0), T(0), Cout(0), T(1));
FA4B_2: FULL_ADDER_COMP PORT MAP ( Ain(1), Bin(1), T(1), Cout(1), T(2));
FA4B_3: FULL_ADDER_COMP PORT MAP (Ain(2), Bin(2), T(2), Cout(2), T(3));
FA4B_4: FULL_ADDER_COMP PORT MAP (Ain(3), Bin(3), T(3), Cout(3), T(4));
Cary_out <= T(4);
end struct;
4
4/21/2021
The carry-look-ahead (CLA) adder
The carry-look-ahead (CLA) adder
The 4 bit carry-look-ahead (CLA) adder
• To speed up the addition process, it is necessary to introduce
addition circuits in which the chain of dependence among the
adder stages must be broken.
• The carry-look-ahead (CLA) adder is well known.
5
•
Assume Pi and Gi already generated
Ref: NPTEL V ideo Lectures, IIT Kharagpur
• The total delay = 8 independent of n
• Carry = 5 and sum = 8 delay
• Complexity increases with number of bits.
Two level AND-OR
Delay = 2
16 bit adder using 4 CLA adder
The carry-look-ahead (CLA) adder
Delay of 5 for carry,
8  for sum
Ref: NPTEL V ideo Lectures, IIT Kharagpur
The 4 bit carry-look-ahead (CLA) adder circuit
FUNDAMENTALS OF COMPUTER ORGANIZATION AND ARCHITECTURE, Mostafa Abd-El-Barr Hesham El-Rewini .Wiley InterScience,2005
4/21/2021
6
4/21/2021
The 4 bit carry-look-ahead (CLA) adder
Carry Select Adder
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
Delay Comparison
entity cla_adder is
port( A,B : in std_logic_vector(3 downto 0);
cin : in std_logic;
S : out std_logic_vector(3 downto 0);
cout : out std_logic);
end cla_adder;
entity cla_basic is
port (
P : in std_logic_vector(3 downto 0);
G : in std_logic_vector(3 downto 0);
C : out std_logic_vector(4 downto 0);
cin : in std_logic);
end cla_basic;
Ref: NPTEL V ideo Lectures, IIT Kharagpur
architecture Behavioral of cla_basic is
begin
C(0) <= cin;
C(1) <= G(0) or (P(0) and cin);
C(2) <= G(1) or (P(1) and G(0)) or (P(1) and P(0) and cin);
C(3) <= G(2) or (P(2) and G(1)) or (P(2) and P(1) and G(0))
or (P(2) and P(1) and P(0) and cin);
C(4) <= G(3) or (P(3) and G(2)) or (P(3) and P(2) and G(1))
or (P(3) and P(2) and P(1) and G(0)) or (P(3) and P(2) and
P(1) and P(0) and cin);
end Behavioral;
architecture Behavioral of cla_adder is
signal P,G : STD_LOGIC_VECTOR(3 DOWNTO 0) := "0000";
signal C : STD_LOGIC_VECTOR(4 DOWNTO 0) := "00000";
component cla_basic is
port ( P,G : in std_logic_vector(3 downto 0);
C : out std_logic_vector(4 downto 0);
cin : in std_logic);
end component;
begin
--first level
P <= A xor B;
G <= A and B;
--second level
gen_c : cla_basic port map(P,G,C,cin);
--third level
S <= P xor C(3 downto 0);
cout <= C(4);
end Behavioral;
Multiplexer
Ref: NPTEL V ideo Lectures, IIT Kharagpur
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
• Basic Building blocks of a
carry select adder with
block size 4
• More Hardware, avoiding
addition delay with little
multiplexer delay
Multiplexers
7
4/21/2021
Here also we see rippling effect but that is through multiplexers instead of adders
Ref: NPTEL V ideo Lectures, IIT Kharagpur
•
Carry Save adder
Variable Sized adder
Ref: NPTEL V ideo Lectures, IIT Kharagpur
Ref: NPTEL V ideo Lectures, IIT Kharagpur
Uniform Sized adder
8
Example: Carry Save adder
Example: Carry Save adder
Ref: NPTEL V ideo Lectures, IIT Kharagpur
• Advantageous for large adders m > 3
• Delay -= 3 X 3 + parallel rippling adder delay
FUNDAMENTALS OF COMPUTER ORGANIZATION AND ARCHITECTURE, Mostafa Abd-El-Barr Hesham El-Rewini .Wiley InterScience,2005
4/21/2021
Multiplication
• Let the two input arguments are the multiplier Q given by
• The multiplicand M given by
A. The Paper and Pencil Method (for Unsigned Numbers)
• Binary multiplication can be just a
bunch of right shifts and adds
9
B. The Add-Shift Method
• Multiplication is performed as a series of (n) conditional addition and
shift operations.
• If the given bit of the multiplier is 0 (only a shift operation) is performed,
• If the given bit of the multiplier is 1, (addition of the partial products and
a shift operation) are performed.
Example:
• Multiplication of the two unsigned numbers 11 and 13.
The process is shown below in a tabular form.
• A is a 4-bit register and is initialized to 0s
• C is the carry bit from the most significant bit position.
• The process is repeated n = 4 times (the number of bits in the multiplier
Q). If the bit of the multiplier is “1”, then A
A + M and the
concatenation of AQ is shifted one bit position to the right.
• If, on the other hand, the bit is “0”, then only a shift operation is
performed on AQ.
FUNDAMENTALS OF COMPUTER ORGANIZATION AND ARCHITECTURE, Mostafa Abd-El-Barr Hesham El-Rewini .Wiley InterScience,2005
4/21/2021
Hardware structure for add-shift multiplication
Combinational Array Multiplier
• The control logic is used to determine the operation to be performed
depending on the least significant bit in Q.
• An n-bit adder is used to add the contents of registers A and M.
10
• To Speed up the multiplication operation
• Booth’s algorithm effectively skips over runs of 1’s and 0s that is
encountered in multipliers Q(0)Q(-1) bits.
• This skipping allows faster multipliers to be designed due to
reduction in average number add-subtract steps but requires more
complex timing and control circuitry.
• Booth algorithm gives a procedure for multiplying binary integers in
signed- 2’s complement representation.
The Booth’s Algorithm
• In this technique, two bits of the multiplier, are inspected at a time.
• The action taken depends on the binary values of the two bits.
• If the two values are respectively. 01, then A
• If the two values are 10, then A
A+M
A - M. (use 2’s complement)
• No action is needed if the values are 00 or 11.
• In all
four cases, an arithmetic shift right operation
on the
concatenation of AQ is performed. The whole process is repeated n
times (n is the number of bits in the multiplier).
• The Booth’s algorithm requires the inclusion of a bit Q(-1) = 0 as the
least significant bit in the multiplier Q at the beginning of the
multiplication process.
FUNDAMENTALS OF COMPUTER ORGANIZATION AND ARCHITECTURE, Mostafa Abd-El-Barr Hesham El-Rewini .Wiley InterScience,2005
The Booth’s Algorithm
FUNDAMENTALS OF COMPUTER ORGANIZATION AND ARCHITECTURE, Mostafa Abd-El-Barr Hesham El-Rewini .Wiley InterScience,2005
4/21/2021
The Booth’s Algorithm
11
ALU
• It consists of an ALU that can perform the add/sub operation depending
on the two bits Q(0)Q(-1).
• A control circuitry is also required to perform the ASR (AQ) and to issue the
appropriate signals needed to control the number of cycles.
• Drawbacks: The variability in the number of add/sub operations and the
inefficiency of the algorithm when the bit pattern in Q becomes a
repeated pair of a 0 (1) followed by a 1(0).
• This last situation can be improved if three rather than two bits are
inspected at a time.
Example : The Booth’s Algorithm
0
FUNDAMENTALS OF COMPUTER ORGANIZATION AND ARCHITECTURE, Mostafa Abd-El-Barr Hesham El-Rewini .Wiley InterScience,2005
The Booth’s Algorithm Hardware
FUNDAMENTALS OF COMPUTER ORGANIZATION AND ARCHITECTURE, Mostafa Abd-El-Barr Hesham El-Rewini .Wiley InterScience,2005
FUNDAMENTALS OF COMPUTER ORGANIZATION AND ARCHITECTURE, Mostafa Abd-El-Barr Hesham El-Rewini .Wiley InterScience,2005
4/21/2021
Example : The Booth’s Algorithm
12
4/21/2021
Arithmetic Logic Shift Unit
Arithmetic Logic Shift Unit
Arithmetic Logic Shift Unit
Bitwise operation
Y0 =A0 NAND B0
Y1 = A1 NAND B1)
• An ALU, performs many different arithmetic and logic
operations.
NPTEL, Digital electronics Circuits, IIT Kharaggpur
NPTEL, Digital electronics Circuits, IIT Kharaggpur
else in the CPU is there to support the ALU.
Computer architecture and Organization, John P Hayes, McGraw Hill, 3rd ed.
• The ALU is the “heart” of a processor— means everything
13
4/21/2021
A Register Level view of 74181 4-bit ALU
Sequential ALU
Introduction
Computer architecture and Organization, John P Hayes, McGraw Hill, 3rd ed.
Computer architecture and Organization, John P Hayes, McGraw Hill, 3rd ed.
❖ Definition
❖ A Key processing element of a microprocessor that performs arithmetic and
logic operations such as ADD, SUB, NOT, OR, AND, XOR etc.
❖ Control Signals from Control Unit determine what type of operation is to be
performed.
❖ Input data consists of two operands: operand A and operand B stored in
registers and having n bits
❖ ALU also outputs Status Signals such as:
o Zero (when the result of the operation is 0)
o Negative (when the operation result is < 0)
o Carry (when the operation results in carry)
o Overflow (when the result exceeds the number of bits allocated for its
storage) etc.
,
Dr. Ashok Kherodia, IIIT KOTA
14
4/21/2021
Full Adder Circuit
Modular ALU Design
❖ Facts
❖ Building blocks (AND gate, OR gate, Not Gate, Multiplexer etc. work with
individual (I/O) bits
1.
Construct the logic circuit for Sum
2.
Implement the logic circuit for CarryOut
3.
Connect all inputs with the same name
Basic ALU Design
❖ Start with the simplest operations.
❖
AND, OR
❖ Add control signal
❖ Incrementally add functionality and control.
❖ 32 bit ALU works with 32-bit registers
❖ ALU performs a variety of tasks (+, -, *, /, shift, etc)
❖ Make a 1-bit ALU
❖ Principles
❖ Modify design for a n-bit ALU
❖ Build 32 separate 1-bit ALUs
❖ Build separate hardware blocks for each task
❖ Perform all operations in parallel
❖ Use a mux to choose the actual operation (make decision)
❖ Advantages
❖ Easy to add new operations (instructions)
❖ Add new data lines into the muxes; inform “control” of the change.
,
Dr. Ashok Kherodia, IIIT KOTA
,
Dr. Ashok Kherodia, IIIT KOTA
,
Dr. Ashok Kherodia, IIIT KOTA
15
4/21/2021
Basic ALU Design
Basic ALU Design
Adder/Subtractor
0
1
a
a
b
+
S=a+b
b
Cout
S=a-b
Cout
-b = b + 1
AND, OR
Reference: Computer Organization and Design, D.A. Patterson, J.L. Hennessy5th ed. Elsevier Morgan Kaufmann
Publishers , 2014,
Dr. Ashok Kherodia, IIIT KOTA
ADDER, AND, OR
Reference: Computer Organization and Design, D.A. Patterson, J.L. Hennessy5th ed. Elsevier Morgan Kaufmann
Publishers , 2014,
Dr. Ashok Kherodia, IIIT KOTA
Reference: Computer Organization and Design, D.A. Patterson, J.L. Hennessy5th ed. Elsevier Morgan Kaufmann
Publishers , 2014,
Dr. Ashok Kherodia, IIIT KOTA
16
4/21/2021
A±B, AND, OR,
SUB
ALU Design
ALU Design
A NAND B = (A AND B)
= A OR B
A NOR B = (A OR B)
= A AND B
Set on Less Than
❖ Set a status bit if (a < b)
❖ If (a < b); (a-b) is -ve
❖ Calculate (a-b). Observe sign bit of the result
❖ Add a new input to each bit (“Less”) which provide result of slt.
❖ Set the Operation control to pass the Less input through as output
❖ Set Binvert and CarryIn controls to compute A + (− B)
❖ Pass the most significant bit of the sum back to the Less input of the
least significant bit
❖ Tie the Less input of the other bits to zero.
A–B: Binvert = 1; CarryIn = 1.
A+B: Binvert = 0; CarryIn = 0.
Reference: Computer Organization and Design, D.A. Patterson, J.L. Hennessy5th ed. Elsevier Morgan Kaufmann
Publishers , 2014,
Dr. Ashok Kherodia, IIIT KOTA
a NAND b: Ainvert=1;Binvert=1;Operation = 1
a NOR b: Ainvert=1;Binvert=1;Operation = 0
Reference: Computer Organization and Design, D.A. Patterson, J.L. Hennessy5th ed. Elsevier Morgan Kaufmann
Publishers , 2014,
Dr. Ashok Kherodia, IIIT KOTA
,
Dr. Ashok Kherodia, IIIT KOTA
17
4/21/2021
ALU Design
Set
slt(a,b): Binvert = 1; CarryIn=1; Operation = 3;
A±B, AND, OR, NOR, NAND, SLT
Reference: Computer Organization and Design, D.A. Patterson, J.L. Hennessy5th ed. Elsevier Morgan Kaufmann
Publishers , 2014,
Dr. Ashok Kherodia, IIIT KOTA
Dr. Ashok Kherodia, IIIT KOTA
Dr. Ashok Kherodia, IIIT KOTA
18
4/21/2021
Implementing BEQ and BNE
ALU Design
Circuit for shifting by 2 bits
❖ Multiplexer with wires acts as shifter
Mux
8 bits
❖ A = B, if A–B = 0
So for BEQ:
❖ Compute A–B
❖ Or the 32 resulting bits
❖ Invert that result
For BNE:
❖ As for BEQ, but no inverter
From ALU31 only
(a < b) shows MSB sign bit
A±B, AND, OR, NOR, NAND, SLT, Overflow detection
Reference: NPTEL, Prof. Anshul Kumar CAO video course, 2009
Dr. Ashok Kherodia, IIIT KOTA
Dr. Ashok Kherodia, IIIT KOTA
Reference: Computer Organization and Design, D.A. Patterson, J.L. Hennessy5th ed. Elsevier Morgan Kaufmann
Publishers , 2014,
Dr. Ashok Kherodia, IIIT KOTA
19
4/21/2021
ALU Design
Binvert
Ainvert
❖ Combine Binvert and CarryIn into
a single control line Bnegate
Operation
CarryIn
Sum
a
b
A 32-bit ALU
A 32-bit ALU
Set
1-bit ALU
Less
less
❖ Zero Detection Logic is just a one
BIG NOR gate
❖ Any non-zero input to the NOR
gate will cause its output to be
zero
Overflow
CarryOut
Zero detection?
Reference: Computer Organization and Design, D.A. Patterson, J.L. Hennessy5th ed. Elsevier Morgan Kaufmann
Publishers , 2014,
Dr. Ashok Kherodia, IIIT KOTA
Reference: Computer Organization and Design, D.A. Patterson, J.L. Hennessy5th ed. Elsevier Morgan Kaufmann
Publishers , 2014,
Dr. Ashok Kherodia, IIIT KOTA
Reference: Computer Organization and Design, D.A. Patterson, J.L. Hennessy5th ed. Elsevier Morgan Kaufmann
Publishers , 2014,
Dr. Ashok Kherodia, IIIT KOTA
20
4/21/2021
Main Control Unit
ALU Control – AND Instruction
ALU Control Signals
(Opcode bits)
Main Control unit generates ALUOp and other control signals
8 bit (2 bit ALUOp and 6 bit function
code as input for ALU control)
ALU !
❖ Uses multiple levels of decoding: The main control unit generates the
ALUOp bits, which are then used as input to the ALU control that
generates the actual signals to control the ALU unit.
AND
6 opcode bits
as input
4 bit ALU Control signal
MAIN
CONTROLUNIT
ALU
CONTROL
0000
A
+-*/
==
!=
&|^
❖ Using multiple levels of control can reduce the size of the main control
unit.
❖ Using several smaller control units potentially increase the speed of the
control unit.
❖ The ALU uses 4 bits for control.
OP rd, rs, rt
AND R3,R1,R2
Op (0x0)
6 bits
Dr. Ashok Kherodia, IIIT KOTA
Rs (1)
5 bits
B
Rt (2)
5 bits
Rd (3)
5 bits
Shamt (0)
5 bits
,
A &B
Funct
( 0x26)
6 bits
Dr. Ashok Kherodia, IIIT KOTA
,
Dr. Ashok Kherodia, IIIT KOTA
21
4/21/2021
ALU – Control Signals
An Example ALU Control Unit
Truth Table of the ALU CU
Instruction
LW, SW, BEQ, ADD, SUB, AND, OR, SLT
Function bits
Reference: Computer Organization and Design, D.A. Patterson, J.L. Hennessy5th ed. Elsevier Morgan Kaufmann
Publishers , 2014,
Dr. Ashok Kherodia, IIIT KOTA
Ainvert
Binvert
CarryIn
0
0
0
00
OR
0
0
0
01
ADD,LW,SW
0
0
0
10
SUB,BEQ
0
1
1
10
NAND
1
1
0
01
NOR
1
1
0
00
SLT
0
1
1
(ALU Control input)
,
Dr. Ashok Kherodia, IIIT KOTA
Operation(Op)
AND
AND
,
11
ABOp
(ALU Control input)
0000
OR
0001
ADD,LW,SW
0010
SUB,BEQ
0110
NAND
1101
NOR
1100
SLT
0111
Dr. Ashok Kherodia, IIIT KOTA
22
Download