Implementation of High Speed, Low Power and Area Efficient S.V.Padmajarani , Dr.M.Muralidhar

advertisement
International Journal of Engineering Trends and Technology (IJETT) – Volume 25 Number 4- July 2015
Implementation of High Speed, Low Power and Area Efficient
Parallel Prefix Adder in an FPGA
S.V.Padmajarani#1, Dr.M.Muralidhar*2
#
Professor and HOD(ECE), SreeVenkateswara College of Engineering, Northrajupalem, Nellore, A.P, India
*
Principal, SreeVenkateswara College of Engineering and Technology, Chittoor, A.P, India
Abstract --In the portable world, the major issues in the
designs are low power, high speed and less area
requirement. Most of the portable computing devices
contain the sophisticated and power hungry signal
processing techniques, hence there is a need to reduce the
power consumption of these devices. Binary adder is a
fundamental unit in many arithmetic operations, and signal
processing applications. Hence the impact of the adders
will be large on the overall performance of the entire
system. The aim of this paper is to present a low power,
high speed and area efficient parallel prefix adder
architecture. The proposed adder architecture is
implemented for 16-bit, 32-bit width operands using Xilinx
14.5 version of VHDL with targeted device of Spartan 3E.
The experimental results are compared with the basic
adder variants such as Ripple Carry Adder, Carry Lookahead adder, Carry Bypass Adder, Carry Select Adder.
Key words -- Ripple Carry Adder(RCA), Carry Lookahead adder(CLA), Carry bypass adder(CBA), Carry select
adder(CSLA), Parallel Prefix Adder(PPA), Very Large
Scale Integration(VLSI)
I.
Introduction
As the scale of integration keeps growing more
and more sophisticated processing systems are being
implemented on a VLSI Chip. These signal
processing applications not only demand great
computation capacity but also consume considerable
amount of energy. Main objectives of most of the
system level (or) circuit level design are high
performance and power optimization. For high
performance system design, propagation delay
minimization plays an important role. Basically size,
cost, performance and power consumption are the
major issues in low power portable devices.
Binary addition is one of the primitive operations
in computer arithmetic. VLSI integer adders are
critical elements in general purpose and digital signal
processing processors since they are employed in the
design of Arithmetic-Logic units. They are also
employed in encryption and hashing function
implementation. The basic component of addition is
a full adder, which adds three 1-bit numbers. Twooperands addition is a primitive operation included
practically in all-arithmetic algorithms. As a
ISSN: 2231-5381
consequence, the efficiency of an arithmetic circuit
strongly depends on the way the adders are
implemented. A key point in two operand adder
implementation is the way the carrier is computed.
The implementation techniques of several types of
adders are Ripple carry adder, Carry look-ahead
adder, Carry skip adder, Carry select adder. When
high operation speed is required, tree structures like
parallel-prefix adders are proposed in literature[11].
Parallel–prefix adders are suitable for VLSI
implementation since they rely on the use of simple
cells and maintain regular connections between them.
The prefix structures allow several trade offs among
the number of cells used, the logic levels required for
implementation, and the fan-out of cells , etc[1-10].
The rest of the paper is organized as follows:
In section II, basic adder models such as ripple carry
adder, carry look-ahead adder, carry bypass adder,
carry select adder are discussed. In section III,
parallel prefix addition procedure is discussed. In
section IV, the design of hybrid parallel prefix adder
is presented. In section V, experimental results are
presented. The conclusions are drawn in sectionVI.
II.
Basic Adder Models
A. Ripple Carry Adder(RCA)
The Ripple carry adder is one of the simplest
adders to implement. This adder takes in two N-bit
inputs and produces (N+1) output bits as N-bit Sum
and 1-bit carry out bit. The Ripple carry adder is built
from N full adders cascaded together, with the carry
out bit of one full adder is connected to the carry in
bit of the next full adder. Ripple carry adder for 4-bit
addition is shown in Fig. 1. The implementation of
16-bit Ripple carry adder based on 4-bit RCA is
given in Fig. 2.
http://www.ijettjournal.org
Page 212
International Journal of Engineering Trends and Technology (IJETT) – Volume 25 Number 4- July 2015
Fig. 1 Logic circuit of 4-bit Ripple Carry Adder
Fig. 3 Logic Circuit of 4-bit Carry Look-ahead adder
Fig. 2 Circuit for 16-bit Ripple Carry Adder
Fig. 4 Circuit of 16-bit Carry Look-ahead Adder
B. Carry Look-ahead Adder(CLA)
Carry look-ahead logic uses the concepts of
generating and propagating carriers. The carry lookahead logic will determine whether that bit pair will
generate a carry or propagate a carry. This allows the
circuit to “pre-process” the two numbers being added
to determine the carry ahead of time. The carryout
signals of 4-bit adder are computed by the following
equations:
C. Carry Skip Adder (or) Carry Bypass
Adder(CBA)
A Carry Skip adder consists of a simple
ripple carry adder with special up carry chain called a
Skip Chain. The carry skip adder addresses this issue
by looking at group of bits and determines whether
this group has a carryout or not. The logic circuit of
4-bit Carry bypass adder is shown in Fig. 5. The
implementation of 16-bit Carry bypass adder based
on 4-bit CBA is given in Fig. 6.
The logic circuit of 4-bit Carry Look-ahead
adder is shown in Fig. 3. The implementation of 16bit Carry look-ahead adder based on 4-bit CLA is
given in Fig. 4.
Fig. 5 Logic Circuit of 4-bit Carry Bypass Adder
ISSN: 2231-5381
http://www.ijettjournal.org
Page 213
International Journal of Engineering Trends and Technology (IJETT) – Volume 25 Number 4- July 2015
Step 2: Use one of the parallel prefix tree to
compute the carry input signals (carryi) for final
addition.
Step 3: Perform simple addition using the following
equation.
Fig. 6 Circuit of 16-bit Carry Bypass Adder
In Fig. 5, the Cout= Cin when
BP(bypass)signal is 1, otherwise the computed carry
at the last stage (C4) is the final Cout. The P0, P1, P2, P3
are the propagate signals from all full adders, as
mentioned in carry look-ahead addition.
Fig. 8 Circuit of 16-bit Carry Select Adder
D. Carry Select Adder(CSLA)
The carry select adder divides the adder into
blocks that have the same input operands except for
the carry in. Carry select adder perform two
additions, one assuming the carry in is 1(Cin=1) and
one assuming the carry in is 0 (Cin= 0), and chooses
between the two results once the actual carry in is
known. The logic circuit of 4-bit Carry select adder is
shown in Fig. 7. The implementation of 16-bit Carry
select adder based on 4-bit CSLA is given in Fig. 8.
IV.
Proposed
adder
Hybrid
Parallel
Prefix
The proposed hybrid parallel prefix adder
for 16-bit addition is presented in Fig. 9. The
proposed parallel prefix adder is designed based on
four types of operators namely black, gray, o3black
and o3gray operators. These operators receive
generate and propagate signals from previous level
and compute required generate and propagate signals
for the next stage.
Fig. 7 Logic Circuit of 4-bit Carry Select Adder
III.
Parallel Prefix Addition
The parallel prefix addition is done in three steps[12]:
Step 1: Calculate the generate and Propagate signals
with the following equations.
ISSN: 2231-5381
Fig.9 Hybrid parallel prefix adder for 16-bit addition
The black and gray operators used in
proposed hybrid parallel prefix adder are shown in
Fig. 10(a), Fig. 10(b) respectively.
http://www.ijettjournal.org
Page 214
International Journal of Engineering Trends and Technology (IJETT) – Volume 25 Number 4- July 2015
The O3gray operator, which takes three pairs of
generate and propagate values (gi , pi),(gj, pj), (gk , pk)
as inputs and produces only one generate signal
output as per equation (10).
The implementation of these operators are done
using multiplexer based design[13].
V.
(a)
(b)
Fig. 10(a) black operator (b) gray operator
The black operator receives two sets of generate
and propagate signals (gi , pi),(gj,pj), computes one
set of generate and propagate signals (go , po) by the
following equations:
The gray operator receives two sets of generate
and propagate signals (gi, pi), (gj, pj), computes only
one generate signal with the same equation as in
equation (8).
The O3black, O3gray operators are shown in
Fig. 11(a), Fig. 11(b) respectively.
Experimental Results
The Xilinx 14.5 version of VHDL is used with
the targeted device of XC3S500E for the
implementation of adders. Basic adders, ripple carry
adder, carry look-ahead adder, carry bypass adder,
carry select adder and hybrid parallel prefix adder is
simulated and synthesize using Xilinx tool for 16-bit
and 32-bit addition. The results of these adders for 16bit addition and 32-bit addition are tabulated in table 1
and table 2 respectively.
The comparison is done for three factors:
speed, area and power consumption. The speed
performance is evaluated with respect to and delay.
The area requirement can be estimated from the
utilization of number of slices and Look-up tables.
The power consumption is analyzed by taking
switching power (dynamic power) in account which
mainly depends on the input test vectors that can be
applied through the test bench. Static power is not
considered because lack of ASIC tools available.
The design summary of synthesis and power
analysis results of 32-bit hybrid parallel prefix adder
are given in figures 12 and 13 respectively.
Table 1. Results of various adders for 16-bit addition
Slices
LUT
Average
fan out
Delay
(ns)
Logic
levels
Power
(mw)
RCA
22
31
1.76
20.629
17
33
CLA
22
31
1.76
20.317
17
33
CBA
30
49
2.87
18.352
16
33
CSLA
Hybrid
PPA
22
40
2.59
14.384
11
33
27
46
2.40
14.886
12
33
Adder
type
(a)
(b)
Fig. 11. (a) O3black operator (b) O3gray operator
The O3black operator, which takes three pairs of
generate and propagate values (gi , pi),(gj, pj), (gk , pk)
as inputs and produces the generate and propagate
output values (go , po) as follows:
ISSN: 2231-5381
Table 2. Results of various adders for 32-bit addition
http://www.ijettjournal.org
Page 215
International Journal of Engineering Trends and Technology (IJETT) – Volume 25 Number 4- July 2015
Slices
LUT
Average
fan out
Delay
(ns)
Logic
levels
Power
(mw)
RCA
46
63
1.76
37.604
33
62
CLA
45
65
1.93
34.921
31
62
CBA
50
86
2.10
45.169
40
62
CSLA
Hybrid
PPA
46
84
2.71
23.083
19
62
57
105
2.58
21.162
19
52
Adder
type
Slices
60
50
40
30
20
10
0
16-bi
32-bit
RCA
CLA
CBA CSLA Hybrid
PPA
Fig. 14 Utilization of Slices of various adder variants for 16-bit and
32-bit addition
LUTs
Fig. 12 Design Summary of hybrid parallel prefix adder for 32-bit
addition
120
100
80
60
40
20
0
16-bi
32-bit
Fig. 15 Utilization of LUTs of various adder variants for 16-bit and
32-bit addition
Delay (ns)
50
40
30
Fig. 13 Power analysis results of hybrid parallel prefix adder for
32-bit addition
20
16-bi
10
32-bit
0
RCA
CLA
For better comparison, the results of various types of
adders are presented in figures 14 to 19.
CBA CSLA Hybrid
PPA
Fig. 16 Delay of various adder variants for 16-bit and 32-bit
addition
ISSN: 2231-5381
http://www.ijettjournal.org
Page 216
International Journal of Engineering Trends and Technology (IJETT) – Volume 25 Number 4- July 2015
Average Fan-out
3.5
3
2.5
2
1.5
1
0.5
0
16-bi
32-bit
RCA
CLA
CBA CSLA Hybrid
PPA
Fig. 17 Average Fan-out of various adder variants for 16-bit and
32-bit addition
50
40
30
20
16-bi
10
32-bit
0
CLA
CBA CSLA Hybrid
PPA
Fig. 18 Logic Levels of various adder variants for 16-bit and 32-bit
addition
Power (mw)
70
60
50
40
30
20
10
0
The utilization of Slices and LUTs for 16-bt
and 32-bit addition are more for the adder models of
CBA and hybrid PPA. The hybrid PPA occupy little
more area than CBA.
Hence, the implementation of proposed
hybrid Parallel Prefix Adder in an FPGA shows its
superiority in Speed and Low Power Consumption
but with little area overhead.
The proposed high speed, low power, area
efficient hybrid parallel prefix adder can be further
used in cryptographic applications such as encryption
and hashing functions, signal processing applications
such as FIR filter, etc, to enhance the overall system
performance.
References
[1]
16-bi
[2]
32-bit
[3]
RCA
CLA
CBA CSLA Hybrid
PPA
[4]
Fig. 19 Dynamic Power Consumption of Slices of various adder
variants for 16-bit and 32-bit addition
VI.
The speed performance is estimated with delay
parameter, the delay of CSLA is least when
compared with all other adder models for 16-bit
addition, hybrid parallel prefix adder delay is very
close to the delay of CSLA. As the bit width of
addition increases to 32-bit wide, the delay of hybrid
PPA is resulted a small value compared to all other
models.
For 16-bit addition, none of the adders shown
deviation of Dynamic power consumption. The 32-bit
hybrid PPA consume less power compared to all
other adder models.
Logic Levels
RCA
This paper presents the implementation of
various adder variants such as ripple carry adder,
carry look-ahead adder, carry bypass adder, carry
select adder and the proposed hybrid parallel prefix
adder for 16-bit and 32-bit addition. To evaluate the
performance of these adders, Xilinx 14.5 version of
VHDL is used with the targeted device of Spartan
3E family of device XC3S500E.
Conclusions
ISSN: 2231-5381
[5]
[6]
J. Skalansky,
“conditional sum additions logic”,
IRE
Transactions, Electronic Computers, vol. EC – 9, pp. 226 - 231,
June 1960.
Y.Choi and E.E.Swartz lander, Ir, “Parallel Prefix adder design
with matrix representation”,, in Proc.17th IEEE symposium on
computer Arithmetic (ARITH), PP 90-98,2005.
Kogge P, Stone H, “A parallel algorithm for the efficient
solution of a general class Recurrence relations”, IEEE Trans.
Computers, vol.C-22, No.8, pp. 786-793, Aug.1973.
GiorgosDimitrakopoulos and DimitricNikolos, “High Speed
Parallel –Prefix VLSI Ling Adders”, IEEE Trans on computers,
Vol.54, No.2, Feb 2005.
Han T, Carlson D, “Fast area-efficient VLSI adders”,
Proc.8th.symp.Comp.Arit.pp.49-56, Sep.1987.
TaekoMatsunaga and Shinji Kimura, YusukaMatsunaga,
“Synthesis of parallel prefix adders considering switching
activities”, IEEE International Conference on computer design,
http://www.ijettjournal.org
Page 217
International Journal of Engineering Trends and Technology (IJETT) – Volume 25 Number 4- July 2015
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
pp.404-409, 2008
TaekoMatsunaga and YusukaMatsunaga, “Timing-Constrained
Area minimization Algorithm for parallel prefix adders”, IEICE
TRANS, Fundamentals, vol.E90-A, No.12 Dec, 2007.
JianhuaLiuZhu, Haikun, Chung-Kuan Cheng, John Lillis,
“Optimum prefix Adders in a Comprehensive Area, Timing and
power Design Space”., Proceeding of the 2007 Asia and South
pacific Design Automation conference. Washington, pp.609-615,
jan 2007.
Ladner R, Fischer M, ”Parallel prefix computation “, J.ACM,
vol.27, no. 4, pp. 831-838, Oct.1980.
Brent R, Kung H, “A regular layout for parallel adders”. IEEE
Trans, computers, Vol.C-31, no.3, pp. 260-264, March1982.
S.V.Padmajarani and M.Muralidhar, “Comparison of Parallel
Prefix Adders Performance in an FPGA”, International Journal
of Engineering Research and Development (IJERD), Vol.3,
No.6, pp. 62-67, September -2012.
S.V.Padmajarani and M.Muralidhar, “A Hybrid Parallel Prefix
Adder for high speed computing”, Proc. 7th National Conference
on Advances in Electronics and Communications(ADELCO),
2011.
S.V.Padmajarani and M.Muralidhar, “A New Approach to
implement Parallel Prefix Adders in an FPGA”, International
Journal of Engineering Research and Applications(IJERA)
,Vol.2, No.4,pp. 1524-1528, July-August, 2012.
J. Skalansky,
“conditional sum additions logic”,
IRE
Transactions, Electronic Computers, vol. EC – 9, pp. 226 - 231,
June 1960.
Brent R, Kung H, “A regular layout for parallel adders”. IEEE
Trans, computers, Vol.C-31, no.3, pp. 260-264, March1982.
Kogge P, Stone H, “A parallel algorithm for the efficient
solution of a general class Recurrence relations”, IEEE Trans.
Computers, vol.C-22, No.8, pp. 786-793, Aug.1973.
Ladner R, Fischer M, ”Parallel prefix computation “, J.ACM,
vol.27, no. 4, pp. 831-838, Oct.1980.
Han T, Carlson D, “Fast area-efficient VLSI adders”,
Proc.8th.symp.Comp.Arit.pp.49-56, Sep.1987.
JianhuaLiuZhu, Haikun, Chung-Kuan Cheng, John Lillis,
“Optimum prefix Adders in a Comprehensive Area, Timing and
power Design Space”., Proceeding of the 2007 Asia and South
pacific Design Automation conference. Washington, pp.609-615,
jan 2007.
TaekoMatsunaga and YusukaMatsunaga, “Timing-Constrained
Area minimization Algorithm for parallel prefix adders”, IEICE
TRANS, Fundamentals, vol.E90-A, No.12 Dec, 2007.
TaekoMatsunaga and Shinji Kimura, YusukaMatsunaga,
“Synthesis of parallel prefix adders considering switching
activities”, IEEE International Conference on computer design,
pp.404-409, 2008
GiorgosDimitrakopoulos and DimitricNikolos, “High Speed
Parallel –Prefix VLSI Ling Adders”, IEEE Trans on computers,
Vol.54, No.2, Feb 2005.
Y.Choi and E.E.Swartz lander, Ir, “Parallel Prefix adder design
with matrix representation”,, in Proc.17th IEEE symposium on
computer Arithmetic (ARITH), PP 90-98,2005.
S.V.Padmajarani and M.Muralidhar, “Comparison of Parallel
Prefix Adders Performance in an FPGA”, International Journal
of Engineering Research and Development (IJERD), Vol.3,
No.6, pp. 62-67, September -2012.
S.V.Padmajarani and M.Muralidhar, “A Hybrid Parallel Prefix
Adder for high speed computing”, Proc. 7th National Conference
on Advances in Electronics and Communications(ADELCO),
2011.
S.V.Padmajarani and M.Muralidhar, “A New Approach to
implement Parallel Prefix Adders in an FPGA”, International
Journal of Engineering Research and Applications(IJERA)
,Vol.2, No.4,pp. 1524-1528, July-August, 2012.
ISSN: 2231-5381
http://www.ijettjournal.org
Page 218
Download