Uploaded by qingteng woshi

08365186

advertisement
Proceedings of the International Conference on Inventive Computing and Informatics (ICICI 2017)
IEEE Xplore Compliant - Part Number: CFP17L34-ART, ISBN: 978-1-5386-4031-9
Low-power Less-Area Bypassing-Based
Multiplier Design
Amit Kumar Sahu, Ms. Laxmi Kumre
Abstract—Low power less area bypassing based multipliers
proposed on the bases of modified XOR gate in low cost low power
bypassing based multiplier. On the bases of different tested samples
our proposed design has average 49.41% less area, 13.6% reduced
power dissipation, 2.36% less delay and 20.89% less power delay
product in comparison to different types of multiplier for 4x4 and 8x8
bit.
Keywords— Bypassing based multiplier, power dissipation,
modified xor, area reduction.
I. INTRODUCTION
N
ow a days , more and more devices are becoming
smaller and portable, so area and power both are the
important factor for these devices and multiplication is an
essential arithmetic operation which consume much
power..For CMOS circuits, the power dissipation can be
divided into static and dynamic power dissipation. Static
power dissipation is from leakage current and the consumption
is proportional to the number of the used transistor. On the
other hand, dynamic power dissipation is from the switching
transistor and the consumption is obtained from the charging
and discharging of the load capacitances. In general, the
average dynamic power dissipation for a CMOS gate can be
obtained as
power dissipation in an array multiplier for low power
multiplier. Other approach is to reduce the power dissipation
in multiplication by interchanging dynamic operand[5] or
using partially guard computation[6]. Row bypassing[7] and
column bypassing[8] are the another methods for power
reduction in multiplication. Next the low power row and
column bypassing based multiplier[10] is proposed. Further
low cost low power bypassing based multiplier[11] is
proposed.
II. DIFFERENT TYPES OF MULTIPLIER
A. Braun multiplier
Consider the multiplication of two unsigned n-bit numbers,
where A = an-1 an-2........a0 is the multiplicand and B = bn-1b n2……..b 0 is the multiplier. The product P = p 2n-1p 2n-2……p 0
can be written as follows:
n −1 n −1
P = P0P1………..P 2n-1 =
∑∑ (a b )2
i
i+ j
j
(1)
i =0 j =0
A 4x4 multiplication example is shown in figure 1.
Pavg =1/2 CfV2DDN
, where C is the load capacitance, f is the clock frequency, VDD
is the power supply voltage and N is the number of switching
activities in a clock cycle. Clearly, if the switching activity of
a given logic circuit is reduced without changing its function,
the power consumption can be reduced.
The basic approach for multiplication is Braun
multiplier[1].It is well known that multiplier consume most of
the power in DSP computation[2]. Hence, it is very important
for modern DSP systems to design low-power multipliers to
reduce the power dissipationMany research on the reduction
of the switching activities[3] for low power multiplier have
been published. Another simple approach[4] is to reduce the
Fig. 1 A 4x4 multiplier
A 4x4 multiplication with the help of half adder and full
adder is shown in figure 2.
Amit Kumar Sahu, Department of Electronics and Communication
Engineering, Maulana azad national institute of technology, Bhopal, INDIA ,
(e-mail: amitsahu341@gmail.com)
Ms.Laxmi Kumre, Assistant Professor, Department of Electronics and
Communication Engineering, Maulana azad national institute of technology,
Bhopal, INDIA (e-mail: laxmikumre99@rediffmail.com)
978-1-5386-4031-9/17/$31.00 ©2017 IEEE
522
Authorized licensed use limited to: SHENZHEN UNIVERSITY. Downloaded on June 11,2021 at 06:52:22 UTC from IEEE Xplore. Restrictions apply.
Proceedings of the International Conference on Inventive Computing and Informatics (ICICI 2017)
IEEE Xplore Compliant - Part Number: CFP17L34-ART, ISBN: 978-1-5386-4031-9
Fig. 2 Multiplication with half adders and full adders.
Fig. 4 4x4 column-bypass multiplier
A nxn bit parallel multiplier consist of n-1 rows of
full adder and 1 row of half adder. In 4x4 multiplication 3
rows of full adder and 1 row of half adder is used.
B. Row bypassing based multiplier
In row bypassing based multiplier[7], if the bit bj in
multiplier is 0 than all the partial product aib j are 0 for 0≤ i ≤
n-1 than the addition operation of jth row is bypassed from j-1
th
row to j+1 th row. To correct the final multiplication result
extra correction circuit must be added. In figure 3. A 4x4
Braun multiplier with row bypassing is illustrated.
D.2-dimensional bypassing based multiplier
For a 2-dimensional based bypassing multiplier[9], if the bit
ai is 0 or the bit bj is 0, the addition operation in the i+1 th
column or jth row can be bypassed. For correct propagation of
carry bit in the multiplication following conditions must be
considered:- if the bit ai and bj are 0 and ci.j+1 the carry bit is 1,
the addition operation in the i+1th column or jth row cannot be
bypassed. Hence the correction circuit must be added which is
so complicated that the power reduction ability is decreased.
In figure 5. a 4x4 bit 2-dimensional bypassing based multiplier
is illustrated.
Fig. 3 4x4 row bypass multiplier .
C. Column bypassing based multiplier
In column bypassing based multiplier[8], if the bit ai in the
multiplicand is 0 i.e. all the partial product aibj are 0 for 0≤ j ≤
n-1, than the addition operation of the ith column is bypassed.
If in the multiplicand bit ai is 0 their input in the i+1th column
is disable and carry output must be set equal to 0 in the
column to produce the correct output. By adding AND gate at
the output of the last row, the protection process can be done.
Therefore it does not need the extra correction circuit. In
figure 4 a 4x4 Braun multiplier with column bypassing is
illustrated
Fig. 5 4x4 bit 2-dimensional multiplier
For 2-dimensional bypassing based multiplier[10], in the
bypassing condition the carry bit in the previous row is also
considered. If the product aib j is 0 and carry bit ci,j-1 is 0, the
addition operation in the i+1,jth full adder will be bypassed i.e.
as the product aib j is 1 and carry bit ci,j-1 is 0, the addition
operation in the i+1,jth full adder will be executed. By using
the bypassing condition, the (i+1, j)-th FA only executes the
A+1 addition as the product, aibj, is 1 and the carry bit, ci,j-1,
is 0, or the product, aibj, is 0 and the carry bit, ci,j-1, is 1. On
the other hand, the (i+1, j)-th FA only executes the A+2
addition as the product, aibj, is 1 and the carry bit, ci,j-1, is 1.
978-1-5386-4031-9/17/$31.00 ©2017 IEEE
523
Authorized licensed use limited to: SHENZHEN UNIVERSITY. Downloaded on June 11,2021 at 06:52:22 UTC from IEEE Xplore. Restrictions apply.
Proceedings of the International Conference on Inventive Computing and Informatics (ICICI 2017)
IEEE Xplore Compliant - Part Number: CFP17L34-ART, ISBN: 978-1-5386-4031-9
Hence, the carry bit in the (i+1, j)-th FA can be replaced by
the AND operation of the product, aibj, and the carry bit, ci,j-1.
For the addition operation half adders are replaced by A+1
adder and full adder by A+B+1 adder. In figure 6. a 4x4Braun
multiplier with row and column bypassing is illustrated.
a3b0
a2b0
A+1
0
a1b1
A+1
0
a0b0
a1b0
a2b1
0
a0b1
A+1
a3b1
0 1
1 0
a2b2
a3b2
0
A+B
+1
0 1
a2b3
A+B
+1
0 1
FA
P7 P6
A+B
+1
0 1
FA
A+B+1
P5
P4
1 0
1 0
A+B
+1
0 1
1 0
a0b3
0
1 0
0 1
a0b2
0
a1b3
0
1 0
A+B
+1
0 1
a2b3
0
1 0
a1b2
0
1 0
0 1
A+B
+1
0 1
1 0
Fig. 8 4x4 bit low cost low power bypassing based multiplier
P3
P2
P1
P0
Fig. 6 4x4 bit braun multiplier with row and column bypassing
III.
PROPOSED MULTIPLIER
In low cost low power bypassing based multiplier the
addition operation in the i+1,jth full adder is bypassed if the
product aib j is equal to carry bit ci,j-1. Hence XOR result of the
product aib j and the carry bit ci,j-1 is used as the control signal
in the bypassing condition.
The conventional XOR gate used for generating the
controlling signal in low cost low power bypassing based
multiplier[11] is being replaced by the modified XOR gate.
The Modified XOR gate is implemented with only 4 transistor
whereas the conventional XOR gate is implemented with 12
transistor. Therefore the total number of transistor in the
implementation of proposed multiplier
is reduced as
compared to the low cost low power bypassing based
multiplier[11]. As the total number of transistors are reduced,
the power consumption and area of the proposed multiplier are
reduced. The 4 transistor modified XOR gate is illustrated in
figure 9.
Fig.7 Bypassing based (a) HA and (b) FA.
If the product aib j is not equal to the carry bit ci,j-1,than the
i+1,jth full adder will executes the A+1 addition. On the other
hand, as the product aib j is equal to the carry ci,j-1 , the addition
result in the i+1,jth full adder will be obtained by adding 2 or
0. Therefore, the resultant carry bit, ci+1,j, in the (i+1, j)-th
full adder can be bypassed from the previous carry bit, ci,j-1,
and the (i+1, j)-th full adder can be replaced with a low-cost
incremental adder, A+1. Besides that, each simplified adder,
A+1, in the CSA array is only attached by one tri-state buffer
and two 2-to-1 multiplexers. Similarly, a half adder can be
also replaced with a low-cost incremental adder, A+1, with the
bypassing condition as aibj=0. In Figure 7, the bypassingbased design of a half adder and a full adder is shown. By
using the bypassing-based design of a half adder and a full
adder, a 4x4 low-cost bypassing-based multiplier can be
illustrated in Figure 8.
Fig. 9 Modified XOR
IV.
ANALYSIS OF AREA ON THE BASIS OF TRANSISTOR COUNT
In all the bypassing based multiplier the following number
of transistor implementation is used. Full adder 28 transistor
implementation, half adder 14 transistor, conventional XOR
gate 8 transistor, AND gate 6 transistor, OR gate 6 transistor,
4 transistor implementation of 2:1 multiplexer, 4 transistor
implementation of tri state buffer, NAND gate 4 transistor
implementation, 2 transistor implementation of A+1 adder or
NOT gate, 4 transistor implementation of modified XOR gate.
For an nxn bit Braun multiplier[1], n2AND gates, n2-2n full
adders and n half adders are used. The total number of
transistor are 34n2-42n. In an nxn bit row bypassing
multiplier[7], n-2 NOT gate,3n2-7n+4 tri state buffer, n 2+n-2
AND gates, 2n2-4n+2 2:1 multiplexer, n2-2n-2 full adders and
2n-4 half adders are used. The total number of transistor are
978-1-5386-4031-9/17/$31.00 ©2017 IEEE
524
Authorized licensed use limited to: SHENZHEN UNIVERSITY. Downloaded on June 11,2021 at 06:52:22 UTC from IEEE Xplore. Restrictions apply.
Proceedings of the International Conference on Inventive Computing and Informatics (ICICI 2017)
IEEE Xplore Compliant - Part Number: CFP17L34-ART, ISBN: 978-1-5386-4031-9
56n2-64n+8. In an nxn bit column bypassing multiplier[8],
2n2-4n+2 tri state buffer, n2+n-1 AND gate, n2-2n+1 2:1
multiplexer, n2-2n full adders and n half adders. The total
number of transistor are 46n2-60n+6. In an nxn bit 2dimensional bypassing based multiplier[9], n-3 NAND gate, 1
NOT gate, 3n2-6n+3 tri state buffer, 4n2-13n+17 AND gate,
2n2-4n+2 2:1 multiplexer, n2-2n-2 full adders and 2n-4 half
adders are used. The total number of transistor are 72n2142n+112. In an nxn bit bit 2-dimensional bypassing based
multiplier[10], 3n2-6n+4 AND gate, 3n2-6n+4 OR gate, 2n24n+2 2:1 multiplexer, 2n2-5n+3 tri state buffer, n-1 A+1 adder,
n-2 full adder and n2-3n+3 half adder are used. The total
number of transistor are 48n2-84n+28. In an nxn bit low cost
low power bypassing based multiplier[11], n2 AND gate, n2-n
tri state buffer, n2-2n XOR gate, 2n2-2n 2:1 multiplexer and
n2-n A+1 adder are used. The total number of transistor are
28n2-30n. In the proposed low power less area bypassing
based multiplier, n2 AND gate, n 2-n tri state buffer, n2-2n
modified XOR gate, 2n2-2n 2:1 multiplexer and n2-n A+1
adder are used. The total number of transistor are 24n2-22n.
Table I shows the area comparison of different types of
multiplier.
TABLE I
AREA COPMARESION
Different types of
multiplier
2
4x4 bit
8x8 bit
Braun multiplier[1]
34n -42n
376(100.0%)
1840(100.0%)
2-dimensional
multiplier[9]
72n2142n+11
2
696(185.1%)
3584(194.8%)
48n284n+28
460(122.3%)
2428(132.0%)
28n2-30n
382(87.2%)
2
296(78.7%)
Row and Column
bypassing
multiplier[10]
Low cost low power
multiplier [11]
Proposed multiplier
24n -22n
Conventional Xor
12
Area
Modified Xor
4
Power
16.25 uw
2.894 uw
Delay
22.61 psec
5.877 psec
Power Delay
product
0.367 fsec
0.17 fsec
B.
Power comparison of different types of multiplier
The implementation of different types of multiplier is
done on cadence 65 nm technology. The power dissipation
calculation is done by setting power supply vdd 1.1 volt for 65
nm and by taking average of 25 different samples. Table III
shows the power dissipation comparison on 65 nm technology
for different types of multiplier for 4x4 and 8x8 bits. The
result shows our proposed multiplier dissipate 14.9%, 25.2%,
10.25%, 4.2% less amount of average power in comparison to
braun multiplier[1], 2-dimensional bypassing multiplier[9
and10] and low cost low power bypassing based
multiplier[11] respectively.
TABLE III
POWER DISSIPATION OF DIFFERENT TYPES OF MULTIPLIER
Area
nxn bit
TABLE II
AREA, POWE , DELAY AND POWER-DELAY PRODUCT
Different types of
multiplier
POWER DISSIPATION(mw)
4x4 bit
8x8 bit
0.987(100%)
6.488(100%)
2-dimensional multiplier
[9]
1.098(112.2%)
7.251(117.6%)
1552(84.3%)
Row
and
multiplier [10]
0.934(94.6%)
6.235(96.1%)
1360(73.9%)
Low cost low power
multiplier [11]
0.869(88.1%)
5.885(90.7%)
Proposed multiplier
0.822(83.3%)
5.638(86.9%)
Braun multiplier[1]
In the area comparison, we compare our proposed low
power less area bypassing based multiplier with Braun
multiplier [1],row bypassing multiplier[7],column bypassing
multiplier [8],2-dimensional bypassing based multiplier[9 and
10] and low cost low power bypassing based multiplier[11] for
4x4 and 8x8 bits. The result shows our proposed multiplier
saves23.7%, 113.65%, 50.85%, 9.45% respectively.
V.EXPERIMENTAL RESULTS
A. Power, Delay and power delay product
comparison of XOR gate
column
C. Delay calculation
Table IV shows the Delay comparison on 65 nm
technology for different types of multiplier for 4x4 and 8x8
bits. The result shows our proposed multiplier has 2.55%,
4.15%, 1.85%, 0.9% less amount of average delay in
comparison to braun multiplier[1], 2-dimensional bypassing
multiplier[9 and10] and low cost low power bypassing based
multiplier[11] respectively.
Below table II shows that modified xor consumes
13.356 uw less amount of power than conventional xor on
65nm and Delay of modified xor is 16.733 psec less than
conventional xor on 65nm.
978-1-5386-4031-9/17/$31.00 ©2017 IEEE
525
Authorized licensed use limited to: SHENZHEN UNIVERSITY. Downloaded on June 11,2021 at 06:52:22 UTC from IEEE Xplore. Restrictions apply.
Proceedings of the International Conference on Inventive Computing and Informatics (ICICI 2017)
IEEE Xplore Compliant - Part Number: CFP17L34-ART, ISBN: 978-1-5386-4031-9
TABLE IV
DELAY OF DIFFERENT TYPES OF MULTIPLIER
Different types of
DELAY(nsec)
multiplier
4x4 bit
8x8 bit
Braun multiplier[1]
2-dimensional
multiplier [9]
Row and column
multiplier [10]
Low cost low power
multiplier [11]
Proposed multiplier
4.149(100%)
4.679(100%)
4.211(101.5%)
4.758(101.7%)
4.128(99.5%)
4.637(99.1%)
4.098(98.8%)
4.581(97.9%)
4.055(97.73%)
4.548(97.2%)
D. POWER DELAY PRODUCT(PDP)
Table v shows the Power Delay product comparison on
65 nm technology for different types of multiplier for 4x4 and
8x8 bits. The result shows our proposed multiplier has
17..05%, 49.85%, 11.75%, 4.9% less amount of average
power delay product in comparison to braun multiplier[1], 2dimensional bypassing multiplier[9 and10] and low cost low
power bypassing based multiplier[11] respectively.
TABLE V
POWER-DELAY PRODUCT OF DIFFERENT TYPES OF MULTIPLIER
Different types of
multiplier
Braun multiplier[1]
2-dimensional
multiplier [9]
Row and column
multiplier [10]
Low cost low power
multiplier [11]
Proposed multiplier
Power-delay product(mwsec)
4x4 bit
8x8 bit
4.095(100%)
30.357(100%)
4.624(129.2%)
34.500(136.4%)
3.856(94.2%)
28.912(95.2%)
3.561(86.9%)
26.958(88.8%)
3.332(81.4%)
25.642(84.5%)
Pipelined multiplier,” IEEE International Symposium on
Circuits and Systems,pp.57–60, 1996.
[5] T. Ahn and K. Choi, “dynamic operand interchange for
low power,Electronics Letters, Vol. 33, no. 25, pp.2118
2120, 1997.
[6] J. Choi, J. Jeon and K. Choi, “Power minimization of
functional units by partially guarded computation,”
International Symposium on Low-power Electronics and
Design, pp.131-136, 2000.
[7] J. Ohban, V. G. Moshnyaga, and K. Inoue, “Multiplier
energy reduction through bypassing of partial products,”
IEEE Asia-Pacific Conference on Circuits and Systems,
pp.13–17, 2002.
[8] C. Wen, S. J. Wang and Y. M. Lin, “Low power parallel
multiplier with column bypassing,“ IEEE International
Symposium on Circuits and Systems, pp.1638-1641,
2005.
[9] G. N. Sung, Y. J. Ciou and C. C. Wang, “A power-aware
2- dimensional bypassing multiplier using cell-based
design flow,” IEEE InternationaL Symposium on
Circuits and Systems, pp.3338-3341, 2008.
[10] J. T. Yan and Z. W. Chen, “Low-power multiplier design
with row and column bypassing,” IEEE International
SOC Conference, pp.227-230,2009.
[11] J. T. Yan and Z. W. Chen,”low cost low power bypassing
based multiplier design” IEEE international conference
,pp.2338-2341,2010.
Author Profile:
Mr. AMIT KUMAR SAHU, student, is currently pursuing his M.Tech VLSI
and Embedded System Design, in ECE department of Maulana Azad National
Institute of Technology, Bhopal (M.P.), INDIA. He has completed his B.E.
from Samrat Ashok Technological Institute, Vidisha (M.P.), INDIA. His area
of research are VLSI, Embedded system, and digital system design.
Ms. LAXMI KUMRE, Assistant professor, department of Electronics and
communication Engineering Maulana Azad National Institute of Technology,
Bhopal (M.P.), INDIA.
VI. CONCLUSION
Based on the less number of transistor implementation of
XOR gate, we proposed low power less area bypassing based
multiplier. In the consideration of area and power
consumption, the experimental result shows that the proposed
multiplier has 49.41% less area, 13.6% reduced power
dissipation, 2.36% less delay and 20.89% less power delay
product in comparison to different types of multiplier.
REFERENCES
[1] B. Parhami, Computer Arithmetic: Algorithms and
Hardware Designs, Oxford University Press, 2000.
[2] T. Nishitani, “Micro-programmable DSP chip,” 14th
Workshop on Circuits and Systems, pp.279-280, 2001.
[3] V. G. Moshnyaga and K. Tamaru, “A comparative study
of switching
activity reduction techniques for design of low power
multipliers,”IEEE International Symposium on Circuits
and Systems, pp.15601563,1995.
[4] A. Wu, “High performance adder cell for low power
978-1-5386-4031-9/17/$31.00 ©2017 IEEE
526
Authorized licensed use limited to: SHENZHEN UNIVERSITY. Downloaded on June 11,2021 at 06:52:22 UTC from IEEE Xplore. Restrictions apply.
Download