International Journal of Engineering Trends and Technology (IJETT) – Volume 25 Number 4- July 2015 Implementation of High Speed, Low Power and Area Efficient Parallel Prefix Adder in an FPGA S.V.Padmajarani#1, Dr.M.Muralidhar*2 # Professor and HOD(ECE), SreeVenkateswara College of Engineering, Northrajupalem, Nellore, A.P, India * Principal, SreeVenkateswara College of Engineering and Technology, Chittoor, A.P, India Abstract --In the portable world, the major issues in the designs are low power, high speed and less area requirement. Most of the portable computing devices contain the sophisticated and power hungry signal processing techniques, hence there is a need to reduce the power consumption of these devices. Binary adder is a fundamental unit in many arithmetic operations, and signal processing applications. Hence the impact of the adders will be large on the overall performance of the entire system. The aim of this paper is to present a low power, high speed and area efficient parallel prefix adder architecture. The proposed adder architecture is implemented for 16-bit, 32-bit width operands using Xilinx 14.5 version of VHDL with targeted device of Spartan 3E. The experimental results are compared with the basic adder variants such as Ripple Carry Adder, Carry Lookahead adder, Carry Bypass Adder, Carry Select Adder. Key words -- Ripple Carry Adder(RCA), Carry Lookahead adder(CLA), Carry bypass adder(CBA), Carry select adder(CSLA), Parallel Prefix Adder(PPA), Very Large Scale Integration(VLSI) I. Introduction As the scale of integration keeps growing more and more sophisticated processing systems are being implemented on a VLSI Chip. These signal processing applications not only demand great computation capacity but also consume considerable amount of energy. Main objectives of most of the system level (or) circuit level design are high performance and power optimization. For high performance system design, propagation delay minimization plays an important role. Basically size, cost, performance and power consumption are the major issues in low power portable devices. Binary addition is one of the primitive operations in computer arithmetic. VLSI integer adders are critical elements in general purpose and digital signal processing processors since they are employed in the design of Arithmetic-Logic units. They are also employed in encryption and hashing function implementation. The basic component of addition is a full adder, which adds three 1-bit numbers. Twooperands addition is a primitive operation included practically in all-arithmetic algorithms. As a ISSN: 2231-5381 consequence, the efficiency of an arithmetic circuit strongly depends on the way the adders are implemented. A key point in two operand adder implementation is the way the carrier is computed. The implementation techniques of several types of adders are Ripple carry adder, Carry look-ahead adder, Carry skip adder, Carry select adder. When high operation speed is required, tree structures like parallel-prefix adders are proposed in literature[11]. Parallel–prefix adders are suitable for VLSI implementation since they rely on the use of simple cells and maintain regular connections between them. The prefix structures allow several trade offs among the number of cells used, the logic levels required for implementation, and the fan-out of cells , etc[1-10]. The rest of the paper is organized as follows: In section II, basic adder models such as ripple carry adder, carry look-ahead adder, carry bypass adder, carry select adder are discussed. In section III, parallel prefix addition procedure is discussed. In section IV, the design of hybrid parallel prefix adder is presented. In section V, experimental results are presented. The conclusions are drawn in sectionVI. II. Basic Adder Models A. Ripple Carry Adder(RCA) The Ripple carry adder is one of the simplest adders to implement. This adder takes in two N-bit inputs and produces (N+1) output bits as N-bit Sum and 1-bit carry out bit. The Ripple carry adder is built from N full adders cascaded together, with the carry out bit of one full adder is connected to the carry in bit of the next full adder. Ripple carry adder for 4-bit addition is shown in Fig. 1. The implementation of 16-bit Ripple carry adder based on 4-bit RCA is given in Fig. 2. http://www.ijettjournal.org Page 212 International Journal of Engineering Trends and Technology (IJETT) – Volume 25 Number 4- July 2015 Fig. 1 Logic circuit of 4-bit Ripple Carry Adder Fig. 3 Logic Circuit of 4-bit Carry Look-ahead adder Fig. 2 Circuit for 16-bit Ripple Carry Adder Fig. 4 Circuit of 16-bit Carry Look-ahead Adder B. Carry Look-ahead Adder(CLA) Carry look-ahead logic uses the concepts of generating and propagating carriers. The carry lookahead logic will determine whether that bit pair will generate a carry or propagate a carry. This allows the circuit to “pre-process” the two numbers being added to determine the carry ahead of time. The carryout signals of 4-bit adder are computed by the following equations: C. Carry Skip Adder (or) Carry Bypass Adder(CBA) A Carry Skip adder consists of a simple ripple carry adder with special up carry chain called a Skip Chain. The carry skip adder addresses this issue by looking at group of bits and determines whether this group has a carryout or not. The logic circuit of 4-bit Carry bypass adder is shown in Fig. 5. The implementation of 16-bit Carry bypass adder based on 4-bit CBA is given in Fig. 6. The logic circuit of 4-bit Carry Look-ahead adder is shown in Fig. 3. The implementation of 16bit Carry look-ahead adder based on 4-bit CLA is given in Fig. 4. Fig. 5 Logic Circuit of 4-bit Carry Bypass Adder ISSN: 2231-5381 http://www.ijettjournal.org Page 213 International Journal of Engineering Trends and Technology (IJETT) – Volume 25 Number 4- July 2015 Step 2: Use one of the parallel prefix tree to compute the carry input signals (carryi) for final addition. Step 3: Perform simple addition using the following equation. Fig. 6 Circuit of 16-bit Carry Bypass Adder In Fig. 5, the Cout= Cin when BP(bypass)signal is 1, otherwise the computed carry at the last stage (C4) is the final Cout. The P0, P1, P2, P3 are the propagate signals from all full adders, as mentioned in carry look-ahead addition. Fig. 8 Circuit of 16-bit Carry Select Adder D. Carry Select Adder(CSLA) The carry select adder divides the adder into blocks that have the same input operands except for the carry in. Carry select adder perform two additions, one assuming the carry in is 1(Cin=1) and one assuming the carry in is 0 (Cin= 0), and chooses between the two results once the actual carry in is known. The logic circuit of 4-bit Carry select adder is shown in Fig. 7. The implementation of 16-bit Carry select adder based on 4-bit CSLA is given in Fig. 8. IV. Proposed adder Hybrid Parallel Prefix The proposed hybrid parallel prefix adder for 16-bit addition is presented in Fig. 9. The proposed parallel prefix adder is designed based on four types of operators namely black, gray, o3black and o3gray operators. These operators receive generate and propagate signals from previous level and compute required generate and propagate signals for the next stage. Fig. 7 Logic Circuit of 4-bit Carry Select Adder III. Parallel Prefix Addition The parallel prefix addition is done in three steps[12]: Step 1: Calculate the generate and Propagate signals with the following equations. ISSN: 2231-5381 Fig.9 Hybrid parallel prefix adder for 16-bit addition The black and gray operators used in proposed hybrid parallel prefix adder are shown in Fig. 10(a), Fig. 10(b) respectively. http://www.ijettjournal.org Page 214 International Journal of Engineering Trends and Technology (IJETT) – Volume 25 Number 4- July 2015 The O3gray operator, which takes three pairs of generate and propagate values (gi , pi),(gj, pj), (gk , pk) as inputs and produces only one generate signal output as per equation (10). The implementation of these operators are done using multiplexer based design[13]. V. (a) (b) Fig. 10(a) black operator (b) gray operator The black operator receives two sets of generate and propagate signals (gi , pi),(gj,pj), computes one set of generate and propagate signals (go , po) by the following equations: The gray operator receives two sets of generate and propagate signals (gi, pi), (gj, pj), computes only one generate signal with the same equation as in equation (8). The O3black, O3gray operators are shown in Fig. 11(a), Fig. 11(b) respectively. Experimental Results The Xilinx 14.5 version of VHDL is used with the targeted device of XC3S500E for the implementation of adders. Basic adders, ripple carry adder, carry look-ahead adder, carry bypass adder, carry select adder and hybrid parallel prefix adder is simulated and synthesize using Xilinx tool for 16-bit and 32-bit addition. The results of these adders for 16bit addition and 32-bit addition are tabulated in table 1 and table 2 respectively. The comparison is done for three factors: speed, area and power consumption. The speed performance is evaluated with respect to and delay. The area requirement can be estimated from the utilization of number of slices and Look-up tables. The power consumption is analyzed by taking switching power (dynamic power) in account which mainly depends on the input test vectors that can be applied through the test bench. Static power is not considered because lack of ASIC tools available. The design summary of synthesis and power analysis results of 32-bit hybrid parallel prefix adder are given in figures 12 and 13 respectively. Table 1. Results of various adders for 16-bit addition Slices LUT Average fan out Delay (ns) Logic levels Power (mw) RCA 22 31 1.76 20.629 17 33 CLA 22 31 1.76 20.317 17 33 CBA 30 49 2.87 18.352 16 33 CSLA Hybrid PPA 22 40 2.59 14.384 11 33 27 46 2.40 14.886 12 33 Adder type (a) (b) Fig. 11. (a) O3black operator (b) O3gray operator The O3black operator, which takes three pairs of generate and propagate values (gi , pi),(gj, pj), (gk , pk) as inputs and produces the generate and propagate output values (go , po) as follows: ISSN: 2231-5381 Table 2. Results of various adders for 32-bit addition http://www.ijettjournal.org Page 215 International Journal of Engineering Trends and Technology (IJETT) – Volume 25 Number 4- July 2015 Slices LUT Average fan out Delay (ns) Logic levels Power (mw) RCA 46 63 1.76 37.604 33 62 CLA 45 65 1.93 34.921 31 62 CBA 50 86 2.10 45.169 40 62 CSLA Hybrid PPA 46 84 2.71 23.083 19 62 57 105 2.58 21.162 19 52 Adder type Slices 60 50 40 30 20 10 0 16-bi 32-bit RCA CLA CBA CSLA Hybrid PPA Fig. 14 Utilization of Slices of various adder variants for 16-bit and 32-bit addition LUTs Fig. 12 Design Summary of hybrid parallel prefix adder for 32-bit addition 120 100 80 60 40 20 0 16-bi 32-bit Fig. 15 Utilization of LUTs of various adder variants for 16-bit and 32-bit addition Delay (ns) 50 40 30 Fig. 13 Power analysis results of hybrid parallel prefix adder for 32-bit addition 20 16-bi 10 32-bit 0 RCA CLA For better comparison, the results of various types of adders are presented in figures 14 to 19. CBA CSLA Hybrid PPA Fig. 16 Delay of various adder variants for 16-bit and 32-bit addition ISSN: 2231-5381 http://www.ijettjournal.org Page 216 International Journal of Engineering Trends and Technology (IJETT) – Volume 25 Number 4- July 2015 Average Fan-out 3.5 3 2.5 2 1.5 1 0.5 0 16-bi 32-bit RCA CLA CBA CSLA Hybrid PPA Fig. 17 Average Fan-out of various adder variants for 16-bit and 32-bit addition 50 40 30 20 16-bi 10 32-bit 0 CLA CBA CSLA Hybrid PPA Fig. 18 Logic Levels of various adder variants for 16-bit and 32-bit addition Power (mw) 70 60 50 40 30 20 10 0 The utilization of Slices and LUTs for 16-bt and 32-bit addition are more for the adder models of CBA and hybrid PPA. The hybrid PPA occupy little more area than CBA. Hence, the implementation of proposed hybrid Parallel Prefix Adder in an FPGA shows its superiority in Speed and Low Power Consumption but with little area overhead. The proposed high speed, low power, area efficient hybrid parallel prefix adder can be further used in cryptographic applications such as encryption and hashing functions, signal processing applications such as FIR filter, etc, to enhance the overall system performance. References [1] 16-bi [2] 32-bit [3] RCA CLA CBA CSLA Hybrid PPA [4] Fig. 19 Dynamic Power Consumption of Slices of various adder variants for 16-bit and 32-bit addition VI. The speed performance is estimated with delay parameter, the delay of CSLA is least when compared with all other adder models for 16-bit addition, hybrid parallel prefix adder delay is very close to the delay of CSLA. As the bit width of addition increases to 32-bit wide, the delay of hybrid PPA is resulted a small value compared to all other models. For 16-bit addition, none of the adders shown deviation of Dynamic power consumption. The 32-bit hybrid PPA consume less power compared to all other adder models. Logic Levels RCA This paper presents the implementation of various adder variants such as ripple carry adder, carry look-ahead adder, carry bypass adder, carry select adder and the proposed hybrid parallel prefix adder for 16-bit and 32-bit addition. To evaluate the performance of these adders, Xilinx 14.5 version of VHDL is used with the targeted device of Spartan 3E family of device XC3S500E. Conclusions ISSN: 2231-5381 [5] [6] J. Skalansky, “conditional sum additions logic”, IRE Transactions, Electronic Computers, vol. EC – 9, pp. 226 - 231, June 1960. Y.Choi and E.E.Swartz lander, Ir, “Parallel Prefix adder design with matrix representation”,, in Proc.17th IEEE symposium on computer Arithmetic (ARITH), PP 90-98,2005. Kogge P, Stone H, “A parallel algorithm for the efficient solution of a general class Recurrence relations”, IEEE Trans. Computers, vol.C-22, No.8, pp. 786-793, Aug.1973. GiorgosDimitrakopoulos and DimitricNikolos, “High Speed Parallel –Prefix VLSI Ling Adders”, IEEE Trans on computers, Vol.54, No.2, Feb 2005. Han T, Carlson D, “Fast area-efficient VLSI adders”, Proc.8th.symp.Comp.Arit.pp.49-56, Sep.1987. TaekoMatsunaga and Shinji Kimura, YusukaMatsunaga, “Synthesis of parallel prefix adders considering switching activities”, IEEE International Conference on computer design, http://www.ijettjournal.org Page 217 International Journal of Engineering Trends and Technology (IJETT) – Volume 25 Number 4- July 2015 [7] [8] [9] [10] [11] [12] [13] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] pp.404-409, 2008 TaekoMatsunaga and YusukaMatsunaga, “Timing-Constrained Area minimization Algorithm for parallel prefix adders”, IEICE TRANS, Fundamentals, vol.E90-A, No.12 Dec, 2007. JianhuaLiuZhu, Haikun, Chung-Kuan Cheng, John Lillis, “Optimum prefix Adders in a Comprehensive Area, Timing and power Design Space”., Proceeding of the 2007 Asia and South pacific Design Automation conference. Washington, pp.609-615, jan 2007. Ladner R, Fischer M, ”Parallel prefix computation “, J.ACM, vol.27, no. 4, pp. 831-838, Oct.1980. Brent R, Kung H, “A regular layout for parallel adders”. IEEE Trans, computers, Vol.C-31, no.3, pp. 260-264, March1982. S.V.Padmajarani and M.Muralidhar, “Comparison of Parallel Prefix Adders Performance in an FPGA”, International Journal of Engineering Research and Development (IJERD), Vol.3, No.6, pp. 62-67, September -2012. S.V.Padmajarani and M.Muralidhar, “A Hybrid Parallel Prefix Adder for high speed computing”, Proc. 7th National Conference on Advances in Electronics and Communications(ADELCO), 2011. S.V.Padmajarani and M.Muralidhar, “A New Approach to implement Parallel Prefix Adders in an FPGA”, International Journal of Engineering Research and Applications(IJERA) ,Vol.2, No.4,pp. 1524-1528, July-August, 2012. J. Skalansky, “conditional sum additions logic”, IRE Transactions, Electronic Computers, vol. EC – 9, pp. 226 - 231, June 1960. Brent R, Kung H, “A regular layout for parallel adders”. IEEE Trans, computers, Vol.C-31, no.3, pp. 260-264, March1982. Kogge P, Stone H, “A parallel algorithm for the efficient solution of a general class Recurrence relations”, IEEE Trans. Computers, vol.C-22, No.8, pp. 786-793, Aug.1973. Ladner R, Fischer M, ”Parallel prefix computation “, J.ACM, vol.27, no. 4, pp. 831-838, Oct.1980. Han T, Carlson D, “Fast area-efficient VLSI adders”, Proc.8th.symp.Comp.Arit.pp.49-56, Sep.1987. JianhuaLiuZhu, Haikun, Chung-Kuan Cheng, John Lillis, “Optimum prefix Adders in a Comprehensive Area, Timing and power Design Space”., Proceeding of the 2007 Asia and South pacific Design Automation conference. Washington, pp.609-615, jan 2007. TaekoMatsunaga and YusukaMatsunaga, “Timing-Constrained Area minimization Algorithm for parallel prefix adders”, IEICE TRANS, Fundamentals, vol.E90-A, No.12 Dec, 2007. TaekoMatsunaga and Shinji Kimura, YusukaMatsunaga, “Synthesis of parallel prefix adders considering switching activities”, IEEE International Conference on computer design, pp.404-409, 2008 GiorgosDimitrakopoulos and DimitricNikolos, “High Speed Parallel –Prefix VLSI Ling Adders”, IEEE Trans on computers, Vol.54, No.2, Feb 2005. Y.Choi and E.E.Swartz lander, Ir, “Parallel Prefix adder design with matrix representation”,, in Proc.17th IEEE symposium on computer Arithmetic (ARITH), PP 90-98,2005. S.V.Padmajarani and M.Muralidhar, “Comparison of Parallel Prefix Adders Performance in an FPGA”, International Journal of Engineering Research and Development (IJERD), Vol.3, No.6, pp. 62-67, September -2012. S.V.Padmajarani and M.Muralidhar, “A Hybrid Parallel Prefix Adder for high speed computing”, Proc. 7th National Conference on Advances in Electronics and Communications(ADELCO), 2011. S.V.Padmajarani and M.Muralidhar, “A New Approach to implement Parallel Prefix Adders in an FPGA”, International Journal of Engineering Research and Applications(IJERA) ,Vol.2, No.4,pp. 1524-1528, July-August, 2012. ISSN: 2231-5381 http://www.ijettjournal.org Page 218