Low Power and Area Efficient Carry Select Adder D.Dharmaja ,Kuppam N Chandra Sekhar, R.Mallikarjuna Reddy Abstract— Carry Select Adder (CSA) is one of the fastest adders used in many data-processing processors to perform fast arithmetic functions. From the structure of the CSA, it is clear that there is scope for reducing the area and power consumption in the CSA. This work uses a simple and efficient gate-level modification to significantly reduce the area and power of the CSA. Based on this modification 8-, 16-, 32-, and 64-b square-root CSA architecture have been developed and compared with the regular CSA architecture. The proposed design has reduced area and power as compared with the regular CSA with only a slight increase in the delay.This work evaluates the performance of the proposed designs in terms of delay, area, power, and their products by hand with logical effort and through custom design and layout in 0.18-µm CMOS process technology. The results analysis shows that the proposed CSA structure is better than the regular CSA.In our project we performed coding and simulation up to 64 bit and analysis up to 32-bit. Index Terms—CMOS, flip - flops, Double - edge triggered, power dissipation, delay and PDP. INTRODUCTION Very Large Scale Integration, or VLSI, is a general term for an integrated circuit chip that has 10,000 to 1 million transistors and related components. As with other technology developments, VLSI didn’t happen overnight, but came as a series of progressive steps, one improvement coming upon another. As integrated circuits crossed the VLSI threshold in the late 1970s, personal computers using these chips advanced from being hobbyist toys to useful systems for science and business.In the 1980s, chip makers had achieved 10,000 transistors on a single device. The transistors, which were invisible to the naked eye for SSI devices, became so small for VLSI you needed an electron microscope to see them. The greater number of transistors allowed chip makers to add new capabilities to microprocessors, such as math functions, RAM and ROM memory and other specialized subsystems. D.Dharmaja ,ECE Department, Jawaharlal Nehru Technology University,GVIC College,Madanapalle,Andhra Pradhesh ,INDIA,Mobile:7799877206.,e-mail:dharmasofty@gmail.com Kuppam N Chandra Sekhar, ECE Department, Jawaharlal Nehru Technology University,GVIC College,Madanapalle,Andhra Pradhesh ,INDIA,Mobile:9885229501, e-mail: chandrasekhar501@gmail.com. R.Mallikarjuna Reddy, ECE Department, Jawaharlal Nehru Technology University, GVIC College, Madanapalle, Andhra Pradhesh, INDIA,Mobile:8096330172 email:mallikarjunareddy.r416@gmail.com The decades following the 80s continued this trend, pushing transistor counts past the billion mark, and making technologies such as smart phones possible.Area and power have major role in the designing of integrated circuit because of the increase in popularity of portable systems as well as the rapid growth of power density in VLSI circuits. Addition usually influences strongly on the overall performance of digital systems and a crucial arithmetic function. Adders are most widely used in electronic applications. For example, in microprocessors, millions of instructions per second are performed. Due to the increase in the portability of the devices like mobile, laptop etc. require more battery backup. Low power and area efficient addition and multiplication have always been a fundamental requirement of high performance processors and systems. Designing efficient adder is the most difficult problem for researchers in VLSI design. Design of area- and power-efficient high-speed data path logic systems are one of the most substantial areas of research in VLSI system design. In digital adders, the speed of addition is limited by the time required to propagate a carry through the adder. The sum for each bit position in an elementary adder is generated sequentially only after the previous bit position has been summed and a carry propagated into the next position. The CSLA is used in many computational systems to alleviate the problem of carry propagation delay by independently generating multiple carries and then select a carry to generate the sum. However, the CSLA is not area efficient because it uses multiple pairs of Ripple Carry Adders (RCA) to generate partial sum and carry by considering carry input Cin=0 and Cin=1, then the final sum and carry are selected by the multiplexers (mux).The basic idea of this work is to use Binary to Excess-1 Converter(BEC) instead of RCA with Cin=1 in the regular CSLA to achieve lower area and power consumption. The main advantage of this BEC logic comes from the lesser number of logic gates than the n-bit Full Adder (FA) structure. II LITERATURE SURVEY 2.1.Full adder To understand the working of a ripple carry adder completely, you need to have a look at the full adder too. Full adder is a logic circuit that adds two input operand bits plus a Carry in bit and outputs a Carry out bit and a sum bit. The Sum out (Sout) of a full adder is the XOR of input operand bits A, B and the Carry in (Cin) bit. Truth table and schematic of a 1 bit Full adder is shown below 1 B CO UT NOT gate’s input is the propagation delay here. Similarly the carry propagation delay is the time elapsed between the application of the carry in signal and the occurance of the carry out (Cout) signal. 2.3 Carry Select Adder To solve the carry propagation delay CSA is developed which drastically reduces the delay to a great extent. The CSA is used in many computational systems design to moderate the problem of carry propagation delay by independently generating multiple carries and then select a carry to generate the sum. It uses independent ripple carry adders (for Cin=0 and Cin=1) to generate the resultant sum. However, the Regular CSA is not area and power efficient because it uses multiple pairs of Ripple Carry Adders (RCA) to generate partial sum and carry by considering carry input. A FA Ci n So utdiagram of full adder Figure 2.1 Block There is a simple trick to find results of a full adder. Consider the second last row of the truth table, here the operands are 1, 1, 0 ie (A, B, Cin). Add them together ie 1+1+0 = 10 . In binary system, the number order is 0, 1, 10, 11……. and so the result of 1+1+0 is 10 just like we get 1+1+0 =2 in decimal system. 2 in the decimal system corresponds to 10 in the binary system. Swapping the result “10″ will give S=0 and Cout = 1 and the second last row is justified. This can be applied to any row in the table. The final sum and carry are selected by the multiplexers (mux). To overcome the above problem, the basic idea of the proposed work is to use n-bit binary to excess-1 code converters (BEC) to improve the speed of addition. This logic can be replaced in RCA for Cin=1 to further improves the speed and thus reduces the delay. Using Binary to Excess-1 Converter (BEC) instead of RCA in the regular CSLA will achieve lower area, delay which speeds up the addition operation. The main advantage of this BEC logic comes from the lesser number of logic gates than the Full Adder (FA) structure because the number of gates used will be decreased. 2.2.Schematic diagram of full adder To draw full adder we need two ‘xor’ gates ,three ‘and’ gates and one ‘or ‘gate. Total number of gates need to construct full adder are six. Here A,B,C are inputs and sum(S),carry(Cout) are the outputs. A B C S COUT Figure 2. 3 Block diagram of 4-bit CSA 2.4 BINARY TO EXCESS-1 CONVERTER Figure 2.2 Schematic diagram of full adder 2.3. Ripple carry adder Multiple full adder circuits can be cascaded in parallel to add an N-bit number. For an N- bit parallel adder, there must be N number of full adder circuits. A ripple carry adder is a logic circuit in which the carry-out of each full adder is the carry in of the succeeding next most significant full adder. It is called a ripple carry adder because each carry bit gets rippled into the next stage. In a ripple carry adder the sum and carry out bits of any half adder stage is not valid until the carry in of that stage occurs.Propagation delays inside the logic circuitry is the reason behind this. Propagation delay is time elapsed between the application of an input and occurance of the corresponding output. Consider a NOT gate, When the input is “0″ the output will be “1″ and vice versa. The time taken for the NOT gate’s output to become “0″ after the application of logic “1″ to the The main advantage of this BEC logic comes from the lesser number of logic gates than the n-bit Full Adder (FA) structure.The importance of the BEC logic stems from the large silicon area reduction when the CSLA with large number of bits are designed. The Boolean expressions of the 4-bit BEC is listed as (note the functional symbols ˜ NOT, &AND,ˆXOR) B1 = ~ A1 B2 = A2 ^A1 B3 = (A1&A2)^ A3 The main function of bec-1 is it increases the given input by 1.that is it gives the output which is one bit greater than input i.e, if input is zero then bec-1 output is one. the logic of a 4-b BEC is shown below. 2 ripple carry adder with lower bits and desired cin will act as a select line. If the carry out is zero then the output of ripple carry adder with fixed carry input zero will be selected as output ,otherwise the output of the ripple carry adder with fixed input one will be selected as output. IV PROPOSED DESIGN 4.1 BLOCK DIAGRAMS OF PROPOSED CSA Figure 2.4 Schematic diagram of bec-1 III EXISTING DESIGN 3.1 BLOCK DIAGRAM OF EXISTING CSA Carry select adder divides the input into two parts. In 32-bit CSA lower bits i.e (A0,B0- - - - - - - A15,B15) are consider as first part and higher bits i.e (A16,B16- - - - - A31,B31) are considered as second part. CSA has mainly three sections of ripple carry adders, first part of the input is given to the first section RCA with desired carry input ,second part of the input is given to the second section RCA with fixed carry input ‘0’ and the same second part of the input is given to the third section RCA with fixed carry input ‘1’. The carry output of the first section ripple carry adder will acts as a selection line to all the multiplexers. If cout of the first section RCA is ‘1’ ,then the mux will choose the output of second section RCA with fixed carry input ‘1’, and vice versa. Figure 4.1 Block diagram of Existing 32 – bit CSA The CSA divides the problem into two parts that is lower bits and higher bits. Lower bits are given to the ripple carry adder with desired carry input and the higher bits are given to the ripple carry adder with fixed carry input zero and again the higher bits are given to the ripple carry adder with fixed carry input one . Here we use multiple multiplexers (higher bits plus one ,number of mux ) for all this mux the carry out from the Figure 4.1 Block diagram of proposed 32-bit CSA 3 Carry select adder divides the input into two parts. In 32-bit CSA lower bits i.e (A0,B0- - - - 15,B15) are consider as first part and higher bits i.e (A16,B16- - - - A31,B31) are considered as second part. CSA has mainly three sections of ripple carry adders, first part of the input is given to the first section RCA with desired carry input, second part of the input is given to the second section RCA with fixed carry input ‘0’ and the same second part of the input is given to the third section RCA with fixed carry input ‘1’. In proposed CSA the second section RCA with fixed carry input ‘1’ is replaced by binary to excess-1 converter and the output of second section RCA with fixed carry input ‘0’ is given as input to the BEC-1 . The carry output of the first section ripple carry adder will acts as a selection line to all the multiplexers. If cout of the first section RCA is ‘1’, then the mux will choose the BEC output, and vice versa. V SIMULATION RESULTS Figure 5.2 Simulation result of proposed 32-bit CSA 5.3.Layout of existing 32 bit-CSA This is the layout result for existing thirty two bit CSA generated from the micro wind tools. 5.1 Existing 32bit-CSA This is the simulation result for existing thirty two bit carry select adder, the inputs are ‘11110000111100001111000011110000’(a31,a30,a29,a28,a 27,a26,a25,a24,a23,a22,a21,a20,a19,a18,a17,a16,a15,a14,a1 3,a12,a11,a10,a9,a8,a7,a6,a5,a4,a3,a2,a1,a0) and ‘11110000111100001111000011110000’(b31,b30,b29,b28, b27,b26,b25,b24,b23,b22,b21,b20,b19,b18,b17,b16,b15,b14 ,b13,b12,b11,b10,b9,b8,b7,b6,b5,b4,b3,b2,b1,b0) and the output is ‘11100001111000011110000111100000’ with carry output one Figure 5.4 Layout of existing 32-bit CSA 5.4 Proposed 32-bitCSA This is the lay out result for proposed thirty two bit CSA generated from the micro wind tools. Figure 5.1 Simulation result of existing 32-bit CSA 5.2 Proposed 32bit-CSA This is the simulation result for proposed thirty two bit carry select adder,the inputs are 11110000111100001111000011110000(a31……….a0) and 11110000111100001111000011110000(b31……….b0) and the otput is 11100001111000011110000111100000 with carry output one. Figure 5.8 Layout of proposed 32-bit CSA 4 REFERENCES 5.5 RESULT ANALYSIS: Bit size Existing CSA Proposed CSA 16-bit 2.055mw 80.484µw 32-bit 0.947mw 0.315mw Table 1 Power analysis result 5.5.2 Area Analysis Bit size O. Bedrij, “Carry select adder,” IRE Trans. Electron. Comput., vol. EC-11, pp. 340–346, 1962. 2. T. Han and D. Carlson' "Fast area efficient VLSI adders'" Proceedings of the Eighth Symposium on Computer Arithmetic. Como' Italy' pp.49-56' September 1987. (Pubitemid 17613979) . 3. Ling, "High Speed Binary Parallel Adder", IEEE Transactions on Electronic Computers, EC-15, p.799809, October, 1966. 4. J. E. Robertson, “A Deterministic Procedure for the Design of Carry-Save Adders and Borrow-Save Subtractor,” University of Illinois, Urbana-Champaign, Dept. of Computer Science, Report No. 235, July 1967. 5. R. F. Hobson. “Optimal skip-block considerations for regenerative carry-skip adders” IEEE J. Solid-State Circuits, 30(9):1020– 1024, September 1995. 6. O. J. Bedrij, “Carry-select adder,” IRE Trans. Electron. Comput., pp. 340–344, 1962. 7. B. Ramkumar, H.M. Kittur, and P. M. Kannan, “ASIC implementation of modified faster carry save adder,” Eur. J. Sci. Res., vol. 42, no. 1, pp. 53–58, 2010. 8. T. Y. Ceiang and M. J. Hsiao, “Carry-select adder using single ripple carry adder,” Electron. Lett., vol. 34, no. 22, pp. 2101–2103, Oct. 1998. 9. Y. Kim and L.-S. Kim, “64-bit carry-select adder with reduced area,” Electron. Lett., vol. 37, no. 10, pp. 614– 615, May 2001. 10. J. M. Rabaey, Digtal Integrated Circuits—A Design Perspective. Upper Saddle River, NJ: Prentice-Hall, 2001. 11. Y. He, C. H. Chang, and J. Gu, “An area efficient 64-bit square root carry-select adder for lowpower applications,” in Proc. IEEE Int. Symp. Circuits Syst., 2005, vol. 4, pp. 4082–4085. 12. Cadence, “Encounter user guide,” Version 6.2.4, March 2008. 1. 5.5.1 Power Analysis Existing CSA Proposed CSA 16-bit 7.6875 nm2 5.309 nm2 32-bit 14.648 nm2 10.022 nm2 Table 2 Area analysis result Power analysis shows that proposed CSA consumes less power when compared to the regular CSA and the area analysis shows that the proposed CSA occupies less power when compared to the regular CSA. The quantum of area gain achieved in the proposed circuit increases with the increase in the word- size of the adders. CONCLUSION A simple approach is proposed in our project to reduce the area and power of CSA architecture. The reduced number of gates of this work offers the great advantage in the reduction of area and also the total power. The compared results show that the modified CSA has a slightly larger delay but the area and power of the modified 4 –bit CSA are significantly reduced by 19.4% and 34% respectively, for 8bit CSA the power and area are significantly reduced by 24.83% and 36% respectively, for 16-bit CSA the power and area are significantly reduced by 30.94% and 96% respectively and for 32 –bit CSA the power and area are significantly reduced by 31% and 66.7 % respectively . The power-delay product and also the area-delay product of the proposed design show a decrease for 16-, 32-, and 64-b sizes which indicates the success of the method and not a mere tradeoff of delay for power and area. The modified CSA architecture is therefore, low area, low power, simple and efficient for VLSI hardware implementation .It would be interesting to test the design of the modified 128 bit modified CSA. FUTURE SCOPE This work has been designed for 8-bit, 16-bit, 32bit and 64-bit word size and results are evaluated for parameters like area and power. This work can be further extended for higher number of bits. New architectures can be designed in order to reduce the power, area of the circuits. Steps may be taken to optimize the other parameters like frequency, number of gate clocks, length etc. BIOGRAPHIE D.Dharmaja , received his Bachelor Degree in Electronics and Commuication Engineering from Jawaharlal Nehru Technological University Anantapur in the year 2012. Currently he is doing his Master’s Degree in VLSI and ES System Design in Jawaharlal Nehru Technological University Anantapur. Her interested areas are Low Power VLSI Design, Digital Circuits Design and VLSI Technology Kuppam N Chandra Sekhar, received her Bachelor Degree in Electronics & Communication Engineering from Jawaharlal Nehru Technological University Hyderabad in the year 2010, Master’s Degree in VLSI System Design from Jawaharlal Nehru Technological University Hyderabad in the year 2013. Presently working as an Assistant Professor in ECE Department In GVIC at Madanapalle.His interested areas of research are Communication,VLSI System Design and Low Power VLSI Design. R.Mallikarjuna Reddy, received her Bachelor Degree in Electronics & Communication Engineering from Jawaharlal Nehru Technological University Hyderabad in the year 2004, Master’s Degree in VLSI System Design from Jawaharlal Nehru Technological University Hyderabad in the year 2008. Presently working as an Assistant Professor in ECE Department In GVIC at Madanapalle. Her interested areas of research are Nano Electronics, VLSI System Design and Low Power VLSI Design. 5