A POWER EFFICIENT DOMINO LOGIC DESIGN FOR WIDE FAN-N GATES ABSTRACT-The explosive growth in present day technology scenario, now-a-days demand low-power VLSI systems with improved performance. One of the most widely used logics in VLSI design is Domino logic. Domino logic circuits are often used in high performance critical units of microprocessors and also in high speed implementation of high fan in circuits. Domino logic is the technique which overcomes the disadvantage of dynamic logic. The main disadvantage of dynamic logic is that cascading of two different dynamic logics cannot be done. So to avoid that, we go for domino logic. Here the disadvantage of dynamic logic is overcome by simply inserting a static inverter between two stages. So this resembles two dynamic logics connected using a static logic. As technology is scaled down, power supply must be scaled to decrease power consumption which in turn increases leakage current and parasitic capacitance contributing towards increased power dissipation. Here a new domino circuitry is proposed which which has very low power dissipation.This is achieved by current comparison based circuitry. The proposed circuitry decreases the parasitic capacitance on the dynamic node, yielding smaller keeper for wide fan-in gates to implement fast and robust circuits. This decreases current contention and leakage current too. The simulation is done using MENTOR GRAPHICS EDA TOOL with 180 nm technology. Index Term — Domino logic, Leakage-Tolerant, Noise immunity, Wide fan-in, Current Comparison Logic. 1. INTRODUCTION : The domino logic consists of an ‘n’ type dynamic logic block followed by static Inverter for cascading of dynamic gates. There are two phases of working. They are the Pre-charge phase and Evaluation phase. During Pre-charge phase, the output of the n type dynamic gate is charged up to VDD and the output of the inverter is set to Q. During Evaluation phase, the dynamic gate conditionally discharges and the output of the inverter makes a conditional transition from 01. The introduction of static inverter has the additional advantage that fan-out of gate is driven by static inverter with low impedance output which increases noise immunity. Also it reduces the capacitance of dynamic output node. Since each dynamic gate has static inverter, only non-inverting logic can be implemented. This is a major limiting factor. Apart from this disadvantage very high speed can be achieved. But while implementing the wide fan-in logic gates using this dynamic logic technology, there are certain disadvantages like high capacitance on the dynamic node, leakage current due to evaluation phase, high current contention due to upsizing and so on. In order to avoid the above said disadvantages many logics where proposed. But they also had their own demerits. And now a new domino logic is proposed which has low leakage without dramatic speed degradation for wide fan-in gates. This technique utilizes the concept of current comparison based domino logic. 11. LITERATURE REVIEW There are many existing techniques for this domino logic. Each technique has its own advantages as well as disadvantages. The first technique proposed was the standard footless domino logic technique. J.M. Rabey et all., in 2002 proposed a Standard Footless Domino Logic (SFLD). This is the most popular dynamic logic and the conventional one. Here a PMOS keeper transistor is used to avoid any unwanted discharge of the dynamic node because of leakage current and charge sharing because of the Pull Down Network (PDN) which happens during the evaluation phase. So the noise robustness is made high. But keeper upsizing increases current contention between the keeper transistor and evaluation network increasing power consumption and evaluation delay of domino circuits. H. Suzuki, C. Kim and K. Roy proposed a logic called Diode Partitioned Domino in Feb 2002 for fast tag comparators. It reduces the parasitic capacitance and enables smaller keeper in high fan-in gates. The diode circuit is also improved by an enhanced diode that boosts up the gate voltage of the NMOS diode. Yet it suffers from power dissipation value being little greater. M.H Anis, M.W. Allam and M.I. Elmsary in May 2002 proposed a logic called as High Speed Domino logic (HSD). Introduction of clock delay can be used to reduce the current drawn through the PMOS keeper and the NMOS PDN. This helps in keeping the large PMOS keeper without performance degradation and leakage current. However the area and power overhead of the clock delay circuit will still be there. H. Mahamoodi and K. Roy proposed a logic named as Diode Footed Domino logic (DFD) in March 2004. A diode footer transistor is used in series with the evaluation network. So robustness and noise immunity increases. But the main drawbacks are that though the leakage current through the evaluation PDN is reduced, the current through the footer is again increased. Discharge of dynamic node is not so fast as previous techniques. A. Alvandpur, R. Krishnamurthy and K. Sourrty proposed a logic called Conditional Keeper Domino logic in Jan 2007. This consists of small and large keeper transistors. The conditional keeper domino has certain disadvantages such as limitations on increasing the delay and power dissipation due to upsizing. Y. Lih, N. Tzartzanis and W.W Walker proposed a leakage current replica keeper dynamic circuit in Jan 2007. It improves scaling of the dynamic logic gates. Overhead is 1-FET per gate plus a portion of the replica circuit. For equal noise margin, more legs are possible. Gate is faster with same number of gates. A fairly large safety factor is needed to account for random ondie process variation especially FET Vt variation. Ali Peravi and Mohamed Asyaei proposed a robust low leakage controlled keeper based domino in 2012 which works on reduction of leakage current and power but yet suffers from serious efficiency issues in terms of area and delay. 111. PROPOSED DESIGN The proposed design works in such a way that it contributes towards reduced power dissipation by reducing the capacitance, leakage current and current contention. Since in wide fan-in gates, the capacitance of the node is large, speed is decreased dramatically. In addition, noise immunity of the gate is reduced due to many parallel leaky paths in wide fan in gates. Although upsizing of the keeper transistor can improve noise robustness, power consumption and delay are increased due to large contention. These problems can be solved if pull down network implements the logic function, is separated from the keeper transistor by using comparison stage in which the current of pullup network is compared with worst case leakage current. This idea is conceptually illustrated in the FIG.8. FIG.8 Proposed Conceptual Diagram The following sections describe the ways of achieving ultimate power reduction using the proposed Current Comparison Domino logic (CCD). CAPACITANCE REDUCTION: Normally in domino logic, in the Pull up network, we use only 1 PMOS transistor instead of ‘n’ transistor n=4,8&16 etc., and the logic will be implemented with the pull down network. By reducing the number of transistors, capacitance decreases.Few transistors means limited switching (charging and Discharging) of the capacitor. So dynamic power consumption gets decreased since the main source for dynamic power dissipation is capacitance charging and discharging. LEAKAGE POWER REDUCTION: As known already leakage current occurs due to unwanted current flow between source and the drain of a transistor. This leakage current can be avoided by the “stacking effect”. Stacking effect says that when two or more transistors in series are at off condition, the leakage current can be reduced. CURRENT CONTENTION: Here concept of keeper transistor comes in. A keeper transistor is used so that this transistor supplies a small amount of current from the power-supply network to the dynamic node of a gate so that charge stored in dynamic node is preserved for the necessary situations. For better robustness upsizing the keeper is done. But the problem is that during evaluation phase, when the pull down network is on, contention arises. The problem can be avoided by temporarily disabling the keeper at particular times when dynamic gate switches. WORKING: The proposed CCD based wide fan-in circuit shown in the FIG.9. In the proposed circuit current of the pull up network is mirrored by transistor M2 and compared with reference current, which replicates the leakage current of the pull up network. FIG.9 Proposed CCD OR gate The proposed circuit is implemented with NMOS circuit to implement the logic function. The source and body terminals of PMOS transistors are tied together so that body effect is eliminated. As shown in the figure, transistor M1 is in diode configuration. This means the drain and the gate of the M1 transistor is tied together. This helps in decreasing the leakage current reduction by using the concept of stacking effect when all inputs of the OR gate is set to low level or in standby mode. Addition of M1 results in leakage reduction of sub threshold leakage of the evaluation network due to stacking effect. The stacking effect does the following things. A) Makes the gate to source voltage of the evaluation transistors to be negative. B) Decreases the drain source voltage of evaluation transistors and hence decreases the drain induced barrier lowering ( DIBL ) EFFECT OF NEGATIVE Vgs : Creates a p-channel at the surface of n-region with opposite polarity. So very small threshold current flows between source and drain reducing leakage current. EFFECT OF DECREASING Vds Decreases DIBL which is caused due to threshold variations which in turn is caused due to increased current flow between Drain & Source. So by reducing the Vds, DIBL reduces. In reference circuit, the transistors M5, M6, M7 and M8 are present. M5 plays a part in decreasing power dissipation. This transistor is ‘on’ during active mode and ‘off’ during standby mode. This helps in reducing standby power dissipation. The reference circuit generates the reference current in this current comparison domino. So by above said methods we can decrease power dissipation effectively in wide fan-in gates. So power dissipation can be reduced in critical units. TWO PHASES: There are two phases of proposed circuit in the active mode and the working of the circuit in these modes is explained below. PRE-CHARGE PHASE: During this phase, the clock input is held low. The transistors and their states are tabulated in TABLE 1. TABLE.1. Transistor states in pre-charge phase ON STATE OFF STATE Mpre, Mdis, Mk1, Mk2, M5 M1, M2, M3, Meval, M6, M7, M8 EVALUATION PHASE: During this phase there are two different conditions. The first condition is when all the inputs to the pull down network is at low level. The TABLE.2 shows the transistor states during this condition. TABLE.2 When all inputs are in OFF state ON STATE OFF STATE M1, M2, Meval, M8 Mpre, Mdis, M5, M6, M7 M4, Mk1,Mk2 are ON/OFF depending upon inputs. A small amount of leakage current appears across M1, though it is off. This leakage current is mirrored by M2. But this leakage current is compensated by Keeper Transistors as they are on. The second condition is atleast one input to the pull down network is at high level. Since atleast one input is on, conduction path exists to ground because in pull down network NMOS transistors are used. Since M1 is on, current flow in Pull up network is high. This is mirrored by M2. So dynamic node voltage is discharged which causes the keeper transistor to turn off decreasing the current contention problem. The TABLE.3 shows the transistor states when atleast one input is pulled high. So by the CCD logic, ultimately the power dissipation gets reduced compared to other methods. TABLE 3 When atleast one input is in ON state ON STATE OFF STATE M1, M2, Meval, MK1, MK2, M3, M4, Mdis, M7, M8 Mdis,M5,M6 1V. SIMULATION RESULTS: The proposed circuit is compared with existing SFLD, DPD and HSD logics in terms of power and area for the wide fan-in of 4. The results are shown in TABLE.4. TABLE.4 Power and Area comparison LOGICS POWER AREA SFLD 813.11µw 14*12 µm HSD 274.37µw 19*12 µm DFD 299.60µw 21*12 µm PROPOSED CCD 1.6877µw 38*12 µm From the table, the power dissipation of proposed CCD logic is found to be lesser than the existing other domino logics like SFLD, HSD and DFD. But the area increases for the proposed logic than the existing ones. The comparison is done for 4 input wide fan-in OR gate. The reason to choose wide fan-in gates is that they have many leaky paths and due to this power dissipation increases \heavily. So by incorporating CCD logic in wide fan-in gates, power dissipation is found to be decreased. The wide fan-in gates are used as main components in many devices like microprocessors and DSP units. So power dissipation should be reduced compulsorily in such critical units. Also comparison between existing SFLD and proposed CCD is done in terms of power dissipation, delay and area for wide fan-in of 8 and 16 inputs for better analysis of the proposed concept. The TABLE.5, TABLE.6, TABLE.7 and TABLE.8 show the comparisons between the SFLD and proposed CCD. From the tables it is clearly seen that power TABLE.8 Input and Output noise voltage of dissipation decreases visibly for the proposed proposed CCD logic logic but still there occurs area overhead due to NOISE INPUT OUTPUT increased number of transistors. 40.02V 39.54V 4 INPUT TABLE.5 Comparison of Power Dissipation between SFLD and CCD logics: 37.29 V 35.94V 8 INPT 16 INPUT POWER DISSIPATIONSFLD CCD 4 INPUT 818.342 μW 1.6677 μW 8 INPUT 851.50 μW 842.82 μW 16 INPUT 922.53 μW 880.177 μW From TABLE.5 it is clear that the power dissipation is lesser for the proposed CCD logic than the existing SFLD logic. TABLE.6 Comparison of Delay between SFLD and CCD logics: DELAY SFLD CCD 4 INPUT 13.90 ns 8.377 ns 8 INPUT 13.92 ns 8.410 ns 16 INPUT 13.905 ns 8.4202 ns 41.62 V 40.64V From TABLE.8, it is visible that the output noise voltage reduces while compared to input noise voltage of the CCD logic. The FIG.10 shows the simulation diagram of 4 input CCD OR gate. The Fig.11 shows the simulation waveform of 4input CCD OR gate. Since when atleast one input is high for an OR gate, the output rises high. This can be clearly seen in the waveform. Unity Noise Gain (UNG) is the input noise amplitude that causes the same noise voltage to be caused at the output. Figure Of Merit (FOM) is the quantity that is used to characterize the performance of a device or system or method, relative to its alternatives. The formula for calculating the UNG is UNG = {Vin : V noise = V out}The formula for calculating the Figure Of Merit (FOM) is given below From TABLE.6 it is visible that delay is considerably lesser for proposed CCD logic than the existing SFLD logic. So it can be said that speed increases for the proposed logic. TABLE.7 comparison of AREA between SFLD and CCD logic AREA SFLD CCD 4 INPUT 14*12 μm 38*12 μm 8 INPUT 20*12 μm 44*12 μm 16 INPUT 32*12 μm 56*12 μm From Table.7 it is inferred that area overhead occurs for proposed CCD logic than the existing SFLD logic. FIG.10 Simulation diagram of 4 input CCD OR gate FIG.12 Layout of 4 input CCD OR gate. . TABLE.8 Performance comparison of SFLD and CCD LOGIC FIG.11 Waveform of CCD OR gate MEASURED PARAMETES SFLD n=4 SFLD n=8 SFLD n =16 CCD n=4 CCD n=8 CCD n = 16 UNG NORMALISD POWER 1 1 1 1 1 1 1.04 0.96 1.12 0.98 1.12 0.95 NORMALISD AREA 1 1 1 2.96 2.2 1.75 NORMALISD DELAY 1 1 1 0.36 0.36 0.36 1 NORMALISD SQUAREROOT OF DELAY 1 FOM 1 1 0.77 0.79 0.77 1 1 1.32 1.86 2.66 As seen from the table.8 the measured parameters are normalized with respect to SFLD logic. It is found that Figure Of Merit for the proposed CCD logic is higher than the existing SFLD logic. In the table header, ‘n’ corresponds to the number of inputs. The FIG.11 shows the comparison chart of the measured parameters between SFLD and proposed CCD logic. From the chart it is clear that there occurs high area overhead for the proposed CCD logic than the existing SFLD logic. But the proposed CCD logic has very low delay 3 2.8 2.6 2.4 2.2 2 1.8 1.6 1.4 1.2 1 0.8 0.6 0.4 0.2 0 SFLD CCD FIG.11 Comparison chart between SFLD and CCD OR Gate. APPLICATION : CCD MULTIPLEXER A new CCD based multiplexer is proposed. The FIG.12 shows the schematics of the CCD MUX. This multiplexer is compared with existing diode footed multiplexer. Mostly multiplexers are used in register bit lines of the microprocessor memory. So it is very important that power dissipation should be lesser. The proposed CCD MUX shown in the figure is a 4:1 MUX.S0 to Sn are select lines and D0 to Dn are the data inputs. FIG.12 4:1 CCD MULTIPLEXER The Table.9 shows the comparison between the existing DFD MUX and proposed CCD MUX. TABLE.9 Performance comparison of DFD MUX and proposed CCD MUX MEASURED PARAMETERS DFD MUX CCD MUX POWER DISSIPATION DELAY AREA 55.06 μW 11.644 μW 9.928 ns 13.328 ns 32*23 μm 70*13 μm From the table, the proposed CCD MUX has reduced power dissipation than the existing DFD MUX. But area and delay overhead occurs. V. CONCLUSION The domino logic circuitry suffered mainly due to increased power dissipation because of leakage current in the evaluation network and also due to current contention because of the keeper upsizing as a result of technology scaling. In order to avoid such problems the Current Comparison Domino logic circuit is proposed. The main concept is to compare the evaluation current of the gate with the leakage current. The proposed CCD logic proves to be power efficient while compared with the existing domino logics. It shows a fairly reduced power dissipation maximum but with slight increase in area overhead. A multiplexer is designed on the basis of CCD logic and is compared with existing DFD based multiplexer. It is found that the power dissipation is visibly lesser but has increased area and delay. So in TSMC 180 nm CMOS process, proposed circuit can operate reliably under low power while compared to existing techniques. The simulation, power analysis, area analysis, noise analysis are carried out using Mentor GraphicsEDA Tool