Minimum Dynamic Power CMOS Design with Variable Input Delay Logic Tezaswi Raja PhD Dissertation Dept. of ECE, Rutgers University Thesis Advisors: Vishwani D. Agrawal, Dept. of ECE, Auburn University Michael L. Bushnell, Dept. of ECE, Rutgers University Research Funded by: National Science Foundation Talk Outline Motivation Background on Glitch Elimination Techniques Problem Statement New Variable Input Delay Logic Transistor Level Design of Variable Input Delay Gate Results Physical Level Implementation Conclusion and Future Work Apr 23, 2004 Tezaswi Raja: PhD Dissertation 2 Motivation Why low power design of circuits? Excessive power consumption discourages their use in portable systems. Excessive power increases packaging cost. Excessive power increases cooling equipment cost. Apr 23, 2004 Tezaswi Raja: PhD Dissertation 3 CMOS Power Dissipation V V Ron Cp C Schematic of a CMOS gate CL Electrical model of a gate rising Short circuit power Leakage power (IDDQ) Dynamic power Essential transitions and Glitches 2 Each transition dissipates CV /2 Apr 23, 2004 Tezaswi Raja: PhD Dissertation 4 Talk Outline Motivation Background on Glitch Elimination Techniques Problem Statement New Variable Input Delay Logic Transistor Level Design of Variable Input Delay Gate Results Physical Level Implementation Conclusion and Future Work Apr 23, 2004 Tezaswi Raja: PhD Dissertation 5 What Are Glitches? Delay =1 2 2 Glitches occur due to differential (unbalanced) path delays. Glitches are transients that are unnecessary for the correct functioning of the circuit. Glitches waste power in CMOS circuits. Apr 23, 2004 Tezaswi Raja: PhD Dissertation 6 Prior work Delay Balancing for Glitch Elimination: Hazard Filtering for Glitch Elimination: Glitch suppression by increasing the inertial delay of gates. Ref: Agrawal et al., VLSI Design `97, `99, `03, `04. Gate Sizing for Glitch Elimination: Balancing delays by adding buffers on select paths. Ref: Chandrakasan and Brodersen and other books Every gate is modeled as an equivalent inverter. Model is non-linear Ref : Berkelaar et al., IEEE Trans. on Circuits and Systems ‘96 Transistor Sizing for Area-Speed Oprimization: Apr 23, 2004 Size the width and length of every transistor to get exact delay. Model is non-linear Convergence problems due to large search space. Ref: Fishburn et al., ICCAD ’85. Tezaswi Raja: PhD Dissertation 7 Hazard filtering Path P1 3 2 Path P2 Filtering Effect of a gate Differential Delay = |delay(P1) – delay(P2)| Glitch suppression condition inertial delay > differential delay Ref: Low Power design by Hazard Filtering, Agrawal (VLSI Design `97). Apr 23, 2004 Tezaswi Raja: PhD Dissertation 8 Linear constraint set (LP) LP Constraint Set t1, T1 . . . di ti, Ti Optimized gate and buffer delays tn, Tn Power Estimator Get optimized delays using Linear Programming Delays of gates di and buffers are variables. ti Earliest time of signal transition at gate i. Ti Latest time of signal transition at gate i. Overall Design Flow Ref: Minimum dynamic power CMOS design by a reduced constraint set linear program, Raja, Agrawal and Bushnell (VLSI Design 2003). Apr 23, 2004 Tezaswi Raja: PhD Dissertation 9 Timing window constraints t2 t1 T2 tn T1 di Tn di ti Output Transition Interval Ti time Differential Delay at gate i = largest window of possible transitions = | min(t1,t2,..tn) – max(T1,T2,…TN) | t constraints ti < min(t1,…tn) + di ti < t1 + di etc. Apr 23, 2004 T constraints Glitch constraints Ti > max(T1,…Tn) + di di > Ti - ti Ti > T1 + di etc. Tezaswi Raja: PhD Dissertation 10 Prior Work: Linear constraint set (LP) Advantages Constraints are gate specific. Constraint set size is linear in the number of gates in the circuit. Solution is provably identical to that obtained by pathoriented constraints. Disadvantages Delay buffers inserted to meet IO delay constraint. Example: c1355, containing 619 gates, needs 224 buffers when no speed degradation is permitted. Buffers reduce the best achievable power savings. Apr 23, 2004 Tezaswi Raja: PhD Dissertation 11 Example: Why Buffers Were Necessary? 1 Critical path delay = 3 1 1 Delay unit is the smallest delay possible for a gate in a given technology. Critical Path is the longest delay path in the circuit and determines the speed of the circuit. Apr 23, 2004 Tezaswi Raja: PhD Dissertation 12 Example (cont.) 0 1 0 1 time 1 For glitch free operation of first gate: Differential delay at inputs < inertial delay OK Apr 23, 2004 Tezaswi Raja: PhD Dissertation 13 Example (cont.) 1 1 1 0 time 1 For glitch free operation of second gate: Apr 23, 2004 Differential delay at inputs < inertial delay OK (Assuming equality does not produce a glitch) Tezaswi Raja: PhD Dissertation 14 Example (cont.) 1 time 1 2 1 0 For glitch free operation of third gate: Apr 23, 2004 Differential delay at inputs < inertial delay Not true for gate 3 Tezaswi Raja: PhD Dissertation 15 Example (cont.) 1 time 1 2 1 1 1 For glitch free operation with no IO delay increase: Must add a delay buffer. Buffer is necessary for conventional gate design – only gate output delay is controllable. Apr 23, 2004 Tezaswi Raja: PhD Dissertation 16 Controllable Input Delay Gates 1 time 1 2 1 2 0 Assume gate input delays to be controllable Glitches can be suppressed without buffers Apr 23, 2004 Tezaswi Raja: PhD Dissertation 17 Problem Statement Find a glitch reduction technique such that: All glitches are eliminated in the circuit. No delay buffers are inserted in the circuit. Circuit operates at the highest possible speed permitted by the device technology. Technique should be scalable for large circuits. Circuits are realizable at the physical level of design. Note: The objective is to minimize switching power. Hence, no attempt is made to reduce short-circuit and leakage power, which is an order of magnitude lower for present CMOS technologies; those components of power may be addressed in the future research. Apr 23, 2004 Tezaswi Raja: PhD Dissertation 18 Talk Outline Motivation Background on Glitch Elimination Techniques Problem Statement New Variable Input Delay Logic Transistor Level Design of Variable Input Delay Gate Results Physical Level Implementation Conclusion and Future Work Apr 23, 2004 Tezaswi Raja: PhD Dissertation 19 New Variable Input Delay Logic I/O path delay through a gate = Input Delay + Output Delay Output Delay Input Delay Propagation delay through a gate from the inputs to the outputs. Extra delay that can be added on a single I/O path through the gate, which can be controlled independently of the other input delays. Variable Input Delay Logic Logic level design of circuits using components with variable input and output delays along different I/O paths through the gate. Apr 23, 2004 Tezaswi Raja: PhD Dissertation 20 Delay Model for a New Gate 1 d3,1 + d3 2 d3,2 + d3 3 Separate the output (inertial) and input delay variables. d3 - output delay of the gate. d3,1 - input delay of the gate along path from 1 to 3. Technology constraint: 0 d3,1 ,d3,2 ub Input delay difference has an upper bound, which we define as Gate Input Differential Delay Upper Bound ( ub ). Apr 23, 2004 Tezaswi Raja: PhD Dissertation 21 Gate Input Differential Delay Upper Bound (ub) It is a measure of the maximum difference in delay of any two I/O paths through the gate, that can be designed in a given CMOS technology. Arbitrary input delays cannot be realized in practice due to the technology limitation at the transistor and layout levels. The bound ub is the limit of flexibility allowed by the technology to the designer at the transistor and layout levels. The following feasibility condition must be imposed while determining delays for glitch suppression: 0 di, j ub Apr 23, 2004 Tezaswi Raja: PhD Dissertation 22 Overall New Approach Delays need to be translated to transistor sizes for implementing the circuit. Transistor models are inherently non-linear. Describing the LP with transistor sizes as variables makes the formulation non-linear. Large non-linear models are intractable. We propose a three phase solution to deal with nonlinearities in transistors. Apr 23, 2004 Phase 1: Detailed study of the gate to determine the ub of the technology. Phase 2: Formulate a LP using variable input delay gates with ub determined above. This keeps the formulation linear. Phase 3: Design gates for the delay assignments given by LP above. Nonlinearities are local to every gate and easier to handle. Tezaswi Raja: PhD Dissertation 23 New Linear Programs We propose two new LPs for designing circuits based on the specifications of the design. Minimum dynamic power (MDP) LP Where the circuit consumes least power possible and operates at the highest possible speed for that power. Delay specification (DS) LP Where the circuit meets a given delay requirement but does by adding fewer buffers than the earlier methods. Apr 23, 2004 Tezaswi Raja: PhD Dissertation 24 New Minimum Dynamic Power LP Contains following components Variables Constraints Gate inertial delay variables (di) Input delay variables (di,j) Timing window variables Gate delay constraints Gate input delay upper bound constraints Differential delay constraints Maximum delay constraints Objective function Let us consider a simple example combinational circuit. Apr 23, 2004 Tezaswi Raja: PhD Dissertation 25 New MDP LP Example 1 d5,1 + d5 5 d7,5 + d7 d5,2 + d5 2 d7,6 + d7 d6,2 + d6 3 d6,3 + d6 7 d7,4 + d7 6 4 Gate inertial delay variables d5 ..d7 Gate input delay variables di, j for every path through gate i from input j Corresponding window variables t5 ..t7 and T5 ..T7. Apr 23, 2004 Tezaswi Raja: PhD Dissertation 26 New MDP LP Example (cont.) 1 2 d5,1 + d5 5 d7,5 + d7 d5,2 + d5 d7,6 + d7 d6,2 + d6 3 d6,3 + d6 7 d7,4 + d7 6 4 Inertial delay constraint for gate 5: d5 1 Input delay (feasibility) constraints for gate 5: 0 d5,1 ub 0 d5,2 ub Apr 23, 2004 Tezaswi Raja: PhD Dissertation 27 New MDP LP Example (cont.) 1 2 d5,1 + d5 5 d7,5 + d7 d5,2 + d5 d7,6 + d7 d6,2 + d6 3 d6,3 + d6 7 d7,4 + d7 6 4 Differential delay constraints for gate 5: T5 > T1 + d5,1 + d5; T5 > T2 + d5,2 + d5; Apr 23, 2004 t5 < t1+ d5,1 + d5; t5 < t2+ d5,2 + d5; Tezaswi Raja: PhD Dissertation d5 > T5 – t5; 28 New MDP LP Example (cont.) 1 2 d5,1 + d5 5 d7,5 + d7 d5,2 + d5 7 d7,6 + d7 d6,2 + d6 d7,4 + d7 3 d6,3 + d6 6 4 IO delay constraint for each PO in the circuit: T7 maxdelay; maxdelay is the parameter which gives the delay of the critical path. This determines the speed of operation of the circuit. Apr 23, 2004 Tezaswi Raja: PhD Dissertation 29 New MDP LP Example (cont.) 1 d5,1 + d5 5 d7,5 + d7 d5,2 + d5 2 7 d7,6 + d7 d6,2 + d6 d7,4 + d7 3 d6,3 + d6 6 4 Objective Function: minimize maxdelay; This gives the fastest possible, minimum dynamic power consuming circuit, given the feasibility condition for the technology. Apr 23, 2004 Tezaswi Raja: PhD Dissertation 30 Solution Curves Power Previous solutions New MDP LP solutions Power consumed by buffers Minimum Dynamic power ub = ∞ ub=15 ub=10 ub=5 Fastest Possible Design in any technology Apr 23, 2004 ub=0 Maxdelay Tezaswi Raja: PhD Dissertation 31 Delay Specification LP If the design needs to meet a given delay specification and the designer is willing to sacrifice some dynamic power by inserting buffers. Modifications to MDP LP Insert buffer variables at every fanout stem and branches and at PIs (similar to Linear constraint set method by Raja et al.) maxdelay is a given parameter, which is the maximum delay of the critical path according to specification. Apr 23, 2004 Tezaswi Raja: PhD Dissertation 32 Delay Specification LP Components of the LP Gate constraints – unchanged Input delay (feasibility) constraints – unchanged for same ub Differential delay constraints – unchanged Maxdelay constraints – unchanged but maxdelay is a given parameter. Objective function: Minimize sum ( dj) where j є buffers Apr 23, 2004 Tezaswi Raja: PhD Dissertation 33 Solution Curves Power Previous solutions New MDP LP solutions New DS LP solutions Power consumed by buffers Minimum Dynamic power ub = ∞ ub=15 ub=10 ub=5 Fastest Possible Design in any technology Apr 23, 2004 ub=0 Maxdelay Tezaswi Raja: PhD Dissertation 34 Talk Outline Motivation Background on Glitch Elimination Techniques Problem Statement New Variable Input Delay Logic Transistor Level Design of Variable Input Delay Gate Results Physical Level Implementation Conclusion and Future Work Apr 23, 2004 Tezaswi Raja: PhD Dissertation 35 Transistor Level Implementation Ron Cr Cin d3,1 Cin d3,2 Ron Ron Cp Cr Cin Cr Conventional CMOS gate design: Delay = Ron ( Crouting + Cinput) Energy = 0.5 (Cr + Cin) V2 Delay can be changed by changing the resistance or the capacitance. Resistance does not affect energy per transition. Apr 23, 2004 Tezaswi Raja: PhD Dissertation 36 Transistor Level Implementation We propose three new implementations of the variable input delay gate Capacitance manipulation method where the input capacitance offered by the respective transistor pair is varied. Pass transistor added design where an extra transistor is added to increase the resistance and thereby the input delay. We propose the addition of: Single nMOS transistor CMOS pass transistor We describe the single nMOS transistor added design in detail here. The other two are documented in the thesis. Apr 23, 2004 Tezaswi Raja: PhD Dissertation 37 Single nMOSFET Added Design Ron d3,1 = Ron (Cr + Cin) + Rs Cin Rs Cr Cin Cin Ron d3,1 d3,2 d3,1 = Output + Input delay d3,2 = Ron (Cr + Cin) Energy = 0.5 (Cr + Cin) V2 Cr The input delay can be added by the input nMOS transistor in series to the path desired. The addition of resistance does not increase the energy per transition. Apr 23, 2004 Tezaswi Raja: PhD Dissertation 38 Effect of Input Slope Rs Too large ub cannot be realized in practice due to noise issues. Increased resistance degrades the slope of a signal and we use the CMOS gate following it to regenerate the slope. The regenerative capability of a gate is limited and this governs practical ub value. The slope allowed in a design depends on the noise specifications of the circuit. Apr 23, 2004 Tezaswi Raja: PhD Dissertation 39 Single nMOSFET Added Design Advantages: Complete independent control of input delays. ub is very high compared to capacitance manipulation method. Very less overhead compared to a conventional buffer. Can be integrated to full-custom as well as standard cell place and route design flows. Design Issues: nMOSFET degrades the signal when passing logic 1. Hence, it increases the leakage of the transistors in the fanout stages. However, this is for certain input combinations only. Short circuit current is a function of the ratio of input/output slopes. Since we increase the input slope by inserting resistance, it might increase short circuit power by a minor amount. Apr 23, 2004 Tezaswi Raja: PhD Dissertation 40 CMOS Pass Transistor Added Design Ron Rs Cr Ron Cin Cin d3,1 = Ron (Cr + Cin) + Rs Cin d3,1 d3,2 d3,1 = Output + Input delay d3,2 = Ron (Cr + Cin) Energy = 0.5 (Cr + Cin) V2 Cr The input delay can be added by the input CMOS pass transistor in series to the path desired. This does not degrade the signal as both transistors together conduct both logic values well. Apr 23, 2004 Tezaswi Raja: PhD Dissertation 41 Technology Mapping Delay required Look Up Table for sizes Transistor Sizes yes Error no acceptable ? Increment that transistor dimension Sensitivity of each transistor size to delay Determine sizes of transistors in a gate for the given delay and given load capacitance. First guess is given by the look-up table. Second stage is sensitivity driven. Reduces the complexity of transistor search. Apr 23, 2004 Tezaswi Raja: PhD Dissertation 42 Original Contributions Described a new variable input delay logic with independently controllable input delay gates. Proposed a new three phase methodology that keeps the formulation linear. Proposed two new LPs using this new logic. Non-linear dependencies are summarized in a single parameter ub. First linear formulation that is scalable for large circuits. First technique to exploit the entire design space effectively. Apr 23, 2004 Tezaswi Raja: PhD Dissertation 43 Original Contributions - II This technique can be used for speeding up the critical paths as well. Since the formulation is maintained linear, it can be used for speeding up large circuits. Described three new implementations of a variable input delay gate. First technique to design such gates. Developed a method of determining the sizes of the transistors using a two step approach. Complexity of gate sizing is reduced by using the two step approach. Apr 23, 2004 Tezaswi Raja: PhD Dissertation 44 Talk Outline Motivation Background on Glitch Elimination Techniques Problem Statement New Variable Input Delay Logic Transistor Level Design of Variable Input Delay Gate Results Physical Level Implementation Conclusion and Future Work Apr 23, 2004 Tezaswi Raja: PhD Dissertation 45 Results: Procedure Outline Combinational circuit netlist C++ Program Constraint-set ub from technology AMPL Optimized delays Power Estimator Results Apr 23, 2004 Tezaswi Raja: PhD Dissertation 46 Results for Speed of Circuit Using MDP LP Maxdelay is normalized to the length of the critical path when all gates are of unit delay. Each curve is a different benchmark circuit. As we increase ub, the circuit becomes faster. Flexibility required for fastest operation of circuit is proportional to the size of the circuit. Apr 23, 2004 Tezaswi Raja: PhD Dissertation 47 Power Savings Using MDP LP (for ub=10) Circuit No. of maxdelay Norm. vectors delay Unoptimized Optimized Avg. Peak Avg. Peak c432 56 71 4.17 1.0 1.0 0.65 0.55 c499 54 34 2.26 1.0 1.0 0.70 0.65 c880 78 45 1.50 1.0 1.0 0.48 0.45 c1355 87 67 2.05 1.0 1.0 0.47 0.36 c1908 144 173 4.32 1.0 1.0 0.54 0.44 c2670 82 35 1.09 1.0 1.0 0.68 0.56 c3540 200 347 7.38 1.0 1.0 0.53 0.43 c5315 157 542 11.06 1.0 1.0 0.53 0.44 c6288 141 124 1.87 1.0 1.0 0.22 0.18 c7552 158 50 1.16 1.0 1.0 0.28 0.26 Apr 23, 2004 Tezaswi Raja: PhD Dissertation 48 Power Savings Using DS LP (for ub=10) Circuit c432 c499 c880 c1355 c1908 Apr 23, 2004 Norm. Maxdelay Conventional gates Variable input delay gates (Raja et al., VLSI Design `03) Avg. Peak Buffers Avg. Peak Buffers 1.0 0.72 0.67 95 0.69 0.66 61 2.0 0.62 0.60 66 0.65 0.55 0 1.0 0.91 0.87 48 0.86 0.84 0 2.0 0.70 0.66 0 0.71 0.65 0 1.0 0.68 0.54 62 0.58 0.45 1 2.0 0.68 0.52 34 0.56 0.45 0 1.0 0.58 0.48 224 0.48 0.42 64 2.0 0.57 0.48 192 0.44 0.39 32 1.0 0.69 0.59 219 0.56 0.46 5 2.0 0.59 0.44 70 0.55 0.45 4 Tezaswi Raja: PhD Dissertation 49 Power Savings Using DS LP (for ub=10) Circuit c2670 c3540 c5315 c6288 c7552 Apr 23, 2004 Norm. Maxdelay Conventional gates Variable input delay gates (Raja et al., VLSI Design `03) Avg. Peak Buffers Avg. Peak Buffers 1.0 0.79 0.65 157 0.70 0.56 2 2.0 0.71 0.58 35 0.69 0.57 0 1.0 0.64 0.44 239 0.57 0.46 3 2.0 0.58 0.46 140 0.54 0.43 1 1.0 0.63 0.52 280 0.57 0.48 26 2.0 0.60 0.45 171 0.55 0.46 4 1.0 0.40 0.36 294 0.91 0.87 584 2.0 0.36 0.34 120 0.21 0.16 0 1.0 0.38 0.34 366 0.28 0.24 1 2.0 0.36 0.32 111 0.27 0.24 0 Tezaswi Raja: PhD Dissertation 50 Transistor Overhead 1,4 – nMOS added design (for maxdelay = 1 and 2) 2,5 – CMOS added design (for maxdelay = 1 and 2) 3,6 – Buffer added design (for maxdelay = 1 and 2) Apr 23, 2004 Tezaswi Raja: PhD Dissertation 51 Physical Level Verification Assumptions in the previous logic level analysis Leakage and short circuit power are a small portion of the entire power consumption of the circuit. Area overhead is tolerable for the achieved power savings. Glitches are a major portion of the power consumption of the chip. We present the results on an example circuit followed by a large circuit. Apr 23, 2004 Tezaswi Raja: PhD Dissertation 52 Example Circuit 1 2 3 5 4 d=2 1 2 3 7 d=1 Unoptimized Circuit d=1 d=1 d=1 5 4 d=1 1 2 3 6 4 d=2 Apr 23, 2004 7 d=1 d=2 d=1 Buffer optimized Circuit d=1 5 7 6 d=2 6 d=1 d=1 nMOS optimized Circuit d=1 Tezaswi Raja: PhD Dissertation 53 Example Circuit – Spectre Results time Unoptimized Circuit Apr 23, 2004 time Buffer optimized Circuit Tezaswi Raja: PhD Dissertation time nMOS optimized Circuit 54 Example Circuit – Energy Consumption Unoptimized circuit Buffer optimized circuit nMOS optimized circuit Apr 23, 2004 Tezaswi Raja: PhD Dissertation 55 Example Circuit – Leakage Analysis Input Vector 000 nMOS optimized circuit CMOS optimized circuit Unoptimized circuit Input Vector 111 nMOS optimized circuit CMOS optimized circuit Unoptimized circuit We observed a 0.2% increase in leakage power for CMOS inserted method and 0.45% increase for nMOS inserted method for one input combination and none for the other. Apr 23, 2004 Tezaswi Raja: PhD Dissertation 56 Physical Level Verification AMPL Delays Technology Mapping Transistor Sizes Create Cells using Prolific Standard Cell Library Standard Cell Place and Route No Layout Extract Routing Capacitance Routing acceptable? Yes Optimized Layout Apr 23, 2004 Routing load Analog Power simulations Energy Consumption Tezaswi Raja: PhD Dissertation 57 Physical Level Verification c7552 Un-optimized Gate Count Transistor Count Critical Delay Area Apr 23, 2004 = 3827 ≈ 40,000 = 2.15 ns = 710 x 710 um2 c7552 optimized (ub = 10) Gate Count = 3828 Transistor Count ≈ 45,000 Critical Delay = 2.15 ns Area = 760 x 760 um2(1.14) Tezaswi Raja: PhD Dissertation 58 Instantaneous Power Savings Peak Power Savings = 68% Apr 23, 2004 Tezaswi Raja: PhD Dissertation 59 Average Energy Savings Average Energy Savings = 58% Apr 23, 2004 Tezaswi Raja: PhD Dissertation 60 Patents and Dissertations Patents V. D. Agrawal, “Low Power Circuits Through Hazard Pulse Suppression,” U.S. Patent 5,983,007, November 1999. T. Raja, V. D. Agrawal and M. L. Bushnell, “Variable Input Delay CMOS Logic and Its Application to Low Power Design,” to be submitted to USPTO through Rutgers Univ., May 2004. Dissertations Apr 23, 2004 T. Raja, Minimum Dynamic Power Design of CMOS Circuits using a Reduced Constraint Set Linear Program, MS Thesis, Dept. of ECE, Rutgers University, May 2002. T. Raja, Minimum Dynamic Power CMOS Design with Variable Input Delay Logic , PhD Thesis, Dept. of ECE, Rutgers University, May 2004. S. Uppalapati, Low Power Design of Standard Cell Digital VLSI Circuits, MS. Thesis, Dept. of ECE, Rutgers University, October 2004. Tezaswi Raja: PhD Dissertation 61 Papers V. D. Agrawal, “Low-Power Design by Hazard Filtering,” Proc. 10th Int. Conf. VLSI Design, Jan. 1997, pp. 193-197. V. D. Agrawal, M. L. Bushnell, G. Parthasarathy, and R. Ramadoss, “Digital Circuit Design for Minimum Transient Energy and a Linear Programming Method,” Proc. 12th Int. Conf. VLSI Design, Jan. 1999, pp. 434-439. T. Raja, V. D. Agrawal, and M. L. Bushnell, “Minimum Dynamic Power CMOS Circuit Design by a Reduced Constraint Set Linear Program,” Proc. 16th Int. Conf. VLSI Design, Jan. 2003, pp. 527-532. T. Raja, V. D. Agrawal, and M. L. Bushnell, “CMOS Circuit Design for Minimum Dynamic Power and Highest Speed,” Proc. 17th Int. Conf. VLSI Design, Jan. 2004, pp. 1035-1040. Apr 23, 2004 Tezaswi Raja: PhD Dissertation 62 Conclusion Main idea: Minimum dynamic power high speed circuits can be designed if gates with variable input delays are used. The new design suppresses all glitches without any delay buffers. Decreases power without loss in speed and very little increase in area. Developed a linear program solution to demonstrate the idea. Developed new gate design for transistor level implementation. Results have been verified by physical layout design of large circuits. Results show average power savings up to 58%. Technique easily scalable for large circuits. Apr 23, 2004 Tezaswi Raja: PhD Dissertation 63