Process - Variation

advertisement
Process-Variation-Resistant Dynamic
Power Optimization for VLSI Circuits
Fei Hu
Department of ECE
Auburn University, AL 36849
Ph.D. Dissertation Committee:
Dr. Vishwani D. Agrawal
Dr. Foster Dai
Dr. Darrel Hankerson
Dr. Saad Biaz (Outside Reader)
November 16, 2005
Outline
Introduction
Background
– Dynamic power dissipation
– Glitch reduction
– Previous LP model
Process-variation-resistant LP model
–
–
–
–
Process variation
Delay model
LP model based on worst-case timing
LP model based on statistical timing
Input-specific optimization
– Without process-variation
– With process-variation
Experimental results
Conclusion
Nov. 16th, 2005
Fei Hu, PhD Dissertation
2
Introduction
Power component for CMOS circuits
– Pavg= Pstatic + Pdynamic
– Pdynamic  1/2 kCLVdd2fclk
Power dissipation problem
– For constant die size, total capacitance increases by
40% when transistor size is reduced by 70%
– Clock frequency is scaled up faster than the minimum
feature size (MFS)
– Leakage power increases dramatically as MFS
reduces into submicron region
– Architecture trend is towards programmability and
reusability – leads to more hunger for power
Nov. 16th, 2005
Fei Hu, PhD Dissertation
3
VLSI Chip Power Density
Source: Intel
Sun’s
Surface
Power Density (W/cm2)
10000
Rocket
Nozzle
1000
Nuclear
Reactor
100
8086
Hot Plate
10 4004
8008 8085
386
286
8080
1
1970
Nov. 16th, 2005
1980
P6
Pentium®
486
1990
Year
2000
Fei Hu, PhD Dissertation
2010
4
Outline
Introduction
Background
– Dynamic power dissipation
– Glitch reduction
– Previous LP model
Process-variation-resistant LP model
–
–
–
–
Process variation
Delay model
LP model based on worst-case timing
LP model based on statistical timing
Input-specific optimization
– Without process-variation
– With process-variation
Experimental results
Conclusion
Nov. 16th, 2005
Fei Hu, PhD Dissertation
5
Background
Dynamic power dissipation
– Pdyn= Pswitching + Pshort-circuit
Switching power dissipation
– Pswitching = 1/2 kCLVdd2fclk
Vdd
Vdd
1
off
0
on
1
1
0
0
ic
on
isupply 1
0
off
CL
Gnd
Nov. 16th, 2005
CL
Gnd
Fei Hu, PhD Dissertation
6
Background
Short-circuit power dissipation
– Short-circuit current when both PMOS and NMOS are on
– Very much affected by the rising and falling times of input signals
Vdd
isupply
CL
Gnd
significant when input rise/fall time much longer than the output
rise/fall time
– Can be kept to a insignificant portion of Pdyn
Nov. 16th, 2005
Fei Hu, PhD Dissertation
7
Background
Glitch reduction
– A important dynamic power reduction technique
Static glitch
Dynamic glitch
– Glitch power consumes 30~70% Pdyn for typical circuits
– Related techniques
Balanced delay
Hazard filtering
Transistor/Gate sizing
Linear Programming approach
Nov. 16th, 2005
Fei Hu, PhD Dissertation
8
Glitch reduction
Original circuit
1
1
1
Balanced path/ path balancing
– Equalize delays of all path incident on a gate
– Balancing requires insertion of delay buffers.
1.5
.5
.5
1
1
Hazard/glitch filtering
– Utilize glitch filtering effect of gate
– Not necessary to insert buffer
Nov. 16th, 2005
Fei Hu, PhD Dissertation
.5
1
3
9
Glitch reduction
Transistor/gate sizing
–
–
–
–
Find transistor sizes in the circuit to realize the delay
No need to insert buffer
Suffers from nonlinearity of delay model
large solution space, numeric convergence and global
optimization not guaranteed
Linear programming approach
– Adopt both path balancing and hazard filtering
– Find the optimal delay assignments of gates
– Use technology mappings to map the gate delay assignments to
transistor/gate dimensions.
– Guaranteed optimal solution, a convenient way to solve a large
scale optimization problem
Nov. 16th, 2005
Fei Hu, PhD Dissertation
10
Previous LP approach
28
15
1
18
22
4
6
20
7
5
23
8
12
14
27
10
24
21
16
13
29
19
2
11
25
9
3
26
17
Timing window (t, T)
t 6 T6
t7
T7
d7
t
T5
Gate constraints:
T7  T5 + d7
T7  T6 + d7
t7 ≤ t5 + d7
t7 ≤ t6 + d7
d7 > T7 – t7
Circuit delay
constraints:
T11 ≤ maxdelay
T12 ≤ maxdelay
Objective:
Minimize sum
of buffer delays
5
Nov. 16th, 2005
Fei Hu, PhD Dissertation
11
Outline
Introduction
Background
– Dynamic power dissipation
– Glitch reduction
– Previous LP model
Process-variation-resistant LP model
–
–
–
–
Process variation
Delay model
LP model based on worst-case timing
LP model based on statistical timing
Input-specific optimization
– Without process-variation
– With process-variation
Experimental results
Conclusion
Nov. 16th, 2005
Fei Hu, PhD Dissertation
12
Process-variation-resistant optimization
Motivation
– Gate delay assumed fixed in previous models
– Variation of gate delay in real circuits
Environmental factors: temperature, Vdd
Physical factors: process variations
– Effect of delay variation
Glitch filtering conditions corrupted
Power dissipation increases from the optimized value
Leakage variation possible, requires separate investigation
– Our proposal
Consider delay variations in dynamic power optimization
Only consider process variations (major source of delay
variation)
Nov. 16th, 2005
Fei Hu, PhD Dissertation
13
Process and delay variations
Process variations
– Variations due to semiconductor process
VT, tox, Leff, Wwire, THwire,etc.
– Inter-die variation
Constant within a die, vary from one die to another die of a
wafer or wafer lot
– Intra-die variation
Variation within a die
Due to equipment limitations or statistical effects in the
fabrication process, e.g., variation in doping concentration
Spatial correlations and deterministic variation due to CMP and
optical proximity effect
Nov. 16th, 2005
Fei Hu, PhD Dissertation
14
Process and delay variations
Delay variation
– First order gate delay model
CL  Vdd
CL  Vdd

I
Cox (W L)
(Vdd  Vt ) 2
2
– Gate delay sensitive to process-variations
Td 
Related previous work
– Static timing analysis
Worst case timing analysis
Statistical timing analysis
– Power optimization under process-variations
Voltage scaling, multi-Vdd/Vth considering critical delay variations
Gate sizing using statistical delay model
No work on glitch power optimization
Nov. 16th, 2005
Fei Hu, PhD Dissertation
15
Delay model and implications
Random gate delay model
– D
total , i  Dnom, i  Dinter,i  Dintra,i
– Truncated normal distribution
– Assume independence
– Variation in terms of σ/Dnom,i ratio
Effect of inter-die variations
– Depends on its effect to switching activities
– Definition of glitch-filtering probability Pglt = P {t2-t1< d}
Signal arrival time t1, t2
Gate inertial delay d
– Theorem 1 states the change of Pglt due to inter-die variation

1
k
k
Pglt   erf( )  erf(
)
2 
2 
2
2  2(r  k ) 
erf(), the error function
k, a path and gate dependent constant
r, σ/Dnom,i ratio for inter-die variations
Nov. 16th, 2005
Fei Hu, PhD Dissertation
16
Delay model and implications
Effect of inter-die variations
– For a large inter-die variation, r = 0.15, |Pglt| < 5.3×10-3
– Negligible effect on switching activity
Nov. 16th, 2005
Fei Hu, PhD Dissertation
17
Delay model and implications
Process-variation-resistant design
– Can be achieved by path balancing and glitch filtering
– Critical delay may increase
Theorem 2 states that a solution is guaranteed only if circuit delay
is allowed to increase
Proved by example, assuming 10% variation
1
1
1
A
1
2.1 3.9
1
1
1
C
B
Nov. 16th, 2005
1
Fei Hu, PhD Dissertation
18
LP model based on worst-case timing
Timing model
Tbi – tbi
Gate 1
Tai – tai
ta1
Ta1
...
Gate j
...
taj
tai
Taj
...
tak
Tai
Gate i
Tak
Gate k
tbi
Nov. 16th, 2005
Tbi
Fei Hu, PhD Dissertation
19
LP model based on worst-case timing
Constraints
– Gate constraints
Tbi  Ta1 ;
tbi  ta1 ;
Tbi  Ta j ;
tbi  ta j ;
Tai  Tbi  d i  (1  3r );
Tbi  Tak ;
tbi  tak ;
tai  tbi  d i  (1  3r );
– Glitch filtering constraints
Tbi  tbi  di  (1  3r )  
where r  0.33 (33%)
– Delay constraints for POs
Parameter
Tai  Dmax
– r, σ/Dnom,i ratio
– Dmax, circuit delay parameter
– , optimism factor  [1,∞]; 1 ≡ all glitches filtered, ∞ ≡ no glitch filtered
Objective
– Minimize #buffer inserted – sum of buffer delays
Nov. 16th, 2005
Fei Hu, PhD Dissertation
20
LP model based on statistical timing
Worst-case timing tends to be too pessimistic
Statistical timing model with random variables
Gate 1
ta1
Ta1
...
Gate j
taj
tai
Taj
...
tak
Tai
Gate i
Tak
di
Gate k
tbi
Nov. 16th, 2005
Tbi
Fei Hu, PhD Dissertation
21
LP model based on statistical timing
Minimum-maximum statistics
– needed for tbi, Tbi
tbi  Min(ta1 , ta j , tak );
– Previous works
Tbi  Max(Ta1 , Ta j , Tak );
Min, Max for two normal random variable not necessarily distributed
as normal
Can be approximated with a normal distribution
Requiring complex operations, e.g., integration, exponentiation, etc.
– Challenges for LP approach
Require simple approximation w/o nonlinear operations
Our approximation for C=Max(A,B), A, B, and C are Gaussian RVs
C  Max(  A ,  B )
C  3 C  Max(  A  3 A ,  B  3 B )
Nov. 16th, 2005
Fei Hu, PhD Dissertation
22
LP model based on statistical timing
Min-Max statistics approximation error
– Negligible when |A-B|> 3(σA+ σB)
– Largest when A=B
P
1
CDFA
Actual CDF for
Max(A,B)
CDFB
0.5
0
Nov. 16th, 2005
C  Max(  A ,  B )
Approximated CDF for
Max(A,B)
A B
C 
1
 Max( A  3 A , B  3 B )  C 
3
x
Fei Hu, PhD Dissertation
23
LP model based on statistical timing
Variables
– Timing, delay variables with mean  and std dev σ
– Auxiliary variables, TTb , ttb ,Wi  Tbi  tbi , W ,W
i
i
i
i
Constraints
– Gate constraints
Timing window at the inputs for a two-input gate i
Tb  Ta ;TTb  Ta  3 Ta ;
tb  ta ; ttb  ta  3 Ta ;
Tb  Ta ;TTb  Ta  3 Ta ;
tb  ta ; ttb  ta  3 Ta ;
 Tb  (TTb  Tb ) / 3;
 tb  ( tb  ttb ) / 3;
i
1
i
2
i
i
1
i
1
2
i
i
2
1
i
i
i
2
i
1
i
i
2
1
2
i
Timing window at outputs
Ta  Tb  d ;
 Ta  k ( Tb  r  d );
ta  tb  d ;
 ta  k ( tb  r   d );
i
i
Nov. 16th, 2005
i
i
i
i
i
i
i
i
Fei Hu, PhD Dissertation
i
i
24
LP model based on statistical
timing
Constraints
– Gate constraint
Linear approximation
 Ta   Tb2  (r  d ) 2   Ta  k ( Tb  r  d )
i
i
i
i
i
k  [0.707, 1]; choose k=0.85, since
– Glitch filtering constraints
–
W  Tb  tb ;
i
i
i
i
A B
 A2  B2  A  B;
2
3σ
P
 W  k ( Tb   tb );
i
i
i
d  W  3  k ( W  r  d );
i
i
i
i
di-Wi
– Circuit delay constraint
Ta  (1  3r )  Dmax
i
Nov. 16th, 2005
Fei Hu, PhD Dissertation
25
LP model based on statistical timing
Parameter
– r, σ/Dnom,i ratio
– Dmax, circuit delay parameter
– , optimism factor
d  W  3  k ( W  r  d )   ;
i
i
i
i
=1, no relaxation
<1, optimistic about the actual glitch width
=0, reduce to previous model
Objective
– Minimize #buffer inserted – sum of buffer delays
Nov. 16th, 2005
Fei Hu, PhD Dissertation
26
Outline
Introduction
Background
– Dynamic power dissipation
– Glitch reduction
– Previous LP model
Process-variation-resistant LP model
–
–
–
–
Process variation
Delay model
LP model based on worst-case timing
LP model based on statistical timing
Input-specific optimization
– Without process-variation
– With process-variation
Experimental results
Conclusion
Nov. 16th, 2005
Fei Hu, PhD Dissertation
27
Input-specific optimization
Motivation
– Previous LP models guarantees glitch filtering for any
input vector sequence
Ti - ti < di for all gates
– Redundancy in optimization
Insertion of more buffers
Increased the overhead in power/area
– In reality, circuit under embedded environments
Optimization for input vector sequence that is possible to the
circuit, e.g., functional vectors
Same reduction in power dissipation w/ less trade-offs in
overheads
Nov. 16th, 2005
Fei Hu, PhD Dissertation
28
Input-specific optimization
Glitch generation pattern
– Input vector pair that can potentially generate a glitch
– AND gate example:
1
1
1
0
0
1
0
1
1
0
0
1
0
0
Glitch generation probability Pg[i]
– Probability glitch-generation pattern occurs at input of gate i
– Steady state signal values match the pattern
Nov. 16th, 2005
Fei Hu, PhD Dissertation
29
Input-specific optimization
Application to Previous model w/o process-variation
– Static optimization
Only static glitches/hazards considered
– Relaxation of constraints
Relax glitch filtering constraints where glitches unlikely happen
Ti - ti < di
=> (Ti – ti)*i < di
Selective relaxation

0 if Pg [i]  0
i  

1 if Pg [i]  0
Generalized relaxation
i  1  e
Nov. 16th, 2005
 Pg [ i ] 
Fei Hu, PhD Dissertation
30
Input-specific optimization
Application to process-variation-resistant LP model
based on statistical timing
– Static optimization
– Relaxation of constraints
di  [Wi  3  k ( Wi  r  di )   ]  i ;
Selective relaxation
Generalized relaxation
– Tuning factor
Original objective
Minimize
d ;
j
( j  buffers)
j
Current objective
Minimize
d
j
Nov. 16th, 2005
j
 TF  (
1
  di ); ( j  buffers, i  other gates)
N i
Fei Hu, PhD Dissertation
31
Input-specific optimization
Why need a tuning factor
– Dominating path affected critical delay distribution
PIs
Can be [1,41]
Dominating path
41
0
Other logic
Always 0
Nov. 16th, 2005
1
20
40
1
0
Fei Hu, PhD Dissertation
1
PO
1
32
Outline
Introduction
Background
– Dynamic power dissipation
– Glitch reduction
– Previous LP model
Process-variation-resistant LP model
–
–
–
–
Process variation
Delay model
LP model based on worst-case timing
LP model based on statistical timing
Input-specific optimization
– Without process-variation
– With process-variation
Experimental results
Conclusion
Nov. 16th, 2005
Fei Hu, PhD Dissertation
33
Experimental results
Circuit
Experimental procedure
– Flow chart
– Power estimation
Data
extraction
Event driven logic simulation
Fanout weighted sum of
switching activities
Variations of CL and Vdd ignored
Monte-Carlo simulation with
1,000 samples of delays under
process-variation
– Results analysis
Dmax
r, 
AMPL
LP
models
Gate delays
Circuit
generation
Optimized circuit
Un-Opt., unit-delay circuit
Opt, previous optimization
Opt1, Proc-var-rst optimization
worst-case timing
Opt2, Proc-var-rst optimization
statistical timing
Nov. 16th, 2005
Constraint set data
Fei Hu, PhD Dissertation
Logic
simulations
Results
34
Experimental results – small variation
Power dissipation under no process variation
UnOpt
c432
c499
c880
c1355
c1908
c2670
c3540
c5315
c6288
c7552
Pwr.
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
Nov. 16th, 2005
Opt (w/o proc var.)
Pwr.
0.74
0.74
0.94
0.94
0.54
0.54
0.93
0.93
0.53
0.55
0.74
0.74
0.60
0.59
0.56
0.56
0.13
0.13
0.52
0.52
Buf.
95
66
80
48
63
29
224
160
84
54
157
26
219
103
281
113
881
864
369
62
maxdelay
17
34
11
22
24
72
24
72
40
120
32
96
47
141
49
147
124
372
43
129
Opt1 (worst case proc)
Pwr.
0.74
0.74
0.94
0.94
0.54
0.54
0.93
0.93
0.53
0.53
0.79
0.75
0.59
0.61
0.62
0.58
0.15
0.14
0.64
0.56
Buf.
96
91
88
88
45
37
296
296
68
92
244
80
228
152
228
130
801
922
180
162
Fei Hu, PhD Dissertation
Dmax
20
40
13
26
28
83
28
83
46
138
37
111
55
163
57
170
143
428
50
149
Opt2 (statistical proc)
Pwr.
0.74
0.74
0.94
0.94
0.54
0.54
0.93
0.93
0.52
0.52
0.73
0.73
0.59
0.59
0.55
0.55
0.14
0.13
0.52
0.52
Buf.
99
91
97
129
76
37
305
273
136
198
313
168
306
303
401
460
1685
1213
464
879
Dmax
20
40
13
26
28
83
28
83
46
138
37
111
55
163
57
170
143
428
50
149
35
Experimental results – small variation
Power distribution under 5% inter-die, 5% intra-die variation
Circuit
c432
c499
c880
c1355
c1908
c2670
c3540
c5315
c6288
c7552
Un-Opt
Opt (w/o proc var.) Opt1 (worst case proc) Opt2 (statistical proc)
Max. Dev. Mean
Max. Dev. Mean Max. Dev.
Maxdelay Mean Max. Dev. Mean
17
34
11
22
24
72
24
72
40
120
32
96
47
141
49
147
124
372
43
129
Nov. 16th, 2005
Pwr.
1.08
1.08
1.06
1.06
1.03
1.03
1.10
1.10
1.15
1.15
1.17
1.17
1.15
1.15
1.12
1.12
1.46
1.46
1.17
1.17
(%)
17.5
17.5
12.9
12.9
7.1
7.1
18.1
18.1
21.0
21.0
21.8
21.8
18.9
18.9
14.9
14.9
49.9
49.9
19.6
19.6
Pwr.
0.78
0.76
1.00
0.99
0.62
0.57
0.99
0.98
0.64
0.64
0.80
0.77
0.66
0.62
0.62
0.60
0.27
0.26
0.57
0.56
(%)
12.8
8.2
12.6
12.6
23.1
12.8
10.6
8.8
28.6
21.5
11.6
6.1
15.2
7.2
13.8
10.3
131.6
128.3
12.4
9.3
Pwr.
0.75
0.74
0.95
0.94
0.58
0.55
0.96
0.93
0.62
0.54
0.81
0.78
0.65
0.63
0.67
0.61
0.28
0.23
0.72
0.58
Fei Hu, PhD Dissertation
(%)
7.0
0.1
0.7
0.0
13.9
1.1
5.5
0.3
22.8
5.9
5.5
5.2
12.9
5.1
9.9
6.8
105.9
76.8
13.3
5.1
Pwr.
0.75
0.74
0.95
0.94
0.55
0.54
0.95
0.93
0.58
0.54
0.75
0.74
0.63
0.59
0.59
0.56
0.24
0.18
0.57
0.53
(%)
4.5
0.1
0.7
0.1
7.5
1.0
4.2
0.1
21.6
6.5
4.8
1.8
9.7
1.3
9.1
3.7
93.6
56.0
11.8
3.5
36
Experimental results – small variation
Power timing analysis
– Example c432
maxdelay=17
maxdelay=26
– Complete suppression of power variation
Nov. 16th, 2005
Fei Hu, PhD Dissertation
37
Experimental results – small variation
Critical delay distribution
Nominal delay
Max. Deviation
– Similar nominal delay
– Reduced variation by Opt2 for c880, c2670, c5315, c7552
Nov. 16th, 2005
Fei Hu, PhD Dissertation
38
Experimental results – large variation
Power dissipation under no process-variation
Un-opt.
c432
c499
c880
c1355
c1908
c2670
c3540
c5313
c6288
c7552
Pwr.
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
Nov. 16th, 2005
Opt (w/o proc var.)
Opt1 (worst case proc)
maxdelay Pwr.
Pwr.
Buf.
Buf.
Dmax
0.74
0.74
0.94
0.94
0.54
0.54
0.93
0.93
0.53
0.54
0.74
0.74
0.59
0.59
0.56
0.56
0.13
0.13
0.52
0.52
66
58
48
0
35
30
192
128
62
34
34
9
139
78
167
53
870
857
91
44
34
68
22
33
48
120
48
120
80
200
64
160
94
235
98
245
228
620
86
215
0.75
0.74
0.97
0.97
0.58
0.59
0.95
0.96
0.55
0.56
0.80
0.78
0.62
0.65
0.66
0.60
0.14
0.13
0.69
0.60
87
81
88
0
36
29
264
264
41
12
39
95
149
52
93
144
1303
939
64
622
Fei Hu, PhD Dissertation
50
99
32
48
70
174
70
174
116
290
93
232
137
341
143
356
331
899
125
312
Opt2 (statistical proc)
Pwr.
0.74
0.74
0.94
0.94
0.54
0.54
0.93
0.93
0.52
0.52
0.74
0.73
0.59
0.59
0.55
0.55
0.13
0.13
0.52
0.52
Buf.
88
106
88
129
57
62
305
305
135
190
249
211
281
311
399
418
1121
1473
481
645
Dmax
50
99
32
48
70
174
70
174
116
290
93
232
137
341
143
356
331
899
125
312
39
Experimental results – large variation
Power distribution under 15% intra-die and 5% inter-die variation
Circuit
c432
c499
c880
c1355
c1908
c2670
c3540
c5313
c6288
c7552
Opt (w/o proc var.) Opt1 (worst case proc) Opt2 (statistical proc)
Un-opt
Max. Dev. Mean
Max. Dev. Mean
Max. Dev.
Max- Mean Max. Dev. Mean
delay
34
68
22
33
48
120
48
120
80
200
64
160
94
235
98
245
228
620
86
215
Nov. 16th, 2005
Pwr.
1.09
1.09
1.07
1.07
1.04
1.04
1.13
1.13
1.16
1.16
1.19
1.19
1.16
1.16
1.13
1.13
1.45
1.45
1.17
1.17
(%)
19.8
19.8
14.0
14.0
8.0
8.0
21.8
21.8
23.1
23.1
25.4
25.4
20.7
20.7
16.5
16.5
52.2
52.2
21.9
21.9
Pwr.
0.78
0.77
1.02
0.99
0.62
0.60
1.06
1.05
0.72
0.66
0.81
0.80
0.67
0.66
0.67
0.64
0.43
0.41
0.64
0.60
(%)
12.6
10.3
15.3
10.2
26.5
22.7
19.7
18.8
49.6
32.3
13.6
11.2
19.5
16.1
24.6
19.0
274.3
264.0
25.8
20.2
Pwr.
0.78
0.75
0.98
0.97
0.63
0.60
0.98
0.97
0.66
0.62
0.90
0.82
0.69
0.71
0.74
0.66
0.36
0.31
0.78
0.65
Fei Hu, PhD Dissertation
(%)
12.1
6.1
1.7
1.4
15.7
5.6
7.3
1.7
30.1
18.8
16.0
8.6
16.9
11.7
16.3
13.9
193.4
161.5
16.0
11.2
Pwr.
0.76
0.74
0.95
0.95
0.59
0.55
0.98
0.94
0.64
0.58
0.80
0.76
0.66
0.62
0.63
0.60
0.38
0.26
0.59
0.56
(%)
11.1
3.7
2.0
1.0
18.2
8.6
10.2
3.0
35.8
21.4
13.6
6.2
17.8
10.1
20.8
13.4
223.8
125.3
18.7
11.8
40
Experimental results – large variation
Critical delay distribution
Nominal delay
Max. Deviation (%)
– Similar nominal delay
– Reduced delay variation by Opt2
Nov. 16th, 2005
Fei Hu, PhD Dissertation
41
Experimental results – input-specific optimization
Application to “Opt” under no process-variation, IS-Opt
Un-Opt
c432
c499
c880
c1355
c1908
c2670
c3540
c5315
c6288
c7552
maxdelay
34
68
22
33
48
120
48
120
80
200
64
160
94
235
98
245
228
620
86
215
Nov. 16th, 2005
Pwr.
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
Opt (w/o proc var.)
Pwr.
0.74
0.74
0.94
0.94
0.54
0.54
0.93
0.93
0.53
0.54
0.74
0.74
0.59
0.59
0.56
0.56
0.13
0.13
0.52
0.52
Delay
34
68
22
33
51
121
48
121
82
203
65
163
95
239
100
249
226
620
89
220
Buffers
66
58
48
0
35
30
192
128
62
34
34
9
139
78
167
53
870
857
91
44
Fei Hu, PhD Dissertation
IS-Opt (input-specific w/o proc)
Pwr.
0.74
0.74
0.94
0.95
0.54
0.54
0.93
0.93
0.54
0.53
0.74
0.74
0.59
0.59
0.56
0.56
0.13
0.13
0.52
0.52
Delay
35
69
22
33
49
122
48
120
86
204
66
162
101
239
104
250
228
620
88
221
Buffers
66
41
33
0
32
24
113
25
52
3
30
1
122
73
170
52
870
853
84
38
42
Experimental results – input-specific optimization
Application to “Opt2” under process-variation, IS-Opt2 under 15% intra-die
and 5% inter-die variation
Un-opt.
Cir.
c432
DMax
50
99
c499
32
48
c880
70
174
c1355
70
174
c1908
116
290
c2670
93
232
c3540
137
341
c5315
143
356
c6288
331
899
c7552
125
312
Nov. 16th, 2005
Nom.
Pwr.
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
Opt2 (statistical proc)
Nom.
Pwr.
0.74
0.74
0.94
0.94
0.54
0.54
0.93
0.93
0.52
0.52
0.74
0.73
0.59
0.59
0.55
0.55
0.13
0.13
0.52
0.52
Mean
Max Dev. No.
Pwr.
(%)
Buf.
0.76
11.1
88
0.74
3.7
106
0.95
2.0
88
0.95
1.0
129
0.59
18.2
57
0.55
8.6
62
0.98
10.2
305
0.94
3.0
305
0.64
35.8
135
0.58
21.4
190
0.80
13.6
249
0.76
6.2
211
0.66
17.8
281
0.62
10.1
311
0.63
20.8
399
0.60
13.4
418
0.38
223.8
1121
0.26
125.3
1473
0.59
18.7
481
0.56
11.8
645
Fei Hu, PhD Dissertation
IS-Opt2 (input-specific statistical proc)
Nom.
Pwr.
0.74
0.74
0.94
0.94
0.54
0.54
0.93
0.93
0.52
0.52
0.73
0.73
0.59
0.59
0.55
0.55
0.13
0.13
0.52
0.52
Mean
Pwr.
0.76
0.74
0.95
0.95
0.59
0.56
1.01
0.95
0.64
0.57
0.79
0.75
0.65
0.61
0.63
0.60
0.38
0.26
0.58
0.55
Max Dev.
(%)
9.3
3.3
1.9
1.8
20.4
9.0
13.1
4.7
34.7
18.4
11.3
4.3
15.6
7.4
21.0
13.2
225.2
125.5
18.1
10.9
No.
Buf.
81
76
88
58
38
38
253
160
107
104
186
79
247
188
389
413
1115
1243
389
520
43
Experimental results – input-specific optimization
Trade-off by generalized relaxation
– c432 circuit with varying  value
– Reduction of #buffers with degradation of power distribution
Nov. 16th, 2005
Fei Hu, PhD Dissertation
44
Experimental results – input-specific optimization
Critical delay
Nominal delay
Max. deviation
– Similar performance for “Opt2” and “IS-Opt2”
Nov. 16th, 2005
Fei Hu, PhD Dissertation
45
Outline
Introduction
Background
– Dynamic power dissipation
– Glitch reduction
– Previous LP model
Process-variation-resistant LP model
–
–
–
–
Process variation
Delay model
LP model based on worst-case timing
LP model based on statistical timing
Input-specific optimization
– Without process-variation
– With process-variation
Experimental results
Conclusion
Nov. 16th, 2005
Fei Hu, PhD Dissertation
46
Conclusions
Proposed a dynamic power optimization technique that is resistant to the
process variation
Consider process-variation in terms of the delay variations
– inter-die and intra-die variations
– Prove inter-die variation has negligible effect on switching activity and power
Construct two new LP models
– Worst case timing analysis
– Statistical timing analysis
Input-specific optimization to reduce number of buffers
– Circuit optimized for certain input vector sequence
Experimental results
– Complete suppression of power variation for small circuit and variations
– Significant reduction of power and delay variations for larger circuit and
variations
53% reduction in power deviation, 40% reduction in delay deviation under 15% intra-die
and 5% inter-die variation
– Input-specific optimization reduces trade-off (buffers) significantly w/ equivalent
power and delay performance
IS-Opt2 vs. Opt2, Up to 63% reduction of buffer
Nov. 16th, 2005
Fei Hu, PhD Dissertation
47
Questions
For more questions, contact me at hufei01@auburn.edu
Nov. 16th, 2005
Fei Hu, PhD Dissertation
48
Download