Power and Performance Optimization of Static CMOS Circuits with Process Variation

advertisement
Power and Performance Optimization of
Static CMOS Circuits with Process Variation
Yuanlin Lu
Department of ECE, Auburn University, Auburn, AL 36849
Ph.D. Dissertation Committee:
Dr. Vishwani D. Agrawal
Dr. Fa Foster Dai
Dr. Charles Stroud
Dr. Douglas Leonard (Outsider Reader)
May 25, 2007
Outline
Motivation
Problem Statement
Background
Proposed Techniques



MILP1 for Leakage and Glitch Minimization
MILP2 for Statistical Leakage Optimization under Process Variation
MILP3 for Statistical Glitch Power Reduction under Process Variation
Results
Conclusion
Suggestions for Future Work
7/16/2016
Ph.D. Final Oral Examination
2
Motivation
Leakage power has become a dominant contributor to
the total power consumption

65nm, leakage is ~ 50% of total power consumption
Glitches consume 20%-70% of dynamic power
Variation of process parameters increases with
technology scaling



7/16/2016
both average and standard deviation of leakage power increase
some glitch elimination technique (path balancing) is not
effective
both power yield and timing yield are degraded
Ph.D. Final Oral Examination
3
Problem Statement
Design a CMOS Circuit with Dual-Threshold Devices and Delay Elements to:



Globally minimize subthreshold leakage
Eliminate all glitches
Maintain specified performance
Statistically Design a CMOS Circuit with Dual-Threshold Devices:


Reduce the effect of process variation on subthreshold leakage
Achieve a specified timing yield
Statistically Design a CMOS Circuit by Dual-Threshold Assignment, Path
Balancing and Gate Sizing to:



Minimize leakage and dynamic power (capacitance reduction and glitch
elimination)
Reduce the effect of process variation on leakage and dynamic power
Achieve a specified timing yield
Allow Performance-Power Tradeoff
7/16/2016
Ph.D. Final Oral Examination
4
Outline
Motivation
Problem Statement
Background
Proposed Techniques
Results
Conclusion
Future Work
7/16/2016
Ph.D. Final Oral Examination
5
Power Consumption
in CMOS Circuit
CL
Dynamic Switching Power + Short Circuit Power +
Subthreshold Leakage Power + Gate Leakage Power
7/16/2016
Ph.D. Final Oral Examination
6
Leakage and Delay
Increasing Vth can exponentially decrease Isub
 Vgs  Vth  
  Vds  

  1  exp 
 
I sub  u0Cox
VT e exp 


Leff
 nVT  
 VT  
Weff
2 1.8
But, gate delay increases at the same time (T. Sakurai and A. R.
Newton, Alpha-power Law, 1990)
T pd 
CVdd
Vdd  Vth 
where α models channel effects
(long channel α = 2, short channel α = 1.3)
While using dual Vth techniques, must consider the tradeoff between
leakage reduction and performance degradation
7/16/2016
Ph.D. Final Oral Examination
7
Dual Threshold CMOS
Dual Threshold Device library (NAND02 @ 70nm)
Spice Simulation
Threshold
Low Vth
High Vth
Subthreshold
Leakage
Speed
High
Fast
(~10nA)
(~30ps)
Low
Slow
(~0.23nA)
(~40ps)
To maintain performance, most gates on the critical
path may be assigned low Vth
Most gates on the non-critical paths may be assigned
high Vth to reduce leakage
7/16/2016
Ph.D. Final Oral Examination
8
Dynamic Power
Pdyn = ½ CLVdd2AF


F – clock frequency
A – switching activity
Dynamic Power =
Logic Switching Power + Glitch Power
7/16/2016
Ph.D. Final Oral Examination
9
Techniques to Eliminate Glitches
?
path delay difference < gate inertial delay [1]
Hazard Filtering
(Gate/Transistor Sizing)


Increase gate inertial delay
Sizing gate to change gate delay
Path Balancing


1→3
2
2
1.5
Decrease path delay difference
Insert delay elements on the
shorter delay signal path
2
1
2 →0.5
[1] V. D. Agrawal, International Conference on VLSI Design, 1997
7/16/2016
Ph.D. Final Oral Examination
10
Timing Window
- for calculating path delay difference
Input Timing Window
t1
t1
t2
di
tn
t2
tn
di
di
Output Timing Window
t1+di = ti
(a) a n-input NAND gate
7/16/2016
tn+di = T
i
(b) timing window for the inputs and output of gate in (a)
Ph.D. Final Oral Examination
11
Previous Work on Leakage Minimization
and Glitch Power Reduction
Leakage Power Minimization by Dual-Vth CMOS Devices

Heuristic Algorithms (locally optimum solutions)
Q. Wang and S. B. K. Vrudhula, "Static Power Optimization of Deep Submicron
CMOS Circuits for Dual VT Technology," Proc. ICCAD, 1998, pp. 490-496.
L. Wei, Z. Chen, M. Johnson and K. Roy, “Design and Optimization of Low
Voltage High Performance Dual Threshold CMOS Circuits,” Proc. DAC, 1998,
pp. 489-494.

Integer Linear Programming (globally optimum solutions)
D. Nguyen, A. Davare, M. Orshansky, D. Chinney, B. Thompson and K. Keutzer,
“Minimization of Dynamic and Static Power Through Joint Assignment of
Threshold Voltages and Sizing Optimization,” Proc. ISLPED, 2003, pp. 158-163.
F. Gao and J. P. Hayes, “Gate Sizing and Vt Assignment for Active-Mode
Leakage Power Reduction,” Proc. ICCD, 2004, pp. 258-264
Glitch Power Elimination by Linear Programming
T. Raja, V. D. Agrawal and M. L. Bushnell, “Minimum Dynamic Power CMOS
Circuit Design by a Reduced Constraint Set Linear Program,” Proc. 16th
International Conference on VLSI Design, 2003, pp. 527-532.
7/16/2016
Ph.D. Final Oral Examination
12
Outline
Motivation
Problem Statement
Background
Proposed Techniques



MILP1 for Leakage and Glitch Minimization
MILP2 for Statistical Leakage Optimization under Process Variation
MILP3 for Statistical Glitch Power Reduction under Process Variation
Results
Conclusion
Future Work
7/16/2016
Ph.D. Final Oral Examination
13
MILP1: Minimize Leakage and Dynamic
Glitch Power Simultaneously
No process variation is considered.
MILP1 is a mixed integer linear program (both
integer variables and continuous variables are
used) .
Objective: In dual-threshold CMOS Process


Minimize leakage – MILP1 determines the optimal dualthreshold assignment
Eliminate glitches – MILP1 determines delays and
positions of delay elements used to balance path
delays
7/16/2016
Ph.D. Final Oral Examination
14
MILP1: A Mixed Integer Linear Program
for Leakage and Glitch Power Reduction
Ideal objective function:
Minimize {Total leakage + No. of glitch
suppressing delay elements}
Alternative objective function (linear
approximation):
Minimize {C1·Total leakage + C2·Total glitch
suppressing delay}
7/16/2016
Ph.D. Final Oral Examination
15
Variables and Constants
Each gate has four variables and four constants:
Integer Variable:
 Xi:
[0,1], specifies gate threshold voltage
Continuous-valued Variables:
 Ti: latest time at which the output of gate i can produce an event
after the occurrence of an event at primary inputs.
 ti:
earliest time at which the output of gate i can produce an event
after the occurrence of an event at primary inputs.
 Δdi,j: delay of inserted delay element at the input of gate i
coming from gate j.
Constants Determined by Spice Simulation
 ILi and IHi:
Leakage currents for low and high thresholds
 DLi and DHi: Delays for low and high thresholds
7/16/2016
Ph.D. Final Oral Examination
16
Constraints
(t1,T1)
Glitch suppression
constraint for each gate i:
1
0
d 2,1
(t0,T0)
(t2,T2)

2
d 2, 2
(t3,T3)
3
T2  T0  d 2,1  X 2  DL 2  1  X 2   DH 2 (1)
T2  0  d 2, 2  X 2  DL 2  1  X 2   DH 2 (2)
t 2  t0  d 2,1  X 2  DL 2  1  X 2   DH 2 (3)
Circuit delay constraint
for each PO k:


t 2  0  d 2, 2  X 2  DL 2  1  X 2   DH 2 (4)
X 2  DL 2  1  X 2   DH 2  T2  t 2
7/16/2016
Constraint (1-5) makes sure
that T2- t2 < d2
Tk  Tmax , k=1,3
Tmax can the delay of critical
path or clock period specified
by the circuit designer
(5)
Ph.D. Final Oral Examination
17
Choices for a Delay Element
 Two cascaded-inverter buffer - consumes additional
short-circuit, subthreshold leakage and dynamic power.
 All delay buffers lie on non-critical paths and are assigned high Vth;
contribute little to leakage
 But they add to dynamic power
 Transmission gate (always on) – increases resistance




Smaller area overhead
No subthreshold leakage
Minimal capacitance increase
Used before
 T. Raja, V. D. Agrawal and M. L. Bushnell, “Variable Input Delay CMOS Logic
for Low Power Design,” Proc. 18th International Conference on VLSI Design,
January 2005, pp. 598-605.
 T. Raja, V. D. Agrawal and M. L. Bushnell, “Transistor Sizing of Logic Gates to
Maximize Input Delay Variability,” JOLPE, vol. 2, no. 1, pp. 121-128, April
2006.
7/16/2016
Ph.D. Final Oral Examination
18
Transmission-Gate Delay Element with
Minimum Capacitance
G
Two types of capacitances:
CGS
CGD
S

Cdiff  C1 W  C2
D
CGB
CSB
CDB

Rtotal
D
Ctotal
Rtotal=Rch
7/16/2016
Channel capacitances: CGS CGD
Cchan  C3  LW
B
S
Diffusion capacitances: CSB CDB
Ctotal=CGS+CGD+CSB+CDB
To minimize diffusion capacitances,
we implement all the transmissiongate delay elements with the
minimal width but longer channel
transistors
Ph.D. Final Oral Examination
19
Transmission-Gate Delay Element with
Minimum Capacitance (Cont.)
t p  ln 2 Req C L

L
(Ctrans_ total  Cload _ chan )
W
To implement a specified delay,
a smallest L is needed with a
minimum W.
This reduces the channel
capacitance of the transmission
gate that is proportional to L·W.
L
 (a  LW  b  W  c)  Cload _ chan 
W
So, a minimal-width
c  Cload _ chan
 L(a  L  b 
)
transmission gate has a minimum
W
Ctotal and causes the smallest
dynamic power overhead.
7/16/2016
Ph.D. Final Oral Examination
20
Outline
Motivation
Problem Statement
Background
Proposed Techniques



MILP1 for Leakage and Glitch Minimization
MILP2 for Statistical Leakage Optimization under Process Variation
MILP3 for Statistical Glitch Power Reduction under Process Variation
Results
Conclusion
Future Work
7/16/2016
Ph.D. Final Oral Examination
21
One Example: Process Variation
Effect on Leakage and Performance





.18um CMOS process
20X leakage variation
30% frequency variation
high frequency but too
leaky chips must be
discarded
 low leakage chips with too
low frequency must also be
discarded
too leaky
too slow
7/16/2016
[Ref] S. Borkar, et. al., DAC
2003.
Ph.D. Final Oral Examination
22
Local and Global Process Variations
Inter-die Variation (Global Variation)


refers to wafer to wafer, or die to die variation
on the same wafer
affects all devices on the same chip in the
same way
Intra-die Variation (Local Variation)


7/16/2016
occurs across an individual die / chip
devices at different locations on the same chip
may have different process parameters
Ph.D. Final Oral Examination
23
Comparison of Dynamic and Leakage Power
Variation of Un-Optimized C432 (1,000 Samples)
0.50
10% delay variation
20% delay variation
30% delay variation
0.45
0.40
Probability
0.35
0.30
Delay
variation
(meannominal)/
nominal
STD /
mean
10%
-0.05%
0.65%
20%
-0.07%
1.12%
30%
-0.16%
1.50%
Leff
variation
(meannominal)/
nominal
STD /
mean
10%
3.10%
6.1%
20%
8.75%
30.7%
30%
25.17%
112.9%
0.25
0.20
0.15
0.10
0.05
0.00
Probability
07
1.
05
1.
0.50
0.45
0.40
0.35
0.30
0.25
0.20
0.15
0.10
0.05
0.00
04
1.
03
1.
01
1.
00
1.
98
0.
97
0.
95
0.
94
0.
Normalized Dynamic Power
Nominal
10% Leff variation
20% Leff variation
30% Leff variation
0. 1. 1. 1. 1. 1. 2. 2. 2. 2. 3. 3. 3. 3. 4. 4. 4.
77 00 23 46 69 93 16 39 62 85 08 31 54 77 01 24 47
Normalized Leakage Power
7/16/2016
Ph.D. Final Oral Examination
24
Comparison of Leakage Distribution of C432 Due to
Different Process Parameters’ Variation (3σ = 15%)
0. 9
0. 90
Global Vth
Local Vth
0. 8
0. 7
0. 70
Probability
0. 5
0. 4
0. 3
0. 50
0. 40
0. 30
Nominal
0. 2
0. 60
Leakage Power (uW)
2.
40
2.
20
2.
00
1.
80
1.
60
1.
40
1.
20
1.
00
0.
80
0.
60
0.
40
2.
2
2.
0
1.
8
1.
6
1.
4
1.
2
1.
0
0.
8
0.
6
0.
4
0. 00
0.
2
0. 0
0.
0
0. 10
0.
20
0. 20
0. 1
0.
00
Probability
0. 6
Leakage Power (uW)
0. 45
0. 25
0. 40
Global Leff
Local Leff
0. 35
Global Leff
Global Tox
Global Vth
0. 20
0. 30
Probability
Probability
Local Leff
Local Tox
Local Vth
0. 80
0. 25
0. 20
0. 15
0. 10
0. 15
0. 10
0. 05
0. 05
0. 00
0.
00
0.
40
0.
80
1.
20
1.
60
2.
00
2.
40
2.
80
3.
20
3.
60
4.
00
4.
40
4.
80
5.
20
5.
60
6.
00
0.
00
0.
30
0.
60
0.
90
1.
20
1.
50
1.
80
2.
10
2.
40
2.
70
3.
00
3.
30
3.
60
3.
90
4.
20
4.
50
4.
80
0. 00
Leakage Power (uW)
Leakage Power (uW)
7/16/2016
Ph.D. Final Oral Examination
25
Comparison of Leakage Distribution of C432 Due to
Different Process Parameters’ Variation (Cont.)
Subthreshold is most sensitive to the variation in the effective channel length.
Global variation has a stronger effect on the subthreshold.
(meannominal)
/ nominal
max dev.
from nominal
(nW)
max dev.
/ nominal
9.8%
16.8%
611.6
67.4%
599.1
55.0%
20.1%
4652.0
513.0%
939.6
33.7
3.6%
3.6%
136.9
15.1%
906.9
938.6
199.9
21.3%
3.5%
795.8
87.7%
local
906.9
956.7
36.4
3.8%
5.5%
171.0
18.9%
global
906.9
964.4
219.8
22.8%
6.3%
1028.0
113.4%
local
906.9
1155.0
140.8
12.2%
27.4%
1044.0
115.1%
global
906.9
1164.0
719.4
61.8%
28.3%
5040.0
555.7%
process parameter
(3σ=15%)
Leff
Tox
Vth
Leff + Tox + Vth
7/16/2016
nominal
(nW)
mean
(nW)
standard std. dev.
dev. (nW) / mean
local
906.9
1059.0
103.6
global
906.9
1089.0
local
906.9
global
Ph.D. Final Oral Examination
26
Statistical Leakage Modeling
2000 samples of subthreshold of one
MUX cell @ 90nm by Monte Carlo Spice
simulation
In the Spice model library, process
parameters (Tox, Ndop, Vth) are random
variables with Gaussian distribution
Statistical subthreshold leakage has
a lognormal distribution
We use the statistical leakage model
in [ref] R. Rao, et al., Parametric Yield
Estimation Considering Leakage
Variability, DAC, 2004.
7/16/2016
Ph.D. Final Oral Examination
27
Statistical Delay Modeling
Deterministic
Statistical – normal distribution [ref]




L

N

T
C
e
f
d
f
o
d dV


D

D
1

c

c

c
D

i
n
,
i
i
1
o
i
2
i
3
m


L
T

V

V
e
0
o
0
fN
d
f
0
d
td
h


Vth  Vth0    X i
i
X i0  X i
X i0
Xi is a process parameter,
Xi0 is the nominal value of Xi
Let

L
N

T
e
f
f
d
o
r

c

c

c
i
i
1
i
2
i
3
L
T
N
e
0
o
f
0
f
d
0
x
Mean
u Di  Dnom,i
Standard Deviation
D r
i
i
Let {X1, X2, X3} = {Leff, Tox, Ndop}
[ref] A. Davoodi and A. Srivastava, ISLPED, 2005.
7/16/2016
Ph.D. Final Oral Examination
28
MILP2 Formulation
(Deterministic vs. Statistical)
Deterministic Approach
Statistical Approach
The delay and subthreshold current of
every gate are assumed to be fixed and
without any effect of the process
variation.
Treat delay and timing intervals as random
variables with normal distributions;
leakage as random variable with
lognormal distribution
Basic MILP1
Basic MILP2
– Minimize total leakage while keeping
the circuit performance unchanged.
Minimize
I

i
Subject to
7/16/2016
" i  gate number
subnom
,i
" k  PO
T

T
POk
max
– Minimize total nominal leakage while
keeping a certain timing yield (η).
Minimize
I

" i  gate number
subnom
,i
i
Subject to PT  T    " k  PO
POk
max
Ph.D. Final Oral Examination
29
Outline
Motivation
Problem Statement
Background
Proposed Techniques



MILP1 for Leakage and Glitch Minimization
MILP2 for Statistical Leakage Optimization under Process Variation
MILP3 for Statistical Glitch Power Reduction under Process Variation
Results
Conclusion
Future Work
7/16/2016
Ph.D. Final Oral Examination
30
Background
10% delay variation
20% delay variation
30% delay variation
0.40
0.30
0.20
Nominal
0.10
1.
07
1.
05
1.
04
1.
03
1.
01
1.
00
0.
98
0.
95
0.
97
0.00
0.
94
Dynamic power is normally much
less sensitive to the process
variation due to its approximately
linear relation to process parameters.
Probability
0.50
Normalized Dynamic Power
0.20
0.18
0.16
0.14
0.12
0.10
0.08
0.06
0.04
0.02
0.00
Normalized Dyanmic Power
C432 optimized by path balancing
7/16/2016
Ph.D. Final Oral Examination
31
1.
48
1.
44
1.
40
1.
36
1.
32
1.
28
1.
24
1.
20
1.
16
1.
12
1.
08
10% delay variation
20% delay variation
30% delay variation
1.
04
1.
00
Deterministic path balancing
becomes ineffective under process
variation because the perfect hazard
filtering conditions can easily be
corrupted with a very slight variation
in process parameters.
Probability
C432 unoptimized for glitches
Gate Distribution without Considering
Process Variation
di = Ti-ti
Timing
window
Ti - ti
di <= Ti-ti
with glitch
di = Ti-ti
Timing
window
Ti - ti
di >= Ti-ti
glitch free
Gate delay di
di >= Ti-ti
glitch free
Gate delay di
Circuits unoptimized for glitch
7/16/2016
di <= Ti-ti
with glitch
Ph.D. Final Oral Examination
Circuits optimized for glitch
by path balancing
32
Gate Distribution under Process Variation
di = Ti-ti
Timing
window
Ti - ti
di <= Ti-ti
with glitch
di = Ti-ti
Timing
window
Ti - ti
di >= Ti-ti
glitch free
Gate delay di
di <= Ti-ti
with glitch
di >= Ti-ti
glitch free
Gate delay di
Circuits unoptimized for glitch
Circuits optimized for glitch
by path balancing
Glitch power of unoptimized circuits is not sensitive to process variation;
Glitch power of circuits optimized by path balancing is sensitive to process variation.
7/16/2016
Ph.D. Final Oral Examination
33
Technique of Enhancing the Resistance
of Glitch Power to Process Variations
di =Ti-ti
di <= Ti-ti
with glitch
Timing
window
Ti - ti
di >= Ti-ti
glitch free
Gate delay di
Leave a relaxed margin for process variation resistance in advance
Di  Ti  t i
7/16/2016
 D  3 D  ( T  3 T )  (  t  3 t )
i
i
i
Ph.D. Final Oral Examination
i
i
i
34
Results for C432
statistical µ=1.04 3σ/µ=2.82% (µ-N)/N=3.63%
Monte Carlo Simulation (15% local
process variation)
determistic µ=1.14 3σ/µ=5.13% (µ-N)/N=13.53%
C432 optimized by the statistical
MILP with greater emphasis on glitch
power to process variation (in Section
5.2.3.1 ) (blue)
C432 optimized by the deterministic
MILP (in Section 5.1.2) (Purple)
0.40
0.30
0.20
0.10
1.
23
1.
21
1.
19
Normalized Dynamic Power
statistical
N2=1.94 µ=2.25 σ/µ=10.24% (µ-N1)/N1=16.97%
deterministic N1=1.00 µ=1.17 σ/µ=6.64%
0.15
0.10
0.05
Normalized Leakage
7/16/2016
Ph.D. Final Oral Examination
35
90
2.
75
2.
60
2.
45
15
00
85
70
55
40
30
2.
2.
2.
1.
1.
1.
1.
25
1.
10
1.
80
65
95
0.
0.
0.
50
0.00
0.
Subthreshold Leakage
(Spice simulation)
0.20
Probability
Dynamic Power
(logic simulation)
(µ-N2)/N2=15.22%
2.
1.
17
1.
15
1.
13
1.
11
1.
09
1.
07
1.
05
1.
03
1.
01
0.
99
0.
97
0.00
0.
95
Probability
0.50
Outline
Motivation
Problem Statement
Background
Proposed Techniques
Results
Conclusion
Future Work
7/16/2016
Ph.D. Final Oral Examination
36
Results of MILP1: Leakage reduction and
performance tradeoff 27℃, 70nm
#
gates
Critical
Path
Delay
Tc
(ns)
Unoptimized
Ileak (μA)
C432
160
0.751
C499
182
C880
(Tmax= Tc )
Leakage
Reduction
%
Sun
OS 5.7
CPU
secs.
(Tmax=1.25Tc )
Leakage
Reduction
%
Sun
OS 5.7
CPU
secs.
2.620
1.022
61.0
0.42
0.132
95.0
0.3
0.391
4.293
3.464
19.3
0.08
0.225
94.8
1.8
328
0.672
4.406
0.524
88.1
0.24
0.153
96.5
0.3
C1355
214
0.403
4.388
3.290
25.0
0.1
0.294
93.3
2.1
C1908
319
0.573
6.023
2.023
66.4
59
0.204
96.6
1.3
C2670
362
1.263
5.925
0.659
90.4
0.38
0.125
97.9
0.16
C3540
1097
1.748
15.622
0.972
93.8
3.9
0.319
98.0
0.74
C5315
1165
1.589
19.332
2.505
87.1
140
0.395
98.0
0.71
C6288
1177
2.177
23.142
6.075
73.8
277
0.678
97.1
7.48
C7552
1046
1.915
22.043
0.872
96.0
1.1
0.445
98.0
0.58
Circuit
7/16/2016
Optimized
Ileak (μA)
Ph.D. Final Oral Examination
Optimized for
Ileak (μA)
37
Results of MILP1: Leakage, Dynamic and Total
Power Comparison 90℃, 70nm
Circuit
Name
No.
of
Gates
Leakage Power
Pleak1
(uW)
Pleak2 (uW)
Leakage
Reduction
Dynamic Power
Pdyn1 (uW)Pdyn2 (uW)
Total Power
Dynamic
Reduction
Ptotal1
(uW)
Ptotal2
(uW)
Total Reduction
C432
160
35.77
11.87
66.8%
101.0
73.3
8.63 %
136.8
104.15
23.86%
C499
182
50.36
39.94
20.7%
225.7 160.3
18.13%
276.1
224.72
18.61%
C880
328
85.21
11.05
87.0%
177.3 128.0
16.23%
262.5
159.57
39.21%
C1355
214
54.12
39.96
26.3%
293.3 165.7
35.79%
347.4
228.29
34.29%
C1908
319
92.17
29.69
67.8%
254.9 197.7
8.39%
347.1
263.20
24.17%
C2670
362
115.4
11.32
90.2%
128.6 100.8
7.42%
244.0
130.38
46.57%
C3540 1097
302.8
17.98
94.1%
333.2 228.1
14.04%
636.0
304.40
52.14%
C5315 1165
421.1
49.79
88.2%
465.5 304.3
12.08%
886.6
459.06
48.22%
C6288 1189
388.5
97.17
75.0% 1691.2 405.6
68.73%
2079.7 625.95
69.90%
C7552 1046
444.4
18.75
95.8%
27.74%
825.3
64.38%
7/16/2016
380.9 227.8
Ph.D. Final Oral Examination
293.99
38
Results of MILP 2: Comparison of nominal leakage
power saving due to statistical modeling with two
different timing yields (η).
Deterministic Optimization
(η=100%)
Circuit
Statistical Optimization
(η=99%)
Statistical Optimization
(η=95%)
# gates
Un-opt.
Leakage
Power
(μW)
Optimized
Leakage Power
(μW)
Run Time
(s)
Optimized
Leakage
Power (μW)
Extra
Power
Saving
Run
Time
(s)
Optimized
Leakage
Power (μW)
Extra
Power
Saving
Run
Time
(s)
C432
160
2.620
1.003
0.00
0.662
33.9%
0.44
0.589
41.3%
0.32
C499
182
4.293
3.396
0.02
3.396
0.0%
0.22
2.323
31.6%
1.47
C880
328
4.406
0.526
0.02
0.367
30.2%
0.18
0.340
35.4%
0.18
C1355
214
4.388
3.153
0.00
3.044
3.5%
0.17
2.158
31.6%
0.48
C1908
319
6.023
1.179
0.03
1.392
21.7%
11.21
1.169
34.3%
17.5
C2670
362
5.925
0.565
0.03
0.298
47.2%
0.35
0.283
49.8%
0.43
C3540
1097
15.622
0.957
0.13
0.475
50.4%
0.24
0.435
54.5%
1.17
C5315
1165
19.332
2.716
1.88
1.194
56.0%
67.63
0.956
64.8%
19.7
C7552
1045
22.043
0.938
0.44
0.751
20.0%
0.88
0.677
27.9%
0.58
Average of ISCAS’85 benchmarks
0.24
29.2%
9.04
41.3%
4.64
14.07%
36.79
14.07%
36.4
Circuit
Name
ARM7
15.5k
7/16/2016
686.56
495.12
15.69
425.44
Ph.D. Final Oral Examination
425.44
39
Statistical
Dual-threshold Assignment
The leakage in high Vth gates is less sensitive to process variation.
Higher the percentage of high Vth gates in a circuit, narrower is
the leakage power distribution (standard deviation) and lower is
the average leakage power (mean).
For global process variation, all gate delays have the same
percentage of variation, and do not affect the constraints in MILP,
which means the dual-threshold assignment will remain the same.
Subthreshold is most sensitive to the Leff variation.
So, we only simulate the leakage distribution of all statistically
optimized circuits with local Leff variation (3σ=15%) by Spice.
To analyze the leakage distribution under process variation in the
deterministic method, we considered the worst case which is too
pessimistic.
7/16/2016
Ph.D. Final Oral Examination
40
Results of MILP 2: Leakage Power Distribution
of Optimized Dual-Vth C7552
0.25
0.20
C 7552_d
C 7552_p99
C 7552_p95
Probability
0.15
0.10
0.05
0.
0E
+
1. 00
0E
2. 07
0E
3. 07
0E
4. 07
0E
5. 07
0E
6. 07
0E
7. 07
0E
8. 07
0E
9. 07
0E
1. 07
0E
1. 06
1E
1. 06
2E
1. 06
3E
1. 06
4E
-0
6
0.00
Leakage Power (uW)
Mean and Standard Deviation of leakage power are reduced by the statistical method.
7/16/2016
Ph.D. Final Oral Examination
41
Results of MILP 2: Comparison of leakage power
distribution with two different timing yields (η).
Circuit
Deterministic Optimization
(η=100%)
Statistical Optimization
(η=99%)
Statistical Optimization
(η=95%)
Name
#
gates
Nominal
Leakage
(uW)
Mean
Leakage
(uW)
Standard
Deviation
(uw)
Nominal
Leakage
(uW)
Mean
Leakage
(uW)
Standard
Deviation
(uW)
Nominal
Leakage
(uW)
Mean
Leakage
(uW)
Standard
Deviation
(uW)
C432
160
0.907
1.059
0.104
0.603
0.709
0.074
0.522
0.614
0.069
C499
182
3.592
4.283
0.255
3.592
4.283
0.255
2.464
2.905
0.197
C880
328
0.551
0.645
0.086
0.430
0.509
0.080
0.415
0.491
0.079
C1355
214
3.198
3.744
0.200
3.090
3.606
0.202
2.199
2.610
0.175
C1908
319
1.803
2.123
0.170
1.356
1.601
0.116
1.140
1.341
0.127
C2670
362
0.635
0.750
0.078
0.405
0.473
0.046
0.395
0.461
0.043
C3540
1097
1.055
1.243
0.119
0.527
0.611
0.032
0.493
0.575
0.031
C5315
1165
2.688
3.128
0.165
1.229
1.420
0.088
1.034
1.188
0.067
C7552
1045
0.924
1.073
0.069
0.774
0.903
0.049
0.701
0.823
0.045
Average of ISCAS’85 benchmarks
7/16/2016
0.138
Ph.D. Final Oral Examination
0.105
0.093
42
Results of MILP 2: Comparison of mean
of three leakage power distributions
Mean (nW)
4500
4000
3500
determ. 100%
sta. 99%
sta. 95%
3000
2500
2000
1500
1000
500
0
2
55
c7
5
31
c5
0
54
c3
0
67
c2
8
90
c1
5
35
c1
80
c8
99
c4
32
c4
7/16/2016
Ph.D. Final Oral Examination
43
Results of MILP 2: Comparison of standard
deviation of three leakage power distributions
Standard Deviation (nW)
300
250
determ. 100%
sta. 99%
sta. 95%
200
150
100
50
0
2
55
c7
5
31
c5 0
54
c3
0
67
c2
8
90
c1 5
35
c1
80
c8
99
c4
32
c4
7/16/2016
Ph.D. Final Oral Examination
44
Conclusion
A new mixed integer linear programming technique



Simultaneous minimization of leakage (dual-Vth) and elimination of
glitches (path delay balancing).
Global tradeoff between power and performance.
Experimental results shows that 96%, 28% and 64% reduction in
leakage, dynamic (glitch) and total power, respectively for C7552.
A second mixed integer linear programming
formulation



statistically minimize the leakage power in a dual-Vth process under
process variations.
Experimental results show that 30% more leakage power reduction
can be achieved by using this statistical approach.
The mean and standard deviation of leakage power distribution are
both reduced when a small yield loss is permitted.
7/16/2016
Ph.D. Final Oral Examination
45
Conclusion (cont.)
A third mixed integer linear programming
formulation


7/16/2016
Statistically minimize the total power, the leakage or
the dynamic power in a dual-Vth process under
process variations
The effect of process variation on glitch power is
minimized.
Ph.D. Final Oral Examination
46
Future Work
Gate leakage
MILP complexity

for SOC, MILP constraints can be generated for its
submodules at a lower level,
may not guarantee a global optimization, but still would get a
reasonable result within acceptable run time.

adopt relaxed LP that uses the LP solution as the
starting point and then round off the variables
An approximate optimal solution with acceptable run time can
be achieved.
7/16/2016
Ph.D. Final Oral Examination
47
Future Work (Cont.)
Iterative MILP
for dual-Vth design

Timing violations were found
The interdependency of delays of
gates was neglected for simplicity in
our MILP formulation.
gate delay

= 2
+
2
+
3
8.2ns = 2
+
3
+
3.2
7ns
1
2
FF
dual-Vthdesign
3
FF
8ns
7/16/2016
LVT design
If any timing violation is found, the
new delays for all LVT cells are
extracted from the current dual-Vth
design and the MILP formulation is
updated correspondingly. A different
optimal solution is then given by the
CPLEX solver with fewer timing
violations. We continue iterations until
all timing violations are eliminated.
Ph.D. Final Oral Examination
48
Thank You All !
Questions?
Download