ASPDAC2010_Xiang

advertisement
An Adaptive Parallel Flow for Power Distribution
Network Simulation Using Discrete Fourier Transform
Xiang Hu1, Wenbo Zhao2, Peng Du2,
Amirali Shayan2, Chung-Kuan Cheng2
1ECE
Dept., 2CSE Dept.
University of California, San Diego
01/19/2010
Outline
 Motivation
 Adaptive simulation flow using discrete Fourier transform (DFT)
– Basic DFT flow
– Problems with the basic DFT flow
– Adaptive DFT flow
 Experimental results
– Error analysis
– Time efficiency of the adaptive flow
– Time efficiency of parallel processing
 Conclusions
Page  2
Outline
 Motivation
 Adaptive simulation flow using discrete Fourier transform (DFT)
– Basic DFT flow
– Problems with the basic DFT flow
– Adaptive DFT flow
 Experimental results
– Error analysis
– Time efficiency of the adaptive flow
– Time efficiency of parallel processing
 Conclusions
Page  3
PDN Simulation: Why Frequency Domain?
 Huge PDN netlists
– Time-domain simulation: serial - slow
– Frequency-domain simulation: parallel – fast
 Frequency dependent parasitics
 Simulation results
– Time-domain: voltage drops, simultaneous switching noise (SSN) –
input dependent
– Frequency-domain: impedance, anti-resonance peaks – input
independent
Page  4
Transform Operations: Why Discrete Fourier Transform
 Laplace Transform [Wanping ’07]
– Input: Series of ramp functions
– Output: Rational expressing via vector fitting
• Vector fitting may introduce large errors
– Choice of frequency samples is case dependent
 Discrete Fourier Transform (DFT)
– Input: periodic signal
– Inverse DFT is straightforward: vector fitting is not needed
– Frequency sample points with uniform steps
Page  5
Outline
 Motivation
 Adaptive simulation flow using discrete Fourier transform (DFT)
– Basic DFT flow
– Problems with the basic DFT flow
– Adaptive DFT flow
 Experimental results
– Error analysis
– Time efficiency of the adaptive flow
– Time efficiency of parallel processing
 Conclusions
Page  6
Basic DFT Simulation Flow
Page  7
Problem with Basic DFT Flow
 “Wrap-around effect” requires long padding zeros at the end of the input
– Periodicity nature of DFT
 Small uniform time steps are needed to cover the input frequency range
T
x 10
8
T
-3
6
x 10
-3
7
5
6
output
5
4
3
Correct
4
Voltage (V)
Current (A)
Large number of
simulation points!
2T 3T 4T
3
2
2
1
1
0
0
T
-1
0
0.2
0.4
0.6
0.8
Time (sec)
8
DFT
repetition
7
6
Current (A)
Current (A)
5
4
3
x 10
-7
-1
Wrap-around
0
0.5
1
1.5
2
Time (sec)
-3
6
x 10
2.5
x 10
-3
-8
7
5
6
output
5
4
3
2
2
1
1
0
0
Distorted!
4
Voltage (V)
8
x 10
-3
1
x 10
3
2
1
-1
0
0.5
Page  8
1
1.5
Time (sec)
2
2.5
x 10
-8
-1
0
0
0.2
0.4
0.6
Time (sec)
0.8
1
x 10
-7
-1
0
0.5
1
1.5
Time (sec)
2
2.5
x 10
-8
Adaptive DFT Simulation
x 10
4T
-
-3
5
Voltage (V)
4
3
2
T
6
x 10
-3
6
5
5
4
4
3
3
Voltage (V)
6
2T 3T
Voltage (V)
T
2
1
1
0
0
0
0
0.2
0.4
0.6
Time (sec)
Correct
0.8
-1
1
x 10
-7
-1
0
0.2
0.4
0.6
Time (sec)
Distorted
0.8
1
x 10
-3
2
1
-1
x 10
0
0.5
1
1.5
2
Time (sec)
-7
2.5
x 10
Correct!
 Basic ideas of the adaptive DFT flow: cancel out the wrap-around effect
by subtracting the tail from the main part of the output
– Main part of the output: obtained with small time step and small period;
distorted by the wrap-around effect
– Tail of the output: low frequency oscillation; can be captured with large time
steps
Total number of simulation points is reduced significantly!
Page  9
-8
Adaptive DFT Flow
 Period[i]: the input period at each iteration
 Interval[i]: the simulation time step at each
iteration
 FreqUpBd[i]: the upper bound of the input
frequency range at each iteration
 vi(t): tentative time-domain output within the
frequency range [0, FreqUpBd] at each
iteration
 Iteration #1: obtain the main part of the
output
 Iteration #2~k: capture the oscillations in the
tail of the output (high, middle, and low
resonant frequencies)
 For each iteration #i, i=k, k-1, …, 2, subtract
the captured tail from the outputs at iteration
#j, j<i to eliminate the wrap-around effect
Page  10
Outline
 Motivation
 Adaptive simulation flow using discrete Fourier transform (DFT)
– Basic DFT flow
– Problems with the basic DFT flow
– Adaptive DFT flow
 Experimental results
– Error analysis
– Time efficiency of the adaptive flow
– Time efficiency of parallel processing
 Conclusions
Page  11
Experimental Results: Test Case & Input
 Test case: 3D PDN
– One resonant peak in the impedance profile
 Input current
– Time step: ∆t = 20ps
– Duration: T0 = 16.88ns
Impedance
5.5
8
5
7
4.5
6
3.5
3
2.5
4
3
2
2
1
1.5
0
1
0.5
10
Page  12
Original Input
-3
5
Current (A)
Impedance ( )
4
x 10
4
10
6
Frequency (Hz)
10
8
-1
0
0.5
1
Time (sec)
1.5
2
x 10
-8
Experimental Results: Adaptive Flow Process
T1
x 10
10
 Iteration #1: v1(t)
-3
v1(t), ∆t1=20ps, T1=20.48ns
8
– ∆t1=20ps
Voltage (V)
6
4
0
0
10
Voltage (V)
 Iteration #2: v2(t)
m 1
-2
-4
– T1=20.48ns
3
v1 (t )   v2 (mT1 : (m  1)T1 )
2
x 10
0.5
-3
1
1.5
2
Time (sec)
3T1
2T1
2.5
x 10
-8
4T1 · · ·8T1
v2(t), ∆t2=640ps, T2=163.84ns
5
= 640ps
– T2 = 8T1
0
-5
= 163.84ns
0
10
x 10
1
2
3
-3
4
5
6
7
8
Time (sec)
9
x 10
5
– Main part:
3
v1 (t )   v2 (mT1 : (m  1)T1 )
m 1
Final output-v (t)
– Tail:
1
0
-5
 Final output:
-8
Final output
v1(t)
Final output
Voltage (V)
– ∆t2 = 64∆t1
v2 (T1 : 8T1 )
0
Page  13
1
2
3
4
Time (sec)
5
6
7
8
9
x 10
-8
Experimental Results: DFT Flow vs. SPICE
10
x 10
-3
DFT flow
SPICE transient simulation
8
Voltage (V)
6
4
2
Relative error: 0.25%
0
-2
-4
Page  14
0
0.5
1
Time (sec)
1.5
2
x 10
-7
Error Analysis: Error Caused by Wrap-around Effect
6
x 10
-3
Output comparison
1.5
DFT flow, T=20.48ns
DFT flow, T=163.84ns
SPICE transient simulation
5
x 10
-4
Error relative to SPICE
Difference between DFT and SPICE, T=20.48ns
Difference between DFT and SPICE, T=163.84ns
1
Relative error: 0.12%
4
3
Voltage (V)
Voltage (V)
0.5
2
0
-0.5
1
-1
0
Relative error: 2.09%
-1
0
0.5
1
1.5
Time (sec)
2
2.5
x 10
-8
-1.5
0
0.5
1
1.5
Time (sec)
2
2.5
x 10
Theorem 1: Let be the initial value of the output voltage.
Suppose
for some
, then the mean
square error, i.e.,
is bounded by
.
Page  15
-8
Error Analysis: Error Caused by Different Interpolation Methods
6
x 10
-3
Output comparison
2
DFT flow, t=20ps
DFT flow, t=2.5ps
SPICE transient simulation
5
x 10
-5
Error relative to SPICE
Difference between DFT and SPICE, t=20ps
Difference between DFT and SPICE, t=2.5ps
1.5
1
4
Relative error: 0.12%
0.5
Voltage (V)
Voltage (V)
3
2
0
-0.5
1
-1
0
-1
-1.5
0
0.2
0.4
0.6
0.8
Time (sec)
x 10
Voltage (V)
1
x 10
-7
1.31
1.3
1.29
2.294
0
0.2
0.4
0.6
0.8
Time (sec)
 DFT: sinusoidal interpolation
DFT flow, t=20ps
DFT flow, t=2.5ps
SPICE transient simulation
Page  16
-2
 SPICE: PWL interpolation
-4
1.32
1.28
Relative error: 0.09%
2.2945
2.295
Time (sec)
2.2955
2.296
x 10
-8
1
x 10
-7
Time Complexity Analysis: Adaptive vs. Non-adaptive
k
 Adaptive flow time complexity: O( Ti / ti )
i 1
– Ti: simulation period at iteration #i, T1  T2 
 Tk
– ∆ti: simulation time step at iteration #i, t1  t2 
 tk
 Non-adaptive flow time complexity: O(Tk / t1 )
2.5
Non-adaptive method
Adaptive method
Relative error to Hspice (%)
2
1.5
1
0.5
Page  17
0
0
100
200
300
400
Time (sec)
500
600
700
Parallel Processing
16000
 Test case: 3D PDN
Basic DFT
Adpative DFT
14000
– ~ 0.17 million nodes
 The adaptive DFT flow has more
advantage when the number of
available processors is limited.
 Simulations between each iteration
of the adaptive flow need to be
processed serially
Simulation time (sec)
12000
10000
8000
6000
4000
2000
0
0
50
100
150
200
Number of processors
250
Simulation time with different number of processors (sec)
1 prc
4 prcs
8 prcs
16 prcs
64 prcs
128 prcs
256 prcs
Hspice
21374
NA
NA
NA
NA
NA
NA
Basic DFT
15976
6635
3225
1647
425
218
143
Adaptive DFT
4947
2096
904
490
180
120
115
Page  18
300
Parallel Processing: DFT Flow vs. SPICE
DFT error compared to Hspice
Simulation result
8
x 10
-3
3
Adaptive DFT
Hspice
7
x 10
-5
2
6
1
Voltage (V)
Voltage (V)
5
4
3
0
-1
2
-2
1
-3
0
-1
0
Page  19
0.5
1
1.5
Time (sec)
2
2.5
x 10
-8
-4
0
0.5
1
1.5
Time (sec)
2
2.5
x 10
-8
Outline
 Motivation
 Adaptive simulation flow using discrete Fourier transform (DFT)
– Basic DFT flow
– Problems with the basic DFT flow
– Adaptive DFT flow
 Experimental results
– Error analysis
– Time efficiency of the adaptive flow
– Time efficiency of parallel processing
 Conclusions
Page  20
Conclusions
 Implemented an adaptive flow for large PDN simulation using DFT
 Total number of simulation points is reduced significantly compared to the
basic DFT flow
 Achieved a relative error of the order of 0.1% compared to SPICE
 10x speed up with a single processor compared to SPICE.
 Parallel processing is incorporated to reduce the simulation time even
more significantly
Page  21
Page  22
Download