ASPDAC2010_Xiang

An Adaptive Parallel Flow for Power Distribution Network Simulation Using Discrete Fourier Transform Xiang Hu1, Wenbo Zhao2, Peng Du2, Amirali Shayan2, Chung-Kuan Cheng2 1ECE Dept., 2CSE Dept. University of California, San Diego 01/19/2010 Outline  Motivation  Adaptive simulation flow using discrete Fourier transform (DFT) – Basic DFT flow – Problems with the basic DFT flow – Adaptive DFT flow  Experimental results – Error analysis – Time efficiency of the adaptive flow – Time efficiency of parallel processing  Conclusions Page  2 Outline  Motivation  Adaptive simulation flow using discrete Fourier transform (DFT) – Basic DFT flow – Problems with the basic DFT flow – Adaptive DFT flow  Experimental results – Error analysis – Time efficiency of the adaptive flow – Time efficiency of parallel processing  Conclusions Page  3 PDN Simulation: Why Frequency Domain?  Huge PDN netlists – Time-domain simulation: serial - slow – Frequency-domain simulation: parallel – fast  Frequency dependent parasitics  Simulation results – Time-domain: voltage drops, simultaneous switching noise (SSN) – input dependent – Frequency-domain: impedance, anti-resonance peaks – input independent Page  4 Transform Operations: Why Discrete Fourier Transform  Laplace Transform [Wanping ’07] – Input: Series of ramp functions – Output: Rational expressing via vector fitting • Vector fitting may introduce large errors – Choice of frequency samples is case dependent  Discrete Fourier Transform (DFT) – Input: periodic signal – Inverse DFT is straightforward: vector fitting is not needed – Frequency sample points with uniform steps Page  5 Outline  Motivation  Adaptive simulation flow using discrete Fourier transform (DFT) – Basic DFT flow – Problems with the basic DFT flow – Adaptive DFT flow  Experimental results – Error analysis – Time efficiency of the adaptive flow – Time efficiency of parallel processing  Conclusions Page  6 Basic DFT Simulation Flow Page  7 Problem with Basic DFT Flow  “Wrap-around effect” requires long padding zeros at the end of the input – Periodicity nature of DFT  Small uniform time steps are needed to cover the input frequency range T x 10 8 T -3 6 x 10 -3 7 5 6 output 5 4 3 Correct 4 Voltage (V) Current (A) Large number of simulation points! 2T 3T 4T 3 2 2 1 1 0 0 T -1 0 0.2 0.4 0.6 0.8 Time (sec) 8 DFT repetition 7 6 Current (A) Current (A) 5 4 3 x 10 -7 -1 Wrap-around 0 0.5 1 1.5 2 Time (sec) -3 6 x 10 2.5 x 10 -3 -8 7 5 6 output 5 4 3 2 2 1 1 0 0 Distorted! 4 Voltage (V) 8 x 10 -3 1 x 10 3 2 1 -1 0 0.5 Page  8 1 1.5 Time (sec) 2 2.5 x 10 -8 -1 0 0 0.2 0.4 0.6 Time (sec) 0.8 1 x 10 -7 -1 0 0.5 1 1.5 Time (sec) 2 2.5 x 10 -8 Adaptive DFT Simulation x 10 4T - -3 5 Voltage (V) 4 3 2 T 6 x 10 -3 6 5 5 4 4 3 3 Voltage (V) 6 2T 3T Voltage (V) T 2 1 1 0 0 0 0 0.2 0.4 0.6 Time (sec) Correct 0.8 -1 1 x 10 -7 -1 0 0.2 0.4 0.6 Time (sec) Distorted 0.8 1 x 10 -3 2 1 -1 x 10 0 0.5 1 1.5 2 Time (sec) -7 2.5 x 10 Correct!  Basic ideas of the adaptive DFT flow: cancel out the wrap-around effect by subtracting the tail from the main part of the output – Main part of the output: obtained with small time step and small period; distorted by the wrap-around effect – Tail of the output: low frequency oscillation; can be captured with large time steps Total number of simulation points is reduced significantly! Page  9 -8 Adaptive DFT Flow  Period[i]: the input period at each iteration  Interval[i]: the simulation time step at each iteration  FreqUpBd[i]: the upper bound of the input frequency range at each iteration  vi(t): tentative time-domain output within the frequency range [0, FreqUpBd] at each iteration  Iteration #1: obtain the main part of the output  Iteration #2~k: capture the oscillations in the tail of the output (high, middle, and low resonant frequencies)  For each iteration #i, i=k, k-1, …, 2, subtract the captured tail from the outputs at iteration #j, j<i to eliminate the wrap-around effect Page  10 Outline  Motivation  Adaptive simulation flow using discrete Fourier transform (DFT) – Basic DFT flow – Problems with the basic DFT flow – Adaptive DFT flow  Experimental results – Error analysis – Time efficiency of the adaptive flow – Time efficiency of parallel processing  Conclusions Page  11 Experimental Results: Test Case & Input  Test case: 3D PDN – One resonant peak in the impedance profile  Input current – Time step: ∆t = 20ps – Duration: T0 = 16.88ns Impedance 5.5 8 5 7 4.5 6 3.5 3 2.5 4 3 2 2 1 1.5 0 1 0.5 10 Page  12 Original Input -3 5 Current (A) Impedance ( ) 4 x 10 4 10 6 Frequency (Hz) 10 8 -1 0 0.5 1 Time (sec) 1.5 2 x 10 -8 Experimental Results: Adaptive Flow Process T1 x 10 10  Iteration #1: v1(t) -3 v1(t), ∆t1=20ps, T1=20.48ns 8 – ∆t1=20ps Voltage (V) 6 4 0 0 10 Voltage (V)  Iteration #2: v2(t) m 1 -2 -4 – T1=20.48ns 3 v1 (t )   v2 (mT1 : (m  1)T1 ) 2 x 10 0.5 -3 1 1.5 2 Time (sec) 3T1 2T1 2.5 x 10 -8 4T1 · · ·8T1 v2(t), ∆t2=640ps, T2=163.84ns 5 = 640ps – T2 = 8T1 0 -5 = 163.84ns 0 10 x 10 1 2 3 -3 4 5 6 7 8 Time (sec) 9 x 10 5 – Main part: 3 v1 (t )   v2 (mT1 : (m  1)T1 ) m 1 Final output-v (t) – Tail: 1 0 -5  Final output: -8 Final output v1(t) Final output Voltage (V) – ∆t2 = 64∆t1 v2 (T1 : 8T1 ) 0 Page  13 1 2 3 4 Time (sec) 5 6 7 8 9 x 10 -8 Experimental Results: DFT Flow vs. SPICE 10 x 10 -3 DFT flow SPICE transient simulation 8 Voltage (V) 6 4 2 Relative error: 0.25% 0 -2 -4 Page  14 0 0.5 1 Time (sec) 1.5 2 x 10 -7 Error Analysis: Error Caused by Wrap-around Effect 6 x 10 -3 Output comparison 1.5 DFT flow, T=20.48ns DFT flow, T=163.84ns SPICE transient simulation 5 x 10 -4 Error relative to SPICE Difference between DFT and SPICE, T=20.48ns Difference between DFT and SPICE, T=163.84ns 1 Relative error: 0.12% 4 3 Voltage (V) Voltage (V) 0.5 2 0 -0.5 1 -1 0 Relative error: 2.09% -1 0 0.5 1 1.5 Time (sec) 2 2.5 x 10 -8 -1.5 0 0.5 1 1.5 Time (sec) 2 2.5 x 10 Theorem 1: Let be the initial value of the output voltage. Suppose for some , then the mean square error, i.e., is bounded by . Page  15 -8 Error Analysis: Error Caused by Different Interpolation Methods 6 x 10 -3 Output comparison 2 DFT flow, t=20ps DFT flow, t=2.5ps SPICE transient simulation 5 x 10 -5 Error relative to SPICE Difference between DFT and SPICE, t=20ps Difference between DFT and SPICE, t=2.5ps 1.5 1 4 Relative error: 0.12% 0.5 Voltage (V) Voltage (V) 3 2 0 -0.5 1 -1 0 -1 -1.5 0 0.2 0.4 0.6 0.8 Time (sec) x 10 Voltage (V) 1 x 10 -7 1.31 1.3 1.29 2.294 0 0.2 0.4 0.6 0.8 Time (sec)  DFT: sinusoidal interpolation DFT flow, t=20ps DFT flow, t=2.5ps SPICE transient simulation Page  16 -2  SPICE: PWL interpolation -4 1.32 1.28 Relative error: 0.09% 2.2945 2.295 Time (sec) 2.2955 2.296 x 10 -8 1 x 10 -7 Time Complexity Analysis: Adaptive vs. Non-adaptive k  Adaptive flow time complexity: O( Ti / ti ) i 1 – Ti: simulation period at iteration #i, T1  T2   Tk – ∆ti: simulation time step at iteration #i, t1  t2   tk  Non-adaptive flow time complexity: O(Tk / t1 ) 2.5 Non-adaptive method Adaptive method Relative error to Hspice (%) 2 1.5 1 0.5 Page  17 0 0 100 200 300 400 Time (sec) 500 600 700 Parallel Processing 16000  Test case: 3D PDN Basic DFT Adpative DFT 14000 – ~ 0.17 million nodes  The adaptive DFT flow has more advantage when the number of available processors is limited.  Simulations between each iteration of the adaptive flow need to be processed serially Simulation time (sec) 12000 10000 8000 6000 4000 2000 0 0 50 100 150 200 Number of processors 250 Simulation time with different number of processors (sec) 1 prc 4 prcs 8 prcs 16 prcs 64 prcs 128 prcs 256 prcs Hspice 21374 NA NA NA NA NA NA Basic DFT 15976 6635 3225 1647 425 218 143 Adaptive DFT 4947 2096 904 490 180 120 115 Page  18 300 Parallel Processing: DFT Flow vs. SPICE DFT error compared to Hspice Simulation result 8 x 10 -3 3 Adaptive DFT Hspice 7 x 10 -5 2 6 1 Voltage (V) Voltage (V) 5 4 3 0 -1 2 -2 1 -3 0 -1 0 Page  19 0.5 1 1.5 Time (sec) 2 2.5 x 10 -8 -4 0 0.5 1 1.5 Time (sec) 2 2.5 x 10 -8 Outline  Motivation  Adaptive simulation flow using discrete Fourier transform (DFT) – Basic DFT flow – Problems with the basic DFT flow – Adaptive DFT flow  Experimental results – Error analysis – Time efficiency of the adaptive flow – Time efficiency of parallel processing  Conclusions Page  20 Conclusions  Implemented an adaptive flow for large PDN simulation using DFT  Total number of simulation points is reduced significantly compared to the basic DFT flow  Achieved a relative error of the order of 0.1% compared to SPICE  10x speed up with a single processor compared to SPICE.  Parallel processing is incorporated to reduce the simulation time even more significantly Page  21 Page  22

ASPDAC2010_Xiang

Related documents

Products

Support

ASPDAC2010_Xiang

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib