Krylov Subspace-Based Dimension Reduction of Large-Scale Linear Dynamical Systems Roland W. Freund Department of Mathematics University of California, Davis, USA http://www.math.ucdavis.edu/˜freund/ Supported in part by NSF Outline • Motivation • From AWE to PVL • Descriptor systems • Preserving RCL structures • SPRIM • Using restarted Krylov subspace methods • Concluding remarks Outline • Motivation • From AWE to PVL • Descriptor systems • Preserving RCL structures • SPRIM • Using restarted Krylov subspace methods • Concluding remarks State-of-the-art VLSI circuits • 45 nm feature size • O(109) transistors • O(10) km of ‘wires’ (the interconnect) • Up to 15 layers VLSI interconnect • Wires are not ideal: Resistance Capacitance Inductance • Consequences: Timing behavior Noise Energy consumption Power distribution Interconnect now dominates Lumped-circuit paradigm • Replace ‘pieces’ of the interconnect by RCL networks: Need for dimension reduction Outline • Motivation • From AWE to PVL • Descriptor systems • Preserving RCL structures • SPRIM • Using restarted Krylov subspace methods • Concluding remarks A simple RC circuit i(t) R C v(t) • Impulse response: t 1 i(t) = 2 exp − , R C RC t≥0 • In frequency domain: 1/R2C I(s) = s + 1/RC =: H(s) , s∈C General case • Impulse response: i(t) = N X ki exp(tpi ), t≥0 i=1 • In frequency domain: N X ki , H(s) = s − p i i=1 s∈C • Simplest RC reduced-order model k̃ H1(s) := ≈ H(s) s − p̃ Simplest RC reduced-order model • Moment matching: Choose p̃ and k̃ such that H1(s) = H(s) + O s2 • Reduced circuit: Set k̃ p̃ R := − and C := 2 k̃ p̃ i(t) R v(t) • Elmore delay: τ := RC, t i(t) = i(0) exp − τ C AWE (Pillage and Rohrer, ‘90) • Transfer function of RCL network: H(s) = N X ki i=1 s − pi • Reduced-order model via approximation n X k̃i , Hn(s) = s − p̃ i i=1 where n≪N • Moment matching: Choose the k̃i’s and p˜i’s such that 2n Hn(s) = H(s) + O s • AWE generates Hn via explicit moment computations PVL (Feldmann and F., ‘94) • Based on the classical Lanczos-Padé connection • Write the transfer function in state-space form: H(s) = l T −1 I−(s−s0) A where r, A ∈ RN ×N , r, l ∈ RN • Run n steps of the Lanczos process (applied to A, starting vectors r and l) to obtain n × n tridiagonal matrix Tn • Theorem (Gragg, ‘74): The n-th Padé approximant Hn of H is given by T Hn(s) = l r eT 1 I − (s − s0) Tn where e1 is the first unit vector −1 e1 An RCL network with mostly C’s and L’s 0.014 Im s exact 0.012 PVL 60 iter. frequency range of interest Current(Amps) 0.01 poles 0.008 0.006 0.004 s0 0.002 Re s s−plane 0 0 0.5 1 1.5 2 2.5 3 Frequency (GHz) Exact and reduced-order model of size n = 60 3.5 4 4.5 5 9 x 10 The multi-input multi-output case • Matrix-valued transfer function H(s) = LH(I − (s − s0) A)−1R where A ∈ CN ×N , R ∈ CN ×m, L ∈ CN ×p • Band Lanczos process for any m and p Aliaga, Boley, F., and Hernández, ‘94, ‘96, and ‘00 F., ‘00, ‘03, and ‘09 • MPVL (Matrix-Padé Via Lanczos) algorithm (Feldmann and F., ‘95) • ‘Symmetric’ algorithm tailored to RC networks: SyMPVL (Feldmann and F., ‘97 and ‘98) Full chip SIV — statistics interconnect nets 24, 607 pins 91, 277 total capacitors 5, 413, 127 cross-coupled C’s 4, 955, 020 grounded C’s 458, 107 resistors 265, 941 potential violations 602 nets in cluster 2–7 total run time 2.5 hours SyMPVL run time 15 minutes Outline • Motivation • From AWE to PVL • Descriptor systems • Preserving RCL structures • SPRIM • Using restarted Krylov subspace methods • Concluding remarks RCL networks as descriptor systems • System of linear time-invariant DAEs of the form d C x(t) + G x(t) = B u(t) dt y(t) = BHx(t) where C, G ∈ CN ×N and B ∈ CN ×m • x(t) ∈ CN is the unknown vector of state variables • m inputs, m outputs Reduced-order models • System of DAEs of the same form: d Cn z(t) + Gn z(t) = Bn u(t) dt e (t) = BH y n z(t) • But now: Cn, Gn ∈ Cn×n where n ≪ N and Bn ∈ Cn×m Transfer functions • Original descriptor system: H(s) = BH (s C + G)−1 B • Reduced-order model: −1 Hn(s) = BH s C + G Bn ( ) n n n • ‘Good’ reduced-order model ⇐⇒ ‘Good’ approximation Hn ≈ H • Original dimension N ≈ 104−6 H(s) = BH BH n −1 s + C G • Reduced dimension n ≪ N (n ≈ 100−2) Hn(s) = BH n s Cn + Gn −1 Bn B Bn Padé approximation • Choose expansion point s0 ∈ C such that the matrix s0 C + G is nonsingular • Cn, Gn ∈ Cn×n, Bn ∈ Cn×m are such that q(n) Hn(s) = H(s) + O (s − s0) and q (n) is maximal n • q (n) ≥ 2 with equality in the ‘generic’ case m Padé-type approximation • Padé approximants have undesirable properties in general • Remedy: relax approximation property • Cn, Gn ∈ Cn×n, Bn ∈ Cn×m are such that q̃(n) Hn(s) = H(s) + O (s − s0) where q̃ (n) is no longer maximal n • Typical: q̃ (n) ≥ with equality in the ‘generic’ case m Reduction to one matrix • Transfer function: H(s) = BH (s C + G)−1 B = BH s0 C + G + (s − s0) C −1 • Set A := − (s0 C + G)−1 C and R := (s0 C + G)−1 B • Rewriting H gives H(s) = BH (I − (s − s0) A)−1 R B Krylov subspaces and Padé • Expanding about s0 gives H(s) = BH (I − (s − s0) A)−1 R = = ∞ X BH i=0 ∞ X i=0 i A R (s − s0)i i H H A B R (s − s0)i • Right and left block Krylov sequences: h R A R · · · AiR · · · i and B AH B · · · i AH B · · · Outline • Motivation • From AWE to PVL • Descriptor systems • Preserving RCL structures • SPRIM • Using restarted Krylov subspace methods • Concluding remarks Problem of structure preservation • Any RCL network is stable, passive, . . . • Reduced-order model should be stable, passive, . . . • More difficult problem: Reduced-order model of an RCL network should be synthesizable as an RCL network • Padé reduced-order models are not even stable in general! Preservation of RCL structure General RCL network equations • System of linear time-invariant DAEs of the form d C x(t) + G x(t) = B u(t) dt y(t) = BHx(t) where C C 1 = 0 0 0 0 C2 0 , 0 0 G G1 H = −G2 −GH 3 G2 G3 0 0 , 0 0 • Moreover: C0 (This implies passivity!) and G + GH 0 B B 1 = 0 0 0 0 B2 Dimension reduction via projection • PRIMA Passive Reduced Interconnect Macromodeling Algorithm (Odabasioglu, ’96; Odabasioglu, Celik, and Pileggi, ’97) • SPRIM Structure-Preserving Reduced Interconnect Macromodeling (F., ’04 and ’09) • PRIMA and SPRIM satisfy a Padé-type property: j Hn(s) = H(s) + O (s − s0) for some j = j (n) PRIMA does not preserve RCL structure • Structure of the data matrices: C 0 0 1 C = 0 C2 0 , 0 0 0 G1 G2 G3 H G = −G2 0 0 , −GH 0 0 3 B • Structure of PRIMA reduced-order matrices: Cn = , Gn = , Bn = B 1 = 0 0 0 0 B2 SPRIM does preserve RCL structure • Structure of SPRIM reduced-order matrices: C̃ 1 Cn = 0 0 0 C̃2 0 G̃1 G̃2 G̃3 0 B̃ 0 1 H 0 0 , Bn = 0 0 , Gn = −G̃2 0 0 0 B̃2 −G̃H 0 0 3 • Padé-type property: j Hn(s) = H(s) + O (s − s0) with j the same integer as for PRIMA • For SPRIM, we even have j ⇒ 2j . Why? An RCL network with mostly C’s and L’s 0 10 Exact PRIMA model SPRIM model −1 10 −2 10 −3 ( abs(Z 2,1)) 10 −4 10 −5 10 −6 10 −7 10 −8 10 0 0.5 1 1.5 2 2.5 3 Frequency (Hz) 3.5 4 4.5 5 9 x 10 Exact and models corresponding to block Krylov subspace of dimension n̂ = 120 Outline • Motivation • From AWE to PVL • Descriptor systems • Preserving RCL structures • SPRIM • Using restarted Krylov subspace methods • Concluding remarks Projection-based reduction • Let Vn ∈ CN ×n be any matrix with full column rank n • Use Vn to explicitly project the data matrices of d C x(t) + G x(t) = B u(t) dt y(t) = BHx(t) onto the subspace spanned by the columns of Vn Projection-based reduction, continued • Resulting reduced-order model where d Cn z(t) + Gn z(t) = Bn u(t) dt e (t) = BH y n z(t) Cn = VnH C Vn, Gn = VnH G Vn, Bn = VnH B • Passivity is preserved: C 0, G + GH 0 ⇒ Cn 0, Gn + GH n 0 Projection + Krylov • Choose an expansion point s0 ∈ C and re-write the original transfer function: H(s) = BH (s C + G)−1 B = BH (I − (s − s0) A)−1 R where −1 A := − s0 C + G C and −1 R := s0 C + G • Block Krylov sequence: R, AR, A2R, . . . , AiR, . . . B Projection + Krylov, continued • n̂-th block Krylov subspace: h Kn̂ (A, R) := colspann̂ R AR A2R · · · i • Choose the projection matrix Vn such that Kn̂(A, R) ⊆ Range Vn • Projection + Krylov subspace = Padé-type approximant: j Hn(s) = H(s) + O (s − s0) , where j ≥ ⌊n̂/m⌋ SPRIM • Let V̂n̂ be any matrix such that Kn̂(A, R) = Range V̂n̂ • Recall: C 0 0 1 , C= 0 C 0 2 0 0 0 G1 G2 G3 H , G= − G 0 0 2 −GH 0 0 3 B B 1 = 0 0 0 0 B2 SPRIM, continued • Partition V̂n̂ accordingly: (1) V n̂ (2) V̂n̂ = Vn̂ (3) Vn̂ • For l = 1, 2, 3: (i) (i) If Rank Vn̂ < n̂, replace Vn̂ by matrix of full column rank SPRIM, continued • Set Vn = Vn(1) 0 0 0 0 Vn(2) 0 0 Vn(3) • Block structure is preserved: C̃1 Cn = 0 0 0 C̃2 0 G̃1 G̃2 G̃3 B̃1 0 0 H , B = , G = 0 0 − G̃ 0 0 0 n n 2 0 B̃2 0 −G̃H 0 0 3 • Kn̂ (A, R) = Range Vn̂ ⊆ Range Vn ⇒ Padé-type property! An RCL network with mostly C’s and L’s 5 10 Exact PRIMA model 4 10 SPRIM model 3 abs(H 1,1 ) 10 2 10 1 10 0 10 −1 10 1.5 2 2.5 3 3.5 Frequency (Hz) 4 4.5 5 5.5 9 x 10 Exact and models corresponding to n̂ = 90 An RCL network with mostly C’s and L’s 1 10 0 abs(H 1,3 ) 10 −1 10 −2 10 Exact PRIMA model SPRIM model −3 10 1.5 2 2.5 3 3.5 Frequency (Hz) 4 4.5 5 5.5 9 x 10 Exact and models corresponding to n̂ = 90 A package example 4 10 Exact PRIMA model 3 SPRIM model abs(H8,1) 10 2 10 1 10 0 10 9 10 10 10 Frequency (Hz) 11 10 Exact and models corresponding to n̂ = 128 A package example 0 10 Exact PRIMA model SPRIM model −1 abs(H9,9) 10 −2 10 −3 10 9 10 10 10 Frequency (Hz) 11 10 Exact and models corresponding to n̂ = 128 Padé-type property • So far, we only know that both PRIMA and SPRIM produce Padé-type reduced-order models with Hn(s) = H(s) + O ((s − s0)q ) , where q ≥ ⌊n̂/m⌋ • Can we say more in the case of SPRIM? (3) • Easy in the case of no third subblock Vn (F. ’05) • General case: J-Hermitian linear dynamical systems (F. ’08) J-Hermitian systems • Recall: C d x(t) + G x(t) = B u(t) dt y(t) = BHx(t) where C 0 0 1 , C= 0 C 0 2 0 0 0 G1 G2 G3 H , − G G= 0 0 2 −GH 0 0 3 B B 1 = 0 0 0 0 B2 • C and G are J-Hermitian: J C = CH J and J G = GHJ, where J I 0 0 0 := 0 − I 0 0 −I J-Hermitian systems, continued • The input-output matrix B satisfies Range(J B) = Range(B) Jn-Hermitian property of SPRIM models • The SPRIM models Cn d z(t) + Gn z(t) = Bn u(t) dt y(t) = BH n z(t) preserve the structure of Cn, Gn, Bn • Therefore, Cn and Gn are Jn-Hermitian with I 0 Jn := 0 − I 0 0 0 0 −I and Range(Jn Bn) = Range(Bn) • Moreover, the projection matrix Vn satisfies J Vn = Vn Jn Padé-type property • Theorem (F., ’08) For J-Hermitian systems and real expansion points s0, the n-th SPRIM model is Jn-Hermitian and satisfies q̃ Hn(s) = H(s) + O (s − s0) • Twice as accurate as PRIMA! , where q̃ ≥ 2 ⌊n̂/m⌋ Outline • Motivation • From AWE to PVL • Descriptor systems • Preserving RCL structures • SPRIM • Using restarted Krylov subspace methods • Concluding remarks Using restarts (with Efrem Rensi) • To obtain a Padé-type property, we need to generate a matrix V̂n̂ such that Kn̂(A, R) = Range V̂n̂ • Use suitable variant of the Arnoldi process • But: prohibitive for large n̂ • Remedy: (thick) restarts Using restarts, continued • Motivated by recent work by Eiermann et al. • Restart after each cycle of r Arnoldi steps • Extract ‘good’ eigenvector information Y from the last batch of r Arnoldi vectors • Use the columns of Y as the first vectors in the next cycle • At each restart allow for changing expansion point: A(s0) = − (s0 C + G)−1 C ⇒ A(s̃0) = − (s̃0 C + G)−1 C Single vs. multiple expansion points 4 r = 100, K = 1, l min = 0 4 10 r = 15, K = 3, l min = 5 10 actual approx (rel diff=5.837e−005) actual approx (rel diff=4.576e−005) 3 3 10 |H (s)| 2 10 K K |H (s)| 10 1 1 10 10 0 10 8 10 2 10 0 9 10 frq Single point — no restarts n = 100 10 10 10 8 10 9 10 frq 3 points — thick restarts n = 45 10 10 Outline • Motivation • From AWE to PVL • Descriptor systems • Preserving RCL structures • SPRIM • Using restarted Krylov subspace methods • Concluding remarks Concluding remarks • Practical use of Krylov subspace-based dimension reduction was motivated by need to handle large-scale RCL networks • Lead to the development of new Krylov subspace methods • How to avoid the need to store N × n dense matrix in projection methods? • Krylov subspace methods with thick restarts? • Use with multiple expansion points?