Building blocks for fast circuit simulation

advertisement
D
e
p
a
r
t
me
nto
fR
a
d
i
oS
c
i
e
nc
ea
ndE
ng
i
ne
e
r
i
ng
MikkoH
o
nkal
a
Buil
ding bl
o
c
ks fo
r fast
c
irc
uitsimul
at
io
n
Buil
ding bl
o
c
ks f
o
rf
astc
irc
uitsimul
at
io
n
M
i
k
k
oH
o
nk
a
l
a
A
a
l
t
oU
ni
v
e
r
s
i
t
y
D
O
C
T
O
R
A
L
D
I
S
S
E
R
T
A
T
I
O
N
S
Aalto University publication series
DOCTORAL DISSERTATIONS 174/2012
Building blocks for fast circuit
simulation
Mikko Honkala
Doctoral dissertation for the degree of Doctor of Science in
Technology to be presented with due permission of the Aalto
University School of Electrical Engineering for public examination
and debate in Auditorium S1 at the Aalto University School of
Electrical Engineering (Otakaari 5, Espoo, Finland) on the 18th of
January, 2013, at 12 noon.
Aalto University
School of Electrical Engineering
Department of Radio Science and Engineering
Supervising professor
Professor Martti Valtonen
Thesis advisor
D.Sc. (Tech.) Janne Roos
Preliminary examiners
Associate professor Gabriela Ciuprina, Polytechnic University of
Bukarest, Romania
Professor Timo Rahkonen, University of Oulu, Finland
Opponent
Prof.dr. Wil H.A. Schilders, Technische Universiteit Eindhoven, The
Netherlands
Aalto University publication series
DOCTORAL DISSERTATIONS 174/2012
© Mikko Honkala
ISBN 978-952-60-4922-9 (printed)
ISBN 978-952-60-4923-6 (pdf)
ISSN-L 1799-4934
ISSN 1799-4934 (printed)
ISSN 1799-4942 (pdf)
http://urn.fi/URN:ISBN:978-952-60-4923-6
Unigrafia Oy
Helsinki 2012
Finland
Abstract
Aalto University, P.O. Box 11000, FI-00076 Aalto www.aalto.fi
Author
Mikko Honkala
Name of the doctoral dissertation
Building blocks for fast circuit simulation
Publisher School of Electrical Engineering
Unit Department of Radio Science and Engineering
Series Aalto University publication series DOCTORAL DISSERTATIONS 174/2012
Field of research Circuit theory
Manuscript submitted 8 June 2012
Date of the defence 18 January 2013
Permission to publish granted (date) 30 August 2012
Language English
Monograph
Article dissertation (summary + original articles)
Abstract
Modern electronic circuits are typically large, consisting of thousands of transistors and other
components. During the design process, there is a need to perform computationally demanding
numerical simulations to verify the functionality of the circuit. Thus, the need for fast and
accurate circuit simulation tools is obvious.
Four approaches to improve the speed and the convergence of the numerical circuit
simulation are presented.
The first approach utilizes efficient iteration methods for nonlinear DC analysis.
Newton–Raphson (NR) iteration is the most used nonliner iteration method for nonlinear
circuit equations, but it lacks good global convergence properties. Some new variants of
nonlinear iteration methods are proposed to improve the convergence of DC analysis.
In the second approach, the computing time is reduced by using parallel processing.
Parallelization of harmonic balance (HB) analysis using multithreads is studied. Also, the
modified multilevel NR method that has improved convergence properties is presented.
The third approach concentrates on improving the convergence of iterative solvers for linear
systems using preconditioners. The emphasis is in the preconditioning of Jacobians of the HB
method. It is shown how to use time-domain preconditioners with frequency-domain
preconditioners in order to benefit from both.
The fourth approach to speed up the circuit simulation is to use model-order reduction
(MOR), where the idea is to approximate complex circuit models with simpler ones. This thesis
concentrates on MOR methods for linear circuits or the linear parts of nonlinear circuits.
Efficient partitioning-based MOR methods and a new global approach to projection-based
MOR are proposed.
Keywords Circuit simulation, numerical analysis, parallel processing, iterative methods,
model-order reduction, preconditioners
ISBN (printed) 978-952-60-4922-9
ISBN (pdf) 978-952-60-4923-6
ISSN-L 1799-4934
ISSN (printed) 1799-4934
ISSN (pdf) 1799-4942
Location of publisher Espoo
Pages 157
Location of printing Helsinki
Year 2012
urn http://urn.fi/URN:ISBN:978-952-60-4923-6
Tiivistelmä
Aalto-yliopisto, PL 11000, 00076 Aalto www.aalto.fi
Tekijä
Mikko Honkala
Väitöskirjan nimi
Nopean piirisimuloinnin rakennuspalikoita
Julkaisija Sähkötekniikan korkeakoulu
Yksikkö Radiotieteen ja -tekniikan laitos
Sarja Aalto University publication series DOCTORAL DISSERTATIONS 174/2012
Tutkimusala Piiriteoria
Käsikirjoituksen pvm 08.06.2012
Julkaisuluvan myöntämispäivä 30.08.2012
Monografia
Väitöspäivä 18.01.2013
Kieli Englanti
Yhdistelmäväitöskirja (yhteenveto-osa + erillisartikkelit)
Tiivistelmä
Nykyaikaiset elektroniikkapiirit ovat tyypillisesti isoja, tuhansien transistorien
kokonaisuuksia. Suunnitteluprosessin aikana niiden toiminta pitää tarkastaa laskennallisesti
haastavien simulaatioiden avulla. Näin ollen nopeille ja tarkoille simulaatiotyökaluille on
tarvetta.
Tässä väitöskirjassa tarkastellaan neljää erilaista lähestymistapaa numeeristen
piirisimulaatioiden nopeuttamiseksi.
Ensimmäisessä lähestymistavassa tutkitaan tehokkaita iteraatiomenetelmiä epälineaaristen
piiriyhtälöiden ratkaisemiseksi. Newton-Raphson-menetelmä (NR) on yleisesti käytetty
iteraatiomenetelmä epälineaaristen tasavirtapiiriyhtälöiden ratkaisemiseksi. Sen huono puoli
on globaalien suppenemisominaisuuksien puute. Tässä väitöskirjassa esitellään muutamia
uusia iteraatiomenetelmiä tasavirta-analyysin suppenemisen parantamiseksi.
Toisessa lähestymistavassa piirisimulointia nopeutetaan rinnakkaislaskennan avulla.
Väitöskirjassa käsitellään harmoninen balanssi -menetelmän (HB) rinnakkaistaminen
säikeiden avulla. Lisäksi esitellään rinnakkaislaskentaan soveltuva monitasoinen NRmenetelmä, jossa on erityisesti otettu huomioon suppenemisen avustaminen.
Kolmas lähestymistapa keskittyy lineaaristen yhtälöryhmien ratkaisemisessa käytettyjen
iteraatiomenetelmien pohjustimiin. HB-yhtälöiden kanssa käytetään tavallisesti taajuusalueen
pohjustimia, mutta tässä väitöskirjassa esitetään, miten taajuusalueen pohjustimet voidaan
yhdistää aika-alueen pohjustimien kanssa, jotta saadaan kummankin hyvät ominaisuudet
käyttöön.
Neljännessä lähestymistavassa käytetään malliredusointia. Sen ideana on redusoida isoa
piirimallia pienemmäksi siten, että tarkkuus kuitenkin säilyy riittävänä. Tässä väitöskirjassa
keskitytään lineaaristen piirien malliredusointiin ja esitellään piirijakoon perustuvia
menetelmiä sekä uusi globaaliin approksimaatioon perustuva menetelmä.
Avainsanat Piirisimulointi, numeerinen analyysi, rinnakkaislaskenta, iteratiiviset
menetelmät, malliredusointi, pohjustimetnaeos, nulla
ISBN (painettu) 978-952-60-4922-9
ISBN (pdf) 978-952-60-4923-6
ISSN-L 1799-4934
ISSN (painettu) 1799-4934
ISSN (pdf) 1799-4942
Julkaisupaikka Espoo
Sivumäärä 157
Painopaikka Helsinki
Vuosi 2012
urn http://urn.fi/URN:ISBN:978-952-60-4923-6
Preface
Starting from 1999 I have worked on many industrial projects (SYANIDE,
ARFSIM, MOSAICS, AMAZE, STONGA) mainly in the development of
APLAC’s analysis methods. Also, from 2005 to 2007, we had a joint project
NETMOR with NEC Europe on model-order reduction. In 2009–2010 I
also worked with the EU project ICESTARS. This thesis comprises the
collection of the publications that resulted from these projects.
I would especially like to thank my supervisor prof. Martti Valtonen
and my instructor D.Sc. (Tech.) Janne Roos for many reasons and Ville
Karanko and Jarmo Virtanen for co-operating in the APLAC projects. I
also wish to thank Pekka Miettinen, in particular, but also Dr. Achim
Basermann and Carsten Neff for fruitful co-operation in MOR research.
I am very grateful to Sakari Aaltonen for proof-reading my articles and
to Luis Costa for proof-reading the overview of this thesis. Many other
current and former members of circuit theory group have, at least indirectly, influenced my thesis: Mikko Hulkkonen, Tuomo Kujanpää, Anu
Lehtovuori, Vesa Linja-aho, Timo Palenius, dr. Neslihan Şengör (visiting scientist), D.Ss. (Tech.) Kimmo Silvonen, Taisto Tinttunen, Tuukka
Tuomisto, and D.Ss. (Tech.) Timo Veijola to mention some of the most
important. Thank you. And, of course, the warmest thanks goes to my
wife Sanna for her constant support.
The Jenny and Antti Wihuri Foundation and the Nokia Foundation have
partially funded this thesis.
Espoo, November 19, 2012,
Mikko Honkala
1
Preface
2
Contents
Preface
1
Contents
3
List of Publications
7
Author’s Contribution
9
1. Introduction
19
1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . .
19
1.2 Scope of study . . . . . . . . . . . . . . . . . . . . . . . . . . .
20
2. Numerical circuit-analysis methods
23
2.1 Circuit equations . . . . . . . . . . . . . . . . . . . . . . . . .
23
2.2 DC analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . .
24
2.2.1 Aiding the convergence . . . . . . . . . . . . . . . . . .
24
2.3 Transient analysis
. . . . . . . . . . . . . . . . . . . . . . . .
25
2.4 Harmonic balance analysis . . . . . . . . . . . . . . . . . . . .
26
2.4.1 Frequency selective harmonic balance analysis . . . .
27
2.4.2 Preconditioning . . . . . . . . . . . . . . . . . . . . . .
27
3. Iterative methods for nonlinear equations
31
3.1 Equation formulation . . . . . . . . . . . . . . . . . . . . . . .
31
3.2 Line-search methods . . . . . . . . . . . . . . . . . . . . . . .
32
3.3 Trust-region methods . . . . . . . . . . . . . . . . . . . . . . .
32
3.4 Dogleg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
33
3.5 Tensor methods . . . . . . . . . . . . . . . . . . . . . . . . . .
34
3.6 Multilevel Newton–Raphson . . . . . . . . . . . . . . . . . . .
35
4. Parallel processing in circuit simulation
39
3
Contents
5. Model-order reduction
41
5.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
41
5.2 PRIMA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
43
5.2.1 Equation formulation . . . . . . . . . . . . . . . . . . .
43
5.2.2 PRIMA algorithm . . . . . . . . . . . . . . . . . . . . .
44
5.2.3 Eigenvalue decomposition . . . . . . . . . . . . . . . .
45
5.2.4 Macromodel synthesis by Matsumoto’s method . . . .
46
5.3 Liao–Dai method . . . . . . . . . . . . . . . . . . . . . . . . .
46
5.3.1 T model . . . . . . . . . . . . . . . . . . . . . . . . . . .
49
5.3.2 Π model . . . . . . . . . . . . . . . . . . . . . . . . . . .
49
5.3.3 Circuit model of a port . . . . . . . . . . . . . . . . . .
50
6. Discussion
51
7. Summary of the publications
53
7.1 Publication I: Nonmonotone norm-reduction method for circuit simulation . . . . . . . . . . . . . . . . . . . . . . . . . . .
53
7.2 Publication II: On nonlinear iteration methods for DC analysis of industrial circuits . . . . . . . . . . . . . . . . . . . . .
53
7.3 Publication III: New multilevel Newton–Raphson method
for parallel circuit simulation . . . . . . . . . . . . . . . . . .
54
7.4 Publication IV: A Parallel harmonic balance simulator for
shared memory multicomputer . . . . . . . . . . . . . . . . .
54
7.5 Publication V: Mixed preconditioners for harmonic balance
Jacobians . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
54
7.6 Publication VI: Frequency/time block preconditioners for
harmonic balance Jacobians . . . . . . . . . . . . . . . . . . .
55
7.7 Publication VII: Study and development of an efficient RCin–RC-out MOR method . . . . . . . . . . . . . . . . . . . . .
55
7.8 Publication VIII: Hierarchical model-order reduction flow .
55
7.9 Publication IX: GABOR: Global-approximation-based order
reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
56
7.10 Publication X: PartMOR: Partitioning-based realizable modelorder reduction method for RLC circuits . . . . . . . . . . . .
56
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
58
Bibliography
59
Errata
69
4
Contents
Publications
71
5
Contents
6
List of Publications
This thesis consists of an overview and of the following publications which
are referred to in the text by their Roman numerals.
I M. Honkala. Nonmonotone norm-reduction method for circuit simulation. Electronics Letters, vol. 38, pp. 1316–1317, Oct. 2002.
II M. Honkala, J. Roos, and V. Karanko. On nonlinear iteration methods for DC analysis of industrial circuits.
Mathematics in Industry 8:
Progress in Industrial Mathematics at ECMI 2004, (A. D. Bucchianico,
R. M. M. Mattheij, and M. A. Peletier, eds.), pp. 144–148, 2006.
III M. Honkala, J. Roos, and M. Valtonen. New multilevel Newton–Raphson
method for parallel circuit simulation. Proceedings of European Conference on Circuit Theory and Design, vol. II, pp. 113–116, Aug. 2001.
IV V. Karanko and M. Honkala. A parallel harmonic balance simulator
for shared memory multicomputers. Proceedings of the 34th European
Microwave Conference, pp. 849–851, 2004.
V M. Honkala and V. Karanko. Mixed preconditioners for harmonic balance Jacobians. International Journal of RF and Microwave ComputerAided Engineering, vol. 19, no. 2, pp. 211–217, 2009.
VI M. Honkala, V. Karanko, J. Roos, and M. Valtonen. Frequency/time
block preconditioners for harmonic balance Jacobians. Proceedings of
European Conference on Circuit Theory and Design, pp. 607–610, Aug.
7
List of Publications
2009.
VII P. Miettinen, M. Honkala, J. Roos, C. Neff, and A. Basermann. Study
and development of an efficient RC-in–RC-out MOR method. Proceedings of the 15th IEEE International Conference on Electronics, Circuits
and Systems, pp. 1277–1280, Aug. 2008.
VIII M. Honkala, P. Miettinen, J. Roos, and C. Neff. Hierarchical modelorder reduction flow. Mathematics in Industry 14: Scientific Computing
in Electrical Engineering SCEE 2008, (J. Roos and L. R. J. Costa, eds.),
pp. 539–546, 2010.
IX J. Roos, M. Honkala, and P. Miettinen. GABOR: global-approximationbased order reduction. Mathematics in Industry 14: Scientific Computing in Electrical Engineering SCEE 2008, (J. Roos and L. R. J. Costa,
eds.), pp. 517–514, 2010.
X P. Miettinen, M. Honkala, J. Roos, and M. Valtonen. PartMOR: partitioningbased realizable model-order reduction method for RLC circuits. IEEE
Transactions of Computer-Aided Design of Integrated Circuits and Systems, vol. 30, no. 3, pp. 374–387, 2011.
8
Author’s Contribution
Publication I: “Nonmonotone norm-reduction method for circuit
simulation”
All the work, i.e. the implementation, the testing and the paper preparation was done by the author.
Publication II: “On nonlinear iteration methods for DC analysis of
industrial circuits”
The author has developed, implemented, and evaluated the algorithms.
Ville Karanko worked jointly with the author to develop and implement
the basic tensor methods. D.Sc. (Tech.) Janne Roos helped with the preparation of the paper.
Publication III: “New multilevel Newton–Raphson method for parallel
circuit simulation”
The development, implementation and evaluation of the method was done
by the author. D.Sc. (Tech.) Janne Roos helped with the preparation of
the paper.
Publication IV: “A parallel harmonic balance simulator for shared
memory multicomputers”
The major part of the implementation (about 70 %) was done by Ville
Karanko. The author has contributed in the implementation of the component-function evaluations and general parallelization. Ville Karanko
9
Author’s Contribution
was responsible for the writing of the paper.
Publication V: “Mixed preconditioners for harmonic balance
Jacobians”
The idea of mixed preconditioners was proposed by the author and the
major part of the implementation was done by the author. Ville Karanko
contributed to the paper through discussions and helped with implementations.
Publication VI: “Frequency/time block preconditioners for harmonic
balance Jacobians”
The idea of this approach was proposed by the author. The author implemented the methods building on the foundation done by Ville Karanko,
who also contributed in writing some parts of the paper. The tests were
performed by the author.
Publication VII: “Study and development of an efficient
RC-in–RC-out MOR method”
The author was responsible for implementing the major part of the MOR
flow, analyzing the nodal-formulation-based circuit-equation approach, and
deriving the simplified macromodel approach together with D.Sc. (Tech.)
Janne Roos. A large portion of the method implementation and the extensive simulations were done by Pekka Miettinen. Janne Roos contributed
substantially to the paper through discussions and an initial study of the
Liao–Dai and PRIMA methods. Carsten Neff helped through discussions
on the implementation.
Publication VIII: “Hierarchical model-order reduction flow”
The paper and the related additional research was inspired by the discoveries made in the test simulations performed during the development of
[PVII]. Thus, the contributions stated above for paper [PVII] apply here to
some extent. In addition, the author performed test simulations to study
hierarchical analysis with PRIMA and discussed the hierarchical method
10
Author’s Contribution
flow at a more general level. Pekka Miettinen’s contribution to the paper was writing two preliminary sections of the manuscript, discussions,
and commenting on the text. D.Sc. (Tech.) Janne Roos helped through
discussions and by commenting the text.
Publication IX: “GABOR: global-approximation-based order
reduction”
The main idea behind this approach as well as the proof of the GABOR
method was proposed by D.Sc. (Tech.) Janne Roos. The author’s contribution was in the implementation of the method, the test simulations, and
inventing some parts of the method, e.g. frequency shifting and scaling.
Publication X: “PartMOR: partitioning-based realizable model-order
reduction method for RLC circuits”
The original idea as well as most of the method implementation and all
the test simulations were done by Pekka Miettinen. The author contributed to the theory at various points during the development of the
method. In addition, the idea for Fig. 11 was proposed by the author.
D.Sc. (Tech.) Janne Roos helped with preparing the manuscript and with
additional discussions.
11
Author’s Contribution
12
Symbols
1
identity matrix
a
auxiliary vector
A
matrix of the linear equation
A
auxiliary matrix
Ai
ith diagonal block in the BBD matrix
b
right-hand side vector
b
auxiliary vector
B
trust region
B
selector matrix
B̃
reduced-order selector matrix
Bi
ith upper off-diagonal block in the BBD matrix
c
capacitance
C
nodal capacitance matrix
C̃
nodal capacitance matrix in the time domain
C̃
reduced-order capacitance matrix
Ci
ith lower off-diagonal block in the BBD matrix
D
diagonal block in the BBD matrix
e
voltage source
E
incidence matrix
E
auxiliary matrix
E
part of modified nodal matrix
f
function
f
frequency
F
function
g
conductance
g
gradient vector
G
conductance
G
nodal conductance matrix
13
Symbols
14
G̃
nodal conductance matrix in the time domain
G̃
reduced-order resistance matrix
H
Hessian matrix
H
inductor stamp matrix
H
auxiliary matrix
i
index
i
current
i
vector of currents in the time domain
iD
diode current
ip
vector of port currents
I
vector of currents in the frequency domain
j
index
J
Jacobian matrix
J̃
Jacobian matrix in the time domain
k
index
K
number of block moments
K
preconditioner
L
inductance
L
linear model
L
inductance matrix
L
selector matrix
L̃
reduced-order selector matrix
m
moment
m
quadratic model
M
moment matrix
n
number of unknowns
N
number of harmonic frequencies
N , Nx , Ny
number of ports
N
resitor stamp matrix
q
order of reduction
qc
number of complex poles
qr
number of real poles
q
vector of charges
Q
capacitor stamp matrix
r
residual
R
resistance
R
auxiliary matrix
S
scattering parameter matrix
Symbols
S
eigenvector matrix
t
time
T
tensor
u
vector of excitation currents
up
vector of port voltages
U
vector of excitation currents
v
nodal voltage
v
vector of voltages
x
unknown
x̃
state variable
x
vector of unknowns
x∗
solution vector
X
vector of unknowns in frequency domain
X
projection matrix
y
y parameter
Y
admittance matrix
Z0
reference impedance
α
damping factor
αk
weight of the time-domain difference operator
β
contracting factor
Γ
DFT matrix
δ
radius of trust region
∆
auxiliary matrix
∆x
update of the vector of unknows
∆xDL
dogleg update
∆xNR
Newton-Raphson update
∆xSD
steepest-descent update
error limit
λ
damping factor
Λ
diagonal matrix containing the eigenvalues
ρ
gain factor
τ
maximum error
Ω
frequency matrix
ω
angular velocity
15
Symbols
16
Abbreviations
2D
2 dimensional
AC
alternating current
APLAC
circuit simulator
AWE
asymptotic waveform evaluation
BBD
bordered-block diagonal
BE
backward Euler
CG
conjugate gradient
CP
Cauchy point
CPU
central processing unit
DAE
differential algebraic equations
DC
direct current
DFT
discrete Fourier transform
DL
dog leg
FD
frequency domain
FE
forward Euler
FFT
fast Fourier transform
FGMRES
flexible generalized minimal residual
FSHB
frequency selective harmonic balance
GABOR
global-approximation-based order reduction
GMRES
generalized minimal residual
GPU
graphics processing unit
HB
harmonic balance
HMOR
hierarchical model-order reduction
IDFT
inverse discrete Fourier transform
IFFT
inverse fast Fourier transform
LU
lower upper
MEMS
micro-electro-mechanical systems
MIMD
multiple instruction stream / multiple data stream
17
Abbreviations
MLNA
multilevel Newton analysis
MLNR
multilevel Newton–Raphson
MNA
modified nodal analysis
MOR
model-order reduction
MPI
message-passing interface
MRHB
multirate harmonic balance
NOW
network of workstations
NR
Newton–Raphson
PartMOR
partitioning-based model-order reduction
PRIMA
passive reduced-order interconnect macromodeling algorithm
PVL
Padé via Lanczos
PVM
parallel virtual machine
RF
radio frequency
ROM
reduced-order model
SAPOR
second-order Arnoldi method for passive order reduction
SD
steepest descent
SIMD
single instruction / multiple data
SPICE
a well-known circuit simulator
SPRIM
structure-preserving reduced-order interconnect macromodeling algorithm
SVD
singular value decomposition
TBR
truncated balanced realization
TD
time domain
18
1. Introduction
1.1
Background
Modern electronic circuits are typically large, consisting of thousands of
transistors and other components. During the design process, there is
a need to perform computationally demanding numerical simulations to
verify the functionality of the circuit. Thus, the need for fast and accurate
circuit simulation tools is obvious.
A circuit simulator is a tool that analyzes (simulates) the behavior of
electrical circuits using numerical, or in some cases symbolic, algorithms.
The circuit to be simulated is constructed from components that can be
described with a mathematical model. In other words, these mathematical circuit models are interconnected and combined into the large system
of equations to be solved by the simulation algorithms. There are numerous tools for simulating analog electronics, like SPICE [1] and its many
derivatives. Also, tools for digital circuits and mixed-mode simulators for
mixed analog and digital design exist [2]. For RF and microwave circuits,
there are sophisticated tools like APLAC [3, 4].
The competent circuit simulation is based on accurate circuit models
and efficient simulation algorithms. The high quality numerical analysis
methods are both fast and robust; e.g., iterative equation-solving algorithms used by the simulation methods should converge in any circumstance. The obtained solution should be as accurate as the models permit.
However, sometimes less accurate but faster to evaluate circuit models
are needed, e.g., in a timing analysis of digital circuits. Also, the memory
consumption can sometimes be the bottleneck. This leads to model-order
reduction (MOR) techniques that have been heavily studied in the past
decades. In addition to improvements in the numerical algorithms, par-
19
Introduction
allel processing (or concurrent programming) has become a standard approach to improve the efficiency of numerical computation – also in circuit
simulation.
1.2 Scope of study
The general topic of this thesis is numerical circuit-simulation methods
for analog and RF/microwave circuits. Four approaches to improve the
speed and the convergence of the numerical circuit simulation are presented. A very brief overview of the novel contributions of the publications [PI–PX] is presented in the following. A more detailed description of
each publication is presented in Section 7.
The first approach utilizes efficient iteration methods for nonlinear DC
analysis. When choosing the nonlinear iteration method for a circuit simulator, the special properties of both the circuit equations and the circuit
simulator have to take into account. Newton–Raphson (NR) iteration is
the most used nonliner iteration method for nonlinear circuit equations,
but it lacks good global convergence properties. A nonmonotone linesearch strategy for NR iteration is studied in [PI] and some new (variants
of) iteration methods for nonlinear equations are presented in [PII].
In the second approach, the computing time is reduced by using parallel
processing. The necessary requirement for that is parallel hardware. Traditionally, the parallel processing is performed in supercomputers with
multiple processors, but these computers are usually very expensive. Recently, multi-core processors have become available at cheap price. These
multi-core architectures can execute several threads concurrently. The
parallelization of the harmonic balance (HB) method using multithreads
is studied in [PIV]. Also, utilization of networks of workstations (NOW)
as parallel computers are studied. In the networked parallel processing
or in the distributed computing, each serial (or parallel) computer is used
as a processing unit and data is transferred via a local area network, like
Ethernet. With this approach the communication between processors is
expensive. The utilization of the multilevel iteration methods has been
proposed to minimize the communication. [PIII] proposed a variant of the
multilevel Newton–Raphson (MLNR) method with improved global convergence properties.
The third approach concentrates on improving the convergence of iterative solvers for linear systems, like generalized minimal residual (GM-
20
Introduction
RES) [5], using preconditioners. In this thesis, the emphasis is in preconditioning of Jacobians of the HB method using time-domain (TD) preconditioners instead of typical frequency-domain (FD) ones. The main contribution of this thesis is to show how to combine the FD preconditioners
with TD preconditioners [PV, PVI] in order to benefit from both.
The fourth approach is a little different. Sometimes the improvements
in simulation algorithms and hardware is not enough to be able to improve the speed (and reduce memory consumption) of the circuit simulation. Then, there is need for MOR, where the idea is to simplify complex circuit models by approximating the model by simpler one. In the
best case, the reduced-order model (ROM) is of small size compared to
the original, but still describes the original system perfectly. In practice,
the reduction process generates some error in the ROM in a trade-of for a
smaller system. There are MOR methods for linear and nonlinear circuits,
but this thesis concentrates on methods for linear circuits or linear parts
of the nonlinear circuits. [PVII, PVIII, PX] present partitioning-based
MOR for RC and RLC circuits. A new global approach to projection-based
MOR was invented and tested in [PIX].
The following highlights the new contributions:
• Application of a nonmonotone line-search (norm-reduction) method to a
nonlinear DC analysis method in [PI].
• Development of the nonmonotone trust-region dogleg method in [PII].
• Development of the nonmonotone line-search and trust-region tensor
methods in [PII].
• Comparison of different nonlinear iterative methods in DC analysis [PII].
• Development of the modified MLNR method with improved line-search
properties for DC analysis in [PIII].
• Convergence analysis of MLNR method in [PIII].
• Parallelization of HB analysis using multithreads in [PIV].
• Development of two mixed TD/FD preconditioners that combine TD and
21
Introduction
FD preconditioners for HB Jacobians in [PV].
• Development of three TD/FD block preconditioners that divide the Jacobian into blocks that can be preconditioned with different preconditioners in [PVI].
• In [PVII], circuit-partitioning-based MOR methods for RC circuits are
developed and studied.
• The results in [PVII] are extended into the hierarchical MOR flow presented in [PVIII]. The MOR flow shows how to apply suitable reduction
methods to different parts of the linear circuit, e.g. PRIMA for RLC
circuit parts and Liao–Dai for RC circuit parts.
• A new global approach to projection-based MOR was invented and tested
in [PIX].
• A new partitioning-based low-order macromodel MOR method for RLC
circuits is proposed in [PX].
22
2. Numerical circuit-analysis methods
In this section, the basics of numerical circuit-analysis methods are presented briefly such that a mathematical framework is provided for the
rest of the thesis. Readers not so familiar with numerical circuit simulation methods can read more from the fundamental text books, e.g.,
[6, 7, 8, 9, 10, 11].
2.1
Circuit equations
There are many ways to formulate the circuit equations [7], e.g., nodal,
tableau, and mesh analysis, but the one most used in circuit simulation is
the modified nodal analysis (MNA) [12].
The MNA equations are formulated using Kirchhoff ’s current law (sum
of currents in a node is zero) and branch constitutive equations: e.g, a
resistor is modelled using Ohm’s law as i = v/R and a the behavior of
linear capacitor is described by i =
dq(v)
dt
= C dv
dt . The MNA differs from
nodal analysis such that a circuit element that has no admittance representation (e.g., voltage sources and inductances) is presented using an
additional current variable.
Consider the system of nonlinear differential algebraic equations (DAEs)
f (x(t), t) =
dq(x(t))
+ i(x(t), t) + u(t) = 0,
dt
(2.1)
where x is the nodal vector of unknowns, i.e., nodal voltages and currents
of elements that have no admittance presentation and that are required
as part of the solution, q is the nonlinear charge vector, i is the nonlinear
function of nodal currents, and u is the excitation (current) vector.
23
Numerical circuit-analysis methods
Figure 2.1. Example circuit.
For example, the MNA equations for the circuit in Fig. 2.1 are

f


0
 1  
 

 f2  =  ∂q(v2 (t))
 

∂t
f3
0
 
G(v (t) − v (t)) + i
1
2
E
 
 
+ −G(v1 (t) − v2 (t)) + iD (v2 )
 
v1 (t)
 
 
 
+
 
0
0
−e(t)


0


 
 

= 0 

 
0
(2.2)
The other way to formulate the MNA equations (the approach used in
APLAC [4]) is to use the gyrator transformation [13] and nodal analysis.
In this way a voltage source is transformed into a gyrator and a current
source and an inductance into a gyrator and a capacitance. Then, the pure
nodal analysis can be applied to formulate the equations.
The linear equations arising from electrical circuits are commonly asymmetric and extremely sparse, and thus sparse LU factorization algorithms
[7, 14] are applied.
2.2
DC analysis
DC analysis is the basis of all circuit simulation. For example, the operating point has to be found before AC analysis, and, also, the DC solution is
the initial condition for transient analysis. Moreover, the DC characteristics themselves are sometimes of interest.
DC analysis solves the steady-state behavior of the circuit variables under the DC excitation. By setting
dq
dt
= 0, Eq. (2.1) reduces to the nonlin-
ear algebraic circuit equation
f (x) = 0,
(2.3)
where x is solved iteratively using the NR method (see Section 3.2).
2.2.1
Aiding the convergence
The local convergence of NR iteration is quadratic, but it has no global
convergence properties. If the NR iteration lacks a good initial guess
24
Numerical circuit-analysis methods
close enough to the solution, convergence is not guaranteed. Therefore,
convergence-aiding methods are needed.
There are several approaches to aid convergence. Using homotopy and
continuation methods [15, 16, 17, 18] is one way to improve the convergence of DC analysis and even to find multiple DC solutions. Another way
to help the convergence is to use line-search to damp iterations [19, 20]
such that the norm of the objective function reduces in every iteration.
Thus the term norm reduction is sometimes used for this strategy. A third
approach is to use totally different solution algorithms, e.g. piecewise linear analysis [21]. More details on different nonlinear iterative solvers is
found in Section 3.
2.3
Transient analysis
Transient analysis is the computation of a time-domain transient response
of (nonlinear) DAEs using numerical integration methods. The analysis
is an initial value problem, where the initial values are typically got from
the DC solution.
DAEs are discretized with different methods like forward Euler (FE),
backward Euler (BE), or the trapezoidal rule, into the set of nonlinear
equations that are solved similarly to DC analysis (using NR iteration).
By using the BE formulation, Eq. (2.1) transforms to
q(xk+1 ) − q(xk )
+ i(xk , tk ) + u(tk ) = 0,
(tk+1 − tk )
(2.4)
and by using the FE formulation to
q(xk+1 ) − q(xk )
+ i(xk+1 , tk+1 ) + u(tk+1 ) = 0,
(tk+1 − tk )
(2.5)
where k is the time index. Starting from the initial condition x0 , which
typically is the DC solution, and using (variable) time stepping, the timedomain transient response can be computed. At each time point the nonlinear algebraic equation has to be solved, typically using NR iteration.
The time steps are selected such that the truncation error satisfy the adequate error bound. BE and FE are first-order methods and the trapezoidal
integration rule a second-order method. Many other integration methods
can be applied, e.g. Runge–Kutta methods [22].
25
Numerical circuit-analysis methods
2.4 Harmonic balance analysis
The harmonic balance (HB) [11, 23, 24] method is a frequency-domain
analysis technique for solving the periodic and quasi-periodic steady state.
It is widely used for RF and microwave circuits.
In HB analysis, the variables are presented in terms of Fourier coefficients:
x(t) =
N
X
Xk ejωk t ,
(2.6)
−N
where N is the number of harmonic frequencies, Xk is the kth Fourier
coefficient, and ωk the kth harmonic frequency. Since the HB equations
easily become huge, they are usually solved using the inexact Newton [25]
method, i.e. the NR method with an iterative linear solver like GMRES
[5].
There are two major ways to formulate the HB equations, namely piecewise HB [23] and nodal HB.
For simplicity, the following considers nodal equations instead of MNA.
By transforming (2.1) into the frequency domain using the (multidimensional) discrete Fourier transform (DFT) Γ, the HB equations become
F(X) = I(X) + jΩQ(X) + U,
(2.7)
where U is the vector of excitation currents, X = Γx(t) is the nodal voltage
vector, and I = Γi(t) and Q = Γq(t) are the nonlinear nodal current and
charge vectors, respectively. Ω is the frequency domain differentiation
matrix

 ω−N 1




Ω=






ω−N +1 1
...
ωN −1 1
ωN 1





,





(2.8)
where 1 is the unity matrix. In practice, the DFT is usually replaced with
the fast Fourier transform (FFT), which is the efficient implementation of
the DFT. Another important way to improve the efficiency of the multitone
analysis is to use 1-D frequency mappings [11, 24, 26].
The whole HB analysis (using inexact Newton method) is as follows:
A LGORITHM 1 HB(X0 )
1. Set initial quess X0 , k = 0.
2. Inverse discrete Fourier transform (IDFT): x(t) = Γ−1 Xk .
3. Compute nonlinear functions i(x(t)) and q(x(t)).
26
Numerical circuit-analysis methods
4. DFT: I = Γi and Q = Γq.
5. Solve new iterate for Xk+1 .
6. Check convergence. If no convergence go to step 2.
7. Solution found.
As mentioned before, the nonlinear equation is solved using inexact
Newton. It requires a Jacobian matrix
J = jΩC + G,
(2.9)
where C and G are the nodal capacitance and conductance matrices, respectively.
2.4.1
Frequency selective harmonic balance analysis
The circuit design usually has different parts that can have very different
operating frequencies, e.g., mixer parts of a circuit have 2-tone frequencies
while a DC biasing circuit 1-tone, or just DC as in this case.
In frequency selective HB (FSHB) or multirate HB (MRHB) that is implemented in APLAC [27, 28], each circuit element may be assigned a
different frequency set. A similar approach was presented by Rizzoli [29],
where each circuit block may have the reduced set of frequencies. The
circuit is, therefore, effectively decomposed into partitions having same
frequencies. Nodal equations are required for the union of frequencies of
each partition the node is referenced from, i.e., the boundary node equations are different from the ordinary nodal HB with respect to the frequency selection.
This way the number of unknowns in the HB analysis can be reduced
while preserving good accuracy in the critical parts. The node equations
are then formed for each node from the set of frequencies contributing
from the elements connected to the corresponding node.
2.4.2
Preconditioning
In order to solve the general linear equation having the form
Ax = b,
(2.10)
the iterative solvers (like GMRES) can be used and are typically used if
the linear system is huge. In the following, consider the iterative solver
27
Numerical circuit-analysis methods
GMRES that minimizes the residual
r = b − Ax.
(2.11)
The iterative linear solver needs preconditioners to function efficiently.
The goal of preconditioning is to reduce the number of GMRES (or other
iterative solvers or variants of GMRES more suitable for preconditioning
like [30, 31]) iterations by making the problem easier for GMRES:
K−1 Ax = K−1 b,
(2.12)
where K is a preconditioner. For complicated problems, the number of
iterations cannot be directly analyzed as a function of the preconditioner.
However, a matrix K is often a good preconditioner if [32]
1. K is a good approximation to A in some sense,
2. the cost of the construction of K is not prohibitive, and
3. the solution of the preconditioner equation requires much less computation than solving the original equation.
In the best case when K−1 A ≈ I, Eq. (2.12) becomes x = K−1 b, and the
GMRES converges in one iteration (or a few iterations) that needs only
one computationally cheap inversion of a preconditioner.
The preconditioning of GMRES in HB analysis has been studied, e.g., in
[33, 34, 35, 36, 37, 38]. In the inexact Newton, the linear equation to be
solved is
J∆X = −F(X).
(2.13)
As HB is a frequency-domain method, a natural choice for a preconditioner is a frequency-domain conditioner. One of the most commonly used
preconditioners is the block Jacobi preconditioner. This is just the diagonal blocks of the Jacobian matrix when ordered in frequency-major order.






KFD = 





J−N
0
0
...
0
J−N +1
0
...
0
0
0
0
J−N +2 . . .
..
.
0

0 

0 


0 
,
. . . JN




(2.14)
It is equivalent to zeroing the off-diagonal terms of the conversion matrices and explains why strongly nonlinear problems are not handled well
28
Numerical circuit-analysis methods
by this preconditioner. However, the cost of inverting this block equation
is low, because each diagonal block can be LU factored separately.
While frequency domain preconditioning is effective for weakly nonlinear circuits, for highly nonlinear cases, especially for frequency dividers,
time-domain preconditioners [34, 35, 36, 37], which take nonlinear behavior better into account, become attractive.
The following considers 1-tone HB analysis only.
In the time domain, the Jacobian is
J̃ = Γ−1 JΓ = DC̃ + G̃,
where

−1
G̃ := Γ
g
0
 1

 0 g2


GΓ = 
 0




0
and, similarly,
C̃ := Γ
0
...
0
...
g3 . . .
..
.
0
0
c
0
 1

 0 c2


CΓ = 
 0




0
0

−1
(2.15)

0 

0 


0 

. . . gn
0
...
0
...
0
c3 . . .
..
.
0
0




(2.16)

0 

0 


0 
,
. . . cn




(2.17)
where gk and ck are block matrices, and Γ is, as before, the DFT operating
on each nodal variable. The difference operator D = Γ−1 jΩΓ is a matrix
having the general form

0
α 1
1


 α−1 1
0


D=
 α−2 1 α−1 1




α1 1
α2 1
α2 1
...
α1 1
...
0
...
..
.
...
α−1 1

α−1 1 

α−2 1 


α−3 1 
,
0




(2.18)
where the coefficients αk are the weights of the time-domain difference
operator.
Since for strongly nonlinear circuits the resistive nonlinearities are dominant, it is tempting to approximate the equations by considering them in
this form and further approximating the difference operator D by some
typical finite difference.
29
Numerical circuit-analysis methods
In [PV] and [PVI] mixed FD and TD preconditioners are considered.
Several approaches to mix different preconditioners are proposed.
The paper [PVI] presents block preconditioners that can be used with
FSHB. In these preconditioners, highly nonlinear 1-tone parts can be preconditioned with TD preconditioners while the other parts can be preconditioned with FD preconditioners.
30
3. Iterative methods for nonlinear
equations
In this section, some standard iterative methods for solving nonlinear
equations are presented. This section offers introduction for the study
in [PI–PIII], where these methods are evaluated and further developed in
the context of DC circuit analysis.
3.1
Equation formulation
The nonlinear algebraic circuit equations were presented in Eq. (2.3).
In DC and transient analysis x is the vector of nodal voltages and currents, but in HB analysis the vector contains Fourier coefficients. The
function values can be directly obtained from the model equations and
the derivatives either using numerical perturbation or directly from the
model equations.
In order to use more sophisticated methods the objective function is defined as
1
1
F = kf (x)k22 = f (x)T f (x).
2
2
(3.1)
g = ∇F = JT f (x),
(3.2)
The gradient then is
where J is the Jacobian matrix. The Hessian matrix
∂ JT
∂g
H=∇ F =
=
f (x) + JT J
∂x
∂x
2
(3.3)
can be obtained, but it would need expensive numerical computation.
31
Iterative methods for nonlinear equations
3.2 Line-search methods
This thesis studied the damped iteration methods, where the new iterate
xk+1 at the kth iteration is
xk+1 = xk + λk ∆xk .
(3.4)
The damping factor λk , 0 < λk ≤ 1, and the update ∆xk depend on the
iteration method used. For example, the update for NR, the steepest descent (SD), and the conjugate gradient (CG) method are
∆xNR
= −Jk−1 fk ,
k
∆xSD
k
=
−JTk fk
= −gk ,
∆xCG
= −gk + βk−1 ∆xk−1 ,
k
(3.5)
(3.6)
(3.7)
respectively, where β can be computed in many ways, e.g. by using Fletcher
and Reeves formula [39]
βk−1 =
||gk ||2
.
||gk−1 ||2
(3.8)
Alternative formulas can be found, e.g., in [40].
The local convergence of the NR method is quadratical, i.e.
||xk+1 − x∗ || < K||xk − x∗ ||2 ,
(3.9)
where x∗ is the solution of the nonlinear equation, and K is a constant.
The nonmonotone approach to line search has been proposed in several
papers, e.g. [41, 42]. The idea here is that close to the solution the NR
iteration converges, i.e. ||xk+1 − x∗ || < ||xk − x∗ ||, even if the norm of the
objective function does not decrease. If this is the case, some nonmonotonicity to decreasing of the function norm can be allowed.
3.3 Trust-region methods
A trust region B = {x | kx − xk k ≤ δ }, where δ is the trust-region radius,
is the region where the linear or quadratic model m(x) is assumed to
approximate f (x). In the trust-region methods, the iteration step, ∆xk , is
obtained by minimizing the model within the trust region:
min m(xk + ∆x).
k∆xk≤δ
(3.10)
The trust-region radius, δ, is adaptively adjusted during the iteration.
The quality of the linear model
L(∆x) = f (xk ) + Jk ∆xk
32
(3.11)
Iterative methods for nonlinear equations
is monitored using the gain ratio
ρ=
F (xk ) − F (xk + ∆xk )
,
L(0) − L(∆xk )
(3.12)
i.e., the ratio between the actual and predicted decrease in function value.
A large value of ρ indicates that the linear model is good and the trustregion size can be increased. A small ρ indicates a poor model and a
smaller step-size should be used.
There are several trust region methods, like dogleg [43], the Levenberg–
Marquardt [44, 45], the tensor method with a 2D trust-region [46], and so
on. The nonmonotone strategy is reported to have been used with trustregion methods [47, 48].
3.4
Dogleg
As an example of trust-region methods, dogleg (DL) is presented in the
following. The method combines NR and SD methods as illustrated in
Fig. 3.1. If the NR step is inside the trust region, it is accepted as a trial
step. Otherwise, the point that minimizes the objective function in the
direction of SD, the Cauchy point (CP), is computed, i.e., the minimizer of
the linear model of F (x + α∆xSD ) is
α=−
kgk2
∆xsd JT f
.
2 =
kJ∆xsd k
kJgk2
(3.13)
If the CP is outside the trust region, a damped SD step to the trust-region
boundary is taken. When the CP is inside the trust region, a step is taken
to the trust region boundary between the CP and the NR point:
∆xDL = α∆xSD + β(∆xNR − α∆xSD ).
(3.14)
By defining a := α∆xSD and b := ∆xNR , β is computed as follows:

p
 (−c + c2 + kb − ak2 (δ 2 − kak))/kb − ak2 , if c ≤ 0,
β=
 (δ 2 − kak)/(c + pc2 + kb − ak2 (δ 2 − kak)), if c > 0.
(3.15)
The whole DL algorithm is as follows:
A LGORITHM 2 DL(x0 ,δ0 )
1. Set k = 0, x = x0 , and δ = δ0
2. Compute g = JT f
3. While kf (x)k > and kgk > and k < kmax
(a) Compute CP
(b) Solve NR step ∆xNR
33
Iterative methods for nonlinear equations
Figure 3.1. Dogleg step.
(c) Compute ∆xDL
(d) If solution found, end iteration.
(e) xk+1 = xk + ∆xDL
(f) ρ =
F (xk ) − F (xk+1 )
L(0) − L(∆xDL )
(g) If ρ > 0, accept step and compute gk .
(h) If ρ > 0.75, then δ = max{δ, 3k∆xDL k}
(i) If ρ < 0.75, then δ = δ/2.
(j) Set k := k + 1
4. EndWhile.
3.5
Tensor methods
Tensor methods with line search were presented in [49]. In [46], tensor
methods with 2D trust-region methods were introduced.
In these methods, the quadratic model is
m(x + ∆x) = f (xk ) + Jk ∆x + (1/2)Tk ∆x∆x,
(3.16)
where Tk is the tensor obtained from interpolating past function values.
Although a quadratic model is used, there is no need for the Hessian matrix. The iteration update ∆x is found by minimizing km(x + ∆x)k.
In [PII], nonmonotone trust-region and line-search tensor methods are
introduced.
34
Iterative methods for nonlinear equations
3.6
Multilevel Newton–Raphson
Multilevel iteration methods are based on the hierarchical analysis of a
partitioned circuit. For hierarchical analysis, concepts like diakoptics and
tearing have been introduced in the 1970’s [50, 51, 52, 53, 54]. In the
1990’s, the term domain decomposition has been connected to these methods [55]. In these methods, the linear or linearized circuit equations are
ordered into bordered block diagonal (BBD) form, which can be decomposed into separately solved submatrices. The equations are solved by
using hierarchical LU factorization and forward-backward substitution.
The BBD ordering of the matrix can be done even recursively on multiple
levels of hierarchy. These methods have been efficiently utilized for parallel computation in DC and transient analysis. The BBD formulation and
efficient equation solvers are used also for MOR methods [56].
Consider a circuit that has n nodes and that can be decomposed into
m subcircuits consisting of ni internal nodes and nEi external connection
nodes. The nonlinear system of nodal equations for internal and external
nodes can be written as
fi (xi , xE ) = 0,
fE (x1 , . . . , xm , xE ) = 0,
(3.17)
respectively, where xi is the internal nodal voltage vector of subcircuits,
xE the voltages of the external connection nodes of the subcircuits fi and
fE .
The Jacobian matrix J has the BBD form [57]:

A
 1




J=






A2
..
.
Am
C1 C2 . . . Cm
B1 

B2 

.. 
. 
,


Bm 

(3.18)
D
where
∂fi
,
∂xi
∂fE
Ci =
,
∂xi
Ai =
∂fi
,
∂xE
∂fE
D=
.
∂xE
Bi =
(3.19)
The function vector fE , as well as the matrix D, can be further decomposed into parts that contain the contributions of the circuit elements of
35
Iterative methods for nonlinear equations
the main circuit and each subcircuit:
fE = fE,0 +
m
X
fEi ,
i=1
m
X
D = DE,0 +
Di .
(3.20)
(3.21)
i=1
In the BBD formulation above, the decomposion is performed on the linear equation level, but if the circuit is partitioned before linearization,
then, on the nonlinear equation level, nonlinear analysis methods like
MLNR methods or Multilevel Newton Analysis (MLNA) [58] can be applied. These methods have been applied to DC and transient analyses to
solve the system of (discretized) nonlinear equations [59, 60, 61, 62, 63,
64, 65]. They can be used also in the HB method [66] as well as in the
simulation of micro-electro-mechanical systems (MEMS) [67, 68, 69] and
mixed circuit/device systems [70, 71]. The multilevel methods can be effectively parallelized [59, 60, 61, 63, 64, 65, 72], because they apply circuit
hierarchy in a natural way. The circuit equations as well as the linearized
equations can be processed in parallel.
One of the first MLNR methods, MLNA [58], performs the iterations on
multiple levels. Between outer iterations, the external variables are kept
constant and only the inner variables of subcircuits are iterated:
k,j
= xk,j
xk,j+1
i − Ai
i
−1
k
fi (xk,j
i , xE i ),
(3.22)
where j is the inner iteration index. The inner iteration is stopped at some
error level τ = min(τ 0 , k∆xE k2 ) (τ 0 is the maximum allowed error level)
which is needed for quadratic convergence of the outer level iteration [58].
The initial guess for the inner variables xk,0
can be the same at every
i
inner iteration or it may be the ending value of the previous iteration.
The main-circuit variables are iterated using subcircuits as macromodels.
The MLNA [58] is summarized in the following.
A LGORITHM 3 MLNA(circuit)
1. Set x0E , ε and τ 0 .
2. Begin outer iteration: Set k = 0.
3. Begin inner iterations for all subsystems i (in parallel): Set j = 0 and
xk,0
i .
k,j
k
(a) Solve Ai ∆xk,j
i = −fi (xi , xE i ).
k,j
(b) Set xik,j+1 = xk,j
i + ∆xi .
(c) Set j = j + 1.
36
Iterative methods for nonlinear equations
(d) If k∆xk,j
i k > τ go to Step 3 (a).
4. End inner iteration.
5. Solve DE k0 +
Pm
k
i=1 DSub, i
6. τ = min(τ 0 , k∆xE k2 ).
∆xE = −fE k0 +
Pm
k
i=1 fSub, i .
7. Set k = k + 1.
8. End outer iteration if k∆xE k < ε.
The method utilizing parallel computing and having improved global
convergence properties is presented in [PIII].
37
Iterative methods for nonlinear equations
38
4. Parallel processing in circuit
simulation
Traditionally, computer software has been designed for serial computation. Parallel computing, in turn, uses multiple processing elements simultaneously to perform the computation concurrently (or in parallel).
The computation problem is decomposed into independent parts so that
each processing element can execute its part of the algorithm independently. The processing elements can be a computer with multiple processors (or cores), a network of workstations (NOW), or a specialized hardware like the graphic processing unit (GPU) that can be used also for
double-precison floating point computations.
Parallel processing can be performed in the hardware where singleor multiprocessor computers are connected in a network and a software
backplane is used to control the processing. This system is treated as a
single multiprocessor computer (virtual machine). This kind of computing
is called networked computing or distributed computing.
From these many types of architectures, in multiple instruction stream
/ multiple data stream (MIMD) type architectures multiple instruction
streams are executed in parallel for multiple data [32]. Most multiprocessor computers belong to this class. GPU architecture is the so called single
instruction / multiple data (SIMD) architecture, where a single instruction (e.g., addition) can be performed on multiple data simultaneously.
The memory in a parallel computer is either shared memory (shared
between all processing elements in a single address space), or distributed
memory (in which each processing element has its own local address space).
In the message-passing programming model, each processor has its own
local memory and message passing is used to deliver data between processors. In the shared-memory model, the processors have shared data [32].
There are programming packages, like PVM [73] and MPI [74], available
for message passing programming.
39
Parallel processing in circuit simulation
In the multihread computing model, the algorithm is designed such that
concurrent prosessing is performed in different threads. Threads are the
smallest units of processing that can be scheduled by an operating system
and can even be executed simultaneously in different processing elements
(typically in cores). Usually multithreading utilizes shared memory.
The history of parallel circuit simulation is long. Decades ago supercomputers were used for parallel processing [75], but recently some studies
on parallel circuit simulation in multicore processors has been presented,
e.g., in [76], and now the GPU has been used for the same [77] along with
proposed applications [78, 79]. Multilevel iteration methods suitable for
NOWs are presented in [63, 64].
A great deal of effort is put in solving linear sparse matrices in circuit
simulation, e.g. [80, 81, 82, 83]. Also, work on parallel HB has been
reported, e.g., in [84, 85, 86]. In this thesis, the variant of the MLNA
method with some convergence aiding [PIII] and an implementation of
parallel HB analysis for shared memory computers are presented [PIV].
40
5. Model-order reduction
5.1
Overview
The goal of model-order reduction (MOR) is to produce a smaller model
from a large one. Linear MOR concentrates on the reduction of RLCM
circuits or RLCM parts of nonlinear circuits. Nonlinear MOR reduces
also circuits consisting of transistors and other nonlinear components. An
overview on MOR is presented in [87, 88, 89, 90].
One of the first proposed MOR methods was the asymptotic waveform
evaluation (AWE) [91] in 1990. After that, a large number of projectionbased MOR methods have been proposed. The AWE algorithm uses the
Padé approximation to the obtain an approximation of the original transfer function. However, the direct matching of high-order moments causes
numerical instability problems. A solution to the instability was to use
implicit moment matching, i.e., to project the original moment space onto
an orthonormal Krylov subspace. The first such method was the Padé via
Lanczos (PVL) [92], where the Lanczos process is used to generate the
Krylov subspace. Alternatively, the Arnoldi process can be used to generate the subspace. The passive reduced-order interconnect macromodeling algorithm (PRIMA) [93] proposed in 1998 uses the Arnoldi process
to generate the Krylov subspace. The projection matrix constructed is
used to perform a congruence transformation of the original system into
smaller system. PRIMA generetes provably passive reduced-order models (ROM). The easy implementation and guaranteed passivity property
made PRIMA very popular. The structure-preserving reduced-order interconnect macromodeling algorithm (SPRIM) [94] proposed in 2004 added a
structure-preserving feature to the Arnoldi process. Thus, the reciprocity
of the system is preserved.
41
Model-order reduction
Also, methods based on singular value decomposition (SVD) have been
presented, e.g. truncated-balanced realization (TBR) [95, 96, 97, 98]. The
idea of these methods is to use different balancing techniques to capture
specific system properties. The methods have computable error bounds.
However, these methods have been considered very expensive.
Yet another approach to MOR is nodal-elimination methods like TICER
[99, 100] that is used for reduction of RC circuits. Its extension [101] is
used for RLC circuits. The outcome of these methods is by definition an
RC or RLC circuit.
One approach to MOR is to use partitioning-based reduction. In general,
the original circuit can be divided into subcircuits (or matrices into submatrices), and each subcircuit can be analyzed separately with any of the
previously presented methods. The partitioning itself can be performed
for the graph constructed from the RLC circuit. There are standard methods for partitioning the graph, e.g. hMETIS [102]. However, some more
sophisticated methods are substantially based on partitioning. For them
the partitioning is not an additional part of the MOR flow, but an essential part of the method. One of these partitioning-based MOR methods
was presented by Liao and Dai in 1999 [103]. Another partitioning based
method, SparseRC for RC circuits, is presented in [104]. For RLC circuits,
PartMOR [PX], and its extension for RLCM circuits in [105] were proposed. Some other partitioning-based approaches to MOR are presented
in [56, 106, 107, 108].
There are several MOR methods for nonlinear systems of equations, but
this field of MOR is beyond the scope of this thesis, as is parametric MOR,
where the model is parameterized with respect to, e.g., temperature, geometric dimensions, etc.
One often overlooked issue in the development of new MOR methods is
the realizability of the reduced-order models. If a potential MOR method
produces only a reduced mathematical model of a transfer function or
state equations instead of a realizable circuit macromodel, simulation
tools may need to be modified to handle these mathematical representations. Also, the realized RLC netlist allows the usage of all analysis modes
of a simulator, e.g. it is simpler to use an RLC circuit in transient analysis than in a frequency representation. In [109], the macromodel realizations for MOR are well studied, but contain voltage-controlled current and
charge sources in addition to standard RLC elements. Depending on the
design flow, this may severely limit the utility and usability of MOR. RLC-
42
Model-order reduction
SYN [110] can be utilized with structure preserving (reciprocal) methods
like SPRIM and SAPOR [111]. TICER and Liao–Dai methods produce, as
mentioned before, RC circuits.
In order to get the more detailed idea of MOR, some methods used in
[PVII–PX] are presented in the following: PRIMA with the macromodel
realization method proposed by Matsumoto and the partitioning-based
method proposed by Liao and Dai.
5.2
PRIMA
The passive reduced-order interconnect macromodeling algorithm (PRIMA)
[93] is based on the block Arnoldi algorithm and employs congruence
transformations to project a large system of equations onto a smaller subspace so that passivity is preserved during reduction. PRIMA uses the
Arnoldi iteration as a numerically stable method of generating the Krylov
subspace to match K = bq/N c block moments of the N -port y-parameters,
where q is the order of reduction.
5.2.1
Equation formulation
The MNA equations of an N -port can be expressed as follows:


 C dx(t) = −Gx(t) + Bup (t),
dt
(5.1)

 ip (t) = LT x(t),
where x(t) contains nodal voltages and branch currents of ports and inductances (x(0) = 0), and up and ip denote the port voltages and currents,
respectively. B = L, where B ∈ <n×N is a selector matrix consisting of
ones, minus ones and zeroes. n is the total number of unknowns. The
MNA matrices G ∈ <n×n and C ∈ <n×n can be partitioned as

C≡
Q
0
0
H


, G ≡ 
N
E
−ET
0


, x ≡ 
v
i

.
(5.2)
N, Q, and H are symmetric non-negative definite matrices containing
the stamps from resistances, capacitances, and inductances, respectively.
Vector v is the nodal voltage vector and i contains the branch currents
of ports and inductances. The matrix E represent the current variable
contributions in the MNA equations.
Define A ≡ −G−1 C and R ≡ G−1 B. Taking the Laplace transformation
of (5.1) and solving for the port current variables, the y-parameter matrix
43
Model-order reduction
Y(s) is
Y(s) = LT (1 − sA)−1 R.
(5.3)
The block moments of Y(s) are defined as the coefficients of the Taylor
expansion of Y around s = 0:
Y(s) = M0 + M1 s + M2 s2 + · · · .
(5.4)
The block moments can be computed using the relation
Mi = LT Ai R.
(5.5)
The reduction happens when this series is truncated into K first moments. PRIMA does not solve these moments directly, but other methods,
such as PartMOR computes 1–3 first moments directly using (5.5).
5.2.2
PRIMA algorithm
The generation of the projection matrices X is considered in the following
algorithm:
A LGORITHM 4 PRIMA(G, C, B, K )
1. Solve GR = B for R.
2. Block orthonormalization: compute QR factorization R = X0 T.
3. For j = 1, . . . , K
(a) Solve for Xj in GXj = −CXj−1 .
(b) For i = 1, . . . , j (modifed Gram–Schmidt orthogonalization)
i. ∆ = XTj−1 XJ .
ii. Xj = Xj − Xj−1 ∆.
iii. Hj−i,j−1 = Hj−i,j−1 + ∆.
(c) EndFor.
(d) Block orthonormalization: compute QR fact.: Xj = Xj Hj,j−1 .
4. EndFor.
5. Collect generated vector blocks X = [X0 . . . XK−1 ].
Using the projection matrix X, PRIMA transforms (5.1) into


 C̃ dx̃(t) = −G̃x̃(t) + B̃u(t),
dt

 i(t) = L̃T x̃(t),
44
(5.6)
Model-order reduction
where the reduced matrices are
C̃ = XT CX, G̃ = XT GX, B̃ = XT B, L̃ = XT L.
(5.7)
These types of transformations are known as congruence transformations. The matrix X is an n × q matrix, which is obtained after q/N + 1
iterations of the block Arnoldi algorithm (the extra step is not necessary
if q/N is an integer).
5.2.3
Eigenvalue decomposition
The mathematical formulation in this subsection is based on [109] and
[112].
The reduced model in (5.6) is described by dense block matrices. If
the direct stamping methods [93, 109], where the matrices are directly
stamped into components, are used for macromodel creation, they might
create many new components and, thus, lead to realizations that have
more components than the original circuit. Therefore, other sophisticated
macromodel realization methods are needed. Most of these methods need
eigenvalue decomposition as a preprocessing step, and it is presented in
the following.
If the first equation of Eq. (5.6) is premultiplied by G̃−1 and assuming
that a basis of eigenvectors exists for the matrix G̃−1 C̃, then G̃−1 C̃ =
SΛS−1 , where Λ is a diagonal matrix containing the eigenvalues of G̃−1 C̃
as its diagonal elements and S has the corresponding q eigenvectors as its
columns. After premultiplying by S−1 , Eq. (5.6) can be written as

dx̃(t)


 S−1 SΛS−1
= −S−1 x̃(t) + S−1 G̃−1 B̃u(t),
dt


 i(t) = L̃T SS−1 x̃(t),
(5.8)
or, with a change in variables S−1 x̃ → x̃, as

dx̃(t)


 Λ
= −1x̃(t) + Hu(t),
dt


 i(t) = ET x̃(t),
(5.9)
where H = S−1 G̃−1 B̃, E = ST L̃, and 1 is the q × q unity matrix. Eq. (5.9)
has the same dimensions as Eq. (5.6), but the coefficient matrices Λ and
1 are diagonal.
The real matrix G−1 C̃ has qr real eigenvalues and qc complex conjugate
pairs such that q = qr + 2qc . Consider one conjugate pair, Λrm ± jΛim . The
corresponding eigenvectors, and the corresponding rows of matrices H
45
Model-order reduction
and E in Eq. (5.9) are complex conjugate. Let the corresponding elements
of vector x̃ be x̃rm ± jx̃im . Multiplying the mth row in the first equation of
(5.9) by S−1 , it is written as
Λm
N
X
dx̃m (t)
= −x̃m (t) +
Hmj uj (t),
dt
j=1
(5.10)
If x̃rm ± jx̃im are inserted into Eq. (5.10) and the real and imaginary parts
of the equation are required to hold independently, Eq. (5.10) becomes

N

dx̃i (t) X
dx̃r (t)

r


= −x̃rm (t) + Λim m
+
Hmj
uj (t),
Λrm m


dt
dt
j=1
N
i
r

X

i

r dx̃m (t) = −x̃i (t) − Λi dx̃m (t) +

Hmj
uj (t).

m
m
 Λm dt
dt
(5.11)
j=1
5.2.4 Macromodel synthesis by Matsumoto’s method
The macromodeling method proposed by Matsumoto [112] produces efficient macromodel realizations [112, 109] of the reduced-order models for
the matrices obtained from PRIMA. The method is a realization of Eqs.
(5.10) and (5.11). The equivalent circuit is presented in Fig. 5.1. The
nodal equations, e.g., for the circuit in Fig. 5.1(b), can be expressed as
N
X
Hmj Uj = (sΛm + 1)X̃m .
(5.12)
j=1
5.3 Liao–Dai method
In principle, the method proposed by Liao and Dai (here, Liao–Dai) [103]
is a partitioning and macromodel-based RC MOR method, where the circuit is partitioned into subcircuits that are modeled with simple low-order
RC circuit. In this section, the original method is presented briefly for the
reference purposes. In [PVII], the method is investigated further: each
step is analyzed carefully and alternative approaches are proposed and
evaluated.
The Liao–Dai method begins by describing the circuit with scattering
(S) parameters, where each circuit element is described in S parameter
terms. The goal is to minimize the total number of the entries of S matrix.
This is done by decomposing the circuit into subcircuits that have as small
number of ports, i.e. connection nodes between subcircuits, as possible.
The partitioning in the original Liao–Dai method is done by considering the RC netlist as a weighted graph G(V, E). For RC circuits, vertices
46
Model-order reduction
Figure 5.1. Matsumoto’s equivalent-circuit realization [112]. (a) A port VCCS, (b) realization of a real eigenvalue, and (c) realization of a complex eigenvalue pair.
V consist of nodes that connect more than two elements, and edges E
represent the adjacency (resistance or capacitance) between nodes in the
circuit. For each edge, a weight is determined based on the number of
ports in each subcircuit the edge is connected to, if the two subcircuits
were joined together. The partitioning process then chooses the elements
to be combined together into larger subcircuits. This is done by eliminating edges between the elements until the weight of all remaining edges is
greater than a preset maximum weight for an edge. In practise, an edge
between two subcircuits is eliminated, if the inequality
Nx2 + Ny2 > β(Nx + Ny − 2)2
(5.13)
holds, where Nx and Ny are the number of ports in two subcircuits, x and
y, and β is the user defined contracting factor. The equation is derived
from the size of the corresponding S-parameter matrix: the left-hand side
of the equation describes the number of entries in the matrix before the
elimination and the right-hand side describes the number of entries after
the elimination of an edge.
As a result, the circuit is divided into partitions with a small number of
ports, i.e. smaller number of entries in the S-parameter matrix.
At each step, the S parameters of subcircuits are updated by calculat-
47
Model-order reduction
Figure 5.2. Liao–Dai circuit macromodels between pairs of ports.
ing new S parameters for each subcircuit. The S-parameter equations
are also truncated to the first two low-order terms, because higher-order
terms are not needed for the low-order macromodel synthesis. After the
partitioning is completed, the S parameters of each subcircuit are converted into y parameters as follows
Y = Z0−1 (1 + S)−1 (1 − S),
(5.14)
where Z0 is the reference impedance. These y parameters are used to
realize the subcircuits with small-order RC-macromodels.
For an N -port, the admittance between the ith port and ground is given
by the sum of the ith row of its Y matrix, Y(s). The admittance between
ports i and j is −yij . The admittances between a port and ground and
between pairs of ports are synthesized with R and C elements. In the
original Liao–Dai method [103], the admittance matrices of an RC circuit
are synthesized using moment-matching technique. Once M0 and M1
have been calculated for the N -port using (5.5), each element of Y(s) can
be expressed as
ij
yij ≈ mij
0 + m1 s.
(5.15)
Figure 5.2 presents the macromodel realizations between two ports i
and j in different situations. A terminal macromodel, such as shown in
Fig. 5.3, is also needed for each port. In case of a one-port, only the terminal macromodel is needed. Depending on the mij
1 , different macromodels
are used between ports i and j:
1. The T circuit in Fig. 5.2(a) may be used if mij
1 ≥ 0. Along with the port
impedance of Fig. 5.3 at two ports, this creates a 2Π circuit.
2. If mij
1 is negative, Fig. 5.2(b) must be used instead, which forms, combined with the port model of Fig. 5.3 at the two ports, a Π circuit.
48
Model-order reduction
Figure 5.3. Liao–Dai circuit macromodel of a port.
5.3.1
T model
The synthesization of the T model shown in Fig. 5.2(a), where the circuit
parameters are determined by matching the moments of yij , is presented
first. The T model together with the port model at the two ports (see
Section 5.3.3) forms a 2Π model.
ij
ij
By matching the first two moments, mii
0 , m0 , and m1 of the series in
jj
(5.15) and the relation mii
1 /m1 with the 2Π macromodel description, we
obtain the values
q


−
mjj

1


q
R
=
,
q

ij1


ij
ii +

(
)
m
m
mjj

0
1
1





q




− mii

1

q
Rij2 =
,
q
ij
ii
(
m
m
+
mjj

0
1 )
1





q
q



2

mij
( mii
+ mjj

1
1 )
1

 C =
q

.
ij


ii mjj


m

1 1

(5.16)
jj
According to [103], the reason for matching mii
1 /m1 instead of matching
jj
jj
ii
mii
1 and m1 directly is that m1 and m1 are the second moments of yii
and yjj , which are the total contribution of the circuit, not just a branch
between ports i and j.
5.3.2 Π model
The above formulas could be used to synthesize all off-diagonal elements
ij
of the Y matrix with mij
1 ≥ 0. However, m1 may be negative for circuits
with floating capacitances. In this case, a floating capacitance can be used
to realize the yij between port i and port j as presented in Fig. 5.2(b). The
admittance-matrix elements of the reduced macro model in Fig. 5.2(b) are
y˜ii = ỹjj = −ỹij =
1
+ sCij .
Rij
(5.17)
Thus
Rij = −
1
mij
0
,
(5.18)
49
Model-order reduction
Cij = −mij
1.
5.3.3
(5.19)
Circuit model of a port
For the diagonal elements yii , yi0 needs to be synthesized, where
yi0 = yii −
N
X
j=1(j6=i)
i0
ỹij = mi0
0 + m1 s + · · ·
(5.20)
The parameter yi0 is modeled with a parallel RC circuit in Fig. 5.3. If
mi0 = 0, the yi0 is modeled with a single capacitance. The parameters of
the model are
1
,
mi0
0
(5.21)
Cii = mi0
1 .
(5.22)
Rii =
In case the capacitance Cii computed in Eq. (5.22) is negative, [103]
suggests that Cii can be set to zero and all Cij scaled down to keep the total
capacitance unchanged. This way all the resistances and capacitances are
non-negative and the total macromodel is passive.
50
6. Discussion
This thesis presented four approahes to increase the speed and improve
the convergence of numerical circuit simulation.
In the first approach, several nonlinear iteration methods have been
evaluated and further developed. The main emphasis was in the convergence, and as a result of the studies in [PI, PII], the Dogleg method
has been implemented in APLAC’s DC analysis as one of the many convergence strategies [3]. Also implemented is the nonmonotone normreduction method [PI] that the user may choose to use in APLAC’s DC
and HB analysis. Only some attention was paid to the computational cost
of the methods. In the future studies, the possibility to apply nonlinear
methods to massive problems has to be taken better into account, especially how to use these methods with HB analysis efficiently.
Even though utilization of parallel processing for circuit simulation has
been studied for decades, it is still a relevant topic since the parallel hardwares are constantly developing; e.g., multicore processors exist in almost
every computer. The multithreaded HB analysis reported in [PIV] is the
basis of APLAC’s current HB implementation. The MLNR method [PIII]
was available in some older versions of APLAC but was then disabled due
to maintenance difficulties.
Preconditoning is an essential part of the development of HB algorithms.
Both FD and TD preconditioners have their own advantages. In this thesis, combining the TD preconditioner with the FD preconditioner was suggested in order to benefit from both. The mixed preconditioners seem to
improve the performance at least in some cases and, therefore, most of the
preconditioners are implemented in APLAC and can be specified for use
by the user. Unfortunately, TD preconditioners are applicable directly to
1-tone HB problems only. Therefore, mixing of multitone (FD) preconditioning, especially with MRHB, might be a good direction to focus future
51
Discussion
studies.
The MOR research presented in this thesis was concentrated mainly
on partitioning-based methods (excluding GABOR [PIX]). The PartMOR
method presented in [PX] has been further developed and improved [105],
and the possibility to apply this and other partitioning-based methods for
tigthly coupled problems efficiently is under study. GABOR [PIX] was one
attempt to open up a new approach for projection-based MOR but, unfortunately, was not very succesful. However, it was an inspiration for another experimental method: Passive, Reciprocal, and Infinity-Observing
Reduction (PRIOR) [113]. In order to get these MOR methods, and those
methods not presented in this thesis in more widespread use than now,
there is need for research or, at least, technical development work that
takes into account the real-life problems.
52
7. Summary of the publications
7.1
Publication I: Nonmonotone norm-reduction method for circuit
simulation
A nonmonotone norm-reduction (line-search) method for aiding the convergence of NR analysis is presented. It has been implemented in APLAC’s
DC analysis and tested with some benchmark circuits. The test results
showed that the method can reduce the number of line searches during
the DC analysis and, thus, increase the speed of the NR iteration.
7.2
Publication II: On nonlinear iteration methods for DC analysis
of industrial circuits
This paper concentrates on some trust-region methods: DL and tensor
methods which should be efficient in the case of nearly singular Jacobian matrices and do not need the computation of Hessian matrices. The
convergence of these methods is also improved using the nonmonotone
strategy presented in [PI]. The efficiency of the above mentioned methods
was compared to NR and some CG methods. All the methods have been
implemented in the in-house development version of APLAC.
Simulations with real-life circuits are presented. The results showed
that DL method — especially with the nonmonotone search strategy —
was the most robust of the methods tested.
53
Summary of the publications
7.3 Publication III: New multilevel Newton–Raphson method for
parallel circuit simulation
In this paper, a variant of the multilevel Newton–Raphson method for
parallel circuit simulation is presented. The reduced communication between processors is the motivation to use multilevel methods in a network
of workstations. The proof for the quadratic local convergence is given,
and with a specific circuit-equation formulation, the multilevel method is
shown to be adjustable using line-search methods to achieve better global
convergence. Finally, experimental results are presented that show some
speed-up compared to the non-multilevel parallel NR method.
7.4 Publication IV: A Parallel harmonic balance simulator for
shared memory multicomputer
A parallelization of the HB simulator (APLAC) for a shared memory multiprocessor computer is presented. The paper shows how to utilize multithreading in some critical operations: computation of the values of nonlinear elements at each sample point, the computation of matrix-vector
products, and the construction and solution of the block-diagonal preconditioner. As a result, a reasonable scalable simulator is achieved.
7.5 Publication V: Mixed preconditioners for harmonic balance
Jacobians
The efficiency of a linear iterative solver depends heavily on the preconditioner used. Naturally, most preconditioners for HB equations are in the
frequency domain, one of the simplest being the block-diagonal preconditioner. While this is simple and effective for weakly nonlinear circuits, for
highly nonlinear cases, especially for frequency dividers, TD preconditioners, which take nonlinear behavior better into account, become attractive.
However, both TD and FD preconditioners may lack some good properties. In some situations, changing the preconditioner during the iteration
is needed.
In this paper, mixed FD/TD preconditioners for HB Jacobians are presented. The efficiency of mixed preconditioners is demonstrated with realistic simulation examples. The results showed that mixing the preconditioners is mostly a good strategy, but not superior.
54
Summary of the publications
7.6
Publication VI: Frequency/time block preconditioners for
harmonic balance Jacobians
The RF circuit may have a structure where different parts of the HB Jacobians require different preconditioners. [PV] shows how to mix frequencydomain and time-domain preconditioners for full HB Jacobians, but this
paper proposes block preconditioners that combine time- and frequencydomain preconditioners such that different preconditioners can be chosen
for each HB Jacobian block separately. The preconditioners were tested
with a circuit consisting two parts, one suitable for a FD preconditioner
other for a TD preconditioner. The simulation result showed that these
preconditioners can improve the convergence for this type of circuit.
7.7
Publication VII: Study and development of an efficient
RC-in–RC-out MOR method
The paper presents a partitioning and macromodel-based MOR method
for RC circuits. The original method proposed by Liao and Dai [103] is
divided into three parts: circuit partitioning, moment calculation, and
macromodel synthesis. For each of these parts, alternative approaches
are presented. The alternatives are, then, analyzed and the most efficient
solutions to each of these steps are presented. As a result, a MOR method
that uses hMETIS as the partitioning algorithm, an MNA-based moment
calculation, and the simplied macromodeling method is constructed. In
other words, the revised method uses the same original idea as in [103],
but in a more efficient manner.
The revised partitioning-based RC MOR flow was compared to PRIMA,
and TICER. For the most RC circuits, the proposed method out-performed
TICER and PRIMA. The PRIMA of course can be applied to RLC circuits,
too.
7.8
Publication VIII: Hierarchical model-order reduction flow
This paper presents a hierarchical model-order reduction (HMOR) flow,
where the linear parts of a hierarchically defined circuit are divided into
independently reducable subcircuits. The impact of the hierarchical structure and circuit partitioning on two MOR methods is discussed and simulation results are presented.
55
Summary of the publications
The benefits of performing MOR in a hierarchical manner are: The
problem can be divided into parts which can be solved separately, thus
allowing faster analysis using parallel processing. It also requires less
memory. The repeated subcircuits need to be analyzed only once, and a
different MOR method may be chosen for each individual part of the original problem. Circuit partitioning presents a natural way to benefit from
hierarchical analysis.
In addition to general discussion, the HMOR flow is demonstrated using
two MOR methods, the RC MOR method presented in [PVII], and PRIMA.
The simulation examples showed that also PRIMA benefits from the hierarchical analysis as well as the RC MOR method, where the partitioning
is a vital step of the method.
7.9 Publication IX: GABOR: Global-approximation-based order
reduction
This paper proposes a new approach for the MOR of RLC circuit blocks.
Instead of Taylor-series-like local fitting using implicit moment matching,
a global approximation of the matrix-valued s-domain transfer function
is generated. Then, the Krylov-like subspace spanned by the moments
of this approximation is used to set up the projection matrices needed
for MOR. The proposed MOR approach preserves passivity, reciprocity,
and the properties of the global approximation. The simulation example,
a reduction of a dispersive transmission line model, verifies the correct
operation of the method. However, there are numerical problems with
the method that implies that GABOR is not a competitive method. The
concept of global approximation is still worth studing.
7.10
Publication X: PartMOR: Partitioning-based realizable
model-order reduction method for RLC circuits
PartMOR is an extension of the RC and RL MOR methods presented in
[PVII] and [114] into a general-purpose RLC MOR method.
This method partitions the RLC circuit into subcircuits that are modeled
with two macromodels that are used to match three moments of the original y parameters of each subcircuit. The matching is done simultaneously
partly at DC and partly at infinity.
Also, thanks to analysis at both DC and infinity, a singularity of the G
56
Summary of the publications
matrix at either frequency can be avoided.
The test simulations with RC, RL, and RLC circuits are presented, and
they show that the performance in terms of accuracy and reduction ratio
of PartMOR is generally better than that of SPRIM with the RLCSYN
macromodel synthesis method.
57
Summary of the publications
58
Bibliography
[1] L. W. Nagel, SPICE2: A Computer Program to Simulate Semiconductor
Circuits. PhD thesis, EECS Department, University of California, Berkeley, 1975.
[2] R. Saleh, S.-J. Jou, and A. R. Newton, Mixed-Mode Simulation and Analog
Multilevel Simulation. Boston, MA: Kluwer Academic Publisher, 1994.
[3] APLAC 8.6 manuals, 2012.
[4] M. Valtonen, P. Heikkilä, H. Jokinen, and T. Veijola, “APLAC — objectoriented circuit simulator and design tool,” in Low-power HF Microelectronics: a Unified Approach (G. A. S. Machado, ed.), pp. 333–372, London,
UK: IEE, 1996.
[5] Y. Saad and M. H. Schultz, “GMRES: a generalized minimal residual algorithm for solving nonsymmetric linear systems,” SIAM Journal on Scientific and Statistical Computing, vol. 7, pp. 856–869, 1986.
[6] L. O. Chua and P.-M. Lin, Computer-Aided Analysis of Electronic Circuits:
Algorithms and Computational Techniques. Prentice-Hall, 1975.
[7] J. Vlach and K. Singhal, Computer Methods for Circuit Analysis and Design. New York, NY, USA: John Wiley & Sons, Inc., 1983.
[8] J. Ogrodzki, Circuit Simulation Methods and Algorithms. Boca Raton-Ann
Harbor-Tokyo-London: CRC Press, 1994.
[9] W. J. McCalla, Fundamentals of Computer-Aided Circuit Simulation.
Boston/Dordrecht/Lancaster: Kluwer Academic Publishers, 1988.
[10] P. J. C. Rodrigues, Computer-aided analysis of nonlinear microwave circuits. Norwood, MA: Artech House, Inc., 1998.
[11] K. S. Kundert, J. K. White, and A. Sangiovanni-Vincentelli, Steady-State
Methods for Simulating Analog and Microwave Circuits. Boston: Kluwer
Academic Publishers, 1990.
[12] C.-W. Ho, A. E. Ruehli, and P. A. Brennan, “The modified nodal approach
to network analysis,” IEEE Transactions on Circuits and Systems, vol. 22,
pp. 504–509, June 1975.
[13] H. Gaunholt, P. Heikkilä, K. Mannersalo, V. Porra, and M. Valtonen, “Gyrator transformation — a better way for modified nodal approach,” in Proceedings of European Conference on Circuit Theory and Design, vol. 2,
pp. 864–872, July 1991.
59
Bibliography
[14] K. S. Kundert, “Sparse matrix techniques,” in Circuit Analysis, Simulation
and Design (A. E. Ruehli, ed.), pp. 281–324, Elseviers Science Publishers
B. V., 1986.
[15] V. Linja-aho, “Homotopy methods in DC circuit analysis,” Master’s thesis,
Helsinki University of Technology, 2006.
[16] L. Trajković, R. C. Melville, and S.-C. Fang, “Finding DC operating points
of transistor circuits using homotopy methods,” in Proceedings of IEEE
International Symposium on Circuits and Systems, pp. 758–761, 1991.
[17] K. Yamamura, T. Sekiguchi, and Y. Inoue, “A fixed-point homotopy method
for solving modified nodal equations,” IEEE Transactions on Circuits and
Systems I, vol. 46, pp. 654–665, June 1999.
[18] R. C. Melville, L. Trajković, S.-C. Fang, and L. T. Watson, “Artificial parameter homotopy methods for the DC operating point problem,” IEEE
Transactions on Computer-Aided Design, vol. 12, pp. 861–877, June 1993.
[19] C. T. Kelley, Iterative Methods for Linear and Nonlinear Equations.
Philadelphia: SIAM, 1995.
[20] H. R. Yeager and R. W. Dutton, “Improvement in norm-reducing Newton
methods for circuit simulation,” IEEE Transactions on Computer-Aided
Design, vol. 8, pp. 538–546, May 1989.
[21] J. Roos, Improving the Speed and Convergence of DC Analysis by Means of
Self-Generating Lookup Tables and Piecewise-Linear Analysis. PhD thesis,
Helsinki University of Technology, 1999.
[22] P. Maffezzoni, L. Codecasa, and D. D’Amore, “Time-domain simulation of
nonlinear circuits through implicit Runge–Kutta methods,” IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 54, pp. 391–400,
Feb. 2007.
[23] M. S. Nakhla and J. Vlach, “A piecewise harmonic balance technique for
determination of the periodic response of nonlinear system,” IEEE Transactions on Circuits and Systems, vol. CAS-23, pp. 85–91, Feb. 1976.
[24] S. A. Maas, Nonlinear Microwave Circuits. MA: Artech House, Inc., 1988.
[25] R. Dembo, S. Eisenstats, and T. Steighaug, “Inexact Newton methods,”
SIAM Journal on Numerical Analysis, vol. 19, pp. 400–408, 1982.
[26] D. Valtchev and V. Georgiev, “Time-frequency transformation for the spectral balance methods,” International Journal of Electronics and Communications (AEÜ), vol. 49, no. 1, 1995.
[27] J. Virtanen, V. Karanko, T. Tinttunen, and M. Heimlich, “Frequency selective harmonic balance analysis,” in Proceedings of the EUMW’09, pp. 1070–
1073, 2009.
[28] V. Karanko and T. Tinttunen, “Multi-rate harmonic balance provides a new
solution for nonlinear simulation,” High Frequency Electronics, pp. 30–37,
2009.
60
Bibliography
[29] V. Rizzoli, D. Masotti, F. Mastri, and E. Montanari, “System-oriented
harmonic-balance algorithm for circuit-level simulation,” IEEE Transactions on Computer-Aided Design, vol. 30, pp. 256–269, Feb. 2011.
[30] Y. Saad, “A flexible inner-outer preconditioned GMRES algorithm,” SIAM
Journal on Scientific Computing, vol. 14, no. 2, pp. 461–469, 1993.
[31] H. A. van der Vorst and C. Vuik, “GMRESR: a family of nested GMRES
methods,” Numerical Linear Algebra with Applications, vol. 1, pp. 369–
386, 1994.
[32] J. J. Dongarra, I. S. Duff, D. C. Sorensen, and H. A. Van der Vorst, Numerical Linear Algebra for High-Performance Computers. Philadelphia: SIAM,
1998.
[33] P. Feldmann, B. Melville, and D. Long, “Efficient frequency domain analysis of large nonlinear analog circuits,” in Proceedings of the IEEE 1996
Custom Integrated Circuits Conference, pp. 461–464, May 1996.
[34] R. Telichevesky, K. Kundert, I. Elfadel, and J. White, “Fast simulation
algorithms for RF circuits,” in Proceedings of the IEEE 1996 Custom Integrated Circuits Conference, pp. 437–444, May 1996.
[35] D. Long, R. Melville, K. Ashby, and B. Horton, “Full-chip harmonic balance,” in Proceedings of the IEEE 1996 Custom Integrated Circuits Conference, pp. 379–382, May 1997.
[36] O. Nastov, Spectral methods for Circuit Analysis. PhD thesis, MIT, EECS,
1999.
[37] F. Veerse, “Efficient iterative time preconditioners for harmonic balance
RF circuit simulation,” in Proceedings of the 2003 IEEE/ACM International Conference on Computer-Aided Design, (Washington, DC, USA),
pp. 251–254, IEEE Computer Society, 2003.
[38] W. Dong and P. Li, “Hierarchical harmonic balance methods for frequencydomain analog circuit analysis,” IEEE Transactions on Computer-Aided
Design, vol. 26, pp. 2089–2101, Dec. 2007.
[39] R. Fletcher and C. M. Reeves, “Function minimization by conjugate gradients,” Computing Journal, vol. 7, pp. 149–154, 1964.
[40] C. T. Kelley, Iterative Methods for Optimization. Philadelphia: SIAM, 1999.
[41] L. Grippo, F. Lampariello, and S. Lucidi, “A nonmonotone line search technique for Newton’s method,” SIAM Journal on Numerical Analysis, vol. 23,
pp. 707–716, Aug. 1986.
[42] M. Ferris, S. Lucidi, and M. Roma, “Nonmonotone curvilinear line search
methods for unconstrained optimization,” Computational Optimization
and Applications, vol. 6, pp. 117–136, 1996.
[43] M. Powell, “A new algorithm for unconstrained optimization,” in Nonlinear
Programming (J. Rosen, O. Mangasarian, and K. Ritter, eds.), pp. 31–65,
New York: Academic Press, 1970.
61
Bibliography
[44] K. Levenberg, “A method for the solution of certain nonlinear problems
in least squares,” Quarterly of Applied Mathematics, vol. 4, pp. 164–168,
1944.
[45] D. W. Marquardt, “An algorithm for least-squares estimation of nonlinear
parameters,” Journal of the Society for Industrial and Applied Mathematics, vol. 11, pp. 431–441, 1963.
[46] A. Bouaricha and R. Schnabel, “TENSOLVE: A software package for solving systems of nonlinear equations and nonlinear least squares problems
using tensor methods,” preprint MCS-P463-0894, Mathematics snd Computer Science Division, Argonne National Laboratory, Argonne, IL, 1994.
[47] N. Deng, Y. Xiao, and F. Zhou, “Nonmonotonic trust region algorithm,”
Journal of optimization theory and applications, vol. 76, pp. 259–285, Feb.
1993.
[48] P. L. Toint, “A non-monotone trust-region algorithm for nonlinear optimization subject to convex constraints,” Mathematical Programming,
vol. 77, pp. 69–94, 1997.
[49] R. Schnabel and P. Frank, “Tensor methods for nonlinear equations,” Siam
Journal on Numerical Analysis, vol. 21, pp. 815–843, Oct. 1984.
[50] H. H. Happ, Diakoptics and Networks. New York: Academic Press, 1971.
[51] L. O. Chua and L.-K. Chen, “Diakoptic and generalized hybrid analysis,”
IEEE Transactions on Circuits and Systems, vol. CAS-23, pp. 694–705,
Dec. 1976.
[52] F. F. Wu, “Solution of large-scale networks by tearing,” IEEE Transactions
on Circuits and Systems, vol. CAS-23, pp. 706–713, Dec. 1976.
[53] G. Guardabassi and A. Sangiovanni-Vincentelli, “A two level algorithm
for tearing,” IEEE Transactions on Circuits and Systems, vol. CAS-23,
pp. 783–791, Dec. 1976.
[54] I. N. Hajj, “Sparsity considerations in network solution by tearing,” IEEE
Transactions on Circuits and Systems, vol. CAS-27, pp. 357–366, May
1980.
[55] U. Kleis, O. Wallat, U. Wever, and Q. Zheng, “Domain decomposition methods for circuit simulation,” in Proceedings of the 8th Workshop on Parallel
and Distributed Simulation, pp. 183–184, 1994.
[56] H. Yu, C. Chu, Y. Shi, D. Smart, L. He, and S. X.-D. Tan, “Fast analysis of
a large-scale inductive interconnect by block-structure-preserved macromodeling,” IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 18, pp. 1399–1411, Oct. 2010.
[57] M. Vlach, “LU decomposition algorithms for parallel and vector computation,” in Analog Methods for Computer-Aided Circuit Analysis and Design
(T. Ozawa, ed.), pp. 37–64, New York and Basel: Marcel Dekker Inc., 1988.
[58] N. B. G. Rabbat, A. L. Sangiovanni-Vincentelli, and H. Y. Hsieh, “A multilevel Newton algorithm with macromodeling and latency for the analysis
of large-scale nonlinear circuits in the time domain,” IEEE Transactions
on Circuits and Systems, vol. CAS-26, pp. 733–741, Sept. 1979.
62
Bibliography
[59] X. Zhang, R. H. Byrd, and R. B. Schnabel, “Parallel methods for solving
nonlinear block bordered system of equations,” SIAM Journal on Scientific
and Statistical Computing, vol. 13, pp. 841–859, July 1992.
[60] X. Zhang, “Dynamic and static load balancing for block bordered system
circuit equations on multiprocessors,” IEEE Transactions on ComputerAided Design, vol. 11, pp. 1086–1094, Sept. 1992.
[61] J. Borchhardt, F. Grund, D. Horn, and M. Uhle, “MAGNUS — mehrstufige
analyse großer netzwerke und systeme,” Tech. Report 9, WIAS, Berlin,
1994.
[62] C. Cocchi, A. Benedetti, and Z. M. Kovàcs-V., “A new subcircuit ordering
algorithm for a multilevel cicuit simulator,” in Proceedings of European
Conference on Circuit Theory and Design, pp. 1059–1062, 1995.
[63] U. Wever and Q. Zheng, “Parallel transient analysis for circuit simulation,”
in Proceedings of the 29th Annual Hawaii international Conference on Systems Sciences, pp. 442–447, 1996.
[64] N. Fröhlich, B. M. Riess, U. A. Wever, and Q. Zheng, “A new approach for
parallel simulation of VLSI circuits on a transistor level,” IEEE Transactions on Circuits and Systems I, vol. 45, pp. 601–613, June 1998.
[65] J. Borchhardt, F. Grund, and D. Horn, “Parallized numerical methods
for large systems of differential-algebraic equations in industrial applications,” Surveys on Mathematics for Industry, vol. 8, pp. 201–211, 1999.
[66] V. Rizzoli, F. Mastri, and D. Masotti, “A hiearchical harmonic-balance technique for the efficient simulation of large size nonlinear microvave circuits,” in Proceedings of 25th European Microwave Conference, pp. 615–
619, 1995.
[67] N. R. Aluru and J. White, “A multi-level Newton method for static and
fundamental frequency analysis of electromechanical systems,” in International Conference on Simulation of Semiconductor Processes and Devices
SISPAD’97, pp. 125–128, 1997.
[68] S. D. Senturia, N. Aluru, and J. White, “Simulating the behavior of MEMS
devices: Computational methods and needs,” IEEE Computational Science
and Engineering, vol. 4, pp. 30–43, Jan. 1997.
[69] N. R. Aluru and J. White, “A multilevel Newton method for mixed-energy
domain simulation of MEMS,” Journal of Microelectromechanical Systems,
vol. 8, pp. 299–308, Sept. 1999.
[70] K. Mayaram and D. O. Pederson, “Coupling algorithms for mixed-level circuit and device simulation,” IEEE Transactions on Computer-Aided Design, vol. 11, pp. 1003–1012, Aug. 1992.
[71] F. M. Rotela, Mixed Circuit and Device Simulation for Analysis, Design,
and Optimization of Opto-Electronic, Radio Frequency, and High Speed
Semiconductor Devices. PhD thesis, Stanford University, Apr. 2000.
[72] G. Zanghirati, “Global convergence of nonmonotone strategies in parallel
methods for block-bordered nonlinear systems,” Journal of Computational
and Applied Mathematics, vol. 107, pp. 137–168, Jan. 2000.
63
Bibliography
[73] A. Geist, A. Beguelin, J. Dongarra, W. Jiang, R. Manchek, and V. Sunderam, PVM: Parallel Virtual Machine, A Users’ Guide and Tutorial for
Networked Parallel Computing. The MIT Press, 1994.
[74] Message Passing Interface Forum, “MPI: a message-passing interface
standard,” The International Journal of Supercomputer Applications,
vol. 8, no. 3/4, 1994.
[75] R. A. Saleh, K. A. Gallivan, M.-C. Chang, I. N. Hajj, D. Smart, and T. N.
Trick, “Parallel circuit simulation on supercomputers,” Proceedings of the
IEEE, vol. 77, pp. 1915–1931, Dec. 1989.
[76] X. Ye, W. Dong, P. Li, and S. Nassif, “Hierarchical multialgorithm parallel circuit simulation,” IEEE Transactions on Computer-Aided Design of
Integrated Circuits and Systems, vol. 30, pp. 45–58, Jan. 2011.
[77] M. Hulkkonen, “Graphics prosessing unit utilization in circuit simulation,”
Master’s thesis, Aalto University School of Electrical Engineering, 2011.
[78] R. E. Poore, “GPU-accelerated time-domain circuit simulation,” in IEEE
2009 Custom Integrated Circuits Conference, pp. 629–632, 2009.
[79] K. Gulati, J. Croix, S. Khatri, and R. Shastry, “Fast circuit simulation on
graphics processing units,” in Proceedings of 2009 Asia and South Pacific
Design Automation Conference, pp. 403 – 408, 2009.
[80] W. Bomhof, Iterative and Parallel methods for Linear Systems with Applications in Circuit Simulation. PhD thesis, Utrecht University, 2001.
[81] W. Bomhof and H. A. van der Vorst, “A parallel linear system solver for
circuit simulation problems,” Numerical Linear Algebra with Applications,
vol. 7, pp. 649–665, Oct.–Dec. 2000.
[82] A. Basermann, U. Jaekel, M. Nordhausen, and K. Hachiya, “Parallel iterative solvers for sparse linear systems in circuit simulation,” Future Generation Computer Systems, pp. 1275–1284, 2005.
[83] T. A. Davis and E. Palamadai Natarajan, “Algorithm 907: Klu, a direct
sparse solver for circuit simulation problems,” ACM Transactions on Mathematical Software, vol. 37, Sept. 2010.
[84] D. L. Rhodes and B. S. Perlman, “Parallel computation for microwave circuit simulation,” IEEE Transactions on Microwave Theory and Techniques,
vol. 45, pp. 587–592, May 1997.
[85] D. L. Rhodes and A. Gerasoulis, “A scheduling approach to parallel harmonic balance simulation,” Concurrency: Practice and Experience, vol. 12,
pp. 175–187, June 2000.
[86] W. Dong and P. Li, “A parallel harmonic balance approach to steady-state
and envelope-following simulation of driven and autonomous circuits,”
IEEE Transactions on Computer-Aided Design, vol. 8, pp. 409–501, Apr.
2009.
[87] M. Celik, L. Pileggi, and A. Odabasioglu, IC Interconnect Analysis.
Boston/Dordrecht/London: Kluwer Academic Publishers, 2002.
64
Bibliography
[88] S. X.-D. Tan and L. He, Advanced Model Order Reduction Techniques in
VLSI Design. New York, NY: Cambridge University Press, 2007.
[89] P. Benner, M. Hinze, and E. J. W. ter Maten, eds., Model Reduction for
Circuit Simulation, vol. 74 of Lecture Notes in Electrical Engineering.
Springer, 2011.
[90] A. C. Antoulas, Approximation of Large-Scale Dynamical Systems (Advances in Design and Control). Philadelphia, PA, USA: Society for Industrial and Applied Mathematics, 2005.
[91] L. T. Pillage and R. A. Rohrer, “Asymptotic waveform evaluation for timing
analysis,” IEEE Transactions on Computer-Aided Design, vol. 9, pp. 352–
366, Apr. 1990.
[92] P. Feldmann and R. Freund, “Efficient linear circuit analysis by padé approximation via the lanczos process,” IEEE Transactions on ComputerAided Design of Integrated Circuits and Systems, vol. 14, pp. 639–649, May
1995.
[93] A. Odabasioglu, M. Celik, and L. T. Pileggi, “PRIMA: passive reduced-order
interconnect macromodeling algorithm,” IEEE Transactions on ComputerAided Design of Integrated Circuits and Systems, vol. 17, pp. 645–654, Aug.
1998.
[94] R. W. Freund, “SPRIM: structure-preserving reduced-order interconnect
macromodeling,” in Proceedings of the 2004 IEEE/ACM International Conference on Computer-Aided design, ICCAD ’04, pp. 80–87, 2004.
[95] B. Moore, “Principal component analysis in linear systems: controllability, observability, and model reduction,” IEEE Transactions on Automatic
Control, vol. 26, pp. 17 – 32, Feb. 1981.
[96] J. R. Phillips, L. Daniel, and L. M. Silveira, “Guaranteed passive balanced
transformation for model order reduction,” in Proceedings of Design Automation Conference, pp. 52–57, 2002.
[97] J. Phillips and L. Silveira, “Poor man’s TBR: a simple model reduction
scheme,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 24, pp. 43 – 55, Jan. 2005.
[98] T. Stykel, “Balancing-related model reduction of circuit equations using
topological structure,” in Model Reduction for Circuit Simulation (P. Benner, M. Hinze, and E. J. W. ter Maten, eds.), pp. 53–83, Springer, 2011.
[99] B. N. Sheehan, “TICER: realizable reduction of extracted RC circuits,” in
Proceedings of the 1999 IEEE/ACM international conference on Computeraided design, pp. 200–203, 1999.
[100] B. N. Sheehan, “Realizable reduction of RC networks,” IEEE Transactions
on Computer-Aided Design of Integrated Circuits and Systems, vol. 26,
pp. 1393–1407, Aug. 2007.
[101] C. S. Amin, M. H. Chowdhury, and Y. I. Ismail, “Realizable RLCK circuit
crunching,” in Proceedings of the 40th annual Design Automation Conference, (New York, USA), pp. 226–231, ACM, 2003.
65
Bibliography
[102] G. Karypis and V. Kumar, “hMETIS, a hypergraph partitioning package
version 1.5.3.”
[103] H. Liao and W. W.-M. Dai, “Partitioning and reduction of RC interconnect
networks based on scattering parameter macromodels,” in Digest of Technical Papers of IEEE/ACM International Conference on Computer Aided
Design, pp. 704–709, 1995.
[104] R. Ionutiu, J. Rommes, and W. Schilders, “SparseRC: sparsity preserving
model reduction for RC circuits with many terminals,” IEEE Transactions
on Computer-Aided Design of Integrated Circuits and Systems, vol. 30,
pp. 1828–1841, Dec. 2011.
[105] P. Miettinen, M. Honkala, J. Roos, and M. Valtonen, “Partitioning-based
reduction of circuits with mutual inductances,” in Scientific Computing in
Electrical Engineering SCEE 2010 (B. Michielsen and J.-R. Poirier, eds.),
Mathematics in Industry, Vol. 16, 2012.
[106] Y.-M. Lee and C. C.-P. Chen, “Hierarchical model order reduction for
signal-integrity interconnect synthesis,” in Proceedings of the 11th Great
Lakes symposium on VLSI, pp. 109–114, 2001.
[107] Y.-M. Lee, Y. Cao, T.-H. Chen, J. M. Wang, and C. C.-P. Chen, “HiPRIME:
hierarchical and passivity preserved interconnect macromodeling engine
for RLCK power delivery,” IEEE Transactions on Computer-Aided Design
of Integrated Circuits and Systems, vol. 24, pp. 797–806, June 2005.
[108] D. Li, S. X.-D. Tan, and L. Wu, “Hierarchical Krylov subspace based reduction of large interconnects,” Integration, the VLSI Journal, vol. 42, pp. 193–
202, Feb. 2009.
[109] T. Palenius and J. Roos, “Comparison of reduced-order interconnect macromodels for time-domain simulation,” IEEE Transactions on Microwave
Theory and Techniques, vol. 52, pp. 2240–2250, Sept. 2004.
[110] F. Yang, X. Zeng, Y. Su, and D. Zhou, “RLCSYN: RLC equivalent circuit
synthesis for structure-preserved reduced-order model of interconnect,” in
Proceedings of IEEE International Symposium on Circuits and Systems,
pp. 2710–2713, 2007.
[111] Y. Su, J. Wang, X. Zeng, Z. Bai, C. Chiang, and D. Zhou, “SAPOR: secondorder Arnoldi method for passive order reduction of RCS circuits,” in Proceedings of International Conference on Computer-Aided Design (ICCAD),
pp. 74–79, 2004.
[112] Y. Matsumoto, Y. Tanji, and M. Tanaka, “Efficient SPICE-netlist representation of reduced-order interconnect model,” in Proceedings of European
Conference on Circuit Theory and Design, vol. 2, (Espoo, Finland), pp. 145–
148, Aug. 2001.
[113] J. Roos, M. Honkala, and P. Miettinen, “PRIOR: passive, reciprocal, and
infinity-observing reduction.” presentation at Autumn School on Future
Developments in Model Order Reduction Terschelling, The Netherlands,
September 21–25, 2009.
66
Bibliography
[114] P. Miettinen, M. Honkala, and J. Roos, “Partitioning based RL-in–RL-out
MOR method,” in Scientific Computing in Electrical engineerign SCEE
2008 (L. R. Costa and J. Roos, eds.), pp. 119–120, 2008.
67
Bibliography
68
Errata
Publication III
Eq. (4):

A
 1




J=






A2
..
.
B1 

B2 

.. 
. 




A m Bm 
C1 C2 . . . Cm
D
Page 4: “– it is in direction of steepest descent” should be ”– it is in the
descent direction”.
Publication V
Eq. (10) and (15): C and G should be C̃ and G̃. Eq. (17): Ck s and Gk s
should be ck s and gk .
69
Errata
70
A
al
t
o
D
D1
7
4
/
2
0
1
2
Mo
de
rn e
l
e
c
t
ro
nicc
irc
uit
s aret
ypic
al
l
y
l
arge
,c
o
nsist
ing o
ft
h
o
usands o
ft
ransist
o
rs
and o
t
h
e
rc
o
mpo
ne
nt
s.D
uring t
h
ede
sign
pro
c
e
ss, t
h
e
reis a ne
e
dt
ope
rfo
rm
c
o
mput
at
io
nal
l
y de
manding nume
ric
al
simul
at
io
ns t
ove
rify t
h
efunc
t
io
nal
it
yo
ft
h
e
c
irc
uit
.T
h
us, t
h
ene
e
d fo
r fastand ac
c
urat
e
c
irc
uitsimul
at
io
nt
o
o
l
s is o
bvio
us.
F
o
ur appro
ac
h
e
st
oimpro
vet
h
espe
e
d and
t
h
ec
o
nve
rge
nc
eo
ft
h
enume
ric
alc
irc
uit
simul
at
io
n arepre
se
nt
e
d.
9HSTFMG*aejccj+
I
S
BN9
7
89
5
2
6
0
4
9
2
2
9
I
S
BN9
7
89
5
2
6
0
4
9
2
36(
p
d
f
)
I
S
S
N
L1
7
9
9
4
9
34
I
S
S
N1
7
9
9
4
9
34
I
S
S
N1
7
9
9
4
9
4
2(
p
d
f
)
A
a
l
t
oU
ni
v
e
r
s
i
t
y
S
c
h
o
o
lo
fE
l
e
c
t
r
i
c
a
lE
ng
i
ne
e
r
i
ng
D
e
p
a
r
t
me
nto
fR
a
d
i
oS
c
i
e
nc
ea
ndE
ng
i
ne
e
r
i
ng
w
w
w
.
a
a
l
t
o
.
f
i
BU
S
I
N
E
S
S+
E
C
O
N
O
M
Y
A
R
T+
D
E
S
I
G
N+
A
R
C
H
I
T
E
C
T
U
R
E
S
C
I
E
N
C
E+
T
E
C
H
N
O
L
O
G
Y
C
R
O
S
S
O
V
E
R
D
O
C
T
O
R
A
L
D
I
S
S
E
R
T
A
T
I
O
N
S
Download