A quick and dirty introduction to IDR

advertisement
A quick and dirty introduction to IDR
Jens-Peter M. Zemke
zemke@tu-harburg.de
Institut für Numerische Simulation
Technische Universität Hamburg-Harburg
May 27th, 2011
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
1 / 42
Outline
Basics
Internal guidelines
Krylov subspace methods
Hessenberg decompositions
Polynomial representations
Perturbations
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
2 / 42
Outline
Basics
Internal guidelines
Krylov subspace methods
Hessenberg decompositions
Polynomial representations
Perturbations
IDR(s)
IDR
IDR(s)
IDREig
IDR(s)Stab(`)
QMRIDR
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
2 / 42
Basics
Outline
Basics
Internal guidelines
Krylov subspace methods
Hessenberg decompositions
Polynomial representations
Perturbations
IDR(s)
IDR
IDR(s)
IDREig
IDR(s)Stab(`)
QMRIDR
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
3 / 42
Basics
Internal guidelines
What is the problem you’re considering?
I am trying to motivate why the method of Induced Dimension Reduction (IDR)
and its generalization IDR(s) are worth considering when looking for iterative
solvers for your type of problem, e.g.,
I
(large sparse) linear systems: Ax = r0 , A ∈ Cn×n , r0 ∈ Cn , or
I
(large sparse) eigenvalue problems: Av = vλ.
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
4 / 42
Basics
Internal guidelines
What is the problem you’re considering?
I am trying to motivate why the method of Induced Dimension Reduction (IDR)
and its generalization IDR(s) are worth considering when looking for iterative
solvers for your type of problem, e.g.,
I
(large sparse) linear systems: Ax = r0 , A ∈ Cn×n , r0 ∈ Cn , or
I
(large sparse) eigenvalue problems: Av = vλ.
I have a general interest in Krylov subspace methods, for me IDR(s) is just a
new Krylov subspace method that offers interesting new possibilities.
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
4 / 42
Basics
Internal guidelines
What is the problem you’re considering?
I am trying to motivate why the method of Induced Dimension Reduction (IDR)
and its generalization IDR(s) are worth considering when looking for iterative
solvers for your type of problem, e.g.,
I
(large sparse) linear systems: Ax = r0 , A ∈ Cn×n , r0 ∈ Cn , or
I
(large sparse) eigenvalue problems: Av = vλ.
I have a general interest in Krylov subspace methods, for me IDR(s) is just a
new Krylov subspace method that offers interesting new possibilities.
My personal interest lies in the error analysis of perturbed Krylov subspace
methods and their convergence properties. These perturbations are
I
always caused by finite precision,
I
sometimes caused deliberately, e.g., in inexact methods.
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
4 / 42
Basics
Internal guidelines
Why do you find this interesting?
The error analysis of Krylov subspace methods is by no means simple:
I
TUHH
Krylov subspace methods are highly sophisticated tools,
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
5 / 42
Basics
Internal guidelines
Why do you find this interesting?
The error analysis of Krylov subspace methods is by no means simple:
I
Krylov subspace methods are highly sophisticated tools,
I
most analysis is based on the fact that, in theory, Krylov subspace
methods are direct methods, which no longer remains true,
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
5 / 42
Basics
Internal guidelines
Why do you find this interesting?
The error analysis of Krylov subspace methods is by no means simple:
I
Krylov subspace methods are highly sophisticated tools,
I
most analysis is based on the fact that, in theory, Krylov subspace
methods are direct methods, which no longer remains true,
I
the error propagation is highly non-linear,
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
5 / 42
Basics
Internal guidelines
Why do you find this interesting?
The error analysis of Krylov subspace methods is by no means simple:
I
Krylov subspace methods are highly sophisticated tools,
I
most analysis is based on the fact that, in theory, Krylov subspace
methods are direct methods, which no longer remains true,
I
the error propagation is highly non-linear,
I
the short-term methods tend to deviate very soon but still converge, but
now at another “rate” of convergence.
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
5 / 42
Basics
Internal guidelines
Why do you find this interesting?
The error analysis of Krylov subspace methods is by no means simple:
I
Krylov subspace methods are highly sophisticated tools,
I
most analysis is based on the fact that, in theory, Krylov subspace
methods are direct methods, which no longer remains true,
I
the error propagation is highly non-linear,
I
the short-term methods tend to deviate very soon but still converge, but
now at another “rate” of convergence.
The known analysis of short term recurrence Krylov subspace methods is
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
5 / 42
Basics
Internal guidelines
Why do you find this interesting?
The error analysis of Krylov subspace methods is by no means simple:
I
Krylov subspace methods are highly sophisticated tools,
I
most analysis is based on the fact that, in theory, Krylov subspace
methods are direct methods, which no longer remains true,
I
the error propagation is highly non-linear,
I
the short-term methods tend to deviate very soon but still converge, but
now at another “rate” of convergence.
The known analysis of short term recurrence Krylov subspace methods is
I
TUHH
mostly restricted to the simplest method, the symmetric Lanczos method,
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
5 / 42
Basics
Internal guidelines
Why do you find this interesting?
The error analysis of Krylov subspace methods is by no means simple:
I
Krylov subspace methods are highly sophisticated tools,
I
most analysis is based on the fact that, in theory, Krylov subspace
methods are direct methods, which no longer remains true,
I
the error propagation is highly non-linear,
I
the short-term methods tend to deviate very soon but still converge, but
now at another “rate” of convergence.
The known analysis of short term recurrence Krylov subspace methods is
I
mostly restricted to the simplest method, the symmetric Lanczos method,
I
based on tools from a variety of areas that do not seem to be related to
Krylov subspace methods at all,
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
5 / 42
Basics
Internal guidelines
Why do you find this interesting?
The error analysis of Krylov subspace methods is by no means simple:
I
Krylov subspace methods are highly sophisticated tools,
I
most analysis is based on the fact that, in theory, Krylov subspace
methods are direct methods, which no longer remains true,
I
the error propagation is highly non-linear,
I
the short-term methods tend to deviate very soon but still converge, but
now at another “rate” of convergence.
The known analysis of short term recurrence Krylov subspace methods is
I
mostly restricted to the simplest method, the symmetric Lanczos method,
I
based on tools from a variety of areas that do not seem to be related to
Krylov subspace methods at all,
I
either for very specific implementations or does offer very little insight.
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
5 / 42
Basics
Internal guidelines
What is the background?
Krylov subspace methods are based on very basic ideas from Linear Algebra,
namely, linear combinations, subspaces, and projections. Yet, the analysis of
these methods relates them to various other interesting areas.
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
6 / 42
Basics
Internal guidelines
What is the background?
Krylov subspace methods are based on very basic ideas from Linear Algebra,
namely, linear combinations, subspaces, and projections. Yet, the analysis of
these methods relates them to various other interesting areas.
The tools of trade include:
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
6 / 42
Basics
Internal guidelines
What is the background?
Krylov subspace methods are based on very basic ideas from Linear Algebra,
namely, linear combinations, subspaces, and projections. Yet, the analysis of
these methods relates them to various other interesting areas.
The tools of trade include:
I
TUHH
Matrix Analysis (Matrix Functions),
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
6 / 42
Basics
Internal guidelines
What is the background?
Krylov subspace methods are based on very basic ideas from Linear Algebra,
namely, linear combinations, subspaces, and projections. Yet, the analysis of
these methods relates them to various other interesting areas.
The tools of trade include:
I
I
TUHH
Matrix Analysis (Matrix Functions),
Potential Theory (Green’s Functions, Capacity),
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
6 / 42
Basics
Internal guidelines
What is the background?
Krylov subspace methods are based on very basic ideas from Linear Algebra,
namely, linear combinations, subspaces, and projections. Yet, the analysis of
these methods relates them to various other interesting areas.
The tools of trade include:
I
Matrix Analysis (Matrix Functions),
Potential Theory (Green’s Functions, Capacity),
I
Holomorphic Functions (Residue Theorem),
I
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
6 / 42
Basics
Internal guidelines
What is the background?
Krylov subspace methods are based on very basic ideas from Linear Algebra,
namely, linear combinations, subspaces, and projections. Yet, the analysis of
these methods relates them to various other interesting areas.
The tools of trade include:
I
Matrix Analysis (Matrix Functions),
Potential Theory (Green’s Functions, Capacity),
I
Holomorphic Functions (Residue Theorem),
I
Laurent Expansions,
I
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
6 / 42
Basics
Internal guidelines
What is the background?
Krylov subspace methods are based on very basic ideas from Linear Algebra,
namely, linear combinations, subspaces, and projections. Yet, the analysis of
these methods relates them to various other interesting areas.
The tools of trade include:
I
Matrix Analysis (Matrix Functions),
Potential Theory (Green’s Functions, Capacity),
I
Holomorphic Functions (Residue Theorem),
I
Laurent Expansions,
I
(Padé) Approximation,
I
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
6 / 42
Basics
Internal guidelines
What is the background?
Krylov subspace methods are based on very basic ideas from Linear Algebra,
namely, linear combinations, subspaces, and projections. Yet, the analysis of
these methods relates them to various other interesting areas.
The tools of trade include:
I
Matrix Analysis (Matrix Functions),
Potential Theory (Green’s Functions, Capacity),
I
Holomorphic Functions (Residue Theorem),
I
Laurent Expansions,
I
(Padé) Approximation,
I
(Lagrange/Hermite) Interpolation,
I
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
6 / 42
Basics
Internal guidelines
What is the background?
Krylov subspace methods are based on very basic ideas from Linear Algebra,
namely, linear combinations, subspaces, and projections. Yet, the analysis of
these methods relates them to various other interesting areas.
The tools of trade include:
I
Matrix Analysis (Matrix Functions),
Potential Theory (Green’s Functions, Capacity),
I
Holomorphic Functions (Residue Theorem),
I
Laurent Expansions,
I
(Padé) Approximation,
I
(Lagrange/Hermite) Interpolation,
I
(Formal) Orthogonal Polynomials,
I
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
6 / 42
Basics
Internal guidelines
What is the background?
Krylov subspace methods are based on very basic ideas from Linear Algebra,
namely, linear combinations, subspaces, and projections. Yet, the analysis of
these methods relates them to various other interesting areas.
The tools of trade include:
I
Matrix Analysis (Matrix Functions),
Potential Theory (Green’s Functions, Capacity),
I
Holomorphic Functions (Residue Theorem),
I
Laurent Expansions,
I
(Padé) Approximation,
I
(Lagrange/Hermite) Interpolation,
I
(Formal) Orthogonal Polynomials,
I
Riemann-Stieltjes Integrals,
I
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
6 / 42
Basics
Internal guidelines
What is the background?
Krylov subspace methods are based on very basic ideas from Linear Algebra,
namely, linear combinations, subspaces, and projections. Yet, the analysis of
these methods relates them to various other interesting areas.
The tools of trade include:
I
Matrix Analysis (Matrix Functions),
Potential Theory (Green’s Functions, Capacity),
I
Holomorphic Functions (Residue Theorem),
I
Laurent Expansions,
I
(Padé) Approximation,
I
(Lagrange/Hermite) Interpolation,
I
(Formal) Orthogonal Polynomials,
I
Riemann-Stieltjes Integrals,
I
and many, many more . . .
I
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
6 / 42
Basics
Internal guidelines
What are you going to talk about?
I will
I
TUHH
give a brief introduction to Krylov subspace methods,
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
7 / 42
Basics
Internal guidelines
What are you going to talk about?
I will
I
give a brief introduction to Krylov subspace methods,
I
present a sketch of IDR/IDR(s),
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
7 / 42
Basics
Internal guidelines
What are you going to talk about?
I will
I
give a brief introduction to Krylov subspace methods,
I
present a sketch of IDR/IDR(s),
I
explain, why it is different,
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
7 / 42
Basics
Internal guidelines
What are you going to talk about?
I will
I
give a brief introduction to Krylov subspace methods,
I
present a sketch of IDR/IDR(s),
I
explain, why it is different,
I
report on the observed behavior,
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
7 / 42
Basics
Internal guidelines
What are you going to talk about?
I will
I
give a brief introduction to Krylov subspace methods,
I
present a sketch of IDR/IDR(s),
I
explain, why it is different,
I
report on the observed behavior,
I
sketch possible generalizations.
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
7 / 42
Basics
Internal guidelines
What are you going to talk about?
I will
I
give a brief introduction to Krylov subspace methods,
I
present a sketch of IDR/IDR(s),
I
explain, why it is different,
I
report on the observed behavior,
I
sketch possible generalizations.
If I succeed, you will have a feeling for some of the important aspects of
IDR/IDR(s) and can read the papers on the subject for more details of
particular methods.
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
7 / 42
Basics
Internal guidelines
What are you going to talk about?
I will
I
give a brief introduction to Krylov subspace methods,
I
present a sketch of IDR/IDR(s),
I
explain, why it is different,
I
report on the observed behavior,
I
sketch possible generalizations.
If I succeed, you will have a feeling for some of the important aspects of
IDR/IDR(s) and can read the papers on the subject for more details of
particular methods.
In passing, I will note some aspects not to be found in the literature and
outline some paths of possible generalizations.
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
7 / 42
Basics
Krylov subspace methods
Background
Large linear systems are solved by projection onto smaller subspaces,
Ax = r0 ,
TUHH
Jens-Peter M. Zemke
xk := Qk zk ,
H
H
Q̂H
k Ax = (Q̂k AQk )zk = Q̂k r0 .
IDR @ Bath
2011-05-27
8 / 42
Basics
Krylov subspace methods
Background
Large linear systems are solved by projection onto smaller subspaces,
Ax = r0 ,
xk := Qk zk ,
H
H
Q̂H
k Ax = (Q̂k AQk )zk = Q̂k r0 .
Galërkin method:
I
Bubnov-Galërkin: Q̂k = Qk , QH
k Qk = Ik (orthonormal basis),
I
Petrov-Galërkin: Q̂H
k Qk = Ik (bi-orthonormal bases),
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
8 / 42
Basics
Krylov subspace methods
Background
Large linear systems are solved by projection onto smaller subspaces,
Ax = r0 ,
xk := Qk zk ,
H
H
Q̂H
k Ax = (Q̂k AQk )zk = Q̂k r0 .
Galërkin method:
I
Bubnov-Galërkin: Q̂k = Qk , QH
k Qk = Ik (orthonormal basis),
I
Petrov-Galërkin: Q̂H
k Qk = Ik (bi-orthonormal bases),
Subspaces of increasing dimension. As starting vector use r0 , e.g.,
Q1 := q1 := r0 /kr0 k,
TUHH
Jens-Peter M. Zemke
H1 := QH
1 AQ1 ,
IDR @ Bath
z1 := H−1
1 e1 kr0 k,
x1 := Q1 z1 .
2011-05-27
8 / 42
Basics
Krylov subspace methods
Background
Large linear systems are solved by projection onto smaller subspaces,
Ax = r0 ,
xk := Qk zk ,
H
H
Q̂H
k Ax = (Q̂k AQk )zk = Q̂k r0 .
Galërkin method:
I
Bubnov-Galërkin: Q̂k = Qk , QH
k Qk = Ik (orthonormal basis),
I
Petrov-Galërkin: Q̂H
k Qk = Ik (bi-orthonormal bases),
Subspaces of increasing dimension. As starting vector use r0 , e.g.,
Q1 := q1 := r0 /kr0 k,
H1 := QH
1 AQ1 ,
z1 := H−1
1 e1 kr0 k,
x1 := Q1 z1 .
Compute residual: r1 := r0 − Ax1 = Q1 e1 kr0 k − AQ1 z1 .
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
8 / 42
Basics
Krylov subspace methods
Background
Large linear systems are solved by projection onto smaller subspaces,
Ax = r0 ,
xk := Qk zk ,
H
H
Q̂H
k Ax = (Q̂k AQk )zk = Q̂k r0 .
Galërkin method:
I
Bubnov-Galërkin: Q̂k = Qk , QH
k Qk = Ik (orthonormal basis),
I
Petrov-Galërkin: Q̂H
k Qk = Ik (bi-orthonormal bases),
Subspaces of increasing dimension. As starting vector use r0 , e.g.,
Q1 := q1 := r0 /kr0 k,
H1 := QH
1 AQ1 ,
z1 := H−1
1 e1 kr0 k,
x1 := Q1 z1 .
Compute residual: r1 := r0 − Ax1 = Q1 e1 kr0 k − AQ1 z1 . Both steps involve Aq1 .
Expand space:
K2 := span {r0 , Ar0 } = span {q1 , q2 }.
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
8 / 42
Basics
Krylov subspace methods
Krylov subspaces
Natural generalization of this simple idea: Krylov subspaces. Obtained by
multiplication of last basis vector by A,
Kk := span {r0 , Ar0 , . . . , Ak−1 r0 } = span {q1 , q2 , . . . , qk }.
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
9 / 42
Basics
Krylov subspace methods
Krylov subspaces
Natural generalization of this simple idea: Krylov subspaces. Obtained by
multiplication of last basis vector by A,
Kk := span {r0 , Ar0 , . . . , Ak−1 r0 } = span {q1 , q2 , . . . , qk }.
Krylov subspaces isomorphic (up to a certain degree) to polynomial spaces,
x ∈ Kk
⇔
x=
k−1
X
Aj r0 cj+1 = pk−1 (A)r0 ,
j=0
TUHH
Jens-Peter M. Zemke
pk−1 (z) =
k−1
X
cj+1 z j .
j=0
IDR @ Bath
2011-05-27
9 / 42
Basics
Krylov subspace methods
Krylov subspaces
Natural generalization of this simple idea: Krylov subspaces. Obtained by
multiplication of last basis vector by A,
Kk := span {r0 , Ar0 , . . . , Ak−1 r0 } = span {q1 , q2 , . . . , qk }.
Krylov subspaces isomorphic (up to a certain degree) to polynomial spaces,
x ∈ Kk
⇔
x=
k−1
X
Aj r0 cj+1 = pk−1 (A)r0 ,
j=0
pk−1 (z) =
k−1
X
cj+1 z j .
j=0
Residual polynomials are polynomials that
I
I
TUHH
satisfy rk = ρk (A)r0 and
are normalized by the condition ρk (0) = 1.
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
9 / 42
Basics
Krylov subspace methods
Krylov subspaces
Natural generalization of this simple idea: Krylov subspaces. Obtained by
multiplication of last basis vector by A,
Kk := span {r0 , Ar0 , . . . , Ak−1 r0 } = span {q1 , q2 , . . . , qk }.
Krylov subspaces isomorphic (up to a certain degree) to polynomial spaces,
x ∈ Kk
⇔
x=
k−1
X
Aj r0 cj+1 = pk−1 (A)r0 ,
pk−1 (z) =
j=0
k−1
X
cj+1 z j .
j=0
Residual polynomials are polynomials that
I
I
satisfy rk = ρk (A)r0 and
are normalized by the condition ρk (0) = 1.
Residual polynomials arise because
rk := r0 − Axk = (I − Apk−1 (A))r0 =: ρk (A)r0 .
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
9 / 42
Basics
Krylov subspace methods
Krylov subspace methods
There are mainly two classes of Krylov subspace methods:
I
long-term (Hessenberg, Arnoldi),
I
short-term (Lanczos).
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
10 / 42
Basics
Krylov subspace methods
Krylov subspace methods
There are mainly two classes of Krylov subspace methods:
I
long-term (Hessenberg, Arnoldi),
I
short-term (Lanczos).
Arnoldi: Example of a long-term method building an orthonormal basis.
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
10 / 42
Basics
Krylov subspace methods
Krylov subspace methods
There are mainly two classes of Krylov subspace methods:
I
long-term (Hessenberg, Arnoldi),
I
short-term (Lanczos).
Arnoldi: Example of a long-term method building an orthonormal basis.
r = r0 , q = r/krk
Q = q, H =
for k = 1, . . .
r = Aq
c = QH r
r = r − Qc
H = H, c; oT , krk
q = r/krk Q = Q, q
end
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
10 / 42
Basics
Hessenberg decompositions
Hessenberg decompositions
The construction of basis vectors is resembled in the structure of the arising
Hessenberg decomposition
AQk = Qk+1 Hk ,
where
I
Qk+1 = Qk , qk+1 ∈ Cn×(k+1) collects the basis vectors,
I
Hk ∈ C(k+1)×k is an unreduced extended Hessenberg matrix.
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
11 / 42
Basics
Hessenberg decompositions
Hessenberg decompositions
The construction of basis vectors is resembled in the structure of the arising
Hessenberg decomposition
AQk = Qk+1 Hk ,
where
I
Qk+1 = Qk , qk+1 ∈ Cn×(k+1) collects the basis vectors,
I
Hk ∈ C(k+1)×k is an unreduced extended Hessenberg matrix.
Aspects of perturbed Krylov subspace methods can be captured with
perturbed Hessenberg decompositions
AQk + Fk = Qk+1 Hk ,
where Fk ∈ Cn×k accounts for the perturbations.
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
11 / 42
Basics
Hessenberg decompositions
Karl Hessenberg & “his” matrix + decomposition
Behandlung linearer Eigenwertaufgaben mit Hilfe
der Hamilton-Cayleyschen Gleichung, Karl
Hessenberg, 1. Bericht der Reihe „Numerische
Verfahren“, July, 23rd 1940, page 23:
I
Hessenberg decomposition, Eqn. (57),
I
Hessenberg matrix, Eqn. (58).
Karl Hessenberg (* September 8th, 1904, † February 22nd, 1959)
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
12 / 42
Basics
Polynomial representations
Important Polynomials
The vectors from Krylov subspaces can be described in terms of polynomials.
This representation carries over to the perturbed case with minor changes.
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
13 / 42
Basics
Polynomial representations
Important Polynomials
The vectors from Krylov subspaces can be described in terms of polynomials.
This representation carries over to the perturbed case with minor changes.
The residuals of the OR approximation xk := Qk zk and the MR approximation
xk := Qk zk with coefficient vectors
zk := H−1
k e1 kr0 k
and zk := H†k e1 kr0 k
satisfy
rk := r0 − Axk = Rk (A)r0
and rk := r0 − Axk = Rk (A)r0
with residual polynomials Rk and Rk given by
†
Rk (z) := det (Ik − zH−1
k ) and Rk (z) := det (Ik − zHk Ik ).
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
13 / 42
Basics
Polynomial representations
Important Polynomials
The vectors from Krylov subspaces can be described in terms of polynomials.
This representation carries over to the perturbed case with minor changes.
The residuals of the OR approximation xk := Qk zk and the MR approximation
xk := Qk zk with coefficient vectors
zk := H−1
k e1 kr0 k
and zk := H†k e1 kr0 k
satisfy
rk := r0 − Axk = Rk (A)r0
and rk := r0 − Axk = Rk (A)r0
with residual polynomials Rk and Rk given by
†
Rk (z) := det (Ik − zH−1
k ) and Rk (z) := det (Ik − zHk Ik ).
The convergence of OR and MR depends on the Ritz and harmonic Ritz
values, respectively.
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
13 / 42
Basics
Perturbations
Perturbed OR methods
We sketch briefly how the setting changes when perturbations enter the stage
in the special case of an OR method.
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
14 / 42
Basics
Perturbations
Perturbed OR methods
We sketch briefly how the setting changes when perturbations enter the stage
in the special case of an OR method.
In the perturbed case
AQk + Fk = Qk+1 Hk
under the assumption that all trailing square Hessenberg matrices are regular,
the polynomial representation for the OR residuals changes to
rk = Rk (A)r0 −
k
X
z`k R`+1:k (A)f` + Fk zk ,
`=1
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
14 / 42
Basics
Perturbations
Perturbed OR methods
We sketch briefly how the setting changes when perturbations enter the stage
in the special case of an OR method.
In the perturbed case
AQk + Fk = Qk+1 Hk
under the assumption that all trailing square Hessenberg matrices are regular,
the polynomial representation for the OR residuals changes to
rk = Rk (A)r0 −
k
X
z`k R`+1:k (A)f` + Fk zk ,
`=1
where
R`+1:k (z) := det (Ik−` − zH−1
`+1:k ).
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
14 / 42
Basics
Perturbations
Perturbed OR methods
We sketch briefly how the setting changes when perturbations enter the stage
in the special case of an OR method.
In the perturbed case
AQk + Fk = Qk+1 Hk
under the assumption that all trailing square Hessenberg matrices are regular,
the polynomial representation for the OR residuals changes to
rk = Rk (A)r0 −
k
X
z`k R`+1:k (A)f` + Fk zk ,
`=1
where
R`+1:k (z) := det (Ik−` − zH−1
`+1:k ).
We can expect convergence when Fk zk remains bounded (inexact methods)
and all R`+1:k (A) are “small”.
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
14 / 42
IDR(s)
Outline
Basics
Internal guidelines
Krylov subspace methods
Hessenberg decompositions
Polynomial representations
Perturbations
IDR(s)
IDR
IDR(s)
IDREig
IDR(s)Stab(`)
QMRIDR
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
15 / 42
IDR(s)
IDR
Birth of a method
In 1976, Peter Sonneveld of TU Delft “stumbled upon” the three-term
recurrence
rk+1 = (I − A)(rk + γk (rk − rk−1 )),
TUHH
Jens-Peter M. Zemke
IDR @ Bath
where
γk :=
pH rk
.
pH (rk−1 − rk )
2011-05-27
16 / 42
IDR(s)
IDR
Birth of a method
In 1976, Peter Sonneveld of TU Delft “stumbled upon” the three-term
recurrence
rk+1 = (I − A)(rk + γk (rk − rk−1 )),
where
γk :=
pH rk
.
pH (rk−1 − rk )
This recurrence (almost) always results in the zero vector after 2n steps,
where A ∈ Cn×n and r0 ∈ Cn , r1 = Ar0 , and p ∈ Cn are arbitrarily chosen.
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
16 / 42
IDR(s)
IDR
Birth of a method
In 1976, Peter Sonneveld of TU Delft “stumbled upon” the three-term
recurrence
rk+1 = (I − A)(rk + γk (rk − rk−1 )),
where
γk :=
pH rk
.
pH (rk−1 − rk )
This recurrence (almost) always results in the zero vector after 2n steps,
where A ∈ Cn×n and r0 ∈ Cn , r1 = Ar0 , and p ∈ Cn are arbitrarily chosen.
He realized that the recurrence constructs vectors in spaces Gj of shrinking
dimensions:
G0 := K(A, r0 ) = span {r0 , Ar0 , A2 r0 , . . .}
Gj := (I − A)(Gj−1 ∩ S),
TUHH
Jens-Peter M. Zemke
S = span {p}⊥ ,
IDR @ Bath
j = 1, . . .
2011-05-27
16 / 42
IDR(s)
IDR
Birth of a method
In 1976, Peter Sonneveld of TU Delft “stumbled upon” the three-term
recurrence
rk+1 = (I − A)(rk + γk (rk − rk−1 )),
where
γk :=
pH rk
.
pH (rk−1 − rk )
This recurrence (almost) always results in the zero vector after 2n steps,
where A ∈ Cn×n and r0 ∈ Cn , r1 = Ar0 , and p ∈ Cn are arbitrarily chosen.
He realized that the recurrence constructs vectors in spaces Gj of shrinking
dimensions:
G0 := K(A, r0 ) = span {r0 , Ar0 , A2 r0 , . . .}
Gj := (I − A)(Gj−1 ∩ S),
S = span {p}⊥ ,
j = 1, . . .
More precisely,
r2j , r2j+1 ∈ Gj ,
TUHH
Jens-Peter M. Zemke
j = 0, 1, . . .
IDR @ Bath
2011-05-27
16 / 42
IDR(s)
IDR
The origin of IDR: primitive IDR
With r0 := b − Ax0 , the Richardson iteration is carried out as follows:
xk+1 = xk + rk ,
TUHH
Jens-Peter M. Zemke
rk+1 = (I − A)rk .
IDR @ Bath
2011-05-27
17 / 42
IDR(s)
IDR
The origin of IDR: primitive IDR
With r0 := b − Ax0 , the Richardson iteration is carried out as follows:
xk+1 = xk + rk ,
rk+1 = (I − A)rk .
In a Richardson-type IDR Algorithm, the second equation is replaced by the
update
rk+1 = (I − A)(rk + γk (rk − rk−1 )),
TUHH
Jens-Peter M. Zemke
IDR @ Bath
γk =
p H rk
.
pH (rk−1 − rk )
2011-05-27
17 / 42
IDR(s)
IDR
The origin of IDR: primitive IDR
With r0 := b − Ax0 , the Richardson iteration is carried out as follows:
xk+1 = xk + rk ,
rk+1 = (I − A)rk .
In a Richardson-type IDR Algorithm, the second equation is replaced by the
update
rk+1 = (I − A)(rk + γk (rk − rk−1 )),
γk =
p H rk
.
pH (rk−1 − rk )
The update of the iterates has to be modified accordingly,
−A(xk+1 − xk ) = rk+1 − rk = (I − A)(rk + γk (rk − rk−1 )) − rk
= (I − A)(rk − γk A(xk − xk−1 )) − rk
= −A(rk + γk (I − A)(xk − xk−1 ))
⇔
xk+1 − xk = rk + γk (I − A)(xk − xk−1 )
= rk + γk (xk − xk−1 + rk − rk−1 ).
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
17 / 42
IDR(s)
IDR
The origin of IDR: primitive IDR
Sonneveld terms the outcome the Primitive IDR Algorithm (Sonneveld, 2006):
r0 = b − Ax0
x1 = x0 + r0
r1 = r0 − Ar0
For k = 1, 2, . . . do
γk = pT rk /pT (rk−1 − rk )
sk = rk + γk (rk − rk−1 )
xk+1 = xk + γk (xk − xk−1 ) + sk
rk+1 = sk − Ask
done
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
18 / 42
IDR(s)
IDR
The origin of IDR: primitive IDR
Sonneveld terms the outcome the Primitive IDR Algorithm (Sonneveld, 2006):
xold = x0
rold = b − Axold
xnew = xold + rold
rnew = rold − Arold
r0 = b − Ax0
x1 = x0 + r0
r1 = r0 − Ar0
While “not converged” do
For k = 1, 2, . . . do
γ = pT rnew /pT (rold − rnew )
s = rnew + γ(rnew − rold )
xtmp = xnew + γ(xnew − xold ) + s
rtmp = s − As
xold = xnew , xnew = xtmp
rold = rnew , rnew = rtmp
γk = pT rk /pT (rk−1 − rk )
sk = rk + γk (rk − rk−1 )
xk+1 = xk + γk (xk − xk−1 ) + sk
rk+1 = sk − Ask
done
done
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
18 / 42
IDR(s)
IDR
The origin of IDR: primitive IDR
Sonneveld terms the outcome the Primitive IDR Algorithm (Sonneveld, 2006):
xold = x0
rold = b − Axold
xnew = xold + rold
rnew = rold − Arold
r0 = b − Ax0
x1 = x0 + r0
r1 = r0 − Ar0
While “not converged” do
For k = 1, 2, . . . do
γ = pT rnew /pT (rold − rnew )
s = rnew + γ(rnew − rold )
xtmp = xnew + γ(xnew − xold ) + s
rtmp = s − As
xold = xnew , xnew = xtmp
rold = rnew , rnew = rtmp
γk = pT rk /pT (rk−1 − rk )
sk = rk + γk (rk − rk−1 )
xk+1 = xk + γk (xk − xk−1 ) + sk
rk+1 = sk − Ask
done
done
On the next slide we compare Richardson iteration (red) and PIA (blue).
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
18 / 42
IDR(s)
IDR
The origin of IDR: primitive IDR
−10
10
true and updated residuals
0
TUHH
5
10
matrix−vector multiplies
PIA for n = 5 and scaling
0
10
−10
10
0
5
10
matrix−vector multiplies
Jens-Peter M. Zemke
PIA for n = 20 and no scaling
20
10
10
10
0
10
PIA for n = 100 and no scaling
true and updated residuals
0
10
30
10
0
20
40
60
matrix−vector multiplies
PIA for n = 20 and scaling
0
10
−10
10
0
20
40
60
matrix−vector multiplies
IDR @ Bath
true and updated residuals
PIA for n = 5 and no scaling
true and updated residuals
10
10
true and updated residuals
true and updated residuals
Impressions of “finite termination” and acceleration in finite precision:
200
10
100
10
0
10
0
100
200
matrix−vector multiplies
PIA for n = 100 and scaling
0
10
−10
10
0
100
200
matrix−vector multiplies
2011-05-27
19 / 42
IDR(s)
IDR
The origin of IDR: primitive IDR
Sonneveld never did use PIA, as he considered it to be too unstable, instead
he went on with a corresponding acceleration of the Gauß-Seidel method. In
(Sonneveld, 2008) he terms this method Accelerated Gauß-Seidel (AGS) and
refers to it as “[t]he very first IDR-algorithm [..]”, see page 6, Ibid.
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
20 / 42
IDR(s)
IDR
The origin of IDR: primitive IDR
Sonneveld never did use PIA, as he considered it to be too unstable, instead
he went on with a corresponding acceleration of the Gauß-Seidel method. In
(Sonneveld, 2008) he terms this method Accelerated Gauß-Seidel (AGS) and
refers to it as “[t]he very first IDR-algorithm [..]”, see page 6, Ibid.
This part of the story took place “in the background” in the year 1976.
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
20 / 42
IDR(s)
IDR
The origin of IDR: primitive IDR
Sonneveld never did use PIA, as he considered it to be too unstable, instead
he went on with a corresponding acceleration of the Gauß-Seidel method. In
(Sonneveld, 2008) he terms this method Accelerated Gauß-Seidel (AGS) and
refers to it as “[t]he very first IDR-algorithm [..]”, see page 6, Ibid.
This part of the story took place “in the background” in the year 1976.
In September 1979 Sonneveld did attend the IUTAM Symposium on
Approximation Methods for Navier-Stokes Problems in Paderborn, Germany.
At this symposium he presented a new variant of IDR based on a variable
splitting I − ωj A, where ωj is fixed for two steps and otherwise could be chosen
freely, but non-zero.
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
20 / 42
IDR(s)
IDR
The origin of IDR: primitive IDR
Sonneveld never did use PIA, as he considered it to be too unstable, instead
he went on with a corresponding acceleration of the Gauß-Seidel method. In
(Sonneveld, 2008) he terms this method Accelerated Gauß-Seidel (AGS) and
refers to it as “[t]he very first IDR-algorithm [..]”, see page 6, Ibid.
This part of the story took place “in the background” in the year 1976.
In September 1979 Sonneveld did attend the IUTAM Symposium on
Approximation Methods for Navier-Stokes Problems in Paderborn, Germany.
At this symposium he presented a new variant of IDR based on a variable
splitting I − ωj A, where ωj is fixed for two steps and otherwise could be chosen
freely, but non-zero.
This algorithm with minimization of every second residual is included in the
proceedings from 1980 (Wesseling and Sonneveld, 1980). The connection to
Krylov methods, e.g., BiCG/Lanczos, is also given there.
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
20 / 42
IDR(s)
IDR
The origin of IDR: classical IDR
TUHH
−10
10
0
5
10
matrix−vector multiplies
RIP for n = 5 and scaling
0
10
−10
10
0
5
10
matrix−vector multiplies
Jens-Peter M. Zemke
20
10
0
10
0
20
40
60
matrix−vector multiplies
RIP for n = 20 and scaling
0
10
−10
10
0
20
40
60
matrix−vector multiplies
IDR @ Bath
RIP for n = 100 and no scaling
true and updated residuals
0
10
RIP for n = 20 and no scaling
true and updated residuals
RIP for n = 5 and no scaling
true and updated residuals
10
10
true and updated residuals
true and updated residuals
true and updated residuals
A numerical comparison of Richardson iteration, original IDR, and PIA.
200
10
100
10
0
10
0
100
200
matrix−vector multiplies
RIP for n = 100 and scaling
0
10
−10
10
0
100
200
matrix−vector multiplies
2011-05-27
21 / 42
IDR(s)
IDR
IDR: BiCGStab
Later, Peter Sonneveld developed CGS based on the ideas behind IDR and,
together with Henk van der Vorst, rewrote the IDR variant to one that
explicitely constructs the coefficients of the underlying Lanczos recurrence.
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
22 / 42
IDR(s)
IDR
IDR: BiCGStab
Later, Peter Sonneveld developed CGS based on the ideas behind IDR and,
together with Henk van der Vorst, rewrote the IDR variant to one that
explicitely constructs the coefficients of the underlying Lanczos recurrence.
This rewritten variant was published by Henk van der Vorst under the name
BiCGStab.
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
22 / 42
IDR(s)
IDR
IDR: BiCGStab
Later, Peter Sonneveld developed CGS based on the ideas behind IDR and,
together with Henk van der Vorst, rewrote the IDR variant to one that
explicitely constructs the coefficients of the underlying Lanczos recurrence.
This rewritten variant was published by Henk van der Vorst under the name
BiCGStab.
In short: BiCGStab is (almost mathematically equivalent to) IDR.
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
22 / 42
IDR(s)
IDR(s)
IDR(s)
IDR can be generalized: instead of using one hyperplane (span {p})⊥ , one
uses the intersection of s hyperplanes. This makes the dimension reduction
step less frequent but the reduction a larger one.
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
23 / 42
IDR(s)
IDR(s)
IDR(s)
IDR can be generalized: instead of using one hyperplane (span {p})⊥ , one
uses the intersection of s hyperplanes. This makes the dimension reduction
step less frequent but the reduction a larger one.
This generalized IDR, termed IDR(s), was developed in 2006 by Peter
Sonneveld and Martin van Gijzen.
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
23 / 42
IDR(s)
IDR(s)
IDR(s)
IDR can be generalized: instead of using one hyperplane (span {p})⊥ , one
uses the intersection of s hyperplanes. This makes the dimension reduction
step less frequent but the reduction a larger one.
This generalized IDR, termed IDR(s), was developed in 2006 by Peter
Sonneveld and Martin van Gijzen.
In the context of Krylov subspace methods, IDR(s) can be thought of as a
two-sided Lanczos method. There is a predecessor to such a method,
namely, ML(k)BiCGStab by Man-Chung Yeung and Tony Chan.
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
23 / 42
IDR(s)
IDR(s)
Building blocks of IDR(s)
IDR(s) is a Krylov subspace method based on two building blocks:
I
Multiplication by polynomials in A.
(IDR(s): linear, IDR(s)Stab(`): higher degree)
I
Oblique projection perpendicular to P ∈ Cn×s .
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
24 / 42
IDR(s)
IDR(s)
Building blocks of IDR(s)
IDR(s) is a Krylov subspace method based on two building blocks:
I
Multiplication by polynomials in A.
(IDR(s): linear, IDR(s)Stab(`): higher degree)
I
Oblique projection perpendicular to P ∈ Cn×s .
IDR(s) constructs nested subspaces of shrinking dimensions.
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
24 / 42
IDR(s)
IDR(s)
Building blocks of IDR(s)
IDR(s) is a Krylov subspace method based on two building blocks:
I
Multiplication by polynomials in A.
(IDR(s): linear, IDR(s)Stab(`): higher degree)
I
Oblique projection perpendicular to P ∈ Cn×s .
IDR(s) constructs nested subspaces of shrinking dimensions.
The prototype IDR(s) method constructs spaces Gj as follows:
I
Define G0 := K(A, r0 ) = span {r0 , Ar0 , A2 r0 , . . .}.
I
Iterate Gj := (I − ωj A)(Gj−1 ∩ S),
TUHH
Jens-Peter M. Zemke
j = 1, 2, . . . ,
IDR @ Bath
C 3 ωj 6= 0
2011-05-27
24 / 42
IDR(s)
IDR(s)
Building blocks of IDR(s)
IDR(s) is a Krylov subspace method based on two building blocks:
I
Multiplication by polynomials in A.
(IDR(s): linear, IDR(s)Stab(`): higher degree)
I
Oblique projection perpendicular to P ∈ Cn×s .
IDR(s) constructs nested subspaces of shrinking dimensions.
The prototype IDR(s) method constructs spaces Gj as follows:
I
Define G0 := K(A, r0 ) = span {r0 , Ar0 , A2 r0 , . . .}.
I
Iterate Gj := (I − ωj A)(Gj−1 ∩ S),
j = 1, 2, . . . ,
C 3 ωj 6= 0
Only sufficiently many vectors in each space are constructed.
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
24 / 42
IDR(s)
IDR(s)
IDR is Lanczos times something
It turns out that:
I
TUHH
IDR(s) is a transpose-free variant of a Lanczos process with one
right-hand side and s left-hand sides.
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
25 / 42
IDR(s)
IDR(s)
IDR is Lanczos times something
It turns out that:
I
IDR(s) is a transpose-free variant of a Lanczos process with one
right-hand side and s left-hand sides.
I
IDR(s) is a Lanczos-type product method, i.e., most residuals can be
written as
rIDR
16k6s
j(s+1)+k = Ωj (A)ρjs+k (A)r0 ,
where ρjs+k are residual polynomials of the Lanczos process.
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
25 / 42
IDR(s)
IDR(s)
IDR is Lanczos times something
It turns out that:
I
IDR(s) is a transpose-free variant of a Lanczos process with one
right-hand side and s left-hand sides.
I
IDR(s) is a Lanczos-type product method, i.e., most residuals can be
written as
rIDR
16k6s
j(s+1)+k = Ωj (A)ρjs+k (A)r0 ,
where ρjs+k are residual polynomials of the Lanczos process.
Reminder: Residual polynomials are polynomials that
I
satisfy rk = ρk (A)r0 and
I
are normalized by the condition ρk (0) = 1.
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
25 / 42
IDR(s)
IDR(s)
Generalized Hessenberg decomposition
IDR(s) can be captured using a generalized Hessenberg decomposition
AQk Uk = Qk+1 Hk .
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
26 / 42
IDR(s)
IDR(s)
Generalized Hessenberg decomposition
IDR(s) can be captured using a generalized Hessenberg decomposition
AQk Uk = Qk+1 Hk .
IDR based methods include BiCGStab (rewritten version of IDR), and CGS.
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
26 / 42
IDR(s)
IDR(s)
Generalized Hessenberg decomposition
IDR(s) can be captured using a generalized Hessenberg decomposition
AQk Uk = Qk+1 Hk .
IDR based methods include BiCGStab (rewritten version of IDR), and CGS.
OR based IDR methods use
xk := Qk Uk zk ,
TUHH
Jens-Peter M. Zemke
zk := H−1
k e1 kr0 k,
IDR @ Bath
2011-05-27
26 / 42
IDR(s)
IDR(s)
Generalized Hessenberg decomposition
IDR(s) can be captured using a generalized Hessenberg decomposition
AQk Uk = Qk+1 Hk .
IDR based methods include BiCGStab (rewritten version of IDR), and CGS.
OR based IDR methods use
zk := H−1
k e1 kr0 k,
xk := Qk Uk zk ,
the residual is described by
rk := r0 − Axk = r0 − AQk Uk zk = r0 − Qk+1 Hk zk
= Qk (e1 kr0 k − Hk zk ) − qk+1 hk+1,k eTk zk
= Rk (A)r0 ,
TUHH
Jens-Peter M. Zemke
Rk (z) := det (Ik − zUk H−1
k ).
IDR @ Bath
2011-05-27
26 / 42
IDR(s)
IDR(s)
Generalized Hessenberg decomposition
IDR(s) can be captured using a generalized Hessenberg decomposition
AQk Uk = Qk+1 Hk .
IDR based methods include BiCGStab (rewritten version of IDR), and CGS.
OR based IDR methods use
zk := H−1
k e1 kr0 k,
xk := Qk Uk zk ,
the residual is described by
rk := r0 − Axk = r0 − AQk Uk zk = r0 − Qk+1 Hk zk
= Qk (e1 kr0 k − Hk zk ) − qk+1 hk+1,k eTk zk
= Rk (A)r0 ,
Rk (z) := det (Ik − zUk H−1
k ).
Tacitly assuming kqk+1 k = 1, we have krk k = |hk+1,k zk |.
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
26 / 42
IDR(s)
IDR(s)
IDR: Sonneveld pencil and Sonneveld matrix
We consider the prototype IDR(s) by Sonneveld/van Gijzen (IDR(s)ORes).
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
27 / 42
IDR(s)
IDR(s)
IDR: Sonneveld pencil and Sonneveld matrix
We consider the prototype IDR(s) by Sonneveld/van Gijzen (IDR(s)ORes).
(n)
The IDR(s)ORes pencil, the so-called Sonneveld pencil (Y◦n , Yn Dω ), can be
depicted by
×××× ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦  ×××× ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ 
+×××× ◦ ◦ ◦ ◦ ◦ ◦ ◦
◦ ×××× ◦ ◦ ◦ ◦ ◦ ◦ ◦
 ◦ +×××× ◦ ◦ ◦ ◦ ◦ ◦   ◦ ◦ ×××× ◦ ◦ ◦ ◦ ◦ ◦ 
 ◦ ◦ +×××× ◦ ◦ ◦ ◦ ◦   ◦ ◦ ◦ ×××× ◦ ◦ ◦ ◦ ◦ 
 


 ◦ ◦ ◦ +×××× ◦ ◦ ◦ ◦   ◦ ◦ ◦ ◦ ×××× ◦ ◦ ◦ ◦ 
 ◦ ◦ ◦ ◦ +×××× ◦ ◦ ◦   ◦ ◦ ◦ ◦ ◦ ×××× ◦ ◦ ◦ 
 ◦ ◦ ◦ ◦ ◦ +×××× ◦ ◦  ,  ◦ ◦ ◦ ◦ ◦ ◦ ×××× ◦ ◦  .
 ◦ ◦ ◦ ◦ ◦ ◦ +×××× ◦   ◦ ◦ ◦ ◦ ◦ ◦ ◦ ×××× ◦ 
 


 ◦ ◦ ◦ ◦ ◦ ◦ ◦ +××××  ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ××××
 ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ +×××  ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ×××
◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ +××
◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ××
◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ +×
◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦×
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
27 / 42
IDR(s)
IDR(s)
IDR: Sonneveld pencil and Sonneveld matrix
We consider the prototype IDR(s) by Sonneveld/van Gijzen (IDR(s)ORes).
(n)
The IDR(s)ORes pencil, the so-called Sonneveld pencil (Y◦n , Yn Dω ), can be
depicted by
×××× ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦  ×××× ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ 
+×××× ◦ ◦ ◦ ◦ ◦ ◦ ◦
◦ ×××× ◦ ◦ ◦ ◦ ◦ ◦ ◦
 ◦ +×××× ◦ ◦ ◦ ◦ ◦ ◦   ◦ ◦ ×××× ◦ ◦ ◦ ◦ ◦ ◦ 
 ◦ ◦ +×××× ◦ ◦ ◦ ◦ ◦   ◦ ◦ ◦ ×××× ◦ ◦ ◦ ◦ ◦ 
 


 ◦ ◦ ◦ +×××× ◦ ◦ ◦ ◦   ◦ ◦ ◦ ◦ ×××× ◦ ◦ ◦ ◦ 
 ◦ ◦ ◦ ◦ +×××× ◦ ◦ ◦   ◦ ◦ ◦ ◦ ◦ ×××× ◦ ◦ ◦ 
 ◦ ◦ ◦ ◦ ◦ +×××× ◦ ◦  ,  ◦ ◦ ◦ ◦ ◦ ◦ ×××× ◦ ◦  .
 ◦ ◦ ◦ ◦ ◦ ◦ +×××× ◦   ◦ ◦ ◦ ◦ ◦ ◦ ◦ ×××× ◦ 
 


 ◦ ◦ ◦ ◦ ◦ ◦ ◦ +××××  ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ××××
 ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ +×××  ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ×××
◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ +××
◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ××
◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ +×
◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦×
(n)
The upper triangular matrix Yn Dω could be inverted, which results in the
Sonneveld matrix, a full unreduced Hessenberg matrix.
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
27 / 42
IDR(s)
IDR(s)
Understanding IDR: Purification
We know the eigenvalues ≈ roots of kernel polynomials 1/ωj . We are only
interested in the other eigenvalues.
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
28 / 42
IDR(s)
IDR(s)
Understanding IDR: Purification
We know the eigenvalues ≈ roots of kernel polynomials 1/ωj . We are only
interested in the other eigenvalues.
(n)
The purified IDR(s)ORes pencil (Y◦n , Un Dω ), that has only the remaining
eigenvalues and some infinite ones as eigenvalues, can be depicted by

×××× ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦  
××× ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦
+×××× ◦ ◦ ◦ ◦ ◦ ◦ ◦
◦
××
◦
◦
◦
◦
◦
◦
◦
◦
◦
 ◦ +×××× ◦ ◦ ◦ ◦ ◦ ◦   ◦ ◦ × ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ 
 ◦ ◦ +×××× ◦ ◦ ◦ ◦ ◦   ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ 


 
 ◦ ◦ ◦ +×××× ◦ ◦ ◦ ◦   ◦ ◦ ◦ ◦ ××× ◦ ◦ ◦ ◦ ◦ 
 ◦ ◦ ◦ ◦ +×××× ◦ ◦ ◦   ◦ ◦ ◦ ◦ ◦ ×× ◦ ◦ ◦ ◦ ◦ 
 ◦ ◦ ◦ ◦ ◦ +×××× ◦ ◦  ,  ◦ ◦ ◦ ◦ ◦ ◦ × ◦ ◦ ◦ ◦ ◦  .

 ◦ ◦ ◦ ◦ ◦ ◦ +×××× ◦  
 ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦

 ◦ ◦ ◦ ◦ ◦ ◦ ◦ +××××  ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ××× ◦ 
 ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ +×××  ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ×× ◦ 
◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ +××
◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦×◦
◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ +×
◦◦◦◦◦◦◦◦◦◦◦◦
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
28 / 42
IDR(s)
IDR(s)
Understanding IDR: Purification
We know the eigenvalues ≈ roots of kernel polynomials 1/ωj . We are only
interested in the other eigenvalues.
(n)
The purified IDR(s)ORes pencil (Y◦n , Un Dω ), that has only the remaining
eigenvalues and some infinite ones as eigenvalues, can be depicted by

×××× ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦  
××× ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦
+×××× ◦ ◦ ◦ ◦ ◦ ◦ ◦
◦
××
◦
◦
◦
◦
◦
◦
◦
◦
◦
 ◦ +×××× ◦ ◦ ◦ ◦ ◦ ◦   ◦ ◦ × ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ 
 ◦ ◦ +×××× ◦ ◦ ◦ ◦ ◦   ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ 


 
 ◦ ◦ ◦ +×××× ◦ ◦ ◦ ◦   ◦ ◦ ◦ ◦ ××× ◦ ◦ ◦ ◦ ◦ 
 ◦ ◦ ◦ ◦ +×××× ◦ ◦ ◦   ◦ ◦ ◦ ◦ ◦ ×× ◦ ◦ ◦ ◦ ◦ 
 ◦ ◦ ◦ ◦ ◦ +×××× ◦ ◦  ,  ◦ ◦ ◦ ◦ ◦ ◦ × ◦ ◦ ◦ ◦ ◦  .

 ◦ ◦ ◦ ◦ ◦ ◦ +×××× ◦  
 ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦

 ◦ ◦ ◦ ◦ ◦ ◦ ◦ +××××  ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ××× ◦ 
 ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ +×××  ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ×× ◦ 
◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ +××
◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦×◦
◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ +×
◦◦◦◦◦◦◦◦◦◦◦◦
We get rid of the infinite eigenvalues using a change of basis (Gauß/Schur).
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
28 / 42
IDR(s)
IDR(s)
Understanding IDR: Gaussian elimination
The deflated purified IDR(s)ORes pencil, after the elimination step
(n)
(Y◦n Gn , Un Dω ), can be depicted by
××××××× ◦ ◦ ◦ ◦ ◦  

××× ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦
+×××××× ◦ ◦ ◦ ◦ ◦
×× ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ 
 ◦ +××××× ◦ ◦ ◦ ◦ ◦   ◦
◦×◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦
◦ ◦ ◦+◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦
◦
  ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦

 ◦ ◦ ++××××××× ◦   ◦ ◦ ◦ ◦ ××× ◦ ◦ ◦ ◦ ◦ 
 ◦ ◦ ◦ ◦ +×××××× ◦   ◦ ◦ ◦ ◦ ◦ ×× ◦ ◦ ◦ ◦ ◦ 
 ◦ ◦ ◦ ◦ ◦ +××××× ◦  ,  ◦ ◦ ◦ ◦ ◦ ◦ × ◦ ◦ ◦ ◦ ◦  .
◦ ◦ ◦ ◦ ◦ ◦ ◦+◦ ◦ ◦ ◦ 

 ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦

 ◦ ◦ ◦ ◦ ◦ ◦ ++××××  ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ××× ◦ 
 ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ +×××  ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ×× ◦ 
◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ +××
◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦×◦
◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦+
◦◦◦◦◦◦◦◦◦◦◦◦
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
29 / 42
IDR(s)
IDR(s)
Understanding IDR: Gaussian elimination
The deflated purified IDR(s)ORes pencil, after the elimination step
(n)
(Y◦n Gn , Un Dω ), can be depicted by
××××××× ◦ ◦ ◦ ◦ ◦  

××× ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦
+×××××× ◦ ◦ ◦ ◦ ◦
×× ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ 
 ◦ +××××× ◦ ◦ ◦ ◦ ◦   ◦
◦×◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦
◦ ◦ ◦+◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦
◦
  ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦

 ◦ ◦ ++××××××× ◦   ◦ ◦ ◦ ◦ ××× ◦ ◦ ◦ ◦ ◦ 
 ◦ ◦ ◦ ◦ +×××××× ◦   ◦ ◦ ◦ ◦ ◦ ×× ◦ ◦ ◦ ◦ ◦ 
 ◦ ◦ ◦ ◦ ◦ +××××× ◦  ,  ◦ ◦ ◦ ◦ ◦ ◦ × ◦ ◦ ◦ ◦ ◦  .
◦ ◦ ◦ ◦ ◦ ◦ ◦+◦ ◦ ◦ ◦ 

 ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦

 ◦ ◦ ◦ ◦ ◦ ◦ ++××××  ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ××× ◦ 
 ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ +×××  ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ×× ◦ 
◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ +××
◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦×◦
◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦+
◦◦◦◦◦◦◦◦◦◦◦◦
(n)
Using Laplace expansion of the determinant of zUn Dω − Y◦n Gn we can get rid
of the trivial constant factors corresponding to infinite eigenvalues. This
amounts to a deflation.
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
29 / 42
IDR(s)
IDR(s)
Understanding IDR: Deflation
Let D denote an deflation operator that removes every (s + 1)th column and
row from the matrix the operator is applied to.
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
30 / 42
IDR(s)
IDR(s)
Understanding IDR: Deflation
Let D denote an deflation operator that removes every (s + 1)th column and
row from the matrix the operator is applied to.
The deflated purified IDR(s)ORes pencil, after the deflation step
(n)
(D(Y◦n Gn ), D(Un Dω )), can be depicted by
×××××× ◦ ◦ ◦  ××× ◦ ◦ ◦ ◦ ◦ ◦ 
+××××× ◦ ◦ ◦
◦ ×× ◦ ◦ ◦ ◦ ◦ ◦
 ◦ +×××× ◦ ◦ ◦   ◦ ◦ × ◦ ◦ ◦ ◦ ◦ ◦ 
 ◦ ◦ +××××××  ◦ ◦ ◦ ××× ◦ ◦ ◦ 

 

 ◦ ◦ ◦ +××××× ,  ◦ ◦ ◦ ◦ ×× ◦ ◦ ◦  .
 ◦ ◦ ◦ ◦ +××××  ◦ ◦ ◦ ◦ ◦ × ◦ ◦ ◦ 
 ◦ ◦ ◦ ◦ ◦ +×××  ◦ ◦ ◦ ◦ ◦ ◦ ×××
◦ ◦ ◦ ◦ ◦ ◦ +××
◦ ◦ ◦ ◦ ◦ ◦ ◦ ××
◦ ◦ ◦ ◦ ◦ ◦ ◦ +×
◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦×
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
30 / 42
IDR(s)
IDR(s)
Understanding IDR: Deflation
Let D denote an deflation operator that removes every (s + 1)th column and
row from the matrix the operator is applied to.
The deflated purified IDR(s)ORes pencil, after the deflation step
(n)
(D(Y◦n Gn ), D(Un Dω )), can be depicted by
×××××× ◦ ◦ ◦  ××× ◦ ◦ ◦ ◦ ◦ ◦ 
+××××× ◦ ◦ ◦
◦ ×× ◦ ◦ ◦ ◦ ◦ ◦
 ◦ +×××× ◦ ◦ ◦   ◦ ◦ × ◦ ◦ ◦ ◦ ◦ ◦ 
 ◦ ◦ +××××××  ◦ ◦ ◦ ××× ◦ ◦ ◦ 

 

 ◦ ◦ ◦ +××××× ,  ◦ ◦ ◦ ◦ ×× ◦ ◦ ◦  .
 ◦ ◦ ◦ ◦ +××××  ◦ ◦ ◦ ◦ ◦ × ◦ ◦ ◦ 
 ◦ ◦ ◦ ◦ ◦ +×××  ◦ ◦ ◦ ◦ ◦ ◦ ×××
◦ ◦ ◦ ◦ ◦ ◦ +××
◦ ◦ ◦ ◦ ◦ ◦ ◦ ××
◦ ◦ ◦ ◦ ◦ ◦ ◦ +×
◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦×
(n)
The block-diagonal matrix D(Un Dω ) has invertible upper triangular blocks
and can be inverted to expose the underlying Lanczos process.
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
30 / 42
IDR(s)
IDR(s)
IDR: a Lanczos process with multiple left-hand sides
(n)
Inverting the block-diagonal matrix D(Un Dω )) gives an algebraic eigenvalue
problem with a block-tridiagonal unreduced upper Hessenberg matrix
×××××× ◦ ◦ ◦ 
+××××× ◦ ◦ ◦
 ◦ +×××× ◦ ◦ ◦ 
 ◦ ◦ +××××××


−1
Ln := D(Y◦n Gn ) · D(Un D(n)
=  ◦ ◦ ◦ +××××× .
ω ))
 ◦ ◦ ◦ ◦ +××××
 ◦ ◦ ◦ ◦ ◦ +×××
◦ ◦ ◦ ◦ ◦ ◦ +××
◦ ◦ ◦ ◦ ◦ ◦ ◦ +×
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
31 / 42
IDR(s)
IDR(s)
IDR: a Lanczos process with multiple left-hand sides
(n)
Inverting the block-diagonal matrix D(Un Dω )) gives an algebraic eigenvalue
problem with a block-tridiagonal unreduced upper Hessenberg matrix
×××××× ◦ ◦ ◦ 
+××××× ◦ ◦ ◦
 ◦ +×××× ◦ ◦ ◦ 
 ◦ ◦ +××××××


−1
Ln := D(Y◦n Gn ) · D(Un D(n)
=  ◦ ◦ ◦ +××××× .
ω ))
 ◦ ◦ ◦ ◦ +××××
 ◦ ◦ ◦ ◦ ◦ +×××
◦ ◦ ◦ ◦ ◦ ◦ +××
◦ ◦ ◦ ◦ ◦ ◦ ◦ +×
This is the matrix of the underlying BiORes(s, 1) process.
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
31 / 42
IDR(s)
IDR(s)
IDR: a Lanczos process with multiple left-hand sides
(n)
Inverting the block-diagonal matrix D(Un Dω )) gives an algebraic eigenvalue
problem with a block-tridiagonal unreduced upper Hessenberg matrix
×××××× ◦ ◦ ◦ 
+××××× ◦ ◦ ◦
 ◦ +×××× ◦ ◦ ◦ 
 ◦ ◦ +××××××


−1
Ln := D(Y◦n Gn ) · D(Un D(n)
=  ◦ ◦ ◦ +××××× .
ω ))
 ◦ ◦ ◦ ◦ +××××
 ◦ ◦ ◦ ◦ ◦ +×××
◦ ◦ ◦ ◦ ◦ ◦ +××
◦ ◦ ◦ ◦ ◦ ◦ ◦ +×
This is the matrix of the underlying BiORes(s, 1) process.
This matrix (in the extended version) satisfies
AQn = Qn+1 Ln ,
where the reduced residuals q js+k , k = 0, . . . , s − 1, j = 0, 1, . . ., are given by
Ωj (A)q js+k = rj(s+1)+k .
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
31 / 42
IDR(s)
IDREig
IDREig
The eigenvalues of the pencil (Hk , Uk ) are the roots of the residual
polynomials and some of these converge to eigenvalues of A.
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
32 / 42
IDR(s)
IDREig
IDREig
The eigenvalues of the pencil (Hk , Uk ) are the roots of the residual
polynomials and some of these converge to eigenvalues of A.
Suppose that Qk+1 has full rank. The pencil (Hk , Uk ) arises as a oblique
projection of (A, In ), as
b H (A, In )Qk Uk = Q
b H (AQk Uk , Qk Uk )
Q
k
k
H
b
= Qk (Qk+1 Hk , Qk Uk ) = (ITk Hk , Uk ) = (Hk , Uk ),
(1)
b H := IT Q† .
where Q
k k+1
k
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
32 / 42
IDR(s)
IDREig
IDREig
The eigenvalues of the pencil (Hk , Uk ) are the roots of the residual
polynomials and some of these converge to eigenvalues of A.
Suppose that Qk+1 has full rank. The pencil (Hk , Uk ) arises as a oblique
projection of (A, In ), as
b H (A, In )Qk Uk = Q
b H (AQk Uk , Qk Uk )
Q
k
k
H
b
= Qk (Qk+1 Hk , Qk Uk ) = (ITk Hk , Uk ) = (Hk , Uk ),
(1)
b H := IT Q† .
where Q
k k+1
k
One uses a deflated pencil that only gives the Ritz values. The theory was
developed by Martin Gutknecht and Z. (2010), currently we investigate how to
select parameters (s, ωj , P) to obtain good eigenpair approximations (this is
ongoing joint work with Olaf Rendel and Anisa Rizvanolli).
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
32 / 42
IDR(s)
IDR(s)Stab(`)
IDRStab
Recently, IDR(s) was generalized by combining ideas from IDR(s) with the
higher dimensional minimization underlying BiCGStab(`).
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
33 / 42
IDR(s)
IDR(s)Stab(`)
IDRStab
Recently, IDR(s) was generalized by combining ideas from IDR(s) with the
higher dimensional minimization underlying BiCGStab(`).
The first paper was a japanese two-sided sketch of a method named
GIDR(s, L) by Masaaki Tanio and Masaaki Sugihara, followed independently
by a joint paper by Gerard Sleijpen and Martin van Gijzen.
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
33 / 42
IDR(s)
IDR(s)Stab(`)
IDRStab
Recently, IDR(s) was generalized by combining ideas from IDR(s) with the
higher dimensional minimization underlying BiCGStab(`).
The first paper was a japanese two-sided sketch of a method named
GIDR(s, L) by Masaaki Tanio and Masaaki Sugihara, followed independently
by a joint paper by Gerard Sleijpen and Martin van Gijzen.
IDRStab is based on the computation of a Hessenberg matrix of basis
matrices and a linear combination of the last column with polynomial
coefficients to circumvent the need for the roots ωj .
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
33 / 42
IDR(s)
IDR(s)Stab(`)
IDRStab
Recently, IDR(s) was generalized by combining ideas from IDR(s) with the
higher dimensional minimization underlying BiCGStab(`).
The first paper was a japanese two-sided sketch of a method named
GIDR(s, L) by Masaaki Tanio and Masaaki Sugihara, followed independently
by a joint paper by Gerard Sleijpen and Martin van Gijzen.
IDRStab is based on the computation of a Hessenberg matrix of basis
matrices and a linear combination of the last column with polynomial
coefficients to circumvent the need for the roots ωj .
IDRStab and the eigenvalue approximations of the resulting Sonneveld
pencils are currently analyzed („Studienarbeit“ of Anisa Rizvanolli).
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
33 / 42
IDR(s)
QMRIDR
QMRIDR
MR methods use the extended Hessenberg matrix to compute the coefficients
of the vector in the Krylov subspace, i.e.,
xk := Qk zk ,
TUHH
Jens-Peter M. Zemke
zk := H†k e1 kr0 k.
IDR @ Bath
2011-05-27
34 / 42
IDR(s)
QMRIDR
QMRIDR
MR methods use the extended Hessenberg matrix to compute the coefficients
of the vector in the Krylov subspace, i.e.,
zk := H†k e1 kr0 k.
xk := Qk zk ,
In IDR based methods we have to extend the MR framework to generalized
Hessenberg decompositions:
xk := Qk Uk zk ,
TUHH
Jens-Peter M. Zemke
zk := H†k e1 kr0 k.
IDR @ Bath
2011-05-27
34 / 42
IDR(s)
QMRIDR
QMRIDR
MR methods use the extended Hessenberg matrix to compute the coefficients
of the vector in the Krylov subspace, i.e.,
zk := H†k e1 kr0 k.
xk := Qk zk ,
In IDR based methods we have to extend the MR framework to generalized
Hessenberg decompositions:
xk := Qk Uk zk ,
zk := H†k e1 kr0 k.
The implementation has many parameters that we should select “optimal”.
Extensive numerical tests are currently done by Olaf Rendel. As an example
we show the convergence curves (the true residuals) for the matrix add20
from Matrix Market.
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
34 / 42
IDR(s)
QMRIDR
QMRIDR
MR methods use the extended Hessenberg matrix to compute the coefficients
of the vector in the Krylov subspace, i.e.,
zk := H†k e1 kr0 k.
xk := Qk zk ,
In IDR based methods we have to extend the MR framework to generalized
Hessenberg decompositions:
xk := Qk Uk zk ,
zk := H†k e1 kr0 k.
The implementation has many parameters that we should select “optimal”.
Extensive numerical tests are currently done by Olaf Rendel. As an example
we show the convergence curves (the true residuals) for the matrix add20
from Matrix Market.
Ongoing joint work with Olaf Rendel, Gerard Sleijpen, and Martin van Gijzen.
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
34 / 42
IDR(s)
QMRIDR
QMRIDR: add20
Residuals of add20
0
10
BLAS1 orth
BLAS2 orth
BLAS1 bi−orth
BLAS2 bi−orth
−1
10
−2
10
−3
10
−4
||r||/||b||
10
−5
10
−6
10
−7
10
−8
10
−9
10
−10
10
0
100
200
300
400
500
MVs
600
700
800
900
1000
s = 8; ωj local minimization; next by maximal last; various orthogonalizations
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
35 / 42
IDR(s)
QMRIDR
QMRIDR: add20
Residuals of add20
0
10
orthv
maxlast
prolast
protop
−1
10
−2
10
−3
10
−4
||r||/||b||
10
−5
10
−6
10
−7
10
−8
10
−9
10
−10
10
0
100
200
300
400
500
600
700
800
900
MVs
s = 8; ωj local minimization; various expansions; MGS orthogonalization
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
36 / 42
IDR(s)
QMRIDR
QMRIDR: add20
Residuals of add20
0
10
locmin
invray
invmeanritz
−1
10
−2
10
−3
10
−4
||r||/||b||
10
−5
10
−6
10
−7
10
−8
10
−9
10
−10
10
0
200
400
600
MVs
800
1000
1200
s = 8; ωj various strategies; GS expansion; stable basis vectors
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
37 / 42
IDR(s)
QMRIDR
QMRIDR: add20
Residuals of add20
0
10
s= 4
s= 8
s=16
s=32
−2
10
−4
||r||/||b||
10
−6
10
−8
10
−10
10
−12
10
0
100
200
300
400
500
MVs
600
700
800
900
1000
various s; ωj inverse Rayleigh; stable expansion; GS expansion
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
38 / 42
IDR(s)
QMRIDR
QMRIDR: add20
Residuals of add20
0
10
s= 4
s= 8
s=16
s=32
−2
10
−4
||r||/||b||
10
−6
10
−8
10
−10
10
−12
10
0
200
400
600
MVs
800
1000
1200
various s; ωj local minimization; stable expansion; MGS expansion
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
39 / 42
IDR(s)
QMRIDR
QMRIDR: add20
Residuals of add20
0
10
s= 4
s= 8
s=16
s=32
−2
10
−4
||r||/||b||
10
−6
10
−8
10
−10
10
−12
10
0
200
400
600
MVs
800
1000
1200
various s; ωj local minimization; stable expansion; GS expansion
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
40 / 42
Conclusion
Conclusion and Outview
I
TUHH
We sketched some basic facts about Krylov subspace methods and
Hessenberg decompositions.
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
41 / 42
Conclusion
Conclusion and Outview
I
We sketched some basic facts about Krylov subspace methods and
Hessenberg decompositions.
I
We related convergence to Ritz values.
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
41 / 42
Conclusion
Conclusion and Outview
I
We sketched some basic facts about Krylov subspace methods and
Hessenberg decompositions.
I
We related convergence to Ritz values.
I
We sketched IDR and IDR(s).
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
41 / 42
Conclusion
Conclusion and Outview
I
We sketched some basic facts about Krylov subspace methods and
Hessenberg decompositions.
I
We related convergence to Ritz values.
I
We sketched IDR and IDR(s).
I
We introduced the framework of generalized Hessenberg
decompositions.
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
41 / 42
Conclusion
Conclusion and Outview
I
We sketched some basic facts about Krylov subspace methods and
Hessenberg decompositions.
I
We related convergence to Ritz values.
I
We sketched IDR and IDR(s).
I
We introduced the framework of generalized Hessenberg
decompositions.
I
We briefly touched generalizations of IDR(s), namely, IDREig, IDRStab,
and QMRIDR.
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
41 / 42
Conclusion
Conclusion and Outview
I
We sketched some basic facts about Krylov subspace methods and
Hessenberg decompositions.
I
We related convergence to Ritz values.
I
We sketched IDR and IDR(s).
I
We introduced the framework of generalized Hessenberg
decompositions.
I
We briefly touched generalizations of IDR(s), namely, IDREig, IDRStab,
and QMRIDR.
I
We hopefully conviced you that IDR is an interesting Krylov subspace
method and offers lots of even more interesting problems in the design
and analysis of new IDR based methods.
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
41 / 42
Conclusion
Conclusion and Outview
I
We sketched some basic facts about Krylov subspace methods and
Hessenberg decompositions.
I
We related convergence to Ritz values.
I
We sketched IDR and IDR(s).
I
We introduced the framework of generalized Hessenberg
decompositions.
I
We briefly touched generalizations of IDR(s), namely, IDREig, IDRStab,
and QMRIDR.
I
We hopefully conviced you that IDR is an interesting Krylov subspace
method and offers lots of even more interesting problems in the design
and analysis of new IDR based methods.
I
What about inexact IDR/IDREig/IDRStab/QMRIDR?
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
41 / 42
Thank you for your attention!
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
42 / 42
Sonneveld, P. (2006).
History of IDR: an example of serendipity.
PDF file sent by Peter Sonneveld on Monday, 24th of July 2006.
8 pages; evolved into (Sonneveld, 2008).
Sonneveld, P. (2008).
AGS-IDR-CGS-BiCGSTAB-IDR(s): The circle closed. A case of
serendipity.
In Proceedings of the International Kyoto Forum 2008 on Krylov
subspace methods, pages 1–14.
Wesseling, P. and Sonneveld, P. (1980).
Numerical experiments with a multiple grid and a preconditioned Lanczos
type method.
In Approximation Methods for Navier-Stokes Problems, volume 771 of
Lecture Notes in Mathematics, pages 543–562. Springer.
TUHH
Jens-Peter M. Zemke
IDR @ Bath
2011-05-27
42 / 42
Download