A quick and dirty introduction to IDR

A quick and dirty introduction to IDR Jens-Peter M. Zemke zemke@tu-harburg.de Institut für Numerische Simulation Technische Universität Hamburg-Harburg May 27th, 2011 TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 1 / 42 Outline Basics Internal guidelines Krylov subspace methods Hessenberg decompositions Polynomial representations Perturbations TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 2 / 42 Outline Basics Internal guidelines Krylov subspace methods Hessenberg decompositions Polynomial representations Perturbations IDR(s) IDR IDR(s) IDREig IDR(s)Stab(`) QMRIDR TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 2 / 42 Basics Outline Basics Internal guidelines Krylov subspace methods Hessenberg decompositions Polynomial representations Perturbations IDR(s) IDR IDR(s) IDREig IDR(s)Stab(`) QMRIDR TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 3 / 42 Basics Internal guidelines What is the problem you’re considering? I am trying to motivate why the method of Induced Dimension Reduction (IDR) and its generalization IDR(s) are worth considering when looking for iterative solvers for your type of problem, e.g., I (large sparse) linear systems: Ax = r0 , A ∈ Cn×n , r0 ∈ Cn , or I (large sparse) eigenvalue problems: Av = vλ. TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 4 / 42 Basics Internal guidelines What is the problem you’re considering? I am trying to motivate why the method of Induced Dimension Reduction (IDR) and its generalization IDR(s) are worth considering when looking for iterative solvers for your type of problem, e.g., I (large sparse) linear systems: Ax = r0 , A ∈ Cn×n , r0 ∈ Cn , or I (large sparse) eigenvalue problems: Av = vλ. I have a general interest in Krylov subspace methods, for me IDR(s) is just a new Krylov subspace method that offers interesting new possibilities. TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 4 / 42 Basics Internal guidelines What is the problem you’re considering? I am trying to motivate why the method of Induced Dimension Reduction (IDR) and its generalization IDR(s) are worth considering when looking for iterative solvers for your type of problem, e.g., I (large sparse) linear systems: Ax = r0 , A ∈ Cn×n , r0 ∈ Cn , or I (large sparse) eigenvalue problems: Av = vλ. I have a general interest in Krylov subspace methods, for me IDR(s) is just a new Krylov subspace method that offers interesting new possibilities. My personal interest lies in the error analysis of perturbed Krylov subspace methods and their convergence properties. These perturbations are I always caused by finite precision, I sometimes caused deliberately, e.g., in inexact methods. TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 4 / 42 Basics Internal guidelines Why do you find this interesting? The error analysis of Krylov subspace methods is by no means simple: I TUHH Krylov subspace methods are highly sophisticated tools, Jens-Peter M. Zemke IDR @ Bath 2011-05-27 5 / 42 Basics Internal guidelines Why do you find this interesting? The error analysis of Krylov subspace methods is by no means simple: I Krylov subspace methods are highly sophisticated tools, I most analysis is based on the fact that, in theory, Krylov subspace methods are direct methods, which no longer remains true, TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 5 / 42 Basics Internal guidelines Why do you find this interesting? The error analysis of Krylov subspace methods is by no means simple: I Krylov subspace methods are highly sophisticated tools, I most analysis is based on the fact that, in theory, Krylov subspace methods are direct methods, which no longer remains true, I the error propagation is highly non-linear, TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 5 / 42 Basics Internal guidelines Why do you find this interesting? The error analysis of Krylov subspace methods is by no means simple: I Krylov subspace methods are highly sophisticated tools, I most analysis is based on the fact that, in theory, Krylov subspace methods are direct methods, which no longer remains true, I the error propagation is highly non-linear, I the short-term methods tend to deviate very soon but still converge, but now at another “rate” of convergence. TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 5 / 42 Basics Internal guidelines Why do you find this interesting? The error analysis of Krylov subspace methods is by no means simple: I Krylov subspace methods are highly sophisticated tools, I most analysis is based on the fact that, in theory, Krylov subspace methods are direct methods, which no longer remains true, I the error propagation is highly non-linear, I the short-term methods tend to deviate very soon but still converge, but now at another “rate” of convergence. The known analysis of short term recurrence Krylov subspace methods is TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 5 / 42 Basics Internal guidelines Why do you find this interesting? The error analysis of Krylov subspace methods is by no means simple: I Krylov subspace methods are highly sophisticated tools, I most analysis is based on the fact that, in theory, Krylov subspace methods are direct methods, which no longer remains true, I the error propagation is highly non-linear, I the short-term methods tend to deviate very soon but still converge, but now at another “rate” of convergence. The known analysis of short term recurrence Krylov subspace methods is I TUHH mostly restricted to the simplest method, the symmetric Lanczos method, Jens-Peter M. Zemke IDR @ Bath 2011-05-27 5 / 42 Basics Internal guidelines Why do you find this interesting? The error analysis of Krylov subspace methods is by no means simple: I Krylov subspace methods are highly sophisticated tools, I most analysis is based on the fact that, in theory, Krylov subspace methods are direct methods, which no longer remains true, I the error propagation is highly non-linear, I the short-term methods tend to deviate very soon but still converge, but now at another “rate” of convergence. The known analysis of short term recurrence Krylov subspace methods is I mostly restricted to the simplest method, the symmetric Lanczos method, I based on tools from a variety of areas that do not seem to be related to Krylov subspace methods at all, TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 5 / 42 Basics Internal guidelines Why do you find this interesting? The error analysis of Krylov subspace methods is by no means simple: I Krylov subspace methods are highly sophisticated tools, I most analysis is based on the fact that, in theory, Krylov subspace methods are direct methods, which no longer remains true, I the error propagation is highly non-linear, I the short-term methods tend to deviate very soon but still converge, but now at another “rate” of convergence. The known analysis of short term recurrence Krylov subspace methods is I mostly restricted to the simplest method, the symmetric Lanczos method, I based on tools from a variety of areas that do not seem to be related to Krylov subspace methods at all, I either for very specific implementations or does offer very little insight. TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 5 / 42 Basics Internal guidelines What is the background? Krylov subspace methods are based on very basic ideas from Linear Algebra, namely, linear combinations, subspaces, and projections. Yet, the analysis of these methods relates them to various other interesting areas. TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 6 / 42 Basics Internal guidelines What is the background? Krylov subspace methods are based on very basic ideas from Linear Algebra, namely, linear combinations, subspaces, and projections. Yet, the analysis of these methods relates them to various other interesting areas. The tools of trade include: TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 6 / 42 Basics Internal guidelines What is the background? Krylov subspace methods are based on very basic ideas from Linear Algebra, namely, linear combinations, subspaces, and projections. Yet, the analysis of these methods relates them to various other interesting areas. The tools of trade include: I TUHH Matrix Analysis (Matrix Functions), Jens-Peter M. Zemke IDR @ Bath 2011-05-27 6 / 42 Basics Internal guidelines What is the background? Krylov subspace methods are based on very basic ideas from Linear Algebra, namely, linear combinations, subspaces, and projections. Yet, the analysis of these methods relates them to various other interesting areas. The tools of trade include: I I TUHH Matrix Analysis (Matrix Functions), Potential Theory (Green’s Functions, Capacity), Jens-Peter M. Zemke IDR @ Bath 2011-05-27 6 / 42 Basics Internal guidelines What is the background? Krylov subspace methods are based on very basic ideas from Linear Algebra, namely, linear combinations, subspaces, and projections. Yet, the analysis of these methods relates them to various other interesting areas. The tools of trade include: I Matrix Analysis (Matrix Functions), Potential Theory (Green’s Functions, Capacity), I Holomorphic Functions (Residue Theorem), I TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 6 / 42 Basics Internal guidelines What is the background? Krylov subspace methods are based on very basic ideas from Linear Algebra, namely, linear combinations, subspaces, and projections. Yet, the analysis of these methods relates them to various other interesting areas. The tools of trade include: I Matrix Analysis (Matrix Functions), Potential Theory (Green’s Functions, Capacity), I Holomorphic Functions (Residue Theorem), I Laurent Expansions, I TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 6 / 42 Basics Internal guidelines What is the background? Krylov subspace methods are based on very basic ideas from Linear Algebra, namely, linear combinations, subspaces, and projections. Yet, the analysis of these methods relates them to various other interesting areas. The tools of trade include: I Matrix Analysis (Matrix Functions), Potential Theory (Green’s Functions, Capacity), I Holomorphic Functions (Residue Theorem), I Laurent Expansions, I (Padé) Approximation, I TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 6 / 42 Basics Internal guidelines What is the background? Krylov subspace methods are based on very basic ideas from Linear Algebra, namely, linear combinations, subspaces, and projections. Yet, the analysis of these methods relates them to various other interesting areas. The tools of trade include: I Matrix Analysis (Matrix Functions), Potential Theory (Green’s Functions, Capacity), I Holomorphic Functions (Residue Theorem), I Laurent Expansions, I (Padé) Approximation, I (Lagrange/Hermite) Interpolation, I TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 6 / 42 Basics Internal guidelines What is the background? Krylov subspace methods are based on very basic ideas from Linear Algebra, namely, linear combinations, subspaces, and projections. Yet, the analysis of these methods relates them to various other interesting areas. The tools of trade include: I Matrix Analysis (Matrix Functions), Potential Theory (Green’s Functions, Capacity), I Holomorphic Functions (Residue Theorem), I Laurent Expansions, I (Padé) Approximation, I (Lagrange/Hermite) Interpolation, I (Formal) Orthogonal Polynomials, I TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 6 / 42 Basics Internal guidelines What is the background? Krylov subspace methods are based on very basic ideas from Linear Algebra, namely, linear combinations, subspaces, and projections. Yet, the analysis of these methods relates them to various other interesting areas. The tools of trade include: I Matrix Analysis (Matrix Functions), Potential Theory (Green’s Functions, Capacity), I Holomorphic Functions (Residue Theorem), I Laurent Expansions, I (Padé) Approximation, I (Lagrange/Hermite) Interpolation, I (Formal) Orthogonal Polynomials, I Riemann-Stieltjes Integrals, I TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 6 / 42 Basics Internal guidelines What is the background? Krylov subspace methods are based on very basic ideas from Linear Algebra, namely, linear combinations, subspaces, and projections. Yet, the analysis of these methods relates them to various other interesting areas. The tools of trade include: I Matrix Analysis (Matrix Functions), Potential Theory (Green’s Functions, Capacity), I Holomorphic Functions (Residue Theorem), I Laurent Expansions, I (Padé) Approximation, I (Lagrange/Hermite) Interpolation, I (Formal) Orthogonal Polynomials, I Riemann-Stieltjes Integrals, I and many, many more . . . I TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 6 / 42 Basics Internal guidelines What are you going to talk about? I will I TUHH give a brief introduction to Krylov subspace methods, Jens-Peter M. Zemke IDR @ Bath 2011-05-27 7 / 42 Basics Internal guidelines What are you going to talk about? I will I give a brief introduction to Krylov subspace methods, I present a sketch of IDR/IDR(s), TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 7 / 42 Basics Internal guidelines What are you going to talk about? I will I give a brief introduction to Krylov subspace methods, I present a sketch of IDR/IDR(s), I explain, why it is different, TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 7 / 42 Basics Internal guidelines What are you going to talk about? I will I give a brief introduction to Krylov subspace methods, I present a sketch of IDR/IDR(s), I explain, why it is different, I report on the observed behavior, TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 7 / 42 Basics Internal guidelines What are you going to talk about? I will I give a brief introduction to Krylov subspace methods, I present a sketch of IDR/IDR(s), I explain, why it is different, I report on the observed behavior, I sketch possible generalizations. TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 7 / 42 Basics Internal guidelines What are you going to talk about? I will I give a brief introduction to Krylov subspace methods, I present a sketch of IDR/IDR(s), I explain, why it is different, I report on the observed behavior, I sketch possible generalizations. If I succeed, you will have a feeling for some of the important aspects of IDR/IDR(s) and can read the papers on the subject for more details of particular methods. TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 7 / 42 Basics Internal guidelines What are you going to talk about? I will I give a brief introduction to Krylov subspace methods, I present a sketch of IDR/IDR(s), I explain, why it is different, I report on the observed behavior, I sketch possible generalizations. If I succeed, you will have a feeling for some of the important aspects of IDR/IDR(s) and can read the papers on the subject for more details of particular methods. In passing, I will note some aspects not to be found in the literature and outline some paths of possible generalizations. TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 7 / 42 Basics Krylov subspace methods Background Large linear systems are solved by projection onto smaller subspaces, Ax = r0 , TUHH Jens-Peter M. Zemke xk := Qk zk , H H Q̂H k Ax = (Q̂k AQk )zk = Q̂k r0 . IDR @ Bath 2011-05-27 8 / 42 Basics Krylov subspace methods Background Large linear systems are solved by projection onto smaller subspaces, Ax = r0 , xk := Qk zk , H H Q̂H k Ax = (Q̂k AQk )zk = Q̂k r0 . Galërkin method: I Bubnov-Galërkin: Q̂k = Qk , QH k Qk = Ik (orthonormal basis), I Petrov-Galërkin: Q̂H k Qk = Ik (bi-orthonormal bases), TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 8 / 42 Basics Krylov subspace methods Background Large linear systems are solved by projection onto smaller subspaces, Ax = r0 , xk := Qk zk , H H Q̂H k Ax = (Q̂k AQk )zk = Q̂k r0 . Galërkin method: I Bubnov-Galërkin: Q̂k = Qk , QH k Qk = Ik (orthonormal basis), I Petrov-Galërkin: Q̂H k Qk = Ik (bi-orthonormal bases), Subspaces of increasing dimension. As starting vector use r0 , e.g., Q1 := q1 := r0 /kr0 k, TUHH Jens-Peter M. Zemke H1 := QH 1 AQ1 , IDR @ Bath z1 := H−1 1 e1 kr0 k, x1 := Q1 z1 . 2011-05-27 8 / 42 Basics Krylov subspace methods Background Large linear systems are solved by projection onto smaller subspaces, Ax = r0 , xk := Qk zk , H H Q̂H k Ax = (Q̂k AQk )zk = Q̂k r0 . Galërkin method: I Bubnov-Galërkin: Q̂k = Qk , QH k Qk = Ik (orthonormal basis), I Petrov-Galërkin: Q̂H k Qk = Ik (bi-orthonormal bases), Subspaces of increasing dimension. As starting vector use r0 , e.g., Q1 := q1 := r0 /kr0 k, H1 := QH 1 AQ1 , z1 := H−1 1 e1 kr0 k, x1 := Q1 z1 . Compute residual: r1 := r0 − Ax1 = Q1 e1 kr0 k − AQ1 z1 . TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 8 / 42 Basics Krylov subspace methods Background Large linear systems are solved by projection onto smaller subspaces, Ax = r0 , xk := Qk zk , H H Q̂H k Ax = (Q̂k AQk )zk = Q̂k r0 . Galërkin method: I Bubnov-Galërkin: Q̂k = Qk , QH k Qk = Ik (orthonormal basis), I Petrov-Galërkin: Q̂H k Qk = Ik (bi-orthonormal bases), Subspaces of increasing dimension. As starting vector use r0 , e.g., Q1 := q1 := r0 /kr0 k, H1 := QH 1 AQ1 , z1 := H−1 1 e1 kr0 k, x1 := Q1 z1 . Compute residual: r1 := r0 − Ax1 = Q1 e1 kr0 k − AQ1 z1 . Both steps involve Aq1 . Expand space: K2 := span {r0 , Ar0 } = span {q1 , q2 }. TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 8 / 42 Basics Krylov subspace methods Krylov subspaces Natural generalization of this simple idea: Krylov subspaces. Obtained by multiplication of last basis vector by A, Kk := span {r0 , Ar0 , . . . , Ak−1 r0 } = span {q1 , q2 , . . . , qk }. TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 9 / 42 Basics Krylov subspace methods Krylov subspaces Natural generalization of this simple idea: Krylov subspaces. Obtained by multiplication of last basis vector by A, Kk := span {r0 , Ar0 , . . . , Ak−1 r0 } = span {q1 , q2 , . . . , qk }. Krylov subspaces isomorphic (up to a certain degree) to polynomial spaces, x ∈ Kk ⇔ x= k−1 X Aj r0 cj+1 = pk−1 (A)r0 , j=0 TUHH Jens-Peter M. Zemke pk−1 (z) = k−1 X cj+1 z j . j=0 IDR @ Bath 2011-05-27 9 / 42 Basics Krylov subspace methods Krylov subspaces Natural generalization of this simple idea: Krylov subspaces. Obtained by multiplication of last basis vector by A, Kk := span {r0 , Ar0 , . . . , Ak−1 r0 } = span {q1 , q2 , . . . , qk }. Krylov subspaces isomorphic (up to a certain degree) to polynomial spaces, x ∈ Kk ⇔ x= k−1 X Aj r0 cj+1 = pk−1 (A)r0 , j=0 pk−1 (z) = k−1 X cj+1 z j . j=0 Residual polynomials are polynomials that I I TUHH satisfy rk = ρk (A)r0 and are normalized by the condition ρk (0) = 1. Jens-Peter M. Zemke IDR @ Bath 2011-05-27 9 / 42 Basics Krylov subspace methods Krylov subspaces Natural generalization of this simple idea: Krylov subspaces. Obtained by multiplication of last basis vector by A, Kk := span {r0 , Ar0 , . . . , Ak−1 r0 } = span {q1 , q2 , . . . , qk }. Krylov subspaces isomorphic (up to a certain degree) to polynomial spaces, x ∈ Kk ⇔ x= k−1 X Aj r0 cj+1 = pk−1 (A)r0 , pk−1 (z) = j=0 k−1 X cj+1 z j . j=0 Residual polynomials are polynomials that I I satisfy rk = ρk (A)r0 and are normalized by the condition ρk (0) = 1. Residual polynomials arise because rk := r0 − Axk = (I − Apk−1 (A))r0 =: ρk (A)r0 . TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 9 / 42 Basics Krylov subspace methods Krylov subspace methods There are mainly two classes of Krylov subspace methods: I long-term (Hessenberg, Arnoldi), I short-term (Lanczos). TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 10 / 42 Basics Krylov subspace methods Krylov subspace methods There are mainly two classes of Krylov subspace methods: I long-term (Hessenberg, Arnoldi), I short-term (Lanczos). Arnoldi: Example of a long-term method building an orthonormal basis. TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 10 / 42 Basics Krylov subspace methods Krylov subspace methods There are mainly two classes of Krylov subspace methods: I long-term (Hessenberg, Arnoldi), I short-term (Lanczos). Arnoldi: Example of a long-term method building an orthonormal basis. r = r0 , q = r/krk Q = q, H = for k = 1, . . . r = Aq c = QH r r = r − Qc H = H, c; oT , krk q = r/krk Q = Q, q end TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 10 / 42 Basics Hessenberg decompositions Hessenberg decompositions The construction of basis vectors is resembled in the structure of the arising Hessenberg decomposition AQk = Qk+1 Hk , where I Qk+1 = Qk , qk+1 ∈ Cn×(k+1) collects the basis vectors, I Hk ∈ C(k+1)×k is an unreduced extended Hessenberg matrix. TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 11 / 42 Basics Hessenberg decompositions Hessenberg decompositions The construction of basis vectors is resembled in the structure of the arising Hessenberg decomposition AQk = Qk+1 Hk , where I Qk+1 = Qk , qk+1 ∈ Cn×(k+1) collects the basis vectors, I Hk ∈ C(k+1)×k is an unreduced extended Hessenberg matrix. Aspects of perturbed Krylov subspace methods can be captured with perturbed Hessenberg decompositions AQk + Fk = Qk+1 Hk , where Fk ∈ Cn×k accounts for the perturbations. TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 11 / 42 Basics Hessenberg decompositions Karl Hessenberg & “his” matrix + decomposition Behandlung linearer Eigenwertaufgaben mit Hilfe der Hamilton-Cayleyschen Gleichung, Karl Hessenberg, 1. Bericht der Reihe „Numerische Verfahren“, July, 23rd 1940, page 23: I Hessenberg decomposition, Eqn. (57), I Hessenberg matrix, Eqn. (58). Karl Hessenberg (* September 8th, 1904, † February 22nd, 1959) TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 12 / 42 Basics Polynomial representations Important Polynomials The vectors from Krylov subspaces can be described in terms of polynomials. This representation carries over to the perturbed case with minor changes. TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 13 / 42 Basics Polynomial representations Important Polynomials The vectors from Krylov subspaces can be described in terms of polynomials. This representation carries over to the perturbed case with minor changes. The residuals of the OR approximation xk := Qk zk and the MR approximation xk := Qk zk with coefficient vectors zk := H−1 k e1 kr0 k and zk := H†k e1 kr0 k satisfy rk := r0 − Axk = Rk (A)r0 and rk := r0 − Axk = Rk (A)r0 with residual polynomials Rk and Rk given by † Rk (z) := det (Ik − zH−1 k ) and Rk (z) := det (Ik − zHk Ik ). TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 13 / 42 Basics Polynomial representations Important Polynomials The vectors from Krylov subspaces can be described in terms of polynomials. This representation carries over to the perturbed case with minor changes. The residuals of the OR approximation xk := Qk zk and the MR approximation xk := Qk zk with coefficient vectors zk := H−1 k e1 kr0 k and zk := H†k e1 kr0 k satisfy rk := r0 − Axk = Rk (A)r0 and rk := r0 − Axk = Rk (A)r0 with residual polynomials Rk and Rk given by † Rk (z) := det (Ik − zH−1 k ) and Rk (z) := det (Ik − zHk Ik ). The convergence of OR and MR depends on the Ritz and harmonic Ritz values, respectively. TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 13 / 42 Basics Perturbations Perturbed OR methods We sketch briefly how the setting changes when perturbations enter the stage in the special case of an OR method. TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 14 / 42 Basics Perturbations Perturbed OR methods We sketch briefly how the setting changes when perturbations enter the stage in the special case of an OR method. In the perturbed case AQk + Fk = Qk+1 Hk under the assumption that all trailing square Hessenberg matrices are regular, the polynomial representation for the OR residuals changes to rk = Rk (A)r0 − k X z`k R`+1:k (A)f` + Fk zk , `=1 TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 14 / 42 Basics Perturbations Perturbed OR methods We sketch briefly how the setting changes when perturbations enter the stage in the special case of an OR method. In the perturbed case AQk + Fk = Qk+1 Hk under the assumption that all trailing square Hessenberg matrices are regular, the polynomial representation for the OR residuals changes to rk = Rk (A)r0 − k X z`k R`+1:k (A)f` + Fk zk , `=1 where R`+1:k (z) := det (Ik−` − zH−1 `+1:k ). TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 14 / 42 Basics Perturbations Perturbed OR methods We sketch briefly how the setting changes when perturbations enter the stage in the special case of an OR method. In the perturbed case AQk + Fk = Qk+1 Hk under the assumption that all trailing square Hessenberg matrices are regular, the polynomial representation for the OR residuals changes to rk = Rk (A)r0 − k X z`k R`+1:k (A)f` + Fk zk , `=1 where R`+1:k (z) := det (Ik−` − zH−1 `+1:k ). We can expect convergence when Fk zk remains bounded (inexact methods) and all R`+1:k (A) are “small”. TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 14 / 42 IDR(s) Outline Basics Internal guidelines Krylov subspace methods Hessenberg decompositions Polynomial representations Perturbations IDR(s) IDR IDR(s) IDREig IDR(s)Stab(`) QMRIDR TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 15 / 42 IDR(s) IDR Birth of a method In 1976, Peter Sonneveld of TU Delft “stumbled upon” the three-term recurrence rk+1 = (I − A)(rk + γk (rk − rk−1 )), TUHH Jens-Peter M. Zemke IDR @ Bath where γk := pH rk . pH (rk−1 − rk ) 2011-05-27 16 / 42 IDR(s) IDR Birth of a method In 1976, Peter Sonneveld of TU Delft “stumbled upon” the three-term recurrence rk+1 = (I − A)(rk + γk (rk − rk−1 )), where γk := pH rk . pH (rk−1 − rk ) This recurrence (almost) always results in the zero vector after 2n steps, where A ∈ Cn×n and r0 ∈ Cn , r1 = Ar0 , and p ∈ Cn are arbitrarily chosen. TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 16 / 42 IDR(s) IDR Birth of a method In 1976, Peter Sonneveld of TU Delft “stumbled upon” the three-term recurrence rk+1 = (I − A)(rk + γk (rk − rk−1 )), where γk := pH rk . pH (rk−1 − rk ) This recurrence (almost) always results in the zero vector after 2n steps, where A ∈ Cn×n and r0 ∈ Cn , r1 = Ar0 , and p ∈ Cn are arbitrarily chosen. He realized that the recurrence constructs vectors in spaces Gj of shrinking dimensions: G0 := K(A, r0 ) = span {r0 , Ar0 , A2 r0 , . . .} Gj := (I − A)(Gj−1 ∩ S), TUHH Jens-Peter M. Zemke S = span {p}⊥ , IDR @ Bath j = 1, . . . 2011-05-27 16 / 42 IDR(s) IDR Birth of a method In 1976, Peter Sonneveld of TU Delft “stumbled upon” the three-term recurrence rk+1 = (I − A)(rk + γk (rk − rk−1 )), where γk := pH rk . pH (rk−1 − rk ) This recurrence (almost) always results in the zero vector after 2n steps, where A ∈ Cn×n and r0 ∈ Cn , r1 = Ar0 , and p ∈ Cn are arbitrarily chosen. He realized that the recurrence constructs vectors in spaces Gj of shrinking dimensions: G0 := K(A, r0 ) = span {r0 , Ar0 , A2 r0 , . . .} Gj := (I − A)(Gj−1 ∩ S), S = span {p}⊥ , j = 1, . . . More precisely, r2j , r2j+1 ∈ Gj , TUHH Jens-Peter M. Zemke j = 0, 1, . . . IDR @ Bath 2011-05-27 16 / 42 IDR(s) IDR The origin of IDR: primitive IDR With r0 := b − Ax0 , the Richardson iteration is carried out as follows: xk+1 = xk + rk , TUHH Jens-Peter M. Zemke rk+1 = (I − A)rk . IDR @ Bath 2011-05-27 17 / 42 IDR(s) IDR The origin of IDR: primitive IDR With r0 := b − Ax0 , the Richardson iteration is carried out as follows: xk+1 = xk + rk , rk+1 = (I − A)rk . In a Richardson-type IDR Algorithm, the second equation is replaced by the update rk+1 = (I − A)(rk + γk (rk − rk−1 )), TUHH Jens-Peter M. Zemke IDR @ Bath γk = p H rk . pH (rk−1 − rk ) 2011-05-27 17 / 42 IDR(s) IDR The origin of IDR: primitive IDR With r0 := b − Ax0 , the Richardson iteration is carried out as follows: xk+1 = xk + rk , rk+1 = (I − A)rk . In a Richardson-type IDR Algorithm, the second equation is replaced by the update rk+1 = (I − A)(rk + γk (rk − rk−1 )), γk = p H rk . pH (rk−1 − rk ) The update of the iterates has to be modified accordingly, −A(xk+1 − xk ) = rk+1 − rk = (I − A)(rk + γk (rk − rk−1 )) − rk = (I − A)(rk − γk A(xk − xk−1 )) − rk = −A(rk + γk (I − A)(xk − xk−1 )) ⇔ xk+1 − xk = rk + γk (I − A)(xk − xk−1 ) = rk + γk (xk − xk−1 + rk − rk−1 ). TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 17 / 42 IDR(s) IDR The origin of IDR: primitive IDR Sonneveld terms the outcome the Primitive IDR Algorithm (Sonneveld, 2006): r0 = b − Ax0 x1 = x0 + r0 r1 = r0 − Ar0 For k = 1, 2, . . . do γk = pT rk /pT (rk−1 − rk ) sk = rk + γk (rk − rk−1 ) xk+1 = xk + γk (xk − xk−1 ) + sk rk+1 = sk − Ask done TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 18 / 42 IDR(s) IDR The origin of IDR: primitive IDR Sonneveld terms the outcome the Primitive IDR Algorithm (Sonneveld, 2006): xold = x0 rold = b − Axold xnew = xold + rold rnew = rold − Arold r0 = b − Ax0 x1 = x0 + r0 r1 = r0 − Ar0 While “not converged” do For k = 1, 2, . . . do γ = pT rnew /pT (rold − rnew ) s = rnew + γ(rnew − rold ) xtmp = xnew + γ(xnew − xold ) + s rtmp = s − As xold = xnew , xnew = xtmp rold = rnew , rnew = rtmp γk = pT rk /pT (rk−1 − rk ) sk = rk + γk (rk − rk−1 ) xk+1 = xk + γk (xk − xk−1 ) + sk rk+1 = sk − Ask done done TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 18 / 42 IDR(s) IDR The origin of IDR: primitive IDR Sonneveld terms the outcome the Primitive IDR Algorithm (Sonneveld, 2006): xold = x0 rold = b − Axold xnew = xold + rold rnew = rold − Arold r0 = b − Ax0 x1 = x0 + r0 r1 = r0 − Ar0 While “not converged” do For k = 1, 2, . . . do γ = pT rnew /pT (rold − rnew ) s = rnew + γ(rnew − rold ) xtmp = xnew + γ(xnew − xold ) + s rtmp = s − As xold = xnew , xnew = xtmp rold = rnew , rnew = rtmp γk = pT rk /pT (rk−1 − rk ) sk = rk + γk (rk − rk−1 ) xk+1 = xk + γk (xk − xk−1 ) + sk rk+1 = sk − Ask done done On the next slide we compare Richardson iteration (red) and PIA (blue). TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 18 / 42 IDR(s) IDR The origin of IDR: primitive IDR −10 10 true and updated residuals 0 TUHH 5 10 matrix−vector multiplies PIA for n = 5 and scaling 0 10 −10 10 0 5 10 matrix−vector multiplies Jens-Peter M. Zemke PIA for n = 20 and no scaling 20 10 10 10 0 10 PIA for n = 100 and no scaling true and updated residuals 0 10 30 10 0 20 40 60 matrix−vector multiplies PIA for n = 20 and scaling 0 10 −10 10 0 20 40 60 matrix−vector multiplies IDR @ Bath true and updated residuals PIA for n = 5 and no scaling true and updated residuals 10 10 true and updated residuals true and updated residuals Impressions of “finite termination” and acceleration in finite precision: 200 10 100 10 0 10 0 100 200 matrix−vector multiplies PIA for n = 100 and scaling 0 10 −10 10 0 100 200 matrix−vector multiplies 2011-05-27 19 / 42 IDR(s) IDR The origin of IDR: primitive IDR Sonneveld never did use PIA, as he considered it to be too unstable, instead he went on with a corresponding acceleration of the Gauß-Seidel method. In (Sonneveld, 2008) he terms this method Accelerated Gauß-Seidel (AGS) and refers to it as “[t]he very first IDR-algorithm [..]”, see page 6, Ibid. TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 20 / 42 IDR(s) IDR The origin of IDR: primitive IDR Sonneveld never did use PIA, as he considered it to be too unstable, instead he went on with a corresponding acceleration of the Gauß-Seidel method. In (Sonneveld, 2008) he terms this method Accelerated Gauß-Seidel (AGS) and refers to it as “[t]he very first IDR-algorithm [..]”, see page 6, Ibid. This part of the story took place “in the background” in the year 1976. TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 20 / 42 IDR(s) IDR The origin of IDR: primitive IDR Sonneveld never did use PIA, as he considered it to be too unstable, instead he went on with a corresponding acceleration of the Gauß-Seidel method. In (Sonneveld, 2008) he terms this method Accelerated Gauß-Seidel (AGS) and refers to it as “[t]he very first IDR-algorithm [..]”, see page 6, Ibid. This part of the story took place “in the background” in the year 1976. In September 1979 Sonneveld did attend the IUTAM Symposium on Approximation Methods for Navier-Stokes Problems in Paderborn, Germany. At this symposium he presented a new variant of IDR based on a variable splitting I − ωj A, where ωj is fixed for two steps and otherwise could be chosen freely, but non-zero. TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 20 / 42 IDR(s) IDR The origin of IDR: primitive IDR Sonneveld never did use PIA, as he considered it to be too unstable, instead he went on with a corresponding acceleration of the Gauß-Seidel method. In (Sonneveld, 2008) he terms this method Accelerated Gauß-Seidel (AGS) and refers to it as “[t]he very first IDR-algorithm [..]”, see page 6, Ibid. This part of the story took place “in the background” in the year 1976. In September 1979 Sonneveld did attend the IUTAM Symposium on Approximation Methods for Navier-Stokes Problems in Paderborn, Germany. At this symposium he presented a new variant of IDR based on a variable splitting I − ωj A, where ωj is fixed for two steps and otherwise could be chosen freely, but non-zero. This algorithm with minimization of every second residual is included in the proceedings from 1980 (Wesseling and Sonneveld, 1980). The connection to Krylov methods, e.g., BiCG/Lanczos, is also given there. TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 20 / 42 IDR(s) IDR The origin of IDR: classical IDR TUHH −10 10 0 5 10 matrix−vector multiplies RIP for n = 5 and scaling 0 10 −10 10 0 5 10 matrix−vector multiplies Jens-Peter M. Zemke 20 10 0 10 0 20 40 60 matrix−vector multiplies RIP for n = 20 and scaling 0 10 −10 10 0 20 40 60 matrix−vector multiplies IDR @ Bath RIP for n = 100 and no scaling true and updated residuals 0 10 RIP for n = 20 and no scaling true and updated residuals RIP for n = 5 and no scaling true and updated residuals 10 10 true and updated residuals true and updated residuals true and updated residuals A numerical comparison of Richardson iteration, original IDR, and PIA. 200 10 100 10 0 10 0 100 200 matrix−vector multiplies RIP for n = 100 and scaling 0 10 −10 10 0 100 200 matrix−vector multiplies 2011-05-27 21 / 42 IDR(s) IDR IDR: BiCGStab Later, Peter Sonneveld developed CGS based on the ideas behind IDR and, together with Henk van der Vorst, rewrote the IDR variant to one that explicitely constructs the coefficients of the underlying Lanczos recurrence. TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 22 / 42 IDR(s) IDR IDR: BiCGStab Later, Peter Sonneveld developed CGS based on the ideas behind IDR and, together with Henk van der Vorst, rewrote the IDR variant to one that explicitely constructs the coefficients of the underlying Lanczos recurrence. This rewritten variant was published by Henk van der Vorst under the name BiCGStab. TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 22 / 42 IDR(s) IDR IDR: BiCGStab Later, Peter Sonneveld developed CGS based on the ideas behind IDR and, together with Henk van der Vorst, rewrote the IDR variant to one that explicitely constructs the coefficients of the underlying Lanczos recurrence. This rewritten variant was published by Henk van der Vorst under the name BiCGStab. In short: BiCGStab is (almost mathematically equivalent to) IDR. TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 22 / 42 IDR(s) IDR(s) IDR(s) IDR can be generalized: instead of using one hyperplane (span {p})⊥ , one uses the intersection of s hyperplanes. This makes the dimension reduction step less frequent but the reduction a larger one. TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 23 / 42 IDR(s) IDR(s) IDR(s) IDR can be generalized: instead of using one hyperplane (span {p})⊥ , one uses the intersection of s hyperplanes. This makes the dimension reduction step less frequent but the reduction a larger one. This generalized IDR, termed IDR(s), was developed in 2006 by Peter Sonneveld and Martin van Gijzen. TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 23 / 42 IDR(s) IDR(s) IDR(s) IDR can be generalized: instead of using one hyperplane (span {p})⊥ , one uses the intersection of s hyperplanes. This makes the dimension reduction step less frequent but the reduction a larger one. This generalized IDR, termed IDR(s), was developed in 2006 by Peter Sonneveld and Martin van Gijzen. In the context of Krylov subspace methods, IDR(s) can be thought of as a two-sided Lanczos method. There is a predecessor to such a method, namely, ML(k)BiCGStab by Man-Chung Yeung and Tony Chan. TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 23 / 42 IDR(s) IDR(s) Building blocks of IDR(s) IDR(s) is a Krylov subspace method based on two building blocks: I Multiplication by polynomials in A. (IDR(s): linear, IDR(s)Stab(`): higher degree) I Oblique projection perpendicular to P ∈ Cn×s . TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 24 / 42 IDR(s) IDR(s) Building blocks of IDR(s) IDR(s) is a Krylov subspace method based on two building blocks: I Multiplication by polynomials in A. (IDR(s): linear, IDR(s)Stab(`): higher degree) I Oblique projection perpendicular to P ∈ Cn×s . IDR(s) constructs nested subspaces of shrinking dimensions. TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 24 / 42 IDR(s) IDR(s) Building blocks of IDR(s) IDR(s) is a Krylov subspace method based on two building blocks: I Multiplication by polynomials in A. (IDR(s): linear, IDR(s)Stab(`): higher degree) I Oblique projection perpendicular to P ∈ Cn×s . IDR(s) constructs nested subspaces of shrinking dimensions. The prototype IDR(s) method constructs spaces Gj as follows: I Define G0 := K(A, r0 ) = span {r0 , Ar0 , A2 r0 , . . .}. I Iterate Gj := (I − ωj A)(Gj−1 ∩ S), TUHH Jens-Peter M. Zemke j = 1, 2, . . . , IDR @ Bath C 3 ωj 6= 0 2011-05-27 24 / 42 IDR(s) IDR(s) Building blocks of IDR(s) IDR(s) is a Krylov subspace method based on two building blocks: I Multiplication by polynomials in A. (IDR(s): linear, IDR(s)Stab(`): higher degree) I Oblique projection perpendicular to P ∈ Cn×s . IDR(s) constructs nested subspaces of shrinking dimensions. The prototype IDR(s) method constructs spaces Gj as follows: I Define G0 := K(A, r0 ) = span {r0 , Ar0 , A2 r0 , . . .}. I Iterate Gj := (I − ωj A)(Gj−1 ∩ S), j = 1, 2, . . . , C 3 ωj 6= 0 Only sufficiently many vectors in each space are constructed. TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 24 / 42 IDR(s) IDR(s) IDR is Lanczos times something It turns out that: I TUHH IDR(s) is a transpose-free variant of a Lanczos process with one right-hand side and s left-hand sides. Jens-Peter M. Zemke IDR @ Bath 2011-05-27 25 / 42 IDR(s) IDR(s) IDR is Lanczos times something It turns out that: I IDR(s) is a transpose-free variant of a Lanczos process with one right-hand side and s left-hand sides. I IDR(s) is a Lanczos-type product method, i.e., most residuals can be written as rIDR 16k6s j(s+1)+k = Ωj (A)ρjs+k (A)r0 , where ρjs+k are residual polynomials of the Lanczos process. TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 25 / 42 IDR(s) IDR(s) IDR is Lanczos times something It turns out that: I IDR(s) is a transpose-free variant of a Lanczos process with one right-hand side and s left-hand sides. I IDR(s) is a Lanczos-type product method, i.e., most residuals can be written as rIDR 16k6s j(s+1)+k = Ωj (A)ρjs+k (A)r0 , where ρjs+k are residual polynomials of the Lanczos process. Reminder: Residual polynomials are polynomials that I satisfy rk = ρk (A)r0 and I are normalized by the condition ρk (0) = 1. TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 25 / 42 IDR(s) IDR(s) Generalized Hessenberg decomposition IDR(s) can be captured using a generalized Hessenberg decomposition AQk Uk = Qk+1 Hk . TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 26 / 42 IDR(s) IDR(s) Generalized Hessenberg decomposition IDR(s) can be captured using a generalized Hessenberg decomposition AQk Uk = Qk+1 Hk . IDR based methods include BiCGStab (rewritten version of IDR), and CGS. TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 26 / 42 IDR(s) IDR(s) Generalized Hessenberg decomposition IDR(s) can be captured using a generalized Hessenberg decomposition AQk Uk = Qk+1 Hk . IDR based methods include BiCGStab (rewritten version of IDR), and CGS. OR based IDR methods use xk := Qk Uk zk , TUHH Jens-Peter M. Zemke zk := H−1 k e1 kr0 k, IDR @ Bath 2011-05-27 26 / 42 IDR(s) IDR(s) Generalized Hessenberg decomposition IDR(s) can be captured using a generalized Hessenberg decomposition AQk Uk = Qk+1 Hk . IDR based methods include BiCGStab (rewritten version of IDR), and CGS. OR based IDR methods use zk := H−1 k e1 kr0 k, xk := Qk Uk zk , the residual is described by rk := r0 − Axk = r0 − AQk Uk zk = r0 − Qk+1 Hk zk = Qk (e1 kr0 k − Hk zk ) − qk+1 hk+1,k eTk zk = Rk (A)r0 , TUHH Jens-Peter M. Zemke Rk (z) := det (Ik − zUk H−1 k ). IDR @ Bath 2011-05-27 26 / 42 IDR(s) IDR(s) Generalized Hessenberg decomposition IDR(s) can be captured using a generalized Hessenberg decomposition AQk Uk = Qk+1 Hk . IDR based methods include BiCGStab (rewritten version of IDR), and CGS. OR based IDR methods use zk := H−1 k e1 kr0 k, xk := Qk Uk zk , the residual is described by rk := r0 − Axk = r0 − AQk Uk zk = r0 − Qk+1 Hk zk = Qk (e1 kr0 k − Hk zk ) − qk+1 hk+1,k eTk zk = Rk (A)r0 , Rk (z) := det (Ik − zUk H−1 k ). Tacitly assuming kqk+1 k = 1, we have krk k = |hk+1,k zk |. TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 26 / 42 IDR(s) IDR(s) IDR: Sonneveld pencil and Sonneveld matrix We consider the prototype IDR(s) by Sonneveld/van Gijzen (IDR(s)ORes). TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 27 / 42 IDR(s) IDR(s) IDR: Sonneveld pencil and Sonneveld matrix We consider the prototype IDR(s) by Sonneveld/van Gijzen (IDR(s)ORes). (n) The IDR(s)ORes pencil, the so-called Sonneveld pencil (Y◦n , Yn Dω ), can be depicted by ×××× ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦  ×××× ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦  +×××× ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ×××× ◦ ◦ ◦ ◦ ◦ ◦ ◦  ◦ +×××× ◦ ◦ ◦ ◦ ◦ ◦   ◦ ◦ ×××× ◦ ◦ ◦ ◦ ◦ ◦   ◦ ◦ +×××× ◦ ◦ ◦ ◦ ◦   ◦ ◦ ◦ ×××× ◦ ◦ ◦ ◦ ◦       ◦ ◦ ◦ +×××× ◦ ◦ ◦ ◦   ◦ ◦ ◦ ◦ ×××× ◦ ◦ ◦ ◦   ◦ ◦ ◦ ◦ +×××× ◦ ◦ ◦   ◦ ◦ ◦ ◦ ◦ ×××× ◦ ◦ ◦   ◦ ◦ ◦ ◦ ◦ +×××× ◦ ◦  ,  ◦ ◦ ◦ ◦ ◦ ◦ ×××× ◦ ◦  .  ◦ ◦ ◦ ◦ ◦ ◦ +×××× ◦   ◦ ◦ ◦ ◦ ◦ ◦ ◦ ×××× ◦       ◦ ◦ ◦ ◦ ◦ ◦ ◦ +××××  ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ××××  ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ +×××  ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ××× ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ +×× ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ×× ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ +× ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦× TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 27 / 42 IDR(s) IDR(s) IDR: Sonneveld pencil and Sonneveld matrix We consider the prototype IDR(s) by Sonneveld/van Gijzen (IDR(s)ORes). (n) The IDR(s)ORes pencil, the so-called Sonneveld pencil (Y◦n , Yn Dω ), can be depicted by ×××× ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦  ×××× ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦  +×××× ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ×××× ◦ ◦ ◦ ◦ ◦ ◦ ◦  ◦ +×××× ◦ ◦ ◦ ◦ ◦ ◦   ◦ ◦ ×××× ◦ ◦ ◦ ◦ ◦ ◦   ◦ ◦ +×××× ◦ ◦ ◦ ◦ ◦   ◦ ◦ ◦ ×××× ◦ ◦ ◦ ◦ ◦       ◦ ◦ ◦ +×××× ◦ ◦ ◦ ◦   ◦ ◦ ◦ ◦ ×××× ◦ ◦ ◦ ◦   ◦ ◦ ◦ ◦ +×××× ◦ ◦ ◦   ◦ ◦ ◦ ◦ ◦ ×××× ◦ ◦ ◦   ◦ ◦ ◦ ◦ ◦ +×××× ◦ ◦  ,  ◦ ◦ ◦ ◦ ◦ ◦ ×××× ◦ ◦  .  ◦ ◦ ◦ ◦ ◦ ◦ +×××× ◦   ◦ ◦ ◦ ◦ ◦ ◦ ◦ ×××× ◦       ◦ ◦ ◦ ◦ ◦ ◦ ◦ +××××  ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ××××  ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ +×××  ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ××× ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ +×× ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ×× ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ +× ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦× (n) The upper triangular matrix Yn Dω could be inverted, which results in the Sonneveld matrix, a full unreduced Hessenberg matrix. TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 27 / 42 IDR(s) IDR(s) Understanding IDR: Purification We know the eigenvalues ≈ roots of kernel polynomials 1/ωj . We are only interested in the other eigenvalues. TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 28 / 42 IDR(s) IDR(s) Understanding IDR: Purification We know the eigenvalues ≈ roots of kernel polynomials 1/ωj . We are only interested in the other eigenvalues. (n) The purified IDR(s)ORes pencil (Y◦n , Un Dω ), that has only the remaining eigenvalues and some infinite ones as eigenvalues, can be depicted by  ×××× ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦   ××× ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ +×××× ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ×× ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦  ◦ +×××× ◦ ◦ ◦ ◦ ◦ ◦   ◦ ◦ × ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦   ◦ ◦ +×××× ◦ ◦ ◦ ◦ ◦   ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦       ◦ ◦ ◦ +×××× ◦ ◦ ◦ ◦   ◦ ◦ ◦ ◦ ××× ◦ ◦ ◦ ◦ ◦   ◦ ◦ ◦ ◦ +×××× ◦ ◦ ◦   ◦ ◦ ◦ ◦ ◦ ×× ◦ ◦ ◦ ◦ ◦   ◦ ◦ ◦ ◦ ◦ +×××× ◦ ◦  ,  ◦ ◦ ◦ ◦ ◦ ◦ × ◦ ◦ ◦ ◦ ◦  .   ◦ ◦ ◦ ◦ ◦ ◦ +×××× ◦    ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦   ◦ ◦ ◦ ◦ ◦ ◦ ◦ +××××  ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ××× ◦   ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ +×××  ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ×× ◦  ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ +×× ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦×◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ +× ◦◦◦◦◦◦◦◦◦◦◦◦ TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 28 / 42 IDR(s) IDR(s) Understanding IDR: Purification We know the eigenvalues ≈ roots of kernel polynomials 1/ωj . We are only interested in the other eigenvalues. (n) The purified IDR(s)ORes pencil (Y◦n , Un Dω ), that has only the remaining eigenvalues and some infinite ones as eigenvalues, can be depicted by  ×××× ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦   ××× ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ +×××× ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ×× ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦  ◦ +×××× ◦ ◦ ◦ ◦ ◦ ◦   ◦ ◦ × ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦   ◦ ◦ +×××× ◦ ◦ ◦ ◦ ◦   ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦       ◦ ◦ ◦ +×××× ◦ ◦ ◦ ◦   ◦ ◦ ◦ ◦ ××× ◦ ◦ ◦ ◦ ◦   ◦ ◦ ◦ ◦ +×××× ◦ ◦ ◦   ◦ ◦ ◦ ◦ ◦ ×× ◦ ◦ ◦ ◦ ◦   ◦ ◦ ◦ ◦ ◦ +×××× ◦ ◦  ,  ◦ ◦ ◦ ◦ ◦ ◦ × ◦ ◦ ◦ ◦ ◦  .   ◦ ◦ ◦ ◦ ◦ ◦ +×××× ◦    ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦   ◦ ◦ ◦ ◦ ◦ ◦ ◦ +××××  ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ××× ◦   ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ +×××  ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ×× ◦  ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ +×× ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦×◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ +× ◦◦◦◦◦◦◦◦◦◦◦◦ We get rid of the infinite eigenvalues using a change of basis (Gauß/Schur). TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 28 / 42 IDR(s) IDR(s) Understanding IDR: Gaussian elimination The deflated purified IDR(s)ORes pencil, after the elimination step (n) (Y◦n Gn , Un Dω ), can be depicted by ××××××× ◦ ◦ ◦ ◦ ◦    ××× ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ +×××××× ◦ ◦ ◦ ◦ ◦ ×× ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦   ◦ +××××× ◦ ◦ ◦ ◦ ◦   ◦ ◦×◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦+◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦   ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦   ◦ ◦ ++××××××× ◦   ◦ ◦ ◦ ◦ ××× ◦ ◦ ◦ ◦ ◦   ◦ ◦ ◦ ◦ +×××××× ◦   ◦ ◦ ◦ ◦ ◦ ×× ◦ ◦ ◦ ◦ ◦   ◦ ◦ ◦ ◦ ◦ +××××× ◦  ,  ◦ ◦ ◦ ◦ ◦ ◦ × ◦ ◦ ◦ ◦ ◦  . ◦ ◦ ◦ ◦ ◦ ◦ ◦+◦ ◦ ◦ ◦    ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦   ◦ ◦ ◦ ◦ ◦ ◦ ++××××  ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ××× ◦   ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ +×××  ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ×× ◦  ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ +×× ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦×◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦+ ◦◦◦◦◦◦◦◦◦◦◦◦ TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 29 / 42 IDR(s) IDR(s) Understanding IDR: Gaussian elimination The deflated purified IDR(s)ORes pencil, after the elimination step (n) (Y◦n Gn , Un Dω ), can be depicted by ××××××× ◦ ◦ ◦ ◦ ◦    ××× ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ +×××××× ◦ ◦ ◦ ◦ ◦ ×× ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦   ◦ +××××× ◦ ◦ ◦ ◦ ◦   ◦ ◦×◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦+◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦   ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦   ◦ ◦ ++××××××× ◦   ◦ ◦ ◦ ◦ ××× ◦ ◦ ◦ ◦ ◦   ◦ ◦ ◦ ◦ +×××××× ◦   ◦ ◦ ◦ ◦ ◦ ×× ◦ ◦ ◦ ◦ ◦   ◦ ◦ ◦ ◦ ◦ +××××× ◦  ,  ◦ ◦ ◦ ◦ ◦ ◦ × ◦ ◦ ◦ ◦ ◦  . ◦ ◦ ◦ ◦ ◦ ◦ ◦+◦ ◦ ◦ ◦    ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦   ◦ ◦ ◦ ◦ ◦ ◦ ++××××  ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ××× ◦   ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ +×××  ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ×× ◦  ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ +×× ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦×◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦+ ◦◦◦◦◦◦◦◦◦◦◦◦ (n) Using Laplace expansion of the determinant of zUn Dω − Y◦n Gn we can get rid of the trivial constant factors corresponding to infinite eigenvalues. This amounts to a deflation. TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 29 / 42 IDR(s) IDR(s) Understanding IDR: Deflation Let D denote an deflation operator that removes every (s + 1)th column and row from the matrix the operator is applied to. TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 30 / 42 IDR(s) IDR(s) Understanding IDR: Deflation Let D denote an deflation operator that removes every (s + 1)th column and row from the matrix the operator is applied to. The deflated purified IDR(s)ORes pencil, after the deflation step (n) (D(Y◦n Gn ), D(Un Dω )), can be depicted by ×××××× ◦ ◦ ◦  ××× ◦ ◦ ◦ ◦ ◦ ◦  +××××× ◦ ◦ ◦ ◦ ×× ◦ ◦ ◦ ◦ ◦ ◦  ◦ +×××× ◦ ◦ ◦   ◦ ◦ × ◦ ◦ ◦ ◦ ◦ ◦   ◦ ◦ +××××××  ◦ ◦ ◦ ××× ◦ ◦ ◦       ◦ ◦ ◦ +××××× ,  ◦ ◦ ◦ ◦ ×× ◦ ◦ ◦  .  ◦ ◦ ◦ ◦ +××××  ◦ ◦ ◦ ◦ ◦ × ◦ ◦ ◦   ◦ ◦ ◦ ◦ ◦ +×××  ◦ ◦ ◦ ◦ ◦ ◦ ××× ◦ ◦ ◦ ◦ ◦ ◦ +×× ◦ ◦ ◦ ◦ ◦ ◦ ◦ ×× ◦ ◦ ◦ ◦ ◦ ◦ ◦ +× ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦× TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 30 / 42 IDR(s) IDR(s) Understanding IDR: Deflation Let D denote an deflation operator that removes every (s + 1)th column and row from the matrix the operator is applied to. The deflated purified IDR(s)ORes pencil, after the deflation step (n) (D(Y◦n Gn ), D(Un Dω )), can be depicted by ×××××× ◦ ◦ ◦  ××× ◦ ◦ ◦ ◦ ◦ ◦  +××××× ◦ ◦ ◦ ◦ ×× ◦ ◦ ◦ ◦ ◦ ◦  ◦ +×××× ◦ ◦ ◦   ◦ ◦ × ◦ ◦ ◦ ◦ ◦ ◦   ◦ ◦ +××××××  ◦ ◦ ◦ ××× ◦ ◦ ◦       ◦ ◦ ◦ +××××× ,  ◦ ◦ ◦ ◦ ×× ◦ ◦ ◦  .  ◦ ◦ ◦ ◦ +××××  ◦ ◦ ◦ ◦ ◦ × ◦ ◦ ◦   ◦ ◦ ◦ ◦ ◦ +×××  ◦ ◦ ◦ ◦ ◦ ◦ ××× ◦ ◦ ◦ ◦ ◦ ◦ +×× ◦ ◦ ◦ ◦ ◦ ◦ ◦ ×× ◦ ◦ ◦ ◦ ◦ ◦ ◦ +× ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦× (n) The block-diagonal matrix D(Un Dω ) has invertible upper triangular blocks and can be inverted to expose the underlying Lanczos process. TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 30 / 42 IDR(s) IDR(s) IDR: a Lanczos process with multiple left-hand sides (n) Inverting the block-diagonal matrix D(Un Dω )) gives an algebraic eigenvalue problem with a block-tridiagonal unreduced upper Hessenberg matrix ×××××× ◦ ◦ ◦  +××××× ◦ ◦ ◦  ◦ +×××× ◦ ◦ ◦   ◦ ◦ +××××××   −1 Ln := D(Y◦n Gn ) · D(Un D(n) =  ◦ ◦ ◦ +××××× . ω ))  ◦ ◦ ◦ ◦ +××××  ◦ ◦ ◦ ◦ ◦ +××× ◦ ◦ ◦ ◦ ◦ ◦ +×× ◦ ◦ ◦ ◦ ◦ ◦ ◦ +× TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 31 / 42 IDR(s) IDR(s) IDR: a Lanczos process with multiple left-hand sides (n) Inverting the block-diagonal matrix D(Un Dω )) gives an algebraic eigenvalue problem with a block-tridiagonal unreduced upper Hessenberg matrix ×××××× ◦ ◦ ◦  +××××× ◦ ◦ ◦  ◦ +×××× ◦ ◦ ◦   ◦ ◦ +××××××   −1 Ln := D(Y◦n Gn ) · D(Un D(n) =  ◦ ◦ ◦ +××××× . ω ))  ◦ ◦ ◦ ◦ +××××  ◦ ◦ ◦ ◦ ◦ +××× ◦ ◦ ◦ ◦ ◦ ◦ +×× ◦ ◦ ◦ ◦ ◦ ◦ ◦ +× This is the matrix of the underlying BiORes(s, 1) process. TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 31 / 42 IDR(s) IDR(s) IDR: a Lanczos process with multiple left-hand sides (n) Inverting the block-diagonal matrix D(Un Dω )) gives an algebraic eigenvalue problem with a block-tridiagonal unreduced upper Hessenberg matrix ×××××× ◦ ◦ ◦  +××××× ◦ ◦ ◦  ◦ +×××× ◦ ◦ ◦   ◦ ◦ +××××××   −1 Ln := D(Y◦n Gn ) · D(Un D(n) =  ◦ ◦ ◦ +××××× . ω ))  ◦ ◦ ◦ ◦ +××××  ◦ ◦ ◦ ◦ ◦ +××× ◦ ◦ ◦ ◦ ◦ ◦ +×× ◦ ◦ ◦ ◦ ◦ ◦ ◦ +× This is the matrix of the underlying BiORes(s, 1) process. This matrix (in the extended version) satisfies AQn = Qn+1 Ln , where the reduced residuals q js+k , k = 0, . . . , s − 1, j = 0, 1, . . ., are given by Ωj (A)q js+k = rj(s+1)+k . TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 31 / 42 IDR(s) IDREig IDREig The eigenvalues of the pencil (Hk , Uk ) are the roots of the residual polynomials and some of these converge to eigenvalues of A. TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 32 / 42 IDR(s) IDREig IDREig The eigenvalues of the pencil (Hk , Uk ) are the roots of the residual polynomials and some of these converge to eigenvalues of A. Suppose that Qk+1 has full rank. The pencil (Hk , Uk ) arises as a oblique projection of (A, In ), as b H (A, In )Qk Uk = Q b H (AQk Uk , Qk Uk ) Q k k H b = Qk (Qk+1 Hk , Qk Uk ) = (ITk Hk , Uk ) = (Hk , Uk ), (1) b H := IT Q† . where Q k k+1 k TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 32 / 42 IDR(s) IDREig IDREig The eigenvalues of the pencil (Hk , Uk ) are the roots of the residual polynomials and some of these converge to eigenvalues of A. Suppose that Qk+1 has full rank. The pencil (Hk , Uk ) arises as a oblique projection of (A, In ), as b H (A, In )Qk Uk = Q b H (AQk Uk , Qk Uk ) Q k k H b = Qk (Qk+1 Hk , Qk Uk ) = (ITk Hk , Uk ) = (Hk , Uk ), (1) b H := IT Q† . where Q k k+1 k One uses a deflated pencil that only gives the Ritz values. The theory was developed by Martin Gutknecht and Z. (2010), currently we investigate how to select parameters (s, ωj , P) to obtain good eigenpair approximations (this is ongoing joint work with Olaf Rendel and Anisa Rizvanolli). TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 32 / 42 IDR(s) IDR(s)Stab(`) IDRStab Recently, IDR(s) was generalized by combining ideas from IDR(s) with the higher dimensional minimization underlying BiCGStab(`). TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 33 / 42 IDR(s) IDR(s)Stab(`) IDRStab Recently, IDR(s) was generalized by combining ideas from IDR(s) with the higher dimensional minimization underlying BiCGStab(`). The first paper was a japanese two-sided sketch of a method named GIDR(s, L) by Masaaki Tanio and Masaaki Sugihara, followed independently by a joint paper by Gerard Sleijpen and Martin van Gijzen. TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 33 / 42 IDR(s) IDR(s)Stab(`) IDRStab Recently, IDR(s) was generalized by combining ideas from IDR(s) with the higher dimensional minimization underlying BiCGStab(`). The first paper was a japanese two-sided sketch of a method named GIDR(s, L) by Masaaki Tanio and Masaaki Sugihara, followed independently by a joint paper by Gerard Sleijpen and Martin van Gijzen. IDRStab is based on the computation of a Hessenberg matrix of basis matrices and a linear combination of the last column with polynomial coefficients to circumvent the need for the roots ωj . TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 33 / 42 IDR(s) IDR(s)Stab(`) IDRStab Recently, IDR(s) was generalized by combining ideas from IDR(s) with the higher dimensional minimization underlying BiCGStab(`). The first paper was a japanese two-sided sketch of a method named GIDR(s, L) by Masaaki Tanio and Masaaki Sugihara, followed independently by a joint paper by Gerard Sleijpen and Martin van Gijzen. IDRStab is based on the computation of a Hessenberg matrix of basis matrices and a linear combination of the last column with polynomial coefficients to circumvent the need for the roots ωj . IDRStab and the eigenvalue approximations of the resulting Sonneveld pencils are currently analyzed („Studienarbeit“ of Anisa Rizvanolli). TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 33 / 42 IDR(s) QMRIDR QMRIDR MR methods use the extended Hessenberg matrix to compute the coefficients of the vector in the Krylov subspace, i.e., xk := Qk zk , TUHH Jens-Peter M. Zemke zk := H†k e1 kr0 k. IDR @ Bath 2011-05-27 34 / 42 IDR(s) QMRIDR QMRIDR MR methods use the extended Hessenberg matrix to compute the coefficients of the vector in the Krylov subspace, i.e., zk := H†k e1 kr0 k. xk := Qk zk , In IDR based methods we have to extend the MR framework to generalized Hessenberg decompositions: xk := Qk Uk zk , TUHH Jens-Peter M. Zemke zk := H†k e1 kr0 k. IDR @ Bath 2011-05-27 34 / 42 IDR(s) QMRIDR QMRIDR MR methods use the extended Hessenberg matrix to compute the coefficients of the vector in the Krylov subspace, i.e., zk := H†k e1 kr0 k. xk := Qk zk , In IDR based methods we have to extend the MR framework to generalized Hessenberg decompositions: xk := Qk Uk zk , zk := H†k e1 kr0 k. The implementation has many parameters that we should select “optimal”. Extensive numerical tests are currently done by Olaf Rendel. As an example we show the convergence curves (the true residuals) for the matrix add20 from Matrix Market. TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 34 / 42 IDR(s) QMRIDR QMRIDR MR methods use the extended Hessenberg matrix to compute the coefficients of the vector in the Krylov subspace, i.e., zk := H†k e1 kr0 k. xk := Qk zk , In IDR based methods we have to extend the MR framework to generalized Hessenberg decompositions: xk := Qk Uk zk , zk := H†k e1 kr0 k. The implementation has many parameters that we should select “optimal”. Extensive numerical tests are currently done by Olaf Rendel. As an example we show the convergence curves (the true residuals) for the matrix add20 from Matrix Market. Ongoing joint work with Olaf Rendel, Gerard Sleijpen, and Martin van Gijzen. TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 34 / 42 IDR(s) QMRIDR QMRIDR: add20 Residuals of add20 0 10 BLAS1 orth BLAS2 orth BLAS1 bi−orth BLAS2 bi−orth −1 10 −2 10 −3 10 −4 ||r||/||b|| 10 −5 10 −6 10 −7 10 −8 10 −9 10 −10 10 0 100 200 300 400 500 MVs 600 700 800 900 1000 s = 8; ωj local minimization; next by maximal last; various orthogonalizations TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 35 / 42 IDR(s) QMRIDR QMRIDR: add20 Residuals of add20 0 10 orthv maxlast prolast protop −1 10 −2 10 −3 10 −4 ||r||/||b|| 10 −5 10 −6 10 −7 10 −8 10 −9 10 −10 10 0 100 200 300 400 500 600 700 800 900 MVs s = 8; ωj local minimization; various expansions; MGS orthogonalization TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 36 / 42 IDR(s) QMRIDR QMRIDR: add20 Residuals of add20 0 10 locmin invray invmeanritz −1 10 −2 10 −3 10 −4 ||r||/||b|| 10 −5 10 −6 10 −7 10 −8 10 −9 10 −10 10 0 200 400 600 MVs 800 1000 1200 s = 8; ωj various strategies; GS expansion; stable basis vectors TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 37 / 42 IDR(s) QMRIDR QMRIDR: add20 Residuals of add20 0 10 s= 4 s= 8 s=16 s=32 −2 10 −4 ||r||/||b|| 10 −6 10 −8 10 −10 10 −12 10 0 100 200 300 400 500 MVs 600 700 800 900 1000 various s; ωj inverse Rayleigh; stable expansion; GS expansion TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 38 / 42 IDR(s) QMRIDR QMRIDR: add20 Residuals of add20 0 10 s= 4 s= 8 s=16 s=32 −2 10 −4 ||r||/||b|| 10 −6 10 −8 10 −10 10 −12 10 0 200 400 600 MVs 800 1000 1200 various s; ωj local minimization; stable expansion; MGS expansion TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 39 / 42 IDR(s) QMRIDR QMRIDR: add20 Residuals of add20 0 10 s= 4 s= 8 s=16 s=32 −2 10 −4 ||r||/||b|| 10 −6 10 −8 10 −10 10 −12 10 0 200 400 600 MVs 800 1000 1200 various s; ωj local minimization; stable expansion; GS expansion TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 40 / 42 Conclusion Conclusion and Outview I TUHH We sketched some basic facts about Krylov subspace methods and Hessenberg decompositions. Jens-Peter M. Zemke IDR @ Bath 2011-05-27 41 / 42 Conclusion Conclusion and Outview I We sketched some basic facts about Krylov subspace methods and Hessenberg decompositions. I We related convergence to Ritz values. TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 41 / 42 Conclusion Conclusion and Outview I We sketched some basic facts about Krylov subspace methods and Hessenberg decompositions. I We related convergence to Ritz values. I We sketched IDR and IDR(s). TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 41 / 42 Conclusion Conclusion and Outview I We sketched some basic facts about Krylov subspace methods and Hessenberg decompositions. I We related convergence to Ritz values. I We sketched IDR and IDR(s). I We introduced the framework of generalized Hessenberg decompositions. TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 41 / 42 Conclusion Conclusion and Outview I We sketched some basic facts about Krylov subspace methods and Hessenberg decompositions. I We related convergence to Ritz values. I We sketched IDR and IDR(s). I We introduced the framework of generalized Hessenberg decompositions. I We briefly touched generalizations of IDR(s), namely, IDREig, IDRStab, and QMRIDR. TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 41 / 42 Conclusion Conclusion and Outview I We sketched some basic facts about Krylov subspace methods and Hessenberg decompositions. I We related convergence to Ritz values. I We sketched IDR and IDR(s). I We introduced the framework of generalized Hessenberg decompositions. I We briefly touched generalizations of IDR(s), namely, IDREig, IDRStab, and QMRIDR. I We hopefully conviced you that IDR is an interesting Krylov subspace method and offers lots of even more interesting problems in the design and analysis of new IDR based methods. TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 41 / 42 Conclusion Conclusion and Outview I We sketched some basic facts about Krylov subspace methods and Hessenberg decompositions. I We related convergence to Ritz values. I We sketched IDR and IDR(s). I We introduced the framework of generalized Hessenberg decompositions. I We briefly touched generalizations of IDR(s), namely, IDREig, IDRStab, and QMRIDR. I We hopefully conviced you that IDR is an interesting Krylov subspace method and offers lots of even more interesting problems in the design and analysis of new IDR based methods. I What about inexact IDR/IDREig/IDRStab/QMRIDR? TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 41 / 42 Thank you for your attention! TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 42 / 42 Sonneveld, P. (2006). History of IDR: an example of serendipity. PDF file sent by Peter Sonneveld on Monday, 24th of July 2006. 8 pages; evolved into (Sonneveld, 2008). Sonneveld, P. (2008). AGS-IDR-CGS-BiCGSTAB-IDR(s): The circle closed. A case of serendipity. In Proceedings of the International Kyoto Forum 2008 on Krylov subspace methods, pages 1–14. Wesseling, P. and Sonneveld, P. (1980). Numerical experiments with a multiple grid and a preconditioned Lanczos type method. In Approximation Methods for Navier-Stokes Problems, volume 771 of Lecture Notes in Mathematics, pages 543–562. Springer. TUHH Jens-Peter M. Zemke IDR @ Bath 2011-05-27 42 / 42

A quick and dirty introduction to IDR

Related documents

Products

Support

A quick and dirty introduction to IDR

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib