On the numerical behaviour of the CG method Tomáš Gergelits, Zdeněk Strakoš Department of Numerical Mathematics, Faculty of Mathematics and Physics, Charles University in Prague Student Krylov Day 2015, TU Delft 2 February 2015 1 Krylov Day ’15 T. Gergelits CG in finite precision computations /18 Content of the talk Composite polynomial bounds for CG in finite precision arithmetic Krylov subspaces generated by CG in finite precision arithmetic 2 Krylov Day ’15 T. Gergelits CG in finite precision computations /18 The essence of the CG method Consider preconditioned system A ∈ CN×N HPD matrix and b ∈ CN . Ax = b, CG is the projection method which minimizes the energy norm of the error xk ∈ x0 + Kk (A, r0 ), rk ⊥ Kk (A, r0 ), k = 1, 2, . . . Kk (A, r0 ) = span{r0 , Ar0 , A2 r0 , . . . , Ak−1 r0 } kx − xk kA = min {kx − y kA : y ∈ x0 + Kk (A, r0 )} . CG is a matrix formulation of the Gauss-Christoffel quadrature ⇒ The CG method is nonlinear. 3 Krylov Day ’15 T. Gergelits CG in finite precision computations /18 Linear bound for the nonlinear CG method The error in the CG method satisfies ( kx − xk kA = min ϕ(0)=1 deg(ϕ)≤k N X )1/2 2 2 |ξj | λj ϕ (λj ) ≤ j=1 min max |ϕ(λj )|kx − x0 kA . ϕ(0)=1 j=1,...,N deg(ϕ)≤k The error in the Chebyshev semi-iterative (CSI) method satisfies kx − xkCSI kA ≤ |χk (0)|−1 kx − x0 kA = min max ϕ(0)=1 λ∈[λ1 ,λN ] deg(ϕ)≤k |ϕ(λ)|kx − x0 kA . [Flanders, Shortley (1950), Lanczos (1953), Young (1954); Markov (1884)] Linear bound is relevant for the CSI method and trivially holds for CG kx − xk kA ≤ kx − xkCSI kA ≤ 2 √ k κ−1 √ kx − x0 kA . κ+1 [Rutishauser (1959)] 4 Krylov Day ’15 T. Gergelits CG in finite precision computations /18 Idea of composite polynomial convergence bounds In the case of m large outlying eigenvalues the composite polynomial qm (λ)χk−m (λ)/χk−m (0), where qm (λ) = (λ − λN ) . . . (λ − λN−m+1 ), χk−m ≡ (k − m)th Chebyshev polynomial shifted on [λ1 , λN−m ] gives for k ≥ m √ κm − 1 k−m kx − xk kA ≤2 √ kx − x0 kA κm + 1 λN−m . κm = λ1 [Axelsson (1976), Jennings (1977); cf. van der Sluis, van der Vorst (1986)] 5 Krylov Day ’15 T. Gergelits CG in finite precision computations /18 CG in finite precision arithmetic delay of convergence Short recurrences =⇒ loss of orthogonality =⇒ & rank deficiency Failure of the composite polynomial bound 0 exact CG FP CG comp. bound relative A−norm of the error 10 −5 10 −10 10 −15 10 0 20 40 60 80 100 iteration number 120 140 160 6 Krylov Day ’15 T. Gergelits CG in finite precision computations /18 Backward-like analysis ←→ b Eigenvalues of A(k) are tightly clustered around the eigenvalues of A. [Paige (1980), Greenbaum (1989), Strakoš (1991)] b In numerical experiments, A(k) can be replaced (with small inaccuracy) b by an “universal” A with sufficiently many eigenvalues in tiny clusters around the eigenvalues of A. [Greenbaum, Strakoš (1992)] 7 Krylov Day ’15 T. Gergelits CG in finite precision computations /18 Consequences Minimization problem which bounds the CG convergence behaviour in finite precision arithmetic is min max |ϕ(λj )| b ϕ(0)=1 λ∈σ(A) deg(ϕ)≤k with the spectrum of the matrix Ab o [ n bj,· ∈ [λj − ∆, λj + ∆] with tiny ∆. b ≡ bj,1 , . . . , λ bj,l , λ σ(A) λ j=1,...,N Consequently, the upper bound based on the composite polynomial must be based on χk−m (λ) max qm (λ) . χk−m (0) b λ∈σ(A) 8 Krylov Day ’15 T. Gergelits CG in finite precision computations /18 Consequences 5 10 k=4 12 8 10 k=3 4 10 k=2 0 10 relative A−norm of the error 10 0 10 −5 10 −10 10 −15 10 −4 10 bN,• ∈ σ(A) b λ 0 10 20 30 40 50 iteration number 60 70 80 The composite polynomial is very steep in the neighbourhood of the outlying eigenvalues and thus the corresponding bound (dashed line) blows up. 9 Krylov Day ’15 T. Gergelits CG in finite precision computations /18 Summary I Analysis assuming exact arithmetic is an oxymoron. + + short recurrences – loss of orthogonality. long recurrences – no CG method Uniform spectrum, small condition number – the CSI method. 10 Krylov Day ’15 T. Gergelits CG in finite precision computations /18 Content of the talk Composite polynomial bounds for CG in finite precision arithmetic Krylov subspaces generated by CG in finite precision arithmetic 11 Krylov Day ’15 T. Gergelits CG in finite precision computations /18 Idea of shift We relate: k-th iteration of FP CG ⇐⇒ l-th iteration of exact CG I I k −l k −l ≈ ≈ delay of convergence rank-deficiency of computed Krylov subspace We want to study: kx − x k kA × kx − xl kA x k × xl Kk (A, r0 ) × Kl (A, r0 ) rank of the computed Krylov subspace 50 45 40 35 rank−deficiency 30 25 20 15 10 5 FP CG computation 0 0 10 20 30 iteration number 40 50 12 Krylov Day ’15 T. Gergelits CG in finite precision computations /18 Comparison of trajectory of approximation vectors 0 energy norm 10 −5 10 kx − xl kA kx − x̄l kA −10 10 −15 10 0 5 10 15 20 25 iteration number 13 Krylov Day ’15 T. Gergelits CG in finite precision computations /18 Comparison of trajectory of approximation vectors 0 energy norm 10 −5 10 kx − xl kA kx − x̄k kA −10 10 −15 10 0 6 (6) 12 (12) 18 (23) 25 (50) l(k) 13 Krylov Day ’15 T. Gergelits CG in finite precision computations /18 Comparison of trajectory of approximation vectors 0 energy norm 10 −5 kx − xl kA kx − x̄k kA kx̄k − xl kA 10 −10 10 −15 10 0 6 (6) 12 (12) 18 (23) 25 (50) l(k) 13 Krylov Day ’15 T. Gergelits CG in finite precision computations /18 Comparison of trajectory of approximation vectors 0 energy norm 10 Observation kx − xl kA kx − x̄k kA kx̄k − xl kA −5 10 k x̄k −xl k A kx−xl k A kx̄k − xl kA 1 kx − xl kA −10 10 −15 10 0 6 (6) 12 (12) 18 (23) 25 (50) l(k) Trajectories of approximation vectors are very similar in space CN . 13 Krylov Day ’15 T. Gergelits CG in finite precision computations /18 Comparison of trajectory of approximation vectors Ax = b CN CN Ax = b delay at the k-th step x xk x̄k x̄k xl exact computation finite precision computation x0 x exact computation finite precision computation Trajectory of approximations x k generated by FP CG computations follows closely the trajectory of the exact CG approximations xl . 14 Krylov Day ’15 T. Gergelits CG in finite precision computations /18 Comparison of Krylov subspaces Principal angles and vectors ϑj = min min arccos ( p ∗ q ) ≡ arccos ( pj ∗ qj ) , p∈Fj q∈Gj kpk=1 kqk=1 j = 1, 2, . . . , l where Fj ≡ F ∩ {p1 , . . . , pj−1 }⊥ , Gj ≡ G ∩ {q1 , . . . , qj−1 }⊥ , F = Kk (A, r0 ), G = Kl (A, r0 ). 1 angles - in ◦ degrees Comparison of principal angles of subspaces K̄k and Kl 1.5 ϑl ϑl−1 ϑl−2 ϑl−3 0.5 0 0 Krylov Day ’15 6 (6) T. Gergelits 12 (12) l(k) 18 (23) 25 (50) CG in finite precision computations 15 /18 Departure of subspaces For more difficult problems, the subspaces can depart in few directions. angles - in ◦ degrees Comparison of principal angles of subspaces K̄k and Kl 100 ϑl ϑl−1 ϑl−2 ϑl−3 50 0 0 88 (175) 177 (505) l(k) 265 (994) 354 (1638) [data: bus494 from MatrixMarket] 16 Krylov Day ’15 T. Gergelits CG in finite precision computations /18 Summary II and outlook + The convergence rate of finite precision CG and exact CG typically significantly differs. + The trajectories of computed approximations are enclosed in a shrinking “cone”. + Apart from the delay, the computed Krylov subspaces do not depart much from their exact arithmetic counterparts. I Study further the properties of principal vectors, find relationship to the structure of invariant subspaces. I Study the effect of clustered eigenvalues. Acknowledgement This work has been supported by the ERC-CZ project LL1202, by the GACR grant 201/13-06684S and by the GAUK grant 695612. 17 Krylov Day ’15 T. Gergelits CG in finite precision computations /18 References References T. Gergelits and Z. Strakoš, Composite convergence bounds based on Chebyshev polynomials and finite precision conjugate gradient computations, Numer. Algorithms, 65 (2014), pp. 759–782. J. Liesen and Z. Strakoš, Krylov Subspace Methods: Principles and Analysis, Numerical Mathematics and Scientific Computation, Oxford University Press, Oxford, 2013. J. Málek and Z. Strakoš, Preconditioning and the Conjugate Gradient Method in the Context of Solving PDEs, SIAM Spotlight Series, 2015. 18 Krylov Day ’15 T. Gergelits CG in finite precision computations /18