slides

advertisement
Tracking the trajectory in finite precision CG
computations
Tomáš Gergelits 1
1
Zdeněk Strakoš 1,2
Faculty of Mathematics and Physics, Charles University in Prague
2
Institute of Computer Science AS CR
Seminar on Numerical Analysis 2014
Nymburk, 28th January 2014
T. Gergelits, Z. Strakoš
SNA’14, Nymburk
2014, January 28
1 / 21
Contents
1
Introduction
2
The essence of the CG method
3
CG in finite precision arithmetic
4
Trajectory of CG approximations
5
Inertia of computed Krylov subspaces
T. Gergelits, Z. Strakoš
SNA’14, Nymburk
2014, January 28
2 / 21
Introduction
Contents
1
Introduction
2
The essence of the CG method
3
CG in finite precision arithmetic
4
Trajectory of CG approximations
5
Inertia of computed Krylov subspaces
T. Gergelits, Z. Strakoš
SNA’14, Nymburk
2014, January 28
3 / 21
Introduction
Simple numerical experiment
Solve Ax = b by the CG method
A . . . 25 × 25 diagonal and positive definite matrix, b = ones(1, . . . , 1)
0
exact arithmetic
finite precision arithmetic
relative A−norm of the error
10
20 iterations of CG in exact
arithmetic
kx − x20 kA = 2.0611 × 10−6
−5
10
20 iterations of CG in finite
precision arithmetic (FP CG)
−10
kx − x̄20 kA = 1.0953 × 10−2
10
kx20 − x̄20 k∞ = 1.424 × 10−2
−15
10
x
0
10
T. Gergelits, Z. Strakoš
20
30
iteration number
40
50
SNA’14, Nymburk
x̄20
x20
2014, January 28
4 / 21
Introduction
Simple numerical experiment
Solve Ax = b by the CG method
A . . . 25 × 25 diagonal and positive definite matrix, b = ones(1, . . . , 1)
0
exact arithmetic
finite precision arithmetic
relative A−norm of the error
10
20 iterations of CG in exact
arithmetic
kx − x20 kA = 2.0611 × 10−6
−5
10
20 iterations of CG in finite
precision arithmetic (FP CG)
−10
kx − x̄20 kA = 1.0953 × 10−2
10
kx20 − x̄20 k∞ = 1.424 × 10−2
−15
10
x
0
10
T. Gergelits, Z. Strakoš
20
30
iteration number
40
50
SNA’14, Nymburk
x̄20
x20
2014, January 28
4 / 21
Introduction
Simple numerical experiment
Solve Ax = b by the CG method
A . . . 25 × 25 diagonal and positive definite matrix, b = ones(1, . . . , 1)
0
exact arithmetic
finite precision arithmetic
relative A−norm of the error
10
20 iterations of CG in exact
arithmetic
kx − x20 kA = 2.0611 × 10−6
−5
10
20 iterations of CG in finite
precision arithmetic (FP CG)
−10
kx − x̄20 kA = 1.0953 × 10−2
10
kx20 − x̄20 k∞ = 1.424 × 10−2
−15
10
x
0
10
T. Gergelits, Z. Strakoš
20
30
iteration number
40
50
SNA’14, Nymburk
x̄20
x20
2014, January 28
4 / 21
Introduction
Simple numerical experiment
Solve Ax = b by the CG method
A . . . 25 × 25 diagonal and positive definite matrix, b = ones(1, . . . , 1)
0
exact arithmetic
finite precision arithmetic
relative A−norm of the error
10
20 iterations of CG in exact
arithmetic
kx − x20 kA = 2.0611 × 10−6
−5
10
29 iterations of FP CG for
the same level of accuracy
−10
kx − x̄29 kA = 2.0057 × 10−6
10
kx20 − x̄29 k∞ = 1.304 × 10−7
−15
10
x̄29
0
10
T. Gergelits, Z. Strakoš
20
30
iteration number
40
50
SNA’14, Nymburk
x20
x
2014, January 28
5 / 21
Introduction
Simple numerical experiment
Solve Ax = b by the CG method
A . . . 25 × 25 diagonal and positive definite matrix, b = ones(1, . . . , 1)
0
exact arithmetic
finite precision arithmetic
relative A−norm of the error
10
20 iterations of CG in exact
arithmetic
kx − x20 kA = 2.0611 × 10−6
−5
10
29 iterations of FP CG for
the same level of accuracy
−10
kx − x̄29 kA = 2.0057 × 10−6
10
kx20 − x̄29 k∞ = 1.304 × 10−7
−15
10
x̄29
0
10
T. Gergelits, Z. Strakoš
20
30
iteration number
40
50
SNA’14, Nymburk
x20
x
2014, January 28
5 / 21
The essence of the CG method
Contents
1
Introduction
2
The essence of the CG method
3
CG in finite precision arithmetic
4
Trajectory of CG approximations
5
Inertia of computed Krylov subspaces
T. Gergelits, Z. Strakoš
SNA’14, Nymburk
2014, January 28
6 / 21
The essence of the CG method
The essence of the CG method I.
Ax = b,
A ∈ FN×N HPD matrix, b ∈ FN where F is C or R
Hestenes, Stiefel (1952), Lanczos (1952)
CG is a projection method which minimizes the energy norm of the error
xk ∈ x0 + Kk (A, r0 ),
rk ⊥ Kk (A, r0 ),
where
k = 1, 2, . . .
Kk (A, r0 ) = span{r0 , Ar0 , A2 r0 , . . . , Ak −1 r0 }
kx − xk kA = min {kx − ykA : y ∈ x0 + Kk (A, r0 )}
CG is conforming with the Galerkin approximation
(k )
k∇(u − uh )k2 = k∇(u − uh )k2 + kx − xk k2A
T. Gergelits, Z. Strakoš
SNA’14, Nymburk
2014, January 28
7 / 21
The essence of the CG method
The essence of the CG method I.
Ax = b,
A ∈ FN×N HPD matrix, b ∈ FN where F is C or R
Hestenes, Stiefel (1952), Lanczos (1952)
CG is a projection method which minimizes the energy norm of the error
xk ∈ x0 + Kk (A, r0 ),
rk ⊥ Kk (A, r0 ),
where
k = 1, 2, . . .
Kk (A, r0 ) = span{r0 , Ar0 , A2 r0 , . . . , Ak −1 r0 }
kx − xk kA = min {kx − ykA : y ∈ x0 + Kk (A, r0 )}
CG is conforming with the Galerkin approximation
(k )
k∇(u − uh )k2 = k∇(u − uh )k2 + kx − xk k2A
T. Gergelits, Z. Strakoš
SNA’14, Nymburk
2014, January 28
7 / 21
The essence of the CG method
The essence of the CG method I.
Ax = b,
A ∈ FN×N HPD matrix, b ∈ FN where F is C or R
Hestenes, Stiefel (1952), Lanczos (1952)
CG is a projection method which minimizes the energy norm of the error
xk ∈ x0 + Kk (A, r0 ),
rk ⊥ Kk (A, r0 ),
where
k = 1, 2, . . .
Kk (A, r0 ) = span{r0 , Ar0 , A2 r0 , . . . , Ak −1 r0 }
kx − xk kA = min {kx − ykA : y ∈ x0 + Kk (A, r0 )}
CG is conforming with the Galerkin approximation
(k )
k∇(u − uh )k2 = k∇(u − uh )k2 + kx − xk k2A
T. Gergelits, Z. Strakoš
SNA’14, Nymburk
2014, January 28
7 / 21
The essence of the CG method
The essence of the CG method II.
CG method is mathematically equivalent to the Lanczos process:
AVk = Vk Tk + δk +1 vk +1 ekT
with given
v1 = r0 /kr0 k
The CG approximations xk are given by
Tk yk = kr0 ke1 ,
xk = x0 + Vk yk .
Jacobi matrix Tk is the k × k matrix representation of the OG projected
operator
Ak ≡ Vk Vk∗ AVk Vk∗ : Kk (A, r0 ) → Kk (A, r0 ).
CG is a matrix formulation of the Gauss-Christoffel quadrature
ω(λ),
A, r0 /kr0 k
Z
∞
f (λ) dω(λ)
0
first 2k moments matched
ω (k ) (λ),
Tk , e1
T. Gergelits, Z. Strakoš
SNA’14, Nymburk
Pk
i=1
(k )
(k )
ωi f (λi )
2014, January 28
8 / 21
The essence of the CG method
The essence of the CG method II.
CG method is mathematically equivalent to the Lanczos process:
AVk = Vk Tk + δk +1 vk +1 ekT
with given
v1 = r0 /kr0 k
The CG approximations xk are given by
Tk yk = kr0 ke1 ,
xk = x0 + Vk yk .
Jacobi matrix Tk is the k × k matrix representation of the OG projected
operator
Ak ≡ Vk Vk∗ AVk Vk∗ : Kk (A, r0 ) → Kk (A, r0 ).
CG is a matrix formulation of the Gauss-Christoffel quadrature
ω(λ),
A, r0 /kr0 k
Z
∞
f (λ) dω(λ)
0
first 2k moments matched
ω (k ) (λ),
Tk , e1
T. Gergelits, Z. Strakoš
SNA’14, Nymburk
Pk
i=1
(k )
(k )
ωi f (λi )
2014, January 28
8 / 21
CG in FP arithmetic
Contents
1
Introduction
2
The essence of the CG method
3
CG in finite precision arithmetic
4
Trajectory of CG approximations
5
Inertia of computed Krylov subspaces
T. Gergelits, Z. Strakoš
SNA’14, Nymburk
2014, January 28
9 / 21
CG in FP arithmetic
Delay of convergence
Short recurrences =⇒ loss of orthogonality =⇒ delay of convergence
0
exact arithmetic
finite precision arithmetic
loss of orthogonality
relative A−norm of the error
10
−5
10
−10
10
delay of
convergence
−15
10
0
T. Gergelits, Z. Strakoš
10
20
30
iteration number
SNA’14, Nymburk
40
50
2014, January 28
10 / 21
CG in FP arithmetic
Scheme of backward-like analysis
Paige (1971, 1980), Greenbaum (1989), S (1991), Greenbaum and S (1992)
b
FN
FN(k )
A, FP Lanczos → Tk
b
A(k),
EXACT Lanczos → Tk
the first k steps
b
Eigenvalues of A(k)
are tightly clustered around the eigenvalues of A.
b
In numerical experiments, A(k)
can be replaced (with small inaccuracy)
b
by an “universal” A with sufficiently many eigenvalues in the tight clusters
around the eigenvalues of A.
T. Gergelits, Z. Strakoš
SNA’14, Nymburk
2014, January 28
11 / 21
CG in FP arithmetic
Illustration of backward-like analysis
0
relative energy norm of the error
10
b
exact CG on A
FP CG on A
exact CG on A
−5
10
−10
10
0
20
40
60
80
100
iteration number
120
140
160
kx − x̄k kA ≈ kxb − xbk kAb
We can relate k-th iteration of finite precision CG applied on A with k-th
b
iteration of exact CG applied on blurred problem A.
T. Gergelits, Z. Strakoš
SNA’14, Nymburk
2014, January 28
12 / 21
Trajectory of CG approximations
Contents
1
Introduction
2
The essence of the CG method
3
CG in finite precision arithmetic
4
Trajectory of CG approximations
5
Inertia of computed Krylov subspaces
T. Gergelits, Z. Strakoš
SNA’14, Nymburk
2014, January 28
13 / 21
Trajectory of CG approximations
Rank-deficiency and delay of convergence
FP CG convergence curve shifted correspondingly to the rank-deficiency of
the computed Krylov subspace gives the exact CG convergence curve.
0
10
45
relative A−norm of the error
rank of the computed Krylov subspace
50
40
35
30
rank−deficiency
25
20
15
−5
10
−10
10
10
FP CG computation
exact CG computation
5
0
0
10
20
30
iteration number
40
50
−15
10
shifted FP CG
exact CG
0
5
10
15
20
rank of the computed Krylov subspace
25
Rank deficiency: k − k̄ , where k̄ = rank(Kk (A, r0 )) is the numerical rank
of the computed Krylov subspace.
Threshold: 10−1 (correspondence to a significant loss of orthogonality)
T. Gergelits, Z. Strakoš
SNA’14, Nymburk
2014, January 28
14 / 21
Trajectory of CG approximations
Comparison of approximations
We can compare FP and exact CG approximations x̄k and xk̄ themselves,
both x̄k , xk̄ ∈ FN .
0
maximum norm
10
Observation
kx − xk̄ k∞
kx − x̄k k∞
kx̄k − xk̄ k∞
−5
10
kx̄k − xk̄ k∞
≪ 1,
kx − xk̄ k∞
k x̄k −xk̄ k ∞
kx−xk̄ k ∞
i.e., distance between
approximations is small
in comparison with the
actual level of error.
−10
10
−15
10
0
5
10
15
20
rank of the computed Krylov subspace
T. Gergelits, Z. Strakoš
SNA’14, Nymburk
25
2014, January 28
15 / 21
Trajectory of CG approximations
Comparison of approximations
Ax = b
FN
FN
Ax = b
delay at the k-th step
x
xk
x̄k
x̄k
xk̄
exact computation
finite precision computation
x0
x
exact computation
finite precision computation
Finite precision CG computation tightly follows the trajectory of exact CG
computations with the delay which is given by the rank-deficiency of the
computed Krylov subspace.
T. Gergelits, Z. Strakoš
SNA’14, Nymburk
2014, January 28
16 / 21
Inertia of computed Krylov subspaces
Contents
1
Introduction
2
The essence of the CG method
3
CG in finite precision arithmetic
4
Trajectory of CG approximations
5
Inertia of computed Krylov subspaces
T. Gergelits, Z. Strakoš
SNA’14, Nymburk
2014, January 28
17 / 21
Inertia of computed Krylov subspaces
Comparison of numerical ranks
Kk (A, r0 ): k-th Krylov subspace generated by FP CG
Kk̄ (A, r0 ): k̄-th Krylov subspace generated by exact CG
( rank(Kk̄ (A, r0 )) = k̄ )
k
56
80
100
196
362
528
611
664
k̄ = rank(Kk (A, r0 ))
56
73
80
93
112
126
131
132
rank(Kk (A, r0 ) ∪ Kk̄ (A, r0 ))
56
74
80
93
113
127
131
132
Bcsstk04 (from MatrixMarket)
T. Gergelits, Z. Strakoš
SNA’14, Nymburk
2014, January 28
18 / 21
Inertia of computed Krylov subspaces
Distance between subspaces
Principal angles between k̄ -dimensional subspace Kk̄ (A, r0 ) and the
k̄ -dimensional restriction of the subspace Kk (A, r0 ) which corresponds to
its numerical rank:
0 ≤ θ1 ≤ θ2 ≤ . . . ≤ θk̄ ≤ π/2
Distance:
distance Kk̄ (A, r0 ), restricted Kk (A, r0 )
0
10
= sin(θk̄ )
sin(θk̄ ) = distance
sin(θk̄−1 )
sin(θk̄−2 )
sin(θk̄−3 )
−5
10
0
20
T. Gergelits, Z. Strakoš
40
60
80
100
k̄ - dimension of the corresponding subspaces
SNA’14, Nymburk
120
2014, January 28
19 / 21
Concluding remarks and future work
The rate of convergence typically substantially differs for FP and exact
CG. However, the trajectories of FP and exact CG approximations seem
to be very close to each other.
Inertia of computed Krylov subspaces represents phenomenon which
needs to be studied.
T. Gergelits, Z. Strakoš
SNA’14, Nymburk
2014, January 28
20 / 21
Thank you for your kind attention
Acknowledgement
This work has been supported by the ERC-CZ project LL1202, by the GACR
grant 201/13-06684S and by the GAUK grant 695612.
Download