ETNA

advertisement
ETNA
Electronic Transactions on Numerical Analysis.
Volume 35, pp. 104-117, 2009.
Copyright  2009, Kent State University.
ISSN 1068-9613.
Kent State University
http://etna.math.kent.edu
ACCELERATION OF IMPLICIT SCHEMES FOR LARGE SYSTEMS OF
NONLINEAR ODES∗
MOUHAMAD AL SAYED ALI† AND MILOUD SADKANE†
Abstract. Implicit integration schemes for large systems of nonlinear ODEs require, at each integration step,
the solution of a large nonlinear system. Typically, the nonlinear systems are solved by an inexact Newton method
that leads to a set of linear systems involving the Jacobian matrix of the ODE which are solved by Krylov subspace
methods. The convergence of the whole process relies on the quality of initial solutions for both the inexact Newton
iteration and the linear systems. To improve global convergence, line search and trust region algorithms are used to
find effective initial solutions. The purpose of this paper is to construct subspaces of small dimension where descent
directions for line search and trust region algorithms and initial solutions for each linear system are found. Only
one subspace is required for each integration step. This approach can be seen as an improved predictor, leading to
a significant saving in the total number of integration steps. Estimates are provided that relate the quality of the
computed initial solutions to the step size of the discretization, the order of the implicit scheme and the dimension
of the constructed subspaces. Numerical results are reported.
Key words. nonlinear equations, nonlinear ODE systems, inexact Newton, GMRES, line search, trust region
AMS subject classifications. 65H10, 65L05
1. Introduction. Consider the system of ODEs:
ẏ(t) = f (t, y(t)),
(1.1)
t0 ≤ t ≤ T,
y(t0 ) = y (0) ,
where y(t) ∈ Rn and n is large.
A class of implicit schemes for solving (1.1) is given by
Gi (yi ) = yi − ai − βhf (ti , yi ) = 0,
(1.2)
i = q + 1, . . . , N,
where y0 = y (0) and y1 , . . . , yq are assumed to be known, q ≪ N , yi is an approximation to
0
y(ti ) with ti = t0 + ih, h = T −t
N , β is a scalar and ai is a vector depending on yi−k and
f (ti−k , yi−k ), 1 ≤ k ≤ q + 1. Most standard implicit schemes such as implicit Euler, CrankNicolson, Adams-Moulton and BDF methods can be written as in (1.2); see, for example,
[15, 16].
These schemes have good stability properties, but necessitate the solution of the large
system (1.2) at each integration step. The case when f is an affine function of the form
f (t, y(t)) = Ay(t) + b(t) leads to linear systems. This case has been considered in [1]. In the
present paper we assume that f is a nonlinear function. The Newton method can be used to
solve (1.2), but to keep the storage and computational cost low, the inexact Newton method
is preferred [9, 11]. It is given by the following process:
(0)
Choose yi
as initial guess for yi
For k = 0, 1, . . . , until convergence do :
(k)
(k)
(1.3)
Solve inexactly G′i (yi )si
(1.4)
Set yi
(k+1)
(k)
= yi
(k)
= −Gi (yi )
(k)
+ s̃i
(k)
(k)
(k)
where G′i (yi ) is the Jacobian matrix of Gi at yi and s̃i is an approximate solution to
(k)
(1.3). Krylov subspace methods [20], for example GMRES [21], can be used to find s̃i
∗ Received May 27, 2008. Accepted for publication January 21, 2009. Published online June 24, 2009. Recommended by R. Varga.
† Université de Bretagne Occidentale, Département de Mathématiques 6, avenue V. Le Gorgeu, CS 93837, 29238
Brest Cedex 3, France ({alsayed,sadkane}@univ-brest.fr).
104
ETNA
Kent State University
http://etna.math.kent.edu
ACCELERATION OF IMPLICIT SCHEMES FOR LARGE SYSTEMS OF NONLINEAR ODEs
(k)
105
(k)
(k)
in the affine subspace ŝi + Ki , where ŝi is an initial guess for the exact solution and
(k)
(k)
Ki is a Krylov subspace constructed with the matrix G′i (yi ) and the initial residual vector
(k)
(k)
(k) (k)
G′i (yi )ŝi + Gi (yi ). One generally asks that s̃i satifies
(k)
(k)
kG′i (yi )s̃i
(1.5)
(k)
(k)
(k)
+ Gi (yi )k ≤ ǫi kGi (yi )k
(k)
(k)
with some tolerance threshold ǫi . The results in [9] indicate how ǫi must be chosen to
ensure local convergence of the inexact Newton method.
Krylov subspace methods have been used, for example, in [9, 2, 5, 6] in the context of
Newton’s method and in [13, 3, 4, 12, 7, 17, 19] in the context of ODEs. They have the
(k)
advantage of requiring only the multiplication of the Jacobian matrix G′i (yi ) by a vector,
the Jacobian matrix need not be formed explicitly. Also, good convergence can be expected
(k)
provided an efficient preconditioner and/or a good initial guess ŝi is available. In this paper
we concentrate our effort on the latter possibility.
Like Newton’s method, the inexact Newton method is locally convergent: it converges
(0)
(0)
rapidly provided the initial guess yi is close enough to yi . Traditionally, yi is found with
the aid of a predictor, usually an explicit scheme applied to (1.1). The drawback is that often
the Newton method does not converge with such a predictor; see Section 4.2. For this reason,
the line search backtracking and trust region techniques have been used to improve global
(0)
convergence [5, 6, 11] from any initial guess yi . These techniques generate a sequence
of vectors that converge under some conditions to a local minimum or a saddle point of
the function hi (y) = 12 kGi (y)k2 . Here and throughout the paper, the symbol k k denotes
the Euclidean norm. Each vector is obtained from the preceding one by adding a descent
direction, on which the effectiveness of these techniques strongly depends. Of particular
importance from an algorithmic point of view is the paper [5], where the solution of (1.3)
and line search and trust region algorithms all use Krylov subspaces. This results in a method
that is effective for nonlinear equations, but rather expensive for ODEs since the nonlinear
equation (1.2) must be solved N − q times.
Our aim is to show that the subspace
(1.6)
Vi = span{f (ti−1 , yi−1 ), . . . , f (ti−r , yi−r )}, r ≪ n
and some modifications thereof (namely when yi−1 , . . . , yi−r are replaced by approximations) contain a good approximation to yi − ai ; see Theorems 2.1 and 2.2. As a consequence,
(k)
(0)
the subspace Vi can be used to find yi and ŝi , k ≥ 0.
More precisely, we will see that good descent directions for line search and trust region
(k)
(k)
algorithms and a good initial guess ŝi of the exact solution si can be found from Vi and
(k)
the Petrov-Galerkin process. When applied to (1.3), this process finds ŝi in the subspace
(k)
−yi + ai + Vi , such that
(k)
(k)
G′i (yi )ŝi
(k)
(k)
(k)
+ Gi (yi ) ⊥ G′i (yi )Vi .
(k)
This means that ŝi is the best least squares solution to (1.3) in the subspace −yi + ai + Vi .
(k)
Once ŝi is obtained as above, very few iterations by a Krylov subspace method are needed
to satisfy (1.5). The subspace Vi does not depend on k and the main advantage here is that
only one subspace of small dimension is required for each iteration i of the implicit scheme.
The approach thus obtained leads to a significant saving in the total number of integration
steps.
ETNA
Kent State University
http://etna.math.kent.edu
106
M. SAYED ALI AND M. SADKANE
The paper is organized as follows. In Section 2 we describe the subspace of approximation Vi and provide estimates for the error between exact and approximate solutions of (1.2)
in terms of the step size h, the order of the implicit scheme and the dimension r of Vi . In
Section 3 we briefly review the line search and trust region algorithms and explain how these
algorithms can be used, in conjunction with Vi and the Petrov-Galerkin process, to compute
good descent directions and good initial guesses for each linear system of the form (1.3). In
Section 4 we discuss the algorithmic aspect of the proposed approach and show numerically
its effectiveness on a very stiff problem. The Krylov subspace method used throughout the
paper is GMRES.
2. Approximation of the initial guess. In practice we can only hope for an approximation ỹi to the exact solution yi of (1.2). Therefore we replace (1.2) by
kG̃i (ỹi )k ≤ ε, i = q + 1, , . . . , N,
(2.1)
where G̃i (x) = x − ãi − βhf (ti , x), the vector ãi is obtained by replacing the yj ’s in the
expression of ai by the ỹj ’s and ε is some tolerance threshold.
Throughout this section we assume that f is Lipschitz with respect to the second variable.
We also assume that the scheme (1.2) is stable and of order p and that for i = 1, . . . , q, ỹi is
computed with an i-step scheme such that
max kỹi − yi k = O(ε/h).
(2.2)
0≤i≤q
Then, there exists a constant C0 such that (see, e.g., [18, Chap. 8])


N
X
kG̃i (ỹi )k
max kỹi − yi k ≤ C0  max kỹi − yi k +
0≤i≤q
q+1≤i≤N
≤ C0
i=q+1
max kỹi − yi k + (N − q) ε
0≤i≤q
= O(ε/h).
The following theorem shows that the subspace
Vi = span{f (ti−k , ỹi−k ), 1 ≤ k ≤ r}
(2.3)
contains a good approximation to ỹi − ãi .
T HEOREM 2.1. Assume that g(t) = f (t, y(t)) ∈ C r ([t0 , T ]) and f is continuously
differentiable on [t0 , T ] × Rn . Let Vi be the subspace defined in (2.3). Then, there exists a ỹ
in ãi + Vi such that for i = q + 1, . . . , N,
kỹi − ỹk = O(hp+1 ) + O(hr+1 ) + O(ε),
kG̃i (ỹ)k = O(hp+1 ) + O(hr+1 ) + O(ε).
Proof. Since g is r-times continuously differentiable, from the Lagrange interpolation
formula (see, e.g., [8, Chap. 3]), there exist constants αl , 1 ≤ l ≤ r, such that
kg(ti ) −
(2.4)
r
X
αl g(ti−l )k = O(hr ).
l=1
From (2.4) and the fact that the scheme (1.2) is stable and of order p and f is Lipschitz, we
have
(2.5)
kf (ti , ỹi ) −
r
X
l=1
αl f (ti−l , ỹi−l )k = O(hp ) + O(hr ) + O(ε/h).
ETNA
Kent State University
http://etna.math.kent.edu
ACCELERATION OF IMPLICIT SCHEMES FOR LARGE SYSTEMS OF NONLINEAR ODEs
Now let ỹ = ãi + hβ
Pr
l=1
107
αl f (ti−l , ỹi−l ) ∈ ãi + Vi . Then
ỹi − ỹ = ỹi − ãi − hβ
r
X
αl f (ti−l , ỹi−l )
l=1
= G̃i (ỹi ) + hβ(f (ti , ỹi ) −
r
X
αl f (ti−l , ỹi−l ))
l=1
which, with (2.1) and (2.5), gives
kỹi − ỹk = O(hp+1 ) + O(hr+1 ) + O(ε).
Now we have
kG̃i (ỹ)k ≤ kG̃i (ỹ) − G̃i (ỹi )k + kG̃i (ỹi )k
= kỹ − ỹi + hβ(f (ti , ỹi ) − f (ti , ỹ))k + kG̃i (ỹi )k
≤ kỹ − ỹi k + h|β|kf (ti , ỹ) − f (ti , ỹi )k + ε.
Hence,
kG̃i (ỹ)k = O(hr+1 ) + O(hp+1 ) + O(ε).
To reduce the cost in the Petrov-Galerkin process and in GMRES, we modify the subspace Vi by keeping only the last vectors whose computations necessitate the use of GMRES
to satisfy (1.5). For example, suppose that at iteration i the Petrov-Galerkin process was
(k)
(k)
(k+1)
(k)
= yi + ŝi with kG̃i (yik+1 )k ≤ ε. In this case we set
enough to find ŝi such that yi
(k+1)
(k)
(k)
and use the same subspace Vi for the next iteration.
s̃i = ŝi , ỹi = yi
Let us denote, again, by
(2.6)
Vi = span{f (ti−j , ỹi−j ), j = i1 , i2 , . . . , is }
the subspace associated with the last s vectors ỹi−i1 , ỹi−i2 , . . . , ỹi−is with i1 < . . . < is ,
whose computations necessitate the use of GMRES to satisfy (1.5), and by
(2.7)
Wi = span{f (ti−j , ỹi−j ), j = 1, 2, . . . , i1 − 1},
the set associated with the other vectors such that
kG̃i−j (ỹi−j )k ≤ ε, ỹi−j ∈ ãi−j + Vi .
Such a situation is encountered in practice; see steps (TR3) and (N1.1) of Algorithm 4.1.
Then we have the following theorem.
T HEOREM 2.2. Assume that g(t) = f (t, y(t)) ∈ C r ([t0 , T ]) with r = s + i1 − 1 and
f continuously differentiable on [t0 , T ] × Rn . Let Vi be the subspace defined in (2.6). Then
there exists ỹ in ãi + Vi such that for i = q + 1, . . . , N,
kỹi − ỹk = O(hp+1 ) + O(hr+1 ) + O(ε),
kG̃i (ỹ)k = O(hp+1 ) + O(hr+1 ) + O(ε).
Proof. From Theorem 2.1 we know that there exist z1 ∈ ãi + Vi , z2 ∈ Wi , such that
(2.8)
kỹi − (z1 + z2 )k = O(hp+1 ) + O(hr+1 ) + O(ε).
ETNA
Kent State University
http://etna.math.kent.edu
108
M. SAYED ALI AND M. SADKANE
Pi1 −1
Pi1 −1
Let us write z2 = βh j=1
γj f (ti−j , ỹi−j ) and define ỹ = z1 + j=1
γj (ỹi−j − ãi−j ).
Note that ỹ ∈ ãi + Vi and
ỹi − ỹ = (ỹi − (z1 + z2 )) +
iX
1 −1
γj (hβf (ti−j , ỹi−j ) − (ỹi−j − ãi−j )) ,
j=1
kỹi − ỹk ≤ k(ỹi − (z1 + z2 ))k +
iX
1 −1
|γj |kG̃i−j (ỹi−j )k
j=1
= O(hp+1 ) + O(hr+1 ) + O(ε).
As in the proof of the previous theorem, we obtain
kG̃i (ỹ)k = O(hp+1 ) + O(hr+1 ) + O(ε).
R EMARK 2.3.
1. In the sequel we will essentially work with the subspace Vi defined in (2.6).
2. In practice, r > p; for example, the numerical tests are carried out with r = 20 and
p = 1 or 2. Thus, kỹi − ỹk and kG̃i (ỹ)k behave like O(hp+1 ) + O(ε).
3. Line search and trust region algorithms. Line search and trust region algorithms
are two simple iterative methods for finding a local minimum of a function. They are based
on the notion of descent directions that move the iterates towards a local minimum. More
precisely, let
hi (u) =
(3.1)
1
kGi (u)k2 , u ∈ Rn .
2
A vector p ∈ Rn is a descent direction of Gi at u if
∇hi (u)T p < 0.
(3.2)
This inequality guarantees that, for small positive δ,
hi (u + δp) < hi (u).
(3.3)
In fact, inequality (3.2) allows that hi decreases with a rate proportional to ∇hi (u)T p.
In the next two subsections we briefly explain how these algorithms can be used to find a
good initial guess for the nonlinear system (1.2). More details on these algorithms can found,
for example, in [10].
3.1. Backtracking line search algorithm. The line search algorithm computes a se(k)
quence of vectors ui such that
(k+1)
ui
(3.4)
(k)
= ui
(k)
+ λk pi ,
(k)
(k)
where pi is a descent direction of Gi at ui , and λk is a scalar chosen to satisfy the
Goldstein-Armijo condition:
(3.5)
(k+1)
hi (ui
(k)
(k)
(k)
) ≤ hi (ui ) + αλk ∇hi (ui )T pi ,
where α ∈ (0, 1/2) is a parameter typically set to 10−4 .
ETNA
Kent State University
http://etna.math.kent.edu
ACCELERATION OF IMPLICIT SCHEMES FOR LARGE SYSTEMS OF NONLINEAR ODEs
109
(k)
Under some conditions, the sequence (ui )k≥0 converges to a local minimum of hi or
(k)
(k)
the sequence ∇hi (ui ) converges to 0. Moreover, for each descent direction pi , Theorem 6.3.2 in [10] shows the existence of λk , such that (3.5) is satisfied.
The scalar λk is computed by the backtracking method. This method starts with λk = 1
(k+1)
, satisfying (3.5), is found. When
and repeatedly reduces λk until an acceptable iterate ui
(k)
λk = 1, the natural question is which descent direction pi leads approximately to (3.5) and
(k+1)
≈ yi . These conditions translate to
ui
1
(k) (k)
(k)
(k)
kGi (ui )k2 + αGi (ui )T G′i (ui )pi .
2
0/
(3.6)
It turns out that the Newton direction
(k)
pi
(3.7)
(k)
(k)
= −(G′i (ui ))−1 Gi (ui )
guarantees (3.6). However, this direction is expensive to compute. Therefore, using GMRES,
(k)
(k)
we look for an approximation p̃i to pi that satisfies
′
(k)
(k)
kGi (ui )p̃i
(3.8)
(k)
with some tolerance threshold ηi
(k)
(k)
∇hi (ui )T p̃i
(k)
(k)
(k)
+ Gi (ui )k ≤ ηi kGi (ui )k,
> 0. Then we have
′
(k)
(k) (k)
(k)
(k)
= −kGi (ui )k2 + Gi (ui )T Gi (ui )p̃i + Gi (ui )
(k)
≤ (ηi
(k)
− 1)kGi (ui )k2 .
(k)
(k)
In particular, if ηi ≪ 1, then p̃i is a good descent direction.
(k)
(k)
(k)
In conclusion, when p̃i satisfies (3.8) with 0 < ηi ≪ 1 , then p̃i is a descent
direction that satisfies, or almost satisfies, (3.5) with λk = 1. In the latter case, the backtrack(k)
ing method will rapidly find a scalar λk that satisfies (3.5) and the sequence ui converges
(k)
to yi . As we have mentioned, p̃i is obtained with GMRES. To accelerate the computation,
(k)
we start GMRES with p̂i , obtained by the Petrov-Galerkin process applied to the subspace
(k)
−ui + ãi + Vi . With this initial solution, only a few iterations, and sometimes no iteration,
of GMRES are needed to satisfy (3.8); see Tables 4.1 and 4.2.
One drawback of the line search algorithm is that even when the condition (3.5) is not
satisfied, the descent direction is still kept unchanged. Only the scalar λk is updated using a one-dimensional quadratic or cubic model. The trust region algorithm overcomes this
drawback, at the price of increasing the computational cost.
3.2. Trust region algorithm. The trust region algorithm generates a sequence of vec(k)
(k)
(k)
(k+1)
(k)
= ui + di , where di is the solution of
tors ui , such that ui
(k)
min Ψi (d),
(3.9)
(k)
kdk≤δi
(k)
and where δi
(k)
> 0 and Ψi
(k)
is given by the quadratic model
Ψi (d) =
(3.10)
1
(k)
(k)
kGi (ui ) + G′i (ui )dk2
2
1
(k)
(k)
(k)
= hi (ui ) + ∇hi (ui )T d + dT Bi d
2
ETNA
Kent State University
http://etna.math.kent.edu
110
M. SAYED ALI AND M. SADKANE
(k)
(k)
(k)
with Bi = G′i (ui )T G′i (ui ).
(k)
Since Bi is symmetric and positive definite, the solution of (3.9) is given by
(k)
di
(k)
where µi
=

(k)
(k)
(k)
(k)
(k)

 −(Bi )−1 ∇hi (ui ) if k(Bi )−1 ∇hi (ui )k ≤ δi ,

 −(B (k) + µ(k) I )−1 ∇h (u(k) ) if k(B (k) )−1 ∇h (u(k) )k > δ (k) ,
i i
n
i i
i
i
i
i
≥ 0 solves
(k)
k(Bi
(3.11)
(k)
(k)
(k)
+ µi In )−1 ∇hi (ui )k = δi .
(k)
(k)
Note that di is a descent direction of Gi at ui and coincides with the Newton direction
(k)
(k)
(k)
(3.7) when k(Bi )−1 ∇hi (ui )k ≤ δi .
(k)
To compute the scalar µi that approximately satisfies (3.11), we will use two methods:
the locally constrained optimal hook step method, and the dogleg step method. The former
(k)
(k)
(k)
(k)
∼
δi (k) , and takes
+ µi In )−1 ∇hi (ui )k
finds µi , such that k(Bi
=
(k)
(k)
(k)
(k)
(k+1)
= ui − (Bi + µi In )−1 ∇hi (ui ). The latter uses a piecewise linear approxui
(k+1)
(k)
(k)
(k)
as the point
imation of the curve µ → ui − (Bi + µIn )−1 ∇hi (ui ) and takes ui
(k)
(k)
(k+1)
− ui k = δi . Note that both methods, as well
on this approximation, such that kui
(k)
as the computation of di , necessitate the solution of large linear systems. To reduce the
cost, we proceed as in [5] and replace the full quadratic model (3.10) by the following lowerdimensional one,
1
(k)
(k)
(k)
(k)
Φi (c) = hi (ui ) + ∇hi (ui )T Vi c + cT (ViT Bi Vi )c,
2
(3.12)
(k)
where Vi is an n×p matrix whose columns form an orthonormal basis of Vi . Since ViT Bi Vi
(k)
(k)
(k)
(k+1)
is given by ui + Vi ci , where ci is
is symmetric positive definite, the next iterate ui
the solution of
(k)
min Φi (c).
(3.13)
(k)
kck≤δi
(k)
(k)
(k)
Note that ci is not expensive to compute and that Vi ci is a descent direction of Gi at ui .
(k+1)
is the one written in (3.5), namely,
The condition for accepting ui
(3.14)
(k+1)
hi (ui
(k+1)
(k)
(k)
(k+1)
) ≤ hi (ui ) + α∇hi (ui )T (ui
(k)
− ui ),
(k)
does not satisfy (3.14), then we reduce δi and return to (3.13)
where α ∈ (0, 21 ). If ui
(k+1)
(k)
satisfies (3.14), then
to compute ci by the hook step or the dogleg step method. If ui
(k+1)
should be increased, decreased, or kept the same for the next
one must decide whether δi
step k + 1; see [10].
4. Algorithmic aspect. In this section we write, in an informal way, an algorithm that
computes the sequence (ỹi ) defined in (2.1). The initial guess for each nonlinear system is
obtained with a backtracking line search (part 3.1) or trust region (part 3.2), and the approximate solution ỹi is obtained with the inexact Newton method (part 3.3). The parameters r,
ε(1)
, ε(2)
, ε, kmaxLS and kmaxIN should be fixed by the user.
LS
LS
ETNA
Kent State University
http://etna.math.kent.edu
ACCELERATION OF IMPLICIT SCHEMES FOR LARGE SYSTEMS OF NONLINEAR ODEs
111
4.1. The algorithm.
A LGORITHM 4.1.
1. Assume that for k = 1, . . . , q, the approximation ỹk is either given or is computed
with an k-step scheme such that max1≤k≤q kỹk − yk k = O(ε/h).
Set i = q + 1.
2. Let Ri be the matrix formed by the last k0 vectors f (ti−k , ỹi−k ), 1 ≤ k ≤ k0 , where
k0 = min(q + 1, r). Orthonormalize the columns of Ri in Vi .
3. Repeat until i = N
(0)
3.0 Compute an initial solution ui (e.g., by an explicit method applied to (1.1))
3.1 Line Search: Repeat until
(k)
(k) (k+1)
or
)k
kG̃i (ui )k ≤ ε(1)
)k
−
k
G̃
(u
G̃
(u
or k > kmaxLS
k
≤ ε(2)
i
i
i
i
LS
LS
(k)
(LS1) Compute an initial solution p̂i of (3.7) by applying the Petrov-Galerkin
(k)
process to −ui + ãi + range(Vi ).
(k)
(k)
(k)
· If p̂i satisfies (3.8), then set p̃i = p̂i .
(k)
· Otherwise, compute an approximate solution p̃i of (3.7) by
(k)
GMRES, so that (3.8) holds, starting with p̂i .
(LS2) Compute the scalar λk by the backtracking method, so that (3.5) holds.
(k)
(0)
(k)
(k)
(k+1)
= ui + λk p̃i , k := k + 1, yi = ui .
(LS3) Set ui
3.2 Trust Region: replace the steps (LS1), (LS2), and (LS3) above by
(k)
(TR1) Compute an approximation ci to (3.13) by the hook step or the dogleg
step.
(k)
(k)
(k+1)
satisfying (3.14).
+ Vi ci
=
ui
(TR2) Compute ui
(k)
(0)
Set k := k + 1, yi = ui .
(0)
(0)
(TR3) If kG̃i (yi )k ≤ ε, then set ỹi = yi , i := i + 1 and go to 3.
3.3 Inexact Newton: Repeat until
(k)
kG̃i (yi )k ≤ ε or k > kmaxIN
(k)
(N1) Compute an approximation ŝi of the linear system (1.3) by applying the
(k)
Petrov-Galerkin process to −yi + ãi + range(Vi ).
(k)
(k)
(k)
(k)
(N1.1) If kG̃i (yi + ŝi )k ≤ ε, set ỹi = yi + ŝi , i := i + 1 and go to 3.
(k)
(k)
(k+1)
(k)
= yi + ŝi , k := k + 1 and
(N1.2) Else, if ŝi satisfies (1.5), set yi
go to 3.3.
(k)
(N1.3) Else, compute an approximation s̃i to (1.3) by GMRES, starting
(k)
(k)
(k)
with ŝi . Set ŝi = s̃i , and go to (N1.1).
(k)
3.4 Set ỹi = yi .
3.5 Let k0 be the number of columns of Ri . If k0 < r, then set Ri+1 = [Ri , f (ti , ỹi )],
else Ri = [Si , f (ti , ỹi )], where Si is the matrix formed by the last r − 1
columns of Ri . Orthonormalize the columns of Ri+1 in Vi+1
3.6 i := i + 1.
In this algorithm, the subspace Vi = range(Vi ) corresponds to the one of Theorem 2.2.
If we want to use the subspace Vi of Theorem 2.1, then step (TR3) should be replaced by:
(0)
(0)
If kG̃i (yi )k ≤ ε, set ỹi = yi , and go to 3.5,
and step (N1.1) should be replaced by
(k)
If kG̃i (yi
(k)
(k)
+ ŝi )k ≤ ε, set ỹi = yi
(k)
+ ŝi , and go to 3.5.
ETNA
Kent State University
http://etna.math.kent.edu
112
M. SAYED ALI AND M. SADKANE
Note that if step (TR3) or step (N1.1) is satisfied, then ỹi ∈ ãi + Vi . For this reason we
do not change the subspace Vi for the next iteration i + 1.
In step 3.4 the orthonormalization of Ri and Ri+1 uses an updating QR factorization;
see, e.g., [14, p. 594].
4.2. Numerical results. We now show the behavior of Algorithm 4.1 on the Robertson
chemical reaction with one-dimensional diffusion [17]:
∂u
∂2u
= −0.04 u + 104 vw + α 2 ,
∂t
∂x
∂2v
∂v
4
= 0.04 u − 10 vw − 3 × 107 v 2 + α 2 ,
∂t
∂x
2
∂
w
∂w
= 3 × 107 v 2 + α 2 ,
∂t
∂x
with 0 ≤ x ≤ 1, 0 ≤ t ≤ 1, α = 2 × 10−2 and Neumann boundary conditions
∂v
∂w
∂u
∂x = ∂x = ∂x = 0 at x = 0, 1, and initial values
u(x, 0) = 1 + sin(2π x),
v(x, 0) = w(x, 0) = 0.
The second partial derivative is discretized on a uniform grid of 5 × 103 points and the boundary conditions at x = 0 (resp. x = 1) are discretized by forward (resp. backward) differences.
We thus obtain an ODE system of the form (1.1) of size n = 15 × 103. This system is known
to be very stiff.
Recall that our aim is to show that the subspace Vi helps to find good initial solutions
(k)
(0)
yi and ŝi . The implicit schemes used in (1.2) are implicit Euler and Crank-Nicolson. The
numerical tests are carried out with the following parameters: q = 0, r ≡ dim(Vi ) = 20,
h = 1/100. The parameter ε in the inexact Newton method is given by 10−5 . The threshold
(k)
(k)
(k)
(k)
parameters ǫi and ηi , used in the computation of s̃i in (1.5) and p̃i in (3.8), are fixed
(k)
−2
at 10 . Note that the choice of a constant ǫi implies that the convergence of the sequence
(k)
(k)
(k)
(yi )k≥0 is linear [9]. In steps (LS1) and (N.1.3), the computations of p̃i and s̃i are
carried out by GMRES(20), i.e., restarted GMRES with restart value 20. We say that GMRES
fails to converge when the stopping criterion is not satisfied within a total number of iterations
(i.e., number of matrix-vector multiplies) fixed at 104 . The parameters kmaxIN , kmaxLS ,
(0)
ε(1)
and ε(2)
are given by 15, 15, 10−2 and 10−6 , respectively. The initial solution ui in
LS
LS
(0)
step 3.0 is given by explicit Euler, i.e., ui = ỹi−1 + hf (ti−1 , ỹi−1 ).
We also compare the proposed approach with some ”standard predictor schemes” where
the initial solutions to (1.3) and to line search are obtained with explicit Euler, explicit RungeKutta of order 4, or Adams-Bashforth of order r. Specifically, we remove steps 2 and 3.5
(k)
(k)
(0)
from Algorithm 4.1 and replace ui in step 3.0, p̂i in step (LS1) and ŝi in step (N1),
respectively, by
(0)
ui
(e)
(k)
= yi , p̂i
(k)
= −ui
(e)
+ yi
(k)
and ŝi
where
(e)
yi+1 = ỹi + hf (ti , ỹi )
for explicit Euler,
(e)
yi+1 = ỹi +
(k)
= −yi
h
(k1 + 2k2 + 2k3 + k4 )
6
(e)
+ yi ,
ETNA
Kent State University
http://etna.math.kent.edu
113
ACCELERATION OF IMPLICIT SCHEMES FOR LARGE SYSTEMS OF NONLINEAR ODEs
for Runge-Kutta 4 with k1 = f (ti , ỹi ), k2 = f (ti + h/2, ỹi + hk1 /2),
k3 = f (ti + h/2, ỹi + hk2 /2), k4 = f (ti+1 , ỹi + hk3 ), and
(e)
yi+1 = ỹi + h
r−1
X
βk ∇k f (ti , ỹi )
k=0
R 1 Qk−1
for Adams-Bashforth with βk = 0 j=0 (j + s)ds/k!, ∇0 f (ti , ỹi ) = f (ti , ỹi ) and
∇k+1 f (ti , ỹi ) = ∇k f (ti , ỹi ) − ∇k f (ti−1 , ỹi−1 ).
(0)
(0)
Figures 4.1-4.3 show the residual norm kG̃i (yi )k, where yi is the initial solution
computed by line search or trust region algorithm.
−2
−2
10
Residual norm
Residual norm
10
−3
10
−4
10
−3
10
−4
0
20
40
60
Iteration i of implicit Euler scheme
80
10
100
0
20
40
60
80
Iteration i of Crank−Nicolson scheme
100
20
40
60
80
Iteration i of Crank−Nicolson scheme
100
F IG . 4.1. Line search.
1
1
10
10
0
0
10
Residual norm
Residual norm
10
−1
10
−2
−2
10
10
−3
10
−1
10
−3
0
20
40
60
Iteration i of implicit Euler scheme
80
100
10
0
F IG . 4.2. Trust region (dogleg step).
Tables 4.1 and 4.2 show some computational details when the inexact Newton method
is used along with the line search algorithm. The second column of these tables shows
(k)
(k)
(k)
(k)
kG̃i (yi + ŝi )k, where yi is the k-th Newton iterate for computing yi and ŝi is the
(k)
initial guess of si , obtained from Vi and the Petrov-Galerkin process. The third column
(k)
(k)
shows the number of iterations required by GMRES to compute s̃i , starting with ŝi , such
that (1.5) holds. For example, when i = 1, Table 4.1 shows that 4329 + 5440 = 9769
iterations of GMRES have been used to satisfy the condition in step 3.3 of Algorithm 4.1,
while 2320 iterations were sufficient when i = 10. Columns 4, 5, and 6 of these tables show
analogous information for the line search (step 3.1 of Algorithm 4.1).
ETNA
Kent State University
http://etna.math.kent.edu
114
M. SAYED ALI AND M. SADKANE
1
1
10
10
0
0
10
Residual norm
Residual norm
10
−1
10
−2
−2
10
10
−3
10
−1
10
−3
0
20
40
60
Iteration i of implicit Euler scheme
80
10
100
0
20
40
60
80
Iteration i of Crank−Nicolson scheme
100
F IG . 4.3. Trust region (hook step).
From these tables, we see that at iteration i = 1, the number of iterations required by
(k)
(k)
GMRES is large and that the initial solutions ŝi and p̂i are not good enough. This is
because the subspace Vi contains only one vector. As dim(Vi ) increases, the relative residual
(k)
associated with p̂i decreases, especially when the Crank-Nicolson scheme is used. As a
consequence, fewer iterations, and sometimes no iteration, of GMRES, are needed to compute
(0)
(k)
p̃i . This allows the line search algorithm to compute a good initial solution yi ; see also
Figure 4.1. Similar comments apply when inexact Newton is used with trust region; see
Tables 4.3-4.4 and Figures 4.2-4.3.
Table 4.5 shows the behavior of Algorithm 4.1 when the explicit Euler scheme is used
(k)
(k)
as explained above. In this test GMRES did not converge. The norm kG̃i (yi + ŝi )k
4
stagnated around 10 . The algorithm did not perform better when explicit Runge-Kutta 4 and
(k)
(k)
Adams-Bashforth were used. With Runge-Kutta, kG̃i (yi + ŝi )k stagnated around 1024
(k)
for i = 1, k = 0, . . . , 12, and at iteration (i, k) = (1, 13), GMRES failed to compute s̃i .
(k)
(k)
With Adams-Bashforth, kG̃i (yi + ŝi )k stagnated around 104 for i = 1, k = 0, 1 and
(k)
i = 2, k = 0, . . . , 5. At iteration (i, k) = (2, 6), GMRES failed to compute s̃i .
TABLE 4.1
Algorithm 4.1 with line search - Implicit Euler is used in (1.2).
Inexact Newton
iteration
(i,k)
(1,0)
(1,1)
(10,0)
(20,0)
(30,0)
(40,0)
(40,1)
(50,0)
(50,1)
(90,0)
(100,0)
(100,1)
(k)
kG̃i (yi
(k)
+ ŝi
)k
Line search
# iter.
(k)
s̃i
6,2976
6,2976
4, 9076 × 10−3
2, 6774 × 10−3
2, 1495 × 10−3
1, 9993 × 10−3
1, 9993 × 10−3
1, 6472 × 10−3
1, 6472 × 10−3
1, 2606 × 10−3
1, 3811 × 10−3
1, 3811 × 10−3
GMRES
4329
5440
2320
1760
1620
893
1600
536
940
900
201
1000
iteration
(i,k)
(1,0)
(1,1)
(1,2)
(1,3)
(1,4)
(10,0)
(20,0)
(30,0)
(90,0)
′
(k) (k)
(k)
kG̃i (u
)p̂
+G̃i (u
)k
i
i
i
(k)
kG̃i (u
)k
i
1, 8881 × 10−4
9, 9999 × 10−1
1,6321
2, 1002 × 101
9, 8262 × 101
2, 0496 × 10−1
2, 4655 × 10−1
1, 4118 × 10−1
6, 8481 × 10−2
# iter.
(k)
GMRES p̃i
0
1192
557
2641
3393
910
793
520
130
5. Conclusion. The purpose of this paper was to show one possibility for improving
convergence of the linear and nonlinear systems that arise when solving large nonlinear sys-
ETNA
Kent State University
http://etna.math.kent.edu
ACCELERATION OF IMPLICIT SCHEMES FOR LARGE SYSTEMS OF NONLINEAR ODEs
115
TABLE 4.2
Algorithm 4.1 with line search - Crank-Nicolson is used in (1.2).
Inexact Newton
iteration
(i,k)
(1,0)
(1,1)
(10,0)
(20,0)
(20,1)
(30,0)
(30,1)
(50,0)
(50,1)
(80,0)
(80,1)
(100,0)
(100,1)
(k)
kG̃i (yi
(k)
+ ŝi
)k
Line search
# iter.
(k)
s̃i
6,2969
6,2969
1, 6742 × 10−2
4, 9327 × 10−3
4, 9327 × 10−3
2, 8972 × 10−3
2, 8972 × 10−3
3, 2204 × 10−3
3, 2204 × 10−3
2, 8631 × 10−3
2, 8631 × 10−3
1, 8893 × 10−3
1, 8893 × 10−3
GMRES
2347
2800
1540
817
1220
676
940
815
1100
686
960
547
700
iteration
(i,k)
(1,0)
(1,1)
(1,2)
(1,3)
(1,4)
(1,5)
(10,0)
(10,1)
(20,0)
(60,0)
(80,0)
(90,0)
(100,0)
′
(k) (k)
(k)
)p̂
+G̃i (u
)k
kG̃i (u
i
i
i
(k)
kG̃i (u
)k
i
−4
2, 5662 × 10
9, 9999 × 10−1
1,8962
6,5812
2, 7293 × 101
1, 3141 × 102
9, 6104 × 10−6
5, 2984 × 10−1
6, 3566 × 10−7
8, 6954 × 10−8
9, 3942 × 10−8
7, 4865 × 10−8
6, 2467 × 10−8
# iter.
(k)
GMRES p̃i
0
642
760
890
1405
1837
0
728
0
0
0
0
0
TABLE 4.3
Algorithm 4.1 with trust region - Implicit Euler is used in (1.2).
iteration
(i,k)
(1,0)
(1,1)
(1,2)
(1,3)
(1,4)
(1,5)
(1,6)
(1,7)
(10,0)
(10,0)
(20,0)
(20,1)
(30,0)
(30,1)
(40,0)
(40,1)
(50,0)
(50,1)
(60,0)
(60,1)
(70,0)
(70,1)
(80,0)
(80,1)
(90,0)
(90,1)
(100,0)
(100,1)
Inexact Newton & dogleg step
(k)
(k)
(k)
kG̃i (yi + ŝi )k
# iter. GMRES s̃i
6,2976
1192
6,2976
1057
6,2976
1318
6,2976
2393
6,2976
3157
6,2976
4007
6,2976
5292
6,2976
5440
−3
4, 9086 × 10
910
−3
4, 9086 × 10
2320
−3
3, 3623 × 10
1022
3, 3623 × 10−3
2040
2, 1081 × 10−3
1016
2, 1081 × 10−3
1960
−3
1, 3871 × 10
435
−3
1, 3871 × 10
900
1, 8442 × 10−3
1007
1, 8442 × 10−3
2080
2, 1950 × 10−3
481
2, 1950 × 10−3
1320
−3
2, 5777 × 10
1282
2, 5777 × 10−3
2140
1, 7480 × 10−3
633
1, 7480 × 10−3
980
1, 4131 × 10−3
433
−3
1, 4131 × 10
1020
−3
1, 7451 × 10
791
1, 7451 × 10−3
1520
Inexact Newton & hook step
(k)
(k)
(k)
kG̃i (yi + ŝi )k
# iter. GMRES s̃i
6,2976
1192
6,2976
1057
6,2976
1317
6,2976
2393
6,2976
3157
6,2976
4007
6,2976
5292
6,2976
5440
−3
4, 9072 × 10
1361
−3
4, 9072 × 10
2320
−3
2, 4509 × 10
762
2, 4509 × 10−3
1640
1, 9199 × 10−3
766
1, 9199 × 10−3
1720
−3
2, 6895 × 10
1100
2, 6895 × 10−3
2080
1, 4765 × 10−3
640
1, 4765 × 10−3
1600
2, 1745 × 10−3
836
2, 1745 × 10−3
1560
−3
1, 9232 × 10
780
1, 9232 × 10−3
1440
2, 4491 × 10−3
769
2, 4491 × 10−3
1640
1, 3438 × 10−3
502
1, 3438 × 10−3
980
−3
1, 7411 × 10
381
−3
1, 7411 × 10
1060
tems of ODEs by implicit schemes.
The nonlinear systems are solved by the Inexact Newton (IN) method and the linear
systems that arise in IN are solved by GMRES. The convergence of both IN and GMRES can
greatly be improved if good initial solutions are available. To this end, we have developed a
strategy that allows the extraction of good initial solutions for IN and for the linear systems in
IN. The strategy uses, at iteration i of the implicit scheme, a subspace Vi of small dimension
that contains information on the last r iterates, the line search LS or trust region TR algorithm,
and the Petrov-Galerkin process.
ETNA
Kent State University
http://etna.math.kent.edu
116
M. SAYED ALI AND M. SADKANE
TABLE 4.4
Algorithm 4.1 with trust region - Crank-Nicolson is used in (1.2).
iteration
(i,k)
(1,0)
(1,1)
(1,2)
(1,3)
(1,4)
(1,5)
(1,6)
(10,0)
(10,1)
(20,0)
(20,1)
(30,0)
(30,1)
(40,0)
(40,1)
(50,0)
(50,1)
(60,0)
(60,1)
(70,0)
(70,1)
(80,0)
(80,1)
(90,0)
(90,1)
(100,0)
(100,1)
Inexact Newton & dogleg step
(k)
(k)
(k)
kG̃i (yi + ŝi )k
# iter. GMRES s̃i
6,2969
642
6,2969
760
6,2969
890
6,2969
1405
6,2969
1837
6,2969
2347
6,2969
2800
1, 6748 × 10−2
877
1, 6748 × 10−2
1540
−3
4, 8107 × 10
839
4, 8107 × 10−3
1220
−3
3, 3681 × 10
707
3, 3681 × 10−3
1020
2, 9851 × 10−3
807
2, 9851 × 10−3
1100
−3
2, 7778 × 10
846
2, 7778 × 10−3
1100
3, 0731 × 10−3
829
3, 0731 × 10−3
1060
2, 4460 × 10−3
634
2, 4460 × 10−3
820
−3
3, 1872 × 10
552
−3
3, 1872 × 10
760
2, 7831 × 10−3
667
2, 7831 × 10−3
940
2, 6554 × 10−3
559
−3
2, 6554 × 10
780
Inexact Newton & hook step
(k)
(k)
(k)
kG̃i (yi + ŝi )k
# iter. GMRES s̃i
6,2969
642
6,2969
760
6,2969
890
6,2969
1405
6,2969
1837
6,2969
2347
6,2969
2800
1, 6748 × 10−2
877
1, 6748 × 10−2
1540
−3
4, 8091 × 10
839
4, 8091 × 10−3
1220
−3
3, 3973 × 10
701
3, 3973 × 10−3
1020
2, 6970 × 10−3
825
2, 6970 × 10−3
1080
−3
3, 3073 × 10
832
3, 3073 × 10−3
1120
2, 3479 × 10−3
846
2, 3479 × 10−3
1040
2, 3183 × 10−3
613
2, 3183 × 10−3
760
−3
2, 6765 × 10
482
−3
2, 6765 × 10
660
1, 8703 × 10−3
471
1, 8703 × 10−3
600
2, 2800 × 10−3
262
−3
2, 2800 × 10
720
TABLE 4.5
Explicit Euler with line search - Crank-Nicolson is used in (1.2).
Line search
Inexact Newton
(i,k)
(k)
kG̃i (yi
+
(k)
ŝi )k
# iter.
(i,k)
(k)
(1,0)
(1,1)
3, 1434 × 104
3, 1434 × 104
GMRES s̃i
1956
2140
(1,0)
(1,1)
(1,2)
(1,2)
(1,4)
(2,0)
(2,1)
(2,2)
(2,3)
′
(k) (k)
(k)
)p̂
+G̃i (u
)k
kG̃i (u
i
i
i
(k)
kG̃i (u
)k
i
1
1, 0511 × 102
1, 0465 × 104
1, 0465 × 104
1, 2660 × 106
1
1, 0621 × 102
7, 2056 × 103
3, 5319 × 104
# iter.
(k)
GMRES p̃i
10
80
524
1076
1425
10
424
13508
*
∗ The number of matrix-vector multiplies in GMRES largely exceeds 104 .
The efficiency of LS depends mainly on the quality of the descent directions. The Newton
direction (3.7) is known to be a good descent direction but necessitates the solution of linear
systems. The fact that the proposed strategy allows the construction of a good initial solution
(k)
p̂i to (3.7), facilitates the task of GMRES. The resulting method is cheap and efficient. The
TR algorithm uses a large quadratic model, which, after projection onto Vi , leads to several
small linear systems. This algorithm is also efficient but can be more expensive than LS if the
latter uses a good descent direction.
Numerical tests with the approach, where the initial solutions (to (1.3) and to LS) are
obtained with explicit methods, have been carried out. This approach lead to a stagnation of
GMRES at early iterations of the implicit scheme.
ETNA
Kent State University
http://etna.math.kent.edu
ACCELERATION OF IMPLICIT SCHEMES FOR LARGE SYSTEMS OF NONLINEAR ODEs
117
REFERENCES
[1] M. A L S AYED A LI AND M. S ADKANE, Acceleration of implicit schemes for solving large linear systems
of ordinary differential equations, in preparation.
[2] P. N. B ROWN, A local convergence theory for combined inexact-Newton/finite-difference projection methods, SIAM J. Numer. Anal., 24 (1987), pp. 407–434.
[3] P. N. B ROWN AND A. C. H INDMARSH, Matrix-free methods for stiff of ODE’s, SIAM J. Numer. Anal., 23
(1986), pp. 610–638.
, Reduced storage methods in stiff ODE systems, J. Appl. Math. Comput., 31 (1989), pp. 40–91.
[4]
[5] P. N. B ROWN AND Y. S AAD, Hybrid Krylov methods for nonlinear systems of equations, SIAM J. Sci.
Statist. Comput., 11 (1990), pp. 450–481.
[6]
, Convergence theory of nonlinear Newton-Krylov algorithms, SIAM J. Optim., 4 (1994), pp. 297–
330.
[7] M. B ÜNTTNER , B. A. S CHMIDT, AND R. W EINER, W-methods with automatic partitioning by Krylov
techniques for large stiff systems, SIAM J. Numer. Anal., 32 (1995), pp. 260–284.
[8] P. J. DAVIS , Interpolation and Approximation, Blaisdell, New York, 1963.
[9] R. S. D EMBO , S. C. E ISENSTAT, AND T. S TEIHAUG, Inexact Newton methods, SIAM J. Numer. Anal., 19
(1982), pp. 400–408.
[10] J. E. D ENNIS AND R. B. S CHNABEL, Numerical Methods for Unconstrained Optimization and Nonlinear
Equations, Prentice-Hall, Englewood Cliffs, NJ, 1983.
[11] S. C. E ISENSTAT AND H. F. WALKER, Globally convergent inexact Newton methods, SIAM J. Optim., 4
(1994), pp. 393–422.
[12] E. G ALLOPOULOS AND Y. S AAD, Efficient solution of parabolic equations by Krylov approximation methods, SIAM J. Sci. Statist. Comput., 13 (1992), pp. 1236–1264.
[13] C. W. G EAR AND Y. S AAD, Iterative solution of linear equations in ODE codes, SIAM J. Sci. Statist.
Comput., 4 (1983), pp. 583–601.
[14] G. H. G OLUB AND C. F. VAN L OAN, Matrix Computations, Second ed., The John Hopkins University
Press, Baltimore, MD, 1989.
[15] E. H AIRER , S. P. N ORSETT, AND G. WANNER, Solving Ordinary Differential Equations I. Nonstiff problems, Second ed., Springer, Berlin, 1993.
[16] E. H AIRER AND G. WANNER, Solving Ordinary Differential Equations II. Stiff and Differential-Algebraic
Problems, Second ed., Springer, Berlin, 1996.
[17] M. H OCHBRUCK , C. L UBICH , AND H. S ELHOFER, Exponential integrators for large systems of differential
equations, SIAM J. Sci. Comput., 19 (1998), pp. 1552–1574.
[18] E. I SAACSON AND H. B. K ELLER, Analysis of Numerical Methods, Wiley, New York, 1966.
[19] L. O. JAY, Inexact simplified Newton iterations for implicit Runge-Kutta methods, SIAM J. Numer. Anal.,
38 (2001), pp. 1369–1388.
[20] Y. S AAD, Iterative Methods for Sparse Linear Systems, Second ed., SIAM, Philadelphia, PA, 2003.
[21] Y. S AAD AND M. H. S CHULTZ, GMRES: A generalized minimal residual algorithm for solving nonsymmetric linear systems, SIAM J. Sci. Statist. Comput., 7 (1986), pp. 856–869.
Download