CRM-3227 - Centre de recherches mathématiques

advertisement
Variable-step variable-order 4-stage
Hermite–Birkhoff–Obrechkoff ODE Solver
of order 5 to 14∗
Truong Nguyen-Ba†‡
Steven J. Desjardins†¶
Hemza Yagoub†§
Rémi Vaillancourt†k
CRM-3227
September 2006
∗
This work was supported in part by the Natural Sciences and Engineering Research Council of Canada and the
Centre de recherches mathématiques of the Université de Montréal.
†
Department of Mathematics and Statistics, University of Ottawa, Ottawa, Ontario, Canada K1N 6N5
‡
tnguyen@mathstat.uottawa.ca
§
hy.emails@gmail.com
¶
desjards@mathstat.uottawa.ca
k
remi@uottawa.ca
Abstract
Self-starting variable-step variable-order 4-stage Hermite–Birkhoff–Obrechkoff methods
of order 5 to 14, denoted by HBO(5-14)4, are constructed for solving nonstiff systems of
first-order differential equations of the form y 0 = f (x, y), y(x0 ) = y0 . The methods use y 0
and y 00 as in Obrechkoff’s methods. Forcing a Taylor expansion of the numerical solution
to agree with an expansion of the true solution leads to multistep- and Runge–Kuttatype order conditions which are reorganized into Vandermonde-type linear systems. Fast
algorithms are developed for solving these systems to obtain Hermite–Birkhoff interpolation polynomials in terms of generalized Lagrange basis functions. The new methods have larger regions of absolute stability than 3-stage Hermite–Birkhoff–Obrechkoff
methods of comparable orders studied earlier and Adams–Bashforth–Moulton methods
of comparable orders in PECE mode. The stability regions of the HB methods have a
remarkably good shape. The order and stepsize of these methods are controlled by four
local error estimators. HBO(5-14)4 is superior to Matlab’s ode113 in solving several
problems often used to test higher-order ODE solvers on the basis the number of steps,
CPU time, and maximum global error. When programmed in C++, HBO(p)4 uses less
CPU time than Dormand–Prince DP(8,7)13M in solving costly problems at stringent
tolerance.
Mathematical Subject Classification. Primary 65L06; Secondary 65D05, 65D30
Keywords and phrases. general linear method for higher-order ODE’s, Hermite–Birkhoff
method, Obrechkoff method, Vandermonde-type systems, maximum global error, number of function evaluations, CPU time, Matlab’s ode113, comparing ODE solvers.
Résumé
On construit des méthodes Hermite–Birkhoff–Obrechkoff auto-démarrantes d’ordre 5 à
14 à 4 étages à pas et ordre variables, notées HBO(5-14)4 pour résoudre des systèmes
d’équations différentielles du 1er ordre non raides de la forme y 0 = f (x, y), y(x0 ) = y0 . Les
méthodes emploient y 0 et y 00 comme dans les méthodes d’Obrechkoff. En identifiant les
développements de Taylor tronqués de la solution exacte et de la solution numérique on
obtient des conditions d’ordre des types multi-pas et Runge–Kutta qu’on réorganise en
systèmes linéaires de Vandermonde qu’on résout en O(p2 ) opérations au moyen de nouveaux algorithmes rapides qui donnent lieu à des polynômes d’interpolation d’Hermite–
Birkhoff sur une base de fonctions de Lagrange généralisées. À ordre comparable, les
régions de stabilité absolue de HBO(5-14)4, de forme remarquable, dépassent celles de
HBO(5-15)3 et celles d’Adams–Bashforth–Moulton en mode PECE. On contrôle l’ordre
et le pas au moyen de 4 estimateurs de l’erreur locale. HBO(5-14)4 est supérieure à
Matlab’s ode113 pour résoudre plusieurs problèmes souvent employés pour tester des
solveurs d’ordre élevé sur la base du nombres de pas, du temps machine et de l’erreur
globale maximum. En C++, HBO(p)4 est plus rapide que Dormand–Prince DP(8,7)13M
pour résoudre des problèmes coûteux à tolérance serrée.
1
Introduction
Variable-order explicit multistep Obrechkoff methods [23] and a 4-stage Runge–Kutta method of
order 4 are cast into variable-step, variable-order (VSVO) 4-stage Hermite–Birkhoff–Obrechkoff
methods of order 5 to 14, named HBO(5-14)4. The method’s name was chosen because it uses
Hermite–Birkhoff interpolation polynomials and y 0 and y 00 for solving y 0 = f (x, y) at step points like
Obrechkoff methods. The link between the two types of methods is that values at off-step points are
obtained by means of predictors which use values at previous points. A special one-step HBO(5)4S
of order 5 is used to make HBO(5-14)4 self-starting.
Milne [18] was perhaps the first to have advocated the use of multiderivative, multistep Obrechkoff
formulae for the numerical solution of differential equations. More recently, Huang and Innanen
[14] introduced a new form of the classical Adams–Cowell methods and multiderivative, multistep
methods, some of which having larger stability interval and smaller local truncation error than
classical multistep methods.
Scientific computation widely uses VSVO Adams–Bashforth–Moulton multistep methods of orders 1 to 14 as implemented by Gear [10] and [11], Krogh [16], and up to order 13, by Shampine [26]
in Matlab’s ode113 ([2], [27]). The codes DVDQ of Krogh [15] and DIFSUB of Gear [11] prompted
the recognition of the effectiveness of a VSVO formulation of Adams methods. When the equation
is expensive to evaluate, high-order solvers appear to be more efficient than lower-order ones.
The new HBO(5-14)4 are designed for solving nonstiff systems of first-order initial value problems
of the form
d
y 0 = f (x, y),
y(x0 ) = y0 ,
where 0 =
,
(1)
dx
where the derivative y 00 can be obtained analytically or recursively. There are many such problems,
for instance in dynamical systems [3], [25].
Forcing a Taylor expansion of the numerical solution to agree with an expansion of the true
solution leads to a combination of multistep- and Runge–Kutta-type order conditions which are
reorganized into linear Vandermonde-type systems. The solutions of these systems are obtained as
generalized Lagrange basis functions by new fast algorithms.
HBO(5-14)4 have smaller error constants and larger scaled intervals of absolute stability than
Adams–Bashforth–Moulton methods of comparable orders in PECE mode for p > 7. They also have
larger intervals of absolute stability than 3-stage Hermite–Birkhoff–Obrechkoff methods, HBO(414)3, of comparable orders studied earlier [22].
The performance of HBO(5-14)4, programmed in Matlab, and Matlab’s ode113 was compared on
several problems often used to test higher order ODE solvers. It was seen that HBO(5-14)4 requires
fewer steps, uses less CPU time, and has higher accuracy than ode113. The performance of HBO(514)4 and DP(8,7)13M [24], both programmed in C++, was also compared on these problems and it
was seen that HBO(5-14)4 uses lower CPU time in solving costly equations at stringent tolerance.
Section 2 introduces general VSVO HBO(5-14)4 of order 5 to 14. Order conditions are listed
in Section 3. In Section 4, this general HBO(5-14)4 is represented in terms of Vandermonde-type
systems. In Section 5, new symbolic algorithms are derived to diagonalize the coefficient matrices
of these systems as functions of the parameters of the systems. Section 6 defines a particular VSVO
HBO(5-14)4. In Section 7, a family of particular VSVO HBO(5-14)4 is constructed by a fast solution
of the systems. Section 8 considers the regions of absolute stability of constant step HBO(5-14)4
and their principal local truncation coefficients. Section 9 deals with the step and order control. In
Section 10, three criteria are used to compare the performance of the methods considered in this
paper. Appendix A lists the algorithms and Appendix B describes the Matlab programming. The
one-step HBO(5)4S is described in Appendix C
1
2
General VSVO HBO(5-14)4
A Hermite–Birkhoff–Obrechkoff (HBO) method is said to be a general VSVO HBO method if its
backstep and off-step points are variable parameters. If the off-step points are fixed, the method is
said to be a particular VSVO method.
A general HBO(p)4 of order p requires three predictors, P2 , P3 and P4 , an integration formula,
IF, and a step control predictor, P5 , to perform the integration step from xn to xn+1 . For notational
simplicity, c1 = 0 is used in the summations. The “floor” of a real number q, denoted by bqc, is the
largest integer smaller or equal to q.
(P2 )
A Hermite–Birkhoff polynomial of degree (p − 2) is used as predictor P2 to obtain yn+c2 to
order (p − 2),
µ
¶
b(p−3)/2c
yn+c2 = yn + hn+1 a21 fn+c1 +
X
β2j fn−j
+
h2n+1
µb(p−4)/2c
X
µX
2
a3j fn+cj +
X
+
β3j fn−j
h2n+1
µb(p−4)/2c
X
¶
0
γ3j fn−j
.
(3)
j=0
j=1
A Hermite–Birkhoff polynomial of degree p is used as predictor P4 to obtain yn+c4 to order
(p − 2),
yn+c4 = yn + hn+1
µX
3
j=1
¶
b(p−3)/2c
a4j fn+cj +
X
β4j fn−j
+
h2n+1
µb(p−4)/2c
X
j=1
¶
0
γ4j fn−j
.
(4)
j=0
A Hermite–Birkhoff polynomial of degree (p + 1) is used as integration formula IF to obtain
yn+1 to order p,
µX
¶
µb(p−4)/2c
¶
b(p−3)/2c
4
X
X
2
0
yn+1 = yn + hn+1
b1j fn+cj +
β1j fn−j + hn+1
γ1j fn−j .
(5)
j=1
(P5 )
¶
b(p−3)/2c
j=1
(IF)
(2)
A Hermite–Birkhoff polynomial of degree (p − 1) is used as predictor P3 to obtain yn+c3 to
order (p − 2),
yn+c3 = yn + hn+1
(P4 )
¶
.
j=0
j=1
(P3 )
0
γ2j fn−j
j=1
j=0
A Hermite–Birkhoff polynomial of degree (p + 1) is used as step control predictor P5 to
obtain yen+1 to order (p − 2),
yen+1 = yn + hn+1
µX
3
¶
b(p−3)/2c
a5j fn+cj + a54 fn+1 +
j=1
X
β5j fn−j
+
j=1
h2n+1
µb(p−4)/2c
X
¶
0
γ5j fn−j
. (6)
j=0
0
is computed only once at xn+1 .
We note that fn+1
3
Order conditions for general HBO(p)4
As in similar searches for ODE solvers [7], [19], [20], [21], we impose the following simplifying
assumptions on HBO(p)4:
(
i−1
X
i = 2, 3, 4,
1
,
(7)
ck+1
aij ckj + k!Bi (k + 1) =
i
k
+
1
k
=
0,
1,
2,
.
.
.
,
p
−
3,
j=1
2
where
b(p−3)/2c
X
Bi (j) =
`=1
"
# b(p−4)/2c "
#
j−1
j−2
X
η`+1
η`+1
βi`
+
γi`
,
(j − 1)!
(j
−
2)!
`=1
and
ηj = −
1
hn+1
(xn − xn+1−j ) = −
1
j−2
X
hn+1
i=0
hn−i ,
(
i = 2, 3, 4,
j = 0, 1, 2, . . . , p,
(8)
j = 2, 3, . . . , 6.
(9)
Equation (9) will be frequently used in this paper without further reference.
There remain four sets of equations to be solved:
4
X
b1i cki + k!B1 (k + 1) =
i=1
4
X
k = 0, 1, . . . , p,
(10)
#
cp−2
1
j
b1i
aij
+ Bi (p − 1) + B1 (p) = ,
(p − 2)!
p!
i=2
j=1
" i−1
#
4
X
cp−2
1
ci X
j
b1i
aij
+ Bi (p − 1) + B1 (p + 1) =
,
p j=1
(p − 2)!
(p + 1)!
i=2
!
#
" i−1
à j−1
4
X
X
X
1
cp−2
k
+ Bj (p − 1) + Bi (p) + B1 (p + 1) =
,
b1i
aij
ajk
(p
−
2)!
(p
+
1)!
i=2
j=1
k=1
where
" i−1
X
1
,
k+1
b(p−3)/2c
B1 (j) =
X
i=1
j−1
ηi+1
β1i
+
(j − 1)!
b(p−4)/2c
X
γ1i
i=1
j−2
ηi+1
,
(j − 2)!
(11)
(12)
(13)
j = 1, 2, . . . , p + 1.
(14)
We note that equations (10), for k = 0, 1, . . . , p − 2, are multistep-type order conditions. On
the other hand, equation (10) for k = p − 1, p, and equations (11) to (13) are Runge–Kutta-type
order conditions. The numbers B1 (k), B2 (k), B3 (k) and B4 (k) are associated with IF, P2 , P3 and
P4 , respectively.
4
Vandermonde-type formulation of general HBO(p)4
4.1
Integration formula IF
The (p + 1)-vector of the reordered coefficients of integration formula IF in (5),
u1 = [b11 , γ10 , b14 , b13 , b12 , β11 , γ11 , β12 , γ12 , β13 , γ13 , . . . , β1,b(p−3)/2c , γ1,b(p−4)/2c ]T ,
is the solution of the Vandermonde-type system of order conditions
M 1 u1 = r 1 .
(15)
where
M1 =

1 0
 0 1


 0 0
 .
 ..

0 0
1
c4
1
c3
1
c2
1
η2
0
1
c24
2!
c23
2!
c22
2!
η22
2!
η2
···
cp−1
4
(p−1)!
cp−1
3
(p−1)!
cp−1
2
(p−1)!
η2p−1
(p−1)!
η2p−2
(p−2)!
···
3
···
···
1
ηb(p−3)/2c+1
2
ηb(p−3)/2c+1
2!
0
1
ηb(p−4)/2c+1
..
.
p−1
ηb(p−3)/2c+1
p−2
ηb(p−4)/2c+1
(p−1)!
(p−2)!




 (16)



and r 1 = r1 (1 : p + 1) has components
r1 (i) = 1/i!,
i = 1, 2, . . . , p + 1.
The leading error term of IF is
·
cp+1
cp+1
cp+1
4
3
b14
+ b13
+ b12 2
(p + 1)!
(p + 1)!
(p + 1)!
b(p−2)/2c
+
X
j=1
p+1
ηj+1
β1j
+
(p + 1)!
b(p−3)/2c
X
j=1
¸
p
ηj+1
1
γ1j
−
hp+2 y (p+2) .
p!
(p + 2)! n+1 n
The detailed structure of columns 6 to p + 1 of M 1 ∈ R(p+1)×(p+1) is as follows:
(
i−1
M 1 (i, j) = ηb(j−1)/2c
/(i − 1)!,
d
M 1 (i, j + 1) =
M 1 (i, j),
dηb(j−1)/2c
i = 1, 2, . . . , p + 1,
j = 6, 8, . . . , 2b(p + 1)/2c,
provided j + 1 ≤ p + 1 in the second equation.
4.2
Predictor P2
The (p − 2)-vector of the reordered coefficients of predictor P2 in (2),
u2 = [a21 , γ20 , β21 , γ21 , β22 , γ22 , β23 , γ23 , . . . , β2,b(p−3)/2c , γ2,b(p−4)/2c ]T ,
is the solution of the system of order conditions
M 2 u2 = r 2 ,
where

1 0
 0 1


2
0 0
M =
 .
 ..

0 0
1
η2
0
1
···
···
η22
2!
η2
···
η2p−3
(p−3)!
η2p−4
(p−4)!
···
(17)
1
0
1
ηb(p−3)/2c+1
2
ηb(p−3)/2c+1
2!
ηb(p−4)/2c+1
..
.
p−3
ηb(p−3)/2c+1
p−4
ηb(p−4)/2c+1
(p−3)!
(p−4)!








and r 2 = r2 (1 : p − 2) has components
r2 (i) = ci2 /i!,
i = 1, 2, . . . , p − 2.
The detailed structure of columns 3 to p − 2 of M 2 ∈ R(p−2)×(p−2) is as follows:
(
i = 1, 2, . . . , p − 2,
j = 3, 5, . . . , 2b(p − 3)/2c + 1,
i−1
M 2 (i, j) = ηbj/2+1c
/(i − 1)!,
d
M 2 (i, j + 1) =
M 2 (i, j),
dηbj/2+1c
provided j + 1 ≤ p − 2 in the second equation.
A truncated Taylor expansion of the right-hand side of (2) about xn gives
p+1
X
S2 (j)hjn+1 yn(j)
j=0
4
(18)
with coefficients
S2 (j) = M 2 (j, 1 : p − 2)u2 = r2 (j) =
b(p−3)/2c
X
S2 (j) =
i=1
η j−1
β2i i+1 +
(j − 1)!
cj2
,
j!
b(p−4)/2c
X
γ2i
i=0
j = 1, 2, . . . , p − 2,
j−2
ηi+1
,
(j − 2)!
j = p − 1, p, p + 1.
We note that P2 is of order (p − 2) since it satisfies the order conditions
S2 (j) = cj2 /j!,
j = 1, 2, . . . , p − 2,
and its leading error term is
·
4.3
¸
cp−1
2
S2 (p − 1) −
hp−1 y (p−1) .
(p − 1)! n+1 n
Predictor P3
The (p − 1)-vector of the reordered coefficients of predictor P3 in (3),
u3 = [a31 , γ30 , a32 , β31 , γ31 , β32 , γ32 , β33 , γ33 , . . . , β3,b(p−3)/2c , γ3,b(p−4)/2c ]T ,
is the solution of the system of order conditions
M 3 u3 = r 3 ,
where

1 0
 0 1


3
0 0
M =
 .
 ..

0 0
1
c2
1
η2
0
1
···
···
c22
2!
η22
2!
η2
···
cp−2
2
(p−2)!
η2p−2
(p−2)!
η2p−3
(p−3)!
···
(19)
1
ηb(p−3)/2c+1
2
ηb(p−3)/2c+1
2!
0
1
ηb(p−4)/2c+1
..
.
p−2
ηb(p−3)/2c+1
p−3
ηb(p−4)/2c+1
(p−2)!
(p−3)!




.



3
The first (p − 2) components of r = r3 (1 : p − 1) are
r3 (i) = ci3 /i!,
i = 1, 2, . . . , p − 2,
and the (p − 1)-st component is
¸
·
1
1
r3 (p − 1) =
− ((1 − c2 )b2 S2 (p − 1) + B1 (p) − pB1 (p + 1)) ,
(1 − c3 )b3 (p + 1)!
which corresponds to the Runge–Kutta order condition (11) minus order condition (12).
The detailed structure of columns 4 to p − 1 of M 3 ∈ R(p−1)×(p−1) is as follows:
(
i−1
M 3 (i, j) = ηbj/2c
/(i − 1)!,
i = 1, 2, . . . , p − 1,
d
3
3
M (i, j + 1) =
M (i, j),
j = 4, 6, . . . , 2b(p − 1)/2c,
dηbj/2c
provided j + 1 ≤ p − 1 in the second equation.
A truncated Taylor expansion of the right-hand side of (3) about xn gives
p+1
X
S3 (j)hjn+1 yn(j)
j=0
5
(20)
with coefficients
S3 (j) = M 3 (j, 1 : p − 1)u3 = r3 (j) =
b(p−3)/2c
S3 (j) = a32 S2 (j − 1) +
X
i=1
4.4
cj3
,
j!
j = 1, 2, . . . , p − 2,
j−1
ηi+1
β3i
+
(j − 1)!
b(p−4)/2c
X
i=1
j−2
ηi+1
γ3i
,
(j − 2)!
j = p − 1, p, p + 1.
Predictor P4
The p-vector of the reordered coefficients of predictor P4 in (4),
u4 = [a41 , γ40 , β41 , γ41 , β42 , γ42 , β43 , γ43 , . . . , β4,b(p−3)/2c , γ4,b(p−4)/2c , a43 , a42 ]T ,
is the solution of the system of order conditions
M 4 u4 = r 4 ,
(21)
where
M4 =

1 0
 0 1

 0 0

 .
 ..


 0 0

0 0
1
η2
η22
0
1
2!
η2
η2p−2
(p−2)!
η2p−1
(p−1)!
η2p−3
(p−3)!
η2p−2
(p−2)!
···
···
···
···
···
1
0
1
1
c3
1
c2
2!
ηb(p−4)/2c+1
c23
c22
p−2
ηb(p−3)/2c+1
p−3
ηb(p−4)/2c+1
(p−2)!
p−1
ηb(p−3)/2c+1
(p−3)!
p−2
ηb(p−4)/2c+1
(p−1)!
(p−2)!
ηb(p−3)/2c+1
2
ηb(p−3)/2c+1
2!
c3p−2
(p−2)!
2!
..
.
cp−2
2
(p−2)!
S3 (p − 1) S2 (p − 1)
The first (p − 2) components of r 4 = r4 (1 : p) are
r4 (i) = ci4 /i!,
i = 1, 2, . . . , p − 2,
and the (p − 1)st and pth components are
·
¸
1 1
− b12 S2 (p − 1) − b13 S3 (p − 1) − B1 (p) ,
r4 (p − 1) =
b14 p!
·
¸
1
1
r4 (p) =
− b12 S2 (p) − b13 S3 (p) − B1 (p + 1) ,
b14 (p + 1)!
which correspond to the Runge–Kutta order conditions (11) and (13) respectively.
The detailed structure of columns 3 to p − 2 of M 4 ∈ Rp×p is as follows:
(
i = 1, 2, . . . , p,
j = 3, 5, . . . , 2b(p − 3)/2c + 1,
i−1
M 4 (i, j) = ηbj/2+1c
/(i − 1)!,
d
M 4 (i, j + 1) =
M 4 (i, j),
dηbj/2+1c
provided j + 1 ≤ p − 2 in the second equation.
6






 (22)




4.5
Step control predictor P5
The (p + 1)-vector of the reordered coefficients of the step control predictor P5 in (6) is
e 5 = [a51 , γ50 , β51 , γ51 , β52 , γ52 , . . . , β5,b(p−3)/2c , γ5,b(p−4)/2c , a54 , a53 , a52 ]T .
u
The choice
a54 = b14 + ω4 ,
a53 = b13 + ω3 ,
a52 = b12 + ω2 ,
e 5 to the (p − 2)-vector u5 which is the solution of the system of order conditions
reduces u
M 2 u5 = r 5 .
(23)
The (p − 2) components of r 5 = r5 (1 : p − 2) are
r5 (i) =
1
ci−1
ci−1
ci−1
− (b14 + ω4 ) 4
− (b13 + ω3 ) 3
− (b12 + ω2 ) 2
,
i!
(i − 1)!
(i − 1)!
(i − 1)!
i = 1, 2, . . . , p − 2,
where ω4 ω3 ω2 6= 0 can be chosen arbitrarily. For any such choice, P5 yields yen+1 to order (p − 2).
Based on numerical experimentation, a good choice was found to be ω4 = −0.025, ω3 = 0.029 and
ω2 = −0.015.
5
Symbolic construction of elementary matrix functions
Consider the coefficient matrices
M ` ∈ Rm` ×m` ,
` = 1, 2, 3, 4, 5,
(24)
of the Vandermonde-type systems (15), (17), (19), (21), and (23), where
m1 = p + 1,
m2 = p − 2,
m3 = p − 1,
m4 = p,
m5 = p − 2.
(25)
A fast solution of these systems in O(m2` ) operations will be achieved by decomposing (M ` )−1 into
the product of lower and upper bidiagonal matrices, one diagonal matrix and one upper tridiagonal
matrix.
The purpose of this section is to construct elementary lower and upper bidiagonal matrix as
symbolic functions of the parameters of HBO(p)4 to be used in Section 7 to diagonalize M ` , ` =
1, 2, 3, 4, 5. These matrices are most easily constructed by means of a symbolic software.
Since the Vandermonde-type matrices M ` can be decomposed into the product of a diagonal
matrix containing reciprocals of factorials and a confluent Vandermonde matrix, the factorizations
used in this paper hold following the approach of Björck and Pereyra [5], Krogh [16], Galimberti
and Pereyra [9], and Björck and Elfving [4]. Pivoting is not needed in this decomposition because
of the special structure of Vandermonde-type matrices.
5.1
Symbolic construction of lower bidiagonal matrices for M ` , ` =
1, 2, 3, 4
We first describe the zeroing process of a general vector x = [x1 , x2 , . . . , xm ]T with no zero elements.
The lower bidiagonal matrix


Ik−1 0
0
···
0
 0 1
0
0 


 0 1 −τk+1

0
(26)
Lk = 

 ..
..
.. 
.
.
.
.
 .
.
.
.
. 
0 0
0
1 −τm
7
defined by the multipliers
τi =
xi−1
= −Lk (i, i),
xi
i = k + 1, k + 2, . . . , m,
(27)
zeros the last (m − k) components, xk+1 , . . . , xm , of x.
For k = 3, 4, . . . , m` − 1, left multiplying T = L`k−1 · · · L`4 L`3 M ` by L`k zeros the last (ml − k)
components of the kth column of T . Thus we obtain the upper triangular matrix
L` M ` = L`m` −1 · · · L`4 L`3 M `
(28)
in (m` − 3) steps. We note that L` does not change the first two rows of M ` .
Process 1. At the kth step, starting with k = 3,
• M `(k−1) = L`k−1 L`k−2 · · · L`3 M ` is an upper triangular matrix in columns 1 to k − 1.
• The multipliers in L`k are obtained from M `(k−1) (k + 1 : ml , k) since M ` (i, k) 6= 0 for i =
k + 1, k + 2, . . . , ml .
Algorithm 1 in Appendix A describes this process.
5.2
Symbolic construction of initializing upper tridiagonal matrices U1`
for M ` , ` = 1, 2, 3
The second step in diagonalizing M ` transforms the first two rows of L` M ` by right multiplication
by an upper tridiagonal matrix U1` such that
¸
·
1 0 0 ··· 0
`
` `
.
(29)
L M U1 (1 : 2, 1 : m` ) =
0 1 1 ··· 1
The action of U1` amounts to taking the first divided difference of the columns of L` M ` whose first
component is 1 (cf. [4]). The precise form of U1` is postponed until Section 7 since it is defined in
terms of each M ` .
5.3
Symbolic construction of upper bidiagonal matrices for M ` , ` =
1, 2, 3
We construct upper bidiagonal matrices Uk` , k = 2, 3, . . . , m` − 1, whose right multiplication on
L` M ` U1` amounts to taking divided differences of order k, for k = 2, 3, . . . , m` − 1, of the columns
k to m` of the matrices on which they act.
Specifically, consider the two-row matrix:
`
(k : k + 1, 1 : m` )
L` M ` U1` · · · Uk−1
·
yk1
···
=
yk+1,1 · · ·
yk,k−1
1
1
···
yk+1,k−1 yk+1,k yk+1,k+1 · · ·
1
yk+1,m` −1 yk+1,m`
and define the upper bidiagonal matrix

Ik−1 0
···
0
···
0
0
 0 1 −σk+1
0
·
·
·
0
0

 0 0
σk+1 −σk+2 · · ·
0
0

 ..
.
..
`
.
.
..
..
..
Uk =  .
.

 0 0
0
···
σm` −2 −σm` −1
0

 0 0
0
···
0
σm` −1 −σm`
0 0
0
···
0
0
σm`
8
1
¸
(30)











(31)
by means of the divisors
σi =
1
= Uk` (i, i),
y2,i − y2,i−1
i = k + 1, k + 2, . . . , m` .
(32)
Then, right multiplying (30) by Uk` zeros the 1’s in positions k + 1, . . . , m` in the first row and puts
1’s in positions k + 1, . . . , m` in the second row:
`
L` M ` U1` · · · Uk−1
Uk` (k : k + 1, 1 : m` )
·
=
yk1
···
yk+1,1 · · ·
yk,k−1
1
0 ···
yk+1,k−1 yk+1,k 1 · · ·
0 0
1 1
¸
. (33)
Applying Uk` , k = 2, 3, . . . , m` − 1, on the right of the upper triangular matrix L` M ` U1` , we obtain
the diagonal matrix
`
D` = L` M ` U ` = L`m` −1 · · · L`4 L`3 M ` U1` U2` · · · Um
(34)
` −1
in (m` − 2) steps.
Process 2. At the kth step, starting with k = 2,
`
• M `(k−1) = (L` M ` U1` )U2` · · · Uk−1
is a diagonal matrix in rows 1 to k − 1.
• The divisors in Uk` are obtained from M `(k−1) (k + 1, k + 1 : m` ) since M `(k−1) (k + 1, j) −
M `(k−1) (k + 1, j − 1) 6= 0 for j = k + 1, k + 2, . . . , m` .
Algorithm 2 in Appendix A describes this process.
5.4
Symbolic construction of upper bidiagonal matrices for M 4
The construction of the first m − 3 upper bidiagonal matrices for M 4 differs slightly from the
construction for M ` , ` = 1, 2, 3, and requires a consideration of its own. For the matrix L4 M 4 U14
, we construct upper bidiagonal matrices Uk4 , k = 2, 3, . . . , m4 − 2. The right multiplication of Uk4 ,
k = 2, 3, . . . , m4 − 2 on L4 M 4 U14 amounts to taking modified divided differences of order k, for
k = 2, 3, . . . , m4 − 2, of the columns k to m4 of the matrices on which they act.
Specifically, consider the two-row matrix:
4
L4 M 4 U14 · · · Uk−1
(k : k + 1, 1 : m4 )
·
yk1
···
=
yk+1,1 · · ·
and define the upper tridiagonal

Ik−1
 0

 0


4
Uk =  ...

 0

 0
0
yk,k−1
1
1
···
yk+1,k−1 yk+1,k yk+1,k+1 · · ·
1
1
yk+1,m4 −1 yk+1,m4
matrix
0
···
0
···
0
0
1 −σk+1
0
···
0
0
0
σk+1 −σk+2 · · ·
0
0
..
..
...
...
.
.
0
0
···
σm4 −2 −σm4 −1 −σm4
0
0
···
0
σm4 −1
0
0
0
···
0
0
σm4
¸
(35)











(36)
by means of the divisors
1
= Uk` (i, i),
i = k + 1, k + 2, . . . , m4 − 1,
y2,i − y2,i−1
1
= Uk4 (m4 , m4 ).
=
y2,m4 − y2,m4 −2
σi =
σm4
9
(37)
We note that Uk4 , k = 2, 3, . . . , m4 − 2 differs from Uk` of subsection 5.3 in the last column.
Then, right multiplying (35) by Uk4 zeros the 1’s in positions k + 1, . . . , m4 in the first row and
puts 1’s in positions k + 1, . . . , m4 in the second row:
4
Uk4 (k : k + 1, 1 : m4 )
L4 M 4 U14 · · · Uk−1
·
=
yk1
···
yk+1,1 · · ·
yk,k−1
1
0 ···
yk+1,k−1 yk+1,k 1 · · ·
0 0
1 1
¸
. (38)
Applying Uk4 , k = 2, 3, . . . , m4 − 2, on the right of the upper triangular matrix L4 M 4 U14 , we obtain a
nonsingular bidiagonal matrix with only one nonzero off-diagonal element in position (m4 − 1, m4 ),
4
,
W 4 = L4 M 4 U 4 = L4m4 −1 · · · L44 L43 M 4 U14 U24 · · · Um
4 −2
(39)
in (m` − 3) steps.
Process 3. At the kth step, starting with k = 2,
4
• M 4(k−1) = (L4 M 4 U14 )U24 · · · Uk−1
is a diagonal matrix in rows 1 to k − 1.
• The divisors in Uk4 are obtained from M 4(k−1) (k + 1, k + 1 : m4 ) since
M 4(k−1) (k + 1, j) − M 4(k−1) (k + 1, j − 1) 6= 0 for j = k + 1, k + 2, . . . , m4 − 1,
and M 4(k−1) (k + 1, m4 ) − M 4(k−1) (k + 1, m4 − 2) 6= 0.
Algorithm 3 in Appendix A describes this process for constructing the first m−3 upper bidiagonal
matrices Uk for M 4 .
6
Particular VSVO HBO(p)4 of order 5 to 14
In the general HBO(5-14)4 of Section 2, c2 and c3 are free parameters. After an appropriate choice
of c2 and c3 , one has a particular VSVO HBO(p)4 of order p = 5 to 14 that depends only on hn+1
and the previous nodes, xn , xn−1 , . . . , xn−b(p−3)/2c , which determine η2 , . . . , ηb(p−3)/2c+1 in (9).
After extensive numerical experimentation, a good simple particular VSVO HBO(5-14)4 was
obtained with
1
2
c1 = 0,
c2 = ,
c3 = ,
c4 = 1.
(40)
3
3
The remainder of this paper is concerned with the particular VSVO HBO(5-14)4 with coefficients
cj given in (40).
The procedure to advance integration from xn to xn+1 is as follows.
(a) The order p is obtained by the procedure of Section 9. Then, the stepsize, hn+1 , is obtained
by formula (53) of Section 9 with κ = p − 1.
(b) The numbers η2 , . . . , ηb(p−3)/2c+2 , defined in (9), are calculated.
(c) The coefficients of integration formula IF, predictors P2 , P3 , P4 and step control predictor P5
are obtained successively as solutions of systems (15), (17), (19), (21) and (23), respectively.
(d) The values yn+c2 , yn+c3 , yn+c4 , yn+1 , and yen+1 are obtained by formulae (2)–(6).
(e) The step is accepted if |yn+1 − yen+1 | is smaller than the chosen tolerance and the program
goes to (a) with n replaced by n + 1. Otherwise the program returns to (a) with the same
order p and smaller stepsize 0.7hn+1 .
10
7
Fast solution of particular HBO(p)4 of order 5 to 14
The elementary matrix functions L`k and Uk` , ` = 1, 2, 3, 4, are constructed only once as functions of
η2 , . . . , η6 . Then they are used by fast Algorithm 4 listed in Appendix A to solve systems M ` u` = r `
in (15), (17), (19), (21), and (23) at each integration step. The solution steps are described in the
next subsections.
7.1
Solution of M ` u` = r ` for ` = 1, 2, 3, 5
We recall, from (25), that
m1 = p,
m2 = p − 2,
m3 = p − 1,
m5 = p − 2,
and M 5 = M 2 .
Firstly, the elimination procedure of subsection 5.1 is applied to M ` to construct m` × m` lower
bidiagonal matrices L`k , k = 3, . . . , m` − 1, of the form (26) defined by the multipliers
τi =
where
i+2−k
= −L`k (i, i),
µ` (k)
(
µ` (k) =
M ` (2, k),
M ` (3, k),
i = k + 1, k + 2, . . . , m` ,
if M ` (1, k) = 1,
if M ` (1, k) = 0,
(41)
k = 3, 4, . . . , m` − 1.
Left multiplying M ` by L`k , k = 3, . . . , m` − 1, produces an upper triangular matrix L` M ` =
L`m` −1 · · · L`3 M ` of the form (28).
Secondly, we construct the m` × m` upper tridiagonal matrix U1` which transforms M ` into the
matrix M ` U1` of first-order divided differences of the columns of M ` whose first component is 1 and
where the divisors are taken from the second row of M ` . We note that U15 = U12 since M 5 = M 2 .
For given p and `, Table 1 lists the diagonal elements of U1` for i = 1, 2, . . . , m` . The nonzero
Table 1: The diagonal entries of U1` .
i
1
2
3
4
5
6
7
8
9
10
11
12
13
14
U11 (i, i)
U12 (i, i) = U15 (i, i)
1
1
1
1
1/c4
1/η2
1/(c3 − c4 )
1
1/(c2 − c3 )
1/(η3 − η2 )
1/(η2 − c2 )
1
1
1/(η4 − η3 )
1/(η3 − η2 )
1
1
1/(η5 − η4 )
1/(η4 − η3 )
1
1
1/(η6 − η5 )
1/(η5 − η4 )
1
1/(η6 − η5 )
11
U13 (i, i)
1
1
1/c2
1/(η2 − c2 )
1
1/(η3 − η2 )
1
1/(η4 − η3 )
1
1/(η5 − η4 )
1
1/(η6 − η5 )
U14 (i, i)
1
1
1/η2
1
1/(η3 − η2 )
1
1/(η4 − η3 )
1
1/(η5 − η4 )
1
1/(η6 − η5 )
1
1/(c3 − η6 )
1/(c2 − η6 )
off-diagonal elements of U1` are given in the following four sets of equations:
U11 (i − 2, i) = −U11 (i, i),
U11 (3, 4) = −U11 (4, 4),
U11 (4, 5) = −U11 (5, 5),
U11 (5, 6) = −U11 (6, 6),
i = 3, 8, 10, . . . , m1 − 1,
U12 (i − 2, i) = −U12 (i, i),
i = 3, 5, 7, . . . , m2 − 1,
U13 (i − 2, i) = −U13 (i, i),
U13 (3, 4) = −U13 (4, 4),
i = 3, 6, 8, . . . , m3 − 1,
U14 (i − 2, i) = −U14 (i, i),
i = 3, 5, 7, . . . , m4 − 1,
4
4
U1 (m4 − 3, m4 ) = −U1 (m4 , m4 ).
The first two rows of M ` U1` are as in (29).
Thirdly, the elimination procedure of subsection 5.3 is used to construct m` ×m` upper bidiagonal
matrices Uk` , k = 2, . . . , m` − 1, of the form (31) defined by the divisors
σi =
k
= Uk` (i, i),
µ` (i) − µ` (i − k)
i = k + 1, k + 2, . . . , m` .
(42)
Right multiplying L` M ` by Uk` , k = 1, . . . , m` − 1, produces the diagonal matrix
`
D` = L`m` −1 L`m` −2 · · · L`3 M ` U1` U2` · · · Um
,
` −1
where
D` (i, i) = 1,
and
D` (i, i) =
i = 1, 2, 3,
(i − 1)!
,
2[−µ` (3)] [−µ` (4)] · · · [−µ` (i − 1)]
i = 4, 5, . . . , m` .
Lastly, M ` is decomposed into the product of elementary matrices,
¡
¢−1 ` ¡ ` `
¢−1
`
M ` = L`m` −1 L`m` −2 · · · L`3
D U1 U2 · · · Um
,
` −1
and the solution of M ` u` = r ` is
`
(D` )−1 L`m` −1 L`m` −2 · · · L`3 r ` ,
u` = U1` U2` · · · Um
` −1
(43)
where fast computation goes from right to left.
Process 4. Procedure (43) is implemented in the following two steps:
Step 1 Algorithm 4 in Appendix A overwrites r ` = r` (1 : m` ) with
`
(D` )−1 L`m` −1 L`m` −2 · · · L`3 r ` in O(m2` ) operations. The input is M = M ` ; m = m` ;
U2` · · · Um
` −1
r = r ` ; Lk = L`k , k = 3, 4, . . . , m` − 1; Uk = Uk` , k = 2, . . . , m` − 1; and D = D` .
Step 2 For each value of `, one of the following three cases is computed:
12
Case 1 (` = 1) The following iteration overwrites r 1 = r1 (1 : m1 ) with U11 r 1 :
r1 (3) = r1 (3)U11 (3, 3),
r1 (4) = r1 (4)U11 (4, 4),
r1 (5) = r1 (5)U11 (5, 5),
r1 (i) = r1 (i)U11 (i, i),
r1 (1) = r1 (1) − r1 (3),
r1 (3) = r1 (3) − r1 (4),
r1 (4) = r1 (4) − r1 (5)
r1 (5) = r1 (5) − r1 (6)
r1 (i) = r1 (i) − r1 (i + 2),
i = 6, 8, . . . , bm1 /2c2,
if m1 = 6,
i = 6, 8, . . . , bm1 /2c2 − 2.
Case 2 (` = 2) The following iteration overwrites r 2 = r2 (1 : m2 ) with U12 r 2 :
r2 (i) = r2 (i)U12 (i, i),
r2 (i) = r2 (i) − r2 (i + 2),
i = 3, 5, 7, . . . , b(m2 + 1)/2c2 − 1,
i = 1, 3, 5, . . . , b(m2 + 1)/2c2 − 3.
Case 3 (` = 3) The following iteration overwrites r 3 = r3 (1 : m3 ) with U13 r 3 :
r3 (3) = r3 (3)U13 (3, 3),
r3 (i) = r3 (i)U13 (i, i),
r3 (1) = r3 (1) − r3 (3),
r3 (3) = r3 (3) − r3 (4),
r3 (i) = r3 (i) − r3 (i + 2),
7.2
i = 4, 6, 8, . . . , bm3 /2c2,
if m3 = 4,
i = 4, 6, 8, . . . , bm3 /2c2 − 2.
Solution of M 4 u4 = r 4
We recall, from (25), that
m4 = p.
Since the Runge–Kutta-type order condition (13) changes the Vandermonde-type system M 4 u4 =
r 4 , the construction of the lower and upper bidiagonal matrices for M 4 differs slightly from the
construction for M ` , ` = 1, 2, 3.
Firstly, the elimination procedure of subsection 5.1 is applied to M 4 to construct the first (m4 −4)
m4 × m4 lower bidiagonal matrices L4k , k = 3, . . . , m4 − 2, of the form (26) defined by the multipliers
τi =
where
i+2−k
= −L4k (i, i),
µ4 (k)
(
µ4 (k) =
M 4 (2, k),
M 4 (3, k),
i = k + 1, k + 2, . . . , m4 ,
if M 4 (1, k) = 1,
if M 4 (1, k) = 0,
(44)
k = 3, 4, . . . , m4 − 1.
Secondly, the elimination procedure of subsection 5.1 is applied to M 4 to construct the last
m4 × m4 lower bidiagonal matrices L4m4 −1 of the form (26) where the only multiplier τm4 is found
by this elimination procedure:
1
(45)
= −L4m4 −1 (m4 , m4 ),
τm4 =
∆
where
µ
¶
ÁY
p−2
c3 1 S3 (p − 1)(p − 1)!
p−4
∆=
+
− c3 c3
ζc3 (k),
(46)
3
3
cp−2
3
k=3
13
with
(
ζc3 (k) =
c3 − M 4 (2, k),
c3 − M 4 (3, k),
if M 4 (1, k) = 1,
if M 4 (1, k) = 0,
k = 3, 4, . . . , m4 − 2.
Left multiplying M 4 by L4k , k = 3, . . . , m4 − 1, produces the upper triangular matrix L4 M 4 =
L4m4 −1 · · · L43 M 4 of the form (28).
Thirdly, we construct the m4 × m4 upper tridiagonal matrix U14 which transforms M 4 into the
matrix of first-order divided differences M 4 U14 of the columns of M 4 whose first component is 1 and
where the divisors are taken from the second row of M 4 .
Fourthly, the elimination procedure of subsection 5.3 is used to construct the first (m4 − 3)
m4 × m4 upper bidiagonal matrices Uk4 , k = 2, . . . , m4 − 2, of the form (36) defined by the divisors
k
i = k + 1, k + 2, . . . , m4 − 1,
= Uk4 (i, i),
µ4 (i) − µ4 (i − k)
k
=
= Uk4 (m4 , m4 ).
µ4 (m4 ) − µ4 (m4 − k − 1)
σi =
σm4
(47)
Fifthly, the elimination procedure of subsection 5.3 is used to construct the last m4 × m4 upper
4
bidiagonal matrices Um
of the form (31) where the only divisor σm4 is found by this elimination
4 −1
procedure:
σm4 = −
∆
4
= Um
(m4 , m4 ).
4 −1
∆2 − ∆
(48)
where ∆ is defined in (46) and ∆2 is
c2 1
∆2 =
+
3
3
with
(
ζc2 (k) =
µ
S2 (p − 1)(p − 1)!
− c2
cp−2
2
c2 − M 4 (2, k),
c2 − M 4 (3, k),
¶
cp−4
2
if M 4 (1, k) = 1,
if M 4 (1, k) = 0,
ÁY
p−2
ζc2 (k),
(49)
k=3
k = 3, 4, . . . , m4 − 2.
Right multiplying L4 M 4 by Uk4 , k = 1, . . . , m4 − 1, produces the diagonal matrix
4
D4 = L4m4 −1 L4m4 −2 · · · L43 M 4 U14 U24 · · · Um
,
4 −1
where
D4 (i, i) = 1,
i = 1, 2, 3,
and
(i − 1)!
,
2[−µ4 (3)] [−µ4 (4)] · · · [−µ4 (i − 1)]
(m4 − 2)!
D4 (m4 , m4 ) =
.
2[−µ4 (3)] [−µ4 (4)] · · · [−µ4 (m4 − 2)]
D4 (i, i) =
i = 4, 5, . . . , m4 − 1,
Lastly, M 4 is decomposed into the product of elementary matrices,
¡
¢−1 4 ¡ 4 4
¢−1
4
M 4 = L4m4 −1 L4m4 −2 · · · L43
D U1 U2 · · · Um
,
−1
4
and the solution of M 4 u4 = r 4 is
4
u4 = U14 U24 · · · Um
(D4 )−1 L4m4 −1 L4m4 −2 · · · L43 r 4 ,
4 −1
where fast computation goes from right to left.
14
(50)
Process 5. Procedure (50) is implemented in the following two steps:
Step 1 Algorithm 5 in Appendix A overwrites r 4 = r4 (1 : m4 ) with
4
U24 · · · Um
(D4 )−1 L4m4 −1 L4m4 −2 · · · L43 r 4 in O(m24 ) operations. The input is M = M 4 ; m =
4 −1
4
m4 ; r = r ; Lk = L4k , k = 3, 4, . . . , m4 − 1; Uk = Uk4 , k = 2, . . . , m4 − 1; and D = D4 .
Step 2 The following iteration overwrites r 4 = r4 (1 : m4 ) with U14 r 4 :
r4 (i) = r4 (i)U14 (i, i),
r4 (m4 − 1) = r4 (m4 − 1)U14 (m4 − 1, m4 − 1),
r4 (m4 ) = r4 (m4 )U14 (m4 , m4 ),
r4 (i) = r4 (i) − r4 (i + 2),
r4 (m4 − 3) = r4 (m4 − 3) − r4 (m4 − 1),
r4 (m4 − 3) = r4 (m4 − 3) − r4 (m4 ),
r4 (m4 − 2) = r4 (m4 − 2) − r4 (m4 − 1),
r4 (m4 − 2) = r4 (m4 − 2) − r4 (m4 ),
8
i = 3, 5, 7, . . . , m4 − 2,
i = 1, 3, 5, . . . , m4 − 4,
if m4 is even,
if m4 is even,
if m4 is odd,
if m4 is odd.
Regions of absolute stability and principal error term
To obtain the regions of absolute stability, R, of HBO(p)4, p = 5, 6, . . . , 14, we apply the predictors
P2 , P3 , P4 and the integration formula IF with constant h, to the linear test equation
y 0 = λy,
y0 = 1,
This gives the difference equation and the corresponding characteristic equation
b(p−1)/2c
X
b(p−1)/2c
X
γj yn+j = 0,
j=0
γj rj = 0,
(51)
j=0
respectively, where b(p − 1)/2c is the number of steps of the method. A complex number λh is in
R if the b(p − 1)/2c roots of the characteristic equation satisfy the root condition: |rs | ≤ 1 and the
multiple roots satisfy |rs | < 1. (see [12, pp. 256–257]).
The root condition is used to find the regions of absolute stability of HBO(5-14)4 shown in grey
in Fig. 1.
Let ABM(p, p − 1) denote the ABM method with predictor of order p − 1 and corrector of order
p in PECE mode and let ABM(5-13)denote the family of such methods of order 5 to 13. Table 2
lists the intervals of absolute stability (α, 0) of HBO(5-14)4 and ABM(5-13) [26, p. 135–140]. It is
seen that the HBO methods have larger intervals of absolute stability than the ABM methods of
comparable order for p > 7.
The principal error term of the two- to six-step HBO(p)4, p = 5, 6, . . . , 14, is of the form
¤
£
(52)
δ {2 f p−1 }2 hp+1 .
expressed in terms of elementary differentials defined in [6], [17] and [12].
Table 3 lists the principal local truncation coefficient (PLTC) δ of the principal error term and
the scaled norms kPLTCk2 of HBO(5-14)4 and ABM(5-13). The scaling factor is the number of
function evaluations per step. It is observed that the scaled norms of HBO(p)4 are much smaller
than those of ABM(p, p − 1).
15
2.5
4
HBO(5)4
2.0
3
HBO(6)4
1.5
2
1.0
1
0.5
0
-3
2.5
-2
0
-1
0
1 -2.5
2.5
HBO(7)4
2.0
2.0
1.5
1.5
1.0
1.0
0.5
0.5
0
-2.0
-1.0
-1.5
-0.5
2.0
HBO(9)4
1.5
1.5
1.0
1.0
0.5
0.5
0
-2.0
-1.5
-0.5
-1.0
-1.0
-1.5
0
-1.0
-0.5
-0.5
0
0.5
-0.5
0
HBO(10)4
0
-2.0
0
-1.5
HBO(8)4
0
-2.0
0
2.0
-2.0
-1.5
-1.0
1.2
1.5
HBO(12)4
HBO(11)4
0.8
1.0
0.4
0.5
0
-1.5
1.0
0.8
-1.0
0
0 -1.5
-0.5
-0.5
-1.0
0
0.8
HBO(13)4
HBO(14)4
0.6
0.4
0.4
0.2
0
-1.0
-0.8
-0.6
-0.4
-0.2
0
0
-0.8
-0.4
Figure 1: Regions of absolute stability of HBO(5-14)4.
16
0
Table 2: For order p, scaled abscissae of absolute stability, α, for HBO(p)4 and ABM(p, p − 1).
p
5
6
7
8
9
10
11
12
13
14
α/5
α/2
HBO(p)4 ABM(p, p − 1)
−0.50
−0.70
−0.42
−0.52
−0.39
−0.39
−0.37
−0.30
−0.36
−0.22
−0.30
−0.17
−0.25
−0.13
−0.21
−0.11
−0.17
−0.03
−0.14
Table 3: For given order p, the table lists the principal local truncation coefficients (PLTC) of twoto six-step HBO(p), p = 5, . . . , 14, and the scaled norm kPLTCk2 for HBO(5-14)4 and ABM(5-13).
p
δ
3
5
− 2877649
2
6
− 3034349
3
7
− 7129880
1
8
− 4800736
1
9
− 8689239
1
10 − 18241984
1
11 − 34609018
1
12 − 73241643
1
13 − 141993369
1
14 − 300535179
5 × kPLTCk2
HBO(p)4
5.20e-06
3.30e-06
2.11e-06
1.04e-06
5.75e-07
2.74e-07
1.45e-07
6.80e-08
3.52e-08
1.66e-08
17
2 × kPLTCk2
ABM(p, p − 1)
2.44e-01
2.18e-02
2.00e-01
1.86e-01
1.75e-01
1.65e-01
1.57e-01
1.51e-01
1.45e-01
9
Controlling stepsize and order
A variant of the procedure described in [26] is used to control the stepsize, hn+1 , and order, p, of
VSVO HBO(5-14)4. For simplicity, in this section the order of the step control predictor P5 will be
denoted by q = p − 2.
• The program computes the maximum norm
Eq = kyn − yen,q k∞ ,
where yen,q := yen is the value obtained by the step control predictor P5 .
• The stepsize hn+1 is obtained by the formula (see [13]):
(
)
µ
¶1/κ
tolerance
hn+1 = min hmax , β hn
, 4 hn ,
Eq
(53)
with κ = p − 1 and safety factor β = 0.81.
• The coefficients of integration formula IF, predictors P2 , P3 , P4 and step control predictor P5
are obtained successively as fast solutions of the linear systems (15), (17), (19), (21) and (23).
• The step to xn+1 is accepted if Eq ≤ tolerance, else it is rejected and the program returns to
the previous step with smaller step 0.7 hn+1 .
• If the step to xn+1 is successful, besides P5 , three other order and stepsize control predictors
of order ρ = q ± 1 and ρ = q − 2 similar to P5 ,
yen+1 = yn + hn+1
·X
3
¸
b(ρ−1)/2c
a5j fn+cj + a54 fn+1 +
j=1
X
β5j fn−j
j=1
+
h2n+1
·b(ρ−2)/2c
X
0
γ5j fn−j
¸
,
j=0
are used to produce the three values yen+1,ρ to control the order and stepsize by means of the
following three maximum norms:
Eq±1 = kyn+1 − yen+1,q±1 k∞ ,
Eq−2 = kyn+1 − yen+1,q−2 k∞ ,
which estimate the local error at xn+1 had the step to xn+1 been taken at orders q ± 1 and
q − 2, respectively.
To choose the lowest satisfactory order, the following rules are used:
(a) The order is lowered if
Eq−1 ≤ min{Eq , Eq+1 } or Eq ≥ max{Eq−1 , Eq−2 }.
(b) The order is raised only if the following stronger conditions,
Eq+1 < Eq < max{Eq−1 , Eq−2 },
are satisfied.
18
(c) When the order q of P5 is 12, Eq+1 is not available; thus, the order is lowered if
Eq ≥ max{Eq−1 , Eq−2 }.
(d) When q = 2, the order is raised only if
Eq+1 < Eq .
• After selecting the order, κ and Eq are reassigned accordingly. For example, if the order is
to be lowered in the next step, κnew = κold − 1 and Eq = Eq−1 . The stepsize hn+1 is then
controlled by formula (53).
10
Numerical results
The numerical performance of HBO(5-14)4 and Matlab’s ode113 is compared on the following
problems: Arenstorf’s orbits [1], the Brusselator and the Pleiades [12, pp. 244–249], the restricted
three-body problem [26, pp. 232–259], and the following nonstiff DETEST problems: growth problem B1 of two conflicting populations and two-body problems D1–D5 [13].
HBO(5-14)4 is started at x0 with the special one-step HBO(5)4S described in Appendix C. The
initial step size, h1 , is chosen by a method similar to steps (a) and (b) of [12, p. 169].
A comparison of HBO(5-14)4 with DP(8,7)13M, both in C++, is also made.
Computations were performed on a Mac with a dual 2.5 GHz PowerPC G5 and 4 GB DDR
SSRAM running under Mac OS X Version 10.4.2 and Matlab Version 7.0.4.352 (R14) Service Pack
2. Algorithm 4 in Appendix A was written in C and made into system-dependent Matlab mex files
for speed.
10.1
Comparison of HBO(5-14)4 in Matlab and Matlab’s ode113
The maximum global error, MGE, was obtained from the errors at every integration step. These
errors were calculated from the numerical value yn+1 of HBO(5-14)4 and the “exact solution” obtained by Matlab’s ode113 with stringent tolerance 5 × 10−14 .
The CPU time was obtained from the curves which fit, in a least-squares sense, the data
(log10 (|MGE|) , log10 (CPU)) using Matlab’s polyfit.
In Fig. 2, CPU (horizontal axis) is plotted versus log10 (|MGE|) (vertical axis) for the problems
in hand. It is seen from the figures that HBO(5-14)4, programmed in Matlab, compares favorably
with Matlab’s ode113 on the basis of CPU versus MGE.
The CPU percentage efficiency gain (CPU PEG) is defined by formula (cf. Sharp [28]),
ÃP
!
j CPU2,ij
(CPU PEG)i = 100 P
−1 ,
(54)
j CPU1,ij
where CPU1,ij and CPU2,ij are the CPU of methods 1 and 2, respectively, associated with problem
i, and j = − log10 (|MGE|).
The CPU PEG for the problems in hand is listed in Table 4. One sees that HBO(5-14)4 often
performs better than ode113.
10.2
Comparison of HBO(5-14)4 in C++ with DP(8,7)13M in C++
The CPU has been plotted in Fig. 3 versus the Maximum Global Error (MGE) in HBO(5-14)4 in
C++ and DP(8,7)13M in C++ for the Brusselator and the cubic wave. The horizontal axis is CPU
for a given tolerance and the vertical axis is log10 (|MGE|).
19
0
-2
Arenstorf
-2
-4
-4
-6
-6
-8
-8
-10
-10
0.1
-12
0.3
0.2
0.5
0.4
-2
-4
-6
-10
0.1 0.25
0.2
0.15
-7
-8
5
0
0.3
-2
-9
-10
0.4
15
10
-6
-9
-8
1.4
D2
-10
0.2
0.15
0.25
0
-12
0.1
0.3
0.2
D4
0.5
0.4
0
D5
-2
-4
-4
-6
-8
-4
-6
-8
-8
-10
-12
0.2
1.2
1
-4
-7
-8
D3
0.8
0.6
-2
B1
-10
-11
-12
0.35
0.1
-8
Pleiades
-5
-6
-5
-6
Restr. 3-body
-3
-4
Brusselator
0.3
0.4
0.5
0.6
0.7
-12
0
0.2
0.4
0.6
1
0.8
-10
0.5
0
1
1.5
Figure 2: CPU (horizontal axis) versus log10 (|MGE|) (vertical axis) for nine problems in hand.
HBO(5-14)4 ◦ and ode113 .. Programs are in Matlab.
Table 4: CPU PEG of HBO(5-14)4 over ode113 for the listed problems. Programs are in Matlab.
Problem
Arenstorf
Brusselator
Pleiades
Restricted 3 body
B1
CPU PEG
32%
29%
22%
19%
1%
-4
Problem CPU PEG
D1
0%
D2
13%
D3
27%
D4
41%
D5
65%
0
Brusselator
-6
Cubic wave
-2
-4
-8
-6
-10
-8
-12
-14
-10
0
0.5
1
1.5
2
-12
0
2
4
6
Figure 3: CPU (horizontal axis) versus log10 (|MGE|) (vertical axis) for the Brusselator and the
cubic wave. VSVO HBO(5-14)4 ◦ and DP(8,7) .. Programs are in C++.
20
Table 5: CPU PEG of HBO(5-14)4 over DP(8,7)13M for the listed problems. Programs are in
C++.
Problem
CPU PEG
Brusselator
26%
Cubic wave
151%
The CPU PEG defined by formula (54) is listed in Table 5 for the Brusselator and the cubic
wave.
Similar to test results in [8], it is seen from Fig. 3 and Table 5 that for the Brusselator and the
cubic wave problem whose derivative evaluations are relatively expensive, the new VSVO HBO(514)4 wins over DP(8,7)13M at stringent tolerance.
11
Conclusion
A self-starting fast variable-step variable-order 4-stage Hermite–Birkhoff–Obrechkoff method of order 5 to 14 was constructed by solving Vandermonde-type systems satisfying multistep- and Runge–
Kutta-type order conditions. The stability regions of the HBO methods have a remarkably good
shape. The stepsize and order are controlled by four local error estimators. This method, in its vectorized Lagrange form, was tested on the Brusselator, Arenstorf’s orbits, the restricted three-body
problem, the Pleiades, and the following nonstiff DETEST problems: two-body problems of class
D and the growth problem of two conflicting populations of class B. The new method, when programmed in Matlab, was found generally to use less CPU time than Matlab’s ode113 at stringent
tolerances. When programmed in C++, HBO(5-14)4 wins over DP(8,7)13M on expensive problems
at stringent tolerance.
Acknowledgment
Thanks are due to Philip W. Sharp for helpful discussions and observations.
A
Algorithms
We recall from section 7, that algorithms 1, 2 and 3 are used to construct elementary matrix
functions L`k and Uk` , ` = 1, 2, 3, 4, 5 only once and these algorithms are not needed at run time.
Algorithm 1. This algorithm constructs lower bidiagonal matrices Lk as functions of
c2 , c3 , c4 and ηj , j = 2 : 6.
For k = 3 : m − 1, do the following iteration:
For i = m : −1 : k + 1, do the following two steps:
Step (1) Lk (i, i) = −M ` (i − 1, k)/M ` (i, k).
Step (2) For j = k : m, compute:
M ` (i, j) = M ` (i − 1, j) + M ` (i, j)Lk (i, i).
Algorithm 2. This algorithm constructs upper bidiagonal matrices Uk for M ` , ` = 1, 2, 3
as functions of c2 , c3 , c4 and ηj , j = 2 : 6.
21
For k = 2 : m − 1, do the following iteration:
For j = m : −1 : k + 1, do the following two steps:
Step (1) Uk (j, j) = 1/[M ` (k + 1, j) − M ` (k + 1, j − 1)].
Step (2) for i = k : j, compute
M ` (i, j) = (M ` (i, j) − M ` (i, j − 1))Uk (j, j).
Algorithm 3. This algorithm constructs the first m − 3 upper bidiagonal matrices Uk
for M 4 as functions of c2 , c3 , c4 and ηj , j = 2 : 6.
For k = 2 : m − 2, do the following two steps:
Step 1 For j = m − 1 : −1 : k + 1, do the following two steps:
Step 1.a Uk (j, j) = 1/[M 4 (k + 1, j) − M 4 (k + 1, j − 1)].
Step 1.b for i = k : j, compute
M 4 (i, j) = (M 4 (i, j) − M 4 (i, j − 1))Uk (j, j).
Step 2 Do the following two steps:
Step 2.a Uk (m, m) = 1/[M 4 (k + 1, m) − M 4 (k + 1, m − 2)].
Step 2.b for i = m − 1 : m, compute
M 4 (i, m) = (M 4 (i, m) − M 4 (i, m − 2))Um−1 (m, m).
Algorithm 4. This algorithm overwrites r = r(1 : m) with
U2 · · · Um−1 D−1 Lm−1 Lm−2 · · · L3 r in O(m2 ) operations for IF, P2 and P3 .
Given [η2 , η3 , . . . , η6 ] and r = r(1 : m), the following algorithm overwrites r with
U2 · · · Um−1 D−1 Lm−1 Lm−2 · · · L3 r.
Step (1) The following iteration overwrites r = r(1 : m) with Lm−1 Lm−2 · · · L3 r:
for k = 3 : m − 1, compute
r(i) = r(i − 1) + r(i)Lk (i, i),
i = m : −1 : k + 1.
Step (2) The following iteration overwrites r = r(1 : m) with U2 U3 · · · Um−1 D−1 r:
r(i) = r(i)/D(i, i),
i = 1 : m.
For k = m − 1 : −1 : 2, compute
r(i) = r(i)Uk (i, i),
r(i) = r(i) − r(i + 1),
i = k + 1 : m,
i = k : m − 1.
Algorithm 5. This algorithm overwrites r = r(1 : m) with
U2 · · · Um−1 D−1 Lm−1 Lm−2 · · · L3 r in O(m2 ) operations for P4 .
Given [η2 , η3 , . . . , η6 ] and r = r(1 : m), the following algorithm overwrites r with
U2 · · · Um−1 D−1 Lm−1 Lm−2 · · · L3 r.
22
Step (1) The following iteration overwrites r = r(1 : m) with Lm−1 Lm−2 · · · L3 r:
for k = 3 : m − 1, compute
r(i) = r(i − 1) + r(i)Lk (i, i),
i = m : −1 : k + 1.
Step (2) The following iteration overwrites r = r(1 : m) with U2 U3 · · · Um−1 D−1 r:
r(i) = r(i)/D(i, i),
i = 1 : m.
r(m) = r(m)Um−1 (m, m),
r(m − 1) = r(m − 1) − r(m).
For k = m − 2 : −1 : 2, compute
r(i) = r(i)Uk (i, i),
r(i) = r(i) − r(i + 1),
r(m − 2) = r(m − 2) − r(m)
i = k + 1 : m,
i = k : m − 2,
.
Algorithms 4 and 5 use minimum storage since the solution is obtained by successively transforming the right-hand side into the solution vector. This is an advantage compared to generating
m × m triangular matrices as an intermediate result at each integration step.
B
Matlab programming
This appendix is included for the benefit of Matlab users. Algorithms 4 and 5 which solve systems
IF, P2 , P3 and P4 were programmed in C and compiled by the Matlab mex command into mex files,
say, IF.macmex, P2.macmex, P3.macmex and, P4.macmex.
Algorithm 4 which solves the P5 system and three additional similar systems to produce four
local error estimators yen+1,p−j , j = 1, 4 of order p − j was programmed in C and compiled by the
Matlab mex command into a mex file, say, P5.macmex.
At runtime, the data of differential equations were input. Then, IF.macmex, P2.macmex, P3.macmex,
P4.macmex and P5.macmex were called and run to calculate the values of the coefficients of IF, P2 ,
P3 , P4 , P5 and three other step control predictors at each integration step until completion of
the integration. CPU time and number of evaluations of f (x, y) and f 0 (x, y) for the runtime of
Algorithm 4 and 5 were recorded.
The option MGE can be run.
Matlab’s ode113 can be run with appropriate tolerance for comparison with HBO(5-14)4.
The elementary matrix functions L`k and Uk` , ` = 1, 2, 3, 4, 5, are constructed by Algorithms 1, 2
and 3 as functions of ηj , for j = 2, 3, . . . , 6. These algorithms are not needed at runtime since these
matrix functions are already implemented in the above Matlab mex files.
C
Self-starter one-step HBO(5)4S
A self-starter one-step HBO(5)4S of order 5 requires three predictors, P2 , P3 and P4 , an integration
formula, IF, and a step control predictor, P5 , to perform the integration step at x0 and, possibly,
in case of discontinuity at xn .
(P2 )
A Hermite–Birkhoff polynomial of degree 2 is used as predictor P2 to obtain yn+c2 to order 2,
yn+c2 = yn + hn+1 a21 fn+c1 + h2n+1 γ20 fn0 .
23
(55)
(P3 )
A Hermite–Birkhoff polynomial of degree 3 is used as predictor P3 to obtain yn+c3 to order
2,
yn+c3 = yn + hn+1
à 2
X
!
+ h2n+1 γ30 fn0 .
a3j fn+cj
(56)
j=1
(P4 )
A Hermite–Birkhoff polynomial of degree 4 is used as predictor P4 to obtain yn+c4 to order
2,
yn+c4 = yn + hn+1
µX
3
¶
a4j fn+cj
+ h2n+1 γ40 fn0 .
(57)
j=1
(IF)
A Hermite–Birkhoff polynomial of degree 5 is used as integration formula IF to obtain yn+1
to order 5,
µX
¶
4
b1j fn+cj + h2n+1 γ10 fn0 .
(58)
yn+1 = yn + hn+1
j=1
(P5 )
A Hermite–Birkhoff polynomial of degree 5 is used as step control predictor P5 to obtain
yen+1 to order 2,
µX
¶
3
(59)
yen+1 = yn + hn+1
a5j fn+cj + a54 fn+1 + h2n+1 γ50 fn0 .
j=1
The coefficients of integration formula IF, predictors P2 , P3 , P4 and step control predictor P5 of
HBO(5)4S are obtained successively as solutions of systems similar to (15), (17), (19), (21) and (23),
respectively, with the ci as in (40), by deleting the columns containing ηj and the rows accordingly
since HBO(5)4S does not have the backstep parts. A one-step HBO(5)4S of order 5 is obtained
which generalizes the 3/8-rule RK4,
h2
hn+1
fn + n+1 fn0 ,
3
18
h2
1
= yn + hn+1 (fn+ 1 − fn ) − n+1 fn0 ,
3
3
9
¶
µ
h2
36
31
18
fn+ 2 −
fn+ 1 + fn + n+1 fn0 ,
= yn + hn+1
3
3
13
13
13
2
µ
¶
h2
9
13 ˆ
9
13
fn+1 +
= yn + hn+1
fn+ 2 +
fn+ 1 + fn + n+1 fn0 ,
3
3
120
20
40
60
60
µ
¶
1
479
21
683
41 0
= yn + hn+1
fn+1 +
fn+ 2 +
fn+ 1 +
fn + h2n+1
f .
3
3
12
1000
100
3000
1500 n
yn+ 1 = yn +
3
yn+ 2
3
ŷn+1
yn+1
yen+1
(60)
The region of absolute stability of HBO(5)4S is shown in grey in Fig. 4 with abscissa of absolute
stability α = −3.18.
The principal error term of HBO(5)4S is
h
δ1 {f 5 } + 10δ2 {{f 2 }f 2 } + 5δ3 {{f 3 }f } + 5δ4 {{2 f 2 }2 f }
i
+ δ5 {2 f 4 }2 + 4δ6 {2 {f 2 }f }2 + δ7 {3 f 3 }3 + δ8 {4 f 2 }4 h6 , (61)
where {·} and {·}j are elementary differentials defined in [6], [17] and [12]. The principal local
truncation coefficients of the principal error term of HBO(5)4S are
¸
·
−5
−5
1
1
1
−1
, 0,
,
,
, 0,
,
[δ1 , 10δ2 , 5δ3 , 5δ4 , δ5 , 4δ6 , δ7 , δ8 ] =
64800
10800 3600 12960
2160 720
with `2 -norm 2.07e-03.
24
4
HBO(5)4S
3
2
1
0
-4
-3
-1
-2
0
1
Figure 4: Region of absolute stability of starting HBO(5)4S.
References
[1] R. F. Arenstorf, Periodic solutions of the restricted three-body problem representing analytic
continuations of Keplerian elliptic motions, Amer. J. Math., LXXXV (1963), 27–35.
[2] R. Ashino, M. Nagase and R. Vaillancourt, Behid and beyond the Matlab ODE Suite, Comput.
& Math. with Applics., 40 (2000), 491–512.
[3] R. Barrio, F. Blesa and M. Lara, VSVO formulation of the Taylor method for the numerical
solution of ODEs, Comput. & Math. with Applics., 50 (2005), 93–111.
[4] A. Björck and T. Elfving, Algorithms for confluent Vandermonde systems, Numer. Math., 21
(1973), 130–137.
[5] A. Björck and V. Pereyra, Solution of Vandermonde systems of equations, Math. Comp., 24
(1970), 893–903.
[6] J. C. Butcher, Coefficients for the study of Runge–Kutta integration processes, J. Aust. Math.
Soc., 3 (1963), 185–201.
[7] J. C. Butcher, A modified multistep method for the numerical integration of ordinary differential equations, J. Assoc. Comput. Mach., 12 (1965), 124–135.
[8] W. H. Enright and T. E. Hull, The test results on initial value methods for non-stiff ordinary
differential equations, SIAM J. Numr. Anal., 13 (1976) pp. 944–961.
[9] G. Galimberti and V. Pereyra, Solving confluent Vandermonde systems of Hermite type, Numer. Math., 18 (1971), 44–60.
[10] C. W. Gear, The numerical integration of ordinary differential equations, Math. Comp., 21
(1967), 146–156.
[11] C. W. Gear, Numerical Initial Value Problems in Ordinary Differential Equations, PrenticeHall, Englewood Cliffs, NJ, 1971.
[12] E. Hairer, S. P. Nørsett and G. Wanner, Solving Ordinary Differential Equations I. Nonstiff
Problems, Section III.8, Springer-Verlag, Berlin, 1993.
[13] T. E. Hull, W. H. Enright, B. M. Fellen, and A. E. Sedgwick, Comparing numerical methods
for ordinary differential equations, SIAM J. Numer. Anal., 9 (1972), 603–637.
[14] T. Y. Huang and K. Innanen, A survey of multiderivative multistep integrators, Astronomical
J., 112(3) (1996), 1254–1262.
25
[15] F. T. Krogh, VODQ/SVDQ/DVDQ—variable order integrators for the numerical solution of
ordinary differential equations, TU Doc. No. CP-2308, NPO-11643, May 1969, Jet Propulsion
Laboratory, Pasadena, CA.
[16] F. T. Krogh, Changing stepsize in the integration of differential equations using modified
divided differences, in Proc. Conf. on the Numerical Solution of Ordinary Differential Equations,
University of Texas at Austin 1972 (Ed. D.G. Bettis), Lecture Notes in Mathematics No. 362,
Springer-Verlag, Berlin, 22–71, 1974.
[17] J. D. Lambert, Computational Methods in Ordinary Differential Equations, Ch. 5, Wiley,
London, 1973.
[18] W. E. Milne, A note on the numerical integration of differential equations, J. Res. Nat. Bur.
Standards, 43 (1949), 537–542.
[19] T. Nguyen-Ba and R. Vaillancourt, Hermite–Birkhoff differential equation solvers, Scientific
Proceedings of Riga Technical University, 5-th series: Computer Science, 46-th thematic issue,
21 (2004), 47–64.
[20] T. Nguyen-Ba and R. Vaillancourt, Hermite–Birkhoff–Obrechkoff 3-stage ODE Solver of order
14, Can. Appl. Math. Quarterly, 13(2) (Summer 2005), 171–201.
[21] T. Nguyen-Ba, H. Yagoub, Y. Li and R. Vaillancourt, Variable-step variable-order 3-stage
Hermite–Birkhoff ODE Solver of order 5 to 15, Can. Appl. Math. Quarterly. In press.
[22] T. Nguyen-Ba, H. Yagoub, Y. Zhang and R. Vaillancourt, Variable-step variable-order 3-stage
Hermite–Birkhoff–Obrechkoff ODE Solver of order 4 to 14, submitted to the Can. Appl. Math.
Quarterly.
[23] N. Obrechkoff, Neue Quadraturformeln, Abh. Preuss. Akad. Wiss. Math. Nat. Kl., No. 4,
(1940), 1–20.
[24] P. J. Prince and J. R. Dormand, High order embedded Runge–Kutta formulae, J. Comput.
Appl. Math., 7(1) (1981), pp. 67–75.
[25] E. Rabe, Determination and survey of periodic Trojan orbits in the restricted problem of three
bodies, Astronomical J., 66(9) (November 1961) 500–513.
[26] L. F. Shampine and M. K. Gordon, Computer Solution of Ordinary Differential Equations:
The Initial Value Problem, Freeman, San Francisco, CA, 1975.
[27] L. F. Shampine and M. W. Reichelt, The Matlab ODE suite, SIAM J. Sc. Comp., 18(1) (1997),
1–22.
[28] P. W. Sharp, Numerical comparison of explicit Runge–Kutta pairs of orders four through eight,
Trans. on Mathematical Software, 17 (1991), 387–409.
26
Download