CRM-3227 - Centre de recherches mathématiques

Variable-step variable-order 4-stage Hermite–Birkhoff–Obrechkoff ODE Solver of order 5 to 14∗ Truong Nguyen-Ba†‡ Steven J. Desjardins†¶ Hemza Yagoub†§ Rémi Vaillancourt†k CRM-3227 September 2006 ∗ This work was supported in part by the Natural Sciences and Engineering Research Council of Canada and the Centre de recherches mathématiques of the Université de Montréal. † Department of Mathematics and Statistics, University of Ottawa, Ottawa, Ontario, Canada K1N 6N5 ‡ tnguyen@mathstat.uottawa.ca § hy.emails@gmail.com ¶ desjards@mathstat.uottawa.ca k remi@uottawa.ca Abstract Self-starting variable-step variable-order 4-stage Hermite–Birkhoff–Obrechkoff methods of order 5 to 14, denoted by HBO(5-14)4, are constructed for solving nonstiff systems of first-order differential equations of the form y 0 = f (x, y), y(x0 ) = y0 . The methods use y 0 and y 00 as in Obrechkoff’s methods. Forcing a Taylor expansion of the numerical solution to agree with an expansion of the true solution leads to multistep- and Runge–Kuttatype order conditions which are reorganized into Vandermonde-type linear systems. Fast algorithms are developed for solving these systems to obtain Hermite–Birkhoff interpolation polynomials in terms of generalized Lagrange basis functions. The new methods have larger regions of absolute stability than 3-stage Hermite–Birkhoff–Obrechkoff methods of comparable orders studied earlier and Adams–Bashforth–Moulton methods of comparable orders in PECE mode. The stability regions of the HB methods have a remarkably good shape. The order and stepsize of these methods are controlled by four local error estimators. HBO(5-14)4 is superior to Matlab’s ode113 in solving several problems often used to test higher-order ODE solvers on the basis the number of steps, CPU time, and maximum global error. When programmed in C++, HBO(p)4 uses less CPU time than Dormand–Prince DP(8,7)13M in solving costly problems at stringent tolerance. Mathematical Subject Classification. Primary 65L06; Secondary 65D05, 65D30 Keywords and phrases. general linear method for higher-order ODE’s, Hermite–Birkhoff method, Obrechkoff method, Vandermonde-type systems, maximum global error, number of function evaluations, CPU time, Matlab’s ode113, comparing ODE solvers. Résumé On construit des méthodes Hermite–Birkhoff–Obrechkoff auto-démarrantes d’ordre 5 à 14 à 4 étages à pas et ordre variables, notées HBO(5-14)4 pour résoudre des systèmes d’équations différentielles du 1er ordre non raides de la forme y 0 = f (x, y), y(x0 ) = y0 . Les méthodes emploient y 0 et y 00 comme dans les méthodes d’Obrechkoff. En identifiant les développements de Taylor tronqués de la solution exacte et de la solution numérique on obtient des conditions d’ordre des types multi-pas et Runge–Kutta qu’on réorganise en systèmes linéaires de Vandermonde qu’on résout en O(p2 ) opérations au moyen de nouveaux algorithmes rapides qui donnent lieu à des polynômes d’interpolation d’Hermite– Birkhoff sur une base de fonctions de Lagrange généralisées. À ordre comparable, les régions de stabilité absolue de HBO(5-14)4, de forme remarquable, dépassent celles de HBO(5-15)3 et celles d’Adams–Bashforth–Moulton en mode PECE. On contrôle l’ordre et le pas au moyen de 4 estimateurs de l’erreur locale. HBO(5-14)4 est supérieure à Matlab’s ode113 pour résoudre plusieurs problèmes souvent employés pour tester des solveurs d’ordre élevé sur la base du nombres de pas, du temps machine et de l’erreur globale maximum. En C++, HBO(p)4 est plus rapide que Dormand–Prince DP(8,7)13M pour résoudre des problèmes coûteux à tolérance serrée. 1 Introduction Variable-order explicit multistep Obrechkoff methods [23] and a 4-stage Runge–Kutta method of order 4 are cast into variable-step, variable-order (VSVO) 4-stage Hermite–Birkhoff–Obrechkoff methods of order 5 to 14, named HBO(5-14)4. The method’s name was chosen because it uses Hermite–Birkhoff interpolation polynomials and y 0 and y 00 for solving y 0 = f (x, y) at step points like Obrechkoff methods. The link between the two types of methods is that values at off-step points are obtained by means of predictors which use values at previous points. A special one-step HBO(5)4S of order 5 is used to make HBO(5-14)4 self-starting. Milne [18] was perhaps the first to have advocated the use of multiderivative, multistep Obrechkoff formulae for the numerical solution of differential equations. More recently, Huang and Innanen [14] introduced a new form of the classical Adams–Cowell methods and multiderivative, multistep methods, some of which having larger stability interval and smaller local truncation error than classical multistep methods. Scientific computation widely uses VSVO Adams–Bashforth–Moulton multistep methods of orders 1 to 14 as implemented by Gear [10] and [11], Krogh [16], and up to order 13, by Shampine [26] in Matlab’s ode113 ([2], [27]). The codes DVDQ of Krogh [15] and DIFSUB of Gear [11] prompted the recognition of the effectiveness of a VSVO formulation of Adams methods. When the equation is expensive to evaluate, high-order solvers appear to be more efficient than lower-order ones. The new HBO(5-14)4 are designed for solving nonstiff systems of first-order initial value problems of the form d y 0 = f (x, y), y(x0 ) = y0 , where 0 = , (1) dx where the derivative y 00 can be obtained analytically or recursively. There are many such problems, for instance in dynamical systems [3], [25]. Forcing a Taylor expansion of the numerical solution to agree with an expansion of the true solution leads to a combination of multistep- and Runge–Kutta-type order conditions which are reorganized into linear Vandermonde-type systems. The solutions of these systems are obtained as generalized Lagrange basis functions by new fast algorithms. HBO(5-14)4 have smaller error constants and larger scaled intervals of absolute stability than Adams–Bashforth–Moulton methods of comparable orders in PECE mode for p > 7. They also have larger intervals of absolute stability than 3-stage Hermite–Birkhoff–Obrechkoff methods, HBO(414)3, of comparable orders studied earlier [22]. The performance of HBO(5-14)4, programmed in Matlab, and Matlab’s ode113 was compared on several problems often used to test higher order ODE solvers. It was seen that HBO(5-14)4 requires fewer steps, uses less CPU time, and has higher accuracy than ode113. The performance of HBO(514)4 and DP(8,7)13M [24], both programmed in C++, was also compared on these problems and it was seen that HBO(5-14)4 uses lower CPU time in solving costly equations at stringent tolerance. Section 2 introduces general VSVO HBO(5-14)4 of order 5 to 14. Order conditions are listed in Section 3. In Section 4, this general HBO(5-14)4 is represented in terms of Vandermonde-type systems. In Section 5, new symbolic algorithms are derived to diagonalize the coefficient matrices of these systems as functions of the parameters of the systems. Section 6 defines a particular VSVO HBO(5-14)4. In Section 7, a family of particular VSVO HBO(5-14)4 is constructed by a fast solution of the systems. Section 8 considers the regions of absolute stability of constant step HBO(5-14)4 and their principal local truncation coefficients. Section 9 deals with the step and order control. In Section 10, three criteria are used to compare the performance of the methods considered in this paper. Appendix A lists the algorithms and Appendix B describes the Matlab programming. The one-step HBO(5)4S is described in Appendix C 1 2 General VSVO HBO(5-14)4 A Hermite–Birkhoff–Obrechkoff (HBO) method is said to be a general VSVO HBO method if its backstep and off-step points are variable parameters. If the off-step points are fixed, the method is said to be a particular VSVO method. A general HBO(p)4 of order p requires three predictors, P2 , P3 and P4 , an integration formula, IF, and a step control predictor, P5 , to perform the integration step from xn to xn+1 . For notational simplicity, c1 = 0 is used in the summations. The “floor” of a real number q, denoted by bqc, is the largest integer smaller or equal to q. (P2 ) A Hermite–Birkhoff polynomial of degree (p − 2) is used as predictor P2 to obtain yn+c2 to order (p − 2), µ ¶ b(p−3)/2c yn+c2 = yn + hn+1 a21 fn+c1 + X β2j fn−j + h2n+1 µb(p−4)/2c X µX 2 a3j fn+cj + X + β3j fn−j h2n+1 µb(p−4)/2c X ¶ 0 γ3j fn−j . (3) j=0 j=1 A Hermite–Birkhoff polynomial of degree p is used as predictor P4 to obtain yn+c4 to order (p − 2), yn+c4 = yn + hn+1 µX 3 j=1 ¶ b(p−3)/2c a4j fn+cj + X β4j fn−j + h2n+1 µb(p−4)/2c X j=1 ¶ 0 γ4j fn−j . (4) j=0 A Hermite–Birkhoff polynomial of degree (p + 1) is used as integration formula IF to obtain yn+1 to order p, µX ¶ µb(p−4)/2c ¶ b(p−3)/2c 4 X X 2 0 yn+1 = yn + hn+1 b1j fn+cj + β1j fn−j + hn+1 γ1j fn−j . (5) j=1 (P5 ) ¶ b(p−3)/2c j=1 (IF) (2) A Hermite–Birkhoff polynomial of degree (p − 1) is used as predictor P3 to obtain yn+c3 to order (p − 2), yn+c3 = yn + hn+1 (P4 ) ¶ . j=0 j=1 (P3 ) 0 γ2j fn−j j=1 j=0 A Hermite–Birkhoff polynomial of degree (p + 1) is used as step control predictor P5 to obtain yen+1 to order (p − 2), yen+1 = yn + hn+1 µX 3 ¶ b(p−3)/2c a5j fn+cj + a54 fn+1 + j=1 X β5j fn−j + j=1 h2n+1 µb(p−4)/2c X ¶ 0 γ5j fn−j . (6) j=0 0 is computed only once at xn+1 . We note that fn+1 3 Order conditions for general HBO(p)4 As in similar searches for ODE solvers [7], [19], [20], [21], we impose the following simplifying assumptions on HBO(p)4: ( i−1 X i = 2, 3, 4, 1 , (7) ck+1 aij ckj + k!Bi (k + 1) = i k + 1 k = 0, 1, 2, . . . , p − 3, j=1 2 where b(p−3)/2c X Bi (j) = `=1 " # b(p−4)/2c " # j−1 j−2 X η`+1 η`+1 βi` + γi` , (j − 1)! (j − 2)! `=1 and ηj = − 1 hn+1 (xn − xn+1−j ) = − 1 j−2 X hn+1 i=0 hn−i , ( i = 2, 3, 4, j = 0, 1, 2, . . . , p, (8) j = 2, 3, . . . , 6. (9) Equation (9) will be frequently used in this paper without further reference. There remain four sets of equations to be solved: 4 X b1i cki + k!B1 (k + 1) = i=1 4 X k = 0, 1, . . . , p, (10) # cp−2 1 j b1i aij + Bi (p − 1) + B1 (p) = , (p − 2)! p! i=2 j=1 " i−1 # 4 X cp−2 1 ci X j b1i aij + Bi (p − 1) + B1 (p + 1) = , p j=1 (p − 2)! (p + 1)! i=2 ! # " i−1 Ã j−1 4 X X X 1 cp−2 k + Bj (p − 1) + Bi (p) + B1 (p + 1) = , b1i aij ajk (p − 2)! (p + 1)! i=2 j=1 k=1 where " i−1 X 1 , k+1 b(p−3)/2c B1 (j) = X i=1 j−1 ηi+1 β1i + (j − 1)! b(p−4)/2c X γ1i i=1 j−2 ηi+1 , (j − 2)! (11) (12) (13) j = 1, 2, . . . , p + 1. (14) We note that equations (10), for k = 0, 1, . . . , p − 2, are multistep-type order conditions. On the other hand, equation (10) for k = p − 1, p, and equations (11) to (13) are Runge–Kutta-type order conditions. The numbers B1 (k), B2 (k), B3 (k) and B4 (k) are associated with IF, P2 , P3 and P4 , respectively. 4 Vandermonde-type formulation of general HBO(p)4 4.1 Integration formula IF The (p + 1)-vector of the reordered coefficients of integration formula IF in (5), u1 = [b11 , γ10 , b14 , b13 , b12 , β11 , γ11 , β12 , γ12 , β13 , γ13 , . . . , β1,b(p−3)/2c , γ1,b(p−4)/2c ]T , is the solution of the Vandermonde-type system of order conditions M 1 u1 = r 1 . (15) where M1 =  1 0  0 1    0 0  .  ..  0 0 1 c4 1 c3 1 c2 1 η2 0 1 c24 2! c23 2! c22 2! η22 2! η2 ··· cp−1 4 (p−1)! cp−1 3 (p−1)! cp−1 2 (p−1)! η2p−1 (p−1)! η2p−2 (p−2)! ··· 3 ··· ··· 1 ηb(p−3)/2c+1 2 ηb(p−3)/2c+1 2! 0 1 ηb(p−4)/2c+1 .. . p−1 ηb(p−3)/2c+1 p−2 ηb(p−4)/2c+1 (p−1)! (p−2)!      (16)    and r 1 = r1 (1 : p + 1) has components r1 (i) = 1/i!, i = 1, 2, . . . , p + 1. The leading error term of IF is · cp+1 cp+1 cp+1 4 3 b14 + b13 + b12 2 (p + 1)! (p + 1)! (p + 1)! b(p−2)/2c + X j=1 p+1 ηj+1 β1j + (p + 1)! b(p−3)/2c X j=1 ¸ p ηj+1 1 γ1j − hp+2 y (p+2) . p! (p + 2)! n+1 n The detailed structure of columns 6 to p + 1 of M 1 ∈ R(p+1)×(p+1) is as follows: ( i−1 M 1 (i, j) = ηb(j−1)/2c /(i − 1)!, d M 1 (i, j + 1) = M 1 (i, j), dηb(j−1)/2c i = 1, 2, . . . , p + 1, j = 6, 8, . . . , 2b(p + 1)/2c, provided j + 1 ≤ p + 1 in the second equation. 4.2 Predictor P2 The (p − 2)-vector of the reordered coefficients of predictor P2 in (2), u2 = [a21 , γ20 , β21 , γ21 , β22 , γ22 , β23 , γ23 , . . . , β2,b(p−3)/2c , γ2,b(p−4)/2c ]T , is the solution of the system of order conditions M 2 u2 = r 2 , where  1 0  0 1   2 0 0 M =  .  ..  0 0 1 η2 0 1 ··· ··· η22 2! η2 ··· η2p−3 (p−3)! η2p−4 (p−4)! ··· (17) 1 0 1 ηb(p−3)/2c+1 2 ηb(p−3)/2c+1 2! ηb(p−4)/2c+1 .. . p−3 ηb(p−3)/2c+1 p−4 ηb(p−4)/2c+1 (p−3)! (p−4)!         and r 2 = r2 (1 : p − 2) has components r2 (i) = ci2 /i!, i = 1, 2, . . . , p − 2. The detailed structure of columns 3 to p − 2 of M 2 ∈ R(p−2)×(p−2) is as follows: ( i = 1, 2, . . . , p − 2, j = 3, 5, . . . , 2b(p − 3)/2c + 1, i−1 M 2 (i, j) = ηbj/2+1c /(i − 1)!, d M 2 (i, j + 1) = M 2 (i, j), dηbj/2+1c provided j + 1 ≤ p − 2 in the second equation. A truncated Taylor expansion of the right-hand side of (2) about xn gives p+1 X S2 (j)hjn+1 yn(j) j=0 4 (18) with coefficients S2 (j) = M 2 (j, 1 : p − 2)u2 = r2 (j) = b(p−3)/2c X S2 (j) = i=1 η j−1 β2i i+1 + (j − 1)! cj2 , j! b(p−4)/2c X γ2i i=0 j = 1, 2, . . . , p − 2, j−2 ηi+1 , (j − 2)! j = p − 1, p, p + 1. We note that P2 is of order (p − 2) since it satisfies the order conditions S2 (j) = cj2 /j!, j = 1, 2, . . . , p − 2, and its leading error term is · 4.3 ¸ cp−1 2 S2 (p − 1) − hp−1 y (p−1) . (p − 1)! n+1 n Predictor P3 The (p − 1)-vector of the reordered coefficients of predictor P3 in (3), u3 = [a31 , γ30 , a32 , β31 , γ31 , β32 , γ32 , β33 , γ33 , . . . , β3,b(p−3)/2c , γ3,b(p−4)/2c ]T , is the solution of the system of order conditions M 3 u3 = r 3 , where  1 0  0 1   3 0 0 M =  .  ..  0 0 1 c2 1 η2 0 1 ··· ··· c22 2! η22 2! η2 ··· cp−2 2 (p−2)! η2p−2 (p−2)! η2p−3 (p−3)! ··· (19) 1 ηb(p−3)/2c+1 2 ηb(p−3)/2c+1 2! 0 1 ηb(p−4)/2c+1 .. . p−2 ηb(p−3)/2c+1 p−3 ηb(p−4)/2c+1 (p−2)! (p−3)!     .    3 The first (p − 2) components of r = r3 (1 : p − 1) are r3 (i) = ci3 /i!, i = 1, 2, . . . , p − 2, and the (p − 1)-st component is ¸ · 1 1 r3 (p − 1) = − ((1 − c2 )b2 S2 (p − 1) + B1 (p) − pB1 (p + 1)) , (1 − c3 )b3 (p + 1)! which corresponds to the Runge–Kutta order condition (11) minus order condition (12). The detailed structure of columns 4 to p − 1 of M 3 ∈ R(p−1)×(p−1) is as follows: ( i−1 M 3 (i, j) = ηbj/2c /(i − 1)!, i = 1, 2, . . . , p − 1, d 3 3 M (i, j + 1) = M (i, j), j = 4, 6, . . . , 2b(p − 1)/2c, dηbj/2c provided j + 1 ≤ p − 1 in the second equation. A truncated Taylor expansion of the right-hand side of (3) about xn gives p+1 X S3 (j)hjn+1 yn(j) j=0 5 (20) with coefficients S3 (j) = M 3 (j, 1 : p − 1)u3 = r3 (j) = b(p−3)/2c S3 (j) = a32 S2 (j − 1) + X i=1 4.4 cj3 , j! j = 1, 2, . . . , p − 2, j−1 ηi+1 β3i + (j − 1)! b(p−4)/2c X i=1 j−2 ηi+1 γ3i , (j − 2)! j = p − 1, p, p + 1. Predictor P4 The p-vector of the reordered coefficients of predictor P4 in (4), u4 = [a41 , γ40 , β41 , γ41 , β42 , γ42 , β43 , γ43 , . . . , β4,b(p−3)/2c , γ4,b(p−4)/2c , a43 , a42 ]T , is the solution of the system of order conditions M 4 u4 = r 4 , (21) where M4 =  1 0  0 1   0 0   .  ..    0 0  0 0 1 η2 η22 0 1 2! η2 η2p−2 (p−2)! η2p−1 (p−1)! η2p−3 (p−3)! η2p−2 (p−2)! ··· ··· ··· ··· ··· 1 0 1 1 c3 1 c2 2! ηb(p−4)/2c+1 c23 c22 p−2 ηb(p−3)/2c+1 p−3 ηb(p−4)/2c+1 (p−2)! p−1 ηb(p−3)/2c+1 (p−3)! p−2 ηb(p−4)/2c+1 (p−1)! (p−2)! ηb(p−3)/2c+1 2 ηb(p−3)/2c+1 2! c3p−2 (p−2)! 2! .. . cp−2 2 (p−2)! S3 (p − 1) S2 (p − 1) The first (p − 2) components of r 4 = r4 (1 : p) are r4 (i) = ci4 /i!, i = 1, 2, . . . , p − 2, and the (p − 1)st and pth components are · ¸ 1 1 − b12 S2 (p − 1) − b13 S3 (p − 1) − B1 (p) , r4 (p − 1) = b14 p! · ¸ 1 1 r4 (p) = − b12 S2 (p) − b13 S3 (p) − B1 (p + 1) , b14 (p + 1)! which correspond to the Runge–Kutta order conditions (11) and (13) respectively. The detailed structure of columns 3 to p − 2 of M 4 ∈ Rp×p is as follows: ( i = 1, 2, . . . , p, j = 3, 5, . . . , 2b(p − 3)/2c + 1, i−1 M 4 (i, j) = ηbj/2+1c /(i − 1)!, d M 4 (i, j + 1) = M 4 (i, j), dηbj/2+1c provided j + 1 ≤ p − 2 in the second equation. 6        (22)     4.5 Step control predictor P5 The (p + 1)-vector of the reordered coefficients of the step control predictor P5 in (6) is e 5 = [a51 , γ50 , β51 , γ51 , β52 , γ52 , . . . , β5,b(p−3)/2c , γ5,b(p−4)/2c , a54 , a53 , a52 ]T . u The choice a54 = b14 + ω4 , a53 = b13 + ω3 , a52 = b12 + ω2 , e 5 to the (p − 2)-vector u5 which is the solution of the system of order conditions reduces u M 2 u5 = r 5 . (23) The (p − 2) components of r 5 = r5 (1 : p − 2) are r5 (i) = 1 ci−1 ci−1 ci−1 − (b14 + ω4 ) 4 − (b13 + ω3 ) 3 − (b12 + ω2 ) 2 , i! (i − 1)! (i − 1)! (i − 1)! i = 1, 2, . . . , p − 2, where ω4 ω3 ω2 6= 0 can be chosen arbitrarily. For any such choice, P5 yields yen+1 to order (p − 2). Based on numerical experimentation, a good choice was found to be ω4 = −0.025, ω3 = 0.029 and ω2 = −0.015. 5 Symbolic construction of elementary matrix functions Consider the coefficient matrices M ` ∈ Rm` ×m` , ` = 1, 2, 3, 4, 5, (24) of the Vandermonde-type systems (15), (17), (19), (21), and (23), where m1 = p + 1, m2 = p − 2, m3 = p − 1, m4 = p, m5 = p − 2. (25) A fast solution of these systems in O(m2` ) operations will be achieved by decomposing (M ` )−1 into the product of lower and upper bidiagonal matrices, one diagonal matrix and one upper tridiagonal matrix. The purpose of this section is to construct elementary lower and upper bidiagonal matrix as symbolic functions of the parameters of HBO(p)4 to be used in Section 7 to diagonalize M ` , ` = 1, 2, 3, 4, 5. These matrices are most easily constructed by means of a symbolic software. Since the Vandermonde-type matrices M ` can be decomposed into the product of a diagonal matrix containing reciprocals of factorials and a confluent Vandermonde matrix, the factorizations used in this paper hold following the approach of Björck and Pereyra [5], Krogh [16], Galimberti and Pereyra [9], and Björck and Elfving [4]. Pivoting is not needed in this decomposition because of the special structure of Vandermonde-type matrices. 5.1 Symbolic construction of lower bidiagonal matrices for M ` , ` = 1, 2, 3, 4 We first describe the zeroing process of a general vector x = [x1 , x2 , . . . , xm ]T with no zero elements. The lower bidiagonal matrix   Ik−1 0 0 ··· 0  0 1 0 0     0 1 −τk+1  0 (26) Lk =    .. .. ..  . . . .  . . . . .  0 0 0 1 −τm 7 defined by the multipliers τi = xi−1 = −Lk (i, i), xi i = k + 1, k + 2, . . . , m, (27) zeros the last (m − k) components, xk+1 , . . . , xm , of x. For k = 3, 4, . . . , m` − 1, left multiplying T = L`k−1 · · · L`4 L`3 M ` by L`k zeros the last (ml − k) components of the kth column of T . Thus we obtain the upper triangular matrix L` M ` = L`m` −1 · · · L`4 L`3 M ` (28) in (m` − 3) steps. We note that L` does not change the first two rows of M ` . Process 1. At the kth step, starting with k = 3, • M `(k−1) = L`k−1 L`k−2 · · · L`3 M ` is an upper triangular matrix in columns 1 to k − 1. • The multipliers in L`k are obtained from M `(k−1) (k + 1 : ml , k) since M ` (i, k) 6= 0 for i = k + 1, k + 2, . . . , ml . Algorithm 1 in Appendix A describes this process. 5.2 Symbolic construction of initializing upper tridiagonal matrices U1` for M ` , ` = 1, 2, 3 The second step in diagonalizing M ` transforms the first two rows of L` M ` by right multiplication by an upper tridiagonal matrix U1` such that ¸ · 1 0 0 ··· 0 ` ` ` . (29) L M U1 (1 : 2, 1 : m` ) = 0 1 1 ··· 1 The action of U1` amounts to taking the first divided difference of the columns of L` M ` whose first component is 1 (cf. [4]). The precise form of U1` is postponed until Section 7 since it is defined in terms of each M ` . 5.3 Symbolic construction of upper bidiagonal matrices for M ` , ` = 1, 2, 3 We construct upper bidiagonal matrices Uk` , k = 2, 3, . . . , m` − 1, whose right multiplication on L` M ` U1` amounts to taking divided differences of order k, for k = 2, 3, . . . , m` − 1, of the columns k to m` of the matrices on which they act. Specifically, consider the two-row matrix: ` (k : k + 1, 1 : m` ) L` M ` U1` · · · Uk−1 · yk1 ··· = yk+1,1 · · · yk,k−1 1 1 ··· yk+1,k−1 yk+1,k yk+1,k+1 · · · 1 yk+1,m` −1 yk+1,m` and define the upper bidiagonal matrix  Ik−1 0 ··· 0 ··· 0 0  0 1 −σk+1 0 · · · 0 0   0 0 σk+1 −σk+2 · · · 0 0   .. . .. ` . . .. .. .. Uk =  . .   0 0 0 ··· σm` −2 −σm` −1 0   0 0 0 ··· 0 σm` −1 −σm` 0 0 0 ··· 0 0 σm` 8 1 ¸ (30)            (31) by means of the divisors σi = 1 = Uk` (i, i), y2,i − y2,i−1 i = k + 1, k + 2, . . . , m` . (32) Then, right multiplying (30) by Uk` zeros the 1’s in positions k + 1, . . . , m` in the first row and puts 1’s in positions k + 1, . . . , m` in the second row: ` L` M ` U1` · · · Uk−1 Uk` (k : k + 1, 1 : m` ) · = yk1 ··· yk+1,1 · · · yk,k−1 1 0 ··· yk+1,k−1 yk+1,k 1 · · · 0 0 1 1 ¸ . (33) Applying Uk` , k = 2, 3, . . . , m` − 1, on the right of the upper triangular matrix L` M ` U1` , we obtain the diagonal matrix ` D` = L` M ` U ` = L`m` −1 · · · L`4 L`3 M ` U1` U2` · · · Um (34) ` −1 in (m` − 2) steps. Process 2. At the kth step, starting with k = 2, ` • M `(k−1) = (L` M ` U1` )U2` · · · Uk−1 is a diagonal matrix in rows 1 to k − 1. • The divisors in Uk` are obtained from M `(k−1) (k + 1, k + 1 : m` ) since M `(k−1) (k + 1, j) − M `(k−1) (k + 1, j − 1) 6= 0 for j = k + 1, k + 2, . . . , m` . Algorithm 2 in Appendix A describes this process. 5.4 Symbolic construction of upper bidiagonal matrices for M 4 The construction of the first m − 3 upper bidiagonal matrices for M 4 differs slightly from the construction for M ` , ` = 1, 2, 3, and requires a consideration of its own. For the matrix L4 M 4 U14 , we construct upper bidiagonal matrices Uk4 , k = 2, 3, . . . , m4 − 2. The right multiplication of Uk4 , k = 2, 3, . . . , m4 − 2 on L4 M 4 U14 amounts to taking modified divided differences of order k, for k = 2, 3, . . . , m4 − 2, of the columns k to m4 of the matrices on which they act. Specifically, consider the two-row matrix: 4 L4 M 4 U14 · · · Uk−1 (k : k + 1, 1 : m4 ) · yk1 ··· = yk+1,1 · · · and define the upper tridiagonal  Ik−1  0   0   4 Uk =  ...   0   0 0 yk,k−1 1 1 ··· yk+1,k−1 yk+1,k yk+1,k+1 · · · 1 1 yk+1,m4 −1 yk+1,m4 matrix 0 ··· 0 ··· 0 0 1 −σk+1 0 ··· 0 0 0 σk+1 −σk+2 · · · 0 0 .. .. ... ... . . 0 0 ··· σm4 −2 −σm4 −1 −σm4 0 0 ··· 0 σm4 −1 0 0 0 ··· 0 0 σm4 ¸ (35)            (36) by means of the divisors 1 = Uk` (i, i), i = k + 1, k + 2, . . . , m4 − 1, y2,i − y2,i−1 1 = Uk4 (m4 , m4 ). = y2,m4 − y2,m4 −2 σi = σm4 9 (37) We note that Uk4 , k = 2, 3, . . . , m4 − 2 differs from Uk` of subsection 5.3 in the last column. Then, right multiplying (35) by Uk4 zeros the 1’s in positions k + 1, . . . , m4 in the first row and puts 1’s in positions k + 1, . . . , m4 in the second row: 4 Uk4 (k : k + 1, 1 : m4 ) L4 M 4 U14 · · · Uk−1 · = yk1 ··· yk+1,1 · · · yk,k−1 1 0 ··· yk+1,k−1 yk+1,k 1 · · · 0 0 1 1 ¸ . (38) Applying Uk4 , k = 2, 3, . . . , m4 − 2, on the right of the upper triangular matrix L4 M 4 U14 , we obtain a nonsingular bidiagonal matrix with only one nonzero off-diagonal element in position (m4 − 1, m4 ), 4 , W 4 = L4 M 4 U 4 = L4m4 −1 · · · L44 L43 M 4 U14 U24 · · · Um 4 −2 (39) in (m` − 3) steps. Process 3. At the kth step, starting with k = 2, 4 • M 4(k−1) = (L4 M 4 U14 )U24 · · · Uk−1 is a diagonal matrix in rows 1 to k − 1. • The divisors in Uk4 are obtained from M 4(k−1) (k + 1, k + 1 : m4 ) since M 4(k−1) (k + 1, j) − M 4(k−1) (k + 1, j − 1) 6= 0 for j = k + 1, k + 2, . . . , m4 − 1, and M 4(k−1) (k + 1, m4 ) − M 4(k−1) (k + 1, m4 − 2) 6= 0. Algorithm 3 in Appendix A describes this process for constructing the first m−3 upper bidiagonal matrices Uk for M 4 . 6 Particular VSVO HBO(p)4 of order 5 to 14 In the general HBO(5-14)4 of Section 2, c2 and c3 are free parameters. After an appropriate choice of c2 and c3 , one has a particular VSVO HBO(p)4 of order p = 5 to 14 that depends only on hn+1 and the previous nodes, xn , xn−1 , . . . , xn−b(p−3)/2c , which determine η2 , . . . , ηb(p−3)/2c+1 in (9). After extensive numerical experimentation, a good simple particular VSVO HBO(5-14)4 was obtained with 1 2 c1 = 0, c2 = , c3 = , c4 = 1. (40) 3 3 The remainder of this paper is concerned with the particular VSVO HBO(5-14)4 with coefficients cj given in (40). The procedure to advance integration from xn to xn+1 is as follows. (a) The order p is obtained by the procedure of Section 9. Then, the stepsize, hn+1 , is obtained by formula (53) of Section 9 with κ = p − 1. (b) The numbers η2 , . . . , ηb(p−3)/2c+2 , defined in (9), are calculated. (c) The coefficients of integration formula IF, predictors P2 , P3 , P4 and step control predictor P5 are obtained successively as solutions of systems (15), (17), (19), (21) and (23), respectively. (d) The values yn+c2 , yn+c3 , yn+c4 , yn+1 , and yen+1 are obtained by formulae (2)–(6). (e) The step is accepted if |yn+1 − yen+1 | is smaller than the chosen tolerance and the program goes to (a) with n replaced by n + 1. Otherwise the program returns to (a) with the same order p and smaller stepsize 0.7hn+1 . 10 7 Fast solution of particular HBO(p)4 of order 5 to 14 The elementary matrix functions L`k and Uk` , ` = 1, 2, 3, 4, are constructed only once as functions of η2 , . . . , η6 . Then they are used by fast Algorithm 4 listed in Appendix A to solve systems M ` u` = r ` in (15), (17), (19), (21), and (23) at each integration step. The solution steps are described in the next subsections. 7.1 Solution of M ` u` = r ` for ` = 1, 2, 3, 5 We recall, from (25), that m1 = p, m2 = p − 2, m3 = p − 1, m5 = p − 2, and M 5 = M 2 . Firstly, the elimination procedure of subsection 5.1 is applied to M ` to construct m` × m` lower bidiagonal matrices L`k , k = 3, . . . , m` − 1, of the form (26) defined by the multipliers τi = where i+2−k = −L`k (i, i), µ` (k) ( µ` (k) = M ` (2, k), M ` (3, k), i = k + 1, k + 2, . . . , m` , if M ` (1, k) = 1, if M ` (1, k) = 0, (41) k = 3, 4, . . . , m` − 1. Left multiplying M ` by L`k , k = 3, . . . , m` − 1, produces an upper triangular matrix L` M ` = L`m` −1 · · · L`3 M ` of the form (28). Secondly, we construct the m` × m` upper tridiagonal matrix U1` which transforms M ` into the matrix M ` U1` of first-order divided differences of the columns of M ` whose first component is 1 and where the divisors are taken from the second row of M ` . We note that U15 = U12 since M 5 = M 2 . For given p and `, Table 1 lists the diagonal elements of U1` for i = 1, 2, . . . , m` . The nonzero Table 1: The diagonal entries of U1` . i 1 2 3 4 5 6 7 8 9 10 11 12 13 14 U11 (i, i) U12 (i, i) = U15 (i, i) 1 1 1 1 1/c4 1/η2 1/(c3 − c4 ) 1 1/(c2 − c3 ) 1/(η3 − η2 ) 1/(η2 − c2 ) 1 1 1/(η4 − η3 ) 1/(η3 − η2 ) 1 1 1/(η5 − η4 ) 1/(η4 − η3 ) 1 1 1/(η6 − η5 ) 1/(η5 − η4 ) 1 1/(η6 − η5 ) 11 U13 (i, i) 1 1 1/c2 1/(η2 − c2 ) 1 1/(η3 − η2 ) 1 1/(η4 − η3 ) 1 1/(η5 − η4 ) 1 1/(η6 − η5 ) U14 (i, i) 1 1 1/η2 1 1/(η3 − η2 ) 1 1/(η4 − η3 ) 1 1/(η5 − η4 ) 1 1/(η6 − η5 ) 1 1/(c3 − η6 ) 1/(c2 − η6 ) off-diagonal elements of U1` are given in the following four sets of equations: U11 (i − 2, i) = −U11 (i, i), U11 (3, 4) = −U11 (4, 4), U11 (4, 5) = −U11 (5, 5), U11 (5, 6) = −U11 (6, 6), i = 3, 8, 10, . . . , m1 − 1, U12 (i − 2, i) = −U12 (i, i), i = 3, 5, 7, . . . , m2 − 1, U13 (i − 2, i) = −U13 (i, i), U13 (3, 4) = −U13 (4, 4), i = 3, 6, 8, . . . , m3 − 1, U14 (i − 2, i) = −U14 (i, i), i = 3, 5, 7, . . . , m4 − 1, 4 4 U1 (m4 − 3, m4 ) = −U1 (m4 , m4 ). The first two rows of M ` U1` are as in (29). Thirdly, the elimination procedure of subsection 5.3 is used to construct m` ×m` upper bidiagonal matrices Uk` , k = 2, . . . , m` − 1, of the form (31) defined by the divisors σi = k = Uk` (i, i), µ` (i) − µ` (i − k) i = k + 1, k + 2, . . . , m` . (42) Right multiplying L` M ` by Uk` , k = 1, . . . , m` − 1, produces the diagonal matrix ` D` = L`m` −1 L`m` −2 · · · L`3 M ` U1` U2` · · · Um , ` −1 where D` (i, i) = 1, and D` (i, i) = i = 1, 2, 3, (i − 1)! , 2[−µ` (3)] [−µ` (4)] · · · [−µ` (i − 1)] i = 4, 5, . . . , m` . Lastly, M ` is decomposed into the product of elementary matrices, ¡ ¢−1 ` ¡ ` ` ¢−1 ` M ` = L`m` −1 L`m` −2 · · · L`3 D U1 U2 · · · Um , ` −1 and the solution of M ` u` = r ` is ` (D` )−1 L`m` −1 L`m` −2 · · · L`3 r ` , u` = U1` U2` · · · Um ` −1 (43) where fast computation goes from right to left. Process 4. Procedure (43) is implemented in the following two steps: Step 1 Algorithm 4 in Appendix A overwrites r ` = r` (1 : m` ) with ` (D` )−1 L`m` −1 L`m` −2 · · · L`3 r ` in O(m2` ) operations. The input is M = M ` ; m = m` ; U2` · · · Um ` −1 r = r ` ; Lk = L`k , k = 3, 4, . . . , m` − 1; Uk = Uk` , k = 2, . . . , m` − 1; and D = D` . Step 2 For each value of `, one of the following three cases is computed: 12 Case 1 (` = 1) The following iteration overwrites r 1 = r1 (1 : m1 ) with U11 r 1 : r1 (3) = r1 (3)U11 (3, 3), r1 (4) = r1 (4)U11 (4, 4), r1 (5) = r1 (5)U11 (5, 5), r1 (i) = r1 (i)U11 (i, i), r1 (1) = r1 (1) − r1 (3), r1 (3) = r1 (3) − r1 (4), r1 (4) = r1 (4) − r1 (5) r1 (5) = r1 (5) − r1 (6) r1 (i) = r1 (i) − r1 (i + 2), i = 6, 8, . . . , bm1 /2c2, if m1 = 6, i = 6, 8, . . . , bm1 /2c2 − 2. Case 2 (` = 2) The following iteration overwrites r 2 = r2 (1 : m2 ) with U12 r 2 : r2 (i) = r2 (i)U12 (i, i), r2 (i) = r2 (i) − r2 (i + 2), i = 3, 5, 7, . . . , b(m2 + 1)/2c2 − 1, i = 1, 3, 5, . . . , b(m2 + 1)/2c2 − 3. Case 3 (` = 3) The following iteration overwrites r 3 = r3 (1 : m3 ) with U13 r 3 : r3 (3) = r3 (3)U13 (3, 3), r3 (i) = r3 (i)U13 (i, i), r3 (1) = r3 (1) − r3 (3), r3 (3) = r3 (3) − r3 (4), r3 (i) = r3 (i) − r3 (i + 2), 7.2 i = 4, 6, 8, . . . , bm3 /2c2, if m3 = 4, i = 4, 6, 8, . . . , bm3 /2c2 − 2. Solution of M 4 u4 = r 4 We recall, from (25), that m4 = p. Since the Runge–Kutta-type order condition (13) changes the Vandermonde-type system M 4 u4 = r 4 , the construction of the lower and upper bidiagonal matrices for M 4 differs slightly from the construction for M ` , ` = 1, 2, 3. Firstly, the elimination procedure of subsection 5.1 is applied to M 4 to construct the first (m4 −4) m4 × m4 lower bidiagonal matrices L4k , k = 3, . . . , m4 − 2, of the form (26) defined by the multipliers τi = where i+2−k = −L4k (i, i), µ4 (k) ( µ4 (k) = M 4 (2, k), M 4 (3, k), i = k + 1, k + 2, . . . , m4 , if M 4 (1, k) = 1, if M 4 (1, k) = 0, (44) k = 3, 4, . . . , m4 − 1. Secondly, the elimination procedure of subsection 5.1 is applied to M 4 to construct the last m4 × m4 lower bidiagonal matrices L4m4 −1 of the form (26) where the only multiplier τm4 is found by this elimination procedure: 1 (45) = −L4m4 −1 (m4 , m4 ), τm4 = ∆ where µ ¶ ÁY p−2 c3 1 S3 (p − 1)(p − 1)! p−4 ∆= + − c3 c3 ζc3 (k), (46) 3 3 cp−2 3 k=3 13 with ( ζc3 (k) = c3 − M 4 (2, k), c3 − M 4 (3, k), if M 4 (1, k) = 1, if M 4 (1, k) = 0, k = 3, 4, . . . , m4 − 2. Left multiplying M 4 by L4k , k = 3, . . . , m4 − 1, produces the upper triangular matrix L4 M 4 = L4m4 −1 · · · L43 M 4 of the form (28). Thirdly, we construct the m4 × m4 upper tridiagonal matrix U14 which transforms M 4 into the matrix of first-order divided differences M 4 U14 of the columns of M 4 whose first component is 1 and where the divisors are taken from the second row of M 4 . Fourthly, the elimination procedure of subsection 5.3 is used to construct the first (m4 − 3) m4 × m4 upper bidiagonal matrices Uk4 , k = 2, . . . , m4 − 2, of the form (36) defined by the divisors k i = k + 1, k + 2, . . . , m4 − 1, = Uk4 (i, i), µ4 (i) − µ4 (i − k) k = = Uk4 (m4 , m4 ). µ4 (m4 ) − µ4 (m4 − k − 1) σi = σm4 (47) Fifthly, the elimination procedure of subsection 5.3 is used to construct the last m4 × m4 upper 4 bidiagonal matrices Um of the form (31) where the only divisor σm4 is found by this elimination 4 −1 procedure: σm4 = − ∆ 4 = Um (m4 , m4 ). 4 −1 ∆2 − ∆ (48) where ∆ is defined in (46) and ∆2 is c2 1 ∆2 = + 3 3 with ( ζc2 (k) = µ S2 (p − 1)(p − 1)! − c2 cp−2 2 c2 − M 4 (2, k), c2 − M 4 (3, k), ¶ cp−4 2 if M 4 (1, k) = 1, if M 4 (1, k) = 0, ÁY p−2 ζc2 (k), (49) k=3 k = 3, 4, . . . , m4 − 2. Right multiplying L4 M 4 by Uk4 , k = 1, . . . , m4 − 1, produces the diagonal matrix 4 D4 = L4m4 −1 L4m4 −2 · · · L43 M 4 U14 U24 · · · Um , 4 −1 where D4 (i, i) = 1, i = 1, 2, 3, and (i − 1)! , 2[−µ4 (3)] [−µ4 (4)] · · · [−µ4 (i − 1)] (m4 − 2)! D4 (m4 , m4 ) = . 2[−µ4 (3)] [−µ4 (4)] · · · [−µ4 (m4 − 2)] D4 (i, i) = i = 4, 5, . . . , m4 − 1, Lastly, M 4 is decomposed into the product of elementary matrices, ¡ ¢−1 4 ¡ 4 4 ¢−1 4 M 4 = L4m4 −1 L4m4 −2 · · · L43 D U1 U2 · · · Um , −1 4 and the solution of M 4 u4 = r 4 is 4 u4 = U14 U24 · · · Um (D4 )−1 L4m4 −1 L4m4 −2 · · · L43 r 4 , 4 −1 where fast computation goes from right to left. 14 (50) Process 5. Procedure (50) is implemented in the following two steps: Step 1 Algorithm 5 in Appendix A overwrites r 4 = r4 (1 : m4 ) with 4 U24 · · · Um (D4 )−1 L4m4 −1 L4m4 −2 · · · L43 r 4 in O(m24 ) operations. The input is M = M 4 ; m = 4 −1 4 m4 ; r = r ; Lk = L4k , k = 3, 4, . . . , m4 − 1; Uk = Uk4 , k = 2, . . . , m4 − 1; and D = D4 . Step 2 The following iteration overwrites r 4 = r4 (1 : m4 ) with U14 r 4 : r4 (i) = r4 (i)U14 (i, i), r4 (m4 − 1) = r4 (m4 − 1)U14 (m4 − 1, m4 − 1), r4 (m4 ) = r4 (m4 )U14 (m4 , m4 ), r4 (i) = r4 (i) − r4 (i + 2), r4 (m4 − 3) = r4 (m4 − 3) − r4 (m4 − 1), r4 (m4 − 3) = r4 (m4 − 3) − r4 (m4 ), r4 (m4 − 2) = r4 (m4 − 2) − r4 (m4 − 1), r4 (m4 − 2) = r4 (m4 − 2) − r4 (m4 ), 8 i = 3, 5, 7, . . . , m4 − 2, i = 1, 3, 5, . . . , m4 − 4, if m4 is even, if m4 is even, if m4 is odd, if m4 is odd. Regions of absolute stability and principal error term To obtain the regions of absolute stability, R, of HBO(p)4, p = 5, 6, . . . , 14, we apply the predictors P2 , P3 , P4 and the integration formula IF with constant h, to the linear test equation y 0 = λy, y0 = 1, This gives the difference equation and the corresponding characteristic equation b(p−1)/2c X b(p−1)/2c X γj yn+j = 0, j=0 γj rj = 0, (51) j=0 respectively, where b(p − 1)/2c is the number of steps of the method. A complex number λh is in R if the b(p − 1)/2c roots of the characteristic equation satisfy the root condition: |rs | ≤ 1 and the multiple roots satisfy |rs | < 1. (see [12, pp. 256–257]). The root condition is used to find the regions of absolute stability of HBO(5-14)4 shown in grey in Fig. 1. Let ABM(p, p − 1) denote the ABM method with predictor of order p − 1 and corrector of order p in PECE mode and let ABM(5-13)denote the family of such methods of order 5 to 13. Table 2 lists the intervals of absolute stability (α, 0) of HBO(5-14)4 and ABM(5-13) [26, p. 135–140]. It is seen that the HBO methods have larger intervals of absolute stability than the ABM methods of comparable order for p > 7. The principal error term of the two- to six-step HBO(p)4, p = 5, 6, . . . , 14, is of the form ¤ £ (52) δ {2 f p−1 }2 hp+1 . expressed in terms of elementary differentials defined in [6], [17] and [12]. Table 3 lists the principal local truncation coefficient (PLTC) δ of the principal error term and the scaled norms kPLTCk2 of HBO(5-14)4 and ABM(5-13). The scaling factor is the number of function evaluations per step. It is observed that the scaled norms of HBO(p)4 are much smaller than those of ABM(p, p − 1). 15 2.5 4 HBO(5)4 2.0 3 HBO(6)4 1.5 2 1.0 1 0.5 0 -3 2.5 -2 0 -1 0 1 -2.5 2.5 HBO(7)4 2.0 2.0 1.5 1.5 1.0 1.0 0.5 0.5 0 -2.0 -1.0 -1.5 -0.5 2.0 HBO(9)4 1.5 1.5 1.0 1.0 0.5 0.5 0 -2.0 -1.5 -0.5 -1.0 -1.0 -1.5 0 -1.0 -0.5 -0.5 0 0.5 -0.5 0 HBO(10)4 0 -2.0 0 -1.5 HBO(8)4 0 -2.0 0 2.0 -2.0 -1.5 -1.0 1.2 1.5 HBO(12)4 HBO(11)4 0.8 1.0 0.4 0.5 0 -1.5 1.0 0.8 -1.0 0 0 -1.5 -0.5 -0.5 -1.0 0 0.8 HBO(13)4 HBO(14)4 0.6 0.4 0.4 0.2 0 -1.0 -0.8 -0.6 -0.4 -0.2 0 0 -0.8 -0.4 Figure 1: Regions of absolute stability of HBO(5-14)4. 16 0 Table 2: For order p, scaled abscissae of absolute stability, α, for HBO(p)4 and ABM(p, p − 1). p 5 6 7 8 9 10 11 12 13 14 α/5 α/2 HBO(p)4 ABM(p, p − 1) −0.50 −0.70 −0.42 −0.52 −0.39 −0.39 −0.37 −0.30 −0.36 −0.22 −0.30 −0.17 −0.25 −0.13 −0.21 −0.11 −0.17 −0.03 −0.14 Table 3: For given order p, the table lists the principal local truncation coefficients (PLTC) of twoto six-step HBO(p), p = 5, . . . , 14, and the scaled norm kPLTCk2 for HBO(5-14)4 and ABM(5-13). p δ 3 5 − 2877649 2 6 − 3034349 3 7 − 7129880 1 8 − 4800736 1 9 − 8689239 1 10 − 18241984 1 11 − 34609018 1 12 − 73241643 1 13 − 141993369 1 14 − 300535179 5 × kPLTCk2 HBO(p)4 5.20e-06 3.30e-06 2.11e-06 1.04e-06 5.75e-07 2.74e-07 1.45e-07 6.80e-08 3.52e-08 1.66e-08 17 2 × kPLTCk2 ABM(p, p − 1) 2.44e-01 2.18e-02 2.00e-01 1.86e-01 1.75e-01 1.65e-01 1.57e-01 1.51e-01 1.45e-01 9 Controlling stepsize and order A variant of the procedure described in [26] is used to control the stepsize, hn+1 , and order, p, of VSVO HBO(5-14)4. For simplicity, in this section the order of the step control predictor P5 will be denoted by q = p − 2. • The program computes the maximum norm Eq = kyn − yen,q k∞ , where yen,q := yen is the value obtained by the step control predictor P5 . • The stepsize hn+1 is obtained by the formula (see [13]): ( ) µ ¶1/κ tolerance hn+1 = min hmax , β hn , 4 hn , Eq (53) with κ = p − 1 and safety factor β = 0.81. • The coefficients of integration formula IF, predictors P2 , P3 , P4 and step control predictor P5 are obtained successively as fast solutions of the linear systems (15), (17), (19), (21) and (23). • The step to xn+1 is accepted if Eq ≤ tolerance, else it is rejected and the program returns to the previous step with smaller step 0.7 hn+1 . • If the step to xn+1 is successful, besides P5 , three other order and stepsize control predictors of order ρ = q ± 1 and ρ = q − 2 similar to P5 , yen+1 = yn + hn+1 ·X 3 ¸ b(ρ−1)/2c a5j fn+cj + a54 fn+1 + j=1 X β5j fn−j j=1 + h2n+1 ·b(ρ−2)/2c X 0 γ5j fn−j ¸ , j=0 are used to produce the three values yen+1,ρ to control the order and stepsize by means of the following three maximum norms: Eq±1 = kyn+1 − yen+1,q±1 k∞ , Eq−2 = kyn+1 − yen+1,q−2 k∞ , which estimate the local error at xn+1 had the step to xn+1 been taken at orders q ± 1 and q − 2, respectively. To choose the lowest satisfactory order, the following rules are used: (a) The order is lowered if Eq−1 ≤ min{Eq , Eq+1 } or Eq ≥ max{Eq−1 , Eq−2 }. (b) The order is raised only if the following stronger conditions, Eq+1 < Eq < max{Eq−1 , Eq−2 }, are satisfied. 18 (c) When the order q of P5 is 12, Eq+1 is not available; thus, the order is lowered if Eq ≥ max{Eq−1 , Eq−2 }. (d) When q = 2, the order is raised only if Eq+1 < Eq . • After selecting the order, κ and Eq are reassigned accordingly. For example, if the order is to be lowered in the next step, κnew = κold − 1 and Eq = Eq−1 . The stepsize hn+1 is then controlled by formula (53). 10 Numerical results The numerical performance of HBO(5-14)4 and Matlab’s ode113 is compared on the following problems: Arenstorf’s orbits [1], the Brusselator and the Pleiades [12, pp. 244–249], the restricted three-body problem [26, pp. 232–259], and the following nonstiff DETEST problems: growth problem B1 of two conflicting populations and two-body problems D1–D5 [13]. HBO(5-14)4 is started at x0 with the special one-step HBO(5)4S described in Appendix C. The initial step size, h1 , is chosen by a method similar to steps (a) and (b) of [12, p. 169]. A comparison of HBO(5-14)4 with DP(8,7)13M, both in C++, is also made. Computations were performed on a Mac with a dual 2.5 GHz PowerPC G5 and 4 GB DDR SSRAM running under Mac OS X Version 10.4.2 and Matlab Version 7.0.4.352 (R14) Service Pack 2. Algorithm 4 in Appendix A was written in C and made into system-dependent Matlab mex files for speed. 10.1 Comparison of HBO(5-14)4 in Matlab and Matlab’s ode113 The maximum global error, MGE, was obtained from the errors at every integration step. These errors were calculated from the numerical value yn+1 of HBO(5-14)4 and the “exact solution” obtained by Matlab’s ode113 with stringent tolerance 5 × 10−14 . The CPU time was obtained from the curves which fit, in a least-squares sense, the data (log10 (|MGE|) , log10 (CPU)) using Matlab’s polyfit. In Fig. 2, CPU (horizontal axis) is plotted versus log10 (|MGE|) (vertical axis) for the problems in hand. It is seen from the figures that HBO(5-14)4, programmed in Matlab, compares favorably with Matlab’s ode113 on the basis of CPU versus MGE. The CPU percentage efficiency gain (CPU PEG) is defined by formula (cf. Sharp [28]), ÃP ! j CPU2,ij (CPU PEG)i = 100 P −1 , (54) j CPU1,ij where CPU1,ij and CPU2,ij are the CPU of methods 1 and 2, respectively, associated with problem i, and j = − log10 (|MGE|). The CPU PEG for the problems in hand is listed in Table 4. One sees that HBO(5-14)4 often performs better than ode113. 10.2 Comparison of HBO(5-14)4 in C++ with DP(8,7)13M in C++ The CPU has been plotted in Fig. 3 versus the Maximum Global Error (MGE) in HBO(5-14)4 in C++ and DP(8,7)13M in C++ for the Brusselator and the cubic wave. The horizontal axis is CPU for a given tolerance and the vertical axis is log10 (|MGE|). 19 0 -2 Arenstorf -2 -4 -4 -6 -6 -8 -8 -10 -10 0.1 -12 0.3 0.2 0.5 0.4 -2 -4 -6 -10 0.1 0.25 0.2 0.15 -7 -8 5 0 0.3 -2 -9 -10 0.4 15 10 -6 -9 -8 1.4 D2 -10 0.2 0.15 0.25 0 -12 0.1 0.3 0.2 D4 0.5 0.4 0 D5 -2 -4 -4 -6 -8 -4 -6 -8 -8 -10 -12 0.2 1.2 1 -4 -7 -8 D3 0.8 0.6 -2 B1 -10 -11 -12 0.35 0.1 -8 Pleiades -5 -6 -5 -6 Restr. 3-body -3 -4 Brusselator 0.3 0.4 0.5 0.6 0.7 -12 0 0.2 0.4 0.6 1 0.8 -10 0.5 0 1 1.5 Figure 2: CPU (horizontal axis) versus log10 (|MGE|) (vertical axis) for nine problems in hand. HBO(5-14)4 ◦ and ode113 .. Programs are in Matlab. Table 4: CPU PEG of HBO(5-14)4 over ode113 for the listed problems. Programs are in Matlab. Problem Arenstorf Brusselator Pleiades Restricted 3 body B1 CPU PEG 32% 29% 22% 19% 1% -4 Problem CPU PEG D1 0% D2 13% D3 27% D4 41% D5 65% 0 Brusselator -6 Cubic wave -2 -4 -8 -6 -10 -8 -12 -14 -10 0 0.5 1 1.5 2 -12 0 2 4 6 Figure 3: CPU (horizontal axis) versus log10 (|MGE|) (vertical axis) for the Brusselator and the cubic wave. VSVO HBO(5-14)4 ◦ and DP(8,7) .. Programs are in C++. 20 Table 5: CPU PEG of HBO(5-14)4 over DP(8,7)13M for the listed problems. Programs are in C++. Problem CPU PEG Brusselator 26% Cubic wave 151% The CPU PEG defined by formula (54) is listed in Table 5 for the Brusselator and the cubic wave. Similar to test results in [8], it is seen from Fig. 3 and Table 5 that for the Brusselator and the cubic wave problem whose derivative evaluations are relatively expensive, the new VSVO HBO(514)4 wins over DP(8,7)13M at stringent tolerance. 11 Conclusion A self-starting fast variable-step variable-order 4-stage Hermite–Birkhoff–Obrechkoff method of order 5 to 14 was constructed by solving Vandermonde-type systems satisfying multistep- and Runge– Kutta-type order conditions. The stability regions of the HBO methods have a remarkably good shape. The stepsize and order are controlled by four local error estimators. This method, in its vectorized Lagrange form, was tested on the Brusselator, Arenstorf’s orbits, the restricted three-body problem, the Pleiades, and the following nonstiff DETEST problems: two-body problems of class D and the growth problem of two conflicting populations of class B. The new method, when programmed in Matlab, was found generally to use less CPU time than Matlab’s ode113 at stringent tolerances. When programmed in C++, HBO(5-14)4 wins over DP(8,7)13M on expensive problems at stringent tolerance. Acknowledgment Thanks are due to Philip W. Sharp for helpful discussions and observations. A Algorithms We recall from section 7, that algorithms 1, 2 and 3 are used to construct elementary matrix functions L`k and Uk` , ` = 1, 2, 3, 4, 5 only once and these algorithms are not needed at run time. Algorithm 1. This algorithm constructs lower bidiagonal matrices Lk as functions of c2 , c3 , c4 and ηj , j = 2 : 6. For k = 3 : m − 1, do the following iteration: For i = m : −1 : k + 1, do the following two steps: Step (1) Lk (i, i) = −M ` (i − 1, k)/M ` (i, k). Step (2) For j = k : m, compute: M ` (i, j) = M ` (i − 1, j) + M ` (i, j)Lk (i, i). Algorithm 2. This algorithm constructs upper bidiagonal matrices Uk for M ` , ` = 1, 2, 3 as functions of c2 , c3 , c4 and ηj , j = 2 : 6. 21 For k = 2 : m − 1, do the following iteration: For j = m : −1 : k + 1, do the following two steps: Step (1) Uk (j, j) = 1/[M ` (k + 1, j) − M ` (k + 1, j − 1)]. Step (2) for i = k : j, compute M ` (i, j) = (M ` (i, j) − M ` (i, j − 1))Uk (j, j). Algorithm 3. This algorithm constructs the first m − 3 upper bidiagonal matrices Uk for M 4 as functions of c2 , c3 , c4 and ηj , j = 2 : 6. For k = 2 : m − 2, do the following two steps: Step 1 For j = m − 1 : −1 : k + 1, do the following two steps: Step 1.a Uk (j, j) = 1/[M 4 (k + 1, j) − M 4 (k + 1, j − 1)]. Step 1.b for i = k : j, compute M 4 (i, j) = (M 4 (i, j) − M 4 (i, j − 1))Uk (j, j). Step 2 Do the following two steps: Step 2.a Uk (m, m) = 1/[M 4 (k + 1, m) − M 4 (k + 1, m − 2)]. Step 2.b for i = m − 1 : m, compute M 4 (i, m) = (M 4 (i, m) − M 4 (i, m − 2))Um−1 (m, m). Algorithm 4. This algorithm overwrites r = r(1 : m) with U2 · · · Um−1 D−1 Lm−1 Lm−2 · · · L3 r in O(m2 ) operations for IF, P2 and P3 . Given [η2 , η3 , . . . , η6 ] and r = r(1 : m), the following algorithm overwrites r with U2 · · · Um−1 D−1 Lm−1 Lm−2 · · · L3 r. Step (1) The following iteration overwrites r = r(1 : m) with Lm−1 Lm−2 · · · L3 r: for k = 3 : m − 1, compute r(i) = r(i − 1) + r(i)Lk (i, i), i = m : −1 : k + 1. Step (2) The following iteration overwrites r = r(1 : m) with U2 U3 · · · Um−1 D−1 r: r(i) = r(i)/D(i, i), i = 1 : m. For k = m − 1 : −1 : 2, compute r(i) = r(i)Uk (i, i), r(i) = r(i) − r(i + 1), i = k + 1 : m, i = k : m − 1. Algorithm 5. This algorithm overwrites r = r(1 : m) with U2 · · · Um−1 D−1 Lm−1 Lm−2 · · · L3 r in O(m2 ) operations for P4 . Given [η2 , η3 , . . . , η6 ] and r = r(1 : m), the following algorithm overwrites r with U2 · · · Um−1 D−1 Lm−1 Lm−2 · · · L3 r. 22 Step (1) The following iteration overwrites r = r(1 : m) with Lm−1 Lm−2 · · · L3 r: for k = 3 : m − 1, compute r(i) = r(i − 1) + r(i)Lk (i, i), i = m : −1 : k + 1. Step (2) The following iteration overwrites r = r(1 : m) with U2 U3 · · · Um−1 D−1 r: r(i) = r(i)/D(i, i), i = 1 : m. r(m) = r(m)Um−1 (m, m), r(m − 1) = r(m − 1) − r(m). For k = m − 2 : −1 : 2, compute r(i) = r(i)Uk (i, i), r(i) = r(i) − r(i + 1), r(m − 2) = r(m − 2) − r(m) i = k + 1 : m, i = k : m − 2, . Algorithms 4 and 5 use minimum storage since the solution is obtained by successively transforming the right-hand side into the solution vector. This is an advantage compared to generating m × m triangular matrices as an intermediate result at each integration step. B Matlab programming This appendix is included for the benefit of Matlab users. Algorithms 4 and 5 which solve systems IF, P2 , P3 and P4 were programmed in C and compiled by the Matlab mex command into mex files, say, IF.macmex, P2.macmex, P3.macmex and, P4.macmex. Algorithm 4 which solves the P5 system and three additional similar systems to produce four local error estimators yen+1,p−j , j = 1, 4 of order p − j was programmed in C and compiled by the Matlab mex command into a mex file, say, P5.macmex. At runtime, the data of differential equations were input. Then, IF.macmex, P2.macmex, P3.macmex, P4.macmex and P5.macmex were called and run to calculate the values of the coefficients of IF, P2 , P3 , P4 , P5 and three other step control predictors at each integration step until completion of the integration. CPU time and number of evaluations of f (x, y) and f 0 (x, y) for the runtime of Algorithm 4 and 5 were recorded. The option MGE can be run. Matlab’s ode113 can be run with appropriate tolerance for comparison with HBO(5-14)4. The elementary matrix functions L`k and Uk` , ` = 1, 2, 3, 4, 5, are constructed by Algorithms 1, 2 and 3 as functions of ηj , for j = 2, 3, . . . , 6. These algorithms are not needed at runtime since these matrix functions are already implemented in the above Matlab mex files. C Self-starter one-step HBO(5)4S A self-starter one-step HBO(5)4S of order 5 requires three predictors, P2 , P3 and P4 , an integration formula, IF, and a step control predictor, P5 , to perform the integration step at x0 and, possibly, in case of discontinuity at xn . (P2 ) A Hermite–Birkhoff polynomial of degree 2 is used as predictor P2 to obtain yn+c2 to order 2, yn+c2 = yn + hn+1 a21 fn+c1 + h2n+1 γ20 fn0 . 23 (55) (P3 ) A Hermite–Birkhoff polynomial of degree 3 is used as predictor P3 to obtain yn+c3 to order 2, yn+c3 = yn + hn+1 Ã 2 X ! + h2n+1 γ30 fn0 . a3j fn+cj (56) j=1 (P4 ) A Hermite–Birkhoff polynomial of degree 4 is used as predictor P4 to obtain yn+c4 to order 2, yn+c4 = yn + hn+1 µX 3 ¶ a4j fn+cj + h2n+1 γ40 fn0 . (57) j=1 (IF) A Hermite–Birkhoff polynomial of degree 5 is used as integration formula IF to obtain yn+1 to order 5, µX ¶ 4 b1j fn+cj + h2n+1 γ10 fn0 . (58) yn+1 = yn + hn+1 j=1 (P5 ) A Hermite–Birkhoff polynomial of degree 5 is used as step control predictor P5 to obtain yen+1 to order 2, µX ¶ 3 (59) yen+1 = yn + hn+1 a5j fn+cj + a54 fn+1 + h2n+1 γ50 fn0 . j=1 The coefficients of integration formula IF, predictors P2 , P3 , P4 and step control predictor P5 of HBO(5)4S are obtained successively as solutions of systems similar to (15), (17), (19), (21) and (23), respectively, with the ci as in (40), by deleting the columns containing ηj and the rows accordingly since HBO(5)4S does not have the backstep parts. A one-step HBO(5)4S of order 5 is obtained which generalizes the 3/8-rule RK4, h2 hn+1 fn + n+1 fn0 , 3 18 h2 1 = yn + hn+1 (fn+ 1 − fn ) − n+1 fn0 , 3 3 9 ¶ µ h2 36 31 18 fn+ 2 − fn+ 1 + fn + n+1 fn0 , = yn + hn+1 3 3 13 13 13 2 µ ¶ h2 9 13 ˆ 9 13 fn+1 + = yn + hn+1 fn+ 2 + fn+ 1 + fn + n+1 fn0 , 3 3 120 20 40 60 60 µ ¶ 1 479 21 683 41 0 = yn + hn+1 fn+1 + fn+ 2 + fn+ 1 + fn + h2n+1 f . 3 3 12 1000 100 3000 1500 n yn+ 1 = yn + 3 yn+ 2 3 ŷn+1 yn+1 yen+1 (60) The region of absolute stability of HBO(5)4S is shown in grey in Fig. 4 with abscissa of absolute stability α = −3.18. The principal error term of HBO(5)4S is h δ1 {f 5 } + 10δ2 {{f 2 }f 2 } + 5δ3 {{f 3 }f } + 5δ4 {{2 f 2 }2 f } i + δ5 {2 f 4 }2 + 4δ6 {2 {f 2 }f }2 + δ7 {3 f 3 }3 + δ8 {4 f 2 }4 h6 , (61) where {·} and {·}j are elementary differentials defined in [6], [17] and [12]. The principal local truncation coefficients of the principal error term of HBO(5)4S are ¸ · −5 −5 1 1 1 −1 , 0, , , , 0, , [δ1 , 10δ2 , 5δ3 , 5δ4 , δ5 , 4δ6 , δ7 , δ8 ] = 64800 10800 3600 12960 2160 720 with `2 -norm 2.07e-03. 24 4 HBO(5)4S 3 2 1 0 -4 -3 -1 -2 0 1 Figure 4: Region of absolute stability of starting HBO(5)4S. References [1] R. F. Arenstorf, Periodic solutions of the restricted three-body problem representing analytic continuations of Keplerian elliptic motions, Amer. J. Math., LXXXV (1963), 27–35. [2] R. Ashino, M. Nagase and R. Vaillancourt, Behid and beyond the Matlab ODE Suite, Comput. & Math. with Applics., 40 (2000), 491–512. [3] R. Barrio, F. Blesa and M. Lara, VSVO formulation of the Taylor method for the numerical solution of ODEs, Comput. & Math. with Applics., 50 (2005), 93–111. [4] A. Björck and T. Elfving, Algorithms for confluent Vandermonde systems, Numer. Math., 21 (1973), 130–137. [5] A. Björck and V. Pereyra, Solution of Vandermonde systems of equations, Math. Comp., 24 (1970), 893–903. [6] J. C. Butcher, Coefficients for the study of Runge–Kutta integration processes, J. Aust. Math. Soc., 3 (1963), 185–201. [7] J. C. Butcher, A modified multistep method for the numerical integration of ordinary differential equations, J. Assoc. Comput. Mach., 12 (1965), 124–135. [8] W. H. Enright and T. E. Hull, The test results on initial value methods for non-stiff ordinary differential equations, SIAM J. Numr. Anal., 13 (1976) pp. 944–961. [9] G. Galimberti and V. Pereyra, Solving confluent Vandermonde systems of Hermite type, Numer. Math., 18 (1971), 44–60. [10] C. W. Gear, The numerical integration of ordinary differential equations, Math. Comp., 21 (1967), 146–156. [11] C. W. Gear, Numerical Initial Value Problems in Ordinary Differential Equations, PrenticeHall, Englewood Cliffs, NJ, 1971. [12] E. Hairer, S. P. Nørsett and G. Wanner, Solving Ordinary Differential Equations I. Nonstiff Problems, Section III.8, Springer-Verlag, Berlin, 1993. [13] T. E. Hull, W. H. Enright, B. M. Fellen, and A. E. Sedgwick, Comparing numerical methods for ordinary differential equations, SIAM J. Numer. Anal., 9 (1972), 603–637. [14] T. Y. Huang and K. Innanen, A survey of multiderivative multistep integrators, Astronomical J., 112(3) (1996), 1254–1262. 25 [15] F. T. Krogh, VODQ/SVDQ/DVDQ—variable order integrators for the numerical solution of ordinary differential equations, TU Doc. No. CP-2308, NPO-11643, May 1969, Jet Propulsion Laboratory, Pasadena, CA. [16] F. T. Krogh, Changing stepsize in the integration of differential equations using modified divided differences, in Proc. Conf. on the Numerical Solution of Ordinary Differential Equations, University of Texas at Austin 1972 (Ed. D.G. Bettis), Lecture Notes in Mathematics No. 362, Springer-Verlag, Berlin, 22–71, 1974. [17] J. D. Lambert, Computational Methods in Ordinary Differential Equations, Ch. 5, Wiley, London, 1973. [18] W. E. Milne, A note on the numerical integration of differential equations, J. Res. Nat. Bur. Standards, 43 (1949), 537–542. [19] T. Nguyen-Ba and R. Vaillancourt, Hermite–Birkhoff differential equation solvers, Scientific Proceedings of Riga Technical University, 5-th series: Computer Science, 46-th thematic issue, 21 (2004), 47–64. [20] T. Nguyen-Ba and R. Vaillancourt, Hermite–Birkhoff–Obrechkoff 3-stage ODE Solver of order 14, Can. Appl. Math. Quarterly, 13(2) (Summer 2005), 171–201. [21] T. Nguyen-Ba, H. Yagoub, Y. Li and R. Vaillancourt, Variable-step variable-order 3-stage Hermite–Birkhoff ODE Solver of order 5 to 15, Can. Appl. Math. Quarterly. In press. [22] T. Nguyen-Ba, H. Yagoub, Y. Zhang and R. Vaillancourt, Variable-step variable-order 3-stage Hermite–Birkhoff–Obrechkoff ODE Solver of order 4 to 14, submitted to the Can. Appl. Math. Quarterly. [23] N. Obrechkoff, Neue Quadraturformeln, Abh. Preuss. Akad. Wiss. Math. Nat. Kl., No. 4, (1940), 1–20. [24] P. J. Prince and J. R. Dormand, High order embedded Runge–Kutta formulae, J. Comput. Appl. Math., 7(1) (1981), pp. 67–75. [25] E. Rabe, Determination and survey of periodic Trojan orbits in the restricted problem of three bodies, Astronomical J., 66(9) (November 1961) 500–513. [26] L. F. Shampine and M. K. Gordon, Computer Solution of Ordinary Differential Equations: The Initial Value Problem, Freeman, San Francisco, CA, 1975. [27] L. F. Shampine and M. W. Reichelt, The Matlab ODE suite, SIAM J. Sc. Comp., 18(1) (1997), 1–22. [28] P. W. Sharp, Numerical comparison of explicit Runge–Kutta pairs of orders four through eight, Trans. on Mathematical Software, 17 (1991), 387–409. 26

CRM-3227 - Centre de recherches mathématiques

Related documents

Products

Support

CRM-3227 - Centre de recherches mathématiques

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib