A UNIFIED CONVERGENCE THEORY FOR NEWTON-TYPE METHODS FOR ZEROS OF XINGHUA WANG

advertisement
BIT
2002, Vol. 42, No. 1, pp. 206–213
0006-3835/02/4201-0206 $16.00
c Swets & Zeitlinger
A UNIFIED CONVERGENCE THEORY FOR
NEWTON-TYPE METHODS FOR ZEROS OF
NONLINEAR OPERATORS IN BANACH SPACES ∗
XINGHUA WANG1 , CHONG LI2 , and MING-JUN LAI3 †
1
2
Department of Mathematics, Zhejiang University, Hangzhou, 310028, P. R. China
Department of Applied Mathematics, Southeast University, Nanjing, 210096, P. R. China
3
Department of Mathematics, University of Georgia, Athens,
GA 30602, USA. email: mjlai@math.uga.edu
Abstract.
The paper is concerned with the convergence problem of Newton type methods for finding zeros of nonlinear operators in Banach spaces. Some families of nonlinear operators
are defined by different Lipschitz conditions and an “universal constant” is introduced so
that a unified convergence determination of these methods is established for the defined
families.
AMS subject classification: 47H10, 65J15, 65H10.
Key words: Newton’s method, Banach space, convergence.
1
Introduction.
Let E and F be real or complex Banach spaces with norm · . For x0 ∈ E
and r > 0, let B(x0 , r) and B(x0 , r) denote the open and closed ball with radius
r and center x0 , respectively. Let f : B(x0 , r) −→ F be a nonlinear operator with
the continuous Fréchet derivative f . In this paper, we assume that the inverse
f (x0 )−1 of f (x0 ) exists. The Newton method and its variations are the major
numerical methods to solve the equation f (x) = 0. Regarding the existence and
uniqueness of the solution and the convergence of Newton’s method, the most
famous result is the well-known Kantorovich Theorem.
Theorem 1.1 (Kantorovich Theorem [6]). Suppose that f (x0 )−1 f (x)
satisfies
1
(1.1) f (x0 )−1 (f (x ) − f (x)) ≤ γx − x, ∀x , x, x − x + x − x0 ≤ .
γ
Let β = f (x0 )−1 f (x0 ). If γβ ≤ 1/2, then the Newton sequence {xn } defined by
(1.2)
xn+1 = xn − f (xn )−1 f (xn ),
n = 0, 1, 2, . . . ,
converges to the unique solution of the equation f (x) = 0 in B(x0 , 1/γ).
∗ Received
September 2000. Revised April 2001. Communicated by Åke Björck.
project is jointly supported by the Special Funds for Major State Basic Research
Projects (Grant No. G19990328), the National (Grant No 19971013) and Jiangsu Provincial
(Grant No. BK99001) Natural Science Foundation of China.
† This
CONVERGENCE FOR NEWTON-TYPE METHODS
207
Since then, there have been many similar convergence results for the Newton
method and its variations under various similar conditions established in the literature. See, for example, [7, 12, 5, 1]. However, there is no unified convergence
theorem. The purpose of the present paper is to establish such a theorem. After reviewing some preliminary results on majorizing functions and recalling the
definitions of the Newton method and its variations in Section 2, we prove some
convergence theorems for the Newton method in Section 3 and its variations in
Section 4. Finally, we present some conclusions in Section 5.
2
Preliminaries.
In our study, the following cubic majorizing function h plays a key role:
1
1
h(t) = β − t + γt2 + Lt3 ,
2
6
(2.1)
where β, γ, L are some fixed constants.
Proposition 2.1. Let
(2.2)
r=
2
γ+
,
γ 2 + 2L
2(γ + 2 γ 2 + 2L)
b=
.
3(γ + γ 2 + 2L)2
Then, the function h is decreasing monotonically in [0, r], while it is increasing
monotonically in [r, +∞]. Moreover, if β ≤ b, then
h(r) = β − b ≤ 0,
h(β) > 0,
h(+∞) > β > 0.
Thus h has a unique zero in two intervals, respectively, which are denoted by t∗
and t∗∗ . They satisfy
r
(2.3)
β < t∗ < β < r < t∗∗
b
when β < b and t∗ = t∗∗ when β = b.
Next we introduce some families of nonlinear operators. For γ, L, r, b given as
above, let C 1 (x0 , r) denote the set of all operators mapping from B(x0 , r) to F such
that f is continuous on B(x0 , r) and f (x0 )−1 exists while C 2 (x0 , r) denotes the
subset of C 1 (x0 , r) such that f is continuous on B(x0 , r). Recall the Kantorovich
condition (1.1). Let us give more similar conditions:
(2.4)
(2.5)
(2.6)
(2.7)
(2.8)
f (x0 )−1 (f (x) − f (x0 )) ≤ γx − x0 , ∀x, x − x0 ≤ r,
f (x0 )−1 (f (x ) − f (x)) ≤ γ + Lx − x0 + 12 Lx − x x − x,
−1
f (x0 )
∀x , x, x − x + x − x0 ≤ r,
(f (x) − f (x0 )) ≤ γ + 12 Lx − x0 x − x0 ,
∀x,
f (x0 )f (x0 ) ≤ γ,
x − x0 ≤ r,
f (x0 )−1 (f (x ) − f (x)) ≤ Lx − x,
∀x , x, x − x + x − x0 ≤ r,
(2.9) f (x0 )−1 (f (x) − f (x0 )) ≤ Lx − x0 ,
∀x,
x − x0 ≤ r.
208
X. WANG, C. LI, AND M.-J. LAI
We define
K (1) (x0 , γ) = {f ∈ C 1 (x0 , r) : (1.1) holds},
(1)
Kcent(x0 , γ) = {f ∈ C 1 (x0 , r) : (2.4) holds},
K (1) (x0 , γ, L) = {f ∈ C 1 (x0 , r) : (2.5) holds},
(1)
Kcent (x0 , γ, L) = {f ∈ C 1 (x0 , r) : (2.6) holds},
K (2) (x0 , γ, L) = {f ∈ C 1 (x0 , r) : (2.7) and (2.8) hold},
(2)
Kcent (x0 , γ, L) = {f ∈ C 2 (x0 , r) : (2.7) and (2.9) hold}.
Then we have the following:
Proposition 2.2.
(1)
(1)
(i) Kcent (x0 , γ) = Kcent (x0 , γ, 0);
(ii) K (1) (x0 , γ) = K (1) (x0 , γ, 0);
(2)
(1)
(iii) K (2) (x0 , γ, L) ⊂ Kcent(x0 , γ, L) ⊂ K (1) (x0 , γ, L) ⊂ Kcent(x0 , γ, L).
Proof. It suffices to prove that
(2.10)
(2)
Kcent (x0 , γ, L) ⊂ K (1) (x0 , γ, L),
since the other relations are clear.
Observe that for any x , x with x − x + x − x0 ≤ r,
f (x0 )−1 (f (x ) − f (x)) ≤ f (x0 )−1 f (x0 ) x − x +
1
f (x0 )−1
0
×(f (x + τ (x − x)) − f (x1 )) dτ x − x
≤ γ + Lx − x0 + 12 Lx − x x − x,
so that (2.10) holds. This completes the proof.
We also need some other lemmas. Their proofs are straightforward and hence,
are omitted here. Throughout the paper, we always denote β = f (x0 )−1 f (x0 ).
(1)
Lemma 2.3. Let f ∈ Kcent(x0 , γ, L) and suppose that β ≤ b. Then for any
x ∈ B(x0 , r), f (x)−1 exists and satisfies
(2.11)
f (x)−1 f (x0 ) ≤ −
1
.
h (x − x0 )
(2)
Lemma 2.4. Let f ∈ Kcent(x0 , γ, L) and suppose that β ≤ b. Then for any
x ∈ B(x0 , r),
(2.12)
f (x0 )−1 f (x) ≤ h (x − x0 ).
CONVERGENCE FOR NEWTON-TYPE METHODS
3
209
Convergence theorems for the Newton method.
The following two theorems can be obtained from Theorems 3.1 and 1.5 in [9]
by taking L(u) = γ + Lu. However, to make this paper self-contained, we include
their proofs here.
Theorem 3.1. Suppose that f ∈ K (1) (x0 , γ, L). If β ≤ b, then the Newton
sequence {xn } defined by (1.2) converges to the solution of the equation f (x) = 0
in B(x0 , r).
Proof. Let {tn } denote the sequence generated by the Newton method (1.2)
for the majorizing function h with the initial point t0 = 0. It is easy to check
that tn converges to t∗ increasingly monotonically. We will inductively show the
following inequality:
xn − xn−1 ≤ tn − tn−1 ,
(3.1)
n = 1, 2, . . . .
Obviously, (3.1) holds for n = 1. Now assume that (3.1) holds for some n. Then
xn−1+τ ∈ B(x0 , t∗ ) for all 0 ≤ τ ≤ 1, where
xn−1+τ = xn−1 + τ (xn − xn−1 ).
(3.2)
Set
tn−1+τ = tn−1 + τ (tn − tn−1 ).
(3.3)
We have that
f (xn ) = f (xn ) − f (xn−1 ) − f (xn−1 )(xn − xn−1 )
1
{f (xn−1+τ ) − f (xn−1 )} (xn − xn−1 )dτ.
=
0
Using the condition (2.5), it follows from the induction assumption that
f (x0 )−1 f (xn ) ≤
1
0
1
≤
0
γ + Lxn−1 − x0 + 12 Lτ xn − xn−1 τ xn − xn−1 2 dτ
γ + Ltn−1 + 12 Lτ (tn − tn−1 ) τ (tn − tn−1 )2 dτ
= h(tn ).
Then, from Proposition 2.2 and Lemma 2.1, we have that
xn+1 − xn ≤ f (xn )−1 f (x0 )f (x0 )−1 f (xn ) ≤ −h(tn )/h (tn ) = tn+1 − tn ,
that is, (3.1) holds for all n = 1, 2, . . . .
Therefore, {xn } ⊂ B(x0 , t∗ ) converges to an element, say, x∗ and {f (xn )} is
uniformly bounded from condition (2.5). It follows that f (x∗ ) = 0, proving the
theorem.
210
X. WANG, C. LI, AND M.-J. LAI
(1)
Theorem 3.2. Suppose that f ∈ Kcent (x0 , γ, L). If β ≤ b, then the equation
f (x) = 0 has a unique solution in the closed ball B(x0 , r).
Proof. In fact, we can show the following two more general results:
Claim I. Suppose that f satisfies condition (2.6) in B(x0 , t∗ ). Then f (x) = 0
has at least one solution in B(x0 , t∗ ) when β ≤ b.
Claim II. Suppose that f satisfies condition (2.6) in B(x0 , ξ), where ξ = t∗ if
β = b and t∗ ≤ ξ < t∗∗ if β < b. Then f (x) = 0 has a unique solution in B(x0 , ξ)
when β ≤ b.
Proof of Claim I. Define
xn+1 = xn − f (x0 )−1 f (xn ),
n = 0, 1, 2, . . . ,
tn+1 = tn + h(tn ),
n = 0, 1, 2, . . . .
t0 = 0,
Similarly, we can inductively show that inequality (3.1) holds for the above defined
{xn }. Indeed, using the same signs given by (3.2) and (3.3), we have
1
xn+1 − xn = −
{f (x0 )−1 f (xn−1+τ ) − I}(xn − xn−1 )dτ.
0
From condition (2.6) and the induction assumption it follows that
1
xn+1 − xn ≤
γ + 12 Lxn−1+τ − x0 xn−1+τ − x0 xn − xn−1 dτ
0
1
0
1
≤
=
γ + 12 Ltn−1+τ tn−1+τ (tn − tn−1 )dτ
{h (tn−1+τ ) + 1}(tn − tn−1 )dτ = tn+1 − tn .
0
This proves that {xn } ⊂ B(x0 , t∗ ) and converges to a solution x∗ of f (x) = 0 since
tn converges to t∗ increasingly monotonically. The proof of Claim I is complete.
Proof of Claim II. In order to show Claim II, for any x0 ∈ B(x0 , ξ), define
two sequences as follows:
xn+1 = xn − f (x0 )−1 f (xn ),
tn+1 = tn + h(tn ),
n = 0, 1, . . . ,
n = 0, 1, . . . ,
where t0 = x0 − x0 . Let xτn = xn + τ (xn − xn ). Then
1
xn+1 − xn+1 = −
{f (x0 )−1 f (xτn ) − I}(xn − xn )dτ.
0
With similar arguments as before, we can obtain that
xn − xn ≤ tn − tn ,
n = 0, 1, 2, . . . ,
so that xn converges to x∗ , the limit of xn , too. This proves the conclusion of
Claim II and completes the proof of the theorem.
211
CONVERGENCE FOR NEWTON-TYPE METHODS
It should be remarked that, in Claim I and Claim II, the restrictions on the
nonlinear operator f depend upon t∗ and t∗∗ and so upon the majorizing function
h although they are more general than Theorem 3.2. Note that the family of
nonlinear operators in Theorem 3.2 is independent from h and β.
Obviously, when L = 0 Theorem 3.1 and 3.2 give the Kantorovich theorem.
Moreover, they, with Proposition 2.2, also give the following result which is the
main result obtained by Huang [5] for f ∈ K (2) (x0 , γ, L) and by Gutiérrez [1] for
(2)
f ∈ Kcent(x0 , γ, L).
(2)
Corollary 3.3. Suppose that f ∈ K (2) (x0 , γ, L) or f ∈ Kcent(x0 , γ, L). If
β ≤ b, then the Newton sequence {xn } defined by (1.2) converges to the unique
solution of the equation f (x) = 0 in B(x0 , r).
4
Convergence theorems for some variations of the Newton method.
There are many modified Newton methods. However, the ones that obviously
improve the computational efficiency are the following two variations which were
proposed by King [7] and Werner [12]:
(4.1)
xn+1 = xn − f (xm[n/m] )−1 f (xn ),
n = 0, 1, . . . ,
where [n/m] denotes the integer part of n/m, and
xn+1 = xn − f (yn )−1 f (xn ),
(4.2)
yn+1 = xn+1 − 12 f (yn )−1 f (xn+1 ),
y0 = x0 .
n = 0, 1, . . . ,
We in
√ particular recommend the iteration (4.2). Its convergence order is raised to
1 + 2, although the number of the evaluation of the function value is twice as
many as in the Newton method.
Theorem 4.1. Let f ∈ K (1) (x0 , γ, L) and suppose that β ≤ b. Then the
sequence {xn } generated by (4.1) with the initial point x0 converges to the unique
solution x∗ of the equation f (x) = 0.
Proof. Let {tn } be the corresponding sequence {xn } when the iteration (4.1)
is applied to the real function h. Then it is easy to show that {tn } is increasing
monotonically and tending to t∗ . Furthermore, note that
f (xn+1 ) = f (xn+1 ) − f (xn ) − f (xm[n/m] )(xn+1 − xn )
1
[f (xn + τ (xn+1 − xn )) − f (xn )]dτ (xn+1 − xn )
=
0
+ [f (xn ) − f (xm[n/m] )](xn+1 − xn ).
From f ∈ K (1) (x0 , γ, L) and Lemma 2.1 we can inductively prove that
xn+1 − xn ≤ tn+1 − tn ,
n = 0, 1, . . . ,
so that {xn } converges to the unique solution x∗ of the equation f (x) = 0, which
completes the proof.
212
X. WANG, C. LI, AND M.-J. LAI
Theorem 4.2. Let f ∈ K (2) (x0 , γ, L) and suppose that β ≤ b. Then the
sequence {xn } generated by (4.2) with the initial point x0 converges to the unique
solution x∗ of the equation f (x) = 0.
Proof. Let {xn } and {yn } be the two sequences defined by (4.2). We introduce
the sequence {zn } such that yn = 12 (xn + zn ). Thus (4.2) can be rewritten into
xn+1 = xn − f (yn )−1 f (xn ),
−1
zn+1 = xn+1 − f (yn )
y0 = x0 .
(4.3)
n = 0, 1, . . . ,
f (xn+1 ),
Let {tn }, {sn } and {rn } be the corresponding {xn }, {yn } and {zn } for the function
h, where s0 = t0 = 0. Then we can inductively prove that
0 = t0 = s0 ≤ tn ≤ rn ≤ tn+1 ≤ rn+1 ≤ t∗
(4.5)
and limn→∞ tn = limn→∞ sn = limn→∞ rn = t∗ . Furthermore, observe that
f (xn+1 ) = f (xn+1 ) − f (xn ) − f (yn )(xn+1 − xn )
= {f (xn+1 ) − f (zn ) − f (zn )(xn+1 − zn )} + {f (zn ) − f (yn )}(xn+1 − zn )
+ {f (zn ) − f (yn ) − f (yn )(zn − yn )}
− {f (xn ) − f (yn ) − f (yn )(xn − yn )}
1
[f (zn + τ (xn+1 − zn )) − f (zn )]dτ (xn+1 − zn )
=
0
+ {f (zn ) − f (yn )}(xn+1 − zn )
1
f (yn + τ (zn − yn ))(1 − τ )dτ (zn − yn )2
+
0
−
=
1
f (yn + τ (xn − yn ))(1 − τ )dτ (xn − yn )2
0
1
[f (zn + τ (xn+1 − zn )) − f (zn )]dτ (xn+1 − zn )
0
+ {f (zn ) − f (yn )}(xn+1 − zn )
1
[f (yn + τ (yn − xn )) − f (yn + τ (xn − yn ))](1 − τ )dτ (xn − yn )2
+
0
using zn − yn = yn − xn . Hence, from Lemma 2.1 and Lemma 2.2, we can show,
by induction, that xn , yn , zn ∈ B(x0 , t∗ ), n = 0, 1, . . . ,
f (yn )−1 f (xn ) ≤ −h (sn )h(tn ),
and
f (yn )−1 f (xn+1 ) ≤ −h (sn )h(tn+1 ),
n = 0, 1, . . . ,
n = 0, 1, . . . ,
so that
zn − xn ≤ rn − tn ,
n = 0, 1, . . . ,
and
xn+1 − zn ≤ tn+1 − rn ,
This completes the proof.
n = 0, 1, . . . .
CONVERGENCE FOR NEWTON-TYPE METHODS
5
213
Conclusions.
We have established unified convergence theorems which include a lot of known
results on the convergence of the Newton method and its variations. It is worthwhile to remark that the unification lies not only in that of the convergence theorems itself, but also under the same condition β ≤ b. Therefore the constant b is
often called the “universal constant”. Moreover, it should be pointed out that the
idea and technique developed in this paper can be used to establish the same result
for Halley’s method [10, 2, 4, 13], which means that Halley’s iteration sequence
converges under the same condition β ≤ b. Thus we also include them in our unified framework. In other words, the “universal constant” does not depend on any
given method for solving a nonlinear equation. The condition β ≤ b appears to
guarantee the existence of zeros for the majorizing function h. So this condition
is the same for Newton’s method, Newton type methods, Halley’s method and
others (Chebyshev, super-Halley, etc.). In short, our results have reached a rather
high unification.
REFERENCES
1. J. M. Gutiérrez, A new semilocal convergence theorem for Newton method, J. Comput. Appl. Math., 79 (1997), pp. 131–145.
2. D. F. Han and X. Wang, The error estimates of Halley’s method, Numer. Math.
JCU, 6 (1997), pp. 231–240.
3. D. F. Han and X. Wang, Convergence on a deformed Newton method, Appl. Math.
Comput., 94(1998), pp. 65–72.
4. M. A. Hernández, A note on Halley’s method, Numer. Math., 59 (1991), pp. 273–276.
5. Z. D. Huang, A note on the Kantorovich theorem for Newton iteration, J. Comput.
Appl. Math., 47 (1993), pp. 211–217.
6. L. V. Kantorovich and G. P. Akilov, Functional Analysis, Pergamon Press, New
York, 1982.
7. R. F. King, Tangent method for nonlinear equations, Numer. Math 18 (1972),
pp. 298–304.
8. X. Wang, On error estimates for some numerical root-finding methods, Acta Math.
Sinica, 22 (1979), pp. 638–642 (in Chinese).
9. X. Wang, Convergence of Newton’s method and inverse function in Banach spaces,
Math. Comp., 68 (1999), pp. 169–186.
10. X. Wang, Convergence of the iteration of Halley family and Smale operator class in
Banach space, Science in China (Ser. A), 41 (1998), pp. 700–709.
11. X. Wang, Convergence of the iteration of Halley’s family in weak condition, Chinese
Science Bulletin, 42 (1997), pp. 552–555.
√
12. W. Werner, Über ein Verfahren der Ordnung 1 + 2 zur Nullstellenbestimmung,
Numer. Math., 32 (1979), pp. 333–342.
13. T. Yamamoto, On the method of tangent hyperbolas in Banach spaces, J. Comput.
Appl. Math., 21 (1988), pp. 75–86.
Download