APPROXIMATE SOLUTIONS FOR ABSTRACT INEQUALITY SYSTEMS

advertisement
c 2013 Society for Industrial and Applied Mathematics
Downloaded 06/27/13 to 140.117.35.140. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
SIAM J. OPTIM.
Vol. 23, No. 2, pp. 1237–1256
APPROXIMATE SOLUTIONS FOR ABSTRACT INEQUALITY
SYSTEMS∗
CHONG LI† AND K. F. NG‡
Abstract. We consider conic inequality systems of the type F (x) ≥K 0, with approximate
solution x0 associated to a parameter τ , where F is a twice Fréchet differentiable function between
Hilbert spaces X and Y , and ≥K is the partial order in Y defined by a nonempty convex (not
necessarily closed) cone K ⊆ Y . We prove that, under the suitable conditions, the system F (x) ≥K 0
is solvable, and the ratio of the distance from x0 to the solution set S over the distance from F (x0 )
to the cone K has an upper bound given explicitly in terms of τ and x0 . We show that the upper
bound is sharp. Applications to analytic function inequality/equality systems on Euclidean spaces
are given, and the corresponding results of Dedieu [SIAM J. Optim., 11 (2000), pp. 411–425] are
extended and significantly improved.
Key words. conic inequality systems, approximate solution, stability
AMS subject classifications. 47J99, 90C31
DOI. 10.1137/120885176
1. Introduction. The constraint sets in most constrained optimization problems can be described as conic inequality systems of the following type:
(1.1)
F (x) ≥K 0,
where F is a function between Banach spaces X, Y and Y is endowed with the partial
order (or preorder) ≥K defined by a nonempty convex cone K ⊆ Y . One of the most
important issues for conic inequality systems is the issue of error bounds. We say that
a constant c is a (local) error bound for (1.1) at x0 ∈ X if there exists a neighborhood
U of x0 such that
(1.2)
d(x, S) ≤ c · e(x)
for any x ∈ U,
where S denotes the solution set of (1.1) (consisting of all x satisfying (1.1)), d(·, S)
is the distance function associated with S defined by
d(x, S) := inf{x − z : z ∈ S} for each x ∈ X,
and e(·) is some suitable residual function measuring a degree of violation with respect
to (1.1).
In the case when X = Rn , Y = Rm , and K = Rm
− (consisting of all m-vector
(α1 , . . . , αm ) ∈ Rm such that αi ≤ 0 for all i), system (1.1) is reduced to the finitely
many inequalities. In this case, the most important and celebrated result regarding
∗ Received by the editors July 19, 2012; accepted for publication (in revised form) March 19, 2013;
published electronically June 20, 2013.
http://www.siam.org/journals/siopt/23-2/88517.html
† Department of Mathematics, Zhejiang University, Hangzhou 310027, P. R. China (cli@zju.
edu.cn). This author’s work was supported in part by the National Natural Science Foundation
of China (grant 11171300) and by Zhejiang Provincial Natural Science Foundation of China (grant
Y6110006).
‡ Department of Mathematics (and IMS), Chinese University of Hong Kong, Hong Kong, P. R.
China (kfng@math.cuhk.edu.hk). This author’s work was supported by an earmarked grant (GRF)
from the Research Grant Council of Hong Kong (Projects CUHK 403110 and 402612).
1237
Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.
Downloaded 06/27/13 to 140.117.35.140. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
1238
CHONG LI AND K. F. NG
the error bound is due to Hoffman, who proved that if F is affine and the solution set
S is nonempty, then there exists c > 0 such that (1.2) holds with e(·) = d(F (·), Rm
−)
and U = X (see [6, 7]). Extensions of Hoffman’s result to convex inequalities have
been well established in the literature; see, for example, [8, 9, 10, 11, 12, 19, 20, 28]
and references therein. However, study of error bound for nonconvex inequalities is
not so rich. In the special case when (1.1) is a single inequality system F (x) ≤ 0
(i.e., m = 1), Luo and Luo [13] and Luo and Pang [14] established global error
bound inequality (1.2), respectively for polynomial and analytic inequality but with
an unknown fractional exponents to d(F (x), R− ). Other related works in this direction
can be found in [15, 17, 18, 27].
For the general case when K is a closed convex cone, Ng and Yang [16] established
some error bound results for abstract linear inequality systems (i.e., F is affine) in
general Banach spaces, which in particular extend the corresponding results in finite
dimensional spaces due to Robinson [21]. For the general differential system, Robinson
[22] proved that if the so-called Robinson constraint qualification (RCQ) holds at
x0 ∈ S, then there exist c > 0 and a neighborhood U of x0 such that (1.2) holds
with e(x) = d(F (x), K). By weakening the RCQ assumption, this result has been
extended in [2] by He and Sun.
We observe that all results mentioned above are done under two additional assumptions: (a) K is closed, and (b) S is preassumed nonempty with x0 ∈ S. In
this paper we assume that X and Y are Hilbert spaces and that system (1.1) has an
“approximate solution” x0 in the sense that F (x0 ) + c ≥K 0 for some c ∈ Y with
small norm or small residual value, but we drop the two additional assumptions mentioned above. Given an approximate solution x0 , the main concern of this paper is to
show, under suitable assumptions, that the solution set S of (1.1) is nonempty and
its distance d(x0 , S) from x0 has an upper estimate in terms of x0 together with a parameter. In a more concrete situation we will consider, as applications, the following
system with Y = Rl ,
(1.3)
fi (x) > 0,
fi (x) ≥ 0,
i = 1, . . . , lS ,
i = lS + 1, . . . , l,
and the system with Y = Rm+l involving inequality and equality constraints,
(1.4)
fi (x) > 0,
fi (x) ≥ 0,
gj (x) = 0,
i = 1, . . . , lS ,
i = lS + 1, . . . , l,
j = 1, . . . , m.
Here fi and gj are real-valued functions on X, and l, lS , m are nonnegative integers
such that 0 ≤ lS ≤ l. Associated with (1.3) and (1.4), let f : X → Rl , g : X → Rm ,
and (f, g) : X → Rl × Rm be defined, respectively, by
(1.5)
f (x) := (f1 (x), . . . , fl (x)), g(x) := (g1 (x), . . . , gm (x)), and (f, g)(x) := (f (x), g(x))
for each x ∈ X. We assume throughout that the involved functions fi , gi (and so f ,
g and (f, g)) are analytic (at least locally around the point x under consideration);
thus we have the following real constants:
(1.6)
1
(k)
f (x) k−1
Γ(f ; x) := sup k! k≥2
and
1
(f, g)(k) (x) k−1
Γ(f, g; x) := sup ,
k!
k≥2
Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.
APPROXIMATE SOLUTIONS FOR INEQUALITY SYSTEMS
1239
Downloaded 06/27/13 to 140.117.35.140. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
where, e.g., f (k) (x) denotes the kth derivative of f at x and
(1.7)
f (k) (x) := sup{f (k) (x)(u1 , . . . , uk ) : each ui ∈ X with ui ≤ 1}.
Let f (x)∗ denote the conjugate (adjoint) of f (x); thus f (x)f (x)∗ is a linear operator
from Rl into itself and is, as usual, represented as an l × l matrix. For a matrix A,
we use A† to denote the Moore–Penrose generalized inverse of A, and A the usual
operator norm of A. In connection with (1.4), one defines an (l + m) × (l + m) matrix
G(x) for x ∈ X by
f (x)f (x)∗ + Diag(4f (x)+ ) f (x)g (x)∗
(1.8)
G(x) :=
,
g (x)f (x)∗
g (x)g (x)∗
where Diag(4f (x)+ ) is the diagonal matrix with diagonal entries 4fi (x)+ , i = 1, . . . , l.
Here and throughout, we use the notations a+ := max{a, 0} and a− := max{−a, 0}
for any real number a. Let
1
(1.9) δ(f ; x) := (f (x)f (x)∗ + Diag(4f (x)+ ))† 2
1
and δ(f, g; x) := G(x)† 2 .
One of the main results of Dedieu in [3] for system (1.4) with X = Rn can be stated
∞ 2k −1
= 1.63284 . . . are such that
as follows: if x0 ∈ Rn and σ := k=0 12
(1.10)
(1.11)
fi (x0 ) > (σδ(f ; x0 )f (x0 )− )2
for each i = 1, . . . , lS ,
the matrix G(x0 ) is nonsingular,
and
1
(1.12) (f (x0 )− 2 + g(x0 )2 ) 2 <
1
,
8(1 + Γ(f, g; x0 ))δ(f, g; x0 ) max{1, δ(f, g; x0 )}
then the solution set (denoted by (f, g)≥ ) of (1.4) is nonempty, and its distance from
x0 satisfies that
(1.13)
1
d(x0 , (f, g)≥ ) ≤ σδ(f, g; x0 ) (f (x0 )− 2 + g(x0 )2 ) 2 .
Similar treatments have also been done for system (1.3) with X = Rn by Dedieu;
see [3] and references therein for more details and background information. From
some points of view, assumptions (1.10) and (1.11) are too restrictive; for example,
when all fi with lS < i ≤ l and all gj are zero functions, because the matrix G(x0 )
is clearly singular even (1.10) itself trivially entails that x0 is a solution of (1.4), the
results of Dedieu mentioned above cannot be applied.
More generally, we study the corresponding issues for the general system (1.1) for
which we assume from now on that F is twice Fréchet differentiable (at least locally
around x ∈ X with the first and the second derivatives F (·) and F (·)). We say that
F satisfies the γ-condition at x0 ∈ X on the open ball B(x0 , r) (with center x0 ∈ X
and radius r > 0) if 0 ≤ rγ ≤ 1 and
(1.14)
F (x0 )† F (x) ≤
2γ
(1 − γx − x0 )3
for each x ∈ B(x0 , r),
Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.
Downloaded 06/27/13 to 140.117.35.140. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
1240
CHONG LI AND K. F. NG
where F (x0 )† is the Moore–Penrose generalized inverse of F (x0 ) and the norm of
F (x0 )† F (x0 ) is defined similarly as done in (1.7); see section 3 for more notation matters. The γ-condition (in the special case when F (x0 ) is nonsingular) was
introduced by Wang (see [25, 26]) for improving the famous point estimate theory
presented by Smale [24]. An important example is of course the case when F is analytic: if F : X → Y is analytic on B(x0 , r), then F satisfies the γ-condition at x0 on
B(x0 , r0 ) with any γ ∈ [γF (x0 ), +∞) and r0 := min{r, γ1 }, where the quantity γF (x0 )
was introduced by Shub and Smale [23] (see also [1]) and is defined by
(1.15)
1
F (x0 )† F (k) (x0 ) k−1
γF (x0 ) := sup .
k!
k≥2
Our main results are described in terms of the γ-condition and cover cases when K
is not necessarily closed and/or the case when the nonsingularity is not assumed. As
we will see later, an advantage of our approach is that the corresponding constant
c in the error bound result can be pointwise determined explicitly. In particular,
our estimate is optimal in some sense; see Example 2.1. As applications, we extend and improve the results of Dedieu mentioned above for system (1.3) and system
(1.4). In particular, we show that for (1.13) to hold, both (1.11) and (1.12) can be
replaced by weaker assumptions, while (1.10) is not needed, as explained before Theorem 2.2 in section 2. Another by-product of our main results can be cast in the
stability point of view (Robinson initiated the study of stability in [21, 22] for (1.1)).
Under suitable assumptions, we show that if (1.1) is solvable, then the perturbed
systems
(1.16)
F (x) − y ≥K 0
(with small enough y) are also solvable, and the solution map y → S(y) enjoys
some calmness property.
The paper is organized as follows. In section 2, we summarize the main theorems
and their corollaries. Some basic concepts and known facts needed in the rest of the
paper are listed in section 3. Section 4 contains the proofs of the main theorems,
while applications are provided in section 5.
2. Main results. Throughout the paper let X and Y be (real) Hilbert spaces,
and K ⊆ Y be a cone which defines a binary relation ≥K by
y1 ≥ y2 ⇔ y1 − y2 ∈ K.
Given a bounded linear operator A (or a matrix), let kerA, imA, and A∗ respectively
denote the kernel, the image, and the conjugate operator of A. Let A† denote the
Moore–Penrose generalized inverse of A when it exists uniquely (e.g., if imA is closed).
The following theorem is our main result regarding the solution set S := {x ∈ X :
F (x) ∈ K} of system (1.1), where we assume that F is a function from X into Y .
The usual convention that a0 = +∞ is always adopted for any a > 0. Further, for
a nonempty set D in X (or in other normed spaces), we use the notation D to
denote the distance from the origin to D, namely D := inf{u : u ∈ D}, with the
convention that ∅ = +∞. The closure and the relative interior of D are denoted by
D and ri D, respectively. For an element x ∈ X, the notation D − x stands for the
set consisting of all elements d − x with d ∈ D.
Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.
Downloaded 06/27/13 to 140.117.35.140. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
APPROXIMATE SOLUTIONS FOR INEQUALITY SYSTEMS
1241
Theorem 2.1. Let x0 ∈ X and r̄, γ ∈ [0, +∞). Let F : X → Y be continuous on
B(x0 , r̄) and twice Fréchet differentiable on B(x0 , r̄) with the following properties:
(1) imF (x0 ) is closed;
(2) F satisfies the γ-condition at x0 on B(x0 , r̄);
(3)
(2.1)
K ∪ {F (x0 )} ∪ imF (x) ⊆ imF (x0 )
for each x ∈ B(x0 , r̄).
Let ξ, r∗ be defined by
ξ := F (x0 )† (K − F (x0 ))
(2.2)
and
(2.3)
∗
1+γξ−
r :=
√
(1+γξ)2 −8γξ
,
4γ
ξ,
0<γ≤
γ = 0.
√
3−2 2
,
ξ
Suppose that r∗ ≤ r̄ and that there exists τ ∈ R such that
√
2+ 2
τ −1
(2.4)
1<τ ≤
and ξ ≤
.
2
γτ (2τ − 1)
Then there exists x∗ ∈ X such that F (x∗ ) ∈ K and
(2.5)
x0 − x∗ ≤ τ F (x0 )† d(F (x0 ), K).
Moreover, if
(a) K is closed
or
√
(b) r∗ < r̄, ξ < 3−2γ 2 , and the relative interior ri K of K is nonempty,
then S is nonempty and
(2.6)
d(x0 , S) ≤ τ F (x0 )† d(F (x0 ), K).
Remark 2.1. If ξ = 0 in (2.2), then assumption (2.1) entails that F (x0 ) ∈ K. In
fact, assume that ξ = 0 in (2.2) and (2.1) holds. Then, F (x0 )† (K − F (x0 )) = 0,
and, by definition, we can choose {un } ⊆ K such that F (x0 )† (un − F (x0 )) → 0.
Thus F (x0 )F (x0 )† (un − F (x0 )) → 0. Since F (x0 )F (x0 )† is simply the projection
ΠimF (x0 ) (see (3.3) of the next section) and since un − F (x0 ) lies in imF (x0 ) by
(2.1), it follows that un − F (x0 ) → 0. Therefore F (x0 ) ∈ K, as claimed.
Similarly, if γ = 0, then the γ-condition at x0 on B(x0 , r̄) together with assumption (2.1) entails that F (x) = 0 for each x ∈ B(x0 , r̄); that is, F is affine on
B(x0 , r̄).
The following example shows that the bound τ F (x0 )† before d(F (x0 ), K) in
(2.6) cannot be improved.
√
√
τ −1
Example 2.1. Let γ ∈ (0, +∞) and τ ∈ (1, 2+2 2 ] (so τ (2τ
−1) ≤ 3 − 2 2; see (4.2)
and (4.3)). Define F : (−∞, γ1 ) ⊆ R → R by
(2.7)
γx2
τ −1
−x+
F (x) :=
γτ (2τ − 1)
1 − γx
1
for each x ∈ −∞,
.
γ
Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.
1242
CHONG LI AND K. F. NG
Downloaded 06/27/13 to 140.117.35.140. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
Consider the inequality system (1.1) with K := R− and F defined above, namely
F (x) ≤ 0.
(2.8)
1
Let r̄ := 2γ
and x0 = 0. Then F (x0 ) = −1, and it is routine to check that F possesses
properties (1)–(3) of Theorem 2.1. Moreover, we have that
√
3−2 2
1
τ −1
†
≤
< ,
(2.9)
ξ = F (x0 ) (K − F (x0 )) =
γτ (2τ − 1)
γ
γ
∗
so (2.4) is satisfied and r∗ < r̄, where
√ r is defined by (2.3). Note that the solution set
1+γξ+
(1+γξ)2 −8γξ
S = [r∗ , r∗∗ ], where r∗∗ :=
≤ r̄. Therefore, d(x0 , S) = 0 − r∗ =
4γ
∗
r . Noting that ξ = 0 by (2.9), one can check by element calculation that
1 + γξ − (1 + γξ)2 − 8γξ
τ −1
= τ ξ ⇐⇒ ξ =
.
4γ
γτ (2τ − 1)
√
1+γξ− (1+γξ)2 −8γξ
It follows from (2.9) that r∗ =
= τ ξ. Thus we have
4γ
d(x0 , S) = r∗ = τ F (x0 )−1 d(F (x0 ), K)
because d(F (x0 ), K) = ξ and F (x0 )−1 = 1. This shows that the bound τ F (x0 )−1 is optimal.
The following theorem, Theorem 2.2, improves the corresponding result of Dedieu
mentioned in section 1 in three aspects: we drop his assumption (1.10), weaken his
nonsingularity assumption, and replace the constant 18 in the right-hand side of (1.12)
∞ 2k −1
by a bigger constant. Indeed, if we take τ = σ = k=0 12
= 1.63284 . . . in
Theorems 2.2 and 2.3, then
σ−1
1
τ −1
=
= 0.1710 . . . > .
τ (2τ − 1)
σ(2σ − 1)
8
Recall that Γ and δ are defined as in (1.6) and (1.9).
Theorem 2.2. Consider the inequality/equality system (1.4), where each fi and
each gj are real-valued functions on X, and (f, g) are
defined as in (1.5). Let x0 ∈ X
√
2− 2
such that
and suppose that (f, g) is analytic on B x0 , 2(1+Γ(f,g;x
))
0
(2.10)
(Rl × {0m }) ∪ {(f (x0 ), g(x0 ))} ∪ im(f (k) (x0 ), g (k) (x0 )) ⊆ imG(x0 ) for each k = 2, . . .
√
(e.g., G(x0 ) is nonsingular), where G(x0 ) is defined as in (1.8). Let τ ∈ (1, 2+2 2 ] be
such that
(2.11)
1
τ −1
.
(f (x0 )− 2 + g(x0 )2 ) 2 <
τ (2τ − 1)(1 + Γ(f, g; x0 ))δ(f, g; x0 ) max{1, δ(f, g; x0 )}
Then the solution set S(f, g) of (1.4) is nonempty, and the following estimate holds:
(2.12)
1
d(x0 , S(f, g)) ≤ τ δ(f, g; x0 ) (f (x0 )− 2 + g(x0 )2 ) 2 .
Similarly we also extend the result of [3, Theorem 1] for system (1.3) as follows.
Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.
Downloaded 06/27/13 to 140.117.35.140. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
APPROXIMATE SOLUTIONS FOR INEQUALITY SYSTEMS
1243
Theorem 2.3. Consider system (1.3), where each fi is a real-valued function
on X and f√is defined as in (1.5). Let x0 ∈ X, and suppose that f is analytic on
2− 2
B x0 , 2(1+Γ(f
;x0 )) such that
(2.13)
the matrix (f (x0 )f (x0 )∗ + Diag(4f (x0 )+ )) is nonsingular.
√
Let τ ∈ (1, 2+2 2 ] be such that
(2.14)
f (x0 )− <
τ −1
.
τ (2τ − 1)(1 + Γ(f ; x0 ))δ(f ; x0 ) max{1, δ(f ; x0 )}
Then the solution set f > of (1.3) is nonempty, and the following estimate holds:
(2.15)
d(x0 , f > ) ≤ τ δ(f ; x0 ) f (x0 )− .
We defer the proofs of the above three theorems to sections 4 and 5. As a direct
application of Theorem 2.1, we present below a proof for the following stability result
for the perturbed system (1.16).
Theorem 2.4. Let γ ∈ [0, +∞), and let x0 ∈ X be a solution of system
(1.1),
√ 2− 2
where the function F : X → Y is twice Fréchet differentiable on B x0 , 2γ
with
√
2− 2
2γ .
If K is closed or the
√
3−2 2
relative interior ri K of K is nonempty, then, for any y ∈ B 0, γF
(x0 )† ∩ imF (x0 ),
the perturbed system (1.16) is solvable, and its solution set S(y) satisfies that
(2.16)
√
√ 2
2+ 2 3
−
2
d(x0 , S(y)) ≤
F (x0 )† y for any y ∈ B 0,
∩ imF (x0 ).
2
γF (x0 )† properties (1)–(3) in Theorem 2.1 but replacing r̄ by
(In particular, the solution map√S(·), restricted to imF (x0 ), is calm at (0, x0 ).)
√
τ −1
Proof. Consider τ := 2+2 2 , and note that τ (2τ
Let y ∈
−1) = 3 − 2 2.
√
3−2 2
B 0, γF (x0 )† ∩ imF (x0 ), define the function
(2.17)
Fy (·) := F (·) − y,
and let ξy := Fy (x0 )† (K − Fy (x0 )).
Then Fy ≡ F , Fy ≡ F . Since x0 is a solution of system (1.1), one has F (x0 ) ∈ K,
and so y = F (x0 ) − Fy (x0 ) ∈ K − Fy (x0 ); it follows that
√ 3−2 2
τ −1
†
†
(2.18)
ξy ≤ Fy (x0 ) y = F (x0 ) y <
=
.
γ
γ τ (2τ − 1)
Therefore the corresponding ry∗ (defined as in (2.3) but replacing ξ by ξy ) satisfies
that
√
√
1+3−2 2
2− 2
1 + γξy
∗
<
=
.
ry ≤
4γ
4γ
2γ
√ Pick r̄ ∈ ry∗ , 2−2γ 2 . This together with the given assumptions implies that Fy satisfies
the γ-condition on B (x0 , r̄), and that
K ∪ {Fy (x0 )} ∪ imFy (x) ⊆ imFy (x0 )
for each x ∈ B (x0 , r̄)
Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.
Downloaded 06/27/13 to 140.117.35.140. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
1244
CHONG LI AND K. F. NG
(noting that imFy (x0 ) = imF (x0 ) contains y and K, and that Fy (x0 ) ∈ K − y).
Therefore, if K is closed or the ri K is nonempty, one can apply Theorem 2.1, and we
conclude that the perturbed problem
F (x) − y = Fy (x) ≥K 0
is solvable, and its solution set S(y) satisfies that
d(x0 , S(y)) ≤ τ Fy (x0 )† d(Fy (x0 ), K) ≤ τ Fy (x0 )† Fy (x0 ) − F (x0 )
√
2+ 2 F (x0 )† y.
=
2
This proves (2.16), and the proof is complete.
3. Notations, notions, and auxiliary lemmas. Let N denote the set of all
natural numbers, and let k ∈ N. We consider a k-multilinear bounded operator
Λ : X k → Y . As in (1.7), we define the norm Λ by
Λ := sup{Λ(x1 , . . . , xk ) : (x1 , . . . , xk ) ∈ X k , xi ≤ 1 for each i};
also, let imΛ denote the image of Λ:
imΛ := {Λ(x1 , . . . , xk ) : (x1 , . . . , xk ) ∈ X k }.
Recall that if A is a bounded linear operator from X to Y such that imA is closed,
then it has a unique Moore–Penrose generalized inverse (to be denoted by A† ), which
is, by definition, the bounded linear operator from Y to X such that
(3.1) A A† A = A,
A† A A† = A† ,
(A† A)∗ = (A† A),
and (A A† )∗ = (A A† ),
where A∗ denotes the adjoint operator of A. For convenience, we list some facts
regarding the Moore–Penrose inverse:
(3.2)
(A∗ )† = (A† )∗ ,
A† = (A∗ A)† A∗ = A∗ (A A∗ )† ,
(A A∗ )† = (A∗ )† A† ,
and
(3.3)
A† A = Π(kerA)⊥ ,
A A† = ΠimA ,
A† ΠimA = A† A A† = A† ,
where ΠD denotes the projection on subset D; see, for example, [4] for more details.
Notice by (3.2) that (A A∗ )† = (A∗ )† A† = (A† )∗ A† . It follows that
(3.4)
(AA∗ )† = A† 2 .
Lemma 3.1. Let A be a bounded linear operator from X to Y with closed image.
Let D be a closed convex subset of imA. Then there exists u0 ∈ D such that A† u0 =
A† D.
Proof. Choose a sequence {un } ⊆ D such that A† un → A† D. Then {A A† un }
is bounded. Since {un } ⊆ D ⊆ imA by assumption, and since A A† is the projection
ΠimA on imA (see (3.3)), one has that un = A A† un , and so {un } is bounded. Thus
{un } has a subsequence, denoted by itself, which converges weakly to some point
u0 ∈ D; consequently, by a standard result in functional analysis, there exists a
sequence {vn } ⊆ D with each of its elements representable as a convex combination
Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.
Downloaded 06/27/13 to 140.117.35.140. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
APPROXIMATE SOLUTIONS FOR INEQUALITY SYSTEMS
1245
of some elements of {un } such that vn converges to u0 strongly. By the continuity and
linearity of A† , it is then routine to check that A† u0 = A† D, and so we complete
the proof.
Lemma 3.2. Let γ > 0, and suppose that F satisfies the γ-condition at some
x̄ ∈ X on B(x̄, r) with some r ∈ (0, γ1 ]. Suppose further that
imF (x) ⊆ imF (x̄)
(3.5)
Let x ∈ X be such that
(3.6)
for each x ∈ B(x̄, r).
√ 2− 2
.
x − x̄ < min r,
2γ
Then
imF (x) = imF (x̄)
(3.7)
and
(3.8) F (x)† F (x̄) ≤
2−
1
(1 − γx − x̄)2
−1
=
(1 − γx − x̄)2
;
1 − 4γx − x̄ + 2γ 2 x − x̄2
moreover, if F (x̄) ∈ imF (x̄), then
F (x) ∈ imF (x̄) = imF (x).
(3.9)
Proof. Since
(3.10)
F (x) = F (x̄) +
0
1
F (x̄ + t(x − x̄))(x − x̄)dt,
assumption (3.5) implies that the image of F (x) is contained in imF (x̄). Since, when
restricting to imF (x̄), F (x̄)F (x̄)† is the identity map (see (3.3)), we then have that
F (x̄)F (x̄)† F (x) = F (x). Letting W := F (x̄)† (F (x̄) − F (x)) and IX denote the
identity map on X, it follows from (3.1) that
(3.11)
F (x) = F (x̄)F (x̄)† (F (x) − F (x̄)) + F (x̄) = F (x̄)(−W + IX ).
Assuming that IX − W is invertible, (3.7) is seen to hold (by (3.11)), and F (x̄) =
F (x)(IX − W )−1 , and so
(3.12)
F (x)† F (x̄) = F (x)† F (x)(IX − W )−1 ≤ (IX − W )−1 because F (x)† F (x) is a projection operator (and thus of norm bounded by 1) by
(3.3). Thus (3.8) will also be shown, provided that
(3.13)
(IX − W )−1 ≤
(1 − γx − x̄)2
.
1 − 4γx − x̄ + 2γ 2 x − x̄2
To accomplish this, we use (3.10) and the assumed γ-condition to note that
1
F (x̄)† F (x̄ + t(x − x̄))x − x̄dt
W ≤
0 1
2γ
≤
x − x̄dt
(1
−
γtx
− x̄)3
0
1
− 1.
=
(1 − γx − x̄)2
Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.
1246
CHONG LI AND K. F. NG
Downloaded 06/27/13 to 140.117.35.140. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
Further, since γx − x̄ ≤
√
2− 2
2
by (3.6), we also see that W < 1 as
1
1
√ = 2.
< 2− 2 2
(1 − γx − x̄)2
1− 2
By the Banach lemma, IX − W is therefore invertible, and (3.13) holds.
Finally, suppose that F (x̄) ∈ imF (x̄). To establish (3.9), we need only show that
F (x) ∈ imF (x̄). But this is clearly true from the identity
1
F (x) = F (x̄) +
F (x̄ + t(x − x̄))(x − x̄)dt
0
because F (x̄ + t(x − x̄))(x − x̄) ∈ imF (x̄) by (3.7) for any t ∈ [0, 1].
4. Proofs of main results. For convenience, we shall use w to denote the
elementary function defined by
(4.1)
Then
(4.2)
w(t) :=
2
1 + t + (1 + t)2 − 8t
√
for each t ∈ [0, 3 − 2 2].
√ √
2+ 2
w : [0, 3 − 2 2] → 1,
is strictly increasing and bijective
2
√
√
with the inverse map w−1 : [1, 2+2 2 ] → [0, 3 − 2 2] given by
√ 2+ 2
t−1
−1
(4.3)
w (t) :=
for each t ∈ 1,
.
t(2t − 1)
2
Let ξ, γ ∈ [0, +∞), and define the function φξ,γ by
(4.4)
γt2
φξ,γ (t) := ξ − t +
1 − γt
1
for each t ∈ −∞,
.
γ
Suppose that
(4.5)
√
3−2 2
.
0≤ξ≤
γ
Then the smaller root r∗ of φξ,γ on [0, 1/γ) is given by (2.3), that is,
√
√
1+γξ− (1+γξ)2 −8γξ
3−2 2
∗
,
0
<
γ
≤
,
4γ
ξ
(4.6)
r :=
ξ,
γ = 0;
we note that
(4.7)
√
2− 2
1 + γξ
0≤r ≤
≤
.
4γ
2γ
∗
Notice further that the inequality in (2.4) and the definition for r∗ given by (4.6) can
be rewritten as
(4.8)
γξ ≤ w−1 (τ )
Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.
APPROXIMATE SOLUTIONS FOR INEQUALITY SYSTEMS
1247
Downloaded 06/27/13 to 140.117.35.140. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
and
r∗ = w(γξ)ξ,
(4.9)
respectively. If there is no ambiguity, we shall write φ for φξ,γ (which is defined as in
(4.4)). Let h : D ⊆ X → Y be a nonlinear map with continuous Fréchet derivative.
Let {tn } and {xn } be sequences generated by the (Gauss-)Newton method for φ and
for h, respectively, with initial points t0 = 0 and x0 ; that is,
(4.10)
tn+1 := tn − φ (tn )−1 φ(tn ) for each n = 0, 1, . . .
and
(4.11)
xn+1 := xn − h (xn )† h(xn )
for each n = 0, 1, . . . .
The main part of the following result is taken from [5].
Proposition 4.1. Let γ ∈ [0, +∞), Ω ⊆ X, and let x0 be an interior point of
Ω. Let h : Ω → Y be continuous such that it is twice Fréchet differentiable on the
interior of Ω with the surjective derivative h (x0 ). Let ξ be defined by
ξ := h (x0 )† h(x0 ),
let r∗ be defined by (4.6), and let φ := φξ,γ . Suppose that
√
3−2 2
,
(4.12)
ξ≤
γ
B(x0 , r∗ ) ⊆ Ω,
(4.13)
and h satisfies the γ-condition at x0 on B(x0 , r∗ ). Then the Newton method (4.11)
with initial point x0 is well defined, and the generated sequence {xn } converges to a
zero x∗ of h with x∗ ∈ B(x0 , r∗ ) satisfying
xn − x∗ ≤ r∗ − tn
(4.14)
for each n = 0, 1, . . . .
Proof. If ξ = 0, then, by definition, one has xn = x0 = x∗ and tn = r∗ = 0 for
each n, and (4.14) holds trivially. Suppose that ξ > 0 but γ = 0. Then, by the given
assumptions of surjectivity, γ-condition, and continuity, one has that h is affine on
B(x0 , r∗ ):
h(x) = h(x0 ) + h (x0 )(x − x0 ) for each x ∈ B(x0 , r∗ ).
Thus we have that
xn = x∗ = x0 + h (x0 )† h(x0 ) and tn = r∗ = ξ
Therefore, (4.14) is seen to hold because
ξ = r ∗ − t0 ,
xn − x∗ =
0 = r ∗ − tn ,
for each n > 0.
n = 0,
n > 0.
Finally, for the remaining case (namely, ξ > 0, γ > 0), the proof given in [5, Theorem
3.3] can be adopted here although the setting of [5] is in the finite dimensional spaces
X and Y .
Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.
Downloaded 06/27/13 to 140.117.35.140. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
1248
CHONG LI AND K. F. NG
Now we are ready to give the proof of Theorem 2.1.
Proof of Theorem 2.1. By the assumptions, Y0 := imF (x0 ) is a closed subspace
such that
K ∪ {F (x0 )} ∪ imF (x) ⊆ imF (x0 ) = Y0 for each x ∈ B(x0 , r̄).
√ Set r̄0 := min r̄, 2−2γ 2 (so r∗ ≤ r̄0 by (4.7) and thanks to the assumption r∗ ≤ r̄).
By the given γ-condition on B(x0 , r̄), we can apply Lemma 3.2 (to x0 , r̄ in place of
x̄, r) to conclude that
(4.15)
F (x) ∈ Y0 = imF (x0 ) for each x ∈ B(x0 , r̄0 ) (and so for each x ∈ B(x0 , r̄0 )).
Moreover, by Lemma 3.1, the distance from the origin to the set F (x0 )† (K − F (x0 ))
is attained: there exists some c0 ∈ K (so F (x) − c0 ∈ Y0 for each x ∈ B(x0 , r̄0 )) such
that
(4.16) F (x0 )† (c0 − F (x0 )) = F (x0 )† (K − F (x0 )) = F (x0 )† (K − F (x0 )) = ξ,
where the last equality holds by the definition of ξ given in (2.2). Let Ω := B(x0 , r̄0 )
and define h : Ω → Y0 by
h(x) := F (x) − c0
for each x ∈ Ω.
Then h is continuous on Ω, and
(4.17)
h (x) = F (x)
and h (x) = F (x)
for each x ∈ int Ω = B(x0 , r̄0 ).
In particular, h (x0 ) = F (x0 ), and so h (x0 ) : X → Y0 is surjective. Note further
that h (x0 )† is exactly the restriction to Y0 of the operator F (x0 )† . This together
with (4.15) implies that
(4.18)
h (x0 )† h(x0 ) = F (x0 )† (F (x0 ) − c0 )
and h (x0 )† h (x) = F (x0 )† F (x)
for each x ∈ B(x0 , r̄0 ). It follows from the assumed γ-condition for F that h satisfies
the γ-condition on B(x0 , r̄0 ). Further, by (4.16), (4.18), and (4.8), together with (4.2),
we have (4.12) because
√
3−2 2
w−1 (τ )
(4.19)
h (x0 )† h(x0 ) = ξ ≤
≤
.
γ
γ
Finally, since r∗ ≤ r̄0 as noted earlier, one has that B(x0 , r∗ ) ⊆ Ω. Thus, Proposition
4.1 is applicable to concluding that the sequence xn generated by the Newton method
(4.11) for h with initial point x0 converges to a zero x∗ of h and (4.14) holds. Hence,
F (x∗ ) = c0 ∈ K, and (4.14) in particular implies that
(4.20)
x0 − x∗ ≤ r∗ = w(γξ)ξ ≤ τ ξ
(see (4.8) and (4.9)). Clearly, one has
F (x0 )† (K − F (x0 )) ≤ F (x0 )† d(F (x0 ), K).
By (4.16), this means that ξ ≤ F (x0 )† d(F (x0 ), K), and so (2.5) holds thanks to
(4.20). In particular, if K is closed, then x∗ ∈ S, and the conclusion is proved for
case (a).
Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.
Downloaded 06/27/13 to 140.117.35.140. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
APPROXIMATE SOLUTIONS FOR INEQUALITY SYSTEMS
1249
It remains to consider case (b): we assume that
√
3−2 2
∗
(4.21)
r < r̄, ξ <
, and ri K = ∅.
γ
Then there exists a sequence {cn } ⊆ ri K such that cn → 0. Without loss of generality,
we may assume further that, for each n = 1, 2, . . .
√
3−2 2
†
¯
,
(4.22)
ξn := ξ + F (x0 ) cn <
γ
(4.23)
τ̄n := w(γ ξ̄n ) ∈
√ 2+ 2
,
1,
2
and
(4.24)
r̄n∗ := w(γ ξ̄n )ξ¯n < r̄
(noting (4.2) and that ξ̄n → ξ and r̄n∗ → r∗ by (4.9)). Let Fn : X → Y be the map
defined by Fn (·) := F (·) − cn , and consider the inequality system (1.1) with Fn in
place of F , that is, the inequality system
(4.25)
Fn (x) ≥K 0.
Then, Fn = F , and so Fn (x0 )† = F (x0 )† and Fn (x0 ) ∈ imFn (x0 ), thanks to assumption (4.15). Moreover, we have Fn (x0 )† (K − Fn (x0 )) = F (x0 )† (K − F (x0 ) − cn ), and
so
(4.26)
ξn := Fn (x0 )† (K − Fn (x0 )) ≤ ξ + F (x0 )† cn = ξ̄n ,
(4.27)
τn := w(γξn ) ≤ w(γ ξ̄n ) = τ̄n ∈
√ 2+ 2
,
1,
2
and
rn∗ := w(γξn )ξn ≤ w(γ ξ̄n )ξ¯n = r̄n∗ < r̄
thanks to (4.2). This and the given assumptions imply that Fn satisfies the γ-condition
on B(x0 , rn∗ ), and both (2.4) and (2.1) hold with Fn , τn in place of F , τ . (By (4.27)
τn −1
and the definition of w−1 in (4.3), γξn = w−1 (τn ) = τn (2τ
.) Moreover, by (4.9),
n −1)
∗
the quantity rn defined above is exactly the corresponding quantity r∗ defined by
(2.3) with ξn in place of ξ. Then we apply the (already established) first conclusion
of Theorem 2.1 and conclude that there exists x∗n ∈ X satisfying Fn (x∗n ) ∈ K such
that
x0 − x∗n ≤ τn F (x0 )† d(Fn (x0 ), K).
By the choice of cn , we have that F (x∗n ) ∈ K + cn ⊆ ri K ⊆ K. Hence x∗n ∈ S (and
so S is nonempty), and it follows that
d(x0 , S) ≤ x0 − x∗n ≤ τn F (x0 )† d(Fn (x0 ), K).
Letting n → ∞, we establish (2.6), as it is clear that d(Fn (x0 ), K) → d(F (x0 ), K)
and τn = w(γξn ) → w(γξ) ≤ τ . Thus the proof is complete.
Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.
Downloaded 06/27/13 to 140.117.35.140. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
1250
CHONG LI AND K. F. NG
5. Proofs of Theorems 2.2 and 2.3. Before our discussion on the applications
of Theorem 2.1 to the study of systems (1.3) and (1.4), let us prove a simple lemma
regarding the γ-condition for analytic functions. As assumed in the previous sections,
let X and Y be Hilbert spaces.
Lemma 5.1. Let F : X → Y be analytic on B(x0 , r) for some x0 ∈ X and
r ∈ (0, +∞). Let γ ∈ [γF (x0 ), +∞), where γF (x0 ) is defined by (1.15), and let
r0 := min{r, γ1 }. Then the following assertions hold:
(i) F satisfies the γ-condition at x0 on B (x0 , r0 ).
(ii) If imF (x0 ) is closed and imF (k) (x0 ) ⊆ imF (x0 ) for each k = 2, . . . , then
(5.1)
imF (x) ⊆ imF (x0 )
for each x ∈ B (x0 , r0 ) .
Proof. The proof of assertion (i) is standard and is a direct application of the
Taylor expansion; see, for example, [25, Theorem 1]. Below we verify assertion (ii).
For this purpose, assume that imF (k) (x0 ) ⊆ imF (x0 ) for each k = 2, . . . , and let
x ∈ B (x0 , r0 ). Since F is analytic on B(x0 , r), it follows from the Taylor expansion
that
F (x) =
F (k+2) (x0 )
(x − x0 )k .
k!
k≥0
Hence the conclusion follows, as imF (x0 ) is closed.
As applications of Theorem 2.1 to systems (1.3) and (1.4), respectively, we derive
the following corollaries, where we assume that the involved functions f and g are
defined by (1.5) and are analytic around the involved point x0 .
Corollary 5.2. Consider the inequality
system (1.4), and let F : X → Rl × Rm
√
be defined by F := (f, g). Let τ ∈ (1, 2+2 2 ], γ ∈ [γF (x0 ), +∞), and x0 ∈ X be such
that
(5.2)
(Rl × {0m }) ∪ {F (x0 )} ∪ imF (k) (x0 ) ⊆ imF (x0 )
for each k = 2, . . .
and
τ −1
.
γ τ (2τ − 1) F (x0 )† √ Suppose that f and g are analytic on B x0 , 2−2γ 2 . Then the solution set S(f, g) of
(1.4) is nonempty and
(5.3)
(5.4)
1
(f (x0 )− 2 + g(x0 )2 ) 2 <
1
d(x0 , S(f, g)) ≤ τ F (x0 )† (f (x0 )− 2 + g(x0 )2 ) 2 .
Proof. Let Y := Rl × Rm , and define
K := {(u, 0m ) ∈ Rl+ × Rm : ui > 0
∀1 ≤ i ≤ lS }.
Then, with F := (f ; g), systems (1.1) and (1.4) are identical, d(F (x0 ), K) = (f (x0 )− 2
1
+ g(x0 )2 ) 2 , and F (x0 ) = (f (x0 ); g (x0 )) satisfies by the assumptions that
(5.5)
K ∪ {F (x0 )} ∪ imF (k) (x0 ) ⊆ imF (x0 )
for each k = 2, . . . .
Set ξ := F (x0 )† (K − F (x0 )). Since by (5.3)
(5.6)
1
F (x0 )† d(F (x0 ), K) = F (x0 )† (f (x0 )− 2 + g(x0 )2 ) 2 <
τ −1
,
γ τ (2τ − 1)
Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.
APPROXIMATE SOLUTIONS FOR INEQUALITY SYSTEMS
1251
Downloaded 06/27/13 to 140.117.35.140. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
it follows that
√
τ −1
3−2 2
ξ ≤ F (x0 ) d(F (x0 ), K) <
≤
γ τ (2τ − 1)
γ
(5.7)
†
√
(see (4.2) and (4.3)). Set r0 := 2−2γ 2 (≤ γ1 ). Thus, by Lemma 5.1, F satisfies the
γ-condition at x0 on B(x0 , r0 ), and (5.1) holds thanks to (5.5). Therefore, (2.1) is
satisfied. Moreover, r∗ < r0 , where r∗ is defined as in (2.3) (see also (4.6)). To see
this, we may suppose that γ = 0. Then by (2.3) and (5.7), one has
√
1+3−2 2
1 + γξ
<
= r0 ,
r ≤
4γ
4γ
∗
√ as claimed. Let r̄ ∈ (r∗ , r0 ). Then B(x0 , r̄) ⊆ B x0 , 2−2γ 2 . Therefore, F is contin√ uous on B (x0 , r̄), as f and g are analytic on B x0 , 2−2γ 2 by assumption. Noting
ri K = ∅, the assumption stated in (b) of Theorem 2.1 is satisfied. Thus Theorem 2.1
is applicable here, and we conclude that the solution set S(f, g) of (1.1) (namely that
of (1.4)) is nonempty, and
d(x0 , S(f, g)) ≤ τ F (x0 )† d(F (x0 ), K);
1
i.e., d(x0 , S(f, g)) ≤ τ f (x0 )† (f (x0 )− 2 + g(x0 )2 ) 2 . This completes the proof of
the corollary.
Taking gj = 0 for each j in the preceding corollary, we get the following corollary.
Recall that γf (x) is defined by (1.15) (with f in place of F ).
√
Corollary 5.3. Consider the inequality system (1.3). Let τ ∈ (1, 2+2 2 ], γ ∈
√ [γf (x0 ), +∞), x0 ∈ X, and suppose that f is analytic on B x0 , 2−2γ 2 such that
f (x0 ) : X → Rl is surjective and
f (x0 )− <
(5.8)
τ −1
.
τ (2τ − 1) γ f (x0 )† Then the solution set f > of (1.3) is nonempty, and the following estimate holds:
d(x0 , f > ) ≤ τ f (x0 )† f (x0 )− .
(5.9)
Proof. Consider the inequality system (1.4) √
with f given as in the corollary and
2− 2
g = 0. Then f and g are analytic on B x0 , 2γ , and the solution set S(f ; g) of
system (1.4) coincides with the solution set f > of (1.3). Let F := (f ; g). Then
(5.10)
F (x0 )† = (f (x0 )† ; 0)T
and F (k) (x0 ) = (f (k) (x0 ); 0m )
for each k.
This implies that
F (x0 )† = f (x0 )† ,
γF (x0 ) = γf (x0 ),
and
imF (k) (x0 ) = imf (k) (x0 ) × {0m } ⊆ Rl × {0m } = imF (x0 )
for any k ≥ 2,
Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.
Downloaded 06/27/13 to 140.117.35.140. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
1252
CHONG LI AND K. F. NG
thanks to the surjectivity assumption of f (x0 ). Thus assumptions (5.2) and (5.3)
in Corollary 5.2 are satisfied. Hence Corollary 5.2 is applicable to concluding that
f > = S(f ; g) is nonempty and
d(x0 , f > ) = d(x0 , S(f, g))
1
≤ τ F (x0 )† (f (x0 )− 2 + g(x0 )2 ) 2
= τ f (x0 )† f (x0 )− .
The proof is complete.
Remark 5.1. Consider the inequality system (1.3) for the special case when x0
satisfies that
(5.11)
fi (x0 ) < 0 for each 1 ≤ i ≤ l.
Then, for this x0 , Corollary 5.3 is a stronger result than Theorem 2.3. To see this, we
suppose that (5.11) holds and choose γ = γf (x0 ). Note then that
Diag(f (x0 )+ ) = 0,
(5.12)
and hence assumption (2.13) simply means that f (x0 ) is surjective. Moreover, by
(5.12), (1.9), and (3.4),
(5.13)
† 1
δ(f ; x0 ) = f (x0 )f (x0 )∗ 2 = f (x0 )† .
Finally, recalling (1.6) and an elementary fact,
1
t
if t > 1,
(5.14)
sup t k−1 =
1
if t ≤ 1,
k≥2
we have
1
f (x0 )† f (k) (x0 ) k−1
γ = γf (x0 ) = sup k!
k≥2
1
≤ sup f (x0 )† k−1 Γ(f, x0 )
k≥2
= max{1, δ(f ; x0 )}Γ(f, x0 ),
and so the quantity given on the right-hand side of (5.8) is smaller than that of (2.14);
thus (2.14)=⇒ (5.8).
Similarly, Corollary 5.2 is a stronger result than Theorem 2.2 for system (1.4) in
the special case when x0 satisfies (5.11).
For the proofs of Theorems 2.2 and 2.3, it would be convenient to do some preparations first. Let (x; v) ∈ X × Rl . We use (F (x); Diag(v)) to denote the operator
from X × Rl to Rl × Rm defined by
(F (x); Diag(v))(z; u) := F (x)z + (Diag(v)u; 0m )
for each (z; u) ∈ X × Rl .
Using the matrix notation, the operator (F (x); Diag(v)) can be represented as
(5.15)
f (x) Diag(v)
z
(F (x); Diag(v))(z; u) =
for each (z; u) ∈ X × Rl .
g (x)
0m×l
u
Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.
1253
Downloaded 06/27/13 to 140.117.35.140. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
APPROXIMATE SOLUTIONS FOR INEQUALITY SYSTEMS
Let G(x; v) be the (l + m) × (l + m) matrix defined by
f (x)f (x)∗ + Diag(v 2 ) f (x)g (x)∗
(5.16)
G(x; v) :=
,
g (x)f (x)∗
g (x)g (x)∗
where, for any v = (vi ) ∈ Rl , v 2 stands for the vector (vi2 ) in Rl . The following lemma
will be useful for us.
Lemma 5.4. Let (x; v) ∈ X × Rl . Then we have the following assertions:
(5.17)
(F (x); Diag(v)) ◦ (F (x); Diag(v))∗ = G(x; v)
and
im (F (x); Diag(v)) = im G(x; v).
(5.18)
Proof. We claim that
(5.19)
∗
f (x)
∗
(F (x); Diag(v)) (w1 , w2 ) =
Diag(v)
g (x)∗
0l×m
w1
w2
for each (w1 ; w2 ) ∈ Rl ×Rm .
In fact, by definition, for any (z, u) ∈ X × Rl , one checks that
(z, u), (F (x); Diag(v))∗ (w1 , w2 ) = (F (x); Diag(v))(z, u), (w1 , w2 )
= f (x)z, w1 + Diag(v)u, w1 ) + g (x)z, w2 ∗
= z,
Diag(v)w
g (x)∗ w2 1 )
+ z, f (x)w1 + u,
∗
∗
w1
f (x)
g (x)
,
= (z, u),
Diag(v) 0l×m
w2
and so (5.19) is established. This together with (5.15) implies that
∗
f (x) Diag(v)
f (x)
∗
(F (x); Diag(v)) ◦ (F (x); Diag(v)) =
g (x)
0m×l
Diag(v)
g (x)∗
0l×m
,
and (5.17) is seen to hold.
To show (5.18), we set Y0 := im(F (x); Diag(v)). By (5.17), it suffices to check
that im((F (x); Diag(v)) ◦ (F (x); Diag(v))∗ ) ⊇ Y0 , as the converse inclusion holds
trivially. Since Y0 is a finite dimensional subspace of Rl × Rm , we can choose finite
dimensional subspace X0 such that Y0 = (F (x); Diag(v))(X0 ) and the restriction
A := (F (x); Diag(v))|X0 on X0 is a bijection. Hence A∗ coincides with the restriction
of (F (x); Diag(v))∗ to Y0 , and A∗ is also a bijection. Consequently, the composite
A ◦ A∗ is a bijection on Y0 . This implies that
Y0 = (A ◦ A∗ )(Y0 ) ⊆ ((F (x); Diag(v)) ◦ (F (x); Diag(v))∗ )(Y )
= im((F (x); Diag(v)) ◦ (F (x); Diag(v))∗ ),
and the proof is complete.
Proof of Theorem 2.2. Let H : X × Rl → Rl × Rm be the operator defined by
H := (h; g) with h = (hi ) and g = (gj ), where each hi and each gj is given by
(5.20)
hi (x; v) := fi (x) − vi2 ,
gj (x; v) := gj (x),
1 ≤ i ≤ l,
1 ≤ j ≤ m,
Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.
Downloaded 06/27/13 to 140.117.35.140. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
1254
CHONG LI AND K. F. NG
for each (x; v) ∈ X × Rl with v = (v1 , . . . , vl ) ∈ Rl . Consider the following inequality
system on X × Rl :
hi (x; v) > 0,
hi (x; v) ≥ 0,
gj (x; v) = 0,
(5.21)
i = 1, . . . , lS ,
i = lS + 1, . . . , l,
j = 1, . . . , m.
As before, we use S(h; g) to denote the solution set of the above system. Then it is
easy to check that x ∈ S(f ; g) if (x; v) ∈ S(h; g) for some v ∈ Rl , that is,
(5.22)
S0 := {x ∈ X : ∃v ∈ Rl s.t. (x; v) ∈ S(h; g)} ⊆ S(f ; g).
Let v0 := ( fi (x0 )+ ) ∈ Rl , and notice that
(5.23)
h(x0 ; v0 )− 2 + g(x0 ; v0 )2 = f (x0 )− 2 + g(x0 )2 ,
(5.24)
H(x0 ; v0 ) = F (x0 ) − (f (x0 )+ ; 0) ∈ F (x0 ) + Rl × {0m },
and that
G(x0 ; ±2v0 ) = G(x0 )
(5.25)
by (5.16) and (1.8). Recalling (1.6) and (1.9), we define
(5.26)
γ := max{1, δ(f, g; x)}(1 + Γ(f, g; x)).
We will apply Corollary 5.2 to the inequality system (5.21) in place of (1.4) (so to H
and (x0 , v0 ) in place of F and x0 ). To do this, note first that, since√γ ≥ 1 + Γ(f, g; x0 ),
the analyticity assumption implies that H is analytic on B x0 , 2−2γ 2 . Next note that
(5.27)
H (x0 ; v0 ) = (F (x0 ); Diag(−2v0 )), H (k) (x0 ; v0 ) = F (k) (x0 ) for each k > 2,
and
H (x0 ; v0 ) = (F (x0 ); −2Dl ),
(5.28)
where (F (x0 ); −2Dl ) is the bilinear map from (X × Rl )× (X × Rl ) to Rl × Rm defined
by
(F (x0 ); −2Dl )(z1 ; u1 )(z2 ; u2 ) := F (x)z1 z2 − 2Dl (u1 ; u2 )
for any (z1 ; u1 ), (z2 ; u2 ) ∈ X × Rl , and Dl : Rl × Rl → Rl × Rm is the bilinear operator
defined by
Diag(u)v
(5.29)
Dl (u; v) :=
for any (u; v) ∈ Rl × Rl .
0m
This implies in particular that
(5.30)
imH (x0 ; v0 ) = imF (x0 ) + im(−2Dl ) ⊆ imF (x0 ) + Rl × {0m },
and note also that
(5.31)
imH k (x0 ; v0 ) = imF k (x0 )
for each k > 2.
Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.
1255
APPROXIMATE SOLUTIONS FOR INEQUALITY SYSTEMS
Downloaded 06/27/13 to 140.117.35.140. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
Moreover, by (5.25) and (5.27), we have from Lemma 5.4 that
G(x0 ) = G(x0 ; −2v0 ) = H (x0 ; v0 ) ◦ H (x0 ; v0 )∗
(5.32)
and
imG(x0 ) = imH (x0 ; v0 ).
(5.33)
By the given assumption (2.10), it follows that the vector space imH (x0 ; v0 ) contains Rl × {0m } ∪ {F (x0 )} ∪ ∪k≥2 imF (k) (and so contains Rl × {0m } ∪ {H(x0 ; v0 )} ∪
∪k≥2 imH (k) by (5.24), (5.31), and (5.30)). Another consequence of (5.32) is G(x0 )† =
H (x0 ; v0 )† 2 (see (3.4)). Thus, by (1.9),
1
δ(f, g; x0 ) = G(x0 )† 2 = H (x0 ; v0 )† .
(5.34)
Combining this with (5.23) and (5.26), we see that assumption (2.11) means that
h(x0 ; v0 )− 2 + g(x0 ; v0 )2 <
(5.35)
τ −1
.
γ τ (2τ − 1)H (x0 ; v0 )† Further, since
H (x0 ; v0 ) ≤ 2 + F (x0 )
and H (k) (x0 ; v0 ) = F (k) (x0 ) for each k > 2,
and making use of (5.14), one has the following estimate for γH (x0 , v0 ) (see (1.15)):
(5.36)
1 (k)
1
k−1
H (x0 ; v0 ) k−1
†
sup γH (x0 ; v0 ) ≤ sup H (x0 ; v0 ) k!
k≥2
k≥2
≤ max{1, δ(f, g; x0 )}(1 + Γ(f, g; x0 ));
thus one has γ ∈ [γH (x0 ; v0 ), +∞) by (5.26). Therefore, one can apply Corollary 5.2
to the inequality system (5.21) (with (x0 , v0 ) in place of x0 ) to get that S(h; g) = ∅
and
1
d((x0 ; v0 ), S(h; g)) ≤ τ H (x0 ; v0 )† (h(x0 ; v0 )− 2 + g(x0 ; v0 )2 ) 2 .
Using (5.23) and (5.34), we obtain that
1
d((x0 ; v0 ), S(h; g)) ≤ τ δ(f, g; x0 ) (f (x0 )− 2 + g(x0 )2 ) 2 .
Let > 0, and let (x∗ ; v ∗ ) ∈ S(h; g) be such that
(5.37)
1
(x0 ; v0 ) − (x∗ ; v ∗ ) ≤ τ δ(f, g; x0 ) (f (x0 )− 2 + g(x0 )2 ) 2 + .
Then x∗ ∈ S(f, g) by (5.22) and
d(x0 , S(f, g)) ≤ x0 − x∗ ≤ (x0 ; v0 ) − (x∗ ; v ∗ ).
This together with (5.37) implies (2.12) because > 0 is arbitrary. The proof is
complete.
Proof of Theorem 2.3. This is similar to the proof for Corollary 5.3 by taking
g = 0 in Theorem 2.2.
Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.
Downloaded 06/27/13 to 140.117.35.140. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
1256
CHONG LI AND K. F. NG
Acknowledgment. The authors are indebted to two anonymous referees for
their helpful remarks and comments.
REFERENCES
[1] L. Blum, F. Cucker, M. Shub, and S. Smale, Complexity and Real Computation, SpringerVerlag, New York, 1997.
[2] Y. R. He and J. Sun, Error bounds for degenerate cone inclusion problems, Math. Oper. Res.,
33 (2005), pp. 701–717.
[3] J.-P. Dedieu, Approximate solutions of analytic inequality systems, SIAM J. Optim., 11 (2000),
pp. 411–425.
[4] C. W. Groetsch, Generalized Inverse of Linear Operator: Representation and Approximation,
Marcel Dekker, New York, 1977.
[5] J. He, J. Wang, and C. Li, Newton’s method for underdetermined systems of equations under
the γ-condition, Numer. Funct. Anal. Optim., 28 (2007), pp. 663–679.
[6] O. Güler, A. J. Hoffman, and U. G. Rothblum, Approximations to solutions to systems of
linear inequalities, SIAM J. Matrix Anal. Appl., 16 (1995), pp. 688–696.
[7] A. Hoffman, On approximate solutions of systems of linear inequalities, J. Res. Nat. Bur.
Standards, 49 (1952), pp. 263–265.
[8] K. Diethard and W. Li, Asymptotic constraint qualifications and global error bounds for
convex inequalities, Math. Program. Ser. A, 84 (1999), pp. 137–160.
[9] G. Li, On the asymptotically well behaved functions and global error bound for convex polynomials, SIAM J. Optim., 20 (2010), pp. 1923–1943.
[10] G. Li, Global error bounds for piecewise convex polynomials, Math. Program., 137 (2013), pp.
37–64.
[11] W. Li, Abadie’s constraint qualification, metric regularity, and error bounds for differentiable
convex inequalities, SIAM J. Optim., 7 (1997), pp. 966–978.
[12] W. Li, Error bounds for piecewise convex quadratic programs and applications, SIAM J. Control
Optim., 33 (1995), pp. 1510–1529.
[13] X.-D. Luo and Z.-Q. Luo, Extension of Hoffman’s error bound to polynomial systems, SIAM
J. Optim., 4 (1994), pp. 383–392.
[14] X. Luo and J. Pang, Error bounds for analytic systems and their applications, Math. Program., 67 (1995), pp. 1–28.
[15] K. F. Ng and W. H. Yang, Regularities and their relations to error bounds, Math. Program.
Ser. A, 99 (2004), pp. 521–538.
[16] K. F. Ng and W. H. Yang, Error bounds for abstract linear inequality systems, SIAM J.
Optim., 13 (2002), pp. 24–43.
[17] K. F. Ng and X. Y. Zheng, Global error bounds with fractional exponents, Math. Program.
Ser. B, 88 (2000), pp. 357–370.
[18] K. F. Ng and X. Y. Zheng, Error bounds for lower semicontinuous functions in normed
spaces, SIAM J. Optim., 12 (2001), pp. 1–17.
[19] J. S. Pang, Error bounds in mathematical programming, Math. Program. Ser. B, 79 (1997),
pp. 299–332.
[20] S. M. Robinson, An application of error bounds for convex programming in a linear space,
SIAM J. Control., 13 (1975), pp. 271–273.
[21] S. M. Robinson, Stability theory for systems of inequalities. Part I: Linear systems, SIAM J.
Numer. Anal., 12 (1975), pp. 754–769.
[22] S. M. Robinson, Stability theory for systems of inequalities, Part II: Differentiable nonlinear
systems, SIAM J. Numer. Anal., 13 (1976), pp. 497–513.
[23] M. Shub and S. Smale, Complexity of Bezout’s theorem IV: Probability of success; extensions,
SIAM J. Numer. Anal., 33 (1996), pp. 128–148.
[24] S. Smale, Newton’s method estimates from data at one point, in The Merging of Disciplines:
New Directions in Pure, Applied and Computational Mathematics, R. Ewing, K. Gross,
and C. Martin, eds., Springer, New York, 1986, pp. 185–196.
[25] X. H. Wang, Convergence on the iteration of Halley family in weak conditions, Chinese Sci.
Bull., 42 (1997), pp. 552–555.
[26] X. H. Wang and D. F. Han, Criterion α and Newton’s method, Chinese J. Numer. Appl.
Math., 19 (1997), pp. 96–105.
[27] Z. L. Wu and J. J. Ye, On error bounds for lower semicontinuous functions, Math. Program.
Ser. A, 92 (2002), pp. 301–314.
[28] W. H. Yang, Error bounds for convex polynomials, SIAM J. Optim., 19 (2009), pp. 1633–1647.
Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.
Download