c 2013 Society for Industrial and Applied Mathematics Downloaded 06/27/13 to 140.117.35.140. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php SIAM J. OPTIM. Vol. 23, No. 2, pp. 1237–1256 APPROXIMATE SOLUTIONS FOR ABSTRACT INEQUALITY SYSTEMS∗ CHONG LI† AND K. F. NG‡ Abstract. We consider conic inequality systems of the type F (x) ≥K 0, with approximate solution x0 associated to a parameter τ , where F is a twice Fréchet differentiable function between Hilbert spaces X and Y , and ≥K is the partial order in Y defined by a nonempty convex (not necessarily closed) cone K ⊆ Y . We prove that, under the suitable conditions, the system F (x) ≥K 0 is solvable, and the ratio of the distance from x0 to the solution set S over the distance from F (x0 ) to the cone K has an upper bound given explicitly in terms of τ and x0 . We show that the upper bound is sharp. Applications to analytic function inequality/equality systems on Euclidean spaces are given, and the corresponding results of Dedieu [SIAM J. Optim., 11 (2000), pp. 411–425] are extended and significantly improved. Key words. conic inequality systems, approximate solution, stability AMS subject classifications. 47J99, 90C31 DOI. 10.1137/120885176 1. Introduction. The constraint sets in most constrained optimization problems can be described as conic inequality systems of the following type: (1.1) F (x) ≥K 0, where F is a function between Banach spaces X, Y and Y is endowed with the partial order (or preorder) ≥K defined by a nonempty convex cone K ⊆ Y . One of the most important issues for conic inequality systems is the issue of error bounds. We say that a constant c is a (local) error bound for (1.1) at x0 ∈ X if there exists a neighborhood U of x0 such that (1.2) d(x, S) ≤ c · e(x) for any x ∈ U, where S denotes the solution set of (1.1) (consisting of all x satisfying (1.1)), d(·, S) is the distance function associated with S defined by d(x, S) := inf{x − z : z ∈ S} for each x ∈ X, and e(·) is some suitable residual function measuring a degree of violation with respect to (1.1). In the case when X = Rn , Y = Rm , and K = Rm − (consisting of all m-vector (α1 , . . . , αm ) ∈ Rm such that αi ≤ 0 for all i), system (1.1) is reduced to the finitely many inequalities. In this case, the most important and celebrated result regarding ∗ Received by the editors July 19, 2012; accepted for publication (in revised form) March 19, 2013; published electronically June 20, 2013. http://www.siam.org/journals/siopt/23-2/88517.html † Department of Mathematics, Zhejiang University, Hangzhou 310027, P. R. China (cli@zju. edu.cn). This author’s work was supported in part by the National Natural Science Foundation of China (grant 11171300) and by Zhejiang Provincial Natural Science Foundation of China (grant Y6110006). ‡ Department of Mathematics (and IMS), Chinese University of Hong Kong, Hong Kong, P. R. China (kfng@math.cuhk.edu.hk). This author’s work was supported by an earmarked grant (GRF) from the Research Grant Council of Hong Kong (Projects CUHK 403110 and 402612). 1237 Copyright © by SIAM. Unauthorized reproduction of this article is prohibited. Downloaded 06/27/13 to 140.117.35.140. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php 1238 CHONG LI AND K. F. NG the error bound is due to Hoffman, who proved that if F is affine and the solution set S is nonempty, then there exists c > 0 such that (1.2) holds with e(·) = d(F (·), Rm −) and U = X (see [6, 7]). Extensions of Hoffman’s result to convex inequalities have been well established in the literature; see, for example, [8, 9, 10, 11, 12, 19, 20, 28] and references therein. However, study of error bound for nonconvex inequalities is not so rich. In the special case when (1.1) is a single inequality system F (x) ≤ 0 (i.e., m = 1), Luo and Luo [13] and Luo and Pang [14] established global error bound inequality (1.2), respectively for polynomial and analytic inequality but with an unknown fractional exponents to d(F (x), R− ). Other related works in this direction can be found in [15, 17, 18, 27]. For the general case when K is a closed convex cone, Ng and Yang [16] established some error bound results for abstract linear inequality systems (i.e., F is affine) in general Banach spaces, which in particular extend the corresponding results in finite dimensional spaces due to Robinson [21]. For the general differential system, Robinson [22] proved that if the so-called Robinson constraint qualification (RCQ) holds at x0 ∈ S, then there exist c > 0 and a neighborhood U of x0 such that (1.2) holds with e(x) = d(F (x), K). By weakening the RCQ assumption, this result has been extended in [2] by He and Sun. We observe that all results mentioned above are done under two additional assumptions: (a) K is closed, and (b) S is preassumed nonempty with x0 ∈ S. In this paper we assume that X and Y are Hilbert spaces and that system (1.1) has an “approximate solution” x0 in the sense that F (x0 ) + c ≥K 0 for some c ∈ Y with small norm or small residual value, but we drop the two additional assumptions mentioned above. Given an approximate solution x0 , the main concern of this paper is to show, under suitable assumptions, that the solution set S of (1.1) is nonempty and its distance d(x0 , S) from x0 has an upper estimate in terms of x0 together with a parameter. In a more concrete situation we will consider, as applications, the following system with Y = Rl , (1.3) fi (x) > 0, fi (x) ≥ 0, i = 1, . . . , lS , i = lS + 1, . . . , l, and the system with Y = Rm+l involving inequality and equality constraints, (1.4) fi (x) > 0, fi (x) ≥ 0, gj (x) = 0, i = 1, . . . , lS , i = lS + 1, . . . , l, j = 1, . . . , m. Here fi and gj are real-valued functions on X, and l, lS , m are nonnegative integers such that 0 ≤ lS ≤ l. Associated with (1.3) and (1.4), let f : X → Rl , g : X → Rm , and (f, g) : X → Rl × Rm be defined, respectively, by (1.5) f (x) := (f1 (x), . . . , fl (x)), g(x) := (g1 (x), . . . , gm (x)), and (f, g)(x) := (f (x), g(x)) for each x ∈ X. We assume throughout that the involved functions fi , gi (and so f , g and (f, g)) are analytic (at least locally around the point x under consideration); thus we have the following real constants: (1.6) 1 (k) f (x) k−1 Γ(f ; x) := sup k! k≥2 and 1 (f, g)(k) (x) k−1 Γ(f, g; x) := sup , k! k≥2 Copyright © by SIAM. Unauthorized reproduction of this article is prohibited. APPROXIMATE SOLUTIONS FOR INEQUALITY SYSTEMS 1239 Downloaded 06/27/13 to 140.117.35.140. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php where, e.g., f (k) (x) denotes the kth derivative of f at x and (1.7) f (k) (x) := sup{f (k) (x)(u1 , . . . , uk ) : each ui ∈ X with ui ≤ 1}. Let f (x)∗ denote the conjugate (adjoint) of f (x); thus f (x)f (x)∗ is a linear operator from Rl into itself and is, as usual, represented as an l × l matrix. For a matrix A, we use A† to denote the Moore–Penrose generalized inverse of A, and A the usual operator norm of A. In connection with (1.4), one defines an (l + m) × (l + m) matrix G(x) for x ∈ X by f (x)f (x)∗ + Diag(4f (x)+ ) f (x)g (x)∗ (1.8) G(x) := , g (x)f (x)∗ g (x)g (x)∗ where Diag(4f (x)+ ) is the diagonal matrix with diagonal entries 4fi (x)+ , i = 1, . . . , l. Here and throughout, we use the notations a+ := max{a, 0} and a− := max{−a, 0} for any real number a. Let 1 (1.9) δ(f ; x) := (f (x)f (x)∗ + Diag(4f (x)+ ))† 2 1 and δ(f, g; x) := G(x)† 2 . One of the main results of Dedieu in [3] for system (1.4) with X = Rn can be stated ∞ 2k −1 = 1.63284 . . . are such that as follows: if x0 ∈ Rn and σ := k=0 12 (1.10) (1.11) fi (x0 ) > (σδ(f ; x0 )f (x0 )− )2 for each i = 1, . . . , lS , the matrix G(x0 ) is nonsingular, and 1 (1.12) (f (x0 )− 2 + g(x0 )2 ) 2 < 1 , 8(1 + Γ(f, g; x0 ))δ(f, g; x0 ) max{1, δ(f, g; x0 )} then the solution set (denoted by (f, g)≥ ) of (1.4) is nonempty, and its distance from x0 satisfies that (1.13) 1 d(x0 , (f, g)≥ ) ≤ σδ(f, g; x0 ) (f (x0 )− 2 + g(x0 )2 ) 2 . Similar treatments have also been done for system (1.3) with X = Rn by Dedieu; see [3] and references therein for more details and background information. From some points of view, assumptions (1.10) and (1.11) are too restrictive; for example, when all fi with lS < i ≤ l and all gj are zero functions, because the matrix G(x0 ) is clearly singular even (1.10) itself trivially entails that x0 is a solution of (1.4), the results of Dedieu mentioned above cannot be applied. More generally, we study the corresponding issues for the general system (1.1) for which we assume from now on that F is twice Fréchet differentiable (at least locally around x ∈ X with the first and the second derivatives F (·) and F (·)). We say that F satisfies the γ-condition at x0 ∈ X on the open ball B(x0 , r) (with center x0 ∈ X and radius r > 0) if 0 ≤ rγ ≤ 1 and (1.14) F (x0 )† F (x) ≤ 2γ (1 − γx − x0 )3 for each x ∈ B(x0 , r), Copyright © by SIAM. Unauthorized reproduction of this article is prohibited. Downloaded 06/27/13 to 140.117.35.140. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php 1240 CHONG LI AND K. F. NG where F (x0 )† is the Moore–Penrose generalized inverse of F (x0 ) and the norm of F (x0 )† F (x0 ) is defined similarly as done in (1.7); see section 3 for more notation matters. The γ-condition (in the special case when F (x0 ) is nonsingular) was introduced by Wang (see [25, 26]) for improving the famous point estimate theory presented by Smale [24]. An important example is of course the case when F is analytic: if F : X → Y is analytic on B(x0 , r), then F satisfies the γ-condition at x0 on B(x0 , r0 ) with any γ ∈ [γF (x0 ), +∞) and r0 := min{r, γ1 }, where the quantity γF (x0 ) was introduced by Shub and Smale [23] (see also [1]) and is defined by (1.15) 1 F (x0 )† F (k) (x0 ) k−1 γF (x0 ) := sup . k! k≥2 Our main results are described in terms of the γ-condition and cover cases when K is not necessarily closed and/or the case when the nonsingularity is not assumed. As we will see later, an advantage of our approach is that the corresponding constant c in the error bound result can be pointwise determined explicitly. In particular, our estimate is optimal in some sense; see Example 2.1. As applications, we extend and improve the results of Dedieu mentioned above for system (1.3) and system (1.4). In particular, we show that for (1.13) to hold, both (1.11) and (1.12) can be replaced by weaker assumptions, while (1.10) is not needed, as explained before Theorem 2.2 in section 2. Another by-product of our main results can be cast in the stability point of view (Robinson initiated the study of stability in [21, 22] for (1.1)). Under suitable assumptions, we show that if (1.1) is solvable, then the perturbed systems (1.16) F (x) − y ≥K 0 (with small enough y) are also solvable, and the solution map y → S(y) enjoys some calmness property. The paper is organized as follows. In section 2, we summarize the main theorems and their corollaries. Some basic concepts and known facts needed in the rest of the paper are listed in section 3. Section 4 contains the proofs of the main theorems, while applications are provided in section 5. 2. Main results. Throughout the paper let X and Y be (real) Hilbert spaces, and K ⊆ Y be a cone which defines a binary relation ≥K by y1 ≥ y2 ⇔ y1 − y2 ∈ K. Given a bounded linear operator A (or a matrix), let kerA, imA, and A∗ respectively denote the kernel, the image, and the conjugate operator of A. Let A† denote the Moore–Penrose generalized inverse of A when it exists uniquely (e.g., if imA is closed). The following theorem is our main result regarding the solution set S := {x ∈ X : F (x) ∈ K} of system (1.1), where we assume that F is a function from X into Y . The usual convention that a0 = +∞ is always adopted for any a > 0. Further, for a nonempty set D in X (or in other normed spaces), we use the notation D to denote the distance from the origin to D, namely D := inf{u : u ∈ D}, with the convention that ∅ = +∞. The closure and the relative interior of D are denoted by D and ri D, respectively. For an element x ∈ X, the notation D − x stands for the set consisting of all elements d − x with d ∈ D. Copyright © by SIAM. Unauthorized reproduction of this article is prohibited. Downloaded 06/27/13 to 140.117.35.140. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php APPROXIMATE SOLUTIONS FOR INEQUALITY SYSTEMS 1241 Theorem 2.1. Let x0 ∈ X and r̄, γ ∈ [0, +∞). Let F : X → Y be continuous on B(x0 , r̄) and twice Fréchet differentiable on B(x0 , r̄) with the following properties: (1) imF (x0 ) is closed; (2) F satisfies the γ-condition at x0 on B(x0 , r̄); (3) (2.1) K ∪ {F (x0 )} ∪ imF (x) ⊆ imF (x0 ) for each x ∈ B(x0 , r̄). Let ξ, r∗ be defined by ξ := F (x0 )† (K − F (x0 )) (2.2) and (2.3) ∗ 1+γξ− r := √ (1+γξ)2 −8γξ , 4γ ξ, 0<γ≤ γ = 0. √ 3−2 2 , ξ Suppose that r∗ ≤ r̄ and that there exists τ ∈ R such that √ 2+ 2 τ −1 (2.4) 1<τ ≤ and ξ ≤ . 2 γτ (2τ − 1) Then there exists x∗ ∈ X such that F (x∗ ) ∈ K and (2.5) x0 − x∗ ≤ τ F (x0 )† d(F (x0 ), K). Moreover, if (a) K is closed or √ (b) r∗ < r̄, ξ < 3−2γ 2 , and the relative interior ri K of K is nonempty, then S is nonempty and (2.6) d(x0 , S) ≤ τ F (x0 )† d(F (x0 ), K). Remark 2.1. If ξ = 0 in (2.2), then assumption (2.1) entails that F (x0 ) ∈ K. In fact, assume that ξ = 0 in (2.2) and (2.1) holds. Then, F (x0 )† (K − F (x0 )) = 0, and, by definition, we can choose {un } ⊆ K such that F (x0 )† (un − F (x0 )) → 0. Thus F (x0 )F (x0 )† (un − F (x0 )) → 0. Since F (x0 )F (x0 )† is simply the projection ΠimF (x0 ) (see (3.3) of the next section) and since un − F (x0 ) lies in imF (x0 ) by (2.1), it follows that un − F (x0 ) → 0. Therefore F (x0 ) ∈ K, as claimed. Similarly, if γ = 0, then the γ-condition at x0 on B(x0 , r̄) together with assumption (2.1) entails that F (x) = 0 for each x ∈ B(x0 , r̄); that is, F is affine on B(x0 , r̄). The following example shows that the bound τ F (x0 )† before d(F (x0 ), K) in (2.6) cannot be improved. √ √ τ −1 Example 2.1. Let γ ∈ (0, +∞) and τ ∈ (1, 2+2 2 ] (so τ (2τ −1) ≤ 3 − 2 2; see (4.2) and (4.3)). Define F : (−∞, γ1 ) ⊆ R → R by (2.7) γx2 τ −1 −x+ F (x) := γτ (2τ − 1) 1 − γx 1 for each x ∈ −∞, . γ Copyright © by SIAM. Unauthorized reproduction of this article is prohibited. 1242 CHONG LI AND K. F. NG Downloaded 06/27/13 to 140.117.35.140. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php Consider the inequality system (1.1) with K := R− and F defined above, namely F (x) ≤ 0. (2.8) 1 Let r̄ := 2γ and x0 = 0. Then F (x0 ) = −1, and it is routine to check that F possesses properties (1)–(3) of Theorem 2.1. Moreover, we have that √ 3−2 2 1 τ −1 † ≤ < , (2.9) ξ = F (x0 ) (K − F (x0 )) = γτ (2τ − 1) γ γ ∗ so (2.4) is satisfied and r∗ < r̄, where √ r is defined by (2.3). Note that the solution set 1+γξ+ (1+γξ)2 −8γξ S = [r∗ , r∗∗ ], where r∗∗ := ≤ r̄. Therefore, d(x0 , S) = 0 − r∗ = 4γ ∗ r . Noting that ξ = 0 by (2.9), one can check by element calculation that 1 + γξ − (1 + γξ)2 − 8γξ τ −1 = τ ξ ⇐⇒ ξ = . 4γ γτ (2τ − 1) √ 1+γξ− (1+γξ)2 −8γξ It follows from (2.9) that r∗ = = τ ξ. Thus we have 4γ d(x0 , S) = r∗ = τ F (x0 )−1 d(F (x0 ), K) because d(F (x0 ), K) = ξ and F (x0 )−1 = 1. This shows that the bound τ F (x0 )−1 is optimal. The following theorem, Theorem 2.2, improves the corresponding result of Dedieu mentioned in section 1 in three aspects: we drop his assumption (1.10), weaken his nonsingularity assumption, and replace the constant 18 in the right-hand side of (1.12) ∞ 2k −1 by a bigger constant. Indeed, if we take τ = σ = k=0 12 = 1.63284 . . . in Theorems 2.2 and 2.3, then σ−1 1 τ −1 = = 0.1710 . . . > . τ (2τ − 1) σ(2σ − 1) 8 Recall that Γ and δ are defined as in (1.6) and (1.9). Theorem 2.2. Consider the inequality/equality system (1.4), where each fi and each gj are real-valued functions on X, and (f, g) are defined as in (1.5). Let x0 ∈ X √ 2− 2 such that and suppose that (f, g) is analytic on B x0 , 2(1+Γ(f,g;x )) 0 (2.10) (Rl × {0m }) ∪ {(f (x0 ), g(x0 ))} ∪ im(f (k) (x0 ), g (k) (x0 )) ⊆ imG(x0 ) for each k = 2, . . . √ (e.g., G(x0 ) is nonsingular), where G(x0 ) is defined as in (1.8). Let τ ∈ (1, 2+2 2 ] be such that (2.11) 1 τ −1 . (f (x0 )− 2 + g(x0 )2 ) 2 < τ (2τ − 1)(1 + Γ(f, g; x0 ))δ(f, g; x0 ) max{1, δ(f, g; x0 )} Then the solution set S(f, g) of (1.4) is nonempty, and the following estimate holds: (2.12) 1 d(x0 , S(f, g)) ≤ τ δ(f, g; x0 ) (f (x0 )− 2 + g(x0 )2 ) 2 . Similarly we also extend the result of [3, Theorem 1] for system (1.3) as follows. Copyright © by SIAM. Unauthorized reproduction of this article is prohibited. Downloaded 06/27/13 to 140.117.35.140. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php APPROXIMATE SOLUTIONS FOR INEQUALITY SYSTEMS 1243 Theorem 2.3. Consider system (1.3), where each fi is a real-valued function on X and f√is defined as in (1.5). Let x0 ∈ X, and suppose that f is analytic on 2− 2 B x0 , 2(1+Γ(f ;x0 )) such that (2.13) the matrix (f (x0 )f (x0 )∗ + Diag(4f (x0 )+ )) is nonsingular. √ Let τ ∈ (1, 2+2 2 ] be such that (2.14) f (x0 )− < τ −1 . τ (2τ − 1)(1 + Γ(f ; x0 ))δ(f ; x0 ) max{1, δ(f ; x0 )} Then the solution set f > of (1.3) is nonempty, and the following estimate holds: (2.15) d(x0 , f > ) ≤ τ δ(f ; x0 ) f (x0 )− . We defer the proofs of the above three theorems to sections 4 and 5. As a direct application of Theorem 2.1, we present below a proof for the following stability result for the perturbed system (1.16). Theorem 2.4. Let γ ∈ [0, +∞), and let x0 ∈ X be a solution of system (1.1), √ 2− 2 where the function F : X → Y is twice Fréchet differentiable on B x0 , 2γ with √ 2− 2 2γ . If K is closed or the √ 3−2 2 relative interior ri K of K is nonempty, then, for any y ∈ B 0, γF (x0 )† ∩ imF (x0 ), the perturbed system (1.16) is solvable, and its solution set S(y) satisfies that (2.16) √ √ 2 2+ 2 3 − 2 d(x0 , S(y)) ≤ F (x0 )† y for any y ∈ B 0, ∩ imF (x0 ). 2 γF (x0 )† properties (1)–(3) in Theorem 2.1 but replacing r̄ by (In particular, the solution map√S(·), restricted to imF (x0 ), is calm at (0, x0 ).) √ τ −1 Proof. Consider τ := 2+2 2 , and note that τ (2τ Let y ∈ −1) = 3 − 2 2. √ 3−2 2 B 0, γF (x0 )† ∩ imF (x0 ), define the function (2.17) Fy (·) := F (·) − y, and let ξy := Fy (x0 )† (K − Fy (x0 )). Then Fy ≡ F , Fy ≡ F . Since x0 is a solution of system (1.1), one has F (x0 ) ∈ K, and so y = F (x0 ) − Fy (x0 ) ∈ K − Fy (x0 ); it follows that √ 3−2 2 τ −1 † † (2.18) ξy ≤ Fy (x0 ) y = F (x0 ) y < = . γ γ τ (2τ − 1) Therefore the corresponding ry∗ (defined as in (2.3) but replacing ξ by ξy ) satisfies that √ √ 1+3−2 2 2− 2 1 + γξy ∗ < = . ry ≤ 4γ 4γ 2γ √ Pick r̄ ∈ ry∗ , 2−2γ 2 . This together with the given assumptions implies that Fy satisfies the γ-condition on B (x0 , r̄), and that K ∪ {Fy (x0 )} ∪ imFy (x) ⊆ imFy (x0 ) for each x ∈ B (x0 , r̄) Copyright © by SIAM. Unauthorized reproduction of this article is prohibited. Downloaded 06/27/13 to 140.117.35.140. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php 1244 CHONG LI AND K. F. NG (noting that imFy (x0 ) = imF (x0 ) contains y and K, and that Fy (x0 ) ∈ K − y). Therefore, if K is closed or the ri K is nonempty, one can apply Theorem 2.1, and we conclude that the perturbed problem F (x) − y = Fy (x) ≥K 0 is solvable, and its solution set S(y) satisfies that d(x0 , S(y)) ≤ τ Fy (x0 )† d(Fy (x0 ), K) ≤ τ Fy (x0 )† Fy (x0 ) − F (x0 ) √ 2+ 2 F (x0 )† y. = 2 This proves (2.16), and the proof is complete. 3. Notations, notions, and auxiliary lemmas. Let N denote the set of all natural numbers, and let k ∈ N. We consider a k-multilinear bounded operator Λ : X k → Y . As in (1.7), we define the norm Λ by Λ := sup{Λ(x1 , . . . , xk ) : (x1 , . . . , xk ) ∈ X k , xi ≤ 1 for each i}; also, let imΛ denote the image of Λ: imΛ := {Λ(x1 , . . . , xk ) : (x1 , . . . , xk ) ∈ X k }. Recall that if A is a bounded linear operator from X to Y such that imA is closed, then it has a unique Moore–Penrose generalized inverse (to be denoted by A† ), which is, by definition, the bounded linear operator from Y to X such that (3.1) A A† A = A, A† A A† = A† , (A† A)∗ = (A† A), and (A A† )∗ = (A A† ), where A∗ denotes the adjoint operator of A. For convenience, we list some facts regarding the Moore–Penrose inverse: (3.2) (A∗ )† = (A† )∗ , A† = (A∗ A)† A∗ = A∗ (A A∗ )† , (A A∗ )† = (A∗ )† A† , and (3.3) A† A = Π(kerA)⊥ , A A† = ΠimA , A† ΠimA = A† A A† = A† , where ΠD denotes the projection on subset D; see, for example, [4] for more details. Notice by (3.2) that (A A∗ )† = (A∗ )† A† = (A† )∗ A† . It follows that (3.4) (AA∗ )† = A† 2 . Lemma 3.1. Let A be a bounded linear operator from X to Y with closed image. Let D be a closed convex subset of imA. Then there exists u0 ∈ D such that A† u0 = A† D. Proof. Choose a sequence {un } ⊆ D such that A† un → A† D. Then {A A† un } is bounded. Since {un } ⊆ D ⊆ imA by assumption, and since A A† is the projection ΠimA on imA (see (3.3)), one has that un = A A† un , and so {un } is bounded. Thus {un } has a subsequence, denoted by itself, which converges weakly to some point u0 ∈ D; consequently, by a standard result in functional analysis, there exists a sequence {vn } ⊆ D with each of its elements representable as a convex combination Copyright © by SIAM. Unauthorized reproduction of this article is prohibited. Downloaded 06/27/13 to 140.117.35.140. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php APPROXIMATE SOLUTIONS FOR INEQUALITY SYSTEMS 1245 of some elements of {un } such that vn converges to u0 strongly. By the continuity and linearity of A† , it is then routine to check that A† u0 = A† D, and so we complete the proof. Lemma 3.2. Let γ > 0, and suppose that F satisfies the γ-condition at some x̄ ∈ X on B(x̄, r) with some r ∈ (0, γ1 ]. Suppose further that imF (x) ⊆ imF (x̄) (3.5) Let x ∈ X be such that (3.6) for each x ∈ B(x̄, r). √ 2− 2 . x − x̄ < min r, 2γ Then imF (x) = imF (x̄) (3.7) and (3.8) F (x)† F (x̄) ≤ 2− 1 (1 − γx − x̄)2 −1 = (1 − γx − x̄)2 ; 1 − 4γx − x̄ + 2γ 2 x − x̄2 moreover, if F (x̄) ∈ imF (x̄), then F (x) ∈ imF (x̄) = imF (x). (3.9) Proof. Since (3.10) F (x) = F (x̄) + 0 1 F (x̄ + t(x − x̄))(x − x̄)dt, assumption (3.5) implies that the image of F (x) is contained in imF (x̄). Since, when restricting to imF (x̄), F (x̄)F (x̄)† is the identity map (see (3.3)), we then have that F (x̄)F (x̄)† F (x) = F (x). Letting W := F (x̄)† (F (x̄) − F (x)) and IX denote the identity map on X, it follows from (3.1) that (3.11) F (x) = F (x̄)F (x̄)† (F (x) − F (x̄)) + F (x̄) = F (x̄)(−W + IX ). Assuming that IX − W is invertible, (3.7) is seen to hold (by (3.11)), and F (x̄) = F (x)(IX − W )−1 , and so (3.12) F (x)† F (x̄) = F (x)† F (x)(IX − W )−1 ≤ (IX − W )−1 because F (x)† F (x) is a projection operator (and thus of norm bounded by 1) by (3.3). Thus (3.8) will also be shown, provided that (3.13) (IX − W )−1 ≤ (1 − γx − x̄)2 . 1 − 4γx − x̄ + 2γ 2 x − x̄2 To accomplish this, we use (3.10) and the assumed γ-condition to note that 1 F (x̄)† F (x̄ + t(x − x̄))x − x̄dt W ≤ 0 1 2γ ≤ x − x̄dt (1 − γtx − x̄)3 0 1 − 1. = (1 − γx − x̄)2 Copyright © by SIAM. Unauthorized reproduction of this article is prohibited. 1246 CHONG LI AND K. F. NG Downloaded 06/27/13 to 140.117.35.140. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php Further, since γx − x̄ ≤ √ 2− 2 2 by (3.6), we also see that W < 1 as 1 1 √ = 2. < 2− 2 2 (1 − γx − x̄)2 1− 2 By the Banach lemma, IX − W is therefore invertible, and (3.13) holds. Finally, suppose that F (x̄) ∈ imF (x̄). To establish (3.9), we need only show that F (x) ∈ imF (x̄). But this is clearly true from the identity 1 F (x) = F (x̄) + F (x̄ + t(x − x̄))(x − x̄)dt 0 because F (x̄ + t(x − x̄))(x − x̄) ∈ imF (x̄) by (3.7) for any t ∈ [0, 1]. 4. Proofs of main results. For convenience, we shall use w to denote the elementary function defined by (4.1) Then (4.2) w(t) := 2 1 + t + (1 + t)2 − 8t √ for each t ∈ [0, 3 − 2 2]. √ √ 2+ 2 w : [0, 3 − 2 2] → 1, is strictly increasing and bijective 2 √ √ with the inverse map w−1 : [1, 2+2 2 ] → [0, 3 − 2 2] given by √ 2+ 2 t−1 −1 (4.3) w (t) := for each t ∈ 1, . t(2t − 1) 2 Let ξ, γ ∈ [0, +∞), and define the function φξ,γ by (4.4) γt2 φξ,γ (t) := ξ − t + 1 − γt 1 for each t ∈ −∞, . γ Suppose that (4.5) √ 3−2 2 . 0≤ξ≤ γ Then the smaller root r∗ of φξ,γ on [0, 1/γ) is given by (2.3), that is, √ √ 1+γξ− (1+γξ)2 −8γξ 3−2 2 ∗ , 0 < γ ≤ , 4γ ξ (4.6) r := ξ, γ = 0; we note that (4.7) √ 2− 2 1 + γξ 0≤r ≤ ≤ . 4γ 2γ ∗ Notice further that the inequality in (2.4) and the definition for r∗ given by (4.6) can be rewritten as (4.8) γξ ≤ w−1 (τ ) Copyright © by SIAM. Unauthorized reproduction of this article is prohibited. APPROXIMATE SOLUTIONS FOR INEQUALITY SYSTEMS 1247 Downloaded 06/27/13 to 140.117.35.140. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php and r∗ = w(γξ)ξ, (4.9) respectively. If there is no ambiguity, we shall write φ for φξ,γ (which is defined as in (4.4)). Let h : D ⊆ X → Y be a nonlinear map with continuous Fréchet derivative. Let {tn } and {xn } be sequences generated by the (Gauss-)Newton method for φ and for h, respectively, with initial points t0 = 0 and x0 ; that is, (4.10) tn+1 := tn − φ (tn )−1 φ(tn ) for each n = 0, 1, . . . and (4.11) xn+1 := xn − h (xn )† h(xn ) for each n = 0, 1, . . . . The main part of the following result is taken from [5]. Proposition 4.1. Let γ ∈ [0, +∞), Ω ⊆ X, and let x0 be an interior point of Ω. Let h : Ω → Y be continuous such that it is twice Fréchet differentiable on the interior of Ω with the surjective derivative h (x0 ). Let ξ be defined by ξ := h (x0 )† h(x0 ), let r∗ be defined by (4.6), and let φ := φξ,γ . Suppose that √ 3−2 2 , (4.12) ξ≤ γ B(x0 , r∗ ) ⊆ Ω, (4.13) and h satisfies the γ-condition at x0 on B(x0 , r∗ ). Then the Newton method (4.11) with initial point x0 is well defined, and the generated sequence {xn } converges to a zero x∗ of h with x∗ ∈ B(x0 , r∗ ) satisfying xn − x∗ ≤ r∗ − tn (4.14) for each n = 0, 1, . . . . Proof. If ξ = 0, then, by definition, one has xn = x0 = x∗ and tn = r∗ = 0 for each n, and (4.14) holds trivially. Suppose that ξ > 0 but γ = 0. Then, by the given assumptions of surjectivity, γ-condition, and continuity, one has that h is affine on B(x0 , r∗ ): h(x) = h(x0 ) + h (x0 )(x − x0 ) for each x ∈ B(x0 , r∗ ). Thus we have that xn = x∗ = x0 + h (x0 )† h(x0 ) and tn = r∗ = ξ Therefore, (4.14) is seen to hold because ξ = r ∗ − t0 , xn − x∗ = 0 = r ∗ − tn , for each n > 0. n = 0, n > 0. Finally, for the remaining case (namely, ξ > 0, γ > 0), the proof given in [5, Theorem 3.3] can be adopted here although the setting of [5] is in the finite dimensional spaces X and Y . Copyright © by SIAM. Unauthorized reproduction of this article is prohibited. Downloaded 06/27/13 to 140.117.35.140. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php 1248 CHONG LI AND K. F. NG Now we are ready to give the proof of Theorem 2.1. Proof of Theorem 2.1. By the assumptions, Y0 := imF (x0 ) is a closed subspace such that K ∪ {F (x0 )} ∪ imF (x) ⊆ imF (x0 ) = Y0 for each x ∈ B(x0 , r̄). √ Set r̄0 := min r̄, 2−2γ 2 (so r∗ ≤ r̄0 by (4.7) and thanks to the assumption r∗ ≤ r̄). By the given γ-condition on B(x0 , r̄), we can apply Lemma 3.2 (to x0 , r̄ in place of x̄, r) to conclude that (4.15) F (x) ∈ Y0 = imF (x0 ) for each x ∈ B(x0 , r̄0 ) (and so for each x ∈ B(x0 , r̄0 )). Moreover, by Lemma 3.1, the distance from the origin to the set F (x0 )† (K − F (x0 )) is attained: there exists some c0 ∈ K (so F (x) − c0 ∈ Y0 for each x ∈ B(x0 , r̄0 )) such that (4.16) F (x0 )† (c0 − F (x0 )) = F (x0 )† (K − F (x0 )) = F (x0 )† (K − F (x0 )) = ξ, where the last equality holds by the definition of ξ given in (2.2). Let Ω := B(x0 , r̄0 ) and define h : Ω → Y0 by h(x) := F (x) − c0 for each x ∈ Ω. Then h is continuous on Ω, and (4.17) h (x) = F (x) and h (x) = F (x) for each x ∈ int Ω = B(x0 , r̄0 ). In particular, h (x0 ) = F (x0 ), and so h (x0 ) : X → Y0 is surjective. Note further that h (x0 )† is exactly the restriction to Y0 of the operator F (x0 )† . This together with (4.15) implies that (4.18) h (x0 )† h(x0 ) = F (x0 )† (F (x0 ) − c0 ) and h (x0 )† h (x) = F (x0 )† F (x) for each x ∈ B(x0 , r̄0 ). It follows from the assumed γ-condition for F that h satisfies the γ-condition on B(x0 , r̄0 ). Further, by (4.16), (4.18), and (4.8), together with (4.2), we have (4.12) because √ 3−2 2 w−1 (τ ) (4.19) h (x0 )† h(x0 ) = ξ ≤ ≤ . γ γ Finally, since r∗ ≤ r̄0 as noted earlier, one has that B(x0 , r∗ ) ⊆ Ω. Thus, Proposition 4.1 is applicable to concluding that the sequence xn generated by the Newton method (4.11) for h with initial point x0 converges to a zero x∗ of h and (4.14) holds. Hence, F (x∗ ) = c0 ∈ K, and (4.14) in particular implies that (4.20) x0 − x∗ ≤ r∗ = w(γξ)ξ ≤ τ ξ (see (4.8) and (4.9)). Clearly, one has F (x0 )† (K − F (x0 )) ≤ F (x0 )† d(F (x0 ), K). By (4.16), this means that ξ ≤ F (x0 )† d(F (x0 ), K), and so (2.5) holds thanks to (4.20). In particular, if K is closed, then x∗ ∈ S, and the conclusion is proved for case (a). Copyright © by SIAM. Unauthorized reproduction of this article is prohibited. Downloaded 06/27/13 to 140.117.35.140. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php APPROXIMATE SOLUTIONS FOR INEQUALITY SYSTEMS 1249 It remains to consider case (b): we assume that √ 3−2 2 ∗ (4.21) r < r̄, ξ < , and ri K = ∅. γ Then there exists a sequence {cn } ⊆ ri K such that cn → 0. Without loss of generality, we may assume further that, for each n = 1, 2, . . . √ 3−2 2 † ¯ , (4.22) ξn := ξ + F (x0 ) cn < γ (4.23) τ̄n := w(γ ξ̄n ) ∈ √ 2+ 2 , 1, 2 and (4.24) r̄n∗ := w(γ ξ̄n )ξ¯n < r̄ (noting (4.2) and that ξ̄n → ξ and r̄n∗ → r∗ by (4.9)). Let Fn : X → Y be the map defined by Fn (·) := F (·) − cn , and consider the inequality system (1.1) with Fn in place of F , that is, the inequality system (4.25) Fn (x) ≥K 0. Then, Fn = F , and so Fn (x0 )† = F (x0 )† and Fn (x0 ) ∈ imFn (x0 ), thanks to assumption (4.15). Moreover, we have Fn (x0 )† (K − Fn (x0 )) = F (x0 )† (K − F (x0 ) − cn ), and so (4.26) ξn := Fn (x0 )† (K − Fn (x0 )) ≤ ξ + F (x0 )† cn = ξ̄n , (4.27) τn := w(γξn ) ≤ w(γ ξ̄n ) = τ̄n ∈ √ 2+ 2 , 1, 2 and rn∗ := w(γξn )ξn ≤ w(γ ξ̄n )ξ¯n = r̄n∗ < r̄ thanks to (4.2). This and the given assumptions imply that Fn satisfies the γ-condition on B(x0 , rn∗ ), and both (2.4) and (2.1) hold with Fn , τn in place of F , τ . (By (4.27) τn −1 and the definition of w−1 in (4.3), γξn = w−1 (τn ) = τn (2τ .) Moreover, by (4.9), n −1) ∗ the quantity rn defined above is exactly the corresponding quantity r∗ defined by (2.3) with ξn in place of ξ. Then we apply the (already established) first conclusion of Theorem 2.1 and conclude that there exists x∗n ∈ X satisfying Fn (x∗n ) ∈ K such that x0 − x∗n ≤ τn F (x0 )† d(Fn (x0 ), K). By the choice of cn , we have that F (x∗n ) ∈ K + cn ⊆ ri K ⊆ K. Hence x∗n ∈ S (and so S is nonempty), and it follows that d(x0 , S) ≤ x0 − x∗n ≤ τn F (x0 )† d(Fn (x0 ), K). Letting n → ∞, we establish (2.6), as it is clear that d(Fn (x0 ), K) → d(F (x0 ), K) and τn = w(γξn ) → w(γξ) ≤ τ . Thus the proof is complete. Copyright © by SIAM. Unauthorized reproduction of this article is prohibited. Downloaded 06/27/13 to 140.117.35.140. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php 1250 CHONG LI AND K. F. NG 5. Proofs of Theorems 2.2 and 2.3. Before our discussion on the applications of Theorem 2.1 to the study of systems (1.3) and (1.4), let us prove a simple lemma regarding the γ-condition for analytic functions. As assumed in the previous sections, let X and Y be Hilbert spaces. Lemma 5.1. Let F : X → Y be analytic on B(x0 , r) for some x0 ∈ X and r ∈ (0, +∞). Let γ ∈ [γF (x0 ), +∞), where γF (x0 ) is defined by (1.15), and let r0 := min{r, γ1 }. Then the following assertions hold: (i) F satisfies the γ-condition at x0 on B (x0 , r0 ). (ii) If imF (x0 ) is closed and imF (k) (x0 ) ⊆ imF (x0 ) for each k = 2, . . . , then (5.1) imF (x) ⊆ imF (x0 ) for each x ∈ B (x0 , r0 ) . Proof. The proof of assertion (i) is standard and is a direct application of the Taylor expansion; see, for example, [25, Theorem 1]. Below we verify assertion (ii). For this purpose, assume that imF (k) (x0 ) ⊆ imF (x0 ) for each k = 2, . . . , and let x ∈ B (x0 , r0 ). Since F is analytic on B(x0 , r), it follows from the Taylor expansion that F (x) = F (k+2) (x0 ) (x − x0 )k . k! k≥0 Hence the conclusion follows, as imF (x0 ) is closed. As applications of Theorem 2.1 to systems (1.3) and (1.4), respectively, we derive the following corollaries, where we assume that the involved functions f and g are defined by (1.5) and are analytic around the involved point x0 . Corollary 5.2. Consider the inequality system (1.4), and let F : X → Rl × Rm √ be defined by F := (f, g). Let τ ∈ (1, 2+2 2 ], γ ∈ [γF (x0 ), +∞), and x0 ∈ X be such that (5.2) (Rl × {0m }) ∪ {F (x0 )} ∪ imF (k) (x0 ) ⊆ imF (x0 ) for each k = 2, . . . and τ −1 . γ τ (2τ − 1) F (x0 )† √ Suppose that f and g are analytic on B x0 , 2−2γ 2 . Then the solution set S(f, g) of (1.4) is nonempty and (5.3) (5.4) 1 (f (x0 )− 2 + g(x0 )2 ) 2 < 1 d(x0 , S(f, g)) ≤ τ F (x0 )† (f (x0 )− 2 + g(x0 )2 ) 2 . Proof. Let Y := Rl × Rm , and define K := {(u, 0m ) ∈ Rl+ × Rm : ui > 0 ∀1 ≤ i ≤ lS }. Then, with F := (f ; g), systems (1.1) and (1.4) are identical, d(F (x0 ), K) = (f (x0 )− 2 1 + g(x0 )2 ) 2 , and F (x0 ) = (f (x0 ); g (x0 )) satisfies by the assumptions that (5.5) K ∪ {F (x0 )} ∪ imF (k) (x0 ) ⊆ imF (x0 ) for each k = 2, . . . . Set ξ := F (x0 )† (K − F (x0 )). Since by (5.3) (5.6) 1 F (x0 )† d(F (x0 ), K) = F (x0 )† (f (x0 )− 2 + g(x0 )2 ) 2 < τ −1 , γ τ (2τ − 1) Copyright © by SIAM. Unauthorized reproduction of this article is prohibited. APPROXIMATE SOLUTIONS FOR INEQUALITY SYSTEMS 1251 Downloaded 06/27/13 to 140.117.35.140. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php it follows that √ τ −1 3−2 2 ξ ≤ F (x0 ) d(F (x0 ), K) < ≤ γ τ (2τ − 1) γ (5.7) † √ (see (4.2) and (4.3)). Set r0 := 2−2γ 2 (≤ γ1 ). Thus, by Lemma 5.1, F satisfies the γ-condition at x0 on B(x0 , r0 ), and (5.1) holds thanks to (5.5). Therefore, (2.1) is satisfied. Moreover, r∗ < r0 , where r∗ is defined as in (2.3) (see also (4.6)). To see this, we may suppose that γ = 0. Then by (2.3) and (5.7), one has √ 1+3−2 2 1 + γξ < = r0 , r ≤ 4γ 4γ ∗ √ as claimed. Let r̄ ∈ (r∗ , r0 ). Then B(x0 , r̄) ⊆ B x0 , 2−2γ 2 . Therefore, F is contin√ uous on B (x0 , r̄), as f and g are analytic on B x0 , 2−2γ 2 by assumption. Noting ri K = ∅, the assumption stated in (b) of Theorem 2.1 is satisfied. Thus Theorem 2.1 is applicable here, and we conclude that the solution set S(f, g) of (1.1) (namely that of (1.4)) is nonempty, and d(x0 , S(f, g)) ≤ τ F (x0 )† d(F (x0 ), K); 1 i.e., d(x0 , S(f, g)) ≤ τ f (x0 )† (f (x0 )− 2 + g(x0 )2 ) 2 . This completes the proof of the corollary. Taking gj = 0 for each j in the preceding corollary, we get the following corollary. Recall that γf (x) is defined by (1.15) (with f in place of F ). √ Corollary 5.3. Consider the inequality system (1.3). Let τ ∈ (1, 2+2 2 ], γ ∈ √ [γf (x0 ), +∞), x0 ∈ X, and suppose that f is analytic on B x0 , 2−2γ 2 such that f (x0 ) : X → Rl is surjective and f (x0 )− < (5.8) τ −1 . τ (2τ − 1) γ f (x0 )† Then the solution set f > of (1.3) is nonempty, and the following estimate holds: d(x0 , f > ) ≤ τ f (x0 )† f (x0 )− . (5.9) Proof. Consider the inequality system (1.4) √ with f given as in the corollary and 2− 2 g = 0. Then f and g are analytic on B x0 , 2γ , and the solution set S(f ; g) of system (1.4) coincides with the solution set f > of (1.3). Let F := (f ; g). Then (5.10) F (x0 )† = (f (x0 )† ; 0)T and F (k) (x0 ) = (f (k) (x0 ); 0m ) for each k. This implies that F (x0 )† = f (x0 )† , γF (x0 ) = γf (x0 ), and imF (k) (x0 ) = imf (k) (x0 ) × {0m } ⊆ Rl × {0m } = imF (x0 ) for any k ≥ 2, Copyright © by SIAM. Unauthorized reproduction of this article is prohibited. Downloaded 06/27/13 to 140.117.35.140. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php 1252 CHONG LI AND K. F. NG thanks to the surjectivity assumption of f (x0 ). Thus assumptions (5.2) and (5.3) in Corollary 5.2 are satisfied. Hence Corollary 5.2 is applicable to concluding that f > = S(f ; g) is nonempty and d(x0 , f > ) = d(x0 , S(f, g)) 1 ≤ τ F (x0 )† (f (x0 )− 2 + g(x0 )2 ) 2 = τ f (x0 )† f (x0 )− . The proof is complete. Remark 5.1. Consider the inequality system (1.3) for the special case when x0 satisfies that (5.11) fi (x0 ) < 0 for each 1 ≤ i ≤ l. Then, for this x0 , Corollary 5.3 is a stronger result than Theorem 2.3. To see this, we suppose that (5.11) holds and choose γ = γf (x0 ). Note then that Diag(f (x0 )+ ) = 0, (5.12) and hence assumption (2.13) simply means that f (x0 ) is surjective. Moreover, by (5.12), (1.9), and (3.4), (5.13) † 1 δ(f ; x0 ) = f (x0 )f (x0 )∗ 2 = f (x0 )† . Finally, recalling (1.6) and an elementary fact, 1 t if t > 1, (5.14) sup t k−1 = 1 if t ≤ 1, k≥2 we have 1 f (x0 )† f (k) (x0 ) k−1 γ = γf (x0 ) = sup k! k≥2 1 ≤ sup f (x0 )† k−1 Γ(f, x0 ) k≥2 = max{1, δ(f ; x0 )}Γ(f, x0 ), and so the quantity given on the right-hand side of (5.8) is smaller than that of (2.14); thus (2.14)=⇒ (5.8). Similarly, Corollary 5.2 is a stronger result than Theorem 2.2 for system (1.4) in the special case when x0 satisfies (5.11). For the proofs of Theorems 2.2 and 2.3, it would be convenient to do some preparations first. Let (x; v) ∈ X × Rl . We use (F (x); Diag(v)) to denote the operator from X × Rl to Rl × Rm defined by (F (x); Diag(v))(z; u) := F (x)z + (Diag(v)u; 0m ) for each (z; u) ∈ X × Rl . Using the matrix notation, the operator (F (x); Diag(v)) can be represented as (5.15) f (x) Diag(v) z (F (x); Diag(v))(z; u) = for each (z; u) ∈ X × Rl . g (x) 0m×l u Copyright © by SIAM. Unauthorized reproduction of this article is prohibited. 1253 Downloaded 06/27/13 to 140.117.35.140. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php APPROXIMATE SOLUTIONS FOR INEQUALITY SYSTEMS Let G(x; v) be the (l + m) × (l + m) matrix defined by f (x)f (x)∗ + Diag(v 2 ) f (x)g (x)∗ (5.16) G(x; v) := , g (x)f (x)∗ g (x)g (x)∗ where, for any v = (vi ) ∈ Rl , v 2 stands for the vector (vi2 ) in Rl . The following lemma will be useful for us. Lemma 5.4. Let (x; v) ∈ X × Rl . Then we have the following assertions: (5.17) (F (x); Diag(v)) ◦ (F (x); Diag(v))∗ = G(x; v) and im (F (x); Diag(v)) = im G(x; v). (5.18) Proof. We claim that (5.19) ∗ f (x) ∗ (F (x); Diag(v)) (w1 , w2 ) = Diag(v) g (x)∗ 0l×m w1 w2 for each (w1 ; w2 ) ∈ Rl ×Rm . In fact, by definition, for any (z, u) ∈ X × Rl , one checks that (z, u), (F (x); Diag(v))∗ (w1 , w2 ) = (F (x); Diag(v))(z, u), (w1 , w2 ) = f (x)z, w1 + Diag(v)u, w1 ) + g (x)z, w2 ∗ = z, Diag(v)w g (x)∗ w2 1 ) + z, f (x)w1 + u, ∗ ∗ w1 f (x) g (x) , = (z, u), Diag(v) 0l×m w2 and so (5.19) is established. This together with (5.15) implies that ∗ f (x) Diag(v) f (x) ∗ (F (x); Diag(v)) ◦ (F (x); Diag(v)) = g (x) 0m×l Diag(v) g (x)∗ 0l×m , and (5.17) is seen to hold. To show (5.18), we set Y0 := im(F (x); Diag(v)). By (5.17), it suffices to check that im((F (x); Diag(v)) ◦ (F (x); Diag(v))∗ ) ⊇ Y0 , as the converse inclusion holds trivially. Since Y0 is a finite dimensional subspace of Rl × Rm , we can choose finite dimensional subspace X0 such that Y0 = (F (x); Diag(v))(X0 ) and the restriction A := (F (x); Diag(v))|X0 on X0 is a bijection. Hence A∗ coincides with the restriction of (F (x); Diag(v))∗ to Y0 , and A∗ is also a bijection. Consequently, the composite A ◦ A∗ is a bijection on Y0 . This implies that Y0 = (A ◦ A∗ )(Y0 ) ⊆ ((F (x); Diag(v)) ◦ (F (x); Diag(v))∗ )(Y ) = im((F (x); Diag(v)) ◦ (F (x); Diag(v))∗ ), and the proof is complete. Proof of Theorem 2.2. Let H : X × Rl → Rl × Rm be the operator defined by H := (h; g) with h = (hi ) and g = (gj ), where each hi and each gj is given by (5.20) hi (x; v) := fi (x) − vi2 , gj (x; v) := gj (x), 1 ≤ i ≤ l, 1 ≤ j ≤ m, Copyright © by SIAM. Unauthorized reproduction of this article is prohibited. Downloaded 06/27/13 to 140.117.35.140. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php 1254 CHONG LI AND K. F. NG for each (x; v) ∈ X × Rl with v = (v1 , . . . , vl ) ∈ Rl . Consider the following inequality system on X × Rl : hi (x; v) > 0, hi (x; v) ≥ 0, gj (x; v) = 0, (5.21) i = 1, . . . , lS , i = lS + 1, . . . , l, j = 1, . . . , m. As before, we use S(h; g) to denote the solution set of the above system. Then it is easy to check that x ∈ S(f ; g) if (x; v) ∈ S(h; g) for some v ∈ Rl , that is, (5.22) S0 := {x ∈ X : ∃v ∈ Rl s.t. (x; v) ∈ S(h; g)} ⊆ S(f ; g). Let v0 := ( fi (x0 )+ ) ∈ Rl , and notice that (5.23) h(x0 ; v0 )− 2 + g(x0 ; v0 )2 = f (x0 )− 2 + g(x0 )2 , (5.24) H(x0 ; v0 ) = F (x0 ) − (f (x0 )+ ; 0) ∈ F (x0 ) + Rl × {0m }, and that G(x0 ; ±2v0 ) = G(x0 ) (5.25) by (5.16) and (1.8). Recalling (1.6) and (1.9), we define (5.26) γ := max{1, δ(f, g; x)}(1 + Γ(f, g; x)). We will apply Corollary 5.2 to the inequality system (5.21) in place of (1.4) (so to H and (x0 , v0 ) in place of F and x0 ). To do this, note first that, since√γ ≥ 1 + Γ(f, g; x0 ), the analyticity assumption implies that H is analytic on B x0 , 2−2γ 2 . Next note that (5.27) H (x0 ; v0 ) = (F (x0 ); Diag(−2v0 )), H (k) (x0 ; v0 ) = F (k) (x0 ) for each k > 2, and H (x0 ; v0 ) = (F (x0 ); −2Dl ), (5.28) where (F (x0 ); −2Dl ) is the bilinear map from (X × Rl )× (X × Rl ) to Rl × Rm defined by (F (x0 ); −2Dl )(z1 ; u1 )(z2 ; u2 ) := F (x)z1 z2 − 2Dl (u1 ; u2 ) for any (z1 ; u1 ), (z2 ; u2 ) ∈ X × Rl , and Dl : Rl × Rl → Rl × Rm is the bilinear operator defined by Diag(u)v (5.29) Dl (u; v) := for any (u; v) ∈ Rl × Rl . 0m This implies in particular that (5.30) imH (x0 ; v0 ) = imF (x0 ) + im(−2Dl ) ⊆ imF (x0 ) + Rl × {0m }, and note also that (5.31) imH k (x0 ; v0 ) = imF k (x0 ) for each k > 2. Copyright © by SIAM. Unauthorized reproduction of this article is prohibited. 1255 APPROXIMATE SOLUTIONS FOR INEQUALITY SYSTEMS Downloaded 06/27/13 to 140.117.35.140. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php Moreover, by (5.25) and (5.27), we have from Lemma 5.4 that G(x0 ) = G(x0 ; −2v0 ) = H (x0 ; v0 ) ◦ H (x0 ; v0 )∗ (5.32) and imG(x0 ) = imH (x0 ; v0 ). (5.33) By the given assumption (2.10), it follows that the vector space imH (x0 ; v0 ) contains Rl × {0m } ∪ {F (x0 )} ∪ ∪k≥2 imF (k) (and so contains Rl × {0m } ∪ {H(x0 ; v0 )} ∪ ∪k≥2 imH (k) by (5.24), (5.31), and (5.30)). Another consequence of (5.32) is G(x0 )† = H (x0 ; v0 )† 2 (see (3.4)). Thus, by (1.9), 1 δ(f, g; x0 ) = G(x0 )† 2 = H (x0 ; v0 )† . (5.34) Combining this with (5.23) and (5.26), we see that assumption (2.11) means that h(x0 ; v0 )− 2 + g(x0 ; v0 )2 < (5.35) τ −1 . γ τ (2τ − 1)H (x0 ; v0 )† Further, since H (x0 ; v0 ) ≤ 2 + F (x0 ) and H (k) (x0 ; v0 ) = F (k) (x0 ) for each k > 2, and making use of (5.14), one has the following estimate for γH (x0 , v0 ) (see (1.15)): (5.36) 1 (k) 1 k−1 H (x0 ; v0 ) k−1 † sup γH (x0 ; v0 ) ≤ sup H (x0 ; v0 ) k! k≥2 k≥2 ≤ max{1, δ(f, g; x0 )}(1 + Γ(f, g; x0 )); thus one has γ ∈ [γH (x0 ; v0 ), +∞) by (5.26). Therefore, one can apply Corollary 5.2 to the inequality system (5.21) (with (x0 , v0 ) in place of x0 ) to get that S(h; g) = ∅ and 1 d((x0 ; v0 ), S(h; g)) ≤ τ H (x0 ; v0 )† (h(x0 ; v0 )− 2 + g(x0 ; v0 )2 ) 2 . Using (5.23) and (5.34), we obtain that 1 d((x0 ; v0 ), S(h; g)) ≤ τ δ(f, g; x0 ) (f (x0 )− 2 + g(x0 )2 ) 2 . Let > 0, and let (x∗ ; v ∗ ) ∈ S(h; g) be such that (5.37) 1 (x0 ; v0 ) − (x∗ ; v ∗ ) ≤ τ δ(f, g; x0 ) (f (x0 )− 2 + g(x0 )2 ) 2 + . Then x∗ ∈ S(f, g) by (5.22) and d(x0 , S(f, g)) ≤ x0 − x∗ ≤ (x0 ; v0 ) − (x∗ ; v ∗ ). This together with (5.37) implies (2.12) because > 0 is arbitrary. The proof is complete. Proof of Theorem 2.3. This is similar to the proof for Corollary 5.3 by taking g = 0 in Theorem 2.2. Copyright © by SIAM. Unauthorized reproduction of this article is prohibited. Downloaded 06/27/13 to 140.117.35.140. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php 1256 CHONG LI AND K. F. NG Acknowledgment. The authors are indebted to two anonymous referees for their helpful remarks and comments. REFERENCES [1] L. Blum, F. Cucker, M. Shub, and S. Smale, Complexity and Real Computation, SpringerVerlag, New York, 1997. [2] Y. R. He and J. Sun, Error bounds for degenerate cone inclusion problems, Math. Oper. Res., 33 (2005), pp. 701–717. [3] J.-P. Dedieu, Approximate solutions of analytic inequality systems, SIAM J. Optim., 11 (2000), pp. 411–425. [4] C. W. Groetsch, Generalized Inverse of Linear Operator: Representation and Approximation, Marcel Dekker, New York, 1977. [5] J. He, J. Wang, and C. Li, Newton’s method for underdetermined systems of equations under the γ-condition, Numer. Funct. Anal. Optim., 28 (2007), pp. 663–679. [6] O. Güler, A. J. Hoffman, and U. G. Rothblum, Approximations to solutions to systems of linear inequalities, SIAM J. Matrix Anal. Appl., 16 (1995), pp. 688–696. [7] A. Hoffman, On approximate solutions of systems of linear inequalities, J. Res. Nat. Bur. Standards, 49 (1952), pp. 263–265. [8] K. Diethard and W. Li, Asymptotic constraint qualifications and global error bounds for convex inequalities, Math. Program. Ser. A, 84 (1999), pp. 137–160. [9] G. Li, On the asymptotically well behaved functions and global error bound for convex polynomials, SIAM J. Optim., 20 (2010), pp. 1923–1943. [10] G. Li, Global error bounds for piecewise convex polynomials, Math. Program., 137 (2013), pp. 37–64. [11] W. Li, Abadie’s constraint qualification, metric regularity, and error bounds for differentiable convex inequalities, SIAM J. Optim., 7 (1997), pp. 966–978. [12] W. Li, Error bounds for piecewise convex quadratic programs and applications, SIAM J. Control Optim., 33 (1995), pp. 1510–1529. [13] X.-D. Luo and Z.-Q. Luo, Extension of Hoffman’s error bound to polynomial systems, SIAM J. Optim., 4 (1994), pp. 383–392. [14] X. Luo and J. Pang, Error bounds for analytic systems and their applications, Math. Program., 67 (1995), pp. 1–28. [15] K. F. Ng and W. H. Yang, Regularities and their relations to error bounds, Math. Program. Ser. A, 99 (2004), pp. 521–538. [16] K. F. Ng and W. H. Yang, Error bounds for abstract linear inequality systems, SIAM J. Optim., 13 (2002), pp. 24–43. [17] K. F. Ng and X. Y. Zheng, Global error bounds with fractional exponents, Math. Program. Ser. B, 88 (2000), pp. 357–370. [18] K. F. Ng and X. Y. Zheng, Error bounds for lower semicontinuous functions in normed spaces, SIAM J. Optim., 12 (2001), pp. 1–17. [19] J. S. Pang, Error bounds in mathematical programming, Math. Program. Ser. B, 79 (1997), pp. 299–332. [20] S. M. Robinson, An application of error bounds for convex programming in a linear space, SIAM J. Control., 13 (1975), pp. 271–273. [21] S. M. Robinson, Stability theory for systems of inequalities. Part I: Linear systems, SIAM J. Numer. Anal., 12 (1975), pp. 754–769. [22] S. M. Robinson, Stability theory for systems of inequalities, Part II: Differentiable nonlinear systems, SIAM J. Numer. Anal., 13 (1976), pp. 497–513. [23] M. Shub and S. Smale, Complexity of Bezout’s theorem IV: Probability of success; extensions, SIAM J. Numer. Anal., 33 (1996), pp. 128–148. [24] S. Smale, Newton’s method estimates from data at one point, in The Merging of Disciplines: New Directions in Pure, Applied and Computational Mathematics, R. Ewing, K. Gross, and C. Martin, eds., Springer, New York, 1986, pp. 185–196. [25] X. H. Wang, Convergence on the iteration of Halley family in weak conditions, Chinese Sci. Bull., 42 (1997), pp. 552–555. [26] X. H. Wang and D. F. Han, Criterion α and Newton’s method, Chinese J. Numer. Appl. Math., 19 (1997), pp. 96–105. [27] Z. L. Wu and J. J. Ye, On error bounds for lower semicontinuous functions, Math. Program. Ser. A, 92 (2002), pp. 301–314. [28] W. H. Yang, Error bounds for convex polynomials, SIAM J. Optim., 19 (2009), pp. 1633–1647. Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.