NONLINEARLY CONSTRAINED BEST APPROXIMATION IN CONSTRAINT QUALIFICATION

advertisement
c 2002 Society for Industrial and Applied Mathematics
SIAM J. OPTIM.
Vol. 13, No. 1, pp. 228–239
NONLINEARLY CONSTRAINED BEST APPROXIMATION IN
HILBERT SPACES: THE STRONG CHIP AND THE BASIC
CONSTRAINT QUALIFICATION∗
CHONG LI† AND XIAO-QING JIN‡
Abstract. We study best approximation problems with nonlinear constraints in Hilbert spaces.
The strong “conical hull intersection property” (CHIP) and the “basic constraint qualification”
(BCQ) condition are discussed. Best approximations with differentiable constraints and convex
constraints are characterized. The analysis generalizes some linearly constrained results of recent
works [F. Deutsch, W. Li, and J. Ward, J. Approx. Theory, 90 (1997), pp. 385–444; F. Deutsch,
W. Li, and J. D. Ward, SIAM J. Optim., 10 (1999), pp. 252–268].
Key words. best approximation, strong CHIP, BCQ condition, differentiable constraint, convex
constraint
AMS subject classifications. 41A65, 41A29
PII. S1052623401385600
1. Introduction. In recent years, a lot of attention has been focused on constrained best approximation problems in Hilbert spaces; see, e.g., [5, 6, 9, 10, 11, 16,
17]. These problems find applications (cf. [2]) in statistics, mathematical modeling,
curve fitting, and surface fitting. The setting is as follows. Let X be a Hilbert space,
C a nonempty closed convex subset of X, and A a bounded linear operator from X
to a finite-dimensional Hilbert space Y . Given “data” b ∈ Y , the problem consists of
finding the best approximation PK (x) to any x ∈ X from the set
K := C ∩ A−1 (b) = C ∩ {x ∈ X : Ax = b}.
Generally, it is easier to compute the best approximation from C than from K.
Therefore, the interest of several papers [5, 6, 9, 11, 16, 17] was centered on the
following problem: for any x ∈ X, does there exist a y ∈ Y such that PK (x) =
PC (x + A∗ y)? It was proved in [9] that a sufficient and necessary condition for an
affirmative answer to this question is that the pair {C, A−1 (b)} satisfy the strong
“conical hull intersection property” (CHIP).
Very recently, Deutsch, Li, and Ward in [10] considered a more general problem
of finding the best approximation PK (x) to any x ∈ X from the set
(1.1)
K = C ∩ {x ∈ X : Ax ≤ b}
and established a result similar to that of [9]. More precisely, they proved the following
theorem (see Theorem 3.2 and Lemma 3.1 in [10]).
Theorem DLW. Let A be defined on X by
Ax := (x, h1 , x, h2 , . . . , x, hm )
∗ Received by the editors February 26, 2001; accepted for publication (in revised form) February
11, 2002; published electronically July 16, 2002.
http://www.siam.org/journals/siopt/13-1/38560.html
† Department of Mathematics, Zhejiang University, Hangzhou 310027, P. R. China (cli@seu.edu.
cn). The research of this author is supported by the National (grant 19971013) and Jiangsu Provincial
(grant BK99001) Natural Science Foundations of China.
‡ Faculty of Science and Technology, University of Macau, Macau, P. R. China (xqjin@umac.mo).
The research of this author is supported by research grants RG010/99-00S/JXQ/FST and RG026/
00-01S/JXQ/FST from the University of Macau.
228
NONLINEAR APPROXIMATION IN HILBERT SPACES
229
for some hi ∈ X \ {0} for i = 1, 2, . . . , m. Let b ∈ Rm and x∗ ∈ K = C ∩ {x ∈ X :
Ax ≤ b}. Then the following two statements are equivalent:
m
(i) For any x ∈ X, x∗ = PK (x) ⇐⇒ x∗ = PC (x − i=1 λi hi ) for some λi ≥ 0
with λi (x, hi − bi ) = 0 for all i.
(ii) {C, H1 , . . . , Hm } has the strong CHIP at x∗ , where Hi := {x ∈ X : x, hi ≤
bi } for all i.
Theorem DLW gives an unconstrained reformulation for the linearly constrained
system, for which a complete theory has been established. The importance of such
a theory was described in detail in [10, 11], etc. One natural problem is: can one
extend such a theory to a nonlinearly constrained system? Admittedly, this problem
for a general nonlinearly constrained system is quite difficult. In this paper, we shall
relax the linearity assumption made on the operator A in the constraint (1.1) in two
ways. First, we study the case in which A is assumed to be Fréchet differentiable, and
second, we examine the case in which A is convex (i.e., each component is convex).
In the Fréchet differentiable case, we will give a theorem (Theorem 4.1) that
is similar to Theorem DLW, where hi in Theorem DLW is replaced by the Fréchet
derivative Ai (x∗ ) of Ai at x∗ , for i = 1, 2, . . . , m. Note that, when A is nonlinear,
the approximating set K is, in general, nonconvex (see Example 4.1). Thus Theorem
DLW does not work in this case, since K can not be re-expressed as the intersection
of C and a polyhedron. In addition, the nonconvexity of the set K makes the original
problem very complicated. In fact, there is no successful way to characterize the best
approximation from general nonconvex sets. The merit of the present results lies in
converting a nonconvex constrained problem into a convex unconstrained one.
In the convex case, the sets Hi , i = 1, 2, . . . , m, may not be well defined, although
K is convex and, in general, Theorem DLW does not work either (see Example 5.1).
To establish a similar unconstrained reformulation result, we introduce the concept of
the “basic constraint qualification” (BCQ) relative to C, which is a generalization of
the BCQ considered in [12, 13]. We prove that the BCQ relative to C is a sufficient
and necessary condition to ensure the following
“perturbation property”: for any
m
x ∈ X, PK (x) = x∗ if and only if PC (x − 1 λi hi ) = x∗ for some hi ∈ ∂Ai (x∗ ) and
λi ≥ 0 with λi (Ai (x∗ ) − bi ) = 0. Clearly, in either case, the present results generalize
the main results in [10].
The paper is organized as follows. We describe some notations and a useful
proposition in section 2. To deal with the problem with differentiable constraints, we
need to linearize the constraints in section 3. Unconstrained reformulation results for
differentiable constraints and convex constraints are established in sections 4 and 5,
respectively. Finally, a concluding remark is given in section 6.
2. Preliminaries. Let X be a Hilbert space. For a nonempty subset S of X,
the convex hull (resp., conical hull) of S, denoted by convS (resp., coneS), is the
intersection of all convex sets (resp., convex cones) including S, while the dual cone
S ◦ of S is defined by
S ◦ = {x ∈ X : x, y ≤ 0 ∀y ∈ S}.
Then the normal cone of S at x is defined by NS (x) = (S − x)◦ . The closure (resp.,
interior, relative interior) of any set S is denoted by S (resp., intS, riS).
For a function f from X to R, the subdifferential of f at x ∈ X, denoted by
∂f (x), is defined by
∂f (x) = {z ∈ X : f (x) + z, y − x ≤ f (y) ∀y ∈ X}.
230
CHONG LI AND XIAO-QING JIN
It is well known that ∂f (x) = ∅ for all x ∈ X if f is a continuous convex function.
Let G be a nonempty closed convex set in X. Then for any x ∈ X, there exists a
unique best approximation PG (x) from G to x. We define
τ (x, y) = lim
t→+0
x + ty − x
.
t
Since x/x is the unique supporting functional of x, we have
τ (x, y) =
x, y
.
x
The following well-known characterization of the best approximation is useful; see
[9, 10].
Proposition 2.1. Let G be a convex subset of X, x ∈ X, and g0 ∈ G. Then
PG (x) = g0 ⇐⇒ x − g0 , g0 − g ≥ 0 for any g ∈ G ⇐⇒ x − g0 ∈ (G − g0 )◦ .
3. Linearization of the constraints. In the remainder of the paper, we always
assume that C = ∅ is a closed convex subset of X. Suppose that
A(·) = (A1 (·), . . . , Am (·))
is Fréchet differentiable from X to Rm and b = (b1 , . . . , bm ) ∈ Rm . Let me ∈
{1, 2, . . . , m} be a fixed integer. Define
K0 = {x ∈ X : Ai (x) = bi , i ∈ E} ∩ {x ∈ X : Ai (x) ≤ bi , i ∈ I}
and
K = C ∩ K0 ,
where
E = {1, 2, . . . , me },
I = {me + 1, . . . , m}.
Furthermore, let
I(x∗ ) = {i ∈ I : Ai (x∗ ) = bi }
∀x∗ ∈ K.
The following concepts can be easily found in any book on constrained optimization; see, e.g., [14, 20].
Definition 3.1. Let x∗ ∈ K. A vector d = 0 is called a feasible direction of K
∗
at x if there exists δ > 0 such that
x∗ + td ∈ K
∀t ∈ [0, δ].
The set of all feasible directions of K at x∗ is denoted by FD(x∗ ).
Definition 3.2. Let x∗ ∈ K. A vector d is called a linearized feasible direction
of K at x∗ if
d, Ai (x∗ ) = 0
∀i ∈ E
and
d, Ai (x∗ ) ≤ 0
∀i ∈ I(x∗ ),
NONLINEAR APPROXIMATION IN HILBERT SPACES
231
where Ai (x∗ ) is the Fréchet derivative of Ai at x∗ . The set of all linearized feasible
directions of K at x∗ is denoted by LFD(x∗ ).
Definition 3.3. Let x∗ ∈ K. A vector d is called a sequentially feasible direction
of K at x∗ if there exist a sequence {dk } ⊂ X and a sequence {δk } of real positive
numbers such that
dk → d,
x∗ + δk dk ∈ K,
δk → 0,
k = 1, 2, . . . .
∗
The set of all sequentially feasible directions of K at x is denoted by SFD(x∗ ).
Obviously, we have the following inclusion relationship for various feasible directions.
Proposition 3.1. Let x∗ ∈ K. Then
FD(x∗ ) ⊆ SFD(x∗ ) ⊆ LFD(x∗ ).
For convenience, let
KS (x∗ ) = conv(x∗ + SFD(x∗ )) ∩ C
and
KL (x∗ ) = (x∗ + LFD(x∗ )) ∩ C.
Then KS (x∗ ) and KL (x∗ ) are closed convex cones.
The following two theorems describe the equivalence of the best approximation
from K and from KS (x∗ ), which plays an important role in our study.
Theorem 3.1. Let x∗ ∈ K. Then, for any x ∈ X, if PK (x) x∗ , we have
PKS (x∗ ) (x) = x∗ .
Proof. For any x̄ ∈ x∗ + SFD(x∗ ), d = x̄ − x∗ ∈ SFD(x∗ ) there exist dk ∈ X with
dk → d and δk > 0 with δk → 0 such that x∗ + δk dk ∈ K. It follows from PK (x) x∗
that
x − x∗ ≤ x − x∗ − δk dk ,
k = 1, 2, . . . .
Since
τ (x − x∗ , x∗ − x̄)
=
lim
k
≥ lim
k
and
x − x∗ − δk d − x − x∗ δk
x − x∗ − δk d − x − x∗ − δk dk δk
x − x∗ − δk d − x − x∗ − δk dk ≤ dk − d,
δk
it follows that
x − x∗ , x∗ − x̄ ≥ 0
∀x̄ ∈ x∗ + SFD(x∗ ).
Since KS (x∗ ) ⊆ conv(x∗ + SFD(x∗ )), we have
x − x∗ , x∗ − x̄ ≥ 0
∀x̄ ∈ KS (x∗ ).
This, with Proposition 2.1, implies that PKS (x∗ ) (x) = x∗ , and the theorem follows.
Theorem 3.2. Let x∗ ∈ K. Then the following two statements are equivalent:
232
CHONG LI AND XIAO-QING JIN
(i) K ⊆ KS (x∗ ).
(ii) For any x ∈ X, PKS (x∗ ) (x) = x∗ =⇒ PK (x) = x∗ .
Proof. It suffices to prove that (ii) =⇒ (i). Let G = conv(x∗ + SFD(x∗ )). Suppose
on the contrary that K ⊆ KS (x∗ ). Then K ⊆ G, so that there is x̄ ∈ K but
x̄ ∈
/ G. Let g0 = PG (x̄) and x = x̄ − g0 + x∗ . Then PG (x) = x∗ . In fact, since
G = x∗ + conv(SFD(x∗ )), for any g ∈ G, there exist ḡ0 , ḡ ∈ conv(SFD(x∗ )) such that
g0 = x∗ + ḡ0
and
g = x∗ + ḡ.
Note that G is a cone with vertex x∗ . It follows that
g + g0 − x∗ = x∗ + ḡ + ḡ0 ∈ G,
which, by Proposition 2.1, implies that
x̄ − g0 , g0 − (g + g0 − x∗ ) ≥ 0
as g0 = PG (x̄). Thus we have
x − x∗ , x∗ − g = x̄ − g0 , g0 − (g + g0 − x∗ ) ≥ 0,
which proves that PG (x) = x∗ . Now define
xt = x∗ + t(x − x∗ ) ∀t > 0.
From
1
1
∗
x +
g xt − x = tx − x ≤ t x − 1 −
= xt − g
t
t
∗
∗
∀g ∈ G, t > 1,
it follows that PG (xt ) = x∗ for t > 1. Therefore, from (ii) and KS (x∗ ) ⊆ G, we have
PK (xt ) = x∗ for t > 1.
On the other hand, for t > 1 we obtain
xt − x̄2
= x∗ + t(x̄ − g0 ) − x̄2 = x∗ − g0 + (t − 1)(x̄ − g0 )2
=
(t − 1)2 x̄ − g0 2 + 2(t − 1)x̄ − g0 , x∗ − g0 + x∗ − g0 2 .
Since g0 = PG (x̄), it follows from Proposition 2.1 that x̄ − g0 , x∗ − g0 ≤ 0, and hence
xt − x̄2
≤ (t − 1)2 x̄ − g0 2 + x∗ − g0 2
= t2 x̄ − g0 2 − 2tx̄ − g0 2 + x̄ − g0 2 + x∗ − g0 2
<
t2 x̄ − g0 2 = xt − x∗ 2
for all t > 1 large enough. This means that x∗ ∈
/ PK (xt ), which is a contradiction.
The proof is complete.
Similarly, we have the following result for KL (x∗ ).
Theorem 3.3. Let x∗ ∈ K. Then the following statements are equivalent:
(i) K ⊆ KL (x∗ ).
(ii) For any x ∈ X, PKL (x∗ ) (x) = x∗ ⇐⇒ PK (x) x∗ .
Corollary 3.1. Let x∗ ∈ K. Consider the following statements:
NONLINEAR APPROXIMATION IN HILBERT SPACES
233
(i) K ⊆ KL (x∗ ) and KS (x∗ ) = KL (x∗ ).
(ii) For any x ∈ X, PK (x) x∗ ⇐⇒ PKL (x∗ ) (x) = x∗ .
(iii) For any x ∈ X, PK (x) x∗ =⇒ PKL (x∗ ) (x) = x∗ .
Then (i) =⇒ (ii) =⇒ (iii). Furthermore, if K ⊆ KS (x∗ ), then (i) ⇐⇒ (ii) ⇐⇒ (iii).
Proof. If (i) holds, by Theorems 3.1 and 3.2, we have
PK (x) x∗ ⇐⇒ PKS (x∗ ) (x) = x∗ ⇐⇒ PKL (x∗ ) (x) = x∗ .
Therefore (ii) holds. The implication (ii) =⇒ (iii) is trivial. Now assume that K ⊆
KS (x∗ ). If (iii) holds, then, for any x ∈ X,
PKS (x∗ ) (x) = x∗ =⇒ PK (x) x∗ =⇒ PKL (x∗ ) (x) = x∗ .
Thus, with almost the same arguments as in the proof of Theorem 3.2, we have
KL (x∗ ) ⊆ KS (x∗ ). By Proposition 3.1, KL (x∗ ) = KS (x∗ ) and so K ⊆ KL (x∗ ); i.e.,
(i) holds.
It should be noted that if K is convex (e.g., A1 , . . . , Ame are linear and Ame +1 , . . . ,
Am are convex), K ⊆ KS (x∗ ) holds. We therefore have the following corollary.
Corollary 3.2. Let x∗ ∈ K. If K is convex, then the following statements are
equivalent:
(i) KS (x∗ ) = KL (x∗ ).
(ii) For any x ∈ X, PK (x) = x∗ ⇐⇒ PKL (x∗ ) (x) = x∗ .
(iii) For any x ∈ X, PK (x) = x∗ =⇒ PKL (x∗ ) (x) = x∗ .
4. Reformulations of differentiable constraints. The following notation of
the strong CHIP, taken from [9, 10], plays an important role in optimization theory;
see, e.g., [7, 8, 12, 18].
Definition 4.1. Let {C0 , . . . , Cm } be a collection of closed convex sets and
x ∈ ∩m
j=0 Cj . Then {C0 , . . . , Cm } is said to have the strong CHIP at x if

◦
m
m


Cj − x
=
(Cj − x)◦ .
j=0
j=0
Now, for convenience, we write
Ai+m (x∗ ) = − Ai (x∗ ),
b̄i = bi − A(x∗ ) + x∗ , Ai (x∗ ),
Hi = {d ∈ X : d, Ai (x∗ ) ≤ b̄i },
i = 1, 2, . . . , me ,
i = 1, 2, . . . , m + me ,
i = 1, 2, . . . , m + me ,
and
E0 = E ∪ I(x∗ ) ∪ {m + 1, . . . , m + me },
E1 = I \ I(x∗ ).
We define the bounded linear mapping A(x∗ )| from X to Rme by
A(x∗ )|x = (x, A1 (x∗ ), . . . , x, Ame (x∗ )) ∈ Rme
∀x ∈ X.
The inverse of A(x∗ )|, which is generally a set-valued mapping, is denoted by
A(x∗ )|−1 . Let
b̄ = (b̄1 , . . . , b̄me ).
234
CHONG LI AND XIAO-QING JIN
Then we are ready to give the main result of this section.
Theorem 4.1. Let x∗ ∈ K. Suppose that K ⊆ KL (x∗ ) and KS (x∗ ) = KL (x∗ ).
Then the following statements are equivalent:
(i) {C, A(x∗ )|−1 (b̄), Hi , i ∈ I(x∗ )} has the strong CHIP at x∗ .
(ii) {C, A(x∗ )|−1 (b̄), Hi , i ∈ I} has the strong CHIP at x∗ .
(iii) For any x ∈ X,
m
PK (x) x∗ ⇐⇒ PC x −
λi Ai (x∗ ) = x∗
1
for some λi , i = 1, . . . , m, with λi ≥ 0 for all i ∈ I, and λi = 0 for all
i∈
/ E ∪ I(x∗ ).
Proof. We first assume that (i) holds. Since x∗ ∈ int ∩i∈E1 Hi , it follows from
Proposition 2.3 of [10] that {C ∩ (∩i∈E0 Hi ), Hi , i ∈ E1 } has the strong CHIP at x∗ .
Thus (i) implies that {C, A(x∗ )|−1 (b̄), Hi , i = 1, . . . , m} has the strong CHIP at
x∗ . Therefore, (ii) holds.
Now suppose that (ii) holds. By Corollary 3.1, we have that, for any x ∈ X,
PK (x) x∗ ⇐⇒ PKL (x∗ ) (x) = x∗ . We will show that PKL (x∗ ) (x) = x∗ if and
e
only if PKL0 (x∗ ) (x) = x∗ , where KL0 (x∗ ) = C ∩ (∩m+m
Hi ). In fact, it is clear that
i=1
∗
∗
PKL (x∗ ) (x) = x implies PKL0 (x∗ ) (x) = x . Conversely, assume that PKL0 (x∗ ) (x) =
x∗ . Since KL (x∗ ) U (x∗ , r) ⊆ KL0 (x∗ ) for some r > 0, where U (x∗ , r) denotes the
∗
∗
open ball
with∗ center x and∗ radius r > 0, x is a best approximation to ∗x from
∗
KL (x ) U (x , r), that is, x is a local best approximation to x from KL (x ), and
hence PKL (x∗ ) (x) = x∗ by [3]. Note that any finite collection of half-spaces has the
strong CHIP [9]. It follows that {C, A(x∗ )−1 (b̄), Hi : i ∈ I} has the strong CHIP
at x∗ ⇐⇒ {C, Hi : i = 1, 2, . . . , m + me } has the strong CHIP at x∗ . Thus, using
Theorem DLW, we have
m+m
e
∗
∗
PKL0 (x∗ ) (x) = x ⇐⇒ PC x −
λi Ai (x ) = x∗
i=1
for some λi ≥ 0, i = 1, . . . , m + me , with λi (x∗ , Ai (x∗ ) − b̄i ) = 0. Consequently,
(iii) holds.
Finally, if (iii) holds, it follows from Corollary 3.1 that, for any x ∈ X,
m
PKL (x∗ ) (x) = x∗ ⇐⇒ PK (x) x∗ ⇐⇒ PC x −
λi Ai (x∗ ) = x∗
1
for some λi , i = 1, . . . , m, with λi ≥ 0 for all i ∈ I, and λi = 0 for all i ∈
/ E ∪ I(x∗ ).
Consequently,


λi Ai (x∗ ) = x∗
PKL (x∗ ) (x) = x∗ ⇐⇒ PC x −
i∈E∪I(x∗ )
for some λi , i ∈ E ∪ I(x∗ ), satisfy λi ≥ 0 for all i ∈ I(x∗ ), or equivalently,
∗
∗
PKL (x∗ ) (x) = x ⇐⇒ PC x −
λi Ai (x ) = x∗
i∈E0
NONLINEAR APPROXIMATION IN HILBERT SPACES
235
for some λi ≥ 0, i ∈ E0 . Thus, using Theorem DLW again, we know that {C, Hi , i ∈
E0 } has the strong CHIP at x∗ , and so does {C, A(x∗ )|−1 (b̄), Hi , i ∈ I(x∗ )}; i.e.,
(i) holds. The proof of the theorem is complete.
Remark 4.1. Recall that the constraint qualification condition on span(C − x∗ )
is satisfied at x∗ if SFD(x∗ ) ∩ span(C − x∗ ) = LFD(x∗ ) ∩ span(C − x∗ ), which plays an
important role in nonlinear optimization theory; see [1, 14]. Clearly, if the constraint
qualification condition is satisfied at x∗ (indeed, it does if each Ai , i ∈ I(x∗ ), is linear
or the Mangasarian–Fromovitz constraint qualification on span(C − x∗ ) (see [15]) is
satisfied, with x∗ ∈ riC), then KS (x∗ ) = KL (x∗ ).
The following proposition shows that the conditions K ⊆ KL (x∗ ) and KS (x∗ ) =
KL (x∗ ) are “almost” necessary.
Proposition 4.1. Suppose that the conclusion of Theorem 4.1 is valid. Suppose
in addition that one of conditions (i)–(iii) holds; then K ⊆ KL (x∗ ). Moreover, if
K ⊆ KS (x∗ ), in particular if K is convex, then KS (x∗ ) = KL (x∗ ).
Proof. Under the assumption of Proposition 4.1, we have that, for any x ∈ X,
PK (x) x∗ ⇐⇒ PKL (x∗ ) (x) = x∗ . Thus, by Theorem 3.1, K ⊆ KL (x∗ ). Moreover, if
K ⊆ KS (x∗ ), we have KS (x∗ ) = KL (x∗ ) from Corollary 3.1.
Now we give an example to illustrate the main theorem of this section.
Example 4.1. Let X = R2 , C = {(x1 , x2 ) : (x1 − 4)2 + x22 ≤ 16}, and
A1 (x) = x2 − sin x1 ,
A2 (x) = −x1 − x2
∀x = (x1 , x2 ) ∈ X.
For x∗ = (0, 0) we have
A1 (x∗ ) = (−1, 1),
A2 (x∗ ) = (−1, −1),
and
KL (x∗ ) = KS (x∗ ) = {(x1 , x2 ) : x2 ≤ x1 , −x1 ≤ x2 }.
Let me = 0. Clearly, K ⊂ KL (x∗ ). Since intC ∩ H1 ∩ H2 = ∅, it follows from
Proposition 2.3 of [10] that {C, H1 , H2 } has the strong CHIP. Then, by Theorem 4.1,
for any x = (x1 , x2 ) ∈ X, PK (x) x∗ if and only if there exist λ1 , λ2 ≥ 0 such that
PC (x − λ1 (−1, 1) − λ2 (−1, −1)) = x∗ . Observe that, for any y = (y1 , y2 ), PC (y) = x∗
if and only if y1 ≤ 0, y2 = 0. It follows that PK (x) x∗ if and only if x = (x1 , x2 )
satisfies that x1 + x2 ≤ 0 and x1 − x2 ≤ 0. We remark that this result can not be
deduced from Theorem DLW.
5. Reformulations of convex constraints. Throughout this section, we always assume that Ai , i = 1, . . . , m, are convex continuous functions. Without loss of
generality, let
Ci = {x ∈ X : Ai (x) ≤ 0},
and
K=C∩
m
i = 1, . . . , m,
Ci
.
i=1
We first introduce the concept of the BCQ relative to C. For convenience, in
what follows, cone{∂Ai (x) : Ai (x) = 0} is understood to be 0 when Ai (x) < 0 for
all i.
236
CHONG LI AND XIAO-QING JIN
Definition 5.1. Let x ∈ K. The system of convex inequalities
A1 (x) ≤ 0, . . . , Am (x) ≤ 0
(5.1)
is said to satisfy the BCQ relative to C at x if
NK (x) = NC (x) + cone{∂Ai (x) : Ai (x) = 0}.
The system of convex inequalities (5.1) is said to satisfy the BCQ relative to C if it
satisfies the BCQ relative to C at any x ∈ K.
Remark 5.1. When C = X, the BCQ relative to C at x is just the BCQ at x
considered in [12, 13]. Note that if x ∈ K and Ai (x) = 0, then cone(∂Ai (x)) ⊆ NCi (x),
and the equality holds if x is not a minimizer of Ai ; see [4, Corollary 1, p. 50].
Similar to the general BCQ, we also have the following properties about the BCQ
relative to C.
Proposition 5.1. Let x ∈ K. The system (5.1) satisfies the BCQ relative to C
at x if and only if
NK (x) ⊆ NC (x) + cone{∂Ai (x) : Ai (x) = 0}.
Proof. Note that
NC (x) + cone{∂Ai (x) : i ∈ I(x)} ⊆ NC (x) +
⊆ NC (x) +
m
i=1
i∈I(x)
NCi (x)
NCi (x) ⊆ NK (x).
The result follows.
Proposition 5.2. Let x ∈ K. Suppose that the system (5.1) satisfies the BCQ
relative to C at x. Then {C, C1 , . . . , Cm } has the strong CHIP at x.
Definition 5.2. The system (5.1) is said to satisfy the weak Slater condition on
C if there exists some x̄ ∈ (riC) ∩ K, called a weak Slater point, such that for any i,
Ai is affine or Ai (x̄) < 0.
Remark 5.2. When C = X, the weak Slater condition on C is just the weak
Slater condition studied in [12, 13].
The following proposition is a generalization of Corollary 7 of [12].
Proposition 5.3. Suppose that the system (5.1) satisfies the weak Slater condition on C. Then it satisfies the BCQ relative to C.
Proof. Let I0 = {i ∈ I : Ai is affine}, H0 = ∩i∈I
/ 0 Ci , and H = ∩i∈I0 Ci . From
Theorem 5.1 of [10] and Proposition 2.3 of [10], it follows that {C, H} and {C ∩H, H0 }
have the strong CHIP. Thus, for any x ∈ K, we have
NK (x) = NC∩H (x) + NH0 (x) = NC (x) + NH (x) + NH0 (x).
Observe that the system (5.1) satisfies the weak Slater condition [12]. Then Remark
5.1 implies that the system (5.1) satisfies the BCQ. Hence
NH (x) + NH0 (x) = cone{∂Ai (x) : i ∈ I(x)}
for {H, H0 } has the strong CHIP by Proposition 2.3 of [10]. Therefore, the system
(5.1) satisfies the BCQ relative to C. The proof is complete.
The following lemma isolates a condition that does not depend upon the BCQ
but also still allows the computation of PK (x) via a perturbation technique.
NONLINEAR APPROXIMATION IN HILBERT SPACES
237
m
Lemma 5.1. Let x∗ = PC (x − 1 λi hi ) ∈ K for some hi ∈ ∂Ai (x∗ ) and λi ≥ 0
with λi = 0 for i ∈
/ I(x∗ ). Then x∗ = PK (x).
Proof. Since λi = 0 for all i ∈
/ I(x∗ ) and x∗ = PC (x − i∈I(x∗ ) λi hi ), it follows
from Proposition 2.1 that
λi hi − x∗ ∈ (C − x∗ )◦ .
x−
i∈I(x∗ )
Hence
x−x∗ ∈ (C −x∗ )◦ +
λi hi ⊆ (C −x∗ )◦ +cone{∂Ai (x∗ ) : i ∈ I(x∗ )} ⊆ (K −x∗ )◦ .
i∈I(x∗ )
Using Proposition 2.1 again, we have x∗ = PK (x).
The main theorem of this section is stated as follows.
Theorem 5.1. Let x∗ ∈ K. Then the following two statements are equivalent:
(i) The system (5.1) satisfies the BCQ relative to C at x∗ .
(ii) For any x ∈ X,
m
∗
∗
PK (x) = x ⇐⇒ x = PC x −
λ i hi
1
for some hi ∈ ∂Ai (x∗ ) and λi ≥ 0 with λi = 0 for i ∈
/ I(x∗ ).
Proof. Assume that (i) holds. To show (ii), by Lemma 5.1,we need only to
m
prove that, for any x ∈ X, PK (x) = x∗ implies that x∗ = PC (x − 1 λi hi ) for some
∗
∗
/ I(x ). From Proposition 2.1 and (i), we
hi ∈ ∂Ai (x ) and λi ≥ 0, with λi = 0 for i ∈
have
x − x∗ ∈ (K − x∗ )◦ ⊆ (C − x∗ )◦ + cone{∂Ai (x∗ ) : i ∈ I(x∗ )}.
Therefore, there exist hi ∈ ∂Ai (x∗ ) and λi ≥ 0 for i ∈ I(x∗ ) such that
x − x∗ ∈ (C − x∗ )◦ +
λ i hi .
i∈I(x∗ )
That is,
x−
λi hi − x∗ ∈ (C − x∗ )◦ .
i∈I(x∗ )
m
It follows from Proposition 2.1 that x∗ = PC (x − 1 λi hi ) and (ii) holds.
Conversely, assume that (ii) holds. For z ∈ (K − x∗ )◦ , let x = z + x∗ . Observe
that x −x∗ ∈ (K − x∗ )◦ implies that PK (x) = x∗ . It follows from (ii) that x∗ =
m
/ I(x∗ ). Using
PC (x − 1 λi hi ) for some hi ∈ ∂Ai (x∗ ) and λi ≥ 0, with λi = 0 for i ∈
Proposition 2.1, we have
z =x−
m
1
∗
λ i hi − x +
m
λi hi ∈ (C − x∗ )◦ + cone{∂Ai (x∗ ) : i ∈ I(x∗ )}.
1
Hence
(K − x∗ )◦ ⊆ (C − x∗ )◦ + cone{∂Ai (x∗ ) : i ∈ I(x∗ )}.
From Proposition 5.1, (i) holds. The proof is complete.
Corollary 5.1. The following two statements are equivalent:
238
CHONG LI AND XIAO-QING JIN
(i) The system of convex inequalities (5.1) satisfies the BCQ
relative to C.
m
(ii) For any x ∈ X, x∗ ∈ K, PK (x) = x∗ ⇐⇒ x∗ = PC (x − 1 λi hi ) for some
∗
∗
hi ∈ ∂Ai (x ) and λi ≥ 0, with λi = 0 for i ∈
/ I(x ).
The following corollary describes the relationship between the BCQ and the strong
CHIP.
Corollary 5.2. Let x∗ ∈ K. Suppose that Ai , i = 1, . . . , m, are, in addition,
differentiable at x∗ . Let KS (x∗ ), KL (x∗ ), and Hi , i ∈ I(x∗ ), be defined as in the
previous sections. Then the following statements are equivalent:
(i) The system of convex inequalities (5.1) satisfies the BCQ relative to C at x∗ .
∗
∗
(ii) {C, H1 , H2 , . . . , Hm } has the strong CHIP at x∗ and
L (x ).
mKS (x ) = K
∗
∗
∗
(iii) For any x ∈ X, PK (x) = x ⇐⇒ x = PC (x − 1 λi Ai (x )) for some
/ I(x∗ ).
λi ≥ 0, with λi = 0 for all i ∈
Proof. The equivalence of (i) and (iii) is a direct consequence of Theorem 5.1;
hence we need only to prove that (ii) is equivalent to (iii). Since K is convex, Theorem
4.1 gives the implication (iii) =⇒ (ii). Conversely, assume that (iii) holds. By Lemma
3.1 of [10], we have that
PK (x) = x∗ =⇒ PKL (x∗ ) (x) = x∗ .
Then from Corollary 3.2 it follows that KS (x∗ ) = KL (x∗ ). Again, using Theorem
4.1, we have that {C, H1 , H2 , . . . , Hm } has the strong CHIP at x∗ . The proof is
complete.
Finally, we give an example with nondifferentiable convex constraints.
Example 5.1. Let X = l2 and C be the half-space defined by
C = {x = (x1 , x2 , . . .) ∈ l2 : x1 ≤ 0}.
Define
A(x) =
∞
|xk | − 1
∀x = (x1 , x2 , . . .) ∈ l2 ,
k=1
and take x∗ = (x∗k ) ∈ K, where
x∗k =
0
1
2n
for k = 2n + 1,
for k = 2n,
n = 0, 1, 2, . . . ,
n = 1, 2, . . . .
Then x∗ ∈ K, A(x∗ ) = 0, and
∂A(x∗ ) = {z = (z1 , z2 , . . .) : z2n = 1, z2n+1 ∈ [−1, 1], n = 0, 1, 2, . . .}.
Since the system of convex inequalities A(x) ≤ 0 satisfies the weak Slater condition
on C, it satisfies the BCQ relative to C. Thus, using Theorem 5.1, we get that, for
any x = (x1 , x2 , . . .) ∈ l2 , PK (x) = x∗ if and only if there exists b ≥ 0 such that
x1 ≥ −b,
x2n =
1
+ b,
2n
x2n+1 ∈ [−b, b],
n = 1, 2, . . . .
In fact, for any x ∈ l2 , PC (x) = x∗ if and only if x1 ≥ 0, xk = x∗k , k > 1. By
Theorem 5.1, PK (x) = x∗ if and only if there exist λ ≥ 0 and t2n+1 ∈ [−1, 1] such
that PC (x − λ(t1 , 1, t3 , 1, . . .)) = x∗ . From this we can deduce our desired result.
NONLINEAR APPROXIMATION IN HILBERT SPACES
239
6. Concluding remark. Nonlinear best approximation problems in Hilbert
spaces have been studied in this paper. As in the case of linear constraints, the
strong CHIP is used to characterize the “perturbation property” of best approximations in the case of differentiable constraints. However, this is the first time that
the “perturbation property” has been characterized using the generalized BCQ for
convex constraints. Our main results are Theorems 4.1 and 5.1. In particular, for
both differentiable and convex constraints, the equivalence of the generalized BCQ,
the “perturbation property,” and the strong CHIP with the constraint qualification
condition KL (x∗ ) = KS (x∗ ) has been obtained. Moreover, some examples with nonlinear constraints have been given to show that our main results genuinely generalize
some recent work obtained in [9, 10] on best approximations with linear constraints.
Acknowledgments. We wish to thank the referees for their valuable comments
and suggestions. We wish to express our gratitude to Dr. K. F. Ng and Dr. W. Li for
their careful reading of drafts of the present paper and for their helpful remarks.
REFERENCES
[1] M. Bazaraa, J. Goode, and C. Shetty, Constraint qualifications revisited, Management Sci.,
18 (1972), pp. 567–573.
[2] C. De Boor, On “best” interpolation, J. Approx. Theory, 16 (1976), pp. 28–48.
[3] B. Brosowski and F. Deutsch, On some geometric properties of suns, J. Approx. Theory, 10
(1974), pp. 245–267.
[4] F. Clarke, Optimization and Nonsmooth Analysis, John Wiley & Sons, New York, 1983.
[5] C. Chui, F. Deutsch, and J. Ward, Constrained best approximation in Hilbert space, Constr.
Approx., 6 (1990), pp. 35–64.
[6] C. Chui, F. Deutsch, and J. Ward, Constrained best approximation in Hilbert space II, J.
Approx. Theory, 71 (1992), pp. 231–238.
[7] F. Deutsch, The role of the strong conical hull intersection property in convex optimization
and approximation, in Approximation Theory IX, Vol. I: Theoretical Aspects, C. Chui and
L. Schumaker, eds., Vanderbilt University Press, Nashville, TN, 1998, pp. 105–112.
[8] F. Deutsch, W. Li, and J. Swetits, Fenchel duality and the strong conical intersection
property, J. Optim. Theory Appl., 102 (1997), pp. 681–695.
[9] F. Deutsch, W. Li, and J. Ward, A dual approach to constrained interpolation from a convex
subset of Hilbert space, J. Approx. Theory, 90 (1997), pp. 385–444.
[10] F. Deutsch, W. Li, and J. D. Ward, Best approximation from the intersection of a closed
convex set and a polyhedron in Hilbert space, weak Slater conditions, and the strong conical
hull intersection property, SIAM J. Optim., 10 (1999), pp. 252–268.
[11] F. Deutsch, V. Ubhaya, J. Ward, and Y. Xu, Constrained best approximation in Hilbert
space III: Application to n-convex functions, Constr. Approx., 12 (1996), pp. 361–384.
[12] H. Bauschke, J. Borwein, and W. Li, Strong conical hull intersection property, bounded
linear regularity, Jameson’s property (G), and error bounds in convex optimization, Math.
Program., 86 (1999), pp. 135–160.
[13] J. Hiriart-Urruty and C. Lemarechal, Convex Analysis and Minimization Algorithms I,
Grundlehren Math. Wiss. 305, Springer, New York, 1993.
[14] O. Mangasarian, Nonlinear Programming, McGraw–Hill, New York, 1969.
[15] O. L. Mangasarian and S. Fromovitz, The Fritz John necessary optimality conditions in
the presence of equality constraints, J. Math. Anal. Appl., 17 (1967), pp. 37–47.
[16] C. Micchelli, P. Smith, J. Swetits, and J. Ward, Constrained Lp -approximation, Constr.
Approx., 1 (1985), pp. 93–102.
[17] C. A. Micchelli and F. I. Utreras, Smoothing and interpolation in a convex subset of a
Hilbert space, SIAM J. Sci. Statist. Comput., 9 (1988), pp. 728–746.
[18] I. Singer, Duality for optimization and best approximation over finite intersection, Numer.
Funct. Anal. Optim., 19 (1998), pp. 903–915.
[19] R. Rockafellar, Convex Analysis, Princeton University Press, Princeton, NJ, 1970.
[20] Y. Yuan and W. Sun, Optimization Theory and Methods, Science Press, Beijing, 1997 (in
Chinese).
Download