OPERA TIONS RESEARCH Working Paper CENTER Solving Variational Inequality and Fixed Point Problems by Averaging and Optimizing Potentials by T.L. Magnanti G. Perakis OR 324-97 December 1997 MASSA CHUSETTS INSTITUTE OF TECHNOLOGY Solving Variational Inequality and Fixed Point Problems by Averaging and Optimizing Potentials by T.L. Magnanti G. Perakis OR 324-97 December 1997 Solving Variational Inequality and Fixed Point Problems by Averaging and Optimizing Potentials Thomas L. Magnanti * and Georgia Perakis t December 1997 * Abstract We introduce a general adaptive averaging framework for solving fixed point and variational inequality problems. Our goals are to develop schemes that (i) compute solutions when the underlying map satisfies properties weaker than contractiveness, (for example, weaker forms of nonexpansiveness), (ii) are more efficient than the classical methods even when the underlying map is contractive, and (iii) unify and extend several convergence results from the fixed point and variational inequality literatures. To achieve these goals, we consider line searches that optimize certain potential functions. As a special case, we introduce a modified steepest descent method for solving systems of equations that does not require a previous condition from the literature (the square of the Jacobian matrix is positive definite). Since the line searches we propose might be difficult to perform exactly, we also consider inexact line searches. Key Words: Fixed Point Problems, Variational Inequalities, Averaging Schemes, Nonexpansive NMaps, Strongly-f-Monotone Maps. *Sloan School of Management and Department of Electrical Engineering and Computer Science, MXIIT. Cambridge, MA 02139. tOperations Research Center, Cambridge, MA 02139. SPreparation of this paper was supported, in part, by NSF Grant 9634736-DMI from the National Science Foundation. 1 1 Introduction Fixed point and variational inequality problems define a wide class of problems that arise in optimization as well as areas as diverse as economics, game theory, transportation science, and regional science. Their widespread applicability motivates the need for developing and studying efficient algorithms for solving them. Often algorithms for solving problems in these various settings establish an algorithmic map T : K C R n -+ K is a given map defined over a closed, convex (constraint) set K in R n whose fixed point solution solves the original problem. FP(T,K): Find x* E K C R' satisfying T(x*) = x*. (1) The algorithmic map T might be defined through the solution of a subproblem. For variational inequality problems, examples of an algorithmic map T include a projection operator (see Proposition 1) or the solution of a simpler variational inequality subproblem (see, for example, [12] and [44]). A standard method for solving fixed point problems for contractive maps T(.) in Rn is to apply the iterative procedure Xk+l = T(Xk). (2) The classical Banach fixed point theorem shows that this method converges from any starting point to the unique fixed point of T. When the map is nonexpansive instead of contractive, this algorithm need not converge and, indeed, the map T need not have a fixed point (or it might have several). For example, the sequence that the classical iteration (2) induces for the 900 degree rotation mapping shown in Figure 1 does not converge to the solution x*. x T(x) Figure 1: 90 ° Degree Rotation 2 In these situations, researchers (see [2], [10], [13], [27], [38], [39], [46]) have established convergence of recursive averaging schemes of the type Xk+1 = Xk + ak(T(xk) - Xk), assuming that 0 < ak < 1 and 0 l min(ak, 1 Ek_ ak) = +, Ek=l ak(l - ak) = +oo. Equivalently, this condition states that implying that the iterates lie far "enough" from the previous iterates as well as from the image of the previous iterates under the fixed point mapping. For variational inequality problems, averaging schemes of this type give rise to convergent algorithms using algorithmic maps T that are nonexpansive rather than contractive (see Magnanti and Perakis [38]). Even though recursive averaging methods converge to a solution whenever the underlying fixed point map is nonexpansive, they might converge very slowly. In fact, when the underlying map T is a contraction, recursive averaging might converge more slowly than the classical iterative method (2). Motivated by these observations, in this paper we introduce and study a recursive averaging framework for solving fixed point and variational inequality problems. Our goals are to (i) design methods permitting a larger range of step sizes with better rates of convergence than the classical iterative method (2) even when applied to contractive maps, (ii) impose assumptions on the map T that are weaker than contractiveness, (iii) understand the role of nonexpansiveness, and (iv) unify and extend several convergence results from the fixed point and variational inequality literature. To achieve these goals, we introduce a general adaptive averaging framework that determines step sizes by "intelligently" updating them dynamically as we apply the algorithm. For example, to determine step sizes for nonexpansive maps. we adopt the following idea: unless the current iterate lies close to a solution, the next iterate should not lie close to either the current iterate or its image under the fixed point map T. As a special case, we introduce a modification of the classical steepest descent method for solving fixed point problems (or equivalently, unconstrained asymmetric variational inequality problems). This new method does not require the square of the Jacobian matrix to be positive definite as does the classical steepest descent method, but rather that the Jacobian matrix be positive semidefinite. Moreover, it is a globally convergent method even in the general nonlinear case. 3 1.1 Preliminaries: Fixed Points and Variational Inequalities Fixed point problems are closely related to variational inequality problems VI(f,K) : Find x* E K C Rn : f(x*)t(x- x*) > 0, Vx E K (3) defined over a closed, convex (constraint) set K in R n . In this formulation, f : K C R n -+ R n is a given function. The following well-known proposition provides a connection between the two problems. In stating this result and in the remainder of this paper, we let G be a positive definite and symmetric matrix. We also let PrG denote the projection operator on the set K with respect to the norm IIXIIG = (xtGx) . Proposition 1 ([25]): Let p be a positive constant and T be the map T = Pr(I - pG-' f). Then the solutions of the fixed point problem FP(T, K) are the same as the solutions of the variational inequality problem VI(f, K), if any. Corollary 1 : Let f = I - T. Then the fixed point problem FP(T, K) has the same solutions, if any, as the variationalinequality problem VI(f, K). To state the results in this paper, we need to impose several conditions on the underlying fixed point map T or the variational inequality map f. We first introduce a definition that captures many well known concepts from the literature. Definition 1 : A map T: K - K is A-domain and B-range coercive on the set K with coercive constants A E R and B E R if IIT(x) - T(y)II < Allx - yll + BIIx - T(x) - y + T(y)IG V, y E K. T is a (range) coercive map on K relative to the II.IG norm if A > 0 and B = 0. T is a (range) contraction on K, relative to the GII.G norm, and is (range) nonexpansive on K, relative to the HI.IIG if 0 < A < 1 and B = 0 norm, if A = 1 and B = 0. T is nonexpansive-contractive if A = 1 and B < 1, T is contractive-nonexpansive T is nonexpansive-nonexpansive if A < 1 and B = 1, if A = 1 and B = 1, T is firmly contractive on K relative to the I.HIG norm if 0 < A < 1 and B = -1, and T is firmly nonexpansive on K relative to the II.II 4 norm if A = 1 and B = -1. (4) Observe that any firmly contractive map is a contraction and any firmly nonexpansive map is nonexpansive. When y in the definition of a nonexpansive map is restricted to be a fixed point solution, researchers have referred to the map T as pseudo nonexpansive relative to the 11I.IG norm (see [55]). Observe that any nonexpansive map is also nonexpansive-contractive. The following example shows that the converse isn't necessarily true. Example: If T(x) = -3x then x* = 0. T is an expansive map about x* since lIT(x) - T(x*)l[ > lIx - x*II for all x. Nevertheless, the map T satisfies the nonexpansive-contractivecondition, with B = ,1since 2 91]x[I2 = liT(x) - T(x*)112 = lx - x* 112+ Bllx - T(x) 11 = 911IIx 2 Definition 2 : For variational inequality problems VI(f, K), the following notions are useful. f is b-domain and d-range monotone (on the set K) with monotonicity constants b and d if [f(x) - f(y)] t G[x - y] > blx - G + dIlf(x) - f(Y)l, Vx y E K. As special cases: (i) f is strongly domain and range monotone if b > 0 and d > 0. (ii) f is strongly monotone if b > 0 and d = 0. (iii) f is strongly-f-monotone if b = 0 and d > 0. (iv) f is monotone if b = d = 0. Contraction, nonexpansiveness, firm nonexpansiveness, pseudo nonexpansiveness, monotonicity, and strong monotonicity are standard conditions in the literature (especially when G = I). Some authors refer to strong monotonicity and strong-f-monotonicity as coercivity and co-coercivity (see [54], [57]). Lemma 1: Let T: R n - R n and f : R n - R n be two given maps with T = I - f. Then T is domain and range coercive with coercivity constants A and B if and only if f is domain and range monotone with monotonicity constants b = I-A and d = -B. (We permit the coercivity and monotonicity constants to be negative.) That is, 2 JIT(x) - T(y) l1 < AlIx - Y2 + BlIx - T(x) - y + T(y)HG if and only if [f(x) - f(y)] t G[x y]G ]> - 2 X 5 [2-y . Proof: Substitute I - f = T , expand IIT(x) - T(y)1G, and rearrange terms. Corollary 2 : 1. T is nonexpansive relative to the Il.IIG norm if and only if f is strongly-f-monotone relative to the matrix G with a monotonicity constant d = 1 2. If T is contractive relative to the 11.lG norm, then f is strongly monotone relative to the matrix G with a monotonicity constant b = 1-A 3. If T is nonexpansive-nonexpansive (nonexpansive-contractive) relative to the Il.IIG norm with a coercivity constant B E [0, 1] (B < 1 in the nonexpansive-contractive case) if and only if f is strongly-f-monotone with constant d = 1-B 4. T is firmly contractive relative to the II6G norm with constant A E [0, 1) if and only if f is domain and range monotone with constants b = A and d = 1. Proof of the Corollary: Take A = 1 and B = 0 for part a, A < 1 and B = 0 for part b, A = 1 and B = 1 (or B < 1) in part c, and B = -1 in part d. The following related result is also useful. Proposition 2 ([8]): Let p > 0 be a given constant, G a positive definite and symmetric matrix, and f : K C R n - R n a given function. Then the map T = I - pG-1f is nonexpansive with respect to the II.IIG norm if and only if f is strongly-f-monotone. We structure the remainder of this paper as follows. In Section 2 we introduce a general adaptive framework for solving fixed point problems that evaluates step sizes through the optimization of a potential. In Section 3 we introduce and study several choices of potentials as special cases. At each step, we optimize these potentials to dynamically update the step sizes of the recursive averaging scheme. We also examine rates of convergence for the various choices of potentials. To make the computation of the step sizes easier to perform, in Section 4 we attempt to understand when the schemes have a better rate of convergence than the classic iteration (2). In Section 5 we consider inexact line searches. Finally, in Section 6 we address some open questions. 6 Potential Optimizing Methods: A General Case 2 2.1 An Adaptive Averaging Framework We are interested in finding "good" step sizes ak for the following general iterative scheme Xk+1 = xk(ak) = Xk + ak(T(xk) - xk). For any positive definite and symmetric matrix G, the fixed point problem FP(T, K) is equivalent to the minimization problem (5) minxEK][X - T(x) I. The difficulty in using the equivalent optimization formulation (5) is that even when T is a contractive map relative to the 1l.[IG norm, the potential llx - T(x)iiG need not be a convex function. Example: Let T(x) = xl/ 2 and K = [1/2, 1]. Then the fixed point problem becomes Find x* E [1/2,1]: (x*) 1/ 2 = *. x* = 1. The mapping T is contractive on K since IIT(x) - T(y)ll = IX1/2 yl/2 = < [x-yl IIx1/2 + y1 2 11 2 -Yll, for all x, y > 1/2. The potential g(x) = (x-T(x))2 = (x-x l/ 2 )2 is not convex for all x E [1/2. 9/16). In fact, since g(x)" = 2 < 0 for all x < 9/16, g(x) is concave in the interval [1/2, 9/16). This example shows that the minimization problem (5) need not be a convex programming problem. Nevertheless, the equivalent minimization problem does motivate a class of potential functions and a general scheme for computing the step sizes ak. Adaptive Averaging Framework ak = argmin({as}g(xk(a)). Later, we consider several choices for the potential g(xk(a)). We assume the step size search set S is a subset of R. Examples include S = [0,1], S = R+ and S = [-1,0]. We first make several observations, beginning with a relationship between Ix- *G' 7 x - T(x)ll2 and Preliminary Observations Proposition 3 : If T is a coercive map relative to the I IG norm with a coercive constant A > 0 and if x 54 x*, then the following inequality is valid, (1 + V?1) 2 > IIx- T(x)11G > (1- _) -IIx-X*11 -6 2 (6) Proof: Lemma 1 with y = x* and B = 0 together with the inequality I(x - T(x))tG(x - x*)l < IIx - T(x)IIG.Ilx - x*IIG imply that lix - T(x)IIG.llx - X IIG 1-A ---I -2 X*I12 ± 1 + x 2 - I1 T(x)11 Dividing by ilx- x* I on both sides and setting y = Ix-/*11 c 'l we obtain y2-y + ¾_A < 0. inequality holds for the values of y lying between the two roots yl = 1 + vT and Y2 = 1 - This a of the binomial. This result implies the inequality (6). Q.E.D. Definition 3 : The sequence {k} is asymptotically regular (with respect to the map T) if limk-o,, lXk - T(xk)II = 0. Proposition 4 : i. If T is a continuous map and the sequence {Xk} converges to some fixed point x*, then the sequence {k } is asymptotically regular. ii. If the sequence {Xk} is bounded and asymptotically regular, then every limit point of this sequence is also a fixed point solution. Proof: Property (i) is a direct consequence of continuity since if the sequence {k} converges to some fixed point x*, then the continuity of T implies that limk-+,o l[xk - T(xk)I2 = 0. Result (ii) follows from the observation that if the sequence {k } is bounded, then it has at least one limit point. Asymptotic regularity then implies that every limit point is also a fixed point solution. Q.E.D. Remark: If the sequence Ilxk - x* 11is convergent for every fixed point solution x*. then the entire sequence {(k} converges to a fixed point solution. This remark follows from property (ii) of Proposition 4 since every limit point xt is also a fixed point solution x* and the sequence llxk - x*IIG is convergent for every limit point x* = further implies that the limit point . This result of the sequence {xk} must be unique. Therefore, the entire sequence {(k} converges to a fixed point solution. 8 Corollary 3 ([38]): If T is a nonexpansive map and for each k = 1, 2,..., Xk+l = Xk + ak(T(xk) - xk), with ak E [0, 1], then the following two statements are equivalent: i. For some fixed point x*, limk.+o -x*I Ek - = 0. ii. The sequence {xk} is asymptotically regular. Corollary 4 ([38]): Let T be any coercive map. Let xk(a) + a(T(xk) - Xk) for any 0 < a < 1 = Xk and let x* be any fixed point of T. Then, i. Ixk(a) - X*IG < [1 - a(1 ii. 11k(a) - T(xk(a))IG < and - V/)]lxk - X*l11, [1 - a(l - V)]llxk - T(xk)lIG. Remark: Corollary 4 implies that for any potential function g(xk(a)) with a E [0, 1], if T is nonexpansive and x* is any fixed point of T, then the iterates Xk of the general scheme satisfy the following conditions: i. Ixk i. l[xk - x* IG is nonincreasing and, therefore, a convergent sequence. - T(xk)lG is nondecreasing and, therefore, a convergent sequence. In analyzing the convergence of the adaptive averaging framework, we will impose several conditions on the map T and on the step sizes ak} in the iterative scheme xk+l = xk + ak(T(xk) For this purpose, we will select the step sizes from a class C of sequences {ak}. - Xk). For example, the class C might be the set of all sequences generated by a family of potential functions g, that is, ak = argminaESg(xk(a)), for some step size search set S. some potential function g and xk(a) = Xk + a(T(xk) - Xk). Alternatively, C might be the set of all Dunn sequences, that is, sequences satisfying the conditions 0 < ak < 1 and k-=l ak(1 - ak) = +0o. Convergence Analysis For some positive definite, symmetric matrix G and any sequence of step sizes {ak} from class C, for a fixed point solution x* of the map T. we define the quantity __Ixk Ak(X*) - x*11_- lIT(xk)- - IiXk - T(Xk) 11 2 T(·x*) 11 + ( - ak). Observe that if T is nonexpansive, then the first term in the definition of Ak(x*) is nonnegative. If, additionally, 0 < ak 1, then the quantity Ak(x*) is nonnegative. Therefore, when 0 < ak < 1, the condition Ak(x*) > 0 provides a generalization of the condition T is a nonexpansive map. We use a generalization of this condition by imposing the following assumptions in our convergence analysis. 9 Al. For any fixed point x* of the map T, akAk(x*) is nonnegative. A2. For any fixed point x* of the map T, if akAk(x*) converges to zero, then every limit point of the sequence {Xk} is a fixed point solution. As we have already observed, whenever T is nonexpansive and 0 < ak < 1, assumption Al is valid. The following theorem establishes a convergence result for the general adaptive averaging framework. Theorem 1 : Consider iterates of the type Xk+l = xk + ak(T(xk) - Xk), with step sizes ak chosen from a class C. Assume that the map T has a fixed point solution x*. If the map T and the sequence {ak} in the class C generated by the adaptive averaging framework satisfy conditions Al and A2, then the sequence of iterates {Xk} converges to a fixed point solution. Proof: Let F(x) = x - T(x). Consider any fixed point x* of map T. Observe that I1Xk X* 11 G - IT(Xk) |Xk - T(Xk)IG )I= -kX Suppose F(Xk) # t T(x*) |1 + ak) = + 2(F(Xk) - F(x*)) G(Xk |IF(xk) - F(x*)1 0. Since x* is a fixed point solution of the map T, F(x*) IlXk+1 - X1 X* + = lXk - a | |F(Xk) - F(x*)|Ij - 2ak(F(Xk) - F(x*))t - akilXk - T(k)I[ Xk -1 X*) 0 and. therefore, |*i = I1Xk+1 -X* II = |1Xk -akF(k) JIjXk - = - T(k) - jXk (Xk - x*). T(X) -TT(Xk)f + (1 - ak)] That is, Xk+±1 - X*IIG = IXk - X*l2 - akAk(X ) Xk - T(xk) Relation (7) and assumption Al imply that the sequence lxk -X*IZ is convergent. This result implies that either (i) llXk - T(Xk) l .12 (7) is nonincreasing and, therefore, converges to zero and. therefore, Proposition 4 implies that every limit point of the sequence of iterates {Xzk} is also a fixed point solution, or (ii) akAk(x*) converges to zero and then assumption A2 implies that every limit point of the sequence of iterates {Xk} is a fixed point solution. If either (i) or (ii) holds, then the entire sequence of iterates {Xk} converges to a fixed point solution since as we already established in (7), IIXk - X"*11 is convergent for every fixed point x*. Q.E.D. -0 le-2kFx)-Fx)t~k-x) 10 This result extends Banach's fixed point theorem since when T is a contraction, a choice of ak = 1 for all k satisfies assumptions Al and A2 (see also Lemma 2). As we next show, this theorem also includes as special case Dunn's averaging results (see [13], [38]). Theorem 2 : Suppose that T is a nonexpansive map and the step sizes ak from the class C are a Dunn sequence and that T has a fixed point solution x*. Then the map T and the step sizes ak in the class C satisfy assumptions Al and A2. Proof: Assume as in Dunn's Theorem that T is a nonexpansive map and that ak condition Zk ak(l - ak) = +ccx. E [0, 1] satisfies the Since T is a nonexpansive map and ak E [0,1], Ak(x*) O0(that is, Al holds). To establish the validity of assumption A2, we suppose that it does not hold. If akAk(x*) converges to zero, some limit point of the sequence {zk} is not a solution. In the proof of Theorem 1, we showed that IXk+l - x* 112 - IXk - X*112 < -akAk(x*)]xk which implies (since akAk(x*) > 0) that the sequence {(Ixk - - T(xk)I, x*112} converges for every fixed point solution x*. Corollary 4 implies that since ak E [0, 1], the sequence ({ sk - T(xk) l} converges. Since we assumed that some limit point of the sequence {(k} will not be a solution, flxk -T(xk)l > B >0 for all k > k, for sufficiently large constant ko. Then for k > k0 , 2 211 < lim llk+1_X 112- lxk0 E akAk(x*)B. - k=ko But since T is nonexpansive, Sk akAk(x*) > Ek ak(l-ak) = +x. Therefore, k akAk(x*) = +xC. This result is a contradiction since it implies that +o > lim Ixk+l k - x*112 _x - - X*I 2 < - akAk(x )B = -c. k=ko Therefore, if akAk(x*) converges to zero, then every limit point of the sequence {(k} is a fixed point solution (that is, assumption A2 is valid). We conclude that the assumptions of Dunn's theorem imply the assumptions of Theorem 1 and, therefore, Dunn's theorem becomes a special case. Q.E.D. 11 Example: Consider the map T(x) = xv/1 -ixii over the set ({x: lxIll < 1}. The fixed point solution x* = 0 is unique. The map T is nonexpansive around solutions, since IjT(x) - T(x*)ll = Ilxllv1 -Ilxji< lix - x*. Ilxll = If we choose step sizes ak = 1 for all k, then Dunn's averaging result does not apply since ak) = 0 < +oo. Ek ak(l - Banach's fixed point theorem also does not apply since T is a nonexpansive but not a contractive map. Nevertheless, Theorem 1 ensures convergence since ak = 1 for all k is bounded away from zero and Ak(x*) 2 l1--llT(Zxk)-z*1 +1-ak and therefore, IIXk 11 112 assumptions Al and A2 hold. Therefore, Theorem 1 applies for this choice of the map T and step = Zlk- I =- = Ilk-T(k)ll sizes. Remarks: 1. Observe that the iterates we considered in Theorem 1 do not necessarily require that the step sizes lie between zero and one. 2. Theorem 1 does not require the map T to be nonexpansive. Rather it requires assumptions Al and A2. The convergence result depends upon not only the step sizes ak, but also the quantity Ak(x*) + (1 - ak) which measures both how far the step lXzk -X' II -IIT(xk)-T(x*)II = lIXk -T(Xk) 11 size ak lies from one and "how far" the map T lies from "nonexpansiveness"' relative to some C norm. Next we examine how restrictive assumptions Al and A2 are and how we can replace them with the various versions of coerciveness we defined in Definition 1. (a) If S C [0, 1] and T is a nonexpansive map relative to a G norm around fixed point solutions, then assumptions Al and A2 are valid, that is, akAk(x*) = a l xk - x* - i|T(xk) - - T(xV*)lG + ak(1 - ak) > ak(l - ak) > 0. (8) Ixk - T(xk)iG Conclusion from (a): If T is a nonexpansive map relative to a 0 norm and the set S C [0, 1], then all schemes that "repel" ak from zero and one. unless at a solution, induce a sequence of iterates that converges to a solution. (b) As a special case of subcase (a) assume that the step size search set is S = [cl. c 2] for some 0 < cl < c2 < 1 (see, for example, scheme III in the next section) and that T is a nonexpansive map. Relation (8) becomes akAk(*) > cl(1 - c2) > 0. Then assumptions Al and A2 follow easily. This observation is valid regardless of the choice of potential g. 12 (c) Theorem 1 is valid for maps T that are weaker than nonexpansive. What happens if a map T satisfies the nonexpansive-contractiveness condition around solutions x*? Then for step sizes 0 < ak < c < 1 - B, Ak(x*) > 1 - B - ak > 1 - B - c > O. Assumptions Al and A2 follow for all schemes that "repel" ak from zero unless at a solution. It is important to observe that there are maps T satisfying the nonexpansive-contractiveness condition that are weaker than nonexpansive. Conclusion from (c): If S = [0, c] C [0, 1 - B) and T is a nonexpansive-contractivemap with coercivity constant O < B < 1, then schemes that "repel" ak from zero unless at a solution induce a sequence of iterates that converges to a solution. (d) Another class of maps T weaker than nonexpansive that satisfy Theorem 1 are maps that satisfy the following condition Vx E K, for some constant B > 1. (9) > llx-x*ll+Bllx-T(x)112 lIT(x)-T(x*)Il Maps T satisfying this condition are expansive. Then for step sizes 0 > ak Ak(x*) < 1 - B - ak > c > 1 - B. 1 - B - c < O. Assumptions Al and A2 follow for all schemes that "repel" ak from zero unless at a solution. Conclusion from (d): If S = [c. 0] C (1 - B,0] and T is a map satisfying condition (9), then schemes that "repel" ak from zero unless at a solution induce a sequence of iterates that converges to a solution. (e) Finally, we observe that for contractive maps, if we consider step sizes in S = [0. 1] and schemes that "repel" ak from zero unless at a solution, then assumptions Al and A2 are also valid. The following lemma illustrates this observation. Lemma 2 : Let T be a contractive map relative to a 1.HllG norm. Then for any choice of step sizes ak E [0, 1] and schemes that "repel" ak from zero unless at a solution, the sequence {Xk} satisfies assumptions Al and A2. Proof: If T is a contractive map relative to the G norm with a contractive constant A E (0, 1). then Corollary 2 implies that Ak(x*) 2(F(xk) - F(x*))G(xk - X*) kIF(Xk) - F(*) _a > ) X IF(xk) - Ak- F(x*)I + > -ak k (10) 13 (1- A) Xk -X [IF(xk) -_F(x*)[[ > 0. It is easy to see that expression (10) and the fact that the sequence Ijxk - x* 1 is nonin- creasing implies assumptions Al and A2. Q.E.D. Conclusion from (e): If S = [0, 1] and T is a contractive map, then algorithmic schemes that "repel" ak from zero unless at a solution, induce a sequence of iterates that converge to a solution. The discussion in (a)-(e) suggests that in order to satisfy assumptions Al and A2 in Theorem 1, we may either (i) restrict the line searches we perform to a set S C [0, 1) or (-1,0], so that our results apply for maps that satisfy a condition weaker than nonexpansiveness, or (ii) extend the line searches to a set S D [0, 1] or [-1, 0] but as a result we might need to impose stronger assumptions either on the map T or on the step sizes ak. Generalized Norms Our analysis so far applies if we consider the conditions of contractiveness, nonexpansiveness and strictly weak nonexpansiveness relative to a .11 G norm. We observe that the analysis also applies if we impose versions of the conditions with respect to a generalized norm P. In particular, we might consider a generalized norm P : K x K -- R satisfying the following properties: 1. P(x*, T(x*)) = 0 if and only if x* is a solution. 2. P(x, x) = 0. 3. P(x,T(x)) > 0, P(x,x*) > 0. 4. P is convex relative to the first component, that is, P(y,.) is convex for all fixed y, and for all points xl, zl and a E R. some constant D > 0 satisfies the condition P(axi + (1 - a)zi, y) < aP(xl, y) + (1 - a)P(zl, y) - Dia(1 - a)P(zl, x). (If D1 > 0, then P is strictly convex). Some Examples of Generalized Norms: 1) P(x, y) = x1- ylIG or P(x. y) = x - yil for some positive definite and symmetric matrix G. 14 2) For variational inequality problems VI(f, K), let P(x, y) = f(x)t(x - y). Suppose f(x) = Mx - c and the matrix M is positive semidefinite. Then a possible choice of a map T could be T(x) E argminyeKf(x)ty. 3) For variational inequality problems VI(f, K), let P(x, y) = (f(x) - f(y))t(x - y). Using generalized norms, we can extend the notions of coerciveness as follows. Definition 4 : A map T is coercive around solutions x* relative to a generalized norm P, if for some constant A > 0 and for all x P(T(x), x*) < v, P(x,x*). If A = 1, then T is a nonexpansive map relative to the generalized norm P around solutions. If A < 1, then T is a contractive map relative to the generalized norm P around solutions. It will be useful in our subsequent analysis to observe that if P(x, x*) = 0 and P(T(x), x*) = 0 then P(x, T(x)) = 0. This observation follows from property 4 of the generalized norm conditions. Property 1 then also implies that x is a solution. Let us now consider a generalized norm P with the following two additional assumptions. 5. . The generalized norm P is convex with respect to the second component. That is. for a fixed x and for all points Z2, Y2 and a E R + , some constant D 2 > 0, satisfying P(x, ay2 + (1 - a)z2) < aP(x, Y2) + (1 - a)P(x, Z2) - D 2 a(1 - a)P(z2 , Y2). (If D 2 > 0, then P is strictly convex). 6. The generalized norm P satisfies the triangle inequality, P(x, y) < P(z, z) + P(z. y). Lemma 3 : Under assumptions 1-6 on the generalized norm P and for a map T that is coercive relative to P, for all a E [0. 1], P(xk(a), T(xk (a))) < [1 - a(l - 15 )] P(xk, T(xk)). Proof: The convexity assumption 4 implies that for all a E [0, 1], P(xk(a),T(xk(a))) < aP(T(xk), T(xk(a))) + (1 - a)P(xk,T(xk(a))) (then the triangle inequality 6 implies that) < P(T(xk),T(xk(a))) + (1 - a)P(xk,T(xk)) (the definition of coerciveness further implies that) < P(xk,xk(a)) + (1 - a)P(xk,T(xk)) (the convexity assumption 5 implies that) < aP(xk,T(xk)) + (1 - a)P(xk, Xk) + (1 - a)P(xk,T(xk)) (finally, assumption 2 implies that) < avP(k, T(xk)) + (1 - a)P(k,T(xk ) ). Q.E.D. In the previous analysis, we presented choices of maps T and step sizes ak that imply assumptions Al and A2. Next we illustrate a specific class of potentials satisfying assumptions Al and A2 as well. This class of potentials is only a special case. Theorem 1 also applies to other potential functions as we will show in detail in Section 3. To state these results in a more general form, we use the generalized norm concept that we have introduced in this section. 2.2 A Class of Potential Functions The previous discussion related assumptions Al and A2 to the notions of contractiveness, nonexpansiveness, and strictly weak nonexpansiveness. We next address the following natural question: Is there a class of potentials satisfying assumptions Al and A2? To provide partial answers to these questions, we consider potentials of the form g(xk(a)) = P(xk(a), T(xk(a)) 2 -/3h(a)P(xk, T(xk)) 2 . As we will see, these potentials satisfy assumptions Al and A2 if we impose the following conditions: P : K x K -* R is a continuous function, h : R -* R+ is a given continuous function, and j > 0 is a given constant satisfying the following assumptions 16 (i) P is a generalized norm (that is, satisfies assumptions 1-6 from the last section). (ii) h(O) = 0. (iii) For some point a E S n (0, 1), h(a) > 0. (iv) h is bounded from above on S n [0, 1], that is, 0 < C = SupaEsn[o,llh(a) < +oo. (v) 1 < C. The following result provides a bound on the rate of convergence of the adaptive averaging framework for choices of potentials satisfying properties (i)-(v). Proposition 5 : Consider the class of potentials g satisfying the properties (i)-(iv). The general scheme converges at a rate 2 P(xk+l,T(xk+l))2 < [1 - /3(C - h(ak))]P(xk,T(xk)) . (11) Furthermore,for step sizes ak E [0, 1] n S, P(xk+l,T(xk+l))2 < [1 - 2 2 ak(l - V)]1P(xk,T(xk)) . (12) Proof: Select a* E S n [0, 1]. The iteration of the general averaging framework implies that 9(xk+l) = P(xk+l,T(k+l))2 - /3h(ak)P(xk,T(xk))2 < P(xk(a*), T(xk(a*))) 2 - 3h(a*)P(xk, T(xk)) 2 . Lemma 3 implies that P(xk(a*), T(xk(a*)) 2 < P(xk, T(xk)) 2 and so g(xk+l) = P(xk+l,T(xk+l))2 - fh(ak)P(xk,T(xk)) 2 < P(xk,T(xk)) 2 -/3h(a*)P(xk,T(xk)) 2 , which, by letting h(a*) approach C, implies the inequality (11). On the other hand, if ak E [0, 1] n S. then Lemma 3 implies that 0 < P(xk+l,T(xk+l))2 < (1 - ak(l - 2 )) P(xk,T(xk)) 2 . Q.E.D. Remarks: (1) Relation (12) is valid for any choice of 3 (negative as well as positive). (2) If C is not a limit point of h(ak), then for some constant h < C. h(ak) h, for all k > ko. Therefore, for all k > ko, P(xk+l,T(xk+l))2 < [1 - 17 (C - h)]P(xk, T(xk)) 2 . (13) (3) If the step sizes ak are bounded away from zero, i.e., P(xk+1,T(xk+l)) 2 ak > c > 0 for all k > k0 , then xk,T(xk)) 2 . [1 - c(1 - V)]2P( (14) The following proposition shows that potentials satisfying properties (i)-(v) also satisfy assumptions Al and A2. Proposition 6 : Consider potentials of the type g(xk(a)) = P(xk(a), T(xk(a)) 2 - h(a)P(xk, T(xk)) 2 . Assume that P : R x R - R is a continuous function, h : R - R + is a given continuous function, and /3 > 0 is a given constant satisfying properties (i)-(v). Any such potential satisfies the following properties: a. limk-+,o ak = 0 implies that every limit point of {Xk } is a fixed point solution (that is, assumption A1 is valid). b. h(ak) < 0 implies that Xk is a fixed point solution. c. limk -+c h(ak) = 0 implies that every limit point of {xk} is a fixed point solution. d. If h(ak) < akAk(x*) and ak > O, then assumptions Al and A2 are valid. Proof: Before proving parts a-d, we observe that Proposition 5 implies that limk-+x P(xk, T(xk)) exists. a. If limk-+o ak = 0, then the iteration of the general scheme implies that lim g(xk+l) = k +oc lim P(Xk+l,T(xk+l))2 -,3h(ak)P(xk,T(k)) 2 k-+c lim P(xk,T(k))2 [1 - k-+oo = h(ak)] = lim P(xk,T(xk))2 < k¢-+x 2 lim [P(xk(a). T(xk(a)) -/3h(a)P(xk, T(xk)) 2 ], k +o for any a. Therefore, lim P(xk(a),T(xk(a))2 > (1 + /h(a)) k-+oo lim P(xk,T(xk)) 2 . k-+oc Select a E (0, 1] n S satisfying h(a) > 0. Then Proposition 5 implies that P(xk(a), T(xk(a)) 2 < [1- d(1 - V/A)] 2 P(xk,T(xk)) 2 . Therefore. [1 -( 1 - vA)] 2 lim P(xk. T(Xk)) 2 18 > (1 + h(a)) lim P(Xk T(Xk)) 2 Consequently, either limk-+, P(xk, T(xk)) = 0, implying (by property (i) and the continuity of P) that every limit point of {xk} is a fixed point solution, [1 -5(l - 1V)]2 > 1 + 3h(a). But h(d) > 0 implies that [1 - (l - v/)] 2 > 1 which is a contradiction. We conclude that for this class of potentials, assumption Al is valid. b. Select an a so that < h(a). Since < 3, such an a exists. If xk is not a fixed point solution, then P(xk, T(xk)) # O. The generalized nonexpansive property and the choice of 5 imply that g(xk(a)) = P(Xk(d),T(xk(a)) 2 - h(d)P(Xk,T(xk))2 < P(xk, T(xk)) 2 - P(Xk,T(k)) 2 = 0. But if h(ak) < O, then 2 g(xk+l) = P(Xk+l,T(Xk+l)) Therefore, 0 < g(xk+l) < g(xk(a)) < - 3h(ak)P(xk,T(xk))2 > O. 0, which is a contradiction. O. Therefore, as we have c. Suppose that limk_+,~ h(ak) = 0 and that limk_+, P(xk, T(xk)) shown in part a, lim g(Xk+l) k-+oo k--+oo P(xk+lT(Xk+l))2 = lim k--+oc P(k,T(xk))2 > 0. The condition 1 - 3C < 0 implies that for some a E [0, 1] n S, 1 -/3h(a) Lemma 3 implies that P(xk(a),T(xk(a)) k lim g(xk(a)) = -+oo k < < O0.Consequently, since P(xk,T(Xk)), lim P(xk(a),T(xk(a)) 2 -/h(a) + lim P(Xk,T(Xk)) k -+oo 2 - k lim P(xkT(xk)) lim P(xk,T(xk))2 k-+oo oo < = 0. Therefore, 0 < limk-+o g(xk+1) < limk-+o, g(xk(a)) < O, which is a contradiction. This result implies that limP(xk, T(xk)) = 0 and, therefore, that every limit point is a fixed point solution. d. Part b and the fact that h(O) = 0 imply that h(ak) > 0 and ak > 0 whenever xk is not a solution. Consequently, if xk is not a solution, then Ak(x*) > 0 as well (that is, assumption A2 is valid). Moreover, if limk_+o Ak(x*) = 0. then limk-+,o h(ak) = 0. Then part c implies that every limit point of {Xk} is a fixed point solution (that is, assumption A2 is valid). Q.E.D. Remark: For example, if h(a) = a(l - a) and as > 4, then we can choose any so that 5 < h(a). 19 - < a < 2+ Potential Optimizing Methods: Specific Cases 3 In this section we apply the results from the previous section to special cases. We consider several choices for the potential function g and examine the convergence behavior for the sequence they generate. Table I summarizes some of these results. Let 3 E R be a given constant and g(xk(a)) = Ik(a) - T(xk(a))llG + 3llxk(a) - (15) k(0)G. We consider several choices for the constant 3. Scheme I: Let S = [0, 1] and > 0, then (15) becomes gl(xk(a)) = lIxk(a)-T(xk(a))I[+/3lIxk(a)- d Xk(0) This potential stems from (5) and the observation that we do not allow the iterates to lie "far away" from each other. Lemma 4 Suppose T is a contraction relative to the I.IIG norm and 3 limk-+, 0, the sequence ak = Proof: If limk-+, lim ak = Then if converges to a solution (that is, assumption Al is valid). 0, then IXk - T(xk) 1 < k-+oo (since {Xk} 1 - A. lim IT(xk) - T 2 (xk) k +c + /3 lim lxk - T(xk)l < k +oo /3 < 1 - A) (A + 3) k-lim IlXk +oo T(Xk) < lim I1Xk -T(Xk) k-~+ao Therefore, limk-+o Hlxk - T(xk)ll2 = 0. Proposition 4 and Corollary 4 imply that the sequence {(k} converges to a fixed point solution. Q.E.D. Theorem 3 : If T is a contractive map relative to the I.HIG norm, with a contraction constant 0 < A < 1, then the sequence induced by scheme I converges to a solution for any choice 0 < /3 < 1 - A. Proof: · First, we restrict the line searches so that ak E [0, 1]. In this case, the proof follows from Theorem 1. In fact, Lemma 4 implies that for any choice /3 < 1 - A, the step sizes ak satisfy assumption Al. Moreover, Lemma 2 implies that Ak(x*) satisfies assumptions Al and A2. · More generally we perform unrestricted line searches in the sense of choosing step sizes ak E R+. 20 Then the proof follows since the general iteration of the algorithm implies that Ilxk+ - T(Xk+l)11G + a2klXk - T(Xk)11 < IIXk - T(xk) G. Then IIXk+1 - T(Xk+l)G - < -a2IIxk --IIk T(xk)I11 - T(xk)2 < 0. Therefore, lim ak Xk - T(xk)[11 = 0. k -+oo This result implies that (i) either limk-+, ak = 0, or (ii) limk-+, Ixk - T(xk)[[~ = 0. In the former case, Lemma 4 implies that for any choice 3 < 1 - A, the sequence {Xk} converges to a solution. In case (ii), Proposition 3 implies that the sequence {(k} converges to a solution. Q.E.D. As the next corollary shows, scheme I extends to also include nonexpansive affine maps relative to the II.IG norm. Corollary 5 : If T is an affine, nonexpansive map, then the iterates {(k} of scheme I converge to a fixed point solution. Proof: Let dk = T(xk) -Xk. matrix, and since lXk = For affine problems, the map MI = I - T is a positive semidefinite -dk, ak dtMtGdk 2 + ldkll IMdkll (16) Observe that if we let G = MtGM + fG in the definition of Ak(x*), then replacing this choice of 0 and accounting for the fact that MtG is positive semidefinite, we see that Ak(X*) > MdMtGdk + (Xk - x*)tM'ItG(xk - x*) -IIMdkll + (F(xk) - F(x*))tG(xk - x*) Ad dk1 + ,311dkil dlldkl 2 Strong-f-monotonicity of f (or, equivalently, nonexpansiveness of T) easily implies that Ak(x*) > 0 and that if either ak or Ak(x*) converge to zero, then {Xk} converges to a solution (that is, assumptions Al and A2 hold). Theorem 1 implies the conclusion. Q.E.D. Lemma 5 : If T is affine and is domain and range coercive with coercivity constants A and B, then all iterates with dk # 0 satisfy the inequality ak > 1-B + 2(1+V)2+!) (-A1 (l-B) 2 If T is nonexpansive, that is 0 < A < 1 and B = O, then for any choice of 3 < 1 - A, ak is bounded away from 1/2. Proof: Lemma 1 implies, since T is domain and range coercive with coercivity constants A and B, 4 that wtGMw > 1-'lwII ak + i-B IIwli. Moreover, if dk M Gdk dtk/Gdk ak jIMdkI ± jd|k1 0, then Proposition 3 implies that - - (1 - B) B + 11-A-(1-(17) >1-B 2 21 # 2(1 + V-A)2 + ) Q.E.D. The following proposition provides a rate of convergence for scheme I. Proposition 7 : Formaps T that are contractive relative to the G norm with a contraction constant A satisfying the condition 3 < 1 - A, the iterates generated by scheme I satisfy the estimates: IlXk+l - T(xk+l)ll < [A + (1 - a)]xk - (18) T(xk)ll-. Proof: The iteration of scheme I implies the inequality IjXk+1 - T(xk+l)!|G + al lxk - T(xk)|HG < llT(xk) - T2 ( xk)[ + /3Ixk - T(xk)[l1. Since the map T is a contraction, [Ixk+1 - T(xk+l)11G < (A + 3(1 - a))llxk - T(k)llG, implying (18). If 3 < 1 - A, then I - T is a contraction with coercivity constant 0 < A + 3 < 1. Q.E.D. Remark: If ak >> 1, then scheme I has a strictly better rate of convergence than the classical iteration (2). Observe when T is an affine, firmly contractive map, then a choice of 0 < 3 << iA and Lemma 5 imply that ak = dt Mt Gdk Idkl+ >> 1. Requiring the iterates to lie close to each other might be too much of a restriction. In fact, Proposition 7 shows that the rate of convergence of scheme I is, in the nonlinear contractive case, worse than the rate of convergence of the classic iteration (2) unless the step size ak is greater than 1. Therefore, motivated as before by the minimization problem (5), we consider the following potential. Scheme II: Let 3 = 0 in expression (15). Then g2(xk(a)) = lxk(a) - T(xk(a)) [ and S = R + . Itoh, Fukushima and Ibaraki [28] have studied a line search scheme of this type in the context of unconstrained variational inequality problems. They consider only strongly monotone problem functions, which correspond to contractive fixed point maps. Remark: Let T(x) = x - lzIx + c be an affine, nonexpansive map relative to the IiG norm and let dk T(xk) - xk. Then ak = argmin{a>o} g2(xk(a)) = argmin{a>O}l[Mxk(a) - c 22 G -- I (dk) = Lemma 6 : If T is an affine, contractive map relative to the II.IIG norm, with coercivity constant 0 < A < 1, then for all iterates k with dk bounded away from 0 O, ak > 2 + 'A Therefore, if A < 1, then ak is 2. Proof: The proof follows from Lemma 5 with/3 = B = 0. These results show that ak > in the nonexpansive case (when A = 1). Lemma 7 : If T is a contractionwith coercivity constant A, then limk-,, ak = 0 implies that every limit point of the sequence of iterates {xk} is a fixed point solution (that is, assumption Al holds). Proof: If limk,, ak = 0, but limk c IIXk - T(xk)G O. Then, ItXk - T(xk)lIG < IIT(xk) - T(T(Xk))llG Since T is a contraction mapping, I[[T(xk)-T(T(xk)) llG < AIXk -T(Xk) T(xk)Ij' < Alimk-,, 1Gand, therefore, limk_, llxk - Ixk-T(xk)lG, which is a contradiction. Consequently limk-.oc I[Xk-T(Xk)llG = Oand so Proposition 4 implies the conclusion. Q.E.D. Remark: The proof of this lemma also follows from Proposition 6 part a with hi = 0. The class of potential described in this proposition need to satisfy properties (i)-(v), and in particular, to be contractive with constant 0 < A < 1. The following theorem shows when scheme II works. Theorem 4 : If T is an affine, nonexpansive map relative to the l.llG norm, or if T is a nonlinear, contractive map relative to the IIc norm with coercivity constant A, then the sequence that scheme II induces converges to a fixed point solution. Proof: First, we illustrate how the proof of this theorem follows from Theorem 1. Lemma 7 implies that the step sizes ak satisfy assumption Al. Moreover, 1) If T is a contractive map relative to the 1l.llG norm and our linesearch is restricted in [0, 1]. then Lemma 2 implies assumptions Al and A2. 2) In the affine, nonexpansive case, we do not need to restrict the linesearch. In fact, dt 4GA(dk) ak = k]~l. II~dkII, 1 > - 2 and so a choice of G = MtGAI implies, using strong-f-monotonicity, that Ak(x*) = ak > . That is. assumptions Al, Al and A2 are valid. In both cases, Theorem 1 implies the result. 23 Alternatively, for nonlinear, contractive mappings the result follows from the observation that G < T(xk) - T(T(xk)) 11 < Allk - IiXk+1 - T(xk+l) )lG. T(Xk This result implies that the sequence {(Ixk -T(xk)ll } converges to zero. Proposition 3 then implies the conclusion. Q.E.D. The following proposition provides a rate of convergence for scheme II. Proposition 8 : For nonlinear, contractive maps relative to the 11.IG norm, with contraction constant A E (0, 1), the line search we considered in scheme II gives rise to the estimates: (19) IIXk+1 - T(Xk+1)G < AlXk - T(Xk) 11. Proof: For nonlinear, contractive mappings T, IlT(xk) - T(T(xk))11G < AiT(xk) -Xk ]lXk+l - T(k+l)l]_< G. Q.E.D. In the following discussion, to improve on the rate of convergence of the general scheme, we consider potentials that involve a penalty term that pulls the iterates away from zero unless they are approaching a solution. Scheme III: Let /3 < 0 in expression (15). This is equivalent to replacing : with - and letting 3 > 0. Then g3(xk(a)) = Ilxk(a) - T(xk(a))fl 1Ixk(a) - - k(0)ll. in (15) Moreover, we set S = [0,cl], with 0 < cl < 1. Alternatively, letting P(x,T(x)) = Ilx - T(x)IIG and h(a) = a2 and / > 0. g3(xk(a)) = P(xk(a),T(xk(a))) 2 -/3h(a)P(xk, T(xk)) 2. Remarks: 1) For affine, nonexpansive maps relative to the I-IlG norm, we compute ak = argmin({a[O,cj]}93(xk(a)). Let ak = IlMdk ll-Idkl' if 1Mdk11 < 3ldk1 or k > cl then ak = cl, otherwise, ak = ak 2) Observe that when T is a contractive map with a contractive constant A < 1. Proposition 2 implies that for a choice of < (±1A)2, llldk -5-dk-l > 24 1. Lemma 8 : Let T be an aftfine, contractive map, with a contractive constant 0 < A < 1. Then if IIMdk11' >/311dkl11 and p < 4, ajk > 1 - 2 + /3+1-A )2 _ 3) 2((1 + Proof: The proof follows from Lemma 5 with -/ replacing In the nonexpansive case, A = 1. Therefore, that ak > and B = 0. Q.E.D. /3 i + 2+ Lemma 9 : If T is a nonexpansive map relative to the IIl11i - norm, then Al holds. Proof: Observe that in scheme III, h(a) = a2 , a E [O,cl]. Consequently, this lemma becomes a special case of part a in Proposition 6. Q.E.D. Theorem 5 : If T is a nonexpansive map relative to the I.lcG norm with a coercivity constant 0 < A < 1 and ak < c1 < 1, then the sequence induced by scheme III converges to a fixed point solution. Proof: First, let cl < 1. The facts that ak < cl < 1 and T is a nonexpansive map imply that Ak(x*) > 1 - c1 > 0 and, as a result, that assumptions Al and A2 are valid. Moreover, Lemma 9 implies assumption A2 and Theorem 1 implies the result. If we let cl = 1, then we need to assume that T is a contractive map, that is, 0 < A < 1. - xI - IXk Assumptions Al and A2 are valid because Ak(x*) > (1- A) llF(xk)-F( x*)ll,' . Theorem 1 again implies the result. Q.E.D. The following proposition provides a rate of convergence for scheme III. Proposition 9 : For a contractive (or nonexpansive) map T relative to the G norm with a coercivity constant 0 < A < 1, the line search we consider in scheme III gives rise to estimates: Ilxk+1 -T(xk+l)11 Proof: In scheme III, P(x, y) = 11x - < (A -/3(c YIIG -a2)))lxk -T(xk)112 , (20) < ci < 1. and h(a) = a2 . a E [0, cl], with 0 < cl < 1. Then a* = cl and C = c2. Then as in the proof of Proposition 5, the iteration of scheme III implies that 1lxk+1 - T(Xk+l) - aklXk - T(xk)lG < lxk(c) - T(xk(cl)) - 33cllxk - T(Xk) Therefore, |IXk+1 - T(xk+l)t11 < (A - 3(c Q.E.D. 25 2 - a)))lXk - T(xk) 11G . Remarks: 1. Observe that when the map T is contractive, 0 < A < 1, scheme III is at least as good as the classical iterative method (2). Scheme III has a better rate of convergence than the classical iterative method (2) when, for example, the step sizes ak converge to zero. 2. The choice of potential in scheme III requires that for nonexpansive maps, we perform a line search with step sizes bounded away from 1. "How far" should the step sizes we consider lie away from 1 ? Perhaps considering step sizes bounded away from 1 is too much of a restriction. For this reason, we modify the potential function of scheme III as follows, Scheme IV: S =R g4(xk(a)) = Ilxk(a) - T(Xk(a))IG - 1[Ixk(a) - xk(O)llGjxk(a) - Xk(l)][G, and set +. Proposition 10 : For nonexpansive maps T relative to the G norm, 1 12 IXk+1 - T(xk+l) G _< • xk()2 - T(xk( 2 ))11G - (ak- 122 2 )21 )Ik - T(xk) I G< (21) (21) 1 [1 - Proof: Observe that P(x, y) = lx - yli] (ak - )2]lxk - T(xk)ll G. and a* = . Therefore, the result follows from Proposition 5. Q.E.D. Remarks: 1. The sequence {(1xk - T(xk)llG} is nonincreasing and, therefore, converges despite the fact that we did not restrict the line search in the set S = [0, 1]. 2. Proposition 10 provides a convergence rate for scheme IV. Lemma 10 : If T is a nonexpansive map relative to the limk+, ak = IG norm, then Al is valid. Moreover, if 1, then every limit point of the sequence of iterates {Xk} is a fixed point solution. Proof: The proof follows from parts a and c in Proposition 6. Q.E.D. The following theorem provides a convergence result as well as a convergence rate for this choice of potential function. Theorem 6 : For nonexpansive maps T relative to the G norm, the sequence {(k} that scheme IV generates converges to a fixed point solution. 26 Proof: Lemma 10 implies that assumption Al is valid. Proposition 10 implies that if a limit point of {Xk } is not a solution, then ak converges to 2. This result combined with.the nonexpansiveness of the map T, implies that for k0o large enough, for all k > ko, Ak(x*) > 1 - ak > 0, that is, assumption A2 applies. The proof then follows from Theorem 1. Q.E.D. Remark: For affine, nonexpansive maps T relative to the 11.IlG norm, ak 2 l/3lidk Il + 2d GMdk 112dkl + 211Mdkll' Lemma 11 : Suppose T is an affine, nonexpansive map relative to the I.11G norm, with a coercivity constant 0 < A < 1. Then ak > 1 1-A + 2 23 + 2(1 + V)2 - If T is a contraction, that is, 0 < A < 1, then ak is bounded away from 2. Proof: ak > Therefore, whenever dk 0, ak > ak11dkllG + IlMdkll2 + (1 - A)ldkll 211dk11 + 21IMdkl12 2 + +2(1+-) 2,3+2(1+V/)2' Q.E.D. Observe that in the contractive case, a choice of 3 > 1 - (1A)2 implies that all ak < 1. Corollary 6 : For affine, nonexpansive maps T relative to the G norm, the sequence {Xk} that scheme IV generates converges to a fixed point solution. Proof: In the affine case of scheme IV, ak = 211dk1l12+2d1G(1)d 2011dk jj+2jMdk 11' 2MtGM+2/3G in the definition of Ak (x*), then Ak (*) = we choose sce since ½ f is strongly- > 2 . Then if 2d' Mt Gdk+3I11dk 11 k + = >> 2 f-monotone relative to the 11.11G norm. Therefore, assumptions Al and A2 are valid and so Theorem 1 implies the result. Q.E.D. Remarks: 1. The previous analysis does not restrict the line search to the set S = [0. 1]. If we do restrict the search to S = [0. 1] then the following theorem implies convergence. Theorem 7 : If T is a nonexpansive map relative to the G norm, then the sequence that scheme IV generates converges to a fixed point solution. 27 Proof: The proof follows from Theorem 1. Lemma 10 implies assumption Al. Moreover, assumptions Al and A2 follow from Lemma 2. Q.E.D. 2. Observe that the previous analysis did not restrict the choice of the constant /3. If we choose / > 4, then the line search in scheme IV always yields a step length ak < 1 unless the current point is a solution. This result follows from property (2) of Proposition 6. 3. We can view scheme III as a form of scheme IV if we can find 0 < a4 (k) < 1 for which a3 (k) = v/a4(k)(1 - a4 (k)). Then /a 4 (k)( -a xk(a3) = xk(a4) = Xk + 4 (k))(T(xk) Xk) and the potential 93(xk(a3)) = 93(Xk(a4)) = Ixk(a4) - T(xk(a 4 ))11G - a4 (k)(l - a4(k))llxk - T(xk)Ic. Note that we can find these values a4 (k) only if a3 (k) < . More generally, if a3 (k) < cl < 1, then we can find 0 < a4 (k) 1 and a positive integer m, so that a 3 (k) is the mth root of a4 (k)(1 - a4 (k)). 4. Suppose ak is bounded away from for a choice of /3 > -, by a constant 0 < c If T is a contractive map, then the rate of convergence for scheme IV is better than that of the classical iterative method (2). For example, in the contractive case, if the sequence of the step sizes ak converges either to zero or to a constant , then scheme IV strictly improves the rate of convergence of the classical iterative method (2). Scheme V: gs(xk(a)) = [(xk(a) - T(xk(a))) t ( xk - T(xk))] 2 , and S = R +. This method is the classical steepest descent method (see [47]) as studied by Hammond and Magnanti [23], as applied to solving asymmetric system of equations: find x* E K satisfying F(x*) = x* T(x*) = 0. Remark: If T(x) = x - Mx + c is an affine map and dk = -(Mxk - c), then mina>o[(xk(a) - T(xk(a))) t (xk - T(xk))] 2 = mina>o[(AIxk(a) - c)t (lxk implies that ak lldk l2 dt ldk 28 - c)]2 . ak = 0 implies that every limit point of Lemma 12 : If T is a contraction mapping, then limk -+, the sequence of iterates {xk} is a fixed point solution. Proof: Assume T is a contraction map with coercivity constant A E (0, 1). If limk-+x ak then limk-+ [(xk - T(xk))t(xk - T(xk))] 2 result implies that limk-+ < limk--+[(T(xk) - T 2 (xk))t(xk - T(xk))] 2 . 0, This 4 2 4 = I xk - T(xk) 11 < limk-+c~ A [Ixk - T(xk)11 , but since A < 1, this is a contradiction, unless limk-+ Ixk - T(xk)IG = 0. Q.E.D. Theorem 8 : If T = x - Mx + c is an affine map, with M and M 2 positive definite matrices, then the sequence {Xk} that scheme V induces converges to the solution. Proof: See [23]. This result also follows from Theorem 1. Lemma 12 implies assumption Al. Moreover, since ak = t, dk Mdk' assumptions Al and A2 hold when M2 and I are positive definite matrices. = M+Mlt. 2 To show that assumptions Al and A2 are valid. we select Ak(x*) - (xk - X*)tA/ 2 (Xk - x*) + (k - x*)tAItMI(xk - x*) _ ak (xk - x*)tMt( MM)(t (replacing ak = Then Ak(x*) becomes xk x*) - d,d (xk - x*)tM 2( (xk - x*)tM xk - x*) + dkdk 2( xk - x*) dtMdk dt Mdk Therefore, whenever MI2 is a positive definite matrix, Ak (x*) = (xk - x*)tM 2 (xk - x*) dt Mdk (for xk # x*), t , i2+(A/2) _2 Amin( Amax (MtGMI) ) = c> 0. Q.E.D. The following proposition characterizes the rate of convergence of scheme V. Proposition 11 : If T = I - MI is an affine mapping and M and M 2 are positive definite matrices and I= M+M a then the sequence induced by scheme V contracts to a solution through the estimate, Xk+1 -X* • [1- Amin((M n~~~ma(I 29 )]lihX - x*111. (22) Proof: See [23]. Example: Let K = R n and T(x) = [x 2, -xl]. Then x* = (0, 0) is the solution of the fixed point problem FP(T, Rn). The steepest descent algorithm starting from the point x ° = (1, 1) generates the iterates x1 = (1, -1), X2 = (-1, -1), 3 4 = (-1,1), x = XO = (1, 1) and, therefore, the algorithm cycles. Remark: [ 1-1 is positive definite, but M 2 = 0 -2 is positive 1 0 semidefinite. This example illustrates that the choice of potential we considered in scheme V could In this example M = I - T = produce iterates that cycle unless the matrices M and M 2 are positive definite. -2 1 r -1 1 X*=(O,O) X 1 -1 Figure 2: Cycling Example for Scheme V Scheme VI: To remedy this cycling behavior, we modify the potential function in scheme V by introducing a penalty term as follows, g6 (xk(a)) = [(xk(a) - T(xk(a) )G(xk - T(Xk))] 2- /[(xk(a) - Xk()) G(xk - T(xk))].[(xk(a) - Xk(l)) G(xk - T(xk))]. We choose S = R + . We first make the following observation. Lemma 13 : If T is a nonlinear, nonexpansive map relative to the /3 > 4 implies that if ak > 1, then xk is a fixed point solution. 30 H11.G norm, then a choice of Proof: The remark at the end of Section 2 implies that potential g6 satisfies (i)-(v) in Subsection 2.2. Then this lemma follows from part b of Proposition 6. Q.E.D. Remark: Consider an affine map T(x) = x - Mx + c, and let ak = t G d k) 0ldk 1+21ldk 112(d' M t 2 Gdk) k 2[Illd l+(d~M ] +(dMtGdk) 2[I3IdkI Then the step length solution ak to the problem minaE[o,1]g6(Xk(a)), is ak = min1, ak}. (23) We observe the following, (1) If dk # O, then ak > 1/2. (2) If ak > 1, then a choice of /3> 1 implies that t Gdk)2 + ( - 1)lldkllG + (lldkll 0 < (dMA - dtM t Gdk)2 O. This result implies that dk = 0 and, therefore, that xk is a fixed point solution. Lemma 14 : Let T be a nonexpansive map relative to the G norm and suppose nonlinear case (or > 4 in the general > 1 in the affine case). If either ak converges to zero or to one, then every limit point of the sequence {zk} is a fixed point solution. Proof: If ak converges to zero, then lim g(xk(O))= lim k -+oc k-+oeo If limk-+o dk IdklG < lim k +oo g(xk(a)) < [1 - 3a(1 O, then for all a E (0,1), limk-+,c g(xk(O)) > [1 - -a)] pa(1 - lim k-+oo Idk G (24) a)]lldkllG, contradicting (24). We conclude that limk-,+o dk = 0 and therefore, every limit point of xk is a fixed point solution. Moreover, if ak converges to one, then if T is an affine map, if we choose 0 < lim (d'MItGdk)2 + (/ - 1) lim k--+oo This result implies that k-+oc Ildk l + lim (dkG - d MIt'dk) 2 > 1, = 0. k-+oc lldkllG converges to zero. Therefore, Proposition 4 implies that every limit point of the sequence {xk} is a fixed point solution. If T is a nonlinear, nonexpansive map then if ak converges to one, then property c. of Proposition 6 implies that for a choice of / > 4, every limit point of the sequence {xk} is a fixed point solution. Q.E.D. 31 Theorem 9 : If T is a nonexpansive map relative to the 1I.[IG norm and > 4 (or 3 > 1 in the affine case), then the sequence that scheme VI generates converges to a fixed point solution. Proof: Lemma 14 together with the nonexpansiveness of T imply that assumptions Al and A2 are valid. The proof follows from Theorem 1. Q.E.D. Remarks: 1) Observe that scheme VI extends the steepest descent method to include nonexpansive maps and does not require any positive definiteness condition on the square of the Jacobian matrix. Moreover, scheme VI provides a global convergence result even when the map T is nonlinear. 2) In this section we have considered special cases of the general scheme we introduced in Theorem 1. Several other schemes we have not presented in this section are also a special case of our general scheme. In particular, consider a variational inequality problem. Then Proposition 1 states that variational inequality VI(f, K) is equivalent to a fixed point problem FP(T.K). The operator T might be a projection operator T = PrG(I - pG-lf) or perhaps be defined so that T(x) = y, and y is a solution of a simpler variational inequality or minimization subproblem (see for example [12], [20], [44], [56]). Fukushima [20] considered the variational inequality problem VI(f, K). He considered the projection operator T(x) = PrG (x-pG-f(x)) = argminyeK[f(x)t(Yy- x)+ 1 y-x2 ]. The fixed point solutions corresponding to this map T are in fact the solutions of variational inequality VI(f, K) (see Proposition 1). Using the potential g(x) = -f(x)t (T(x- x)x) - -lT(x) 2 xG, (25) Fukushima established a scheme that, when f is strongly monotone. computes a variational inequality solution. Namely, each iteration of the scheme computes a point k+1 = xk(ak) = Xk + ak(T(xk) - Xk), with ak = argmina[o,1]g(xk(a)). We observe that this scheme is in fact a special case of our general scheme for the choice of potential g that we described in expression (25). Observe that when f(x) = x - T(x), the potential in (25) is the same as the potential we considered in scheme II. Taji, Fukushima and Ibaraki [52] established an alternative scheme for strongly monotone variational inequality problems, using an operator T that generates points through a Newton procedure. Zhu and Marcotte [58] modified Fukushima's scheme [20] to also include monotone problems. WNu. Florian and Marcotte [56] have generalized Fukushima's scheme. They considered as T the operator that maps a point x in the set G(x) = argminyEK[f(x)t (y - x) + 32 1O(x.y)]. The function : K x K -- R has the following properties: (a) is continuously differentiable, (b) b is nonnegative, (c) 0 is uniformly strongly convex with respect to y, (d) O(x, y) = 0 is equivalent to x = y, (e) V¢x(x, y) is uniformly Lipschitz continuous on K with respect to x. Observe that when (x,y) = Ix - YII2, then T(x) becomes the projection operator as in Fukushima [20]. The fixed point solutions corresponding to this map T are also solutions of the variational inequality problem VI(f, K) (see [56] for more details). Furthermore, Wu, Florian and Marcotte [56] considered the potential, 1 g(x) = -f(x)t (T(x) - x) - -(T(x), p x). (26) Notice as in the previous cases, this scheme becomes a special case of our general scheme for the choice of potential g given in (26). The convergence of these schemes follows from Theorem 1, where we established the convergence of the general scheme. Observe that as we have established in Lemma 2 when F is strongly monotone (which the developers of these schemes impose) assumptions Al and A2 hold for schemes which "repel" ak from zero unless at a solution. Furthermore, observe that indeed this assumption follows for these schemes (like Lemma 7) using the descent property of the potential functions involved in each of the previous schemes. For a proof of the latter, see [20] and [56]. Finally, for all these schemes (see also Section 5), we can apply an Armijo-type procedure to compute inexact solutions to the line search procedures. This computation is easy to perform. 4 On the Rate of Convergence In this section, we examine the following question: when does the general scheme we introduced in this paper and its special cases exhibit a better rate of convergence than the classic iteration (2)? To illustrate this possibility, we first establish a "best" step size under which the adaptive averaging scheme will achieve a better convergence rate. Proposition 12 : Let (F(xk) - F(x*))tG(xk - x*) (Xk - T(Xk))tG(xk - x*) Xk - T(xk)l[ IF(xk)- 33 F(x*)11 If a* # 1, then for some step size, namely a*, the general iteration Xk+1 = xk(ak) = Xk + a*(T(xk) - Xk), has a rate of convergence at least as good as the classic iteration (2). Moreover, if Iak - 1 > c > 0, then it has a better rate of convergence. Proof: First observe that [Ixk(a) - x* If xk = Ixk - x*1 t ) - 2a(xk - T(xk)) G(xkX + a2Ixk - T(xk)[ (27) is a strictly convex function of a. Moreover, T(xk), then lxk(a) - x* argminaXk(a) - X*11 =2 (Xk - T(Xk))tG(xk - x*) IlXk - T(Xk)II Therefore, if a* = 1, then Ilxk(a) - x Ie < JlT(xk) - T(x*)II < AIIXk - X*112. Moreover, expression (27) implies that flxk(a) - = - (a* - 1)21 IIT(xk) - T(x*)ll k - T(xk)ll Proposition 3 implies that IlXk(ak) 1ii < - (A - (a - 1)2(1 ) 2 )xk - - X*12 Therefore, if la* - 11 > c > 0. then [lkk(a) - *I < (A - c2(1 - '/) 2 )|lxk - x*[, implying that the sequence {xk(a*)} has a better rate of convergence than the sequence {T(Xk)}. Q.E.D. Remark: Which maps T give rise to step sizes a* ~ 1, and which to step sizes a* - 11 > c? We next examine this question. 1. Let T be a nonexpansive map that is tight, that is, jiT(x) - T(x*)I 34 |X - 2x11 Then (k - T(xk) )tG(k - x*) ak - 1 - . 2 |IXk- T(Xk)2ll ak and so 1 Observe that in this case, we can achieve the "best" rate of convergence by moving half way at each step. 2. Let T be a firmly nonexpansive map that is not tight, that is, IT(x) - T(x*)-1 < x - x*l1 - lx - T(x) - x* + T(x*)H. Setting A = 1, B = -1 and y = x* and B = -1 in Lemma 1 stated as a strict inequality implies that this condition is equivalent to (x - T(x) - x* + T(x*)) t G(x - x*) > llx - T(x)H-. Consequently, ak = xk-T(Xk)'G ) > 1. 3. Let T be a firmly contractive map. Then setting y = x*, and B = -1 in Lemma 1 implies that (x- T(x) - x* + T(x*)) tG(x - x*) > 1 Then Proposition 3 implies that a* - 1 > - 2 x - x11 G + 2(1- +V) 2(1 4. Finally, if T is contractive but with a constant A "close" to 1, that is, if for some constant A (0,1), Alx - x*ll then 1 -a > > T(x) - T(x*)l 2 > lx - x* Z - Alx - T(x) l~, A > 0. The previous analysis shows that for certain types of maps and for some step sizes. adaptive averaging provides a better rate of convergence than the classical iterative scheme xk+l = T(xk). We have shown that to achieve the improved convergence rate, we need to choose step size as a. Since a involves a fixed point solution x*, its computation is not possible. Accordingly, we extend our choice of step sizes to establish an allowable range of step sizes for which adaptive averaging schemes have a better rate of convergence. Moreover, we show that the schemes we studied in the previous section have step sizes within this range. Proposition 13 : For fixed point problems with maps T satisfying the condition a* 11 (x-T(x))t(x-x*) 35 > c > 0, for all x E K (28) adaptive averaging schemes with choices of step size ak lying with the range a -la - 11 + d < ak < a* + la -1 -d, (29) with 0 < d < c, have a better rate of convergence than the classic iteration (2). Proof: Using expression (27) we see that IIk(a) - xII = T(xk) - T(x *)II + (a2 - 2aa* + 2ak - 1)lxk - T(xk)I.I Therefore, the binomial a2 - 2aa* + 2a* - 1 < -d 2 for choices of step sizes a satisfying the condition a - la* - 11 + d < a < a* + la* - 1 - d, with 0 < d < c. Consequently, we need la* - 11 > c > 0, (that is, condition (28)). Functions of the type we discussed in the previous remark satisfy this inequality. Then a choice of step sizes ak lying within the range of (29) provide a rate of convergence HIxk(ak) - X*11 < IT(xk) - T(x)112 -d 21Xk - T(Xk)lI (Using the fact that T is contractive and Proposition 3, we obtain) Ilxk(ak) - x*[I[ < [A - d2 (1 - 2 (30) ) ]xk - x*[Iq. Q.E.D. Corollary 7 : For fixed point problems with maps T satisfying expression (28), a choice of step sizes within (29) guarantees a better rate of convergence than the classic iteration (2). Furthermore, IIXk+1 - x*l 2 < IT(xk) - T(x*)11 - (ak - 1)(Ak(X*) - 1)Ilxk - T(xk)1 < (31) [A - (ak - 1)(Ak(X*) - 1)(1 - vA)2 ]Ixk - x* I Proof: Expression (31) follows similarly to our development of (7). Moreover. observe that Ak(x*) 2ak - ak. Therefore, to obtain a better rate of convergence than the classical iteration (2), we need (ak - 1)(Ak(x*) - 1) bounded away from 0. Since, (ak - 1)(Ak(x*) - 1) = a. - 2ak.a + 2ak the conclusion follows as in Proposition 13. Q.E.D. 36 - 1. Remarks: 1) Condition (29) is equivalent to assuming that (ak - )(Ak(x*) - 1) > .d2 > 0. Moreover, it is equivalent to assuming that (32) Ak(x*) E [min{l, 2a* - 1} + d, max{1, 2ak- 1}- d]. Observe that when (29) (or (32)) hold, then assumptions Al and A2 also follow. 2) In the following discussion, we show that the step sizes that we used in the schemes of Section 3 satisfy (29). Therefore, the schemes we studied in Section 3 exhibit better rates of convergence than the classical iteration (2). In discussing these methods, we let ak and Ai(x*) with j = I* ..., VI denote the step size and quantity Aik (x*) for scheme j at the kth iteration. In particular, consider the affine problem with I - T = 3i. We need to keep in mind that T(xk) - xk = dk and a* = (X-dk Xk) · Consider scheme I. If T is firmly contractive, then relation (29) follows. If we set G = MtALI + 31I, Furthermore, for G = I, a-k Observe that for firmly contractive maps T, a choice of 3 < then a*k - Mk+(k-- lldkll2+llMdl112 klMdk[2+-lMdkalk M2k2). that llIdkll 2 1 - A- 2 2 ak-lŽ IMdk 2 + Ildkl 2 implies, using Proposition 3. '-A a - 11> A- 2 2(3 + (1 + d>0 =d>0. V/)2) Therefore, (ak - 1)(A(x*) - 1) = (ak - 1)(2a with d = 1-A-23 2(O+(1+ v)2) ' - a -1) - (a - that is, relation (29). · Consider scheme II. If T is contractive, then relation (29) follows. = Mt 1)2 > d2 , in in ak ak and and G = I in a implies that a, = Observe that a choice of a* which satisfies relation (29). · Consider scheme III. If T is contractive, then relation (29) follows. If we set G = akak == ak t 2 Mdt All(kjIIMdkII12-k-3(xk-x) -0Idk1I d - d t IIAMdkl12I -311dk 112 Observe that 1 - a' 1- AIII(x*) = ) . - I31 then - Furthermore, for G = I, if IlIidk1 2 > 1dk 2 2 < cl. Otherwise and iiiMdk2_-lldkl d-dk 1, 1 ak = C. Cl > 1 - cl = d > 0. Then MIIdk12 -3lldk l12 - dtkIdk + 2/(Xk - X*)tAiM(k IMdk 2 Therefore, (1 tl - x*)= d [idkH 2 aiI)(1 - AII(*)) > d2 that is, relation (29). · Consider scheme IV. If T is firmly contractive, then relation (29) follows. If we set G = I. then 2 IV =311dk12+2d2d If T T is is a firmly contractive map, Lemma 1 with B = -1 and Proposition 3 +2 iMddkI2 . If ak -- 2fldk 2011dk 112+2j~ 37 imply that v aIIv lldk[12 j~dk 11' -A - > k2(ldkIl2 + IIMdk 112 ) > (1 - A (1 Moreover, observe that for a choice of G = MtM, a k = IClvdkl12 d = (1 - A k 1 - A. Therefore, a choice of - )2 1 ) > - v/i)2) +(1) 2(/+(1+Vr~)2) > 0. If T is firmly contractive, then dMd 1 _*2dt IIMdI Mdk -IMdk12 AV(x*)-1 = 2a - akV-1-= S3 < 1 2( + (1 + < 1 - A guarantees that afk VLI- 1 > - d = (1- A - 3) Therefore, a choice of for all ) IlMdkll 2 IV fm _ av > cn t,112 - 3 < 1 - A guarantees that a1V - 1)(AIV(z*) -1) > d2 with 0, that is, relation (29). * Consider scheme V, that is, the steepest descent method. If T is contractive and Al is symmetric, then relation (29) follows. When M = Mt then a choice of G = 1 and G = I imply that a = av which satisfies relation (29). * Consider scheme VI. If T is contractive and M is symmetric, then relation (29) follows. Let us choose > ( 1+)±+1 4c with c > 1. Then for G = I, VI 3l_ dk114 + 2lIdkll 2 (dtMtdk) > -VI 4 2 112[/PldIl + (dtM tdk) ] ak - ak Therefore, a VI1 - 1 > = d > 0. Furthermore, if G _ c-1 IICLCVI~) "IC M 2 V, th,e n A ' (x*) -1= 2a-a - -1 > a I -1 > c - 1 = d > 0. This condition follows since ale dtdk dkd A - 4 lIdkl11 + 2dkl1l 2 (d.M tdk) 2[/11dk l 4 + (dAdtdk)2 ] VI This inequality is valid since dtMdk < lidkll.llMdkl < (1 + v/-)lIdkll2 < 2dk12. Finally, we need to consider how our previous results extend for the general nonlinear case. That is, when does the general scheme we introduced in Section 2 as well as the special cases we studied in Section 3 satisfy relation (29) for general nonlinear operators T? In this case, these schemes would then demonstrate a better rate of convergence than the classical iteration (2). Observe that if for example ak find for which choice of G and 5 - 1 > c - 1 = d > 0, then in order to establish (29). we need to and for what maps T, step sizes ak < a? Inexact Line searches One natural question would arise when attempting to implement any of the methods we have examined in Section 3: 38 how easy is it to perform line searches ? The line searches might not be easy to perform in the general nonlinear case. For this reason, in this section we consider inexact.line searches. We determine step sizes at each step by applying an Armijo-type rule of the following form. For positive constants D > 0 and 0 < b < 1, find the smallest integer lk so that a step length ak = b satisfies the condition gi(xk(ak)) - gi(xk(O)) < -D.bl Ikdk1.1 (33) In this expression, gi denotes the potential function we have used previously for scheme i. We next show that the Armijo-type inexact line search (33) works for the various choices of potentials we considered in Section 3, if we impose appropriate assumptions on the fixed point map T. Since the potentials we considered in Section 3 involve a term ljxk(a) - T(xk(a)ll we first make some preliminary observations concerning this term that is, we first examine scheme II. Theorem 10 : Consider a fixed point problem FP(T, K). Let T be a contractive map relative to the 11.IG norm with a coercivity constant A E (0, 1). Then a choice of D > 1- /A in an Armijo-type search (33) applied to the potential g2(x(a)) = llx(a)-T(x(a))ll1 generates a sequence that converges to a fixed point solution. Proof: We first need to show that the Armijo-type line search (33) has a solution. To see why this is true, we observe that the contractiveness of the map T and Corollary 4 imply that all 0 < a < 1 satisfy the inequality 92 (xk(a)) - g92(k) < -(1 - v)alldk G Therefore, if we choose D > 1 - vA, then all step lengths 0 a < 1 satisfy the Armijo-type rule (33). If in the Armijo-type inequality (33) we choose a step length ak = bk < 1, so that lk is the smallest integer satisfying (33), then 1 > ak > b. which implies that 11xk - The observation that 1 > ak 112_ lxk+1 - *112 > alxk 2 - T(xk) l . (34) > b implies that jlXk - x' 12 -_ Ilk+ - X* 12 > (b) 2 lxk - T(xk)I12 . This result in turn implies that the sequence Ixk - T(xk)ll 2 converges to zero. Proposition 4 implies that the entire sequence {xk} converges to a fixed point solution. Q.E.D. 39 Theorem 11 : Considerfixed point problems FP(T,K) with nonexpansive maps T relative to the 1111G norn. If we apply Armijo-type (33) line searches on potential functions gi, i = 3, 4, then the sequence Xk that these line searches generate converges to a fixed point solution. Proof: We first need to show that the Armijo-type (33) line searches have a solution. In Corollary 4 we have shown that for nonexpansive maps T, [Ixk(a) - T(xk(a)li' - IXk - T(xk)ll2 < 0, for all 0 < a < 1. Applying this result to potential functions of the type gi(xk(a)) = Hxk(a) - T(xk(a) 2 - /3hi(a)Ilxk - T(Xk) G implies that gi(Xk(a)) - i(Xk) (35) -hi(a) IlX - T(Xk) 2, for all 0 < a < 1. Therefore, we see that the step sizes ak will satisfy Armijo-type line searches condition (33) as long as 3hi(ak) > D.ak. To complete the proof, we next consider specific choices of hi(a) that correspond to the various choices of potentials gi, for i = 3, 4 that we considered in Section 3. 1) Consider the potential g3(xk(a)) = xk(a) - T(xk(a)l in Theorem 5 we consider only step sizes a < c. - l31xk(a) - xk(0)1~. As we have shown We need to select constants D and 3 so that < cl < 1. In this case it is easy to see that since h 3 (a) = a2 , all step sizes c > a > D satisfy the inequality phi(a) > D.a. As a result, these step sizes satisfy the inexact Armijo-type inequality (33) as well. Choosing ak = bik, max(b. D/3) < ak so that lk is the smallest integer satisfying (33) implies that ak < cl. > b. Therefore. Convergence follows since max(b, D//) < ak < cl implies, as we argued in Theorem 1, that [Ixk - x*112 with c = . - k+ - X*12 > c.aklxk - T(xk)[ 2 > c.max(b, D//3)2 1xk - T(xk)[[2, (36) Expression (36) implies that the sequence {Xk) converges to a fixed point solution. 2) Consider the potential 94(xk(a)) = jxk(a)-T(xk(a)1 h 4(a) = a(1 - a), implying that for a choice of -l3 Ik(a)-xk(O)ll.llxk(a)-Xk(1)[1. Then > D, all 0 < a < 1 - D//3 satisfy the Armijo-type inequality (33). 40 Moreover, choosing ak = bk, with lk the smallest integer satisfying (33) implies that ak > b. Therefore, the Armijo step size ak lies in the interval [b, 1 - D/1,]. As we have shown in Theorem 7, 2 2 Ilxk - x*112 - IXk+l - x*112 > ak(1 - ak)lxk - T(xk)[1 > b(1 - D/P3) jxk - T(xk)1 . (37) Similar arguments to those used in the proof of Theorem 7, imply then that the sequence {xk} converges to a fixed point solution. Q.E.D. 6 Conclusions-Open Questions In this paper we introduced adaptive averaging schemes for solving fixed point problems. They allowed us to show how to solve classes of fixed point problems whose maps expand in some way, that is, are weaker than nonexpansive. We considered a general scheme for determining step sizes dynamically by optimizing a potential function. We considered several choices of potential functions that optimized in some sense "how far" the current point lies from the image of the fixed point mapping of the current point. We established convergence rates for these choices of potential functions. Moreover, we studied when our general scheme has a better rate of convergence than the classic iteration (2). The line searches we proposed might be hard to perform exactly. For that reason, we also considered inexact line searches. Several open questions that naturally follow from our analysis, * How does the behavior of the schemes we introduced compare for the various choices of potentials in practice? · It might be preferable to consider averages of the type xk = mize potentials involving all aZ 's, instead of moving along xk(a) potentials involving only ak a'lx+a2T(Xl)+.+akT(Xk--) a+...+ak = Xk + ak(T(xk) - and opti- k) and optimize a. * Would these results still be valid if we impose conditions that are weaker than nonexpansiveness? * Can we establish rates of convergence for other choices of potentials? References [1] A. Auslender (1976), Optimisation: Methodes Numeriques, Mason, Paris. [2] J.B. Baillon (1975), "Un theoreme de type ergodique pour les contractions non lineaires dans un espace de Hilbert",. C.R. Acad. Sc., Serie A. Paris, t.28, 1511-1514. 41 [3] D.P. Bertsekas (1976), "On the Goldstein-Levitin-Polyak gradient projection method", IEEE Transactions on Automatic Control, 21, 174-184. [4] D.P. Bertsekas (1982), "Projected Newton methods for optimization problems with simple constraints", SIAM Journal on Control and Optimization, 20, 221-246. [5] D.P. Bertsekas (1996), Nonlinear programming, Athena Scientific, Belmont, MA. [6] K.C. Border (1989), Fixed point theorems with applications to economics, Cambridge University Press. [7] F.E. Browder (1967), Convergence of approximants to fixed points of nonexpansive nonlinear mappings in Banach spaces", Arch. Rational Mech. Anal., 24, 82-90. [8] F.E. Browder and W.V. Petryshn (1967), "Construction of fixed points of nonlinear mappings in a Hilbert space", Journal of Analysis and Applications, 20, 197-228. [9] R.E. Bruck, Jr. (1973), "The iterative solution of the equation y E x + T(x) for a maximal monotone map T in Hilbert space", Journal of Mathematical Analysis and Applications, 48, 114-126. [10] R.E. Bruck, Jr. (1977), "On the weak convergence of an ergodic iteration for the solution of variational inequalities for monotone operators in hilbert space", Journal of Mathematical Analysis and Applications, 61, 159-164. [11] P.H. Calamai and J.J. More (1987), "Projected gradient methods for linearly constrained problems", Mathematical Programming,39, 93-116. [12] S.C. Dafermos (1983), "An iterative scheme for variational inequalities", Mathematical Programming, 26, 40-47. [13] J.C. Dunn (1973), "On recursive averaging processes and Hilbert space extensions of the contraction mapping principle", Journalof the Franklin Institute, 295, 117-133. [14] J.C. Dunn (1980), "Convergence rates for conditional gradient sequences generated by implicit step length rules", SIAM Journalon Control and Optimization, 18. 5, 473-487. [15] J.C. Dunn (1980), "Newton's method and the Goldstein step length rue for constrained minimization problems", SIAM Journalon Control and Optimization, 18, 659-674. 42 [16] J.C. Dunn (1981), "Global and asymptotic convergence rate estimates for a class of projected gradient processes", SIAM Journal on Control and Optimization, 19, 3, 368-400. [17] J.C. Dunn and S. Harshbarger (1978), "Conditional gradient algorithms with open loop step size rules", Journal of Mathematical Analysis and Applications, 62, 432-444. [18] J. Eckstein (1989), Splitting Methods for Monotone Operators with Applications to Parallel Optimization, Ph.D. thesis, Laboratory for Information and Decision Systems, MIT, Cambridge, MA. [19] J. Eckstein and D. P. Bertsekas (1992), "On the Douglas-Rachford splitting method and the proximal point algorithm for maximal monotone operators", Mathematical Programming, 55, 293-318. [20] M. Fukushima (1992), "Equivalent differentiable optimization problems and descent methods for asymmetric variational inequality problems", Mathematical Programming, 53, 1, 99-110. [21] D. Gabay (1983), "Applications of the method of multipliers to variational inequalities', in M. Fortin, R. Glowinski, eds., Augmented Lagrangian Methods: Applications to the Numerical Solution of Boundary- Value Problems, North-Holland, Amsterdam. [22] B. Halpern (1967), "Fixed points of nonexpansive maps", Bulletin of the American Mathematical Society, 73, 957-961. [23] J.H. Hammond and T.L. Magnanti (1987), "Generalized descent methods for asymmetric systems of equations", Mathematics of Operations Research, 12, 678-699. [24] P.T. Harker (1993), "Lectures on computation of equilibria with equation-based methods, CORE Lecture Series, Center for Operations Research and Econometrics, University of Louvain, Louvain la Neuve, Belgium. [25] P.T. Harker and J.S. Pang (1990), "Finite-dimensional variational inequality and nonlinear complementarity problems: a survey of theory, algorithms and applications", Mathematical Programming, 48, 161-220. [26] T. Hoang (1991). "Computing fixed points by global optimization methods", in Fixed point theory and applications, (M.A. Thera and J-B Baillon Eds.), 231-244. 43 [27] S. Ishikawa (1976), "Fixed points and iteration of a nonexpansive mapping in a banach space", Proceedings of the American Mathematical Society, 59, 1, 65-71. [28] T. Itoh, M. Fukushima and T. Ibaraki (1988), "An iterative method for variational inequalities with application to traffic equilibrium problems", Journal of the Operations Research Society of Japan, 31, 1, 82-103. [29] S. Kaniel (1971), "Construction of a fixed point for contractions in banach space", Israel J. Math., 9, 535-540. [30] W.A. Kirk (1990), Topics in metric fixed point theory, Cambridge Studies in Advanced Mathematics, 28, Cambridge University Press, New York. [31] W.A. Kirk (1991), "An application of a generalized Krasnoselskii-Ishikawa iteration process", in Fixed point theory and applications, (M.A. Thera and J-B Baillon Eds.), 261-268. [32] D. Kinderlehrer and G. Stampacchia (1980), An introduction to variational inequalities and their application, Academic Press, New York. [33] K. Knopp (1947), Theory and application of infinite series, Hafner Publishing Company N.Y. [34] A.N. Kolmogorov and S.V. Fomin (1970), Introduction to real analysis, Dover Publications, New York. [35] I. Konnov (1994), "A combined relaxation method for variational inequalities with nonlinear constraints", preprint. [36] T.L. Magnanti and G. Perakis (1995). A unifying geometric solution framework and complexity analysis for variational inequalities", Mathematical Programming, 71. 327-351. [37] T.L. Magnanti and G. Perakis (1997), "The orthogonality theorem and the strong-f- monotonicity condition for variational inequality algorithms", SIAM Journal of Optimization, 7, 1, 248-273. [38] T.L. Magnanti and G. Perakis (1994), "Averaging schemes for variational inequalities and systems of equations", Mathematics of Operations Research, 22, 3, 568-587. [39] W.R. Mann (1953), "On recursive averaging processes and Hilbert space extensions of a contraction mapping principle", Proceedings of the American Mathematical Society, 4. 506-510. 44 [40] P. Marcotte and J-P Dussault (1987), "A note on a globally convergent Newton method for solving monotone variational inequalities", Operations Research Letters, 6, 1, 35-42. [41] P. Marcotte and J.H. Wu (1995), "On the convergence of projection methods: application to the decomposition of affine variational inequalities", JOTA, 85, 2, 347-362. [42] Z. Opial (1967), "Weak convergence of the successive approximation for non-expansive mappings in Banach Spaces", Bulletin A.M.S., 73, 123-135. [43] J.M. Ortega and W.C. Rheinboldt (1970), Iterative solution of nonlinear equations in several variables, Academic Press , New York. [44] J.S. Pang and D. Chan (1982), "Iterative methods for variational and complementarity problems", Mathematical Programming 24, 284-313. [45] J.S. Pang (1994), "Complementarity problems", Handbook of Global Optimization, eds. R. Horst and P. Pardalos, 271-329, Kluwer Academic Publishers. [46] G.B. Passty (1979), "Ergodic convergence to a zero of the sum of monotone operators in hilbert space", Journal of Mathematical Analysis and Applications, 72, 383-390. [47] G. Perakis (1993), Geometric, interiorpoint, and classicalmethods for solving finite dimensional variationalinequalityproblems, Ph.D. dissertation, Department of Applied Mathematics, Brown University, Providence R.I. [48] R.T. Rockafellar (1976), "Monotone operators and the proximal algorithm", SIAM Journal on Control and Optimization, 14, 877-898. [49] M. Sibony (1970), "Methodes iteratives pour les equations et inequations aux derivees partielles nonlinearies de type monotone", Calcolo, 7, 65-183. [50] R.C. Sine (1983), "Fixed Points and Nonexpansive Mappings", Contemporary Mathematics, 18. American Mathematical Society. [51] J. Sun, (1990), "An affine scaling method for linearly constrained convex programs". (Preprint), Department of Industrial Engineering and Management Sciences, Northwestern University. [52] K. Taji, M. Fukushima and T. Ibaraki (1993), "A globally convergent Newton method for solving strongly monotone variational inequalities", Mathematical Programming,58, 3, 369-383. 45 [53] M.J. Todd (1976), " The computation of fixed points and applications" Lecture Notes in Economics and Mathematical Systems, 124, Springer-Verlag. [54] P. Tseng (1990), "Further applications of a matrix splitting algorithm to decomposition in variational inequalities and convex programming", Mathematical Programming,48, 249-264. [55] P. Tseng, D.P. Bertsekas and J.N. Tsitsiklis (1990), "Partially asynchronous, parallel algorithms for network flow problems and other problems", SIAM Journal on Control and Optimization, 28. 3, 678-710. [56] J.H. Wu, M. Florian and P. Marcotte (1993), "A general descent framework for the monotone variational inequality problem", Mathematical Programming, 61, 3, 281-300. [57] D.L. Zhu and P. Marcotte (1996), "Co-coercivity and its role in the convergence of iterative schemes for solving variational inequalities", SIAM Journal on Optimization. 3. 6, 714-726. [58] D.L. Zhu and P. Marcotte (1993), "Modified descent methods for solving the monotone variational inequality problem", Operations Research Letters, 14, 2, 111-120. 46 l - Cd T , ~~~ H I _sv t_ _v _ v ,_ _s H 9* 0 . I ,, H Hi H H f. _ = o = I QNh I I I 11 t _S F3 I I H t-- rF ' aH Tv i 1 > H H i·i ,El li + + I + C-0 I IIC ~z t I II - ~IA Q I + + Y - n . 0 Y Y 0 CD~~~~~~~~~P11ZO I -9 p I C 11t n 09 V I A sN . o _,nt o 0PI A -~z - A o CD 0 o 0 x * ;4. - 0 ~Z)I t 'P c mO 0 _. +- O cn C= Z : I 2 ccri 33 05 0 x sC O o 1 Ip I L-i z-N -> > - I >~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Q s_ t pp I ;1 ~N t N _. s _i> _) 6 < 1 N3 3N , 7r 1