Qamrul Hasan Ansari Advanced Functional Analysis Advanced Functional Analysis Qamrul Hasan Ansari Department of Mathematics Aligarh Muslim University, Aligarh E-mail: qhansari@gmail.com Page 1 SYLLABUS M.A. / M.Sc. II SEMESTER ADVANCED FUNCTIONAL ANALYSIS Course Title Course Number Credits Course Category Prerequisite Courses Contact Course Type of Course Course Assessment End Semester Examination Course Objectives Course Outcomes Advanced Functional Analysis MMM-2009 4 Compulsory Functional Analysis, Linear Algebra, Real Analysis 4 Lecture + 1 Tutorial Theory Sessional (1 hour) 30% (2:30 hrs) 70% To discuss some advanced topics from Functional Analysis, namely orthogonality, orthonormal bases, orthogonal projections, bilinear forms, spectral theory of continuous linear operators, differential calculus on normed spaces, geometry of Banach spaces. These topics play central role in research and advancement of various topics in mathematics. After undertaking this course, students will understand: ◮ spectral theory of continuous linear operators ◮ orthogonality, orthogonal complements, orthonormal bases ◮ orthogonal projection, bilinear form and Lax-Milgram lemma ◮ differential calculus on normed spaces ◮ geometry of Banach spaces 2 Qamrul Hasan Ansari Advanced Functional Analysis Syllabus UNIT I: Orthogonality, Orthonormal Bases, Orthogonal Projection and Bilinear Forms Orthogonality, Orthogonal complements, Orthonormal Bases, Orthogonal projections, Projection theorem, Projection on convex stes, Sesquilinear forms, Bilinear forms and their basic properties, Lax-Milgram lemma UNIT II: Spectral Theory of Continuous Linear Operators Eigenvalues and eigenvectors, Resolvent operators, Spectrum, Spectral properties of bounded linear operators, Compact linear operators on normed spaces, Finite dimensional domain and range, Sequence of compact linear operators, Weak convergence, Spectral theory of compact linear operators UNIT III: Differential Calculus on Normed Spaces Gâteaux derivative, Gradient of a function, Fréchet derivative, Chain rule, Mean value theorem, Properties of Gâteaux and Fréchet derivatives, Taylor’s formula, Subdifferential and its properties UNIT IV: Geometry of Banach Spaces Strict convexity, Modulus of convexity, Uniform convexity, Duality mapping and its properties Smooth Banach spaces, Modulus of smoothness Total Page 3 No. of Lectures 14 13 14 15 56 Qamrul Hasan Ansari Advanced Functional Analysis Page 4 Recommended Books: 1. Q. H. Ansari: Topics in Nonlinear Analysis and Optimization, World Education, Delhi, 2012. 2. Q. H. Ansari, C. S. Lalitha and M. Mehta: Generalized Convexity, Nonsmooth Variational and Nonsmooth Optimization, CRC Press, Taylor and Francis Group, Boca Raton, London, New York, 2014. 3. C. Chidume: Geometric Properties of Banach Spaces and Nonlinear Iterations, Springer, London, 2009. 4. M. C. Joshi and R. K. Bose: Some Topics in Nonlinear Functional Analysis, Wiley Eastern Limited, New Delhi, 1985. 5. E. Kreyazig: Introductory Functional Analysis with Applications, John Wiley and Sons, New York, 1989. 6. M. T. Nair: Functional Analysis: A First Course, Prentice-Hall of India Private Limited, New Delhi, 2002. 7. A. H. Siddiqi: Applied Functional Analysis, CRC Press, London, 2003. 1 Orthogonality, Orthonormal Bases, Orthogonal Projection and Bilinear Forms Throughout these notes, 0 denotes the zero vector of the corresponding vector space, and h., .i denotes the inner product on an inner product space. 1.1 1.1.1 Orthogonality and Orthonormal Bases Orthogonality One of the major differences between an inner product and a normed space is that in an inner product space we can talk about the angle between two vectors. Definition 1.1.1. The angle θ between two vectors x and y of an inner product space X is defined by the following relation: cos θ = hx, yi . kxk kyk (1.1) Definition 1.1.2. Let X be an inner product space whose inner product is denoted by h., .i. (a) Two vectors x and y in X are said to be orthogonal if hx, yi = 0. When two vectors x and y are orthogonal, we denoted by x ⊥ y. (b) A vector x ∈ X is said to be orthogonal to a nonempty subset A of X, denoted by x⊥A, if hx, yi = 0 for all y ∈ A. 5 Qamrul Hasan Ansari Advanced Functional Analysis Page 6 (c) Let A be a nonempty subset of X. The set of all vectors orthogonal to A, denoted by A⊥ , is called the orthogonal complement of A, that is, A⊥ = {x ∈ X : hx, yi = 0 for all y ∈ A}. A⊥⊥ = (A⊥ )⊥ denotes the orthogonal complement of A⊥ , that is, A⊥⊥ = (A⊥ )⊥ = {x ∈ X : hx, yi = 0 for all y ∈ A⊥ }. (c) Two subsets A and B of X are said to be orthogonal, denoted by A⊥B, if hx, yi = 0 for all x ∈ A and all y ∈ B. Clearly, x and y are orthogonal if and only if the angle θ between is 90◦ , that is, cos θ = 0 which is equivalent to (in view of (1.1)) hx, yi = 0 ⇔ x ⊥ y. Remark 1.1.1. (a) Since hx, yi = hy, xi (conjugate of hy, xi) hx, yi = 0 implies that hy, xi = 0 or hy, xi = 0 and vice versa. Hence, x⊥y if and only if y ⊥ x, that is, all vectors in X are mutually orthogonal. (b) Since hx, 0i = 0 for all x, x ⊥ 0 for every x belonging to an inner product space. By the definition of the inner product, 0 is the only vector orthogonal to itself. (c) Clearly, {0}⊥ = X and X ⊥ = {0}. (d) If A ⊥ B, then A ∩ B = {0}. (e) Nonzero mutually orthogonal vectors, x1 , x2 , x3 , . . . , xn , of an inner product space are linearly independent (Prove it!). Example 1.1.1. Let A = {(x, 0, 0) ∈ R3 : x ∈ R} be a line in R3 and B = {(0, y, z) ∈ R3 : y, z ∈ R} be a plane in R3 . Then A⊥ = B and B ⊥ = A. Example 1.1.2. Let X = R3 and A be its subspace spanned by a non-zero vector x. The orthogonal complement of A is the plane through the origin and perpendicular to the vector x. Example 1.1.3. Let A be a subspace of R3 generated by the set {(1, 0, 1), (0, 2, 3)}. An element of A can be expressed as x = (x1, x2, x3 ) = λ(1, 0, 1) + µ(0, 2, 3) = λi + 2µj + (λ + 3µ)k ⇒ x1 = λ, x2 = 2µ, x3 = λ + 3µ. Thus, the element of A is of the form x1 , x2 , x1 + 23 x2 . The orthogonal complement of A can be constructed as follows: Let x = (x1 , x2 , x3 ) ∈ A⊥ . Then for y = (y1 , y2 , y3 ) ∈ A, we have 3 hx, yi = x1 y1 + x2 y2 + x3 y3 = x1 y1 + x2 y2 + x3 y1 + y2 2 3 = (x1 + x3 ) y1 + x2 + x3 y2 = 0. 2 Qamrul Hasan Ansari Advanced Functional Analysis Page 7 Since y1 and y2 are arbitrary, we have 3 x1 + x3 = 0 and x2 + x3 = 0. 2 Therefore, ⊥ A 3 = x = (x1 , x2 , x3 ) : x1 = −x3 , x2 = − x3 2 3 = x ∈ R3 : x = −x3 , − x3, x3 . 2 Exercise 1.1.1. Let A be a subspace of R3 generated by the set {(1, 1, 0), (0, 1, 1)}. Find A⊥ . Answer. A⊥ is the straight line spanned by the vector (1, −1, 1). Theorem 1.1.1. Let X be an inner product space and A be a subset of X. Then A⊥ is a closed subspace of X. Proof. Let x, y ∈ A⊥ . Then, hx, zi = 0 for all z ∈ A and hy, zi = 0 for all z ∈ A. Since for arbitrary scalars α, β, hαx + βy, zi = αhx, zi + βhy, zi = 0, we get hαx + βy, zi = 0; that is, αx + βy ∈ A⊥ . So A⊥ is a subspace of X. To show that A⊥ is closed, let {xn } ∈ A⊥ such that xn → y. We need to show that y must belongs to A⊥ . Since xn ∈ A⊥ , hx, xn i = 0 for all x ∈ X and all n. Since h., .i is a continuous function, we have lim hx, xn i = lim hxn , xi = h lim xn , xi = hy, xi = 0. n→∞ n→∞ n→∞ Hence, y ∈ A⊥ . Exercise 1.1.2. Let X be an inner product space and A and B be subsets of X. Prove the following assertions: (a) A ∩ A⊥ ⊆ {0}. A ∩ A⊥ = {0} if and only if A is a subspace. (b) A ⊆ A⊥⊥ . (c) If B ⊆ A, then B ⊥ ⊇ A⊥ . Proof. (a) If y ∈ A ∩ A⊥ and y ∈ A⊥ , then y ∈ {0}. If A is a subspace, then 0 ∈ A and 0 ∈ A ∩ A⊥ . Hence, A ∩ A⊥ = {0}. (b) Let y ∈ A, but y ∈ / A⊥⊥ . Then there exists an element z ∈ A⊥ such that hy, zi = 6 0. ⊥ Since z ∈ A , hy, zi = 0 which is a contradiction. Hence, y ∈ A⊥⊥ . (c) Let y ∈ A⊥ . Then hy, zi = 0 for all z ∈ A. Since every z ∈ B is an element of A, we have hy, zi = 0 for all z ∈ B. Hence, y ∈ B ⊥ , and so B ⊥ ⊃ A⊥ . Qamrul Hasan Ansari Advanced Functional Analysis Page 8 Exercise 1.1.3. Let X be an inner product space and A and B be subsets of X. Prove the following assertions: (a) If A ⊆ B, then A⊥⊥ ⊆ B ⊥⊥ . (b) A⊥ = A⊥⊥⊥ . (c) If A is dense in X, that is, A = X, then A⊥ = {0}. (d) If A is an orthogonal set and 0 ∈ / A, then prove that A is linearly independent. Exercise 1.1.4. Let A be a nonempty subset of a Hilbert space X. Show that (a) A⊥⊥ = spanA; (b) spanA is dense in X whenever A⊥ = {0}. Hint: See [5], pp. 149. Exercise 1.1.5. Let A be a nonempty subset of a Hilbert space X. Show that A is closed if and only if A = A⊥⊥ . The well-known Pythagorean theorem of plane geometry says that the sum of the squares of the base and the perpendicular in a right-angled triangle is equal to the square of the hypotenuse. Its infinite-dimensional analogue is as follows. Theorem 1.1.2. Let X be an inner product space and x, y ∈ X. Then for x ⊥ y, we have kx + yk2 = kxk2 + kyk2. Proof. Note that kx + yk2 = hx + y, x + yi = hx, xi + hy, xi + hx, yi + hy, yi. Since x⊥y, hx, yi = 0 and hy, xi = 0, we have kx + yk2 = kxk2 + kyk2 . Exercise 1.1.6. Let K and D be subset of an inner product space X. Show that (K + D)⊥ = K ⊥ ∩ D ⊥ , where K + D = {x + y : x ∈ K, y ∈ D}. Exercise 1.1.7. For each i = 1, 2, . . . , n, let Ki be a subspace of a Hilbert space X. If hxi , xj i = 0 when i 6= j for each xi ∈ Ki and yj ∈ Kj , then show that the subspace K1 + K2 + · · · + Kn is closed. Is this property true for an incomplete inner product space? Exercise 1.1.8. Let X be an inner product space and for a nonzero vector y ∈ X, Ky := {x ∈ X : hx, yi = 0}. Determine the subspace Ky⊥ . Qamrul Hasan Ansari 1.1.2 Advanced Functional Analysis Page 9 Orthonormal Sets and Orthonormal Bases Definition 1.1.3. Let X be an inner product space. (a) A subset A of nonzero vectors in X is said to orthogonal if any two distinct elements in A are orthogonal. (b) A set of vectors A in X is said to be orthonormal if it is orthogonal and kxk = 1 for all x ∈ A, that is, for all x, y ∈ A, 0, if x 6= y (1.2) hx, yi = 1, if x = y. If an orthogonal / orthonormal set in X is countable, then it can be arranged as a sequence {xn } and in this case we call it an orthogonal sequence / orthonormal sequence, respectively. More generally, let Λ be any index set. (a) A family of vectors {xα }α∈Λ in an inner product space X is said to be orthogonal if xα ⊥ xβ for all α, β ∈ Λ, α 6= β. (b) A family of vectors {xα }α∈Λ in an inner product space X is said to be orthonormal if it is orthogonal and kxα k = 1 for all xα , that is, for all α, β ∈ Λ, we have 0, if α 6= β (1.3) hxα , yβ i = δαβ = 1, if α = β. 1 Example 1.1.4. The standard / canonical basis for Rn (with usual inner product) e1 = (1, 0, 0, . . . , 0), e2 = (0, 1, 0, . . . , 0), .. .. . . en = (0, 0, 0, . . . , 1), form an orthonormal set as hei , ej i = δij = 0, 1, if i 6= j if i = j. (1.4) Qamrul Hasan Ansari Advanced Functional Analysis Recall that for p ≥ 1, ( ℓp = c00 = x = {xn } ⊆ K : ∞ [ ∞ X n=1 |xn |p < ∞} Page 10 ) {{x1 , x2 , . . .} ⊆ K : xj = 0 for j ≥ k} k=1 ℓ∞ = {{xn } ⊆ K : sup |xn | < ∞} n∈N C[a, b] = The space of all continuous real-valued functions defined on the interval [a, b] P [a, b] = The space of polynomials defined on the interval [a, b] Clearly, c00 ⊆ ℓ∞ . P [a, b] is complete with respect to the norm kf k∞ = supx∈[a,b] |f (x)|. However, P [a, b] is dense in C[a, b] with k · k∞ . ℓ2 is a Hilbert space with inner product defined by hx, yi = ∞ X n=1 xn yn , for all x = {xn }, y = {yn } ∈ ℓ2 . The norm on ℓ2 is defined by kxk = hx, xi 1/2 = ∞ X n=1 |xn | 2 !1/2 . The space ℓp with p 6= 2 is not an inner product space, and hence not a Hilbert space. However, ℓp with p 6= 2 is a Banach space. For 0 < p < ∞, p L [a, b] = f : [a, b] → K : f is measurable and Z b a p |f | dµ < ∞ For 1 ≤ p < ∞, Lp [a, b] is a complete normed space with respect to the norm kf kp = Z a b p |f | dµ 1/p . Note that kf kp does not define a norm on Lp [a, b] for 0 < p < 1. Example 1.1.5. Consider ℓ2 space and its subset E = {e1 , e2 , . . .} with en = δnj , that is, Qamrul Hasan Ansari Advanced Functional Analysis Page 11 en = {0, 0, . . . 0, 1, 0, . . .} (1 is at nth place). Then E forms an orthonormal set for ℓ2 , and {en } is an orthonormal sequence. Example 1.1.6. Consider the space c00 with the inner product hx, yi = ∞ X for all x = {x1 , x2 , . . .}, y = {y1 , y2, . . .} ∈ c00 . xn yn , n=1 The set E = {e1 , e2 , . . .} with en = δnj , that is, en = {0, 0, . . . 0, 1, 0, . . .} (1 is at nth place) forms an orthonormal set for c00 , and {en } is an orthonormal sequence. Example 1.1.7. Consider the space C[0, 2π] with the inner product Z 2π hf, gi = f (t)g(t) dt, for all f, g ∈ C[0, 2π]. 0 Consider the sets E = {u1 , u2 , . . .} and G = {v1 , v2 , . . .} or the sequences {un } and {vn }, where un (t) = cos nt, for all n = 0, 1, 2, . . . , and vn (t) = sin nt, for all n = 1, 2, . . . . Then, E = {u1 , u2 , . . .} is an orthogonal set and {un } is an orthogonal sequence. Also, G = {v1 , v2 , . . .} is an orthogonal set and {vn } is an orthogonal sequence. Indeed, by integrating, we obtain hum , un i = Z hvm , vn i = Z and 2π 0 2π 0, π, cos mt cos nt dt = 2π sin mt sin nt dt = 0 0, π, if m 6= n if m = n = 1, 2, . . . if m = n = 0, if m 6= n if m = n = 1, 2, . . . . Also, {e1 , e2 , . . .} is an orthonormal set and {en } is an orthonormal sequence, where 1 e0 (t) = √ , 2π en = cos nt un (t) = √ , kun k π for n = 1, 2, . . . . Similarly, {ẽ1 , ẽ2 , . . .} is an an orthonormal set and {ẽn } is an orthonormal sequence, where ẽn = sin nt vn (t) = √ , kvn k π for n = 1, 2, . . . . Note that um ⊥ vn for all m and n (Prove it!). Exercise 1.1.9 (Pythagorean Theorem). If {x1 , x2 , . . . , xn } is an orthogonal subset of an inner product space X, then prove that n X i=1 2 xi = n X i=1 kxi k2 . Qamrul Hasan Ansari Advanced Functional Analysis Page 12 Proof. We have n X 2 xi = i=1 * n X i=1 = = = n X i=1 n X i=1 n X i=1 xi , n X xi i=1 hxi , xi i + + n X i,j=1, 6=j hxi , xj i hxi , xi i kxi k2 . Lemma 1.1.1 (Linearly independence). An orthonormal set of non-zero vectors is linearly independent. Proof. Let {ui } be an orthonormal set. Consider the linear combination α1 u1 + α2 u2 + · · · + αn un = 0. Multiply by any fixed uj 6= 0, we get 0 = h0, uj i = hα1 u1 + α2 u2 + · · · + αn un , uj i = α1 hu1, uj i + α2 hu2 , uj i + · · · αj huj , uj i + · · · + αn hun , uj i. Since hui , uj i = δij , we have 0 = αj huj , uj i or αj = 0 as uj 6= 0. This shows that {ui } is a set of linearly independent vectors. Exercise 1.1.10. Determine an orthogonal set in L2 [0, 2π]. √ Hint: (See [6], pp. 179) Consider u1 (t) = 1/ 2π, and for n ∈ N, sin nt u2n (t) = √ , π cos nt u2n+1 (t) = √ , π and then check E = {u1 , u2, . . .} is an orthogonal set in L2 [0, 2π]. Exercise 1.1.11. Construct a set of 3 vectors in R3 and determine whether it is a basis and, if it is, then whether it is orthogonal, orthonormal or neither. Exercise 1.1.12. Show that every orthonormal set in a separable inner product space X is countable. Hint: (See [6], pp. 179) Qamrul Hasan Ansari Advanced Functional Analysis Page 13 Advantages of an Orthonormal Sequence. A great advantage of orthonormal sequences over arbitrary linearly independent sequences is the following: If we know that a given x can be represented as linear combination of some elements of an orthonormal sequence, then the orthonormality makes the actual determination of the coefficients very easy. Let {u1 , u2, . . .} be an orthonormal sequence in an inner product space X and x ∈ span{u1, u2 , . . . , un }, where n is fixed. Then x can be written as a linear combination of u1 , u2 , . . . , un , that is, n X αk uk , for scalars αk . (1.5) x= k=1 Take the inner product by a fixed uj , then we obtain + * n X X αk huk , uj i + αj huk , uj i = αj kuj k = αj , αk u k , u j = hx, uj i = k=1 k=1, k6=j as kuj k = 1 since {u1 , u2, . . .} is an orthonormal sequence. Therefore, the unknown coefficients αk in (1.5) can be easily calculated. The following Gram-Schmidt process provides that how to obtain an orthonormal sequence if an arbitrary linearly independent sequence is given. Gram-Schmidt Orthogonalization Process. Let {xn } be a linearly independent sequence in an inner product space X. Then we obtain an orthonal sequence {vn } and an orthonormal sequence {un } with the following property for every n: span{u1 , u2 , . . . , nn } = span{x1 , x2 , . . . , xn }. Qamrul Hasan Ansari Advanced Functional Analysis 1st Step Take v1 = x1 2nd Step Take v2 = x2 − hx2 , u1iu1 v1 v1 = x2 − x2 , kv1 k kv1 k hx2 , v1 i = x2 − v1 hv1 , v1 i Take v3 = x3 − hx3 , u2iu2 3rd Step = x3 − .. . nth Step .. . Take vn = xn − = xn − .. . .. . Page 14 v1 kv1 k v2 and u2 = kv2 k and u1 = 2 X hxj , vj i hvj , vj i n−1 X hxn , uj iuj n−1 X hxn , vj i j=1 hvj , vj i v3 kv3 k vj j=1 j=1 and u3 = .. . and un = vn kvn k vj .. . Then {vn } is an orthogonal sequence of vectors in X and {un } is an orthonormal sequence in X. Also, for every n: span{u1 , u2 , . . . , nn } = span{x1 , x2 , . . . , xn }. Theorem 1.1.3. Let {xn } be a linearly independent sequence in an inner product space X. Let v1 = x1 , and vn = xn − n−1 X hxn , vn i j=1 hvj , vj i vj , for n = 2, 3, . . . . Then {v1 , v2 , . . .} is an orthogonal set, {un } is an orthonormal sequence where un = and span{x1 , x2 , . . . , xk } = span{u1 , u2 , . . . , uk }, for all k = 1, 2, . . . , n. vn , kvn k Proof. Since {xn } is a sequence of linearly independent vectors, so xn 6= 0 for all n. Define v1 = x1 and hx2 , v1 i v2 = x2 − v1 . hv1 , v1 i Clearly, v2 ∈ span{x1 , x2 } and hv2 , v1 i = hx2 , v1 i − hx2 , v1 i hv1 , v1 i = 0, hv1 , v1 i Qamrul Hasan Ansari Advanced Functional Analysis Page 15 that is, v2 and v1 are orthogonal. Since {x1 , x2 } is linearly independent, v2 6= 0. Then, by Exercise 1.1.3 (d), {v1 , v2 } is linearly independent, and hence, it follows that span{v1 , v2 } = span{x1 , x2 }. Continuing in this way, we define an orthogonal set {v1 , v2 , . . . , vn−1 } such that span{x1 , x2 , . . . , xn−1 } = span{v1 , v2 , . . . , vn−1 }. Let vn = xn − n−1 X hxn , vk i j=1 hvj , vj i vj . Then we have vk ∈ span{x1 , x2 , . . . , xk } and hvk , vi i = 0 for i < k. Again, since {x1 , x2 , . . . , xk } is linearly independent, vk 6= 0. Thus, {v1 , v2 , . . . , vn } is the required orthogonal set and {u1 , u2, . . . , un } is the required orthonormal set Exercise 1.1.13. Let Y be the plane in R3 spanned by the vectors x1 = (1, 2, 2) and x2 = (−1, 0, 2), that is, Y = span{x1 , x2 }. Find orthonormal basis for Y and for R3 . Solution. x1 , x2 is a basis for the plane Y . We can extend it to a basis for R3 by adding one vector from the standard basis. For instance, vectors x1 , x2 and x2 = (0, 0, 1) form a basis for R3 because 1 2 2 1 2 −1 0 2 = = 2 6= 0. −1 0 0 0 1 By using the Gram-Schmidt process, we orthogonalize the basis x1 = (1, 2, 2), x2 = (−1, 0, 2) and x3 = (0, 0, 1): v1 = x1 = (1, 2, 2), hx2 , v1 i v2 = x2 − v1 hv1 , v1 i 3 = (−1, 0, 2) − (1, 2, 2) = (−4/3, −2/3, 4/3) 9 hx3 , v1 i hx3 , v2 i v3 = x3 − v1 − v2 hv1 , v1 i hv2 , v2 i 4/3 2 (−4/3, −2/3, 4/3) = (2/9, −2/9, 1/9). = (0, 0, 1) − (1, 2, 2) − 9 4 Now, v1 = (1, 2, 2), v2 = (−4/3, −2/3, 4/3), v3 = (2/9, −2/9, 1/9) is an orthogonal basis for R3 , while v1 , v2 is an orthogonal basis for Y . The orthonromal basis for Y is u1 = kvv11 k = 1 (1, 2, 2), u2 = kvv22 k = 31 (−2, −1, 2). 3 The orthonromal basis for R3 is u1 = 1 (2, −2, 1). 3 v1 kv1 k = 31 (1, 2, 2), u2 = v2 kv2 k = 31 (−2, −1, 2), u3 = v3 kv3 k = Qamrul Hasan Ansari Advanced Functional Analysis Page 16 Exercise 1.1.14. Let {un } be an orthonormal sequence in an inner product space X. Prove the following statements (Use Pythagorean theorem). (a) If w = P∞ n=1 αn un , then kwk = (b) If x ∈ X and sN = PN P∞ n=1 |αn | n=1 hx, un iun , 2 , where αn ’s are scalars. then kxk2 = kx − sN k2 + ksN k2 . P (c) If x ∈ X and sN = N n=1 hx, un iun , and XN = span{u1 , u2 , . . . uN }, then kx − sN k = miny∈XN kx − yk (It is called best approximation property). Theorem 1.1.4 (Bessel’s inequality). Let {uk } be an orthonormal set in an inner product space X. Then for any x ∈ X, we have ∞ X k=1 |hx, uk i|2 ≤ kxk2 . P Proof. Let xn = nk=1 hx, uk iuk be the nth partial sum. Then, by using the properties of the inner product and applying the fact that 0, if i 6= j hui , uj i = δij = 1, if i = j, we have 0 ≤ kx − xn k2 = hx − xn , x − xn i = kxk2 − hxn , xi − hx, xn i + kxn k2 * n + * + n X X = kxk2 − hx, uk iuk , x − x, hx, uk iuk + kxn k2 k=1 = kxk2 − 2 n X k=1 k=1 hx, uk ihuk , xi − 2 = kxk − kxn k . Therefore, kxn k2 ≤ kxk2 , and hence, the conclusion. Pn k=1 |hx, uk i| 2 n X k=1 hx, uk ihx, uk i + kxn k2 ≤ kxk2 . Taking limit as n → ∞, we get Exercise 1.1.15. Let {ui } be a countably infinite orthonormal set in a Hilbert space X. Then prove the following statements: (a) The infinite series ∞ P n=1 ∞ P αn un , where αn ’s are scalars, converges if and only if the series n=1 |αn |2 converges, that is, ∞ P n=1 |αn |2 < ∞. Qamrul Hasan Ansari (b) If ∞ P Advanced Functional Analysis Page 17 αn un converges and n=1 x= ∞ X αn u n = n=1 ∞ P then αn = βn for all n and kxk2 = Proof. (a) Let ∞ P n=1 ∞ X βn un , n=1 |αn |2 . αn un be convergent and assume that n=1 x= ∞ X αn u n , or equivalently, lim N →∞ n=1 x− N X 2 αn u n = 0. n=1 Now, hx, um i = = *∞ X αn u n , u m n=1 ∞ X n=1 + αn hun , um i, for m = 1, 2, . . . (as {ui } is orthonormal). = αm By the Bessel inequality, we get ∞ X m=1 which shows that ∞ P n=1 2 |hx, um i| = m=1 |αm |2 ≤ kxk2 , |αn |2 converges. To prove the converse, assume that n P ∞ X ∞ P n=1 αi ui . Then, we have |αn |2 is convergent. Consider the finite sum sn = i=1 ksn − sm k 2 = = * n X i=m+1 n X i=m+1 αi u i , n X αi u i i=m+1 + |αi |2 → 0 as n, m → ∞. This means that {sn } is a Cauchy sequence. Since X is complete, the sequence of partial ∞ P αn un converges. sums {sn } is convergent in X, and therefore, the series n=1 Qamrul Hasan Ansari Advanced Functional Analysis (b) We first prove that kxk2 = 2 kxk − N X n=1 |αn | 2 ∞ P n=1 * x, x − x− ≤ Since N P |αn |2 . We have = hx, xi − = Page 18 N X N X hαn un , αm um i n=1 m=1 N X αn u n n=1 N X αn u n n=1 + + *N X kxk + αn u n , x − n=1 N X αn u n n=1 ! N X αn u n n=1 + = M. αn un converges to x, the M converges to zero, proving the result. n=1 If x = ∞ X αn u n = n=1 ∞ X βn un , then n=1 0 = lim N →∞ " N X n=1 (αn − βn ) un # ⇒ ∞ X 0= n=1 |αn − βn |2 , by (a), implying that αn = βn for all n. Exercise 1.1.16. Let {un } be an orthonormal sequence in a Hilbert space X, and ∞ X n=1 2 |αn | < ∞ and Prove that u ∞ X αn u n ∞ X n=1 and v = n=1 |βn |2 < ∞. ∞ X βn un n=1 are convergent series with respect to the norm of X and hu, vi = P∞ n=1 αn βn . Proof. Let uN N X αn u n and vN = n=1 Then for M < N, we have N X βn un . n=1 2 kuN − uM k = N X n=M |αn |2 → 0 as M → ∞, and so, {uN } is a Cauchy sequence in a complete space X and thus converging to some u ∈ X. Similarly, {vN } is a Cauchy sequence in a complete space X that converges to some Qamrul Hasan Ansari Advanced Functional Analysis Page 19 v ∈ X. Finally, huN , vN i = N X hαj uj , βk uk i = j,k=1 N X j,k=1 αj βk huj , uk i = N X αj βj , j=1 since huj , wk i = 0 for j 6= k and hwj , wj i = 1. Taking P∞the limit as N → ∞ and using the Pythagorean theorem, huN , vN i → hu, vi gives hu, vi n=1 αn βn . Recall that if {u1 , u2 , . . . , un } is a basis of a linear space X, then for every x ∈ X, there exists scalars α1 , α2 , . . . , αn such that x = α1 u1 + α2 u2 + · · · + αn un . Definition 1.1.4. (a) An orthogonal set of vectors {ui } in an inner product space X is called an orthogonal basis if for any x ∈ X, there exist scalars αi such that x= ∞ X αi u i . i=1 If the set {ui } is orthonormal, then it is called an orthonormal basis. (b) An orthonormal basis {ui } in a Hilbert space X is called maximal or complete if there is no unit vector u0 in X such that {u0 , u1, u2 , . . .} is an orthonormal set. In other words, the sequence {ui } of orthonormal basis in X is complete if and only if the only vector orthogonal to each of ui ’s is the null vector. In general, an orthonormal set E in an inner product space X is complete or maximal if it is a maximal orthonormal set in X, that is, E is an orthonormal set, and for every e satisfying E ⊆ E, e we have E e = E. orthonormal set E (c) Let {ui } be an orthonormal basis in a Hilbert space X, then the numbers αi = hx, ui i are called the Fourier coefficients of the element x with respect to the system {ui } and P ∞ i=1 αi ui is called the Fourier series of the element x. Example 1.1.8. The set {ei : i ∈ N}, where ei = (0, 0, . . . , 0, 1, 0, . . .) with 1 lies in the ith place, forms an orthonormal basis for ℓ2 (C). Example 1.1.9. Let X = L2 (−π, π) be a complex Hilbert space and un be the element of X defined by 1 un (t) = √ exp(i n t), for n = 0, ±1, ±2, . . . . 2π Then 1 cos nt sin nt √ , √ , √ : n = 1, 2, . . . π π 2π forms an orthonormal basis for X as exp(i n t) = cos nt + i sin nt. Qamrul Hasan Ansari Advanced Functional Analysis Page 20 Theorem 1.1.5. Let {ui : i ∈ N} be an orthonormal set in a Hilbert space X. Then the following assertions are equivalent: (a) {ui : i ∈ N} is an orthonormal basis for X. (b) For all x ∈ X, x = ∞ X i=1 (c) For all x ∈ X, kxk2 = hx, ui iui . ∞ X i=1 |hx, ui i|2 . (d) hx, ui i = 0 for all i implies x = 0. Proof. (a) ⇔ (b): Let {ui : i ∈ N} be an orthonormal basis for X. Then we can write x= ∞ X αi u i , that is x = lim n→∞ i=1 For k ≤ n in N, we have * n X αi u i , u k i=1 + n X = i=1 n X αi u i . i=1 αi hui , uk i = uk . By letting n → ∞ and using the continuity of the inner product, we obtain hx, uk i = lim = αk , n→∞ and hence (b) holds. The same argument shows that if (b) holds, then this expansion is unique and so {ui : i ∈ N} is an orthonormal basis for X. (b) ⇔ (c): By Pythagorean theorem and continuity of the inner product, we have 2 kxk = ∞ X i=1 2 hx, ui iui = i=1 (c) ⇔ (d): Let hx, ui i = 0 for all i. Then kxk2 = x = 0. (d) ⇔ (b): Take any x ∈ X and let y = x − ∞ X i=1 hy, uk i = hx, uk i − lim n→∞ * ∞ X |hx, ui i|2 . P∞ i=1 |hx, ui i|2 = 0 which implies that hx, ui iui . Then for each k ∈ N, we have ∞ X hx, ui iui , uk i=1 + =0 Qamrul Hasan Ansari Advanced Functional Analysis since eventually n ≥ k. It follows from (d) that y = 0, and hence x = Page 21 ∞ X i=1 hx, ui iui . Theorem 1.1.6 (Fourier Series Representation). Let Y be the closed subspace spanned by a countable orthonormal set {ui } in a Hilbert space X. Then every element x ∈ Y can be written uniquely as ∞ X x= hx, ui iui . (1.6) i=1 Proof. Uniqueness of (1.6) is a consequence of Exercise 1.1.15 (b). For any x ∈ Y , we can write M X x = lim αi ui , for M ≥ N N →∞ i=1 as Y is closed. From Theorem 1.1.4 and Exercise 1.1.15, it follows that x− M X i=1 hx, ui iui ≤ x − M X αi u i , i=1 and as N → ∞, we get the desired result. Theorem 1.1.7 (Fourier Series Theorem). For any orthonormal set {un } in a separable Hilbert space X, the following statements are equivalent: (a) Every x ∈ X can be represented by the Fourier series in X; that is, x= ∞ X i=1 hx, ui iui . (1.7) (b) For any pair of vectors x, y ∈ X, we have hx, yi = ∞ X i=1 hx, ui ihy, uii = ∞ X αi βi , (1.8) i=1 where αi = hui , xi are Fourier coefficients of x, and βi = hy, ui i are Fourier coefficients of y. (c) For any x ∈ X, one has 2 kxk = ∞ X i=1 |hx, ui i|2 . (1.9) Qamrul Hasan Ansari Advanced Functional Analysis Page 22 (d) Any subspace Y of X that contains {ui } is dense in X. Proof. (a) ⇒ (b). It follows from (1.6) and the fact that {ui } is orthonormal. (b) ⇒ (c). Put x = y in (1.8) to get (1.9). (a) ⇒ (d). The statement (d) is equivalent to the statement that the orthogonal projection onto S, the closure of S, is the identity. In view of Theorem 1.1.6, statement (d) is equivalent to statement (a). Exercise 1.1.17. Let X be a Hilbert space and E be an orthonormal basis of X. Prove that E is countable if and only if X is separable. Hint: See Theorem 4.10 on page 187 in [6]. Exercise 1.1.18. Let X be a Hilbert space and E be an orthonormal basis of X. Prove that E is a basis of X if and only if X is finite dimension. Hint: See Theorem 4.13 on page 189 in [6]. Exercise 1.1.19. Let X be a Hilbert space. Prove that E is an orthonormal basis of X if and only if spanE is dense in X. Exercise 1.1.20. If X is a Hilbert space, then show that E is an orthonormal basis if and only if X hx, yi = hx, ui hy, ui, for all x, y ∈ X. u∈E 1.2 1.2.1 Orthogonal Projections and Projection Theorem Orthogonal Projection Let K be a nonempty subset of a normed space X. Recall that distance from an element x ∈ X to the set K is defined by ρ := inf kx − yk. y∈K (1.10) It is important to know that whether there is a z ∈ K such that kx − zk = inf kx − yk. y∈K If such point exists, whether it is unique? (1.11) Qamrul Hasan Ansari Advanced Functional Analysis b Page 23 x z K y b Figure 1.1: The distance from a point x to K b x b ρ x ρ K is an open segment No z in K that satisfies (1.12) K is an open segment z is unique that satisfies (1.12) b x ρ K is circular arc; Infinitely many z’s which satisfy (1.12) Figure 1.2: The distance from a point x to K One can see in the following figures that even in the simple space R2 , there may be no z satisfying (1.12), or precisely one such z, or more than one z. To get the existence and uniqueness of such z, we recall the concept of a convex set. Definition 1.2.1. A subset K of a vector space X is said to be a convex set if for all x, y ∈ K and α, β ≥ 0 such that α + β = 1, we have αx + βy ∈ K, that is, for all x, y ∈ K and α ∈ [0, 1], we have αx + (1 − α)y ∈ K. Theorem 1.2.1. Let K be a nonempty closed convex subset of a Hilbert space X. Then for any given x ∈ X, there exists a unique z ∈ K such that kx − zk = inf kx − yk. y∈K (1.12) Proof. Existence. Let ρ := inf kx − yk. By the definition of the infimum, there exists a y∈K sequence {yn } in K such that kx − yn k → ρ as n → ∞. We will prove that {yn } is a Cauchy sequence. Qamrul Hasan Ansari Advanced Functional Analysis Page 24 y x y x A convex set A nonconvex set Figure 1.3: A convex set and a nonconvex set b x z K y b Figure 1.4: Existence and uniqueness of z that minimizes the distance from K By using parallelogram law, we have yn + ym kyn − ym k + 4 x − 2 2 2 = 2 kx − yn k2 + kx − ym k2 , for all n, m ≥ 1. Since K is a convex subset of X and yn , ym ∈ K, we have 12 (yn + ym ) ∈ K. Therefore, m ≥ ρ. Hence x − yn +y 2 kyn − ym k 2 yn + ym = 2 kx − yn k + kx − ym k − 4 x − 2 2 2 2 ≤ 2 kx − yn k + kx − ym k − 4ρ . 2 2 2 Let n, m → ∞, then we have kx − yn k → ρ and kx − ym k → ρ and 0 ≤ lim kyn − ym k2 ≤ 4ρ2 − 4ρ2 = 0. n,m→∞ Therefore, lim kyn − ym k2 = 0, and thus, {yn } is a Cauchy sequence. Since X is complete, n,m→∞ there exists z ∈ X such that lim yn = z. Since yn ∈ K and K is closed, z ∈ K. In n→∞ conclusion, we have kx − zk = inf kx − yk. y∈K Qamrul Hasan Ansari Advanced Functional Analysis Page 25 Uniqueness. Suppose that there is also ẑ ∈ K such that kx − ẑk = inf kx − yk. y∈K z+ẑ 2 By using parallelogram law and x − kz − ẑk2 + 4 x − z + ẑ 2 that is, 2 ≥ ρ (since 21 (z + ẑ) ∈ K), we have = 2 kx − zk2 + kx − ẑk2 = 4ρ2 , z + ẑ 0 ≤ kz − ẑk = 4ρ − 4 x − 2 2 2 2 ≤ 0. Thus, kz − ẑk = 0, and hence, z = ẑ. Remark 1.2.1. Theorem 1.2.1 does not hold in the setting of Banach spaces. For example, c0 is a closed subspace of ℓ∞ , but there is no closest sequence in c0 to the sequence {1, 1, 1, . . .}. In fact, the distance between c0 and the sequence {1, 1, 1, . . .} is 1, and this is achieved by any bounded sequence {xn } with xn ∈ [0, 2]. Theorem 1.2.2. Let K be a closed subspace of a Hilbert space X and x ∈ X be given. There exists a unique z ∈ K which satisfies (1.12) and x − z is orthogonal to K, that is, x − z ∈ K ⊥. Proof. Existence. Existence of z ∈ K follows from previous theorem as every subspace is convex. Orthogonality. Clearly, hx − z, 0i = 0. Take y ∈ K, y 6= 0. Then we shall prove that hx − z, yi = 0. Since z ∈ K satisfies (1.12) and z + λy ∈ K (as K is a subspace), we have kx − zk2 ≤ kx − (z + λy)k2 = kx − zk2 + |λ|2kyk2 − λhy, x − zi − λhx − z, yi, that is, Putting λ = hx−z,yi kyk2 0 ≤ |λ|2 kyk2 − λhx − z, yi − λhx − z, yi. in the above inequality, we obtain |hx − z, yi|2 ≤ 0, kyk2 which is only happened when hx − z, yi = 0. Since y was arbitrary, x − z is orthogonal to K. Uniqueness. Suppose that there is also ẑ ∈ K such that x − ẑ ∈ K ⊥ . Then z − ẑ = (x − ẑ) − (x − z) ∈ K ⊥ . On the other hand, z − ẑ ∈ K since z, ẑ ∈ K and K is a subspace. So, z − ẑ ∈ K ∩ K ⊥ ⊂ {0}. Therefore, z − ẑ = 0, and hence, z = ẑ. Qamrul Hasan Ansari Advanced Functional Analysis Page 26 Lemma 1.2.1. If K is a proper closed subspace of a Hilbert space X, then there exists a nonzero vector x ∈ X such that x ⊥ K. Proof. Let u ∈ / K and ρ = inf ku − yk, the distance from u to K. By Theorem 1.2.1, there y∈K exists a unique element z ∈ K such that ku − zk = ρ. Let x = u − z. Then x 6= 0 as ρ > 0. (If x = 0, then u − z = 0 and ku − zk = 0 implies that ρ = 0.) Now, we show that x ⊥ K. For this, we show that for arbitrary y ∈ K, hx, yi = 0. For any scalar α, we have kx − αyk = ku − z − αyk = ku − (z + αy)k. Since K is a subspace, z + αy ∈ K whenever z, y ∈ K. Thus, z + αy ∈ K implies that kx − αyk ≥ ρ = kxk or kx − αyk2 − kxk2 ≥ 0 or hx − αy, x − αyi − kxk2 ≥ 0. Since hx − αy, x − αyi = hx, xi − αhy, xi − αhx, yi + ααhy, yi = kxk2 − αhx, yi − αhy, xi + |α|2kyk2 , we have, −αhx, yi − αhx, yi + |α|2kyk2 ≥ 0. Putting α = βhx, yi in the above inequality, β being an arbitrary real number, we get −2β|hx, yi|2 + β 2 |hx, yi|2kyk2 ≥ 0. If we put a = |hx, yi|2 and b = kyk2 in the above inequality, we obtain −2βa + β 2 ab ≥ 0, or βa(βb − 2) ≥ 0, for all real β. If a > 0, the above inequality is false for all sufficiently small positive β. Hence, a must be zero, that is, a = |hx, yi|2 = 0 or hx, yi = 0 for all y ∈ K. Lemma 1.2.2. If M and N are closed subspaces of a Hilbert space X such that M ⊥ N, then the subspace M + N = {x + y ∈ X : x ∈ M and y ∈ N} is also closed. Proof. It is a well-known result of vector spaces that M + N is a subspace of X. We show that it is closed, that is, every limit point of M + N belongs to it. Let z be an arbitrary limit point of M + N. Then there exists a sequence {zn } of points of M + N such that zn → z. M ⊥ N implies that M ∩ N = {0}. So, every zn ∈ M + N can be written uniquely in the form zn = xn + yn , where xn ∈ M and yn ∈ N. By the Pythagorean theorem for elements (xm − xn ) and (ym − yn ), we have kzm − zn k2 = k(xm − xn ) + (ym − yn )k2 = kxm − xn k2 + kym − yn k2 (1.13) Qamrul Hasan Ansari Advanced Functional Analysis Page 27 (It is clear that (xm − xn ) ⊥ (ym − yn ) for all m, n.) Since {zn } is convergent, it is a Cauchy sequence and so kzm − zn k2 → 0. Hence, from (1.13), we see that kxm − xn k → 0 and kym − yn k → 0 as m, n → ∞. Hence, {xm } and {yn } are Cauchy sequences in M and N, respectively. Being closed subspaces of a complete space, M and N are also complete. Thus, {xm } and {yn } are convergent in M and N, respectively, say xm → x ∈ M and yn → y ∈ N, x + y ∈ M + N as x ∈ M and y ∈ N. Then z = lim zn = lim (xn + yn ) = lim xn + lim y n→∞ n→∞ n→∞ n→∞ = x + y ∈ M + N. This proves that an arbitrary limit point of M + N belongs to it and so it is closed. Definition 1.2.2. A vector space X is said to be the direct sum of two subspaces Y and Z of X, denoted by X = Y ⊕ Z, if each x ∈ X has a unique representation x = y + z for y ∈ Y and z ∈ Z. Theorem 1.2.3 (Orthogonal Decomposition). If K is a closed subspace of a Hilbert space X, then every x ∈ X can be uniquely represented as x = z + y for z ∈ K and y ∈ K ⊥ , that is, X = K ⊕ K ⊥ . Proof. Since every subspace is a convex set, by previous two results, for every x ∈ X, there is a z ∈ K such that x − z ∈ K ⊥ , that is, there is a y ∈ K ⊥ such that y = x − z which is equivalently to x = z + y for z ∈ K and y ∈ K ⊥ . To prove the uniqueness, assume that there is also ŷ ∈ K ⊥ such that x = ŷ + ẑ for ẑ ∈ K. Then x = y + z = ŷ + ẑ, and therefore, y − ŷ = ẑ − z. Since y − ŷ ∈ K ⊥ whereas ẑ − z ∈ K, we have y − ŷ ∈ K ∩ K ⊥ = {0}. This implies that y = ŷ, and hence also z = ẑ. x y = PK ⊥ (x) z = PK (x) K Figure 1.5: Orthogonal decomposition Qamrul Hasan Ansari Advanced Functional Analysis Page 28 Example 1.2.1. (a) Let X = L2 (−1, 1). Then X = K ⊕ K ⊥ , where K is the space of even functions, that is, K = {f ∈ L2 (−1, 1) : f (−t) = f (t) for all t ∈ (−1, 1)}, and K ⊥ is the space of odd functions, that is, K ⊥ = {f ∈ L2 (−1, 1) : f (−t) = −f (t) for all t ∈ (−1, 1)}. (b) Let X = L2 [a, b]. For c ∈ [a, b], let K = {f ∈ L2 [a, b] : f (t) = 0 almost everywhere in (a, c)} and K ⊥ = {f ∈ L2 [a, b] : f (t) = 0 almost everywhere in (c, b)}. Then X = K ⊕ K ⊥ . Exercise 1.2.1. Give examples of representations of R3 as a direct sum of a subspace and its orthogonal complement. Exercise 1.2.2. Let K be a subspace of an inner product space X. Show that x ∈ K ⊥ if and only if kx − yk ≥ kxk for all y ∈ K. Definition 1.2.3. Let K be a closed subspace of a Hilbert space X. A mapping PK : X → K defined by PK (x) = z, where x = z + y and (z, y) ∈ K × K ⊥ , is called the orthogonal projection of X onto K. 2 Let X and Y be normed spaces and T : X → Y be an operator. (a) The range of T is R(T ) := {T (x) ∈ Y : x ∈ X}. (b) The null space or kernel of T is N (T ) := {x ∈ X : T (x) = 0}. (c) The operator T is called an idempotent if T 2 = T . 2 Let X be a vector space. A linear operator P : X → X is called projection operator if P ◦ P = P 2 = P . Theorem 1.2.4. If P : X → X is a projection operator from a vector space X to itself, then X = R(P ) ⊕ N (P ), where R(P ) is the range set of P and N (P ) = {x ∈: P (x) = 0} is the null space of P . Theorem 1.2.5. If a vector space X is expressed as the directed sum of its subspaces Y and Z, then there is a uniquely determined projection P : X → X such that Y = R(P ) and Z = N (P ) = R(I − P ), where I be the identity mapping on X. Qamrul Hasan Ansari Advanced Functional Analysis Page 29 Theorem 1.2.6 (Existence of Projection Mapping). Let K be a closed subspace of a Hilbert space X. Then there exists a unique mapping PK from X onto K such that R(PK ) = K. Proof. By Theorem 1.2.3, X = K ⊕ K ⊥ . Theorem 1.2.5 ensures the existence of a unique projection PK such that R(PK ) = K and N (PK ) = K ⊥ . This projection is an orthogonal projection as its null space and range are orthogonal. Similarly, it can be verified that the orthogonal projection I − PK corresponds to the case R(I − PK ) = K ⊥ and N (I − PK ) = K. Exercise 1.2.3. Let K be a closed subspace of a Hilbert space X and I be the identity mapping on X. Then prove that there exists a unique mapping PK from X onto K such that I − PK maps X onto K ⊥ . Such map PK is the projection mapping of X onto K. Exercise 1.2.4 (Properties of Projection Mapping). Let K be a closed subspace of a Hilbert space X, I be the identity mapping on X and PK is the projection mapping from X onto K. Then prove that the following properties hold for all x, y ∈ X. (a) Each element x ∈ X has a unique representation as a sum of an element of K and an element of K ⊥ , that is, x = PK (x) + (I − PK )(x). (1.14) (Hint: Compare with Theorem 1.2.3) (b) kxk2 = kPK (x)k2 + k(I − PK )(x)k2 . (c) x ∈ K if and only if PK (x) = x. (d) x ∈ K ⊥ if and only if PK (x) = 0. (e) If K1 and K2 are closed subspaces of X such that K1 ⊆ K2 , then PK1 (PK2 (x)) = PK1 (x). (f) PK is a linear mapping, that is, for all α, β ∈ R and all x, y ∈ X, PK (αx + βy) = αPK (x) + βPK (y). (g) PK is a continuous mapping, that is, xn −→ x (that is, kxn − xk −→ 0) implies n→∞ PK (xn ) −→ PK (x) (that is, kPK (xn ) − PK (x) −→ 0). n→∞ n→∞ n→∞ Exercise 1.2.5 (Properties of Projection Mapping). Let K be a closed subspace of a Hilbert space X, I be the identity mapping on X and PK is the projection mapping from X onto K. Then prove that the following properties hold for all x, y ∈ X. Qamrul Hasan Ansari Advanced Functional Analysis Page 30 (a) Each element z ∈ X can be written uniquely as z = x + y, where x ∈ R(PK ) and y ∈ N (PK ). (b) The null space N (PK ) and the range set R(PK ) are closed subspaces of X. (c) N (PK ) = (R(PK ))⊥ and R(PK ) = N (PK )⊥ . (d) PK is idempotent. Exercise 1.2.6. Let K1 and K2 be closed subspaces of a Hilbert space X and PK1 and PK2 be orthogonal projections onto K1 and K2 , respectively. If hx, yi = 0 for all x ∈ K1 and y ∈ K2 , then prove that (a) K1 + K2 is a closed subspace of X; (b) PK1 + PK2 is the orthogonal projection onto K1 + K2 ; (c) PK1 PK2 ≡ 0 ≡ PK2 PK1 . 1.2.2 Projection on Convex Sets We discuss here the concepts of projection and projection operator on convex sets which are of vital importance in such diverse fields as optimization, optimal control and variational inequalities. Definition 1.2.4. Let K be a nonempty closed convex subset of a Hilbert space X. For x ∈ X, by projection of x on K, we mean the element z ∈ K, denoted by PK (x), such that kx − PK (x)k ≤ kx − yk, for all y ∈ K, (1.15) equivalently, kx − zk = inf kx − yk. y∈K (1.16) An operator on X into K, denoted by PK , is called the projection operator if PK (x) = z, where z is the projection of x on K. In view of Theorem 1.2.1, there always exists a z ∈ K which satisfies (1.16) Theorem 1.2.7 (Variational Characterization of Projection). Let K be a nonempty closed convex subset of a Hilbert space X. For any x ∈ X, z ∈ K is the projection of x if and only if hx − z, y − zi ≤ 0, for all y ∈ K. (1.17) Qamrul Hasan Ansari Advanced Functional Analysis Page 31 Proof. Let z be the projection of x ∈ X. Then for any α, 0 ≤ α ≤ 1, since K is convex, αy + (1 − α)z ∈ K for all y ∈ K. Define a real-valued function g : [0, 1] → R by g(α) := kx − (αy + (1 − α)z)k2 , for all α ∈ [0, 1]. (1.18) Then g is a twice continuously differentiable function of α. Moreover, g ′ (α) = 2hx − αy − (1 − α)z, z − yi g ′′ (α) = 2hz − y, z − yi. b (1.19) x z K y b Figure 1.6: The projection of a point x onto K Now, for z to be the projection of x, it is clear that g ′ (0) ≥ 0, which is (1.17). In order to prove the converse, let (1.17) be satisfied for some element z ∈ K. This implies that g ′(0) is non-negative, and by (1.19), g ′′ (α) is non-negative. Hence, g(0) ≤ g(1) for all y ∈ K such that (1.16) is satisfied. Remark 1.2.2. The inequality (1.17) shows that x − z and y − z subtend a non-acute angle between them. The projection PK (x) of x on K can be interpreted as the result of applying to x the operator PK : X → K, which is called projection operator. Note that PK (x) = x for all x ∈ K. Theorem 1.2.8. The projection operator PK defined on a Hilbert space X into its nonempty closed convex subset K has the following properties: (a) PK is a nonexpansive, that is, kPK (x) − PK (y)k ≤ kx − yk for all x, y ∈ X; which implies that PK is continuous. (b) hPK (x) − PK (y), x − yi ≥ 0 for all x, y ∈ X. Proof. (a) From (1.17), we obtain hPK (x) − x, PK (x) − yi ≤ 0, for all y ∈ K. (1.20) Qamrul Hasan Ansari Advanced Functional Analysis Page 32 Put x = x1 in (1.20), we get hPK (x1 ) − x1 , PK (x1 ) − yi ≤ 0, for all y ∈ K. (1.21) for all y ∈ K. (1.22) Put x = x2 in (1.20), we get hPK (x2 ) − x2 , PK (x2 ) − yi ≤ 0, Since PK (x2 ) and PK (x1 ) ∈ K, choose y = PK (x2 ) and y = PK (x1 ), respectively, in (1.21) and (1.22), we obtain hPK (u1 ) − u1 , PK (u1 ) − PK (u2 )i ≤ 0 hPK (u2 ) − u2 , PK (u2 ) − PK (u1 )i ≤ 0. From above two inequalities, we obatin hPK (x1 ) − x1 − PK (x2 ) + x2 , PK (x1 ) − PK (x2 )i ≤ 0, or hPK (x1 ) − PK (x2 ), PK (x1 i − PK (x2 )i ≤ hx1 − x2 , PK (x1 ) − PK (x2 )i, equivalently, kPK (x1 ) − PK (x2 )k2 ≤ hx1 − x2 , PK (x1 ) − PK (x2 )i. (1.23) Therefore, by the Cauchy-Schwartz-Bunyakowski inequality, we get kPK (x1 ) − PK (x2 )k2 ≤ kx1 − x2 k kPK (x1 ) − PK (x2 )k , (1.24) kPK (x1 ) − PK (x2 )k ≤ kx1 − x2 k . (1.25) and hence, (b) follows from (1.23). The geometric interpretation of the nonexpansivity of PK is given in the following figure. We observe that if strict inequality holds in (a), then the projection operator PK reduces the distance. However, if the equality holds in (a), then the distance is conserved. Qamrul Hasan Ansari Advanced Functional Analysis x̃ x b y b b ỹ b Page 33 PK (x̃) b PK (x) b K b PK (ỹ) PK (y) b Figure 1.7: The nonexpansiveness of the projection operator 1.3 Bilinear Forms and Lax-Milgram Lemma Let X and Y be inner product spaces over the same field K (= R or C). A functional a(·, ·) : X × Y → K will be called a form. Definition 1.3.1. Let X and Y be inner product spaces over the same field K (= R or C). A form a(·, ·) : X × Y → K is called a sesquilinear functional or sesquilinear form if the following conditions are satisfied for all x, x1 , x2 ∈ X, y, y1, y2 ∈ Y and all α, β ∈ K: (i) a(x1 + x2 , y) = a(x1 , y) + a(x2 , y). (ii) a(αx, y) = αa(x, y). (iii) a(x, y1 + y2 ) = a(x, y1 ) + a(x, y2 ). (iv) a(x, βy) = βa(x, y). Remark 1.3.1. (a) The sesquilinear functional is linear in the first variable but not so in the second variable. A sesquilinear functional which is also linear in the second variable is called a bilinear form or a bilinear functional. Thus, a bilinear form a(·, ·) is a mapping defined from X × Y into K which satisfies conditions (i) - (iii) of the above definition and a(x, βy) = βa(x, y). (b) If X and Y are real inner product spaces, then the concepts of sesquilinear functional and bilinear form coincide. (c) An inner product is an example of a sesquilinear functional. The real inner product is an example of a bilinear form. Qamrul Hasan Ansari Advanced Functional Analysis Page 34 (d) If a(·, ·) is a sesquilinear functional, then g(x, y) = a(y, x) is a sesquilinear functional. Definition 1.3.2. Let X and Y be inner product spaces. A form a(·, ·) : X × Y → K is called: (a) symmetric if a(x, y) = a(y, x) for all (x, y) ∈ X × Y ; (b) bounded or continuous if there exists a constant M > 0 such that |a(x, y)| ≤ Mkxk kyk, for all x ∈ X, y ∈ Y, and the norm of a is defined as |a(x, y)| kak = sup = sup a x6=0 y6=0 kxk kyk x6=0 y6=0 = sup |a(x, y)|. y x , kxk kyk kxk=kyk=1 It is clear that |a(x, y)| ≤ kak kxk kyk. Remark 1.3.2. Let a(·, ·) : X×Y → K be a continuous form and {xn } and {yn } be sequences in X and Y , respectively, such that xn → x and yn → y. Then a(xn , yn ) → a(x, y). Indeed, |a(xn , yn ) − a(x, y)| ≤ |a(xn − x, yn )| + |a(x, yn − y)| ≤ kak (kxn − xk kyn k + kxk kyn − yk) . Definition 1.3.3. Let X be an inner product space. A form a(·, ·) : X × X → K is called: (a) positive if a(x, x) ≥ 0 for all x ∈ X; (b) positive definite if a(x, x) ≥ 0 for all x ∈ X and a(x, x) = 0 implies that x = 0; (c) coercive or X-elliptic if there exists a constant α > 0 such that a(x, x) ≥ αkxk2 for all x ∈ X. Example 1.3.1. Let X = Rn with the usual Euclidean inner product. Then any n × n metrix with real entries defines a continuous bilinear form. If A = (aij ), 1 ≤ i, j ≤ n, and if we have x = (x1 , . . . , xn ) and y = (y1 , . . . , yn ), then the bilinear form is defined as n X a(x, y) := aij xj yi = y ⊤ Ax, i,j=1 where x and y are considered as column vectors and y ⊤ denotes the transpose of y. By the Cauchy-Schwarz inequality, we have |a(x, y)| = |y ⊤ Ax| = |hy, Axi| ≤ kyk kAxk ≤ kAk kxk kyk. Qamrul Hasan Ansari Advanced Functional Analysis Page 35 If A is a symmetric and positive definite matrix, then the bilinear form is symmetric and coercive since we know that n X aij yj yi ≥ αkyk2, i,j=1 where α > 0 is the smallest eigenvalue of the matrix A. Theorem 1.3.1 (Extended form of Riesz Representation Theorem). Let X and Y be Hilbert spaces and a(·, ·) : X × Y → K be a bounded sesquilinear form. Then there exists a unique bounded linear operator T : X → Y such that a(x, y) = hT (x), yi, for all (x, y) ∈ X × Y, (1.26) and kak = kT k. Proof. For each fixed x ∈ X, define a functional fx : Y → K by fx (y) = a(x, y), for all y ∈ Y. (1.27) Then, fx is a linear functional since for all y1 , y2 ∈ Y and all α ∈ K, we have fx (y1 + y2 ) = a(x, y1 + y2 ) = a(x, y1 ) + a(x, y2 ) = fx (y1 ) + fx (y2 ) fx (αy) = a(x, αy) = αa(x, y) = αfx (y). Since a(·, ·) is bounded, we have |fx (y)| = |a(x, y)| = |a(x, y)| ≤ kak kxk kyk, that is, kfx k ≤ kak kxk. Thus, fx is a bounded linear functional on Y . By Riesz representation theorem3 , there exists a unique vector y ∗ ∈ Y such that fx (y) = hy, y ∗i, for all y ∈ Y. (1.28) The vector y ∗ depends on the choice vector x. Therefore, we can write y ∗ = T (x) where T : X → Y . We observe that a(x, y) = hy, T (x)i or a(x, y) = hT (x), yi, for all x ∈ X and y ∈ Y. Since y ∗ is unique, the operator T is uniquely determined. 3 Riesz Representation Theorem. If f is a bounded linear functional on a Hilbert space X, then there exists a unique vector y ∈ X such that f (x) = hx, yi for all x ∈ X and kf k = kyk Qamrul Hasan Ansari Advanced Functional Analysis Page 36 The operator T is linear in view of the following relations: For all y ∈ Y , x, x1 , x2 ∈ X and α ∈ K, we have hT (x1 + x2 ), yi = a(x1 + x2 , y) = a(x1 , y) + a(x2 , y) = hT (x1 ), yi + hT (x2 , yi, hT (αx1 ), yi = a(αx, y) = αa(x, y) = αhT (x), yi. Moreover, T is continuous as kfx k = ky ∗ k = kT (x)k ≤ kak kxk implies that kT k ≤ kak. To prove that kT k = kak, it is enough to show that kT k ≥ kak which follows from the following relation: |a(x, y)| |hT (x), yi| = sup x6=0 y6=0 kxk kyk x6=0 y6=0 kxk kyk kT (x)k kyk ≤ sup = kT k. kxk kyk x6=0 y6=0 kak = sup To prove the uniqueness of T , let us assume that there is another linear operator S : X → Y such that a(x, y) = hS(x), yi, for all (x, y) ∈ X × Y. Then, for every x ∈ X and y ∈ Y , we have a(x, y) = hT (x), yi = hS(x), yi equivalently, h(T − S)(x), yi = 0. This implies that (T − S)(x) = 0 for all x ∈ X, that is, T ≡ S. This proves that there exists a unique bounded linear operator T such that a(x, y) = hT (x), yi. Remark 1.3.3 (Converse of above theorem). Let X and Y be Hilbert spaces and T : X → Y be a bounded linear operator. Then the form a(·, ·) : X × Y → K defined by a(x, y) = hT (x), yi, for all (x, y) ∈ X × Y, (1.29) is a bounded sesquilinear form on X × Y . Proof. Since T is a bounded linear operator on X × Y and the inner product is a sesquilinear mapping, we have that a(x, y) = hT (x), yi is sesquilinear. Since |a(x, y)| = |hT (x), yi| ≤ kT k kxk kyk, by the Cauchy-Schwartz-Bunyakowski inequality, we have sup |a(x, y)| ≤ kT k, and hence a(·, ·) is bounded. kxk=kyk=1 Qamrul Hasan Ansari Advanced Functional Analysis Page 37 Corollary 1.3.1. Let X be a Hilbert space and T : X → X be a bounded linear operator. Then the complex-valued function b(·, ·) : X × X → C defined by b(x, y) = hx, T (y)i is a bounded bilinear form on X and kbk = kT k. Conversely, if b(·, ·) : X × X → C is a bounded bilinear form, then there is a unique bounded linear operator T : X → X such that b(x, y) = hx, T (y)i for all (x, y) ∈ X × X. Proof. Define a function a(·, ·) : X × X → C by a(x, y) = b(y, x) = hT (x), yi. By Theorem 1.3.1, a(x, y) is a bounded bilinear form on X and kak = kT k. Since we have b(x, y) = a(y, x); b is also bounded bilinear on X and kbk = sup kxk=kyk=1 |b(x, y)| = sup kxk=kyk=1 |a(y, x)| = kak = kT k. Conversely, if b is given, we define a bounded bilinear form a(·, ·) : X × X → C by a(x, y) = b(y, x), for all x, y ∈ X. Again, by Theorem 1.3.1, there is a bounded linear operator T on X such that a(x, y) = hT (x), yi, for all (x, y) ∈ X × X. Therefore, we have b(x, y) = a(y, x) = hT (y), xi = hx, T (y)i for all (x, y) ∈ X × X. Corollary 1.3.2. Let X be a Hilbert space. If T is a bounded linear operator on X, then kT k = sup |hx, T (y)i| = sup |hT (x), yi|. kxk=kyk=1 kxk=kyk=1 Proof. By Theorem 1.3.1, for every bounded linear operator on X, there is a bounded bilinear form a such that a(x, y) = hT (x), yi and kak = kT k. Then, kak = sup kxk=kyk=1 From this, we conclude that kT k = |a(x, y)| = sup sup kxk=kyk=1 hT (x), yi. t|hT (x), yi|. kxk=kyk=1 Definition 1.3.4. Let X be a Hilbert space and a(·, ·) : X × X → K be a form. Then the operator F : X → K is called a quadratic form associated with a(·, ·) if F (x) = a(x, x) for all x ∈ X. A quadratic form F is called real if F (x) is real for all x ∈ X. Qamrul Hasan Ansari Remark 1.3.4. Advanced Functional Analysis Page 38 (a) We immediately observe that F (αx) = |α|2F (x) and |F (x)| ≤ kak kxk. (b) The norm of F is defined as kF k = sup x6=0 |F (x)| = sup |F (x)|. kxk2 kxk=1 Remark 1.3.5. If a(·, ·) is any fixed sesquilinear form and F (x) is an associated quadratic form on a Hilbert space X. Then (a) 1 2 [a(x, y) + a(y, x)] = F x+y 2 −F x−y 2 ; (b) a(x, y) = 14 [F (x + y) − F (x − y) + iF (x + iy) − iF (x − iy)]. Varification. By using linearity of the bilinear form a, we have F (x + y) = a(x + y, x + y) = a(x, x) + a(y, x) + a(x, y) + a(y, y) and F (x − y) = a(x − y, x − y) = a(x, x) − a(y, x) − a(x, y) + a(y, y). By subtracting the second of the above equation from the first, we get F (x + y) − F (x − y) = 2a(x, y) + 2a(y, x). (1.30) Replacing y by iy in (1.30), we obtain F (x + iy) − F (x − iy) = 2a(x, iy) + 2a(iy, x), or F (x + iy) − F (x − iy) = 2ia(x, y) + 2ia(y, x). (1.31) Multiplying (1.31) by i and adding it to (1.30), we get the result. Lemma 1.3.1. A bilinear form a(·, ·) : X × X → K is symmetric if and only if the associated quadratic functional F (x) is real. Proof. If a(x, y) is symmetric, then we have F (x) = a(x, x) = a(x, x) = F (x). This implies that F (x) is real. Conversely, let F (x) be real, then by Remark 1.3.5 (d) and in view of the relation F (x) = F (−x) = F (ix) Qamrul Hasan Ansari and Advanced Functional Analysis Page 39 F (x) = a(x, x), F (−x) = a(x, x) = a(−x, −x), we obtain, F (ix) = a(ix, ix) = iia(x, x) = a(x, x) , 1 [F (x + y) − F (y − x) + iF (y + ix) − iF (y − ix)] 4 1 [F (x + y) − F (x − y) + iF (x − iy) − iF (x + iy)] = 4 1 = [F (x + y) − F (x − y) + iF (x + iy) − iF (x − iy)] 4 = a(x, y). a(y, x) = Hence, a(·, ·) is symmetric. Lemma 1.3.2. A bilinear form a(·, ·) : X × X → K is bounded if and only if the associated quadratic form F is bounded. If a(·, ·) is bounded, then kF k ≤ kak ≤ 2kF k. Proof. Suppose that a(·, ·) is bounded. Then we have sup |F (x)| = sup |a(x, x)| ≤ kxk=1 kxk=1 sup kxk=kyk=1 |a(x, y)| = kak, and, therefore, F is bounded and kF k ≤ kak. On the other hand, suppose F is bounded. From Remark 1.3.5 (d) and the parallelogram law, we get 1 kF k(kx + yk2 + kx − yk2 + kx + iyk2 + kx − iyk2) 4 1 kF k2 kxk2 + kyk2 + kxk2 + kyk2 = 4 = kF k kxk2 + kyk2 , |a(x, y)| ≤ or sup kxk=kyk=1 |a(x, y)| ≤ 2kF k. Thus, a(·, ·) is bounded and kak ≤ 2kF k. Theorem 1.3.2. Let X be a Hilbert space and T : X → X be a bounded linear operator. Then the following statements are equivalent: (a) T is self-adjoint. (b) The bilinear form a(·, ·) on X defined by a(x, y) = hT (x), yi is symmetric. Qamrul Hasan Ansari Advanced Functional Analysis Page 40 Proof. (a) ⇒ (b): F (x) = hT (x), xi = hx, T (x)i = hT (x), xi = F (x). In view of Lemma 1.3.1, we obtain the result. (b) ⇒ (a): hT (x), yi = a(x, y) = a(y, x) = hT (y), xi = hx, T (y)i. This shows that T ∗ ≡ T that T is self-adjoint. Theorem 1.3.3. Let X be a Hilbert space. If a bilinear form a(·, ·) : X × X → K is bounded and symmetric, then kak = kF k, where F is the associated quadratic functional. The following theorem, known as the Lax-Milgram lemma proved by PD Lax and AN Milgram in 1954, has important applications in different fields. Theorem 1.3.4 (Lax-Milgram Lemma). Let X be a Hilbert space, a(·, ·) : X × X → R be a coercive bounded bilinear form, and f : X → R be a bounded linear functional. Then there exists a unique element x ∈ X such that a(x, y) = f (y), for all y ∈ X. (1.32) Proof. Since a(·, ·) is bounded, there exists a constant M > 0 such that |a(x, y)| ≤ Mkxk kyk. (1.33) By Theorem 1.3.1, there exists a bounded linear operator T : X → X such that a(x, y) = hT (x), yi, for all (x, y) ∈ X × X. By Riesz representation theorem4 , there exists a continuous linear functional f : X → R such that equation a(x, y) = f (y) can be rewritten as, for all λ > 0, hλT (x), yi = λhf, yi, (1.34) or This implies that hλT (x) − λf, yi = 0, for all y ∈ X. λT (x) = λf. (1.35) We will show that (1.35) has a unique solution by showing that for appropriate values of parameter ρ > 0, the affine mapping for y ∈ X, y 7→ y − ρ(λT (y) − λf ) ∈ X is a contraction mapping. For this, we observe that ky − ρλT (y)k2 = hy − ρλT (y), y − ρλT (y)i = kyk2 − 2ρhλT (y), yi + ρ2 kλT (y)k2 ≤ kyk2 − 2ραkyk2 + ρ2 M 2 kyk2, 4 (by applying inner product axioms) Riesz Representation Theorem. If f is a bounded linear functional on a Hilbert space X, then there exists a unique vector y ∈ X such that f (x) = hx, yi for all x ∈ X and kf k = kyk Qamrul Hasan Ansari Advanced Functional Analysis Page 41 as a(y, y) = hλT (y), yi ≥ αkyk2 (by the coercivity), (1.36) and kλT (y)k ≤ Mkyk (by boundedness of T ). Therefore, ky − ρλT (y)k2 ≤ (1 − 2ρα + ρ2 M 2 )kyk2 , (1.37) ky − ρλT (y)k ≤ (1 − 2ρα + ρ2 M 2 )1/2 kyk. (1.38) or Let S(y) = y − ρ (λT (y) − λf ). Then kS(y) − S(z)k = k(y − ρ(λT (y) − λf (u))) − (z − ρ(λT (z) − λf (u)))k = k(y − z) − ρ(λT (y − z))k ≤ (1 − 2ρα + ρ2 M 2 )1/2 ky − zk (by (1.38). (1.39) This implies that S is a contraction mapping if 0 < 1−2ρα+ρ2 M 2 < 1 which is equivalent to the condition that ρ ∈ (0, 2α/M 2). Hence, by the Banach contraction fixed point theorem, S has a unique fixed point which is the unique solution. Remark 1.3.6 (Abstract Variational Problem). Find an element x such that a(x, y) = f (y), for all y ∈ X, where a(x, y) and f are as in Theorem 1.3.4. This problem is known as abstract variational problem. In view of the Lax-Milgram lemma, it has a unique solution. 2 Spectral Theory of Continuous and Compact Linear Operators Let X and Y be linear spaces and T : X → Y be a linear operator. Recall that the range R(T ) and null space N (T ) of T are defined, respectively, as R(T ) = {T (x) : x ∈ X} and N (T ) = {x ∈ X : T (x) = 0}. The dimension of R(T ) is called the rank of T and the dimension of N (T ) is called the nullity of T . It can be easily seen that a linear operator T : X → Y is one-one if and only if N (T ) = {0}. Recall that c = = c0 = = c00 space of convergent sequences of real or complex numbers {{xn } ⊆ K : {xn } is convergent} space of convergent sequences of real or complex numbers that converge to zero {{xn } ⊆ K : xn → 0 as n → ∞} ∞ [ {{x1 , x2 , . . .} ⊆ K : xj = 0 for j ≥ k} = k=1 ℓ∞ = {{xn } ⊆ K : sup |xn | < ∞} n∈N Clearly, c00 ⊆ c0 ⊆ c ⊆ ℓ∞ . 42 Qamrul Hasan Ansari 2.1 Advanced Functional Analysis Page 43 Compact Linear Operators on Normed Spaces Recall that the set {T (x) : kxk ≤ 1} is closed and bounded if T : X → Y is a bounded linear operator from a normed space X to another normed space Y . However, if T : X → Y is bounded linear operator of finite rank, then the set {T (x) : kxk ≤ 1} is compact as every closed and bounded subset of a finite dimensional normed space is compact. But this is not true if the rank of T (dimension of R(T ) is called rank of T ) is infinity. For example, consider the identity operator I : X → X on an infinite dimensional normed space X, then the above set reduces to the closed unit ball {x ∈ X : kxk ≤ 1} which is not compact. Let T : X → Y be a bounded linear operator from a normed space X to another normed space Y . Then for any r > 0, we have {T (x) : kxk ≤ r} is compact ⇔ {T (x) : kxk ≤ 1} is compact {T (x) : kxk < r} is compact ⇔ {T (x) : kxk < 1} is compact Definition 2.1.1 (Compact linear operator). Let X and Y be normed spaces. A linear operator T : X → Y is said be compact or completely continuous if the image T (M) of every bounded subset M of X is relatively compact, that is, T (M) is compact for every bounded subset M of X. Lemma 2.1.1. Let X and Y be normed spaces. (a) Every compact linear operator T : X → Y is bounded, and hence continuous. (b) If dimX = ∞, then the identity operator I : X → X (which is always continuous) is not compact. Proof. (a) Since the unit space S = {x ∈ X : kxk = 1} is bounded and T is a compact linear operator, T (S) is compact, and hence is bounded1 . Therefore, sup kT (x)k < ∞. kxk=1 Hence T is bounded and so it is continuous. (b) Note that the closed unit ball B = {x ∈ X : kxk ≤ 1} is bounded. If dimX = ∞, then B cannot be compact2 . Therefore, I(B) = B = B is not relatively compact. 1 2 Every compact subset of a normed space is closed and bounded The normed space X is finite dimensional if and only if the closed unit ball is compact Qamrul Hasan Ansari Advanced Functional Analysis Page 44 Exercise 2.1.1. Let X and Y be normed spaces and T : X → Y be a linear operator. Then prove that the following statements are equivalent. (a) T is a compact operator. (b) {T (x) : kxk < 1} is compact in Y . (c) {T (x) : kxk ≤ 1} is compact in Y . Proof. Clearly, (a) implies (b) and (c). Assume that (c) holds, that is, {T (x) : kxk ≤ 1} is compact in Y . Let M be a bounded subset of X. Then, there exists r > 0 such that M ⊆ {x ∈ X : kxk ≤ r}. Since T (M) ⊆ {T (x) ∈ Y : x ∈ X, kxk < r} ⊆ {T (x) ∈ Y : x ∈ X, kxk ≤ r}, and the fact that a closed subset of a compact set is compact, it follows that (c) implies (b) and (a), and (b) implies (a). Theorem 2.1.1 (Compactness criterion). Let X and Y be normed spaces and T : X → Y be a linear operator. Then T is compact if and only if it maps every bounded sequence {xn } in X onto a sequence {T (xn )} in Y which has a convergent subsequence. Proof. If T is compact and {xn } is bounded. Then we can assume that kxn k ≤ c for every n ∈ N and some constant c > 0. Let M = {x ∈ X : kxk ≤ c}. Then {T (xn )} is a sequence in the closure of {T (xn )} in Y which is compact, and hence it contains a convergent subsequence. Conversely, assume that every bounded sequence {xn } contains a subsequence {xnk } such that {T (xnk )} converges in Y . Let B be a bounded subset of X. To show that T (B) is compact, it is enough to prove that every sequence in it has a convergent subsequence. Suppose that {yn } be any sequence in T (B). Then yn = T (xn ) for some xn ∈ B and {xn } is bounded since B is bounded. By assumption, {T (xn )} contains a convergent subsequence. Hence T (B) is compact3 because {yn } in T (B) was arbitrary. It shows that T is compact. Remark 2.1.1. Sum T1 + T2 of two compact linear operators T1 , T2 : X → Y is compact. Also, for all α scalar, αT1 is compact. Therefore the set of compact linear operators, denoted by K(X, Y ) from a normed space X to another normed space Y forms a vector space. Exercise 2.1.2. Let X and Y be normed spaces. Prove that K(X, Y ) is a subspace of B(X, Y ) the space of all bounded linear operators from X to Y . Exercise 2.1.3. Let T : X → X be a compact linear operator and S : X → X be a bounded linear operator on a normed space X. Then prove that T S and ST are compact. 3 A set is compact if every sequence has a convergent subsequence Qamrul Hasan Ansari Advanced Functional Analysis Page 45 Proof. Let B be any bounded subset of X. Since S is bounded, S(B) is a bounded set and T (S(B)) = T S(B) is relatively compact because T is compact. Hence T S is a linear compact operator. To prove ST is also compact, let {xn } be any bounded sequence in S. Then {T (xn )} has convergent subsequence {T (xnk )} by Theorem 2.1.1 and {ST (xnk } is convergent. Hence ST is compact again by Theorem 2.1.1. Example 2.1.1. Let 1 ≤ p ≤ ∞ and X = ℓp . Let T : X → X be the right shift operator on X defined by 0, if i = 1, (T (x)(i)) := x(i − 1), if i > 1. Since T (en ) = en+1 , ken − em k = 21/p , 1, if 1 ≤ p < ∞, if = ∞, for all n, m ∈ N, n 6= m, it follows that, corresponding to the bounded sequence {en }, {A(en )} does not have a convergent subsequence. Hence, by Theorem 2.1.1, the operator T is not compact. Exercise 2.1.4. Prove that the left shift operator on ℓp space is not compact for any p with 1 ≤ p ≤ ∞. Definition 2.1.2. An operator T ∈ B(X, Y ) with dimT (X) < ∞ is called an operator of finite rank. Theorem 2.1.2 (Finite dimensional domain or range). Let X and Y be normed spaces and T : X → Y be a linear operator. (a) If T is bounded and dimT (X) < ∞, then the operator T is compact. That is, every bounded linear operator of finite rank is compact. (b) If dim(X) < ∞, then the operator T is compact. Proof. (a) Let {xn } be any bounded sequence in X. Then the inequality kT (xn )k ≤ kT k kxn k shows that the sequence {T (xn )} is bounded. Hence {T (xn )} is relatively compact4 since dimT (X) < ∞. It follows that {T (xn )} has a convergent subsequence. Since {xn } was an arbitrary bounded sequence in X, the operator T is compact by Theorem 2.1.1. (b) It follows from (a) by noting that dim(X) < ∞ implies the boundedness of T 5 . Exercise 2.1.5. Prove that the identity operator on a normed space is compact if and only if the space is of finite dimension. 4 5 In a finite dimensional space, a set is compact if and only if it is closed and bounded Every linear operator is bounded on a finite dimensional normed space X Qamrul Hasan Ansari Advanced Functional Analysis Page 46 Theorem 2.1.3 (Sequence of compact linear operators). Let {Tn } be a sequence of compact linear operators from a normed space X to a Banach space Y . If {Tn } is uniformly operator convergent to an operator T (that is, kTn − T k → 0), then the limit operator T is compact. Proof. Let {Tn } be a sequence in K(X, Y ) such that kTn − T k → 0 as n → ∞. In order to prove that T ∈ K(X, Y ), it is enough to show that for any bounded sequence {xn } in X, the image sequence {T (xn )} has a convergent subsequence, and then apply Theorem 2.1.1. Let {xn } be a bounded sequence in X, and ε > 0 be given. Since {Tn } is a sequence in K(X, Y ), there exists N ∈ N such that kTn − T k < ε, for all n ≥ N. Since TN ∈ K(X, Y ), there exists a subsequence {x̃n } of {xn } such that {TN (x̃n )} is convergent. In particular, there exists n0 ∈ N such that kTN (x̃n ) − TN (x̃m )k < ε, for all m, n ≥ n0 . Hence we obtain for n, m ≥ n0 kT (x̃n ) − T (x̃m )k ≤ kkT (x̃n ) − TN (x̃n )k + kTN (x̃n ) − TN (x̃m )k + kTN (x̃m ) − T (x̃m )k ≤ kT − TN k kx̃j k + kTN (x̃n ) − TN (x̃m )k + kTN − T k kx̃m k ≤ (2c + 1)ε, where c > 0 is such that kxn k ≤ c for all n ∈ N. This shows that {T (x̃n )} is a Cauchy sequence and hence converges since Y is complete. Remembering that {x̃n } is a subsequence of the arbitrary bounded sequence {xn }, we see that Theorem 2.1.1 implies compactness of the operator T . Remark 2.1.2. The above theorem does not hold if we replace unform operator convergence by strong operator convergence kTn (x) − T (x)k → 0. For example, consider the sequence Tn : ℓ2 → ℓ2 defined by Tn (x) = (ξ1 , . . . , ξn , 0, 0, . . .), where x = {ξj } ∈ ℓ2 . Since T is linear and bounded, Tn is compact by Theorem 2.1.2 (a). Clearly, Tn (x) → x = I(x), but I is not compact since dimℓ2 = ∞ (see Lemma 2.1.1 (b). Remark 2.1.3. As a particular case of the above theorem, we can say that if X is a Banach space, and if {Tn } is a sequence of finite rank operators in B(X) such that kTn − T k → 0 as n → ∞ for some T ∈ B(X), then T is a compact operator. By using the above theorem, we give the example of compact operator. Example 2.1.2. The operator T : ℓ2 → ℓ2 defined by T (x) = y, where x = {ξj } ∈ ℓ2 and y = {ηj } with ηj = ξ/j for all j = 1, 2, . . ., is a compact linear operator. Qamrul Hasan Ansari Advanced Functional Analysis Page 47 Clearly, if x = {ξj } ∈ ℓ2 , then y = {ηj } ∈ ℓ2 . Let Tn : ℓ2 → ℓ2 be defined by ξ1 ξ2 ξ3 ξn Tn (x) = , , , . . . , , 0, 0, . . . . 1 2 3 n Then Tn is linear and bounded, and is compact by Theorem 2.1.2 (a). Furthermore, ∞ X k(T − Tn )(x)k2 = j=n+1 |ηj |2 = ∞ X 1 2 |ξj | j j=n+1 ∞ X kxk2 1 2 |ξ | ≤ . j (n + 1)2 j=n+1 (n + 1)2 ≤ Taking the supremum over all x of norm 1, we see that kT − Tn k ≤ 1 . n+1 Hence Tn → T , and T is compact by Theorem 2.1.3. Example 2.1.3. Let {λn } be a sequence of scalars such that λn → 0 as n → ∞. Let T : ℓp → ℓp (1 ≤ p ≤ ∞) be defined by (T (x))(i) = λi x(i), for all x ∈ ℓp , i ∈ N. Then we see that T is a compact operator. For each n ∈ N, let (Tn (x))(i) = λi x(i), 0, if 1 ≤ i ≤ n if i > n. Then for each n, clearly Tn : ℓp → ℓp is a bounded operator of finite rank. In particular, each Tn is a compact operator. It also follows that k(T − Tn )(x)kp ≤ sup |λi | kxkp , for all x ∈ ℓp , n ∈ N. i>n Since λn → 0 as n → ∞, we obtain kT − Tn kp ≤ sup |λi | → 0 as n → ∞. i>n Then by Theorem 2.1.3 is a compact operator. Since T (en ) = λn en for all n ∈ N, T is of infinite rank whenever λn 6= 0 for infinitely many n. Exercise 2.1.6. Prove that the operator T defined in the Example 2.1.3 is not compact if λn → λ 6= 0 as n → ∞. Exercise 2.1.7. Show that the zero operator on any normed space is compact. Qamrul Hasan Ansari Advanced Functional Analysis Page 48 Exercise 2.1.8. If T1 , T2 : X → Y are compact linear operators from a normed space X to another normed space Y and α is a scalar, then show that T1 + T2 and αT1 are also compact linear operators. Exercise 2.1.9. Show that the projection of a Hilbert space H onto a finite dimensional subspace of H is compact. Exercise 2.1.10. Show that the operator T : ℓ2 → ℓ2 defined by T (x) = y, where x = {ξ1 , ξ2 , . . .} and y = {η1 , η2 , . . .} with ηi = ξi /2i , is compact. Exercise 2.1.11. Show that the operator T : ℓp → ℓp , 1 ≤ p < ∞, defined by T (x) = y, where x = {ξ1 , ξ2 , . . .} and y = {η1 , η2 , . . .} with ηi = ξi /i, is compact. Exercise 2.1.12. Show that the operator T : ℓ∞ → ℓ∞ defined by T (x) = y, where x = {ξ1 , ξ2 , . . .} and y = {η1 , η2 , . . .} with ηi = ξi /i, is compact. Theorem 2.1.4. Let X and Y be normed spaces and T : X → Y be a linear compact operator. Suppose that the sequence {xn } in X is weakly convergent, say, xn ⇀ x. Then {T (xn )} converges strongly to T (x) in Y . Proof. We write yn = T (xn ) and y = T (x). We first show that yn ⇀ y and then yn → y. Let g be any bounded linear functional on Y . We define a functional f on X by setting f (z) = g(T (z)), for all z ∈ X. Then f is linear. Also, f is bounded because T is compact, hence bounded, and |f (z)| = |g(T (z))| ≤ kgk kT (z)k ≤ kgk kT k kzk. By definition, xn ⇀ x implies f (xn ) → f (x), hence by the definition, g(T (xn )) → g(T (x)), that is, g(yn) → g(y). Since g was arbitrary, this proves yn ⇀ y. Now we prove yn → y. Assume that it does not hold. Then {yn } has a subsequence {ynk } such that kynk − yk ≥ δ, (2.1) for some δ > 0. Since {xn } is weakly convergent, {xn } is bounded, and so is {xnk }. Compactness of T implies that (by Theorem 2.1.1) {T (xnk )} has a convergent subsequence, say {ỹj }. Let ỹj → ỹ. Then of course ỹj ⇀ ỹ. Hence ỹ = y because yn ⇀ y. Consequently, kỹj − yk → 0 but kỹj − yk ≥ δ > 0 by (2.1). This contradicts, so that yn → y. Remark 2.1.4. In general, the converse of the above theorem does not hold. For example, consider the space X = ℓ1 , then by Schur’s lemma6 , every weakly convergent sequence in ℓ1 is convergent. Thus, every bounded operator on ℓ1 maps every weakly convergent sequence onto a convergent sequence. Obviously, every bounded operator on ℓ1 is not compact. However, if the space X is reflexive, then the converse of Theorem 2.1.4 does hold. 6 Schur’s Lemma. Every weakly convergent sequence in ℓ1 is convergent Qamrul Hasan Ansari Advanced Functional Analysis Page 49 Theorem 2.1.5. Let X and Y be normed spaces such that X is reflexive, and T : X → Y be a linear operator such that for any sequence {xn } in X, xn ⇀ x implies T (xn ) → T (x). Then T is a compact operator. Proof. It is enough to show that for every bounded sequence {xn } in X, {T (xn )} has a convergent sequence. Let {xn } be a bounded sequence in X. By Eberlein-Shmulyan theorem7 , {xn } has a weakly convergent subsequence, say {x̃n }. Then by hypothesis, {T (x̃n )} converges. Exercise 2.1.13. Let X be an infinite dimensional normed space and T :→ X be a compact linear operator. If λ is a nonzero scalar, then prove that λI − T is not a compact operator. Further deduce that the operator α3 α4 S : (α1 , α2 , . . .) 7→ α1 + α2 , α2 + , α3 + , . . . 2 3 is not a compact operator on ℓp , 1 ≤ p ≤ ∞. Exercise 2.1.14. Let 1 ≤ p ≤ ∞ and q be the conjugate exponent of p, that is, 1p + 1q = 1. Let (aij ) be an infinite matrix with aij ∈ K, i, j ∈ N. Show that the operator (T (x)(i)) = P ∞ p p p j=1 aij x(j), x ∈ ℓ , i ∈ N, is well defined and T : ℓ → ℓ is a compact operator in each of the following cases: (a) 1 ≤ p ≤ ∞, 1 ≤ r ≤ ∞ and P∞ (c) 1 < p ≤ ∞, 1 ≤ r ≤ ∞ and P∞ → 0 as i → ∞. r P P∞ (b) 1 ≤ p ≤ ∞, 1 ≤ r < ∞ and ∞ |a | < ∞. ij i=1 j=1 (d) 1 < p ≤ ∞, 1 ≤ r < ∞ and j=1 |aij | j=1 |aij | q P∞ P∞ i=1 → 0 as i → ∞. q j=1 |aij | r/q < ∞. Exercise 2.1.15. Let X be a Hilbert space and T : X → X be a bounded linear operator. Show that T is compact if and only if for every sequence {xn } in X hxn , ui → hx, ui, for all u ∈ X implies T (xn ) → T (x). Exercise 2.1.16. Let X and Y be infinite dimensional normed spaces. If T : X → Y is a surjective linear operator, then prove that T is compact. 7 Eberlein-Shmulyan Theorem. Every bounded sequence in reflexive space has a weakly convergent subsequence Qamrul Hasan Ansari 2.2 Advanced Functional Analysis Page 50 Eigenvalues and Eigenvectors Let X and Y be linear spaces and T : X → Y be a linear operator. Recall that the range R(T ) and null space N (T ) of A are defined, respectively, as R(T ) = {T (x) : x ∈ X} and N (T ) = {x ∈ X : T (x) = 0}. The dimension of R(T ) is called the rank of T and the dimension of N (T ) is called the nullity of T . It can be easily seen that a linear operator T : X → Y is one-one if and only if N (T ) = {0}. Definition 2.2.1. Let X be a linear space and T : X → X be a linear operator. A scalar λ ∈ K is called an eigenvalue of T if there exists a nonzero vector x ∈ X such that T (x) = λx. In this case, x is called an eigenvector of T corresponding to eigenvector λ. The set of all eigenvalues of T is known as eigenspectrum of T or point of spectrum, and it is denoted by σeig (T ). Thus, σeig (T ) := {λ ∈ K : ∃x 6= 0 such that T (x) = λx}. Remark 2.2.1. Note that λ ∈ σeig (T ) ⇔ N (T − λI) 6= {0}, and nonzero element of N (T − λI) are eigenvectors of T corresponding to the eigenvalue λ. The subspace N (T − λI) is called the eigenspace of T corresponding to the eigenvalue λ. Remark 2.2.2. A linear operator may not have any eigenvalue at all. For example, the linear operator T : R2 → R2 defined by T ((α1 , α2 )) = (α2 , −α1 ), for all (α1 , α2 ) ∈ R2 has no eigenvalue. Remark 2.2.3. It can be easy seen that • λ ∈ K is an eigenvalue of T if and only if the operator Tλ I is not injective; • λ ∈ K is an eigenvalue of T if and only if the operator Tλ I is not surjective. Qamrul Hasan Ansari Advanced Functional Analysis Page 51 Example 2.2.1. Let X be any of the sequence spaces c00 , c0 , c, ℓp . (a) Let {λn } be a bounded sequence of scalars. Let T : X → X be the diagonal operator defined by T (x)(j) = λj x(j), for all x ∈ X and j ∈ N. Then it is easy to see that, for λ ∈ K, the equation T (x) = λx is satisfied for a nonzero x ∈ X if and only if λ = λj for some j ∈ N. Hence, σeig (T ) = {λ1 , λ2 , . . .}. In fact, for n ∈ N, en ∈ X defined by en (j) = δij is an eigenvector of T corresponding to the eigenvalue λn . (b) Let T : X → X be the right shift operator, that is, T : (α1 , α2 , . . .) 7→ (0, α1 , α2 , . . .). Let λ ∈ K. Then the equation T (x) = λx is satisfied for some x = (α1 , α2 , . . .) ∈ X if and only if 0 = λα1 , αj = λαj+1, for all j ∈ N. This is possible only if αj = 0 for all j ∈ N. Thus, σeig (T ) = ∅. (c) Let T : X → X be the left shift operator, that is, T : (α1 , α2 , . . .) 7→ (α2 , α3 , . . .). Then for x = (α1 , α2 , . . .) ∈ X and λ ∈ K, T (x) = λx ⇔ αn+1 = λn α1 . From this, we can infer the following: Clearly, λ = 0 is an eigenvalue of T with a corresponding eigenvector e1 . Now suppose that λ 6= 0. If λ is an eigenvalue, then a corresponding eigenvector is of the form x = α1 (1, λ, λ2, λ3 . . .) for some nonzero α1 . Note that if α1 6= 0 and λ 6= 0, then x = α1 (1, λ, λ2, λ3 . . .) does not belong to c00 . Thus, if X = c00 , then σeig (T ) = {0}. Next consider the cases of X = c0 or X = ℓp for 1 ≤ p < ∞. In these cases, we see that (1, λ, λ2, λ3 . . .) ∈ X if and only if |λ| < 1, so that σeig (T ) = {λ : |λ| < 1}. For the case of X = c, we see that (1, λ, λ2 , λ3 . . .) ∈ X if and only if either |λ| < 1 or λ = 1. Thus, in this case σeig (T ) = {λ : |λ| < 1} ∪ {1}. If X = ℓ∞ , then (1, λ, λ2, λ3 . . .) ∈ X if and only if |λ| ≤ 1. Thus, in this case σeig (T ) = {λ : |λ| ≤ 1}. Qamrul Hasan Ansari Advanced Functional Analysis Page 52 Theorem 2.2.1. Let X be a normed space and T : X → X be a compact linear operator. Then zero is the only possible limit point of σeig (T ). In particular, σeig (T ) is a countable subset of K. Proof. Since σeig (T ) \ {0} = ∞ n [ n=1 o λ ∈ σeig (T ) : |λ| ≥ 1/n , it is enough to show that the set Er := {λ ∈ σeig (T ) : |λ| ≥ r} is finite for each r > 0. Assume that there is an r > 0 such that Er is an infinite set. Let {λn } be a sequence of distinct elements in Er , that is, {λn } be a sequence of distinct eigenvalues of T such that |λn | ≥ r. For n ∈ N, let xn be eigenvector of T corresponding to the eigenvalue λn , and let Xn := span{x1 , x2 , . . . , xn }, n ∈ N. Then each Xn is a proper closed subspace of Xn+1 . By Riesz Lemma8 , there exists a sequence {un } ∈ X such that un ∈ Xn , kun k = 1 for all n ∈ N and 1 dist(un , Xm ) ≥ , for all m < n. 2 Therefore, for every m, n ∈ N with m < n, we have kT (un ) − T (um )k = k(T − λn I)(un ) − (T − λm I)(um ) + λn xn − λm xm k = kλn un − [λm um + (T − λm I)(um ) − (T − λn I)(un )k Note that um ∈ Xm ⊆ Xn−1 and (T − λn I)(un ) ∈ Xn−1 , (T − λm I)(um ) ∈ Xm−1 ⊆ Xn−1 . Therefore, we have kT (un ) − T (um )k ≥ |λn |dist(un , Xn−1) ≥ |λn | r ≥ . 2 2 Thus, {T (un )} has no convergent subsequence, contradicting the fact that T is a compact operator. Let X be a normed space and T : X → X be a linear operator. Assume that λ is not an eigenvalue of T . Then we can say that for y ∈ X, the operator equation T (x) − λx = y can have atmost one solution which depends continuously on y. In other words, one would like to know that the inverse operator (T − λI)−1 : R(T − λI) → X 8 Riesz Lemma. Let X0 be a proper closed subspace of a normed space X. Then for every r ∈ (0, 1), there exists xr ∈ X such that kxr k = 1 and dist(xr , X0 ) ≥ r. Qamrul Hasan Ansari Advanced Functional Analysis Page 53 is continuous which is equivalent to say that the operator T − λI is bounded below, that is, there exists c > 0 such that kT (x) − λxk ≥ ckxk, for all x ∈ X. Motivated by the above requirement, we generalize the concept of eigenspectrum. Definition 2.2.2. Let X be a normed space and T : X → X be a linear operator. A scalar λ is said to be an approximate eigenvalue of T if T − λI is not bounded below. The set of all approximate eigenvalues of T is called the approximate eigenspectrum of T , and it is denoted by σapp (T ), that is, σapp (T ) = {λ ∈ K : T − λI not bounded below}. Remark 2.2.4. By the result9 , λ ∈ / σapp (T ) if and only if T −λI is injective and (T −λI)−1 : R(T − λI) → X is continuous. The following result provides the characterization of σapp (T ). Theorem 2.2.2. Let X be a normed space, T : X → X be a linear operator and λ ∈ K. Then λ ∈ σapp (T ) if and only if there exists a sequence {xn } in X such that kxn k = 1 for all n ∈ N, and kT (xn ) − λxn k → 0 as n → ∞. Proof. If λ ∈ / σapp (T ), that is, if there exists c > 0 such that kT (x) − λxk ≥ ckxk for all x ∈ X, then there would not exist any sequence {xn } in X such that kxn k = 1 for all n ∈ N and kT (xn ) − λxn k → 0 as n → ∞. Conversely, assume that λ ∈ σapp (T ), that is, there does not exist any c > 0 such that kT (x) − λxk ≥ ckxk for all x ∈ X. Then for all n ∈ N, there exists un ∈ X such that 1 kun k, for all n ∈ N. n un for all n ∈ N, then we have Clearly, un = 6 0 for all n ∈ N. Taking xn = kun k kT (un ) − λun k < kxn k = 1 for all n ∈ N and kT (xn ) − λxn k < 1 → 0 as n → ∞. n This completes the proof. 9 Let X and Y be normed spaces and T : X → Y be a linear operator. Then there exists γ > 0 such that kT (x)k ≥ γkxk for all x ∈ X if and only if T is injective and T −1 : R(T ) → X is continuous, and in that case, kT −1(y)k ≤ γ1 kyk for all y ∈ R(T ). Qamrul Hasan Ansari Advanced Functional Analysis Page 54 Theorem 2.2.3. Let X be a normed space and T : X → X be a linear operator. Then, σeig (T ) ⊆ σapp (T ). If X is a finite dimensional space, then σeig (T ) = σapp (T ). Proof. Clearly, λ ∈ / σapp (T ) implies T − λI is injective (one-one) so that λ ∈ / σeig (T ). Thus, σeig (T ) ⊆ σapp (T ). Now, assume that X is a finite dimensional space. If λ ∈ / σeig (T ), then T − λI is injective (one-one) so that using the finite dimensionality of X, it follows that T − λI is surjective as well, and hence the operator (T − λI)−1 is continuous. Consequently, T − λI is bounded below, that is, λ ∈ / σapp (T ). Thus, if X is finite dimensional, then σeig (T ) = σapp (T ). The following example illustrates that the strict inclusion in σeig (A) ⊆ σapp (A) can occur if the space X is infinite dimensional. Example 2.2.2. Let X be any of the sequence spaces c00 , c0 , c, ℓp with any norm satisfying ken k = 1 for all n ∈ N. Let T : X → X be defined by (T (x))(j) = λj x(j), for all x ∈ X and all j ∈ N. where {λn } is a bounded sequence of scalars. As in Example 2.2.1, we have σeig (T ) = {λ1 , λ2 , . . .}. Now assume that λn → λ as n → ∞. Then we have kT (en ) − λen k = |λn − λ| ken k = |λn − λ| → 0 as n → ∞. Thus, we can conclude that λ ∈ σapp (T ). Note that if λ 6= λn for every n ∈ N, then λ∈ / σeig (T ). Theorem 2.2.4. Let X be a normed space and T : X → X be a linear compact operator. Then the following assertions hold: (a) σapp (T )\{0} = σeig (T )\{0}. (b) If T is a finite rank operator, then σapp (T ) = σeig (T ). (c) If X is infinite dimensional, then 0 ∈ σapp (T ). (d) 0 is the only possible limit point of σapp (T ). Qamrul Hasan Ansari Advanced Functional Analysis Page 55 Proof. (a) We have already observed that σeig (T ) ⊆ σapp (T ). Now, suppose that 0 6= λ ∈ σapp (T ). We show that λ ∈ σeig (T ). Let {xn } be a sequence in X such that kxn k = 1 for every n ∈ N and kT (xn ) − λxn k → 0 as n → ∞. Since T is compact operator, there exists a subsequence {x̃n } of {xn } and y ∈ X such that T (x̃n ) → y. Hence, λx̃n = T (x̃n ) − (T (x̃n ) − λx̃n ) → y. Then it follows that kyk = |λ| and y = lim T (x̃n ) = T n→∞ so that T (y) = λy, showing that λ ∈ σeig (T ). y λ , (b) Suppose that T is a finite rank operator. In view of (a), it is enough to show that 0 ∈ σapp (T ) implies 0 ∈ σeig (A). Suppose that 0 ∈ / σeig (T ). Then T is injective so that by the hypothesis that T is of finite rank, X is finite dimensional. Therefore, σapp (T ) = σeig (T ), and consequently, 0 ∈ / σapp (T ). (c) Let X be infinite dimensional. Suppose that 0 ∈ / σapp (T ), that is, T is bounded below. We show that every bounded sequence in X has a Cauchy subsequence so that X would be of finite dimension, contradicting the assumption. Let {xn } be a bounded sequence in X. Since A is compact, there is subsequence {x̃n } of {xn } such that {T (x̃n )} converges. Since T is bounded below, it follows that {x̃n } is Cauchy subsequence of {xn }. (d) It follows from the proof of (a) and Theorem 2.2.1. From the above theorem part (c), we can observe that an operator defined on an infinite dimensional space is not compact. The following example illustrates this point of view. Example 2.2.3. Let X = ℓp with 1 ≤ p ≤ ∞. Let T be the right shift operator on X defined as T : (α1 , α2 , . . .) 7→ (0, α1 , α2 , . . .), or the diagonal operator on X defined as T : (α1 , α2 , . . .) 7→ (λ1 α1 , λ2 α2 , . . .) associated with a sequence {λn } of nonzero scalars which converges to a nonzero scalar. We know that T is not a compact operator but bounded below. Hence 0 ∈ / σapp (T ). Thus, the fact that T is not compact as follows from Theorem 2.2.4 (c). Qamrul Hasan Ansari Advanced Functional Analysis Page 56 We know that the range of an infinite rank compact operator on a Banach space is not closed. Does Theorem 2.2.4 (c) hold for every bounded operator with nonclosed range as well? The answer is in the affirmative if X is a Banach space, as the following theorem shows. Theorem 2.2.5. Let X be a Banach space and T : X → X be a bounded linear operator. If the range R(T ) of T is not closed in X, then 0 ∈ σapp (T ). Proof. The proof follows from result “Let T : X → Y be a bounded linear operator from a Banach space X to a normed space Y . If T is bounded below, then the range R(T ) of T is a closed subspace of Y .” Now we prove a topological property of σapp (T ). Theorem 2.2.6. Let X be a normed space and T : X → X be a bounded linear operator. Then σapp (T ) is a closed subset of K. Proof. Let {λn } be a sequence in σapp (T ) such that λn → λ for some λ ∈ K. Suppose that λ∈ / σapp (T ). Let c > 0 be such that kT (x) − λxk ≥ ckxk, for all x ∈ X. Observe that, for every x ∈ X, n ∈ N, kT (x) − λn xk = k(T (x) − λx) − (λn − λ)xk ≥ kT (x) − λxk − |λn − λ)|kxk ≥ (c − |λn − λ|)kxk. Thus, for all large enough n, T − λn I is bounded below. More precisely, let N ∈ N be such that |λn − λ| ≤ c/2 for all n ≥ N. Then we have c kT (x) − λn xk ≤ kxk, 2 for all x ∈ X and all n ≥ N, which shows that λn ∈ / σapp (T ) for all n ≥ N. Thus, we arrive at a contradiction. The above result, in particular, shows that if {λn } is a sequence of eigenvalues of T ∈ B(X) (T : X → X is bounded linear operator) such that λn → λ, then λ is an approximate eigenvalue. One may ask whether every approximate eigenvalue arises in this manner. The answer is, in general, negative, as the following examples shows. Example 2.2.4. Let X = ℓ1 and T be the right shift operator on ℓ1 . Then we know that σeig (T ) = ∅. We show that σapp (T ) 6= ∅. Let {xn } in ℓ1 be defined by xn (j) = 1 , n 0, if j ≤ n, if j > n. Qamrul Hasan Ansari Advanced Functional Analysis Page 57 Then we see that kxn k = 1 for all n ∈ N, and kT (xn ) − xn k1 = 2/n → 0 as n → ∞ so that 1 ∈ σapp (T ). Few other examples of operators describing eigenspectrum and approximate eigenspctrum completely are given in the book by M. T. Nair: Functional Analysis: A First Course, Prentice-Hall of India Private Limited, New Delhi, 2002. Qamrul Hasan Ansari 2.3 Advanced Functional Analysis Page 58 Resolvent Operators Let X be a normed space and T : X → X be a linear operator. We have seen in Remark 2.2.4 that λ ∈ / σapp (T ) if and only if T − λI is injective and (T − λI)−1 : R(T − λI) → X is continuous. That is, a scalar λ is not an approximate eigenvalue of T if and only if for every y ∈ R(T − λI), there exists a unique x ∈ X such that T (x) − λx = y, and the map y 7→ x is continuous. Thus, if x and y are as above, and if {yn } is a sequence in R(T − λI) such that yn → y, and {xn } in X satisfies T (xn ) − λxn = yn , then xn → x. One would like to have the above situation not only for every y ∈ R(T − λI), but also for every y ∈ X. Motivated by this requirement, we have the concept of spectrum of T . Definition 2.3.1. The resolvent set of T , denoted by ρ(T ), is defined as ρ(T ) = {λ ∈ K : T − λI is bijective and (T − λI)−1 ∈ B(X)}, where B(X) denotes the set of all bounded linear operators from X into itself. The complement of ρ(T ) in K is called the spectrum of T and is denoted by σ(T ). Thus, λ ∈ σ(T ) if and only if either T − λI is not bijective or else (T − λI)−1 ∈ / B(X). The elements of the spectrum are called the spectral values of T . We observe that, for T ∈ B(X), 0 ∈ ρ(T ) ⇔ ∃S ∈ B(X) such that T S = I = ST, and, in that case, S = T −1 . If 0 ∈ ρ(T ), then we say that T is invertible in B(X). We note that if T, S ∈ B(X) are invertible, then T S is invertible, and (T S)−1 = S −1 T −1 . In view of Proposition A10 , if λ ∈ ρ(T ), then T − λI is bounded below. Hence, every approximate eigenvalue is a spectral value, that is, σapp (T ) ⊆ σ(T ). 10 Proposition A: Let X and Y be normed spaces and T : X → Y be a linear operator. Then there exists γ > 0 such that kT (x)k ≥ γkxk for all x ∈ X if and only if T is injective and T −1 : R(T ) → X is continuous, and in that case, kT −1 (y)k ≤ γ1 kyk for all y ∈ R(T ). Qamrul Hasan Ansari Advanced Functional Analysis Page 59 Clearly, if X is a finite dimensional space, then σeig (T ) = σapp (T ) = σ(T ). We have seen examples of infinite rank operators T for which σeig (T ) 6= σapp (T ). The following example shows that strict inclusion is possible in σapp (T ) ⊆ σ(T ) as well. Example 2.3.1. Let X = ℓp , 1 ≤ p ≤ ∞, and T be the right shift operator on X. We have seen in Example 2.2.3 that 0 ∈ / σapp (T ). But 0 ∈ σ(T ), since T is not onto. In fact, e1 ∈ / R(T ). Now we give some characterizations of the spectrum. Theorem 2.3.1. Let X be a Banach space, T : X → X be a bounded linear operator and λ ∈ K. Then λ ∈ σ(T ) if and only if either λ ∈ σapp (T ) or R(T − λI) is not dense in X. Proof. Clearly, if λ ∈ σapp (T ) or R(T − λI) is not dense in X, then λ ∈ σ(T ). Conversely, suppose that λ ∈ σ(T ). If λ ∈ / σapp (T ), then by Proposition A11 , Proposition 12 B , the operator T − λI is injective, and its inverse (T − λI)−1 : R(T − λI) → X is continuous, and R(T − λI) is closed. Hence, R(T − λI) is not dense in X; otherwise, T − λI would become bijective and (T − λI)−1 ∈ B(X), which is a contradiction to the assumption that λ ∈ σ(T ). 11 Proposition A: Let X and Y be normed spaces and T : X → Y be a linear operator. Then there exists γ > 0 such that kT (x)k ≥ γkxk for all x ∈ X if and only if T is injective and T −1 : R(T ) → X is continuous, and in that case, kT −1 (y)k ≤ γ1 kyk for all y ∈ R(T ). 12 Proposition B: Let T : X → Y be a bounded linear operator from a Banach space X to a normed space Y . If T is bounded below, then the range R(T ) of T is a closed subspace of Y Qamrul Hasan Ansari 2.4 Advanced Functional Analysis Page 60 Spectral Theory of Compact Linear Operators Theorem 2.4.1 (Null Space). Let X be a normed space and T : X → X be a linear compact operator. Then for every λ 6= 0, the null space N (Tλ ) = {x ∈ D(Tλ ) : Tλ (x) = 0} of Tλ = T − λI is finite dimensional. Proof. We prove it by showing that the closed unit ball B = {x ∈ N (Tλ ) : kxk ≤ 1} is compact as a normed space is finite dimensional if the closed unit ball in it is compact. Let {xn } be in B. Then {xn } is bounded as kxn k ≤ 1. Since T is compact, by Theorem 2.1.4, {T (xn )} has a convergent subsequence {T (xnk )}. Now xn ∈ B ⊂ N (Tλ ) implies Tλ (xn ) = T (xn )−λxn = 0, so that xn = λ−1 T (xn ) because λ 6= 0. Consequently, {xnk } = {λ−1 T (xnk )} also converges and its limit lies in B as B is closed. Since {xn } was arbitrary, it says that every sequence in B has convergent subsequence, and therefore, B is compact. This implies that domN (T ) < ∞. Theorem 2.4.2. Let X be a normed space and T : X → X be a linear compact operator. Then for every λ 6= 0, the range of Tλ = T − λI is closed. Proof. The proof is divided into three steps. / Tλ (X) and a Step 1. Suppose that Tλ (X) is not closed. Then there is a y ∈ Tλ (X), y ∈ sequence {xn } in X such that yn = Tλ (xn ) → y. (2.2) Since Tλ (X) is a vector space, 0 ∈ Tλ (X). But y ∈ / Tλ (X), so that y 6= 0. This implies that yn 6= 0 and xn ∈ / N (Tλ ) for all sufficiently large n. Without loss of generality, we may assume that this holds for all n. Since N (Tλ) is closed, the distance δn from xn to N (Tλ ) is positive, that is, δn = inf kxn − zk > 0. z∈N (Tλ ) By the definition of an infimum, there is a sequence {zn } in N (Tλ) such that an = kxn − zn k < 2δn . (2.3) Step 2. We show that an = kxn − zn k → ∞, as n → ∞. (2.4) Assume that it does not hold. Then {xn −zn } has bounded subsequence. Since T is compact, it follows from Theorem 2.1.1 that {T (xn − zn )} has a convergent subsequence. Now from Tλ = T − λI and λ 6= 0, we have I = λ−1 (T − Tλ ). Since zn ∈ N (Tλ), we have Tλ (zn ) = 0 and thus we obtain 1 1 xn − zn = (T − Tλ )(xn − zn ) = [T (xn − zn ) − Tλ (xn )]. λ λ Qamrul Hasan Ansari Advanced Functional Analysis Page 61 {T (xn − zn )} has convergent subsequence and {Tλ (xn )} converges by (2.2); hence {xn − zn } has convergent subsequence, say, xnk − znk → v. Since T is compact, T is continuous and so is Tλ . Hence Tλ (xnk − znk ) → Tλ (v). Here Tλ (znk ) = 0 because zn ∈ N (Tλ), so by (2.2) we also have Tλ (xnk − znk ) = Tλ (xnk ) → y. hence Tλ (v) = y. Thus y ∈ Tλ (X), which contradicts y ∈ / Tλ (X) (we assumed it in Step 1). This is a contradiction and hence an = kxn − zn k → ∞ as n → ∞. Step 3. Using an as in (2.4) and setting wn = 1 (xn − zn ), an (2.5) we have kwn k = 1. Since an → ∞, whereas Tλ (zn ) = 0 and {Tλ (zn )} converges, it follows that 1 Tλ (wn ) = Tλ (xn ) → 0, (2.6) an Using again I = λ−1 (T − Tλ ), we obatin wn = 1 T (wn ) − Tλ (wn )). λ (2.7) Since T is compact and {wn } is bounded, {T (wn )} has convergent subsequence. Furthermore, {Tλ (wn )} converges by (2.6). Hence (2.7) shows that {wn } has a convergent subsequence, say wnj → w. (2.8) A comparison with (2.6) implies that Tλ (w) = 0. Hence w ∈ N (Tλ ). Since zn ∈ N (Tλ ), also un = zn + an w ∈ N (Tλ ). Hence for the distance from xn to un , we must have kxn − un k ≥ δn . Writing un out and using (2.5) and (2.3), we thus obtain δn ≤ = = < Dividing by 2δn > 0, we have 1 2 kxn − zn − an wk kan wn − an wk an kwn − wk 2δn kwn − wk. < kwn −wk. This contradicts (2.8) and proves the result. Exercise 2.4.1. Let X be a normed space and T :→ X be a linear operator. Let λ ∈ K be such that T − λI is injective. Show that (T − λI)−1 : R(T − λI) → X is continuous if and only if λ is not an approximate eigenvalue. Qamrul Hasan Ansari Advanced Functional Analysis Page 62 Exercise 2.4.2. Let X be a normed space and T : X → X be a bounded linear operator. Let λ ∈ K be such that |λ| > kT k. Show that (a) T − λI is bounded below, (b) R(T − λI) is dense in X, (c) (T − λI)−1 : R(T − λI) → X is continuous. Exercise 2.4.3. Let X be a Banach space and T : X → X be a bounded linear operator. Show that λ ∈ σeig (T ) if and only if there exists a nonzero operator S ∈ B(X) such that (T − λI)S = 0. Exercise 2.4.4. Give an example of a bijective operator T on a normed space X such that 0 ∈ σ(A). 3 Differential Calculus on Normed Spaces 3.1 Directional Derivatives and Their Properties Throughout this section, unless otherwise specified, we assume that X is a real vector space and f : X → R ∪ {±∞} is an extended real-valued function. In this section, we discuss directional derivatives of f and present some of their basic properties. Definition 3.1.1. Let f : X → R ∪ {±∞} be a function and x ∈ Rn be a point where f is finite. (a) The right-sided directional derivative of f at x in the direction d ∈ X is defined by f+′ (x; d) = lim+ t→0 f (x + td) − f (x) , t if the limit exists in [−∞, +∞], that is, finite or not. (b) The left-sided directional derivative of f at x in the direction d ∈ X is defined by f−′ (x; d) = lim− t→0 f (x + td) − f (x) , t if the limit exists in [−∞, +∞], that is, finite or not. For d = 0 the zero vector in X, f+′ (x; 0) = f−′ (x; 0) = 0. Since f+′ (x; −d) = lim+ t→0 f (x − td) − f (x) f (x + τ d) − f (x) = lim− = −f−′ (x; d), τ →0 t −τ 63 Qamrul Hasan Ansari Advanced Functional Analysis Page 64 we have −f+′ (x; −d) = f−′ (x; d). If f+′ (x; d) exists and f+′ (x; d) = f−′ (x; d), then it is called the directional derivative of f at x in the direction d. Thus, the directional derivative of f at x in the direction d ∈ X is defined by f (x + td) − f (x) , f ′ (x; d) = lim t→0 t provided the limit exists in [−∞, +∞], that is, finite or not. Remark 3.1.1. (a) If f ′ (x; d) exists, then f ′ (x; −d) = −f ′ (x; d). (b) If f : Rn → R is differentiable, then the directional derivative of f at x ∈ X in the direction d is given by ′ f (x; d) = n X i=1 di ∂f (x) = h∇f (x), di. ∂xi In particular, if d = (0, 0, . . . , 0, 1, 0, . . . , 0, 0) = ei , where 1 is at the ith place, then ∂f (x) the partial derivative of f with respect to xi . f ′ (x; ei ) = ∂xi For an extended convex function1 f : X → R ∪ {±∞}, the following proposition shows that f (x + td) − f (x) is monotonically increasing on (0, ∞). the function t 7→ t Proposition 3.1.1. Let f : X → R ∪ {±∞} be an extended real-valued convex function and x be a point in X where f is finite. Then, for each direction d ∈ X, function f (x + td) − f (x) t 7→ is monotonically nondecreasing on (0, ∞). t Proof. Let x ∈ X be any point such that f (x) is finite, and s, t ∈ (0, ∞) with s ≤ t. Then, by convexity of f , we have s s x (x + td) + 1 − f (x + sd) = f t t s s ≤ f (x + td) + 1 − f (x). t t It follows that f (x + sd) − f (x) f (x + td) − f (x) ≤ . s t f (x + td) − f (x) Thus, function t 7→ is monotonically nondecreasing on (0, ∞). t 1 A function f : X → R ∪ {±∞} is said to be convex if for all x, y ∈ X with f (x), f (y) 6= ±∞, and all α ∈ [0, 1], f (αx + (1 − α)y) ≤ αf (x) + (1 − α)f (y). Qamrul Hasan Ansari Advanced Functional Analysis Page 65 The following result ensures the existence of f+′ (x; d) and f−′ (x; d) when f is a convex function. Proposition 3.1.2. Let f : X → R ∪ {±∞} be an extended real-valued convex function and x be a point in X where f is finite. Then, f+′ (x; d) and f−′ (x; d) exist for every direction d ∈ X. Also, f (x + td) − f (x) f+′ (x; d) = inf , (3.1) t>0 t and f (x + td) − f (x) f−′ (x; d) = sup . (3.2) t t<0 Proof. Let x ∈ X be any point such that f (x) is finite. For given t > 0, by the convexity of f , we have 1 t (x − d) + (x + td) f (x) = f 1+t 1+t t 1 f (x − d) + f (x + td) ≤ 1+t 1+t 1 = (tf (x − d) + f (x + td)) . 1+t It follows that (1 + t)f (x) ≤ tf (x − d) + f (x + td), and so, f (x + td) − f (x) ≥ f (x) − f (x − d). t f (x + td) − f (x) , as t → 0+ , is bounded below by t the constant f (x) − f (x − d). Thus, the limit in the definition of f+′ (x; d) exists and is given by f (x + td) − f (x) f (x + td) − f (x) f+′ (x; d) = lim+ = inf . t>0 t→0 t t Since f+′ (x; d) exists in every direction d, the equality −f+′ (x; −d) = f−′ (x; d) implies that f−′ (x; d) exists in every direction d. Hence the decreasing sequence of values The relation (3.2) can be established on the lines of the proof given to derive (3.1). Proposition 3.1.3. Let f : X → R ∪ {±∞} be an extended real-valued convex function and x be a point in X where f is finite. Then, f+′ (x; d) is a convex and positively homogeneous functiona of d and f−′ (x; d) ≤ f+′ (x; d). (3.3) A function f : X → R is said to be (a) convex if for all x, y ∈ X and all α ∈ [0, 1], f (αx+(1−α)y) ≤ αf (x) + (1 − α)f (y); (b) positive homogeneous if for all x ∈ X and all r ≥ 0, f (rx) = rf (x). a Qamrul Hasan Ansari Advanced Functional Analysis Page 66 Proof. Let λ > 0 be a real number. Then, f+′ (x; λd) = lim+ λt→0 λ(f (x + λtd) − f (x)) = λf+′ (x; d). λt Hence, f+′ (x; ·) is positively homogeneous. Similarly, we can show that f−′ (x; ·) is also positively homogeneous. Next, we show that f+′ (x; ·) is convex. Let d1 , d2 ∈ X and λ1 , λ2 ≥ 0 be such that λ1 +λ2 = 1. From the convexity of f , we have f (x + t(λ1 d1 + λ2 d2 )) − = = ≤ = f (x) f ((λ1 + λ2 )x + t(λ1 d1 + λ2 d2 )) − (λ1 + λ2 )f (x) f (λ1 (x + td1 ) + λ2 (x + td2 )) − λ1 f (x) − λ2 f (x) λ1 f (x + td1 ) + λ2 f (x + td2 ) − λ1 f (x) − λ2 f (x) λ1 (f (x + td1 ) − f (x)) + λ2 (f (x + td2 ) − f (x)) for all sufficiently small t. Dividing by t > 0 and letting t → 0+ , we obtain f+′ (x; λ1 d1 + λ2 d2 ) ≤ λ1 f+′ (x; d1 ) + λ2 f+′ (x; d2 ). Hence f+′ (x; d) is convex in d. By subadditivity of f+′ (x; d) in d with f+′ (x; d) < +∞ and f+′ (x; −d) < +∞, we obtain f+′ (x; d) + f+′ (x; −d) ≥ f+′ (x; 0) = 0, and thus, f+′ (x; d) ≥ −f+′ (x; −d) = f−′ (x; d). If f+′ (x; d) = +∞ or f+′ (x; −d) = +∞, then the inequality (3.3) holds trivially. Corollary 3.1.1. Let f : X → R ∪ {±∞} be an extended real-valued convex function and x be a point in X where f is finite. Then, for each direction d ∈ X, f (x + td) − f (x) . t∈(0,∞) t f ′ (x; d) = inf Proposition 3.1.4. Let f : X → R ∪ {±∞} be an extended real-valued convex function and x be a point in X where f is finite. Then the following assertions hold: (a) f ′ (x; ·) is sublinear.a (b) For every y ∈ X, f ′ (x; y − x) + f (x) ≤ f (y). (3.4) Qamrul Hasan Ansari Advanced Functional Analysis Page 67 A function f : X → R is said to be sublinear if f (λx) = λf (x) and f (x + y) ≤ f (x) + f (y) for all x, y ∈ X and all λ ≥ 0. a Proof. (a) It follows from Proposition 3.1.3. (b) If y is not in Dom(f ), then the inequality (3.4) trivially holds. So, let y ∈ Dom(f ). For t ∈ (0, 1), we have f ((1 − t)x + ty) − f (x) ≤ t(f (y) − f (x)), which implies that f ((1 − t)x + ty) − f (x) ≤ f (y) − f (x). t Letting limit as t → 0. we obtain f ′ (x; y − x) + f (x) ≤ f (y). Corollary 3.1.2. Let f : Rn → R ∪ {+∞} be an extended real-valued convex function and x ∈ Rn be such that f (x) is finite and f is differentiable at x. Then, f (y) ≥ f (x) + h∇f (x), y − xi, for all y ∈ X, where ∇f (x) denotes the gradient of f at x. Corollary 3.1.3. Let f : X → R ∪ {+∞} be an extended real-valued convex function and x, y ∈ X be such that f (x) and f (y) are finite. Then, f+′ (y; y − x) ≥ f+′ (x; y − x), (3.5) f−′ (y; y − x) ≥ f−′ (x; y − x). (3.6) h∇f (y) − ∇f (x), y − xi ≥ 0. (3.7) and In particular, if f : Rn → R is differentiable at x and y, then Proof. From Corollary 3.1.2, we have f (y) ≥ f (x) + f+′ (x; y − x), (3.8) f (x) ≥ f (y) + f+′ (y; x − y). (3.9) and Qamrul Hasan Ansari Advanced Functional Analysis Page 68 By adding inequalities (3.8) and (3.9), we obtain −f+′ (y; x − y) ≥ f+′ (x; y − x). Since −f+′ (x; −d) = f−′ (x; d), by using inequality (3.3), we get f+′ (y; y − x) ≥ f−′ (y; y − x) = −f+′ (y; x − y) ≥ f+′ (x; y − x). Hence, the inequality (3.5) holds. Similarly, we can establish the inequality (3.6). The inequality (3.7) holds using Remark 3.1.1 (b). Qamrul Hasan Ansari 3.2 Advanced Functional Analysis Page 69 Gâteaux Derivative and Its Properties Definition 3.2.1. Let X be a normed space. A function f : X → (−∞, ∞] is said to be Gâteaux2 differentiable at x ∈ int(Dom(f )) if there exists a continuous linear functional, denoted by fG′ (x), on X such that f ′ (x; d) = fG′ (x)(d), that is, lim t→0 fG′ (x) at d. for all d ∈ X, (3.10) f (x + td) − f (x) exists for all d ∈ X and it is equal to the value of the functional t The continuous linear functional fG′ (x) : X → R is called the Gâteaux derivative of f at x. fG′ (x; d) is called the value of the Gâteaux derivative of f at x in the direction d. Similarly, the Gâteaux derivative of an operator T : X → Y from a normed space X to another normed space Y can be defined as follows: Definition 3.2.2. Let X and Y be normed spaces. An operator T : X → Y is said to be Gâteaux differentiable at x ∈ int(Dom(T )) if there exists a continuous linear operator TG′ (x) : X → Y such that T (x + td) − T (x) = TG′ (x)(d), t→0 t lim for all d ∈ X. (3.11) The continuous linear operator TG′ (x) : X → Y is called the Gâteaux derivative of T at x. TG′ (x; d) is called the value of the Gâteaux derivative of T at x in the direction d. The relation (3.11) is equivalent to the following relation lim t→0 T (x + td) − T (x) − TG′ (x; d) = 0. t (3.12) Remark 3.2.1. If fG′ (x; d) exists, then fG′ (x; −d) = −fG′ (x; d). Remark 3.2.2. If X = Rn is an Euclidean space with the standard inner product. If f : Rn → R has continuous partial derivatives of order 1, then f is Gâteaux differentiable at x = (x1 , x2 , . . . , xn ) ∈ Rn and in the direction d = (d1 , d2 , . . . , dn ) ∈ Rn , and it is given by fG′ (x; d) = n X ∂f (x) k=1 2 ∂xk dk , René Gâteaux (1889-1914) had died in the First World War and his work was published by Lévy in 1919 with some improvement. Qamrul Hasan Ansari Advanced Functional Analysis Page 70 ∂f (x) denotes a partial derivative of f at the point x with respect to xk . Thus, ∂xk ∂f (x) ∂f (x) ∂f (x) ∇G f (x) = , ,..., is gradient of f at the point x. ∂x1 ∂x2 ∂xn where Remark 3.2.3. Let X = Rn and Y = Rm be Euclidean spaces with the standard inner product. If T : Rn → Rm be given by T = (f1 , f2 , . . . , fm ) and A = (aij ) be a m × n matrix, where fi : Rn → R be functions for each i = 1, 2, . . . , m. Let d = ej = (0, 0, . . . , 1, . . . , 0, 0) where 1 at jth place. Then lim t→0 T (x + td) − T (x) − Ad = 0 t implies that fi (x + tej ) − fi (x) − aij = 0, t→0 t for all i = 1, 2, . . . , m and all j = 1, 2, . . . , n. This shows that fi has partial derivatives at x and ∂fi (x) = aij , for i = 1, 2, . . . , m and j = 1, 2, . . . , n. ∂xj Hence ∂f1 (x) . . . ∂f∂x1 (x) ∂x1 n .. .. .. TG′ (x) = . . . . lim ∂fm (x) ∂x1 ... ∂fm (x) ∂xn We establish that the Gâteaux derivative is unique. Proposition 3.2.1. Let X and Y be normed spaces, T : X → Y be an operator and x ∈ int(Dom(T )). The Gâteaux derivative TG′ (x) of T at x is unique, provided it exists. Proof. Assume that there exist two continuous linear operator TG′ (x) and TG∗′ (x) which satisfy (3.12). Then, for all d ∈ X, and for sufficiently small t, we have T (x + td) − T (x) ′ ∗′ ′ kTG (x; d) − TG (x; d)k = − TG (x; d) t T (x + td) − T (x) ∗′ − − TG (x; d) t T (x + td) − T (x) − TG′ (x; d) ≤ t T (x + td) − T (x) + − TG∗′ (x0 ; d) t → 0 as t → 0. Therefore, kTG′ (x; d) − TG∗′ (x; d)k = 0 for all d ∈ X. Hence, TG′ (x; d) = TG∗′ (x; d), and thus, TG′ (x) ≡ TG∗′ (x). Qamrul Hasan Ansari Advanced Functional Analysis Page 71 Theorem 3.2.1. Let K be a nonempty open convex subset of a normed space X and f : K → R be a convex function. If f is Gâteaux differentiable at x ∈ K, then fG′ (x; d) is linear in d. Conversely, if f+′ (x; d) is linear in d, then f is Gâteaux differentiable at x. Proof. Let f be Gâteaux differentiable at x ∈ K, then for all d ∈ X −f+′ (x; −d) = f−′ (x; d) = f+′ (x; d). Therefore, for all d, u ∈ X, we have f+′ (x; d) + f+′ (x; u) ≥ = ≥ = f+′ (x; d + u) −f+′ (x; −(d + u)) −f+′ (x; −d) − f+′ (x; −u) f+′ (x; d) + f+′ (x; u), and thus, f+′ (x; d + u) = f+′ (x; d) + f+′ (x; u). Since fG′ (x; d) = f+′ (x; d) = f−′ (x; d), we have fG′ (x; d + u) = fG′ (x; d) + f G (x; u). For α ∈ R with α 6= 0, we have α(f (x + tαd) − f (x)) = αfG′ (x; d). αt→0 αt fG′ (x; αd) = lim Hence fG′ (x; d) is linear in d. Conversely, assume that f+′ (x; d) is linear in d. Then, 0 = f+′ (x; d − d) = f+′ (x; d) + f+′ (x; −d). Therefore, for all d ∈ X, we have f−′ (x; d) = −f+′ (x; −d) = f+′ (x; d). Thus, f is Gâteaux differentiable at x. Remark 3.2.4. (a) A nonconvex function f : X → R may be Gâteaux differentiable at a point but the Gâteaux derivative may not be linear at that point. For example, consider the function f : R2 → R defined by ( x2 x 1 2 , if x 6= (0, 0), x21 +x22 f (x) = 0, if x = (0, 0), Qamrul Hasan Ansari Advanced Functional Analysis Page 72 where x = (x1 , x2 ). For d = (d1 , d2 ) 6= (0, 0) and t 6= 0, we have f ((0, 0) + t(d1 , d2)) − f (0, 0) d2 d2 = 2 1 2. t d1 + d2 Then, d2 d2 f ((0, 0) + t(d1 , d2)) − f (0, 0) = 2 1 2. t→0 t d1 + d2 fG′ ((0, 0); d) = lim Therefore, f is Gâteaux differentiable at (0, 0), but fG′ ((0, 0); d) is not linear in d. (b) For a real valued function f defined on Rn , the partial derivatives may exist at a point but f may not be Gâteaux differentiable at that point. For example, consider the function f : R2 → R defined by ( xx 1 2 , if x 6= (0, 0), x21 +x22 f (x) = 0, if x = (0, 0), where x = (x1 , x2 ). For d = (d1 , d2 ) 6= (0, 0) and t 6= 0, we have f ((0, 0) + t(d1 , d2 )) − f (0, 0) d1 d2 = . t t(d21 + d22 ) Then, d1 d2 f ((0, 0) + t(d1 , d2 )) − f (0, 0) = lim , 2 t→0 t(d2 t→0 t 1 + d2 ) lim exists only if d = (d1 , 0) or d = (0, d2). That is, fG′ (0; 0) does not exist but 0= ∂f (0, 0) , where 0 = (0, 0) is the zero vector in R2 . ∂x2 ∂f (0, 0) = ∂x1 (c) The existence, linearity and continuity of fG′ (x; d) in d do not imply the continuity of the function f . For example, consider the function f : R2 → R defined by ( x3 1 , if x1 6= 0 and x2 6= 0, x2 f (x) = 0, if x1 = 0 or x2 = 0, where x = (x1 , x2 ). Then, t3 d31 = 0, t→0 t2 d2 fG′ ((0, 0); d) = lim for all d = (d1 , d2 ) ∈ R2 with (d1 , d2 ) 6= (0, 0). Thus, fG′ (0; d) exists and it is continuous and linear in d but f is discontinuous at (0, 0). The function f is Gâteaux differentiable but not continuous. Hence a Gâteaux differentiable function is not necessarily continuous. Qamrul Hasan Ansari Advanced Functional Analysis Page 73 (d) The Gâteaux derivative fG′ (x; d) of a function f is positively homogeneous in the second argument, that is, fG′ (x; rd) = rfG′ (x; d) for all r > 0. But, as we have seen in part (a), in general, fG′ (x; d) is not linear in d. Remark 3.2.5. The Gâteaux derivative of a linear operator T : X → Y is also a linear operator. Indeed, if T : X → Y is a linear operator, then we have T (x + td) − T (x) T (x) + tT (d) − T (x) = lim = T (d). t→0 t→0 t t TG′ (x; d) = lim Hence TG′ (x; d) = T (d) for all x ∈ X and d ∈ X. The following theorem shows that the partial derivatives and Gâteaux derivative are the same if the function f defined on X is convex. Theorem 3.2.2. Let K be nonempty convex subset of Rn and f : K → R be a convex function. If the partial derivatives of f at x ∈ K exist, then f is Gâteaux differentiable at x. Proof. Suppose that the partial derivatives of f at x ∈ K exist. Then, the Gâteaux derivative of f at x is the linear functional fG′ (x; d) = n X ∂f (x) k=1 ∂xk dk , for d = (d1 , d2, . . . dn ) ∈ Rn . For each fixed x ∈ K, define a function g : K → R by g(d) = f (x + d) − f (x) − fG′ (x; d). ∂g(0) = 0 for all k = 1, 2, . . . , n, since the partial derivatives of f ∂xk exist at x. Now, if {e1 , e2 , . . . , en } is the standard basis for Rn , then by the convexity of g, we have for λ 6= 0 ! n n n X X 1X g(nλdk ek ) dk ek ≤ g(λd) = g λ g (nλdk ek ) = λ . n nλ k=1 k=1 k=1 Then, g is convex and So, n g(λd) X g(nλdk ek ) ≤ , λ nλ for λ > 0, k=1 and Since n g(λd) X g(nλdk ek ) ≥ , λ nλ k=1 ∂g(0) g(nλdk ek ) = = 0, λ→0 nλ ∂dk lim for λ < 0. for all k = 1, 2, . . . , n, Qamrul Hasan Ansari Advanced Functional Analysis Page 74 we have g(λd) = 0, λ→0 λ and so, f is Gâteaux differentiable at x. lim The mean value theorem in terms of Gâteaux derivative is the following. Theorem 3.2.3. Let X and Y be normed spaces, K be a nonempty open subset of X and T : X → Y be Gâteaux differentiable with Gâteaux derivative fG′ (x; d) at x ∈ X in the direction d ∈ X. Then for any points x ∈ X and x + d ∈ X, there exists s ∈ ]0, 1[ such that T (x + d) − T (x) = TG′ (x + sd; d). (3.13) Proof. Since K is an open subset of X, we can select an open interval I of real numbers, which contains the numbers 0 and 1, such that x + λd belongs to K for all λ ∈ I. For all λ ∈ I, define ϕ(λ) = T (x + λd). Then, ϕ(λ + τ ) − ϕ(λ) τ →0 τ T (x + λd + τ d) − T (x + λd) = lim τ →0 τ = TG′ (x + λd; d). ϕ′ (λ) = lim (3.14) By applying the mean value theorem for real-valued functions of one variable to the restriction of the function ϕ : I → R to the closed interval [0, 1], we obtain ϕ(1) − ϕ(0) = ϕ′ (s), for some s ∈ ]0, 1[. By using (3.14) and the definition of ϕ : [0, 1] → R, we obtain the desired result. For the differentiable function, we have the following result which follows from the above theorem. Corollary 3.2.1. If in the above theorem T is a differentiable function from Rn to R, then there exists s ∈ ]0, 1[ such that T (x + d) − T (x) = hTG′ (x + sd), i = h∇T (x + sd), di. Now we give the characterization of a convex functional in terms of Gâteaux derivative. Theorem 3.2.4. Let X be a normed space and f : X → (−∞, ∞] be a proper function. Let K be a convex subset of int(Dom(f )) such that f is Gâteaux differentiable at each point of K. Then the following are equivalent: (a) f is convex on K. Qamrul Hasan Ansari Advanced Functional Analysis Page 75 (b) f (y) − f (x) ≥ fG′ (x)(y − x) for all x, y ∈ K. (c) fG′ (y)(y − x) − fG′ (x)(y − x) ≥ 0 for all x, y ∈ K. Proof. (a) ⇒ (b). Suppose that f is convex on K. Let x, y ∈ K. Then f ((1 − t)x + ty) ≤ (1 − t)f (x) + tf (y), for all t ∈ (0, 1). f (x + t(y − x)) − f (x) ≤ f (y) − f (x), t for all t ∈ (0, 1). It follows that Letting limit as t → 0, we obtain fG′ (x)(y − x) ≤ f (y) − f (x). Thus, (b) holds. (b)⇒(c). Suppose that (b) holds. Let x, y ∈ K. Note that fG′ (y)(x − y) ≤ f (x) − f (y) and fG′ (x)(y − x) ≤ f (y) − f (x). Adding the above inequalities, we obtain fG′ (y)(y − x) − fG′ (x)(y − x) ≥ 0. (c) ⇒ (a). Suppose that (c) holds. Then we have fG′ (u)(u − v) − fG′ (v)(u − v) ≥ 0, for all u, v ∈ K. (3.15) Let x, y ∈ K. Define a function g : [0, 1] → R by g(t) = f (x + t(y − x)), for all t ∈ [0, 1]. Then g ′ (t) = fG′ (x + t(y − x))(y − x). Consider u = (1 − t)x + ty and v = (1 − s)x + sy in (3.15), for 0 ≤ s < t ≤ 1. Then we have (fG′ ((1 − t)x + ty) − fG′ ((1 − s)x + sy)) ((1 − t)x + ty − ((1 − s)x + sy)) ≥ 0, which implies that (g ′ (t) − g ′(s))(t − s) = (fG′ ((1 − t)x + ty) − fG′ ((1 − s)x + sy)) (y − x) ≥ 0. Hence g ′ is monotonic increasing on [0, 1] and hence g is convex on [0, 1]. Thus, g(λ) ≤ (1 − λ)g(0) + λg(1), it follows that f is convex on K. for all λ ∈ (0, 1), Qamrul Hasan Ansari Advanced Functional Analysis Exercise 3.2.1. Let f : R2 → R be defined by 2x2 e−x−2 1 , −2x−2 2 1 f (x1 , x2 ) = x2 +e 0, if x1 6= 0, if x1 = 0. Prove that f is Gâteaux differentiable at 0 but not continuous there. Page 76 Qamrul Hasan Ansari 3.3 Advanced Functional Analysis Page 77 Fréchet Derivative and Its Properties Definition 3.3.1. Let X and Y be normed spaces. An operator (possibly nonlinear) T : X → Y is said to be Fréchet differentiable at a point x ∈ int(Dom(T )) if there exists a continuous linear operator T ′ (x) : X → Y such that kT (x + d) − T (x) − T ′ (x)(d)k lim = 0. (3.16) kdk→0 kdk In this case, T ′ (x), also denoted by DT (x), is called Fréchet derivative of T at the point x. The operator T ′ : X → B(X, Y ) which assigns a continuous linear operator T ′ (x) to a vector x is known as the Fréchet derivative3 of T . The domain of the operator T ′ contains naturally all vectors in X at which the Fréchet derivative can be defined. The meaning of the relation (3.16) is that for each ε > 0, there exists a δ > 0 (depending on ε) such that kT (x + d) − T (x) − T ′ (x)(d)k < ε, kdk for all d ∈ X satisfying the condition kdk < δ. Example 3.3.1. Let X = Rn and Y = Rm be Euclidean spaces with the standard inner product. If T : Rn → Rm is Fréchet differentiable at a point x ∈ Rn , then T is represented by T (x) = (f1 (x1 , . . . , xn ), . . . , fm (x1 , . . . , xn )), where fj : Rn → R be a function for each j = 1, 2, . . . , m. Let {ei : i = 1, 2, .P . . n} denote the standardPbasis in Rn . Then the vector n d ∈ R can be represented as d = in=1 di ei and f ′ (x)(d) = ni=1 di f ′ (x)(ei ). Therefore we find that (f1 (·, xi + t, ·), . . . , fm (·, xi + t, ·)) − (f1 (·, xi , ·), . . . , fm (·, xi , ·)) lim t→0 t ∂fm (x) ∂f1 (x) ,..., = T ′ (x)(ei ). = ∂xi ∂xi Thus the Fréchet derivative T ′ is expressed in the following form n X ∂f1 (x) ∂fm (x) ′ T (x)(d) = di ,..., ∂xi ∂xi i=1 n X ∂f1 (x) ∂fm (x) = di , . . . , di ∂xi ∂xi i=1 ∂f1 (x) . . . ∂f∂x1 (x) d 1 ∂x1 n .. .. .. .. = . . . . . ∂fm (x) ∂fm (x) dn . . . ∂x ∂x 1 3 n The Fréchet derivative is introduced by the French mathematician Gil Fréchet in 1925. Qamrul Hasan Ansari Advanced Functional Analysis Page 78 This shows that the Fréchet derivative T ′ (x) at a point x is a linear operator represented by the Jacobian matrix. Remark 3.3.1. If the operators λT (λ is a scalar) and T + S are Fréchet differentiable, then for all d ∈ X, (λT )′ (d) = αT ′ (d) and (T + S)′ (d) = T ′ (d) + S ′ (d). We establish the relation between Gâteaux and Fréchet differentiability. Proposition 3.3.1. Let X and Y be normed spaces. If the operator T : X → Y is Fréchet differentiable at x ∈ X, it is Gâteaux differentiable at x and these two derivatives are equal. Proof. Since T is Fréchet differentiable at x, we have kT (x + d) − T (x) − T ′ (x)(d)k = 0. kdk→0 kdk lim Set d = td0 for t > 0 and for any fixed d0 6= 0. Then kT (x + td0 ) − T (x) − tT ′ (x)(d0 )k t→0 tkd0 k kT (x + td0 ) − T (x) 1 = lim − T ′ (x)(d0 ) t→0 t kd0k 0 = lim which implies that T (x + td0 ) − T (x) = TG′ (x)(d0 ), for all d0 ∈ X. t→0 t T ′ (x)(d0 ) = lim Hence TG′ (x) ≡ T ′ (x). The following example shows that the converse of Proposition 3.3.1 is not true, that is, if an operator T : X → Y is Gâteaux differentiable, then it may not be Fréchet differentiable. Example 3.3.2. Let X = R2 with the Euclidean norm k · k and f : X → R be a function defined by x3 y , if (x, y) 6= (0, 0), x4 +x2 f (x, y) = 0, if (x, y) = (0, 0). It can be easily seen that the f is Gâteaux differentiable at (0, 0) with Gâteaux derivative fG′ (0, 0) = 0. Since for (x, x2 ) ∈ X with (x, x2 ) 6= (0, 0), we have |x3 x3 | 1 1 |f (x, x2 )| √ = = √ , 2 4 4 2 4 k(x, x )k 2 1 + x2 (x + x )( x + x ) Therefore, f is not Fréchet differentiable at (0, 0). for k = h2 . Qamrul Hasan Ansari Advanced Functional Analysis Page 79 Theorem 3.3.1. Let X and Y be normed spaces. If the operator T : X → Y is Fréchet differentiable at x ∈ X, then it is continuous at x. Proof. Since T has a Fréchet derivative at x ∈ X, for each ε1 > 0, there exists a δ1 > 0 (depending on ε1 ) such that kT (y) − T (x) − T ′ (x)(y − x)k < ε1 ky − xk, for all y ∈ X satisfying ky − xk < δ1 . By the triangle inequality kT (y) − T (x) − T ′ (x)(y − x)k ≥ kT (y) − T (x)k − kT ′ (x)(y − x)k, we find for ky − xk < δ1 that kT (y) − T (x)k < ε1 ky − xk + kT ′ (x)(y − x)k ≤ (ε1 + kT ′ (x)k)ky − xk. Choose δ = min{δ1 , ε/(ε1 + kT ′ (x)k)} for each ε > 0. Then for all y ∈ X, we have kT (y) − T (x)k < ε whenever ky − xk < δ, that is, T is continuous at x. Theorem 3.3.2 (Chain Rule). Let X, Y and Z be normed spaces. If T : X → Y and S : Y → Z are Fréchet differentiable, then the operator R := S ◦ T : X → Z is also Fréchet differentiable and its Fréchet derivative is given by R′ (x) = S ′ (T (x)) ◦ T ′ (x). Proof. For exercise. Theorem 3.3.3 (Mean Value Theorem). Let K be an open convex subset of a normed space X, a, b ∈ K and T : K → X be a Fréchet differentiable such that at each x ∈ (a, b) (open line segment joining a and b) and T (x) is continuous on closed line segment [a, b]. Then kT (b) − T (a)k ≤ sup kT ′ (y)k kb − ak. (3.17) y∈(a,b) Proof. Let F be a continuous linear functional on X and ϕ : [0, 1] → R be a function defined by ϕ(λ) = F ((T ((1 − λ)a + λb))), for all λ ∈ [0, 1]. Qamrul Hasan Ansari Advanced Functional Analysis Page 80 By Classical Mean Value Theorem of Calculus for ϕ, we have that for some λ̂ ∈ [0, 1] and x = (1 − λ̂)a + λ̂b, F (T (b) − T (a)) = F (T (b)) − F (T (a)) = ϕ(1) − ϕ(0) = ϕ′ (λ̂) = F (T ′ (x)(b − a)), where we have used the Chain Rule and the fact that a bounded linear functional is its own derivative. Therefore, for each continuous linear functional F on X, we have kF (T (b) − T (a))k ≤ kF k kT ′(x)k kb − ak. (3.18) Now, if we define a function G on the subspace [T (b) −T (a)] of X as G(α(F (b)) −F (a)) = α, then kGk = kT (b) − T (a)k−1 . If F is a Hahn-Banach extension of G to entire X, we find by substitution in (3.18) that 1 = kF (T (b) − T (a))k ≤ kT (b) − T (a)k−1 kT ′ (x)k kb − ak, which gives (3.17). Definition 3.3.2. If T : X → Y is Fréchet differentiable on an open set Ω ⊂ X and the first Fréchet derivative T ′ at x ∈ Ω is Fréchet differentiable at x, then the Fréchet derivative of T ′ at x is called the second derivative of T at x and is denoted by T ′′ (x). Definition 3.3.3. Let X be a normed space. A function f : X → R is said to be twice Fréchet differentiable at x ∈ int(Dom(T )) if there exists A ∈ B(X, X ∗ ) such that kf ′ (x + d) − f ′ (x) − A(d)k = 0. lim t→0 t The second derivative of f at x and is f ′′ (x) = A. It may be observed that if T : X → Y is Fréchet differentiable on an open set Ω ⊂ X, then T ′ is a mapping on X into B[X, Y ]. Consequently, if T ′′ (x) exists, it is a bounded linear mapping from X into B[X, Y ]. If T ′′ exists at every point of Ω, then T ′′ : X → B[X, B[X, Y ]]. Theorem 3.3.4 (Taylor’s Formula for Differentiable Functions). Let T : Ω ⊂ X → Y and let [a, a + h] be any closed segment in Ω. If T is Fréchet differentiable at a, then T (a + h) = T (a) + T ′ (a)h + khkε(h), lim ε(h) = 0. h→0 Theorem 3.3.5 (Taylor’s Formula for Twice Fréchet Differentiable Functions). Let T : Ω ⊂ X → Y and [a, a + h] be any closed segment lying in Ω. If T is differentiable in Ω Qamrul Hasan Ansari Advanced Functional Analysis Page 81 and twice differentiable at a, then 1 T (a + h) = T (a) + T ′ (a)h + (T ′′ (a)h)h + khk2 ε(h), 2 lim ε(h) = 0. h→0 For proofs of these two theorems and other related results, we refer to the book by H. Cartan, Differential Calculus, Herman, 1971. Qamrul Hasan Ansari 3.4 Advanced Functional Analysis Page 82 Some Related Results Let X be a Hilbert space and f : X → (−∞, ∞] be a proper functional such that f is Gâteaux differentiable at a point x ∈ int(Dom(f )). Then, by Riesz representation theorem4 ,there exists exactly one vector, denoted by ∇G f (x) in X such that fG′ (x)(d) = h∇G f (x), di, for all d ∈ X and kfG′ (x)k∗ = k∇G f (x)k. (3.19) We say that ∇G f (x) is the Gâteaux gradient vector of f at x. Alternatively, we have fG′ (x)(d) = h∇G f (x), di = f (x + td) − f (x) , t∈R, t→0 t lim for all d ∈ X. Example 3.4.1. Let X be p a real inner product space and f : X → R be a functional defined by f (x) = kxk = hx, xi for all x ∈ X. Then f is differentiable on X \ {0} with 1 ∇G f (x) = kxk x for 0 6= x ∈ X. In fact, for x, d ∈ X with x 6= 0, we have p p kxk2 + 2thx, di + t2 kdk2 − kxk2 f (x + td) − f (x) = kxk2 + 2thx, di + t2 kdk2 − kxk2 p p = kxk2 + 2thx, di + t2 kdk2 + kxk2 2thx, di + t2 kdk2 p , = p kxk2 + 2thx, di + t2 kdk2 + kxk2 for all t ∈ R, which implies that f (x + td) − f (x) 1 = hx, di = h∇G f (x), di, t→0 t kxk fG′ (x)(d) = lim where ∇G f (x) = 1 x. kxk Lemma 3.4.1 (Descent lemma). Let X be a Hilbert space and f : X → R be a differentiable convex function such that ∇f : X → X is ∇f is β-Lipschitz continuous. Then the following assertions hold: (a) For all x, y ∈ X, f (y) − f (x) ≤ β ky − xk2 + hy − x, ∇f (x)i. 2 Qamrul Hasan Ansari Advanced Functional Analysis (b) For all x ∈ X, f 1 x − ∇f (x) β ≤ f (x) − Page 83 1 k∇f (x)k2 . 2β Proof. (a) Let x, y ∈ X. Define φ : [0, 1] → R by φ(t) = f (x + t(y − x)), for all t ∈ [0, 1]. Noticing that φ(0) = f (x) and φ′ (t) = hy − x, ∇f (x + t(y − x))i. φ(1) = f (y), Hence f (y) = f (x) + = f (x) + Z 1 Z0 1 0 = f (x) + Z ≤ f (x) + ≤ f (x) + (b) Replacing y by x − 0 hy − x, ∇f (x + t(y − x))idt 1 hy − x, ∇f (x + t(y − x)) − ∇f (x)idt + hy − x, ∇f (x)i 0 Z φ′ (t)dt 1 ky − xk k∇f (x + t(y − x)) − ∇f (x)kdt + hy − x, ∇f (x)i β ky − xk2 + hy − x, ∇f (x)i. 2 1 ∇f (x) in (a), we get (b). 2β Definition 3.4.1. Let X be an inner product space. An operator T : X → X is said to be γ-inverse strongly monotone or γ-cocercive if there exists γ > 0 such that hT (x) − T (y), x − yi ≥ γkT (x) − T (y)k2, for all x, y ∈ X. Proposition 3.4.1. Let X be a Hilbert space and f : X → R be a Fréchet differentiable convex function such that ∇f : X → X is ∇f is β-Lipschitz continuous for some β > 0. 1 Then ∇f is -inverse strongly monotone, that is, β h∇f (x) − ∇f (y), x − yi ≥ 1 k∇f (x) − ∇f (y)k2. β Proof. Let x ∈ X. Define g : X → R by g(z) = f (z) − f (x) − h∇f (x), z − xi, for all z ∈ X. Qamrul Hasan Ansari Advanced Functional Analysis Page 84 Note g(x) = 0 ≤ f (z) − f (x) − h∇f (x), z − xi = g(z), for all z ∈ X and ∇g(z) = ∇f (z) − ∇f (x), for allz ∈ X. Clearly, inf g(z) = 0. One can see that z∈X k∇g(u) − ∇g(v)k = k∇f (u) − ∇f (v)k ≤ βku − vk, for all u, v ∈ X. Let y ∈ X. From Lemma 3.4.1(b), we have inf g(z) ≤ g(y) − z∈X 1 k∇g(y)k2, 2β which implies that 0 ≤ f (y) − f (x) − h∇f (x), y − xi − 1 k∇f (y) − ∇f (x)k2 . 2β Similarly, we have 0 ≤ f (x) − f (y) − h∇f (y), x − yi − Thus, we have 0 ≤ h∇f (x) − ∇f (y), x − yi − 1 k∇f (x) − ∇f (y)k2. 2β 1 k∇f (x) − ∇f (y)k2. β Definition 3.4.2. Let X be a inner product space. An operator T : X → X is said to be (a) nonexpansive if kT (x) − T (y)k ≤ kx − yk, for all x, y ∈ X; (b) firmly nonexpansive if kT (x) − T (y)k2 + k(I − T )(x) − (I − T )(y)k2 ≤ kx − yk2, for all x, y ∈ X, where I is the identity operator. It can be easily seen that every firmly nonexpansive mapping is nonexpansive but converse may not hold. For example, consider the negative of identity operator, that is, (−I). Corollary 3.4.1. Let X be a Hilbert space and f : X → R be a Fréchet differentiable convex function. Then ∇f is nonexpansive ⇔ ∇f is firmly nonexpansive. Qamrul Hasan Ansari Advanced Functional Analysis Page 85 Exercise 3.4.1. Let X be a Hilbert space and Y be an inner product space, A ∈ B(X, Y ) and b ∈ Y . Define a functional f : X → R by 1 f (x) = kA(x) − bk2 , 2 for all x ∈ X. Then prove that f is Fŕechet differentiable on X with ∇f (x) = A∗ (Ax−b) and ∇2 f (x) = A∗ A for each x ∈ X. Proof. Let x ∈ X. Then, for y ∈ X, we have = = = = f (x + y) − f (x) 1 hA(x) − b + A(y), A(x) − b + A(y)i − f (x) 2 1 [hA(x) − b, A(x) − bi + hA(x) − b, A(y)i + hA(y), A(x) − bi + hA(y), A(y)i] − f (x) 2 1 hA(x) − b, A(y)i + kA(y)k2 2 1 ∗ hA (Ax − b), yi + kA(y)k2. 2 Thus, kAk2 1 kyk2, |f (x + y) − f (x) − hA∗ (Ax − b), yi| = kA(y)k2 ≤ 2 2 for all y ∈ X. Therefore, f is Fŕechet differentiable on X with f ′ (x)y = h∇f (x), yi, for all x ∈ X, where ∇f (x) = A∗ (A(x) − b). It is easy to see that ∇2 f (x) = A∗ A. Exercise 3.4.2. Let X an inner product space and a ∈ X. Define a functional f : X → R by 1 f (x) = kx − ak2 , for all x ∈ X. 2 Then prove that f is Fŕechet differentiable on X with ∇f (x) = x − a and ∇2 f (x) = I for each x ∈ X. Exercise 3.4.3. Let X be a Hilbert space X and A : X → X be a bounded linear operator. Let b ∈ X, c ∈ R and define 1 f (x) = hAx, xi − hb, xi + c, 2 x ∈ C. 1 Then prove that f is Fréchet differentiable on X with ∇f (x) = (A + A∗ )(x) − b and 2 1 2 ∗ ∇ f (x) = (A + A ) for each x ∈ X. 2 Qamrul Hasan Ansari Advanced Functional Analysis Page 86 Proof. Let x ∈ X. Then, for y ∈ X, we have f (x + y) = = = = = Thus, Therefore, 1 hA(x + y), x + yi − hb, (x + y)i + c 2 1 [hA(x) + A(y), xi + hA(x) + A(y), yi] − hb, (x + y)i + c 2 1 [hA(x), xi + hA(y), xi + hA(x), yi + hA(y), yi] − hb, (x + y)i + c 2 1 1 hA(x), xi − hb, xi + c + [hy, A∗(x)i + hA(x), yi + hA(y), yi] − hb, yi 2 2 1 1 f (x) + h (A + A∗ )(x) − b, yi + hA(y), yi. 2 2 1 kf (x + y) − f (x) − h (A + A∗ )(x) − b, yik ≤ kAkkyk2, 2 for all y ∈ X. kf (x + y) − f (x) − h 21 (A + A∗ )(x) − b, yik lim = 0, kyk→0 kyk 1 i.e., f is differentiable with ∇f (x) = (A + A∗ )(x) − b. One can see that 2 1 ∇2 f (x) = (A + A∗ ). 2 Exercise 3.4.4. Let X be a Hilbert space, b ∈ X and A : X → X be a self-adjoint, bounded, linear operator and strongly positive, i.e., there exists α > 0 such that hA(x), xi ≥ αkxk2 , for all x ∈ X. Let b ∈ X and define a quadratic function f : X → R by 1 f (x) = hA(x), xi + hx, bi, 2 for all x ∈ X. Then prove that ∇f (·) = A(·) + b is α-strongly monotone and kAk-Lipschitz continuous. When X = RN is finite dimensional, then the above operator A coincides with a positive definite matrix. Then ∇2 f (x) = A and λmin kxk2 ≤ hAx, xi ≤ λmax kxk2 , for all x ∈ RN , where λmin and λmax are the minimum and maximum eigenvalues of A, respectively. Hence α = λmin ≤ λmax = kAk. Qamrul Hasan Ansari 3.5 Advanced Functional Analysis Page 87 Subdifferential and Its Properties The concept of a subdifferential plays an important role in problems of optimization and convex analysis. In this section, we study subgradients and subdifferentials of R∞ -valued convex functions and their properties in normed spaces. We have already seen in Theorem 3.2.4 that if X is a normed space, f : X → (−∞, ∞] is a proper convex function and x ∈ int(Dom(f )), then the following inequality holds: fG′ (x)(y − x) + f (x) ≤ f (y), for all y ∈ X. (3.20) The inequality (3.20) motivates us to introduce the notion of another kind of differentiability when f is not Gâteaux differentiable at x, but the inequality (3.20) holds. Definition 3.5.1. Let X be a normed space, f : X → (−∞, ∞] be a proper function and x ∈ Dom(f ). Then an element j ∈ X ∗ is said to be a subgradient of f at x if f (x) ≤ f (y) + hx − y, ji for all y ∈ X. (3.21) The set (possibly nonempty) ∂f (x) := {j ∈ X ∗ : f (x) ≤ f (y) + hx − y, ji, for all y ∈ X}, of subgradients of f at x is called the subdifferential or Fenchel subdifferential of f at x. Clearly, ∂f (x) may be empty set even if f (x) ∈ R. But for the case x ∈ / Dom(f ), we consider ∂f (x) = ∅. Thus, subdifferential of a proper convex function f is a set-valued mapping ∂f : X ⇒ X ∗ defined by ∂f (x) = {j ∈ X ∗ : f (x) ≤ f (y) + hx − y, ji for all y ∈ X}. The domain of the subdifferential ∂f is defined by Dom(∂f ) = {x ∈ X : ∂f (x) 6= ∅}. Obviously, Dom(∂f ) ⊆ Dom(f ). Remark 3.5.1. (a) If f (x) 6= ∞, then Dom(∂f ) is a subset of Dom(f ). (b) If f (x) = ∞ for some x, then ∂f (x) = ∅. Definition 3.5.2. Let X be a Hilbert space, f : X → (−∞, ∞] be a proper function. The subdifferential of f is the set-valued map ∂f : X ⇒ X defined by ∂f (x) = {u ∈ X : f (x) ≤ f (y) + hx − y, ui for all y ∈ X}, for x ∈ X. (3.22) Then f is said to be subdifferentiable at x ∈ X if ∂f (x0 ) 6= ∅. The elements of ∂f (x) are called the subgradients of f at x. Qamrul Hasan Ansari Advanced Functional Analysis Page 88 Example 3.5.1. Let f : R → R be a function defined by f (x) = |x| for x ∈ R. Then if x < 0, {−1}, [−1, 1], if x = 0, ∂f (x) = {1}, if x > 0. Note that f is convex and continuous, but not differentiable at 0. Clearly, f is subdifferentiable at 0 with ∂f (0) = [−1, 1]. Also Dom(∂f ) = Dom(f ) = R. Example 3.5.2. Define f : R → (−∞, ∞] by 0, if x = 0, f (x) = ∞, otherwise. Then ∂f (x) = ∅, R, if x 6= 0, if x = 0. Note that f is not continuous at 0, but f is subdifferentiable at 0 with ∂f (0) = R. Example 3.5.3. Define f : R → (−∞, ∞] by ∞, √ f (x) = − x, Then ∂f (x) = ∅, − 2√1 x , if x < 0, if x ≥ 0. if x ≤ 0, if x > 0. Note that Dom(f ) = [0, ∞) and f is not continuous at 0. Moreover, ∂f (0) = ∅ and Dom(∂f ) = (0, ∞). Thus, f is not subdifferentianble at 0 even 0 ∈ Dom(f ). We now consider some more general functions. Example 3.5.4. Let X be a inner product space, a ∈ X and define f : X → R by f (x) = kx − ak for x ∈ X. Then S1 (0), if x = a, ∂f (x) = x − a, if x 6= a, where S1 (0) is open unit ball at 0 ∈ X Example 3.5.5. Let K be a nonempty closed convex subset of a normed space X and iK the indicator function of K, i.e., 0, if x ∈ K, iK (x) = ∞, otherwise. Then ∂iK (x) = {j ∈ X ∗ : hx − y, ji ≥ 0 for all y ∈ K} , for x ∈ K. Qamrul Hasan Ansari Advanced Functional Analysis Page 89 Proof. Since the indicator function is a proper lower semicontinuous convex function on X, from (3.21), we have ∂iK (x) = {j ∈ X ∗ : iK (x) − iK (y) ≤ hx − y, ji for all y ∈ K} . Remark 3.5.2. Dom(iK ) = Dom(∂iK ) = K and ∂iK (x) = {0} for each x ∈ int(K). 3.5.1 Properties of Subdifferentials Definition 3.5.3. Let X be an inner product space. A set-valued mapping T : X ⇒ X is said to be (a) monotone if for all x, y ∈ X, hu − v, x − yi ≥ 0, for all u ∈ T (x) and v ∈ T (y); (b) maximal monotone if it is monotone and its graph Graph(T ) := {(x, u) ∈ X × X : u ∈ T (x)} is not contained properly in the graph of any other monotone set-valued mapping. Theorem 3.5.1. Let X be a Hilbert space and f : X → (−∞, ∞] be a proper convex function. Then ∂f is monotone. Proof. Let x, y ∈ and u ∈ ∂f (x), v ∈ ∂f (y) be arbitrary. Then f (x) ≤ f (z) + hx − z, ui, for all z ∈ X (3.23) f (y) ≤ f (w) + hy − w, vi, for all w ∈ X. (3.24) and Taking z = y in (3.23) and w = x in (3.24) and adding the resultants, we get f (x) + f (y) ≤ f (y) + f (x) + hx − y, ui + hy − x, vi, which implies that hu − v, x − yi ≥ 0. Thus, ∂f is monotone. Qamrul Hasan Ansari Advanced Functional Analysis Page 90 Theorem 3.5.2. Let X be a Hilbert space and f : X → (−∞, ∞] be a proper lower semicontinuous convex function. Then R(I + ∂f ) = X. Proof. Noticing that R(I + ∂f ) ⊆ X. It is suffices to show that X ⊆ R(I + ∂f ). For this, let x0 ∈ X and define 1 ψ(x) = kxk2 + f (x) − hx, x0 i, 2 for all x ∈ X. Note that ψ has an affine lower bound and lim ψ(x) = ∞. Hence, from Theorem A5 , there kxk→∞ exists z ∈ Dom(f ) such that ψ(z) = inf ψ(x). x∈X Thus, for all x ∈ X, from Proposition P6 , we have kxk2 ≤ kzk2 + 2hx − z, xi and 1 1 kzk2 + f (z) − hz, x0 i ≤ kxk2 + f (x) − hx, x0 i, 2 2 which imply that 1 f (z) ≤ f (x) + (kxk2 − kzk2 ) + hz − x, x0 i 2 ≤ f (x) + hx − z, xi + hz − x, x0 i = f (x) + hx − z, x − x0 i. Let u ∈ X. Define zt = (1 − t)z + tu for t ∈ (0, 1). Hence, for t ∈ (0, 1), we obtain f (z) ≤ (1 − t)f (z) + tf (u) + thu − z, zt − x0 i, which gives us that f (z) ≤ f (u) + hu − z, zt − x0 i. Letting limit as t → 0+ , we get f (z) ≤ f (u) + hu − z, z − x0 i. Hence x0 − z ∈ ∂f (z), i.e., x0 ∈ (I + ∂f )(z) ⊆ R(I + ∂f ). Thus, X ⊆ R(I + ∂f ). From Theorem 3.5.2, we have 5 Theorem A. Let K be a nonempty closed convex subset of a Hilbert space X and f : K → (−∞, +∞] be a proper lower semicontinuous function such that f (xn ) → ∞ as kxn k → ∞. Then there exists x̄ ∈ K such that f (x̄) = inf f (x). x∈K 6 Let X be an inner product space. Then for any x, y ∈ X, kxk2 ≤ kyk2 − 2hy − x, xi Qamrul Hasan Ansari Advanced Functional Analysis Page 91 Corollary 3.5.1. Let X be a Hilbert space and f : X → (−∞, ∞] be a proper lower semicontinuous convex function. Then R(I + λ∂f ) = X for all λ ∈ (0, ∞). Theorem 3.5.3. Let X be a Hilbert space and f : X → (−∞, ∞] be a proper lower semicontinuous convex function. Then ∂f is maximal monotone. Theorem 3.5.4. Let X be a Hilbert space and f : X → (−∞, ∞] be a proper convex function. Then, for each x ∈ Dom(f ), ∂f (x) is closed and convex. Proof. Exercise. Theorem 3.5.5. Let X be a Hilbert space and f : X → (−∞, ∞] be a proper convex function. Then ∂f −1 (0) is closed and convex. Proof. Exercise. We now study some calculus of subgradients. Proposition 3.5.1. Let X be a Hilbert space and f : X → (−∞, ∞] be a proper function. Then ∂(λf ) = λ∂f, for all λ ∈ (0, ∞). Proof. Let λ ∈ (0, ∞). Then, for x ∈ X, we have z ∈ ∂(λf )(x) ⇔ λf (x) ≤ λf (y) + hx − y, zi, 1 ⇔ f (x) ≤ f (y) + hx − y, zi, λ 1 ⇔ z ∈ ∂f (x) λ ⇔ z ∈ λ∂f x). for all y ∈ X for all y ∈ X Therefore, ∂(λf ) = λ∂f, for all λ ∈ (0, ∞). Theorem 3.5.6. Let X be a Hilbert space. Let f, g : X → (−∞, ∞] be proper convex functions and there exists x0 ∈ Dom(f ) ∩ Dom(g) where f is continuous. Then ∂(f + g) = ∂f + ∂g. Qamrul Hasan Ansari Advanced Functional Analysis Page 92 Theorem 3.5.7. Let X be a Hilbert space and f : X → (−∞, ∞] be a proper convex function. Let x ∈ Dom(f ) and u ∈ X. Then u ∈ ∂f (x) ⇔ hy, ui ≤ f ′ (x; y), for all y ∈ X. Proof. Suppose that u ∈ ∂f (x) and y ∈ X. From (3.22), we have f (x) ≤ f (x + ty) + hx − (x + ty), ui, Hence hy, ui ≤ f (x + ty) − f (x) , t Letting limit as t → 0, we get for all t ∈ (0, ∞). for all t ∈ (0, ∞). hy, ui ≤ f ′ (x; y). Conversely, suppose that hy, ui ≤ f ′ (x; y), for all y ∈ X. (3.25) From (3.4) and (3.25), we have hy − x, ui ≤ f ′ (x; y − x) ≤ f (y) − f (x), for all y ∈ X. This shows that u ∈ ∂f (x). We now give a relation between Gâteaux differentiability and subdifferentiability. Theorem 3.5.8. Let X be a Banach space and f : X → (−∞, ∞] a proper convex function. Let f be Gâteaux differentiable at a point x0 ∈ Dom(f ). Then x0 ∈ Dom(∂f ) and ∂f (x0 ) = {fG′ (x0 )}. In this case, d f (x0 + ty) dt t=0 = hy, ∂f (x0 )i = hy, fG′ (x0 )i, for all y ∈ X. Proof. Since f is Gâteaux differentiable at x0 ∈ Dom(f ). Then hy, fG ′ (x0 )i = lim t→0 f (x0 + ty) − f (x0 ) , t for all y ∈ X. By the convexity of f , we have f (x0 + λ(y − x0 )) = f ((1 − λ)x0 + λy) ≤ (1 − λ)f (x0 ) + λf (y), for all y ∈ X and λ ∈ (0, 1), i.e, f (x0 + λ(y − x0 )) − f (x0 ) ≤ f (y) − f (x0 ), λ for all y ∈ X and λ ∈ (0, 1), Qamrul Hasan Ansari Advanced Functional Analysis Page 93 It follows that hy − x0 , fG′ (x0 )i ≤ f (y) − f (x0 ), for all y ∈ X, i.e., fG′ (x0 ) ∈ ∂f (x0 ). This shows that x0 ∈ Dom(∂f ). Now, let jx0 ∈ ∂f (x0 ). Then, we have f (x0 ) − f (u) ≤ hx0 − u, jx0 i, for all u ∈ X. Let h ∈ X and let ut = x0 + λh for λ ∈ (0, ∞). Then f (x0 + λh) − f (x0 ) ≥ hh, jx0 i, λ for all λ ∈ (0, ∞). Letting limit as λ → 0, we get hh, f ′G (x0 ) − jx0 i ≥ 0, for all h ∈ X, i.e., jx0 = fG′ (x0 ). Therefore, f is Gâteaux differentiable at x0 and fG′ (x0 ) = ∂f (x0 ). Corollary 3.5.2. Let X be a Hilbert space and f : X → (−∞, ∞] be a proper convex function such that f is Gâteaux differentiable at a point x0 ∈ Dom(f ). Then x0 ∈ Dom(∂f ) and ∂f (x0 ) = {∇G f (x0 )}. In this case, d f (x0 + ty) dt t=0 = hy, ∂f (x0 )i = hy, ∇G f (x0 )i, for all y ∈ X. Exercise 3.5.1. Let X be a Banach space. Then prove that ∂kxk = {j ∈ X ∗ : hx, ji = kxk kjk∗ , kjk∗ = 1} , for all x ∈ X \ {0}. Proof. Let j ∈ ∂kxk. Then hy − x, ji ≤ kxk − kyk ≤ ky − xk, for all y ∈ X. (3.26) It follows that j ∈ X ∗ and kjk ≤ 1. It is clear from (3.26) that kxk ≤ hx, ji, which gives hx, ji = kxk and kjk∗ = 1. Thus, ∂kxk ⊆ {j ∈ X ∗ : hx, ji = kxk and kjk∗ = 1} . Suppose that j ∈ X ∗ such that j ∈ {f ∈ X ∗ : hx, f i = kxk and kf k∗ = 1}. Then hx, ji = kxk and kjk∗ = 1. Thus, hy − x, ji = hy, ji − kxk ≤ kyk − kxk, for all y ∈ X, that is, j ∈ ∂kxk. It follows that {j ∈ X ∗ : hx, ji = kxk and kjk∗ = 1} ⊆ ∂kxk. Therefore, ∂kxk = {j ∈ X ∗ : hx, ji = kxk and kjk∗ = 1} Qamrul Hasan Ansari Advanced Functional Analysis Page 94 Exercise 3.5.2. Let X be a Hilbert space and a ∈ X. Define f : X → R by 1 f (x) = kx − ak2 , 2 for all x ∈ X. Then prove that ∂f (x) = {x − a} for all x ∈ X. Hint 3.5.1. It is easy to see that f is differentiable with ∇f (x) = x − a for all x ∈ X by Proposition 3.4.1. Exercise 3.5.3. Let X be a Hilbert space. Then prove that ∂ 12 k · k2 = I. 4 Geometry of Banach Spaces Among all infinite dimensional Banach spaces, Hilbert spaces have the most important and useful geometric properties. Namely, the inner product on an inner product space satisfies the parallelogram law. It is well known that a normed space is an inner product space if and only if its norm satisfies the parallelogram law. The geometric properties of an inner product space make numerous problems posed in inner product space more manageable than those in normed spaces. Consequently, to extend some of inner product techniques and inner product properties, we study the geometric properties of normed spaces. In this chapter, we study strict convexity, modulus of convexity, uniform convexity and smoothness of normed spaces. Most of the results presented in this chapter are given in the standard books on functional analysis, convex analysis and geometry of Banach spaces, namely, recommended books 1 and 3. 4.1 Strict Convexity and Modulus of Convexity It is well known that the norm of a normed space X is convex, that is, kλx + (1 − λ)yk ≤ λkxk + (1 − λ)kyk, for all x, y ∈ X and λ ∈ [0, 1]. There are several norms of normed spaces which are strictly convex, that is, kλx + (1 − λ)yk < λkxk + (1 − λ)kyk, for all x, y ∈ X with x 6= y and λ ∈ (0, 1). (4.1) We denote by SX the unit sphere SX = {x ∈ X : kxk = 1} in a normed space X. If x, y ∈ SX with x 6= y, then (4.1) reduces to kλx + (1 − λ)yk < 1, for all λ ∈ (0, 1), which says that the unit sphere SX contains no line segments. This suggests strict convexity of normed space. 95 Qamrul Hasan Ansari Advanced Functional Analysis Page 96 Definition 4.1.1. A normed space X is said to be strictly convex if x, y ∈ SX with x 6= y ⇒ kλx + (1 − λ)yk < 1, for all λ ∈ (0, 1). Geometrically speaking, the normed space X is strictly convex if the boundary of the unit sphere in X contains no line segments. Clearly, if k · k is strictly convex, then X is strictly convex. Also, kλx + (1 − λ)yk < 1 = λkxk + (1 − λ)kyk (because kxk = kyk = 1) implies that k · k is strictly convex. Before giving the examples of strictly convex normed spaces, we present the following characterizations. Proposition 4.1.1. The following assertions are equivalent: (a) X is strictly convex. (b) If x 6= y and kxk = kyk = 1 (that is, x, y ∈ SX ), then kx + yk < 2. (c) If for any x, y, z ∈ X, kx − yk = kx − zk + kz − yk, then there exists λ ∈ [0, 1] such that z = λx + (1 − λ)y. Proof. (a) ⇒ (b): Assume that X is strictly convex. Then for any x, y ∈ SX , we have kxk = kyk = 1 and therefore, by strict convexity of X, we have kλx + (1 − λ)yk < 1 for all λ ∈ [0, 1]. Take λ = 21 , then we obtain kx + yk < 2, that is, (b) holds. (b) ⇒ (a): Suppose contrary that for each x, y ∈ X, x 6= y, kxk = kyk = 1 and λ0 ∈ (0, 1), we have kλ0 x + (1 − λ0 )yk = 1, that is, λ0 x + (1 − λ0 )y ∈ SX . Take λ0 < λ < 1, then λ0 λ0 λ0 x + (1 − λ0 )y = y, [λx + (1 − λ)y] + 1 − λ λ λ0 λ0 (1 − λ) and hence + 1− as 1 − λ0 = λ λ λ0 λ0 1 = kλ0 x + (1 − λ0 yk ≤ kλx + (1 − λ)yk + 1 − kyk. λ λ This implies that λ0 λ0 λ0 = , kλx + (1 − λ)yk ≥ 1 − 1 − λ λ λ that is, kλx + (1 − λ)yk ≥ 1. Similarly, for 0 < λ < λ0 , we can have kλx + (1 − λ)yk ≥ 1. So for particular λ = 12 , we have 1 kx + yk ≥ 1, that is, kx + yk ≥ 2, a contradiction of the condition of strict convexity. 2 Qamrul Hasan Ansari Advanced Functional Analysis Page 97 (a) ⇒ (c): Let x, y, z ∈ X such that kx − yk = kx − zk + kz − yk. Suppose that kx − zk = 6 0, kz − yk = 6 0 and kx − zk ≤ kz − yk. Then 1 z−y 1 x−z · + · 2 kx − zk 2 kz − yk 1 x−z 1 z−y 1 z−y 1 z−y ≥ − · + · · − · 2 kx − zk 2 kx − zk 2 kx − zk 2 kz − yk 1 (z − y)kz − yk − (z − y)kx − zk 1 z−y 1 x−z − · · + · = 2 kx − zk 2 kx − zk 2 kx − zk kz − yk 1 kx − yk 1 kz − yk − kx − zk = · − · 2 kx − zk 2 kx − zk 1 kx − yk − kz − yk + kx − zk = · 2 kx − zk 1 kx − zk + kz − yk − kz − yk + kx − zk = 1, = · 2 kx − zk since kx − yk = kx − zk + kz − yk. Now since x−z kx−zk ∈ SX and z−y kz−yk x−z kx−zk = 1 and z−y kz−yk = 1, that is, ∈ SX , we have 1 z−y 1 x−z < 1. · + · 2 kx − zk 2 kz − yk Hence, x−z z−y = 2. + kx − zk kz − yk Therefore, x−z z−y = , kx − zk kz − yk and this yields z= by (b) kx − zk kz − yk ·x+ · y. kx − zk + kz − yk kx − zk + kz − yk (c) ⇒ (b): Let x 6= y such that kxk = kyk = x+y = 1. Then kx + yk = kxk + kyk. 2 y. Consequently, there exists λ ∈ (0, 1) such that z = 0 = λx − (1 − λ)y, that is, x = 1−λ λ 1−λ So that kxk = λ kyk. Since kxk = kyk = 1, we have λ = 1/2. Therefore, x = y, a contradiction. Remark 4.1.1. (a) The assertion (b) in Proposition 4.1.1 says that the midpoint (x+y)/2 of two distinct points x and y on the unit sphere SX of X does not lie on SX . In other words, if x, y ∈ SX with kxk = kyk = k(x + y)/2k, then x = y. (b) The assertion (c) in Proposition 4.1.1 says that any three point x, y, z ∈ X satisfying kx − yk = kx − zk + kz − yk must lie ona line; specially, if kx − zk = r1 , ky − zk = r2 r2 r1 and kx − yk = r = r1 + r2 , then z = r x + r y. Qamrul Hasan Ansari Advanced Functional Analysis Page 98 We give some examples of strict convex spaces. Example 4.1.1. Consider X = Rn , n ≥ 2 with norm kxk2 defined by kxk2 = n X i=1 x2i !1/2 , x = (x1 , x2 , . . . , xn ) ∈ Rn . Let x = (1, 0, 0,√ . . . , 0) ∈ Rn and y = (0, 1, 0, . . . , 0) ∈ Rn . Then x 6= y, kxk2 = 1 = kyk2, but kx + yk2 = 2 < 2. Hence X is strictly convex. R SX R The unit sphere in R2 with p respect to the norm kxk2 = k(x1 , x2 )k = x21 + x22 Example 4.1.2. Consider X = Rn , n ≥ 2 with norm k · k1 defined by kxk1 = |x1 | + |x2 | + · · · + |xn |, x = (x1 , x2 , . . . , xn ) ∈ Rn . Then X is not strictly convex. To see this, let x = (1, 0, 0, . . . , 0) ∈ Rn and y = (0, 1, 0, . . . , 0) ∈ Rn . Then x 6= y, kxk1 = 1 = kyk1, but kx + yk1 = 2. Qamrul Hasan Ansari Advanced Functional Analysis Page 99 R SX R The unit sphere in R2 with respect to the norm kxk1 = k(x1 , x2 )k1 = |x1 | + |x2 | Example 4.1.3. Consider X = Rn , n ≥ 2 with norm k · k∞ defined by kxk∞ = max |xi |, 1≤i≤n x = (x1 , x2 , . . . , xn ) ∈ Rn . Then X is not strictly convex. Indeed, for x = (1, 0, 0, . . . , 0) ∈ Rn and y = (1, 1, 0, . . . , 0) ∈ Rn , we have, x 6= y, kxk∞ = 1 = kyk∞, but kx + yk∞ = 2. R SX R The unit sphere in R2 with respect to the norm kxk∞ = k(x1 , x2 )k∞ = max{|x1 |, |x2 |} Example 4.1.4. The space C[a, b] of all real-valued continuous functions defined on [a, b] with the norm kf k = sup |f (t)|, is not strictly convex. Indeed, choose two functions f and a≤t≤b g defined as follows: f (t) = 1, for all t ∈ [a, b] and g(t) = b−t , b−a for all t ∈ [a, b]. Qamrul Hasan Ansari Advanced Functional Analysis Page 100 Then, clearly, f, g ∈ C[a, b], kf k = kgk = k(f + g)/2k = 1, however, f 6= g. Therefore, C[a, b] is not strictly convex. Exercise 4.1.1. Show that the spaces L1 , L∞ and c0 are not strictly convex. The following proposition provides some equivalent conditions of strict convexity. Proposition 4.1.2. Let X be a normed space. Then X is strictly convex if and only if for each nonzero f ∈ X ∗ , there exists at most one point x ∈ X with kxk = 1 such that hx, f i = f (x) = kf k∗ . Proof. Let X be a strictly convex normed space and f ∈ X ∗ . Suppose there exist two distinct points x, y ∈ X with kxk = kyk = 1 such that f (x) = f (y) = kf k∗. If λ ∈ (0, 1), then kf k∗ = = ≤ < λf (x) + (1 − λ)f (y) (since f (x) = f (y) = kf k∗ ) f (λx + (1 − λ)y) (because f is linear) kf k∗kλx + (1 − λ)yk kf k∗, (since kλx + (1 − λ)yk < 1) which is a contradiction. Therefore, there exists at most one point x in X with kxk = 1 such that f (x) = kf k∗. Conversely, assume that x, y ∈ SX with x 6= y such that k(x + y)/2k = 1. By Hahn-Banach Theorem (Corollary 6.0.1), there exists a functional j ∈ SX ∗ such that kjk∗ = 1 and h(x + y)/2, ji = k(x + y)/2k. Since hx, ji ≤ kxk kjk = 1 and hy, ji ≤ kyk kjk = 1, we have hx, ji = hy, ji because x+y x+y ,j = = 1 ⇔ hx + y, ji = 2 ⇔ hx, ji + hy, ji = 2. 2 2 This implies, by hypothesis, that x = y. Therefore, X is strictly convex. Proposition 4.1.3. A normed space X is strictly convex if and only if the functional h(x) := kxk2 is strictly convex, that is, kλx + (1 − λ)yk2 < λkxk2 + (1 − λ)kyk2, for all x, y ∈ X, x 6= y and λ ∈ (0, 1). Proof. Suppose that X is strictly convex. Let x, y ∈ X, λ ∈ (0, 1). Then we have kλx + (1 − λ)yk2 ≤ = ≤ = (λkxk + (1 − λ)kyk)2 λ2 kxk2 + 2λ(1 − λ)kxk kyk + (1 − λ)2 kyk2 λ2 kxk2 + 2λ(1 − λ) kxk2 + kyk2 + (1 − λ)2 kyk2 λkxk2 + (1 − λ)kyk2. (4.2) (4.3) (4.4) Qamrul Hasan Ansari Advanced Functional Analysis Page 101 Hence h is convex. Now we show that the equality can not hold. Assume that there are x, y ∈ X, x 6= y with kλ0 x + (1 − λ0 )yk2 = λ0 kxk2 + (1 − λ0 )kyk2 , for some λ0 ∈ (0, 1). Then from (4.3), we obtain 2kxk kyk = kxk2 + kyk2. Hence kxk = kyk = kλ0 x + (1 − λ0 )yk which is impossible. Conversely, assume that the functional h(x) := kxk2 is strictly convex. Let x, y ∈ X be such that x 6= y, kxk = kyk = 1 with kλx + (1 − λ)yk = 1 for some λ ∈ (0, 1). Then kλx + (1 − λ)yk2 = 1 = λkxk2 + (1 − λ)kyk2 a contradiction that h is strictly convex. Exercise 4.1.2. Let X be a normed space. Prove that X is strictly convex if and only if for every 1 < p < ∞, kλx + (1 − λ)ykp < λkxkp + (1 − λ)kykp, for all x, y ∈ X, x 6= y and λ ∈ (0, 1). Proof. Suppose that X is strictly convex, and let x, y ∈ X with x 6= y. Then by strict convexity of X and hence by strict convexity of k · k, we have kλx + (1 − λ)yk < λkxk + (1 − λ)kyk, for all λ ∈ (0, 1). Therefore, for every 1 < p < ∞, we have kλx + (1 − λ)ykp < (λkxk + (1 − λ)kyk)p , for all λ ∈ (0, 1). (4.5) If kxk = kyk, then kλx + (1 − λ)ykp < kxkp = λkxkp + (1 − λ)kykp. Assume that kxk = 6 kyk, and consider the function λ 7→ λp for 1 < p < ∞. Then it is a convex function and p ap + bp a+b < , for all a, b ≥ 0 and a 6= b. 2 2 Hence from (4.5) with λ = 1/2, we have x+y 2 p ≤ kxk + kyk 2 p < 1 (kxkp + kykp) . 2 (4.6) Qamrul Hasan Ansari Advanced Functional Analysis Page 102 If λ ∈ (0, 1/2], then from (4.5), we have p kλx + (1 − λ)ykp = < < < < x+y 2λ (after adding and substracting λy) + (1 − 2λ)y 2 p x+y + (1 − 2λ)kyk 2λ 2 p x+y + (1 − 2λ)kykp 2λ 2 1 p p 2λ kxk + kyk + (1 − 2λ)kykp 2 λkxkp + (1 − λ)kykp. (by (4.6)) The proof is similar if λ ∈ (1/2, 1). The converse part is obvious. Proposition 4.1.4. Let X be a normed space. Then X is strictly convex if and only if for any two linearly independent elements x, y ∈ X, kx + yk < kxk + kyk. In other words, X is strictly convex if and only if kx + yk = kxk + kyk for 0 6= x ∈ X and y ∈ X, then there exists λ ≥ 0 such that y = λx. Proof. Suppose that X is not strictly convex. Then there exist x and y in X such that kxk = kyk = 1, x 6= y and kx + yk = 2. By hypothesis, for any two linearly independent elements x, y ∈ X, kx + yk < kxk + kyk. Since kx + yk = kxk + kyk, x and y are linearly dependent. Then, x = αy for some α ∈ R, and therefore, kxk = |α| kyk for some α ∈ R which implies that |α| = 1 because kxk = kyk = 1. If α = 1, then x = y, contradicting that x 6= y. So we have α = −1, and therefore, 2 = kx + yk = k − y + yk = 0. This is a contradiction. Conversely, suppose that X is a strictly convex space and there exist linearly independent elements x and y in X such that kx + yk = kxk + kyk. Without loss of generality, we may Qamrul Hasan Ansari Advanced Functional Analysis Page 103 assume that 0 < kxk ≤ kyk. Then, we have 2 > = = = ≥ = x y + kxk kyk x y because = = 1 and X is strictly convex kxk kyk 1 k(xkyk + ykxk)k kxkkyk 1 k(xkyk + ykyk − ykyk + ykxk)k kxkkyk 1 k[kyk(x + y) − (kyk − kxk)y]k kxkkyk 1 k[kykkx + yk − (kyk − kxk)kyk]k kxkkyk 1 k[kyk(kxk + kyk) − (kyk − kxk)kyk]k = 2 kxkkyk (because kx + yk = kxk + kyk). This is a contradiction. We now present the existence and uniqueness of elements of minimal norm in convex subsets of strictly convex normed spaces. Proposition 4.1.5. Let X be a strictly convex normed space and C be a nonempty convex subset of X. Then there is at most one point x ∈ C such that kxk = inf {kzk : z ∈ C}. Proof. Assume that there exist two points x, y ∈ C, x 6= y such that kxk = kyk = inf{kzk : z ∈ C} = d (say). If λ ∈ (0, 1), then by the strict convexity of X, we have kλx + (1 − λ)yk < λkxk + (1 − λ)kyk = λd + (1 − λ)d = d, which is a contradiction, since λx + (1 − λ)y ∈ C by convexity of C. Proposition 4.1.6. Let C be a nonempty closed convex subset of a reflexive strictly convex Banach space X. Then there exists a unique point x ∈ C such that kxk = inf {kzk : z ∈ C}. Proof. Let d := inf {kzk : z ∈ C}. Then there exists a sequence {xn } in C such that lim kxn k = d. Since X is reflexive, by Theorem 6.0.8, there exists a subsequence {xni } n→∞ Qamrul Hasan Ansari Advanced Functional Analysis Page 104 of {xn } that converges weakly to an element x in C. The weak lower semicontinuity of the norm gives kxk ≤ lim kxn k = d. n→∞ Therefore, d = kxk. The uniqueness of x follows from Proposition 4.1.5. Definition 4.1.2. Let C be a nonempty subset of a normed space X and x ∈ X. The distance from the point x to the set C is defined as d(x, C) = inf{kx − yk : y ∈ C}. Proposition 4.1.7. Let C be a nonempty closed convex subset of a reflexive strictly convex Banach space X. Then for all x ∈ X, there exists a unique point zx ∈ C such that kx − zx k = d(x, C). Proof. Let x ∈ C. Since C is a nonempty closed convex subset the Banach space X, D = C − x := {y − x : y ∈ C} is a nonempty closed convex subset of X. By Proposition 4.1.6, there exists a unique point ux ∈ D such that kux k = inf{ky −xk : y ∈ C}. For ux ∈ D, there exists a point zx ∈ C such that ux = zx − x. Hence, there exists a unique point zx ∈ C such that kzx − xk = d(x, C). In order to measure the degree of strict convexity of X, we define its modulus of convexity. Definition 4.1.3. Let X be a normed space. A function δX : [0, 2] → [0, 1] defined by kx + yk : kxk ≤ 1, kyk ≤ 1, kx − yk ≥ ε δX (ε) = inf 1 − 2 is called the modulus of convexity of X . Roughly speaking, δX measures how deeply the midpoint of the linear segment joining points in the sphere SX of X must lie within SX . The notion of the modulus of convexity was introduced by Clarkson in 19361 . It allows us to measure the convexity and rotundity of the unit ball of a normed space. Remark 4.1.2. (a) It is easy to see that δX (0) = 0 and δX (ε) ≥ 0 for all ε ≥ 0. (b) The function δ is increasing on [0, 2], that is, if ε1 ≤ ε2 , then δX (ε1 ) ≤ δX (ε2 ). (c) The function δX is continuous on [0, 2), but not necessarily continuous at ε = 2. (d) The modulus of convexity of an inner product space H is r ε2 δH (ε) = 1 − 1 − . 4 1 J.A. Clarkson: Uniform convex spaces, Trans. Amer. Math. Soc., 40 (1936), 396–414. Qamrul Hasan Ansari Advanced Functional Analysis Page 105 (e) The modulus of convexity of ℓp (1 ≤ p < ∞) is p 1/p ε2 . δℓp (ε) = 1 − 1 − 4 (f) δX (ε) ≤ δH (ε) for any normed space X and any inner product space H. That is, an inner product space is the most convex normed space. Remark 4.1.3. We note that for any ε > 0, the number δX (ε) is the largest number for which the following implication always holds: For any x, y ∈ X, kxk ≤ 1, kyk ≤ 1, kx − yk ≥ ε ⇒ x+y ≤ 1 − δX (ε). 2 (4.7) Example 4.1.5. Let X = R2 be a normed space equipped with one of the following norms: k(x1 , x2 )k = kx1 k + kx2 k or k(x1 , x2 )k = max {kx1 k, kx2 k} , for all (x1 , x2 ) ∈ X. Then, δX (ε) = 0 for all ε ∈ [0, 2]. Example 4.1.6. Let X = R2 be a normed space equipped with the following norm: x2 x2 , for all (x1 , x2 ) ∈ X. k(x1 , x2 )k = max kx2 k, x1 + √ , x1 − √ 3 3 Then the unit sphere is a regular hexagon and 1 lim δX (ε) = δX (2) = . 2 ε→2 We now give some important properties of the modulus of convexity of normed spaces. Theorem 4.1.1. A normed space X is strictly convex if and only if δX (2) = 1. Proof. Let X be a strictly convex normed space with modulus of convexity δX (ε). Suppose kxk = kyk = 1 and kx − yk = 2 with x 6= −y. By strict convexity of X, we have 1= x−y x + (−y) = < 1, 2 2 a contradiction. Hence x = −y. Therefore, δX (2) = 1. Conversely, suppose δX (2) = 1. Let x, y ∈ X such that kxk = kyk = k(x + y)/2k = 1, that is, kx + yk = 2 or kx − (−y)k = 2. Then x−y x + (−y) = ≤ 1 − δX (2) = 0, 2 2 which implies that x = y. Thus, kxk = kyk and kx + yk = 2 = kxk + kyk imply that x = y. Therefore, X is strictly convex. Qamrul Hasan Ansari 4.2 Advanced Functional Analysis Page 106 Uniform Convexity The strict convexity of a normed space X says that the midpoint (x + y)/2 of the segment joining two distinct points x, y ∈ SX with kx − yk ≥ ε > 0 does not lie on SX , that is, x+y < 1. 2 In such spaces, we have no information about 1 − k(x + y)/2k, the distance of midpoints from the unit sphere SX . A stronger property than the strict convexity which provides information about the distance 1 − k(x + y)/2k is uniform convexity. Definition 4.2.1. A normed space X is said to be uniformly convex if for any ε, 0 < ε ≤ 2, the inequalities kxk ≤ 1, kyk ≤ 1 and kx − yk ≥ ε imply that there exists a δ = δ(ε) > 0 such that k(x + y)/2k ≤ 1 − δ. This says that if x and y are in the closed unit ball BX := {x ∈ X : kxk ≤ 1} with kx − yk ≥ ε > 0, the midpoint of x and y lies inside the unit ball BX at a distance of at least δ from the unit sphere SX . Roughly speaking, if two points on the unit sphere of a uniformly convex space are far apart, then their midpoint must be well within it. The concept of uniform convexity was introduced by Clarkson2 . Example 4.2.1. Every Hilbert space H is a uniformly convex space. In fact, the parallelogram law gives us kx + yk2 = 2(kxk2 + kyk2) − kx − yk2 , for all x, y ∈ H. Suppose x, y ∈ BH with x 6= y and kx − yk ≥ ε. Then kx + yk2 ≤ 4 − ε2 . Therefore, where δ(ε) = 1 − p k(x + y)/2k ≤ 1 − δ(ε), 1 − ε2 /4. Thus, H is uniformly convex. Example 4.2.2. The spaces ℓ1 and ℓ∞ are not uniformly convex. To see it, take x = (1, 0, 0, 0, . . .), y = (0, −1, 0, 0, . . .) ∈ ℓ1 and ε = 1. Then kxk1 = 1, kyk1 = 1, kx − yk1 = 2 > 1 = ε. 2 J.A. Clarkson: Uniform convex spaces, Trans. Amer. Math. Soc., 40 (1936), 396–414. Qamrul Hasan Ansari Advanced Functional Analysis Page 107 However, k(x + y)/2k1 = 1 and there is no δ > 0 such that k(x + y)/2k1 ≤ 1 − δ. Thus, ℓ1 is not uniformly convex. Similarly, if we take x = (1, 1, 1, 0, 0, . . .), y = (1, 1, −1, 0, 0, . . .) ∈ ℓ∞ and ε = 1, then kxk∞ = 1, kyk∞ = 1, kx − yk∞ = 2 > 1 = ε. Since k(x + y)/2k∞ = 1, ℓ∞ is not uniformly convex. Exercise 4.2.1. Fix µ > 0 and let C[0, 1] be the space with the norm k · kµ defined by kxkµ = kxk0 + µ Z 1 1/2 x (t)dt , 2 0 where k · k0 is the usual supremum norm. Then kxk0 ≤ kxkµ ≤ (1 + µ)kxk0 , for all x ∈ C[0, 1], and the two norms are equivalent with k · kµ near k · k0 for small µ. However (C[0, 1], k · k0 ) is not strictly convex while for any µ > 0, (C[0, 1], k · kµ ) is. On the other hand, it is easy to see that for any ε ∈ (0, 2), there exist functions x, y, ∈ C[0, 1] with kxkµ = kykµ = 1, kx − yk = ε and k(x + y)/2k arbitrary near 1. Thus, (C[0, 1], k · kµ ) is not uniformly convex. Exercise 4.2.2. Show that the normed spaces ℓp , ℓnp (whenever n is a nonnegative integer), and Lp [a, b] with 1 < p < ∞ are uniformly convex. Exercise 4.2.3. Show that the normed spaces ℓa , c, ℓ∞ , L1 [a, b], C[a, b] and L∞ [a, b] are not strictly convex. Theorem 4.2.1. Every uniformly convex normed space is strictly convex. Proof. It follows directly from Definition 4.2.1. Remark 4.2.1. The converse of Theorem 4.2.1 is not true in general. Let β > 0 and X = c0 the space of all sequences of scalars which converge to zero, that is, c0 = {x = (x1 , x2 , . . . , xn , . . .) : {xi }∞ i=1 is convergent to zero} with the norm k · kβ defined by kxkβ = kxkc0 + β ∞ X xi 2 i=1 i !1/2 , x = {xi } ∈ c0 . The spaces (c0 , k · kβ ) for β > 0 are strictly convex, but not uniformly convex, while c0 with its usual norm kxk∞ = sup |xi |, is not strictly convex. i∈N Remark 4.2.2. The strict convexity and uniform convexity are equivalent in finite dimensional spaces. Qamrul Hasan Ansari Advanced Functional Analysis Page 108 Theorem 4.2.2. Let X be a normed space. Then X is uniformly convex if and only if for two sequences {xn } and {yn } in X, kxn k ≤ 1, kyn k ≤ 1 and lim kxn + yn k = 2 ⇒ n→∞ lim kxn − yn k = 0. n→∞ (4.8) Proof. Let X be uniformly convex. Assume that {xn } and {yn } are two sequences in X such that kxn k ≤ 1, kyn k ≤ 1 for all n ∈ N and lim kxn + yn k = 2. Suppose contrary that n→∞ lim kxn − yn k = 6 0. Then for some ε > 0, there exists a subsequence {ni } of {n} such that n→∞ kxni − yni k ≥ ε. Since X is uniformly convex, there exists δ(ε) > 0 such that kxni + yni k ≤ 2(1 − δ(ε)). (4.9) Since lim kxn + yn k = 2, it follows from (4.9) that n→∞ 2 ≤ 2(1 − δ(ε)), a contradiction. Conversely, assume that the condition (4.8) is satisfied. If X is not uniformly convex, then for ε > 0, there is no δ(ε) such that kxk ≤ 1, kyk ≤ 1, kx − yk ≥ ε ⇒ kx + yk ≤ 2(1 − δ(ε)), and we can find sequences {xn } and {yn } in X such that (i) kxn k ≤ 1, kyn k ≤ 1, (ii) kxn + yn k ≥ 2(1 − 1/n), (iii) kxn − yn k ≥ ε. Clearly kxn − yn k ≥ ε which contradicts the hypothesis, since (ii) gives lim kxn + yn k = 2. n→∞ Thus, X must be uniformly convex. Theorem 4.2.3. A normed space X is uniformly convex if and only if δX (ε) > 0 for all ε ∈ (0, 2]. Proof. Let X be a uniformly convex normed space. Then for ε > 0, there exists δ(ε) > 0 ≤ 1 − δ(ε), that is, such that x+y 2 0 < δ(ε) ≤ 1 − x+y 2 Qamrul Hasan Ansari Advanced Functional Analysis Page 109 for all x, y ∈ X with kxk ≤ 1, kyk ≤ 1 and kx − yk ≥ ε. Therefore, from the definition of modulus of convexity, we have δX (ε) > 0. Conversely, suppose that X is a normed space with modulus of convexity δX such that δX (ε) > 0 for all ε ∈ (0, 2]. Let x, y ∈ X such that kxk = 1, kyk = 1 with kx − yk ≥ ε for fixed ε ∈ (0, 2]. By the definition of modulus of convexity δX (ε), we have 0 < δX (ε) ≤ 1 − It follows that x+y . 2 x+y ≤ 1 − δX (ε), 2 which is independent of x and y. Therefore, X is uniformly convex. Theorem 4.2.4. Let {xn } be a sequence in an uniformly convex Banach space X. Then, xn ⇀ x, kxn k → kxk ⇒ xn → x. Proof. If x = 0, then it is obvious that xn → 0. So, let x 6= 0. Put yn = kxxnn k for n large x . By construction, kyn k = kyk = 1, yn ⇀ y, and thus yn + y ⇀ 2y. enough, and y = kxk Suppose that xn 6→ x. Then, yn 6→ y. This implies that there exist ε > 0 and a subsequence {ynk } of {yn } such that kynk − yk ≥ ε. Since X is uniformly convex, there exists δX (ε) > 0 such that y nk + y ≤ 1 − δX (ε). 2 Since ynk ⇀ y without loss of generality, we have kyk ≤ lim inf k→∞ y nk + y ≤ 1 − δX (ε), 2 which contradicts kyk = 1. Therefore, xn → x. For the class of uniform convex Banach spaces, we have the following important results. Theorem 4.2.5. Every uniformly convex Banach space is reflexive. Proof. Let X be a uniformly convex Banach space. Let SX ∗ := {j ∈ X ∗ : kjk∗ = 1} be the unit sphere in X ∗ and f ∈ SX ∗ . Suppose that {xn } is a sequence in SX such that f (xn ) → 1. We show that {xn } is a Cauchy sequence. Assume contrary that there exist ε > 0 and two subsequences {xni } and {xnj } of {xn } such that kxni − xnj k ≥ ε. The uniform convexity of X guarantees that there exists δX (ε) > 0 such that k(xni + xnj )/2k < 1 − δX (ε). Observe that |f ((xni + xnj )/2)| ≤ kf k∗ k(xni + xnj )/2k < kf k∗ (1 − δX (ε)) = 1 − δX (ε) Qamrul Hasan Ansari Advanced Functional Analysis Page 110 and f (xn ) → 1, yield a contradiction. Hence {xn } is a Cauchy sequence and there exists a point x in X such that xn → x. Clearly x ∈ SX . In fact, kxk = k lim xn k = lim kxn k = 1. n→∞ n→∞ Using James Theorem 6.0.3 (which states that a Banach space is reflexive if and only if for each f ∈ SX ∗ , there exists x ∈ SX such that f (x) = 1), we conclude that X is reflexive. Remark 4.2.3. Every finite-dimensional Banach space is reflexive, but it need not be unin X n formly convex. For example, X = R , n ≥ 2 with the norm kxk1 = |xi | is not uniformly convex. However, it is finite dimensional space. i=1 Combining Proposition 4.1.6 and Theorems 4.2.1 and 4.2.5, we obtain the following interesting result. Theorem 4.2.6. Let C be a nonempty closed convex subset of a uniformly convex Banach space X. Then C has a unique element of minimum norm, that is, there exists a unique element x ∈ C such that kxk = inf {kzk : z ∈ C}. Theorem 4.2.7 (Intersection Theorem). Let {Cn }∞ n=1 be a decreasing sequence of nonempty bounded closed convex subsets of a uniformly convex Banach space X. Then, the inter∞ \ section Cn is a nonempty closed convex subset of X. n=1 Proof. Let x be a point in X which does not belong to C1 , rn = d(x, Cn ) and r = lim rn . n→∞ Also, let {qn } be a sequence of positive numbers that decreases to zero, Dn = {y ∈ Cn : kx − yk ≤ r + qn }, and dn the diameter of Dn . If y and z belong to Dn and ky − zk ≥ dn − qn , then ky − zk y+z ≤ 1−δ (r + qn ), x− 2 r + qn and dn − qn (r + qn ). rn ≤ 1 − δ r + qn Let lim dn = d, then we obtain a contradiction unless d = 0. This in turn implies that n→∞ ∞ ∞ \ \ Dn 6= ∅, and so is Cn 6= ∅. n=1 n=1 Remark 4.2.4. Theorem 4.2.7 remains valid if the sequence {Cn }∞ n=1 is replaced by an arbitrary decreasing net of nonempty bounded closed convex sets. However, Theorem 4.2.7 does not hold in arbitrary Banach spaces. For example, consider the space X = C[0, 1] and Cn = {x ∈ C[0, 1] : 0 ≤ x(t) ≤ tn for all 0 ≤ t ≤ 1 and x(1) = 1}. Qamrul Hasan Ansari 4.3 Advanced Functional Analysis Page 111 Duality Mapping and Its Properties Before defining the duality mapping and giving its fundamental properties, we mention the following notations and definitions: Let T : X ⇒ X ∗ be a set-valued mapping. The domain Dom(T ), range R(T ), inverse T −1 , and graph G(T ) are defined as Dom(T ) = {x ∈ X : T (x) 6= ∅}, [ R(T ) = T (x), x∈Dom(T ) T −1 (y) = {x ∈ X : y ∈ T (x)}, G(T ) = {(x, y) ∈ X × X ∗ : y ∈ T (x), x ∈ Dom(T )}. The graph G(T ) of T is a subset of X × X ∗ . The mapping T is said to be injective if T (x) ∩ T (y) = ∅ for all x 6= y. Definition 4.3.1. Let X ∗ be the dual of a normed space X. A set-valued mapping J : X ⇒ X ∗ is said to be normalized duality if J(x) = j ∈ X ∗ : hx, ji = kxk2 = kjk2∗ , equivalently, J(x) = {j ∈ X ∗ : hx, ji = kxk kjk and kxk = kjk} . Example 4.3.1. In a real Hilbert space H, the normalized duality mapping is the identity mapping. Indeed, let x ∈ H with x 6= 0. Since H = H ∗ and hx, xi = kxk · kxk, we have x ∈ J(x). Assume that y ∈ J(x). By the definition of J, we have hx, yi = kxkkyk and kxk = kyk. Since kx − yk2 = kxk2 + kyk2 − 2hx, yi, it follows that x = y. Therefore, J(x) = {x}. The following theorem presents some fundamental properties of duality mappings in Banach spaces. Proposition 4.3.1. Let X be a Banach space and J : X ⇒ X ∗ be a normalized duality mapping. Then the following assertions hold: (a) J(0) = {0}. (b) For each x ∈ X, J(x) is nonempty closed convex and bounded subset of X ∗ . (c) J(λx) = λJ(x) for all x ∈ X and real λ, that is, J is homogeneous. Qamrul Hasan Ansari Advanced Functional Analysis Page 112 (d) J is a monotone set-valued map, that is, hx − y, jx − jy i ≥ 0, for all x, y ∈ X, jx ∈ J(x) and jy ∈ J(y). (e) kxk2 − kyk2 ≥ 2hx − y, ji, for all x, y ∈ X and j ∈ J(y). (f) If X ∗ is strictly convex, then J is single-valued. (g) If X is strictly convex, then J is injective, that is, x 6= y ⇒ J(x) ∩ J(y) = ∅. (h) If X is reflexive with strictly convex dual X ∗ , then J is demicontinuous, that is, if xn → x in X implies J(xn ) ⇀ J(x). Proof. (a) It is obvious. (b) Let x ∈ X. If x = 0, then it is done by Part (a). So, we assume that x 6= 0. Then, by the Hahn-Banach Theorem, there exists f ∈ X ∗ such that hx, f i = kxk and kf k∗ = 1. Set j := kxkf . Then hx, ji = kxkhx, f i = kxk2 and kjk∗ = kxk, and it follows that J(x) is nonempty for each x 6= 0. So, we can assume that f1 , f2 ∈ J(x). Then, we have hx, f1 i = kxkkf1 k∗ , kxk = kf1 k∗ hx, f2 i = kxkkf2 k∗ , kxk = kf2 k∗ , and and therefore, for t ∈ (0, 1), we have hx, tf1 + (1 − t)f2 i = kxk (tkf1 k∗ + (1 − t)kf2 k∗ ) = kxk2 . Since kxk2 = hx, tf1 + (1 − t)f2 i ≤ ktf1 + (1 − t)f2 k∗ kxk ≤ (tkf1 k∗ + (1 − t)kf2 k∗ ) kxk = kxk2 , we have which gives us kxk2 ≤ kxkktf1 + (1 − t)f2 k∗ ≤ kxk2 , kxk2 = kxkktf1 + (1 − t)f2 k∗ , that is, ktf1 + (1 − t)f2 k∗ = kxk. Therefore, hx, tf1 + (1 − t)f2 i = kxk ktf1 + (1 − t)f2 k∗ and kxk = ktf1 + (1 − t)f2 k∗ , and thus, tf1 + (1 − t)f2 ∈ J(x) for all t ∈ (0, 1), that is, J(x) is a convex set. Similarly, we can show that J(x) is a closed and bounded set in X ∗ . Qamrul Hasan Ansari Advanced Functional Analysis Page 113 (c) For λ = 0, it is obvious that J(0x) = 0J(x). Assume that j ∈ J(λx) for λ 6= 0. We first show that J(λx) ⊆ λJ(x). Since j ∈ J(λx), we have hλx, ji = kλxkkjk∗ and kλxk = kjk∗ , and thus, hλx, ji = kjk2∗ . Hence hx, λ−1 ji = λ−1 hλx, λ−1 ji = λ−2 hλx, ji = λ−2 kλxkkjk∗ = λ−1 kjk∗ kjk∗ = kλ−1 jk2∗ = kxk2 . This shows that λ−1 j ∈ J(x), that is, j ∈ λJ(x). Therefore, J(λx) ⊆ λJ(x). Similarly, we can show that λJ(x) ⊆ J(λx). Thus, J(λx) = λJ(x). (d) Let jx ∈ J(x) and jy ∈ J(y) for x, y ∈ X. Then, we have hx − y, jx − jy i = ≥ ≥ = hx, jx i − hx, jy i − hy, jx i + hy, jy i kxk2 + kyk2 − kxkkjy k∗ − kykkjxk∗ kxk2 + kyk2 − 2kxkkyk (kxk − kyk)2 ≥ 0. (e) Let j ∈ J(x), x, y ∈ X. Then, we have kxk2 kyk2 − 2hx − y, ji = = = ≥ kxk2 − kyk2 − 2hx, ji + 2hy, ji kxk2 − kyk2 − 2hx, ji + 2kyk2 kxk2 + kyk2 − 2hx, ji kxk2 + kyk2 − 2kxk kyk = (kxk − kyk)2 ≥ 0. (f) Let j1 , j2 ∈ J(x) for x ∈ X. Then, we have hx, j1 i = kj1 k2∗ = kxk2 and hx, j2 i = kj2 k2∗ = kxk2 . Adding the above identities, we obtain hx, j1 + j2 i = 2kxk2 . Since 2kxk2 = hx, j1 + j2 i ≤ kxkkj1 + j2 k∗ , we have kj1 k∗ + kj2 k∗ = 2kxk ≤ kj1 + j2 k∗ . It follows from the fact kj1 + j2 k∗ ≤ kj1 k∗ + kj2 k∗ that kj1 + j2 k∗ = kj1 k∗ + kj2 k∗ . (4.10) Qamrul Hasan Ansari Advanced Functional Analysis Page 114 Since X ∗ is strictly convex and kj1 + j2 k∗ = kj1 k∗ + kj2 k∗ , there exists λ ∈ R such that j1 = λj2 . Since hx, j2 i = hx, j1 i = hx, λj2 i = λhx, j2 i, this implies that λ = 1, and hence, j1 = j2 . Therefore, J is single-valued. (g) Suppose that j ∈ J(x) ∩ J(y) for x, y ∈ X. Since j ∈ J(x) and j ∈ J(y), it follows from kjk2∗ = kxk2 = kyk2 = hx, ji = hy, ji that kxk2 = h(x + y)/2, ji ≤ k(x + y)/2kkxk, which gives that kxk = kyk ≤ k(x + y)/2k ≤ kxk. Hence kxk = kyk = k(x + y)/2k. Since X is strictly convex and kxk = kyk = k(x + y)/2k, we have x = y. Therefore, J is one-one. (h) It is sufficient to prove the demicontinuity of J on the unit sphere SX . For this, let {xn } be a sequence in SX such that xn → z in X. Then kJ(xn )k∗ = kxn k = 1 for all n ∈ N, that is, {J(xn )} is bounded. Since X is reflexive, so is X ∗ . Then, there exists a subsequence {J(xnk )} of {J(xn )} in X ∗ such that {J(xnk )} converges weakly to some j in X ∗ . Since xnk → z and J(xnk ) ⇀ j, we have hz, ji = lim hxnk , J(xnk )i = lim kxnk k2 = 1. k→∞ k→∞ Moreover, kjk∗ ≤ = lim kJxnk k∗ = lim (kJxnk k∗ kxnk k) k→∞ k→∞ lim hxnk , Jxnk i = hz, ji = kjk∗ , k→∞ that is, kjk = hz, ji. This shows that hz, ji = kjk∗ kzk and kjk∗ = kzk, (because z ∈ SX and so kzk = 1, also hz, ji = 1 and so kjk = 1). This implies that j = J(z). Thus, every subsequence {J(xni )} converging weakly to j ∈ X ∗ . This gives J(xn ) ⇀ J(z). Therefore, J is demicontinuous. The following inequalities are very useful in many applications. Corollary 4.3.1. Let X be a Banach space and J : X ⇒ X ∗ be the duality mapping. Then the following statements hold: (a) kx + yk2 ≥ kxk2 + 2hy, jx i, for all x, y ∈ X, where jx ∈ J(x). (b) kx + yk2 ≤ kyk2 + 2hx, jx+y i, for all x, y ∈ X, where jx+y ∈ J(x + y). Qamrul Hasan Ansari Advanced Functional Analysis Page 115 Proof. (a) Replacing y by x + y in (4.11), we get the inequality. (b) Replacing x by x + y in (4.11), we get the result. Proposition 4.3.2. Let X be a Banach space and J : X ⇒ X ∗ be a normalized duality mapping. For each x, y ∈ X, the following statements are equivalent: (a) kxk ≤ kx + tyk, for all t > 0. (b) There exists j ∈ J(x) such that hy, ji ≥ 0. Proof. (a) ⇒ (b). For t > 0, let ft ∈ J(x + ty). Then hx + ty, ft i = kx + tyk kftk. Define gt = kffttk∗ . Then kgt k∗ = 1. Since gt ∈ kft k−1 ∗ J(x + ty), we have kxk ≤ kx + tyk = kft k−1 ∗ hx + ty, ft i = hx + ty, gti = hx, gt i + thy, gti ≤ kxk + thy, gt i. (since kgt k∗ = 1) By the Banach-Alaoglu Theorem 6.0.4 (which states that the unit ball in X ∗ is weak*compact), the net {gt } has a limit point g ∈ X ∗ such that kgk∗ ≤ 1, hx, gi ≥ kxk and hy, gi ≥ 0. Observe that kxk ≤ hx, gi ≤ kxkkgk∗ = kxk, which gives that hx, gi = kxk and kgk∗ = 1. Set j = gkxk, then j ∈ J(x) and hy, ji ≥ 0. (b) ⇒ (a). Assume that for x, y ∈ X with x 6= 0, there exists j ∈ J(x) such that hy, ji ≥ 0. Then for t > 0, kxk2 = hx, ji ≤ hx, ji + hty, ji = hx + ty, ji ≤ kx + tykkxk, which implies that kxk ≤ kx + tyk. Qamrul Hasan Ansari Advanced Functional Analysis Page 116 Proposition 4.3.3. Let X be a Banach space and ϕ : X → R be a function defined by ϕ(x) = kxk2 /2. Then the subdifferential ∂ϕ of ϕ coincides with the normalized duality mapping J : X ⇒ X ∗ defined by J(x) = {j ∈ X ∗ : hx, ji = kxkkjk∗ , kjk∗ = kxk} , for x ∈ X. Proof. We first show that J(x) ⊆ ∂ (kxk2 /2). Let x 6= 0 and j ∈ J(x). Then for y ∈ X, we have kyk2 kxk2 kyk2 kxk2 − − hy − x, ji = − − hy, ji + hx, ji 2 2 2 2 kyk2 kxk2 ≥ − − kyk kjk∗ + kxk kjk∗ 2 2 (because hy, ji ≤ kyk kjk∗ and hx, ji = kxk kjk∗) kyk2 kxk2 = − − kyk kxk + kxk2 (because kjk∗ = kxk) 2 2 kxk2 kyk2 + − kxkkyk ≥ 2 2 (kxk − kyk)2 = ≥ 0. 2 It follows that kxk2 kyk2 − ≤ hx − y, ji. 2 2 Hence j ∈ ∂ (kxk2 /2). Thus, J(x) ⊆ ∂ (kxk2 /2) for all x 6= 0. We now prove ∂ (kxk2 /2) ⊆ J(x) for all x 6= 0. Suppose j ∈ ∂ kxk2 kyk2 − ≤ hx − y, ji, 2 2 kxk2 2 for all y ∈ X. for 0 6= x ∈ X. Then, (4.11) Observe that kxkkjk∗ = sup {hy, jikxk : kyk = 1} (since j is a continuous linear functional) = sup {hy, ji : kxk = kyk = 1} ≤ sup {hy, ji : kxk = kyk} kyk2 kxk2 ≤ sup hx, ji + − : kxk = kyk (by using (4.11)) 2 2 ≤ kxkkjk∗ . Thus, hx, ji = kxkkjk∗ . (4.12) Qamrul Hasan Ansari Advanced Functional Analysis Page 117 To see j ∈ J(x), we show that kjk∗ = kxk. For t > 1, we take y = tx ∈ X in (4.11), then we obtain kxk2 t2 kxk2 − ≤ hx − tx, ji, 2 2 that is, (1 − t2 ) kxk2 ≤ (1 − t)hx, ji, 2 which implies that hx, ji ≤ (t + 1) kxk2 . 2 Letting t → 1, we get hx, ji ≤ kxk2 . (4.13) Further, for t > 0, we take y = (1 − t)x ∈ X in (4.11), then we obtain kxk2 k(1 − t)2 xk2 − ≤ hx − (1 − t)x, ji, 2 2 that is, 1 − (1 − t)2 It follows that (2 − t) kxk2 2 ≤ thx, ji. kxk2 ≤ hx, ji. 2 Letting t → 0, we get kxk2 ≤ hx, ji. (4.14) From (4.12), (4.13) and (4.14), we obtain kjk∗ = kxk. Thus, ∂ (kxk2 /2) ⊆ J(x). Therefore, J(x) = ∂ (kxk2 /2) for all x 6= 0. Qamrul Hasan Ansari 4.4 Advanced Functional Analysis Page 118 Smooth Banach Spaces and Modulus of Smoothness Let C be a nonempty closed convex subset of a normed space X such that the origin belongs to the interior of C. A linear functional j ∈ X ∗ is said to be a tangent to C at the point x0 ∈ ∂C if j(x0 ) = sup{j(x) : x ∈ C}, where ∂C denotes the boundary of C. If H = {x ∈ X : j(x) = 0} is the hyperplane, then the set H + x0 is called a tangent hyperplane to C at x0 . Definition 4.4.1. A Banach space X is said to be smooth if for each x ∈ SX , there exists a unique functional jx ∈ X ∗ such that hx, jx i = kxk and kjx k = 1. In other words, X is smooth if for all x ∈ SX , there exists jx ∈ SX ∗ such that hx, jx i = 1. Geometrically, the smoothness condition means that at each point x of the unit sphere, there is exactly one supporting hyperplane {jx = 1} := {y ∈ X : hy, jx i = 1}. This means that the hyperplane {jx = 1} is tangent at x to the unit ball and this unit ball is contained in the half space {jx ≤ 1} := {y ∈ X : hy, jx i ≤ 1}. Example 4.4.1. ℓp , Lp (1 < p < ∞) are smooth Banach spaces. However, c0 , ℓ1 , L1 , ℓ∞ , L∞ are not smooth. Theorem 4.4.1. Let X be a Banach space. Then the following assertions hold. (a) If X ∗ is strictly convex, then X is smooth. (b) If X ∗ is smooth, then X is strictly convex. Proof. (a) Assume that X is not smooth. Then there exist x0 ∈ SX and j1 , j2 ∈ SX ∗ with j1 6= j2 such that hx0 , j1 i = hx0 , j2 i = 1. Since kj1 + j2 k ≤ kj1 k + kj2 k = 2, and hx0 , j1 + j2 i = hx0 , j1 i + hx0 , j2 i = 2, we have (j1 + j2 )/2 ∈ SX ∗ . Hence X ∗ is not strictly convex. (b) Suppose that X is not strictly convex. Then there exist x, y ∈ SX with x 6= y such that , j = 1. Then, we have kx + yk = 2. Take j ∈ SX ∗ with x+y 2 x+y 1 1 1 1 1= , j = hx, ji + hy, ji ≤ + , 2 2 2 2 2 and hence, hx, ji = hy, ji = kjk = 1. Since x, y ∈ X ⊆ X ∗∗ , we have x, y ∈ J(j). So, for x 6= y, we have X ∗ is not smooth. It is well known that for a reflexive Banach space X, the dual spaces X and X ∗ can be equivalently renormed as strictly convex spaces such that the duality is preserved. By using this fact, we have the following result. Qamrul Hasan Ansari Advanced Functional Analysis Page 119 Theorem 4.4.2. Let X be a reflexive Banach space. Then the following assertions hold. (a) X is smooth if and only if X ∗ is strictly convex. (b) X is strictly convex if and only if X ∗ is smooth. We now establish a relation between smoothness and Gâteaux differentiability of a norm. Theorem 4.4.3. A Banach space X is smooth if and only if the norm is Gâteaux differentiable on X\{0}. Proof. Since the proper convex continuous functional ϕ is Gâteaux differentiable if and only if it has a unique subgradient, we have norm is Gâteaux differentiable at x ⇔ ∂kxk = {j ∈ X ∗ : hx, ji = kxk, kjk∗ = 1} is singleton ⇔ there exists a unique j ∈ X ∗ such that hx, ji = kxk and kjk∗ = 1 ⇔ smooth. Corollary 4.4.1. Let X be a Banach space and J : X ⇒ X ∗ be a duality mapping. Then the following statements are equivalent: (a) X is smooth. (b) J is single-valued. (c) The norm of X is Gâteaux differentiable with ▽kxk = kxk−1 J(x). We now study the continuity property of duality mappings. Theorem 4.4.4. Let X be a smooth Banach space and J : X → X ∗ be a single-valued duality mapping. Then J is norm to weak*-continuous. Proof. We show that xn → x implies J(xn ) → J(x) in the weak* topology. Let xn → x and set fn := J(xn ). Then hxn , fn i = kxn kkfn k∗ and kxn k = kfn k∗ . Qamrul Hasan Ansari Advanced Functional Analysis Page 120 Since {xn } is bounded, {fn } is bounded in X ∗ . Then there exists a subsequence {fnk } of {fn } such that fnk → f ∈ X ∗ in the weak* topology. Then we show that f = J(x). Since the norm of X ∗ is lower semicontinuous in weak* topology, we have kf k∗ ≤ lim inf kfnk k∗ = lim inf kxnk k = kxk. k→∞ k→∞ Since hx, f − fnk i → 0 and hx − xnk , fnk i → 0, it follows from the fact |hx, f i − kxnk k2 | = |hx, f i − hxnk , fnk i| ≤ |hx, f − fnk i| + |hx − xnk , fnk i| → 0 that hx, f i = kxk2 . As a result kxk2 = hx, f i ≤ kf k∗ kxk. Thus, we have hx, f i = kxk2 , kxk = kf k∗ . Therefore, f = J(x). Qamrul Hasan Ansari 4.5 Advanced Functional Analysis Page 121 Metric Projection on Normed Spaces Let C be a nonempty subset of a normed space X and x ∈ X. An element y0 ∈ C is said to be a best approximation to x if kx − y0 k = d(x, C), where d(x, C) = inf kx − yk. The number d(x, C) is called the distance from x to C. y∈C The (possibly empty) set of all best approximations from x to C is denoted by PC (x) = {y ∈ C : kx − yk = d(x, C)}. This defines a mapping PC from X into 2C and it is called the metric projection onto C. The metric projection mapping is also known as the nearest point projection mapping, proximity mapping or best approximation operator. The set C is said to be proximinal (respectively, Chebyshev) set if each x ∈ X has at least (respectively, exactly) one best approximation in C. Remark 4.5.1. (a) C is proximinal if PC (x) 6= ∅ for all x ∈ X. (b) C is Chebyshev if PC (x) is singleton for each x ∈ X. (c) The set of best approximations is convex if C is convex. Proposition 4.5.1. If C is a proximinal subset of a Banach space X, then C is closed. Proof. Suppose contrary that C is not closed. Then there exists a sequence {xn } in C such that xn → x and x ∈ / C, but x ∈ X. It follows that d(x, C) ≤ kxn − xk → 0, so that, d(x, C) = 0. Since x ∈ / C, we have kx − yk > 0, for all y ∈ C. This implies that PC (x) = ∅ which contradicts PC (x) 6= ∅. Theorem 4.5.1 (The Existence of Best Approximations). Let C be a nonempty weakly compact convex subset of a Banach space X and x ∈ X. Then x has a best approximation in C, that is, PC (x) 6= ∅. Proof. Define the function f : C → R+ by f (y) = kx − yk, for all y ∈ C. Then, f is lower semicontinuous. Since C is weakly compact, by Theorem 6.0.10, there exists y0 ∈ C such that kx − y0 k = inf kx − yk. y∈C Qamrul Hasan Ansari Advanced Functional Analysis Page 122 Corollary 4.5.1. Let C be a nonempty closed convex subset of a reflexive Banach space X. Then each element x ∈ X has a best approximation in C. Theorem 4.5.2 (The Uniqueness of Best Approximations). Let C be a nonempty convex subset of a strictly convex Banach space X. Then for each x ∈ X, C has at most one best approximation. Proof. Assume contrary that y1 , y2 ∈ C are best approximations to x ∈ X. Since the set of best approximations is convex, (y1 +y2 )/2 is also a best approximation to x. Set r := d(x, C). Then 0 ≤ r = kx − y1 k = kx − y2 k = kx − (y1 + y2 )/2k, and so, k(x − y1 ) + (x − y2 )k = 2r = kx − y1 k + kx − y2 k. By the strict convexity of X, we have x − y1 = t(x − y2 ), for all t ≥ 0. Taking the norm in this relation, we obtain r = tr, that is, t = 1, which gives us y1 = y2 . The following example shows that the strict convexity cannot be dropped in Theorem 4.5.2. Example 4.5.1. Let X = R2 with norm kxk1 = |x1 | + |x2 | for all x = (x1 , x2 ) ∈ R2 . As we have seen that X is not strictly convex. Let C = (x1 , x2 ) ∈ R2 : k(x1 , x2 )k1 ≤ 1 = (x1 , x2 ) ∈ R2 : |x1 | + |x2 | ≤ 1 . Then C is a closed convex set. The distance from z = (−1, −1) to the set C is one and this distance is realized by more than one point of C. The following example shows that the uniqueness of best approximations in Theorem 4.5.2 need not be true for nonconvex sets. 1/2 Example 4.5.2. Let X = R2 with the norm k · k2 = (x21 + x22 ) for all x = (x1 , x2 ) ∈ R2 . Let C = SX = (x1 , x2 ) ∈ R2 : x21 + x22 = 1 . Then X is strictly convex and C is not convex. However, all points of C are best approximations to (0, 0) ∈ X. Theorem 4.5.3. Let X be a Banach space X. If every element in X possesses at most a best approximation with respect to every convex set, then X is strictly convex. Qamrul Hasan Ansari Advanced Functional Analysis Page 123 Proof. Assume contrary that X is not strictly convex. Then there exist x, y ∈ X, x 6= y such that kxk = kyk = k(x + y)/2k = 1. Furthermore, ktx + (1 − t)yk = 1, for all t ∈ [0, 1]. Set C := co({x, y}) the convex hull of the set {x, y}. Then k0 − zk = d(0, C) for all z ∈ C. It follows that every element of C is the best approximation to zero which contradicts the uniqueness. From Theorems 4.5.1 and 4.5.2, we obtain the following result. Theorem 4.5.4. Let C be a nonempty weakly compact convex subset of a strictly convex Banach space X. Then for each x ∈ X, C has the unique best approximation, that is, PC (·) is a single-valued metric projection mapping from X onto C. Corollary 4.5.2. Let C be a nonempty closed convex subset of a strictly convex reflexive Banach space X and let x ∈ X. Then there exists a unique element x0 ∈ C such that kx − x0 k = d(x, C). 5 Appendix: Basic Results from Analysis - I Definition 5.0.1. A function f : Rn → R ∪ {±∞} is said to be (a) positively homogeneous if for all x ∈ Rn and all r ≥ 0, f (rx) = rf (x); (b) subadditive if f (x + y) ≤ f (x) + f (y), for all x, y ∈ Rn ; (c) sublinear if it is positively homogeneous and subadditive; (d) subodd if for all x ∈ Rn \ {0}, f (x) ≥ −f (−x). Every real-valued odd function is subodd. It can be seen that the function f : R → R defined by f (x) = x2 is subodd but it is neither odd nor subadditive. Remark 5.0.1. (a) It can be easily seen that f is subodd if and only if f (x) + f (−x) ≥ 0, for all x ∈ Rn \ {0}. (b) If f is sublinear and is not constant with value −∞ such that f (0) ≥ 0, then f is subodd. Definition 5.0.2. Let f : Rn → R ∪ {±∞} be an extended real-valued function. (a) The effective domain of f is defined as dom(f ) := {x ∈ Rn : f (x) < +∞}. (b) The function f is called proper if f (x) < +∞ for at least one x ∈ Rn and f (x) > −∞ for all x ∈ Rn . 124 Qamrul Hasan Ansari Advanced Functional Analysis Page 125 (c) The graph of f is defined as graph(f ) := {(x, y) ∈ Rn × R : y = f (x)}. (d) The epigraph of f is defined as epi(f ) := {(x, α) ∈ Rn × R : f (x) ≤ α}. (e) The hypograph of f is defined as hyp(f ) := {(x, α) ∈ Rn × R : f (x) ≥ α}. (f) The lower level set of f at level α ∈ R is defined as L(f, α) := {x ∈ Rn : f (x) ≤ α}. (g) The upper level set of f at level α ∈ R is defined as U(f, α) := {x ∈ Rn : f (x) ≥ α}. The epigraph (hypograph) is thus a subset of Rn+1 that consists of all the points of Rn+1 lying on or above (on or below) the graph of f . From the above definitions, we have (x, α) ∈ epi(f ) if and only if x ∈ L(f, α), and (x, α) ∈ hyp(f ) if and only if x ∈ U(f, α). Definition 5.0.3. A function f : Rn → R is said to be (a) bounded above if there exists a real number M such that f (x) ≤ M, for all x ∈ Rn ; (b) bounded below if there exists a real number m such that f (x) ≥ m, for all x ∈ Rn ; (c) bounded if it is bounded above as well as bounded below. For f : Rn → R ∪ {±∞}, we write inf f := inf{f (x) : x ∈ Rn }, argminf := argmin{f (x) : x ∈ Rn } := {x ∈ Rn : f (x) = inf f }. Qamrul Hasan Ansari Advanced Functional Analysis Page 126 Definition 5.0.4. A function f : Rn → R ∪ {±∞} is said to be lower semicontinuous at a point x ∈ Rn if f (x) ≤ lim inf f (xm ) whenever xm → x as m → ∞. f is said to be lower m→∞ semicontinuous on Rn if it is lower semicontinuous at each point of Rn . A function f : Rn → R ∪ {±∞} is said to be upper semicontinuous at a point x ∈ Rn if f (x) ≥ lim sup f (xm ) whenever xm → x as m → ∞. f is said to be upper semicontinuous m→∞ on Rn if it is upper semicontinuous at each point of Rn . Remark 5.0.2. A function f : Rn → R is lower (respectively, upper) semicontinuous on Rn if and only if the lower level set L(f, α) (respectively, the upper level set U(f, α)) is closed in Rn for all α ∈ R. Also, f is lower (respectively, upper) semicontinuous on Rn if and only if the epi(f ) (respectively, hyp(f )) is closed. Equivalently, f is lower (respectively, upper) semicontinuous on Rn if and only if the set {x ∈ Rn : f (x) > α} (respectively, the set {x ∈ Rn : f (x) < α}) is open in Rn for all α ∈ R. Definition 5.0.5. A function f : Rn → R is said to be differentiable at x ∈ Rn if there exists a vector ∇f (x), called the gradient, and a function α : Rn → R such that f (y) = f (x) + h∇f (x), y − xi + ky − xkα(y − x), for all y ∈ Rn , where limy→x α(y − x) = 0. If f is differentiable, then f (x + λv) = f (x) + λh∇f (x), vi + o(λ), where limλ→0 o(λ) = 0. λ for all x + λv ∈ Rn , The gradient of f at x = (x1 , x2 , . . . , xn ) is a vector in Rn given by ∂f (x) ∂f (x) ∂f (x) ∇f (x) = . , ,..., ∂x1 ∂x2 ∂xn Definition 5.0.6. An n × n symmetric matrix M of real numbers is said to be positive semidefinite if hy, Myi ≥ 0 for all y ∈ Rn . It is called positive definite if hy, Myi > 0 for all y 6= 0. Definition 5.0.7. Let f = (f1 , . . . , fℓ ) : Rn → Rℓ be a vector-valued function such that the ∂fi (x) partial derivative of fi with respect to xj exists for i = 1, 2, . . . , ℓ and j = 1, 2, . . . , n. ∂xj Then the Jacobian matrix J(f )(x) is given by ∂f1 (x) ∂f1 (x) ··· ∂x1 ∂xn . .. , . J(f )(x) = . . ∂fℓ (x) ∂fℓ (x) ··· ∂x1 ∂xn Qamrul Hasan Ansari Advanced Functional Analysis Page 127 where x = (x1 , x2 , . . . , xn ) ∈ Rn . Definition 5.0.8. A function f : Rn → R is said to be twice differentiable at x ∈ Rn if there exist a vector ∇f (x) and an n × n symmetric matrix ∇2 f (x), called the Hessian matrix, and a function α : Rn → R such that f (y) = f (x) + h∇f (x), y − xi + hy − x, ∇2 f (x)(y − x)i + ky − xk2 α(y − x), for all y ∈ Rn , where limy→x α(y − x) = 0. If f is twice differentiable, then f (x + λv) = f (x) + λh∇f (x), vi + λ2 hv, ∇2 f (x)vi + o(λ2 ), where limλ→0 o(λ2 ) λ2 for all x + λv ∈ Rn , = 0. The Hessian matrix of f at x = (x1 , x2 , . . . , xn ) is given by 2 ∂ f (x) ∂ 2 f (x) · · · ∂x2 ∂x1 ∂xn 1 . .. 2 . ∇ f (x) ≡ H(x) = . 2. ∂ f (x) ∂ 2 f (x) ··· ∂xn ∂x1 ∂x2n . Definition 5.0.9. Let K be a nonempty convex subset of Rn . A function f : K → R is said to be (a) convex if for all x, y ∈ K and all λ ∈ [0, 1], f (λx + (1 − λ)y) ≤ λf (x) + (1 − λ)f (y); (b) strictly convex if for all x, y ∈ K, x 6= y and all λ ∈ ]0, 1[, f (λx + (1 − λ)y) < λf (x) + (1 − λ)f (y). A function f : K → R is said to be (strictly) concave if −f is (strictly) convex. Geometrically speaking, a function f : K → R defined on a convex subset K of Rn is convex if the line segment joining any two points on the graph of the function lies on or above the portion of the graph between these points. Similarly, f is concave if the line segment joining any two points on the graph of the function lies on or below the portion of the graph between these points. Also, a function for which the line segment joining any two points on the graph of the function lies strictly above the portion of the graph between these points is referred to as strictly convex function. Qamrul Hasan Ansari Advanced Functional Analysis Page 128 Some of the examples of convex functions defined on R are f (x) = ex , f (x) = x, f (x) = |x|, f (x) = max{0, x}. The functions f (x) = − log x and f (x) = xα for α < 0, α > 1 are strictly convex defined on the interval ]0, ∞[. Clearly, every strictly convex function is convex but the converse may not be true. For example, the function f (x) = x defined on R is not strictly convex. The function f (x) = |x + x3 | is a nondifferentiable strictly convex function on R. Proposition 5.0.1. A function f : K → R defined on a nonempty convex subset K of Rn is convex if and only its epigraph is a convex set. 6 Appendix: Basic Results from Analysis - II Theorem 6.0.1 (Finite Intersection Property). Let topological space X is compact if n \ and only if for every collection {Cα }α∈Λ of closed sets in X such that Ci 6= ∅, we have i=1 \ Cα 6= ∅. α∈Λ Theorem 6.0.2 (Hahn-Banach Theorem). Let C be a subspace of a linear space X, p be a sublinear functional on X and f be a linear functional defined on C such that f (x) ≤ p(x), for all x ∈ C. Then there exists a linear extension F of f such that F (x) ≤ p(x) for all x ∈ X. The following corollary gives the existence of nontrivial bounded linear functionals on an arbitrary normed space. Corollary 6.0.1. Let x be a nonzero element of a normed space X. Then there exists j ∈ X ∗ such that j(x) = kxk and kjk∗ = 1. Definition 6.0.1. Let X be a normed space and X ∗ be its dual space. The duality pairing between X and X ∗ is the functional h., .i : X × X ∗ → R defined by hx, ji = j(x), for all x ∈ X and j ∈ X ∗ . Theorem 6.0.3 (James Theorem). A Banach space X is reflexive if and only if for each j ∈ SX ∗ , there exists x ∈ SX such that j(x) = 1. Let X be a Banach space with its dual X ∗ . We say that the sequence {xn } in X converges to x if lim kxn − xk = 0. This kind of convergence is also called norm convergence or n→∞ 129 Qamrul Hasan Ansari Advanced Functional Analysis Page 130 strong convergence. This is related to the strong topology on X with neighborhood base Br (0) = {x ∈ X : kxk < r}, r > 0 at the origin. There is also a weak topology on X generated by the bounded linear functionals on X. Indeed, A set G ⊆ X is said to be open in the weak topology if for every x ∈ G, there are bounded linear functionals f1 , f2 , . . . , fn and positive real numbers ε1 , ε2 , . . . , εn such that {y ∈ X : |fi (x) − fi (y)| < εi , i = 1, 2, . . . , n} ⊆ G. Hence a subbase σ for the weak topology on X generated by a base of neighborhoods of x̄ ∈ X is given by the following sets: V (f1 , f2 , . . . , fn ; ε) = {x ∈ X : |hx − x̄, fi i| < ε, for all i = 1, 2, . . . , n} . In particular, a sequence {xn } in X converges to x with respect to a weak topology σ(X, X ∗ ) if and only if hxn , f i → hx, f i for all f ∈ X ∗ . Definition 6.0.2. A sequence {xn } in a normed space X is said to converge weakly to x ∈ X if f (xn ) → f (x) for all f ∈ X ∗ . In this case, we write xn ⇀ x or weak- lim xn = x. n→∞ Definition 6.0.3. A subset C of a normed space X is said to be weakly closed if it is closed in the weak topology. Definition 6.0.4. A subset C of a normed space X is said to be weakly compact if it is compact in the weak topology. Remark 6.0.1. In the finite dimensional spaces, the weak convergence and the strong convergence are equivalent. Theorem 6.0.4 (Banach-Alaoglu Theorem). Let X be a normed space and X ∗ be its dual. Then the unit ball in X ∗ is weak*-compact. Proposition 6.0.1. Let C be a nonempty convex subset of a normed space X. Then C is weakly closed if and only if it is closed. Proposition 6.0.2. Every weakly compact subset of a Banach space is bounded. Proposition 6.0.3. Every closed convex subset of a weakly compact set is weakly compact. Theorem 6.0.5 (Kakutani’s Theorem). Let X be a Banach space. Then X is reflexive if and only if the unit ball SX := {x ∈ X : kxk ≤ 1} is weakly compact. Qamrul Hasan Ansari Advanced Functional Analysis Page 131 Theorem 6.0.6. Let X be a Banach space. Then X is reflexive if and only if every closed convex bounded subset of X is weakly compact. Theorem 6.0.7. Let C be a subset of a reflexible Banach space X. Then C is weakly compact if and only if C is bounded Theorem 6.0.8. Let X be a Banach space. Then X is reflexive if and only if every bounded sequence in X in strong topology has a weakly convergent subsequence. Theorem 6.0.9. Let X be a compact topological space and f : X → (−∞, ∞] be a lower semicontinuous functional. Then there exists an element x̄ ∈ X such that f (x̄) = inf f (x). x∈X Proof. For all α ∈ R,[let Gα := {x ∈ X : f (x) > α}. Since f is lower semicontinuous, Gα . By compactness of X, there exists a finite family {Gαi }ni=1 of Gα is open and X = α∈R {Gα }α∈R such that X= n [ Gαi . i=1 Suppose that α0 = min{α1 , α2 , . . . , αn }. Then f (x) > α0 for all x ∈ X. It follows that inf{f (x) : x ∈ X} exists. Let m = inf{f (x) : x ∈ X} and β be a number such that β > m. Set Fβ := {x ∈ X : f (x) ≤ β}. Then Fβ is a nonempty closed subset of X, and hence, by the intersection property (Theorem 6.0.1), we have \ Fβ 6= ∅. β>m Therefore, for any point x̄ of this intersection, we have m = f (x̄). Theorem 6.0.10. Let C be a weakly compact convex subset of a Banach space X and f : C → (−∞, ∞] be a proper lower semicontinuous convex functional. Then there exists x̄ ∈ C such that f (x̄) = inf{f (x) : x ∈ C}. Remark 6.0.2. If f is a strictly convex function in Theorem 6.0.10, then x̄ ∈ C is the unique point such that f (x̄) = inf f (x). x∈C Recall that every closed convex bounded subset of a reflexive Banach space is weakly compact (Theorem 6.0.6). Using this fact, we have the following result. Qamrul Hasan Ansari Advanced Functional Analysis Page 132 Theorem 6.0.11. Let C be a nonempty closed convex bounded subset of a reflexive Banach space X and f : X → (−∞, ∞] be a proper lower semicontinuous convex functional. Then there exists x̄ ∈ C such that f (x̄) = inf f (x). x∈C In Theorem 6.0.11, the boundedness of C may be replaced by the following weaker assumption (called coercivity condition): lim x∈C,kxk→∞ f (x) = ∞. Theorem 6.0.12. Let C be a nonempty closed convex subset of a reflexive Banach space X and f : C → (−∞, ∞] be a proper lower semicontinuous convex functional such that f (xn ) → ∞ as kxn k → ∞. Then there exists x̄ ∈ C such that f (x̄) = inf f (x). x∈C Proof. Let m = inf{f (x) : x ∈ X}. Choose a minimizing sequence {xn } in X, that is, f (xn ) → m. If {xn } is not bounded, then there exists a subsequence {xni } of {xn } such that kxni k → ∞. From the hypothesis, we have f (xni ) → ∞, which contradicts m 6= ∞. Hence {xn } is bounded. Since X is reflexive, by Theorem 6.0.8, there exists a subsequence {xnj } of {xn } such that xnj ⇀ x̄ ∈ X. Since f is lower semicontinuous in the weak topology, we have m ≤ f (x̄) ≤ lim inf f (xnj ) = lim f (xn ) = m. j→∞ Therefore, f (x̄) = m. n→∞