Uploaded by january1960

Ansari - Advanced Functional Analysis Notes

advertisement
Qamrul Hasan Ansari
Advanced Functional Analysis
Advanced Functional Analysis
Qamrul Hasan Ansari
Department of Mathematics
Aligarh Muslim University, Aligarh
E-mail: qhansari@gmail.com
Page 1
SYLLABUS
M.A. / M.Sc. II SEMESTER
ADVANCED FUNCTIONAL ANALYSIS
Course Title
Course Number
Credits
Course Category
Prerequisite Courses
Contact Course
Type of Course
Course Assessment
End Semester Examination
Course Objectives
Course Outcomes
Advanced Functional Analysis
MMM-2009
4
Compulsory
Functional Analysis, Linear Algebra, Real Analysis
4 Lecture + 1 Tutorial
Theory
Sessional (1 hour) 30%
(2:30 hrs) 70%
To discuss some advanced topics from Functional Analysis, namely
orthogonality, orthonormal bases, orthogonal projections, bilinear forms,
spectral theory of continuous linear operators,
differential calculus on normed spaces, geometry of Banach spaces.
These topics play central role in research and
advancement of various topics in mathematics.
After undertaking this course, students will understand:
◮ spectral theory of continuous linear operators
◮ orthogonality, orthogonal complements, orthonormal bases
◮ orthogonal projection, bilinear form and Lax-Milgram lemma
◮ differential calculus on normed spaces
◮ geometry of Banach spaces
2
Qamrul Hasan Ansari
Advanced Functional Analysis
Syllabus
UNIT I: Orthogonality, Orthonormal Bases, Orthogonal Projection
and Bilinear Forms
Orthogonality, Orthogonal complements, Orthonormal Bases,
Orthogonal projections, Projection theorem,
Projection on convex stes, Sesquilinear forms,
Bilinear forms and their basic properties, Lax-Milgram lemma
UNIT II: Spectral Theory of Continuous Linear Operators
Eigenvalues and eigenvectors, Resolvent operators, Spectrum,
Spectral properties of bounded linear operators,
Compact linear operators on normed spaces,
Finite dimensional domain and range, Sequence of compact linear operators,
Weak convergence, Spectral theory of compact linear operators
UNIT III: Differential Calculus on Normed Spaces
Gâteaux derivative, Gradient of a function, Fréchet derivative,
Chain rule, Mean value theorem, Properties of Gâteaux and Fréchet derivatives,
Taylor’s formula, Subdifferential and its properties
UNIT IV: Geometry of Banach Spaces
Strict convexity, Modulus of convexity, Uniform convexity,
Duality mapping and its properties
Smooth Banach spaces, Modulus of smoothness
Total
Page 3
No. of Lectures
14
13
14
15
56
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 4
Recommended Books:
1. Q. H. Ansari: Topics in Nonlinear Analysis and Optimization, World Education, Delhi,
2012.
2. Q. H. Ansari, C. S. Lalitha and M. Mehta: Generalized Convexity, Nonsmooth Variational
and Nonsmooth Optimization, CRC Press, Taylor and Francis Group, Boca Raton, London,
New York, 2014.
3. C. Chidume: Geometric Properties of Banach Spaces and Nonlinear Iterations, Springer,
London, 2009.
4. M. C. Joshi and R. K. Bose: Some Topics in Nonlinear Functional Analysis, Wiley Eastern
Limited, New Delhi, 1985.
5. E. Kreyazig: Introductory Functional Analysis with Applications, John Wiley and Sons, New
York, 1989.
6. M. T. Nair: Functional Analysis: A First Course, Prentice-Hall of India Private Limited,
New Delhi, 2002.
7. A. H. Siddiqi: Applied Functional Analysis, CRC Press, London, 2003.
1
Orthogonality, Orthonormal Bases,
Orthogonal Projection and Bilinear Forms
Throughout these notes, 0 denotes the zero vector of the corresponding vector space, and
h., .i denotes the inner product on an inner product space.
1.1
1.1.1
Orthogonality and Orthonormal Bases
Orthogonality
One of the major differences between an inner product and a normed space is that in an
inner product space we can talk about the angle between two vectors.
Definition 1.1.1. The angle θ between two vectors x and y of an inner product space X is
defined by the following relation:
cos θ =
hx, yi
.
kxk kyk
(1.1)
Definition 1.1.2. Let X be an inner product space whose inner product is denoted by h., .i.
(a) Two vectors x and y in X are said to be orthogonal if hx, yi = 0. When two vectors x
and y are orthogonal, we denoted by x ⊥ y.
(b) A vector x ∈ X is said to be orthogonal to a nonempty subset A of X, denoted by
x⊥A, if hx, yi = 0 for all y ∈ A.
5
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 6
(c) Let A be a nonempty subset of X. The set of all vectors orthogonal to A, denoted by
A⊥ , is called the orthogonal complement of A, that is,
A⊥ = {x ∈ X : hx, yi = 0 for all y ∈ A}.
A⊥⊥ = (A⊥ )⊥ denotes the orthogonal complement of A⊥ , that is,
A⊥⊥ = (A⊥ )⊥ = {x ∈ X : hx, yi = 0 for all y ∈ A⊥ }.
(c) Two subsets A and B of X are said to be orthogonal, denoted by A⊥B, if hx, yi = 0
for all x ∈ A and all y ∈ B.
Clearly, x and y are orthogonal if and only if the angle θ between is 90◦ , that is, cos θ = 0
which is equivalent to (in view of (1.1)) hx, yi = 0 ⇔ x ⊥ y.
Remark 1.1.1. (a) Since hx, yi = hy, xi (conjugate of hy, xi) hx, yi = 0 implies that
hy, xi = 0 or hy, xi = 0 and vice versa. Hence, x⊥y if and only if y ⊥ x, that is, all
vectors in X are mutually orthogonal.
(b) Since hx, 0i = 0 for all x, x ⊥ 0 for every x belonging to an inner product space. By
the definition of the inner product, 0 is the only vector orthogonal to itself.
(c) Clearly, {0}⊥ = X and X ⊥ = {0}.
(d) If A ⊥ B, then A ∩ B = {0}.
(e) Nonzero mutually orthogonal vectors, x1 , x2 , x3 , . . . , xn , of an inner product space are
linearly independent (Prove it!).
Example 1.1.1. Let A = {(x, 0, 0) ∈ R3 : x ∈ R} be a line in R3 and B = {(0, y, z) ∈ R3 :
y, z ∈ R} be a plane in R3 . Then A⊥ = B and B ⊥ = A.
Example 1.1.2. Let X = R3 and A be its subspace spanned by a non-zero vector x. The
orthogonal complement of A is the plane through the origin and perpendicular to the vector
x.
Example 1.1.3. Let A be a subspace of R3 generated by the set {(1, 0, 1), (0, 2, 3)}. An
element of A can be expressed as
x = (x1, x2, x3 ) = λ(1, 0, 1) + µ(0, 2, 3)
= λi + 2µj + (λ + 3µ)k
⇒ x1 = λ, x2 = 2µ, x3 = λ + 3µ.
Thus, the element of A is of the form x1 , x2 , x1 + 23 x2 . The orthogonal complement of A
can be constructed as follows: Let x = (x1 , x2 , x3 ) ∈ A⊥ . Then for y = (y1 , y2 , y3 ) ∈ A, we
have
3
hx, yi = x1 y1 + x2 y2 + x3 y3 = x1 y1 + x2 y2 + x3 y1 + y2
2
3
= (x1 + x3 ) y1 + x2 + x3 y2 = 0.
2
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 7
Since y1 and y2 are arbitrary, we have
3
x1 + x3 = 0 and x2 + x3 = 0.
2
Therefore,
⊥
A
3
=
x = (x1 , x2 , x3 ) : x1 = −x3 , x2 = − x3
2
3
=
x ∈ R3 : x = −x3 , − x3, x3
.
2
Exercise 1.1.1. Let A be a subspace of R3 generated by the set {(1, 1, 0), (0, 1, 1)}. Find
A⊥ .
Answer. A⊥ is the straight line spanned by the vector (1, −1, 1).
Theorem 1.1.1. Let X be an inner product space and A be a subset of X. Then A⊥ is
a closed subspace of X.
Proof. Let x, y ∈ A⊥ . Then, hx, zi = 0 for all z ∈ A and hy, zi = 0 for all z ∈ A. Since for
arbitrary scalars α, β, hαx + βy, zi = αhx, zi + βhy, zi = 0, we get hαx + βy, zi = 0; that is,
αx + βy ∈ A⊥ . So A⊥ is a subspace of X.
To show that A⊥ is closed, let {xn } ∈ A⊥ such that xn → y. We need to show that y must
belongs to A⊥ . Since xn ∈ A⊥ , hx, xn i = 0 for all x ∈ X and all n. Since h., .i is a continuous
function, we have
lim hx, xn i = lim hxn , xi = h lim xn , xi = hy, xi = 0.
n→∞
n→∞
n→∞
Hence, y ∈ A⊥ .
Exercise 1.1.2. Let X be an inner product space and A and B be subsets of X. Prove the
following assertions:
(a) A ∩ A⊥ ⊆ {0}. A ∩ A⊥ = {0} if and only if A is a subspace.
(b) A ⊆ A⊥⊥ .
(c) If B ⊆ A, then B ⊥ ⊇ A⊥ .
Proof. (a) If y ∈ A ∩ A⊥ and y ∈ A⊥ , then y ∈ {0}. If A is a subspace, then 0 ∈ A and
0 ∈ A ∩ A⊥ . Hence, A ∩ A⊥ = {0}.
(b) Let y ∈ A, but y ∈
/ A⊥⊥ . Then there exists an element z ∈ A⊥ such that hy, zi =
6 0.
⊥
Since z ∈ A , hy, zi = 0 which is a contradiction. Hence, y ∈ A⊥⊥ .
(c) Let y ∈ A⊥ . Then hy, zi = 0 for all z ∈ A. Since every z ∈ B is an element of A, we
have hy, zi = 0 for all z ∈ B. Hence, y ∈ B ⊥ , and so B ⊥ ⊃ A⊥ .
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 8
Exercise 1.1.3. Let X be an inner product space and A and B be subsets of X. Prove the
following assertions:
(a) If A ⊆ B, then A⊥⊥ ⊆ B ⊥⊥ .
(b) A⊥ = A⊥⊥⊥ .
(c) If A is dense in X, that is, A = X, then A⊥ = {0}.
(d) If A is an orthogonal set and 0 ∈
/ A, then prove that A is linearly independent.
Exercise 1.1.4. Let A be a nonempty subset of a Hilbert space X. Show that
(a) A⊥⊥ = spanA;
(b) spanA is dense in X whenever A⊥ = {0}.
Hint: See [5], pp. 149.
Exercise 1.1.5. Let A be a nonempty subset of a Hilbert space X. Show that A is closed
if and only if A = A⊥⊥ .
The well-known Pythagorean theorem of plane geometry says that the sum of the squares
of the base and the perpendicular in a right-angled triangle is equal to the square of the
hypotenuse. Its infinite-dimensional analogue is as follows.
Theorem 1.1.2. Let X be an inner product space and x, y ∈ X. Then for x ⊥ y, we
have kx + yk2 = kxk2 + kyk2.
Proof. Note that kx + yk2 = hx + y, x + yi = hx, xi + hy, xi + hx, yi + hy, yi. Since x⊥y,
hx, yi = 0 and hy, xi = 0, we have kx + yk2 = kxk2 + kyk2 .
Exercise 1.1.6. Let K and D be subset of an inner product space X. Show that
(K + D)⊥ = K ⊥ ∩ D ⊥ ,
where K + D = {x + y : x ∈ K, y ∈ D}.
Exercise 1.1.7. For each i = 1, 2, . . . , n, let Ki be a subspace of a Hilbert space X. If
hxi , xj i = 0 when i 6= j for each xi ∈ Ki and yj ∈ Kj , then show that the subspace
K1 + K2 + · · · + Kn is closed. Is this property true for an incomplete inner product space?
Exercise 1.1.8. Let X be an inner product space and for a nonzero vector y ∈ X, Ky :=
{x ∈ X : hx, yi = 0}. Determine the subspace Ky⊥ .
Qamrul Hasan Ansari
1.1.2
Advanced Functional Analysis
Page 9
Orthonormal Sets and Orthonormal Bases
Definition 1.1.3. Let X be an inner product space.
(a) A subset A of nonzero vectors in X is said to orthogonal if any two distinct elements
in A are orthogonal.
(b) A set of vectors A in X is said to be orthonormal if it is orthogonal and kxk = 1 for
all x ∈ A, that is, for all x, y ∈ A,
0,
if x 6= y
(1.2)
hx, yi =
1,
if x = y.
If an orthogonal / orthonormal set in X is countable, then it can be arranged as a sequence
{xn } and in this case we call it an orthogonal sequence / orthonormal sequence, respectively.
More generally, let Λ be any index set.
(a) A family of vectors {xα }α∈Λ in an inner product space X is said to be orthogonal if
xα ⊥ xβ for all α, β ∈ Λ, α 6= β.
(b) A family of vectors {xα }α∈Λ in an inner product space X is said to be orthonormal if
it is orthogonal and kxα k = 1 for all xα , that is, for all α, β ∈ Λ, we have
0,
if α 6= β
(1.3)
hxα , yβ i = δαβ =
1,
if α = β.
1
Example 1.1.4. The standard / canonical basis for Rn (with usual inner product)
e1 = (1, 0, 0, . . . , 0),
e2 = (0, 1, 0, . . . , 0),
..
..
.
.
en = (0, 0, 0, . . . , 1),
form an orthonormal set as
hei , ej i = δij =
0,
1,
if i 6= j
if i = j.
(1.4)
Qamrul Hasan Ansari
Advanced Functional Analysis
Recall that for p ≥ 1,
(
ℓp =
c00 =
x = {xn } ⊆ K :
∞
[
∞
X
n=1
|xn |p < ∞}
Page 10
)
{{x1 , x2 , . . .} ⊆ K : xj = 0 for j ≥ k}
k=1
ℓ∞ = {{xn } ⊆ K : sup |xn | < ∞}
n∈N
C[a, b] = The space of all continuous real-valued functions defined on the interval [a, b]
P [a, b] = The space of polynomials defined on the interval [a, b]
Clearly, c00 ⊆ ℓ∞ .
P [a, b] is complete with respect to the norm kf k∞ = supx∈[a,b] |f (x)|.
However, P [a, b] is dense in C[a, b] with k · k∞ .
ℓ2 is a Hilbert space with inner product defined by
hx, yi =
∞
X
n=1
xn yn ,
for all x = {xn }, y = {yn } ∈ ℓ2 .
The norm on ℓ2 is defined by
kxk = hx, xi
1/2
=
∞
X
n=1
|xn |
2
!1/2
.
The space ℓp with p 6= 2 is not an inner product space, and hence not a Hilbert space.
However, ℓp with p 6= 2 is a Banach space.
For 0 < p < ∞,
p
L [a, b] =
f : [a, b] → K : f is measurable and
Z
b
a
p
|f | dµ < ∞
For 1 ≤ p < ∞, Lp [a, b] is a complete normed space with respect to the norm
kf kp =
Z
a
b
p
|f | dµ
1/p
.
Note that kf kp does not define a norm on Lp [a, b] for 0 < p < 1.
Example 1.1.5. Consider ℓ2 space and its subset E = {e1 , e2 , . . .} with en = δnj , that is,
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 11
en = {0, 0, . . . 0, 1, 0, . . .} (1 is at nth place). Then E forms an orthonormal set for ℓ2 , and
{en } is an orthonormal sequence.
Example 1.1.6. Consider the space c00 with the inner product
hx, yi =
∞
X
for all x = {x1 , x2 , . . .}, y = {y1 , y2, . . .} ∈ c00 .
xn yn ,
n=1
The set E = {e1 , e2 , . . .} with en = δnj , that is, en = {0, 0, . . . 0, 1, 0, . . .} (1 is at nth place)
forms an orthonormal set for c00 , and {en } is an orthonormal sequence.
Example 1.1.7. Consider the space C[0, 2π] with the inner product
Z 2π
hf, gi =
f (t)g(t) dt, for all f, g ∈ C[0, 2π].
0
Consider the sets E = {u1 , u2 , . . .} and G = {v1 , v2 , . . .} or the sequences {un } and {vn },
where
un (t) = cos nt, for all n = 0, 1, 2, . . . ,
and
vn (t) = sin nt,
for all n = 1, 2, . . . .
Then, E = {u1 , u2 , . . .} is an orthogonal set and {un } is an orthogonal sequence. Also,
G = {v1 , v2 , . . .} is an orthogonal set and {vn } is an orthogonal sequence.
Indeed, by integrating, we obtain
hum , un i =
Z
hvm , vn i =
Z
and
2π
0
2π

 0,
π,
cos mt cos nt dt =

2π
sin mt sin nt dt =
0
0,
π,
if m 6= n
if m = n = 1, 2, . . .
if m = n = 0,
if m 6= n
if m = n = 1, 2, . . . .
Also, {e1 , e2 , . . .} is an orthonormal set and {en } is an orthonormal sequence, where
1
e0 (t) = √ ,
2π
en =
cos nt
un (t)
= √ ,
kun k
π
for n = 1, 2, . . . .
Similarly, {ẽ1 , ẽ2 , . . .} is an an orthonormal set and {ẽn } is an orthonormal sequence, where
ẽn =
sin nt
vn (t)
= √ ,
kvn k
π
for n = 1, 2, . . . .
Note that um ⊥ vn for all m and n (Prove it!).
Exercise 1.1.9 (Pythagorean Theorem). If {x1 , x2 , . . . , xn } is an orthogonal subset of an
inner product space X, then prove that
n
X
i=1
2
xi
=
n
X
i=1
kxi k2 .
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 12
Proof. We have
n
X
2
xi
=
i=1
* n
X
i=1
=
=
=
n
X
i=1
n
X
i=1
n
X
i=1
xi ,
n
X
xi
i=1
hxi , xi i +
+
n
X
i,j=1, 6=j
hxi , xj i
hxi , xi i
kxi k2 .
Lemma 1.1.1 (Linearly independence). An orthonormal set of non-zero vectors is linearly independent.
Proof. Let {ui } be an orthonormal set. Consider the linear combination
α1 u1 + α2 u2 + · · · + αn un = 0.
Multiply by any fixed uj 6= 0, we get
0 = h0, uj i = hα1 u1 + α2 u2 + · · · + αn un , uj i
= α1 hu1, uj i + α2 hu2 , uj i + · · · αj huj , uj i + · · · + αn hun , uj i.
Since hui , uj i = δij , we have
0 = αj huj , uj i or αj = 0 as uj 6= 0.
This shows that {ui } is a set of linearly independent vectors.
Exercise 1.1.10. Determine an orthogonal set in L2 [0, 2π].
√
Hint: (See [6], pp. 179) Consider u1 (t) = 1/ 2π, and for n ∈ N,
sin nt
u2n (t) = √ ,
π
cos nt
u2n+1 (t) = √ ,
π
and then check E = {u1 , u2, . . .} is an orthogonal set in L2 [0, 2π].
Exercise 1.1.11. Construct a set of 3 vectors in R3 and determine whether it is a basis
and, if it is, then whether it is orthogonal, orthonormal or neither.
Exercise 1.1.12. Show that every orthonormal set in a separable inner product space X is
countable.
Hint: (See [6], pp. 179)
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 13
Advantages of an Orthonormal Sequence. A great advantage of orthonormal sequences
over arbitrary linearly independent sequences is the following: If we know that a given x can
be represented as linear combination of some elements of an orthonormal sequence, then the
orthonormality makes the actual determination of the coefficients very easy.
Let {u1 , u2, . . .} be an orthonormal sequence in an inner product space X and x ∈
span{u1, u2 , . . . , un }, where n is fixed. Then x can be written as a linear combination of
u1 , u2 , . . . , un , that is,
n
X
αk uk , for scalars αk .
(1.5)
x=
k=1
Take the inner product by a fixed uj , then we obtain
+
* n
X
X
αk huk , uj i + αj huk , uj i = αj kuj k = αj ,
αk u k , u j =
hx, uj i =
k=1
k=1, k6=j
as kuj k = 1 since {u1 , u2, . . .} is an orthonormal sequence. Therefore, the unknown coefficients αk in (1.5) can be easily calculated.
The following Gram-Schmidt process provides that how to obtain an orthonormal sequence
if an arbitrary linearly independent sequence is given.
Gram-Schmidt Orthogonalization Process. Let {xn } be a linearly independent sequence in an inner product space X. Then we obtain an orthonal sequence {vn } and an
orthonormal sequence {un } with the following property for every n:
span{u1 , u2 , . . . , nn } = span{x1 , x2 , . . . , xn }.
Qamrul Hasan Ansari
Advanced Functional Analysis
1st Step
Take v1 = x1
2nd Step
Take v2 = x2 − hx2 , u1iu1
v1
v1
= x2 − x2 ,
kv1 k kv1 k
hx2 , v1 i
= x2 −
v1
hv1 , v1 i
Take v3 = x3 − hx3 , u2iu2
3rd Step
= x3 −
..
.
nth Step
..
.
Take vn = xn −
= xn −
..
.
..
.
Page 14
v1
kv1 k
v2
and u2 =
kv2 k
and u1 =
2
X
hxj , vj i
hvj , vj i
n−1
X
hxn , uj iuj
n−1
X
hxn , vj i
j=1
hvj , vj i
v3
kv3 k
vj
j=1
j=1
and u3 =
..
.
and un =
vn
kvn k
vj
..
.
Then {vn } is an orthogonal sequence of vectors in X and {un } is an orthonormal sequence
in X. Also, for every n:
span{u1 , u2 , . . . , nn } = span{x1 , x2 , . . . , xn }.
Theorem 1.1.3. Let {xn } be a linearly independent sequence in an inner product space
X. Let v1 = x1 , and
vn = xn −
n−1
X
hxn , vn i
j=1
hvj , vj i
vj ,
for n = 2, 3, . . . .
Then {v1 , v2 , . . .} is an orthogonal set, {un } is an orthonormal sequence where un =
and
span{x1 , x2 , . . . , xk } = span{u1 , u2 , . . . , uk }, for all k = 1, 2, . . . , n.
vn
,
kvn k
Proof. Since {xn } is a sequence of linearly independent vectors, so xn 6= 0 for all n. Define
v1 = x1 and
hx2 , v1 i
v2 = x2 −
v1 .
hv1 , v1 i
Clearly, v2 ∈ span{x1 , x2 } and
hv2 , v1 i = hx2 , v1 i −
hx2 , v1 i
hv1 , v1 i = 0,
hv1 , v1 i
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 15
that is, v2 and v1 are orthogonal. Since {x1 , x2 } is linearly independent, v2 6= 0. Then, by
Exercise 1.1.3 (d), {v1 , v2 } is linearly independent, and hence, it follows that
span{v1 , v2 } = span{x1 , x2 }.
Continuing in this way, we define an orthogonal set {v1 , v2 , . . . , vn−1 } such that
span{x1 , x2 , . . . , xn−1 } = span{v1 , v2 , . . . , vn−1 }.
Let
vn = xn −
n−1
X
hxn , vk i
j=1
hvj , vj i
vj .
Then we have vk ∈ span{x1 , x2 , . . . , xk } and hvk , vi i = 0 for i < k. Again, since {x1 , x2 , . . . , xk }
is linearly independent, vk 6= 0. Thus, {v1 , v2 , . . . , vn } is the required orthogonal set and
{u1 , u2, . . . , un } is the required orthonormal set
Exercise 1.1.13. Let Y be the plane in R3 spanned by the vectors x1 = (1, 2, 2) and
x2 = (−1, 0, 2), that is, Y = span{x1 , x2 }. Find orthonormal basis for Y and for R3 .
Solution. x1 , x2 is a basis for the plane Y . We can extend it to a basis for R3 by adding one
vector from the standard basis. For instance, vectors x1 , x2 and x2 = (0, 0, 1) form a basis
for R3 because
1 2 2
1 2
−1 0 2 =
= 2 6= 0.
−1 0
0 0 1
By using the Gram-Schmidt process, we orthogonalize the basis x1 = (1, 2, 2), x2 = (−1, 0, 2)
and x3 = (0, 0, 1):
v1 = x1 = (1, 2, 2),
hx2 , v1 i
v2 = x2 −
v1
hv1 , v1 i
3
= (−1, 0, 2) − (1, 2, 2) = (−4/3, −2/3, 4/3)
9
hx3 , v1 i
hx3 , v2 i
v3 = x3 −
v1 −
v2
hv1 , v1 i
hv2 , v2 i
4/3
2
(−4/3, −2/3, 4/3) = (2/9, −2/9, 1/9).
= (0, 0, 1) − (1, 2, 2) −
9
4
Now, v1 = (1, 2, 2), v2 = (−4/3, −2/3, 4/3), v3 = (2/9, −2/9, 1/9) is an orthogonal basis for
R3 , while v1 , v2 is an orthogonal basis for Y . The orthonromal basis for Y is u1 = kvv11 k =
1
(1, 2, 2), u2 = kvv22 k = 31 (−2, −1, 2).
3
The orthonromal basis for R3 is u1 =
1
(2, −2, 1).
3
v1
kv1 k
= 31 (1, 2, 2), u2 =
v2
kv2 k
= 31 (−2, −1, 2), u3 =
v3
kv3 k
=
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 16
Exercise 1.1.14. Let {un } be an orthonormal sequence in an inner product space X. Prove
the following statements (Use Pythagorean theorem).
(a) If w =
P∞
n=1 αn un , then kwk =
(b) If x ∈ X and sN =
PN
P∞
n=1 |αn |
n=1 hx, un iun ,
2
, where αn ’s are scalars.
then kxk2 = kx − sN k2 + ksN k2 .
P
(c) If x ∈ X and sN = N
n=1 hx, un iun , and XN = span{u1 , u2 , . . . uN }, then kx − sN k =
miny∈XN kx − yk (It is called best approximation property).
Theorem 1.1.4 (Bessel’s inequality). Let {uk } be an orthonormal set in an inner product space X. Then for any x ∈ X, we have
∞
X
k=1
|hx, uk i|2 ≤ kxk2 .
P
Proof. Let xn = nk=1 hx, uk iuk be the nth partial sum. Then, by using the properties of the
inner product and applying the fact that
0,
if i 6= j
hui , uj i = δij =
1,
if i = j,
we have
0 ≤ kx − xn k2 = hx − xn , x − xn i = kxk2 − hxn , xi − hx, xn i + kxn k2
* n
+ *
+
n
X
X
= kxk2 −
hx, uk iuk , x − x,
hx, uk iuk + kxn k2
k=1
= kxk2 −
2
n
X
k=1
k=1
hx, uk ihuk , xi −
2
= kxk − kxn k .
Therefore, kxn k2 ≤ kxk2 , and hence,
the conclusion.
Pn
k=1 |hx, uk i|
2
n
X
k=1
hx, uk ihx, uk i + kxn k2
≤ kxk2 . Taking limit as n → ∞, we get
Exercise 1.1.15. Let {ui } be a countably infinite orthonormal set in a Hilbert space X.
Then prove the following statements:
(a) The infinite series
∞
P
n=1
∞
P
αn un , where αn ’s are scalars, converges if and only if the series
n=1
|αn |2 converges, that is,
∞
P
n=1
|αn |2 < ∞.
Qamrul Hasan Ansari
(b) If
∞
P
Advanced Functional Analysis
Page 17
αn un converges and
n=1
x=
∞
X
αn u n =
n=1
∞
P
then αn = βn for all n and kxk2 =
Proof. (a) Let
∞
P
n=1
∞
X
βn un ,
n=1
|αn |2 .
αn un be convergent and assume that
n=1
x=
∞
X
αn u n ,
or equivalently,
lim
N →∞
n=1
x−
N
X
2
αn u n
= 0.
n=1
Now,
hx, um i =
=
*∞
X
αn u n , u m
n=1
∞
X
n=1
+
αn hun , um i,
for m = 1, 2, . . .
(as {ui } is orthonormal).
= αm
By the Bessel inequality, we get
∞
X
m=1
which shows that
∞
P
n=1
2
|hx, um i| =
m=1
|αm |2 ≤ kxk2 ,
|αn |2 converges.
To prove the converse, assume that
n
P
∞
X
∞
P
n=1
αi ui . Then, we have
|αn |2 is convergent. Consider the finite sum sn =
i=1
ksn − sm k
2
=
=
*
n
X
i=m+1
n
X
i=m+1
αi u i ,
n
X
αi u i
i=m+1
+
|αi |2 → 0 as n, m → ∞.
This means that {sn } is a Cauchy sequence. Since X is complete, the sequence of partial
∞
P
αn un converges.
sums {sn } is convergent in X, and therefore, the series
n=1
Qamrul Hasan Ansari
Advanced Functional Analysis
(b) We first prove that kxk2 =
2
kxk −
N
X
n=1
|αn |
2
∞
P
n=1
*
x, x −
x−
≤
Since
N
P
|αn |2 . We have
= hx, xi −
=
Page 18
N X
N
X
hαn un , αm um i
n=1 m=1
N
X
αn u n
n=1
N
X
αn u n
n=1
+
+
*N
X
kxk +
αn u n , x −
n=1
N
X
αn u n
n=1
!
N
X
αn u n
n=1
+
= M.
αn un converges to x, the M converges to zero, proving the result.
n=1
If x =
∞
X
αn u n =
n=1
∞
X
βn un , then
n=1
0 = lim
N →∞
" N
X
n=1
(αn − βn ) un
#
⇒
∞
X
0=
n=1
|αn − βn |2 ,
by (a),
implying that αn = βn for all n.
Exercise 1.1.16. Let {un } be an orthonormal sequence in a Hilbert space X, and
∞
X
n=1
2
|αn | < ∞ and
Prove that
u
∞
X
αn u n
∞
X
n=1
and v =
n=1
|βn |2 < ∞.
∞
X
βn un
n=1
are convergent series with respect to the norm of X and hu, vi =
P∞
n=1
αn βn .
Proof. Let
uN
N
X
αn u n
and vN =
n=1
Then for M < N, we have
N
X
βn un .
n=1
2
kuN − uM k =
N
X
n=M
|αn |2 → 0 as M → ∞,
and so, {uN } is a Cauchy sequence in a complete space X and thus converging to some
u ∈ X. Similarly, {vN } is a Cauchy sequence in a complete space X that converges to some
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 19
v ∈ X. Finally,
huN , vN i =
N
X
hαj uj , βk uk i =
j,k=1
N
X
j,k=1
αj βk huj , uk i =
N
X
αj βj ,
j=1
since huj , wk i = 0 for j 6= k and hwj , wj i = 1. Taking
P∞the limit as N → ∞ and using the
Pythagorean theorem, huN , vN i → hu, vi gives hu, vi n=1 αn βn .
Recall that if {u1 , u2 , . . . , un } is a basis of a linear space X, then for every x ∈ X, there
exists scalars α1 , α2 , . . . , αn such that x = α1 u1 + α2 u2 + · · · + αn un .
Definition 1.1.4. (a) An orthogonal set of vectors {ui } in an inner product space X is
called an orthogonal basis if for any x ∈ X, there exist scalars αi such that
x=
∞
X
αi u i .
i=1
If the set {ui } is orthonormal, then it is called an orthonormal basis.
(b) An orthonormal basis {ui } in a Hilbert space X is called maximal or complete if there
is no unit vector u0 in X such that {u0 , u1, u2 , . . .} is an orthonormal set. In other
words, the sequence {ui } of orthonormal basis in X is complete if and only if the only
vector orthogonal to each of ui ’s is the null vector.
In general, an orthonormal set E in an inner product space X is complete or maximal
if it is a maximal orthonormal set in X, that is, E is an orthonormal set, and for every
e satisfying E ⊆ E,
e we have E
e = E.
orthonormal set E
(c) Let {ui } be an orthonormal basis in a Hilbert space X, then the numbers αi = hx, ui i
are called the Fourier coefficients of the element x with respect to the system {ui } and
P
∞
i=1 αi ui is called the Fourier series of the element x.
Example 1.1.8. The set {ei : i ∈ N}, where ei = (0, 0, . . . , 0, 1, 0, . . .) with 1 lies in the ith
place, forms an orthonormal basis for ℓ2 (C).
Example 1.1.9. Let X = L2 (−π, π) be a complex Hilbert space and un be the element of
X defined by
1
un (t) = √ exp(i n t), for n = 0, ±1, ±2, . . . .
2π
Then
1 cos nt sin nt
√ , √ , √ : n = 1, 2, . . .
π
π
2π
forms an orthonormal basis for X as exp(i n t) = cos nt + i sin nt.
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 20
Theorem 1.1.5. Let {ui : i ∈ N} be an orthonormal set in a Hilbert space X. Then the
following assertions are equivalent:
(a) {ui : i ∈ N} is an orthonormal basis for X.
(b) For all x ∈ X, x =
∞
X
i=1
(c) For all x ∈ X, kxk2 =
hx, ui iui .
∞
X
i=1
|hx, ui i|2 .
(d) hx, ui i = 0 for all i implies x = 0.
Proof. (a) ⇔ (b): Let {ui : i ∈ N} be an orthonormal basis for X. Then we can write
x=
∞
X
αi u i ,
that is x = lim
n→∞
i=1
For k ≤ n in N, we have
* n
X
αi u i , u k
i=1
+
n
X
=
i=1
n
X
αi u i .
i=1
αi hui , uk i = uk .
By letting n → ∞ and using the continuity of the inner product, we obtain
hx, uk i = lim = αk ,
n→∞
and hence (b) holds.
The same argument shows that if (b) holds, then this expansion is unique and so {ui : i ∈ N}
is an orthonormal basis for X.
(b) ⇔ (c): By Pythagorean theorem and continuity of the inner product, we have
2
kxk =
∞
X
i=1
2
hx, ui iui
=
i=1
(c) ⇔ (d): Let hx, ui i = 0 for all i. Then kxk2 =
x = 0.
(d) ⇔ (b): Take any x ∈ X and let y = x −
∞
X
i=1
hy, uk i = hx, uk i − lim
n→∞
*
∞
X
|hx, ui i|2 .
P∞
i=1
|hx, ui i|2 = 0 which implies that
hx, ui iui . Then for each k ∈ N, we have
∞
X
hx, ui iui , uk
i=1
+
=0
Qamrul Hasan Ansari
Advanced Functional Analysis
since eventually n ≥ k. It follows from (d) that y = 0, and hence x =
Page 21
∞
X
i=1
hx, ui iui .
Theorem 1.1.6 (Fourier Series Representation). Let Y be the closed subspace spanned
by a countable orthonormal set {ui } in a Hilbert space X. Then every element x ∈ Y
can be written uniquely as
∞
X
x=
hx, ui iui .
(1.6)
i=1
Proof. Uniqueness of (1.6) is a consequence of Exercise 1.1.15 (b). For any x ∈ Y , we can
write
M
X
x = lim
αi ui , for M ≥ N
N →∞
i=1
as Y is closed. From Theorem 1.1.4 and Exercise 1.1.15, it follows that
x−
M
X
i=1
hx, ui iui ≤ x −
M
X
αi u i ,
i=1
and as N → ∞, we get the desired result.
Theorem 1.1.7 (Fourier Series Theorem). For any orthonormal set {un } in a separable
Hilbert space X, the following statements are equivalent:
(a) Every x ∈ X can be represented by the Fourier series in X; that is,
x=
∞
X
i=1
hx, ui iui .
(1.7)
(b) For any pair of vectors x, y ∈ X, we have
hx, yi =
∞
X
i=1
hx, ui ihy, uii =
∞
X
αi βi ,
(1.8)
i=1
where αi = hui , xi are Fourier coefficients of x, and βi = hy, ui i are Fourier coefficients of y.
(c) For any x ∈ X, one has
2
kxk =
∞
X
i=1
|hx, ui i|2 .
(1.9)
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 22
(d) Any subspace Y of X that contains {ui } is dense in X.
Proof. (a) ⇒ (b). It follows from (1.6) and the fact that {ui } is orthonormal.
(b) ⇒ (c). Put x = y in (1.8) to get (1.9).
(a) ⇒ (d). The statement (d) is equivalent to the statement that the orthogonal projection
onto S, the closure of S, is the identity. In view of Theorem 1.1.6, statement (d) is equivalent
to statement (a).
Exercise 1.1.17. Let X be a Hilbert space and E be an orthonormal basis of X. Prove
that E is countable if and only if X is separable.
Hint: See Theorem 4.10 on page 187 in [6].
Exercise 1.1.18. Let X be a Hilbert space and E be an orthonormal basis of X. Prove
that E is a basis of X if and only if X is finite dimension.
Hint: See Theorem 4.13 on page 189 in [6].
Exercise 1.1.19. Let X be a Hilbert space. Prove that E is an orthonormal basis of X if
and only if spanE is dense in X.
Exercise 1.1.20. If X is a Hilbert space, then show that E is an orthonormal basis if and
only if
X
hx, yi =
hx, ui hy, ui, for all x, y ∈ X.
u∈E
1.2
1.2.1
Orthogonal Projections and Projection Theorem
Orthogonal Projection
Let K be a nonempty subset of a normed space X. Recall that distance from an element
x ∈ X to the set K is defined by
ρ := inf kx − yk.
y∈K
(1.10)
It is important to know that whether there is a z ∈ K such that
kx − zk = inf kx − yk.
y∈K
If such point exists, whether it is unique?
(1.11)
Qamrul Hasan Ansari
Advanced Functional Analysis
b
Page 23
x
z
K
y
b
Figure 1.1: The distance from a point x to K
b
x
b
ρ
x
ρ
K is an open segment
No z in K that satisfies (1.12)
K is an open segment
z is unique that satisfies (1.12)
b
x
ρ
K is circular arc; Infinitely
many z’s which satisfy (1.12)
Figure 1.2: The distance from a point x to K
One can see in the following figures that even in the simple space R2 , there may be no z
satisfying (1.12), or precisely one such z, or more than one z.
To get the existence and uniqueness of such z, we recall the concept of a convex set.
Definition 1.2.1. A subset K of a vector space X is said to be a convex set if for all
x, y ∈ K and α, β ≥ 0 such that α + β = 1, we have αx + βy ∈ K, that is, for all x, y ∈ K
and α ∈ [0, 1], we have αx + (1 − α)y ∈ K.
Theorem 1.2.1. Let K be a nonempty closed convex subset of a Hilbert space X. Then
for any given x ∈ X, there exists a unique z ∈ K such that
kx − zk = inf kx − yk.
y∈K
(1.12)
Proof. Existence. Let ρ := inf kx − yk. By the definition of the infimum, there exists a
y∈K
sequence {yn } in K such that kx − yn k → ρ as n → ∞. We will prove that {yn } is a Cauchy
sequence.
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 24
y
x
y
x
A convex set
A nonconvex set
Figure 1.3: A convex set and a nonconvex set
b
x
z
K
y
b
Figure 1.4: Existence and uniqueness of z that minimizes the distance from K
By using parallelogram law, we have
yn + ym
kyn − ym k + 4 x −
2
2
2
= 2 kx − yn k2 + kx − ym k2 ,
for all n, m ≥ 1.
Since K is a convex subset of X and yn , ym ∈ K, we have 12 (yn + ym ) ∈ K. Therefore,
m
≥ ρ. Hence
x − yn +y
2
kyn − ym k
2
yn + ym
= 2 kx − yn k + kx − ym k − 4 x −
2
2
2
2
≤ 2 kx − yn k + kx − ym k − 4ρ .
2
2
2
Let n, m → ∞, then we have kx − yn k → ρ and kx − ym k → ρ and
0 ≤ lim kyn − ym k2 ≤ 4ρ2 − 4ρ2 = 0.
n,m→∞
Therefore, lim kyn − ym k2 = 0, and thus, {yn } is a Cauchy sequence. Since X is complete,
n,m→∞
there exists z ∈ X such that lim yn = z. Since yn ∈ K and K is closed, z ∈ K. In
n→∞
conclusion, we have
kx − zk = inf kx − yk.
y∈K
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 25
Uniqueness. Suppose that there is also ẑ ∈ K such that
kx − ẑk = inf kx − yk.
y∈K
z+ẑ
2
By using parallelogram law and x −
kz − ẑk2 + 4 x −
z + ẑ
2
that is,
2
≥ ρ (since 21 (z + ẑ) ∈ K), we have
= 2 kx − zk2 + kx − ẑk2 = 4ρ2 ,
z + ẑ
0 ≤ kz − ẑk = 4ρ − 4 x −
2
2
2
2
≤ 0.
Thus, kz − ẑk = 0, and hence, z = ẑ.
Remark 1.2.1. Theorem 1.2.1 does not hold in the setting of Banach spaces. For example, c0
is a closed subspace of ℓ∞ , but there is no closest sequence in c0 to the sequence {1, 1, 1, . . .}.
In fact, the distance between c0 and the sequence {1, 1, 1, . . .} is 1, and this is achieved by
any bounded sequence {xn } with xn ∈ [0, 2].
Theorem 1.2.2. Let K be a closed subspace of a Hilbert space X and x ∈ X be given.
There exists a unique z ∈ K which satisfies (1.12) and x − z is orthogonal to K, that is,
x − z ∈ K ⊥.
Proof. Existence. Existence of z ∈ K follows from previous theorem as every subspace is
convex.
Orthogonality. Clearly, hx − z, 0i = 0. Take y ∈ K, y 6= 0. Then we shall prove that
hx − z, yi = 0. Since z ∈ K satisfies (1.12) and z + λy ∈ K (as K is a subspace), we have
kx − zk2 ≤ kx − (z + λy)k2 = kx − zk2 + |λ|2kyk2 − λhy, x − zi − λhx − z, yi,
that is,
Putting λ =
hx−z,yi
kyk2
0 ≤ |λ|2 kyk2 − λhx − z, yi − λhx − z, yi.
in the above inequality, we obtain
|hx − z, yi|2
≤ 0,
kyk2
which is only happened when hx − z, yi = 0. Since y was arbitrary, x − z is orthogonal to K.
Uniqueness. Suppose that there is also ẑ ∈ K such that x − ẑ ∈ K ⊥ . Then z − ẑ =
(x − ẑ) − (x − z) ∈ K ⊥ . On the other hand, z − ẑ ∈ K since z, ẑ ∈ K and K is a subspace.
So, z − ẑ ∈ K ∩ K ⊥ ⊂ {0}. Therefore, z − ẑ = 0, and hence, z = ẑ.
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 26
Lemma 1.2.1. If K is a proper closed subspace of a Hilbert space X, then there exists
a nonzero vector x ∈ X such that x ⊥ K.
Proof. Let u ∈
/ K and ρ = inf ku − yk, the distance from u to K. By Theorem 1.2.1, there
y∈K
exists a unique element z ∈ K such that ku − zk = ρ. Let x = u − z. Then x 6= 0 as ρ > 0.
(If x = 0, then u − z = 0 and ku − zk = 0 implies that ρ = 0.)
Now, we show that x ⊥ K. For this, we show that for arbitrary y ∈ K, hx, yi = 0. For
any scalar α, we have kx − αyk = ku − z − αyk = ku − (z + αy)k. Since K is a subspace,
z + αy ∈ K whenever z, y ∈ K. Thus, z + αy ∈ K implies that kx − αyk ≥ ρ = kxk or
kx − αyk2 − kxk2 ≥ 0 or hx − αy, x − αyi − kxk2 ≥ 0. Since
hx − αy, x − αyi = hx, xi − αhy, xi − αhx, yi + ααhy, yi
= kxk2 − αhx, yi − αhy, xi + |α|2kyk2 ,
we have,
−αhx, yi − αhx, yi + |α|2kyk2 ≥ 0.
Putting α = βhx, yi in the above inequality, β being an arbitrary real number, we get
−2β|hx, yi|2 + β 2 |hx, yi|2kyk2 ≥ 0.
If we put a = |hx, yi|2 and b = kyk2 in the above inequality, we obtain
−2βa + β 2 ab ≥ 0,
or
βa(βb − 2) ≥ 0,
for all real β.
If a > 0, the above inequality is false for all sufficiently small positive β. Hence, a must be
zero, that is, a = |hx, yi|2 = 0 or hx, yi = 0 for all y ∈ K.
Lemma 1.2.2. If M and N are closed subspaces of a Hilbert space X such that M ⊥ N,
then the subspace M + N = {x + y ∈ X : x ∈ M and y ∈ N} is also closed.
Proof. It is a well-known result of vector spaces that M + N is a subspace of X. We show
that it is closed, that is, every limit point of M + N belongs to it. Let z be an arbitrary limit
point of M + N. Then there exists a sequence {zn } of points of M + N such that zn → z.
M ⊥ N implies that M ∩ N = {0}. So, every zn ∈ M + N can be written uniquely in the
form zn = xn + yn , where xn ∈ M and yn ∈ N.
By the Pythagorean theorem for elements (xm − xn ) and (ym − yn ), we have
kzm − zn k2 = k(xm − xn ) + (ym − yn )k2
= kxm − xn k2 + kym − yn k2
(1.13)
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 27
(It is clear that (xm − xn ) ⊥ (ym − yn ) for all m, n.) Since {zn } is convergent, it is a Cauchy
sequence and so kzm − zn k2 → 0. Hence, from (1.13), we see that kxm − xn k → 0 and
kym − yn k → 0 as m, n → ∞. Hence, {xm } and {yn } are Cauchy sequences in M and N,
respectively. Being closed subspaces of a complete space, M and N are also complete. Thus,
{xm } and {yn } are convergent in M and N, respectively, say xm → x ∈ M and yn → y ∈ N,
x + y ∈ M + N as x ∈ M and y ∈ N. Then
z =
lim zn = lim (xn + yn ) = lim xn + lim y
n→∞
n→∞
n→∞
n→∞
= x + y ∈ M + N.
This proves that an arbitrary limit point of M + N belongs to it and so it is closed.
Definition 1.2.2. A vector space X is said to be the direct sum of two subspaces Y and Z
of X, denoted by X = Y ⊕ Z, if each x ∈ X has a unique representation x = y + z for y ∈ Y
and z ∈ Z.
Theorem 1.2.3 (Orthogonal Decomposition). If K is a closed subspace of a Hilbert
space X, then every x ∈ X can be uniquely represented as x = z + y for z ∈ K and
y ∈ K ⊥ , that is, X = K ⊕ K ⊥ .
Proof. Since every subspace is a convex set, by previous two results, for every x ∈ X, there
is a z ∈ K such that x − z ∈ K ⊥ , that is, there is a y ∈ K ⊥ such that y = x − z which is
equivalently to x = z + y for z ∈ K and y ∈ K ⊥ .
To prove the uniqueness, assume that there is also ŷ ∈ K ⊥ such that x = ŷ + ẑ for ẑ ∈ K.
Then x = y + z = ŷ + ẑ, and therefore, y − ŷ = ẑ − z. Since y − ŷ ∈ K ⊥ whereas ẑ − z ∈ K,
we have y − ŷ ∈ K ∩ K ⊥ = {0}. This implies that y = ŷ, and hence also z = ẑ.
x
y = PK ⊥ (x)
z = PK (x)
K
Figure 1.5: Orthogonal decomposition
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 28
Example 1.2.1. (a) Let X = L2 (−1, 1). Then X = K ⊕ K ⊥ , where K is the space of
even functions, that is,
K = {f ∈ L2 (−1, 1) : f (−t) = f (t) for all t ∈ (−1, 1)},
and K ⊥ is the space of odd functions, that is,
K ⊥ = {f ∈ L2 (−1, 1) : f (−t) = −f (t) for all t ∈ (−1, 1)}.
(b) Let X = L2 [a, b]. For c ∈ [a, b], let
K = {f ∈ L2 [a, b] : f (t) = 0 almost everywhere in (a, c)}
and
K ⊥ = {f ∈ L2 [a, b] : f (t) = 0 almost everywhere in (c, b)}.
Then X = K ⊕ K ⊥ .
Exercise 1.2.1. Give examples of representations of R3 as a direct sum of a subspace and
its orthogonal complement.
Exercise 1.2.2. Let K be a subspace of an inner product space X. Show that x ∈ K ⊥ if
and only if kx − yk ≥ kxk for all y ∈ K.
Definition 1.2.3. Let K be a closed subspace of a Hilbert space X. A mapping PK : X → K
defined by
PK (x) = z, where x = z + y and (z, y) ∈ K × K ⊥ ,
is called the orthogonal projection of X onto K.
2
Let X and Y be normed spaces and T : X → Y be an operator.
(a) The range of T is R(T ) := {T (x) ∈ Y : x ∈ X}.
(b) The null space or kernel of T is N (T ) := {x ∈ X : T (x) = 0}.
(c) The operator T is called an idempotent if T 2 = T .
2
Let X be a vector space. A linear operator P : X → X is called projection operator if P ◦ P = P 2 = P .
Theorem 1.2.4. If P : X → X is a projection operator from a vector space X to itself, then X =
R(P ) ⊕ N (P ), where R(P ) is the range set of P and N (P ) = {x ∈: P (x) = 0} is the null space of P .
Theorem 1.2.5. If a vector space X is expressed as the directed sum of its subspaces Y and Z, then
there is a uniquely determined projection P : X → X such that Y = R(P ) and Z = N (P ) = R(I − P ),
where I be the identity mapping on X.
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 29
Theorem 1.2.6 (Existence of Projection Mapping). Let K be a closed subspace of a
Hilbert space X. Then there exists a unique mapping PK from X onto K such that
R(PK ) = K.
Proof. By Theorem 1.2.3, X = K ⊕ K ⊥ . Theorem 1.2.5 ensures the existence of a unique
projection PK such that R(PK ) = K and N (PK ) = K ⊥ . This projection is an orthogonal
projection as its null space and range are orthogonal.
Similarly, it can be verified that the orthogonal projection I − PK corresponds to the case
R(I − PK ) = K ⊥ and N (I − PK ) = K.
Exercise 1.2.3. Let K be a closed subspace of a Hilbert space X and I be the identity
mapping on X. Then prove that there exists a unique mapping PK from X onto K such
that I − PK maps X onto K ⊥ .
Such map PK is the projection mapping of X onto K.
Exercise 1.2.4 (Properties of Projection Mapping). Let K be a closed subspace of a Hilbert
space X, I be the identity mapping on X and PK is the projection mapping from X onto
K. Then prove that the following properties hold for all x, y ∈ X.
(a) Each element x ∈ X has a unique representation as a sum of an element of K and an
element of K ⊥ , that is,
x = PK (x) + (I − PK )(x).
(1.14)
(Hint: Compare with Theorem 1.2.3)
(b) kxk2 = kPK (x)k2 + k(I − PK )(x)k2 .
(c) x ∈ K if and only if PK (x) = x.
(d) x ∈ K ⊥ if and only if PK (x) = 0.
(e) If K1 and K2 are closed subspaces of X such that K1 ⊆ K2 , then PK1 (PK2 (x)) =
PK1 (x).
(f) PK is a linear mapping, that is, for all α, β ∈ R and all x, y ∈ X, PK (αx + βy) =
αPK (x) + βPK (y).
(g) PK is a continuous mapping, that is, xn −→ x (that is, kxn − xk −→ 0) implies
n→∞
PK (xn ) −→ PK (x) (that is, kPK (xn ) − PK (x) −→ 0).
n→∞
n→∞
n→∞
Exercise 1.2.5 (Properties of Projection Mapping). Let K be a closed subspace of a Hilbert
space X, I be the identity mapping on X and PK is the projection mapping from X onto
K. Then prove that the following properties hold for all x, y ∈ X.
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 30
(a) Each element z ∈ X can be written uniquely as
z = x + y,
where x ∈ R(PK ) and y ∈ N (PK ).
(b) The null space N (PK ) and the range set R(PK ) are closed subspaces of X.
(c) N (PK ) = (R(PK ))⊥ and R(PK ) = N (PK )⊥ .
(d) PK is idempotent.
Exercise 1.2.6. Let K1 and K2 be closed subspaces of a Hilbert space X and PK1 and PK2
be orthogonal projections onto K1 and K2 , respectively. If hx, yi = 0 for all x ∈ K1 and
y ∈ K2 , then prove that
(a) K1 + K2 is a closed subspace of X;
(b) PK1 + PK2 is the orthogonal projection onto K1 + K2 ;
(c) PK1 PK2 ≡ 0 ≡ PK2 PK1 .
1.2.2
Projection on Convex Sets
We discuss here the concepts of projection and projection operator on convex sets which are
of vital importance in such diverse fields as optimization, optimal control and variational
inequalities.
Definition 1.2.4. Let K be a nonempty closed convex subset of a Hilbert space X. For
x ∈ X, by projection of x on K, we mean the element z ∈ K, denoted by PK (x), such that
kx − PK (x)k ≤ kx − yk,
for all y ∈ K,
(1.15)
equivalently,
kx − zk = inf kx − yk.
y∈K
(1.16)
An operator on X into K, denoted by PK , is called the projection operator if PK (x) = z,
where z is the projection of x on K.
In view of Theorem 1.2.1, there always exists a z ∈ K which satisfies (1.16)
Theorem 1.2.7 (Variational Characterization of Projection). Let K be a nonempty
closed convex subset of a Hilbert space X. For any x ∈ X, z ∈ K is the projection of x
if and only if
hx − z, y − zi ≤ 0, for all y ∈ K.
(1.17)
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 31
Proof. Let z be the projection of x ∈ X. Then for any α, 0 ≤ α ≤ 1, since K is convex,
αy + (1 − α)z ∈ K for all y ∈ K. Define a real-valued function g : [0, 1] → R by
g(α) := kx − (αy + (1 − α)z)k2 ,
for all α ∈ [0, 1].
(1.18)
Then g is a twice continuously differentiable function of α. Moreover,
g ′ (α) = 2hx − αy − (1 − α)z, z − yi
g ′′ (α) = 2hz − y, z − yi.
b
(1.19)
x
z
K
y
b
Figure 1.6: The projection of a point x onto K
Now, for z to be the projection of x, it is clear that g ′ (0) ≥ 0, which is (1.17).
In order to prove the converse, let (1.17) be satisfied for some element z ∈ K. This implies
that g ′(0) is non-negative, and by (1.19), g ′′ (α) is non-negative. Hence, g(0) ≤ g(1) for all
y ∈ K such that (1.16) is satisfied.
Remark 1.2.2. The inequality (1.17) shows that x − z and y − z subtend a non-acute angle
between them. The projection PK (x) of x on K can be interpreted as the result of applying
to x the operator PK : X → K, which is called projection operator. Note that PK (x) = x
for all x ∈ K.
Theorem 1.2.8. The projection operator PK defined on a Hilbert space X into its
nonempty closed convex subset K has the following properties:
(a) PK is a nonexpansive, that is, kPK (x) − PK (y)k ≤ kx − yk for all x, y ∈ X; which
implies that PK is continuous.
(b) hPK (x) − PK (y), x − yi ≥ 0 for all x, y ∈ X.
Proof. (a) From (1.17), we obtain
hPK (x) − x, PK (x) − yi ≤ 0,
for all y ∈ K.
(1.20)
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 32
Put x = x1 in (1.20), we get
hPK (x1 ) − x1 , PK (x1 ) − yi ≤ 0,
for all y ∈ K.
(1.21)
for all y ∈ K.
(1.22)
Put x = x2 in (1.20), we get
hPK (x2 ) − x2 , PK (x2 ) − yi ≤ 0,
Since PK (x2 ) and PK (x1 ) ∈ K, choose y = PK (x2 ) and y = PK (x1 ), respectively, in (1.21)
and (1.22), we obtain
hPK (u1 ) − u1 , PK (u1 ) − PK (u2 )i ≤ 0
hPK (u2 ) − u2 , PK (u2 ) − PK (u1 )i ≤ 0.
From above two inequalities, we obatin
hPK (x1 ) − x1 − PK (x2 ) + x2 , PK (x1 ) − PK (x2 )i ≤ 0,
or
hPK (x1 ) − PK (x2 ), PK (x1 i − PK (x2 )i ≤ hx1 − x2 , PK (x1 ) − PK (x2 )i,
equivalently,
kPK (x1 ) − PK (x2 )k2 ≤ hx1 − x2 , PK (x1 ) − PK (x2 )i.
(1.23)
Therefore, by the Cauchy-Schwartz-Bunyakowski inequality, we get
kPK (x1 ) − PK (x2 )k2 ≤ kx1 − x2 k kPK (x1 ) − PK (x2 )k ,
(1.24)
kPK (x1 ) − PK (x2 )k ≤ kx1 − x2 k .
(1.25)
and hence,
(b) follows from (1.23).
The geometric interpretation of the nonexpansivity of PK is given in the following figure.
We observe that if strict inequality holds in (a), then the projection operator PK reduces
the distance. However, if the equality holds in (a), then the distance is conserved.
Qamrul Hasan Ansari
Advanced Functional Analysis
x̃
x
b
y
b
b
ỹ
b
Page 33
PK (x̃)
b
PK (x)
b
K
b
PK (ỹ)
PK (y)
b
Figure 1.7: The nonexpansiveness of the projection operator
1.3
Bilinear Forms and Lax-Milgram Lemma
Let X and Y be inner product spaces over the same field K (= R or C). A functional
a(·, ·) : X × Y → K will be called a form.
Definition 1.3.1. Let X and Y be inner product spaces over the same field K (= R or C).
A form a(·, ·) : X × Y → K is called a sesquilinear functional or sesquilinear form if the
following conditions are satisfied for all x, x1 , x2 ∈ X, y, y1, y2 ∈ Y and all α, β ∈ K:
(i) a(x1 + x2 , y) = a(x1 , y) + a(x2 , y).
(ii) a(αx, y) = αa(x, y).
(iii) a(x, y1 + y2 ) = a(x, y1 ) + a(x, y2 ).
(iv) a(x, βy) = βa(x, y).
Remark 1.3.1. (a) The sesquilinear functional is linear in the first variable but not so
in the second variable. A sesquilinear functional which is also linear in the second
variable is called a bilinear form or a bilinear functional. Thus, a bilinear form a(·, ·) is
a mapping defined from X × Y into K which satisfies conditions (i) - (iii) of the above
definition and a(x, βy) = βa(x, y).
(b) If X and Y are real inner product spaces, then the concepts of sesquilinear functional
and bilinear form coincide.
(c) An inner product is an example of a sesquilinear functional. The real inner product is
an example of a bilinear form.
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 34
(d) If a(·, ·) is a sesquilinear functional, then g(x, y) = a(y, x) is a sesquilinear functional.
Definition 1.3.2. Let X and Y be inner product spaces. A form a(·, ·) : X × Y → K is
called:
(a) symmetric if a(x, y) = a(y, x) for all (x, y) ∈ X × Y ;
(b) bounded or continuous if there exists a constant M > 0 such that
|a(x, y)| ≤ Mkxk kyk,
for all x ∈ X, y ∈ Y,
and the norm of a is defined as
|a(x, y)|
kak = sup
= sup a
x6=0 y6=0 kxk kyk
x6=0 y6=0
=
sup |a(x, y)|.
y
x
,
kxk kyk
kxk=kyk=1
It is clear that |a(x, y)| ≤ kak kxk kyk.
Remark 1.3.2. Let a(·, ·) : X×Y → K be a continuous form and {xn } and {yn } be sequences
in X and Y , respectively, such that xn → x and yn → y. Then a(xn , yn ) → a(x, y).
Indeed,
|a(xn , yn ) − a(x, y)| ≤ |a(xn − x, yn )| + |a(x, yn − y)|
≤ kak (kxn − xk kyn k + kxk kyn − yk) .
Definition 1.3.3. Let X be an inner product space. A form a(·, ·) : X × X → K is called:
(a) positive if a(x, x) ≥ 0 for all x ∈ X;
(b) positive definite if a(x, x) ≥ 0 for all x ∈ X and a(x, x) = 0 implies that x = 0;
(c) coercive or X-elliptic if there exists a constant α > 0 such that a(x, x) ≥ αkxk2 for all
x ∈ X.
Example 1.3.1. Let X = Rn with the usual Euclidean inner product. Then any n × n
metrix with real entries defines a continuous bilinear form.
If A = (aij ), 1 ≤ i, j ≤ n, and if we have x = (x1 , . . . , xn ) and y = (y1 , . . . , yn ), then the
bilinear form is defined as
n
X
a(x, y) :=
aij xj yi = y ⊤ Ax,
i,j=1
where x and y are considered as column vectors and y ⊤ denotes the transpose of y. By the
Cauchy-Schwarz inequality, we have
|a(x, y)| = |y ⊤ Ax| = |hy, Axi|
≤ kyk kAxk ≤ kAk kxk kyk.
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 35
If A is a symmetric and positive definite matrix, then the bilinear form is symmetric and
coercive since we know that
n
X
aij yj yi ≥ αkyk2,
i,j=1
where α > 0 is the smallest eigenvalue of the matrix A.
Theorem 1.3.1 (Extended form of Riesz Representation Theorem). Let X and Y be
Hilbert spaces and a(·, ·) : X × Y → K be a bounded sesquilinear form. Then there exists
a unique bounded linear operator T : X → Y such that
a(x, y) = hT (x), yi,
for all (x, y) ∈ X × Y,
(1.26)
and kak = kT k.
Proof. For each fixed x ∈ X, define a functional fx : Y → K by
fx (y) = a(x, y),
for all y ∈ Y.
(1.27)
Then, fx is a linear functional since for all y1 , y2 ∈ Y and all α ∈ K, we have
fx (y1 + y2 ) = a(x, y1 + y2 ) = a(x, y1 ) + a(x, y2 ) = fx (y1 ) + fx (y2 )
fx (αy) = a(x, αy) = αa(x, y) = αfx (y).
Since a(·, ·) is bounded, we have
|fx (y)| = |a(x, y)| = |a(x, y)| ≤ kak kxk kyk,
that is,
kfx k ≤ kak kxk.
Thus, fx is a bounded linear functional on Y . By Riesz representation theorem3 , there exists
a unique vector y ∗ ∈ Y such that
fx (y) = hy, y ∗i,
for all y ∈ Y.
(1.28)
The vector y ∗ depends on the choice vector x. Therefore, we can write y ∗ = T (x) where
T : X → Y . We observe that
a(x, y) = hy, T (x)i or a(x, y) = hT (x), yi,
for all x ∈ X and y ∈ Y.
Since y ∗ is unique, the operator T is uniquely determined.
3
Riesz Representation Theorem. If f is a bounded linear functional on a Hilbert space X, then
there exists a unique vector y ∈ X such that f (x) = hx, yi for all x ∈ X and kf k = kyk
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 36
The operator T is linear in view of the following relations: For all y ∈ Y , x, x1 , x2 ∈ X and
α ∈ K, we have
hT (x1 + x2 ), yi = a(x1 + x2 , y) = a(x1 , y) + a(x2 , y)
= hT (x1 ), yi + hT (x2 , yi,
hT (αx1 ), yi = a(αx, y) = αa(x, y) = αhT (x), yi.
Moreover, T is continuous as
kfx k = ky ∗ k = kT (x)k ≤ kak kxk
implies that kT k ≤ kak.
To prove that kT k = kak, it is enough to show that kT k ≥ kak which follows from the
following relation:
|a(x, y)|
|hT (x), yi|
= sup
x6=0 y6=0 kxk kyk
x6=0 y6=0 kxk kyk
kT (x)k kyk
≤ sup
= kT k.
kxk kyk
x6=0 y6=0
kak =
sup
To prove the uniqueness of T , let us assume that there is another linear operator S : X → Y
such that
a(x, y) = hS(x), yi, for all (x, y) ∈ X × Y.
Then, for every x ∈ X and y ∈ Y , we have
a(x, y) = hT (x), yi = hS(x), yi
equivalently, h(T − S)(x), yi = 0. This implies that (T − S)(x) = 0 for all x ∈ X, that
is, T ≡ S. This proves that there exists a unique bounded linear operator T such that
a(x, y) = hT (x), yi.
Remark 1.3.3 (Converse of above theorem). Let X and Y be Hilbert spaces and T : X → Y
be a bounded linear operator. Then the form a(·, ·) : X × Y → K defined by
a(x, y) = hT (x), yi,
for all (x, y) ∈ X × Y,
(1.29)
is a bounded sesquilinear form on X × Y .
Proof. Since T is a bounded linear operator on X × Y and the inner product is a sesquilinear
mapping, we have that a(x, y) = hT (x), yi is sesquilinear.
Since |a(x, y)| = |hT (x), yi| ≤ kT k kxk kyk, by the Cauchy-Schwartz-Bunyakowski inequality,
we have sup |a(x, y)| ≤ kT k, and hence a(·, ·) is bounded.
kxk=kyk=1
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 37
Corollary 1.3.1. Let X be a Hilbert space and T : X → X be a bounded linear operator.
Then the complex-valued function b(·, ·) : X × X → C defined by b(x, y) = hx, T (y)i is a
bounded bilinear form on X and kbk = kT k.
Conversely, if b(·, ·) : X × X → C is a bounded bilinear form, then there is a unique
bounded linear operator T : X → X such that b(x, y) = hx, T (y)i for all (x, y) ∈ X × X.
Proof. Define a function a(·, ·) : X × X → C by
a(x, y) = b(y, x) = hT (x), yi.
By Theorem 1.3.1, a(x, y) is a bounded bilinear form on X and kak = kT k. Since we have
b(x, y) = a(y, x); b is also bounded bilinear on X and
kbk =
sup
kxk=kyk=1
|b(x, y)| =
sup
kxk=kyk=1
|a(y, x)| = kak = kT k.
Conversely, if b is given, we define a bounded bilinear form a(·, ·) : X × X → C by
a(x, y) = b(y, x),
for all x, y ∈ X.
Again, by Theorem 1.3.1, there is a bounded linear operator T on X such that
a(x, y) = hT (x), yi,
for all (x, y) ∈ X × X.
Therefore, we have b(x, y) = a(y, x) = hT (y), xi = hx, T (y)i for all (x, y) ∈ X × X.
Corollary 1.3.2. Let X be a Hilbert space. If T is a bounded linear operator on X,
then
kT k = sup |hx, T (y)i| = sup |hT (x), yi|.
kxk=kyk=1
kxk=kyk=1
Proof. By Theorem 1.3.1, for every bounded linear operator on X, there is a bounded bilinear
form a such that a(x, y) = hT (x), yi and kak = kT k. Then,
kak =
sup
kxk=kyk=1
From this, we conclude that kT k =
|a(x, y)| =
sup
sup
kxk=kyk=1
hT (x), yi.
t|hT (x), yi|.
kxk=kyk=1
Definition 1.3.4. Let X be a Hilbert space and a(·, ·) : X × X → K be a form. Then the
operator F : X → K is called a quadratic form associated with a(·, ·) if F (x) = a(x, x) for
all x ∈ X.
A quadratic form F is called real if F (x) is real for all x ∈ X.
Qamrul Hasan Ansari
Remark 1.3.4.
Advanced Functional Analysis
Page 38
(a) We immediately observe that F (αx) = |α|2F (x) and |F (x)| ≤ kak kxk.
(b) The norm of F is defined as
kF k = sup
x6=0
|F (x)|
= sup |F (x)|.
kxk2
kxk=1
Remark 1.3.5. If a(·, ·) is any fixed sesquilinear form and F (x) is an associated quadratic
form on a Hilbert space X. Then
(a)
1
2
[a(x, y) + a(y, x)] = F
x+y
2
−F
x−y
2
;
(b) a(x, y) = 14 [F (x + y) − F (x − y) + iF (x + iy) − iF (x − iy)].
Varification. By using linearity of the bilinear form a, we have
F (x + y) = a(x + y, x + y) = a(x, x) + a(y, x) + a(x, y) + a(y, y)
and
F (x − y) = a(x − y, x − y) = a(x, x) − a(y, x) − a(x, y) + a(y, y).
By subtracting the second of the above equation from the first, we get
F (x + y) − F (x − y) = 2a(x, y) + 2a(y, x).
(1.30)
Replacing y by iy in (1.30), we obtain
F (x + iy) − F (x − iy) = 2a(x, iy) + 2a(iy, x),
or
F (x + iy) − F (x − iy) = 2ia(x, y) + 2ia(y, x).
(1.31)
Multiplying (1.31) by i and adding it to (1.30), we get the result.
Lemma 1.3.1. A bilinear form a(·, ·) : X × X → K is symmetric if and only if the
associated quadratic functional F (x) is real.
Proof. If a(x, y) is symmetric, then we have
F (x) = a(x, x)
= a(x, x)
= F (x).
This implies that F (x) is real.
Conversely, let F (x) be real, then by Remark 1.3.5 (d) and in view of the relation
F (x) = F (−x) = F (ix)
Qamrul Hasan Ansari
and
Advanced Functional Analysis
Page 39
F (x) = a(x, x), F (−x) = a(x, x) = a(−x, −x),
we obtain,
F (ix) = a(ix, ix) = iia(x, x) = a(x, x) ,
1
[F (x + y) − F (y − x) + iF (y + ix) − iF (y − ix)]
4
1
[F (x + y) − F (x − y) + iF (x − iy) − iF (x + iy)]
=
4
1
=
[F (x + y) − F (x − y) + iF (x + iy) − iF (x − iy)]
4
= a(x, y).
a(y, x) =
Hence, a(·, ·) is symmetric.
Lemma 1.3.2. A bilinear form a(·, ·) : X × X → K is bounded if and only if the
associated quadratic form F is bounded. If a(·, ·) is bounded, then kF k ≤ kak ≤ 2kF k.
Proof. Suppose that a(·, ·) is bounded. Then we have
sup |F (x)| = sup |a(x, x)| ≤
kxk=1
kxk=1
sup
kxk=kyk=1
|a(x, y)| = kak,
and, therefore, F is bounded and kF k ≤ kak.
On the other hand, suppose F is bounded. From Remark 1.3.5 (d) and the parallelogram
law, we get
1
kF k(kx + yk2 + kx − yk2 + kx + iyk2 + kx − iyk2)
4
1
kF k2 kxk2 + kyk2 + kxk2 + kyk2
=
4
= kF k kxk2 + kyk2 ,
|a(x, y)| ≤
or
sup
kxk=kyk=1
|a(x, y)| ≤ 2kF k.
Thus, a(·, ·) is bounded and kak ≤ 2kF k.
Theorem 1.3.2. Let X be a Hilbert space and T : X → X be a bounded linear operator.
Then the following statements are equivalent:
(a) T is self-adjoint.
(b) The bilinear form a(·, ·) on X defined by a(x, y) = hT (x), yi is symmetric.
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 40
Proof. (a) ⇒ (b): F (x) = hT (x), xi = hx, T (x)i = hT (x), xi = F (x). In view of Lemma
1.3.1, we obtain the result.
(b) ⇒ (a): hT (x), yi = a(x, y) = a(y, x) = hT (y), xi = hx, T (y)i. This shows that T ∗ ≡ T
that T is self-adjoint.
Theorem 1.3.3. Let X be a Hilbert space. If a bilinear form a(·, ·) : X × X → K is
bounded and symmetric, then kak = kF k, where F is the associated quadratic functional.
The following theorem, known as the Lax-Milgram lemma proved by PD Lax and AN Milgram in 1954, has important applications in different fields.
Theorem 1.3.4 (Lax-Milgram Lemma). Let X be a Hilbert space, a(·, ·) : X × X → R
be a coercive bounded bilinear form, and f : X → R be a bounded linear functional. Then
there exists a unique element x ∈ X such that
a(x, y) = f (y),
for all y ∈ X.
(1.32)
Proof. Since a(·, ·) is bounded, there exists a constant M > 0 such that
|a(x, y)| ≤ Mkxk kyk.
(1.33)
By Theorem 1.3.1, there exists a bounded linear operator T : X → X such that
a(x, y) = hT (x), yi,
for all (x, y) ∈ X × X.
By Riesz representation theorem4 , there exists a continuous linear functional f : X → R
such that equation a(x, y) = f (y) can be rewritten as, for all λ > 0,
hλT (x), yi = λhf, yi,
(1.34)
or
This implies that
hλT (x) − λf, yi = 0,
for all y ∈ X.
λT (x) = λf.
(1.35)
We will show that (1.35) has a unique solution by showing that for appropriate values of
parameter ρ > 0, the affine mapping for y ∈ X, y 7→ y − ρ(λT (y) − λf ) ∈ X is a contraction
mapping. For this, we observe that
ky − ρλT (y)k2 = hy − ρλT (y), y − ρλT (y)i
= kyk2 − 2ρhλT (y), yi + ρ2 kλT (y)k2
≤ kyk2 − 2ραkyk2 + ρ2 M 2 kyk2,
4
(by applying inner product axioms)
Riesz Representation Theorem. If f is a bounded linear functional on a Hilbert space X, then there
exists a unique vector y ∈ X such that f (x) = hx, yi for all x ∈ X and kf k = kyk
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 41
as
a(y, y) = hλT (y), yi ≥ αkyk2
(by the coercivity),
(1.36)
and
kλT (y)k ≤ Mkyk (by boundedness of T ).
Therefore,
ky − ρλT (y)k2 ≤ (1 − 2ρα + ρ2 M 2 )kyk2 ,
(1.37)
ky − ρλT (y)k ≤ (1 − 2ρα + ρ2 M 2 )1/2 kyk.
(1.38)
or
Let S(y) = y − ρ (λT (y) − λf ). Then
kS(y) − S(z)k = k(y − ρ(λT (y) − λf (u))) − (z − ρ(λT (z) − λf (u)))k
= k(y − z) − ρ(λT (y − z))k
≤ (1 − 2ρα + ρ2 M 2 )1/2 ky − zk (by (1.38).
(1.39)
This implies that S is a contraction mapping if 0 < 1−2ρα+ρ2 M 2 < 1 which is equivalent to
the condition that ρ ∈ (0, 2α/M 2). Hence, by the Banach contraction fixed point theorem,
S has a unique fixed point which is the unique solution.
Remark 1.3.6 (Abstract Variational Problem). Find an element x such that
a(x, y) = f (y),
for all y ∈ X,
where a(x, y) and f are as in Theorem 1.3.4.
This problem is known as abstract variational problem. In view of the Lax-Milgram lemma,
it has a unique solution.
2
Spectral Theory of Continuous and Compact
Linear Operators
Let X and Y be linear spaces and T : X → Y be a linear operator. Recall that the range
R(T ) and null space N (T ) of T are defined, respectively, as
R(T ) = {T (x) : x ∈ X} and N (T ) = {x ∈ X : T (x) = 0}.
The dimension of R(T ) is called the rank of T and the dimension of N (T ) is called the
nullity of T .
It can be easily seen that a linear operator T : X → Y is one-one if and only if N (T ) = {0}.
Recall that
c =
=
c0 =
=
c00
space of convergent sequences of real or complex numbers
{{xn } ⊆ K : {xn } is convergent}
space of convergent sequences of real or complex numbers that converge to zero
{{xn } ⊆ K : xn → 0 as n → ∞}
∞
[
{{x1 , x2 , . . .} ⊆ K : xj = 0 for j ≥ k}
=
k=1
ℓ∞ = {{xn } ⊆ K : sup |xn | < ∞}
n∈N
Clearly, c00 ⊆ c0 ⊆ c ⊆ ℓ∞ .
42
Qamrul Hasan Ansari
2.1
Advanced Functional Analysis
Page 43
Compact Linear Operators on Normed Spaces
Recall that the set {T (x) : kxk ≤ 1} is closed and bounded if T : X → Y is a bounded linear
operator from a normed space X to another normed space Y . However, if T : X → Y is
bounded linear operator of finite rank, then the set {T (x) : kxk ≤ 1} is compact as every
closed and bounded subset of a finite dimensional normed space is compact. But this is
not true if the rank of T (dimension of R(T ) is called rank of T ) is infinity. For example,
consider the identity operator I : X → X on an infinite dimensional normed space X, then
the above set reduces to the closed unit ball {x ∈ X : kxk ≤ 1} which is not compact.
Let T : X → Y be a bounded linear operator from a normed space X to another normed
space Y . Then for any r > 0, we have
{T (x) : kxk ≤ r} is compact ⇔ {T (x) : kxk ≤ 1} is compact
{T (x) : kxk < r} is compact ⇔ {T (x) : kxk < 1} is compact
Definition 2.1.1 (Compact linear operator). Let X and Y be normed spaces. A linear
operator T : X → Y is said be compact or completely continuous if the image T (M) of every
bounded subset M of X is relatively compact, that is, T (M) is compact for every bounded
subset M of X.
Lemma 2.1.1. Let X and Y be normed spaces.
(a) Every compact linear operator T : X → Y is bounded, and hence continuous.
(b) If dimX = ∞, then the identity operator I : X → X (which is always continuous)
is not compact.
Proof. (a) Since the unit space S = {x ∈ X : kxk = 1} is bounded and T is a compact linear
operator, T (S) is compact, and hence is bounded1 . Therefore,
sup kT (x)k < ∞.
kxk=1
Hence T is bounded and so it is continuous.
(b) Note that the closed unit ball B = {x ∈ X : kxk ≤ 1} is bounded. If dimX = ∞, then
B cannot be compact2 . Therefore, I(B) = B = B is not relatively compact.
1
2
Every compact subset of a normed space is closed and bounded
The normed space X is finite dimensional if and only if the closed unit ball is compact
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 44
Exercise 2.1.1. Let X and Y be normed spaces and T : X → Y be a linear operator. Then
prove that the following statements are equivalent.
(a) T is a compact operator.
(b) {T (x) : kxk < 1} is compact in Y .
(c) {T (x) : kxk ≤ 1} is compact in Y .
Proof. Clearly, (a) implies (b) and (c). Assume that (c) holds, that is, {T (x) : kxk ≤ 1}
is compact in Y . Let M be a bounded subset of X. Then, there exists r > 0 such that
M ⊆ {x ∈ X : kxk ≤ r}. Since
T (M) ⊆ {T (x) ∈ Y : x ∈ X, kxk < r} ⊆ {T (x) ∈ Y : x ∈ X, kxk ≤ r},
and the fact that a closed subset of a compact set is compact, it follows that (c) implies (b)
and (a), and (b) implies (a).
Theorem 2.1.1 (Compactness criterion). Let X and Y be normed spaces and T : X →
Y be a linear operator. Then T is compact if and only if it maps every bounded sequence
{xn } in X onto a sequence {T (xn )} in Y which has a convergent subsequence.
Proof. If T is compact and {xn } is bounded. Then we can assume that kxn k ≤ c for
every n ∈ N and some constant c > 0. Let M = {x ∈ X : kxk ≤ c}. Then {T (xn )} is a
sequence in the closure of {T (xn )} in Y which is compact, and hence it contains a convergent
subsequence.
Conversely, assume that every bounded sequence {xn } contains a subsequence {xnk } such
that {T (xnk )} converges in Y . Let B be a bounded subset of X. To show that T (B) is
compact, it is enough to prove that every sequence in it has a convergent subsequence.
Suppose that {yn } be any sequence in T (B). Then yn = T (xn ) for some xn ∈ B and {xn } is
bounded since B is bounded. By assumption, {T (xn )} contains a convergent subsequence.
Hence T (B) is compact3 because {yn } in T (B) was arbitrary. It shows that T is compact.
Remark 2.1.1. Sum T1 + T2 of two compact linear operators T1 , T2 : X → Y is compact.
Also, for all α scalar, αT1 is compact. Therefore the set of compact linear operators, denoted
by K(X, Y ) from a normed space X to another normed space Y forms a vector space.
Exercise 2.1.2. Let X and Y be normed spaces. Prove that K(X, Y ) is a subspace of
B(X, Y ) the space of all bounded linear operators from X to Y .
Exercise 2.1.3. Let T : X → X be a compact linear operator and S : X → X be a bounded
linear operator on a normed space X. Then prove that T S and ST are compact.
3
A set is compact if every sequence has a convergent subsequence
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 45
Proof. Let B be any bounded subset of X. Since S is bounded, S(B) is a bounded set
and T (S(B)) = T S(B) is relatively compact because T is compact. Hence T S is a linear
compact operator.
To prove ST is also compact, let {xn } be any bounded sequence in S. Then {T (xn )} has
convergent subsequence {T (xnk )} by Theorem 2.1.1 and {ST (xnk } is convergent. Hence ST
is compact again by Theorem 2.1.1.
Example 2.1.1. Let 1 ≤ p ≤ ∞ and X = ℓp . Let T : X → X be the right shift operator on
X defined by
0,
if i = 1,
(T (x)(i)) :=
x(i − 1),
if i > 1.
Since
T (en ) = en+1 ,
ken − em k =
21/p ,
1,
if 1 ≤ p < ∞,
if = ∞,
for all n, m ∈ N, n 6= m, it follows that, corresponding to the bounded sequence {en },
{A(en )} does not have a convergent subsequence. Hence, by Theorem 2.1.1, the operator T
is not compact.
Exercise 2.1.4. Prove that the left shift operator on ℓp space is not compact for any p with
1 ≤ p ≤ ∞.
Definition 2.1.2. An operator T ∈ B(X, Y ) with dimT (X) < ∞ is called an operator of
finite rank.
Theorem 2.1.2 (Finite dimensional domain or range). Let X and Y be normed spaces
and T : X → Y be a linear operator.
(a) If T is bounded and dimT (X) < ∞, then the operator T is compact. That is, every
bounded linear operator of finite rank is compact.
(b) If dim(X) < ∞, then the operator T is compact.
Proof. (a) Let {xn } be any bounded sequence in X. Then the inequality kT (xn )k ≤ kT k kxn k
shows that the sequence {T (xn )} is bounded. Hence {T (xn )} is relatively compact4 since
dimT (X) < ∞. It follows that {T (xn )} has a convergent subsequence. Since {xn } was an
arbitrary bounded sequence in X, the operator T is compact by Theorem 2.1.1.
(b) It follows from (a) by noting that dim(X) < ∞ implies the boundedness of T 5 .
Exercise 2.1.5. Prove that the identity operator on a normed space is compact if and only
if the space is of finite dimension.
4
5
In a finite dimensional space, a set is compact if and only if it is closed and bounded
Every linear operator is bounded on a finite dimensional normed space X
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 46
Theorem 2.1.3 (Sequence of compact linear operators). Let {Tn } be a sequence of
compact linear operators from a normed space X to a Banach space Y . If {Tn } is
uniformly operator convergent to an operator T (that is, kTn − T k → 0), then the limit
operator T is compact.
Proof. Let {Tn } be a sequence in K(X, Y ) such that kTn − T k → 0 as n → ∞. In order to
prove that T ∈ K(X, Y ), it is enough to show that for any bounded sequence {xn } in X, the
image sequence {T (xn )} has a convergent subsequence, and then apply Theorem 2.1.1.
Let {xn } be a bounded sequence in X, and ε > 0 be given. Since {Tn } is a sequence in
K(X, Y ), there exists N ∈ N such that
kTn − T k < ε,
for all n ≥ N.
Since TN ∈ K(X, Y ), there exists a subsequence {x̃n } of {xn } such that {TN (x̃n )} is convergent. In particular, there exists n0 ∈ N such that
kTN (x̃n ) − TN (x̃m )k < ε,
for all m, n ≥ n0 .
Hence we obtain for n, m ≥ n0
kT (x̃n ) − T (x̃m )k ≤ kkT (x̃n ) − TN (x̃n )k + kTN (x̃n ) − TN (x̃m )k + kTN (x̃m ) − T (x̃m )k
≤ kT − TN k kx̃j k + kTN (x̃n ) − TN (x̃m )k + kTN − T k kx̃m k
≤ (2c + 1)ε,
where c > 0 is such that kxn k ≤ c for all n ∈ N. This shows that {T (x̃n )} is a Cauchy
sequence and hence converges since Y is complete. Remembering that {x̃n } is a subsequence
of the arbitrary bounded sequence {xn }, we see that Theorem 2.1.1 implies compactness of
the operator T .
Remark 2.1.2. The above theorem does not hold if we replace unform operator convergence
by strong operator convergence kTn (x) − T (x)k → 0. For example, consider the sequence
Tn : ℓ2 → ℓ2 defined by Tn (x) = (ξ1 , . . . , ξn , 0, 0, . . .), where x = {ξj } ∈ ℓ2 . Since T is linear
and bounded, Tn is compact by Theorem 2.1.2 (a). Clearly, Tn (x) → x = I(x), but I is not
compact since dimℓ2 = ∞ (see Lemma 2.1.1 (b).
Remark 2.1.3. As a particular case of the above theorem, we can say that if X is a Banach
space, and if {Tn } is a sequence of finite rank operators in B(X) such that kTn − T k → 0 as
n → ∞ for some T ∈ B(X), then T is a compact operator.
By using the above theorem, we give the example of compact operator.
Example 2.1.2. The operator T : ℓ2 → ℓ2 defined by T (x) = y, where x = {ξj } ∈ ℓ2 and
y = {ηj } with ηj = ξ/j for all j = 1, 2, . . ., is a compact linear operator.
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 47
Clearly, if x = {ξj } ∈ ℓ2 , then y = {ηj } ∈ ℓ2 . Let Tn : ℓ2 → ℓ2 be defined by
ξ1 ξ2 ξ3
ξn
Tn (x) =
, , , . . . , , 0, 0, . . . .
1 2 3
n
Then Tn is linear and bounded, and is compact by Theorem 2.1.2 (a). Furthermore,
∞
X
k(T − Tn )(x)k2 =
j=n+1
|ηj |2 =
∞
X
1 2
|ξj |
j
j=n+1
∞
X
kxk2
1
2
|ξ
|
≤
.
j
(n + 1)2 j=n+1
(n + 1)2
≤
Taking the supremum over all x of norm 1, we see that
kT − Tn k ≤
1
.
n+1
Hence Tn → T , and T is compact by Theorem 2.1.3.
Example 2.1.3. Let {λn } be a sequence of scalars such that λn → 0 as n → ∞. Let
T : ℓp → ℓp (1 ≤ p ≤ ∞) be defined by
(T (x))(i) = λi x(i),
for all x ∈ ℓp , i ∈ N.
Then we see that T is a compact operator.
For each n ∈ N, let
(Tn (x))(i) =
λi x(i),
0,
if 1 ≤ i ≤ n
if i > n.
Then for each n, clearly Tn : ℓp → ℓp is a bounded operator of finite rank. In particular,
each Tn is a compact operator. It also follows that
k(T − Tn )(x)kp ≤ sup |λi | kxkp , for all x ∈ ℓp , n ∈ N.
i>n
Since λn → 0 as n → ∞, we obtain
kT − Tn kp ≤ sup |λi | → 0 as n → ∞.
i>n
Then by Theorem 2.1.3 is a compact operator.
Since T (en ) = λn en for all n ∈ N, T is of infinite rank whenever λn 6= 0 for infinitely many
n.
Exercise 2.1.6. Prove that the operator T defined in the Example 2.1.3 is not compact if
λn → λ 6= 0 as n → ∞.
Exercise 2.1.7. Show that the zero operator on any normed space is compact.
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 48
Exercise 2.1.8. If T1 , T2 : X → Y are compact linear operators from a normed space X to
another normed space Y and α is a scalar, then show that T1 + T2 and αT1 are also compact
linear operators.
Exercise 2.1.9. Show that the projection of a Hilbert space H onto a finite dimensional
subspace of H is compact.
Exercise 2.1.10. Show that the operator T : ℓ2 → ℓ2 defined by T (x) = y, where x =
{ξ1 , ξ2 , . . .} and y = {η1 , η2 , . . .} with ηi = ξi /2i , is compact.
Exercise 2.1.11. Show that the operator T : ℓp → ℓp , 1 ≤ p < ∞, defined by T (x) = y,
where x = {ξ1 , ξ2 , . . .} and y = {η1 , η2 , . . .} with ηi = ξi /i, is compact.
Exercise 2.1.12. Show that the operator T : ℓ∞ → ℓ∞ defined by T (x) = y, where x =
{ξ1 , ξ2 , . . .} and y = {η1 , η2 , . . .} with ηi = ξi /i, is compact.
Theorem 2.1.4. Let X and Y be normed spaces and T : X → Y be a linear compact
operator. Suppose that the sequence {xn } in X is weakly convergent, say, xn ⇀ x. Then
{T (xn )} converges strongly to T (x) in Y .
Proof. We write yn = T (xn ) and y = T (x). We first show that yn ⇀ y and then yn → y.
Let g be any bounded linear functional on Y . We define a functional f on X by setting
f (z) = g(T (z)),
for all z ∈ X.
Then f is linear. Also, f is bounded because T is compact, hence bounded, and
|f (z)| = |g(T (z))| ≤ kgk kT (z)k ≤ kgk kT k kzk.
By definition, xn ⇀ x implies f (xn ) → f (x), hence by the definition, g(T (xn )) → g(T (x)),
that is, g(yn) → g(y). Since g was arbitrary, this proves yn ⇀ y.
Now we prove yn → y. Assume that it does not hold. Then {yn } has a subsequence {ynk }
such that
kynk − yk ≥ δ,
(2.1)
for some δ > 0. Since {xn } is weakly convergent, {xn } is bounded, and so is {xnk }. Compactness of T implies that (by Theorem 2.1.1) {T (xnk )} has a convergent subsequence, say
{ỹj }. Let ỹj → ỹ. Then of course ỹj ⇀ ỹ. Hence ỹ = y because yn ⇀ y. Consequently,
kỹj − yk → 0 but kỹj − yk ≥ δ > 0 by (2.1).
This contradicts, so that yn → y.
Remark 2.1.4. In general, the converse of the above theorem does not hold. For example,
consider the space X = ℓ1 , then by Schur’s lemma6 , every weakly convergent sequence in ℓ1 is
convergent. Thus, every bounded operator on ℓ1 maps every weakly convergent sequence onto
a convergent sequence. Obviously, every bounded operator on ℓ1 is not compact. However,
if the space X is reflexive, then the converse of Theorem 2.1.4 does hold.
6
Schur’s Lemma. Every weakly convergent sequence in ℓ1 is convergent
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 49
Theorem 2.1.5. Let X and Y be normed spaces such that X is reflexive, and T : X → Y
be a linear operator such that for any sequence {xn } in X,
xn ⇀ x implies T (xn ) → T (x).
Then T is a compact operator.
Proof. It is enough to show that for every bounded sequence {xn } in X, {T (xn )} has a
convergent sequence.
Let {xn } be a bounded sequence in X. By Eberlein-Shmulyan theorem7 , {xn } has a weakly
convergent subsequence, say {x̃n }. Then by hypothesis, {T (x̃n )} converges.
Exercise 2.1.13. Let X be an infinite dimensional normed space and T :→ X be a compact
linear operator. If λ is a nonzero scalar, then prove that λI − T is not a compact operator.
Further deduce that the operator
α3
α4
S : (α1 , α2 , . . .) 7→ α1 + α2 , α2 + , α3 + , . . .
2
3
is not a compact operator on ℓp , 1 ≤ p ≤ ∞.
Exercise 2.1.14. Let 1 ≤ p ≤ ∞ and q be the conjugate exponent of p, that is, 1p + 1q = 1.
Let (aij ) be an infinite matrix with aij ∈ K, i, j ∈ N. Show that the operator (T (x)(i)) =
P
∞
p
p
p
j=1 aij x(j), x ∈ ℓ , i ∈ N, is well defined and T : ℓ → ℓ is a compact operator in each of
the following cases:
(a) 1 ≤ p ≤ ∞, 1 ≤ r ≤ ∞ and
P∞
(c) 1 < p ≤ ∞, 1 ≤ r ≤ ∞ and
P∞
→ 0 as i → ∞.
r
P P∞
(b) 1 ≤ p ≤ ∞, 1 ≤ r < ∞ and ∞
|a
|
< ∞.
ij
i=1
j=1
(d) 1 < p ≤ ∞, 1 ≤ r < ∞ and
j=1 |aij |
j=1 |aij |
q
P∞ P∞
i=1
→ 0 as i → ∞.
q
j=1 |aij |
r/q
< ∞.
Exercise 2.1.15. Let X be a Hilbert space and T : X → X be a bounded linear operator.
Show that T is compact if and only if for every sequence {xn } in X
hxn , ui → hx, ui,
for all u ∈ X
implies T (xn ) → T (x).
Exercise 2.1.16. Let X and Y be infinite dimensional normed spaces. If T : X → Y is a
surjective linear operator, then prove that T is compact.
7
Eberlein-Shmulyan Theorem. Every bounded sequence in reflexive space has a weakly convergent
subsequence
Qamrul Hasan Ansari
2.2
Advanced Functional Analysis
Page 50
Eigenvalues and Eigenvectors
Let X and Y be linear spaces and T : X → Y be a linear operator. Recall that the range
R(T ) and null space N (T ) of A are defined, respectively, as
R(T ) = {T (x) : x ∈ X} and N (T ) = {x ∈ X : T (x) = 0}.
The dimension of R(T ) is called the rank of T and the dimension of N (T ) is called the
nullity of T .
It can be easily seen that a linear operator T : X → Y is one-one if and only if N (T ) = {0}.
Definition 2.2.1. Let X be a linear space and T : X → X be a linear operator. A scalar
λ ∈ K is called an eigenvalue of T if there exists a nonzero vector x ∈ X such that
T (x) = λx.
In this case, x is called an eigenvector of T corresponding to eigenvector λ.
The set of all eigenvalues of T is known as eigenspectrum of T or point of spectrum, and it
is denoted by σeig (T ). Thus,
σeig (T ) := {λ ∈ K : ∃x 6= 0 such that T (x) = λx}.
Remark 2.2.1. Note that
λ ∈ σeig (T )
⇔
N (T − λI) 6= {0},
and nonzero element of N (T − λI) are eigenvectors of T corresponding to the eigenvalue λ.
The subspace N (T − λI) is called the eigenspace of T corresponding to the eigenvalue λ.
Remark 2.2.2. A linear operator may not have any eigenvalue at all. For example, the
linear operator T : R2 → R2 defined by
T ((α1 , α2 )) = (α2 , −α1 ),
for all (α1 , α2 ) ∈ R2
has no eigenvalue.
Remark 2.2.3. It can be easy seen that
• λ ∈ K is an eigenvalue of T if and only if the operator Tλ I is not injective;
• λ ∈ K is an eigenvalue of T if and only if the operator Tλ I is not surjective.
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 51
Example 2.2.1. Let X be any of the sequence spaces c00 , c0 , c, ℓp .
(a) Let {λn } be a bounded sequence of scalars. Let T : X → X be the diagonal operator
defined by
T (x)(j) = λj x(j), for all x ∈ X and j ∈ N.
Then it is easy to see that, for λ ∈ K, the equation T (x) = λx is satisfied for a nonzero
x ∈ X if and only if λ = λj for some j ∈ N. Hence,
σeig (T ) = {λ1 , λ2 , . . .}.
In fact, for n ∈ N, en ∈ X defined by en (j) = δij is an eigenvector of T corresponding to the
eigenvalue λn .
(b) Let T : X → X be the right shift operator, that is,
T : (α1 , α2 , . . .) 7→ (0, α1 , α2 , . . .).
Let λ ∈ K. Then the equation T (x) = λx is satisfied for some x = (α1 , α2 , . . .) ∈ X if and
only if
0 = λα1 , αj = λαj+1, for all j ∈ N.
This is possible only if αj = 0 for all j ∈ N. Thus, σeig (T ) = ∅.
(c) Let T : X → X be the left shift operator, that is,
T : (α1 , α2 , . . .) 7→ (α2 , α3 , . . .).
Then for x = (α1 , α2 , . . .) ∈ X and λ ∈ K,
T (x) = λx
⇔
αn+1 = λn α1 .
From this, we can infer the following:
Clearly, λ = 0 is an eigenvalue of T with a corresponding eigenvector e1 .
Now suppose that λ 6= 0. If λ is an eigenvalue, then a corresponding eigenvector is of the
form x = α1 (1, λ, λ2, λ3 . . .) for some nonzero α1 . Note that if α1 6= 0 and λ 6= 0, then
x = α1 (1, λ, λ2, λ3 . . .) does not belong to c00 . Thus, if X = c00 , then σeig (T ) = {0}.
Next consider the cases of X = c0 or X = ℓp for 1 ≤ p < ∞. In these cases, we see that
(1, λ, λ2, λ3 . . .) ∈ X if and only if |λ| < 1, so that σeig (T ) = {λ : |λ| < 1}.
For the case of X = c, we see that (1, λ, λ2 , λ3 . . .) ∈ X if and only if either |λ| < 1 or λ = 1.
Thus, in this case
σeig (T ) = {λ : |λ| < 1} ∪ {1}.
If X = ℓ∞ , then (1, λ, λ2, λ3 . . .) ∈ X if and only if |λ| ≤ 1. Thus, in this case
σeig (T ) = {λ : |λ| ≤ 1}.
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 52
Theorem 2.2.1. Let X be a normed space and T : X → X be a compact linear operator.
Then zero is the only possible limit point of σeig (T ). In particular, σeig (T ) is a countable
subset of K.
Proof. Since
σeig (T ) \ {0} =
∞ n
[
n=1
o
λ ∈ σeig (T ) : |λ| ≥ 1/n ,
it is enough to show that the set Er := {λ ∈ σeig (T ) : |λ| ≥ r} is finite for each r > 0.
Assume that there is an r > 0 such that Er is an infinite set. Let {λn } be a sequence of
distinct elements in Er , that is, {λn } be a sequence of distinct eigenvalues of T such that
|λn | ≥ r. For n ∈ N, let xn be eigenvector of T corresponding to the eigenvalue λn , and let
Xn := span{x1 , x2 , . . . , xn }, n ∈ N. Then each Xn is a proper closed subspace of Xn+1 . By
Riesz Lemma8 , there exists a sequence {un } ∈ X such that un ∈ Xn , kun k = 1 for all n ∈ N
and
1
dist(un , Xm ) ≥ , for all m < n.
2
Therefore, for every m, n ∈ N with m < n, we have
kT (un ) − T (um )k = k(T − λn I)(un ) − (T − λm I)(um ) + λn xn − λm xm k
= kλn un − [λm um + (T − λm I)(um ) − (T − λn I)(un )k
Note that um ∈ Xm ⊆ Xn−1 and
(T − λn I)(un ) ∈ Xn−1 ,
(T − λm I)(um ) ∈ Xm−1 ⊆ Xn−1 .
Therefore, we have
kT (un ) − T (um )k ≥ |λn |dist(un , Xn−1) ≥
|λn |
r
≥ .
2
2
Thus, {T (un )} has no convergent subsequence, contradicting the fact that T is a compact
operator.
Let X be a normed space and T : X → X be a linear operator. Assume that λ is not an
eigenvalue of T . Then we can say that for y ∈ X, the operator equation
T (x) − λx = y
can have atmost one solution which depends continuously on y. In other words, one would
like to know that the inverse operator
(T − λI)−1 : R(T − λI) → X
8
Riesz Lemma. Let X0 be a proper closed subspace of a normed space X. Then for every r ∈ (0, 1),
there exists xr ∈ X such that kxr k = 1 and dist(xr , X0 ) ≥ r.
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 53
is continuous which is equivalent to say that the operator T − λI is bounded below, that is,
there exists c > 0 such that
kT (x) − λxk ≥ ckxk,
for all x ∈ X.
Motivated by the above requirement, we generalize the concept of eigenspectrum.
Definition 2.2.2. Let X be a normed space and T : X → X be a linear operator. A scalar
λ is said to be an approximate eigenvalue of T if T − λI is not bounded below.
The set of all approximate eigenvalues of T is called the approximate eigenspectrum of T ,
and it is denoted by σapp (T ), that is,
σapp (T ) = {λ ∈ K : T − λI not bounded below}.
Remark 2.2.4. By the result9 , λ ∈
/ σapp (T ) if and only if T −λI is injective and (T −λI)−1 :
R(T − λI) → X is continuous.
The following result provides the characterization of σapp (T ).
Theorem 2.2.2. Let X be a normed space, T : X → X be a linear operator and λ ∈ K.
Then λ ∈ σapp (T ) if and only if there exists a sequence {xn } in X such that kxn k = 1
for all n ∈ N, and
kT (xn ) − λxn k → 0 as n → ∞.
Proof. If λ ∈
/ σapp (T ), that is, if there exists c > 0 such that kT (x) − λxk ≥ ckxk for all
x ∈ X, then there would not exist any sequence {xn } in X such that kxn k = 1 for all n ∈ N
and kT (xn ) − λxn k → 0 as n → ∞.
Conversely, assume that λ ∈ σapp (T ), that is, there does not exist any c > 0 such that
kT (x) − λxk ≥ ckxk for all x ∈ X. Then for all n ∈ N, there exists un ∈ X such that
1
kun k, for all n ∈ N.
n
un
for all n ∈ N, then we have
Clearly, un =
6 0 for all n ∈ N. Taking xn =
kun k
kT (un ) − λun k <
kxn k = 1 for all n ∈ N and kT (xn ) − λxn k <
1
→ 0 as n → ∞.
n
This completes the proof.
9
Let X and Y be normed spaces and T : X → Y be a linear operator. Then there exists γ > 0 such that
kT (x)k ≥ γkxk for all x ∈ X if and only if T is injective and T −1 : R(T ) → X is continuous, and in that
case, kT −1(y)k ≤ γ1 kyk for all y ∈ R(T ).
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 54
Theorem 2.2.3. Let X be a normed space and T : X → X be a linear operator. Then,
σeig (T ) ⊆ σapp (T ).
If X is a finite dimensional space, then
σeig (T ) = σapp (T ).
Proof. Clearly, λ ∈
/ σapp (T ) implies T − λI is injective (one-one) so that λ ∈
/ σeig (T ). Thus,
σeig (T ) ⊆ σapp (T ).
Now, assume that X is a finite dimensional space. If λ ∈
/ σeig (T ), then T − λI is injective
(one-one) so that using the finite dimensionality of X, it follows that T − λI is surjective
as well, and hence the operator (T − λI)−1 is continuous. Consequently, T − λI is bounded
below, that is, λ ∈
/ σapp (T ). Thus, if X is finite dimensional, then σeig (T ) = σapp (T ).
The following example illustrates that the strict inclusion in σeig (A) ⊆ σapp (A) can occur
if the space X is infinite dimensional.
Example 2.2.2. Let X be any of the sequence spaces c00 , c0 , c, ℓp with any norm satisfying
ken k = 1 for all n ∈ N. Let T : X → X be defined by
(T (x))(j) = λj x(j),
for all x ∈ X and all j ∈ N.
where {λn } is a bounded sequence of scalars. As in Example 2.2.1, we have
σeig (T ) = {λ1 , λ2 , . . .}.
Now assume that λn → λ as n → ∞. Then we have
kT (en ) − λen k = |λn − λ| ken k = |λn − λ| → 0 as n → ∞.
Thus, we can conclude that λ ∈ σapp (T ). Note that if λ 6= λn for every n ∈ N, then
λ∈
/ σeig (T ).
Theorem 2.2.4. Let X be a normed space and T : X → X be a linear compact operator.
Then the following assertions hold:
(a) σapp (T )\{0} = σeig (T )\{0}.
(b) If T is a finite rank operator, then σapp (T ) = σeig (T ).
(c) If X is infinite dimensional, then 0 ∈ σapp (T ).
(d) 0 is the only possible limit point of σapp (T ).
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 55
Proof. (a) We have already observed that σeig (T ) ⊆ σapp (T ). Now, suppose that 0 6= λ ∈
σapp (T ). We show that λ ∈ σeig (T ).
Let {xn } be a sequence in X such that kxn k = 1 for every n ∈ N and kT (xn ) − λxn k → 0 as
n → ∞. Since T is compact operator, there exists a subsequence {x̃n } of {xn } and y ∈ X
such that T (x̃n ) → y. Hence,
λx̃n = T (x̃n ) − (T (x̃n ) − λx̃n ) → y.
Then it follows that kyk = |λ| and
y = lim T (x̃n ) = T
n→∞
so that T (y) = λy, showing that λ ∈ σeig (T ).
y
λ
,
(b) Suppose that T is a finite rank operator. In view of (a), it is enough to show that 0 ∈
σapp (T ) implies 0 ∈ σeig (A). Suppose that 0 ∈
/ σeig (T ). Then T is injective so that by the
hypothesis that T is of finite rank, X is finite dimensional. Therefore, σapp (T ) = σeig (T ),
and consequently, 0 ∈
/ σapp (T ).
(c) Let X be infinite dimensional. Suppose that 0 ∈
/ σapp (T ), that is, T is bounded below.
We show that every bounded sequence in X has a Cauchy subsequence so that X would be
of finite dimension, contradicting the assumption.
Let {xn } be a bounded sequence in X. Since A is compact, there is subsequence {x̃n } of
{xn } such that {T (x̃n )} converges. Since T is bounded below, it follows that {x̃n } is Cauchy
subsequence of {xn }.
(d) It follows from the proof of (a) and Theorem 2.2.1.
From the above theorem part (c), we can observe that an operator defined on an infinite
dimensional space is not compact. The following example illustrates this point of view.
Example 2.2.3. Let X = ℓp with 1 ≤ p ≤ ∞. Let T be the right shift operator on X
defined as
T : (α1 , α2 , . . .) 7→ (0, α1 , α2 , . . .),
or the diagonal operator on X defined as
T : (α1 , α2 , . . .) 7→ (λ1 α1 , λ2 α2 , . . .)
associated with a sequence {λn } of nonzero scalars which converges to a nonzero scalar. We
know that T is not a compact operator but bounded below. Hence 0 ∈
/ σapp (T ). Thus, the
fact that T is not compact as follows from Theorem 2.2.4 (c).
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 56
We know that the range of an infinite rank compact operator on a Banach space is not closed.
Does Theorem 2.2.4 (c) hold for every bounded operator with nonclosed range as well? The
answer is in the affirmative if X is a Banach space, as the following theorem shows.
Theorem 2.2.5. Let X be a Banach space and T : X → X be a bounded linear operator.
If the range R(T ) of T is not closed in X, then 0 ∈ σapp (T ).
Proof. The proof follows from result “Let T : X → Y be a bounded linear operator from a
Banach space X to a normed space Y . If T is bounded below, then the range R(T ) of T is
a closed subspace of Y .”
Now we prove a topological property of σapp (T ).
Theorem 2.2.6. Let X be a normed space and T : X → X be a bounded linear operator.
Then σapp (T ) is a closed subset of K.
Proof. Let {λn } be a sequence in σapp (T ) such that λn → λ for some λ ∈ K. Suppose that
λ∈
/ σapp (T ). Let c > 0 be such that
kT (x) − λxk ≥ ckxk,
for all x ∈ X.
Observe that, for every x ∈ X, n ∈ N,
kT (x) − λn xk = k(T (x) − λx) − (λn − λ)xk
≥ kT (x) − λxk − |λn − λ)|kxk
≥ (c − |λn − λ|)kxk.
Thus, for all large enough n, T − λn I is bounded below. More precisely, let N ∈ N be such
that |λn − λ| ≤ c/2 for all n ≥ N. Then we have
c
kT (x) − λn xk ≤ kxk,
2
for all x ∈ X and all n ≥ N,
which shows that λn ∈
/ σapp (T ) for all n ≥ N. Thus, we arrive at a contradiction.
The above result, in particular, shows that if {λn } is a sequence of eigenvalues of T ∈ B(X)
(T : X → X is bounded linear operator) such that λn → λ, then λ is an approximate
eigenvalue. One may ask whether every approximate eigenvalue arises in this manner. The
answer is, in general, negative, as the following examples shows.
Example 2.2.4. Let X = ℓ1 and T be the right shift operator on ℓ1 . Then we know that
σeig (T ) = ∅. We show that σapp (T ) 6= ∅.
Let {xn } in ℓ1 be defined by
xn (j) =
1
,
n
0,
if j ≤ n,
if j > n.
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 57
Then we see that kxn k = 1 for all n ∈ N, and kT (xn ) − xn k1 = 2/n → 0 as n → ∞ so that
1 ∈ σapp (T ).
Few other examples of operators describing eigenspectrum and approximate eigenspctrum
completely are given in the book by M. T. Nair: Functional Analysis: A First Course,
Prentice-Hall of India Private Limited, New Delhi, 2002.
Qamrul Hasan Ansari
2.3
Advanced Functional Analysis
Page 58
Resolvent Operators
Let X be a normed space and T : X → X be a linear operator. We have seen in Remark
2.2.4 that λ ∈
/ σapp (T ) if and only if T − λI is injective and (T − λI)−1 : R(T − λI) → X is
continuous. That is, a scalar λ is not an approximate eigenvalue of T if and only if for every
y ∈ R(T − λI), there exists a unique x ∈ X such that
T (x) − λx = y,
and the map y 7→ x is continuous. Thus, if x and y are as above, and if {yn } is a sequence
in R(T − λI) such that yn → y, and {xn } in X satisfies T (xn ) − λxn = yn , then xn → x.
One would like to have the above situation not only for every y ∈ R(T − λI), but also for
every y ∈ X. Motivated by this requirement, we have the concept of spectrum of T .
Definition 2.3.1. The resolvent set of T , denoted by ρ(T ), is defined as
ρ(T ) = {λ ∈ K : T − λI is bijective and (T − λI)−1 ∈ B(X)},
where B(X) denotes the set of all bounded linear operators from X into itself.
The complement of ρ(T ) in K is called the spectrum of T and is denoted by σ(T ).
Thus, λ ∈ σ(T ) if and only if either T − λI is not bijective or else (T − λI)−1 ∈
/ B(X).
The elements of the spectrum are called the spectral values of T .
We observe that, for T ∈ B(X),
0 ∈ ρ(T )
⇔
∃S ∈ B(X) such that T S = I = ST,
and, in that case, S = T −1 . If 0 ∈ ρ(T ), then we say that T is invertible in B(X). We note
that if T, S ∈ B(X) are invertible, then T S is invertible, and
(T S)−1 = S −1 T −1 .
In view of Proposition A10 , if λ ∈ ρ(T ), then T − λI is bounded below. Hence, every
approximate eigenvalue is a spectral value, that is,
σapp (T ) ⊆ σ(T ).
10
Proposition A: Let X and Y be normed spaces and T : X → Y be a linear operator. Then there exists
γ > 0 such that kT (x)k ≥ γkxk for all x ∈ X if and only if T is injective and T −1 : R(T ) → X is continuous,
and in that case, kT −1 (y)k ≤ γ1 kyk for all y ∈ R(T ).
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 59
Clearly, if X is a finite dimensional space, then
σeig (T ) = σapp (T ) = σ(T ).
We have seen examples of infinite rank operators T for which σeig (T ) 6= σapp (T ). The
following example shows that strict inclusion is possible in σapp (T ) ⊆ σ(T ) as well.
Example 2.3.1. Let X = ℓp , 1 ≤ p ≤ ∞, and T be the right shift operator on X. We
have seen in Example 2.2.3 that 0 ∈
/ σapp (T ). But 0 ∈ σ(T ), since T is not onto. In fact,
e1 ∈
/ R(T ).
Now we give some characterizations of the spectrum.
Theorem 2.3.1. Let X be a Banach space, T : X → X be a bounded linear operator
and λ ∈ K. Then λ ∈ σ(T ) if and only if either λ ∈ σapp (T ) or R(T − λI) is not dense
in X.
Proof. Clearly, if λ ∈ σapp (T ) or R(T − λI) is not dense in X, then λ ∈ σ(T ).
Conversely, suppose that λ ∈ σ(T ). If λ ∈
/ σapp (T ), then by Proposition A11 , Proposition
12
B , the operator T − λI is injective, and its inverse (T − λI)−1 : R(T − λI) → X is
continuous, and R(T − λI) is closed. Hence, R(T − λI) is not dense in X; otherwise, T − λI
would become bijective and (T − λI)−1 ∈ B(X), which is a contradiction to the assumption
that λ ∈ σ(T ).
11
Proposition A: Let X and Y be normed spaces and T : X → Y be a linear operator. Then there exists
γ > 0 such that kT (x)k ≥ γkxk for all x ∈ X if and only if T is injective and T −1 : R(T ) → X is continuous,
and in that case, kT −1 (y)k ≤ γ1 kyk for all y ∈ R(T ).
12
Proposition B: Let T : X → Y be a bounded linear operator from a Banach space X to a normed space
Y . If T is bounded below, then the range R(T ) of T is a closed subspace of Y
Qamrul Hasan Ansari
2.4
Advanced Functional Analysis
Page 60
Spectral Theory of Compact Linear Operators
Theorem 2.4.1 (Null Space). Let X be a normed space and T : X → X be a linear
compact operator. Then for every λ 6= 0, the null space N (Tλ ) = {x ∈ D(Tλ ) : Tλ (x) =
0} of Tλ = T − λI is finite dimensional.
Proof. We prove it by showing that the closed unit ball B = {x ∈ N (Tλ ) : kxk ≤ 1} is
compact as a normed space is finite dimensional if the closed unit ball in it is compact.
Let {xn } be in B. Then {xn } is bounded as kxn k ≤ 1. Since T is compact, by Theorem 2.1.4,
{T (xn )} has a convergent subsequence {T (xnk )}. Now xn ∈ B ⊂ N (Tλ ) implies Tλ (xn ) =
T (xn )−λxn = 0, so that xn = λ−1 T (xn ) because λ 6= 0. Consequently, {xnk } = {λ−1 T (xnk )}
also converges and its limit lies in B as B is closed. Since {xn } was arbitrary, it says that
every sequence in B has convergent subsequence, and therefore, B is compact. This implies
that domN (T ) < ∞.
Theorem 2.4.2. Let X be a normed space and T : X → X be a linear compact operator.
Then for every λ 6= 0, the range of Tλ = T − λI is closed.
Proof. The proof is divided into three steps.
/ Tλ (X) and a
Step 1. Suppose that Tλ (X) is not closed. Then there is a y ∈ Tλ (X), y ∈
sequence {xn } in X such that
yn = Tλ (xn ) → y.
(2.2)
Since Tλ (X) is a vector space, 0 ∈ Tλ (X). But y ∈
/ Tλ (X), so that y 6= 0. This implies
that yn 6= 0 and xn ∈
/ N (Tλ ) for all sufficiently large n. Without loss of generality, we may
assume that this holds for all n. Since N (Tλ) is closed, the distance δn from xn to N (Tλ ) is
positive, that is,
δn = inf kxn − zk > 0.
z∈N (Tλ )
By the definition of an infimum, there is a sequence {zn } in N (Tλ) such that
an = kxn − zn k < 2δn .
(2.3)
Step 2. We show that
an = kxn − zn k → ∞,
as n → ∞.
(2.4)
Assume that it does not hold. Then {xn −zn } has bounded subsequence. Since T is compact,
it follows from Theorem 2.1.1 that {T (xn − zn )} has a convergent subsequence. Now from
Tλ = T − λI and λ 6= 0, we have I = λ−1 (T − Tλ ). Since zn ∈ N (Tλ), we have Tλ (zn ) = 0
and thus we obtain
1
1
xn − zn = (T − Tλ )(xn − zn ) = [T (xn − zn ) − Tλ (xn )].
λ
λ
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 61
{T (xn − zn )} has convergent subsequence and {Tλ (xn )} converges by (2.2); hence {xn − zn }
has convergent subsequence, say, xnk − znk → v. Since T is compact, T is continuous and so
is Tλ . Hence
Tλ (xnk − znk ) → Tλ (v).
Here Tλ (znk ) = 0 because zn ∈ N (Tλ), so by (2.2) we also have
Tλ (xnk − znk ) = Tλ (xnk ) → y.
hence Tλ (v) = y. Thus y ∈ Tλ (X), which contradicts y ∈
/ Tλ (X) (we assumed it in Step 1).
This is a contradiction and hence an = kxn − zn k → ∞ as n → ∞.
Step 3. Using an as in (2.4) and setting
wn =
1
(xn − zn ),
an
(2.5)
we have kwn k = 1. Since an → ∞, whereas Tλ (zn ) = 0 and {Tλ (zn )} converges, it follows
that
1
Tλ (wn ) = Tλ (xn ) → 0,
(2.6)
an
Using again I = λ−1 (T − Tλ ), we obatin
wn =
1
T (wn ) − Tλ (wn )).
λ
(2.7)
Since T is compact and {wn } is bounded, {T (wn )} has convergent subsequence. Furthermore,
{Tλ (wn )} converges by (2.6). Hence (2.7) shows that {wn } has a convergent subsequence,
say
wnj → w.
(2.8)
A comparison with (2.6) implies that Tλ (w) = 0. Hence w ∈ N (Tλ ). Since zn ∈ N (Tλ ), also
un = zn + an w ∈ N (Tλ ). Hence for the distance from xn to un , we must have
kxn − un k ≥ δn .
Writing un out and using (2.5) and (2.3), we thus obtain
δn ≤
=
=
<
Dividing by 2δn > 0, we have
1
2
kxn − zn − an wk
kan wn − an wk
an kwn − wk
2δn kwn − wk.
< kwn −wk. This contradicts (2.8) and proves the result.
Exercise 2.4.1. Let X be a normed space and T :→ X be a linear operator. Let λ ∈ K be
such that T − λI is injective. Show that (T − λI)−1 : R(T − λI) → X is continuous if and
only if λ is not an approximate eigenvalue.
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 62
Exercise 2.4.2. Let X be a normed space and T : X → X be a bounded linear operator.
Let λ ∈ K be such that |λ| > kT k. Show that
(a) T − λI is bounded below,
(b) R(T − λI) is dense in X,
(c) (T − λI)−1 : R(T − λI) → X is continuous.
Exercise 2.4.3. Let X be a Banach space and T : X → X be a bounded linear operator.
Show that λ ∈ σeig (T ) if and only if there exists a nonzero operator S ∈ B(X) such that
(T − λI)S = 0.
Exercise 2.4.4. Give an example of a bijective operator T on a normed space X such that
0 ∈ σ(A).
3
Differential Calculus on Normed Spaces
3.1
Directional Derivatives and Their Properties
Throughout this section, unless otherwise specified, we assume that X is a real vector space
and f : X → R ∪ {±∞} is an extended real-valued function. In this section, we discuss
directional derivatives of f and present some of their basic properties.
Definition 3.1.1. Let f : X → R ∪ {±∞} be a function and x ∈ Rn be a point where f is
finite.
(a) The right-sided directional derivative of f at x in the direction d ∈ X is defined by
f+′ (x; d) = lim+
t→0
f (x + td) − f (x)
,
t
if the limit exists in [−∞, +∞], that is, finite or not.
(b) The left-sided directional derivative of f at x in the direction d ∈ X is defined by
f−′ (x; d) = lim−
t→0
f (x + td) − f (x)
,
t
if the limit exists in [−∞, +∞], that is, finite or not.
For d = 0 the zero vector in X, f+′ (x; 0) = f−′ (x; 0) = 0.
Since
f+′ (x; −d) = lim+
t→0
f (x − td) − f (x)
f (x + τ d) − f (x)
= lim−
= −f−′ (x; d),
τ →0
t
−τ
63
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 64
we have
−f+′ (x; −d) = f−′ (x; d).
If f+′ (x; d) exists and f+′ (x; d) = f−′ (x; d), then it is called the directional derivative of f at x
in the direction d. Thus, the directional derivative of f at x in the direction d ∈ X is defined
by
f (x + td) − f (x)
,
f ′ (x; d) = lim
t→0
t
provided the limit exists in [−∞, +∞], that is, finite or not.
Remark 3.1.1.
(a) If f ′ (x; d) exists, then f ′ (x; −d) = −f ′ (x; d).
(b) If f : Rn → R is differentiable, then the directional derivative of f at x ∈ X in the
direction d is given by
′
f (x; d) =
n
X
i=1
di
∂f (x)
= h∇f (x), di.
∂xi
In particular, if d = (0, 0, . . . , 0, 1, 0, . . . , 0, 0) = ei , where 1 is at the ith place, then
∂f (x)
the partial derivative of f with respect to xi .
f ′ (x; ei ) =
∂xi
For an extended convex function1 f : X → R ∪ {±∞}, the following proposition shows that
f (x + td) − f (x)
is monotonically increasing on (0, ∞).
the function t 7→
t
Proposition 3.1.1. Let f : X → R ∪ {±∞} be an extended real-valued convex function
and x be a point in X where f is finite. Then, for each direction d ∈ X, function
f (x + td) − f (x)
t 7→
is monotonically nondecreasing on (0, ∞).
t
Proof. Let x ∈ X be any point such that f (x) is finite, and s, t ∈ (0, ∞) with s ≤ t. Then,
by convexity of f , we have
s
s x
(x + td) + 1 −
f (x + sd) = f
t
t
s
s
≤ f (x + td) + 1 −
f (x).
t
t
It follows that
f (x + sd) − f (x)
f (x + td) − f (x)
≤
.
s
t
f (x + td) − f (x)
Thus, function t 7→
is monotonically nondecreasing on (0, ∞).
t
1
A function f : X → R ∪ {±∞} is said to be convex if for all x, y ∈ X with f (x), f (y) 6= ±∞, and all
α ∈ [0, 1], f (αx + (1 − α)y) ≤ αf (x) + (1 − α)f (y).
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 65
The following result ensures the existence of f+′ (x; d) and f−′ (x; d) when f is a convex function.
Proposition 3.1.2. Let f : X → R ∪ {±∞} be an extended real-valued convex function
and x be a point in X where f is finite. Then, f+′ (x; d) and f−′ (x; d) exist for every
direction d ∈ X. Also,
f (x + td) − f (x)
f+′ (x; d) = inf
,
(3.1)
t>0
t
and
f (x + td) − f (x)
f−′ (x; d) = sup
.
(3.2)
t
t<0
Proof. Let x ∈ X be any point such that f (x) is finite. For given t > 0, by the convexity of
f , we have
1
t
(x − d) +
(x + td)
f (x) = f
1+t
1+t
t
1
f (x − d) +
f (x + td)
≤
1+t
1+t
1
=
(tf (x − d) + f (x + td)) .
1+t
It follows that (1 + t)f (x) ≤ tf (x − d) + f (x + td), and so,
f (x + td) − f (x)
≥ f (x) − f (x − d).
t
f (x + td) − f (x)
, as t → 0+ , is bounded below by
t
the constant f (x) − f (x − d). Thus, the limit in the definition of f+′ (x; d) exists and is given
by
f (x + td) − f (x)
f (x + td) − f (x)
f+′ (x; d) = lim+
= inf
.
t>0
t→0
t
t
Since f+′ (x; d) exists in every direction d, the equality −f+′ (x; −d) = f−′ (x; d) implies that
f−′ (x; d) exists in every direction d.
Hence the decreasing sequence of values
The relation (3.2) can be established on the lines of the proof given to derive (3.1).
Proposition 3.1.3. Let f : X → R ∪ {±∞} be an extended real-valued convex function
and x be a point in X where f is finite. Then, f+′ (x; d) is a convex and positively
homogeneous functiona of d and
f−′ (x; d) ≤ f+′ (x; d).
(3.3)
A function f : X → R is said to be (a) convex if for all x, y ∈ X and all α ∈ [0, 1], f (αx+(1−α)y) ≤
αf (x) + (1 − α)f (y); (b) positive homogeneous if for all x ∈ X and all r ≥ 0, f (rx) = rf (x).
a
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 66
Proof. Let λ > 0 be a real number. Then,
f+′ (x; λd) = lim+
λt→0
λ(f (x + λtd) − f (x))
= λf+′ (x; d).
λt
Hence, f+′ (x; ·) is positively homogeneous.
Similarly, we can show that f−′ (x; ·) is also positively homogeneous.
Next, we show that f+′ (x; ·) is convex. Let d1 , d2 ∈ X and λ1 , λ2 ≥ 0 be such that λ1 +λ2 = 1.
From the convexity of f , we have
f (x + t(λ1 d1 + λ2 d2 )) −
=
=
≤
=
f (x)
f ((λ1 + λ2 )x + t(λ1 d1 + λ2 d2 )) − (λ1 + λ2 )f (x)
f (λ1 (x + td1 ) + λ2 (x + td2 )) − λ1 f (x) − λ2 f (x)
λ1 f (x + td1 ) + λ2 f (x + td2 ) − λ1 f (x) − λ2 f (x)
λ1 (f (x + td1 ) − f (x)) + λ2 (f (x + td2 ) − f (x))
for all sufficiently small t. Dividing by t > 0 and letting t → 0+ , we obtain
f+′ (x; λ1 d1 + λ2 d2 ) ≤ λ1 f+′ (x; d1 ) + λ2 f+′ (x; d2 ).
Hence f+′ (x; d) is convex in d.
By subadditivity of f+′ (x; d) in d with f+′ (x; d) < +∞ and f+′ (x; −d) < +∞, we obtain
f+′ (x; d) + f+′ (x; −d) ≥ f+′ (x; 0) = 0,
and thus,
f+′ (x; d) ≥ −f+′ (x; −d) = f−′ (x; d).
If f+′ (x; d) = +∞ or f+′ (x; −d) = +∞, then the inequality (3.3) holds trivially.
Corollary 3.1.1. Let f : X → R ∪ {±∞} be an extended real-valued convex function
and x be a point in X where f is finite. Then, for each direction d ∈ X,
f (x + td) − f (x)
.
t∈(0,∞)
t
f ′ (x; d) = inf
Proposition 3.1.4. Let f : X → R ∪ {±∞} be an extended real-valued convex function
and x be a point in X where f is finite. Then the following assertions hold:
(a) f ′ (x; ·) is sublinear.a
(b) For every y ∈ X,
f ′ (x; y − x) + f (x) ≤ f (y).
(3.4)
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 67
A function f : X → R is said to be sublinear if f (λx) = λf (x) and f (x + y) ≤ f (x) + f (y) for all
x, y ∈ X and all λ ≥ 0.
a
Proof. (a) It follows from Proposition 3.1.3.
(b) If y is not in Dom(f ), then the inequality (3.4) trivially holds. So, let y ∈ Dom(f ). For
t ∈ (0, 1), we have
f ((1 − t)x + ty) − f (x) ≤ t(f (y) − f (x)),
which implies that
f ((1 − t)x + ty) − f (x)
≤ f (y) − f (x).
t
Letting limit as t → 0. we obtain
f ′ (x; y − x) + f (x) ≤ f (y).
Corollary 3.1.2. Let f : Rn → R ∪ {+∞} be an extended real-valued convex function
and x ∈ Rn be such that f (x) is finite and f is differentiable at x. Then,
f (y) ≥ f (x) + h∇f (x), y − xi,
for all y ∈ X,
where ∇f (x) denotes the gradient of f at x.
Corollary 3.1.3. Let f : X → R ∪ {+∞} be an extended real-valued convex function
and x, y ∈ X be such that f (x) and f (y) are finite. Then,
f+′ (y; y − x) ≥ f+′ (x; y − x),
(3.5)
f−′ (y; y − x) ≥ f−′ (x; y − x).
(3.6)
h∇f (y) − ∇f (x), y − xi ≥ 0.
(3.7)
and
In particular, if f : Rn → R is differentiable at x and y, then
Proof. From Corollary 3.1.2, we have
f (y) ≥ f (x) + f+′ (x; y − x),
(3.8)
f (x) ≥ f (y) + f+′ (y; x − y).
(3.9)
and
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 68
By adding inequalities (3.8) and (3.9), we obtain
−f+′ (y; x − y) ≥ f+′ (x; y − x).
Since −f+′ (x; −d) = f−′ (x; d), by using inequality (3.3), we get
f+′ (y; y − x) ≥ f−′ (y; y − x) = −f+′ (y; x − y) ≥ f+′ (x; y − x).
Hence, the inequality (3.5) holds. Similarly, we can establish the inequality (3.6). The
inequality (3.7) holds using Remark 3.1.1 (b).
Qamrul Hasan Ansari
3.2
Advanced Functional Analysis
Page 69
Gâteaux Derivative and Its Properties
Definition 3.2.1. Let X be a normed space. A function f : X → (−∞, ∞] is said to be
Gâteaux2 differentiable at x ∈ int(Dom(f )) if there exists a continuous linear functional,
denoted by fG′ (x), on X such that
f ′ (x; d) = fG′ (x)(d),
that is, lim
t→0
fG′ (x) at d.
for all d ∈ X,
(3.10)
f (x + td) − f (x)
exists for all d ∈ X and it is equal to the value of the functional
t
The continuous linear functional fG′ (x) : X → R is called the Gâteaux derivative of f at x.
fG′ (x; d) is called the value of the Gâteaux derivative of f at x in the direction d.
Similarly, the Gâteaux derivative of an operator T : X → Y from a normed space X to
another normed space Y can be defined as follows:
Definition 3.2.2. Let X and Y be normed spaces. An operator T : X → Y is said to
be Gâteaux differentiable at x ∈ int(Dom(T )) if there exists a continuous linear operator
TG′ (x) : X → Y such that
T (x + td) − T (x)
= TG′ (x)(d),
t→0
t
lim
for all d ∈ X.
(3.11)
The continuous linear operator TG′ (x) : X → Y is called the Gâteaux derivative of T at x.
TG′ (x; d) is called the value of the Gâteaux derivative of T at x in the direction d.
The relation (3.11) is equivalent to the following relation
lim
t→0
T (x + td) − T (x)
− TG′ (x; d) = 0.
t
(3.12)
Remark 3.2.1. If fG′ (x; d) exists, then fG′ (x; −d) = −fG′ (x; d).
Remark 3.2.2. If X = Rn is an Euclidean space with the standard inner product. If
f : Rn → R has continuous partial derivatives of order 1, then f is Gâteaux differentiable at
x = (x1 , x2 , . . . , xn ) ∈ Rn and in the direction d = (d1 , d2 , . . . , dn ) ∈ Rn , and it is given by
fG′ (x; d)
=
n
X
∂f (x)
k=1
2
∂xk
dk ,
René Gâteaux (1889-1914) had died in the First World War and his work was published by Lévy in 1919
with some improvement.
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 70
∂f (x)
denotes a partial derivative of f at the point x with respect to xk . Thus,
∂xk
∂f (x) ∂f (x)
∂f (x)
∇G f (x) =
,
,...,
is gradient of f at the point x.
∂x1
∂x2
∂xn
where
Remark 3.2.3. Let X = Rn and Y = Rm be Euclidean spaces with the standard inner
product. If T : Rn → Rm be given by T = (f1 , f2 , . . . , fm ) and A = (aij ) be a m × n matrix,
where fi : Rn → R be functions for each i = 1, 2, . . . , m. Let d = ej = (0, 0, . . . , 1, . . . , 0, 0)
where 1 at jth place. Then
lim
t→0
T (x + td) − T (x)
− Ad = 0
t
implies that
fi (x + tej ) − fi (x)
− aij = 0,
t→0
t
for all i = 1, 2, . . . , m and all j = 1, 2, . . . , n. This shows that fi has partial derivatives at x
and
∂fi (x)
= aij , for i = 1, 2, . . . , m and j = 1, 2, . . . , n.
∂xj
Hence

 ∂f1 (x)
. . . ∂f∂x1 (x)
∂x1
n


..
..
..
TG′ (x) = 
.
.
.
.
lim
∂fm (x)
∂x1
...
∂fm (x)
∂xn
We establish that the Gâteaux derivative is unique.
Proposition 3.2.1. Let X and Y be normed spaces, T : X → Y be an operator and
x ∈ int(Dom(T )). The Gâteaux derivative TG′ (x) of T at x is unique, provided it exists.
Proof. Assume that there exist two continuous linear operator TG′ (x) and TG∗′ (x) which satisfy
(3.12). Then, for all d ∈ X, and for sufficiently small t, we have
T (x + td) − T (x)
′
∗′
′
kTG (x; d) − TG (x; d)k =
− TG (x; d)
t
T (x + td) − T (x)
∗′
−
− TG (x; d)
t
T (x + td) − T (x)
− TG′ (x; d)
≤
t
T (x + td) − T (x)
+
− TG∗′ (x0 ; d)
t
→ 0 as t → 0.
Therefore, kTG′ (x; d) − TG∗′ (x; d)k = 0 for all d ∈ X. Hence, TG′ (x; d) = TG∗′ (x; d), and thus,
TG′ (x) ≡ TG∗′ (x).
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 71
Theorem 3.2.1. Let K be a nonempty open convex subset of a normed space X and
f : K → R be a convex function. If f is Gâteaux differentiable at x ∈ K, then fG′ (x; d)
is linear in d. Conversely, if f+′ (x; d) is linear in d, then f is Gâteaux differentiable at
x.
Proof. Let f be Gâteaux differentiable at x ∈ K, then for all d ∈ X
−f+′ (x; −d) = f−′ (x; d) = f+′ (x; d).
Therefore, for all d, u ∈ X, we have
f+′ (x; d) + f+′ (x; u) ≥
=
≥
=
f+′ (x; d + u)
−f+′ (x; −(d + u))
−f+′ (x; −d) − f+′ (x; −u)
f+′ (x; d) + f+′ (x; u),
and thus,
f+′ (x; d + u) = f+′ (x; d) + f+′ (x; u).
Since fG′ (x; d) = f+′ (x; d) = f−′ (x; d), we have
fG′ (x; d + u) = fG′ (x; d) + f G (x; u).
For α ∈ R with α 6= 0, we have
α(f (x + tαd) − f (x))
= αfG′ (x; d).
αt→0
αt
fG′ (x; αd) = lim
Hence fG′ (x; d) is linear in d.
Conversely, assume that f+′ (x; d) is linear in d. Then,
0 = f+′ (x; d − d) = f+′ (x; d) + f+′ (x; −d).
Therefore, for all d ∈ X, we have
f−′ (x; d) = −f+′ (x; −d) = f+′ (x; d).
Thus, f is Gâteaux differentiable at x.
Remark 3.2.4. (a) A nonconvex function f : X → R may be Gâteaux differentiable at
a point but the Gâteaux derivative may not be linear at that point. For example,
consider the function f : R2 → R defined by
( x2 x
1 2
,
if x 6= (0, 0),
x21 +x22
f (x) =
0,
if x = (0, 0),
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 72
where x = (x1 , x2 ). For d = (d1 , d2 ) 6= (0, 0) and t 6= 0, we have
f ((0, 0) + t(d1 , d2)) − f (0, 0)
d2 d2
= 2 1 2.
t
d1 + d2
Then,
d2 d2
f ((0, 0) + t(d1 , d2)) − f (0, 0)
= 2 1 2.
t→0
t
d1 + d2
fG′ ((0, 0); d) = lim
Therefore, f is Gâteaux differentiable at (0, 0), but fG′ ((0, 0); d) is not linear in d.
(b) For a real valued function f defined on Rn , the partial derivatives may exist at a point
but f may not be Gâteaux differentiable at that point. For example, consider the
function f : R2 → R defined by
( xx
1 2
,
if x 6= (0, 0),
x21 +x22
f (x) =
0,
if x = (0, 0),
where x = (x1 , x2 ). For d = (d1 , d2 ) 6= (0, 0) and t 6= 0, we have
f ((0, 0) + t(d1 , d2 )) − f (0, 0)
d1 d2
=
.
t
t(d21 + d22 )
Then,
d1 d2
f ((0, 0) + t(d1 , d2 )) − f (0, 0)
= lim
,
2
t→0 t(d2
t→0
t
1 + d2 )
lim
exists only if d = (d1 , 0) or d = (0, d2). That is, fG′ (0; 0) does not exist but
0=
∂f (0, 0)
, where 0 = (0, 0) is the zero vector in R2 .
∂x2
∂f (0, 0)
=
∂x1
(c) The existence, linearity and continuity of fG′ (x; d) in d do not imply the continuity of
the function f . For example, consider the function f : R2 → R defined by
( x3
1
,
if x1 6= 0 and x2 6= 0,
x2
f (x) =
0,
if x1 = 0 or x2 = 0,
where x = (x1 , x2 ). Then,
t3 d31
= 0,
t→0 t2 d2
fG′ ((0, 0); d) = lim
for all d = (d1 , d2 ) ∈ R2 with (d1 , d2 ) 6= (0, 0). Thus, fG′ (0; d) exists and it is continuous
and linear in d but f is discontinuous at (0, 0). The function f is Gâteaux differentiable but not continuous. Hence a Gâteaux differentiable function is not necessarily
continuous.
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 73
(d) The Gâteaux derivative fG′ (x; d) of a function f is positively homogeneous in the second
argument, that is, fG′ (x; rd) = rfG′ (x; d) for all r > 0. But, as we have seen in part (a),
in general, fG′ (x; d) is not linear in d.
Remark 3.2.5. The Gâteaux derivative of a linear operator T : X → Y is also a linear
operator. Indeed, if T : X → Y is a linear operator, then we have
T (x + td) − T (x)
T (x) + tT (d) − T (x)
= lim
= T (d).
t→0
t→0
t
t
TG′ (x; d) = lim
Hence TG′ (x; d) = T (d) for all x ∈ X and d ∈ X.
The following theorem shows that the partial derivatives and Gâteaux derivative are the
same if the function f defined on X is convex.
Theorem 3.2.2. Let K be nonempty convex subset of Rn and f : K → R be a convex
function. If the partial derivatives of f at x ∈ K exist, then f is Gâteaux differentiable
at x.
Proof. Suppose that the partial derivatives of f at x ∈ K exist. Then, the Gâteaux derivative
of f at x is the linear functional
fG′ (x; d)
=
n
X
∂f (x)
k=1
∂xk
dk ,
for d = (d1 , d2, . . . dn ) ∈ Rn .
For each fixed x ∈ K, define a function g : K → R by
g(d) = f (x + d) − f (x) − fG′ (x; d).
∂g(0)
= 0 for all k = 1, 2, . . . , n, since the partial derivatives of f
∂xk
exist at x. Now, if {e1 , e2 , . . . , en } is the standard basis for Rn , then by the convexity of g,
we have for λ 6= 0
!
n
n
n
X
X
1X
g(nλdk ek )
dk ek ≤
g(λd) = g λ
g (nλdk ek ) = λ
.
n
nλ
k=1
k=1
k=1
Then, g is convex and
So,
n
g(λd) X g(nλdk ek )
≤
,
λ
nλ
for λ > 0,
k=1
and
Since
n
g(λd) X g(nλdk ek )
≥
,
λ
nλ
k=1
∂g(0)
g(nλdk ek )
=
= 0,
λ→0
nλ
∂dk
lim
for λ < 0.
for all k = 1, 2, . . . , n,
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 74
we have
g(λd)
= 0,
λ→0
λ
and so, f is Gâteaux differentiable at x.
lim
The mean value theorem in terms of Gâteaux derivative is the following.
Theorem 3.2.3. Let X and Y be normed spaces, K be a nonempty open subset of X
and T : X → Y be Gâteaux differentiable with Gâteaux derivative fG′ (x; d) at x ∈ X in
the direction d ∈ X. Then for any points x ∈ X and x + d ∈ X, there exists s ∈ ]0, 1[
such that
T (x + d) − T (x) = TG′ (x + sd; d).
(3.13)
Proof. Since K is an open subset of X, we can select an open interval I of real numbers,
which contains the numbers 0 and 1, such that x + λd belongs to K for all λ ∈ I. For all
λ ∈ I, define
ϕ(λ) = T (x + λd).
Then,
ϕ(λ + τ ) − ϕ(λ)
τ →0
τ
T (x + λd + τ d) − T (x + λd)
= lim
τ →0
τ
= TG′ (x + λd; d).
ϕ′ (λ) = lim
(3.14)
By applying the mean value theorem for real-valued functions of one variable to the restriction of the function ϕ : I → R to the closed interval [0, 1], we obtain
ϕ(1) − ϕ(0) = ϕ′ (s),
for some s ∈ ]0, 1[.
By using (3.14) and the definition of ϕ : [0, 1] → R, we obtain the desired result.
For the differentiable function, we have the following result which follows from the above
theorem.
Corollary 3.2.1. If in the above theorem T is a differentiable function from Rn to R,
then there exists s ∈ ]0, 1[ such that T (x + d) − T (x) = hTG′ (x + sd), i = h∇T (x + sd), di.
Now we give the characterization of a convex functional in terms of Gâteaux derivative.
Theorem 3.2.4. Let X be a normed space and f : X → (−∞, ∞] be a proper function.
Let K be a convex subset of int(Dom(f )) such that f is Gâteaux differentiable at each
point of K. Then the following are equivalent:
(a) f is convex on K.
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 75
(b) f (y) − f (x) ≥ fG′ (x)(y − x) for all x, y ∈ K.
(c) fG′ (y)(y − x) − fG′ (x)(y − x) ≥ 0 for all x, y ∈ K.
Proof. (a) ⇒ (b). Suppose that f is convex on K. Let x, y ∈ K. Then
f ((1 − t)x + ty) ≤ (1 − t)f (x) + tf (y),
for all t ∈ (0, 1).
f (x + t(y − x)) − f (x)
≤ f (y) − f (x),
t
for all t ∈ (0, 1).
It follows that
Letting limit as t → 0, we obtain
fG′ (x)(y − x) ≤ f (y) − f (x).
Thus, (b) holds.
(b)⇒(c). Suppose that (b) holds. Let x, y ∈ K. Note that
fG′ (y)(x − y) ≤ f (x) − f (y)
and
fG′ (x)(y − x) ≤ f (y) − f (x).
Adding the above inequalities, we obtain
fG′ (y)(y − x) − fG′ (x)(y − x) ≥ 0.
(c) ⇒ (a). Suppose that (c) holds. Then we have
fG′ (u)(u − v) − fG′ (v)(u − v) ≥ 0,
for all u, v ∈ K.
(3.15)
Let x, y ∈ K. Define a function g : [0, 1] → R by
g(t) = f (x + t(y − x)),
for all t ∈ [0, 1].
Then
g ′ (t) = fG′ (x + t(y − x))(y − x).
Consider u = (1 − t)x + ty and v = (1 − s)x + sy in (3.15), for 0 ≤ s < t ≤ 1. Then we have
(fG′ ((1 − t)x + ty) − fG′ ((1 − s)x + sy)) ((1 − t)x + ty − ((1 − s)x + sy)) ≥ 0,
which implies that
(g ′ (t) − g ′(s))(t − s) = (fG′ ((1 − t)x + ty) − fG′ ((1 − s)x + sy)) (y − x) ≥ 0.
Hence g ′ is monotonic increasing on [0, 1] and hence g is convex on [0, 1]. Thus,
g(λ) ≤ (1 − λ)g(0) + λg(1),
it follows that f is convex on K.
for all λ ∈ (0, 1),
Qamrul Hasan Ansari
Advanced Functional Analysis
Exercise 3.2.1. Let f : R2 → R be defined by

 2x2 e−x−2
1
,
−2x−2
2
1
f (x1 , x2 ) =
x2 +e
 0,
if x1 6= 0,
if x1 = 0.
Prove that f is Gâteaux differentiable at 0 but not continuous there.
Page 76
Qamrul Hasan Ansari
3.3
Advanced Functional Analysis
Page 77
Fréchet Derivative and Its Properties
Definition 3.3.1. Let X and Y be normed spaces. An operator (possibly nonlinear) T :
X → Y is said to be Fréchet differentiable at a point x ∈ int(Dom(T )) if there exists a
continuous linear operator T ′ (x) : X → Y such that
kT (x + d) − T (x) − T ′ (x)(d)k
lim
= 0.
(3.16)
kdk→0
kdk
In this case, T ′ (x), also denoted by DT (x), is called Fréchet derivative of T at the point x.
The operator T ′ : X → B(X, Y ) which assigns a continuous linear operator T ′ (x) to a vector
x is known as the Fréchet derivative3 of T .
The domain of the operator T ′ contains naturally all vectors in X at which the Fréchet
derivative can be defined.
The meaning of the relation (3.16) is that for each ε > 0, there exists a δ > 0 (depending on
ε) such that
kT (x + d) − T (x) − T ′ (x)(d)k
< ε,
kdk
for all d ∈ X satisfying the condition kdk < δ.
Example 3.3.1. Let X = Rn and Y = Rm be Euclidean spaces with the standard inner
product. If T : Rn → Rm is Fréchet differentiable at a point x ∈ Rn , then T is represented
by T (x) = (f1 (x1 , . . . , xn ), . . . , fm (x1 , . . . , xn )), where fj : Rn → R be a function for each
j = 1, 2, . . . , m. Let {ei : i = 1, 2, .P
. . n} denote the standardPbasis in Rn . Then the vector
n
d ∈ R can be represented as d = in=1 di ei and f ′ (x)(d) = ni=1 di f ′ (x)(ei ). Therefore we
find that
(f1 (·, xi + t, ·), . . . , fm (·, xi + t, ·)) − (f1 (·, xi , ·), . . . , fm (·, xi , ·))
lim
t→0
t
∂fm (x)
∂f1 (x)
,...,
= T ′ (x)(ei ).
=
∂xi
∂xi
Thus the Fréchet derivative T ′ is expressed in the following form
n
X
∂f1 (x)
∂fm (x)
′
T (x)(d) =
di
,...,
∂xi
∂xi
i=1
n
X
∂f1 (x)
∂fm (x)
=
di
, . . . , di
∂xi
∂xi
i=1



 ∂f1 (x)
. . . ∂f∂x1 (x)
d
1
∂x1
n
  .. 

..
..
..
= 
  . .
.
.
.
∂fm (x)
∂fm (x)
dn
. . . ∂x
∂x
1
3
n
The Fréchet derivative is introduced by the French mathematician Gil Fréchet in 1925.
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 78
This shows that the Fréchet derivative T ′ (x) at a point x is a linear operator represented by
the Jacobian matrix.
Remark 3.3.1. If the operators λT (λ is a scalar) and T + S are Fréchet differentiable, then
for all d ∈ X,
(λT )′ (d) = αT ′ (d) and (T + S)′ (d) = T ′ (d) + S ′ (d).
We establish the relation between Gâteaux and Fréchet differentiability.
Proposition 3.3.1. Let X and Y be normed spaces. If the operator T : X → Y is
Fréchet differentiable at x ∈ X, it is Gâteaux differentiable at x and these two derivatives
are equal.
Proof. Since T is Fréchet differentiable at x, we have
kT (x + d) − T (x) − T ′ (x)(d)k
= 0.
kdk→0
kdk
lim
Set d = td0 for t > 0 and for any fixed d0 6= 0. Then
kT (x + td0 ) − T (x) − tT ′ (x)(d0 )k
t→0
tkd0 k
kT (x + td0 ) − T (x)
1
= lim
− T ′ (x)(d0 )
t→0
t
kd0k
0 = lim
which implies that
T (x + td0 ) − T (x)
= TG′ (x)(d0 ), for all d0 ∈ X.
t→0
t
T ′ (x)(d0 ) = lim
Hence TG′ (x) ≡ T ′ (x).
The following example shows that the converse of Proposition 3.3.1 is not true, that is, if an
operator T : X → Y is Gâteaux differentiable, then it may not be Fréchet differentiable.
Example 3.3.2. Let X = R2 with the Euclidean norm k · k and f : X → R be a function
defined by
x3 y
, if (x, y) 6= (0, 0),
x4 +x2
f (x, y) =
0,
if (x, y) = (0, 0).
It can be easily seen that the f is Gâteaux differentiable at (0, 0) with Gâteaux derivative
fG′ (0, 0) = 0.
Since for (x, x2 ) ∈ X with (x, x2 ) 6= (0, 0), we have
|x3 x3 |
1
1
|f (x, x2 )|
√
=
= √
,
2
4
4
2
4
k(x, x )k
2 1 + x2
(x + x )( x + x )
Therefore, f is not Fréchet differentiable at (0, 0).
for k = h2 .
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 79
Theorem 3.3.1. Let X and Y be normed spaces. If the operator T : X → Y is Fréchet
differentiable at x ∈ X, then it is continuous at x.
Proof. Since T has a Fréchet derivative at x ∈ X, for each ε1 > 0, there exists a δ1 > 0
(depending on ε1 ) such that
kT (y) − T (x) − T ′ (x)(y − x)k < ε1 ky − xk,
for all y ∈ X satisfying ky − xk < δ1 . By the triangle inequality
kT (y) − T (x) − T ′ (x)(y − x)k ≥ kT (y) − T (x)k − kT ′ (x)(y − x)k,
we find for ky − xk < δ1 that
kT (y) − T (x)k < ε1 ky − xk + kT ′ (x)(y − x)k
≤ (ε1 + kT ′ (x)k)ky − xk.
Choose δ = min{δ1 , ε/(ε1 + kT ′ (x)k)} for each ε > 0. Then for all y ∈ X, we have
kT (y) − T (x)k < ε whenever
ky − xk < δ,
that is, T is continuous at x.
Theorem 3.3.2 (Chain Rule). Let X, Y and Z be normed spaces. If T : X → Y and
S : Y → Z are Fréchet differentiable, then the operator R := S ◦ T : X → Z is also
Fréchet differentiable and its Fréchet derivative is given by
R′ (x) = S ′ (T (x)) ◦ T ′ (x).
Proof. For exercise.
Theorem 3.3.3 (Mean Value Theorem). Let K be an open convex subset of a normed
space X, a, b ∈ K and T : K → X be a Fréchet differentiable such that at each x ∈ (a, b)
(open line segment joining a and b) and T (x) is continuous on closed line segment [a, b].
Then
kT (b) − T (a)k ≤ sup kT ′ (y)k kb − ak.
(3.17)
y∈(a,b)
Proof. Let F be a continuous linear functional on X and ϕ : [0, 1] → R be a function defined
by
ϕ(λ) = F ((T ((1 − λ)a + λb))), for all λ ∈ [0, 1].
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 80
By Classical Mean Value Theorem of Calculus for ϕ, we have that for some λ̂ ∈ [0, 1] and
x = (1 − λ̂)a + λ̂b,
F (T (b) − T (a)) = F (T (b)) − F (T (a))
= ϕ(1) − ϕ(0)
= ϕ′ (λ̂) = F (T ′ (x)(b − a)),
where we have used the Chain Rule and the fact that a bounded linear functional is its own
derivative. Therefore, for each continuous linear functional F on X, we have
kF (T (b) − T (a))k ≤ kF k kT ′(x)k kb − ak.
(3.18)
Now, if we define a function G on the subspace [T (b) −T (a)] of X as G(α(F (b)) −F (a)) = α,
then kGk = kT (b) − T (a)k−1 . If F is a Hahn-Banach extension of G to entire X, we find by
substitution in (3.18) that
1 = kF (T (b) − T (a))k ≤ kT (b) − T (a)k−1 kT ′ (x)k kb − ak,
which gives (3.17).
Definition 3.3.2. If T : X → Y is Fréchet differentiable on an open set Ω ⊂ X and the
first Fréchet derivative T ′ at x ∈ Ω is Fréchet differentiable at x, then the Fréchet derivative
of T ′ at x is called the second derivative of T at x and is denoted by T ′′ (x).
Definition 3.3.3. Let X be a normed space. A function f : X → R is said to be twice
Fréchet differentiable at x ∈ int(Dom(T )) if there exists A ∈ B(X, X ∗ ) such that
kf ′ (x + d) − f ′ (x) − A(d)k
= 0.
lim
t→0
t
The second derivative of f at x and is f ′′ (x) = A.
It may be observed that if T : X → Y is Fréchet differentiable on an open set Ω ⊂ X, then
T ′ is a mapping on X into B[X, Y ]. Consequently, if T ′′ (x) exists, it is a bounded linear
mapping from X into B[X, Y ]. If T ′′ exists at every point of Ω, then T ′′ : X → B[X, B[X, Y ]].
Theorem 3.3.4 (Taylor’s Formula for Differentiable Functions). Let T : Ω ⊂ X → Y
and let [a, a + h] be any closed segment in Ω. If T is Fréchet differentiable at a, then
T (a + h) = T (a) + T ′ (a)h + khkε(h),
lim ε(h) = 0.
h→0
Theorem 3.3.5 (Taylor’s Formula for Twice Fréchet Differentiable Functions). Let T :
Ω ⊂ X → Y and [a, a + h] be any closed segment lying in Ω. If T is differentiable in Ω
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 81
and twice differentiable at a, then
1
T (a + h) = T (a) + T ′ (a)h + (T ′′ (a)h)h + khk2 ε(h),
2
lim ε(h) = 0.
h→0
For proofs of these two theorems and other related results, we refer to the book by H. Cartan,
Differential Calculus, Herman, 1971.
Qamrul Hasan Ansari
3.4
Advanced Functional Analysis
Page 82
Some Related Results
Let X be a Hilbert space and f : X → (−∞, ∞] be a proper functional such that f is Gâteaux
differentiable at a point x ∈ int(Dom(f )). Then, by Riesz representation theorem4 ,there
exists exactly one vector, denoted by ∇G f (x) in X such that
fG′ (x)(d) = h∇G f (x), di,
for all d ∈ X
and kfG′ (x)k∗ = k∇G f (x)k.
(3.19)
We say that ∇G f (x) is the Gâteaux gradient vector of f at x. Alternatively, we have
fG′ (x)(d) = h∇G f (x), di =
f (x + td) − f (x)
,
t∈R, t→0
t
lim
for all d ∈ X.
Example 3.4.1. Let X be
p a real inner product space and f : X → R be a functional
defined by f (x) = kxk = hx, xi for all x ∈ X. Then f is differentiable on X \ {0} with
1
∇G f (x) = kxk
x for 0 6= x ∈ X.
In fact, for x, d ∈ X with x 6= 0, we have
p
p
kxk2 + 2thx, di + t2 kdk2 − kxk2
f (x + td) − f (x) =
kxk2 + 2thx, di + t2 kdk2 − kxk2
p
p
=
kxk2 + 2thx, di + t2 kdk2 + kxk2
2thx, di + t2 kdk2
p
,
= p
kxk2 + 2thx, di + t2 kdk2 + kxk2
for all t ∈ R,
which implies that
f (x + td) − f (x)
1
=
hx, di = h∇G f (x), di,
t→0
t
kxk
fG′ (x)(d) = lim
where ∇G f (x) =
1
x.
kxk
Lemma 3.4.1 (Descent lemma). Let X be a Hilbert space and f : X → R be a differentiable convex function such that ∇f : X → X is ∇f is β-Lipschitz continuous. Then
the following assertions hold:
(a) For all x, y ∈ X,
f (y) − f (x) ≤
β
ky − xk2 + hy − x, ∇f (x)i.
2
Qamrul Hasan Ansari
Advanced Functional Analysis
(b) For all x ∈ X,
f
1
x − ∇f (x)
β
≤ f (x) −
Page 83
1
k∇f (x)k2 .
2β
Proof. (a) Let x, y ∈ X. Define φ : [0, 1] → R by
φ(t) = f (x + t(y − x)),
for all t ∈ [0, 1].
Noticing that
φ(0) = f (x) and φ′ (t) = hy − x, ∇f (x + t(y − x))i.
φ(1) = f (y),
Hence
f (y) = f (x) +
= f (x) +
Z
1
Z0 1
0
= f (x) +
Z
≤ f (x) +
≤ f (x) +
(b) Replacing y by x −
0
hy − x, ∇f (x + t(y − x))idt
1
hy − x, ∇f (x + t(y − x)) − ∇f (x)idt + hy − x, ∇f (x)i
0
Z
φ′ (t)dt
1
ky − xk k∇f (x + t(y − x)) − ∇f (x)kdt + hy − x, ∇f (x)i
β
ky − xk2 + hy − x, ∇f (x)i.
2
1
∇f (x) in (a), we get (b).
2β
Definition 3.4.1. Let X be an inner product space. An operator T : X → X is said to be
γ-inverse strongly monotone or γ-cocercive if there exists γ > 0 such that
hT (x) − T (y), x − yi ≥ γkT (x) − T (y)k2,
for all x, y ∈ X.
Proposition 3.4.1. Let X be a Hilbert space and f : X → R be a Fréchet differentiable
convex function such that ∇f : X → X is ∇f is β-Lipschitz continuous for some β > 0.
1
Then ∇f is -inverse strongly monotone, that is,
β
h∇f (x) − ∇f (y), x − yi ≥
1
k∇f (x) − ∇f (y)k2.
β
Proof. Let x ∈ X. Define g : X → R by
g(z) = f (z) − f (x) − h∇f (x), z − xi,
for all z ∈ X.
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 84
Note
g(x) = 0 ≤ f (z) − f (x) − h∇f (x), z − xi = g(z),
for all z ∈ X
and
∇g(z) = ∇f (z) − ∇f (x),
for allz ∈ X.
Clearly, inf g(z) = 0. One can see that
z∈X
k∇g(u) − ∇g(v)k = k∇f (u) − ∇f (v)k ≤ βku − vk,
for all u, v ∈ X.
Let y ∈ X. From Lemma 3.4.1(b), we have
inf g(z) ≤ g(y) −
z∈X
1
k∇g(y)k2,
2β
which implies that
0 ≤ f (y) − f (x) − h∇f (x), y − xi −
1
k∇f (y) − ∇f (x)k2 .
2β
Similarly, we have
0 ≤ f (x) − f (y) − h∇f (y), x − yi −
Thus, we have
0 ≤ h∇f (x) − ∇f (y), x − yi −
1
k∇f (x) − ∇f (y)k2.
2β
1
k∇f (x) − ∇f (y)k2.
β
Definition 3.4.2. Let X be a inner product space. An operator T : X → X is said to be
(a) nonexpansive if
kT (x) − T (y)k ≤ kx − yk,
for all x, y ∈ X;
(b) firmly nonexpansive if
kT (x) − T (y)k2 + k(I − T )(x) − (I − T )(y)k2 ≤ kx − yk2,
for all x, y ∈ X,
where I is the identity operator.
It can be easily seen that every firmly nonexpansive mapping is nonexpansive but converse
may not hold. For example, consider the negative of identity operator, that is, (−I).
Corollary 3.4.1. Let X be a Hilbert space and f : X → R be a Fréchet differentiable
convex function. Then
∇f is nonexpansive ⇔ ∇f is firmly nonexpansive.
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 85
Exercise 3.4.1. Let X be a Hilbert space and Y be an inner product space, A ∈ B(X, Y )
and b ∈ Y . Define a functional f : X → R by
1
f (x) = kA(x) − bk2 ,
2
for all x ∈ X.
Then prove that f is Fŕechet differentiable on X with ∇f (x) = A∗ (Ax−b) and ∇2 f (x) = A∗ A
for each x ∈ X.
Proof. Let x ∈ X. Then, for y ∈ X, we have
=
=
=
=
f (x + y) − f (x)
1
hA(x) − b + A(y), A(x) − b + A(y)i − f (x)
2
1
[hA(x) − b, A(x) − bi + hA(x) − b, A(y)i + hA(y), A(x) − bi + hA(y), A(y)i] − f (x)
2
1
hA(x) − b, A(y)i + kA(y)k2
2
1
∗
hA (Ax − b), yi + kA(y)k2.
2
Thus,
kAk2
1
kyk2,
|f (x + y) − f (x) − hA∗ (Ax − b), yi| = kA(y)k2 ≤
2
2
for all y ∈ X.
Therefore, f is Fŕechet differentiable on X with
f ′ (x)y = h∇f (x), yi,
for all x ∈ X,
where ∇f (x) = A∗ (A(x) − b). It is easy to see that ∇2 f (x) = A∗ A.
Exercise 3.4.2. Let X an inner product space and a ∈ X. Define a functional f : X → R
by
1
f (x) = kx − ak2 , for all x ∈ X.
2
Then prove that f is Fŕechet differentiable on X with ∇f (x) = x − a and ∇2 f (x) = I for
each x ∈ X.
Exercise 3.4.3. Let X be a Hilbert space X and A : X → X be a bounded linear operator.
Let b ∈ X, c ∈ R and define
1
f (x) = hAx, xi − hb, xi + c,
2
x ∈ C.
1
Then prove that f is Fréchet differentiable on X with ∇f (x) = (A + A∗ )(x) − b and
2
1
2
∗
∇ f (x) = (A + A ) for each x ∈ X.
2
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 86
Proof. Let x ∈ X. Then, for y ∈ X, we have
f (x + y) =
=
=
=
=
Thus,
Therefore,
1
hA(x + y), x + yi − hb, (x + y)i + c
2
1
[hA(x) + A(y), xi + hA(x) + A(y), yi] − hb, (x + y)i + c
2
1
[hA(x), xi + hA(y), xi + hA(x), yi + hA(y), yi] − hb, (x + y)i + c
2
1
1
hA(x), xi − hb, xi + c + [hy, A∗(x)i + hA(x), yi + hA(y), yi] − hb, yi
2
2
1
1
f (x) + h (A + A∗ )(x) − b, yi + hA(y), yi.
2
2
1
kf (x + y) − f (x) − h (A + A∗ )(x) − b, yik ≤ kAkkyk2,
2
for all y ∈ X.
kf (x + y) − f (x) − h 21 (A + A∗ )(x) − b, yik
lim
= 0,
kyk→0
kyk
1
i.e., f is differentiable with ∇f (x) = (A + A∗ )(x) − b. One can see that
2
1
∇2 f (x) = (A + A∗ ).
2
Exercise 3.4.4. Let X be a Hilbert space, b ∈ X and A : X → X be a self-adjoint, bounded,
linear operator and strongly positive, i.e., there exists α > 0 such that
hA(x), xi ≥ αkxk2 ,
for all x ∈ X.
Let b ∈ X and define a quadratic function f : X → R by
1
f (x) = hA(x), xi + hx, bi,
2
for all x ∈ X.
Then prove that ∇f (·) = A(·) + b is α-strongly monotone and kAk-Lipschitz continuous.
When X = RN is finite dimensional, then the above operator A coincides with a positive
definite matrix. Then ∇2 f (x) = A and
λmin kxk2 ≤ hAx, xi ≤ λmax kxk2 ,
for all x ∈ RN ,
where λmin and λmax are the minimum and maximum eigenvalues of A, respectively. Hence
α = λmin ≤ λmax = kAk.
Qamrul Hasan Ansari
3.5
Advanced Functional Analysis
Page 87
Subdifferential and Its Properties
The concept of a subdifferential plays an important role in problems of optimization and
convex analysis. In this section, we study subgradients and subdifferentials of R∞ -valued
convex functions and their properties in normed spaces.
We have already seen in Theorem 3.2.4 that if X is a normed space, f : X → (−∞, ∞] is a
proper convex function and x ∈ int(Dom(f )), then the following inequality holds:
fG′ (x)(y − x) + f (x) ≤ f (y),
for all y ∈ X.
(3.20)
The inequality (3.20) motivates us to introduce the notion of another kind of differentiability
when f is not Gâteaux differentiable at x, but the inequality (3.20) holds.
Definition 3.5.1. Let X be a normed space, f : X → (−∞, ∞] be a proper function and
x ∈ Dom(f ). Then an element j ∈ X ∗ is said to be a subgradient of f at x if
f (x) ≤ f (y) + hx − y, ji for all y ∈ X.
(3.21)
The set (possibly nonempty)
∂f (x) := {j ∈ X ∗ : f (x) ≤ f (y) + hx − y, ji, for all y ∈ X},
of subgradients of f at x is called the subdifferential or Fenchel subdifferential of f at x.
Clearly, ∂f (x) may be empty set even if f (x) ∈ R. But for the case x ∈
/ Dom(f ), we
consider ∂f (x) = ∅. Thus, subdifferential of a proper convex function f is a set-valued
mapping ∂f : X ⇒ X ∗ defined by
∂f (x) = {j ∈ X ∗ : f (x) ≤ f (y) + hx − y, ji for all y ∈ X}.
The domain of the subdifferential ∂f is defined by
Dom(∂f ) = {x ∈ X : ∂f (x) 6= ∅}.
Obviously, Dom(∂f ) ⊆ Dom(f ).
Remark 3.5.1.
(a) If f (x) 6= ∞, then Dom(∂f ) is a subset of Dom(f ).
(b) If f (x) = ∞ for some x, then ∂f (x) = ∅.
Definition 3.5.2. Let X be a Hilbert space, f : X → (−∞, ∞] be a proper function. The
subdifferential of f is the set-valued map ∂f : X ⇒ X defined by
∂f (x) = {u ∈ X : f (x) ≤ f (y) + hx − y, ui for all y ∈ X},
for x ∈ X.
(3.22)
Then f is said to be subdifferentiable at x ∈ X if ∂f (x0 ) 6= ∅. The elements of ∂f (x) are
called the subgradients of f at x.
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 88
Example 3.5.1. Let f : R → R be a function defined by f (x) = |x| for x ∈ R. Then

if x < 0,
 {−1},
[−1, 1],
if x = 0,
∂f (x) =

{1},
if x > 0.
Note that f is convex and continuous, but not differentiable at 0. Clearly, f is subdifferentiable at 0 with ∂f (0) = [−1, 1]. Also Dom(∂f ) = Dom(f ) = R.
Example 3.5.2. Define f : R → (−∞, ∞] by
0,
if x = 0,
f (x) =
∞, otherwise.
Then
∂f (x) =
∅,
R,
if x 6= 0,
if x = 0.
Note that f is not continuous at 0, but f is subdifferentiable at 0 with ∂f (0) = R.
Example 3.5.3. Define f : R → (−∞, ∞] by
∞,
√
f (x) =
− x,
Then
∂f (x) =
∅,
− 2√1 x ,
if x < 0,
if x ≥ 0.
if x ≤ 0,
if x > 0.
Note that Dom(f ) = [0, ∞) and f is not continuous at 0. Moreover, ∂f (0) = ∅ and
Dom(∂f ) = (0, ∞). Thus, f is not subdifferentianble at 0 even 0 ∈ Dom(f ).
We now consider some more general functions.
Example 3.5.4. Let X be a inner product space, a ∈ X and define f : X → R by
f (x) = kx − ak for x ∈ X. Then
S1 (0),
if x = a,
∂f (x) =
x − a,
if x 6= a,
where S1 (0) is open unit ball at 0 ∈ X
Example 3.5.5. Let K be a nonempty closed convex subset of a normed space X and iK
the indicator function of K, i.e.,
0,
if x ∈ K,
iK (x) =
∞, otherwise.
Then
∂iK (x) = {j ∈ X ∗ : hx − y, ji ≥ 0 for all y ∈ K} ,
for x ∈ K.
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 89
Proof. Since the indicator function is a proper lower semicontinuous convex function on X,
from (3.21), we have
∂iK (x) = {j ∈ X ∗ : iK (x) − iK (y) ≤ hx − y, ji for all y ∈ K} .
Remark 3.5.2. Dom(iK ) = Dom(∂iK ) = K and ∂iK (x) = {0} for each x ∈ int(K).
3.5.1
Properties of Subdifferentials
Definition 3.5.3. Let X be an inner product space. A set-valued mapping T : X ⇒ X is
said to be
(a) monotone if for all x, y ∈ X,
hu − v, x − yi ≥ 0,
for all u ∈ T (x) and v ∈ T (y);
(b) maximal monotone if it is monotone and its graph Graph(T ) := {(x, u) ∈ X × X :
u ∈ T (x)} is not contained properly in the graph of any other monotone set-valued
mapping.
Theorem 3.5.1. Let X be a Hilbert space and f : X → (−∞, ∞] be a proper convex
function. Then ∂f is monotone.
Proof. Let x, y ∈ and u ∈ ∂f (x), v ∈ ∂f (y) be arbitrary. Then
f (x) ≤ f (z) + hx − z, ui,
for all z ∈ X
(3.23)
f (y) ≤ f (w) + hy − w, vi,
for all w ∈ X.
(3.24)
and
Taking z = y in (3.23) and w = x in (3.24) and adding the resultants, we get
f (x) + f (y) ≤ f (y) + f (x) + hx − y, ui + hy − x, vi,
which implies that
hu − v, x − yi ≥ 0.
Thus, ∂f is monotone.
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 90
Theorem 3.5.2. Let X be a Hilbert space and f : X → (−∞, ∞] be a proper lower
semicontinuous convex function. Then R(I + ∂f ) = X.
Proof. Noticing that R(I + ∂f ) ⊆ X. It is suffices to show that X ⊆ R(I + ∂f ). For this,
let x0 ∈ X and define
1
ψ(x) = kxk2 + f (x) − hx, x0 i,
2
for all x ∈ X.
Note that ψ has an affine lower bound and lim ψ(x) = ∞. Hence, from Theorem A5 , there
kxk→∞
exists z ∈ Dom(f ) such that
ψ(z) = inf ψ(x).
x∈X
Thus, for all x ∈ X, from Proposition P6 , we have
kxk2 ≤ kzk2 + 2hx − z, xi
and
1
1
kzk2 + f (z) − hz, x0 i ≤ kxk2 + f (x) − hx, x0 i,
2
2
which imply that
1
f (z) ≤ f (x) + (kxk2 − kzk2 ) + hz − x, x0 i
2
≤ f (x) + hx − z, xi + hz − x, x0 i
= f (x) + hx − z, x − x0 i.
Let u ∈ X. Define zt = (1 − t)z + tu for t ∈ (0, 1). Hence, for t ∈ (0, 1), we obtain
f (z) ≤ (1 − t)f (z) + tf (u) + thu − z, zt − x0 i,
which gives us that
f (z) ≤ f (u) + hu − z, zt − x0 i.
Letting limit as t → 0+ , we get
f (z) ≤ f (u) + hu − z, z − x0 i.
Hence x0 − z ∈ ∂f (z), i.e., x0 ∈ (I + ∂f )(z) ⊆ R(I + ∂f ). Thus, X ⊆ R(I + ∂f ).
From Theorem 3.5.2, we have
5
Theorem A. Let K be a nonempty closed convex subset of a Hilbert space X and f : K → (−∞, +∞]
be a proper lower semicontinuous function such that f (xn ) → ∞ as kxn k → ∞. Then there exists x̄ ∈ K
such that f (x̄) = inf f (x).
x∈K
6
Let X be an inner product space. Then for any x, y ∈ X, kxk2 ≤ kyk2 − 2hy − x, xi
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 91
Corollary 3.5.1. Let X be a Hilbert space and f : X → (−∞, ∞] be a proper lower
semicontinuous convex function. Then R(I + λ∂f ) = X for all λ ∈ (0, ∞).
Theorem 3.5.3. Let X be a Hilbert space and f : X → (−∞, ∞] be a proper lower
semicontinuous convex function. Then ∂f is maximal monotone.
Theorem 3.5.4. Let X be a Hilbert space and f : X → (−∞, ∞] be a proper convex
function. Then, for each x ∈ Dom(f ), ∂f (x) is closed and convex.
Proof. Exercise.
Theorem 3.5.5. Let X be a Hilbert space and f : X → (−∞, ∞] be a proper convex
function. Then ∂f −1 (0) is closed and convex.
Proof. Exercise.
We now study some calculus of subgradients.
Proposition 3.5.1. Let X be a Hilbert space and f : X → (−∞, ∞] be a proper
function. Then
∂(λf ) = λ∂f, for all λ ∈ (0, ∞).
Proof. Let λ ∈ (0, ∞). Then, for x ∈ X, we have
z ∈ ∂(λf )(x) ⇔ λf (x) ≤ λf (y) + hx − y, zi,
1
⇔ f (x) ≤ f (y) + hx − y, zi,
λ
1
⇔
z ∈ ∂f (x)
λ
⇔ z ∈ λ∂f x).
for all y ∈ X
for all y ∈ X
Therefore,
∂(λf ) = λ∂f,
for all λ ∈ (0, ∞).
Theorem 3.5.6. Let X be a Hilbert space. Let f, g : X → (−∞, ∞] be proper convex
functions and there exists x0 ∈ Dom(f ) ∩ Dom(g) where f is continuous. Then
∂(f + g) = ∂f + ∂g.
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 92
Theorem 3.5.7. Let X be a Hilbert space and f : X → (−∞, ∞] be a proper convex
function. Let x ∈ Dom(f ) and u ∈ X. Then
u ∈ ∂f (x) ⇔ hy, ui ≤ f ′ (x; y),
for all y ∈ X.
Proof. Suppose that u ∈ ∂f (x) and y ∈ X. From (3.22), we have
f (x) ≤ f (x + ty) + hx − (x + ty), ui,
Hence
hy, ui ≤
f (x + ty) − f (x)
,
t
Letting limit as t → 0, we get
for all t ∈ (0, ∞).
for all t ∈ (0, ∞).
hy, ui ≤ f ′ (x; y).
Conversely, suppose that
hy, ui ≤ f ′ (x; y),
for all y ∈ X.
(3.25)
From (3.4) and (3.25), we have
hy − x, ui ≤ f ′ (x; y − x) ≤ f (y) − f (x),
for all y ∈ X.
This shows that u ∈ ∂f (x).
We now give a relation between Gâteaux differentiability and subdifferentiability.
Theorem 3.5.8. Let X be a Banach space and f : X → (−∞, ∞] a proper convex
function. Let f be Gâteaux differentiable at a point x0 ∈ Dom(f ). Then x0 ∈ Dom(∂f )
and ∂f (x0 ) = {fG′ (x0 )}. In this case,
d
f (x0 + ty)
dt
t=0
= hy, ∂f (x0 )i = hy, fG′ (x0 )i,
for all y ∈ X.
Proof. Since f is Gâteaux differentiable at x0 ∈ Dom(f ). Then
hy, fG ′ (x0 )i = lim
t→0
f (x0 + ty) − f (x0 )
,
t
for all y ∈ X.
By the convexity of f , we have
f (x0 + λ(y − x0 )) = f ((1 − λ)x0 + λy) ≤ (1 − λ)f (x0 ) + λf (y),
for all y ∈ X and λ ∈ (0, 1),
i.e,
f (x0 + λ(y − x0 )) − f (x0 )
≤ f (y) − f (x0 ),
λ
for all y ∈ X and λ ∈ (0, 1),
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 93
It follows that
hy − x0 , fG′ (x0 )i ≤ f (y) − f (x0 ),
for all y ∈ X,
i.e., fG′ (x0 ) ∈ ∂f (x0 ). This shows that x0 ∈ Dom(∂f ).
Now, let jx0 ∈ ∂f (x0 ). Then, we have
f (x0 ) − f (u) ≤ hx0 − u, jx0 i,
for all u ∈ X.
Let h ∈ X and let ut = x0 + λh for λ ∈ (0, ∞). Then
f (x0 + λh) − f (x0 )
≥ hh, jx0 i,
λ
for all λ ∈ (0, ∞).
Letting limit as λ → 0, we get
hh, f ′G (x0 ) − jx0 i ≥ 0,
for all h ∈ X,
i.e., jx0 = fG′ (x0 ). Therefore, f is Gâteaux differentiable at x0 and fG′ (x0 ) = ∂f (x0 ).
Corollary 3.5.2. Let X be a Hilbert space and f : X → (−∞, ∞] be a proper convex
function such that f is Gâteaux differentiable at a point x0 ∈ Dom(f ). Then x0 ∈
Dom(∂f ) and ∂f (x0 ) = {∇G f (x0 )}. In this case,
d
f (x0 + ty)
dt
t=0
= hy, ∂f (x0 )i = hy, ∇G f (x0 )i,
for all y ∈ X.
Exercise 3.5.1. Let X be a Banach space. Then prove that
∂kxk = {j ∈ X ∗ : hx, ji = kxk kjk∗ , kjk∗ = 1} ,
for all x ∈ X \ {0}.
Proof. Let j ∈ ∂kxk. Then
hy − x, ji ≤ kxk − kyk ≤ ky − xk,
for all y ∈ X.
(3.26)
It follows that j ∈ X ∗ and kjk ≤ 1. It is clear from (3.26) that kxk ≤ hx, ji, which gives
hx, ji = kxk and kjk∗ = 1.
Thus,
∂kxk ⊆ {j ∈ X ∗ : hx, ji = kxk and kjk∗ = 1} .
Suppose that j ∈ X ∗ such that j ∈ {f ∈ X ∗ : hx, f i = kxk and kf k∗ = 1}. Then hx, ji = kxk
and kjk∗ = 1. Thus,
hy − x, ji = hy, ji − kxk ≤ kyk − kxk,
for all y ∈ X,
that is, j ∈ ∂kxk. It follows that
{j ∈ X ∗ : hx, ji = kxk and kjk∗ = 1} ⊆ ∂kxk.
Therefore, ∂kxk = {j ∈ X ∗ : hx, ji = kxk and kjk∗ = 1}
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 94
Exercise 3.5.2. Let X be a Hilbert space and a ∈ X. Define f : X → R by
1
f (x) = kx − ak2 ,
2
for all x ∈ X.
Then prove that ∂f (x) = {x − a} for all x ∈ X.
Hint 3.5.1. It is easy to see that f is differentiable with ∇f (x) = x − a for all x ∈ X by
Proposition 3.4.1.
Exercise 3.5.3. Let X be a Hilbert space. Then prove that ∂ 12 k · k2 = I.
4
Geometry of Banach Spaces
Among all infinite dimensional Banach spaces, Hilbert spaces have the most important and
useful geometric properties. Namely, the inner product on an inner product space satisfies
the parallelogram law. It is well known that a normed space is an inner product space if
and only if its norm satisfies the parallelogram law. The geometric properties of an inner
product space make numerous problems posed in inner product space more manageable than
those in normed spaces. Consequently, to extend some of inner product techniques and inner
product properties, we study the geometric properties of normed spaces. In this chapter, we
study strict convexity, modulus of convexity, uniform convexity and smoothness of normed
spaces. Most of the results presented in this chapter are given in the standard books on
functional analysis, convex analysis and geometry of Banach spaces, namely, recommended
books 1 and 3.
4.1
Strict Convexity and Modulus of Convexity
It is well known that the norm of a normed space X is convex, that is,
kλx + (1 − λ)yk ≤ λkxk + (1 − λ)kyk,
for all x, y ∈ X and λ ∈ [0, 1].
There are several norms of normed spaces which are strictly convex, that is,
kλx + (1 − λ)yk < λkxk + (1 − λ)kyk,
for all x, y ∈ X with x 6= y and λ ∈ (0, 1). (4.1)
We denote by SX the unit sphere SX = {x ∈ X : kxk = 1} in a normed space X. If x, y ∈ SX
with x 6= y, then (4.1) reduces to
kλx + (1 − λ)yk < 1,
for all λ ∈ (0, 1),
which says that the unit sphere SX contains no line segments. This suggests strict convexity
of normed space.
95
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 96
Definition 4.1.1. A normed space X is said to be strictly convex if
x, y ∈ SX with x 6= y
⇒
kλx + (1 − λ)yk < 1,
for all λ ∈ (0, 1).
Geometrically speaking, the normed space X is strictly convex if the boundary of the unit
sphere in X contains no line segments.
Clearly, if k · k is strictly convex, then X is strictly convex. Also, kλx + (1 − λ)yk < 1 =
λkxk + (1 − λ)kyk (because kxk = kyk = 1) implies that k · k is strictly convex.
Before giving the examples of strictly convex normed spaces, we present the following characterizations.
Proposition 4.1.1. The following assertions are equivalent:
(a) X is strictly convex.
(b) If x 6= y and kxk = kyk = 1 (that is, x, y ∈ SX ), then kx + yk < 2.
(c) If for any x, y, z ∈ X, kx − yk = kx − zk + kz − yk, then there exists λ ∈ [0, 1]
such that z = λx + (1 − λ)y.
Proof. (a) ⇒ (b): Assume that X is strictly convex. Then for any x, y ∈ SX , we have
kxk = kyk = 1 and therefore, by strict convexity of X, we have kλx + (1 − λ)yk < 1 for all
λ ∈ [0, 1]. Take λ = 21 , then we obtain kx + yk < 2, that is, (b) holds.
(b) ⇒ (a): Suppose contrary that for each x, y ∈ X, x 6= y, kxk = kyk = 1 and λ0 ∈ (0, 1),
we have kλ0 x + (1 − λ0 )yk = 1, that is, λ0 x + (1 − λ0 )y ∈ SX . Take λ0 < λ < 1, then
λ0
λ0
λ0 x + (1 − λ0 )y =
y,
[λx + (1 − λ)y] + 1 −
λ
λ
λ0
λ0 (1 − λ)
and hence
+ 1−
as 1 − λ0 =
λ
λ
λ0
λ0
1 = kλ0 x + (1 − λ0 yk ≤ kλx + (1 − λ)yk + 1 −
kyk.
λ
λ
This implies that
λ0
λ0
λ0
= ,
kλx + (1 − λ)yk ≥ 1 − 1 −
λ
λ
λ
that is, kλx + (1 − λ)yk ≥ 1.
Similarly, for 0 < λ < λ0 , we can have kλx + (1 − λ)yk ≥ 1. So for particular λ = 12 , we have
1
kx + yk ≥ 1, that is, kx + yk ≥ 2, a contradiction of the condition of strict convexity.
2
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 97
(a) ⇒ (c): Let x, y, z ∈ X such that kx − yk = kx − zk + kz − yk. Suppose that kx − zk =
6 0,
kz − yk =
6 0 and kx − zk ≤ kz − yk. Then
1 z−y
1 x−z
·
+ ·
2 kx − zk 2 kz − yk
1 x−z
1 z−y
1 z−y
1 z−y
≥
−
·
+ ·
·
− ·
2 kx − zk 2 kx − zk
2 kx − zk 2 kz − yk
1 (z − y)kz − yk − (z − y)kx − zk
1 z−y
1 x−z
− ·
·
+ ·
=
2 kx − zk 2 kx − zk
2
kx − zk kz − yk
1 kx − yk 1 kz − yk − kx − zk
= ·
− ·
2 kx − zk 2
kx − zk
1 kx − yk − kz − yk + kx − zk
= ·
2
kx − zk
1 kx − zk + kz − yk − kz − yk + kx − zk
= 1,
= ·
2
kx − zk
since kx − yk = kx − zk + kz − yk. Now since
x−z
kx−zk
∈ SX and
z−y
kz−yk
x−z
kx−zk
= 1 and
z−y
kz−yk
= 1, that is,
∈ SX , we have
1 z−y
1 x−z
< 1.
·
+ ·
2 kx − zk 2 kz − yk
Hence,
x−z
z−y
= 2.
+
kx − zk kz − yk
Therefore,
x−z
z−y
=
,
kx − zk
kz − yk
and this yields
z=
by (b)
kx − zk
kz − yk
·x+
· y.
kx − zk + kz − yk
kx − zk + kz − yk
(c) ⇒ (b): Let x 6= y such that kxk = kyk = x+y
= 1. Then kx + yk = kxk + kyk.
2
y.
Consequently, there exists λ ∈ (0, 1) such that z = 0 = λx − (1 − λ)y, that is, x = 1−λ
λ
1−λ
So that kxk = λ kyk. Since kxk = kyk = 1, we have λ = 1/2. Therefore, x = y, a
contradiction.
Remark 4.1.1. (a) The assertion (b) in Proposition 4.1.1 says that the midpoint (x+y)/2
of two distinct points x and y on the unit sphere SX of X does not lie on SX . In other
words, if x, y ∈ SX with kxk = kyk = k(x + y)/2k, then x = y.
(b) The assertion (c) in Proposition 4.1.1 says that any three point x, y, z ∈ X satisfying
kx − yk = kx − zk + kz − yk must lie ona line; specially,
if kx − zk = r1 , ky − zk = r2
r2
r1
and kx − yk = r = r1 + r2 , then z = r x + r y.
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 98
We give some examples of strict convex spaces.
Example 4.1.1. Consider X = Rn , n ≥ 2 with norm kxk2 defined by
kxk2 =
n
X
i=1
x2i
!1/2
,
x = (x1 , x2 , . . . , xn ) ∈ Rn .
Let x = (1, 0, 0,√
. . . , 0) ∈ Rn and y = (0, 1, 0, . . . , 0) ∈ Rn . Then x 6= y, kxk2 = 1 = kyk2,
but kx + yk2 = 2 < 2. Hence X is strictly convex.
R
SX
R
The unit sphere in R2 with
p respect to the norm
kxk2 = k(x1 , x2 )k = x21 + x22
Example 4.1.2. Consider X = Rn , n ≥ 2 with norm k · k1 defined by
kxk1 = |x1 | + |x2 | + · · · + |xn |,
x = (x1 , x2 , . . . , xn ) ∈ Rn .
Then X is not strictly convex. To see this, let x = (1, 0, 0, . . . , 0) ∈ Rn and y = (0, 1, 0, . . . , 0) ∈
Rn . Then x 6= y, kxk1 = 1 = kyk1, but kx + yk1 = 2.
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 99
R
SX
R
The unit sphere in R2 with respect to the norm
kxk1 = k(x1 , x2 )k1 = |x1 | + |x2 |
Example 4.1.3. Consider X = Rn , n ≥ 2 with norm k · k∞ defined by
kxk∞ = max |xi |,
1≤i≤n
x = (x1 , x2 , . . . , xn ) ∈ Rn .
Then X is not strictly convex. Indeed, for x = (1, 0, 0, . . . , 0) ∈ Rn and y = (1, 1, 0, . . . , 0) ∈
Rn , we have, x 6= y, kxk∞ = 1 = kyk∞, but kx + yk∞ = 2.
R
SX
R
The unit sphere in R2 with respect to the norm
kxk∞ = k(x1 , x2 )k∞ = max{|x1 |, |x2 |}
Example 4.1.4. The space C[a, b] of all real-valued continuous functions defined on [a, b]
with the norm kf k = sup |f (t)|, is not strictly convex. Indeed, choose two functions f and
a≤t≤b
g defined as follows:
f (t) = 1,
for all t ∈ [a, b]
and
g(t) =
b−t
,
b−a
for all t ∈ [a, b].
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 100
Then, clearly, f, g ∈ C[a, b], kf k = kgk = k(f + g)/2k = 1, however, f 6= g. Therefore,
C[a, b] is not strictly convex.
Exercise 4.1.1. Show that the spaces L1 , L∞ and c0 are not strictly convex.
The following proposition provides some equivalent conditions of strict convexity.
Proposition 4.1.2. Let X be a normed space. Then X is strictly convex if and only if
for each nonzero f ∈ X ∗ , there exists at most one point x ∈ X with kxk = 1 such that
hx, f i = f (x) = kf k∗ .
Proof. Let X be a strictly convex normed space and f ∈ X ∗ . Suppose there exist two
distinct points x, y ∈ X with kxk = kyk = 1 such that f (x) = f (y) = kf k∗. If λ ∈ (0, 1),
then
kf k∗ =
=
≤
<
λf (x) + (1 − λ)f (y)
(since f (x) = f (y) = kf k∗ )
f (λx + (1 − λ)y)
(because f is linear)
kf k∗kλx + (1 − λ)yk
kf k∗,
(since kλx + (1 − λ)yk < 1)
which is a contradiction. Therefore, there exists at most one point x in X with kxk = 1 such
that f (x) = kf k∗.
Conversely, assume that x, y ∈ SX with x 6= y such that k(x + y)/2k = 1. By Hahn-Banach
Theorem (Corollary 6.0.1), there exists a functional j ∈ SX ∗ such that
kjk∗ = 1 and h(x + y)/2, ji = k(x + y)/2k.
Since hx, ji ≤ kxk kjk = 1 and hy, ji ≤ kyk kjk = 1, we have hx, ji = hy, ji because
x+y
x+y
,j =
= 1 ⇔ hx + y, ji = 2 ⇔ hx, ji + hy, ji = 2.
2
2
This implies, by hypothesis, that x = y. Therefore, X is strictly convex.
Proposition 4.1.3. A normed space X is strictly convex if and only if the functional
h(x) := kxk2 is strictly convex, that is,
kλx + (1 − λ)yk2 < λkxk2 + (1 − λ)kyk2,
for all x, y ∈ X, x 6= y and λ ∈ (0, 1).
Proof. Suppose that X is strictly convex. Let x, y ∈ X, λ ∈ (0, 1). Then we have
kλx + (1 − λ)yk2 ≤
=
≤
=
(λkxk + (1 − λ)kyk)2
λ2 kxk2 + 2λ(1 − λ)kxk kyk + (1 − λ)2 kyk2
λ2 kxk2 + 2λ(1 − λ) kxk2 + kyk2 + (1 − λ)2 kyk2
λkxk2 + (1 − λ)kyk2.
(4.2)
(4.3)
(4.4)
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 101
Hence h is convex.
Now we show that the equality can not hold. Assume that there are x, y ∈ X, x 6= y with
kλ0 x + (1 − λ0 )yk2 = λ0 kxk2 + (1 − λ0 )kyk2 ,
for some λ0 ∈ (0, 1).
Then from (4.3), we obtain
2kxk kyk = kxk2 + kyk2.
Hence kxk = kyk = kλ0 x + (1 − λ0 )yk which is impossible.
Conversely, assume that the functional h(x) := kxk2 is strictly convex. Let x, y ∈ X be such
that x 6= y, kxk = kyk = 1 with kλx + (1 − λ)yk = 1 for some λ ∈ (0, 1). Then
kλx + (1 − λ)yk2 = 1 = λkxk2 + (1 − λ)kyk2
a contradiction that h is strictly convex.
Exercise 4.1.2. Let X be a normed space. Prove that X is strictly convex if and only if
for every 1 < p < ∞,
kλx + (1 − λ)ykp < λkxkp + (1 − λ)kykp,
for all x, y ∈ X, x 6= y and λ ∈ (0, 1).
Proof. Suppose that X is strictly convex, and let x, y ∈ X with x 6= y. Then by strict
convexity of X and hence by strict convexity of k · k, we have
kλx + (1 − λ)yk < λkxk + (1 − λ)kyk,
for all λ ∈ (0, 1).
Therefore, for every 1 < p < ∞, we have
kλx + (1 − λ)ykp < (λkxk + (1 − λ)kyk)p ,
for all λ ∈ (0, 1).
(4.5)
If kxk = kyk, then
kλx + (1 − λ)ykp < kxkp = λkxkp + (1 − λ)kykp.
Assume that kxk =
6 kyk, and consider the function λ 7→ λp for 1 < p < ∞. Then it is a
convex function and
p
ap + bp
a+b
<
, for all a, b ≥ 0 and a 6= b.
2
2
Hence from (4.5) with λ = 1/2, we have
x+y
2
p
≤
kxk + kyk
2
p
<
1
(kxkp + kykp) .
2
(4.6)
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 102
If λ ∈ (0, 1/2], then from (4.5), we have
p
kλx + (1 − λ)ykp =
<
<
<
<
x+y
2λ
(after adding and substracting λy)
+ (1 − 2λ)y
2
p
x+y
+ (1 − 2λ)kyk
2λ
2
p
x+y
+ (1 − 2λ)kykp
2λ
2
1
p
p
2λ
kxk + kyk + (1 − 2λ)kykp
2
λkxkp + (1 − λ)kykp.
(by (4.6))
The proof is similar if λ ∈ (1/2, 1).
The converse part is obvious.
Proposition 4.1.4. Let X be a normed space. Then X is strictly convex if and only
if for any two linearly independent elements x, y ∈ X, kx + yk < kxk + kyk. In other
words, X is strictly convex if and only if kx + yk = kxk + kyk for 0 6= x ∈ X and y ∈ X,
then there exists λ ≥ 0 such that y = λx.
Proof. Suppose that X is not strictly convex. Then there exist x and y in X such that
kxk = kyk = 1, x 6= y and kx + yk = 2. By hypothesis, for any two linearly independent
elements x, y ∈ X, kx + yk < kxk + kyk. Since kx + yk = kxk + kyk, x and y are linearly
dependent. Then, x = αy for some α ∈ R, and therefore, kxk = |α| kyk for some α ∈ R
which implies that |α| = 1 because kxk = kyk = 1. If α = 1, then x = y, contradicting that
x 6= y. So we have α = −1, and therefore,
2 = kx + yk = k − y + yk = 0.
This is a contradiction.
Conversely, suppose that X is a strictly convex space and there exist linearly independent
elements x and y in X such that kx + yk = kxk + kyk. Without loss of generality, we may
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 103
assume that 0 < kxk ≤ kyk. Then, we have
2 >
=
=
=
≥
=
x
y
+
kxk kyk
x
y
because
=
= 1 and X is strictly convex
kxk
kyk
1
k(xkyk + ykxk)k
kxkkyk
1
k(xkyk + ykyk − ykyk + ykxk)k
kxkkyk
1
k[kyk(x + y) − (kyk − kxk)y]k
kxkkyk
1
k[kykkx + yk − (kyk − kxk)kyk]k
kxkkyk
1
k[kyk(kxk + kyk) − (kyk − kxk)kyk]k = 2
kxkkyk
(because kx + yk = kxk + kyk).
This is a contradiction.
We now present the existence and uniqueness of elements of minimal norm in convex subsets
of strictly convex normed spaces.
Proposition 4.1.5. Let X be a strictly convex normed space and C be a nonempty convex subset of X. Then there is at most one point x ∈ C such that kxk = inf {kzk : z ∈ C}.
Proof. Assume that there exist two points x, y ∈ C, x 6= y such that
kxk = kyk = inf{kzk : z ∈ C} = d (say).
If λ ∈ (0, 1), then by the strict convexity of X, we have
kλx + (1 − λ)yk < λkxk + (1 − λ)kyk = λd + (1 − λ)d = d,
which is a contradiction, since λx + (1 − λ)y ∈ C by convexity of C.
Proposition 4.1.6. Let C be a nonempty closed convex subset of a reflexive strictly
convex Banach space X. Then there exists a unique point x ∈ C such that kxk =
inf {kzk : z ∈ C}.
Proof. Let d := inf {kzk : z ∈ C}. Then there exists a sequence {xn } in C such that
lim kxn k = d. Since X is reflexive, by Theorem 6.0.8, there exists a subsequence {xni }
n→∞
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 104
of {xn } that converges weakly to an element x in C. The weak lower semicontinuity of the
norm gives
kxk ≤ lim kxn k = d.
n→∞
Therefore, d = kxk. The uniqueness of x follows from Proposition 4.1.5.
Definition 4.1.2. Let C be a nonempty subset of a normed space X and x ∈ X. The
distance from the point x to the set C is defined as
d(x, C) = inf{kx − yk : y ∈ C}.
Proposition 4.1.7. Let C be a nonempty closed convex subset of a reflexive strictly
convex Banach space X. Then for all x ∈ X, there exists a unique point zx ∈ C such
that kx − zx k = d(x, C).
Proof. Let x ∈ C. Since C is a nonempty closed convex subset the Banach space X,
D = C − x := {y − x : y ∈ C} is a nonempty closed convex subset of X. By Proposition
4.1.6, there exists a unique point ux ∈ D such that kux k = inf{ky −xk : y ∈ C}. For ux ∈ D,
there exists a point zx ∈ C such that ux = zx − x. Hence, there exists a unique point zx ∈ C
such that kzx − xk = d(x, C).
In order to measure the degree of strict convexity of X, we define its modulus of convexity.
Definition 4.1.3. Let X be a normed space. A function δX : [0, 2] → [0, 1] defined by
kx + yk
: kxk ≤ 1, kyk ≤ 1, kx − yk ≥ ε
δX (ε) = inf 1 −
2
is called the modulus of convexity of X .
Roughly speaking, δX measures how deeply the midpoint of the linear segment joining points
in the sphere SX of X must lie within SX .
The notion of the modulus of convexity was introduced by Clarkson in 19361 . It allows us
to measure the convexity and rotundity of the unit ball of a normed space.
Remark 4.1.2.
(a) It is easy to see that δX (0) = 0 and δX (ε) ≥ 0 for all ε ≥ 0.
(b) The function δ is increasing on [0, 2], that is, if ε1 ≤ ε2 , then δX (ε1 ) ≤ δX (ε2 ).
(c) The function δX is continuous on [0, 2), but not necessarily continuous at ε = 2.
(d) The modulus of convexity of an inner product space H is
r
ε2
δH (ε) = 1 − 1 − .
4
1
J.A. Clarkson: Uniform convex spaces, Trans. Amer. Math. Soc., 40 (1936), 396–414.
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 105
(e) The modulus of convexity of ℓp (1 ≤ p < ∞) is
p 1/p
ε2
.
δℓp (ε) = 1 − 1 −
4
(f) δX (ε) ≤ δH (ε) for any normed space X and any inner product space H. That is, an
inner product space is the most convex normed space.
Remark 4.1.3. We note that for any ε > 0, the number δX (ε) is the largest number for
which the following implication always holds: For any x, y ∈ X,
kxk ≤ 1, kyk ≤ 1, kx − yk ≥ ε ⇒
x+y
≤ 1 − δX (ε).
2
(4.7)
Example 4.1.5. Let X = R2 be a normed space equipped with one of the following norms:
k(x1 , x2 )k = kx1 k + kx2 k or k(x1 , x2 )k = max {kx1 k, kx2 k} ,
for all (x1 , x2 ) ∈ X. Then, δX (ε) = 0 for all ε ∈ [0, 2].
Example 4.1.6. Let X = R2 be a normed space equipped with the following norm:
x2
x2
, for all (x1 , x2 ) ∈ X.
k(x1 , x2 )k = max kx2 k, x1 + √ , x1 − √
3
3
Then the unit sphere is a regular hexagon and
1
lim δX (ε) = δX (2) = .
2
ε→2
We now give some important properties of the modulus of convexity of normed spaces.
Theorem 4.1.1. A normed space X is strictly convex if and only if δX (2) = 1.
Proof. Let X be a strictly convex normed space with modulus of convexity δX (ε). Suppose
kxk = kyk = 1 and kx − yk = 2 with x 6= −y. By strict convexity of X, we have
1=
x−y
x + (−y)
=
< 1,
2
2
a contradiction. Hence x = −y. Therefore, δX (2) = 1.
Conversely, suppose δX (2) = 1. Let x, y ∈ X such that kxk = kyk = k(x + y)/2k = 1, that
is, kx + yk = 2 or kx − (−y)k = 2. Then
x−y
x + (−y)
=
≤ 1 − δX (2) = 0,
2
2
which implies that x = y. Thus, kxk = kyk and kx + yk = 2 = kxk + kyk imply that x = y.
Therefore, X is strictly convex.
Qamrul Hasan Ansari
4.2
Advanced Functional Analysis
Page 106
Uniform Convexity
The strict convexity of a normed space X says that the midpoint (x + y)/2 of the segment
joining two distinct points x, y ∈ SX with kx − yk ≥ ε > 0 does not lie on SX , that is,
x+y
< 1.
2
In such spaces, we have no information about 1 − k(x + y)/2k, the distance of midpoints
from the unit sphere SX . A stronger property than the strict convexity which provides
information about the distance 1 − k(x + y)/2k is uniform convexity.
Definition 4.2.1. A normed space X is said to be uniformly convex if for any ε, 0 < ε ≤ 2,
the inequalities kxk ≤ 1, kyk ≤ 1 and kx − yk ≥ ε imply that there exists a δ = δ(ε) > 0
such that k(x + y)/2k ≤ 1 − δ.
This says that if x and y are in the closed unit ball BX := {x ∈ X : kxk ≤ 1} with
kx − yk ≥ ε > 0, the midpoint of x and y lies inside the unit ball BX at a distance of at
least δ from the unit sphere SX .
Roughly speaking, if two points on the unit sphere of a uniformly convex space are far apart,
then their midpoint must be well within it.
The concept of uniform convexity was introduced by Clarkson2 .
Example 4.2.1. Every Hilbert space H is a uniformly convex space. In fact, the parallelogram law gives us
kx + yk2 = 2(kxk2 + kyk2) − kx − yk2 ,
for all x, y ∈ H.
Suppose x, y ∈ BH with x 6= y and kx − yk ≥ ε. Then
kx + yk2 ≤ 4 − ε2 .
Therefore,
where δ(ε) = 1 −
p
k(x + y)/2k ≤ 1 − δ(ε),
1 − ε2 /4. Thus, H is uniformly convex.
Example 4.2.2. The spaces ℓ1 and ℓ∞ are not uniformly convex. To see it, take x =
(1, 0, 0, 0, . . .), y = (0, −1, 0, 0, . . .) ∈ ℓ1 and ε = 1. Then
kxk1 = 1, kyk1 = 1, kx − yk1 = 2 > 1 = ε.
2
J.A. Clarkson: Uniform convex spaces, Trans. Amer. Math. Soc., 40 (1936), 396–414.
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 107
However, k(x + y)/2k1 = 1 and there is no δ > 0 such that k(x + y)/2k1 ≤ 1 − δ. Thus, ℓ1
is not uniformly convex.
Similarly, if we take x = (1, 1, 1, 0, 0, . . .), y = (1, 1, −1, 0, 0, . . .) ∈ ℓ∞ and ε = 1, then
kxk∞ = 1, kyk∞ = 1, kx − yk∞ = 2 > 1 = ε.
Since k(x + y)/2k∞ = 1, ℓ∞ is not uniformly convex.
Exercise 4.2.1. Fix µ > 0 and let C[0, 1] be the space with the norm k · kµ defined by
kxkµ = kxk0 + µ
Z
1
1/2
x (t)dt
,
2
0
where k · k0 is the usual supremum norm. Then
kxk0 ≤ kxkµ ≤ (1 + µ)kxk0 ,
for all x ∈ C[0, 1],
and the two norms are equivalent with k · kµ near k · k0 for small µ. However (C[0, 1], k · k0 )
is not strictly convex while for any µ > 0, (C[0, 1], k · kµ ) is. On the other hand, it is easy
to see that for any ε ∈ (0, 2), there exist functions x, y, ∈ C[0, 1] with kxkµ = kykµ = 1,
kx − yk = ε and k(x + y)/2k arbitrary near 1. Thus, (C[0, 1], k · kµ ) is not uniformly convex.
Exercise 4.2.2. Show that the normed spaces ℓp , ℓnp (whenever n is a nonnegative integer),
and Lp [a, b] with 1 < p < ∞ are uniformly convex.
Exercise 4.2.3. Show that the normed spaces ℓa , c, ℓ∞ , L1 [a, b], C[a, b] and L∞ [a, b] are not
strictly convex.
Theorem 4.2.1. Every uniformly convex normed space is strictly convex.
Proof. It follows directly from Definition 4.2.1.
Remark 4.2.1. The converse of Theorem 4.2.1 is not true in general. Let β > 0 and
X = c0 the space of all sequences of scalars which converge to zero, that is, c0 = {x =
(x1 , x2 , . . . , xn , . . .) : {xi }∞
i=1 is convergent to zero} with the norm k · kβ defined by
kxkβ = kxkc0 + β
∞ X
xi 2
i=1
i
!1/2
,
x = {xi } ∈ c0 .
The spaces (c0 , k · kβ ) for β > 0 are strictly convex, but not uniformly convex, while c0 with
its usual norm kxk∞ = sup |xi |, is not strictly convex.
i∈N
Remark 4.2.2. The strict convexity and uniform convexity are equivalent in finite dimensional spaces.
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 108
Theorem 4.2.2. Let X be a normed space. Then X is uniformly convex if and only if
for two sequences {xn } and {yn } in X,
kxn k ≤ 1, kyn k ≤ 1 and lim kxn + yn k = 2
⇒
n→∞
lim kxn − yn k = 0.
n→∞
(4.8)
Proof. Let X be uniformly convex. Assume that {xn } and {yn } are two sequences in X
such that kxn k ≤ 1, kyn k ≤ 1 for all n ∈ N and lim kxn + yn k = 2. Suppose contrary that
n→∞
lim kxn − yn k =
6 0. Then for some ε > 0, there exists a subsequence {ni } of {n} such that
n→∞
kxni − yni k ≥ ε.
Since X is uniformly convex, there exists δ(ε) > 0 such that
kxni + yni k ≤ 2(1 − δ(ε)).
(4.9)
Since lim kxn + yn k = 2, it follows from (4.9) that
n→∞
2 ≤ 2(1 − δ(ε)),
a contradiction.
Conversely, assume that the condition (4.8) is satisfied. If X is not uniformly convex, then
for ε > 0, there is no δ(ε) such that
kxk ≤ 1, kyk ≤ 1, kx − yk ≥ ε
⇒
kx + yk ≤ 2(1 − δ(ε)),
and we can find sequences {xn } and {yn } in X such that
(i) kxn k ≤ 1, kyn k ≤ 1,
(ii) kxn + yn k ≥ 2(1 − 1/n),
(iii) kxn − yn k ≥ ε.
Clearly kxn − yn k ≥ ε which contradicts the hypothesis, since (ii) gives lim kxn + yn k = 2.
n→∞
Thus, X must be uniformly convex.
Theorem 4.2.3. A normed space X is uniformly convex if and only if δX (ε) > 0 for all
ε ∈ (0, 2].
Proof. Let X be a uniformly convex normed space. Then for ε > 0, there exists δ(ε) > 0
≤ 1 − δ(ε), that is,
such that x+y
2
0 < δ(ε) ≤ 1 −
x+y
2
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 109
for all x, y ∈ X with kxk ≤ 1, kyk ≤ 1 and kx − yk ≥ ε. Therefore, from the definition of
modulus of convexity, we have δX (ε) > 0.
Conversely, suppose that X is a normed space with modulus of convexity δX such that
δX (ε) > 0 for all ε ∈ (0, 2]. Let x, y ∈ X such that kxk = 1, kyk = 1 with kx − yk ≥ ε for
fixed ε ∈ (0, 2]. By the definition of modulus of convexity δX (ε), we have
0 < δX (ε) ≤ 1 −
It follows that
x+y
.
2
x+y
≤ 1 − δX (ε),
2
which is independent of x and y. Therefore, X is uniformly convex.
Theorem 4.2.4. Let {xn } be a sequence in an uniformly convex Banach space X. Then,
xn ⇀ x, kxn k → kxk
⇒
xn → x.
Proof. If x = 0, then it is obvious that xn → 0. So, let x 6= 0. Put yn = kxxnn k for n large
x
. By construction, kyn k = kyk = 1, yn ⇀ y, and thus yn + y ⇀ 2y.
enough, and y = kxk
Suppose that xn 6→ x. Then, yn 6→ y. This implies that there exist ε > 0 and a subsequence
{ynk } of {yn } such that kynk − yk ≥ ε. Since X is uniformly convex, there exists δX (ε) > 0
such that
y nk + y
≤ 1 − δX (ε).
2
Since ynk ⇀ y without loss of generality, we have
kyk ≤ lim inf
k→∞
y nk + y
≤ 1 − δX (ε),
2
which contradicts kyk = 1. Therefore, xn → x.
For the class of uniform convex Banach spaces, we have the following important results.
Theorem 4.2.5. Every uniformly convex Banach space is reflexive.
Proof. Let X be a uniformly convex Banach space. Let SX ∗ := {j ∈ X ∗ : kjk∗ = 1} be the
unit sphere in X ∗ and f ∈ SX ∗ . Suppose that {xn } is a sequence in SX such that f (xn ) → 1.
We show that {xn } is a Cauchy sequence. Assume contrary that there exist ε > 0 and two
subsequences {xni } and {xnj } of {xn } such that kxni − xnj k ≥ ε. The uniform convexity of
X guarantees that there exists δX (ε) > 0 such that k(xni + xnj )/2k < 1 − δX (ε). Observe
that
|f ((xni + xnj )/2)| ≤ kf k∗ k(xni + xnj )/2k < kf k∗ (1 − δX (ε)) = 1 − δX (ε)
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 110
and f (xn ) → 1, yield a contradiction. Hence {xn } is a Cauchy sequence and there exists a
point x in X such that xn → x. Clearly x ∈ SX . In fact,
kxk = k lim xn k = lim kxn k = 1.
n→∞
n→∞
Using James Theorem 6.0.3 (which states that a Banach space is reflexive if and only if for
each f ∈ SX ∗ , there exists x ∈ SX such that f (x) = 1), we conclude that X is reflexive.
Remark 4.2.3. Every finite-dimensional Banach space is reflexive, but it need not be unin
X
n
formly convex. For example, X = R , n ≥ 2 with the norm kxk1 =
|xi | is not uniformly
convex. However, it is finite dimensional space.
i=1
Combining Proposition 4.1.6 and Theorems 4.2.1 and 4.2.5, we obtain the following interesting result.
Theorem 4.2.6. Let C be a nonempty closed convex subset of a uniformly convex Banach space X. Then C has a unique element of minimum norm, that is, there exists a
unique element x ∈ C such that kxk = inf {kzk : z ∈ C}.
Theorem 4.2.7 (Intersection Theorem). Let {Cn }∞
n=1 be a decreasing sequence of nonempty
bounded closed convex subsets of a uniformly convex Banach space X. Then, the inter∞
\
section
Cn is a nonempty closed convex subset of X.
n=1
Proof. Let x be a point in X which does not belong to C1 , rn = d(x, Cn ) and r = lim rn .
n→∞
Also, let {qn } be a sequence of positive numbers that decreases to zero, Dn = {y ∈ Cn :
kx − yk ≤ r + qn }, and dn the diameter of Dn . If y and z belong to Dn and ky − zk ≥ dn − qn ,
then
ky − zk
y+z
≤ 1−δ
(r + qn ),
x−
2
r + qn
and
dn − qn
(r + qn ).
rn ≤ 1 − δ
r + qn
Let lim dn = d, then we obtain a contradiction unless d = 0. This in turn implies that
n→∞
∞
∞
\
\
Dn 6= ∅, and so is
Cn 6= ∅.
n=1
n=1
Remark 4.2.4. Theorem 4.2.7 remains valid if the sequence {Cn }∞
n=1 is replaced by an
arbitrary decreasing net of nonempty bounded closed convex sets. However, Theorem 4.2.7
does not hold in arbitrary Banach spaces. For example, consider the space X = C[0, 1] and
Cn = {x ∈ C[0, 1] : 0 ≤ x(t) ≤ tn for all 0 ≤ t ≤ 1 and x(1) = 1}.
Qamrul Hasan Ansari
4.3
Advanced Functional Analysis
Page 111
Duality Mapping and Its Properties
Before defining the duality mapping and giving its fundamental properties, we mention the
following notations and definitions:
Let T : X ⇒ X ∗ be a set-valued mapping. The domain Dom(T ), range R(T ), inverse T −1 ,
and graph G(T ) are defined as
Dom(T ) = {x ∈ X : T (x) 6= ∅},
[
R(T ) =
T (x),
x∈Dom(T )
T
−1
(y) = {x ∈ X : y ∈ T (x)},
G(T ) = {(x, y) ∈ X × X ∗ : y ∈ T (x), x ∈ Dom(T )}.
The graph G(T ) of T is a subset of X × X ∗ .
The mapping T is said to be injective if T (x) ∩ T (y) = ∅ for all x 6= y.
Definition 4.3.1. Let X ∗ be the dual of a normed space X. A set-valued mapping J : X ⇒
X ∗ is said to be normalized duality if
J(x) = j ∈ X ∗ : hx, ji = kxk2 = kjk2∗ ,
equivalently,
J(x) = {j ∈ X ∗ : hx, ji = kxk kjk and kxk = kjk} .
Example 4.3.1. In a real Hilbert space H, the normalized duality mapping is the identity
mapping. Indeed, let x ∈ H with x 6= 0. Since H = H ∗ and hx, xi = kxk · kxk, we have
x ∈ J(x). Assume that y ∈ J(x). By the definition of J, we have hx, yi = kxkkyk and
kxk = kyk. Since
kx − yk2 = kxk2 + kyk2 − 2hx, yi,
it follows that x = y. Therefore, J(x) = {x}.
The following theorem presents some fundamental properties of duality mappings in Banach
spaces.
Proposition 4.3.1. Let X be a Banach space and J : X ⇒ X ∗ be a normalized duality
mapping. Then the following assertions hold:
(a) J(0) = {0}.
(b) For each x ∈ X, J(x) is nonempty closed convex and bounded subset of X ∗ .
(c) J(λx) = λJ(x) for all x ∈ X and real λ, that is, J is homogeneous.
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 112
(d) J is a monotone set-valued map, that is, hx − y, jx − jy i ≥ 0, for all x, y ∈ X,
jx ∈ J(x) and jy ∈ J(y).
(e) kxk2 − kyk2 ≥ 2hx − y, ji, for all x, y ∈ X and j ∈ J(y).
(f) If X ∗ is strictly convex, then J is single-valued.
(g) If X is strictly convex, then J is injective, that is, x 6= y ⇒ J(x) ∩ J(y) = ∅.
(h) If X is reflexive with strictly convex dual X ∗ , then J is demicontinuous, that is, if
xn → x in X implies J(xn ) ⇀ J(x).
Proof. (a) It is obvious.
(b) Let x ∈ X. If x = 0, then it is done by Part (a). So, we assume that x 6= 0. Then,
by the Hahn-Banach Theorem, there exists f ∈ X ∗ such that hx, f i = kxk and kf k∗ = 1.
Set j := kxkf . Then hx, ji = kxkhx, f i = kxk2 and kjk∗ = kxk, and it follows that J(x) is
nonempty for each x 6= 0. So, we can assume that f1 , f2 ∈ J(x). Then, we have
hx, f1 i = kxkkf1 k∗ ,
kxk = kf1 k∗
hx, f2 i = kxkkf2 k∗ ,
kxk = kf2 k∗ ,
and
and therefore, for t ∈ (0, 1), we have
hx, tf1 + (1 − t)f2 i = kxk (tkf1 k∗ + (1 − t)kf2 k∗ ) = kxk2 .
Since
kxk2 = hx, tf1 + (1 − t)f2 i ≤ ktf1 + (1 − t)f2 k∗ kxk
≤ (tkf1 k∗ + (1 − t)kf2 k∗ ) kxk
= kxk2 ,
we have
which gives us
kxk2 ≤ kxkktf1 + (1 − t)f2 k∗ ≤ kxk2 ,
kxk2 = kxkktf1 + (1 − t)f2 k∗ ,
that is,
ktf1 + (1 − t)f2 k∗ = kxk.
Therefore,
hx, tf1 + (1 − t)f2 i = kxk ktf1 + (1 − t)f2 k∗ and kxk = ktf1 + (1 − t)f2 k∗ ,
and thus, tf1 + (1 − t)f2 ∈ J(x) for all t ∈ (0, 1), that is, J(x) is a convex set.
Similarly, we can show that J(x) is a closed and bounded set in X ∗ .
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 113
(c) For λ = 0, it is obvious that J(0x) = 0J(x). Assume that j ∈ J(λx) for λ 6= 0. We first
show that J(λx) ⊆ λJ(x). Since j ∈ J(λx), we have
hλx, ji = kλxkkjk∗ and kλxk = kjk∗ ,
and thus, hλx, ji = kjk2∗ . Hence
hx, λ−1 ji = λ−1 hλx, λ−1 ji = λ−2 hλx, ji
= λ−2 kλxkkjk∗ = λ−1 kjk∗ kjk∗
= kλ−1 jk2∗ = kxk2 .
This shows that λ−1 j ∈ J(x), that is, j ∈ λJ(x). Therefore, J(λx) ⊆ λJ(x). Similarly, we
can show that λJ(x) ⊆ J(λx). Thus, J(λx) = λJ(x).
(d) Let jx ∈ J(x) and jy ∈ J(y) for x, y ∈ X. Then, we have
hx − y, jx − jy i =
≥
≥
=
hx, jx i − hx, jy i − hy, jx i + hy, jy i
kxk2 + kyk2 − kxkkjy k∗ − kykkjxk∗
kxk2 + kyk2 − 2kxkkyk
(kxk − kyk)2 ≥ 0.
(e) Let j ∈ J(x), x, y ∈ X. Then, we have
kxk2 kyk2 − 2hx − y, ji =
=
=
≥
kxk2 − kyk2 − 2hx, ji + 2hy, ji
kxk2 − kyk2 − 2hx, ji + 2kyk2
kxk2 + kyk2 − 2hx, ji
kxk2 + kyk2 − 2kxk kyk = (kxk − kyk)2 ≥ 0.
(f) Let j1 , j2 ∈ J(x) for x ∈ X. Then, we have
hx, j1 i = kj1 k2∗ = kxk2
and
hx, j2 i = kj2 k2∗ = kxk2 .
Adding the above identities, we obtain
hx, j1 + j2 i = 2kxk2 .
Since 2kxk2 = hx, j1 + j2 i ≤ kxkkj1 + j2 k∗ , we have
kj1 k∗ + kj2 k∗ = 2kxk ≤ kj1 + j2 k∗ .
It follows from the fact kj1 + j2 k∗ ≤ kj1 k∗ + kj2 k∗ that
kj1 + j2 k∗ = kj1 k∗ + kj2 k∗ .
(4.10)
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 114
Since X ∗ is strictly convex and kj1 + j2 k∗ = kj1 k∗ + kj2 k∗ , there exists λ ∈ R such that
j1 = λj2 . Since
hx, j2 i = hx, j1 i = hx, λj2 i = λhx, j2 i,
this implies that λ = 1, and hence, j1 = j2 . Therefore, J is single-valued.
(g) Suppose that j ∈ J(x) ∩ J(y) for x, y ∈ X. Since j ∈ J(x) and j ∈ J(y), it follows from
kjk2∗ = kxk2 = kyk2 = hx, ji = hy, ji that
kxk2 = h(x + y)/2, ji ≤ k(x + y)/2kkxk,
which gives that
kxk = kyk ≤ k(x + y)/2k ≤ kxk.
Hence kxk = kyk = k(x + y)/2k. Since X is strictly convex and kxk = kyk = k(x + y)/2k,
we have x = y. Therefore, J is one-one.
(h) It is sufficient to prove the demicontinuity of J on the unit sphere SX . For this, let {xn }
be a sequence in SX such that xn → z in X. Then kJ(xn )k∗ = kxn k = 1 for all n ∈ N,
that is, {J(xn )} is bounded. Since X is reflexive, so is X ∗ . Then, there exists a subsequence
{J(xnk )} of {J(xn )} in X ∗ such that {J(xnk )} converges weakly to some j in X ∗ . Since
xnk → z and J(xnk ) ⇀ j, we have
hz, ji = lim hxnk , J(xnk )i = lim kxnk k2 = 1.
k→∞
k→∞
Moreover,
kjk∗ ≤
=
lim kJxnk k∗ = lim (kJxnk k∗ kxnk k)
k→∞
k→∞
lim hxnk , Jxnk i = hz, ji = kjk∗ ,
k→∞
that is, kjk = hz, ji. This shows that
hz, ji = kjk∗ kzk and kjk∗ = kzk,
(because z ∈ SX and so kzk = 1, also hz, ji = 1 and so kjk = 1). This implies that j = J(z).
Thus, every subsequence {J(xni )} converging weakly to j ∈ X ∗ . This gives J(xn ) ⇀ J(z).
Therefore, J is demicontinuous.
The following inequalities are very useful in many applications.
Corollary 4.3.1. Let X be a Banach space and J : X ⇒ X ∗ be the duality mapping.
Then the following statements hold:
(a) kx + yk2 ≥ kxk2 + 2hy, jx i, for all x, y ∈ X, where jx ∈ J(x).
(b) kx + yk2 ≤ kyk2 + 2hx, jx+y i, for all x, y ∈ X, where jx+y ∈ J(x + y).
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 115
Proof. (a) Replacing y by x + y in (4.11), we get the inequality.
(b) Replacing x by x + y in (4.11), we get the result.
Proposition 4.3.2. Let X be a Banach space and J : X ⇒ X ∗ be a normalized duality
mapping. For each x, y ∈ X, the following statements are equivalent:
(a) kxk ≤ kx + tyk, for all t > 0.
(b) There exists j ∈ J(x) such that hy, ji ≥ 0.
Proof. (a) ⇒ (b). For t > 0, let ft ∈ J(x + ty). Then hx + ty, ft i = kx + tyk kftk. Define
gt = kffttk∗ . Then kgt k∗ = 1. Since gt ∈ kft k−1
∗ J(x + ty), we have
kxk ≤ kx + tyk = kft k−1
∗ hx + ty, ft i
= hx + ty, gti = hx, gt i + thy, gti
≤ kxk + thy, gt i.
(since kgt k∗ = 1)
By the Banach-Alaoglu Theorem 6.0.4 (which states that the unit ball in X ∗ is weak*compact), the net {gt } has a limit point g ∈ X ∗ such that
kgk∗ ≤ 1, hx, gi ≥ kxk and hy, gi ≥ 0.
Observe that
kxk ≤ hx, gi ≤ kxkkgk∗ = kxk,
which gives that
hx, gi = kxk and kgk∗ = 1.
Set j = gkxk, then j ∈ J(x) and hy, ji ≥ 0.
(b) ⇒ (a). Assume that for x, y ∈ X with x 6= 0, there exists j ∈ J(x) such that hy, ji ≥ 0.
Then for t > 0,
kxk2 = hx, ji ≤ hx, ji + hty, ji
= hx + ty, ji ≤ kx + tykkxk,
which implies that
kxk ≤ kx + tyk.
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 116
Proposition 4.3.3. Let X be a Banach space and ϕ : X → R be a function defined by
ϕ(x) = kxk2 /2. Then the subdifferential ∂ϕ of ϕ coincides with the normalized duality
mapping J : X ⇒ X ∗ defined by
J(x) = {j ∈ X ∗ : hx, ji = kxkkjk∗ , kjk∗ = kxk} ,
for x ∈ X.
Proof. We first show that J(x) ⊆ ∂ (kxk2 /2). Let x 6= 0 and j ∈ J(x). Then for y ∈ X, we
have
kyk2 kxk2
kyk2 kxk2
−
− hy − x, ji =
−
− hy, ji + hx, ji
2
2
2
2
kyk2 kxk2
≥
−
− kyk kjk∗ + kxk kjk∗
2
2
(because hy, ji ≤ kyk kjk∗ and hx, ji = kxk kjk∗)
kyk2 kxk2
=
−
− kyk kxk + kxk2
(because kjk∗ = kxk)
2
2
kxk2 kyk2
+
− kxkkyk
≥
2
2
(kxk − kyk)2
=
≥ 0.
2
It follows that
kxk2 kyk2
−
≤ hx − y, ji.
2
2
Hence j ∈ ∂ (kxk2 /2). Thus, J(x) ⊆ ∂ (kxk2 /2) for all x 6= 0.
We now prove ∂ (kxk2 /2) ⊆ J(x) for all x 6= 0. Suppose j ∈ ∂
kxk2 kyk2
−
≤ hx − y, ji,
2
2
kxk2
2
for all y ∈ X.
for 0 6= x ∈ X. Then,
(4.11)
Observe that
kxkkjk∗ = sup {hy, jikxk : kyk = 1}
(since j is a continuous linear functional)
= sup {hy, ji : kxk = kyk = 1}
≤ sup {hy, ji : kxk = kyk}
kyk2 kxk2
≤ sup hx, ji +
−
: kxk = kyk
(by using (4.11))
2
2
≤ kxkkjk∗ .
Thus,
hx, ji = kxkkjk∗ .
(4.12)
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 117
To see j ∈ J(x), we show that kjk∗ = kxk. For t > 1, we take y = tx ∈ X in (4.11), then
we obtain
kxk2 t2 kxk2
−
≤ hx − tx, ji,
2
2
that is,
(1 − t2 )
kxk2 ≤ (1 − t)hx, ji,
2
which implies that
hx, ji ≤ (t + 1)
kxk2
.
2
Letting t → 1, we get
hx, ji ≤ kxk2 .
(4.13)
Further, for t > 0, we take y = (1 − t)x ∈ X in (4.11), then we obtain
kxk2 k(1 − t)2 xk2
−
≤ hx − (1 − t)x, ji,
2
2
that is,
1 − (1 − t)2
It follows that
(2 − t)
kxk2
2
≤ thx, ji.
kxk2
≤ hx, ji.
2
Letting t → 0, we get
kxk2 ≤ hx, ji.
(4.14)
From (4.12), (4.13) and (4.14), we obtain kjk∗ = kxk. Thus, ∂ (kxk2 /2) ⊆ J(x). Therefore,
J(x) = ∂ (kxk2 /2) for all x 6= 0.
Qamrul Hasan Ansari
4.4
Advanced Functional Analysis
Page 118
Smooth Banach Spaces and Modulus of Smoothness
Let C be a nonempty closed convex subset of a normed space X such that the origin belongs
to the interior of C. A linear functional j ∈ X ∗ is said to be a tangent to C at the point
x0 ∈ ∂C if j(x0 ) = sup{j(x) : x ∈ C}, where ∂C denotes the boundary of C. If H = {x ∈
X : j(x) = 0} is the hyperplane, then the set H + x0 is called a tangent hyperplane to C at
x0 .
Definition 4.4.1. A Banach space X is said to be smooth if for each x ∈ SX , there exists
a unique functional jx ∈ X ∗ such that hx, jx i = kxk and kjx k = 1.
In other words, X is smooth if for all x ∈ SX , there exists jx ∈ SX ∗ such that hx, jx i = 1.
Geometrically, the smoothness condition means that at each point x of the unit sphere, there
is exactly one supporting hyperplane {jx = 1} := {y ∈ X : hy, jx i = 1}. This means that
the hyperplane {jx = 1} is tangent at x to the unit ball and this unit ball is contained in
the half space {jx ≤ 1} := {y ∈ X : hy, jx i ≤ 1}.
Example 4.4.1. ℓp , Lp (1 < p < ∞) are smooth Banach spaces. However, c0 , ℓ1 , L1 , ℓ∞ ,
L∞ are not smooth.
Theorem 4.4.1. Let X be a Banach space. Then the following assertions hold.
(a) If X ∗ is strictly convex, then X is smooth.
(b) If X ∗ is smooth, then X is strictly convex.
Proof. (a) Assume that X is not smooth. Then there exist x0 ∈ SX and j1 , j2 ∈ SX ∗
with j1 6= j2 such that hx0 , j1 i = hx0 , j2 i = 1. Since kj1 + j2 k ≤ kj1 k + kj2 k = 2, and
hx0 , j1 + j2 i = hx0 , j1 i + hx0 , j2 i = 2, we have (j1 + j2 )/2 ∈ SX ∗ . Hence X ∗ is not strictly
convex.
(b) Suppose that X is not strictly convex. Then there exist x, y ∈ SX with x 6= y such that
, j = 1. Then, we have
kx + yk = 2. Take j ∈ SX ∗ with x+y
2
x+y
1
1
1 1
1=
, j = hx, ji + hy, ji ≤ + ,
2
2
2
2 2
and hence, hx, ji = hy, ji = kjk = 1. Since x, y ∈ X ⊆ X ∗∗ , we have x, y ∈ J(j). So, for
x 6= y, we have X ∗ is not smooth.
It is well known that for a reflexive Banach space X, the dual spaces X and X ∗ can be
equivalently renormed as strictly convex spaces such that the duality is preserved. By using
this fact, we have the following result.
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 119
Theorem 4.4.2. Let X be a reflexive Banach space. Then the following assertions hold.
(a) X is smooth if and only if X ∗ is strictly convex.
(b) X is strictly convex if and only if X ∗ is smooth.
We now establish a relation between smoothness and Gâteaux differentiability of a norm.
Theorem 4.4.3. A Banach space X is smooth if and only if the norm is Gâteaux
differentiable on X\{0}.
Proof. Since the proper convex continuous functional ϕ is Gâteaux differentiable if and only
if it has a unique subgradient, we have
norm is Gâteaux differentiable at x
⇔ ∂kxk = {j ∈ X ∗ : hx, ji = kxk, kjk∗ = 1} is singleton
⇔ there exists a unique j ∈ X ∗ such that hx, ji = kxk and kjk∗ = 1
⇔ smooth.
Corollary 4.4.1. Let X be a Banach space and J : X ⇒ X ∗ be a duality mapping.
Then the following statements are equivalent:
(a) X is smooth.
(b) J is single-valued.
(c) The norm of X is Gâteaux differentiable with ▽kxk = kxk−1 J(x).
We now study the continuity property of duality mappings.
Theorem 4.4.4. Let X be a smooth Banach space and J : X → X ∗ be a single-valued
duality mapping. Then J is norm to weak*-continuous.
Proof. We show that xn → x implies J(xn ) → J(x) in the weak* topology. Let xn → x and
set fn := J(xn ). Then
hxn , fn i = kxn kkfn k∗
and kxn k = kfn k∗ .
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 120
Since {xn } is bounded, {fn } is bounded in X ∗ . Then there exists a subsequence {fnk } of
{fn } such that fnk → f ∈ X ∗ in the weak* topology. Then we show that f = J(x). Since
the norm of X ∗ is lower semicontinuous in weak* topology, we have
kf k∗ ≤ lim inf kfnk k∗ = lim inf kxnk k = kxk.
k→∞
k→∞
Since hx, f − fnk i → 0 and hx − xnk , fnk i → 0, it follows from the fact
|hx, f i − kxnk k2 | = |hx, f i − hxnk , fnk i|
≤ |hx, f − fnk i| + |hx − xnk , fnk i| → 0
that
hx, f i = kxk2 .
As a result
kxk2 = hx, f i ≤ kf k∗ kxk.
Thus, we have hx, f i = kxk2 , kxk = kf k∗ . Therefore, f = J(x).
Qamrul Hasan Ansari
4.5
Advanced Functional Analysis
Page 121
Metric Projection on Normed Spaces
Let C be a nonempty subset of a normed space X and x ∈ X. An element y0 ∈ C is said to
be a best approximation to x if
kx − y0 k = d(x, C),
where d(x, C) = inf kx − yk. The number d(x, C) is called the distance from x to C.
y∈C
The (possibly empty) set of all best approximations from x to C is denoted by
PC (x) = {y ∈ C : kx − yk = d(x, C)}.
This defines a mapping PC from X into 2C and it is called the metric projection onto C. The
metric projection mapping is also known as the nearest point projection mapping, proximity
mapping or best approximation operator.
The set C is said to be proximinal (respectively, Chebyshev) set if each x ∈ X has at least
(respectively, exactly) one best approximation in C.
Remark 4.5.1.
(a) C is proximinal if PC (x) 6= ∅ for all x ∈ X.
(b) C is Chebyshev if PC (x) is singleton for each x ∈ X.
(c) The set of best approximations is convex if C is convex.
Proposition 4.5.1. If C is a proximinal subset of a Banach space X, then C is closed.
Proof. Suppose contrary that C is not closed. Then there exists a sequence {xn } in C such
that xn → x and x ∈
/ C, but x ∈ X. It follows that
d(x, C) ≤ kxn − xk → 0,
so that, d(x, C) = 0. Since x ∈
/ C, we have
kx − yk > 0,
for all y ∈ C.
This implies that PC (x) = ∅ which contradicts PC (x) 6= ∅.
Theorem 4.5.1 (The Existence of Best Approximations). Let C be a nonempty weakly
compact convex subset of a Banach space X and x ∈ X. Then x has a best approximation
in C, that is, PC (x) 6= ∅.
Proof. Define the function f : C → R+ by
f (y) = kx − yk,
for all y ∈ C.
Then, f is lower semicontinuous. Since C is weakly compact, by Theorem 6.0.10, there exists
y0 ∈ C such that kx − y0 k = inf kx − yk.
y∈C
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 122
Corollary 4.5.1. Let C be a nonempty closed convex subset of a reflexive Banach space
X. Then each element x ∈ X has a best approximation in C.
Theorem 4.5.2 (The Uniqueness of Best Approximations). Let C be a nonempty convex
subset of a strictly convex Banach space X. Then for each x ∈ X, C has at most one
best approximation.
Proof. Assume contrary that y1 , y2 ∈ C are best approximations to x ∈ X. Since the set of
best approximations is convex, (y1 +y2 )/2 is also a best approximation to x. Set r := d(x, C).
Then
0 ≤ r = kx − y1 k = kx − y2 k = kx − (y1 + y2 )/2k,
and so,
k(x − y1 ) + (x − y2 )k = 2r = kx − y1 k + kx − y2 k.
By the strict convexity of X, we have
x − y1 = t(x − y2 ),
for all t ≥ 0.
Taking the norm in this relation, we obtain r = tr, that is, t = 1, which gives us y1 = y2 .
The following example shows that the strict convexity cannot be dropped in Theorem 4.5.2.
Example 4.5.1. Let X = R2 with norm kxk1 = |x1 | + |x2 | for all x = (x1 , x2 ) ∈ R2 . As we
have seen that X is not strictly convex. Let
C = (x1 , x2 ) ∈ R2 : k(x1 , x2 )k1 ≤ 1 = (x1 , x2 ) ∈ R2 : |x1 | + |x2 | ≤ 1 .
Then C is a closed convex set. The distance from z = (−1, −1) to the set C is one and this
distance is realized by more than one point of C.
The following example shows that the uniqueness of best approximations in Theorem 4.5.2
need not be true for nonconvex sets.
1/2
Example 4.5.2. Let X = R2 with the norm k · k2 = (x21 + x22 ) for all x = (x1 , x2 ) ∈ R2 .
Let
C = SX = (x1 , x2 ) ∈ R2 : x21 + x22 = 1 .
Then X is strictly convex and C is not convex. However, all points of C are best approximations to (0, 0) ∈ X.
Theorem 4.5.3. Let X be a Banach space X. If every element in X possesses at most
a best approximation with respect to every convex set, then X is strictly convex.
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 123
Proof. Assume contrary that X is not strictly convex. Then there exist x, y ∈ X, x 6= y
such that
kxk = kyk = k(x + y)/2k = 1.
Furthermore,
ktx + (1 − t)yk = 1,
for all t ∈ [0, 1].
Set C := co({x, y}) the convex hull of the set {x, y}. Then k0 − zk = d(0, C) for all z ∈ C.
It follows that every element of C is the best approximation to zero which contradicts the
uniqueness.
From Theorems 4.5.1 and 4.5.2, we obtain the following result.
Theorem 4.5.4. Let C be a nonempty weakly compact convex subset of a strictly convex
Banach space X. Then for each x ∈ X, C has the unique best approximation, that is,
PC (·) is a single-valued metric projection mapping from X onto C.
Corollary 4.5.2. Let C be a nonempty closed convex subset of a strictly convex reflexive
Banach space X and let x ∈ X. Then there exists a unique element x0 ∈ C such that
kx − x0 k = d(x, C).
5
Appendix: Basic Results from Analysis - I
Definition 5.0.1. A function f : Rn → R ∪ {±∞} is said to be
(a) positively homogeneous if for all x ∈ Rn and all r ≥ 0, f (rx) = rf (x);
(b) subadditive if
f (x + y) ≤ f (x) + f (y),
for all x, y ∈ Rn ;
(c) sublinear if it is positively homogeneous and subadditive;
(d) subodd if for all x ∈ Rn \ {0}, f (x) ≥ −f (−x).
Every real-valued odd function is subodd. It can be seen that the function f : R → R defined
by f (x) = x2 is subodd but it is neither odd nor subadditive.
Remark 5.0.1. (a) It can be easily seen that f is subodd if and only if f (x) + f (−x) ≥ 0,
for all x ∈ Rn \ {0}.
(b) If f is sublinear and is not constant with value −∞ such that f (0) ≥ 0, then f is
subodd.
Definition 5.0.2. Let f : Rn → R ∪ {±∞} be an extended real-valued function.
(a) The effective domain of f is defined as
dom(f ) := {x ∈ Rn : f (x) < +∞}.
(b) The function f is called proper if f (x) < +∞ for at least one x ∈ Rn and f (x) > −∞
for all x ∈ Rn .
124
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 125
(c) The graph of f is defined as
graph(f ) := {(x, y) ∈ Rn × R : y = f (x)}.
(d) The epigraph of f is defined as
epi(f ) := {(x, α) ∈ Rn × R : f (x) ≤ α}.
(e) The hypograph of f is defined as
hyp(f ) := {(x, α) ∈ Rn × R : f (x) ≥ α}.
(f) The lower level set of f at level α ∈ R is defined as
L(f, α) := {x ∈ Rn : f (x) ≤ α}.
(g) The upper level set of f at level α ∈ R is defined as
U(f, α) := {x ∈ Rn : f (x) ≥ α}.
The epigraph (hypograph) is thus a subset of Rn+1 that consists of all the points of Rn+1
lying on or above (on or below) the graph of f . From the above definitions, we have
(x, α) ∈ epi(f ) if and only if x ∈ L(f, α),
and
(x, α) ∈ hyp(f ) if and only if x ∈ U(f, α).
Definition 5.0.3. A function f : Rn → R is said to be
(a) bounded above if there exists a real number M such that f (x) ≤ M, for all x ∈ Rn ;
(b) bounded below if there exists a real number m such that f (x) ≥ m, for all x ∈ Rn ;
(c) bounded if it is bounded above as well as bounded below.
For f : Rn → R ∪ {±∞}, we write
inf f := inf{f (x) : x ∈ Rn },
argminf := argmin{f (x) : x ∈ Rn } := {x ∈ Rn : f (x) = inf f }.
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 126
Definition 5.0.4. A function f : Rn → R ∪ {±∞} is said to be lower semicontinuous at a
point x ∈ Rn if f (x) ≤ lim inf f (xm ) whenever xm → x as m → ∞. f is said to be lower
m→∞
semicontinuous on Rn if it is lower semicontinuous at each point of Rn .
A function f : Rn → R ∪ {±∞} is said to be upper semicontinuous at a point x ∈ Rn if
f (x) ≥ lim sup f (xm ) whenever xm → x as m → ∞. f is said to be upper semicontinuous
m→∞
on Rn if it is upper semicontinuous at each point of Rn .
Remark 5.0.2. A function f : Rn → R is lower (respectively, upper) semicontinuous on
Rn if and only if the lower level set L(f, α) (respectively, the upper level set U(f, α)) is
closed in Rn for all α ∈ R. Also, f is lower (respectively, upper) semicontinuous on Rn if
and only if the epi(f ) (respectively, hyp(f )) is closed. Equivalently, f is lower (respectively,
upper) semicontinuous on Rn if and only if the set {x ∈ Rn : f (x) > α} (respectively, the
set {x ∈ Rn : f (x) < α}) is open in Rn for all α ∈ R.
Definition 5.0.5. A function f : Rn → R is said to be differentiable at x ∈ Rn if there
exists a vector ∇f (x), called the gradient, and a function α : Rn → R such that
f (y) = f (x) + h∇f (x), y − xi + ky − xkα(y − x),
for all y ∈ Rn ,
where limy→x α(y − x) = 0.
If f is differentiable, then
f (x + λv) = f (x) + λh∇f (x), vi + o(λ),
where limλ→0
o(λ)
= 0.
λ
for all x + λv ∈ Rn ,
The gradient of f at x = (x1 , x2 , . . . , xn ) is a vector in Rn given by
∂f (x) ∂f (x)
∂f (x)
∇f (x) =
.
,
,...,
∂x1
∂x2
∂xn
Definition 5.0.6. An n × n symmetric matrix M of real numbers is said to be positive
semidefinite if hy, Myi ≥ 0 for all y ∈ Rn . It is called positive definite if hy, Myi > 0 for all
y 6= 0.
Definition 5.0.7. Let f = (f1 , . . . , fℓ ) : Rn → Rℓ be a vector-valued function such that the
∂fi (x)
partial derivative
of fi with respect to xj exists for i = 1, 2, . . . , ℓ and j = 1, 2, . . . , n.
∂xj
Then the Jacobian matrix J(f )(x) is given by


∂f1 (x)
∂f1 (x)
···
 ∂x1
∂xn 


.
..
,
.
J(f )(x) = 
.
.


 ∂fℓ (x)
∂fℓ (x) 
···
∂x1
∂xn
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 127
where x = (x1 , x2 , . . . , xn ) ∈ Rn .
Definition 5.0.8. A function f : Rn → R is said to be twice differentiable at x ∈ Rn if there
exist a vector ∇f (x) and an n × n symmetric matrix ∇2 f (x), called the Hessian matrix, and
a function α : Rn → R such that
f (y) = f (x) + h∇f (x), y − xi + hy − x, ∇2 f (x)(y − x)i + ky − xk2 α(y − x),
for all y ∈ Rn ,
where limy→x α(y − x) = 0.
If f is twice differentiable, then
f (x + λv) = f (x) + λh∇f (x), vi + λ2 hv, ∇2 f (x)vi + o(λ2 ),
where limλ→0
o(λ2 )
λ2
for all x + λv ∈ Rn ,
= 0.
The Hessian matrix of f at x = (x1 , x2 , . . . , xn ) is given by
 2
∂ f (x)
∂ 2 f (x)
·
·
·
 ∂x2
∂x1 ∂xn

1

.
..
2
.
∇ f (x) ≡ H(x) = 
.
 2.
 ∂ f (x)
∂ 2 f (x)
···
∂xn ∂x1
∂x2n




.


Definition 5.0.9. Let K be a nonempty convex subset of Rn . A function f : K → R is said
to be
(a) convex if for all x, y ∈ K and all λ ∈ [0, 1],
f (λx + (1 − λ)y) ≤ λf (x) + (1 − λ)f (y);
(b) strictly convex if for all x, y ∈ K, x 6= y and all λ ∈ ]0, 1[,
f (λx + (1 − λ)y) < λf (x) + (1 − λ)f (y).
A function f : K → R is said to be (strictly) concave if −f is (strictly) convex.
Geometrically speaking, a function f : K → R defined on a convex subset K of Rn is convex
if the line segment joining any two points on the graph of the function lies on or above the
portion of the graph between these points. Similarly, f is concave if the line segment joining
any two points on the graph of the function lies on or below the portion of the graph between
these points. Also, a function for which the line segment joining any two points on the graph
of the function lies strictly above the portion of the graph between these points is referred
to as strictly convex function.
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 128
Some of the examples of convex functions defined on R are f (x) = ex , f (x) = x, f (x) = |x|,
f (x) = max{0, x}. The functions f (x) = − log x and f (x) = xα for α < 0, α > 1 are strictly
convex defined on the interval ]0, ∞[. Clearly, every strictly convex function is convex but
the converse may not be true. For example, the function f (x) = x defined on R is not strictly
convex. The function f (x) = |x + x3 | is a nondifferentiable strictly convex function on R.
Proposition 5.0.1. A function f : K → R defined on a nonempty convex subset K of
Rn is convex if and only its epigraph is a convex set.
6
Appendix: Basic Results from Analysis - II
Theorem 6.0.1 (Finite Intersection Property). Let topological space X is compact if
n
\
and only if for every collection {Cα }α∈Λ of closed sets in X such that
Ci 6= ∅, we have
i=1
\
Cα 6= ∅.
α∈Λ
Theorem 6.0.2 (Hahn-Banach Theorem). Let C be a subspace of a linear space X, p
be a sublinear functional on X and f be a linear functional defined on C such that
f (x) ≤ p(x),
for all x ∈ C.
Then there exists a linear extension F of f such that F (x) ≤ p(x) for all x ∈ X.
The following corollary gives the existence of nontrivial bounded linear functionals on an
arbitrary normed space.
Corollary 6.0.1. Let x be a nonzero element of a normed space X. Then there exists
j ∈ X ∗ such that j(x) = kxk and kjk∗ = 1.
Definition 6.0.1. Let X be a normed space and X ∗ be its dual space. The duality pairing
between X and X ∗ is the functional h., .i : X × X ∗ → R defined by
hx, ji = j(x),
for all x ∈ X and j ∈ X ∗ .
Theorem 6.0.3 (James Theorem). A Banach space X is reflexive if and only if for
each j ∈ SX ∗ , there exists x ∈ SX such that j(x) = 1.
Let X be a Banach space with its dual X ∗ . We say that the sequence {xn } in X converges
to x if lim kxn − xk = 0. This kind of convergence is also called norm convergence or
n→∞
129
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 130
strong convergence. This is related to the strong topology on X with neighborhood base
Br (0) = {x ∈ X : kxk < r}, r > 0 at the origin. There is also a weak topology on X
generated by the bounded linear functionals on X. Indeed, A set G ⊆ X is said to be open
in the weak topology if for every x ∈ G, there are bounded linear functionals f1 , f2 , . . . , fn
and positive real numbers ε1 , ε2 , . . . , εn such that
{y ∈ X : |fi (x) − fi (y)| < εi , i = 1, 2, . . . , n} ⊆ G.
Hence a subbase σ for the weak topology on X generated by a base of neighborhoods of
x̄ ∈ X is given by the following sets:
V (f1 , f2 , . . . , fn ; ε) = {x ∈ X : |hx − x̄, fi i| < ε, for all i = 1, 2, . . . , n} .
In particular, a sequence {xn } in X converges to x with respect to a weak topology σ(X, X ∗ )
if and only if hxn , f i → hx, f i for all f ∈ X ∗ .
Definition 6.0.2. A sequence {xn } in a normed space X is said to converge weakly to x ∈ X
if f (xn ) → f (x) for all f ∈ X ∗ . In this case, we write xn ⇀ x or weak- lim xn = x.
n→∞
Definition 6.0.3. A subset C of a normed space X is said to be weakly closed if it is closed
in the weak topology.
Definition 6.0.4. A subset C of a normed space X is said to be weakly compact if it is
compact in the weak topology.
Remark 6.0.1. In the finite dimensional spaces, the weak convergence and the strong convergence are equivalent.
Theorem 6.0.4 (Banach-Alaoglu Theorem). Let X be a normed space and X ∗ be its
dual. Then the unit ball in X ∗ is weak*-compact.
Proposition 6.0.1. Let C be a nonempty convex subset of a normed space X. Then C
is weakly closed if and only if it is closed.
Proposition 6.0.2. Every weakly compact subset of a Banach space is bounded.
Proposition 6.0.3. Every closed convex subset of a weakly compact set is weakly compact.
Theorem 6.0.5 (Kakutani’s Theorem). Let X be a Banach space. Then X is reflexive
if and only if the unit ball SX := {x ∈ X : kxk ≤ 1} is weakly compact.
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 131
Theorem 6.0.6. Let X be a Banach space. Then X is reflexive if and only if every
closed convex bounded subset of X is weakly compact.
Theorem 6.0.7. Let C be a subset of a reflexible Banach space X. Then C is weakly
compact if and only if C is bounded
Theorem 6.0.8. Let X be a Banach space. Then X is reflexive if and only if every
bounded sequence in X in strong topology has a weakly convergent subsequence.
Theorem 6.0.9. Let X be a compact topological space and f : X → (−∞, ∞] be a lower
semicontinuous functional. Then there exists an element x̄ ∈ X such that
f (x̄) = inf f (x).
x∈X
Proof. For all α ∈ R,[let Gα := {x ∈ X : f (x) > α}. Since f is lower semicontinuous,
Gα . By compactness of X, there exists a finite family {Gαi }ni=1 of
Gα is open and X =
α∈R
{Gα }α∈R such that
X=
n
[
Gαi .
i=1
Suppose that α0 = min{α1 , α2 , . . . , αn }. Then f (x) > α0 for all x ∈ X. It follows that
inf{f (x) : x ∈ X} exists. Let m = inf{f (x) : x ∈ X} and β be a number such that β > m.
Set Fβ := {x ∈ X : f (x) ≤ β}. Then Fβ is a nonempty closed subset of X, and hence, by
the intersection property (Theorem 6.0.1), we have
\
Fβ 6= ∅.
β>m
Therefore, for any point x̄ of this intersection, we have m = f (x̄).
Theorem 6.0.10. Let C be a weakly compact convex subset of a Banach space X and
f : C → (−∞, ∞] be a proper lower semicontinuous convex functional. Then there exists
x̄ ∈ C such that f (x̄) = inf{f (x) : x ∈ C}.
Remark 6.0.2. If f is a strictly convex function in Theorem 6.0.10, then x̄ ∈ C is the
unique point such that f (x̄) = inf f (x).
x∈C
Recall that every closed convex bounded subset of a reflexive Banach space is weakly compact
(Theorem 6.0.6). Using this fact, we have the following result.
Qamrul Hasan Ansari
Advanced Functional Analysis
Page 132
Theorem 6.0.11. Let C be a nonempty closed convex bounded subset of a reflexive Banach space X and f : X → (−∞, ∞] be a proper lower semicontinuous convex functional.
Then there exists x̄ ∈ C such that f (x̄) = inf f (x).
x∈C
In Theorem 6.0.11, the boundedness of C may be replaced by the following weaker assumption (called coercivity condition):
lim
x∈C,kxk→∞
f (x) = ∞.
Theorem 6.0.12. Let C be a nonempty closed convex subset of a reflexive Banach space
X and f : C → (−∞, ∞] be a proper lower semicontinuous convex functional such that
f (xn ) → ∞ as kxn k → ∞. Then there exists x̄ ∈ C such that f (x̄) = inf f (x).
x∈C
Proof. Let m = inf{f (x) : x ∈ X}. Choose a minimizing sequence {xn } in X, that is,
f (xn ) → m. If {xn } is not bounded, then there exists a subsequence {xni } of {xn } such that
kxni k → ∞. From the hypothesis, we have f (xni ) → ∞, which contradicts m 6= ∞. Hence
{xn } is bounded. Since X is reflexive, by Theorem 6.0.8, there exists a subsequence {xnj }
of {xn } such that xnj ⇀ x̄ ∈ X. Since f is lower semicontinuous in the weak topology, we
have
m ≤ f (x̄) ≤ lim inf f (xnj ) = lim f (xn ) = m.
j→∞
Therefore, f (x̄) = m.
n→∞
Download