Introduction to the Modern Calculus of Variations MA4G6 Lecture Notes Filip Rindler

advertisement
MA4G6 Lecture Notes
Introduction to the
Modern Calculus of Variations
Filip Rindler
Spring Term 2015
Filip Rindler
Mathematics Institute
University of Warwick
Coventry CV4 7AL
United Kingdom
F.Rindler@warwick.ac.uk
http://www.warwick.ac.uk/filiprindler
Copyright ©2015 Filip Rindler.
Version 1.1.
Preface
These lecture notes, written for the MA4G6 Calculus of Variations course at the University
of Warwick, intend to give a modern introduction to the Calculus of Variations. I have tried
to cover different aspects of the field and to explain how they fit into the “big picture”.
This is not an encyclopedic work; many important results are omitted and sometimes I
only present a special case of a more general theorem. I have, however, tried to strike a
balance between a pure introduction and a text that can be used for later revision of forgotten
material.
The presentation is based around a few principles:
• The presentation is quite “modern” in that I use several techniques which are perhaps
not usually found in an introductory text or that have only recently been developed.
• For most results, I try to use “reasonable” assumptions, not necessarily minimal ones.
• When presented with a choice of how to prove a result, I have usually preferred the
(in my opinion) most conceptually clear approach over more “elementary” ones. For
example, I use Young measures in many instances, even though this comes at the
expense of a higher initial burden of abstract theory.
• Wherever possible, I first present an abstract result for general functionals defined on
Banach spaces to illustrate the general structure of a certain result. Only then do I
specialize to the case of integral functionals on Sobolev spaces.
• I do not attempt to trace every result to its original source as many results have evolved
over generations of mathematicians. Some pointers to further literature are given in
the last section of every chapter.
Comment...
Most of what I present here is not original, I learned it from my teachers and colleagues, both through formal lectures and through informal discussions. Published works
that influenced this course were the lecture notes on microstructure by S. Müller [73], the
encyclopedic work on the Calculus of Variations by B. Dacorogna [25], the book on Young
measures by P. Pedregal [81], Giusti’s more regularity theory-focused introduction to the
Calculus of Variations [44], as well as lecture notes on several related courses by J. Ball,
J. Kristensen, A. Mielke. Further texts on the Calculus of Variations are the elementary
introductions by B. van Brunt [96] and B. Dacorogna [26], the more classical two-part treatise [39, 40] by M. Giaquinta & S. Hildebrandt, as well as the comprehensive treatment of
ii
PREFACE
the geometric measure theory aspects of the Calculus of Variations by Giaquinta, Modica
& Souček [41, 42].
In particular, I would like to thank Alexander Mielke and Jan Kristensen, who are responsible for my choice of research area. Through their generosity and enthusiasm in sharing their knowledge they have provided me with the foundation for my study and research.
Furthermore, I am grateful to Richard Gratwick and the participants of the MA4G6 course
for many helpful comments, corrections, and remarks.
These lecture notes are a living document and I would appreciate comments, corrections
and remarks – however small. They can be sent back to me via e-mail at
F.Rindler@warwick.ac.uk
or anonymously via
http://tinyurl.com/MA4G6-2015
or by scanning the following QR-Code:
Moreover, every page contains its individual QR-code for immediate feedback. I encourage
all readers to make use of them.
Comment...
F. R.
January 2015
Contents
Preface
i
Contents
iii
1 Introduction
1.1 The brachistochrone problem . . . . . .
1.2 Electrostatics . . . . . . . . . . . . . .
1.3 Stationary states in quantum mechanics
1.4 Hyperelasticity . . . . . . . . . . . . .
1.5 Optimal saving and consumption . . . .
1.6 Sailing against the wind . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
1
2
4
5
6
8
9
2 Convexity
2.1 The Direct Method . . . . . . . . .
2.2 Functionals with convex integrands .
2.3 Integrands with u-dependence . . .
2.4 The Euler–Lagrange equation . . . .
2.5 Regularity of minimizers . . . . . .
2.6 Lavrentiev gap phenomenon . . . .
2.7 Invariances and the Noether theorem
2.8 Side constraints . . . . . . . . . . .
2.9 Notes and bibliographical remarks .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
11
12
14
18
19
24
31
33
37
41
.
.
.
.
.
.
.
.
.
43
44
51
54
62
66
68
73
74
75
3 Quasiconvexity
3.1 Quasiconvexity . . . . . . . . . . . .
3.2 Null-Lagrangians . . . . . . . . . . .
3.3 Young measures . . . . . . . . . . . .
3.4 Gradient Young measures . . . . . . .
3.5 Gradient Young measures and rigidity
3.6 Lower semicontinuity . . . . . . . . .
3.7 Integrands with u-dependence . . . .
3.8 Regularity of minimizers . . . . . . .
3.9 Notes and bibliographical remarks . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Comment...
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
iv
4 Polyconvexity
4.1 Polyconvexity . . . . . . . . . . .
4.2 Existence Theorems . . . . . . . .
4.3 Global injectivity . . . . . . . . .
4.4 Notes and bibliographical remarks
CONTENTS
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
77
78
79
85
86
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
5 Relaxation
5.1 Quasiconvex envelopes . . . . . . . . . . . .
5.2 Relaxation of integral functionals . . . . . . .
5.3 Young measure relaxation . . . . . . . . . . .
5.4 Characterization of gradient Young measures
5.5 Notes and bibliographical remarks . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
87
. 88
. 91
. 95
. 101
. 105
A Prerequisites
A.1 General facts and notation . . . . . . . .
A.2 Measure theory . . . . . . . . . . . . . .
A.3 Functional analysis . . . . . . . . . . . .
A.4 Sobolev and other function spaces spaces
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
107
107
109
111
112
117
Index
125
Comment...
Bibliography
Chapter 1
Introduction
Physical modeling is a delicate business – trying to balance the validity of the resulting
model, that is, its agreement with nature and its predictive capabilities, with the feasibility
of its mathematical treatment is often highly non-straightforward. In this quest to formulate
a useful picture of an aspect of the world, it turns out on surprisingly many occasions that
the formulation of the model becomes clearer, more compact, or more descriptive, if one
introduces some form of a variational principle, that is, if one can define a quantity, such as
energy or entropy, which obeys a minimization, maximization or saddle-point law.
How “fundamental” such a scalar quantity is perceived depends on the situation: For
example, in classical mechanics, one calls forces conservative if they can be expressed as
the derivative of an “energy potential” along a curve. It turns out that many, if not most,
forces in physics are conservative, which seems to imply that the concept of energy is fundamental. Furthermore, Einstein’s famous formula E = mc2 expresses the equivalence of
mass and energy, positioning energy as the “most” fundamental quantity, with mass becoming “condensed” energy. On the other hand, the other ubiquitous scalar physical quantity,
the entropy, has a more “artificial” flavor as a measure of missing information in a model,
as is now well understood in Statistical Mechanics, where entropy becomes a derived quantity. We leave it to the field of Philosophy of Science to understand the nature of variational
concepts. In the words of William Thomson, the Lord Kelvin1 :
The fact that mathematics does such a good job of describing the Universe is
a mystery that we don’t understand. And a debt that we will probably never be
able to repay.
Our approach to variational quantities here is a pragmatic one: We see them as providing
structure to a problem and to enable different mathematical, so-called variational, methods.
For instance, in elasticity theory it is usually unrealistic that a system will tend to a global
minimizer by itself, but this does not mean that – as an approximation – such a principle
cannot be useful in practice. On a fundamental level, there probably is no minimization as
such in physics, only quantum-mechanical interactions (or whatever lies “below” quantum
prove nothing, however. Thomson also said “X-rays will prove to be a hoax”.
Comment...
1 Quotes
2
CHAPTER 1. INTRODUCTION
mechanics). However, if we wait long enough, the inherent noise in a realistic system will
move our system around and it will settle with a high probability in a low-energy state.
In this course, we will focus solely on minimization problems for integral functionals
of the form (Ω ⊂ Rd open and bounded)
∫
Ω
f (x, u(x), ∇u(x)) dx
→
min,
u : Ω → Rm ,
possibly under conditions on the boundary values of u and further side constraints. These
problems form the original core of the Calculus of Variations and are as relevant today as
they have always been. The systematic understanding of these integral functionals starts
in Euler’s and Bernoulli’s times in the late 1600s and the early 1700s, and their study
was boosted into the modern age by Hilbert’s 19th, 20th, 23rd problems, formulated in
1900 [47]. The excitement has never stopped since and new and important discoveries are
made every year.
We start this course by looking at a parade of examples, which we treat at varying levels
of detail. All these problems will be investigated further along the course once we have
developed the necessary mathematical tools.
1.1 The brachistochrone problem
In June 1696 Johann Bernoulli published a problem in the journal Acta Eruditorum, which
can be seen as the time of birth of the Calculus of Variations (the name, however, is from
Leonhard Euler’s 1766 treatise Elementa calculi variationum). Additionally, Bernoulli sent
a letter containing the question to Gottfried Wilhelm Leibniz on 9 June 1696, who returned
his solution only a few days later on 16 June, and commented that the problem tempted him
“like the apple tempted Eve” because of its beauty. Isaac Newton published a solution (after
the problem had reached him) without giving his identity, but Bernoulli identified him “ex
ungue leonem” (from Latin, “by the lion’s claw”). The problem was formulated as follows2 :
Given two points A and B in a vertical [meaning “not horizontal”] plane, one
shall find a curve AMB for a movable point M, on which it travels from A to B
in the shortest time, only driven by its own weight.
The resulting curve is called the brachistochrone (from Ancient Greek, “shortest time”)
curve.
We formulate the problem as follows: We look for the curve y connecting the origin
(0, 0) to the point (x̄, ȳ), where x̄ > 0, ȳ < 0, such that a point mass m > 0 slides from rest
at (0, 0) to (x̄, ȳ) quickest among all such curves. See Figure 1.1 for several possible slide
paths. We parametrize a point (x, y) on the curve by the time t ≥ 0. The point mass has
Comment...
2 The German original reads “Wenn in einer verticalen Ebene zwei Punkte A und B gegeben sind, soll
man dem beweglichen Punkte M eine Bahn AMB anweisen, auf welcher er von A ausgehend vermöge seiner
eigenen Schwere in kürzester Zeit nach B gelangt.”
1.1. THE BRACHISTOCHRONE PROBLEM
3
Figure 1.1: Several slide curves from the origin to (x̄, ȳ).
kinetic and potential energies
{( )2 ( )2 }
( ) {
( )2 }
dx
dy
m dx 2
dy
m
+
=
1+
,
Ekin =
2
dt
dt
2 dt
dx
Epot = mgy,
where g ≈ 9.81 m/s2 is the (constant) gravitational acceleration on Earth. The total energy
Ekin + Epot is zero at the beginning and conserved along the path. Hence, for all y we have
( ) {
( )2 }
m dx 2
dy
1+
= −mgy.
2 dt
dx
We can solve this for dt/dx (where t = t(x) is the inverse of the x-parametrization) to get
√
(
)
dt
1 + (y′ )2
dt
=
≥0 ,
dx
−2gy
dx
dy
where we wrote y′ = dx
. Integrating over the whole x-length along the curve from 0 to x̄,
we get for the total time duration T [y] that
√
∫ x̄
1
1 + (y′ )2
T [y] = √
dx.
−y
2g 0
Comment...
We may drop the constant in front of the integral, which does not influence the minimization
problem, and set x̄ = 1 by reparametrization, to arrive at the problem
√

∫ 1

1 + (y′ )2
 F [y] :=
dx → min,
−y
0


y(0) = 0, y(1) = ȳ < 0.
4
CHAPTER 1. INTRODUCTION
Clearly, the integrand is convex in y′ , which will be important for the solution theory. We
will come back to this problem in Example 2.34.
A related problem concerns Fermat’s Principle expressing that light (in vacuum or in a
medium) always takes the fastest path between two points, from which many optical laws,
e.g. about reflection and refraction, can be derived.
1.2 Electrostatics
Consider an electric charge density ρ : R3 → R (in units of C/m3 ) in three-dimensional
vacuum. Let E : R3 → R3 (in V/m) and B : R3 → R3 (in T = Vs/m2 ) be the electric and
magnetic fields, respectively, which we assume to be constant in time (hence electrostatics).
Assuming that R3 is a linear, homogeneous, isotropic electric material, the Gauss law for
electricity reads
∇ · E = div E =
ρ
,
ε0
where ε0 ≈ 8.854 · 10−12 C/(Vm). Moreover, we have the Faraday law of induction
∇ × E = curl E =
dB
= 0.
dt
Thus, since E is curl-free, there exist an electric potential φ : R3 → R (in V) such that
E = −∇φ .
Combining this with the Gauss law, we arrive at the Poisson equation,
∆φ = ∇ · [∇φ ] = −
ρ
.
ε0
(1.1)
We can also look at electrostatics in a variational way: We use the norming condition
φ (0) = 0. Then, the electric potential energy UE (x; q) of a point charge q > 0 (in C) at point
x ∈ R3 in the electric field E is given by the path integral (which does not depend on the
path chosen since E is a gradient)
UE (x; q) = −
∫ x
0
qE · ds = −
∫ 1
0
qE(hx) dh = qφ (x).
Thus, the total electric energy of our charge distribution ρ is (the factor 1/2 is necessary to
count mutual reaction forces correctly)
1
UE :=
2
∫
ε0
ρφ dx =
3
2
R
∫
R3
(∇ · E)φ dx,
which has units of CV = J. Using (∇ · E)φ = ∇ · (E φ ) − E · (∇φ ), the divergence theorem,
and the natural assumption that φ vanishes at infinity, we get further
Comment...
UE =
ε0
2
∫
R3
∇ · (E φ ) − E · (∇φ ) dx = −
ε0
2
∫
R3
E · (∇φ ) dx =
ε0
2
∫
R3
|∇φ |2 dx.
1.3. STATIONARY STATES IN QUANTUM MECHANICS
5
∫
The integral Ω |∇φ |2 dx is called Dirichlet integral.
In Example 2.15 we will see that (1.1) is equivalent to the minimization problem
∫
∫
ε0
|∇φ |2 + ρφ dx → min .
2
The second term is the energy stored in the electric field caused by its interaction with the
charge density ρ .
UE +
R3
ρφ dx =
R3
1.3 Stationary states in quantum mechanics
The non-relativistic evolution of a quantum mechanical system with N degrees of freedom
in an electrical field is described completely through its wave function Ψ : RN × R → C
that satisfies the Schrödinger equation
[
]
−h̄
d
ih̄ Ψ(x,t) =
∆ +V (x,t) Ψ(x,t),
x ∈ RN , t ∈ R,
dt
2µ
where h̄ ≈ 1.05 · 10−34 Js is the reduced Planck constant, µ > 0 is the reduced mass, and
V = V (x,t) ∈ R is the potential energy. The (self-adjoint) operator H := −(2µ )−1 h̄∆ +V is
called the Hamiltonian of the system.
The value of the wave function itself at a given point in spacetime has no physical interpretation, but according to the Copenhagen Interpretation of quantum mechanics, |Ψ(x)|2
is the probability density of finding a particle at the point x. In order for |Ψ|2 to be a probability density, we need to impose the side constraint
∥Ψ∥L2 = 1.
In particular, Ψ(x,t) has to decay as |x| → ∞.
Of particular interest are the so-called stationary states, that is, solutions of the stationary Schrödinger equation
]
[ 2
−h̄
∆ +V (x) Ψ(x) = EΨ(x),
x ∈ RN ,
2µ
where E > 0 is an energy level. We then solve this eigenvalue problem (in a weak sense) for
Ψ ∈ W1,2 (RN ; C) with ∥Ψ∥L2 = 1 (see Appendix A.4 for a collection of facts about Sobolev
spaces like W1,2 ).
If we are just interested in the lowest-energy state, the so-called ground state, we can
find minimizers of the energy functional
E [Ψ] :=
∫
RN
h̄2
1
|∇Ψ(x)|2 + V (x)|Ψ(x)|2 dx,
2µ
2
again under the side constraint
∥Ψ∥L2 = 1.
Comment...
The two parts of the integral above correspond to kinetic and potential energy, respectively.
We will continue this investigation in Example 2.39.
6
CHAPTER 1. INTRODUCTION
1.4 Hyperelasticity
Elasticity theory is one of the most important theories of continuum mechanics, that is, the
study of the mechanics of (idealized) continuous media. We will not go into much detail
about elasticity modeling here and refer to the book [20] for a thorough introduction.
Consider a body of mass occupying a domain Ω ⊂ R3 , called the reference configuration. If we deform the body, any material point x ∈ Ω is mapped into a spatial point
y(x) ∈ R3 and we call y(Ω) the deformed configuration. For a suitable continuum mechanics theory, we also need to require that y : Ω → y(Ω) is a differentiable bijection and that it
is orientation-preserving, i.e. that
det ∇y(x) > 0
for all x ∈ Ω.
For convenience one also introduces the displacement
u(x) := y(x) − x.
One can show (using the implicit function theorem and the mean value theorem) that the
orientation-preserving condition and the invertibility are satisfied if
sup |∇u(x)| < δ (Ω)
(1.2)
x∈Ω
for a domain-dependent (small) constant δ (Ω) > 0.
Next, we need a measure of local “stretching”, called a strain tensor, which should serve
as the parameter to a local energy density. On physical grounds, rigid body motions, that is,
deformations u(x) = Rx+u0 with a rotation R ∈ R3×3 (RT = R−1 and det R = 1) and u0 ∈ R3 ,
should not cause strain. In this sense, strain measures the deviation of the displacement from
a rigid body motion. One popular choice is the Green–St. Venant strain tensor3
E :=
)
1(
∇u + ∇uT + ∇uT ∇u .
2
(1.3)
We first consider fully nonlinear (“finite strain”) elasticity. For our purposes we simply
postulate the existence of a stored-energy density W : R3×3 → [0, ∞] and an external body
force field b : Ω → R3 (e.g. gravity) such that
F [y] :=
∫
Ω
W (∇y) − b · y dx
represents the total
elastic energy stored in the system. If the elastic energy can be written
∫
in this way as Ω W (∇y(x)) dx, we call the material hyperelastic. In applications, W is
often given as depending on the Green–St. Venant strain tensor E instead of ∇y, but for our
mathematical theory, the above form is more convenient. We require several properties of
W for arguments F ∈ R3×3 :
Comment...
3 There is an array of choices for the strain tensor, but they are mostly equivalent within nonlinear
elasticity.
1.4. HYPERELASTICITY
7
(i) The undeformed state costs no energy: W (I) = 0.
(ii) Invariance under change of frame4 : W (RF) = W (F) for all R ∈ SO(3).
(iii) Infinite compression costs infinite energy: W (F) → +∞ as det F ↓ 0.
(iv) Infinite stretching costs infinite energy: W (F) → +∞ as |F| → ∞.
The main problem of nonlinear hyperelasticity is to minimize F as above over all
y : Ω → R3 with given boundary values. Of course, it is not a-priori clear in which space
we should look for a solution. Indeed, this depends on the growth properties of W . For
example, for the prototypical choice
W (F) := dist(F, SO(3))2 ,
which however does not satisfy (3) from our list of requirements, we would look for square
integrable functions. More realistic in applications are W of the form
M
[
] N
[
]
W (F) := ∑ ai tr (F T F)γi /2 + ∑ b j tr cof (F T F)δ j /2 + Γ(det F),
i=1
F ∈ R3×3 ,
j=1
where ai > 0, γi ≥ 1, b j > 0, δ j ≥ 1, and Γ : R → R ∪ {+∞} is a convex function with
Γ(d) → +∞ as d ↓ 0, Γ(d) = +∞ for s ≤ 0. These materials are called Ogden materials
and occur in a wide range of applications.
In the setting of linearized elasticity, we make the “small strain” assumption that the
quadratic term in (1.3) can be neglected and that (1.2) holds. In this case, we work with the
linearized strain tensor
)
1(
E u := ∇u + ∇uT .
2
In this infinitesimal deformation setting, the displacements that do not create strain are
precisely the skew-affine maps5 u(x) = Qx + u0 with QT = −Q and u0 ∈ R.
For linearized elasticity we consider an energy of the special “quadratic” form
F [u] :=
∫
1
E u : C E u − b · u dx,
Ω2
where C(x) = Ci jkl (x) is a symmetric, strictly positive definite (A : C(x)A > c|A|2 for some
c > 0) fourth-order tensor, called the elasticity/stiffness tensor, and b : Ω → R3 is our
external body force. Thus we will always look for solutions in W1,2 (Ω; R3 ).
For homogeneous, isotropic media, C does not depend on x or the direction of strain,
which translates into the additional condition
(AR) : C(x)(AR) = A : C(x)A
for all x ∈ Ω, A ∈ R3×3 and R ∈ SO(3).
4 Rotating
or translating the (inertial) frame of reference must give the same result.
becomes more meaningful when considering a bit more algebra: The Lie group SO(3) of rotations
has as its Lie algebra Lie(SO(3)) = so(3) the space of all skew-symmetric matrices, which then can be seen as
“infinitesimal rotations”.
Comment...
5 This
8
CHAPTER 1. INTRODUCTION
In this case, it can be shown that F simplifies to
∫
(
1
2 )
F [u] =
2µ |E u|2 + κ − µ | tr E u|2 − b · u dx
2 Ω
3
for µ > 0 the shear modulus and κ > 0 the bulk modulus, which are material constants. For
example, for cold-rolled steel µ ≈ 75 GPa and κ ≈ 160 GPa.
1.5 Optimal saving and consumption
Consider a capitalist worker earning a (constant) wage w per year, which he can either spend
on consumption or save. Denote by S(t) the savings at time t, where t ∈ [0, T ] is in years,
with t = 0 denoting the beginning of his work life and t = T his retirement. Further, let
C(t) be the consumption rate (consumption per time) at time t. On the saved capital, the
worker earns interest, say with gross-continuous rate ρ > 0, meaning that a capital amount
m > 0 grows as exp(ρ t)m. If we were given an APR ρ1 > 0 instead of ρ , we could compute
ρ = ln(1 + ρ1 ). We further assume that salary is paid continuously, not in intervals, for
simplicity. So, w is really the rate of pay, given in money per time. Then, the worker’s
savings evolve according to the differential equation
Ṡ(t) = w + ρ S(t) −C(t).
(1.4)
Now, assume that our worker is mathematically inclined and wants to optimize the
happiness due to consumption in his life by finding the optimal amount of consumption
at any given time. Being a pure capitalist, the worker’s happiness only depends on his
consumption rate C. So, if we denote by U(C) the utility function, that is, the “happiness”
due to the consumption rate C, our worker wants to find C : [0, T ] → R such that
H [C] :=
∫ T
U(C(t)) dt
0
is maximized. The choice of U depends on our worker’s personality, but it is probably realistic to assume that there is a law of diminishing returns, i.e. for twice as much consumption,
our worker is happier, but not twice as happy. So, let us assume U ′ > 0 and U ′ (C) → 0 as
C → ∞. Also, we should have U(0) = −∞ (starvation). Moreover, it is realistic for U to
be concave, which implies that there are no local maxima. One function that satisfies all of
these requirements is the logarithm, and so we use
U(C) = ln(C),
C > 0.
Let us also assume that the worker starts with no savings, S(0) = 0, and at the end
of his work life wants to retire with savings S(T ) = ST ≥ 0. Rearranging (1.4) for C(t)
and plugging this into the formula for H , we therefore want to solve the optimal saving
problem

∫ T
 F [S] :=
− ln(w + ρ S(t) − Ṡ(t)) dt → min,
0

S(0) = 0, S(T ) = ST ≥ 0.
Comment...
This will be solved in Example 2.18.
1.6. SAILING AGAINST THE WIND
9
Figure 1.2: Sailing against the wind in a channel.
1.6 Sailing against the wind
Every sailor knows how to sail against the wind by “beating”: One has to sail at an angle of
approximately 45◦ to the wind6 , then tack (turn the bow through the wind) and finally, after
the sail has caught the wind on the other side, continue again at approximately 45◦ to the
wind. Repeating this procedure makes the boat follow a zig-zag motion, which gives a net
movement directly against the wind, see Figure 1.2. A mathematically inclined sailor might
ask the question of “how often to tack”. In an idealized model we can assume that tacking
costs no time and that the forward sailing speed vs of the boat depends on the angle α to the
wind as follows (at least qualitatively):
vs (α ) = − cos(4α ),
which has maxima at α = ±45◦ (in real boats, the maximum might be at a lower angle, i.e.
“closer to the wind”). Assume furthermore that our sailor is sailing along a straight river
with the current. Now, the current is fastest in the middle of the river and goes down to zero
at the banks. In fact, a good approximation would be the formula of Poiseuille (channel)
flow, which can be derived from the flow equations of fluids: At distance r from the center
of the river the current’s flow speed is approximately
(
)
r2
vc (r) := vmax 1 − 2 ,
R
where R > 0 is half the width of the river.
optimal angle depends on the boat. Ask a sailor about this and you will get an evening-filling lecture.
Comment...
6 The
10
CHAPTER 1. INTRODUCTION
If we denote by r(t) the distance of our boat from the middle of the channel at time
t ∈ [0, T ], then the total speed (called the “velocity made good” in sailing parlance) is
v(t) := vs (arctan r′ (t)) + vc (r(t))
(
)
r(t)2
= − cos(4 arctan r (t)) + vmax 1 − 2 .
R
′
The key to understand this problem is now the fact that a 7→ − cos(4 arctan a) has two maxima at a = ±1. We say that this function is a double-well potential.
The total forward distance traveled over the time interval [0, T ] is
(
)
∫ T
∫ T
r(t)2
′
v(t) dt =
− cos(4 arctan r (t)) + vmax 1 − 2
dt.
R
0
0
If we also require the initial and terminal conditions r(0) = r(T ) = 0, we arrive at the
optimal beating problem:

(
)
∫ T

r(t)2
′
 F [r] :=
cos(4 arctan r (t)) − vmax 1 − 2
dt → min,
R
0

 r(0) = r(T ) = 0,
|r(t)| ≤ R.
Comment...
Our intuition tells us that in this idealized model, where tacking costs no time, we should
be tacking as “infinitely fast” in order to stay in the middle of the river. Later, once we have
advanced tools at our disposal, we will make this idea precise, see Example 5.6.
Chapter 2
Convexity
In the introduction we got a glimpse of the many applications of minimization problems in
physics, technology, and economics. In this chapter we start to develop the abstract theory.
Consider a minimization problem of the form
∫

 F [u] :=
f (x, u(x), ∇u(x)) dx → min,
Ω

over all u ∈ W1,p (Ω; Rm ) with u|∂ Ω = g.
Here, and throughout the course (if not otherwise stated) we will make the standard assumption that Ω ⊂ Rd is a bounded Lipschitz domain, that is, Ω is open, bounded, and has a
Lipschitz boundary. Furthermore, we assume 1 < p < ∞, so that the integrand
f : Ω × Rm × Rm×d → R,
is measurable in the first and continuous in the second and third arguments (hence f is a
Carathéodory integrand). For the prescribed boundary values g we assume
g ∈ W1−1/p,p (∂ Ω; Rm ).
In this context recall that W1−1/p,p (∂ Ω; Rm ) is the space of all traces of functions u ∈
W1,p (Ω; Rm ), see Appendix A.4 for some background on Sobolev spaces. These conditions
will be assumed from now on unless stated otherwise. Furthermore, to make the above
integral finite, we assume the growth bound
| f (x, v, A)| ≤ M(1 + |v| p + |A| p )
for some M > 0.
Comment...
Below, we will investigate the solvability and regularity properties of this minimization
problem; in particular, we will take a close look at the way in which convexity properties
of f in its gradient (third) argument determine whether F is lower semicontinuous, which
is the decisive attribute in the fundamental Direct Method of the Calculus of Variations.
We will also consider the partial differential equation associated with the minimization
problem, the so-called Euler–Lagrange equation that is a necessary (but not sufficient) condition for minimizers. This leads naturally to the question of regularity of solutions. Finally,
12
CHAPTER 2. CONVEXITY
we also treat invariances, leading to the famous Noether theorem before having a glimpse
at side constraints.
Most results here are already formulated for vector-valued u, as this has many applications. Some aspects, however, in particular regularity theory, are specific to scalar problems,
where u is assumed to be merely real-valued.
2.1 The Direct Method
Fundamental to all of the existence theorems in this text is the conceptionally simple Direct
Method1 of the Calculus of Variations. Let X be a topological space2 (e.g. a Banach space
with the strong or weak topology) and let F : X → R ∪ {+∞} be our objective functional
satisfying the following two assumptions:
(H1) Coercivity: For all Λ ∈ R, the sublevel set { x ∈ X : F [x] ≤ Λ } is sequentially relatively compact, that is, if F [x j ] ≤ Λ for a sequence (x j ) ⊂ X and some Λ ∈ R, then
(x j ) has a convergent subsequence.
(H2) Lower semicontinuity: For all sequences (x j ) ⊂ X with x j → x it holds that
F [x] ≤ lim inf F [x j ].
j→∞
Consider the abstract minimization problem
F [x] → min
over all x ∈ X.
(2.1)
Then, the Direct Method is the following simple strategy:
Theorem 2.1 (Direct Method). Assume that F is both coercive and lower semicontinuous. Then, the abstract problem (2.1) has at least one solution, that is, there exists x∗ ∈ X
with F [x∗ ] = min { F [x] : x ∈ X }.
Proof. Let us assume that there exists at least one x ∈ X such that F [x] < +∞; otherwise,
any x ∈ X is a “solution” to the (degenerate) minimization problem.
To construct a minimizer we take a minimizing sequence (x j ) ⊂ X such that
{
}
lim F [x j ] → α := inf F [x] : x ∈ X < +∞.
j→∞
Then, since convergent sequences are bounded, there exists Λ ∈ R such that F [x j ] ≤ Λ for
all j ∈ N. Hence, by the coercivity (H1), the sequence (x j ) is contained in a compact subset
1 “Direct”
refers to the fact that we construct solutions without employing a differential equation.
fact, we only need a space together with a notion of convergence as all our arguments will involve
sequences as opposed to more general topological tools such as nets. In all of this text we use “topology” and
“convergence” interchangeably.
Comment...
2 In
2.1. THE DIRECT METHOD
13
of X. Now select a subsequence, which here as in the following we do not renumber, such
that
x j → x∗ ∈ X.
By the lower semicontinuity (H2), we immediately conclude
α ≤ F [x∗ ] ≤ lim inf F [x j ] = α .
j→∞
Thus, F [x∗ ] = α and x∗ is the sought minimizer.
Example 2.2. Using the Direct Method, one can easily see that
{
1 − t if t < 0,
h(t) :=
t ∈ R (= X),
t
if t ≥ 0,
has minimizer t = 0, despite not even being continuous there.
Despite its nearly trivial proof, the Direct Method is very useful and flexible in applications. Of course, neither coercivity nor lower semicontinuity are necessary for the existence
of a minimizer. For instance, we really only needed coercivity and lower semicontinuity
along a minimizing sequence, but this weaker condition is hard to check in most practical
situations; however, such a refined approach can be useful in special cases.
The abstract principle of the Direct Method pushes the difficulty in proving existence
of a minimizer toward establishing coercivity and lower semicontinuity. This, however, is a
big advantage, since we have many tools at our disposal to establish these two hypotheses
separately. In particular, for integral functionals lower semicontinuity is tightly linked to
convexity properties of the integrand, as we will see throughout this course.
At this point it is crucial to observe how coercivity and lower semicontinuity interact
with the topology (or, more accurately, convergence) in X: If we choose a stronger topology,
i.e. one for which there are fewer convergent sequences, then it becomes easier for F to
be lower semicontinuous, but harder for F to be coercive. The opposite holds if choosing
a weaker topology. In the mathematical treatment of a problem we are most likely in a
situation where F and X are given by the concrete problem. We then need to find a suitable
topology in which we can establish both coercivity and lower semicontinuity.
In this text, X will always be an infinite-dimensional Banach space and we have a real
choice between using the strong or weak convergence. Usually, it turns out that coercivity with respect to the strong convergence is false since strongly compact sets in infinitedimensional spaces are very restricted, whereas coercivity with respect to the weak convergence is true under reasonable assumptions. However, whereas strong lower semicontinuity
poses few challenges, lower semicontinuity with respect to weakly converging sequences is
a delicate matter and we will spend considerable time on this topic.
We will always use the Direct Method in the following version:
Comment...
Theorem 2.3 (Direct Method for weak convergence). Let X be a Banach space and let
F : X → R ∪ {+∞}. Assume the following:
14
CHAPTER 2. CONVEXITY
(WH1) Weak Coercivity: For all Λ ∈ R, the sublevel set
{ x ∈ X : F [x] ≤ Λ }
is sequentially weakly relatively compact,
that is, if F [x j ] ≤ Λ for a sequence (x j ) ⊂ X and some Λ ∈ R, then (x j ) has a weakly
convergent subsequence.
(WH2) Weak lower semicontinuity: For all sequences (x j ) ⊂ X with x j ⇀ x (weak convergence X) it holds that
F [x] ≤ lim inf F [x j ].
j→∞
Then, the minimization problem
F [x] → min
over all x ∈ X,
has at least one solution.
Before we move on to the more concrete theory, let us record one further remark: While
it might appear as if “nature does not give us the topology” and it is up to mathematicians to
“invent” a suitable one, it is remarkable that the topology that turns out to be mathematically
convenient is also often physically relevant. This is for instance expressed in the following
observation: The only phenomena which are present in weakly but not strongly converging
sequences are oscillations and concentrations, as can be seen in the classical Vitali Convergence Theorem A.6. On the other hand, these phenomena very much represent physical
effects, for example in crystal microstructure. Thus, the fact that we work with the weak
convergence means that we have to consider microstructure (which might or might not be
exhibited), a fact that we will come back to many times throughout this course.
2.2 Functionals with convex integrands
Returning to the minimization problem at the beginning of this chapter, we first consider
the minimization problem for the simpler functional
F [u] :=
∫
Ω
f (x, ∇u(x)) dx
over all u ∈ W1,p (Ω; Rm ) for some 1 < p < ∞ to be chosen later (depending on lower growth
properties of f ). The following lemma together with (2.3) allows us to conclude that F [u]
is indeed well-defined.
Lemma 2.4. Let f : Ω × RN → R be a Carathéodory integrand, that is,
(i) x 7→ f (x, A) is Lebesgue-measurable for all fixed A ∈ RN .
Comment...
(ii) A 7→ f (x, A) is continuous for all fixed x ∈ Ω.
2.2. FUNCTIONALS WITH CONVEX INTEGRANDS
15
Then, for any Lebesgue-measurable function V : Ω → RN the composition x 7→ f (x,V (x))
is Lebesgue-measurable.
We will prove this lemma via the following “Lusin-type” theorem for Carathéodory
functions:
Theorem 2.5 (Scorza Dragoni). Let f : Ω × RN → R be Carathéodory. Then, there exists
an increasing sequence of compact sets Sk ⊂ Ω (k ∈ N) with |Ω \ Sk | ↓ 0 such that f |Sk ×RN
is continuous.
See Theorem 6.35 in [37] for a proof of this theorem, which is rather measure-theoretic
and similar to the proof of the Lusin theorem, cf. Theorem 2.24 in [85].
Proof of Lemma 2.4. Let Sk ⊂ Ω (k ∈ N) be as in the Scorza Dragoni Theorem. Set fk :=
f |Sk ×RN and
gk (x) := fk (x,V (x)),
x ∈ Sk .
Then, for any open set E ⊂ R, the pre-image of E under gk is the set of all x ∈ Sk such that
(x,V (x)) ∈ fk−1 (E). As fk is continuous, fk−1 (E) is open and from the Lebesgue measurability of the product function x 7→ (x,V (x)) we infer that g−1
k (E) is a Lebesgue-measurable
subset of Sk . We can extend the definition of gk to all of Ω by setting gk (x) := 0 whenever
x ∈ Ω \ Sk . This function is still Lebesgue-measurable as Sk is compact, hence measurable. The conclusion follows from the fact that gk (x) → f (x,V (x)) and pointwise limits of
Lebesgue-measurable functions are themselves Lebesgue-measurable.
We next investigate coercivity: Let 1 < p < ∞ The most basic assumption, and the
only one we want to consider here, to guarantee coercivity is the following lower growth
(coercivity) bound:
µ |A| p − µ −1 ≤ f (x, A)
for all x ∈ Ω, A ∈ Rm×d and some µ > 0.
(2.2)
This coercivity also suggests the exponent p for the definition of the function spaces where
we look for solutions. We further assume the upper growth bound
| f (x, A)| ≤ M(1 + |A| p )
for all x ∈ Ω, A ∈ Rm×d and some M > 0,
(2.3)
which gives finiteness of F [u] for all u ∈ W1,p (Ω; Rm ).
Proposition 2.6. If f satisfies
the lower growth bound (2.2),
{
} then F is weakly coercive on
m
1,p
m
the space W1,p
g (Ω; R ) = u ∈ W (Ω; R ) : u|∂ Ω = g .
m
Proof. We need to show that any sequence (u j ) ⊂ W1,p
g (Ω; R ) with
Comment...
sup j F [u j ] < ∞
16
CHAPTER 2. CONVEXITY
is weakly relatively compact. From (2.2) we get
µ · sup j
∫
Ω
|∇u j | p dx −
|Ω|
≤ sup j F [u j ] < ∞,
µ
whereby sup j ∥∇u j ∥L p < ∞. By the Poincaré–Friedrichs inequality, Theorem A.16 (I), in
conjunction with the fact that we fix the boundary values, we therefore get sup j ∥u j ∥W1,p <
∞. We conclude by the fact that bounded sets in reflexive Banach spaces, like W1,p (Ω; Rm )
for 1 < p < ∞, are sequentially weakly relatively compact, see Theorem A.11.
Having settled the question of (weak) coercivity, we can now turn to investigate the
weak lower semicontinuity.
Theorem 2.7. If A 7→ f (x, A) is convex for all x ∈ Ω and f (x, A) ≥ κ for some κ ∈ R (which
follows in particular from (2.2)), then F is weakly lower semicontinuous on W1,p (Ω; Rm ).
Proof. Step 1. We first establish that F is strongly lower semicontinuous, so let u j → u
in W1,p and ∇u j → ∇u almost everywhere (which holds after selecting a non-renumbered
subsequence, see Appendix A.2). By assumption we have
f (x, ∇u j (x)) − κ ≥ 0.
Thus, applying Fatou’s Lemma,
(
)
lim inf F [u j ] − κ |Ω| ≥ F [u] − κ |Ω|.
j→∞
Thus, F [u] ≤ lim inf j→∞ F [u j ]. Since this holds for all subsequences, it also follows for
our original sequence.
Step 2. To show weak lower semicontinuity take (u j ) ⊂ W1,p (Ω; Rm ) with u j ⇀ u. We
need to show that
F [u] ≤ lim inf F [u j ] =: α .
(2.4)
j→∞
Taking a subsequence (not relabeled), we can in fact assume that F [u j ] converges to α .
By the Mazur Lemma A.14, we may find convex combinations
N( j)
vj =
∑
N( j)
θ j,n un ,
where
θ j,n ∈ [0, 1] and
n= j
∑ θ j,n = 1,
n= j
such that v j → u strongly. Since f (x, q) is convex,
)
(
∫
N( j)
F [v j ] =
Ω
f
x,
∑
N( j)
θ j,n ∇un (x)
n= j
dx ≤
∑ θ j,n F [un ].
n= j
Now, F [un ] → α as n → ∞ and n ≥ j inside the sum. Thus,
lim inf F [v j ] ≤ α = lim inf F [u j ].
j→∞
j→∞
Comment...
On the other hand, from the first step and v j → u strongly, we have F [u] ≤ lim inf j→∞ F [v j ].
Thus, (2.4) follows and the proof is finished.
2.2. FUNCTIONALS WITH CONVEX INTEGRANDS
17
We can summarize our findings in the following theorem:
Theorem 2.8. Let f : Ω × Rm×d be a Carathéodory integrand such that f satisfies the
lower growth bound (2.2), the upper growth bound (2.3) and is convex in its second argum
ment. Then, the associated functional F has a minimizer over the space W1,p
g (Ω; R ).
Proof. This follows immediately from the Direct Method in Banach spaces, Theorem 2.3
together with Proposition 2.6 and Theorem 2.7.
Example 2.9. The Dirichlet integral (functional) is
F [u] :=
∫
Ω
|∇u(x)|2 dx,
u ∈ W1,2 (Ω).
We already encountered this integral functional when considering electrostatics in Section 1.2. It is easy to see that this functional satisfies all requirements of Theorem 2.8
and so there exists a minimizer for any prescribed boundary values g ∈ W1/2,2 (∂ Ω).
Example 2.10. In the prototypical problem of linearized elasticity from Section 1.4, we
are tasked with the minimization problem

∫
)
(
 F [u] := 1 2µ |E u|2 + κ − 2 µ | tr E u|2 − f u dx → min,
2 Ω
3

over all u ∈ W1,2 (Ω; R3 ) with u|∂ Ω = g,
where µ , κ > 0, f ∈ L2 (Ω; R3 ), and g ∈ W1/2,2 (∂ Ω; Rm ). It is clear that F has quadratic
growth. Let us first consider the coercivity: For u ∈ W1,2 (Ω; R3 ) with u|∂ Ω = g we have the
following version of the L2 -Korn inequality
(
)
∥u∥2W1,2 ≤ CK ∥E u∥2L2 + ∥g∥2W1/2,2 .
Here, CK = CK (Ω) > 0 is a constant. Let us for a simplified estimate assume that κ − 23 µ ≥ 0
(this is not necessary, but shortens the argument). Then, also using the Young inequality,
we get for any δ > 0,
F [u] ≥ µ ∥E u∥2L2 − ∥ f ∥L2 ∥u∥L2
(
) 1
δ
≥ µ ∥E u∥2L2 + ∥g∥2W1/2,2 − ∥ f ∥2L2 − ∥u∥2L2 − µ ∥g∥2W1/2,2
2δ
2
(
)
µ
δ
1
2
2
≥
−
∥u∥W1,2 − ∥ f ∥L2 − µ ∥g∥2W1/2,2 .
CK 2
2δ
Choosing δ = µ /CK , we get the coercivity estimate
F [u] ≥
CK
µ
∥u∥2W1,2 −
∥ f ∥2L2 − µ ∥g∥2W1/2,2 .
2CK
2µ
Comment...
Hence, F [u] controls ∥u∥W1,2 and our functional is weakly coercive. Moreover, it is clear
that our integrand is convex in in the E u-argument (note that the trace function is linear).
Hence, Theorem 2.8 yields the existence of a solution u∗ ∈ W1,2 (Ω; R3 ) to our minimization
problem of linearized elasticity.
18
CHAPTER 2. CONVEXITY
We finish this section with the following converse to Theorem 2.7:
Proposition 2.11. Let F be an integral functional with a Carathéodory integrand f : Ω ×
Rm×d → R satisfying the upper growth bound (2.3). Assume furthermore that F is weakly
lower semicontinuous on W1,p (Ω; Rm ). If either m = 1 or d = 1 (the scalar case), then
A 7→ f (x, A) is convex for almost every x ∈ Ω.
Proof. We will not prove the full result here, since Proposition 3.27 together with Corollary 3.4 in the next chapter will give a stronger version (including the corresponding result
for the vectorial case as well).
In the vectorial case, i.e. m ̸= 1 or d ̸= 1, the necessity of convexity is far from being
true and there is indeed another “canonical” condition ensuring weak lower semicontinuity,
which is weaker than (ordinary) convexity; we will explore this in the next chapter.
2.3 Integrands with u-dependence
If we try to extend the results in the previous section to more general functionals
F [u] :=
∫
Ω
f (x, u(x), ∇u(x)) dx
→
min,
we discover that our proof strategy of using the Mazur Lemma runs into difficulties: We
cannot “pull out” the convex combination inside
(
)
∫
N( j)
Ω
f
x,
∑
N( j)
θ j,n un (x),
n= j
∑ θ j,n ∇un (x)
dx
n= j
anymore. Nevertheless, a lower semicontinuity result analogous to the one for the scalar
case turns out to be true:
Theorem 2.12. Let f : Ω × Rm × Rm×d → R satisfy the growth bound
| f (x, v, A)| ≤ M(1 + |v| p + |A| p )
for some M > 0, 1 < p < ∞,
and the convexity property
A 7→ f (x, v, A) is convex for all (x, v) ∈ Ω × Rm .
Then, the functional
F [u] :=
∫
Ω
f (x, u(x), ∇u(x)) dx,
u ∈ W1,p (Ω; Rm ),
is weakly lower semicontinuous.
Comment...
While it would be possible to give an elementary proof of this theorem here, we postpone the verification to the next chapter. There, using more advanced and elegant techniques
(blow-ups and Young measures), we will in fact establish a much more general result and
see that, when viewed from the right perspective, u-dependence does not pose any additional
complications.
2.4. THE EULER–LAGRANGE EQUATION
19
2.4 The Euler–Lagrange equation
In analogy to the elementary fact that the derivative of a function at its minimizers is zero,
we will now derive a necessary condition for a function to be a minimizer. This condition
furnishes the connection between the Calculus of Variations and PDE theory and is of great
importance for computing particular solutions.
Theorem 2.13 (Euler–Lagrange equation (system)). Let f : Ω × Rm × Rm×d → R be
a Carathéodory integrand that is continuously differentiable in v and A and satisfies the
growth bounds
|Dv f (x, v, A)|, |DA f (x, v, A)| ≤ M(1 + |v| p + |A| p ),
(x, v, A) ∈ Ω × Rm × Rm×d ,
m
1−1/p,p (∂ Ω; Rm ), minimizes
for some M > 0, 1 ≤ p < ∞. If u∗ ∈ W1,p
g (Ω; R ), where g ∈ W
the functional
F [u] :=
∫
Ω
f (x, u(x), ∇u(x)) dx,
m
u ∈ W1,p
g (Ω; R ),
then u∗ is a weak solution of the following system of PDEs, called the Euler–Lagrange
equation3 ,
{
[
]
− div DA f (x, u, ∇u) + Dv f (x, u, ∇u) = 0 in Ω,
(2.5)
u = g on ∂ Ω.
Here we used the common convention to omit the x-arguments whenever this does not
cause any confusion in order to curtail the proliferation of x’s, for example in f (x, u, ∇u) =
f (x, u(x), ∇u(x)).
Recall that u∗ is a weak solution of (2.5) if
∫
Ω
DA f (x, u∗ , ∇u∗ ) : ∇ψ + Dv f (x, u∗ , ∇u∗ ) · ψ dx = 0
m
for all ψ ∈ C∞
c (Ω; R ), where
DA f (x, v, A) : B := lim
h↓0
f (x, v, A + hB) − f (x, v, A)
,
h
A, B ∈ Rm×d ,
is the directional derivative of f (x, v, q) at A in direction B, likewise for Dv f (x, v, A) · w.
The derivatives DA f (x, v, A) and Dv f (x, v, A) can also be represented as a matrix in Rm×d
and a vector in Rm , respectively, namely
(
)j
(
)j
DA f (x, v, A) := ∂A j f (x, v, A) k ,
Dv f (x, v, A) := ∂v j f (x, v, A) .
k
Hence we use the notation with the Frobenius matrix vector product “:” (see the appendix)
and the scalar product “·”. The boundary condition u = g on ∂ Ω is to be understood in the
sense of trace.
is a system of PDEs, but we simply speak of it as an “equation”.
Comment...
3 This
20
CHAPTER 2. CONVEXITY
m
Proof. For all ψ ∈ C∞
c (Ω; R ) and all h > 0 we have
F [u∗ ] ≤ F [u∗ + hψ ]
m
since u∗ + hψ ∈ W1,p
g (Ω; R ) is admissible in the minimization. Thus,
∫
f (x, u∗ + hψ , ∇u∗ + h∇ψ ) − f (x, u∗ , ∇u∗ )
dx
h
Ω
∫ ∫ 1
]
1 d[
=
f (x, u∗ + thψ , ∇u∗ + th∇ψ ) dt dx
Ω 0 h dt
0≤
∫ ∫ 1
=
Ω 0
DA f (x, u∗ + thψ , ∇u∗ + th∇ψ ) : ∇ψ
+ Dv f (x, u∗ + thψ , ∇u∗ + th∇ψ ) · ψ dt dx.
By the growth bounds on the derivative, the integrand can be seen to have an h-uniform L1 majorant, namely C(1 + |u∗ | p + |ψ | p + |∇u∗ | p + |∇ψ | p ) and so we may apply the Lebesgue
dominated convergence theorem to let h ↓ 0 under the (double) integral. This yields
0≤
∫
Ω
DA f (x, u∗ , ∇u∗ ) : ∇ψ + Dv f (x, u∗ , ∇u∗ ) · ψ dx
and we conclude by taking ψ and −ψ in this inequality.
m
Remark 2.14. If we want to allow ψ ∈ W1,p
0 (Ω; R ) in the weak formulation of (2.5),
then we need to assume the stronger growth conditions
|Dv f (x, v, A)|, |DA f (x, v, A)| ≤ M(1 + |v| p−1 + |A| p−1 )
for some M > 0, 1 ≤ p < ∞.
Example 2.15. Coming back to the Dirichlet integral from Example 2.9, we see that the
associated Euler–Lagrange equation is the Laplace equation
−∆u = 0,
where ∆ := ∂x21 + · · · + ∂x2d is the Laplace operator. Solutions u are called harmonic functions. Since the Dirichlet integral is convex, all solutions of the Laplace equation are in fact
minimizers. In particular, it can be seen that solutions of the Laplace equation are unique
for given boundary values.
Comment...
Example 2.16. In the linearized elasticity problem from Example 2.10, we may compute
the Euler–Lagrange equation as

[
]
)
(

 − div 2µ E u + κ − 2 µ (tr E u)I = f in Ω,
3


u = g on ∂ Ω.
2.4. THE EULER–LAGRANGE EQUATION
21
Sometimes, one also defines the first variation δ F [u] of F at u ∈ W1,p (Ω; Rm ) as the
linear map
m
δ F [u] : W1,p
0 (Ω; R ) → R
through (whenever this exists)
δ F [u][ψ ] := lim
h↓0
F [u + hψ ] − F [u]
,
h
m
ψ ∈ W1,p
0 (Ω; R ).
Then, the assertion of the previous theorem can equivalently be formulated as
δ F [u∗ ] = 0
m
if u∗ minimizes F over W1,p
g (Ω; R ).
Of course, this condition is only necessary for u to be a minimizer. Indeed, any solution of
the Euler–Lagrange equation is called a critical point of F , which could be a minimizer,
maximizer, or saddle point.
One crucial consequence of the results in this section is that we can use all available
PDE methods to study minimizers. Immediately, one can ask about the type of PDE we are
dealing with. In this respect we have the following result:
Proposition 2.17. Let f be twice continuously differentiable in v and A. Also let A 7→
f (x, v, A) be convex for all (x, v) ∈ Ω ×Rm . Then, the Euler–Lagrange equation is an elliptic
PDE, that is,
d2
2
0 ≤ DA f (x, v, A)[B, B] := 2 f (x, v, A + tB)
dt
t=0
for all x ∈ Ω, v ∈ Rm , and A, B ∈ Rm×d .
This proposition is not difficult to establish and just needs a bit of notation; we omit the
proof.
One main use of the Euler–Lagrange equation is to find concrete solutions of variational
problems:
Example 2.18. Recall the optimal saving problem from Section 1.5,

∫ T
 F [S] :=
− ln(w + ρ S(t) − Ṡ(t)) dt → min,
0

S(0) = 0, S(T ) = ST ≥ 0.
The Euler–Lagrange equation is
]
[
d
ρ
1
−
=−
,
dt w + ρ S(t) − Ṡ(t)
w + ρ S(t) − Ṡ(t)
which resolves to
Comment...
ρ Ṡ(t) − S̈(t)
= ρ.
w + ρ S(t) − Ṡ(t)
22
CHAPTER 2. CONVEXITY
Figure 2.1: The solution to the optimal saving problem.
With the consumption rate C(t) := w + ρ S(t) − Ṡ(t), this is equivalent to
Ċ(t)
= ρ,
C(t)
and so, if C(0) = C0 (to be determined later),
w + ρ S(t) − Ṡ(t) = C(t) = eρ t C0 .
This differential equation for S(t) can be solved, for example via the Duhamel formula,
which yields
ρt
S(t) = e · 0 +
=
∫ t
0
eρ (t−s) (w − eρ sC0 ) ds
eρ t − 1
w − teρ t C0 .
ρ
Comment...
and C0 can now be chosen to satisfy the terminal condition S(T ) = ST , in fact, C0 = (1 −
e−ρ T )w/(ρ T ) − e−ρ T ST /T .
Since C 7→ − lnC is convex, we know that S(t) as above is a minimizer of our problem.
In Figure 2.1 we see the optimal savings strategy for a worker earning a (constant) salary
of w = £30, 000 per year and having a savings goal of ST = £100, 000. The worker has to
save for approximately 27 years, reaching savings of just over £168, 000, and then starts
withdrawing his savings (Ṡ(t) < 0) for the last 13 years. The worker’s consumption C(t) =
eρ t C0 goes up continuously during the whole working life, making him/her (materially)
happy.
2.4. THE EULER–LAGRANGE EQUATION
23
Example 2.19. For functions u = u(t, x) : R × Rd → R, consider the action functional
1
A [u] :=
2
∫
R×Rd
−|∂t u|2 + |∇x u|2 d(t, x),
where ∇x u is the gradient of u with respect to x ∈ Rd . This functional should be interpreted
as the usual Dirichlet integral, see Example 2.9, where, however, we use the Lorentz metric.
Then, the Euler–Lagrange equation is the wave equation
∂t2 u − ∆u = 0.
Notice that A is not convex and consequently the wave equation is hyperbolic instead of
elliptic.
It is an important question whether a weak solution of the Euler–Lagrange equation (2.5)
is also a strong solution, that is, whether u ∈ W2,2 (Ω; Rm ) and
{
[
]
− div DA f (x, u(x), ∇u(x)) + Dv f (x, u(x), ∇u(x)) = 0 for a.e. x ∈ Ω,
(2.6)
u = g on ∂ Ω.
If u ∈ C2 (Ω; Rm ) ∩ C(Ω; Rm ) satisfies this PDE for every x ∈ Ω, then we call u a classical
solution.
m
Multiplying (2.6) by a test function ψ ∈ C∞
c (Ω; R ), integrating over Ω, and using the
Gauss–Green Theorem, it follows that any solution of (2.6) also solves (2.5). The converse
is also true whenever u is sufficiently regular:
Proposition 2.20. Let the integrand f be twice continuously differentiable and let u ∈
W2,2 (Ω; Rm ) be a weak solution of the Euler–Lagrange equation (2.5). Then, u solves the
Euler–Lagrange equation (2.6) in the strong sense.
Proof. If u ∈ W2,2 (Ω; Rm ) is a weak solution, then
∫
Ω
DA f (x, u, ∇u) : ∇ψ + Dv f (x, u, ∇u) · ψ dx = 0
m
for all ψ ∈ C∞
c (Ω; R ). Integration by parts (more precisely, the Gauss–Green Theorem)
gives
∫ {
Ω
[
]
}
− div DA f (x, u, ∇u) + Dv f (x, u, ∇u) · ψ dx = 0,
again for all ψ as before. We conclude using the following important lemma.
Lemma 2.21 (Fundamental lemma of the Calculus of Variations). Let Ω ⊂ Rd be open.
If g ∈ L1 (Ω) satisfies
∫
Ω
gψ dx = 0
for all ψ ∈ C∞
c (Ω),
Comment...
then g ≡ 0 almost everywhere.
24
CHAPTER 2. CONVEXITY
Proof. We can assume that Ω is bounded by considering subdomains if necessary. Also, let
g be extended by zero to all of Rd . Fix ε > 0 and let (ρδ )δ ⊂ C∞
c (B(0, δ )) be a family of
mollifying kernels, see Appendix A.2 for details. Then, since ρδ ⋆ g → g in L1 , we may find
h ∈ (L1 ∩ C∞ )(Rd ) with
∥g − h∥L1 ≤
ε
4
and
∥h∥∞ < ∞.
Set φ (x) := h(x)/|h(x)| for h(x) ̸= 0 and φ (x) := 0 for h(x) = 0, so that hφ = |h|. Then take
ψ = ρδ̄ ⋆ φ ∈ (L1 ∩ C∞ )(Rd ) with
∥φ − ψ ∥L1 ≤
ε
.
2∥h∥∞
Since |ψ | ≤ 1 (this follows from the convolution formula),
∥g∥L1 ≤ ∥g − h∥L1 +
≤ ∥g − h∥L1 +
∫
∫Ω
Ω
hφ dx
h(φ − ψ ) + (h − g)ψ + gψ dx
≤ 2∥g − h∥L1 + ∥h∥∞ · ∥φ − ψ ∥L1 + 0
≤ ε.
We conclude by letting ε ↓ 0.
2.5 Regularity of minimizers
As we saw at the end of the last section, whether a weak solution of the Euler–Lagrange
equation is also a strong or even classical solution, depends on its regularity, that is, on the
amount of differentiability. More generally, one would like to know how much regularity
we can expect from solutions of a variational problem. Such a question also was the content
of David Hilbert’s 19th problem [47]:
Does every Lagrangian partial differential equation of a regular variational
problem have the property of exclusively admitting analytic integrals?4
In modern language, Hilbert asked whether “regular” variational problems (defined below)
admit only analytic solutions, i.e. ones that have a local power series representation.
In this section, we will prove some basic regularity assertions, but we will only sketch
the solution of Hilbert’s 19th problem, as the techniques needed are quite involved. We remark from the outset that regularity results are very sensitive to the dimensions involved, in
particular the behavior of the scalar case (m = 1) and the vector case (m > 1) is fundamentally different. We also only consider the quadratic (p = 2) case for reasons of simplicity.
4 The
Comment...
German original asks “ob jede Lagrangesche partielle Differentialgleichung eines regulären Variationsproblems die Eigenschaft hat, daß sie nur analytische Integrale zuläßt.”
2.5. REGULARITY OF MINIMIZERS
25
In the spirit of Hilbert’s 19th problem, call
F [u] :=
∫
Ω
f (∇u(x)) dx,
u ∈ W1,2 (Ω; Rm ),
a regular variational integral if f ∈ C2 (Rm×d ) and there are constants 0 < µ ≤ M with
µ |B|2 ≤ D2A f (A)[B, B] ≤ M|B|2
In this context,
D2A f (A)[B1 , B2 ]
for all A, B ∈ Rm×d .
d d
=
f (A + sB1 + tB2 )
dt ds
s,t=0
for all A, B1 , B2 ∈ Rm×d ,
which is a bilinear form. Clearly, regular variational problems are convex. In fact, integrands f that satisfy the above lower bound µ |B|2 ≤ D2A f (A)[B, B] are called strongly convex. For example, the Dirichlet integral from Example 2.9 is regular. Also, the following
Cauchy–Schwarz-type estimate can be shown:
2
DA f (A)[B1 , B2 ] ≤ M|B1 ||B2 |.
(2.7)
The basic regularity theorem is the following:
Theorem 2.22 (W2,2
loc -regularity). Let F be a regular variational integral. Then, for
m
any minimizer u ∈ W1,2 (Ω; Rm ) of F , it holds that u ∈ W2,2
loc (Ω; R ). Moreover, for any
B(x0 , 3r) ⊂ Ω (x0 ∈ Ω, r > 0) the Caccioppoli inequality
) ∫
(
∫
|∇u(x) − [∇u]B(x0 ,3r) |2
2M 2
2
2
|∇ u(x)| dx ≤
dx
(2.8)
µ
r2
B(x0 ,3r)
B(x0 ,r)
holds, where [∇u]B(x0 ,3r) :=
holds strongly,
− div D f (∇u) = 0,
∫
−B(x ,3r) ∇u
0
dx. Consequently, the Euler–Lagrange equation
a.e. in Ω.
For the proof we employ the difference quotient method, which is fundamental in regularity theory. Let u : Ω → Rm , x ∈ Ω, k ∈ {1, . . . , d}, and h ∈ R. Then, define the difference
quotients
Dhk u(x) :=
u(x + hek ) − u(x)
,
h
Dh u := (Dh1 u, . . . , Dhd u),
where {e1 , . . . , ed } is the standard basis of Rd . The key is the following characterization of
Sobolev spaces in terms of difference quotients:
Lemma 2.23. Let 1 < p < ∞, D ⊂⊂ Ω ⊂ Rd be open, and u ∈ L p (Ω; Rm ).
(I) If u ∈ W1,p (Ω; Rm ), then
for all k ∈ {1, . . . , d}, |h| < dist(D, ∂ Ω).
Comment...
∥Dhk u∥L p (D) ≤ ∥∂k u∥L p (Ω)
26
CHAPTER 2. CONVEXITY
(II) If for some 0 < δ < dist(D, ∂ Ω) it holds that
∥Dhk u∥L p (D) ≤ C
for all k ∈ {1, . . . , d} and all |h| < δ ,
m
then u ∈ W1,p
loc (Ω; R ) and ∥∂k u∥L p (D) ≤ C for all k ∈ {1, . . . , d}.
Proof. For (I), assume first that u ∈ (L p ∩ C1 )(Ω; Rm ). In this case, by the Fundamental
Theorem of Calculus, at x ∈ Ω it holds that
Dhk u(x)
1
=
h
∫ 1
d
dt
0
∫ 1
u(x + thek ) dt =
0
∂k u(x + thek ) dt.
Thus,
∫
D
|Dhk u| p
dx ≤
∫
Ω
|∂k u| p dx,
from which the assertion follows. The general case follows from the density of (L p ∩
C1 )(Ω; Rm ) in L p (Ω; Rm ).
For (II), we observe that for fixed k ∈ {1, . . . , d} by assumption (Dhk u)0<h<δ is uniformly
p
L -bounded. Thus, for an arbitrary fixed sequence of h’s tending to zero, there exists a
subsequence h j ↓ 0 with
h
Dk j u ⇀ vk
in L p
m
for some vk ∈ L p (Ω; Rm ). Let ψ ∈ C∞
c (Ω; R ). Using an “integration-by-parts” rule for
difference quotients, which is elementary to check, we get
∫
D
vk · ψ dx = lim
∫
j→∞ D
h
Dk j u · ψ
dx = − lim
∫
j→∞ D
−h
u · Dk j ψ
dx = −
∫
D
u · ∂k ψ dx.
Thus, u ∈ W1,p (D; Rm ) and vk = ∂k u. The norm estimate follows from the lower semicontinuity of the norm under weak convergence.
Proof of Theorem 2.22. The idea is to emulate the a-priori non-existent second derivatives
using difference quotients and to derive estimates which allow one to conclude that these
difference quotients are in L2 . Then we can conclude by the preceding lemma.
Let u ∈ W1,2 (Ω; Rm ) be a minimizer of F . By Theorem 2.13,
∫
0=
Ω
D f (∇u) : ∇ψ dx
(2.9)
m
1,2
m
holds for all ψ ∈ C∞
c (Ω; R ), and then by density also for all ψ ∈ W (Ω; R ). Here we
have used the notation D f (A) : B for the directional derivative of f at A in direction B. Fix
a ball B(x0 , 3r) ⊂ Ω and take a Lipschitz cut-off function ρ ∈ W1,∞ (Ω) such that
Comment...
1B(x0 ,r) ≤ ρ ≤ 1B(x0 ,2r)
and
1
|∇ρ | ≤ .
r
2.5. REGULARITY OF MINIMIZERS
27
Then, for any k = 1, . . . , d and |h| < r, we let
]
[ 2 h
ψ := D−h
∈ W1,2 (Ω; Rm ),
k ρ Dk (u − a)
where a is an affine function to be chosen later. We may plug ψ into (2.9) to get
∫
[
]
0 = Dhk (D f (∇u)) : ρ 2 Dhk ∇u + Dhk (u − a) ⊗ ∇(ρ 2 ) dx.
Ω
(2.10)
Here, we used the “integration-by-parts” formula for difference quotients, which is easy to
check (also recall a ⊗ b := abT ).
Next, we estimate, using the assumptions on f ,
µ |Dhk ∇u|2 ≤
∫ 1
D2 f (∇u + thDhk ∇u)[Dhk ∇u, Dhk ∇u] dt
1
1
h
h
= D f (∇u + thDk ∇u) : Dk ∇u
h
t=0
0
= Dhk (D f (∇u)) : Dhk ∇u.
From the preceding lemma (both parts), (2.7), and the fact |a ⊗ b| = |a| · |b| for our norms,
we get
h
]
] [
[
D (D f (∇u)) : Dh (u − a) ⊗ ∇(ρ 2 ) ≤ D2 f (∇u) Dh (u − a) ⊗ ∇(ρ 2 ), ∂k ∇u k
k
k
≤ 2M · ρ · |Dhk (u − a)| · |∇ρ | · |Dhk ∇u|.
Using the last two estimates, (2.10), and the integration-by-parts formula for difference
quotients again,
µ
∫
Ω
|Dhk ∇u|2 ρ 2 dx ≤
∫
Ω
=−
Dhk (D f (∇u)) : [ρ 2 Dhk ∇u] dx
∫
Ω
≤ 2M
∫
[
]
Dhk (D f (∇u)) : Dhk (u − a) ⊗ ∇(ρ 2 ) dx
Ω
ρ · |Dhk ∇u| · |Dhk (u − a)| · |∇ρ | dx.
Now use the Young inequality to get
µ
∫
Ω
|Dhk ∇u|2 ρ 2 dx ≤
µ
2
∫
Ω
|Dhk ∇u|2 ρ 2 dx +
(2M)2
2µ
∫
Ω
|Dhk (u − a)|2 · |∇ρ |2 dx.
Absorbing the first term in the left hand side and using the properties of ρ ,
(
) ∫
∫
|Dhk (u − a)|2
2M 2
h
2
dx.
|Dk ∇u| dx ≤
µ
r2
B(x0 ,2r)
B(x0 ,r)
Now invoke the difference-quotient lemma, part (I), to arrive at
(
) ∫
∫
2M 2
|∂k (u − a)|2
h
2
|Dk ∇u| dx ≤
dx.
µ
r2
B(x0 ,r)
B(x0 ,3r)
Comment...
Applying the lemma again, this time part (II), we get u ∈ W2,2 (B(x0 , r); Rm ). The Caccioppoli inequality (2.8) follows once we take ∇a := [∇u]B(x0 ,3r) .
28
CHAPTER 2. CONVEXITY
With the W2,2
loc -regularity result at hand, we may use ψ = ∂k ψ̃ , k = 1, . . . , d, as test
function in the weak formulation of the Euler–Lagrange equation − div[D f (∇u)] = 0 to
conclude that
[
]
− div D2 f (∇u) : ∇(∂k u) = 0
(2.11)
holds in the weak sense. Indeed,
0=−
∫
Ω
div D f (∇u) · ∂k ψ dx =
∫
Ω
]
[
div D2 f (∇u) : ∇(∂k u) · ψ dx
m
2
for all ψ ∈ C∞
c (Ω; R ). In the special case that f is quadratic, D f is constant and the above
PDE has the same structure (with ∂k u as unknown function) as the original Euler–Lagrange
equation. Thus it is itself an Euler–Lagrange equation and we can bootstrap the regularity
m
∞
m
theorem to get u ∈ Wk,2
loc (Ω; R ) for all k ∈ N, hence k ∈ C (Ω; R ).
Example 2.24. For a minimizer u ∈ W1,2 (Ω) of the Dirichlet integral as in Example 2.9,
the theory presented here immediately gives u ∈ C∞ (Ω).
Example 2.25. Similarly, for minimizers u ∈ W1,2 (Ω; R3 ) to the problem of linearized
elasticity from Section 1.4 and Example 2.10, we also get u ∈ C∞ (Ω; R3 ). The reasoning
has to be slightly adjusted, however, to take care of the fact that we are dealing with E u
instead of ∇u.
In the general case, however, our Euler–Lagrange equation is nonlinear and the above
bootstrapping procedure fails. In particular, this includes the problem posed in Hilbert’s
19th problem, which was only solved, in the scalar case m = 1, by Ennio De Giorgi in
1957 [28] and, using different methods, by John Nash in 1958 [78]. After the proof was
improved by Jürgen Moser in 1960/1961 [71, 72], the results are now subsumed in the De
Giorgi–Nash–Moser theory.
The current standard solution (based on De Giorgi’s approach) is based on the De Giorgi
regularity theorem and the classical Schauder estimates, which establish Hölder regularity of weak solutions of a PDE.
Theorem 2.26 (De Giorgi regularity theorem). Let the function A : Ω → Rd×d be measurable, symmetric, i.e. A(x) = A(x)T for x ∈ Ω, and satisfy the ellipticity and boundedness
estimates
µ |v|2 ≤ vT A(x)v ≤ M|v|2 ,
x ∈ Ω, v ∈ Rd ,
(2.12)
for constants 0 < µ ≤ M. If u ∈ W1,2 (Ω) is a weak solution of
− div[A∇u] = 0,
(2.13)
Comment...
0,α0
(Ω), that is, u is α0 -Hölder continuous, for some α0 = α0 (d, M/µ ) ∈ (0, 1).
then u ∈ Cloc
2.5. REGULARITY OF MINIMIZERS
29
Theorem 2.27 (Schauder estimate). Let A : Ω → Rd×d be as above but in addition assumed to be of class Cs−1,α for some s ∈ N and α ∈ (0, 1). If u ∈ W1,2 (Ω) is a weak solution
of
− div[A∇u] = 0,
then u ∈ Cs,locα (Ω).
We remark that the Schauder estimates also hold for systems of PDEs (u : Ω → Rm ,
m > 1), but the De Giorgi regularity theorem does not. The proofs of these results are quite
involved and reserved for an advanced course on regularity theory, but we establish their
arguably most important consequence, namely the solution of Hilbert’s 19th problem in the
scalar case:
Theorem 2.28 (De Giorgi–Nash–Moser). Let F be a regular variational integral and
α
f ∈ Cs (Rd ), s ∈ {2, 3, . . .}. If u ∈ W1,2 (Ω) minimizes F , then it holds that u ∈ Cs−1,
loc (Ω)
∞
d
∞
for some α ∈ (0, 1). In particular, if f ∈ C (R ), then u ∈ C (Ω).
Proof. We saw in (2.11) that the partial derivative ∂k u, k = 1, . . . , d, of a minimizer u ∈
W1,2 (Ω) of F satisfy (2.13) for
A(x) := D2 f (∇u(x)),
x ∈ Ω.
From general properties of the Hessian we conclude that A(x) is symmetric and the upper
and lower estimates (2.12) on A follow from the respective properties of D2 f . However,
we cannot conclude (yet) any regularity of A beyond measurability. Nevertheless, we may
0,α0
apply the De Giorgi Regularity Theorem 2.26, whereby ∂k u ∈ Cloc
(Ω) for some α0 ∈ (0, 1)
1,α0
and all k = 1, . . . , d. Hence u ∈ Cloc (Ω). This is the assertion for s = 2.
If s = 3, our arguments so far in conjunction with the regularity assumptions on f imply
α0
2
D f (∇u(x)) ∈ C0,
loc (Ω). Hence, the Schauder estimates from Theorem 2.27 apply and yield
2,α0
s−1,α0
α0
(Ω). For higher s, this procedure can be
∂k u ∈ C1,
loc (Ω), whereby u ∈ Cloc (Ω) = Cloc
α0
iterated until we run out of f -derivatives and we have achieved u ∈ Cs−1,
(Ω).
loc
Comment...
The above argument also yields analyticity of u if f is analytic. Another refinement
shows that in the situation of the De Giorgi–Nash–Moser Theorem, we even have u ∈
s−1,α
(Ω) for all α ∈ (0, 1). Here one needs the additional Schauder–type result that weak
Cloc
[
]
0,α
solutions u to − div A∇(∂k u) = 0 for A continuous are of class Cloc
for all α ∈ (0, 1).
In the above proof we can apply this result in the case s = 2 since A(x) = D2 f (∇u(x)) is
1,α
continuous. Thus, u ∈ Cloc
(Ω) for any α ∈ (0, 1). The other parts of the proof are adapted
accordingly. See [86] for details and other refinements.
We close our look at regularity theory by considering the vectorial case m > 1. It was
shown again by De Giorgi in 1968 that if d = m > 2, then his regularity theorem does not
hold:
30
CHAPTER 2. CONVEXITY
Example 2.29 (De Giorgi). Let d = m > 2 and define
−γ
u(x) := |x| x,
(
)
d
1
γ :=
1− √
,
2
(2d − 2)2 + 1
x ∈ B(0, 1).
d
Note that 1 < γ < d2 and so u ∈ W1,2 (B(0, 1); Rd ) but u ∈
/ L∞
loc (B(0, 1); R ). It can be
checked, though, that u solves (2.13) for
) ]2
[(
x⊗x
v A(x)v := |v| + (d − 1)I + d ·
v ,
|x|2
T
2
which satisfies all assumptions of the De Giorgi Theorem. Moreover, u is a minimizer of
the quadratic variational integral
∫
B(0,1)
∇u(x)T A(x)∇u(x) dx.
However, this is not a regular variational integral because of the x-dependence of the
integrand is not smooth.
In principle, this still leaves the possibility that the vectorial analogue of Hilbert’s 19th
problem has a positive solution, just that its proof would have to proceed along a different
avenue than via the De Giorgi Theorem. However, after first results of Nečas in 1975 [79],
the current state of the art was established by Šverák and Yan in 2000–2002 [90, 91]. They
prove that there exist regular (and C∞ ) variational integrals such that minimizers must be:
• non-Lipschitz if d ≥ 3, m ≥ 5,
• unbounded if d ≥ 5, m ≥ 14.
Comment...
However, there are also positive results, in particular we have that for d = 2 minimizers to regular variational problems are always as regular as the data allows (as in
Theorem 2.28), this is the Morrey regularity theorem from 1938, see [68]. If we confine ourselves to Sobolev-regularity, then using the difference quotient technique, we can
δ
prove W2,2+
-regularity for minimizers of regular variational problems for some dimensionloc
dependent δ > 0, this result is originally due to Sergio Campanato. By an embedding
theorem this yields C0,α -regularity for some α ∈ (0, 1) when d ≤ 4. Only in 2008 were
δ
-regularity for a dimensionJan Kristensen and Christof Melcher able to establish W2,2+
loc
independent δ > 0 (in fact, δ = µ /(50M)), see [55]. We will learn about further regularity
results in Section 3.8. Regularity theory is still an area of active research and new results
are constantly discovered.
Finally, we mention that regularity of solutions might in general be extended up to and
including the boundary if the boundary is smooth enough, see Section 6.3.2 in [35] and
also [44] for results in this direction.
2.6. LAVRENTIEV GAP PHENOMENON
31
2.6 Lavrentiev gap phenomenon
So far we have chosen the function spaces in which we look for solutions of a minimization
problem according to coercivity and growth assumptions. However, at first sight, classically
differentiable functions may appear to be more appealing. So the question arises whether
the minimum (infimum) value is actually the same when considering different function
spaces. Formally, given two spaces X ⊂ Y such that X is dense in Y and a functional
F : Y → R ∪ {+∞}, is
inf F = inf F
Y
?
X
This is certainly true if we assume good growth conditions:
Theorem 2.30. Let f : Ω × Rm × Rm×d → R be a Caratheéodory function satisfying the
growth bound
| f (x, v, A)| ≤ M(1 + |v| p + |A| p ),
(x, v, A) ∈ Ω × Rm × Rm×d ,
for some M > 0, 1 < p < ∞. Then, the functional
F [u] :=
∫
Ω
f (x, u(x), ∇u(x)) dx,
u ∈ W1,p (Ω; Rm )
is strongly continuous. Consequently,
inf
W1,p (Ω;Rm )
F=
inf
C∞ (Ω;Rm )
F.
Proof. Let u j → u in W1,p , whereby u j → u and ∇u j → ∇u almost everywhere (which holds
after selecting a non-renumbered subsequence). Then, from the growth assumption we get
F [u j ] =
∫
Ω
f (x, u j , ∇u j ) dx ≤
∫
Ω
M(1 + |u j | p + |∇u j | p ) dx
and by the Pratt Theorem A.5,
F [u j ] → F [u].
This shows the continuity of F with respect to strong convergence in W1,p .
The assertion about the equality of the infima now follows readily since C∞ (Ω; Rm ) is
(strongly) dense in W1,p (Ω; Rm ).
Comment...
If we dispense with the (upper) growth, however, the infimum over different spaces
may indeed be different – this is the Lavrentiev gap phenomenon, discovered in 1926 by
Mikhail Lavrentiev. We here give an example between the spaces W1,1 and W1,∞ (with
boundary conditions) due to Basilio Manià from 1934.
32
CHAPTER 2. CONVEXITY
Example 2.31 (Manià). Consider the minimization problem

∫ 1
 F [u] :=
(u(t)3 − t)2 u̇(t)6 dt → min,
0

u(0) = 0, u(1) = 1
for u from either W1,1 (0, 1) or W1,∞ (0, 1) (Lipschitz functions). We claim that
inf
W1,1 (0,1)
F<
inf
W1,∞ (0,1)
F,
where here and in the following these infima are to be taken only over functions u with
boundary values u(0) = 0, u(1) = 1.
Clearly, F ≥ 0, and for u∗ (t) := t 1/3 ∈ (W1,1 \ W1,∞ )(0, 1) we have F [u∗ ] = 0. Thus,
inf
W1,1 (0,1)
F = 0.
On the other hand, for any u ∈ W1,∞ (0, 1) with u(0) = 0, u(1) = 1, the Lipschitz-continuity
implies that
u(t) ≤ h(t) :=
t 1/3
2
for all t ∈ [0, τ ] and some τ > 0,
and u(τ ) = h(τ ). Then, u(t)3 − t ≤ h(t)3 − t for t ∈ [0, τ ] and, since both of these terms are
negative,
(u(t)3 − t)2 ≥ (h(t)3 − t)2 =
72 2
t
82
for all t ∈ [0, τ ].
We then estimate
F [u] ≥
∫ τ
0
(u(t)3 − t)2 u̇(t)6 dt ≥
72
82
∫ τ
t 2 u̇(t)6 dt.
0
Further, by Hölder’s inequality,
∫ τ
∫ τ
t −1/3 · t 1/3 u̇(t) dt
(∫ τ
)5/6 (∫ τ
)1/6
−2/5
2
6
≤
t
dt
·
t u̇(t) dt
u̇(t) dt =
0
0
0
=
55/6 1/2
τ
35/6
(∫
0
τ
)1/6
t 2 u̇(t)6 dt
0
Since also
∫ τ
Comment...
0
u̇(t) dt = u(τ ) − u(0) = h(τ ) =
τ 1/3
,
2
.
2.7. INVARIANCES AND THE NOETHER THEOREM
33
we arrive at
72 35
72 35
≥
> 0.
82 55 26 τ
82 55 26
F [u] ≥
Thus,
inf
W1,∞ (0,1)
F >0=
inf
W1,1 (0,1)
F.
In a more recent example, Ball & Mizel [13] showed that the problem

∫ 1
 F [u] :=
(t 4 − u(t)6 )2 |u̇(t)|2m + ε u̇(t)2 dt
−1

u(−1) = α , u(1) = β ,
→
min,
also exhibits a Lavrentiev gap phenomenon between the spaces W1,2 and W1,∞ if m ∈ N
satisfies m > 13, ε > 0 is sufficiently small, and −1 ≤ α < 0 < β ≤ 1. This example is
significant because the Ball–Mizel functional is coercive on W1,2 (−1, 1).
Finally, we note that despite its theoretical significance, the Lavrentiev gap phenomenon
is a major obstacle for the numerical approximation of minimization problems. For instance, standard piecewise-affine finite elements are in W1,∞ and hence in the presence of
the Lavrentiev gap phenomenon we cannot approximate the true solution with such elements! Thus, one is forced to work with non-conforming elements and other advanced
schemes. This issue does not only affect “academic” examples such as the ones above, but
is also of great concern in applied problems, such as nonlinear elasticity theory.
2.7 Invariances and the Noether theorem
In physics and other applications of the Calculus of Variations, we are often highly interested in symmetries of minimizers or, more generally, critical points. These symmetries
manifest themselves in other differential or pointwise relations that are automatically satisfied for any critical point. They can be used to identify concrete solutions or are interesting
in their own right, for example as “gauge symmetries” in modern physics. In this section,
for simplicity we only consider “sufficiently smooth” (W2,2
loc ) critical points, which is the
natural level of regularity, see Theorem 2.22.
As a first concrete example of a symmetry, consider the Dirichlet integral
F [u] :=
∫
Ω
|∇u(x)|2 dx,
u ∈ W1,2 (Ω),
which was introduced in Example 2.9. First we notice that F is invariant under translations
in space: Let u ∈ (W1,2 ∩ W2,2
loc )(Ω), τ ∈ R, k ∈ {1, . . . , d}, and set
uτ (x) := u(x + τ ek ).
Comment...
xτ := x + τ ek ,
34
CHAPTER 2. CONVEXITY
Then, for any open D ⊂ Rd with Dτ := D + τ ek ⊂ Ω, we have the invariance
∫
D
|∇uτ (x)|2 dx =
∫
Dτ
|∇u(xτ )|2 dxτ .
(2.14)
The Dirichlet integral also exhibits invariance with respect to scaling: For λ > 0 set
xλ := λ x,
uλ (x) := λ (d−2)/2 u(λ x),
Dλ := λ D.
(2.15)
Then it is not hard to see that (2.14) again holds if we set λ = eτ (to allow τ ∈ R as before).
The main result of this section, the Noether theorem, roughly says that “differentiable
invariances of a functional give rise to conservation laws”. More concretely, the two invariances of the Dirichlet integral presented above will yield two additional PDEs that any
minimizer of the Dirichlet integral must satisfy.
To make these statement precise in the general case, we need a bit of notation: Let
m
u ∈ (W1,2 ∩ W2,2
loc )(Ω; R ) be a critical point of the functional
F [u] :=
∫
Ω
f (x, u(x), ∇u(x)) dx,
where f : Ω × Rm × Rm×d → R is twice continuously differentiable. Then, u satisfies the
strong Euler–Lagrange equation
[
]
− div DA f (x, u, ∇u) + Dv f (x, u, ∇u) = 0 a.e. in Ω.
see Proposition 2.20. We consider u to be extended to all of Rd (it will not matter below
how we extend u).
Our invariance is given through maps g : Rd × R → Rd and H : Rd × R → Rm , which
will depend on u above, such that
g(x, 0) = x
and
H(x, 0) = u(x),
x ∈ Rd .
We also require that g, H are continuously differentiable in their second argument τ for
almost every x ∈ Ω. Then set for x ∈ Rd , τ ∈ R, D ⊂⊂ Rd
xτ := g(x, τ ),
uτ (x) := H(x, τ ),
Dτ := g(D, τ ).
Therefore, g, H can be thought of as a form of homotopy.
We call F invariant under the transformation defined by (g, H) if
∫
D
f (x, uτ (x), ∇uτ (x)) dx =
∫
Dτ
f (x′ , u(x′ ), ∇u(x′ )) dx′ .
(2.16)
for all τ ∈ R sufficiently small and all open D ⊂ Rd such that Dτ ⊂⊂ Ω.
The main result of this section goes back to Emmy Noether5 and is considered one of
the most important mathematical theorems ever proved. Its pivotal idea of systematically
generating conservation laws from invariances has proved to be immensely influential in
modern physics.
5 The
Comment...
German mathematician Emmy Noether (23 March 1882 – 14 April 1935) has been called the most
important woman in the history of mathematics by Einstein and others.
2.7. INVARIANCES AND THE NOETHER THEOREM
35
Theorem 2.32 (Noether). Let f : Ω × Rm × Rm×d → R be twice continuously differentiable and satisfy the growth bounds
|Dv f (x, v, A)|, |DA f (x, v, A)| ≤ M(1 + |v| p + |A| p ),
(x, v, A) ∈ Ω × Rm × Rm×d ,
for some M > 0, 1 ≤ p < ∞, and let the associated functional F be invariant under
the transformation defined by (g, H) as above, for which we assume that there exist a pmajorant g ∈ L p (Ω) such that
|∂τ H(x, τ )|, |∂τ g(x, τ )| ≤ g(x)
for a.e. x ∈ Ω and all τ ∈ R.
(2.17)
m
(W1,2 ∩ W2,2
loc )(Ω; R )
Then, for any critical point u ∈
of F the conservation law
[
]
div µ (x) · DA f (x, u, ∇u) − ν (x) · f (x, u, ∇u) = 0
for a.e. x ∈ Ω.
(2.18)
holds, where
µ (x) := ∂τ H(x, 0) ∈ Rm
ν (x) := ∂τ g(x, 0) ∈ Rd
and
(x ∈ Ω)
are the Noether multipliers.
Corollary 2.33. If f = f (v, A) : R × Rd → R does not depend on x, then every critical
point u ∈ (W1,2 ∩ W2,2
loc )(Ω) of F satisfies
[
]
∂
∂
∂
δ
(
f
(u,
∇u)
−
u)
·
f
(u,
∇u)
=0
i
A
k
ik
∑
k
d
(2.19)
i=1
for all k = 1, . . . , d.
Proof. Differentiate (2.16) with respect to τ and then set τ = 0. To be able to differentiate
the left hand side under the integral, we use the growth assumptions on the derivatives
|Dv f (x, v, A)|, |DA f (x, v, A)| and (2.17) to get a uniform (in τ ) majorant, which allows to
move the differentiation under the intgral sign. For the right hand side, we need to employ
the formula for the differentiation of an integral with respect to a moving domain (this is a
special case of the Reynolds transport theorem),
d
dτ
∫
Dτ
f (x, u, ∇u) dx =
∫
∂ Dτ
f (x, u, ∇u)∂τ g(x, τ ) · n dσ ,
where n is the unit outward normal; this formula can be checked in an elementary way. Abbreviating for readability F := f (x, u(x), ∇u(x)), DA F := DA f (x, u(x), ∇u(x)), and Dv F :=
Dv f (x, u(x), ∇u(x)), we get
D
DA F : ∇µ + Dv F · µ dx =
∫
∂D
F ν · n dσ .
Applying the Gauss–Green theorem,
∫ [
∫
]
− div DA F + Dv F · µ dx =
∫∂ D
D
=
D
[
]
−µ · DA F + ν F · n dσ
[
]
div −µ · DA F + ν F dx
Comment...
∫
36
CHAPTER 2. CONVEXITY
Now, the Euler–Lagrange equation − div DA F + Dv F = 0 immediately implies
∫
[
]
div −µ · DA F + ν F dx = 0.
D
and varying D we conclude that (2.18) holds.
The corollary follows by considering the invariance xτ := x + τ ek , uτ (x) := u(x + τ ek ),
k = 1, . . . , d.
Example 2.34 (Brachistochrone problem). In the brachistochrone problem presented in
Section 1.1, we were tasked to minimize the functional
√
∫ 1
1 + (y′ )2
F [y] :=
dx
−y
0
√
over all y : [0, 1] → R with y(0) = 0, y(1) = ȳ < 0. The integrand f (v, a) = −(1 + a2 )/v
is independent of x, hence from (2.19) we get
1
y′ · Da f (y, y′ ) − f (y, y′ ) = const = − √
2r
for some r > 0 (positive constants lead to inadmissible y). So,
√
1 + (y′ )2
(y′ )2
1
√
− √
= −√ ,
√
−y
2r
1 + (y′ )2 · −y
which we transform into the differential equation
(y′ )2 = −
2r
− 1.
y
The solution of this differential equation is called an inverted cycloid with radius r > 0,
which is the curve traced out by a fixed point on a sphere of radius r (touching the y-axis at
the beginning) that rolls to the right on the bottom of the x-axis, see Figure 2.2. It can be
written in parametric form as
x(t) = r(t − sint),
y(t) = −r(1 − cost),
t ∈ R.
The radius r > 0 has to be chosen as to satisfy the boundary conditions on one cycloid
segment. This is the solution of the brachistochrone problem.
Comment...
Example 2.35. We return to our canonical example, the Dirichlet integral, see Example 2.9. We know from Example 2.24 that critical points u are smooth. We noticed at the
beginning of this section that the Dirichlet integral is invariant under the scaling transformation (2.15) with λ = eτ . By the Noether theorem, this yields the (non-obvious) conservation
law
[
]
div (2∇u(x) · x + (d − 2)u)∇u − |∇u|2 x = 0.
2.8. SIDE CONSTRAINTS
37
Figure 2.2: The brachistochrone curve (here, x̄ = 1).
Integrating this over B(0, r) ⊂ Ω, r > 0, and using the Gauss–Green theorem, we get
(
)
x 2
dσ
(d − 2)
|∇u(x)| dx = r
|∇u(x)| − 2 ∇u(x) ·
|x|
B(0,r)
∂ B(0,r)
∫
∫
2
2
and then, after some computations,
d
dr
(
1
)
∫
rd−2
|∇u(x)| dx =
2
B(0,r)
2
rd−2
)
(
x 2
dσ ≥ 0.
∇u(x) ·
|x|
∂ B(0,r)
∫
This monotonicity formula implies that
1
rd−2
∫
B(0,r)
|∇u(x)|2 dx
is increasing in r > 0.
Any harmonic function (−∆u = 0) that is defined on all of Rd satisfies this monotonicity
formula. For example, this allows us to draw the conclusion that if d ≥ 3, then u cannot
be compactly supported. The formula also shows that the growth around a singularity at
zero (or anywhere else by translation) has to behave “more smoothly” than |x|−2 (in fact,
we already know that solutions are C∞ ). While these are not particularly strong remarks,
they serve to illustrate how Noether’s theorem restricts the candidates for solutions.
The last example exhibited a conservation law that was not obvious from the Euler–
Lagrange equation. While in principle it could have been derived directly, the Noether
theorem gave us a systematic way to find this conservation law from an invariance.
2.8 Side constraints
Comment...
In many minimization problems, the class of functions over which we minimize our functional may be restricted to include one or several side constraints. To establish existence
of a minimizer also in these cases, we first need to extend the Direct Method:
38
CHAPTER 2. CONVEXITY
Theorem 2.36 (Direct Method with side constraint). Let X be a Banach space and let
F , G : X → R ∪ {+∞}. Assume the following:
(WH1) Weak Coercivity of F : For all Λ ∈ R, the sublevel set
{ x ∈ X : F [x] ≤ Λ }
is sequentially weakly compact,
that is, if F [x j ] ≤ Λ for a sequence (x j ) ⊂ X, then (x j ) has a weakly convergent
subsequence.
(WH2) Weak lower semicontinuity of F : For all sequences (x j ) ⊂ X with x j ⇀ x it holds
that
F [x] ≤ lim inf F [x j ].
j→∞
(WH3) Weak continuity of G : For all sequences (x j ) ⊂ X with x j ⇀ x it holds that
G [x j ] → G [x].
Assume also that there exists at least one x0 ∈ X with G [x0 ] = 0. Then, the minimization
problem
F [x] → min
over all x ∈ X with G [x] = α ∈ R,
has a solution.
Proof. The proof is almost exactly the same as the one for the standard Direct Method in
Theorem 2.3. However, we need to select the x j for an infimizing sequence with G [x j ] = α .
Then, by (WH3), this property also holds for any weak limit x∗ of a subsequence of the
x j ’s.
A large class of side constraints can be treated using the following simple result:
Lemma 2.37. Let g : Ω × Rm → R be a Carathéodory integrand such that there exists
M > 0 with
|g(x, v)| ≤ M(1 + |v| p ),
(x, v) ∈ Ω × Rm .
Then, G : W1,p (Ω; Rm ) → R defined through
G [u] :=
∫
Ω
g(x, u(x)) dx.
Comment...
is weakly continuous.
2.8. SIDE CONSTRAINTS
39
Proof. Let u j ⇀ u in W1,p , whereby after selecting a subsequence and employing the
Rellich–Kondrachov Theorem A.18 and Lemma A.3, u j → u in L p (strongly) and almost
everywhere. By assumption we have
±g(x, v) + M(1 + |v| p ) ≥ 0.
Thus, applying Fatou’s Lemma separately to these two integrands, we get
(
)
∫
∫
p
lim inf ±G [u j ] + M(1 + |u j | ) dx ≥ ±G [u] + M(1 + |u| p ) dx.
Ω
j→∞
Ω
Since ∥u j ∥L p → ∥u∥L p , we can combine these two assertions to get G [u j ] → G [u]. Since
this holds for all subsequences, it also follows for our original sequence.
We will later see more complicated side constraints, in particular constraints involving
the determinant of the gradient.
The following result is very important for the calculation of minimizers under side
constraints. It effectively says that at a minimizer u∗ of F subject to the side constraint
G [u∗ ] = 0, the derivatives of F and G have to be parallel, in analogy with the finitedimensional situation.
Theorem 2.38 (Lagrange multipliers). Let f : Ω × Rm × Rm×d → R, g : Ω × Rm → R be
Carathéodory integrands and satisfy the growth bounds
| f (x, v, A)| ≤ M(1 + |v| p + |A| p ),
|g(x, v)| ≤ M(1 + |v| p ),
(x, v, A) ∈ Ω × Rm × Rm×d ,
for some M > 0, 1 ≤ p < ∞, and let f , g be continuously differentiable in v and A. Suppose
that u∗ ∈ Wg1,p (Ω; Rm ), where g ∈ W1−1/p,p (∂ Ω; Rm ), minimizes the functional
F [u] :=
∫
Ω
f (x, u(x), ∇u(x)) dx,
m
u ∈ W1,p
g (Ω; R ),
under the side constraint
G [u] :=
∫
Ω
g(x, u(x)) dx = α ∈ R.
Assume furthermore the consistency condition
δ G [u∗ ][v] =
∫
Ω
Dv g(x, u∗ (x)) · v(x) dx ̸= 0
(2.20)
Comment...
m
for at least one v ∈ W1,p
0 (Ω; R ). Then, there exists a Lagrange multiplier λ ∈ R such that
u∗ is a weak solution of the system of PDEs
{
[
]
− div DA f (x, u, ∇u) + Dv f (x, u, ∇u) = λ Dv g(x, u) in Ω,
(2.21)
u=g
on ∂ Ω.
40
CHAPTER 2. CONVEXITY
Proof. The proof is structurally similar to its finite-dimensional analogue.
Let u∗ ∈ Wg1,p (Ω; Rm ) be a minimizer of F under the side constraint G [u] = 0. From
m
the consistency condition (2.20) we infer that there exists w ∈ W1,p
0 (Ω; R ) such that
δ G [u∗ ][w] =
∫
Ω
Dv g(x, u∗ ) · w dx = 1.
Now fix any variation v ∈ W01,p (Ω; Rm ) and define for s,t ∈ R,
H(s,t) := G [u∗ + sv + tw].
It is not difficult to see that H is continuously differentiable in both s and t (this uses strong
continuity of the integral, see Theorem 2.30) and
∂s H(s,t) = δ G [u∗ + sv + tw][v],
∂t H(s,t) = δ G [u∗ + sv + tw][w].
Thus, by definition of w,
H(0, 0) = α
∂t H(0, 0) = 1.
and
By the implicit function theorem, there exists a function τ : R → R such that τ (0) = 0 and
H(s, τ (s)) = α
for small |s|.
The chain rule yields for such s,
0 = ∂s [H(s, τ (s))] = ∂s H(s, τ (s)) + ∂t H(s, τ (s))τ ′ (s),
whereby
′
τ (0) = −∂s H(0, 0) = −
∫
Ω
Dv g(x, u∗ ) · v dx.
(2.22)
Now define for small |s| as above,
J(s) := F [u∗ + sv + τ (s)w].
We have
G [u∗ + sv + τ (s)w] = H(s, τ (s)) = α
and the continuously differentiable function J has a minimum at s = 0 by the minimization property of u∗ . Thus, with the shorthand notations DA f = DA f (x, u∗ , ∇u∗ ) and Dv f =
Dv f (x, u∗ , ∇u∗ ),
Comment...
0 = J ′ (0) =
∫
Ω
DA f : (∇v + τ ′ (0)∇w) + Dv f · (v + τ ′ (0)w) dx.
2.9. NOTES AND BIBLIOGRAPHICAL REMARKS
41
Rearranging and using (2.22), we get
∫
Ω
′
DA f : ∇v + Dv f · v dx = −τ (0)
=λ
∫
Ω
∫
Ω
DA f : ∇w + Dv f · w dx
Dv g(x, u∗ ) · v dx,
(2.23)
where we have defined
λ :=
∫
Ω
DA f : ∇w + Dv f · w dx.
Since (2.23) shows that u∗ is a weak solution of (2.21), the proof is finished.
Example 2.39 (Stationary Schrödinger equation). When looking for ground states in
quantum mechanics as in Section 1.3, we have to minimize
E [Ψ] :=
∫
RN
1
h̄2
|∇Ψ|2 + V (x)|Ψ|2 dx
2µ
2
over all Ψ ∈ W1,2 (RN ; C) under the side constraint
∥Ψ∥L2 = 1.
From Theorem 2.36 in conjunction with Lemma 2.37 (slightly extended to also apply to
the whole space Ω = RN ) we see that this problem always has at least one solution. Theorem 2.38 (likewise extended) yields that this Ψ satisfies
[ 2
]
−h̄
∆ +V (x) Ψ(x) = EΨ(x),
x ∈ Ω,
2µ
for some Lagrange multiplier E ∈ R, which is precisely the stationary Schrödinger equa2
tion. One can also show that E > 0 is the smallest eigenvalue of the operator Ψ 7→ [ −h̄
2µ ∆ +
V (x)]Ψ.
2.9 Notes and bibliographical remarks
Comment...
If a given problem has some convexity, this is invariable very useful and the analysis can
draw on a vast theory. We have not mentioned the general field of Convex Analysis, see for
example the classic texts [83] or [34]. In this area, however, the emphasis is more on extending the convex theory to “non-smooth” situations, see for example the monographs [84]
and [66, 67].
For the u-dependent variational integrals, the growth in the u-variable can be improved
up to q-growth, where q < p/(p − d) (the Sobolev embedding exponent for W1,p ). Moreover, we can work with the more general growth bounds | f (x, v, A)| ≤ M(g(x) + |v|q + |A| p ),
with g ∈ L1 (Ω; [0, ∞)) and q < p/(p − d). For reasons of simplicity, we have omitted these
generalization here.
42
CHAPTER 2. CONVEXITY
Comment...
The Fundamental Lemma of Calculus of Variations 2.21 is due to Du Bois-Raymond
and is sometimes named after him.
Difference quotients were already considered by Newton, their application to regularity theory is due to Nirenberg in the 1940s and 1950s. Many of the fundamental results
of regularity theory (albeit in a non-variational context) can be found in [43]. The book
by Giusti [44] contains theory relevant for variational questions. A nice framework for
Schauder estimates is [86]. De Giorgi’s 1968 counterexample is from [29].
Noether’s theorem has many ramifications and can be put into a very general form in
Hamiltonian systems and Lie-group theory. The idea is to study groups of symmetries and
their actions. For an introduction into this diverse field see [80].
Example 2.35 about the monotonicity formula is from [36].
Side-constraints can also take the form of inequalities, which is often the case in Optimization Theory or Mathematical Programming. This leads to Karush–Kuhn–Tucker conditions or obstacle problems.
Chapter 3
Quasiconvexity
We saw in the last chapter that convexity of the integrand implies lower semicontinuity for
integral functionals. Moreover, we saw in Proposition 2.11 that also the converse is true in
the scalar case d = 1 or m = 1. In the vectorial case (d, m > 1), however, it turns out that one
can find weakly lower semicontinuous integral functionals whose integrand is non-convex,
the following is the most prominent one: Let p ≥ d (Ω ⊂ Rd a bounded Lipschitz domain
as usual) and define
F [u] :=
∫
Ω
det ∇u(x) dx,
d
u ∈ W1,p
0 (Ω; R ).
Then, we can argue using Stokes’ Theorem (this can also be computed in a more elementary
way, see Lemma 3.8 below):
F [u] =
∫
Ω
det ∇u dx =
∫
Ω
du1 ∧ · · · ∧ dud =
∫
∂Ω
u1 ∧ du2 ∧ · · · ∧ dud = 0
d
because u ∈ W1,d
0 (Ω; R ) is zero on the boundary ∂ Ω. Thus, F is in fact constant on
d
W1,p
0 (Ω; R ), hence trivially weakly lower semicontinuous. However, the determinant function is far from being convex if d ≥ 2; for instance we can easily convex-combine a matrix
with positive determinant from two singular ones. But we can also find examples not involving singular matrices: For
(
)
(
)
(
)
1
1
−1 −2
1 −2
0 −2
A :=
,
B :=
,
A+ B =
,
2
1
2 −1
2 0
2
2
we have det A = det B = 3, but det(A/2 + B/2) = 4.
Furthermore, convexity of the integrand is not compatible with one of the most fundamental principles of continuum mechanics, the frame-indifference or objectivity. Assume
that our integrand f = f (A) is frame-indifferent, that is,
f (RA) = f (A)
for all A ∈ Rd×d , R ∈ SO(d),
Comment...
where SO(d) is the set of (d × d)-orthogonal matrices with determinant 1 (i.e. rotations).
Furthermore, assume that every purely compressive or purely expansive deformation costs
44
CHAPTER 3. QUASICONVEXITY
energy, i.e.
f (α I) > f (I)
for all α ̸= 1,
(3.1)
which is very reasonable in applications. Then, f cannot be convex: Let us for simplicity
assume d = 2. Set, for a fixed γ ∈ (0, 2π ),
(
)
cos γ − sin γ
R :=
∈ SO(2).
sin γ cos γ
Then, if f was convex, for M := (R + RT )/2 = (cos γ )I we would get
f ((cos γ )I) = f (M) ≤
)
1(
f (R) + f (RT ) = f (I),
2
contradicting (3.1). Sharper arguments are available, but the essential conclusion is the
same: convexity is not suitable for many variational problems originating from continuum
mechanics.
3.1 Quasiconvexity
We are now ready to replace convexity with a more general notion, which has become one
of the cornerstones of the modern Calculus of Variations. This is the notion of quasiconvexity, introduced by Charles B. Morrey in the 1950s: A locally bounded Borel-measurable
function h : Rm×d → R is called quasiconvex if
∫
h(A) ≤ −
B(0,1)
h(A + ∇ψ (z)) dz
(3.2)
∫
∫
m
−1 , where B(0, 1)
−
for all A ∈ Rm×d and all ψ ∈ W1,∞
0 (B(0, 1); R ). Here, B(0,1) := |B(0, 1)|
is the unit ball in Rd .
Let us give a physical interpretation to the notion of quasiconvexity: Let d = m = 3 and
assume that
F [y] :=
∫
h(∇y(x)) dx,
B(0,1)
y ∈ W1,∞ (B(0, 1); R3 ),
models the physical energy of an elastically deformed body, whose deformation from the
reference configuration is given through y : Ω = B(0, 1) → R3 (see Section 1.4 for more
details on this modeling). A special class of deformations are the affine ones, say a(x) =
y0 + Ax for some y0 ∈ R3 , A ∈ R3×3 . Then, quasiconvexity of f entails that
F [a] =
∫
B(0,1)
h(A) dx ≤
∫
B(0,1)
h(A + ∇ψ (z)) dz
Comment...
3
for all ψ ∈ W1,∞
0 (B(0, 1); R ). This means that the affine deformation a is always energetically favorable over the internally distorted deformation a(x) + ψ (x); this is very often a
3.1. QUASICONVEXITY
45
reasonable assumption for real materials. This argument also holds on any other domain by
Lemma 3.2 below.
To justify its name, we also need to convince ourselves that quasiconvexity
is indeed
∫
a notion of convexity: For A ∈ Rm×d and V ∈ L1 (B(0, 1); Rm×d ) with B(0,1) V (x) dx = 0
define the probability measure µ ∈ M1 (Rm×d ) via its action as an element of C0 (Rm×d )∗ as
follows (recall that M(Rm×d ) ∼
= C0 (Rm×d )∗ by the Riesz Representation Theorem A.10):
∫
⟨
⟩
h, µ := −
h(A +V (x)) dx
B(0,1)
for h ∈ C0 (Rm×d ).
This µ is easily seen to be an element of the dual space to C0 (Rm×d ), and in fact µ is a
probability measure: For the boundedness we observe |⟨h, µ ⟩| ≤ ∥h∥∞ , whereas the positivity ⟨h, µ ⟩ ≥ 0 for h ≥ 0 and the normalization ⟨1, µ ⟩ = 1 follow at once. The barycenter [µ ]
of µ is
∫
⟩
⟨
[µ ] = id, µ = A + −
V (x) dx = A.
B(0,1)
Therefore, if h is convex, we get from then Jensen inequality, Lemma A.9,
⟩ ∫
⟨
h(A) = h([µ ]) ≤ h, µ = −
h(A +V (x)) dx.
B(0,1)
m
In particular, (3.2) holds if we set V (x) := ∇ψ (x) for any ψ ∈ W1,∞
0 (B(0, 1); R ). Thus, we
have shown:
Proposition 3.1. All convex functions h : Rm×d → R are quasiconvex.
However, quasiconvexity is weaker than classical convexity, at least if d, m ≥ 2. The
determinant function and, more generally, minors are quasiconvex, as will be proved in the
next section, but these minors (except for the (1 × 1)-minor) are not convex. We will see
many further quasiconvex functions in the course of this and the next chapter. The relation
between (quasi)convexity and Jensen’s inequality plays a major role in the theory, as we
will see soon.
Other basic properties of quasiconvexity are the following:
Lemma 3.2. In the definition of quasiconvexity we can replace the domain B(0, 1) by
any bounded Lipschitz domain Ω′ ⊂ Rd . Furthermore, if h has p-growth, i.e. |h(A)| ≤
M(1 + |A| p ) for some 1 ≤ p < ∞, M > 0, then in the definition of quasiconvexity we can
1,p
replace testing with ψ of class W1,∞
0 by ψ of class W0 .
Proof. To see the first statement, we will prove the following: If ψ ∈ W1,p (Ω; Rm ), where
Ω ⊂ Rd is a bounded Lipschitz domain, satisfies ψ |∂ Ω = Ax for some A ∈ Rm×d , then there
exists ψ̃ ∈ W1,p (Ω′ ; Rm ) with ψ̃ |∂ Ω′ = Ax and, if one of the following integrals exists and
is finite,
∫
− h(∇ψ ) dx = − h(∇ψ̃ ) dy
Ω
Ω′
for all measurable h : Rm×d → R.
(3.3)
Comment...
∫
46
CHAPTER 3. QUASICONVEXITY
In particular, ∥∇ψ ∥L p = ∥∇ψ̃ ∥L p . Clearly, this implies that the definition of quasiconvexity
is independent of the domain.
To see (3.3), take a Vitali cover of Ω′ with re-scaled versions of Ω, see Theorem A.8,
N
⊎
Ω′ = Z ⊎
Ω(ak , rk ),
|Z| = 0,
k=1
with ak ∈ Ω, rk > 0, Ω(ak , rk ) := ak + rk Ω, where k = 1, . . . , N ∈ N. Then let
(
)

r ψ y − ak + Aa if y ∈ Ω(a , r ),
k
k
k k
rk
ψ̃ (y) :=
y ∈ Ω′ .

0
otherwise,
It holds that ψ̃ ∈ W1,∞ (Ω′ ; Rm ). We compute for any measurable h : Rm×d → R,
))
( (
∫
∫
y − ak
dy
h(∇ψ̃ ) dy = ∑
h ∇ψ
rk
Ω′
k Ω(ak ,rk )
= ∑ rkd
k
=
|Ω′ |
|Ω|
∫
∫
Ω
Ω
h(∇ψ ) dx
h(∇ψ ) dx
since ∑k rkd = |Ω′ |/|Ω|. This shows (3.3).
1,p
m
m
The second assertion follows since W1,∞
0 (B(0, 1); R ) is dense in W0 (B(0, 1); R ) and
under a p-growth assumption, the integral functional
∫
ψ 7→ − h(∇ψ ) dx,
Ω
m
ψ ∈ W1,p
0 (B(0, 1); R ),
is well-defined and continuous, the latter for example by Pratt’s Theorem A.5, see the proof
of Theorem 2.30 for a similar argument.
An even weaker notion of convexity is the following one: A locally bounded Borelmeasurable function h : Rm×d → R is called rank-one convex if it is convex along any
rank-one line, that is,
(
)
h θ A + (1 − θ )B ≤ θ h(A) + (1 − θ )h(B)
for all A, B ∈ Rm×d with rank(A − B) ≤ 1 and all θ ∈ (0, 1). In this context, recall that a
matrix M ∈ Rm×d has rank one if and only if M = a ⊗ b = abT for some a ∈ Rm , b ∈ Rd .
Proposition 3.3. If h : Rm×d → R is quasiconvex, then it is rank-one convex.
Proof. Let A, B ∈ Rm×d with B − A = a ⊗ n for a ∈ Rm and n ∈ Rd with |n| = 1. Denote
by Qn a unit cube (|Qn | = 1) centered at the origin and with one side normal to n. By
Lemma 3.2 we can require h to satisfy
∫
h(M) ≤ − h(∇ψ (z)) dz
Comment...
Qn
(3.4)
3.1. QUASICONVEXITY
47
Figure 3.1: The function φ0 .
for all M ∈ Rm×d and all ψ ∈ W1,∞ (Qn ; Rm ) with ψ (x) = Mx on ∂ Qn (in the sense of trace).
For fixed θ ∈ (0, 1), we now let M := θ A + (1 − θ )B and define the sequence of test
functions φ j ∈ W1,∞ (Qn ; Rm ) as follows:
)
1 (
φ j (x) := Mx + φ0 jx · n − ⌊ jx · n⌋ a,
j
x ∈ Qn ,
where ⌊s⌋ denotes the largest integer below or equal to s ∈ R, and
{
−(1 − θ )t if t ∈ [0, θ ],
φ0 (t) :=
t ∈ [0, 1).
θt − θ
if t ∈ (θ , 1],
Such φ j are called laminations in direction n. See Figures 3.1 and 3.2 for φ0 and φ j ,
respectively.
We calculate
{
M − (1 − θ )a ⊗ n = A if jx · n − ⌊ jx · n⌋ ∈ (0, θ ),
∇φ j (x) =
M +θa⊗n = B
if jx · n − ⌊ jx · n⌋ ∈ (θ , 1).
Thus,
∫
lim − h(∇φ j (x)) dx = θ h(A) + (1 − θ )h(B).
j→∞ Qn
∗
Also notice that φ j ⇁ a in W1,∞ for a(x) := Mx since φ0 is uniformly bounded. We will
show below that we may replace the sequence (φ j ) with a sequence (ψ j ) ⊂ W1,∞ (Qn ; Rm )
with the additional property ψ (x) = Mx on ∂ Qn , but such that still
∫
lim − h(∇ψ j (x)) dx = θ h(A) + (1 − θ )h(B).
Comment...
j→∞ Qn
48
CHAPTER 3. QUASICONVEXITY
Figure 3.2: The laminate φ j .
Combining this with (3.4) allows us to conclude
(
)
h θ A + (1 − θ )B ≤ θ h(A) + (1 − θ )h(B)
and h is indeed rank-one convex.
It remains to construct the sequence (ψ j ) from (φ j ), for which we employ a standard
cut-off construction:
Take a sequence
(ρ j ) ⊂ C∞
c (Qn ; [0, 1]) of cut-off functions such that
{
}
for G j := x ∈ Ω : ρ j (x) = 1 it holds that |Qn \ G j | → 0 as j → ∞. With a(x) = Mx set
ψ j,k := ρ j φk + (1 − ρ j )a,
which lies in W1,∞ (Qn ; Rm ) and satisfies ψ j,k (x) = Mx near ∂ Qn . Also,
∇ψ j,k = ρ j ∇φk + (1 − ρ j )M + (φk − a) ⊗ ∇ρ j .
Since W1,∞ (Qn ; Rm ) embeds compactly into L∞ (Qn ; Rm ) by the Rellich-Kondrachov theorem (or the classical Arzéla–Ascoli theorem), we have φk → a uniformly. Thus, in
∥∇ψ j,k ∥L∞ ≤ ∥∇φk ∥L∞ + |M| + ∥(φk − a) ⊗ ∇ρ j ∥L∞
the last term vanishes as k → ∞. As the ∇φk furthermore are uniformly L∞ -bounded, we
can for every j ∈ N choose k( j) ∈ N such that ∥∇ψ j,k( j) ∥L∞ is bounded by a constant that is
independent of j. Then, as h is assumed to be locally bounded, this implies that there exists
C > 0 (again independent of j) with
Comment...
∥h(∇φ j )∥L∞ + ∥h(∇ψ j,k( j) )∥L∞ ≤ C.
3.1. QUASICONVEXITY
49
Hence, for ψ j := ψ j,k( j) , we may estimate
∫
lim
j→∞ Qn
|h(∇φ j ) − h(∇ψ j )| dx = lim
∫
j→∞ Qn \G j
|h(∇φ j )| + |h(∇ψ j,k( j) )| dx
≤ lim C|Qn \ G j | = 0.
j→∞
This shows that above we may replace (φ j ) by (ψ j ).
Since for d = 1 or m = 1, rank-one convexity obviously is equivalent to convexity, we
have for the scalar case:
Corollary 3.4. If h : Rm×d → R is quasiconvex and d = 1 or m = 1, then h is convex.
The following is a standard example:
Example 3.5 (Alibert–Dacorogna–Marcellini). For d = m = 2 and γ ∈ R define
(
)
hγ (A) := |A|2 |A|2 − 2γ det A ,
A ∈ R2×2 .
For this function it is known that
√
2 2
≈ 0.94,
• hγ is convex if and only if |γ | ≤
3
2
• hγ is rank-one convex if and only if |γ | ≤ √ ≈ 1.15,
3
]
(
2
√
• hγ is quasiconvex if and only if |γ | ≤ γQC for some γQC ∈ 1,
.
3
It is currently unknown whether γQC = √23 . We do not prove these statements here, see
Section 5.3.8 in [25] for the (involved!) details.
We end this section with the following regularity theorem for rank-one convex (and
hence also for quasiconvex) functions:
Proposition 3.6. If h : Rm×d → R is rank-one convex (and locally bounded), then it is
locally Lipschitz continuous.
Proof. We will prove the stronger quantitative bound
lip(h; B(A0 , r)) ≤
√
min(d, m) ·
osc(h; B(A0 , 6r))
,
3r
(3.5)
where
|h(A) − h(B)|
|A − B|
A,B∈B(A0 ,r)
sup
Comment...
lip(h; B(A0 , r)) :=
50
CHAPTER 3. QUASICONVEXITY
is the Lipschitz constant of h on the ball B(A0 , r) ⊂ Rm×d , and
osc(h; B(A0 , r)) :=
sup
|h(A) − h(B)|
A,B∈B(A0 ,r)
is the oscillation of h on B(A0 , r). By the local boundedness, the oscillation is bounded on
every ball, and so, locally, the Lipschitz constant is finite.
To show (3.5), let A, B ∈ B(x0 , r) and assume first that rank(A − B) ≤ 1. Define the point
C ∈ Rm×d as the intersection of ∂ B(A0 , 2r) with the ray starting at B and going through A.
Then, because h is convex along this ray,
|h(A) − h(B)| |h(C) − h(B)| osc(h; B(A0 , 2r))
≤
≤
=: α (2r).
|A − B|
|C − B|
r
(3.6)
For general A, B ∈ B(A0 , r), use the (real) singular value decomposition to write
min(d,m)
∑
B−A =
σi P(ei ⊗ ei )QT ,
i=1
where σi ≥ 0 is the i’th singular value, and P ∈ Rm×m , Q ∈ Rd×d are orthogonal matrices.
Set
k−1
Ak := A + ∑ σi P(ei ⊗ ei )QT ,
k = 1, . . . , min(d, m) + 1,
i=1
for which we have A1 = A, Amin(d,m)+1 = B. We estimate (recall that we are employing the
√
√
Frobenius norm |M| := ∑ j,k (Mkj )2 = ∑i σi (M)2 )
√
k−1
∑ σi2 ≤ |A − A0 | + |B − A| < 3r
|Ak − A0 | ≤ |A − A0 | +
i=1
and
min(d,m)
∑
min(d,m)
∑
|Ak − Ak+1 |2 =
k=1
σi2 = |A − B|2 .
k=1
Applying (3.6) to Ak , Ak+1 ∈ B(A0 , 3r), k = 1, . . . , min(d, m), we get
min(d,m)
|h(A) − h(B)| ≤
∑
|h(Ak ) − h(Ak+1 )|
k=1
min(d,m)
≤ α (6r)
∑
|Ak − Ak+1 |
k=1
√
≤ α (6r) min(d, m) ·
[
∑
k=1
√
= α (6r) min(d, m) · |A − B|.
Comment...
This yields (3.5).
]1/2
min(d,m)
|Ak − Ak+1 |2
3.2. NULL-LAGRANGIANS
51
Remark 3.7. An improved argument (see Lemma 2.2 in [12]), where one orders the singular values in a particular way, allows to establish the better estimate
√
lip(h; B(A0 , r)) ≤
min(d, m) ·
osc(h; B(A0 , 2r))
.
r
3.2 Null-Lagrangians
The determinant is quasiconvex, but it is only one representative of a larger class of canonical examples of quasiconvex, but not convex, functions: In this section, we will investigate
the properties of minors (subdeterminants) as integrands. Let for r ∈ {1, 2, . . . , min(d, m)},
{
}
I ∈ P(m, r) := (i1 , i2 , . . . , ir ) ∈ {1, . . . , m}r : i1 < i2 < · · · < ir
and J ∈ P(d, r) be ordered multi-indices. Then, a (r × r)-minor M : Rm×d → R is a function of the form
( )
M(A) = MJI (A) := det AIJ ,
where AIJ is the (square) matrix consisting of the I-rows and J-columns of A. The number
r ∈ {1, . . . , min(d, m)} is called the rank of the minor M.
The first result shows that all minors
are null-Lagrangians, which by definition is the
∫
class of integrands h : Rm×d such that Ω h(∇u) dx only depends on the boundary values of
u.
Lemma 3.8. Let M : Rm×d → R be a (r × r)-minor, r ∈ {1, . . . , min(d, m)}. If u, v ∈
m
W1,p (Ω; Rm ), p ≥ r, with u − v ∈ W1,p
0 (Ω; R ), then
∫
Ω
∫
M(∇u) dx =
Ω
M(∇v) dx.
Proof. In all of the following, we will assume that u, v are smooth and supp(u − v) ⊂⊂ Ω,
which can be achieved by approximation (incorporating a cut-off procedure), see Theorem. A.19. We also need that taking the minor M of the gradient commutes with strong
convergence, i.e. the strong continuity of u 7→ M(∇u) in W1,p for p ≥ r; this follows by
Hadamard’s inequality |M(A)| ≤ |A|r and Pratt’s Theorem A.5. Finally, we assume (again
by approximation) that supp(u − v) ⊂⊂ Ω.
All first-order minors are just the entries of ∇u, ∇v and the result follows from
∫
Ω
∇u dx =
∫
∂Ω
u · n dσ =
∫
∂Ω
v · n dσ =
∫
Ω
∇v dx
m
since u − v ∈ W1,p
0 (Ω; R ).
For higher-order minors, the crucial observation is that minors of gradients can be written as divergences, which we will establish below. So, if M(∇u) = div G(u, ∇u) for some
G(u, ∇u) ∈ C∞ (Ω; Rd ), then, since supp(u − v) ⊂⊂ Ω,
Ω
∫
M(∇u) dx =
∂Ω
G(u, ∇u) · n dσ =
∫
∂Ω
G(v, ∇v) · n dσ =
∫
Ω
M(∇v) dx
Comment...
∫
52
CHAPTER 3. QUASICONVEXITY
and the result follows.
We first consider the physically most relevant cases d = m ∈ {2, 3}.
For d = m = 2 and u = (u1 , u2 )T , the only second-order minor is the Jacobian determinant and we easily see from the fact that second derivatives of smooth functions commute
that
det ∇u = ∂1 u1 ∂2 u2 − ∂2 u1 ∂1 u2
= ∂1 (u1 ∂2 u2 ) − ∂2 (u1 ∂1 u2 )
)
(
= div u1 ∂2 u2 , −u1 ∂1 u2 .
¬k (A), i.e. the determinant of A after
For d = m = 3, consider a second-order minor M¬l
deleting the k’th row and l’th column. Then, analogously to the situation in dimension two,
we get, using cyclic indices k, l ∈ {1, 2, 3},
]
[
¬k
M¬l
(∇u) = (−1)k+l ∂l+1 uk+1 ∂l+2 uk+2 − ∂l+2 uk+1 ∂l+1 uk+2
]
[
= (−1)k+l ∂l+1 (uk+1 ∂l+2 uk+2 ) − ∂l+2 (uk+1 ∂l+1 uk+2 ) .
(3.7)
For the three-dimensional Jacobian determinant, we will show
3
(
)
det ∇u = ∑ ∂l u1 (cof ∇u)1l ,
(3.8)
l=1
¬k (A). To see this, use the Cramer formula
where we recall that (cof A)kl = (−1)k+l M¬l
(det A)I = A(cof A)T , which holds for any matrix A ∈ Rn×n , to get
3
det ∇u = ∑ ∂l u1 · (cof ∇u)1l .
l=1
Then, our formula (3.8) follows from the Piola identity
div cof ∇u = 0,
(3.9)
¬k (∇u). Hence, (3.8) holds.
which can be verified directly from the expression (3.7) for M¬l
For general dimensions d, m we use the notation of differential geometry to tame the multilinear algebra involved in the proof. So, let M be a (r × r)-minor. Reordering x1 , . . . , xd
and u1 , . . . , um , we can assume without loss of generality that M is a principal minor, i.e.
M is the determinant of the top-left (r × r)-submatrix. Then, in differential geometry it is
well-known that
M(∇u) dx1 ∧ · · · ∧ dxd = du1 ∧ · · · ∧ dur ∧ dxr+1 ∧ · · · ∧ dxd
= d(u1 ∧ du2 ∧ · · · ∧ dur ∧ dxr+1 ∧ · · · ∧ dxd ).
Thus, Stokes Theorem from Differential Geometry gives
∫
Ω
M(∇u) dx1 ∧ · · · ∧ dxd =
=
Comment...
Therefore,
∫
Ω M(∇u) dx
∫
Ω
∫
d(u1 ∧ du2 ∧ · · · ∧ dur ∧ dxr+1 ∧ · · · ∧ dxd )
∂Ω
1 ∧ · · · ∧ dxd
u1 ∧ du2 ∧ · · · ∧ dur ∧ dxr+1 ∧ · · · ∧ dxd .
only depends on the values of u around ∂ Ω.
3.2. NULL-LAGRANGIANS
53
Consequently, we immediately have:
Corollary 3.9. All (r × r)-minors M : Rm×d → R are quasiaffine, that is, both M and −M
are quasiconvex.
We will later use this property to define a large and useful class of quasiconvex functions, namely the polyconvex ones; this is the topic of the next chapter.
Further, minors enjoy a surprising continuity property:
Lemma 3.10 (Weak continuity of minors). Let M : Rm×d → R be an (r × r)-minor, r ∈
{1, . . . , min(d, m)}, and (u j ) ⊂ W1,p (Ω; Rm ) where r < p ≤ ∞. If
u j ⇀ u in W1,p
∗
(or u j ⇁ u in W1,∞ ).
then
∗
M(∇u j ) ⇀ M(∇u) in L p/r
(or M(∇u j ) ⇁ M(∇u) in W1,∞ ).
Proof. For d = m ∈ {2, 3}, we use the special structure of minors as divergences, as ex¬k in three dimensions (that is,
hibited in the proof of Lemma 3.8. For a (2 × 2)-minor M¬l
deleting the k’th row and l’th column; in two dimensions there is only one (2 × 2)-minor,
the determinant, but we still use the same notation), we rely on (3.7) to observe that with
cyclic indices k, l ∈ {1, 2, 3},
∫
Ω
¬k
M¬l
(∇u j )ψ dx = −(−1)k+l
∫
Ω
k+1
k+2
k+2
(uk+1
j ∂l+2 u j )∂l+1 ψ − (u j ∂l+1 u j )∂l+2 ψ dx
p/r (Ω)∗ ∼ L p/(p−r) (Ω). Since
for all ψ ∈ C∞
=
c (Ω) and then by density also for all ψ ∈ L
1,p
p
u j ⇀ u in W , we have u j → u strongly in L . The above expression consists of products
of one L p -weakly and one L p -strongly convergent factor as well as a uniformly bounded
function under the integral. Hence the integral is weakly continuous and as j → ∞ converges
to
∫
Ω
¬k
M¬l
(∇u)ψ dx.
This finishes the proof for d = m = 2.
For d = m = 3, we additionally need to consider the determinant. However, as a consequence of the two-dimensional reasoning, cof ∇u j ⇀ cof ∇u in L p/2 . Then, (3.8) implies
∫
3
Ω
det ∇u j ψ dx = − ∑
∫ [
]
u1j (cof ∇u j )1l ∂l ψ dx
l=1 Ω
and by a similar reasoning as above, this converges to
3
−∑
∫ [
∫
]
1
1
u (cof ∇u)l ∂l ψ dx = det ∇u ψ dx.
l=1 Ω
Ω
Comment...
In the general case, one proceeds by induction. We omit the details.
54
CHAPTER 3. QUASICONVEXITY
It can also be shown that any quasiaffine function can be written as an affine function
of all the minors of the argument. This characterization of quasiaffine functions is due to
Ball [5], a different proof (also including further characterizing statements) can be found in
Theorem 5.20 of [25].
3.3 Young measures
We now introduce a very versatile tool for the study of minimization problems – the Young
measure, named after its inventor Laurence Chisholm Young. In this chapter, we will only
use it as a device to organize the proof of lower semicontinuity for quasiconvex integral
functions, see Theorems 3.25, 3.29, but later it will also be important as an object in its own
right.
Let us motivate this device by the following central issue: Assume that we have a weakly
convergent sequence v j ⇀ v in L2 (Ω), where Ω ⊂ Rd is a bounded Lipschitz domain, and
an integral functional
F [v] :=
∫
Ω
f (x, v(x)) dx
with f : Ω × R → R continuous and bounded, say. Then, we may want to know the limit of
F [v j ] as j → ∞, for instance if (v j ) is a minimizing sequence. In fact, this will be cornerstone of the proof of weak lower semicontinuity for integral functionals with quasiconvex
integrand.
Equivalently, we could ask for the L∞ -weak* limit of the sequence of compound functions g j (x) := f (x, v j (x)). It is easy to see that the weak* limit in general is not equal to
x 7→ f (x, v(x)). For example, in Ω = (0, 1), consider the sequence
{
a if jx − ⌊ jx⌋ ≤ θ ,
x ∈ Ω,
v j (x) :=
b otherwise,
where a, b ∈ R, θ ∈ (0, 1), and ⌊s⌋ is the largest integer below s ∈ R. Then, if f (x, a) = α ∈
∗
R and f (x, b) = β ∈ R (and smooth and bounded elsewhere), we see v j ⇁ θ a + (1 − θ )b
and
∗
f (x, v j (x)) ⇁ h(x) = θ α + (1 − θ )β .
The right-hand side, however, is not necessarily f (x, θ a + (1 − θ )b). Instead, we could
write
∫
h(x) =
f (x, v) dνx (v)
with a x-parametrized family of probability measures νx ∈ M1 (R), namely
νx = θ δα + (1 − θ )δβ ,
x ∈ (0, 1).
Comment...
This family (νx )x∈Ω is the “Young measure generated by the sequence (v j )”.
We start with the following fundamental result:
3.3. YOUNG MEASURES
55
Theorem 3.11 (Fundamental theorem of Young measure theory). Let (V j ) ⊂ L p (Ω; RN )
be a bounded sequence, where 1 ≤ p ≤ ∞. Then, there exists a subsequence (not relabeled)
and a family (νx )x∈Ω ⊂ M1 (RN ) of probability measures, called the p-Young measure generated by the sequence (V j ), such that the following assertions are true:
(I) The family (νx ) is weakly* measurable, that is, for all Carathéodory integrands
f : Ω × RN → R, the compound function
x 7→
∫
f (x, A) dνx (A),
x ∈ Ω,
is Lebesgue-measurable.
(II) If 1 ≤ p < ∞, it holds that
⟨⟨ p ⟩⟩ ∫ ∫
| q| , ν :=
|A| p dνx (A) dx < ∞
Ω
or, if p = ∞, there exists a compact set K ⊂ RN such that
supp νx ⊂ K
for a.e. x ∈ Ω.
(III) For all Carathéodory integrands f : Ω × RN → R such that the family ( f (x,V j (x))) j
is uniformly L1 -bounded and equiintegrable, it holds that
)
(
∫
in L1 (Ω).
(3.10)
f (x,V j (x)) ⇀ x 7→ f (x, A) dνx (A)
For parametrized measures ν = (νx )x∈Ω that satisfy (I) and (II) above, we write ν =
(νx ) ∈ Y p (Ω; RN ), the latter being called the set of p-Young measures. In symbols we
express the generation of a Young measure ν by as sequence (V j ) as
Y
Vj → ν .
Furthermore, the convergence (3.10) can equivalently be expressed as (absorb a test function
for weak convergence into f )
∫
Ω
f (x,V j (x)) dx →
∫ ∫
Ω
f (x, A) dνx (A) dx =:
⟨⟨
⟩⟩
f,ν .
In this context, recall from the Dunford–Pettis Theorem A.13 that ( f (x,V j (x))) j is equiintegrable if and only it is weakly relatively compact in L1 (Ω).
Comment...
Remark 3.12. Some assumptions in the fundamental theorem can be weakened: For the
existence of a Young measure, we
only need to require limK↑∞ sup j |{|V j | ≥ K}| = 0, which
∫
for example follows from sup j Ω |V j | p dx < ∞ for some p > 0. Of course, in this case (II)
needs to be suitably adapted.
56
CHAPTER 3. QUASICONVEXITY
Proof. We associate with each V j an elementary Young measure (δV j ) ∈ Y p (Ω; RN ) as
follows:
(δV j )x := δV j (x)
for a.e. x ∈ Ω.
(3.11)
Below we will show the following Young measure compactness principle, which is also
useful in other contexts. We only formulate the principle for 1 ≤ p < ∞ for reasons of
convenience, but it holds (suitably adapted) also for p = ∞.
Let (ν ( j) ) j ⊂ Y p (Ω; RN ) be a sequence of p-Young measures such that
∫ ∫
⟩⟩
⟨⟨
( j)
sup j | q| p , ν ( j) = sup j
|A| p dνx (A) dx < ∞.
Ω
(3.12)
Then, there exists a subsequence of the (ν j ) (not relabeled) and ν ∈ Y p (Ω; RN )
such that
⟨⟨
⟩⟩ ⟨⟨
⟩⟩
lim f , ν ( j) = f , ν
(3.13)
j→∞
for all Carathéodory f : Ω × RN → R for which the sequence of functions x 7→
( j)
⟨ f (x, q), νx ⟩ is uniformly L1 -bounded and the equiintegrability condition
⟨⟨
⟩⟩
sup j f (x, A)1{| f (x,A)|≥m} , ν ( j) → 0
as
m → ∞.
(3.14)
holds. Moreover,
⟨⟨ p ⟩⟩
| q| , ν ≤ lim inf ⟨⟨| q| p , ν ( j) ⟩⟩.
j→∞
In the theorem, our assumption of L p -boundedness of (V j ) corresponds to (3.12), so (I)–
(III) follow immediately from the compactness principle applied to our sequence (δV j ) j of
elementary Young measures from (3.11) above once we realize that for the sequence of
Young measures (δV j ) j the condition (3.14) corresponds precisely to the equiintegrability
of ( f (x,V j (x))) j .
It remains to show the compactness principle. To this end define the measures
( j)
µ ( j) := Lxd Ω ⊗ νx ,
which is just a shorthand notation for the Radon measures µ ( j) ∈ M(Ω × RN ) ∼
= C0 (Ω ×
RN )∗ defined through their action
⟨
⟩ ∫ ∫
( j)
f , µ ( j) =
f (x, A) dνx (A) dx
Ω
for all f ∈ C0 (Ω × RN ).
So, for the ν ( j) = δV j from (3.11), we recover the familiar form
⟨
⟩ ∫
f (x,V j (x)) dx
f , µ ( j) =
Comment...
Ω
for all f ∈ C0 (Ω × RN ),
3.3. YOUNG MEASURES
57
but in the compactness principle we might be in a more general situation.
Clearly, every so-defined µ ( j) is a positive measure and
⟨
⟩
f , µ ( j) ≤ |Ω| · ∥ f ∥∞ ,
whereby the (µ ( j) ) constitute a uniformly bounded sequence in the dual space C0 (Ω×RN )∗ .
Thus, by the Sequential Banach–Alaoglu Theorem A.12, we can select a (non-relabeled)
subsequence such that with a µ ∈ C0 (Ω × RN )∗ it holds that
⟩ ⟨
⟨
⟩
f , µ ( j) → f , µ
for all f ∈ C0 (Ω × RN ).
(3.15)
Next we will show that µ can be written as µ = Lxd Ω ⊗ νx for a weakly* measurable
parametrized family ν = (νx )x∈Ω ⊂ M1 (RN ) of probability measures.
Theorem 3.13 (Disintegration). Let µ ∈ M(Ω × RN ) be a finite Radon measure. Then,
there exists a weakly* measurable family (νx )x∈Ω ⊂ M1 (RN ) of probability measures such
that with κ (A) := µ (A × RN ),
µ = κ (dx) ⊗ νx ,
that is,
⟨
⟩
f,µ =
∫ ∫
Ω
f (x, A) dνx (A) dκ (x)
for all f ∈ C0 (Ω × RN ).
A proof of this measure-theoretic result can be found in Theorem 2.28 of [2].
In our situation, we have for f (x, v) := φ (x) with arbitrary φ ∈ C0 (Ω) that
⟩
⟨
⟨
⟩ ∫
φ , µ = lim φ , µ ( j) = φ (x) dx,
j→∞
Ω
and so κ = L d Ω. Hence, the Disintegration Theorem
⟩⟩immediately yields the weak*
⟨⟨
measurability of (νx ) and thus lim j→∞ ⟨⟨ f , ν ( j) ⟩⟩ = f , ν for f ∈ C0 (Ω × RN ), which
is (3.13).
Next, we show (3.13) for the case of f merely Carathéodory and bounded and such
that there exists a compact set K ⊂ RN with supp f ⊂ Ω × K. We invoke the Scorza Dragoni
Theorem 2.5 to get an increasing sequence of compact sets Sk ⊂⊂ Ω (k ∈ N) with |Ω\Sk | ↓ 0
such that f |Sk ×RN is continuous. Let fk ∈ C(Ω × RN ) be an extension of f |Sk ×RN to all of
Ω × RN with the property that the fk are uniformly bounded. Indeed, since K is bounded
and | fk (x, A)| ≤ M(1 + |A| p ), f is bounded and hence we may require uniform boundedness
of the fk .
Now, since fk ∈ C0 (Ω × RN ), (3.15) implies
⟨
⟨
⟩
( j) ⟩
fk (x, q), νx
in L1
as j → ∞.
⇀ fk (x, q), νx
as j → ∞.
Comment...
Thus, we also get
⟨
⟩
⟨
( j) ⟩
f (x, q), νx
in L1 (Sk )
⇀ f (x, q), νx
58
CHAPTER 3. QUASICONVEXITY
On the other hand,
∫
⟩
⟨
⟩
f (x, q), νx( j) − 1S f (x, q), νx( j) dx ≤
k
∫ ⟨
Ω
Ω\Sk
⟨
⟩
f (x, q), νx( j) dx
and this converges to zero as k → ∞, uniformly in j, by the uniform boundedness of fk ; the
same estimate holds with ν in place of ν ( j) . Therefore, we may conclude
⟨
( j) ⟩
f (x, q), νx
⇀
⟨
f (x, q), νx
⟩
in L1 (Ω)
as j → ∞,
which directly implies (3.13) (for f with supp f ⊂ Ω × K).
Finally, to remove the restriction of compact support in A, we remark that it suffices
to show the assertion under the additional constraint that f ≥ 0 by considering the positive
and negative part separately. Then, in case f positive, choose for any m ∈ N a function
N
ρm ∈ C∞
c (R ; [0, 1]) with ρm ≡ 1 on B(0, m) and supp ρm ⊂ B(0, 2m). Set
f m (x, A) := ρm (|A| p/2 )ρm ( f (x, A)) f (x, A).
Then,
E j,m :=
≤
≤
∫ ⟨
( j) ⟩
f (x, q), νx
∫ Ω∫ [
]
( j)
1 − ρm (|A| p/2 )ρm ( f (x, A)) f (x, A) dνx (A) dx
Ω
∫∫
≤M
⟨
( j) ⟩
− f m (x, q), νx dx
( j)
{|A| ≥ m2/p
∫ ∫
Ω
or f (x, A) ≥ m}
{|A|≥m2/p }
m
f (x, A) dνx (A) dx
( j)
dνx (A)
∫
dx +
{ f (x,A)≥m}
( j)
f (x, A) dνx (A) dx
⟩⟩
⟨⟨
⟩⟩
⟨⟨
M
≤ sup j |A| p , ν ( j) + sup j f (x, A)1{| f (x,A)|≥m} , ν ( j) .
m
Both of these terms converge to zero as m → ∞ by the assumption from the compactness
principle, in particular (3.12) and the equiintegrability condition (3.14).
As the Young measure convergence holds for f m by the previous proof step, we get
⟩⟩ ⟨⟨
⟩⟩ ⟨⟨
⟩⟩)
(⟨⟨
⟩⟩)
)
(⟨⟨
lim sup f , ν ( j) − f m , ν = lim sup f m , ν ( j) − f m , ν + E j,m
j→∞
j→∞
≤ sup j E j,m .
Then, since f m ↑ f and sup j E j,m vanishes as m → ∞ and f ≥ 0, we get that indeed
⟨⟨
⟩⟩ ⟨⟨
⟩⟩
f , ν ( j) → f , ν
as
j → ∞.
The last assertion to be shown in the compactness principle is, for 1 ≤ p < ∞,
⟨⟨ p ⟩⟩
⟨⟨
⟩⟩
| q| , ν ≤ lim inf | q| p , ν ( j) .
Comment...
j→∞
3.3. YOUNG MEASURES
59
For K ∈ N define |A|K := min{|A|, K} ≤ |A|. Then, by the Young measure representation
for equiintegrable integrands,
⟨⟨
⟩⟩
⟨⟨
⟩⟩ ⟨⟨
⟩⟩
for all K ∈ N.
lim inf | q| p , ν ( j) ≥ lim | q|Kp , ν ( j) = | q|Kp , ν
j→∞
j→∞
We conclude by letting K ↑ ∞ and using the monotone convergence lemma.
Remark 3.14. The Fundamental Theorem can also be proved in a more functional analytic
N
way as follows: Let L∞
w∗ (Ω; M(R )) be the set of essentially bounded weakly* measurable
functions defined on Ω with values in the Radon measures M(RN ). It turns out (see for
N
1
N
example [7] or [30, 31]) that L∞
w∗ (Ω; M(R )) is the dual space to L (Ω; C0 (R )). One can
(
j)
N
show that the maps ν ( j) = (x 7→ νx ) form a uniformly bounded set in L∞
w∗ (Ω; M(R )) and
by the Sequential Banach–Alaoglu Theorem A.12 we can again conclude the existence of a
weak* limit point ν of the ν ( j) ’s. The extended representation of limits for Carathéodory f
follows as before.
It is known that every weakly* measurable parametrized measure (νx )x∈Ω ⊂ M1 (RN )
with (x 7→ ⟨| q| p , νx ⟩) ∈ L1 (Ω) (1 ≤ p < ∞) can be generated by a sequence (u j ) ⊂ L p (Ω; RN )
(one starts first by approximating Dirac masses and then uses the approximation of a general
measure by linear combinations of Dirac masses, a spatial gluing argument completes the
reasoning). Thus, in the definition of Y p (Ω; RN ) we did not need to include (III) from the
Fundamental Theorem.
A Young measure ν = (νx ) ∈ Y p (Ω; RN ) is called homogeneous if νx = νy for a.e.
x, y ∈ Ω; by a mild abuse of notation we then simply write ν in place of νx .
Before we come to further properties of Young measures, let us consider a few examples:
Example 3.15. In Ω := (0, 1) define u := 1(0,1/2) − 1(1/2,1) and extend this function periodically to all of R. Then, the functions u j (x) := u( jx) for j ∈ N (see Figure 3.3) generate
the homogeneous Young measure ν ∈ Y∞ ((0, 1); R) with
1
1
νx = δ−1 + δ+1
for a.e. x ∈ (0, 1).
2
2
Indeed, for φ ⊗ h ∈ C0 ((0, 1) × R) we have that φ is uniformly continuous, say |φ (x) −
φ (y)| ≤ ω (|x − y|) with a modulus of continuity ω : [0, ∞) → [0, ∞) (that is, ω is continuous,
increasing, and ω (0) = 0). Then,
)
∫ 1
j−1(∫ (k+1)/ j
ω (1/ j)∥h∥∞
lim
φ (x)h(u j (x)) dx = lim ∑
φ (k/ j)h(u j (x)) dx +
j→∞
j→∞ 0
j
k/ j
k=0
j−1
∫
1
1
φ (k/ j)
h(u(y)) dy
∑
j→∞
0
k=0 j
(
)
∫ 1
1
1
=
φ (x) dx ·
h(−1) + h(+1)
2
2
0
= lim
Comment...
since the Riemann sums converge to the integral of φ . The last line implies the assertion.
60
CHAPTER 3. QUASICONVEXITY
Figure 3.3: An oscillating sequence.
Figure 3.4: Another oscillating sequence.
Example 3.16. Take Ω := (0, 1) again and let u j (x) = sin(2π jx) for j ∈ N (see Figure 3.4).
The sequence (u j ) generates the homogeneous Young measure ν ∈ Y∞ ((0, 1); R) with
1
νx = √
Ly1 (−1, 1)
2
π 1−y
for a.e. x ∈ (0, 1),
as should be plausible from the oscillating sequence (there is more mass close to the horizontal axis than farther away).
Example 3.17. Take a bounded Lipschitz domain Ω ⊂ R2 , let A, B ∈ R2×2 with B − A =
a ⊗ n (this is equivalent to rank(A − B) ≤ 1), where a, n ∈ R2 , and for θ ∈ (0, 1) define
( ∫ x·n
)
u(x) := Ax +
χ (t) dt a,
x ∈ R2 ,
0
where χ := 1∪z∈Z [z,z+1−θ ) . If we let u j (x) := u( jx)/ j, x ∈ Ω, then (∇u j ) generates the
homogeneous Young measure ν ∈ Y∞ (Ω; R2 ) with
νx = θ δA + (1 − θ )δB
for a.e. x ∈ Ω.
This example can also be extended to include multiple scales, cf. [73].
Comment...
The Fundamental Theorem has many important consequences, two of which are the
following:
3.3. YOUNG MEASURES
61
Lemma 3.18. Let (V j ) ⊂ L p (Ω; RN ), 1 < p ≤ ∞, be a bounded sequence generating the
Young measure ν ∈ Y p (Ω; RN ). Then,
in L p (Ω; RN )
Vj ⇀ V
⟨
⟩
V (x) := [νx ] = id, νx .
for
Proof. Bounded sequences in L p with p > 1 are weakly relatively compact and it suffices
to identify the limit of any weakly convergent subsequence. From the Dunford–Pettis Theorem A.13 it follows that any such sequence is in fact equiintegrable (in L1 ). Now simply
apply assertion (III) of the Fundamental Theorem for the integrand f (x, A) := A (or, more
pedantically, f ji (A) := Aij for i = 1, . . . , d, m = 1, . . . , m).
The preceding lemma does not hold for p = 1. One example is V j := j1(0,1/ j) , which
concentrates.
Proposition 3.19. Let (V j ) ⊂ L p (Ω; RN ), 1 ≤ p ≤ ∞, be a bounded sequence generating
the Young measure ν ∈ Y p (Ω; RN ), and let f : Ω × RN → [0, ∞) be any Carathéodory integrand (not necessarily satisfying the equiintegrability property in (III) of the Fundamental
Theorem). Then, it holds that
∫
lim inf
j→∞
Ω
f (x,V j (x)) dx ≥
⟨⟨
⟩⟩
f,ν .
Proof. For M ∈ N define fM (x, A) := min( f (x, A), M). Then, (III) from the Fundamental
Theorem is applicable with fM and we get
∫
Ω
fM (x,V j (x)) dx →
⟨⟨
fM , ν
⟩⟩
∫ ∫
=
Ω
fM (x, A) dνx (A) dx.
Since f ≥ fM , we have
∫
lim inf
j→∞
Ω
f (x,V j (x)) dx ≥
⟨⟨
⟩⟩
fM , ν .
We conclude by letting M ↑ ∞ and using the monotone convergence lemma.
Another important feature of Young measures is that they allow us to read off whether
or not our sequence converges in measure:
Proposition 3.20. Let (νx ) ⊂ M1 (RN ) be the Young measure generated by a bounded
sequence (V j ) ⊂ L p (Ω; RN ), 1 ≤ p ≤ ∞. Let furthermore K ⊂ RN be compact. Then,
dist(V j (x), K) → 0 in measure
if and only if
supp νx ⊂ K for a.e. x ∈ Ω.
Moreover,
if and only if
νx = δV (x) for a.e. x ∈ Ω.
Comment...
V j → V in measure
62
CHAPTER 3. QUASICONVEXITY
Proof. We only show the second assertion, the proof of the first one is similar. Take the
bounded, positive Carathéodory integrand
f (x, A) :=
|A −V (x)|
,
1 + |A −V (x)|
x ∈ Ω, A ∈ RN .
Then, for all δ ∈ (0, 1), the Markov inequality and the Fundamental Theorem on Young
measures yield
∫
{
}
1
lim sup x ∈ Ω : f (x,V j (x)) ≥ δ ≤ lim
f (x,V j (x)) dx
j→∞ δ Ω
j→∞
∫
⟩
1 ⟨
=
f (x, q), νx dx.
δ Ω
On the other hand,
∫ (
∫ ⟨
⟩
)
q
f x,V j (x) dx
f (x, ), νx dx = lim
j→∞ Ω
Ω
{
}
≤ δ |Ω| + lim sup x ∈ Ω : f (x,V j (x)) ≥ δ .
j→∞
The preceding estimates show that f (x,V j (x)) converges to zero in measure if and only
if ⟨ f (x, q), νx ⟩ = 0 for L d -almost every x ∈ Ω, which is the case if and only if νx = δV (x)
almost everywhere. We conclude by observing that f (x,V j (x)) converges to zero in measure
if and only if V j converges to V in measure.
3.4 Gradient Young measures
The most important subclass of Young measures for our purposes is that of Young measures
that can be generated by a sequence of gradients: Let ν ∈ Y p (Ω; Rm×d ) (use RN = Rm×d
in the theory of the last section). We say that ν is a p-gradient Young measure, in symbols ν ∈ GY p (Ω; Rm×d ), if there exists a bounded sequence (u j ) ⊂ W1,p (Ω; Rm ) such that
Y
∇u j → ν , i.e. the sequence (∇u j ) generates ν . Note that it is not necessary that every sequence that generates ν is a sequence of gradients (which, in fact, is never the case). We
have already considered examples of gradient Young measures in the previous section, see
in particular Example 3.17.
Is it the case that every Young measure is a gradient Young measure? The answer is
no, as can be easily seen: Let V ∈ (L p ∩ C∞ )(Ω; Rm×d ) with curlV (x) ̸= 0 for some x ∈ Ω.
Then, νx := δV (x) cannot be a gradient Young measure since for such measures it must hold
Y
Comment...
that [ν ] is a gradient itself: if there existed a sequence (u j ) ⊂ W1,p (Ω; Rm ) with ∇u j → ν ,
then ∇u j ⇀ V by Lemma 3.18, and it would follow that curlV = 0. Consequently, for every
ν ∈ GY p (Ω; Rm×d ) there must exist u ∈ W1,p (Ω; Rm ) with [ν ] = ∇u. In this case we call
any such u an underlying deformation of ν .
We first show the following technical, but immensely useful, result about gradient
Young measures. Recall that we assume Ω to be open, bounded, and to have a Lipschitz
boundary (to make the trace well-defined).
3.4. GRADIENT YOUNG MEASURES
63
Lemma 3.21. Let ν ∈ GY p (Ω; Rm×d ), 1 < p < ∞, and let u ∈ W1,p (Ω; Rm ) be an underlying deformation of ν , i.e. [ν ] = ∇u. Then, there exists a sequence (u j ) ⊂ W1,p (Ω; Rm )
such that
u j |∂ Ω = u|∂ Ω
Y
∇u j → ν .
and
Furthermore, in addition we can ensure that (∇u j ) is p-equiintegrable.
Proof. Step 1. Since Ω is assumed to have a Lipschitz boundary, we can extend a generating
sequence (∇v j ) for ν to all of Rd , so we assume (v j ) ⊂ W1,p (Rd ; Rm ) with sup j ∥v j ∥W1,p ≤
C < ∞.
In the following, we need the maximal function M f : Rd → R ∪ {+∞} of f : Rd →
R ∪ {+∞}, which is defined as
∫
M f (x0 ) := sup −
r>0
B(x0 ,r)
| f (x)| dx,
x0 ∈ Rd .
We quote the following two results about the maximal function whose proofs can be found
in [87]:
(i) ∥M f ∥L p ≤ C∥M f ∥L p , where C = C(d, p) is a constant.
(ii) For every M > 0 and f ∈ W1,p (Rd ; Rm ), the maximal function M f is Lipschitzcontinuous on the set {M (| f | + |∇ f |) < M} and its Lipschitz-constant is bounded
by CM, where C = C(d, m, p) is a dimensional constant.
With this tool at hand, consider the sequence
V j := M (|v j | + |∇v j |),
j ∈ N.
By (i) above, (V j ) is uniformly bounded in L p (Ω) and we may select a (non-renumbered)
subsequence such that V j generates a Young measure µ ∈ Y p (Ω).
Next, define for h ∈ N the (nonlinear) truncation operator Th ,
{
v
if |v| ≤ h,
Th v :=
v ∈ R,
v
h |v| if |v| > h,
and let (φl )l ⊂ C∞ (Ω) be a countable family that is dense in C(Ω).
Then, for fixed h ∈ N, the sequence (ThV j ) j is uniformly bounded in L∞ (Ω) and so, by
Young measure representation,
∫
lim lim
φl (x)|ThV j (x)| dx = lim
∫
p
h→∞ j→∞ Ω
h→∞ Ω
φl (x)
∫
|Th v| dµx (v) dx =
∫
p
Ω
⟨
⟩
φl (x) | q| p , µx dx,
where in the last step we used the monotone convergence lemma. Thus, for some diagonal
sequence Wk := TkV j(k) , we have that
k→∞ Ω
φl (x) |Wk (x)| p dx =
∫
Ω
⟨
⟩
φl (x) | q| p , µx dx
for all l ∈ N.
Comment...
∫
lim
64
CHAPTER 3. QUASICONVEXITY
Thus, |Wk | p ⇀ ⟨| q| p , µx ⟩ in L1 and {|Wk | p }k is an equiintegrable family by the Dunford–
Pettis Theorem. We can see also the equiintegrability in a more elementary way as follows:
For a Borel subset A ⊂ Ω take φ ∈ C∞ (Ω) with 1A ≤ φ and estimate
∫
lim sup
A
k→∞
|Wk (x)| p dx ≤ lim
∫
k→∞ Ω
φ (x) |Wk (x)| p dx =
∫
Ω
⟨
⟩
φ (x) | q| p , µx dx
∫
by the L1 -convergence of |Wk | p to x 7→ ⟨| q| p , µx ⟩. The last integral tends to A ⟨| q| p , µx ⟩ dx as
φ ↓ 1A , which vanishes if |A| → 0. Thus the family {Wk }k is p-equiintegrable.
By (ii) above, vk = v j(k) is Ck-Lipschitz on the set Ek := {Wk ≤ k} and by a wellknown result for Lipschitz functions due to Kirszbraun, we may extend vk to a function
wk : Rd → Rm that is globally Lipschitz continuous with the same Lipschitz constant Ck
and wk = vk in Ek . Thus,
|∇wk | ≤ CWk
in Ω.
Thus, {|∇wk |}k inherits the p-equiintegrable from {Wk }k .
Moreover, by the Markov inequality,
∥Wk ∥Lp p
|Ω \ Ek | ≤
→0
kp
as k → ∞.
Therefore, for all φ ∈ C0 (Ω) and h ∈ C0 (Rm ),
∫
Ω
|φ h(∇wk ) − φ h(∇vk )| dx ≤ ∥φ h∥∞ |Ω \ Ek | → 0.
Thus, since all such φ , h determine Young measure convergence (see the proof of the Fundamental Theorem 3.11), we have shown that (∇wk ) generates the same Young measure ν
as (∇v j ) and additionally the family {∇wk }k is p-equiintegrable.
Step 2. Since W1,p (Ω; Rm ) embeds compactly into the space L p (Ω; Rm ) by the RellichLet moreover (ρ j ) ⊂} C∞
Kondrachov Theorem A.18, we have wk → u in L p . {
c (Ω; [0, 1])
be a sequence of cut-off functions such that for G j := x ∈ Ω : ρ j (x) = 1 it holds that
|Ω \ G j | → 0 as j → ∞. For
u j,k := ρ j wk + (1 − ρ j )u
m
∈ W1,p
u (Ω; R )
we observe ∇u j,k = ρ j ∇wk + (1 − ρ j )∇u + (wk − u) ⊗ ∇ρ j .
We only consider the case p < ∞ in the following; the proof for the case p = ∞ is in fact
easier. For every Carathéodory function f : Ω × Rm×d → R with | f (x, A)| ≤ M(1 + |A| p ) we
have
∫ lim sup f (x, ∇wk (x)) − f (x, ∇u j,k (x)) dx
k→∞
Ω
≤ lim sup CM
k→∞
≤ lim sup CM
Comment...
k→∞
∫
Ω\G j
∫
Ω\G j
2 + 2|∇wk | p + |∇u| p + |wk − u| p · |∇ρ j | p dx
2 + 2|∇wk | p + |∇u| p dx,
3.4. GRADIENT YOUNG MEASURES
65
where C > 0 is a constant. However, by assumption, the last integrand is equiintegrable and
hence
lim lim sup
j→∞
∫ k→∞
Ω
f (x, ∇wk (x)) − f (x, ∇u j,k (x)) dx = 0.
We can now select a diagonal sequence u j = u j,k( j) such that (∇u j ) generates ν and satisfies
all requirements from the statement of the lemma. Note that the equiintegrability is not
affected by the cut-off procedure.
The following result is now easy to prove:
Lemma 3.22 (Jensen-type inequality). Let ν ∈ GY p (Ω; Rm×d ) be a homogeneous Young
measure, i.e. νx = ν ∈ M1 (Rm×d ) for almost every x ∈ Ω, and let B := [ν ] ∈ Rm×d be its
barycenter (which is a constant). Then, for all quasiconvex functions h : Rm×d → R with
|h(A)| ≤ M(1 + |A| p ) (M > 0, 1 < p < ∞) it holds that
h(B) ≤
∫
h dν .
(3.16)
Notice that the conclusion of this lemma is trivial (and also holds for general Young
measures, not just the gradient ones) if h is convex. Thus, also (3.16) expresses the convexity
in the notion of quasiconvexity.
Y
Proof. Let (u j ) ⊂ W1,p (Ω; Rm ) with ∇u j → ν , u j |∂ Ω (x) = Bx for x ∈ ∂ Ω (in the sense of
trace) and (∇u j ) p-equiintegrable, the latter two conditions being realizable by Lemma 3.21.
Then, from the definition of quasiconvexity, we get
∫
h(B) ≤ − h(∇u j ) dx.
Ω
Passing to the Young measure limit as j → ∞ on the right-hand side, for which we note that
h(∇u j ) is equiintegrable by the growth assumption on h, we arrive at
∫ ∫
h(B) ≤ −
Ω
h dν dx =
∫
h dν ,
which already is the sought inequality.
Comment...
The last result will be of crucial importance in proving weak lower semicontinuity in the
next section. It is remarkable that the converse also holds, i.e. the validity of (3.16) for all
quasiconvex h with p-growth precisely characterizes the class of (homogeneous) gradient
Young measures in the class of all (homogeneous) Young measures; this is the content of
the Kinderlehrer–Pedregal Theorem 5.9, which we will meet in Chapter 5.
66
CHAPTER 3. QUASICONVEXITY
3.5 Gradient Young measures and rigidity
To round off the introduction to Young measures let us return to the question if every Young
measure is a gradient Young measure, but now with the additional requirement that the
Young measure in question is homogeneous. Then, the barycenter is constant and hence
always the gradient of an affine function, so the simple reasoning from the beginning of
the section no longer applies. Still, there are homogeneous Young measures that are not
gradients, but proving that no generating sequence of gradients can be found, is not so
trivial. One possible way would be to show that there is a quasiconvex function h : Rm×d →
R such that the Jensen-type inequality of Lemma 3.22 fails; this is, however, not easy.
Let us demonstrate this assertion instead through the so-called (approximate) twogradient problem: Let A, B ∈ Rm×d , θ ∈ (0, 1) and consider the homogeneous Young measure
ν := θ δA + (1 − θ )δB
∈ Y∞ (B(0, 1); Rm×d ).
(3.17)
We know from Example 3.17 that for rank(A − B) ≤ 1, ν is a gradient Young measure. In
any case, supp ν = {A, B} and by Proposition 3.20 it follows that dist(V j , {A, B}) → 0 in
Y
measure for any generating sequence V j → ν .
The following is the first instance of so-called rigidity results, which play a prominent
role in the field:
Theorem 3.23 (Ball–James). Let Ω ⊂ Rd be open, bounded, and convex. Also, let A, B ∈
Rm×d .
(I) Let u ∈ W1,∞ (Ω; Rm ) satisfy the exact two-gradient inclusion
∇u ∈ {A, B}
a.e. in Ω.
(a) If rank(A − B) ≥ 2, then ∇u = A a.e. or ∇u = B a.e.
(b) If B − A = a ⊗ n (a ∈ Rm , n ∈ Rd \ {0}), then there exists a Lipschitz function
h : R → R with h′ ∈ {0, 1} almost everywhere, and a constant v0 ∈ Rm such that
u(x) = v0 + Ax + h(x · n)a.
(II) Let rank(A − B) ≥ 2 and assume that u j ⊂ W1,∞ (Ω; Rm ) is uniformly bounded and
satisfies the approximate two-gradient inclusion
(
)
dist ∇u j , {A, B} → 0 in measure.
Then,
Comment...
∇u j → A
in measure
or
∇u j → B
in measure.
3.5. GRADIENT YOUNG MEASURES AND RIGIDITY
67
Proof. Ad (I) (a). Assume after a translation that B = 0 and thus rank A ≥ 2. Then, ∇u = Ag
for a scalar function g : Ω → R. Mollifying u (i.e. working with uε := ρε ⋆ u instead of u
and passing to the limit ε ↓ 0), we may assume that in fact g ∈ C1 (Ω).
The idea of the proof is that the curl of ∇u vanishes, expressed as follows: for all
i, j = 1, . . . , d and k = 1, . . . , m, it holds that
∂i (∇u)kj = ∂i j uk = ∂ ji uk = ∂ j (∇u)ki
where M kj denotes the element in the k’th row and j’th column of the matrix M. For our
special ∇u = Ag, this reads as
Akj ∂i g = Aki ∂ j g.
(3.18)
Under the assumptions of (I) (a), we claim that ∇g ≡ 0. If otherwise ξ (x) := ∇g(x) ̸= 0 for
some x ∈ Ω, then with ak (x) := Akj /ξ j (x) (k = 1, . . . , m) for any j such that ξ j (x) ̸= 0, where
ak (x) is well-defined by the relation (3.18), we have
Akj = ak (x)ξ j (x),
i.e.
A = a(x) ⊗ ξ (x).
This, however, is impossible if rank A ≥ 2. Hence, ∇g ≡ 0 and u is an affine function. This
property is also stable under mollification.
Ad (I) (b). As in (I) (a), we assume ∇u = Ag (B = 0) and A = a ⊗ n. Pick any b ⊥ n.
Then,
d
u(x + tb) = ∇u(x)b = [anT b]g(x) = 0.
dt
t=0
This implies that u is constant in direction b. As b ⊥ n was arbitrary and Ω is assumed
convex, u(x) can only depend on x · n. This implies the claim.
Ad (II). Assume once more that B = 0 and that there exists a (2 × 2)-minor M with
M(A) ̸= 0. By assumption there are sets D j ⊂ Ω with
∇u j − A1D j → 0 in measure.
Let us also assume that we have selected a subsequence such that
∗
u j ⇁ u in W1,∞
and
∗
1D j ⇁
χ in L∞ .
In the following we use that under an assumption of uniform L∞ -boundedness convergence in measure implies weak* convergence in L∞ . Indeed, for any ψ ∈ L1 (Ω) and any
ε > 0 we have
Ω
(∇u j − A1D j )ψ dx
{
}
≤ x ∈ Ω : |∇u j (x) − A1D j (x)| > ε · ∥∇u j − A1D j ∥L∞ · ∥ψ ∥L1 + ε ∥ψ ∥L1
→ 0 + ε ∥ψ ∥L1
Comment...
∫
68
CHAPTER 3. QUASICONVEXITY
∗
As ε > 0 is arbitrary, we have indeed ∇u j − A1D j ⇁ 0 in L∞ .
Now,
w*-lim M(∇u j ) = w*-lim M(A1D j ) = M(A) · w*-lim 1D j = M(A)χ .
j→∞
j→∞
j→∞
∗
By a similar argument, ∇u j ⇁ ∇u = Aχ , and so, by the weak* continuity of minors proved
in Lemma 3.10,
w*-lim M(∇u j ) = M(∇u) = M(Aχ ) = M(A)χ 2 .
j→∞
Thus, χ = χ 2 and we can conclude that there exists D ⊂ Ω such that χ = 1D and ∇u = A1D .
Furthermore, combining the above convergence assertions,
∇u j → ∇u = A1D
in measure.
Finally, (I) (a) implies that ∇u = A or ∇u = 0 = B a.e. in Ω, finishing the proof.
With this result at hand it is easy to see that our example (3.17) cannot be a gradient
Y
Young measure if rank(A − B) ≥ 2: If there was a sequence of gradients ∇u j → ν , then by
the reasoning above we would have dist(∇u j , {A, B}) → 0, but from (II) of the Ball–James
rigidity theorem, we would get ∇u j → A or ∇u j → B in measure, either one of which yields
a contradiction by Proposition 3.20.
3.6 Lower semicontinuity
After our detour through Young measure theory, we now return to the central subject of this
chapter, namely to minimization problems of the form
∫

 F [u] :=
f (x, ∇u(x)) dx → min,
Ω

over all u ∈ W1,p (Ω; Rm ) with u|∂ Ω = g,
where Ω ⊂ Rd is a bounded Lipschitz domain, 1 < p < ∞, the integrand f : Ω × Rm ×
Rm×d → R has p-growth,
| f (x, A)| ≤ M(1 + |A| p )
for some M > 0,
Comment...
and g ∈ W1−1/p,p (∂ Ω; Rm ) specifies the boundary values. In Chapter 2 we solved this problem via the Direct Method of the Calculus of Variations, a coercivity result, and, crucially,
the Lower Semicontinuity Theorem 2.7. In this section, we recycle the Direct Method and
the coercivity result, but extend lower semicontinuity to quasiconvex integrands; some motivation for this was given at the beginning of the chapter.
Let us first consider how we could approach the proof of lower semicontinuity (it should
be clear that the proof via Mazur’s Lemma that we used for the convex lower semicontinuity
3.6. LOWER SEMICONTINUITY
69
theorem, does not extend). Assume we have a sequence (u j ) ⊂ W1,p (Ω; Rm ) such that
u j ⇀ u in W1,p and we want to show lower semicontinuity of our functional F . If we
assume that our (norm-bounded) sequence ∇u j also generates the gradient Young measure
ν ∈ GY p (Ω; Rm×d ), which is true up to a subsequence, and that the sequence of integrands
( f (x, ∇u j (x))) j is equiintegrable, then we already have a limit:
F [u j ] →
∫ ∫
Ω
f (x, A) dνx (A) dx.
This is useful, because it now suffices to show the inequality
∫
f (x, q) dνx ≥ f (x, ∇u(x)) dx
for almost every x ∈ Ω, to conclude lower semicontinuity. Now, this inequality is close to
the Jensen-type inequality (3.16) for gradient Young measures from Lemma 3.22, and we
already explained how this is related to (quasi)convexity. The only difference in comparison
to (3.16) is that here we need to “localize” in x ∈ Ω and make νx a gradient Young measure
in its own right. This is accomplished via the fundamental blow-up technique:
Lemma 3.24 (Blow-up technique). Let ν = (νx ) ∈ GY p (Ω; Rm×d ) be a gradient Young
measure. Then, for almost every x0 ∈ Ω, νx0 ∈ GY p (B(0, 1); Rm×d ) is a homogeneous gradient Young measure in its own right.
Proof. We start with a general property of Young measures: Let {φk }k and {hl }l be countable dense subsets of C0 (Ω) and C0 (RN ), respectively. Assume furthermore that for a
uniformly norm-bounded sequence (V j ) ⊂ L p (Ω; RN ) and ν ∈ Y p (Ω; RN ) we have
∫
lim
j→∞ Ω
φk (x)hl (V j (x)) dx =
∫
Ω
φk (x)
∫
hl dνx dx
for all k, l ∈ N.
Y
Then we claim that V j → ν . This is in fact clear once we realize that the set of linear
combinations of the functions fk,l := φk ⊗ hl (that is, fk,l (x, A) := φk (x)hl (A)) is dense in
C0 (Ω × RN ) and testing with such functions determines Young measure convergence, see
the proof of the Fundamental Theorem 3.11.
⟨
⟩
Now let x0 ∈ Ω be a Lebesgue point of all the functions x 7→ hl , νx (l ∈ N), that is,
∫
lim
r↓0
B(0,1)
⟨
hl , νx
⟩
0 +ry
⟨
⟩
− hl , νx0 dy = 0.
By Theorem A.7, almost every x ∈ Ω has this property. Then, at such a Lebesgue point x0 ,
set
u j (x0 + ry) − [u j ]B(x0 ,r)
,
r
where [u]B(x0 ,r) :=
∫
−B(x ,r) u
0
y ∈ B(0, 1),
dx.
Comment...
(r)
v j (y) :=
70
CHAPTER 3. QUASICONVEXITY
For φk , hl from the collections exhibited at the beginning of the proof, we get
∫
(r)
B(0,1)
φk (y)hl (∇v j (y)) dy =
∫
B(0,1)
1
= d
r
φk (y)hl (∇u j (x0 + ry)) dy
∫
B(x0 ,r)
φk
(x−x )
0
hl (∇u j (x)) dx
r
after a change of variables. Letting j → ∞ and then r ↓ 0, we get
∫
∫
( x − x )⟨
⟩
1
(r)
0
φk (y)hl (∇v j (y)) dy = lim d
φk
lim lim
hl , νx dx
r↓0 r
r↓0 j→∞ B(0,1)
r
B(x0 ,r)
∫
⟨
⟩
= lim
φk (y) hl , νx0 +ry dx
r↓0
∫
B(0,1)
=
B(0,1)
⟨
⟩
φk (y) hl , νx0 dy,
the last convergence following from the Lebesgue point property of x0 . Moreover,
∫
(r)
B(0,1)
|∇v j | p dy =
∫
B(0,1)
|∇u j (x0 + ry)| p dy =
1
rd
∫
B(x0 ,r)
|∇u j (x)| p dx
and the last integral is uniformly bounded in j (for fixed r). Denote by λ ∈ M(Ω; Rm ) the
weak* limit of the measures |∇u j | p L d Ω, which exists after taking a subsequence. If we
require of x0 additionally that
lim sup
r↓0
λ (B(x0 , r))
< ∞,
rd
which hold at L d -almost every x0 ∈ Ω, then
∫
lim sup lim
r↓0
j→∞ B(0,1)
(r)
|∇v j | p dy < ∞.
(r)
Since also [v j ]B(0,1) = 0, the Poincaré inequality from Theorem A.16 yields that there exists
R( j)
a diagonal sequence w j := vJ( j) that is uniformly bounded in W1,p (B(0, 1); Rm ) and such
that for all k, l ∈ N,
∫
∫
⟨
⟩
lim
φk (y)hl (∇w j (y)) dy =
φk (y) hl , νx0 dy.
j→∞ B(0,1)
B(0,1)
Y
Therefore, ∇w j → νx0 , where we now understand νx0 as a homogeneous (gradient) Young
measure on B(0, 1).
The proof of weak lower semicontinuity is then straightforward:
Comment...
Theorem 3.25. Let f : Ω × Rm×d → R be a Carathéodory integrand with | f (x, A)| ≤
M(1 + |A| p ) for some M > 0, 1 < p < ∞. Assume furthermore that f (x, q) is quasiconvex for
every fixed x ∈ Ω. Then the corresponding functional F is weakly lower semicontinuous on
W1,p (Ω; Rm ).
3.6. LOWER SEMICONTINUITY
71
Proof. Let (∇u j ) ⊂ W1,p (Ω; Rm ) with u j ⇀ u in W1,p . Assume that (∇u j ) generates the
gradient Young measure ν = (νx ) ∈ GY p (Ω; Rm×d ), for which it holds that [ν ] = ∇u. This
is only true up to a (non-relabeled) subsequence, but if we can establish the lower semicontinuity for every such subsequence it follows that the result also holds for the original
sequence.
From Proposition 3.19 we get
∫
lim inf
j→∞
Ω
f (x, ∇u j (x)) dx ≥
⟨⟨
f,ν
⟩⟩
∫ ∫
=
Ω
f (x, A) dνx (A) dx.
Now, for almost every x ∈ Ω, we can consider νx as a homogeneous Young measure in
GY p (B(0, 1); Rm×d ) by the Blow-up Lemma 3.24. Thus, the Jensen-type inequality from
Lemma 3.22 reads
∫
f (x, A) dνx (A) ≥ f (x, ∇u(x))
for a.e. x ∈ Ω.
Combining, we arrive at
lim inf F [u j ] ≥ F [u],
j→∞
which is what we wanted to show.
At this point is is worthwhile to reflect on the role of Young measures in the proof of
the preceding result, namely that they allowed us to split the argument into two parts: First,
we passed to the limit in the functional via the Young measure. Second, we established
a Jensen-type inequality, which then yielded the lower semicontinuity inequality. It is remarkable that the Young measure preserves exactly the “right” amount of information to
serve as an intermediate object.
We can now sum up and prove existence to our minimization problem:
Theorem 3.26. Let f : Ω × Rm×d → R be a Carathéodory integrand such that f satisfies
the lower growth bound
µ |A| p − µ −1 ≤ f (x, A)
for some µ > 0,
the upper growth bound
| f (x, A)| ≤ M(1 + |A| p )
for some M > 0,
and is quasiconvex in its second argument. Then, the associated functional F has a minim
1−1/p,p (∂ Ω; Rm ).
mizer over the space W1,p
g (Ω; R ), where g ∈ W
Proof. This follows directly by combining the Direct Method from Theorem 2.3 with the
coercivity result in Proposition 2.6 and the Lower Semicontinuity Theorem 3.25.
Comment...
The following result shows that quasiconvexity is also necessary for weak lower semicontinuity; we only state and show this for x-independent integrands, but we note that it also
holds for x-dependent ones by a localization technique.
72
CHAPTER 3. QUASICONVEXITY
Proposition 3.27. Let f : Rm×d → R satisfy the p-growth condition.
| f (x, A)| ≤ M(1 + |A| p )
for some M > 0.
If the associated functional
F [u] :=
∫
Ω
u ∈ W1,p (Ω; Rm ),
f (∇u(x)) dx,
is weakly lower semicontinuous (with or without fixed boundary values), then f is quasiconvex.
Proof. We may assume that B(0, 1) ⊂⊂ Ω, otherwise we can translate and rescale the dom
main. Let A ∈ Rm×d and ψ ∈ W1,∞
0 (B(0, 1); R ). We need to show
∫
h(A) ≤ −
B(0,1)
h(A + ∇ψ (z)) dz.
Take for every j ∈ N a Vitali cover of B(0, 1), see Theorem A.8,
N( j)
B(0, 1) = Z ⊎
⊎
( j)
( j)
|Z| = 0,
B(ak , rk ),
k=1
( j)
rk
( j)
ak
∈ B(0, 1),
≤ 1/ j (k = 1, . . . , N( j)), and fix a smooth function h : Ω \ B(0, 1) →
with
m
R with h(x) = Ax for x ∈ ∂ B(0, 1) (and h|∂ Ω equal to the prescribed boundary values if
there are any). Define
(
( j) )
x − ak
( j) ( j)
( j)
if x ∈ B(ak , rk ),
u j (x) := Ax + rk ψ
( j)
rk
and u j := h on Ω \ B(0, 1). Then, it is not hard to see that u j ⇀ u in W1,p for
{
Ax
if x ∈ B(0, 1),
x ∈ Ω.
u(x) =
h(x) if x ∈ Ω \ B(0, 1),
Thus, the lower semicontinuity yields, after eliminating the constant part of the functional
on Ω \ B(0, 1),
∫
B(0,1)
f (A) dx ≤ lim inf
j→∞
= lim inf
j→∞
= lim inf
∫
j→∞
=
B(0,1)
∫
B(0,1)
∑
k
∫
( j) ( j)
B(ak ,rk )
∑ rk
( j)
k
f (∇u j (x)) dx
∫
B(0,1)
(
(
( j) ))
x − ak
f A + ∇ψ
dx
( j)
rk
f (A + ∇ψ (y)) dy
f (A + ∇ψ (y)) dy
( j)
since ∑k rk = 1. This is nothing else than quasiconvexity.
Comment...
We note that also for quasiconvex integrands, our results about the Euler–Lagrange
equations hold just the same (if we assume the same growth conditions).
3.7. INTEGRANDS WITH U-DEPENDENCE
73
3.7 Integrands with u-dependence
One very useful feature of our Young measure approach is that it allows to derive a lower
semicontinuity result for u-dependent integrands with minimal additional effort. So consider
∫

 F [u] :=
f (x, u(x), ∇u(x)) dx → min,
Ω

over all u ∈ W1,p (Ω; Rm ) with u|∂ Ω = g,
where Ω ⊂ Rd is a bounded Lipschitz domain, 1 < p < ∞, and the Carathéodory integrand
f : Ω × Rm × Rm×d → R satisfies the p-growth bound
| f (x, v, A)| ≤ M(1 + |v| p + |A| p )
for some M > 0,
(3.19)
and g ∈ W1−1/p,p (∂ Ω; Rm ).
The main idea of our approach is to consider Young measures generated by the pairs
(u j , ∇u j ) ∈ Rm+md . We have the following result:
Lemma 3.28. Let (u j ) ⊂ L p (Ω; RM ) and (V j ) ⊂ L p (Ω; RN ) be norm-bounded sequences
such that
u j → u pointwise a.e.
Y
V j → ν = (νx ) ∈ Y p (Ω; RN ).
and
Y
Then, (u j ,V j ) → µ = (µx ) ∈ Y p (Ω; RM+N ) with
µx = δu(x) ⊗ νx
for a.e. x ∈ Ω,
that is,
∫
Ω
f (x, u j (x),V j (x)) dx →
∫ ∫
Ω
f (x, u(x), A) dνx (A) dx
(3.20)
for all Carathéodory f : Ω × RM × RN → R satisfying the p-growth bound (3.19).
Proof. By a similar density argument as in the proof of Lemma 3.24, it suffices to show
the convergence (3.20) for f (x, v, A) = φ (x)ψ (v)h(A), where φ ∈ C0 (Ω), ψ ∈ C0 (RM ),
h ∈ C0 (RN ). We already know by assumption
⟩)
⟨
∗ (
h(V j ) ⇁ x 7→ h, νx
in L∞ .
Furthermore, ψ (u j ) → ψ (u) almost everywhere and thus strongly in L1 since ψ is bounded.
Thus, since the product of a L∞ -weakly* converging sequence and a L1 -strongly converging
sequence converges itself weakly* in the sense of measures,
∫
Ω
φ (x)ψ (u j (x))h(V j (x)) dx →
∫
Ω
⟩
⟨
φ (x)ψ (u(x)) h, νx dx,
Comment...
which is (3.20) for our special f . This already finishes the proof.
74
CHAPTER 3. QUASICONVEXITY
Y
The trick of the preceding lemma is that in our situation, where V j = ∇u j → ν ∈
GY(Ω; Rm×d ), it allows us to “freeze” u(x) in the integrand. Then, we can apply the Jensentype inequality from Lemma 3.22 just as we did in Theorem 3.25 to get:
Theorem 3.29. Let f : Ω × Rm × Rm×d → R be a Carathéodory integrand with p-growth,
i.e. | f (x, v, A)| ≤ M(1 + |v| p + |A| p ) for some M > 0, 1 < p < ∞. Assume furthermore that
f (x, v, q) is quasiconvex for every fixed (x, v) ∈ Ω × Rm . Then, the corresponding functional
F is weakly lower semicontinuous on W1,p (Ω; Rm ).
Remark 3.30. It is not too hard to extend the previous theorem to integrands f satisfying
the more general growth condition
| f (x, v, A)| ≤ M(1 + |v|q + |A| p )
for a 1 ≤ q < p/(d − p). By the Sobolev Embedding Theorem A.17, u j → u in Lq (strongly)
for all such q and thus Lemma 3.28 and then Theorem 3.29 can be suitably generalized.
3.8 Regularity of minimizers
At the end of Section 2.5 we discussed the failure of regularity for vector-valued problems.
In particular, we mentioned various counterexamples to regularity. Upon closer inspection,
however, it turns out that in all counterexamples, the points where regularity fails form a
closed set. This is not a coincidence.
Another issue was that all regularity theorems discussed so far require (strong) convexity of the integrand and our discussion so far has shown that convexity is not a good
notion for vector-valued problems (however, much of the early regularity theory for vectorvalued problems focused on convex problems, we will not elaborate on this aspect here). To
remedy this, we need a new notion, which will take over from strong convexity in regularity theory: A locally bounded Borel-measurable function h : Rm×d → R is called strongly
(2-)quasiconvex if there exists µ > 0 such that
A 7→ h(A) − µ |A|2
is quasiconvex.
Equivalently, we may require that
µ
∫
B(0,1)
|∇ψ (z)|2 dz ≤
∫
B(0,1)
h(A + ∇ψ (z)) − h(A) dz
m
for all A ∈ Rm×d and all ψ ∈ W1,∞
0 (B(0, 1); R ).
The most well-known regularity result in this situation is by Lawrence C. Evans from
1986 (there is significant overlap with work by Acerbi & Fusco in the same year):
Theorem 3.31 (Evans partial regularity theorem). Let f : Rm×d → R be of class C2 ,
strongly quasiconvex, and assume that there exists M > 0 such that
Comment...
D2A f (A)[B, B] ≤ M|B|2
for all A, B ∈ Rm×d .
3.9. NOTES AND BIBLIOGRAPHICAL REMARKS
75
1,2
m
m
1/2,2 (∂ Ω; Rm ).
Let u ∈ W1,2
g (Ω; R ) be a minimizer of F over Wg (Ω; R ) where g ∈ W
Then, there exists a relatively closed singular set Σu ⊂ Ω with |Σu | = 0 and it holds that
α
u ∈ C1,
loc (Ω \ Σu ) for all α ∈ (0, 1).
The proof is long and builds on earlier work of Almgren and De Giorgi in Geometric
Measure Theory.
The result is called a “partial regularity” result because the regularity does not hold
everywhere. It should be noted that while the scalar regularity theory was essentially a
theory for PDEs, and hence applies to all solutions of the Euler–Lagrange equations, this
is not the case for the present result: There is no regularity theory for critical points of the
Euler–Lagrange equation for a quasiconvex or even polyconvex (see next chapter) integral
functional. This was shown by Müller & Švérak in 2003 [75] (for the quasiconvex case) and
Székelyhidi Jr. in 2004 [92] (for the polyconvex case). We finally remark that sometimes
better estimates on the “smallness” of Σu than merely |Σu | = 0 are available. In fact, for
strongly convex integrands it can be shown that the (Hausdorff-)dimension of the singular
set is at most d − 2, this is actually within reach of our discussion in Section 2.5, because
we showed W2,2
loc -regularity, and general properties of such functions imply the conclusion,
see Chapter 2 of [44]. In the strongly quasiconvex case, much less is known, but at least for
minimizers that happen to be W1,∞ it was established in 2007 by Kristensen & Mingione
that the dimension of the singular set is strictly less than d, see [56]. Many other questions
are open.
3.9 Notes and bibliographical remarks
Comment...
The notion of quasiconvexity was first introduced in Morrey’s fundamental paper [69] and
later refined by Meyers in [65]. The results about null-Lagrangians, in particular Lemma 3.8
and Lemma 3.10, go back to Morrey [70], Reshetnyak [82] and Ball [5]. Our proof is based
on the idea that certain combinations of derivatives might have good convergence properties even if the individual derivatives do not. This is also pivotal for so-called compensated
compactness, which originated in work of Tartar [94, 95] and Murat [76, 77], in particular
in their celebrated Div–Curl Lemma, see for example the discussion in [35]. It is quite
remarkable that one can prove Brouwer’s fixed point theorem using the fact that minors
are null-Lagrangians, see Theorem 8.1.3 in [36]. Proposition 3.6 is originally due to Morrey [70], we follow the presentation in [12]. We also remark that if only r = p, then a weaker
version of Lemma 3.10 is true where we only have convergence of the minors in the sense
of distributions, see Theorem 8.20 in [25].
The convexity properties of the quadratic function A 7→ MA : A, where M is a fourthorder tensor (or, equivalently, a matrix in Rmd×md ) has received considerable attention because it corresponds to a linear Euler–Lagrange equation. In this case, quasi-convexity and
rank-one convexity are the same, and polyconvexity (see next chapter) is also equivalent to
quasiconvexity if d = 2 or m = 2, but this does not hold anymore for d, m ≥ 3. These results
together with pointers to the vast literature can be found in Section 5.3.2 in [25].
76
CHAPTER 3. QUASICONVEXITY
A more general result on why convexity is inadmissible for realistic problems in nonlinear elasticity can be found in Section 4.8 of [20].
The result that rank-one convex functions are locally Lipschitz continuous is wellknown for convex functions, see for example Corollary 2.4 in [34] and an adapted version
for rank-one convex (even separately convex) functions is in Theorem 2.31 in [25]. Our
proof with a quantitative bound is from Lemma 2.2 in [12].
The Fundamental Theorem of Young Measure Theory 3.11 has seen considerable evolution since its original proof by Young in the 1930s and 1940s [97–99]. It was popularized in
the Calculus of Variations by Tartar [95] and Ball [7]. Further, much more general, versions
have been contributed by Balder [4], also see [19] for probabilistic applications.
The Ball–James Rigidity Theorem 3.23 is from [10], see [73] for further results in the
same spirit.
Lemma 3.21 is an expression of the well-known decomposition lemma from [38], another version is in [53].
Many results in this section that are formulated for Carathéodory integrands continue to
hold for f : Ω × RN → R that are Borel measurable and lower semicontinuous in the second
argument, so called normal integrands, see [16] and [37].
The linear growth case p = 1 is very special and requires more sophisticated tools, socalled generalized Young measures, see [1, 32, 57, 58].
One issue in the linear-growth case is demonstrated in the following counterpart of
Lemma 3.18 for p = 1:
Lemma 3.32. Let (V j ) ⊂ L1 (Ω; RN ) be a bounded sequence generating the Young measure
(νx )x∈Ω ⊂ M1 (RN ). Define v(x) := [νx ] = ⟨id, νx ⟩. Then there exists an increasing sequence
Ωk ⊂ Ω with |Ωk | ↑ |Ω| such that V j ⇀ v in L1 (Ωk ; RN ) for all k; this is called biting
convergence.
Comment...
Once one knows the so-called Chacon Biting Lemma, which asserts that every L1 bounded sequence has a subsequence converging in the biting sense, the preceding lemma
is easy to see using (III) from the Fundamental Theorem of Young Measure Theory 3.11.
More information on this topic can be found in [14] and some of the references cited therein.
It is possible to prove the Lower Semicontinuity Theorem 3.25 for quasiconvex integrands without the use of Young measures, see for instance Chapter 8 in [25] for such an
approach. However, many of the ideas are essentially the same, they are just carried out directly without the intermediate steps of Young measures. However, Young measures allow
one to very clearly organize and compartmentalize these ideas, which aids the understanding of the material. We will also use Young measures later for the relaxation of non-lower
semicontinuous functionals. More on lower semicontinuity and Young measures can be
found in the book [81]. The first use of Young measures to investigate questions of lower
semicontinuity for quasiconvex integral functionals seems to be [50].
The blow-up technique is related to the use of tangent measures in Geometric Measure
Theory, see for example [61].
Chapter 4
Polyconvexity
At the beginning of Chapter 3 we saw that convexity cannot hold concurrently with frameindifference (and a mild positivity condition). Thus, we were lead to consider quasiconvex
integrands. However, while quasiconvexity is of huge importance in the theory of the Calculus of Variations, our Lower Semicontinuity Theorem 3.25 has one major drawback: we
needed to require the p-growth bound
| f (x, A)| ≤ M(1 + |A| p )
for all x ∈ Ω, A ∈ Rd×d and some M > 0, 1 < p < ∞.
Unfortunately, this is not a realistic assumption for nonlinear elasticity because it ignores
the fact that infinite compressions should cost infinite energy. Indeed, realistic integrands
have the property that
f (A) → +∞
as
f (A) = +∞
if
det A ↓ 0
and
det A ≤ 0.
For example, the family of matrices
(
)
1 0
Aα :=
α > 0,
,
0 α
Comment...
satisfies det Aα ↓ 0 as α ↓ 0, but |Aα | remains uniformly bounded, so the above p-growth
bound cannot hold.
The question whether the Lower Semicontinuity Theorem 3.25 for quasiconvex integrands can be extended to integrands with the above growth is currently a major unsolved
problem, see [8]. For the time being, we have to confine ourself to a more restrictive notion
of convexity if we want to allow for the above “elastic” growth. This type of convexity
is called polyconvexity by John Ball in [5], who proved the corresponding existence theorem for minimization problems and enabled a mature mathematical treatment of nonlinear
elasticity theory, including the realistic Ogden materials, see Section 1.4 and Example 4.3.
We will only look at the three-dimensional theory because it is by far the most physically important case and because this restriction eases the notational burden; however, other
dimensions can be treated as well, we refer to the comprehensive treatment in [25].
78
CHAPTER 4. POLYCONVEXITY
4.1 Polyconvexity
A continuous integrand h : R3×3 → R ∪ {+∞} (now we allow the value +∞) is called polyconvex if it can be written in the form
h(A) = H(A, cof A, det A),
A ∈ R3×3 ,
where H : R3×3 × R3×3 × R → R ∪ {+∞} is convex. Here, cof A denotes the cofactor matrix
as defined in Appendix A.1.
While convexity obviously implies polyconvexity, the converse is clearly wrong. However, we have:
Proposition 4.1. A polyconvex h : R3×3 → R (not taking the value +∞) is quasiconvex.
Proof. Let h be as stated and let ψ ∈ W1,∞ (B(0, 1); R3 ) with ψ (x) = Ax on ∂ B(0, 1). Then,
using Jensen’s inequality
∫
−
B(0,1)
∫
h(∇ψ ) dx = −
H(∇ψ , cof ∇ψ , det ∇ψ ) dx
B(0,1)
(∫
∫
∫
≥H −
∇ψ dx, −
cof ∇ψ dx, −
B(0,1)
B(0,1)
B(0,1)
)
det ∇ψ dx
= H(A, cof A, det A) = h(A),
where for the last line we used the fact that minors are null-Lagrangians as proved in
Lemma 3.8.
At this point we have seen all four major convexity conditions that play a role in the
modern Calculus of Variations. They can be put into a linear order by implications:
convexity ⇒ polyconvexity ⇒ quasiconvexity ⇒ rank-one convexity
While in the scalar case (d = 1 or m = 1) all these notions are equivalent, in higher dimensions the reverse implications do not hold in general. Clearly, the determinant function is
polyconvex but not convex. However, the fact that the notions of polyconvexity, quasiconvexity, and rank-one convexity are all different in higher dimensions, are non-trivial. The
fact that rank-one convexity does not imply quasiconvexity, at least for d ≥ 2, m ≥ 3, is the
subject of Švérak’s famous counterexample, see Example 5.10. The case d = m = 2 is still
open and one of the major unsolved problems in the field. Here we discuss an example of a
quasiconvex, but not polyconvex function:
Example 4.2 (Alibert–Dacorogna–Marcellini). Recall from Example 3.5 the Alibert–
Dacorogna–Marcellini function
(
)
hγ (A) := |A|2 |A|2 − 2γ det A ,
A ∈ R2×2 .
Comment...
It can be shown that hγ is polyconvex if and only if |γ | ≤ 1. Since it is quasiconvex if and
only if |γ | ≤ γQC , where γQC > 1, we see that there are quasiconvex functions that are not
polyconvex. See again Section 5.3.8 in [25] for details.
4.2. EXISTENCE THEOREMS
79
Example 4.3 (Ogden materials). Functions W of the form
M
[
] N
[
]
W (F) := ∑ ai tr (F T F)γi /2 + ∑ b j tr cof (F T F)δ j /2 + Γ(det F),
i=1
F ∈ R3×3 ,
j=1
where ai > 0, γi ≥ 1, b j > 0, δ j ≥ 1, and Γ : R → R ∪ {+∞} is a convex function with
Γ(d) → +∞ as d ↓ 0, Γ(d) = +∞ for s ≤ 0 and good enough growth from below, can be
shown to be polyconvex. These stored energy functionals correspond to so-called Ogden
materials and occur in a wide range of elasticity applications, see [20] for details.
It can also be shown that in three dimensions convex functions of certain combinations
of the singular values of a matrix (the singular values of A are the square roots of the eigenvalues of AT A) are polyconvex, see Section 5.3.3 in [25] for details.
4.2 Existence Theorems
Let Ω ⊂ R3 be a connected bounded Lipschitz domain. In this section we will prove existence of a minimizer of the variational problem
∫

 F [u] := W (x, ∇u(x)) − b(x) · u(x) dx → min,
Ω
(4.1)

1,p
3
over all u ∈ W (Ω; R ) with det ∇u > 0 a.e. and u|∂ Ω = g,
where W : Ω × R3×3 → R ∪ {+∞} is Carathéodory and W (x, q) is polyconvex for almost
every x ∈ Ω. Thus,
W (x, A) = W(x, A, cof A, det A),
(x, A) ∈ Ω × R3×3 ,
for W : Ω × R3×3 × R3×3 × R → R ∪ {+∞} with W(x, q, q, q) jointly convex and continuous
for almost every x ∈ Ω. As the only upper growth assumptions on W we impose
{
W (x, A) → +∞ as det A ↓ 0,
(4.2)
W (x, A) = +∞ if det A ≤ 0.
Furthermore, we assume as usual that g ∈ W1−1/p,p (∂ Ω; R3 ) and that b ∈ Lq (Ω; R3 ) where
1/p + 1/q = 1. The exponent p > 1 will remain unspecified for now. Later, when we
impose conditions on the coercivity of W , we will also specify p.
The first lower semicontinuity result is relatively straightforward:
Theorem 4.4. If in addition to the above assumptions it holds that
µ |A| p − µ −1 ≤ W (x, A)
for some µ > 0 and p > 3,
(4.3)
then the minimization problem (4.1) has at least one solution in the space
{
}
A := u ∈ W1,p (Ω; R3 ) : det ∇u > 0 a.e. and u|∂ Ω = g ,
Comment...
whenever this set is non-empty.
80
CHAPTER 4. POLYCONVEXITY
Proof. We apply the usual Direct Method. For a minimizing sequence (u j ) ⊂ A with
F [u j ] → min, we get from the coercivity estimate (4.3) in conjunction with the Poincaré–
Friedrichs inequality (and the fixed boundary values) that for any δ > 0,
∫
Ω
W (x, ∇u j ) − b(x) · u j (x) dx
≥ µ ∥∇u j ∥Lp p − µ −1 |Ω| − ∥b∥Lq · ∥u j ∥L p
µ
µ
|Ω|
δ
1
p
p
p
≥ ∥u j ∥W
∥g∥W
− ∥b∥qLq − ∥u j ∥W
1,p −
1,p .
1−1/p,p −
C
C
µ
δq
p
Choosing δ = pµ /(2C), we arrive at
(
)
sup ∥u j ∥W1,p ≤ C sup F [u j ] + 1 .
j∈N
j∈N
Thus, we may select a subsequence (not renumbered) such that u j ⇀ u in W1,p (Ω; R3 ).
Now, by Lemma 3.10 and p > 3,
{
det ∇u j ⇀ det ∇u in L p/3
and
cof ∇u j ⇀ cof ∇u in L p/2 .
Thus, an argument entirely analogous to the proof of Theorem 2.7 (note that in that theorem
we did not need any upper growth condition) yields that the hyperelastic part of F ,
u 7→
∫
Ω
W (x, ∇u j ) dx =
∫
Ω
W(x, ∇u j , cof ∇u j , det ∇u j ) dx
is weakly lower semicontinuous in W1,p . The second part of F is weakly continuous by
Lemma 2.37. Thus,
F [u] ≤ lim inf F [u j ] = inf F .
j→∞
A
In particular, det ∇u > 0 almost everywhere and u|∂ Ω = g by the weak continuity of the
trace. Hence u ∈ A and the proof is finished.
Unfortunately, the preceding theorem has a major drawback in that p > 3 has to be
assumed. In applications in elasticity theory, however, a more realistic form of W is
W (A) =
λ
(tr E)2 + µ tr E 2 + O(|E|2 ),
2
1
E = (AT A − I),
2
(4.4)
where λ , µ > 0 are the Lamé constants. Except for the last term O(|E|2 ), which vanishes
as |E| ↓ 0, this energy corresponds to a so-called St. Venant–Kirchhoff material. It was
shown by Ciarlet & Geymonat [21] (also see Theorem 4.10-2 in [20]) that for any such
Lamé constants, there exists a polyconvex function W of the form
Comment...
W (A) = a|A|2 + b| cof A|2 + γ (det A) + c,
4.2. EXISTENCE THEOREMS
81
where a, b, c > 0 and γ : R → [0, ∞] is convex, such that the above expansion (4.4) holds.
Clearly, such W has 2-growth in |A|. Thus, we need an existence theorem for functions with
these growth properties.
The core of the refined argument will be an improvement of Lemma 3.10:
Lemma 4.5. Let 1 ≤ p, q, r ≤ ∞ with
1 1
+ ≤ 1,
p q
p ≥ 2,
r ≥ 1,
and assume that the sequence (u j ) ⊂ W1,p (Ω; R3 ) satisfies for some H ∈ Lq (Ω; R3×3 ), d ∈
Lr (Ω) that

u j ⇀ u in W1,p ,


cof ∇u j ⇀ H in Lq ,


det ∇u j ⇀ d in Lr .
Then, H = cof ∇u and d = det ∇u.
Proof. The idea is to use distributional versions of the cofactor matrix and Jacobian determinant and to show that they agree with the usual definitions for sufficiently regular functions. To this aim we will use the representation of minors as divergences already employed
in Lemmas 3.8, 3.10.
Step 1. From (3.7) we get that for all φ = (φ 1 , φ 2 , φ 3 ) ∈ C1 (Ω; R3 ),
(cof ∇φ )kl = ∂l+1 (φ k+1 ∂l+2 φ k+2 ) − ∂l+2 (φ k+1 ∂l+1 φ k+2 ),
where k, l ∈ {1, 2, 3} are cyclic indices. Thus, for all ψ ∈ C∞
c (Ω),
∫
∫
(cof ∇φ )kl ψ dx = − (φ k+1 ∂l+2 φ k+2 )∂l+1 ψ − (φ k+1 ∂l+1 φ k+2 )∂l+2 ψ dx
Ω
⟨ Ω
⟩
=: (Cof ∇φ )kl , ψ
(4.5)
and we call the quantity “Cof ∇φ ” the distributional cofactors.
To investigate the continuity properties of the distributional cofactors, we consider
G(φ ) :=
∫
Ω
(φ k ∂l φ m )∂n ψ dx,
φ ∈ W1,p (Ω; R3 ),
for some k, l, m, n ∈ {1, 2, 3} and fixed ψ ∈ C∞
c (Ω). We can estimate this using the Hölder
inequality as follows:
|G(φ )| ≤ ∥φ ∥Ls ∥∇φ ∥L p ∥∇ψ ∥∞
whenever
1 1
+ ≤ 1.
s p
Comment...
(4.6)
82
CHAPTER 4. POLYCONVEXITY
In particular this is true for s = p ≥ 2. Thus, since C1 (Ω; R3 ) is dense in W1,p (Ω; R3 ), we
have that the equality (4.5) also holds for φ ∈ W1,p (Ω; R3 ) whenever p ≥ 2.
Moreover, the above arguments yield for a sequence (φ j ) ⊂ W1,p (Ω; R3 ) that
{
φj → φ
in Ls ,
if
G(φ j ) → G(φ )
∇φ j ⇀ ∇φ in L p .
The first convergence is automatically satisfied by the compact Rellich–Kondrachov Theorem A.18 if
{
3p
if p < 3,
s < 3−p
(4.7)
∞
if p ≥ 3.
We can always choose s such that it simultaneously satisfies (4.6) and (4.7), as a quick
calculation shows. Thus,
⟨
⟩ ⟨
⟩
if
Cof ∇φ j , ψ → Cof ∇φ , ψ
φ j ⇀ φ in W1,p , p ≥ 2.
(4.8)
Step 2. First, if φ = (φ 1 , φ 2 , φ 3 ) ∈ C1 (Ω; R3 ), then we know from (3.8) that
3
3
(
)
det ∇φ = ∑ ∂l φ 1 (cof ∇φ )1l = ∑ ∂l φ 1 (cof ∇φ )1l .
l=1
l=1
Thus, in this case, we have for all ψ ∈ C∞
c (Ω) that
∫
3
Ω
(det ∇φ )ψ dx = ∑
∫
3
l=1 Ω
∂l φ 1 (cof ∇φ )1l ψ dx = − ∑
∫ (
)
φ 1 (cof ∇φ )1l ∂l ψ dx. (4.9)
l=1 Ω
The key idea now is that by Hölder’s inequality this quantity makes sense if only φ ∈
L p (Ω; R3 ) and cof ∇φ ∈ Lq (Ω; R3×3 ) with 1p + 1q ≤ 1. Analogously to the argument for the
cofactor matrix, this motivates us to define for
φ ∈ L p (Ω; R3 ) with
cof ∇φ ∈ Lq (Ω; R3 ),
where
1 1
+ ≤ 1,
p q
the distributional determinant
3 ∫ (
⟨
⟩
)
Det ∇φ , ψ := − ∑
φ 1 (cof ∇φ )1l ∂l ψ dx,
ψ ∈ C∞
c (Ω).
l=1 Ω
From (4.9) we see that if φ ∈ C1 (Ω; R3 ), then “det = Det”, i.e.
∫
⟨
⟩
(det ∇φ )ψ dx = Det ∇φ , ψ
Ω
(4.10)
for all ψ ∈ C∞
c (Ω). We want to show that this equality remains to hold if merely φ ∈
1,p
3
W (Ω; R ) and cof ∇φ ∈ Lq (Ω; R3×3 ) with p, q as in the statement of the lemma. For the
1
3
1
3×3 ),
moment fix ψ ∈ C∞
c (Ω) and define for φ ∈ C (Ω; R ) and F ∈ C (Ω; R
3
Z(φ , F) := ∑
∫
Comment...
l=1 Ω
3
∂l φ 1 Fl1 ψ + φ 1 Fl1 ∂l ψ dx = ∑
∫
l=1 Ω
Fl1 ∂l (φ 1 ψ ) dx.
4.2. EXISTENCE THEOREMS
83
To establish (4.10), we need to show Z(φ , cof ∇φ ) = 0.
Recall the Piola identity (3.9),
div cof ∇v = 0
if v ∈ C1 (Ω; R3 ).
As a consequence,
Z(φ , cof ∇v) = 0
for φ , v ∈ C1 (Ω; R3 ).
Furthermore, we have the estimate
|Z(φ , F)| ≤ ∥∇φ ∥L p ∥F∥Lq ∥ψ ∥∞
whenever
1 1
+ ≤ 1.
p q
Thus, we get, via two approximation procedures (first in F = cof ∇v, then in φ )
Z(φ , cof ∇v) = 0
for φ ∈ W1,p (Ω; R3 ), cof ∇v ∈ Lq (Ω; R3×3 ).
In particular,
Z(φ , cof ∇φ ) = 0
for φ ∈ W1,p (Ω; R3 ), cof ∇φ ∈ Lq (Ω; R3×3 ).
Therefore, (4.10) holds also for φ = u as in the statement of the lemma.
Moreover, we see from the definition of the distributional determinant that for a sequence (φ j ) ⊂ W1,p (Ω; R3 ) we have
{
⟨
⟩ ⟨
⟩
φj → φ
in Ls ,
Det ∇φ j , ψ → Det ∇φ , ψ
if
cof ∇φ j ⇀ cof ∇φ in Lq .
whenever
1 1
+ ≤ 1.
s q
(4.11)
The first convergence follows, as before, from the Rellich–Kondrachov Theorem A.18 if
{
3p
if p < 3,
3−p
s<
(4.12)
∞
if p ≥ 3.
However, since we assumed
1 1
+ ≤ 1,
p q
Comment...
we can always choose s such that it satisfies (4.11) and (4.12) simultaneously, as can be
seen by some elementary algebra. Thus,
{
⟨
⟩ ⟨
⟩
φj ⇀ φ
in W1,p ,
Det ∇φ j , ψ → Det ∇φ , ψ
if
(4.13)
cof ∇φ j ⇀ cof ∇φ in Lq ,
84
CHAPTER 4. POLYCONVEXITY
where p, q satisfy the assumptions of the lemma.
Step 3. Assume that, as in the statement of the lemma, we are given a sequence (u j ) ⊂
W1,p (Ω; R3 ) such that for H ∈ Lq (Ω; R3×3 ), d ∈ Lr (Ω) it holds that





in W1,p ,
uj ⇀ u
cof ∇u j ⇀ H
in Lq ,
det ∇u j ⇀ d
in Lr .
Then, (4.5), (4.10) imply that for all ψ ∈ C∞
c (Ω),
⟨
⟩ ⟨
⟩
Cof ∇u j , ψ → H, ψ
⟨
⟩ ⟨
⟩
Det ∇u j , ψ → d, ψ .
and
On the other hand, from (4.8), (4.13),
⟨
⟩ ⟨
⟩
Cof ∇u j , ψ → Cof ∇u, ψ
⟨
⟩ ⟨
⟩
Det ∇u j , ψ → Det ∇u, ψ .
and
Thus,
∫
Ω
(Cof ∇u − H)ψ dx = 0
∫
and
Ω
(Det ∇u − d)ψ dx = 0
for all ψ ∈ C∞
c (Ω). The Fundamental Lemma of the Calculus of Variations 2.21 then gives
immediately
H = cof ∇u
and
d = det ∇u.
This finishes the proof.
With this tool at hand, we can now prove the improved existence result:
Theorem 4.6 (Ball). Let 1 ≤ p, q, r < ∞
p ≥ 2,
1 1
+ ≤ 1,
p q
r > 1,
such that in addition to the assumptions at the beginning of this section, it holds that
(
)
W (x, A) ≥ µ |A| p + | cof A|q + | det A|r − µ −1
for some µ > 0.
Then, the minimization problem (4.1) has at least one solution in the space
{
A := u ∈ W1,p (Ω; R3 ) : cof ∇u ∈ Lq (Ω; R3×3 ), det ∇u ∈ Lr (Ω),
}
det ∇u > 0 a.e. and u|∂ Ω = g ,
Comment...
whenever this set is non-empty.
(4.14)
4.3. GLOBAL INJECTIVITY
85
Proof. This follows in a completely analogous way to the proof of Theorem 4.4, but now
we select a subsequence of a minimizing sequence (u j ) ⊂ A such that for some H ∈
Lq (Ω; R3×3 ), d ∈ Lr (Ω) we have





uj ⇀ u
in W1,p ,
cof ∇u j ⇀ H
in Lq ,
det ∇u j ⇀ d
in Lr ,
which is possible by the usual weak compactness results in conjunction with the coercivity
assumption (4.14). Thus, Lemma 4.5 yields





uj ⇀ u
in W1,p ,
cof ∇u j ⇀ cof ∇u in Lq ,
det ∇u j ⇀ det ∇u
in Lr ,
and we may argue as in Theorem 4.4 to conclude that u is indeed a minimizer. The proof
that u ∈ A is the same as in Theorem 4.4.
4.3 Global injectivity
For reasons of physical admissibility, we need to additionally prove that we can find a minimizer that is globally injective, at least almost everywhere (since we work with Sobolev
functions). There are several approaches to this issue, for example via the topological degree. We here present a classical argument by Ciarlet & Nečas [22]. It is important to
notice that on physical grounds it is only realistic to expect this injectivity for p > d. For
lower exponents, complex effects such as cavitation and (microscopic) fracture have to be
considered. This is already expressed in the fact that Sobolev functions in W1,p for p ≤ d
are not necessarily continuous anymore.
We will prove the following theorem:
Theorem 4.7. In the situation of Theorem 4.4, in particular p > 3, the minimization problem (4.1) has at least one solution in the space
{
}
A := u ∈ W1,p (Ω; Rd ) : det ∇u > 0 a.e., u is injective a.e., u|∂ Ω = g ,
whenever this set is non-empty.
Proof. Let (u j ) ⊂ A be a minimizing sequence. The proof is completely analogous to the
one of Theorem 4.4, we only need to show in addition that u is injective almost everywhere.
For our choice p > 3, the space W1,p (Ω; R3 ) embeds continuously into C(Ω; R3 ), so we
have that
Comment...
u j → u uniformly.
86
CHAPTER 4. POLYCONVEXITY
Let U ⊂ R3 be any relatively compact open set with u(Ω) ⊂⊂ U (note that u(Ω) is relatively
compact). Then, u j (Ω) ⊂ U for j sufficiently large by the uniform convergence. Hence, for
such j,
|u j (Ω)| ≤ |U|.
Furthermore, since det ∇u j converges weakly to det ∇u in L p/3 by Lemma 3.10 and all u j
are injective almost everywhere by definition of the space A ,
∫
Ω
det ∇u dx = lim
∫
det ∇u j dx = lim |u j (Ω)| ≤ |U|.
j→∞ Ω
j→∞
Letting |U| ↓ |u(Ω)|, we get the Ciarlet–Nečas non-interpenetration condition
∫
Ω
det ∇u dx ≤ |u(Ω)|.
(4.15)
It is known (see for example [60] or [17]) that even if only u ∈ W1,p (Ω; R3 ) it holds that
∫
Ω
| det ∇u| dx =
∫
u(Ω)
H 0 (u−1 (x′ )) dx′ ,
where we denote by H 0 the counting measure. We estimate
|u(Ω)| ≤
∫
u(Ω)
H 0 (u−1 (x′ )) dx′ =
∫
Ω
det ∇u dx ≤ |u(Ω)|,
where we used that det ∇u > 0 almost everywhere and (4.15). Thus,
H 0 (u−1 (x′ )) = 1
for a.e. x′ ∈ u(Ω),
which is nothing else than the almost everywhere injectivity of u.
4.4 Notes and bibliographical remarks
Comment...
Theorem 4.6 is a refined version by Ball, Currie & Olver [9] of Ball’s original result in [5].
Also see [10, 11] for further reading.
Many questions about polyconvex integral functionals remain open to this day: In particular, the regularity of solutions and the validity of the Euler–Lagrange equations are
largely open in the general case. Note that the regularity theory from the previous chapter is
not in general applicable, at least if we do not assume the upper p-growth. These questions
are even open for the more restricted situation of nonlinear elasticity theory. See [8] for a
survey on the current state of the art and a collection of challenging open problems.
As for the almost injectivity (for p > 3), this is in fact sometimes automatic, as shown
by Ball [6], but the arguments do not apply to all situations. Tang [93] extended this to
p > 2, but since then we have to deal with non-continuous functions, it is not even clear
how to define u(Ω). For fracture, there is a huge literature, see [74] and the recent [46],
which also contains a large bibliography. A study on the surprising (mis)behavior of the
determinant for p < d can be found in [52].
Chapter 5
Relaxation
When trying to minimize an integral functional over a Sobolev space, it it a very real possibility that there is no minimizer if the integrand is not quasiconvex; of course, even functionals with non-quasiconvex integrands can have minimizers, but, generally speaking, we
should expect non-quasiconvexity to be correlated with the non-existence of minimizers.
For instance, consider the functional
F [u] :=
∫ 1
0
(
)2
|u(x)|2 + |u′ (x)|2 − 1 dx,
u ∈ W1,4
0 (0, 1).
Comment...
The gradient part of the integrand, a 7→ (a2 − 1)2 , is called a double-well potential, see
Figure 5.1. Approximate minimizers of F try to satisfy u′ ∈ {−1, +1} as closely as possible, while at the same time staying close to zero because of the first term under the integral. These contradicting requirements lead to minimizing sequences that develop faster
and faster oscillations similar to the ones shown in Figure 3.2 on page 48. It should be
intuitively clear that no classical function can be a minimizer of F .
In this situation, we have essentially two options: First, if we only care about the infimal
value of F , we can compute the relaxation F∗ of F , which (by definition) will be lower
semicontinuous and then find its minimizer. This will give us the infimum of F over all
admissible functions, but the minimizer of F∗ says only very little about the minimizing
sequence of our original F since all oscillations (and concentrations in some cases) have
been “averaged out”.
Second, we can focus on the minimizing sequences themselves and try to find a generalized limit object to a minimizing sequence encapsulating “interesting” information about
the minimization. This limit object will be a Young measure and in fact this was the original
motivation for introducing them. Effectively, Young measure theory allows one to replace
a minimization problem over a Sobolev space by a generalized minimization problem over
(gradient) Young measures that always has a solution. In applications, the emerging oscillations correspond to microstructure, which is very important for example in material
science.
In this chapter we will consider both approaches in turn.
88
CHAPTER 5. RELAXATION
Figure 5.1: A double-well potential.
5.1 Quasiconvex envelopes
We have seen in Chapter 3 that integral functionals with quasiconvex integrands are weakly
lower semicontinuous in W1,p , where the exponent 1 < p < ∞ is determined by growth
properties of the integrand. If the integrand is not quasiconvex, then we would like to
compute the functional’s relaxation, i.e. the largest (weakly) lower semicontinuous function
below it. We can expect that when passing from the original functional to the relaxed one,
we should also pass from the original integrand to a related but quasiconvex integrand. For
this, we define the quasiconvex envelope Qh : Rm×d → R ∪ {−∞} of a locally bounded
Borel-function h : Rm×d → R as
}
{∫
1,∞
m
Qh(A) := inf − h(A + ∇ψ (z)) dz : ψ ∈ W0 (ω ; R ) ,
A ∈ Rm×d ,
(5.1)
ω
where ω ⊂ Rd is a bounded Lipschitz domain. Clearly, Qh ≤ h. By a similar argument
as in Lemma 3.2, one can see that this formula is independent of the choice of ω . If h
m
has p-growth at infinity, we may replace the space W1,∞
0 (ω ; R ) in the above formula by
1,p
W0 (ω ; Rm ). Finally, by a covering argument as for instance in Proposition 3.27, we may
furthermore restrict the class of ψ in the above infimum to those satisfying ∥ψ ∥L∞ ≤ ε for
any ε > 0.
It is well-known that for continuous h the quasiconvex envelope Qh is equal to
{
}
Qh = sup g : g quasiconvex and g ≤ h ,
(5.2)
Comment...
see Section 6.3 in [25]. Note that the equality also holds if one (hence both) of the two
expressions is −∞.
We first need the following lemma:
5.1. QUASICONVEX ENVELOPES
89
Lemma 5.1. For continuous h : Rm×d → R with p-growth, |h(A)| ≤ M(1 + |A| p ), the quasiconvex envelope Qh as defined in (5.1) is either identically −∞ or real-valued with pgrowth, and quasiconvex.
Proof. Step 1. If Qh takes the value −∞, say Qh(B0 ) = −∞, then for all N ∈ N there exists
m
ψ̂N ∈ W1,∞
0 (B(0, 1/2); R ) such that
∫
B(0,1/2)
h(A0 + ∇ψ̂N (z)) dz ≤ −N.
m
For any A0 ∈ Rm×d pick ψ∗ ∈ W1,∞
A0 x (B(0, 1); R ) that has trace B0 x on ∂ B(0, 1/2) and set
{
ψ̂N (z) if z ∈ B(0, 1/2),
z ∈ B(0, 1).
ψN (z) :=
otherwise,
ψ∗ (z
Hence,
∫
h(∇ψN (z)) dz ≤
Qh(A0 ) ≤ −
B(0,1)
−N +C
B(0, 1)
for some fixed C > 0 (depending on our choice of ψ∗ ). Letting N → ∞, we get Qh(A0 ) =
−∞. Thus, Qh is identically −∞.
m
m×d we need to show
Step 2. For any ψ ∈ W1,∞
0 (B(0, 1); R ) and any A0 ∈ R
∫
−
B(0,1)
Qh(A0 + ∇ψ (z)) dz ≥ Qh(A0 ).
(5.3)
It can be shown that also Qh has p-growth, see for instance Lemma 2.5 in [54]. Thus, we can
use an approximation argument in conjunction with Theorem 2.30 to prove that it suffices to
m
show the inequality (5.3) for piecewise affine ψ ∈ W1,∞
0 (B(0, 1); R ), say ψ (x) = vk + Ak x
(vk ∈ Rm , Ak ∈ Rm×d ) for x ∈ Gk from a disjoint collection of Lipschitz subdomains Gk ⊂ Ω
∪
(k = 1, . . . , N) with |Ω \ k Gk | = 0. Here we use the W1,p -density of piecewise affine
functions in W1,∞
0 , which can be proved by mollification and an elementary approximation
of smooth functions by piecewise affine ones (e.g. employing finite elements).
m
Fix ε > 0. By the definition of Qh, for every k we can find φk ∈ W1,∞
0 (Gk ; R ) such that
∫
Qh(A0 + Ak ) ≥ − f (A0 + Ak + ∇φk (z)) dz − ε .
Gk
Then,
B(0,1)
Qh(A0 + ∇ψ (z)) dz =
≥
N
∑ |Gk |Qh(A0 + Ak )
k=1
N ∫
∑
k=1 Gk
h(A0 + Ak + ∇φk (z)) dz − ε |Gk |
∫
h(A0 + ∇φ (z)) dz − ε |B(0, 1)|
(
)
≥ |B(0, 1)| Qh(A0 ) − ε ,
=
B(0,1)
Comment...
∫
90
CHAPTER 5. RELAXATION
m
where φ ∈ W1,∞
0 (B(0, 1); R ) is defined as
{
vk + Ak x + φk (x) if x ∈ Gk ,
φ (x) :=
0
otherwise,
x ∈ B(0, 1),
and the last step follows from the definition of Qh. Letting ε ↓ 0, we have shown (5.3).
We can also use the notion of the quasiconvex envelope to introduce a class of nontrivial quasiconvex functions with p-growth, 1 < p < ∞ (the linear growth case is also
possible using a refined technique).
Lemma 5.2. Let F ∈ R2×2 with rank F = 2 and let 1 < p < ∞. Define
h(A) := dist(A, {−F, F}) p ,
A ∈ R2×2 .
Then, Qh(0) > 0 and the quasiconvex envelope Qh is not convex (at zero).
Proof. We first show Qh(0) > 0. Assume to the contrary that Qh(0) = 0. Then, by (5.1)
2
there exists a sequence (ψ j ) ⊂ W1,∞
0 (B(0, 1); R ) with
∫
−
B(0,1)
h(∇ψ j ) dz → 0.
(5.4)
Let P : R2×2 → R2×2 be the projection onto the orthogonal complement of span{F}. Some
linear algebra shows |P(A)| p ≤ h(A) for all A ∈ R2×2 . Therefore,
P(∇ψ j ) → 0
in L p (B(0, 1); R2×2 ).
(5.5)
With the definition
ĝ(ξ ) :=
∫
g(x)e−2π iξ ·x dx,
B(0,1)
ξ ∈ Rd ,
d
for the Fourier transform ĝ of a function g ∈ L1 (B(0, 1); RN ), we get the rule ∇
ψ j (ξ ) =
2
/ span{F} for all a, ξ ∈ R \ {0} implies P(a ⊗ ξ ) ̸= 0.
(2π i) ψ̂ j (ξ ) ⊗ ξ . The fact that a ⊗ ξ ∈
Some algebra implies that we may write
d
ψ j (ξ ) = M(ξ )P(ψ̂ j (ξ ) ⊗ ξ ),
∇
ξ ∈ Rd ,
where M(ξ ) : R2×2 → R2×2 is linear and furthermore depends smoothly and positively
0-homogeneously on ξ . We omit the details of this construction.
For p = 2, Parseval’s formula ∥g∥L2 = ∥ĝ∥L2 together with (5.5) then implies
d
∥∇ψ j ∥L2 = ∥∇
ψ j ∥L2 = ∥M(ξ )P(ψ̂ j (ξ ) ⊗ ξ )∥L2
≤ ∥M∥∞ ∥P(ψ̂ j (ξ ) ⊗ ξ )∥L2 = ∥M∥∞ ∥P(∇ψ j )∥L2 → 0.
Comment...
But then h(∇ψ j ) → |F| in L1 , contradicting (5.4).
5.2. RELAXATION OF INTEGRAL FUNCTIONALS
91
For 1 < p < ∞, we may apply the Mihlin Multiplier Theorem (see for example Theorem 5.2.7 in [45] and Theorem 6.1.6 in [15]) to get analogously to the case p = 2 that
∥∇ψ j ∥L p ≤ ∥M∥Cd ∥P(∇ψ j )∥L p → 0,
which is again at odds with (5.4).
Finally, if Qh was convex at zero, we would have
) 1(
)
1(
Qh(F) + Qh(−F) ≤ h(F) + h(−F) = 0,
2
2
Qh(0) ≤
a contradiction.
5.2 Relaxation of integral functionals
We first consider the abstract principles of relaxation before moving on to more concrete
integral functionals. Suppose we have a functional F : X → R ∪ {+∞}, where X is a reflexive Banach space. Assume furthermore that our F is not weakly lower semicontinuous1 ,
i.e. there exists at least one sequence x j ⇀ x in X with F [x] > lim inf j→∞ F [x j ]. Then, the
relaxation F∗ : X → R ∪ {−∞, +∞} of F is defined as
{
}
F∗ [x] := inf lim inf F [x j ] : x j ⇀ x in X .
j→∞
Theorem 5.3 (Abstract relaxation theorem). Let X be a separable, reflexive Banach
space and let F : X → R ∪ {+∞} be a functional. Assume furthermore:
(WH1) Weak Coercivity: For all Λ > 0, the sublevel set
{ x ∈ X : F [x] ≤ Λ }
is sequentially weakly relatively compact.
Then, the relaxation F∗ of F is weakly lower semicontinuous and minX F∗ = infX F .
Proof. Since X ∗ is also separable, we can find a countable set { fl }l∈N ⊂ X ∗ such that
xj ⇀ x
in X
⇐⇒
sup j ∥x j ∥ < ∞ and ⟨x j , fl ⟩ → ⟨x, fl ⟩ for all l ∈ N.
Step 1. We first show weak lower semicontinuity of F∗ : Let x j ⇀ x in X such that
lim inf j→∞ F∗ [x j ] < ∞ (if this is not satisfied, there is nothing to show). Selecting a (not
renumbered) subsequence if necessary, we may further assume that α := sup j F∗ [x j ] < ∞.
For every j ∈ N, there exists a sequence (x j,k )k ⊂ X such that x j,k ⇀ x j in X as k → ∞
and
1
lim inf F [x j,k ] ≤ F∗ [x j ] + .
k→∞
j
Comment...
1 In principle, if F nevertheless is weakly lower semicontinuous for a minimizing sequence, then no relaxation is necessary. It is, however, very difficult to establish such a property without establishing general weak
lower semicontinuity.
92
CHAPTER 5. RELAXATION
Now select a y j := x j,k( j) such that
⟨y j , fl ⟩ − ⟨x j , fl ⟩ ≤ 1
j
for all l = 1, . . . , j,
and
F [y j ] ≤ F∗ [x j ] +
2
2
≤α+ .
j
j
Then, from the weak coercivity (WH1) we get sup j ∥y j ∥ < ∞. Thus, it follows that the
so-selected diagonal sequence (y j ) satisfies y j ⇀ x and
F∗ [x] ≤ lim inf F [y j ] ≤ lim inf F∗ [x j ]
j→∞
j→∞
and F∗ is shown to be weakly lower semicontinuous.
Step 2. Next take a minimizing sequence (x j ) ⊂ X such that
lim F∗ [x j ] = inf F∗ .
j→∞
X
Now, for every x j select a y j ∈ X such that
1
F [y j ] ≤ F∗ [x j ] + .
j
From the coercivity assumption (WH1) we get that we may select a weakly converging
subsequence (not renumbered) y j ⇀ y, where y ∈ X. Then, since F∗ is weakly lower semicontinuous and F∗ ≤ F ,
F∗ [y] ≤ lim inf F∗ [y j ] ≤ lim inf F [y j ] ≤ lim F∗ [x j ] = inf F∗ ,
j→∞
j→∞
j→∞
X
so F∗ [y] = minX F∗ . Furthermore,
inf F ≤ lim inf F [y j ] ≤ min F∗ ≤ inf F .
X
j→∞
X
X
Thus, minX F∗ = infX F , completing the proof.
As usual, we here are most interested in the concrete case of integral functionals
F [u] :=
∫
Ω
f (x, ∇u(x)) dx,
u ∈ W1,p (Ω; Rm ),
where, as usual, f : Ω × Rm×d → R is a Carathéodory integrand satisfying the p-growth and
coercivity assumption
µ |A| p − µ −1 ≤ f (x, A) ≤ M(1 + |A| p ),
(x, A) ∈ Ω × Rm×d ,
(5.6)
Comment...
for some µ , M > 0, 1 < p < ∞, and Ω ⊂ Rd a bounded Lipschitz domain. The main result
of this section is:
5.2. RELAXATION OF INTEGRAL FUNCTIONALS
93
Theorem 5.4 (Relaxation). Let F be as above and assume furthermore that there exists a
modulus of continuity ω , that is, ω : [0, ∞) → [0, ∞) is continuous, increasing, and ω (0) =
0, such that
| f (x, A) − f (y, A)| ≤ ω (|x − y|)(1 + |A| p ),
x, y ∈ Ω, A ∈ Rm×d .
(5.7)
Then, the relaxation F∗ of F is
F∗ [u] =
∫
Ω
Q f (x, ∇u(x)) dx,
u ∈ W1,p (Ω; Rm ),
where Q f (x, q) is the quasiconvex envelope of f (x, q) for all x ∈ Ω.
Proof. We define
G [u] :=
∫
Ω
Q f (x, ∇u(x)) dx,
u ∈ W1,p (Ω; Rm ).
As Q f (x, q) is quasiconvex for all x ∈ Ω by Lemma 5.1, it is continuous by Proposition 3.6.
We will see below that Q f is continuous in its first argument, so Q f is Carathéodory and G
is thus well-defined.
We will show in the following that (I) G ≤ F∗ and (II) F∗ ≤ G .
To see (I), it suffices to observe that G is weakly lower semicontinuous by Theorem 3.25
and that G ≤ F . Thus, from the definition of F∗ we immediately get G ≤ F∗ .
We will show (II) in several steps.
m
Step 1. Fix A ∈ Rm×d . For x ∈ Ω let ψx ∈ W1,p
0 (B(0, 1); R ) be such that
∫
−
B(0,1)
f (x, A + ∇ψx (z)) dz ≤ Q f (x, A) + 1.
Then we use (5.6) to observe
∫
µ−
B(0,1)
|A + ∇ψx (z)| p dz − µ −1 ≤ Q f (x, A) + 1 ≤ M(1 + |A| p ) + 1.
Let now x, y ∈ Ω and estimate using (5.7) and the definition of Q f (x, q),
∫
Q f (y, A) − Q f (x, A) ≤ −
B(0,1)
f (y, A + ∇ψx (z)) − f (x, A + ∇ψx (z)) dz + 1
∫
≤ ω (|x − y|) −
B(0,1)
1 + |A + ∇ψx (z)| p dz + 1
≤ Cω (|x − y|)(1 + |A| p ),
where C = C(µ , M) is a constant. Exchanging the roles of x and y, we see
Q f (x, A) − Q f (y, A) ≤ Cω (|x − y|)(1 + |A| p )
(5.8)
Comment...
for all x, y ∈ Ω and A ∈ Rm×d . In particular, Q f is continuous in x.
94
CHAPTER 5. RELAXATION
Step 2. Next we show that it suffices to consider piecewise affine u. If u ∈ W1,p (Ω; Rm ),
then there exists a sequence (v j ) ⊂ W1,p (Ω; Rm ) of piecewise affine functions such that
v j → u in W1,p . Since Q f is Carathéodory and has p-growth (as Q f ≤ f ), Theorem 2.30
shows that G [v j ] → G [u] and, by lower semicontinuity, F∗ [u] ≤ lim inf j→∞ F∗ [v j ]. Thus,
if (II) holds for all piecewise affine u, it also holds for all u ∈ W1,p (Ω; Rm ).
Step 3. Fix ε > 0 and let u ∈ W1,p (Ω; Rm ) be piecewise affine, say u(x) = vk + Ak x
(where vk ∈ Rm , Ak ∈ Rm×d ) on the set Gk from a disjoint collection of open sets Gk ⊂ Ω
∪
(k = 1, . . . , N) such that |Ω \ Nk=1 Gk | = 0. Fix such k and cover Gk up to a nullset with
countably many balls Bk,l := B(xk,l , rk,l ) ⊂ Gk , where xk,l ∈ Gk , 0 < rk,l < ε (l ∈ N) such
that
∫
− Q f (x, Ak ) dx − Q f (xk,l , Ak ) ≤ Cω (ε )(1 + |Ak | p ).
Bk,l
This is possible by the Vitali Covering Theorem A.8 in conjunction with (5.8).
Now, from the definition of Q f and the independence of this definition from the domain,
m
∞
we can find a function ψk,l ∈ W1,∞
0 (Bk,l ; R ) with ∥ψk,l ∥L ≤ ε and
∫
Q f (xk,l , Ak ) − −
) f xk,l , Ak + ∇ψk,l (z) dz ≤ ε .
(
Bk,l
Set
{
ψk,l (x) if x ∈ Bk,l ,
vε (x) := u(x) +
0
otherwise,
x ∈ Ω.
In a similar way to Step 1 we can show that
∫
µ − |Ak + ∇ψk,l (z)| p dz − µ −1 ≤ M(1 + |Ak | p ) + ε .
Bk,l
Thus,
∫
|∇vε | dx = ∑
∫
p
Ω
k,l
M
≤
µ
Bk,l
∫
Ω
|Ak + ∇ψk,l (x)| p dx
1 + |∇u(x)| p dx +
|Ω| ε |Ω|
+
< ∞,
µ2
µ
so, by the Poincaré–Friedrichs inequality, (vε )ε >0 ∈ W1,p (Ω; Rm ) is uniformly bounded in
W1,p .
By the continuity assumption (5.7),
∫
∫
∫
− f (xk,l , ∇vε (x)) dx − − f (x, ∇vε (x)) dx ≤ ω (ε ) − 1 + |∇vε (x)| p dx.
B
B
B
Comment...
k,l
k,l
k,l
5.3. YOUNG MEASURE RELAXATION
95
Then, we can combine all the above arguments to get
∫
∫
− Q f (x, ∇u(x)) dx − − f (x, ∇vε (x)) dx
Bk,l
∫
Bk,l
≤ Cω (ε ) − 2 + |Ak | p + |∇vε (x)| p dx + ε .
Bk,l
Multiplying this by |Bk,l | and summing over all k, l, we get
∫
G [u] − F [vε ] ≤ Cω (ε ) 2 + |∇u(x)| p + |∇vε (x)| p dx + ε |Ω|
Ω
and this vanishes as ε ↓ 0. For u j := v1/ j we can convince ourselves that u j ⇀ u since (u j )
is weakly relatively compact in W1,p and ∥u j − u∥L∞ ≤ 1/ j → 0. Hence,
G [u] = lim F [u j ] ≥ F∗ [u],
j→∞
which is (II) for piecewise affine u.
5.3 Young measure relaxation
As discussed at the beginning of this chapter, the relaxation strategy as outlined in the preceding section has one serious drawback: While it allows to find the infimal value, the
relaxed functional does not tell us anything about the structure (e.g. oscillations) of minimizing sequences. Often, however, in applications this structure of minimizing sequences is
decisive as it represents the development of microstructure. For example, in material science, this microstructure can often be observed and greatly influences material properties.
Therefore, in this section we implement the second strategy mentioned at the beginning
of the chapter, that is, we extend the minimization problem to a larger space and look for
solutions there.
Let us first, in an abstract fashion, collect a few properties that our extension should
satisfy. Assume we are given a space X together with a notion of convergence “→” and
a functional F : X → R ∪ {+∞}, which is not lower semicontinuous with respect to the
convergence “→” in X. Then, we need to extend X to a larger space X with a convergence
“⇝” and F to a functional F : X → R ∪ {+∞}, called the extension–relaxation such that
the following conditions are satisfied:
(i) Extension property: X ⊂ X (in the sense of an embedding) and, using this embedding, F |X = F .
(ii) Lower bound: If x j → x in X, then, up to a subsequence, there exists x ∈ X such that
x j ⇝ x in X and
lim inf F [x j ] ≥ F [x].
Comment...
j→∞
96
CHAPTER 5. RELAXATION
(iii) Recovery sequence: For all x ∈ X there exists a recovery sequence (x j ) ⊂ X with
x j ⇝ x in X and such that
lim F [x j ] = F [x].
j→∞
Intuitively, (ii) entails that we can solve our minimization problem for F by passing to the
extended space; in particular, if (x j ) ⊂ X is minimizing and relatively compact (which, up
to a subsequence, implies x j → x in X), then (ii) tells us that x should be considered the
“minimizer”. On the other hand, (iii) ensures that the minimization problems for F and for
F are sufficiently related. In particular, it is easy to see that the infima agree and indeed F
attains its minimum (for this, we need to assume some coercivity of F ):
min F = inf F .
X
X
If we consider integral functionals defined on X = W1,p (Ω; Rm ), we have already seen a
very convenient extended space X, namely the space of gradient p-Young measures. In fact,
these generalized variational problems are the original raison d’être for Young measures. In
the following we will show how this fits into the above abstract framework. This, we return
to our prototypical integral functional
F [u] :=
∫
Ω
f (x, ∇u(x)) dx,
u ∈ W1,p (Ω; Rm ),
with f : Ω × Rm×d → R Carathéodory and
µ |A| p − µ −1 ≤ f (x, A) ≤ M(1 + |A| p ),
(x, A) ∈ Ω × Rm×d ,
(5.9)
for some µ , M > 0, 1 < p < ∞, and Ω ⊂ Rd a bounded Lipschitz domain.
Motivated by the considerations in Sections 3.3 and 3.4, we let for a gradient Young
measure ν = (νx ) ∈ GY p (Ω; Rm×d ) the extension–relaxation F : GY p (Ω; Rm×d ) → R be
defined as
F [ν ] :=
⟨⟨
f,ν
⟩⟩
∫ ∫
=
Ω
f (x, A) dνx (A) dx.
Then, our relaxation result takes the following form:
Theorem 5.5 (Relaxation with Young measures). Let F , F be as above.
(I) Extension property: For every u ∈ W1,p (Ω; Rm ) define the elementary Young measure (δ∇u ) ∈ GY p (Ω; Rm×d ) as
(δ∇u )x := δ∇u(x)
Comment...
Then, F [u] = F [δ∇u ].
for a.e. x ∈ Ω.
5.3. YOUNG MEASURE RELAXATION
97
(II) Lower bound: If u j ⇀ u in W1,p (Ω; Rm ), then, up to a subsequence, there exists a
Y
Young measure ν ∈ GY p (Ω; Rm×d ) such that ∇u j → ν (i.e. (∇u j ) generates ν ) with
[ν ] = ∇u a.e., and
lim inf F [u j ] ≥ F [ν ].
(5.10)
j→∞
(III) Recovery sequence: For all ν ∈ GY p (Ω; Rm×d ) there exists a recovery generating
Y
sequence (u j ) ⊂ W1,p (Ω; Rm ) with ∇u j → ν and such that
lim F [u j ] = F [ν ].
(5.11)
j→∞
(IV) Equality of minima: F attains its minimum and
min
GY p (Ω;Rm×d )
F=
inf
W1,p (Ω;Rm )
F.
Furthermore, all these statements remain true if we prescribe boundary values. For
a Young measure we here prescribe the boundary values for an underlying deformation
(which is only determined up to a translation, of course).
Proof. Ad (I). This follows directly from the definition of F .
Ad (II). The existence of ν follows from the Fundamental Theorem 3.11. The lower
bound (5.10) is a consequence of Proposition 3.19.
Ad (III). By virtue of Lemma 3.21 we may construct an (improved) sequence (u j ) ⊂
W1,p (Ω; Rm ) such that
(i) u j |∂ Ω = u, where u ∈ W1,p (Ω; Rm ) is an underlying deformation of ν , that is, ∇u = [ν ],
(ii) the family {∇u j } j is p-equiintegrable, and
Y
(iii) ∇u j → ν .
For this improved generating sequence we can now use the statement about representation
of limits of integral functionals from the Fundamental Theorem 3.11 (here we need the
p-equiintegrability) to get
F [u j ] →
⟨⟨
f,ν
⟩⟩
= F [ν ],
Comment...
which is nothing else than (5.11).
Ad (IV). This is not hard to see using (II), (III) and the coercivity assumption, that is,
the lower bound in (5.9).
98
CHAPTER 5. RELAXATION
Figure 5.2: The double-well potential in the sailing example.
Example 5.6. In our sailing example from Section 1.6, we were tasked with solving

(
)
∫ T

r2
′
 F [r] :=
cos(4 arctan r (t)) − vmax 1 − 2 dt → min,
R
0

 r(0) = r(T ) = 0,
|r(t)| ≤ R.
Here, because of the additional constraints we can work in any L p -space that we want, even
L∞ . Clearly, the integrand
(
)
r2
W (r, a) := cos(4 arctan a) − vmax 1 − 2 ,
|r(t)| ≤ R,
R
is not convex in a, see Figure 5.2. Technically, the preceding theorem is not applicable,
since W depends on r and a and p = ∞, but it is obvious that we can simply extend it to
consider Young measures (δr(x) ⊗ νx )x ∈ Y∞ (Ω; R × R) with ν ∈ GY∞ (Ω; R) and [ν ] = r′ as
in Section 3.7 (we omit the details for p = ∞). We collect all such product Young measures
in the set GY∞ (Ω; R × R).
Then, we consider the extended–relaxed variational problem

⟨⟨
⟩⟩ ∫ T


F
[
δ
ν
]
:=
W,
δ
ν
=
W (r(t), a) dνx (a) dx → min,
⊗
⊗

r
r

∞
0
δr ⊗ ν = (δr(x) ⊗ νx )x ∈ GY (Ω; R × R),




r(0) = r(T ) = 0,
|r(t)| ≤ R.
Comment...
Let us also construct a sequence of solutions that generate the optimal Young measure
solution. Some elementary analysis yields that g(a) := cos(4 arctan a) has two minima in
a = ±1, where g(a) = −1 (see Figure 5.2). Moreover, h(r) := −vmax (1 − r2 /R2 ) attains its
minimum −vmax for r = 0. Thus,
(
)
W ≥ − 1 + vmax =: Wmin .
5.3. YOUNG MEASURE RELAXATION
99
We let
{
s − 1 if s ∈ [0, 2],
h(s) :=
3 − s if s ∈ (2, 4],
s ∈ [0, 4],
and consider h to be extended to all s ∈ R by periodicity. Then set
(
)
T
4j
r j (t) := h
t +1 .
4j
T
It is easy to see that r′j ∈ {−1, +1} and r j → 0 uniformly. Thus,
F [r j ] → T ·Wmin = inf F .
By the Fundamental Theorem 3.11 on Young measures, we know that we may select a
subsequence of j’s (not renumbered) such that (see Lemma 3.28)
(r j , r′j ) → δr ⊗ ν = (δr(x) ⊗ νx )x ∈ GY∞ (Ω; R × R).
Y
From the construction of r j we see
(
)
1
1
δr ⊗ ν = δ0 ⊗
δ−1 + δ+1 .
2
2
Clearly, this δr ⊗ ν is minimizing for F .
The following lemma creates a connection to the other relaxation approach from the last
section:
Lemma 5.7. Let h : Rm×d → R be continuous and satisfy
µ |A| p − µ −1 ≤ h(A) ≤ M(1 + |A| p )
A ∈ Rm×d ,
for some µ , M > 0. Then, for all A ∈ Rm×d there exists a homogeneous gradient Young
measure νA ∈ GY p (B(0, 1); Rm×d ) such that
∫
h dνA .
Qh(A) =
Proof. According to (5.1) and the remarks following it,
{∫
}
1,p
m
Qh(A) := inf −
h(A + ∇ψ (z)) dz : ψ ∈ W0 (B(0, 1); R ) .
B(0,1)
Let now (ψ j ) ⊂ W1,p (B(0, 1); Rm ) be a minimizing sequence in the above formula. This
minimization problem is admissible in Theorem 5.5, which yields a Young measure minimizer ν ∈ GY p (B(0, 1); Rm ) such that
∫
Qh(A) = −
B(0,1)
∫
h dνx dx.
Comment...
It only remains to show that we can replace (νx ) by a homogeneous Young measure νA .
This follows from the following lemma.
100
CHAPTER 5. RELAXATION
Lemma 5.8 (Homogenization). Let ν = (νx ) ∈ GY p (Ω; Rm ) such that [ν ] = ∇u for some
u ∈ W1,p (Ω; Rm ) with linear boundary values. Then, for any Lipschitz domain D ⊂ Rd there
exists a homogeneous ν = (ν y ) ∈ GY p (D; Rm ) such that
∫
h dν =
∫ ∫
1
|D|
Ω
h dνx dx
(5.12)
for all continuous h : Rm×d → R with |h(A)| ≤ M(1 + |A| p ). Furthermore, this result remains valid if Ω = Qd = (−1/2, 1/2)d , the d-dimensional unit cube and u has periodic
boundary values.
Proof. Let (u j ) ⊂ W1,p (Ω; Rm ) with u j (x) = Mx on ∂ Ω (in the sense of trace) for a fixed
Y
matrix M ∈ Rm×d such that ∇u j → ν , in particular sup j ∥∇u j ∥L p < ∞. Now choose for every
j ∈ N a Vitali cover of D, see Theorem A.8,
N( j)
D=Z ⊎
⊎
( j)
( j)
Ω(ak , rk ),
|Z| = 0,
k=1
( j)
( j)
with ak ∈ D, rk ≤ 1/ j (k = 1, . . . , N( j)), and Ω(a, r) := a + rΩ. Then define

(
( j) )

( j)
r( j) u y − ak
+ Mak
j
k
(
j)
v j (y) :=
rk


0
( j)
( j)
if y ∈ Ω(ak , rk ),
y ∈ D.
otherwise,
We have v j ∈ W1,p (D; Rm ) (it is easy to see that there are no jumps over the gluing boundaries) and
(
∇v j (y) = ∇u j
( j) )
y − ak
( j)
( j)
if y ∈ Ω(ak , rk ).
( j)
rk
We can then use a change of variables to compute for all φ ∈ C0 (D) and h ∈ C(Rm×d ) with
p-growth,
(
(
( j) ))
y − ak
φ (y)h(∇v j (y)) dy = ∑
φ (y) h ∇u j
dy
( j) ( j)
( j)
D
rk
k=1 Ω(ak ,rk )
( )
∫
N( j)
1
( j)
( j) d
= ∑ (rk ) φ (ak ) h(∇u j (x)) dx + O
|D|
j
Ω
k=1
∫
N( j) ∫
where we also used that φ is uniformly continuous. Letting j → ∞ and using that (Riemann
sums)
N( j)
lim
Comment...
j→∞
∫
∑ (rk )d φ (ak ) = − φ (x) dx,
k=1
( j)
( j)
D
5.4. CHARACTERIZATION OF GRADIENT YOUNG MEASURES
101
we arrive at
∫
lim
j→∞ D
∫
φ (y)h(∇v j (y)) dy = − φ (x) dx ·
D
∫ ∫
Ω
h(A) dνx dx.
(5.13)
For φ = 1 and h(A) := |A| p , this gives
sup j ∥∇v j ∥Lp p = sup j ∥∇u j ∥Lp p < ∞.
Y
Thus, there exists ν ∈ GY p (D; Rm ) such that ∇v j → ν .
Using Lemma 3.21 we may moreover assume that (∇u j ), and hence also (∇v j ), is pequiintegrable. Then, (5.13) implies
∫ ∫
D
∫
φ (y)h(A) dν y dy = − φ (x) dx ·
D
∫ ∫
Ω
h(A) dνx dx
for all φ , h as above. This implies in particular that (ν y ) = ν is homogeneous. For φ = 1,
we get (5.12).
The additional claim about Ω = Qd and an underlying deformation u with periodic
boundary values follows analogously, since also in this case we can “glue” generating functions on a (special) covering.
5.4 Characterization of gradient Young measures
In the previous section we replaced a minimization problem over a Sobolev space by its
extension–relaxation, defined on the space of gradient Young measures. Unfortunately, this
subset of the space of Young measures is so far defined only “extrinsically”, i.e. through
the existence of a generating sequence of gradients. The question arises whether there
is also an “intrinsic” characterization of gradient Young measures. Intuitively, trying to
understand gradient Young measures amounts to understanding the “structure” of sequences
of gradients, which is a useful endeavor.
Recall that in Lemma 3.22 we showed that a homogeneous gradient Young measure ν ∈
GY p (B(0, 1); Rm×d ) with barycenter B := [ν ] ∈ Rm×d satisfies the Jensen-type inequality
h(B) ≤
∫
h dν
Comment...
for all quasiconvex functions h : Rm×d → R with |h(A)| ≤ M(1 + |A| p ) (M > 0). There, we
interpreted this as an expression of the (generalized) convexity of h, in analogy with the
classical Jensen-inequality.
However, we can also dualize the situation and see the validity of the above Jensentype inequality for quasiconvex functions as a property of gradient Young measures. The
following (perhaps surprising) result shows that this dual point of view is indeed valid and
the Jensen-type inequality (essentially) characterizes gradient Young measures, a result due
to David Kinderlehrer and Pablo Pedregal from the early 1990s:
102
CHAPTER 5. RELAXATION
Theorem 5.9 (Kinderlehrer–Pedregal). Let ν ∈ Y p (Ω; Rm×d ) be a Young measure such
that [ν ] = ∇u for some u ∈ W1,p (Ω; Rm ) and such that for almost every x ∈ Ω the Jensentype inequality
h(∇u(x)) ≤
∫
h dνx
holds for all quasiconvex h : Rm×d → R with |h(A)| ≤ M(1 + |A| p ) (M > 0). Then ν ∈
GY p (Ω; Rm×d ), i.e. ν is a gradient p-Young measure.
The idea of the proof is to show that the set of gradient Young measures is convex
and (weakly*) closed in the set of Young measures and the Jensen-type inequalities entail
that ν cannot be separated from this set by a hyperplane (which turn out to be in oneto-one correspondence with integrands). Then, the Hahn–Banach Theorem implies that ν
actually lies in this set and is hence a gradient Young measure. One first shows this for
homogeneous Young measures and then uses a gluing procedure to extend the result to
x-dependent ν = (νx ). The details, however, are quite technical, see [81] or the original
works [49, 51] for a full proof.
This theorem is conceptually very important. It entails that if we could understand the
class of quasiconvex functions, then we also could understand gradient Young measures and
thus sequences of gradients. Unfortunately, our knowledge of quasiconvex functions (and
hence of gradient Young measures) is very limited at present. Most results are for specific
situations only and much of it has an “ad-hoc” flavor.
We close this chapter with a result showing that quasiconvex functions are not easy to
understand. In Proposition 3.3 we proved that quasiconvexity implies rank-one convexity
and Proposition 4.1 showed that polyconvexity implies quasiconvexity. Hence, in a sense,
rank-one convexity is a lower bound and polyconvexity an upper bound on quasiconvexity.
Moreover, the Alibert–Dacorogna–Marcellini function (see Examples 3.5, 4.2)
(
)
hγ (A) := |A|2 |A|2 − 2γ det A ,
A ∈ R2×2 , γ ∈ R.
has the following convexity properties:
√
2 2
• hγ is convex if and only if |γ | ≤
≈ 0.94,
3
2
• hγ is rank-one convex if and only if |γ | ≤ √ ≈ 1.15,
3
]
(
2
• hγ is quasiconvex if and only if |γ | ≤ γQC for some γQC ∈ 1, √ ,
3
• hγ is polyconvex if and only if |γ | ≤ 1.
However, it is open whether γQC =
√2
3
and so it could be that quasiconvexity and rank-one
Comment...
convexity are actually equivalent. This was open for a long time until Vladimir Švérak’s
1992 counterexample [89], which settled the question in the negative, at least for d ≥ 2,
5.4. CHARACTERIZATION OF GRADIENT YOUNG MEASURES
103
m ≥ 3. The case d = m = 2 is still open and considered to be very important, also because
of its connection to other branches of mathematics.
Example 5.10 (Švérak). We will show the non-equivalence of quasiconvexity and rankone convexity for m = 3, d = 2 only. Higher dimensions can be treated using an embedding
of R3×2 into Rm×d . To accomplish this, we construct h : R3×2 → R that is quasiconvex, but
not rank-one convex. Define a linear subspace L of R3×2 as follows:



 x 0

L := 0 y : x, y, z ∈ R .


z z
We denote by P : R3×2 → L a linear projection onto L given as




a
0
a b

0
d
P(A) := 
for
A = c d  ∈ R3×2 .
(e + f )/2 (e + f )/2
e f
Also define g : L → R to be


x 0
g 0 y := −xyz.
z z
For α , β > 0, we let hα ,β : R3×2 → R be given as
(
)
hα ,β (A) := g(P(A)) + α |A|2 + |A|4 + β |A − P(A)|2 .
We show the following two properties of hα ,β :
(I) For every α > 0 sufficiently small and all β > 0, the function hα ,β is not quasiconvex.
(II) For every α > 0, there exists β = β (α ) > 0 such that hα ,β is rank-one convex.
Ad (I): For A = 0, D = (0, 1)2 , and a periodic ψ ∈ W1,∞ (D; R3 ) given as


sin 2π x1
1 
sin 2π x2  ,
ψ (x1 , x2 ) :=
2π
sin 2π (x1 + x2 )
we have ∇ψ ∈ L and so P(∇ψ ) = ∇ψ . Thus, we may compute (using the identity for the
cosine of a sum and the fact that the sine is an odd function)
∫
D
g(∇ψ ) dx = −
∫ 1∫ 1
0
0
1
(cos 2π x1 )2 (cos 2π x2 )2 dx1 dx2 = − < 0.
4
Thus, for α > 0 sufficiently small,
D
hα ,β (∇ψ ) dx < 0 = hα ,β (0).
Comment...
∫
104
CHAPTER 5. RELAXATION
It turns out that in the definition of quasiconvexity we may alternatively test with functions
that have merely periodic boundary values on D (see Proposition 5.13 in [25]). Hence, (I)
follows.
Ad (II): Rank-one convexity is equivalent to the Legendre–Hadamard condition
d2
2
D hα ,β (A)[B, B] := 2 hα ,β (A + tB) ≥ 0
for all A, B ∈ R3×2 , rank B ≤ 1.
dt
t=0
D2 hα ,β (A)[B, B]
Here,
is the second (Gâteaux-)derivative of hα ,β at A in direction B.
The function g is a homogeneous polynomial of degree three, whereby we can find c > 0
such that for all A, B ∈ R3×2 with rank B ≤ 1,
d2
2
G(A, B) := D (g ◦ P)(A)[B, B] := 2 g(P(A + tB)) ≥ −c|A||B|2 .
dt
t=0
A computation then shows that
D2 hα ,β (A)[B, B] = G(A, B) + 2α |B|2 + 4α |A|2 |B|2 + 8α (A : B)2 + 2β |B − P(B)|2
≥ (−c + 4α |A|)|A||B|2 .
where we used the Frobenius product A : B := ∑i, j Aij Bij . Thus, for |A| ≥ c/(4α ), the
Legendre–Hadamard condition (and hence rank-one convexity) holds.
We still need to prove the Legendre–Hadamard condition for |A| < c/(4α ) and any fixed
α > 0. Since D2 hα ,β (A)[B, B] is homogeneous of degree two in B, we only need to consider
A, B from the compact set
}
{
c
3×2
3×2
, |B| = 1, rank B = 1 .
K := (A, B) ∈ R × R
: |A| ≤
4α
From the estimate
D2 hα ,β (A)[B, B] ≥ G(A, B) + 2α |B|2 + 2β |B − P(B)|2 =: κ (A, B, β )
it suffices to show that there exists β = β (α ) > 0 such that κ (A, B, β ) ≥ 0 for all (A, B) ∈ K.
Assume that this is not the case. Then there exists β j → ∞ and (A j , B j ) ∈ K with
0 > κ (A j , B j , β j ) = G(A j , B j ) + 2α + 2β |B j − P(B j )|2 .
As K is compact, we may assume (A j , B j ) → (A, B) ∈ K, for which it holds that
G(A, B) + 2α ≤ 0,
P(B) = B,
and
rank B = 1.
However, since rank B = 1, we have g(P(A+tB)) = 0 for all t ∈ R. This implies in particular
G(A, B) = D2 (g◦P)(A)[B, B] = 0, yielding 2α ≤ 0, which contradicts our assumption α > 0.
Thus, we have shown that for a suitable choice of α , β > 0 it indeed holds that hα ,β is rankone convex.
Comment...
So, quasiconvexity and, dually, gradient Young measures, must remain somewhat mysterious concepts. This also shows that the theory of elliptic systems of PDEs, for which
traditionally the Legendre–Hadamard condition is the expression of “ellipticity”, is incomplete. In particular, non-variational systems (i.e. systems of PDEs which do not arise as the
Euler–Lagrange equations to a variational problem) are only poorly understood.
5.5. NOTES AND BIBLIOGRAPHICAL REMARKS
105
5.5 Notes and bibliographical remarks
Comment...
Classically, one often defines the quasiconvex envelope through (5.2) and not as we have
done through (5.1). In this case, (5.2) is often called the Dacorogna formula, see Section 6.3 in [25] for further references.
The construction of Lemma 5.2 is based on a construction of Šverák [88] and some
ellipticity arguments similar to those in Lemma 2.7 of [73].
Laurence Chisholm Young originally introduced the objects that are now called Young
measures as “generalized curves/surfaces” in the late 1930s and early 1940s [97–99] to
treat problems in the Calculus of Variations and Optimal Control Theory that could not be
solved using classical methods. At the end of the 1960s he also wrote a book [100] (2nd
edition [101]) on the topic, which explained these objects and their applications in great
detail (including philosophical discussions on the semantics of the terminology “generalized
curve/surface”).
More recently, starting from the seminal work of Ball & James in 1987 [10], Young
measures have provided a convenient framework to describe fine phase mixtures in the
theory of microstructure, also see [33, 73] and the references cited therein. Further, and
more related to the present work, relaxation formulas involving Young measures for nonconvex integral functionals have been developed and can for example be found in Pedregal’s
book [81] and also in Chapter 11 of the recent text [3]; historically, Dacorogna’s lecture
notes [24] were also influential.
A second avenue of development – somewhat different from Young’s original intention
– is to use Young measures only as a technical tool (that is without them appearing in
the final statement of a result). This approach is in fact quite old and was probably first
adopted in a series of articles by McShane from the 1940s [62–64]. There, the author first
finds a generalized Young measure solution to a variational problem, then proves additional
properties of the obtained Young measures, and finally concludes that these properties entail
that the generalized solution is in fact classical. This idea exhibits some parallels to the
hugely successful approach of first proving existence of a weak solution to a PDE and then
showing that this weak solution has (in some cases) additional regularity.
The Kinderlehrer–Pedregal Theorem 5.9 also holds for p = ∞, which was in fact the
first case to be established. See [81] for an overview and proof.
A different proof of Proposition 5.2 can be found in Section 5.3.9 of [25].
The conditions (i)–(iii) at the beginning of Section 5.3 are modeled on the concept
of Γ-convergence (introduced by De Giorgi) and are also the basis of the usual treatment
of parameter-dependent integral functionals. See [18] for an introduction and [27] for a
thorough treatise.
Appendix A
Prerequisites
This appendix recalls some facts that are needed throughout the course.
A.1
General facts and notation
First, recall the following version of the fundamental Young inequality
ab ≤
δ 2 1 2
a + b ,
2
2δ
which holds for all a, b ≥ 0 and δ > 0.
In this text (as in much of the literature), the matrix space Rm×d of (m × d)-matrices
always comes equipped with the the Frobenius matrix (inner) product
A : B := ∑ Akj Bkj ,
A, B ∈ Rm×d ,
j,k
which is just the Euclidean product if we identify such matrices with vectors in Rmd . This
inner product induces the Frobenius matrix norm
√
|A| := ∑(Akj )2 ,
A ∈ Rm×d .
j,k
Very often we will use the tensor product of vectors a ∈ Rm , b ∈ Rd ,
a ⊗ b := abT
∈ Rm×d .
The tensor product interacts well with the Frobenius norm,
|a ⊗ b| = |a| · |b|,
a ∈ Rm , b ∈ Rd .
Comment...
While all norms on the finite-dimensional space Rm×d are equivalent, some “finer” arguments require to specify a matrix norm; if nothing else is stated, we always use the Frobenius norm.
108
APPENDIX A. PREREQUISITES
It is proved in texts on Matrix Analysis, see for instance [48], that this Frobenius norm
can also be expressed as
|A| =
√
∑ σi (A)2 ,
A ∈ Rm×d ,
i
where σi (A) ≥ 0 is the i’th singular value of A, where i = 1, . . . , min(d, m). Recall that every
matrix A ∈ Rm×d has a (real) singular value decomposition
A = PΣQT
for orthogonal P ∈ Rm×m , Q ∈ Rd×d , and a diagonal Σ = diag(σ1 , . . . , σmin(d,m) ) ∈ Rm×d
with only positive entries
σ1 ≥ σ2 ≥ . . . ≥ σr > 0 = σr+1 = σr+2 = · · · = σmin(d,m) ,
called the singular values of A, where r is the rank of A.
For A ∈ Rd×d the cofactor matrix cof A ∈ Rd×d is the matrix whose ( j, k)’th entry is
¬j
¬j
(−1) j+k M¬k
(A) with M¬k
(A) being the ( j, k)-minor of A, i.e. the determinant of the matrix
that originates from A by deleting the j’th row and k’th column. One important formula
(and in fact another way to define the cofactor matrix) is
∂A [det A] = cof A.
Furthermore, the Cramer rule entails that
A(cof A)T = det A · I
for all A ∈ Rd×d ,
where I denotes the identity matrix as usual. In particular, if A is invertible,
A−1 =
(cof A)T
.
det A
Sometimes, the matrix (cof A)T is called the adjugate matrix to A in the literature (we will
not use this terminology here, however).
A fundamental inequality involving the determinant is the Hadamard inequality: Let
A ∈ Rd×d with columns A j ∈ Rn ( j = 1, . . . , d). Then,
d
| det A| ≤ ∏ |A j | ≤ |A|d .
j=1
Comment...
Of course, an analogous formula holds with the rows of A.
A.2. MEASURE THEORY
A.2
109
Measure theory
We assume that the reader is familiar with the notion of Lebesgue- and Borel-measurability,
nullsets, and L p -spaces; a good introduction is [85]. For the d-dimensional Lebesgue measure we write L d (or Lxd if we want to stress the integration variable), but the Lebesgue
measure of a Borel- or Lebesgue-measurable set A ⊂ Rd is simply |A|. THe characteristic
functionf such an A is
{
1 if x ∈ A,
1A (x) :=
x ∈ Rd .
0 otherwise,
In the following, we only recall some results that are needed elsewhere.
Lemma A.1 (Fatou). Let f j : RN → [0, +∞], j ∈ N, be a sequence of Lebesgue-measurable
functions. Then,
∫
lim inf
j→∞
f j (x) dx ≥
∫
lim inf f j (x) dx.
j→∞
Lemma A.2 (Monotone convergence). Let f j : RN → [0, +∞], j ∈ N, be a sequence of
Lebesgue-measurable functions with f j (x) ↑ f (x) for almost every x ∈ Ω. Then, f : RN →
[0, +∞] is measurable and
∫
lim
j→∞
∫
f j (x) dx =
f (x) dx.
Lemma A.3. If f j → f strongly in L1 (Rd ), 1 ≤ p ≤ ∞, that is,
∥ f j − f ∥L p → 0
as
j → ∞,
then there exists a subsequence (not renumbered1 ) such that f j → f almost everywhere.
Theorem A.4 (Lebesgue dominated convergence theorem). Let f j : RN → Rn , j ∈ N,
be a sequence of Lebesgue-measurable functions such that there exists an L p -integrable
majorant g ∈ L p (RN ), 1 ≤ p < ∞, that is,
| f j| ≤ g
for all j ∈ N.
If f j → f pointwise almost everywhere, then also f j → f strongly in L p .
The following strengthening of Lebesgue’s theorem is very convenient:
Theorem A.5 (Pratt). If f j → f pointwise almost everywhere and there exists a sequence
(g j ) ⊂ L1 (RN ) with g j → g in L1 such that | f j | ≤ g j , then f j → f in L1 .
The following convergence theorem is of fundamental significance:
generally do not choose different indices for subsequences if the original sequence is discarded.
Comment...
1 We
110
APPENDIX A. PREREQUISITES
Theorem A.6 (Vitali convergence theorem). Let Ω ⊂ Rd be bounded and let ( f j ) ⊂
L p (Ω; Rm ), 1 ≤ p < ∞. Assume furthermore that
(i) No oscillations: f j → f in measure, that is, for all ε > 0,
{
}
x ∈ Ω : | f j − f | > ε → 0 as j → ∞.
(ii) No concentrations: { f j } j is p-equiintegrable, i.e.
∫
lim sup j
K↑∞
{| f j |>K}
| f j | p dx = 0.
Then, f j → f strongly in L p .
Lebesgue-integrable functions also have a very useful “continuity” property:
Theorem A.7. Let f ∈ L1 (Rd ). Then, almost every x0 ∈ Rd is a Lebesgue point of f , i.e.
∫
lim −
r↓0
B(x0 ,r)
| f (x) − f (x0 )| dx = 0,
where B(x0 , r) ⊂ Rd is the ball with center x0 and radius r > 0 and
∫
−B(x ,r)
0
:= |B(x10 ,r)|
∫
B(x0 ,r) .
The following covering theorem is a handy tool for several constructions, see Theorems 2.19 and 5.51 in [2] for a proof:
Theorem A.8 (Vitali covering theorem). Let B ⊂ RN be a bounded Borel set and let
K ⊂ Rd be compact. Then, there exist ak ∈ B, rk > 0, where k ∈ N, such that
B=Z ⊎
N
⊎
K(ak , rk ),
K(ak , rk ) = ak + rk K,
k=1
⊎
with Z ⊂ B a Lebesgue-nullset, |Z| = 0, and “ ” denoting that the sets in the union are
disjoint.
We also need other measures than Lebesgue measure on subsets of RN (this could be
either Rd or a matrix space Rm×d , identified with Rmd ). All of these abstract measures will
be Borel measures, that is, they are defined on the Borel σ -algebra B(RN ) of RN . In
this context, recall that the Borel-σ -algebra is the smallest σ -algebra that contains the open
sets. All Borel measures defined on RN that do not take the value +∞ are collected in the
set M(RN ) of (finite) Radon measures, its subclass of probability measures is M1 (RN ).
We also use M(Ω), M1 (Ω) for the subset of measures that only charge Ω ⊂ RN . We define
the duality pairing
Comment...
⟨
⟩ ∫
h, µ := h(A) dµ (A)
A.3. FUNCTIONAL ANALYSIS
111
for h : RN → Rm and µ ∈ M(RN ), whenever this integral makes sense (we need at least h
µ -measurable). The following notation is convenient for the barycenter of a finite Borel
measure µ :
⟨
⟩ ∫
[µ ] := id, µ = A dµ (A).
Probability measures and convex functions interact well:
Lemma A.9 (Jensen inequality). For all probability measures µ ∈ M1 (RN ) and all convex h : RN → R it holds that
(∫
) ∫
⟨
⟩
A dµ (A) ≤ h(A) dµ (A) = h, µ .
h([µ ]) = h
There is an alternative, “dual”, view on finite measures, expressed in the following
important theorem:
Theorem A.10 (Riesz representation theorem). The space of Radon measures M(RN )
is isometrically isomorphic to the dual space C0 (RN )∗ via the duality pairing
⟨
⟩ ∫
h, µ = h dµ .
As an easy, often convenient, consequence, we can define measures through their action
on C0 (RN ). Note, however, that we then need to check the boundedness |⟨h, µ ⟩| ≤ ∥h∥∞ .
If the element µ ∈ C0 (RN )∗ is additionally positive, that is, ⟨h, µ ⟩ ≥ 0 for h ≥ 0, and
normalized, that is, ⟨1, µ ⟩ = 1 (here, 1 ≡ 1 on the whole space), then µ from the Riesz
representation theorem is in fact a probability measure, µ ∈ M1 (RN ).
A.3
Functional analysis
We assume that the reader has a solid foundation in the basic notions of Functional Analysis
such as Banach spaces and their duals, weak/weak* convergence (and topology2 ), reflexivity, and weak/weak* compactness. The application of x∗ ∈ X ∗ to x ∈ X is often written
via the duality pairing ⟨x, x∗ ⟩ = ⟨x∗ , x⟩ := x∗ (x). We write x j ⇀ x for weak convergence
∗
and x j ⇁ x for weak* convergence. In this context recall that in Banach spaces topological weak compactness is equivalent to sequential weak compactness, this is the Eberlein–
Šmulian Theorem. Also recall that the norm in a Banach space is lower semicontinuous
with respect to weak convergence: If x j ⇀ x in X, then ∥x∥ ≤ lim inf j→∞ ∥x j ∥. One very
thorough reference for most of this material is [23]. The following are just reminders:
Theorem A.11 (Weak compactness). Let X be a reflexive Banach space. Then, normbounded sets in X are sequentially weakly relatively compact.
we here really only need convergence of sequences.
Comment...
2 However,
112
APPENDIX A. PREREQUISITES
Theorem A.12 (Sequential Banach–Alaoglu theorem). Let X be a separable Banach
space. Then, norm-bounded sets in the dual space X ∗ are weakly* sequentially relatively
compact.
Theorem A.13 (Dunford–Pettis). Let Ω ⊂ Rd be bounded. A bounded family { f j } ⊂
L1 (Ω) ( j ∈ N) is equiintegrable if and only if it is weakly relatively (sequentially) compact
in L1 (Ω).
Finally, weak convergence can be “improved” to strong convergence in the following
way:
Lemma A.14 (Mazur). Let x j ⇀ x in a Banach space X. Then, there exists a sequence
(y j ) ⊂ X of convex combinations,
N( j)
yj =
∑
N( j)
θ j,n xn ,
θ j,n ∈ [0, 1],
n= j
∑ θ j,n = 1.
n= j
such that y j → x in X.
A.4
Sobolev and other function spaces spaces
We give a brief introduction to Sobolev spaces, see [59] or [36] for more detailed accounts
and proofs.
Let Ω ⊂ Rd . As usual we denote by C(Ω) = C0 (Ω), Ck (Ω), k = 1, 2, . . ., the spaces of
continuous and k-times continuously differentiable functions. As norms in these spaces we
have
∥u∥Ck :=
∑
|α |≤k
∥∂ α u∥∞ ,
u ∈ Ck (Ω),
k = 0, 1, 2, . . . ,
where ∥ q∥∞ is the supremum norm. Here, the sum is over all multi-indices α ∈ (N ∪ {0})d
with |α | := α1 + · · · + αd ≤ k and
∂ α := ∂1α1 ∂2α2 · · · ∂dαd
is the α -derivative operator. Similarly, we define C∞ (Ω) of infinitely-often differentiable
functions (but this is not a Banach space anymore). A subscript “c ” indicates that all functions u in the respective function space (e.g. C∞
c (Ω)) must have their support
supp u := { x ∈ Ω : u(x) ̸= 0 }
Comment...
compactly contained in Ω (so, supp u ⊂ Ω and supp u compact). For the compact containment of a A in an open set B we write A ⊂⊂ B, which means that A ⊂ B. In this way, the
previous condition could be written supp u ⊂⊂ Ω.
For k ∈ N a positive integer and 1 ≤ p ≤ ∞, the Sobolev space Wk,p (Ω) is defined to
contain all functions u ∈ L p (Ω) such that the weak derivative ∂ α u exists in L p (Ω) for all
A.4. SOBOLEV AND OTHER FUNCTION SPACES SPACES
113
multi-indices α ∈ (N ∪ {0})d with |α | ≤ k. This means that for each such α , there is a
(unique) function vα ∈ L p (Ω) satisfying
∫
|α |
vα · ψ dx = (−1)
∫
u · ∂ α ψ dx
for all ψ ∈ C∞
c (Ω),
and we write ∂ α u for this vα . The uniqueness follows from the Fundamental Lemma of the
Calculus of Variations 2.21. Clearly, if u ∈ Ck (Ω), then the weak derivative is the classical
derivative. For u ∈ W1,p (Ω) we further define the weak gradient and weak divergence,
∇u := (∂1 u, ∂2 u, . . . , ∂d u),
div u := ∂1 u + ∂2 u + · · · + ∂d u.
As norm in Wk,p (Ω), 1 ≤ p < ∞, we use
(
)1/p
∑
∥u∥Wk,p :=
|α |≤k
∥∂
α
u∥Lp p
,
u ∈ Wk,p (Ω).
For p = ∞, we set
∥u∥Wk,∞ := max ∥∂ α u∥L∞ ,
|α |≤k
u ∈ Wk,∞ (Ω).
Under these norms, the Wk,p (Ω) become Banach spaces.
Concerning the boundary values of Sobolev functions, we have:
Theorem A.15 (Trace theorem). For every domain Ω ⊂ Rd with Lipschitz boundary and
every 1 ≤ p ≤ ∞, there exists a bounded linear trace operator trΩ : W1,p (Ω) → L p (∂ Ω)
such that
tr φ = φ |∂ Ω
if φ ∈ C(Ω).
We write trΩ u simply as u|∂ Ω . Furthermore, trΩ is bounded and weakly continuous between
W1,p (Ω) and L p (∂ Ω).
For 1 < p < ∞ denote the image of W1,p (Ω) under trΩ by W1−1/p,p (∂ Ω), called the
trace space of W1,p (Ω). As a makeshift norm on W1−1/p,p (∂ Ω) one can use the L p (∂ Ω)norm, but W1−1/p,p (∂ Ω) is not closed under this norm. This can be remedied by introducing a more complicated norm involving “fractional” derivatives (hence the notation for
W1−1/p,p (∂ Ω)), under which this set becomes a Banach space.
1,p
1,p
We write W1,p
0 (Ω) for the linear subspace of W (Ω) consisting of all W -functions
with zero boundary values (in the sense of trace). More generally, we use W1,p
g (Ω) with
g ∈ W1−1/p,p (∂ Ω), for the affine subspace of all W1,p -functions with boundary trace g.
The following are some properties of Sobolev spaces, stated for simplicity only for the
first-order space W1,p (Ω) with Ω ⊂ Rd a bounded Lipschitz domain.
Comment...
Theorem A.16 (Poincaré–Friedrichs inequality). Let u ∈ W1,p (Ω).
114
APPENDIX A. PREREQUISITES
(I) If u|∂ Ω = 0, then
∥u∥L p ≤ C∥∇u∥L p ,
where C = C(Ω) > 0 is a constant. If 1 < p < ∞ and g ∈ W1−1/p,p (∂ Ω), then
(
)
∥u∥L p ≤ C ∥∇u∥L p + ∥g∥W1−1/p,p .
(II) Setting [u]Ω :=
∫
−Ω u
dx, it holds that
∥u − [u]Ω ∥L p ≤ C∥∇u∥L p ,
where C = C(Ω) > 0 is a constant.
Sobolev functions have higher integrability than originally assumed:
Theorem A.17 (Sobolev embedding). Let u ∈ W1,p (Ω).
∗
(I) If p < d, then u ∈ L p (Ω), where
p∗ =
dp
d−p
and there is a constant C = C(Ω) > 0 such that ∥u∥L p∗ ≤ C∥u∥W1,p .
(II) If p = d, then u ∈ Lq (Ω) for all 1 ≤ q < ∞ and ∥u∥Lq ≤ Cq ∥u∥W1,p .
(III) If p > d, then u ∈ C(Ω) and ∥u∥∞ ≤ C∥u∥W1,p .
The second part can in fact be made more precise by considering embeddings into
Hölder spaces, see Section 5.6.3 in [36] for details.
Theorem A.18 (Rellich–Kondrachov). Let (u j ) ⊂ W1,p (Ω) with u j ⇀ u in W1,p .
(I) If p < d, then u j → u in Lq (Ω; RN ) (strongly) for any q < p∗ = d p/(d − p).
(II) If p = d, then u j → u in Lq (Ω; RN ) (strongly) for any q < ∞.
(III) If p > d, then u j → u uniformly.
The following is a consequence of the Stone–Weierstrass Theorem:
Theorem A.19 (Density). For every 1 ≤ p < ∞, u ∈ W1,p (Ω), and all ε > 0 there exists
v ∈ (W1,p ∩ C∞ )(Ω) with u − v ∈ W1,p
0 (Ω) and ∥u − v∥W1,p < ε .
Finally, we note that for 0 < γ ≤ 1 a function u : Ω → R is γ -Hölder-continuous function, in symbols u ∈ C0,γ (Ω), if
|u(x) − u(y)|
< ∞.
|x − y|γ
x,y∈Ω
∥u∥C0,γ := ∥u∥∞ + sup
Comment...
x̸=y
A.4. SOBOLEV AND OTHER FUNCTION SPACES SPACES
115
Functions in C0,1 (Ω) are called Lipschitz-continuous. We construct higher-order spaces
Ck,γ (Ω) analogously.
d
Next, we define a family of mollifiers as follows: Let ρ ∈ C∞
c (R ) be radially symmetric
∞
d
and positive. Then, the family (ρδ )δ ⊂ Cc (B(0, δ ))(R ) consists of the functions
( )
1
x
ρδ (x) := d ρ
,
x ∈ Rd .
δ
δd
For u ∈ Wk,p (Rd ), where k ∈ N ∪ {0} and 1 ≤ p < ∞, we define the mollification uδ ∈
Wk,p (Rd ) as the convolution between u and ρδ , i.e.
uδ (x) := (ρδ ⋆ u)(x) :=
∫
ρδ (x − y)u(y) dy,
x ∈ Rd .
Lemma A.20. If u ∈ Wk,p (Rd ), then uδ → u strongly in Wk,p as δ ↓ 0.
Analogous results also hold for continuously differentiable functions.
Finally, all the above notions and theorems continue to hold for vector-valued functions
u = (u1 , . . . , um )T : Ω → Rm and in this case we set
 1

∂1 u ∂2 u1 · · · ∂d u1
 ∂1 u2 ∂2 u2 · · · ∂d u2 


m×d
∇u :=  .
.
..
..  ∈ R
 ..

.
.
∂1 um ∂2 um · · · ∂d um
Comment...
We use the spaces C(Ω; Rm ), Ck (Ω; Rm ), Wk,p (Ω; Rm ), Ck,γ (Ω; Rm ) with analogous meanings.
k,γ
We also use local versions of all spaces, namely Cloc (Ω), Ckloc (Ω), Wk,p
loc (Ω), Cloc (Ω),
where the defining norm is only finite on every compact subset of Ω.
Bibliography
J. J. Alibert and G. Bouchitté. Non-uniform integrability and generalized Young
measures. J. Convex Anal., 4:129–147, 1997.
[2]
L. Ambrosio, N. Fusco, and D. Pallara. Functions of Bounded Variation and
Free-Discontinuity Problems. Oxford Mathematical Monographs. Oxford University Press, 2000.
[3]
H. Attouch, G. Buttazzo, and G. Michaille. Variational Analysis in Sobolev and BV
Spaces – Applications to PDEs and Optimization. MPS-SIAM Series on Optimization. Society for Industrial and Applied Mathematics, 2006.
[4]
E. J. Balder. A general approach to lower semicontinuity and lower closure in optimal
control theory. SIAM J. Control Optim., 22:570–598, 1984.
[5]
J. M. Ball. Convexity conditions and existence theorems in nonlinear elasticity. Arch.
Ration. Mech. Anal., 63:337–403, 1976/77.
[6]
J. M. Ball. Global invertibility of Sobolev functions and the interpenetration of matter. Proc. Roy. Soc. Edinburgh Sect. A, 88:315–328, 1981.
[7]
J. M. Ball. A version of the fundamental theorem for Young measures. In PDEs and
continuum models of phase transitions (Nice, 1988), volume 344 of Lecture Notes in
Physics, pages 207–215. Springer, 1989.
[8]
J. M. Ball. Some open problems in elasticity. In Geometry, mechanics, and dynamics,
pages 3–59. Springer, 2002.
[9]
J. M. Ball, J. C. Currie, and P. J. Olver. Null Lagrangians, weak continuity, and
variational problems of arbitrary order. J. Funct. Anal., 41:135–174, 1981.
[10]
J. M. Ball and R. D. James. Fine phase mixtures as minimizers of energy. Arch.
Ration. Mech. Anal., 100:13–52, 1987.
[11]
J. M. Ball and R. D. James. Proposed experimental tests of a theory of fine microstructure and the two-well problem. Phil. Trans. R. Soc. Lond. A, 338:389–450,
1992.
Comment...
[1]
118
BIBLIOGRAPHY
J. M. Ball, B. Kirchheim, and J. Kristensen. Regularity of quasiconvex envelopes.
Calc. Var. Partial Differential Equations, 11:333–359, 2000.
[13]
J. M. Ball and V. J. Mizel. One-dimensional variational problems whose minimizers
do not satisfy the Euler-Lagrange equation. Arch. Rational Mech. Anal., 90:325–388,
1985.
[14]
J. M. Ball and F. Murat. Remarks on Chacon’s biting lemma. Proc. Amer. Math.
Soc., 107:655–663, 1989.
[15]
J. Bergh and J. Löfström. Interpolation Spaces, volume 223 of Grundlehren der
mathematischen Wissenschaften. Springer, 1976.
[16]
H. Berliocchi and J.-M. Lasry. Intégrandes normales et mesures paramétrées en calcul des variations. Bull. Soc. Math. France, 101:129–184, 1973.
[17]
B. Bojarski and T. Iwaniec. Analytical foundations of the theory of quasiconformal
mappings in Rn . Ann. Acad. Sci. Fenn. Ser. A I Math., 8:257–324, 1983.
[18]
A. Braides. Γ-convergence for Beginners, volume 22 of Oxford Lecture Series in
Mathematics and its Applications. Oxford University Press, 2002.
[19]
C. Castaing, P. R. de Fitte, and M. Valadier. Young Measures on Topological Spaces:
With Applications in Control Theory and Probability Theory. Mathematics and its
Applications. Kluwer, 2004.
[20]
Ph. G. Ciarlet. Mathematical Elasticity, volume 1: Three Dimensional Elasticity.
North-Holland, 1988.
[21]
Ph. G. Ciarlet and G. Geymonat. Sur les lois de comportement en élasticité non
linéaire compressible. C. R. Acad. Sci. Paris Sér. II Méc. Phys. Chim. Sci. Univers
Sci. Terre, 295:423–426, 1982.
[22]
Ph. G. Ciarlet and J. Nečas. Injectivity and self-contact in nonlinear elasticity. Arch.
Rational Mech. Anal., 97:171–188, 1987.
[23]
J. B. Conway. A Course in Functional Analysis, volume 96 of Graduate Texts in
Mathematics. Springer, 2nd edition, 1990.
[24]
B. Dacorogna. Weak Continuity and Weak Lower Semicontinuity for Nonlinear Functionals, volume 922 of Lecture Notes in Mathematics. Springer, 1982.
[25]
B. Dacorogna. Direct Methods in the Calculus of Variations, volume 78 of Applied
Mathematical Sciences. Springer, 2nd edition, 2008.
[26]
B. Dacorogna. Introduction to the calculus of variations. Imperial College Press,
2nd edition, 2009.
Comment...
[12]
BIBLIOGRAPHY
119
G. Dal Maso. An Introduction to Γ-Convergence, volume 8 of Progress in Nonlinear
Differential Equations and their Applications. Birkhäuser, 1993.
[28]
E. De Giorgi. Sulla differenziabilità e l’analiticità delle estremali degli integrali multipli regolari. Mem. Accad. Sci. Torino. Cl. Sci. Fis. Mat. Nat. (3), 3:25–43, 1957.
[29]
E. De Giorgi. Un esempio di estremali discontinue per un problema variazionale di
tipo ellittico. Boll. Un. Mat. Ital., 1:135–137, 1968.
[30]
J. Diestel and J. J. Uhl, Jr. Vector measures, volume 15 of Mathematical Surveys.
American Mathematical Society, 1977.
[31]
N. Dinculeanu. Vector measures, volume 95 of International Series of Monographs
in Pure and Applied Mathematics. Pergamon Press, 1967.
[32]
R. J. DiPerna and A. J. Majda. Oscillations and concentrations in weak solutions of
the incompressible fluid equations. Comm. Math. Phys., 108:667–689, 1987.
[33]
G. Dolzmann and S. Müller. Microstructures with finite surface energy: the two-well
problem. Arch. Ration. Mech. Anal., 132:101–141, 1995.
[34]
I. Ekeland and R. Temam. Convex analysis and variational problems. North-Holland,
1976.
[35]
L. C. Evans. Weak convergence methods for nonlinear partial differential equations,
volume 74 of CBMS Regional Conference Series in Mathematics. Conference Board
of the Mathematical Sciences, 1990.
[36]
L. C. Evans. Partial differential equations, volume 19 of Graduate Studies in Mathematics. American Mathematical Society, 2nd edition, 2010.
[37]
I. Fonseca and G. Leoni. Modern Methods in the Calculus of Variations: L p Spaces.
Springer, 2007.
[38]
I. Fonseca, S. Müller, and P. Pedregal. Analysis of concentration and oscillation
effects generated by gradients. SIAM J. Math. Anal., 29:736–756, 1998.
[39]
M. Giaquinta and S. Hildebrandt. Calculus of variations. I – The Lagrangian formalism, volume 310 of Grundlehren der mathematischen Wissenschaften. Springer,
1996.
[40]
M. Giaquinta and S. Hildebrandt. Calculus of variations. II – The Hamiltonian formalism, volume 311 of Grundlehren der mathematischen Wissenschaften. Springer,
1996.
[41]
M. Giaquinta, G. Modica, and J. Souček. Cartesian currents in the calculus of variations. I, volume 37 of Ergebnisse der Mathematik und ihrer Grenzgebiete. Springer,
1998.
Comment...
[27]
120
BIBLIOGRAPHY
M. Giaquinta, G. Modica, and J. Souček. Cartesian currents in the calculus of variations. II, volume 38 of Ergebnisse der Mathematik und ihrer Grenzgebiete. Springer,
1998.
[43]
D. Gilbarg and N. S. Trudinger. Elliptic Partial Differential Equations of Second
Order, volume 224 of Grundlehren der mathematischen Wissenschaften. Springer,
1998.
[44]
E. Giusti. Direct methods in the calculus of variations. World Scientific, 2003.
[45]
L. Grafakos. Classical Fourier Analysis, volume 249 of Graduate Texts in Mathematics. Springer, 2nd edition, 2008.
[46]
D. Henao and C. Mora-Corral. Invertibility and weak continuity of the determinant
for the modelling of cavitation and fracture in nonlinear elasticity. Arch. Ration.
Mech. Anal., 197:619–655, 2010.
[47]
D. Hilbert. Mathematische Probleme – Vortrag, gehalten auf dem internationalen
Mathematiker-Kongreß zu Paris 1900. Nachrichten von der Königlichen Gesellschaft
der Wissenschaften zu Göttingen, Mathematisch-Physikalische Klasse, pages 253–
297, 1900.
[48]
R. A. Horn and C. R. Johnson. Matrix analysis. Cambridge University Press, 2nd
edition, 2013.
[49]
D. Kinderlehrer and P. Pedregal. Characterizations of Young measures generated by
gradients. Arch. Ration. Mech. Anal., 115:329–365, 1991.
[50]
D. Kinderlehrer and P. Pedregal. Weak convergence of integrands and the Young
measure representation. SIAM J. Math. Anal., 23:1–19, 1992.
[51]
D. Kinderlehrer and P. Pedregal. Gradient Young measures generated by sequences
in Sobolev spaces. J. Geom. Anal., 4:59–90, 1994.
[52]
K. Koumatos, F. Rindler, and E. Wiedemann. Differential inclusions and Young
measures involving prescribed Jacobians. SIAM J. Math. Anal., 2015. to appear.
[53]
J. Kristensen. Lower semicontinuity of quasi-convex integrals in BV. Calc. Var.
Partial Differential Equations, 7(3):249–261, 1998.
[54]
J. Kristensen. Lower semicontinuity in spaces of weakly differentiable functions.
Math. Ann., 313:653–710, 1999.
[55]
J. Kristensen and C. Melcher. Regularity in oscillatory nonlinear elliptic systems.
Math. Z., 260:813–847, 2008.
[56]
J. Kristensen and G. Mingione. The singular set of Lipschitzian minima of multiple
integrals. Arch. Ration. Mech. Anal., 184:341–369, 2007.
Comment...
[42]
BIBLIOGRAPHY
121
J. Kristensen and F. Rindler. Characterization of generalized gradient Young measures generated by sequences in W1,1 and BV. Arch. Ration. Mech. Anal., 197:539–
598, 2010. Erratum: Vol. 203 (2012), 693-700.
[58]
M. Kružı́k and T. Roubı́ček. On the measures of DiPerna and Majda. Math. Bohem.,
122:383–399, 1997.
[59]
G. Leoni. A first course in Sobolev spaces, volume 105 of Graduate Studies in Mathematics. American Mathematical Society, 2009.
[60]
M. Marcus and V. J. Mizel. Transformations by functions in Sobolev spaces and
lower semicontinuity for parametric variational problems. Bull. Amer. Math. Soc.,
79:790–795, 1973.
[61]
P. Mattila. Geometry of Sets and Measures in Euclidean Spaces, volume 44 of Cambridge Studies in Advanced Mathematics. Cambridge University Press, 1995.
[62]
E. J. McShane. Generalized curves. Duke Math. J., 6:513–536, 1940.
[63]
E. J. McShane. Necessary conditions in generalized-curve problems of the calculus
of variations. Duke Math. J., 7:1–27, 1940.
[64]
E. J. McShane. Sufficient conditions for a weak relative minimum in the problem of
Bolza. Trans. Amer. Math. Soc., 52:344–379, 1942.
[65]
N. G. Meyers. Quasi-convexity and lower semi-continuity of multiple variational
integrals of any order. Trans. Amer. Math. Soc., 119:125–149, 1965.
[66]
B. S. Mordukhovich. Variational analysis and generalized differentiation. I – Basic
theory, volume 330 of Grundlehren der mathematischen Wissenschaften. Springer,
2006.
[67]
B. S. Mordukhovich. Variational analysis and generalized differentiation. II – Applications, volume 331 of Grundlehren der mathematischen Wissenschaften. Springer,
2006.
[68]
C. B. Morrey, Jr. On the solutions of quasi-linear elliptic partial differential equations.
Trans. Amer. Math. Soc., 43:126–166, 1938.
[69]
C. B. Morrey, Jr. Quasiconvexity and the semicontinuity of multiple integrals. Pacific
J. Math., 2:25–53, 1952.
[70]
C. B. Morrey, Jr. Multiple Integrals in the Calculus of Variations, volume 130 of
Grundlehren der mathematischen Wissenschaften. Springer, 1966.
[71]
J. Moser. A new proof of De Giorgi’s theorem concerning the regularity problem for
elliptic differential equations. Comm. Pure Appl. Math., 13:457–468, 1960.
Comment...
[57]
122
BIBLIOGRAPHY
J. Moser. On Harnack’s theorem for elliptic differential equations. Comm. Pure Appl.
Math., 14:577–591, 1961.
[73]
S. Müller. Variational models for microstructure and phase transitions. In Calculus
of variations and geometric evolution problems (Cetraro, 1996), volume 1713 of
Lecture Notes in Mathematics, pages 85–210. Springer, 1999.
[74]
S. Müller and S. J. Spector. An existence theory for nonlinear elasticity that allows
for cavitation. Arch. Rational Mech. Anal., 131:1–66, 1995.
[75]
S. Müller and V. Šverák. Convex integration for Lipschitz mappings and counterexamples to regularity. Ann. of Math., 157:715–742, 2003.
[76]
F. Murat. Compacité par compensation. Ann. Scuola Norm. Sup. Pisa Cl. Sci., 5:489–
507, 1978.
[77]
F. Murat. Compacité par compensation. II. In Proceedings of the International
Meeting on Recent Methods in Nonlinear Analysis (Rome, 1978), pages 245–256.
Pitagora, 1979.
[78]
J. Nash. Continuity of solutions of parabolic and elliptic equations. Amer. J. Math.,
80:931–954, 1958.
[79]
J. Nečas. On regularity of solutions to nonlinear variational inequalities for secondorder elliptic systems. Rend. Mat., 8:481–498, 1975.
[80]
P. J. Olver. Applications of Lie groups to differential equations, volume 107 of Graduate Texts in Mathematics. Springer, 2nd edition, 1993.
[81]
P. Pedregal. Parametrized Measures and Variational Principles, volume 30 of
Progress in Nonlinear Differential Equations and their Applications. Birkhäuser,
1997.
[82]
Y. G. Reshetnyak. On the stability of conformal mappings in multidimensional
spaces. Siberian Math. J., 8:69–85, 1967.
[83]
R. T. Rockafellar. Convex Analysis, volume 28 of Princeton Mathematical Series.
Princeton University Press, 1970.
[84]
R. T. Rockafellar and R. J.-B. Wets. Variational analysis, volume 317 of Grundlehren
der mathematischen Wissenschaften. Springer, 1998.
[85]
W. Rudin. Real and Complex Analysis. McGraw–Hill, 3rd edition, 1986.
[86]
L. Simon. Schauder estimates by scaling. Calc. Var. Partial Differential Equations,
5:391–407, 1997.
[87]
E. M. Stein. Harmonic Analysis. Princeton University Press, 1993.
Comment...
[72]
BIBLIOGRAPHY
123
[88]
V. Šverák. Quasiconvex functions with subquadratic growth. Proc. Roy. Soc. London
Ser. A, 433:723–725, 1991.
[89]
V. Šverák. Rank-one convexity does not imply quasiconvexity. Proc. Roy. Soc.
Edinburgh Sect. A, 120:185–189, 1992.
[90]
V. Šverák and X. Yan. A singular minimizer of a smooth strongly convex functional
in three dimensions. Calc. Var. Partial Differential Equations, 10:213–221, 2000.
[91]
V. Sverák and X. Yan. Non-Lipschitz minimizers of smooth uniformly convex functionals. Proc. Natl. Acad. Sci. USA, 99:15269–15276, 2002.
[92]
L. Székelyhidi, Jr. The regularity of critical points of polyconvex functionals. Arch.
Ration. Mech. Anal., 172:133–152, 2004.
[93]
Q. Tang. Almost-everywhere injectivity in nonlinear elasticity. Proc. Roy. Soc. Edinburgh Sect. A, 109:79–95, 1988.
[94]
L. Tartar. Compensated compactness and applications to partial differential equations. In Nonlinear analysis and mechanics: Heriot-Watt Symposium, Vol. IV, volume 39 of Res. Notes in Math., pages 136–212. Pitman, 1979.
[95]
L. Tartar. The compensated compactness method applied to systems of conservation
laws. In Systems of nonlinear partial differential equations (Oxford, 1982), volume
111 of NATO Adv. Sci. Inst. Ser. C Math. Phys. Sci., pages 263–285. Reidel, 1983.
[96]
B. van Brunt. The calculus of variations. Universitext. Springer, 2004.
[97]
L. C. Young. Generalized curves and the existence of an attained absolute minimum
in the calculus of variations. C. R. Soc. Sci. Lett. Varsovie, Cl. III, 30:212–234, 1937.
[98]
L. C. Young. Generalized surfaces in the calculus of variations. Ann. of Math.,
43:84–103, 1942.
[99]
L. C. Young. Generalized surfaces in the calculus of variations. II. Ann. of Math.,
43:530–544, 1942.
[100] L. C. Young. Lectures on the calculus of variations and optimal control theory.
Saunders, 1969.
Comment...
[101] L. C. Young. Lectures on the calculus of variations and optimal control theory.
Chelsea, 2nd edition, 1980. (reprinted by AMS Chelsea Publishing 2000).
Index
Ball existence theorem, 84
Ball–James rigidity theorem, 66
Banach–Alaoglu theorem, 112
barycenter, 111
biting convergence, 76
blow-up technique, 69
bootstrapping, 28
Borel σ -algebra, 110
Borel measure, 110
brachistochrone curve, 2
Caccioppoli inequality, 25
Carathéodory integrand, 14
Ciarlet–Nečas non-interpenetration condition, 86
coercivity, 12
weak, 14
cofactor matrix, 108
compactness principle for Young measures, 56
compensated compactness, 75
continuum mechanics, 6
convexity, 11
strong, 25
convolution, 115
Cramer rule, 108
critical point, 21
cut-off construction, 48
Dacorogna formula, 105
De Giorgi regularity theorem, 28
De Giorgi–Nash–Moser regularity theorem, 29
density theorem for Sobolev spaces, 114
difference quotient, 25
Direct Method, 12
for weak convergence, 13
with side constraint, 38
directional derivative, 19
Dirichlet integral, 5, 17, 33
disintegration theorem, 57
distributional cofactors, 81
distributional determinant, 82
double-well potential, 10, 87
duality pairing, 111
duality pairing, for measures, 110
Dunford–Pettis theorem, 112
elasticity tensor, 7
electric energy, 4
electric potential, 4
ellipticity, 21
Euler–Lagrange equation, 19
Evans partial regularity theorem, 74
extension–relaxation, 95, 96
Faraday law of induction, 4
Fatou lemma, 109
frame-indifference, 43
Frobenius matrix (inner) product, 107
Frobenius matrix norm, 107
Fundamental lemma of the Calculus of
Variations, 23
Fundamental theorem of Young measure
theory, 55
Comment...
action of a measure, 111
adjugate matrix, 108
126
Gauss law, 4
Green–St. Venant strain tensor, 6
ground state in quantum mechanics, 5
growth bound
lower, 15
upper, 15
Hölder-continuity, 114
Hadamard inequality, 108
Hamiltonian, 5
harmonic function, 20
homogenization lemma, 100
hyperelasticity, 6
integrand, 11
invariance, 34
Jensen inequality, 111
Jensen-type inequality, 65
Kinderlehrer–Pedregal theorem, 102
Korn inequality, 17
Lagrange multiplier, 39
Lamé constants, 80
lamination, 47
Laplace equation, 20
Laplace operator, 20
Lavrentiev gap phenomenon, 31
Lebesgue dominated convergence theorem, 109
Lebesgue point, 110
Legendre–Hadamard condition, 104
linearized strain tensor, 7
Lipschitz domain, 11
Lipschitz-continuity, 115
lower semicontinuity, 12
Comment...
maximal function, 63
Mazur lemma, 112
microstructure, 87, 95
minimizing sequence, 12
minor, 51
modulus of continuity, 93
mollification, 115
INDEX
mollifier, 115
Monotone convergence lemma, 109
multi-index, 112
ordered, 51
Noether multipliers, 35
Noether theorem, 35
normal integrand, 76
null-Lagrangian, 51
o, 109
objective functional, 12
Ogden material, 7
optimal beating problem, 10
optimal saving problem, 8
Philosphy of Science, 1
Piola identity, 52
Poincaré–Friedrichs inequality, 113
Poisson equation, 4
polyconvexity, 78
Pratt theorem, 109
probability measure, 110
quasiaffine, 53
quasiconvex envelope, 88
quasiconvexity, 44
strong, 74
Radon measure, 110
rank of a minor, 51
rank-one convexity, 46
recovery sequence, 96
regular variational integral, 25
relaxation, 88, 91
relaxation theorem
abstract, 91
for integral functionals, 93
with Young measures, 96
Rellich–Kondrachov theorem, 114
Riesz representation theorem, 111
rigid body motion, 6
scalar problem, 18
Schauder estimate, 28, 29
INDEX
tensor product, 107
trace operator, 113
trace space, 113
trace theorem, 113
two-gradient inclusion
approximate, 66
exact, 66
two-gradient problem, 66
underlying deformation, 62
utility function, 8
variation
first, 21
variational principle, 1
vectorial problem, 18
Vitali convergence theorem, 110
Vitali covering theorem, 110
W2,2
loc -regularity theorem, 25
wave equation, 23
wave function, 5
weak compactness in Banach spaces, 111
weak continuity, 38
weak continuity of minors, 53
weak derivative, 112
weak divergence, 113
weak gradient, 113
weak lower semicontinuity, 14
Young inequality, 107
Young measure, 55
elementary, 56
gradient, 62
homogeneous, 59
set of ˜s, 55
Comment...
Schrödinger equation, 5
stationary, 5, 41
Scorza Dragoni theorem, 15
side constraint, 37
singular set, 75
singular value, 108
singular value decomposition, 108
Sobolev embedding theorem, 114
Sobolev space, 112
solution
classical, 23
strong, 23
weak, 19
St. Venant–Kirchhoff material, 80
Statistical Mechanics, 1
stiffness tensor, 7
strong convergence
in L p -spaces, 109
support, 112
symmetries of minimizers, 33
127
Download