MA4G6 Lecture Notes Introduction to the Modern Calculus of Variations Filip Rindler Spring Term 2015 Filip Rindler Mathematics Institute University of Warwick Coventry CV4 7AL United Kingdom F.Rindler@warwick.ac.uk http://www.warwick.ac.uk/filiprindler Copyright ©2015 Filip Rindler. Version 1.1. Preface These lecture notes, written for the MA4G6 Calculus of Variations course at the University of Warwick, intend to give a modern introduction to the Calculus of Variations. I have tried to cover different aspects of the field and to explain how they fit into the “big picture”. This is not an encyclopedic work; many important results are omitted and sometimes I only present a special case of a more general theorem. I have, however, tried to strike a balance between a pure introduction and a text that can be used for later revision of forgotten material. The presentation is based around a few principles: • The presentation is quite “modern” in that I use several techniques which are perhaps not usually found in an introductory text or that have only recently been developed. • For most results, I try to use “reasonable” assumptions, not necessarily minimal ones. • When presented with a choice of how to prove a result, I have usually preferred the (in my opinion) most conceptually clear approach over more “elementary” ones. For example, I use Young measures in many instances, even though this comes at the expense of a higher initial burden of abstract theory. • Wherever possible, I first present an abstract result for general functionals defined on Banach spaces to illustrate the general structure of a certain result. Only then do I specialize to the case of integral functionals on Sobolev spaces. • I do not attempt to trace every result to its original source as many results have evolved over generations of mathematicians. Some pointers to further literature are given in the last section of every chapter. Comment... Most of what I present here is not original, I learned it from my teachers and colleagues, both through formal lectures and through informal discussions. Published works that influenced this course were the lecture notes on microstructure by S. Müller [73], the encyclopedic work on the Calculus of Variations by B. Dacorogna [25], the book on Young measures by P. Pedregal [81], Giusti’s more regularity theory-focused introduction to the Calculus of Variations [44], as well as lecture notes on several related courses by J. Ball, J. Kristensen, A. Mielke. Further texts on the Calculus of Variations are the elementary introductions by B. van Brunt [96] and B. Dacorogna [26], the more classical two-part treatise [39, 40] by M. Giaquinta & S. Hildebrandt, as well as the comprehensive treatment of ii PREFACE the geometric measure theory aspects of the Calculus of Variations by Giaquinta, Modica & Souček [41, 42]. In particular, I would like to thank Alexander Mielke and Jan Kristensen, who are responsible for my choice of research area. Through their generosity and enthusiasm in sharing their knowledge they have provided me with the foundation for my study and research. Furthermore, I am grateful to Richard Gratwick and the participants of the MA4G6 course for many helpful comments, corrections, and remarks. These lecture notes are a living document and I would appreciate comments, corrections and remarks – however small. They can be sent back to me via e-mail at F.Rindler@warwick.ac.uk or anonymously via http://tinyurl.com/MA4G6-2015 or by scanning the following QR-Code: Moreover, every page contains its individual QR-code for immediate feedback. I encourage all readers to make use of them. Comment... F. R. January 2015 Contents Preface i Contents iii 1 Introduction 1.1 The brachistochrone problem . . . . . . 1.2 Electrostatics . . . . . . . . . . . . . . 1.3 Stationary states in quantum mechanics 1.4 Hyperelasticity . . . . . . . . . . . . . 1.5 Optimal saving and consumption . . . . 1.6 Sailing against the wind . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 2 4 5 6 8 9 2 Convexity 2.1 The Direct Method . . . . . . . . . 2.2 Functionals with convex integrands . 2.3 Integrands with u-dependence . . . 2.4 The Euler–Lagrange equation . . . . 2.5 Regularity of minimizers . . . . . . 2.6 Lavrentiev gap phenomenon . . . . 2.7 Invariances and the Noether theorem 2.8 Side constraints . . . . . . . . . . . 2.9 Notes and bibliographical remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 12 14 18 19 24 31 33 37 41 . . . . . . . . . 43 44 51 54 62 66 68 73 74 75 3 Quasiconvexity 3.1 Quasiconvexity . . . . . . . . . . . . 3.2 Null-Lagrangians . . . . . . . . . . . 3.3 Young measures . . . . . . . . . . . . 3.4 Gradient Young measures . . . . . . . 3.5 Gradient Young measures and rigidity 3.6 Lower semicontinuity . . . . . . . . . 3.7 Integrands with u-dependence . . . . 3.8 Regularity of minimizers . . . . . . . 3.9 Notes and bibliographical remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Comment... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv 4 Polyconvexity 4.1 Polyconvexity . . . . . . . . . . . 4.2 Existence Theorems . . . . . . . . 4.3 Global injectivity . . . . . . . . . 4.4 Notes and bibliographical remarks CONTENTS . . . . . . . . . . . . . . . . . . . . 77 78 79 85 86 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Relaxation 5.1 Quasiconvex envelopes . . . . . . . . . . . . 5.2 Relaxation of integral functionals . . . . . . . 5.3 Young measure relaxation . . . . . . . . . . . 5.4 Characterization of gradient Young measures 5.5 Notes and bibliographical remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 . 88 . 91 . 95 . 101 . 105 A Prerequisites A.1 General facts and notation . . . . . . . . A.2 Measure theory . . . . . . . . . . . . . . A.3 Functional analysis . . . . . . . . . . . . A.4 Sobolev and other function spaces spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 107 109 111 112 117 Index 125 Comment... Bibliography Chapter 1 Introduction Physical modeling is a delicate business – trying to balance the validity of the resulting model, that is, its agreement with nature and its predictive capabilities, with the feasibility of its mathematical treatment is often highly non-straightforward. In this quest to formulate a useful picture of an aspect of the world, it turns out on surprisingly many occasions that the formulation of the model becomes clearer, more compact, or more descriptive, if one introduces some form of a variational principle, that is, if one can define a quantity, such as energy or entropy, which obeys a minimization, maximization or saddle-point law. How “fundamental” such a scalar quantity is perceived depends on the situation: For example, in classical mechanics, one calls forces conservative if they can be expressed as the derivative of an “energy potential” along a curve. It turns out that many, if not most, forces in physics are conservative, which seems to imply that the concept of energy is fundamental. Furthermore, Einstein’s famous formula E = mc2 expresses the equivalence of mass and energy, positioning energy as the “most” fundamental quantity, with mass becoming “condensed” energy. On the other hand, the other ubiquitous scalar physical quantity, the entropy, has a more “artificial” flavor as a measure of missing information in a model, as is now well understood in Statistical Mechanics, where entropy becomes a derived quantity. We leave it to the field of Philosophy of Science to understand the nature of variational concepts. In the words of William Thomson, the Lord Kelvin1 : The fact that mathematics does such a good job of describing the Universe is a mystery that we don’t understand. And a debt that we will probably never be able to repay. Our approach to variational quantities here is a pragmatic one: We see them as providing structure to a problem and to enable different mathematical, so-called variational, methods. For instance, in elasticity theory it is usually unrealistic that a system will tend to a global minimizer by itself, but this does not mean that – as an approximation – such a principle cannot be useful in practice. On a fundamental level, there probably is no minimization as such in physics, only quantum-mechanical interactions (or whatever lies “below” quantum prove nothing, however. Thomson also said “X-rays will prove to be a hoax”. Comment... 1 Quotes 2 CHAPTER 1. INTRODUCTION mechanics). However, if we wait long enough, the inherent noise in a realistic system will move our system around and it will settle with a high probability in a low-energy state. In this course, we will focus solely on minimization problems for integral functionals of the form (Ω ⊂ Rd open and bounded) ∫ Ω f (x, u(x), ∇u(x)) dx → min, u : Ω → Rm , possibly under conditions on the boundary values of u and further side constraints. These problems form the original core of the Calculus of Variations and are as relevant today as they have always been. The systematic understanding of these integral functionals starts in Euler’s and Bernoulli’s times in the late 1600s and the early 1700s, and their study was boosted into the modern age by Hilbert’s 19th, 20th, 23rd problems, formulated in 1900 [47]. The excitement has never stopped since and new and important discoveries are made every year. We start this course by looking at a parade of examples, which we treat at varying levels of detail. All these problems will be investigated further along the course once we have developed the necessary mathematical tools. 1.1 The brachistochrone problem In June 1696 Johann Bernoulli published a problem in the journal Acta Eruditorum, which can be seen as the time of birth of the Calculus of Variations (the name, however, is from Leonhard Euler’s 1766 treatise Elementa calculi variationum). Additionally, Bernoulli sent a letter containing the question to Gottfried Wilhelm Leibniz on 9 June 1696, who returned his solution only a few days later on 16 June, and commented that the problem tempted him “like the apple tempted Eve” because of its beauty. Isaac Newton published a solution (after the problem had reached him) without giving his identity, but Bernoulli identified him “ex ungue leonem” (from Latin, “by the lion’s claw”). The problem was formulated as follows2 : Given two points A and B in a vertical [meaning “not horizontal”] plane, one shall find a curve AMB for a movable point M, on which it travels from A to B in the shortest time, only driven by its own weight. The resulting curve is called the brachistochrone (from Ancient Greek, “shortest time”) curve. We formulate the problem as follows: We look for the curve y connecting the origin (0, 0) to the point (x̄, ȳ), where x̄ > 0, ȳ < 0, such that a point mass m > 0 slides from rest at (0, 0) to (x̄, ȳ) quickest among all such curves. See Figure 1.1 for several possible slide paths. We parametrize a point (x, y) on the curve by the time t ≥ 0. The point mass has Comment... 2 The German original reads “Wenn in einer verticalen Ebene zwei Punkte A und B gegeben sind, soll man dem beweglichen Punkte M eine Bahn AMB anweisen, auf welcher er von A ausgehend vermöge seiner eigenen Schwere in kürzester Zeit nach B gelangt.” 1.1. THE BRACHISTOCHRONE PROBLEM 3 Figure 1.1: Several slide curves from the origin to (x̄, ȳ). kinetic and potential energies {( )2 ( )2 } ( ) { ( )2 } dx dy m dx 2 dy m + = 1+ , Ekin = 2 dt dt 2 dt dx Epot = mgy, where g ≈ 9.81 m/s2 is the (constant) gravitational acceleration on Earth. The total energy Ekin + Epot is zero at the beginning and conserved along the path. Hence, for all y we have ( ) { ( )2 } m dx 2 dy 1+ = −mgy. 2 dt dx We can solve this for dt/dx (where t = t(x) is the inverse of the x-parametrization) to get √ ( ) dt 1 + (y′ )2 dt = ≥0 , dx −2gy dx dy where we wrote y′ = dx . Integrating over the whole x-length along the curve from 0 to x̄, we get for the total time duration T [y] that √ ∫ x̄ 1 1 + (y′ )2 T [y] = √ dx. −y 2g 0 Comment... We may drop the constant in front of the integral, which does not influence the minimization problem, and set x̄ = 1 by reparametrization, to arrive at the problem √ ∫ 1 1 + (y′ )2 F [y] := dx → min, −y 0 y(0) = 0, y(1) = ȳ < 0. 4 CHAPTER 1. INTRODUCTION Clearly, the integrand is convex in y′ , which will be important for the solution theory. We will come back to this problem in Example 2.34. A related problem concerns Fermat’s Principle expressing that light (in vacuum or in a medium) always takes the fastest path between two points, from which many optical laws, e.g. about reflection and refraction, can be derived. 1.2 Electrostatics Consider an electric charge density ρ : R3 → R (in units of C/m3 ) in three-dimensional vacuum. Let E : R3 → R3 (in V/m) and B : R3 → R3 (in T = Vs/m2 ) be the electric and magnetic fields, respectively, which we assume to be constant in time (hence electrostatics). Assuming that R3 is a linear, homogeneous, isotropic electric material, the Gauss law for electricity reads ∇ · E = div E = ρ , ε0 where ε0 ≈ 8.854 · 10−12 C/(Vm). Moreover, we have the Faraday law of induction ∇ × E = curl E = dB = 0. dt Thus, since E is curl-free, there exist an electric potential φ : R3 → R (in V) such that E = −∇φ . Combining this with the Gauss law, we arrive at the Poisson equation, ∆φ = ∇ · [∇φ ] = − ρ . ε0 (1.1) We can also look at electrostatics in a variational way: We use the norming condition φ (0) = 0. Then, the electric potential energy UE (x; q) of a point charge q > 0 (in C) at point x ∈ R3 in the electric field E is given by the path integral (which does not depend on the path chosen since E is a gradient) UE (x; q) = − ∫ x 0 qE · ds = − ∫ 1 0 qE(hx) dh = qφ (x). Thus, the total electric energy of our charge distribution ρ is (the factor 1/2 is necessary to count mutual reaction forces correctly) 1 UE := 2 ∫ ε0 ρφ dx = 3 2 R ∫ R3 (∇ · E)φ dx, which has units of CV = J. Using (∇ · E)φ = ∇ · (E φ ) − E · (∇φ ), the divergence theorem, and the natural assumption that φ vanishes at infinity, we get further Comment... UE = ε0 2 ∫ R3 ∇ · (E φ ) − E · (∇φ ) dx = − ε0 2 ∫ R3 E · (∇φ ) dx = ε0 2 ∫ R3 |∇φ |2 dx. 1.3. STATIONARY STATES IN QUANTUM MECHANICS 5 ∫ The integral Ω |∇φ |2 dx is called Dirichlet integral. In Example 2.15 we will see that (1.1) is equivalent to the minimization problem ∫ ∫ ε0 |∇φ |2 + ρφ dx → min . 2 The second term is the energy stored in the electric field caused by its interaction with the charge density ρ . UE + R3 ρφ dx = R3 1.3 Stationary states in quantum mechanics The non-relativistic evolution of a quantum mechanical system with N degrees of freedom in an electrical field is described completely through its wave function Ψ : RN × R → C that satisfies the Schrödinger equation [ ] −h̄ d ih̄ Ψ(x,t) = ∆ +V (x,t) Ψ(x,t), x ∈ RN , t ∈ R, dt 2µ where h̄ ≈ 1.05 · 10−34 Js is the reduced Planck constant, µ > 0 is the reduced mass, and V = V (x,t) ∈ R is the potential energy. The (self-adjoint) operator H := −(2µ )−1 h̄∆ +V is called the Hamiltonian of the system. The value of the wave function itself at a given point in spacetime has no physical interpretation, but according to the Copenhagen Interpretation of quantum mechanics, |Ψ(x)|2 is the probability density of finding a particle at the point x. In order for |Ψ|2 to be a probability density, we need to impose the side constraint ∥Ψ∥L2 = 1. In particular, Ψ(x,t) has to decay as |x| → ∞. Of particular interest are the so-called stationary states, that is, solutions of the stationary Schrödinger equation ] [ 2 −h̄ ∆ +V (x) Ψ(x) = EΨ(x), x ∈ RN , 2µ where E > 0 is an energy level. We then solve this eigenvalue problem (in a weak sense) for Ψ ∈ W1,2 (RN ; C) with ∥Ψ∥L2 = 1 (see Appendix A.4 for a collection of facts about Sobolev spaces like W1,2 ). If we are just interested in the lowest-energy state, the so-called ground state, we can find minimizers of the energy functional E [Ψ] := ∫ RN h̄2 1 |∇Ψ(x)|2 + V (x)|Ψ(x)|2 dx, 2µ 2 again under the side constraint ∥Ψ∥L2 = 1. Comment... The two parts of the integral above correspond to kinetic and potential energy, respectively. We will continue this investigation in Example 2.39. 6 CHAPTER 1. INTRODUCTION 1.4 Hyperelasticity Elasticity theory is one of the most important theories of continuum mechanics, that is, the study of the mechanics of (idealized) continuous media. We will not go into much detail about elasticity modeling here and refer to the book [20] for a thorough introduction. Consider a body of mass occupying a domain Ω ⊂ R3 , called the reference configuration. If we deform the body, any material point x ∈ Ω is mapped into a spatial point y(x) ∈ R3 and we call y(Ω) the deformed configuration. For a suitable continuum mechanics theory, we also need to require that y : Ω → y(Ω) is a differentiable bijection and that it is orientation-preserving, i.e. that det ∇y(x) > 0 for all x ∈ Ω. For convenience one also introduces the displacement u(x) := y(x) − x. One can show (using the implicit function theorem and the mean value theorem) that the orientation-preserving condition and the invertibility are satisfied if sup |∇u(x)| < δ (Ω) (1.2) x∈Ω for a domain-dependent (small) constant δ (Ω) > 0. Next, we need a measure of local “stretching”, called a strain tensor, which should serve as the parameter to a local energy density. On physical grounds, rigid body motions, that is, deformations u(x) = Rx+u0 with a rotation R ∈ R3×3 (RT = R−1 and det R = 1) and u0 ∈ R3 , should not cause strain. In this sense, strain measures the deviation of the displacement from a rigid body motion. One popular choice is the Green–St. Venant strain tensor3 E := ) 1( ∇u + ∇uT + ∇uT ∇u . 2 (1.3) We first consider fully nonlinear (“finite strain”) elasticity. For our purposes we simply postulate the existence of a stored-energy density W : R3×3 → [0, ∞] and an external body force field b : Ω → R3 (e.g. gravity) such that F [y] := ∫ Ω W (∇y) − b · y dx represents the total elastic energy stored in the system. If the elastic energy can be written ∫ in this way as Ω W (∇y(x)) dx, we call the material hyperelastic. In applications, W is often given as depending on the Green–St. Venant strain tensor E instead of ∇y, but for our mathematical theory, the above form is more convenient. We require several properties of W for arguments F ∈ R3×3 : Comment... 3 There is an array of choices for the strain tensor, but they are mostly equivalent within nonlinear elasticity. 1.4. HYPERELASTICITY 7 (i) The undeformed state costs no energy: W (I) = 0. (ii) Invariance under change of frame4 : W (RF) = W (F) for all R ∈ SO(3). (iii) Infinite compression costs infinite energy: W (F) → +∞ as det F ↓ 0. (iv) Infinite stretching costs infinite energy: W (F) → +∞ as |F| → ∞. The main problem of nonlinear hyperelasticity is to minimize F as above over all y : Ω → R3 with given boundary values. Of course, it is not a-priori clear in which space we should look for a solution. Indeed, this depends on the growth properties of W . For example, for the prototypical choice W (F) := dist(F, SO(3))2 , which however does not satisfy (3) from our list of requirements, we would look for square integrable functions. More realistic in applications are W of the form M [ ] N [ ] W (F) := ∑ ai tr (F T F)γi /2 + ∑ b j tr cof (F T F)δ j /2 + Γ(det F), i=1 F ∈ R3×3 , j=1 where ai > 0, γi ≥ 1, b j > 0, δ j ≥ 1, and Γ : R → R ∪ {+∞} is a convex function with Γ(d) → +∞ as d ↓ 0, Γ(d) = +∞ for s ≤ 0. These materials are called Ogden materials and occur in a wide range of applications. In the setting of linearized elasticity, we make the “small strain” assumption that the quadratic term in (1.3) can be neglected and that (1.2) holds. In this case, we work with the linearized strain tensor ) 1( E u := ∇u + ∇uT . 2 In this infinitesimal deformation setting, the displacements that do not create strain are precisely the skew-affine maps5 u(x) = Qx + u0 with QT = −Q and u0 ∈ R. For linearized elasticity we consider an energy of the special “quadratic” form F [u] := ∫ 1 E u : C E u − b · u dx, Ω2 where C(x) = Ci jkl (x) is a symmetric, strictly positive definite (A : C(x)A > c|A|2 for some c > 0) fourth-order tensor, called the elasticity/stiffness tensor, and b : Ω → R3 is our external body force. Thus we will always look for solutions in W1,2 (Ω; R3 ). For homogeneous, isotropic media, C does not depend on x or the direction of strain, which translates into the additional condition (AR) : C(x)(AR) = A : C(x)A for all x ∈ Ω, A ∈ R3×3 and R ∈ SO(3). 4 Rotating or translating the (inertial) frame of reference must give the same result. becomes more meaningful when considering a bit more algebra: The Lie group SO(3) of rotations has as its Lie algebra Lie(SO(3)) = so(3) the space of all skew-symmetric matrices, which then can be seen as “infinitesimal rotations”. Comment... 5 This 8 CHAPTER 1. INTRODUCTION In this case, it can be shown that F simplifies to ∫ ( 1 2 ) F [u] = 2µ |E u|2 + κ − µ | tr E u|2 − b · u dx 2 Ω 3 for µ > 0 the shear modulus and κ > 0 the bulk modulus, which are material constants. For example, for cold-rolled steel µ ≈ 75 GPa and κ ≈ 160 GPa. 1.5 Optimal saving and consumption Consider a capitalist worker earning a (constant) wage w per year, which he can either spend on consumption or save. Denote by S(t) the savings at time t, where t ∈ [0, T ] is in years, with t = 0 denoting the beginning of his work life and t = T his retirement. Further, let C(t) be the consumption rate (consumption per time) at time t. On the saved capital, the worker earns interest, say with gross-continuous rate ρ > 0, meaning that a capital amount m > 0 grows as exp(ρ t)m. If we were given an APR ρ1 > 0 instead of ρ , we could compute ρ = ln(1 + ρ1 ). We further assume that salary is paid continuously, not in intervals, for simplicity. So, w is really the rate of pay, given in money per time. Then, the worker’s savings evolve according to the differential equation Ṡ(t) = w + ρ S(t) −C(t). (1.4) Now, assume that our worker is mathematically inclined and wants to optimize the happiness due to consumption in his life by finding the optimal amount of consumption at any given time. Being a pure capitalist, the worker’s happiness only depends on his consumption rate C. So, if we denote by U(C) the utility function, that is, the “happiness” due to the consumption rate C, our worker wants to find C : [0, T ] → R such that H [C] := ∫ T U(C(t)) dt 0 is maximized. The choice of U depends on our worker’s personality, but it is probably realistic to assume that there is a law of diminishing returns, i.e. for twice as much consumption, our worker is happier, but not twice as happy. So, let us assume U ′ > 0 and U ′ (C) → 0 as C → ∞. Also, we should have U(0) = −∞ (starvation). Moreover, it is realistic for U to be concave, which implies that there are no local maxima. One function that satisfies all of these requirements is the logarithm, and so we use U(C) = ln(C), C > 0. Let us also assume that the worker starts with no savings, S(0) = 0, and at the end of his work life wants to retire with savings S(T ) = ST ≥ 0. Rearranging (1.4) for C(t) and plugging this into the formula for H , we therefore want to solve the optimal saving problem ∫ T F [S] := − ln(w + ρ S(t) − Ṡ(t)) dt → min, 0 S(0) = 0, S(T ) = ST ≥ 0. Comment... This will be solved in Example 2.18. 1.6. SAILING AGAINST THE WIND 9 Figure 1.2: Sailing against the wind in a channel. 1.6 Sailing against the wind Every sailor knows how to sail against the wind by “beating”: One has to sail at an angle of approximately 45◦ to the wind6 , then tack (turn the bow through the wind) and finally, after the sail has caught the wind on the other side, continue again at approximately 45◦ to the wind. Repeating this procedure makes the boat follow a zig-zag motion, which gives a net movement directly against the wind, see Figure 1.2. A mathematically inclined sailor might ask the question of “how often to tack”. In an idealized model we can assume that tacking costs no time and that the forward sailing speed vs of the boat depends on the angle α to the wind as follows (at least qualitatively): vs (α ) = − cos(4α ), which has maxima at α = ±45◦ (in real boats, the maximum might be at a lower angle, i.e. “closer to the wind”). Assume furthermore that our sailor is sailing along a straight river with the current. Now, the current is fastest in the middle of the river and goes down to zero at the banks. In fact, a good approximation would be the formula of Poiseuille (channel) flow, which can be derived from the flow equations of fluids: At distance r from the center of the river the current’s flow speed is approximately ( ) r2 vc (r) := vmax 1 − 2 , R where R > 0 is half the width of the river. optimal angle depends on the boat. Ask a sailor about this and you will get an evening-filling lecture. Comment... 6 The 10 CHAPTER 1. INTRODUCTION If we denote by r(t) the distance of our boat from the middle of the channel at time t ∈ [0, T ], then the total speed (called the “velocity made good” in sailing parlance) is v(t) := vs (arctan r′ (t)) + vc (r(t)) ( ) r(t)2 = − cos(4 arctan r (t)) + vmax 1 − 2 . R ′ The key to understand this problem is now the fact that a 7→ − cos(4 arctan a) has two maxima at a = ±1. We say that this function is a double-well potential. The total forward distance traveled over the time interval [0, T ] is ( ) ∫ T ∫ T r(t)2 ′ v(t) dt = − cos(4 arctan r (t)) + vmax 1 − 2 dt. R 0 0 If we also require the initial and terminal conditions r(0) = r(T ) = 0, we arrive at the optimal beating problem: ( ) ∫ T r(t)2 ′ F [r] := cos(4 arctan r (t)) − vmax 1 − 2 dt → min, R 0 r(0) = r(T ) = 0, |r(t)| ≤ R. Comment... Our intuition tells us that in this idealized model, where tacking costs no time, we should be tacking as “infinitely fast” in order to stay in the middle of the river. Later, once we have advanced tools at our disposal, we will make this idea precise, see Example 5.6. Chapter 2 Convexity In the introduction we got a glimpse of the many applications of minimization problems in physics, technology, and economics. In this chapter we start to develop the abstract theory. Consider a minimization problem of the form ∫ F [u] := f (x, u(x), ∇u(x)) dx → min, Ω over all u ∈ W1,p (Ω; Rm ) with u|∂ Ω = g. Here, and throughout the course (if not otherwise stated) we will make the standard assumption that Ω ⊂ Rd is a bounded Lipschitz domain, that is, Ω is open, bounded, and has a Lipschitz boundary. Furthermore, we assume 1 < p < ∞, so that the integrand f : Ω × Rm × Rm×d → R, is measurable in the first and continuous in the second and third arguments (hence f is a Carathéodory integrand). For the prescribed boundary values g we assume g ∈ W1−1/p,p (∂ Ω; Rm ). In this context recall that W1−1/p,p (∂ Ω; Rm ) is the space of all traces of functions u ∈ W1,p (Ω; Rm ), see Appendix A.4 for some background on Sobolev spaces. These conditions will be assumed from now on unless stated otherwise. Furthermore, to make the above integral finite, we assume the growth bound | f (x, v, A)| ≤ M(1 + |v| p + |A| p ) for some M > 0. Comment... Below, we will investigate the solvability and regularity properties of this minimization problem; in particular, we will take a close look at the way in which convexity properties of f in its gradient (third) argument determine whether F is lower semicontinuous, which is the decisive attribute in the fundamental Direct Method of the Calculus of Variations. We will also consider the partial differential equation associated with the minimization problem, the so-called Euler–Lagrange equation that is a necessary (but not sufficient) condition for minimizers. This leads naturally to the question of regularity of solutions. Finally, 12 CHAPTER 2. CONVEXITY we also treat invariances, leading to the famous Noether theorem before having a glimpse at side constraints. Most results here are already formulated for vector-valued u, as this has many applications. Some aspects, however, in particular regularity theory, are specific to scalar problems, where u is assumed to be merely real-valued. 2.1 The Direct Method Fundamental to all of the existence theorems in this text is the conceptionally simple Direct Method1 of the Calculus of Variations. Let X be a topological space2 (e.g. a Banach space with the strong or weak topology) and let F : X → R ∪ {+∞} be our objective functional satisfying the following two assumptions: (H1) Coercivity: For all Λ ∈ R, the sublevel set { x ∈ X : F [x] ≤ Λ } is sequentially relatively compact, that is, if F [x j ] ≤ Λ for a sequence (x j ) ⊂ X and some Λ ∈ R, then (x j ) has a convergent subsequence. (H2) Lower semicontinuity: For all sequences (x j ) ⊂ X with x j → x it holds that F [x] ≤ lim inf F [x j ]. j→∞ Consider the abstract minimization problem F [x] → min over all x ∈ X. (2.1) Then, the Direct Method is the following simple strategy: Theorem 2.1 (Direct Method). Assume that F is both coercive and lower semicontinuous. Then, the abstract problem (2.1) has at least one solution, that is, there exists x∗ ∈ X with F [x∗ ] = min { F [x] : x ∈ X }. Proof. Let us assume that there exists at least one x ∈ X such that F [x] < +∞; otherwise, any x ∈ X is a “solution” to the (degenerate) minimization problem. To construct a minimizer we take a minimizing sequence (x j ) ⊂ X such that { } lim F [x j ] → α := inf F [x] : x ∈ X < +∞. j→∞ Then, since convergent sequences are bounded, there exists Λ ∈ R such that F [x j ] ≤ Λ for all j ∈ N. Hence, by the coercivity (H1), the sequence (x j ) is contained in a compact subset 1 “Direct” refers to the fact that we construct solutions without employing a differential equation. fact, we only need a space together with a notion of convergence as all our arguments will involve sequences as opposed to more general topological tools such as nets. In all of this text we use “topology” and “convergence” interchangeably. Comment... 2 In 2.1. THE DIRECT METHOD 13 of X. Now select a subsequence, which here as in the following we do not renumber, such that x j → x∗ ∈ X. By the lower semicontinuity (H2), we immediately conclude α ≤ F [x∗ ] ≤ lim inf F [x j ] = α . j→∞ Thus, F [x∗ ] = α and x∗ is the sought minimizer. Example 2.2. Using the Direct Method, one can easily see that { 1 − t if t < 0, h(t) := t ∈ R (= X), t if t ≥ 0, has minimizer t = 0, despite not even being continuous there. Despite its nearly trivial proof, the Direct Method is very useful and flexible in applications. Of course, neither coercivity nor lower semicontinuity are necessary for the existence of a minimizer. For instance, we really only needed coercivity and lower semicontinuity along a minimizing sequence, but this weaker condition is hard to check in most practical situations; however, such a refined approach can be useful in special cases. The abstract principle of the Direct Method pushes the difficulty in proving existence of a minimizer toward establishing coercivity and lower semicontinuity. This, however, is a big advantage, since we have many tools at our disposal to establish these two hypotheses separately. In particular, for integral functionals lower semicontinuity is tightly linked to convexity properties of the integrand, as we will see throughout this course. At this point it is crucial to observe how coercivity and lower semicontinuity interact with the topology (or, more accurately, convergence) in X: If we choose a stronger topology, i.e. one for which there are fewer convergent sequences, then it becomes easier for F to be lower semicontinuous, but harder for F to be coercive. The opposite holds if choosing a weaker topology. In the mathematical treatment of a problem we are most likely in a situation where F and X are given by the concrete problem. We then need to find a suitable topology in which we can establish both coercivity and lower semicontinuity. In this text, X will always be an infinite-dimensional Banach space and we have a real choice between using the strong or weak convergence. Usually, it turns out that coercivity with respect to the strong convergence is false since strongly compact sets in infinitedimensional spaces are very restricted, whereas coercivity with respect to the weak convergence is true under reasonable assumptions. However, whereas strong lower semicontinuity poses few challenges, lower semicontinuity with respect to weakly converging sequences is a delicate matter and we will spend considerable time on this topic. We will always use the Direct Method in the following version: Comment... Theorem 2.3 (Direct Method for weak convergence). Let X be a Banach space and let F : X → R ∪ {+∞}. Assume the following: 14 CHAPTER 2. CONVEXITY (WH1) Weak Coercivity: For all Λ ∈ R, the sublevel set { x ∈ X : F [x] ≤ Λ } is sequentially weakly relatively compact, that is, if F [x j ] ≤ Λ for a sequence (x j ) ⊂ X and some Λ ∈ R, then (x j ) has a weakly convergent subsequence. (WH2) Weak lower semicontinuity: For all sequences (x j ) ⊂ X with x j ⇀ x (weak convergence X) it holds that F [x] ≤ lim inf F [x j ]. j→∞ Then, the minimization problem F [x] → min over all x ∈ X, has at least one solution. Before we move on to the more concrete theory, let us record one further remark: While it might appear as if “nature does not give us the topology” and it is up to mathematicians to “invent” a suitable one, it is remarkable that the topology that turns out to be mathematically convenient is also often physically relevant. This is for instance expressed in the following observation: The only phenomena which are present in weakly but not strongly converging sequences are oscillations and concentrations, as can be seen in the classical Vitali Convergence Theorem A.6. On the other hand, these phenomena very much represent physical effects, for example in crystal microstructure. Thus, the fact that we work with the weak convergence means that we have to consider microstructure (which might or might not be exhibited), a fact that we will come back to many times throughout this course. 2.2 Functionals with convex integrands Returning to the minimization problem at the beginning of this chapter, we first consider the minimization problem for the simpler functional F [u] := ∫ Ω f (x, ∇u(x)) dx over all u ∈ W1,p (Ω; Rm ) for some 1 < p < ∞ to be chosen later (depending on lower growth properties of f ). The following lemma together with (2.3) allows us to conclude that F [u] is indeed well-defined. Lemma 2.4. Let f : Ω × RN → R be a Carathéodory integrand, that is, (i) x 7→ f (x, A) is Lebesgue-measurable for all fixed A ∈ RN . Comment... (ii) A 7→ f (x, A) is continuous for all fixed x ∈ Ω. 2.2. FUNCTIONALS WITH CONVEX INTEGRANDS 15 Then, for any Lebesgue-measurable function V : Ω → RN the composition x 7→ f (x,V (x)) is Lebesgue-measurable. We will prove this lemma via the following “Lusin-type” theorem for Carathéodory functions: Theorem 2.5 (Scorza Dragoni). Let f : Ω × RN → R be Carathéodory. Then, there exists an increasing sequence of compact sets Sk ⊂ Ω (k ∈ N) with |Ω \ Sk | ↓ 0 such that f |Sk ×RN is continuous. See Theorem 6.35 in [37] for a proof of this theorem, which is rather measure-theoretic and similar to the proof of the Lusin theorem, cf. Theorem 2.24 in [85]. Proof of Lemma 2.4. Let Sk ⊂ Ω (k ∈ N) be as in the Scorza Dragoni Theorem. Set fk := f |Sk ×RN and gk (x) := fk (x,V (x)), x ∈ Sk . Then, for any open set E ⊂ R, the pre-image of E under gk is the set of all x ∈ Sk such that (x,V (x)) ∈ fk−1 (E). As fk is continuous, fk−1 (E) is open and from the Lebesgue measurability of the product function x 7→ (x,V (x)) we infer that g−1 k (E) is a Lebesgue-measurable subset of Sk . We can extend the definition of gk to all of Ω by setting gk (x) := 0 whenever x ∈ Ω \ Sk . This function is still Lebesgue-measurable as Sk is compact, hence measurable. The conclusion follows from the fact that gk (x) → f (x,V (x)) and pointwise limits of Lebesgue-measurable functions are themselves Lebesgue-measurable. We next investigate coercivity: Let 1 < p < ∞ The most basic assumption, and the only one we want to consider here, to guarantee coercivity is the following lower growth (coercivity) bound: µ |A| p − µ −1 ≤ f (x, A) for all x ∈ Ω, A ∈ Rm×d and some µ > 0. (2.2) This coercivity also suggests the exponent p for the definition of the function spaces where we look for solutions. We further assume the upper growth bound | f (x, A)| ≤ M(1 + |A| p ) for all x ∈ Ω, A ∈ Rm×d and some M > 0, (2.3) which gives finiteness of F [u] for all u ∈ W1,p (Ω; Rm ). Proposition 2.6. If f satisfies the lower growth bound (2.2), { } then F is weakly coercive on m 1,p m the space W1,p g (Ω; R ) = u ∈ W (Ω; R ) : u|∂ Ω = g . m Proof. We need to show that any sequence (u j ) ⊂ W1,p g (Ω; R ) with Comment... sup j F [u j ] < ∞ 16 CHAPTER 2. CONVEXITY is weakly relatively compact. From (2.2) we get µ · sup j ∫ Ω |∇u j | p dx − |Ω| ≤ sup j F [u j ] < ∞, µ whereby sup j ∥∇u j ∥L p < ∞. By the Poincaré–Friedrichs inequality, Theorem A.16 (I), in conjunction with the fact that we fix the boundary values, we therefore get sup j ∥u j ∥W1,p < ∞. We conclude by the fact that bounded sets in reflexive Banach spaces, like W1,p (Ω; Rm ) for 1 < p < ∞, are sequentially weakly relatively compact, see Theorem A.11. Having settled the question of (weak) coercivity, we can now turn to investigate the weak lower semicontinuity. Theorem 2.7. If A 7→ f (x, A) is convex for all x ∈ Ω and f (x, A) ≥ κ for some κ ∈ R (which follows in particular from (2.2)), then F is weakly lower semicontinuous on W1,p (Ω; Rm ). Proof. Step 1. We first establish that F is strongly lower semicontinuous, so let u j → u in W1,p and ∇u j → ∇u almost everywhere (which holds after selecting a non-renumbered subsequence, see Appendix A.2). By assumption we have f (x, ∇u j (x)) − κ ≥ 0. Thus, applying Fatou’s Lemma, ( ) lim inf F [u j ] − κ |Ω| ≥ F [u] − κ |Ω|. j→∞ Thus, F [u] ≤ lim inf j→∞ F [u j ]. Since this holds for all subsequences, it also follows for our original sequence. Step 2. To show weak lower semicontinuity take (u j ) ⊂ W1,p (Ω; Rm ) with u j ⇀ u. We need to show that F [u] ≤ lim inf F [u j ] =: α . (2.4) j→∞ Taking a subsequence (not relabeled), we can in fact assume that F [u j ] converges to α . By the Mazur Lemma A.14, we may find convex combinations N( j) vj = ∑ N( j) θ j,n un , where θ j,n ∈ [0, 1] and n= j ∑ θ j,n = 1, n= j such that v j → u strongly. Since f (x, q) is convex, ) ( ∫ N( j) F [v j ] = Ω f x, ∑ N( j) θ j,n ∇un (x) n= j dx ≤ ∑ θ j,n F [un ]. n= j Now, F [un ] → α as n → ∞ and n ≥ j inside the sum. Thus, lim inf F [v j ] ≤ α = lim inf F [u j ]. j→∞ j→∞ Comment... On the other hand, from the first step and v j → u strongly, we have F [u] ≤ lim inf j→∞ F [v j ]. Thus, (2.4) follows and the proof is finished. 2.2. FUNCTIONALS WITH CONVEX INTEGRANDS 17 We can summarize our findings in the following theorem: Theorem 2.8. Let f : Ω × Rm×d be a Carathéodory integrand such that f satisfies the lower growth bound (2.2), the upper growth bound (2.3) and is convex in its second argum ment. Then, the associated functional F has a minimizer over the space W1,p g (Ω; R ). Proof. This follows immediately from the Direct Method in Banach spaces, Theorem 2.3 together with Proposition 2.6 and Theorem 2.7. Example 2.9. The Dirichlet integral (functional) is F [u] := ∫ Ω |∇u(x)|2 dx, u ∈ W1,2 (Ω). We already encountered this integral functional when considering electrostatics in Section 1.2. It is easy to see that this functional satisfies all requirements of Theorem 2.8 and so there exists a minimizer for any prescribed boundary values g ∈ W1/2,2 (∂ Ω). Example 2.10. In the prototypical problem of linearized elasticity from Section 1.4, we are tasked with the minimization problem ∫ ) ( F [u] := 1 2µ |E u|2 + κ − 2 µ | tr E u|2 − f u dx → min, 2 Ω 3 over all u ∈ W1,2 (Ω; R3 ) with u|∂ Ω = g, where µ , κ > 0, f ∈ L2 (Ω; R3 ), and g ∈ W1/2,2 (∂ Ω; Rm ). It is clear that F has quadratic growth. Let us first consider the coercivity: For u ∈ W1,2 (Ω; R3 ) with u|∂ Ω = g we have the following version of the L2 -Korn inequality ( ) ∥u∥2W1,2 ≤ CK ∥E u∥2L2 + ∥g∥2W1/2,2 . Here, CK = CK (Ω) > 0 is a constant. Let us for a simplified estimate assume that κ − 23 µ ≥ 0 (this is not necessary, but shortens the argument). Then, also using the Young inequality, we get for any δ > 0, F [u] ≥ µ ∥E u∥2L2 − ∥ f ∥L2 ∥u∥L2 ( ) 1 δ ≥ µ ∥E u∥2L2 + ∥g∥2W1/2,2 − ∥ f ∥2L2 − ∥u∥2L2 − µ ∥g∥2W1/2,2 2δ 2 ( ) µ δ 1 2 2 ≥ − ∥u∥W1,2 − ∥ f ∥L2 − µ ∥g∥2W1/2,2 . CK 2 2δ Choosing δ = µ /CK , we get the coercivity estimate F [u] ≥ CK µ ∥u∥2W1,2 − ∥ f ∥2L2 − µ ∥g∥2W1/2,2 . 2CK 2µ Comment... Hence, F [u] controls ∥u∥W1,2 and our functional is weakly coercive. Moreover, it is clear that our integrand is convex in in the E u-argument (note that the trace function is linear). Hence, Theorem 2.8 yields the existence of a solution u∗ ∈ W1,2 (Ω; R3 ) to our minimization problem of linearized elasticity. 18 CHAPTER 2. CONVEXITY We finish this section with the following converse to Theorem 2.7: Proposition 2.11. Let F be an integral functional with a Carathéodory integrand f : Ω × Rm×d → R satisfying the upper growth bound (2.3). Assume furthermore that F is weakly lower semicontinuous on W1,p (Ω; Rm ). If either m = 1 or d = 1 (the scalar case), then A 7→ f (x, A) is convex for almost every x ∈ Ω. Proof. We will not prove the full result here, since Proposition 3.27 together with Corollary 3.4 in the next chapter will give a stronger version (including the corresponding result for the vectorial case as well). In the vectorial case, i.e. m ̸= 1 or d ̸= 1, the necessity of convexity is far from being true and there is indeed another “canonical” condition ensuring weak lower semicontinuity, which is weaker than (ordinary) convexity; we will explore this in the next chapter. 2.3 Integrands with u-dependence If we try to extend the results in the previous section to more general functionals F [u] := ∫ Ω f (x, u(x), ∇u(x)) dx → min, we discover that our proof strategy of using the Mazur Lemma runs into difficulties: We cannot “pull out” the convex combination inside ( ) ∫ N( j) Ω f x, ∑ N( j) θ j,n un (x), n= j ∑ θ j,n ∇un (x) dx n= j anymore. Nevertheless, a lower semicontinuity result analogous to the one for the scalar case turns out to be true: Theorem 2.12. Let f : Ω × Rm × Rm×d → R satisfy the growth bound | f (x, v, A)| ≤ M(1 + |v| p + |A| p ) for some M > 0, 1 < p < ∞, and the convexity property A 7→ f (x, v, A) is convex for all (x, v) ∈ Ω × Rm . Then, the functional F [u] := ∫ Ω f (x, u(x), ∇u(x)) dx, u ∈ W1,p (Ω; Rm ), is weakly lower semicontinuous. Comment... While it would be possible to give an elementary proof of this theorem here, we postpone the verification to the next chapter. There, using more advanced and elegant techniques (blow-ups and Young measures), we will in fact establish a much more general result and see that, when viewed from the right perspective, u-dependence does not pose any additional complications. 2.4. THE EULER–LAGRANGE EQUATION 19 2.4 The Euler–Lagrange equation In analogy to the elementary fact that the derivative of a function at its minimizers is zero, we will now derive a necessary condition for a function to be a minimizer. This condition furnishes the connection between the Calculus of Variations and PDE theory and is of great importance for computing particular solutions. Theorem 2.13 (Euler–Lagrange equation (system)). Let f : Ω × Rm × Rm×d → R be a Carathéodory integrand that is continuously differentiable in v and A and satisfies the growth bounds |Dv f (x, v, A)|, |DA f (x, v, A)| ≤ M(1 + |v| p + |A| p ), (x, v, A) ∈ Ω × Rm × Rm×d , m 1−1/p,p (∂ Ω; Rm ), minimizes for some M > 0, 1 ≤ p < ∞. If u∗ ∈ W1,p g (Ω; R ), where g ∈ W the functional F [u] := ∫ Ω f (x, u(x), ∇u(x)) dx, m u ∈ W1,p g (Ω; R ), then u∗ is a weak solution of the following system of PDEs, called the Euler–Lagrange equation3 , { [ ] − div DA f (x, u, ∇u) + Dv f (x, u, ∇u) = 0 in Ω, (2.5) u = g on ∂ Ω. Here we used the common convention to omit the x-arguments whenever this does not cause any confusion in order to curtail the proliferation of x’s, for example in f (x, u, ∇u) = f (x, u(x), ∇u(x)). Recall that u∗ is a weak solution of (2.5) if ∫ Ω DA f (x, u∗ , ∇u∗ ) : ∇ψ + Dv f (x, u∗ , ∇u∗ ) · ψ dx = 0 m for all ψ ∈ C∞ c (Ω; R ), where DA f (x, v, A) : B := lim h↓0 f (x, v, A + hB) − f (x, v, A) , h A, B ∈ Rm×d , is the directional derivative of f (x, v, q) at A in direction B, likewise for Dv f (x, v, A) · w. The derivatives DA f (x, v, A) and Dv f (x, v, A) can also be represented as a matrix in Rm×d and a vector in Rm , respectively, namely ( )j ( )j DA f (x, v, A) := ∂A j f (x, v, A) k , Dv f (x, v, A) := ∂v j f (x, v, A) . k Hence we use the notation with the Frobenius matrix vector product “:” (see the appendix) and the scalar product “·”. The boundary condition u = g on ∂ Ω is to be understood in the sense of trace. is a system of PDEs, but we simply speak of it as an “equation”. Comment... 3 This 20 CHAPTER 2. CONVEXITY m Proof. For all ψ ∈ C∞ c (Ω; R ) and all h > 0 we have F [u∗ ] ≤ F [u∗ + hψ ] m since u∗ + hψ ∈ W1,p g (Ω; R ) is admissible in the minimization. Thus, ∫ f (x, u∗ + hψ , ∇u∗ + h∇ψ ) − f (x, u∗ , ∇u∗ ) dx h Ω ∫ ∫ 1 ] 1 d[ = f (x, u∗ + thψ , ∇u∗ + th∇ψ ) dt dx Ω 0 h dt 0≤ ∫ ∫ 1 = Ω 0 DA f (x, u∗ + thψ , ∇u∗ + th∇ψ ) : ∇ψ + Dv f (x, u∗ + thψ , ∇u∗ + th∇ψ ) · ψ dt dx. By the growth bounds on the derivative, the integrand can be seen to have an h-uniform L1 majorant, namely C(1 + |u∗ | p + |ψ | p + |∇u∗ | p + |∇ψ | p ) and so we may apply the Lebesgue dominated convergence theorem to let h ↓ 0 under the (double) integral. This yields 0≤ ∫ Ω DA f (x, u∗ , ∇u∗ ) : ∇ψ + Dv f (x, u∗ , ∇u∗ ) · ψ dx and we conclude by taking ψ and −ψ in this inequality. m Remark 2.14. If we want to allow ψ ∈ W1,p 0 (Ω; R ) in the weak formulation of (2.5), then we need to assume the stronger growth conditions |Dv f (x, v, A)|, |DA f (x, v, A)| ≤ M(1 + |v| p−1 + |A| p−1 ) for some M > 0, 1 ≤ p < ∞. Example 2.15. Coming back to the Dirichlet integral from Example 2.9, we see that the associated Euler–Lagrange equation is the Laplace equation −∆u = 0, where ∆ := ∂x21 + · · · + ∂x2d is the Laplace operator. Solutions u are called harmonic functions. Since the Dirichlet integral is convex, all solutions of the Laplace equation are in fact minimizers. In particular, it can be seen that solutions of the Laplace equation are unique for given boundary values. Comment... Example 2.16. In the linearized elasticity problem from Example 2.10, we may compute the Euler–Lagrange equation as [ ] ) ( − div 2µ E u + κ − 2 µ (tr E u)I = f in Ω, 3 u = g on ∂ Ω. 2.4. THE EULER–LAGRANGE EQUATION 21 Sometimes, one also defines the first variation δ F [u] of F at u ∈ W1,p (Ω; Rm ) as the linear map m δ F [u] : W1,p 0 (Ω; R ) → R through (whenever this exists) δ F [u][ψ ] := lim h↓0 F [u + hψ ] − F [u] , h m ψ ∈ W1,p 0 (Ω; R ). Then, the assertion of the previous theorem can equivalently be formulated as δ F [u∗ ] = 0 m if u∗ minimizes F over W1,p g (Ω; R ). Of course, this condition is only necessary for u to be a minimizer. Indeed, any solution of the Euler–Lagrange equation is called a critical point of F , which could be a minimizer, maximizer, or saddle point. One crucial consequence of the results in this section is that we can use all available PDE methods to study minimizers. Immediately, one can ask about the type of PDE we are dealing with. In this respect we have the following result: Proposition 2.17. Let f be twice continuously differentiable in v and A. Also let A 7→ f (x, v, A) be convex for all (x, v) ∈ Ω ×Rm . Then, the Euler–Lagrange equation is an elliptic PDE, that is, d2 2 0 ≤ DA f (x, v, A)[B, B] := 2 f (x, v, A + tB) dt t=0 for all x ∈ Ω, v ∈ Rm , and A, B ∈ Rm×d . This proposition is not difficult to establish and just needs a bit of notation; we omit the proof. One main use of the Euler–Lagrange equation is to find concrete solutions of variational problems: Example 2.18. Recall the optimal saving problem from Section 1.5, ∫ T F [S] := − ln(w + ρ S(t) − Ṡ(t)) dt → min, 0 S(0) = 0, S(T ) = ST ≥ 0. The Euler–Lagrange equation is ] [ d ρ 1 − =− , dt w + ρ S(t) − Ṡ(t) w + ρ S(t) − Ṡ(t) which resolves to Comment... ρ Ṡ(t) − S̈(t) = ρ. w + ρ S(t) − Ṡ(t) 22 CHAPTER 2. CONVEXITY Figure 2.1: The solution to the optimal saving problem. With the consumption rate C(t) := w + ρ S(t) − Ṡ(t), this is equivalent to Ċ(t) = ρ, C(t) and so, if C(0) = C0 (to be determined later), w + ρ S(t) − Ṡ(t) = C(t) = eρ t C0 . This differential equation for S(t) can be solved, for example via the Duhamel formula, which yields ρt S(t) = e · 0 + = ∫ t 0 eρ (t−s) (w − eρ sC0 ) ds eρ t − 1 w − teρ t C0 . ρ Comment... and C0 can now be chosen to satisfy the terminal condition S(T ) = ST , in fact, C0 = (1 − e−ρ T )w/(ρ T ) − e−ρ T ST /T . Since C 7→ − lnC is convex, we know that S(t) as above is a minimizer of our problem. In Figure 2.1 we see the optimal savings strategy for a worker earning a (constant) salary of w = £30, 000 per year and having a savings goal of ST = £100, 000. The worker has to save for approximately 27 years, reaching savings of just over £168, 000, and then starts withdrawing his savings (Ṡ(t) < 0) for the last 13 years. The worker’s consumption C(t) = eρ t C0 goes up continuously during the whole working life, making him/her (materially) happy. 2.4. THE EULER–LAGRANGE EQUATION 23 Example 2.19. For functions u = u(t, x) : R × Rd → R, consider the action functional 1 A [u] := 2 ∫ R×Rd −|∂t u|2 + |∇x u|2 d(t, x), where ∇x u is the gradient of u with respect to x ∈ Rd . This functional should be interpreted as the usual Dirichlet integral, see Example 2.9, where, however, we use the Lorentz metric. Then, the Euler–Lagrange equation is the wave equation ∂t2 u − ∆u = 0. Notice that A is not convex and consequently the wave equation is hyperbolic instead of elliptic. It is an important question whether a weak solution of the Euler–Lagrange equation (2.5) is also a strong solution, that is, whether u ∈ W2,2 (Ω; Rm ) and { [ ] − div DA f (x, u(x), ∇u(x)) + Dv f (x, u(x), ∇u(x)) = 0 for a.e. x ∈ Ω, (2.6) u = g on ∂ Ω. If u ∈ C2 (Ω; Rm ) ∩ C(Ω; Rm ) satisfies this PDE for every x ∈ Ω, then we call u a classical solution. m Multiplying (2.6) by a test function ψ ∈ C∞ c (Ω; R ), integrating over Ω, and using the Gauss–Green Theorem, it follows that any solution of (2.6) also solves (2.5). The converse is also true whenever u is sufficiently regular: Proposition 2.20. Let the integrand f be twice continuously differentiable and let u ∈ W2,2 (Ω; Rm ) be a weak solution of the Euler–Lagrange equation (2.5). Then, u solves the Euler–Lagrange equation (2.6) in the strong sense. Proof. If u ∈ W2,2 (Ω; Rm ) is a weak solution, then ∫ Ω DA f (x, u, ∇u) : ∇ψ + Dv f (x, u, ∇u) · ψ dx = 0 m for all ψ ∈ C∞ c (Ω; R ). Integration by parts (more precisely, the Gauss–Green Theorem) gives ∫ { Ω [ ] } − div DA f (x, u, ∇u) + Dv f (x, u, ∇u) · ψ dx = 0, again for all ψ as before. We conclude using the following important lemma. Lemma 2.21 (Fundamental lemma of the Calculus of Variations). Let Ω ⊂ Rd be open. If g ∈ L1 (Ω) satisfies ∫ Ω gψ dx = 0 for all ψ ∈ C∞ c (Ω), Comment... then g ≡ 0 almost everywhere. 24 CHAPTER 2. CONVEXITY Proof. We can assume that Ω is bounded by considering subdomains if necessary. Also, let g be extended by zero to all of Rd . Fix ε > 0 and let (ρδ )δ ⊂ C∞ c (B(0, δ )) be a family of mollifying kernels, see Appendix A.2 for details. Then, since ρδ ⋆ g → g in L1 , we may find h ∈ (L1 ∩ C∞ )(Rd ) with ∥g − h∥L1 ≤ ε 4 and ∥h∥∞ < ∞. Set φ (x) := h(x)/|h(x)| for h(x) ̸= 0 and φ (x) := 0 for h(x) = 0, so that hφ = |h|. Then take ψ = ρδ̄ ⋆ φ ∈ (L1 ∩ C∞ )(Rd ) with ∥φ − ψ ∥L1 ≤ ε . 2∥h∥∞ Since |ψ | ≤ 1 (this follows from the convolution formula), ∥g∥L1 ≤ ∥g − h∥L1 + ≤ ∥g − h∥L1 + ∫ ∫Ω Ω hφ dx h(φ − ψ ) + (h − g)ψ + gψ dx ≤ 2∥g − h∥L1 + ∥h∥∞ · ∥φ − ψ ∥L1 + 0 ≤ ε. We conclude by letting ε ↓ 0. 2.5 Regularity of minimizers As we saw at the end of the last section, whether a weak solution of the Euler–Lagrange equation is also a strong or even classical solution, depends on its regularity, that is, on the amount of differentiability. More generally, one would like to know how much regularity we can expect from solutions of a variational problem. Such a question also was the content of David Hilbert’s 19th problem [47]: Does every Lagrangian partial differential equation of a regular variational problem have the property of exclusively admitting analytic integrals?4 In modern language, Hilbert asked whether “regular” variational problems (defined below) admit only analytic solutions, i.e. ones that have a local power series representation. In this section, we will prove some basic regularity assertions, but we will only sketch the solution of Hilbert’s 19th problem, as the techniques needed are quite involved. We remark from the outset that regularity results are very sensitive to the dimensions involved, in particular the behavior of the scalar case (m = 1) and the vector case (m > 1) is fundamentally different. We also only consider the quadratic (p = 2) case for reasons of simplicity. 4 The Comment... German original asks “ob jede Lagrangesche partielle Differentialgleichung eines regulären Variationsproblems die Eigenschaft hat, daß sie nur analytische Integrale zuläßt.” 2.5. REGULARITY OF MINIMIZERS 25 In the spirit of Hilbert’s 19th problem, call F [u] := ∫ Ω f (∇u(x)) dx, u ∈ W1,2 (Ω; Rm ), a regular variational integral if f ∈ C2 (Rm×d ) and there are constants 0 < µ ≤ M with µ |B|2 ≤ D2A f (A)[B, B] ≤ M|B|2 In this context, D2A f (A)[B1 , B2 ] for all A, B ∈ Rm×d . d d = f (A + sB1 + tB2 ) dt ds s,t=0 for all A, B1 , B2 ∈ Rm×d , which is a bilinear form. Clearly, regular variational problems are convex. In fact, integrands f that satisfy the above lower bound µ |B|2 ≤ D2A f (A)[B, B] are called strongly convex. For example, the Dirichlet integral from Example 2.9 is regular. Also, the following Cauchy–Schwarz-type estimate can be shown: 2 DA f (A)[B1 , B2 ] ≤ M|B1 ||B2 |. (2.7) The basic regularity theorem is the following: Theorem 2.22 (W2,2 loc -regularity). Let F be a regular variational integral. Then, for m any minimizer u ∈ W1,2 (Ω; Rm ) of F , it holds that u ∈ W2,2 loc (Ω; R ). Moreover, for any B(x0 , 3r) ⊂ Ω (x0 ∈ Ω, r > 0) the Caccioppoli inequality ) ∫ ( ∫ |∇u(x) − [∇u]B(x0 ,3r) |2 2M 2 2 2 |∇ u(x)| dx ≤ dx (2.8) µ r2 B(x0 ,3r) B(x0 ,r) holds, where [∇u]B(x0 ,3r) := holds strongly, − div D f (∇u) = 0, ∫ −B(x ,3r) ∇u 0 dx. Consequently, the Euler–Lagrange equation a.e. in Ω. For the proof we employ the difference quotient method, which is fundamental in regularity theory. Let u : Ω → Rm , x ∈ Ω, k ∈ {1, . . . , d}, and h ∈ R. Then, define the difference quotients Dhk u(x) := u(x + hek ) − u(x) , h Dh u := (Dh1 u, . . . , Dhd u), where {e1 , . . . , ed } is the standard basis of Rd . The key is the following characterization of Sobolev spaces in terms of difference quotients: Lemma 2.23. Let 1 < p < ∞, D ⊂⊂ Ω ⊂ Rd be open, and u ∈ L p (Ω; Rm ). (I) If u ∈ W1,p (Ω; Rm ), then for all k ∈ {1, . . . , d}, |h| < dist(D, ∂ Ω). Comment... ∥Dhk u∥L p (D) ≤ ∥∂k u∥L p (Ω) 26 CHAPTER 2. CONVEXITY (II) If for some 0 < δ < dist(D, ∂ Ω) it holds that ∥Dhk u∥L p (D) ≤ C for all k ∈ {1, . . . , d} and all |h| < δ , m then u ∈ W1,p loc (Ω; R ) and ∥∂k u∥L p (D) ≤ C for all k ∈ {1, . . . , d}. Proof. For (I), assume first that u ∈ (L p ∩ C1 )(Ω; Rm ). In this case, by the Fundamental Theorem of Calculus, at x ∈ Ω it holds that Dhk u(x) 1 = h ∫ 1 d dt 0 ∫ 1 u(x + thek ) dt = 0 ∂k u(x + thek ) dt. Thus, ∫ D |Dhk u| p dx ≤ ∫ Ω |∂k u| p dx, from which the assertion follows. The general case follows from the density of (L p ∩ C1 )(Ω; Rm ) in L p (Ω; Rm ). For (II), we observe that for fixed k ∈ {1, . . . , d} by assumption (Dhk u)0<h<δ is uniformly p L -bounded. Thus, for an arbitrary fixed sequence of h’s tending to zero, there exists a subsequence h j ↓ 0 with h Dk j u ⇀ vk in L p m for some vk ∈ L p (Ω; Rm ). Let ψ ∈ C∞ c (Ω; R ). Using an “integration-by-parts” rule for difference quotients, which is elementary to check, we get ∫ D vk · ψ dx = lim ∫ j→∞ D h Dk j u · ψ dx = − lim ∫ j→∞ D −h u · Dk j ψ dx = − ∫ D u · ∂k ψ dx. Thus, u ∈ W1,p (D; Rm ) and vk = ∂k u. The norm estimate follows from the lower semicontinuity of the norm under weak convergence. Proof of Theorem 2.22. The idea is to emulate the a-priori non-existent second derivatives using difference quotients and to derive estimates which allow one to conclude that these difference quotients are in L2 . Then we can conclude by the preceding lemma. Let u ∈ W1,2 (Ω; Rm ) be a minimizer of F . By Theorem 2.13, ∫ 0= Ω D f (∇u) : ∇ψ dx (2.9) m 1,2 m holds for all ψ ∈ C∞ c (Ω; R ), and then by density also for all ψ ∈ W (Ω; R ). Here we have used the notation D f (A) : B for the directional derivative of f at A in direction B. Fix a ball B(x0 , 3r) ⊂ Ω and take a Lipschitz cut-off function ρ ∈ W1,∞ (Ω) such that Comment... 1B(x0 ,r) ≤ ρ ≤ 1B(x0 ,2r) and 1 |∇ρ | ≤ . r 2.5. REGULARITY OF MINIMIZERS 27 Then, for any k = 1, . . . , d and |h| < r, we let ] [ 2 h ψ := D−h ∈ W1,2 (Ω; Rm ), k ρ Dk (u − a) where a is an affine function to be chosen later. We may plug ψ into (2.9) to get ∫ [ ] 0 = Dhk (D f (∇u)) : ρ 2 Dhk ∇u + Dhk (u − a) ⊗ ∇(ρ 2 ) dx. Ω (2.10) Here, we used the “integration-by-parts” formula for difference quotients, which is easy to check (also recall a ⊗ b := abT ). Next, we estimate, using the assumptions on f , µ |Dhk ∇u|2 ≤ ∫ 1 D2 f (∇u + thDhk ∇u)[Dhk ∇u, Dhk ∇u] dt 1 1 h h = D f (∇u + thDk ∇u) : Dk ∇u h t=0 0 = Dhk (D f (∇u)) : Dhk ∇u. From the preceding lemma (both parts), (2.7), and the fact |a ⊗ b| = |a| · |b| for our norms, we get h ] ] [ [ D (D f (∇u)) : Dh (u − a) ⊗ ∇(ρ 2 ) ≤ D2 f (∇u) Dh (u − a) ⊗ ∇(ρ 2 ), ∂k ∇u k k k ≤ 2M · ρ · |Dhk (u − a)| · |∇ρ | · |Dhk ∇u|. Using the last two estimates, (2.10), and the integration-by-parts formula for difference quotients again, µ ∫ Ω |Dhk ∇u|2 ρ 2 dx ≤ ∫ Ω =− Dhk (D f (∇u)) : [ρ 2 Dhk ∇u] dx ∫ Ω ≤ 2M ∫ [ ] Dhk (D f (∇u)) : Dhk (u − a) ⊗ ∇(ρ 2 ) dx Ω ρ · |Dhk ∇u| · |Dhk (u − a)| · |∇ρ | dx. Now use the Young inequality to get µ ∫ Ω |Dhk ∇u|2 ρ 2 dx ≤ µ 2 ∫ Ω |Dhk ∇u|2 ρ 2 dx + (2M)2 2µ ∫ Ω |Dhk (u − a)|2 · |∇ρ |2 dx. Absorbing the first term in the left hand side and using the properties of ρ , ( ) ∫ ∫ |Dhk (u − a)|2 2M 2 h 2 dx. |Dk ∇u| dx ≤ µ r2 B(x0 ,2r) B(x0 ,r) Now invoke the difference-quotient lemma, part (I), to arrive at ( ) ∫ ∫ 2M 2 |∂k (u − a)|2 h 2 |Dk ∇u| dx ≤ dx. µ r2 B(x0 ,r) B(x0 ,3r) Comment... Applying the lemma again, this time part (II), we get u ∈ W2,2 (B(x0 , r); Rm ). The Caccioppoli inequality (2.8) follows once we take ∇a := [∇u]B(x0 ,3r) . 28 CHAPTER 2. CONVEXITY With the W2,2 loc -regularity result at hand, we may use ψ = ∂k ψ̃ , k = 1, . . . , d, as test function in the weak formulation of the Euler–Lagrange equation − div[D f (∇u)] = 0 to conclude that [ ] − div D2 f (∇u) : ∇(∂k u) = 0 (2.11) holds in the weak sense. Indeed, 0=− ∫ Ω div D f (∇u) · ∂k ψ dx = ∫ Ω ] [ div D2 f (∇u) : ∇(∂k u) · ψ dx m 2 for all ψ ∈ C∞ c (Ω; R ). In the special case that f is quadratic, D f is constant and the above PDE has the same structure (with ∂k u as unknown function) as the original Euler–Lagrange equation. Thus it is itself an Euler–Lagrange equation and we can bootstrap the regularity m ∞ m theorem to get u ∈ Wk,2 loc (Ω; R ) for all k ∈ N, hence k ∈ C (Ω; R ). Example 2.24. For a minimizer u ∈ W1,2 (Ω) of the Dirichlet integral as in Example 2.9, the theory presented here immediately gives u ∈ C∞ (Ω). Example 2.25. Similarly, for minimizers u ∈ W1,2 (Ω; R3 ) to the problem of linearized elasticity from Section 1.4 and Example 2.10, we also get u ∈ C∞ (Ω; R3 ). The reasoning has to be slightly adjusted, however, to take care of the fact that we are dealing with E u instead of ∇u. In the general case, however, our Euler–Lagrange equation is nonlinear and the above bootstrapping procedure fails. In particular, this includes the problem posed in Hilbert’s 19th problem, which was only solved, in the scalar case m = 1, by Ennio De Giorgi in 1957 [28] and, using different methods, by John Nash in 1958 [78]. After the proof was improved by Jürgen Moser in 1960/1961 [71, 72], the results are now subsumed in the De Giorgi–Nash–Moser theory. The current standard solution (based on De Giorgi’s approach) is based on the De Giorgi regularity theorem and the classical Schauder estimates, which establish Hölder regularity of weak solutions of a PDE. Theorem 2.26 (De Giorgi regularity theorem). Let the function A : Ω → Rd×d be measurable, symmetric, i.e. A(x) = A(x)T for x ∈ Ω, and satisfy the ellipticity and boundedness estimates µ |v|2 ≤ vT A(x)v ≤ M|v|2 , x ∈ Ω, v ∈ Rd , (2.12) for constants 0 < µ ≤ M. If u ∈ W1,2 (Ω) is a weak solution of − div[A∇u] = 0, (2.13) Comment... 0,α0 (Ω), that is, u is α0 -Hölder continuous, for some α0 = α0 (d, M/µ ) ∈ (0, 1). then u ∈ Cloc 2.5. REGULARITY OF MINIMIZERS 29 Theorem 2.27 (Schauder estimate). Let A : Ω → Rd×d be as above but in addition assumed to be of class Cs−1,α for some s ∈ N and α ∈ (0, 1). If u ∈ W1,2 (Ω) is a weak solution of − div[A∇u] = 0, then u ∈ Cs,locα (Ω). We remark that the Schauder estimates also hold for systems of PDEs (u : Ω → Rm , m > 1), but the De Giorgi regularity theorem does not. The proofs of these results are quite involved and reserved for an advanced course on regularity theory, but we establish their arguably most important consequence, namely the solution of Hilbert’s 19th problem in the scalar case: Theorem 2.28 (De Giorgi–Nash–Moser). Let F be a regular variational integral and α f ∈ Cs (Rd ), s ∈ {2, 3, . . .}. If u ∈ W1,2 (Ω) minimizes F , then it holds that u ∈ Cs−1, loc (Ω) ∞ d ∞ for some α ∈ (0, 1). In particular, if f ∈ C (R ), then u ∈ C (Ω). Proof. We saw in (2.11) that the partial derivative ∂k u, k = 1, . . . , d, of a minimizer u ∈ W1,2 (Ω) of F satisfy (2.13) for A(x) := D2 f (∇u(x)), x ∈ Ω. From general properties of the Hessian we conclude that A(x) is symmetric and the upper and lower estimates (2.12) on A follow from the respective properties of D2 f . However, we cannot conclude (yet) any regularity of A beyond measurability. Nevertheless, we may 0,α0 apply the De Giorgi Regularity Theorem 2.26, whereby ∂k u ∈ Cloc (Ω) for some α0 ∈ (0, 1) 1,α0 and all k = 1, . . . , d. Hence u ∈ Cloc (Ω). This is the assertion for s = 2. If s = 3, our arguments so far in conjunction with the regularity assumptions on f imply α0 2 D f (∇u(x)) ∈ C0, loc (Ω). Hence, the Schauder estimates from Theorem 2.27 apply and yield 2,α0 s−1,α0 α0 (Ω). For higher s, this procedure can be ∂k u ∈ C1, loc (Ω), whereby u ∈ Cloc (Ω) = Cloc α0 iterated until we run out of f -derivatives and we have achieved u ∈ Cs−1, (Ω). loc Comment... The above argument also yields analyticity of u if f is analytic. Another refinement shows that in the situation of the De Giorgi–Nash–Moser Theorem, we even have u ∈ s−1,α (Ω) for all α ∈ (0, 1). Here one needs the additional Schauder–type result that weak Cloc [ ] 0,α solutions u to − div A∇(∂k u) = 0 for A continuous are of class Cloc for all α ∈ (0, 1). In the above proof we can apply this result in the case s = 2 since A(x) = D2 f (∇u(x)) is 1,α continuous. Thus, u ∈ Cloc (Ω) for any α ∈ (0, 1). The other parts of the proof are adapted accordingly. See [86] for details and other refinements. We close our look at regularity theory by considering the vectorial case m > 1. It was shown again by De Giorgi in 1968 that if d = m > 2, then his regularity theorem does not hold: 30 CHAPTER 2. CONVEXITY Example 2.29 (De Giorgi). Let d = m > 2 and define −γ u(x) := |x| x, ( ) d 1 γ := 1− √ , 2 (2d − 2)2 + 1 x ∈ B(0, 1). d Note that 1 < γ < d2 and so u ∈ W1,2 (B(0, 1); Rd ) but u ∈ / L∞ loc (B(0, 1); R ). It can be checked, though, that u solves (2.13) for ) ]2 [( x⊗x v A(x)v := |v| + (d − 1)I + d · v , |x|2 T 2 which satisfies all assumptions of the De Giorgi Theorem. Moreover, u is a minimizer of the quadratic variational integral ∫ B(0,1) ∇u(x)T A(x)∇u(x) dx. However, this is not a regular variational integral because of the x-dependence of the integrand is not smooth. In principle, this still leaves the possibility that the vectorial analogue of Hilbert’s 19th problem has a positive solution, just that its proof would have to proceed along a different avenue than via the De Giorgi Theorem. However, after first results of Nečas in 1975 [79], the current state of the art was established by Šverák and Yan in 2000–2002 [90, 91]. They prove that there exist regular (and C∞ ) variational integrals such that minimizers must be: • non-Lipschitz if d ≥ 3, m ≥ 5, • unbounded if d ≥ 5, m ≥ 14. Comment... However, there are also positive results, in particular we have that for d = 2 minimizers to regular variational problems are always as regular as the data allows (as in Theorem 2.28), this is the Morrey regularity theorem from 1938, see [68]. If we confine ourselves to Sobolev-regularity, then using the difference quotient technique, we can δ prove W2,2+ -regularity for minimizers of regular variational problems for some dimensionloc dependent δ > 0, this result is originally due to Sergio Campanato. By an embedding theorem this yields C0,α -regularity for some α ∈ (0, 1) when d ≤ 4. Only in 2008 were δ -regularity for a dimensionJan Kristensen and Christof Melcher able to establish W2,2+ loc independent δ > 0 (in fact, δ = µ /(50M)), see [55]. We will learn about further regularity results in Section 3.8. Regularity theory is still an area of active research and new results are constantly discovered. Finally, we mention that regularity of solutions might in general be extended up to and including the boundary if the boundary is smooth enough, see Section 6.3.2 in [35] and also [44] for results in this direction. 2.6. LAVRENTIEV GAP PHENOMENON 31 2.6 Lavrentiev gap phenomenon So far we have chosen the function spaces in which we look for solutions of a minimization problem according to coercivity and growth assumptions. However, at first sight, classically differentiable functions may appear to be more appealing. So the question arises whether the minimum (infimum) value is actually the same when considering different function spaces. Formally, given two spaces X ⊂ Y such that X is dense in Y and a functional F : Y → R ∪ {+∞}, is inf F = inf F Y ? X This is certainly true if we assume good growth conditions: Theorem 2.30. Let f : Ω × Rm × Rm×d → R be a Caratheéodory function satisfying the growth bound | f (x, v, A)| ≤ M(1 + |v| p + |A| p ), (x, v, A) ∈ Ω × Rm × Rm×d , for some M > 0, 1 < p < ∞. Then, the functional F [u] := ∫ Ω f (x, u(x), ∇u(x)) dx, u ∈ W1,p (Ω; Rm ) is strongly continuous. Consequently, inf W1,p (Ω;Rm ) F= inf C∞ (Ω;Rm ) F. Proof. Let u j → u in W1,p , whereby u j → u and ∇u j → ∇u almost everywhere (which holds after selecting a non-renumbered subsequence). Then, from the growth assumption we get F [u j ] = ∫ Ω f (x, u j , ∇u j ) dx ≤ ∫ Ω M(1 + |u j | p + |∇u j | p ) dx and by the Pratt Theorem A.5, F [u j ] → F [u]. This shows the continuity of F with respect to strong convergence in W1,p . The assertion about the equality of the infima now follows readily since C∞ (Ω; Rm ) is (strongly) dense in W1,p (Ω; Rm ). Comment... If we dispense with the (upper) growth, however, the infimum over different spaces may indeed be different – this is the Lavrentiev gap phenomenon, discovered in 1926 by Mikhail Lavrentiev. We here give an example between the spaces W1,1 and W1,∞ (with boundary conditions) due to Basilio Manià from 1934. 32 CHAPTER 2. CONVEXITY Example 2.31 (Manià). Consider the minimization problem ∫ 1 F [u] := (u(t)3 − t)2 u̇(t)6 dt → min, 0 u(0) = 0, u(1) = 1 for u from either W1,1 (0, 1) or W1,∞ (0, 1) (Lipschitz functions). We claim that inf W1,1 (0,1) F< inf W1,∞ (0,1) F, where here and in the following these infima are to be taken only over functions u with boundary values u(0) = 0, u(1) = 1. Clearly, F ≥ 0, and for u∗ (t) := t 1/3 ∈ (W1,1 \ W1,∞ )(0, 1) we have F [u∗ ] = 0. Thus, inf W1,1 (0,1) F = 0. On the other hand, for any u ∈ W1,∞ (0, 1) with u(0) = 0, u(1) = 1, the Lipschitz-continuity implies that u(t) ≤ h(t) := t 1/3 2 for all t ∈ [0, τ ] and some τ > 0, and u(τ ) = h(τ ). Then, u(t)3 − t ≤ h(t)3 − t for t ∈ [0, τ ] and, since both of these terms are negative, (u(t)3 − t)2 ≥ (h(t)3 − t)2 = 72 2 t 82 for all t ∈ [0, τ ]. We then estimate F [u] ≥ ∫ τ 0 (u(t)3 − t)2 u̇(t)6 dt ≥ 72 82 ∫ τ t 2 u̇(t)6 dt. 0 Further, by Hölder’s inequality, ∫ τ ∫ τ t −1/3 · t 1/3 u̇(t) dt (∫ τ )5/6 (∫ τ )1/6 −2/5 2 6 ≤ t dt · t u̇(t) dt u̇(t) dt = 0 0 0 = 55/6 1/2 τ 35/6 (∫ 0 τ )1/6 t 2 u̇(t)6 dt 0 Since also ∫ τ Comment... 0 u̇(t) dt = u(τ ) − u(0) = h(τ ) = τ 1/3 , 2 . 2.7. INVARIANCES AND THE NOETHER THEOREM 33 we arrive at 72 35 72 35 ≥ > 0. 82 55 26 τ 82 55 26 F [u] ≥ Thus, inf W1,∞ (0,1) F >0= inf W1,1 (0,1) F. In a more recent example, Ball & Mizel [13] showed that the problem ∫ 1 F [u] := (t 4 − u(t)6 )2 |u̇(t)|2m + ε u̇(t)2 dt −1 u(−1) = α , u(1) = β , → min, also exhibits a Lavrentiev gap phenomenon between the spaces W1,2 and W1,∞ if m ∈ N satisfies m > 13, ε > 0 is sufficiently small, and −1 ≤ α < 0 < β ≤ 1. This example is significant because the Ball–Mizel functional is coercive on W1,2 (−1, 1). Finally, we note that despite its theoretical significance, the Lavrentiev gap phenomenon is a major obstacle for the numerical approximation of minimization problems. For instance, standard piecewise-affine finite elements are in W1,∞ and hence in the presence of the Lavrentiev gap phenomenon we cannot approximate the true solution with such elements! Thus, one is forced to work with non-conforming elements and other advanced schemes. This issue does not only affect “academic” examples such as the ones above, but is also of great concern in applied problems, such as nonlinear elasticity theory. 2.7 Invariances and the Noether theorem In physics and other applications of the Calculus of Variations, we are often highly interested in symmetries of minimizers or, more generally, critical points. These symmetries manifest themselves in other differential or pointwise relations that are automatically satisfied for any critical point. They can be used to identify concrete solutions or are interesting in their own right, for example as “gauge symmetries” in modern physics. In this section, for simplicity we only consider “sufficiently smooth” (W2,2 loc ) critical points, which is the natural level of regularity, see Theorem 2.22. As a first concrete example of a symmetry, consider the Dirichlet integral F [u] := ∫ Ω |∇u(x)|2 dx, u ∈ W1,2 (Ω), which was introduced in Example 2.9. First we notice that F is invariant under translations in space: Let u ∈ (W1,2 ∩ W2,2 loc )(Ω), τ ∈ R, k ∈ {1, . . . , d}, and set uτ (x) := u(x + τ ek ). Comment... xτ := x + τ ek , 34 CHAPTER 2. CONVEXITY Then, for any open D ⊂ Rd with Dτ := D + τ ek ⊂ Ω, we have the invariance ∫ D |∇uτ (x)|2 dx = ∫ Dτ |∇u(xτ )|2 dxτ . (2.14) The Dirichlet integral also exhibits invariance with respect to scaling: For λ > 0 set xλ := λ x, uλ (x) := λ (d−2)/2 u(λ x), Dλ := λ D. (2.15) Then it is not hard to see that (2.14) again holds if we set λ = eτ (to allow τ ∈ R as before). The main result of this section, the Noether theorem, roughly says that “differentiable invariances of a functional give rise to conservation laws”. More concretely, the two invariances of the Dirichlet integral presented above will yield two additional PDEs that any minimizer of the Dirichlet integral must satisfy. To make these statement precise in the general case, we need a bit of notation: Let m u ∈ (W1,2 ∩ W2,2 loc )(Ω; R ) be a critical point of the functional F [u] := ∫ Ω f (x, u(x), ∇u(x)) dx, where f : Ω × Rm × Rm×d → R is twice continuously differentiable. Then, u satisfies the strong Euler–Lagrange equation [ ] − div DA f (x, u, ∇u) + Dv f (x, u, ∇u) = 0 a.e. in Ω. see Proposition 2.20. We consider u to be extended to all of Rd (it will not matter below how we extend u). Our invariance is given through maps g : Rd × R → Rd and H : Rd × R → Rm , which will depend on u above, such that g(x, 0) = x and H(x, 0) = u(x), x ∈ Rd . We also require that g, H are continuously differentiable in their second argument τ for almost every x ∈ Ω. Then set for x ∈ Rd , τ ∈ R, D ⊂⊂ Rd xτ := g(x, τ ), uτ (x) := H(x, τ ), Dτ := g(D, τ ). Therefore, g, H can be thought of as a form of homotopy. We call F invariant under the transformation defined by (g, H) if ∫ D f (x, uτ (x), ∇uτ (x)) dx = ∫ Dτ f (x′ , u(x′ ), ∇u(x′ )) dx′ . (2.16) for all τ ∈ R sufficiently small and all open D ⊂ Rd such that Dτ ⊂⊂ Ω. The main result of this section goes back to Emmy Noether5 and is considered one of the most important mathematical theorems ever proved. Its pivotal idea of systematically generating conservation laws from invariances has proved to be immensely influential in modern physics. 5 The Comment... German mathematician Emmy Noether (23 March 1882 – 14 April 1935) has been called the most important woman in the history of mathematics by Einstein and others. 2.7. INVARIANCES AND THE NOETHER THEOREM 35 Theorem 2.32 (Noether). Let f : Ω × Rm × Rm×d → R be twice continuously differentiable and satisfy the growth bounds |Dv f (x, v, A)|, |DA f (x, v, A)| ≤ M(1 + |v| p + |A| p ), (x, v, A) ∈ Ω × Rm × Rm×d , for some M > 0, 1 ≤ p < ∞, and let the associated functional F be invariant under the transformation defined by (g, H) as above, for which we assume that there exist a pmajorant g ∈ L p (Ω) such that |∂τ H(x, τ )|, |∂τ g(x, τ )| ≤ g(x) for a.e. x ∈ Ω and all τ ∈ R. (2.17) m (W1,2 ∩ W2,2 loc )(Ω; R ) Then, for any critical point u ∈ of F the conservation law [ ] div µ (x) · DA f (x, u, ∇u) − ν (x) · f (x, u, ∇u) = 0 for a.e. x ∈ Ω. (2.18) holds, where µ (x) := ∂τ H(x, 0) ∈ Rm ν (x) := ∂τ g(x, 0) ∈ Rd and (x ∈ Ω) are the Noether multipliers. Corollary 2.33. If f = f (v, A) : R × Rd → R does not depend on x, then every critical point u ∈ (W1,2 ∩ W2,2 loc )(Ω) of F satisfies [ ] ∂ ∂ ∂ δ ( f (u, ∇u) − u) · f (u, ∇u) =0 i A k ik ∑ k d (2.19) i=1 for all k = 1, . . . , d. Proof. Differentiate (2.16) with respect to τ and then set τ = 0. To be able to differentiate the left hand side under the integral, we use the growth assumptions on the derivatives |Dv f (x, v, A)|, |DA f (x, v, A)| and (2.17) to get a uniform (in τ ) majorant, which allows to move the differentiation under the intgral sign. For the right hand side, we need to employ the formula for the differentiation of an integral with respect to a moving domain (this is a special case of the Reynolds transport theorem), d dτ ∫ Dτ f (x, u, ∇u) dx = ∫ ∂ Dτ f (x, u, ∇u)∂τ g(x, τ ) · n dσ , where n is the unit outward normal; this formula can be checked in an elementary way. Abbreviating for readability F := f (x, u(x), ∇u(x)), DA F := DA f (x, u(x), ∇u(x)), and Dv F := Dv f (x, u(x), ∇u(x)), we get D DA F : ∇µ + Dv F · µ dx = ∫ ∂D F ν · n dσ . Applying the Gauss–Green theorem, ∫ [ ∫ ] − div DA F + Dv F · µ dx = ∫∂ D D = D [ ] −µ · DA F + ν F · n dσ [ ] div −µ · DA F + ν F dx Comment... ∫ 36 CHAPTER 2. CONVEXITY Now, the Euler–Lagrange equation − div DA F + Dv F = 0 immediately implies ∫ [ ] div −µ · DA F + ν F dx = 0. D and varying D we conclude that (2.18) holds. The corollary follows by considering the invariance xτ := x + τ ek , uτ (x) := u(x + τ ek ), k = 1, . . . , d. Example 2.34 (Brachistochrone problem). In the brachistochrone problem presented in Section 1.1, we were tasked to minimize the functional √ ∫ 1 1 + (y′ )2 F [y] := dx −y 0 √ over all y : [0, 1] → R with y(0) = 0, y(1) = ȳ < 0. The integrand f (v, a) = −(1 + a2 )/v is independent of x, hence from (2.19) we get 1 y′ · Da f (y, y′ ) − f (y, y′ ) = const = − √ 2r for some r > 0 (positive constants lead to inadmissible y). So, √ 1 + (y′ )2 (y′ )2 1 √ − √ = −√ , √ −y 2r 1 + (y′ )2 · −y which we transform into the differential equation (y′ )2 = − 2r − 1. y The solution of this differential equation is called an inverted cycloid with radius r > 0, which is the curve traced out by a fixed point on a sphere of radius r (touching the y-axis at the beginning) that rolls to the right on the bottom of the x-axis, see Figure 2.2. It can be written in parametric form as x(t) = r(t − sint), y(t) = −r(1 − cost), t ∈ R. The radius r > 0 has to be chosen as to satisfy the boundary conditions on one cycloid segment. This is the solution of the brachistochrone problem. Comment... Example 2.35. We return to our canonical example, the Dirichlet integral, see Example 2.9. We know from Example 2.24 that critical points u are smooth. We noticed at the beginning of this section that the Dirichlet integral is invariant under the scaling transformation (2.15) with λ = eτ . By the Noether theorem, this yields the (non-obvious) conservation law [ ] div (2∇u(x) · x + (d − 2)u)∇u − |∇u|2 x = 0. 2.8. SIDE CONSTRAINTS 37 Figure 2.2: The brachistochrone curve (here, x̄ = 1). Integrating this over B(0, r) ⊂ Ω, r > 0, and using the Gauss–Green theorem, we get ( ) x 2 dσ (d − 2) |∇u(x)| dx = r |∇u(x)| − 2 ∇u(x) · |x| B(0,r) ∂ B(0,r) ∫ ∫ 2 2 and then, after some computations, d dr ( 1 ) ∫ rd−2 |∇u(x)| dx = 2 B(0,r) 2 rd−2 ) ( x 2 dσ ≥ 0. ∇u(x) · |x| ∂ B(0,r) ∫ This monotonicity formula implies that 1 rd−2 ∫ B(0,r) |∇u(x)|2 dx is increasing in r > 0. Any harmonic function (−∆u = 0) that is defined on all of Rd satisfies this monotonicity formula. For example, this allows us to draw the conclusion that if d ≥ 3, then u cannot be compactly supported. The formula also shows that the growth around a singularity at zero (or anywhere else by translation) has to behave “more smoothly” than |x|−2 (in fact, we already know that solutions are C∞ ). While these are not particularly strong remarks, they serve to illustrate how Noether’s theorem restricts the candidates for solutions. The last example exhibited a conservation law that was not obvious from the Euler– Lagrange equation. While in principle it could have been derived directly, the Noether theorem gave us a systematic way to find this conservation law from an invariance. 2.8 Side constraints Comment... In many minimization problems, the class of functions over which we minimize our functional may be restricted to include one or several side constraints. To establish existence of a minimizer also in these cases, we first need to extend the Direct Method: 38 CHAPTER 2. CONVEXITY Theorem 2.36 (Direct Method with side constraint). Let X be a Banach space and let F , G : X → R ∪ {+∞}. Assume the following: (WH1) Weak Coercivity of F : For all Λ ∈ R, the sublevel set { x ∈ X : F [x] ≤ Λ } is sequentially weakly compact, that is, if F [x j ] ≤ Λ for a sequence (x j ) ⊂ X, then (x j ) has a weakly convergent subsequence. (WH2) Weak lower semicontinuity of F : For all sequences (x j ) ⊂ X with x j ⇀ x it holds that F [x] ≤ lim inf F [x j ]. j→∞ (WH3) Weak continuity of G : For all sequences (x j ) ⊂ X with x j ⇀ x it holds that G [x j ] → G [x]. Assume also that there exists at least one x0 ∈ X with G [x0 ] = 0. Then, the minimization problem F [x] → min over all x ∈ X with G [x] = α ∈ R, has a solution. Proof. The proof is almost exactly the same as the one for the standard Direct Method in Theorem 2.3. However, we need to select the x j for an infimizing sequence with G [x j ] = α . Then, by (WH3), this property also holds for any weak limit x∗ of a subsequence of the x j ’s. A large class of side constraints can be treated using the following simple result: Lemma 2.37. Let g : Ω × Rm → R be a Carathéodory integrand such that there exists M > 0 with |g(x, v)| ≤ M(1 + |v| p ), (x, v) ∈ Ω × Rm . Then, G : W1,p (Ω; Rm ) → R defined through G [u] := ∫ Ω g(x, u(x)) dx. Comment... is weakly continuous. 2.8. SIDE CONSTRAINTS 39 Proof. Let u j ⇀ u in W1,p , whereby after selecting a subsequence and employing the Rellich–Kondrachov Theorem A.18 and Lemma A.3, u j → u in L p (strongly) and almost everywhere. By assumption we have ±g(x, v) + M(1 + |v| p ) ≥ 0. Thus, applying Fatou’s Lemma separately to these two integrands, we get ( ) ∫ ∫ p lim inf ±G [u j ] + M(1 + |u j | ) dx ≥ ±G [u] + M(1 + |u| p ) dx. Ω j→∞ Ω Since ∥u j ∥L p → ∥u∥L p , we can combine these two assertions to get G [u j ] → G [u]. Since this holds for all subsequences, it also follows for our original sequence. We will later see more complicated side constraints, in particular constraints involving the determinant of the gradient. The following result is very important for the calculation of minimizers under side constraints. It effectively says that at a minimizer u∗ of F subject to the side constraint G [u∗ ] = 0, the derivatives of F and G have to be parallel, in analogy with the finitedimensional situation. Theorem 2.38 (Lagrange multipliers). Let f : Ω × Rm × Rm×d → R, g : Ω × Rm → R be Carathéodory integrands and satisfy the growth bounds | f (x, v, A)| ≤ M(1 + |v| p + |A| p ), |g(x, v)| ≤ M(1 + |v| p ), (x, v, A) ∈ Ω × Rm × Rm×d , for some M > 0, 1 ≤ p < ∞, and let f , g be continuously differentiable in v and A. Suppose that u∗ ∈ Wg1,p (Ω; Rm ), where g ∈ W1−1/p,p (∂ Ω; Rm ), minimizes the functional F [u] := ∫ Ω f (x, u(x), ∇u(x)) dx, m u ∈ W1,p g (Ω; R ), under the side constraint G [u] := ∫ Ω g(x, u(x)) dx = α ∈ R. Assume furthermore the consistency condition δ G [u∗ ][v] = ∫ Ω Dv g(x, u∗ (x)) · v(x) dx ̸= 0 (2.20) Comment... m for at least one v ∈ W1,p 0 (Ω; R ). Then, there exists a Lagrange multiplier λ ∈ R such that u∗ is a weak solution of the system of PDEs { [ ] − div DA f (x, u, ∇u) + Dv f (x, u, ∇u) = λ Dv g(x, u) in Ω, (2.21) u=g on ∂ Ω. 40 CHAPTER 2. CONVEXITY Proof. The proof is structurally similar to its finite-dimensional analogue. Let u∗ ∈ Wg1,p (Ω; Rm ) be a minimizer of F under the side constraint G [u] = 0. From m the consistency condition (2.20) we infer that there exists w ∈ W1,p 0 (Ω; R ) such that δ G [u∗ ][w] = ∫ Ω Dv g(x, u∗ ) · w dx = 1. Now fix any variation v ∈ W01,p (Ω; Rm ) and define for s,t ∈ R, H(s,t) := G [u∗ + sv + tw]. It is not difficult to see that H is continuously differentiable in both s and t (this uses strong continuity of the integral, see Theorem 2.30) and ∂s H(s,t) = δ G [u∗ + sv + tw][v], ∂t H(s,t) = δ G [u∗ + sv + tw][w]. Thus, by definition of w, H(0, 0) = α ∂t H(0, 0) = 1. and By the implicit function theorem, there exists a function τ : R → R such that τ (0) = 0 and H(s, τ (s)) = α for small |s|. The chain rule yields for such s, 0 = ∂s [H(s, τ (s))] = ∂s H(s, τ (s)) + ∂t H(s, τ (s))τ ′ (s), whereby ′ τ (0) = −∂s H(0, 0) = − ∫ Ω Dv g(x, u∗ ) · v dx. (2.22) Now define for small |s| as above, J(s) := F [u∗ + sv + τ (s)w]. We have G [u∗ + sv + τ (s)w] = H(s, τ (s)) = α and the continuously differentiable function J has a minimum at s = 0 by the minimization property of u∗ . Thus, with the shorthand notations DA f = DA f (x, u∗ , ∇u∗ ) and Dv f = Dv f (x, u∗ , ∇u∗ ), Comment... 0 = J ′ (0) = ∫ Ω DA f : (∇v + τ ′ (0)∇w) + Dv f · (v + τ ′ (0)w) dx. 2.9. NOTES AND BIBLIOGRAPHICAL REMARKS 41 Rearranging and using (2.22), we get ∫ Ω ′ DA f : ∇v + Dv f · v dx = −τ (0) =λ ∫ Ω ∫ Ω DA f : ∇w + Dv f · w dx Dv g(x, u∗ ) · v dx, (2.23) where we have defined λ := ∫ Ω DA f : ∇w + Dv f · w dx. Since (2.23) shows that u∗ is a weak solution of (2.21), the proof is finished. Example 2.39 (Stationary Schrödinger equation). When looking for ground states in quantum mechanics as in Section 1.3, we have to minimize E [Ψ] := ∫ RN 1 h̄2 |∇Ψ|2 + V (x)|Ψ|2 dx 2µ 2 over all Ψ ∈ W1,2 (RN ; C) under the side constraint ∥Ψ∥L2 = 1. From Theorem 2.36 in conjunction with Lemma 2.37 (slightly extended to also apply to the whole space Ω = RN ) we see that this problem always has at least one solution. Theorem 2.38 (likewise extended) yields that this Ψ satisfies [ 2 ] −h̄ ∆ +V (x) Ψ(x) = EΨ(x), x ∈ Ω, 2µ for some Lagrange multiplier E ∈ R, which is precisely the stationary Schrödinger equa2 tion. One can also show that E > 0 is the smallest eigenvalue of the operator Ψ 7→ [ −h̄ 2µ ∆ + V (x)]Ψ. 2.9 Notes and bibliographical remarks Comment... If a given problem has some convexity, this is invariable very useful and the analysis can draw on a vast theory. We have not mentioned the general field of Convex Analysis, see for example the classic texts [83] or [34]. In this area, however, the emphasis is more on extending the convex theory to “non-smooth” situations, see for example the monographs [84] and [66, 67]. For the u-dependent variational integrals, the growth in the u-variable can be improved up to q-growth, where q < p/(p − d) (the Sobolev embedding exponent for W1,p ). Moreover, we can work with the more general growth bounds | f (x, v, A)| ≤ M(g(x) + |v|q + |A| p ), with g ∈ L1 (Ω; [0, ∞)) and q < p/(p − d). For reasons of simplicity, we have omitted these generalization here. 42 CHAPTER 2. CONVEXITY Comment... The Fundamental Lemma of Calculus of Variations 2.21 is due to Du Bois-Raymond and is sometimes named after him. Difference quotients were already considered by Newton, their application to regularity theory is due to Nirenberg in the 1940s and 1950s. Many of the fundamental results of regularity theory (albeit in a non-variational context) can be found in [43]. The book by Giusti [44] contains theory relevant for variational questions. A nice framework for Schauder estimates is [86]. De Giorgi’s 1968 counterexample is from [29]. Noether’s theorem has many ramifications and can be put into a very general form in Hamiltonian systems and Lie-group theory. The idea is to study groups of symmetries and their actions. For an introduction into this diverse field see [80]. Example 2.35 about the monotonicity formula is from [36]. Side-constraints can also take the form of inequalities, which is often the case in Optimization Theory or Mathematical Programming. This leads to Karush–Kuhn–Tucker conditions or obstacle problems. Chapter 3 Quasiconvexity We saw in the last chapter that convexity of the integrand implies lower semicontinuity for integral functionals. Moreover, we saw in Proposition 2.11 that also the converse is true in the scalar case d = 1 or m = 1. In the vectorial case (d, m > 1), however, it turns out that one can find weakly lower semicontinuous integral functionals whose integrand is non-convex, the following is the most prominent one: Let p ≥ d (Ω ⊂ Rd a bounded Lipschitz domain as usual) and define F [u] := ∫ Ω det ∇u(x) dx, d u ∈ W1,p 0 (Ω; R ). Then, we can argue using Stokes’ Theorem (this can also be computed in a more elementary way, see Lemma 3.8 below): F [u] = ∫ Ω det ∇u dx = ∫ Ω du1 ∧ · · · ∧ dud = ∫ ∂Ω u1 ∧ du2 ∧ · · · ∧ dud = 0 d because u ∈ W1,d 0 (Ω; R ) is zero on the boundary ∂ Ω. Thus, F is in fact constant on d W1,p 0 (Ω; R ), hence trivially weakly lower semicontinuous. However, the determinant function is far from being convex if d ≥ 2; for instance we can easily convex-combine a matrix with positive determinant from two singular ones. But we can also find examples not involving singular matrices: For ( ) ( ) ( ) 1 1 −1 −2 1 −2 0 −2 A := , B := , A+ B = , 2 1 2 −1 2 0 2 2 we have det A = det B = 3, but det(A/2 + B/2) = 4. Furthermore, convexity of the integrand is not compatible with one of the most fundamental principles of continuum mechanics, the frame-indifference or objectivity. Assume that our integrand f = f (A) is frame-indifferent, that is, f (RA) = f (A) for all A ∈ Rd×d , R ∈ SO(d), Comment... where SO(d) is the set of (d × d)-orthogonal matrices with determinant 1 (i.e. rotations). Furthermore, assume that every purely compressive or purely expansive deformation costs 44 CHAPTER 3. QUASICONVEXITY energy, i.e. f (α I) > f (I) for all α ̸= 1, (3.1) which is very reasonable in applications. Then, f cannot be convex: Let us for simplicity assume d = 2. Set, for a fixed γ ∈ (0, 2π ), ( ) cos γ − sin γ R := ∈ SO(2). sin γ cos γ Then, if f was convex, for M := (R + RT )/2 = (cos γ )I we would get f ((cos γ )I) = f (M) ≤ ) 1( f (R) + f (RT ) = f (I), 2 contradicting (3.1). Sharper arguments are available, but the essential conclusion is the same: convexity is not suitable for many variational problems originating from continuum mechanics. 3.1 Quasiconvexity We are now ready to replace convexity with a more general notion, which has become one of the cornerstones of the modern Calculus of Variations. This is the notion of quasiconvexity, introduced by Charles B. Morrey in the 1950s: A locally bounded Borel-measurable function h : Rm×d → R is called quasiconvex if ∫ h(A) ≤ − B(0,1) h(A + ∇ψ (z)) dz (3.2) ∫ ∫ m −1 , where B(0, 1) − for all A ∈ Rm×d and all ψ ∈ W1,∞ 0 (B(0, 1); R ). Here, B(0,1) := |B(0, 1)| is the unit ball in Rd . Let us give a physical interpretation to the notion of quasiconvexity: Let d = m = 3 and assume that F [y] := ∫ h(∇y(x)) dx, B(0,1) y ∈ W1,∞ (B(0, 1); R3 ), models the physical energy of an elastically deformed body, whose deformation from the reference configuration is given through y : Ω = B(0, 1) → R3 (see Section 1.4 for more details on this modeling). A special class of deformations are the affine ones, say a(x) = y0 + Ax for some y0 ∈ R3 , A ∈ R3×3 . Then, quasiconvexity of f entails that F [a] = ∫ B(0,1) h(A) dx ≤ ∫ B(0,1) h(A + ∇ψ (z)) dz Comment... 3 for all ψ ∈ W1,∞ 0 (B(0, 1); R ). This means that the affine deformation a is always energetically favorable over the internally distorted deformation a(x) + ψ (x); this is very often a 3.1. QUASICONVEXITY 45 reasonable assumption for real materials. This argument also holds on any other domain by Lemma 3.2 below. To justify its name, we also need to convince ourselves that quasiconvexity is indeed ∫ a notion of convexity: For A ∈ Rm×d and V ∈ L1 (B(0, 1); Rm×d ) with B(0,1) V (x) dx = 0 define the probability measure µ ∈ M1 (Rm×d ) via its action as an element of C0 (Rm×d )∗ as follows (recall that M(Rm×d ) ∼ = C0 (Rm×d )∗ by the Riesz Representation Theorem A.10): ∫ ⟨ ⟩ h, µ := − h(A +V (x)) dx B(0,1) for h ∈ C0 (Rm×d ). This µ is easily seen to be an element of the dual space to C0 (Rm×d ), and in fact µ is a probability measure: For the boundedness we observe |⟨h, µ ⟩| ≤ ∥h∥∞ , whereas the positivity ⟨h, µ ⟩ ≥ 0 for h ≥ 0 and the normalization ⟨1, µ ⟩ = 1 follow at once. The barycenter [µ ] of µ is ∫ ⟩ ⟨ [µ ] = id, µ = A + − V (x) dx = A. B(0,1) Therefore, if h is convex, we get from then Jensen inequality, Lemma A.9, ⟩ ∫ ⟨ h(A) = h([µ ]) ≤ h, µ = − h(A +V (x)) dx. B(0,1) m In particular, (3.2) holds if we set V (x) := ∇ψ (x) for any ψ ∈ W1,∞ 0 (B(0, 1); R ). Thus, we have shown: Proposition 3.1. All convex functions h : Rm×d → R are quasiconvex. However, quasiconvexity is weaker than classical convexity, at least if d, m ≥ 2. The determinant function and, more generally, minors are quasiconvex, as will be proved in the next section, but these minors (except for the (1 × 1)-minor) are not convex. We will see many further quasiconvex functions in the course of this and the next chapter. The relation between (quasi)convexity and Jensen’s inequality plays a major role in the theory, as we will see soon. Other basic properties of quasiconvexity are the following: Lemma 3.2. In the definition of quasiconvexity we can replace the domain B(0, 1) by any bounded Lipschitz domain Ω′ ⊂ Rd . Furthermore, if h has p-growth, i.e. |h(A)| ≤ M(1 + |A| p ) for some 1 ≤ p < ∞, M > 0, then in the definition of quasiconvexity we can 1,p replace testing with ψ of class W1,∞ 0 by ψ of class W0 . Proof. To see the first statement, we will prove the following: If ψ ∈ W1,p (Ω; Rm ), where Ω ⊂ Rd is a bounded Lipschitz domain, satisfies ψ |∂ Ω = Ax for some A ∈ Rm×d , then there exists ψ̃ ∈ W1,p (Ω′ ; Rm ) with ψ̃ |∂ Ω′ = Ax and, if one of the following integrals exists and is finite, ∫ − h(∇ψ ) dx = − h(∇ψ̃ ) dy Ω Ω′ for all measurable h : Rm×d → R. (3.3) Comment... ∫ 46 CHAPTER 3. QUASICONVEXITY In particular, ∥∇ψ ∥L p = ∥∇ψ̃ ∥L p . Clearly, this implies that the definition of quasiconvexity is independent of the domain. To see (3.3), take a Vitali cover of Ω′ with re-scaled versions of Ω, see Theorem A.8, N ⊎ Ω′ = Z ⊎ Ω(ak , rk ), |Z| = 0, k=1 with ak ∈ Ω, rk > 0, Ω(ak , rk ) := ak + rk Ω, where k = 1, . . . , N ∈ N. Then let ( ) r ψ y − ak + Aa if y ∈ Ω(a , r ), k k k k rk ψ̃ (y) := y ∈ Ω′ . 0 otherwise, It holds that ψ̃ ∈ W1,∞ (Ω′ ; Rm ). We compute for any measurable h : Rm×d → R, )) ( ( ∫ ∫ y − ak dy h(∇ψ̃ ) dy = ∑ h ∇ψ rk Ω′ k Ω(ak ,rk ) = ∑ rkd k = |Ω′ | |Ω| ∫ ∫ Ω Ω h(∇ψ ) dx h(∇ψ ) dx since ∑k rkd = |Ω′ |/|Ω|. This shows (3.3). 1,p m m The second assertion follows since W1,∞ 0 (B(0, 1); R ) is dense in W0 (B(0, 1); R ) and under a p-growth assumption, the integral functional ∫ ψ 7→ − h(∇ψ ) dx, Ω m ψ ∈ W1,p 0 (B(0, 1); R ), is well-defined and continuous, the latter for example by Pratt’s Theorem A.5, see the proof of Theorem 2.30 for a similar argument. An even weaker notion of convexity is the following one: A locally bounded Borelmeasurable function h : Rm×d → R is called rank-one convex if it is convex along any rank-one line, that is, ( ) h θ A + (1 − θ )B ≤ θ h(A) + (1 − θ )h(B) for all A, B ∈ Rm×d with rank(A − B) ≤ 1 and all θ ∈ (0, 1). In this context, recall that a matrix M ∈ Rm×d has rank one if and only if M = a ⊗ b = abT for some a ∈ Rm , b ∈ Rd . Proposition 3.3. If h : Rm×d → R is quasiconvex, then it is rank-one convex. Proof. Let A, B ∈ Rm×d with B − A = a ⊗ n for a ∈ Rm and n ∈ Rd with |n| = 1. Denote by Qn a unit cube (|Qn | = 1) centered at the origin and with one side normal to n. By Lemma 3.2 we can require h to satisfy ∫ h(M) ≤ − h(∇ψ (z)) dz Comment... Qn (3.4) 3.1. QUASICONVEXITY 47 Figure 3.1: The function φ0 . for all M ∈ Rm×d and all ψ ∈ W1,∞ (Qn ; Rm ) with ψ (x) = Mx on ∂ Qn (in the sense of trace). For fixed θ ∈ (0, 1), we now let M := θ A + (1 − θ )B and define the sequence of test functions φ j ∈ W1,∞ (Qn ; Rm ) as follows: ) 1 ( φ j (x) := Mx + φ0 jx · n − ⌊ jx · n⌋ a, j x ∈ Qn , where ⌊s⌋ denotes the largest integer below or equal to s ∈ R, and { −(1 − θ )t if t ∈ [0, θ ], φ0 (t) := t ∈ [0, 1). θt − θ if t ∈ (θ , 1], Such φ j are called laminations in direction n. See Figures 3.1 and 3.2 for φ0 and φ j , respectively. We calculate { M − (1 − θ )a ⊗ n = A if jx · n − ⌊ jx · n⌋ ∈ (0, θ ), ∇φ j (x) = M +θa⊗n = B if jx · n − ⌊ jx · n⌋ ∈ (θ , 1). Thus, ∫ lim − h(∇φ j (x)) dx = θ h(A) + (1 − θ )h(B). j→∞ Qn ∗ Also notice that φ j ⇁ a in W1,∞ for a(x) := Mx since φ0 is uniformly bounded. We will show below that we may replace the sequence (φ j ) with a sequence (ψ j ) ⊂ W1,∞ (Qn ; Rm ) with the additional property ψ (x) = Mx on ∂ Qn , but such that still ∫ lim − h(∇ψ j (x)) dx = θ h(A) + (1 − θ )h(B). Comment... j→∞ Qn 48 CHAPTER 3. QUASICONVEXITY Figure 3.2: The laminate φ j . Combining this with (3.4) allows us to conclude ( ) h θ A + (1 − θ )B ≤ θ h(A) + (1 − θ )h(B) and h is indeed rank-one convex. It remains to construct the sequence (ψ j ) from (φ j ), for which we employ a standard cut-off construction: Take a sequence (ρ j ) ⊂ C∞ c (Qn ; [0, 1]) of cut-off functions such that { } for G j := x ∈ Ω : ρ j (x) = 1 it holds that |Qn \ G j | → 0 as j → ∞. With a(x) = Mx set ψ j,k := ρ j φk + (1 − ρ j )a, which lies in W1,∞ (Qn ; Rm ) and satisfies ψ j,k (x) = Mx near ∂ Qn . Also, ∇ψ j,k = ρ j ∇φk + (1 − ρ j )M + (φk − a) ⊗ ∇ρ j . Since W1,∞ (Qn ; Rm ) embeds compactly into L∞ (Qn ; Rm ) by the Rellich-Kondrachov theorem (or the classical Arzéla–Ascoli theorem), we have φk → a uniformly. Thus, in ∥∇ψ j,k ∥L∞ ≤ ∥∇φk ∥L∞ + |M| + ∥(φk − a) ⊗ ∇ρ j ∥L∞ the last term vanishes as k → ∞. As the ∇φk furthermore are uniformly L∞ -bounded, we can for every j ∈ N choose k( j) ∈ N such that ∥∇ψ j,k( j) ∥L∞ is bounded by a constant that is independent of j. Then, as h is assumed to be locally bounded, this implies that there exists C > 0 (again independent of j) with Comment... ∥h(∇φ j )∥L∞ + ∥h(∇ψ j,k( j) )∥L∞ ≤ C. 3.1. QUASICONVEXITY 49 Hence, for ψ j := ψ j,k( j) , we may estimate ∫ lim j→∞ Qn |h(∇φ j ) − h(∇ψ j )| dx = lim ∫ j→∞ Qn \G j |h(∇φ j )| + |h(∇ψ j,k( j) )| dx ≤ lim C|Qn \ G j | = 0. j→∞ This shows that above we may replace (φ j ) by (ψ j ). Since for d = 1 or m = 1, rank-one convexity obviously is equivalent to convexity, we have for the scalar case: Corollary 3.4. If h : Rm×d → R is quasiconvex and d = 1 or m = 1, then h is convex. The following is a standard example: Example 3.5 (Alibert–Dacorogna–Marcellini). For d = m = 2 and γ ∈ R define ( ) hγ (A) := |A|2 |A|2 − 2γ det A , A ∈ R2×2 . For this function it is known that √ 2 2 ≈ 0.94, • hγ is convex if and only if |γ | ≤ 3 2 • hγ is rank-one convex if and only if |γ | ≤ √ ≈ 1.15, 3 ] ( 2 √ • hγ is quasiconvex if and only if |γ | ≤ γQC for some γQC ∈ 1, . 3 It is currently unknown whether γQC = √23 . We do not prove these statements here, see Section 5.3.8 in [25] for the (involved!) details. We end this section with the following regularity theorem for rank-one convex (and hence also for quasiconvex) functions: Proposition 3.6. If h : Rm×d → R is rank-one convex (and locally bounded), then it is locally Lipschitz continuous. Proof. We will prove the stronger quantitative bound lip(h; B(A0 , r)) ≤ √ min(d, m) · osc(h; B(A0 , 6r)) , 3r (3.5) where |h(A) − h(B)| |A − B| A,B∈B(A0 ,r) sup Comment... lip(h; B(A0 , r)) := 50 CHAPTER 3. QUASICONVEXITY is the Lipschitz constant of h on the ball B(A0 , r) ⊂ Rm×d , and osc(h; B(A0 , r)) := sup |h(A) − h(B)| A,B∈B(A0 ,r) is the oscillation of h on B(A0 , r). By the local boundedness, the oscillation is bounded on every ball, and so, locally, the Lipschitz constant is finite. To show (3.5), let A, B ∈ B(x0 , r) and assume first that rank(A − B) ≤ 1. Define the point C ∈ Rm×d as the intersection of ∂ B(A0 , 2r) with the ray starting at B and going through A. Then, because h is convex along this ray, |h(A) − h(B)| |h(C) − h(B)| osc(h; B(A0 , 2r)) ≤ ≤ =: α (2r). |A − B| |C − B| r (3.6) For general A, B ∈ B(A0 , r), use the (real) singular value decomposition to write min(d,m) ∑ B−A = σi P(ei ⊗ ei )QT , i=1 where σi ≥ 0 is the i’th singular value, and P ∈ Rm×m , Q ∈ Rd×d are orthogonal matrices. Set k−1 Ak := A + ∑ σi P(ei ⊗ ei )QT , k = 1, . . . , min(d, m) + 1, i=1 for which we have A1 = A, Amin(d,m)+1 = B. We estimate (recall that we are employing the √ √ Frobenius norm |M| := ∑ j,k (Mkj )2 = ∑i σi (M)2 ) √ k−1 ∑ σi2 ≤ |A − A0 | + |B − A| < 3r |Ak − A0 | ≤ |A − A0 | + i=1 and min(d,m) ∑ min(d,m) ∑ |Ak − Ak+1 |2 = k=1 σi2 = |A − B|2 . k=1 Applying (3.6) to Ak , Ak+1 ∈ B(A0 , 3r), k = 1, . . . , min(d, m), we get min(d,m) |h(A) − h(B)| ≤ ∑ |h(Ak ) − h(Ak+1 )| k=1 min(d,m) ≤ α (6r) ∑ |Ak − Ak+1 | k=1 √ ≤ α (6r) min(d, m) · [ ∑ k=1 √ = α (6r) min(d, m) · |A − B|. Comment... This yields (3.5). ]1/2 min(d,m) |Ak − Ak+1 |2 3.2. NULL-LAGRANGIANS 51 Remark 3.7. An improved argument (see Lemma 2.2 in [12]), where one orders the singular values in a particular way, allows to establish the better estimate √ lip(h; B(A0 , r)) ≤ min(d, m) · osc(h; B(A0 , 2r)) . r 3.2 Null-Lagrangians The determinant is quasiconvex, but it is only one representative of a larger class of canonical examples of quasiconvex, but not convex, functions: In this section, we will investigate the properties of minors (subdeterminants) as integrands. Let for r ∈ {1, 2, . . . , min(d, m)}, { } I ∈ P(m, r) := (i1 , i2 , . . . , ir ) ∈ {1, . . . , m}r : i1 < i2 < · · · < ir and J ∈ P(d, r) be ordered multi-indices. Then, a (r × r)-minor M : Rm×d → R is a function of the form ( ) M(A) = MJI (A) := det AIJ , where AIJ is the (square) matrix consisting of the I-rows and J-columns of A. The number r ∈ {1, . . . , min(d, m)} is called the rank of the minor M. The first result shows that all minors are null-Lagrangians, which by definition is the ∫ class of integrands h : Rm×d such that Ω h(∇u) dx only depends on the boundary values of u. Lemma 3.8. Let M : Rm×d → R be a (r × r)-minor, r ∈ {1, . . . , min(d, m)}. If u, v ∈ m W1,p (Ω; Rm ), p ≥ r, with u − v ∈ W1,p 0 (Ω; R ), then ∫ Ω ∫ M(∇u) dx = Ω M(∇v) dx. Proof. In all of the following, we will assume that u, v are smooth and supp(u − v) ⊂⊂ Ω, which can be achieved by approximation (incorporating a cut-off procedure), see Theorem. A.19. We also need that taking the minor M of the gradient commutes with strong convergence, i.e. the strong continuity of u 7→ M(∇u) in W1,p for p ≥ r; this follows by Hadamard’s inequality |M(A)| ≤ |A|r and Pratt’s Theorem A.5. Finally, we assume (again by approximation) that supp(u − v) ⊂⊂ Ω. All first-order minors are just the entries of ∇u, ∇v and the result follows from ∫ Ω ∇u dx = ∫ ∂Ω u · n dσ = ∫ ∂Ω v · n dσ = ∫ Ω ∇v dx m since u − v ∈ W1,p 0 (Ω; R ). For higher-order minors, the crucial observation is that minors of gradients can be written as divergences, which we will establish below. So, if M(∇u) = div G(u, ∇u) for some G(u, ∇u) ∈ C∞ (Ω; Rd ), then, since supp(u − v) ⊂⊂ Ω, Ω ∫ M(∇u) dx = ∂Ω G(u, ∇u) · n dσ = ∫ ∂Ω G(v, ∇v) · n dσ = ∫ Ω M(∇v) dx Comment... ∫ 52 CHAPTER 3. QUASICONVEXITY and the result follows. We first consider the physically most relevant cases d = m ∈ {2, 3}. For d = m = 2 and u = (u1 , u2 )T , the only second-order minor is the Jacobian determinant and we easily see from the fact that second derivatives of smooth functions commute that det ∇u = ∂1 u1 ∂2 u2 − ∂2 u1 ∂1 u2 = ∂1 (u1 ∂2 u2 ) − ∂2 (u1 ∂1 u2 ) ) ( = div u1 ∂2 u2 , −u1 ∂1 u2 . ¬k (A), i.e. the determinant of A after For d = m = 3, consider a second-order minor M¬l deleting the k’th row and l’th column. Then, analogously to the situation in dimension two, we get, using cyclic indices k, l ∈ {1, 2, 3}, ] [ ¬k M¬l (∇u) = (−1)k+l ∂l+1 uk+1 ∂l+2 uk+2 − ∂l+2 uk+1 ∂l+1 uk+2 ] [ = (−1)k+l ∂l+1 (uk+1 ∂l+2 uk+2 ) − ∂l+2 (uk+1 ∂l+1 uk+2 ) . (3.7) For the three-dimensional Jacobian determinant, we will show 3 ( ) det ∇u = ∑ ∂l u1 (cof ∇u)1l , (3.8) l=1 ¬k (A). To see this, use the Cramer formula where we recall that (cof A)kl = (−1)k+l M¬l (det A)I = A(cof A)T , which holds for any matrix A ∈ Rn×n , to get 3 det ∇u = ∑ ∂l u1 · (cof ∇u)1l . l=1 Then, our formula (3.8) follows from the Piola identity div cof ∇u = 0, (3.9) ¬k (∇u). Hence, (3.8) holds. which can be verified directly from the expression (3.7) for M¬l For general dimensions d, m we use the notation of differential geometry to tame the multilinear algebra involved in the proof. So, let M be a (r × r)-minor. Reordering x1 , . . . , xd and u1 , . . . , um , we can assume without loss of generality that M is a principal minor, i.e. M is the determinant of the top-left (r × r)-submatrix. Then, in differential geometry it is well-known that M(∇u) dx1 ∧ · · · ∧ dxd = du1 ∧ · · · ∧ dur ∧ dxr+1 ∧ · · · ∧ dxd = d(u1 ∧ du2 ∧ · · · ∧ dur ∧ dxr+1 ∧ · · · ∧ dxd ). Thus, Stokes Theorem from Differential Geometry gives ∫ Ω M(∇u) dx1 ∧ · · · ∧ dxd = = Comment... Therefore, ∫ Ω M(∇u) dx ∫ Ω ∫ d(u1 ∧ du2 ∧ · · · ∧ dur ∧ dxr+1 ∧ · · · ∧ dxd ) ∂Ω 1 ∧ · · · ∧ dxd u1 ∧ du2 ∧ · · · ∧ dur ∧ dxr+1 ∧ · · · ∧ dxd . only depends on the values of u around ∂ Ω. 3.2. NULL-LAGRANGIANS 53 Consequently, we immediately have: Corollary 3.9. All (r × r)-minors M : Rm×d → R are quasiaffine, that is, both M and −M are quasiconvex. We will later use this property to define a large and useful class of quasiconvex functions, namely the polyconvex ones; this is the topic of the next chapter. Further, minors enjoy a surprising continuity property: Lemma 3.10 (Weak continuity of minors). Let M : Rm×d → R be an (r × r)-minor, r ∈ {1, . . . , min(d, m)}, and (u j ) ⊂ W1,p (Ω; Rm ) where r < p ≤ ∞. If u j ⇀ u in W1,p ∗ (or u j ⇁ u in W1,∞ ). then ∗ M(∇u j ) ⇀ M(∇u) in L p/r (or M(∇u j ) ⇁ M(∇u) in W1,∞ ). Proof. For d = m ∈ {2, 3}, we use the special structure of minors as divergences, as ex¬k in three dimensions (that is, hibited in the proof of Lemma 3.8. For a (2 × 2)-minor M¬l deleting the k’th row and l’th column; in two dimensions there is only one (2 × 2)-minor, the determinant, but we still use the same notation), we rely on (3.7) to observe that with cyclic indices k, l ∈ {1, 2, 3}, ∫ Ω ¬k M¬l (∇u j )ψ dx = −(−1)k+l ∫ Ω k+1 k+2 k+2 (uk+1 j ∂l+2 u j )∂l+1 ψ − (u j ∂l+1 u j )∂l+2 ψ dx p/r (Ω)∗ ∼ L p/(p−r) (Ω). Since for all ψ ∈ C∞ = c (Ω) and then by density also for all ψ ∈ L 1,p p u j ⇀ u in W , we have u j → u strongly in L . The above expression consists of products of one L p -weakly and one L p -strongly convergent factor as well as a uniformly bounded function under the integral. Hence the integral is weakly continuous and as j → ∞ converges to ∫ Ω ¬k M¬l (∇u)ψ dx. This finishes the proof for d = m = 2. For d = m = 3, we additionally need to consider the determinant. However, as a consequence of the two-dimensional reasoning, cof ∇u j ⇀ cof ∇u in L p/2 . Then, (3.8) implies ∫ 3 Ω det ∇u j ψ dx = − ∑ ∫ [ ] u1j (cof ∇u j )1l ∂l ψ dx l=1 Ω and by a similar reasoning as above, this converges to 3 −∑ ∫ [ ∫ ] 1 1 u (cof ∇u)l ∂l ψ dx = det ∇u ψ dx. l=1 Ω Ω Comment... In the general case, one proceeds by induction. We omit the details. 54 CHAPTER 3. QUASICONVEXITY It can also be shown that any quasiaffine function can be written as an affine function of all the minors of the argument. This characterization of quasiaffine functions is due to Ball [5], a different proof (also including further characterizing statements) can be found in Theorem 5.20 of [25]. 3.3 Young measures We now introduce a very versatile tool for the study of minimization problems – the Young measure, named after its inventor Laurence Chisholm Young. In this chapter, we will only use it as a device to organize the proof of lower semicontinuity for quasiconvex integral functions, see Theorems 3.25, 3.29, but later it will also be important as an object in its own right. Let us motivate this device by the following central issue: Assume that we have a weakly convergent sequence v j ⇀ v in L2 (Ω), where Ω ⊂ Rd is a bounded Lipschitz domain, and an integral functional F [v] := ∫ Ω f (x, v(x)) dx with f : Ω × R → R continuous and bounded, say. Then, we may want to know the limit of F [v j ] as j → ∞, for instance if (v j ) is a minimizing sequence. In fact, this will be cornerstone of the proof of weak lower semicontinuity for integral functionals with quasiconvex integrand. Equivalently, we could ask for the L∞ -weak* limit of the sequence of compound functions g j (x) := f (x, v j (x)). It is easy to see that the weak* limit in general is not equal to x 7→ f (x, v(x)). For example, in Ω = (0, 1), consider the sequence { a if jx − ⌊ jx⌋ ≤ θ , x ∈ Ω, v j (x) := b otherwise, where a, b ∈ R, θ ∈ (0, 1), and ⌊s⌋ is the largest integer below s ∈ R. Then, if f (x, a) = α ∈ ∗ R and f (x, b) = β ∈ R (and smooth and bounded elsewhere), we see v j ⇁ θ a + (1 − θ )b and ∗ f (x, v j (x)) ⇁ h(x) = θ α + (1 − θ )β . The right-hand side, however, is not necessarily f (x, θ a + (1 − θ )b). Instead, we could write ∫ h(x) = f (x, v) dνx (v) with a x-parametrized family of probability measures νx ∈ M1 (R), namely νx = θ δα + (1 − θ )δβ , x ∈ (0, 1). Comment... This family (νx )x∈Ω is the “Young measure generated by the sequence (v j )”. We start with the following fundamental result: 3.3. YOUNG MEASURES 55 Theorem 3.11 (Fundamental theorem of Young measure theory). Let (V j ) ⊂ L p (Ω; RN ) be a bounded sequence, where 1 ≤ p ≤ ∞. Then, there exists a subsequence (not relabeled) and a family (νx )x∈Ω ⊂ M1 (RN ) of probability measures, called the p-Young measure generated by the sequence (V j ), such that the following assertions are true: (I) The family (νx ) is weakly* measurable, that is, for all Carathéodory integrands f : Ω × RN → R, the compound function x 7→ ∫ f (x, A) dνx (A), x ∈ Ω, is Lebesgue-measurable. (II) If 1 ≤ p < ∞, it holds that ⟨⟨ p ⟩⟩ ∫ ∫ | q| , ν := |A| p dνx (A) dx < ∞ Ω or, if p = ∞, there exists a compact set K ⊂ RN such that supp νx ⊂ K for a.e. x ∈ Ω. (III) For all Carathéodory integrands f : Ω × RN → R such that the family ( f (x,V j (x))) j is uniformly L1 -bounded and equiintegrable, it holds that ) ( ∫ in L1 (Ω). (3.10) f (x,V j (x)) ⇀ x 7→ f (x, A) dνx (A) For parametrized measures ν = (νx )x∈Ω that satisfy (I) and (II) above, we write ν = (νx ) ∈ Y p (Ω; RN ), the latter being called the set of p-Young measures. In symbols we express the generation of a Young measure ν by as sequence (V j ) as Y Vj → ν . Furthermore, the convergence (3.10) can equivalently be expressed as (absorb a test function for weak convergence into f ) ∫ Ω f (x,V j (x)) dx → ∫ ∫ Ω f (x, A) dνx (A) dx =: ⟨⟨ ⟩⟩ f,ν . In this context, recall from the Dunford–Pettis Theorem A.13 that ( f (x,V j (x))) j is equiintegrable if and only it is weakly relatively compact in L1 (Ω). Comment... Remark 3.12. Some assumptions in the fundamental theorem can be weakened: For the existence of a Young measure, we only need to require limK↑∞ sup j |{|V j | ≥ K}| = 0, which ∫ for example follows from sup j Ω |V j | p dx < ∞ for some p > 0. Of course, in this case (II) needs to be suitably adapted. 56 CHAPTER 3. QUASICONVEXITY Proof. We associate with each V j an elementary Young measure (δV j ) ∈ Y p (Ω; RN ) as follows: (δV j )x := δV j (x) for a.e. x ∈ Ω. (3.11) Below we will show the following Young measure compactness principle, which is also useful in other contexts. We only formulate the principle for 1 ≤ p < ∞ for reasons of convenience, but it holds (suitably adapted) also for p = ∞. Let (ν ( j) ) j ⊂ Y p (Ω; RN ) be a sequence of p-Young measures such that ∫ ∫ ⟩⟩ ⟨⟨ ( j) sup j | q| p , ν ( j) = sup j |A| p dνx (A) dx < ∞. Ω (3.12) Then, there exists a subsequence of the (ν j ) (not relabeled) and ν ∈ Y p (Ω; RN ) such that ⟨⟨ ⟩⟩ ⟨⟨ ⟩⟩ lim f , ν ( j) = f , ν (3.13) j→∞ for all Carathéodory f : Ω × RN → R for which the sequence of functions x 7→ ( j) ⟨ f (x, q), νx ⟩ is uniformly L1 -bounded and the equiintegrability condition ⟨⟨ ⟩⟩ sup j f (x, A)1{| f (x,A)|≥m} , ν ( j) → 0 as m → ∞. (3.14) holds. Moreover, ⟨⟨ p ⟩⟩ | q| , ν ≤ lim inf ⟨⟨| q| p , ν ( j) ⟩⟩. j→∞ In the theorem, our assumption of L p -boundedness of (V j ) corresponds to (3.12), so (I)– (III) follow immediately from the compactness principle applied to our sequence (δV j ) j of elementary Young measures from (3.11) above once we realize that for the sequence of Young measures (δV j ) j the condition (3.14) corresponds precisely to the equiintegrability of ( f (x,V j (x))) j . It remains to show the compactness principle. To this end define the measures ( j) µ ( j) := Lxd Ω ⊗ νx , which is just a shorthand notation for the Radon measures µ ( j) ∈ M(Ω × RN ) ∼ = C0 (Ω × RN )∗ defined through their action ⟨ ⟩ ∫ ∫ ( j) f , µ ( j) = f (x, A) dνx (A) dx Ω for all f ∈ C0 (Ω × RN ). So, for the ν ( j) = δV j from (3.11), we recover the familiar form ⟨ ⟩ ∫ f (x,V j (x)) dx f , µ ( j) = Comment... Ω for all f ∈ C0 (Ω × RN ), 3.3. YOUNG MEASURES 57 but in the compactness principle we might be in a more general situation. Clearly, every so-defined µ ( j) is a positive measure and ⟨ ⟩ f , µ ( j) ≤ |Ω| · ∥ f ∥∞ , whereby the (µ ( j) ) constitute a uniformly bounded sequence in the dual space C0 (Ω×RN )∗ . Thus, by the Sequential Banach–Alaoglu Theorem A.12, we can select a (non-relabeled) subsequence such that with a µ ∈ C0 (Ω × RN )∗ it holds that ⟩ ⟨ ⟨ ⟩ f , µ ( j) → f , µ for all f ∈ C0 (Ω × RN ). (3.15) Next we will show that µ can be written as µ = Lxd Ω ⊗ νx for a weakly* measurable parametrized family ν = (νx )x∈Ω ⊂ M1 (RN ) of probability measures. Theorem 3.13 (Disintegration). Let µ ∈ M(Ω × RN ) be a finite Radon measure. Then, there exists a weakly* measurable family (νx )x∈Ω ⊂ M1 (RN ) of probability measures such that with κ (A) := µ (A × RN ), µ = κ (dx) ⊗ νx , that is, ⟨ ⟩ f,µ = ∫ ∫ Ω f (x, A) dνx (A) dκ (x) for all f ∈ C0 (Ω × RN ). A proof of this measure-theoretic result can be found in Theorem 2.28 of [2]. In our situation, we have for f (x, v) := φ (x) with arbitrary φ ∈ C0 (Ω) that ⟩ ⟨ ⟨ ⟩ ∫ φ , µ = lim φ , µ ( j) = φ (x) dx, j→∞ Ω and so κ = L d Ω. Hence, the Disintegration Theorem ⟩⟩immediately yields the weak* ⟨⟨ measurability of (νx ) and thus lim j→∞ ⟨⟨ f , ν ( j) ⟩⟩ = f , ν for f ∈ C0 (Ω × RN ), which is (3.13). Next, we show (3.13) for the case of f merely Carathéodory and bounded and such that there exists a compact set K ⊂ RN with supp f ⊂ Ω × K. We invoke the Scorza Dragoni Theorem 2.5 to get an increasing sequence of compact sets Sk ⊂⊂ Ω (k ∈ N) with |Ω\Sk | ↓ 0 such that f |Sk ×RN is continuous. Let fk ∈ C(Ω × RN ) be an extension of f |Sk ×RN to all of Ω × RN with the property that the fk are uniformly bounded. Indeed, since K is bounded and | fk (x, A)| ≤ M(1 + |A| p ), f is bounded and hence we may require uniform boundedness of the fk . Now, since fk ∈ C0 (Ω × RN ), (3.15) implies ⟨ ⟨ ⟩ ( j) ⟩ fk (x, q), νx in L1 as j → ∞. ⇀ fk (x, q), νx as j → ∞. Comment... Thus, we also get ⟨ ⟩ ⟨ ( j) ⟩ f (x, q), νx in L1 (Sk ) ⇀ f (x, q), νx 58 CHAPTER 3. QUASICONVEXITY On the other hand, ∫ ⟩ ⟨ ⟩ f (x, q), νx( j) − 1S f (x, q), νx( j) dx ≤ k ∫ ⟨ Ω Ω\Sk ⟨ ⟩ f (x, q), νx( j) dx and this converges to zero as k → ∞, uniformly in j, by the uniform boundedness of fk ; the same estimate holds with ν in place of ν ( j) . Therefore, we may conclude ⟨ ( j) ⟩ f (x, q), νx ⇀ ⟨ f (x, q), νx ⟩ in L1 (Ω) as j → ∞, which directly implies (3.13) (for f with supp f ⊂ Ω × K). Finally, to remove the restriction of compact support in A, we remark that it suffices to show the assertion under the additional constraint that f ≥ 0 by considering the positive and negative part separately. Then, in case f positive, choose for any m ∈ N a function N ρm ∈ C∞ c (R ; [0, 1]) with ρm ≡ 1 on B(0, m) and supp ρm ⊂ B(0, 2m). Set f m (x, A) := ρm (|A| p/2 )ρm ( f (x, A)) f (x, A). Then, E j,m := ≤ ≤ ∫ ⟨ ( j) ⟩ f (x, q), νx ∫ Ω∫ [ ] ( j) 1 − ρm (|A| p/2 )ρm ( f (x, A)) f (x, A) dνx (A) dx Ω ∫∫ ≤M ⟨ ( j) ⟩ − f m (x, q), νx dx ( j) {|A| ≥ m2/p ∫ ∫ Ω or f (x, A) ≥ m} {|A|≥m2/p } m f (x, A) dνx (A) dx ( j) dνx (A) ∫ dx + { f (x,A)≥m} ( j) f (x, A) dνx (A) dx ⟩⟩ ⟨⟨ ⟩⟩ ⟨⟨ M ≤ sup j |A| p , ν ( j) + sup j f (x, A)1{| f (x,A)|≥m} , ν ( j) . m Both of these terms converge to zero as m → ∞ by the assumption from the compactness principle, in particular (3.12) and the equiintegrability condition (3.14). As the Young measure convergence holds for f m by the previous proof step, we get ⟩⟩ ⟨⟨ ⟩⟩ ⟨⟨ ⟩⟩) (⟨⟨ ⟩⟩) ) (⟨⟨ lim sup f , ν ( j) − f m , ν = lim sup f m , ν ( j) − f m , ν + E j,m j→∞ j→∞ ≤ sup j E j,m . Then, since f m ↑ f and sup j E j,m vanishes as m → ∞ and f ≥ 0, we get that indeed ⟨⟨ ⟩⟩ ⟨⟨ ⟩⟩ f , ν ( j) → f , ν as j → ∞. The last assertion to be shown in the compactness principle is, for 1 ≤ p < ∞, ⟨⟨ p ⟩⟩ ⟨⟨ ⟩⟩ | q| , ν ≤ lim inf | q| p , ν ( j) . Comment... j→∞ 3.3. YOUNG MEASURES 59 For K ∈ N define |A|K := min{|A|, K} ≤ |A|. Then, by the Young measure representation for equiintegrable integrands, ⟨⟨ ⟩⟩ ⟨⟨ ⟩⟩ ⟨⟨ ⟩⟩ for all K ∈ N. lim inf | q| p , ν ( j) ≥ lim | q|Kp , ν ( j) = | q|Kp , ν j→∞ j→∞ We conclude by letting K ↑ ∞ and using the monotone convergence lemma. Remark 3.14. The Fundamental Theorem can also be proved in a more functional analytic N way as follows: Let L∞ w∗ (Ω; M(R )) be the set of essentially bounded weakly* measurable functions defined on Ω with values in the Radon measures M(RN ). It turns out (see for N 1 N example [7] or [30, 31]) that L∞ w∗ (Ω; M(R )) is the dual space to L (Ω; C0 (R )). One can ( j) N show that the maps ν ( j) = (x 7→ νx ) form a uniformly bounded set in L∞ w∗ (Ω; M(R )) and by the Sequential Banach–Alaoglu Theorem A.12 we can again conclude the existence of a weak* limit point ν of the ν ( j) ’s. The extended representation of limits for Carathéodory f follows as before. It is known that every weakly* measurable parametrized measure (νx )x∈Ω ⊂ M1 (RN ) with (x 7→ ⟨| q| p , νx ⟩) ∈ L1 (Ω) (1 ≤ p < ∞) can be generated by a sequence (u j ) ⊂ L p (Ω; RN ) (one starts first by approximating Dirac masses and then uses the approximation of a general measure by linear combinations of Dirac masses, a spatial gluing argument completes the reasoning). Thus, in the definition of Y p (Ω; RN ) we did not need to include (III) from the Fundamental Theorem. A Young measure ν = (νx ) ∈ Y p (Ω; RN ) is called homogeneous if νx = νy for a.e. x, y ∈ Ω; by a mild abuse of notation we then simply write ν in place of νx . Before we come to further properties of Young measures, let us consider a few examples: Example 3.15. In Ω := (0, 1) define u := 1(0,1/2) − 1(1/2,1) and extend this function periodically to all of R. Then, the functions u j (x) := u( jx) for j ∈ N (see Figure 3.3) generate the homogeneous Young measure ν ∈ Y∞ ((0, 1); R) with 1 1 νx = δ−1 + δ+1 for a.e. x ∈ (0, 1). 2 2 Indeed, for φ ⊗ h ∈ C0 ((0, 1) × R) we have that φ is uniformly continuous, say |φ (x) − φ (y)| ≤ ω (|x − y|) with a modulus of continuity ω : [0, ∞) → [0, ∞) (that is, ω is continuous, increasing, and ω (0) = 0). Then, ) ∫ 1 j−1(∫ (k+1)/ j ω (1/ j)∥h∥∞ lim φ (x)h(u j (x)) dx = lim ∑ φ (k/ j)h(u j (x)) dx + j→∞ j→∞ 0 j k/ j k=0 j−1 ∫ 1 1 φ (k/ j) h(u(y)) dy ∑ j→∞ 0 k=0 j ( ) ∫ 1 1 1 = φ (x) dx · h(−1) + h(+1) 2 2 0 = lim Comment... since the Riemann sums converge to the integral of φ . The last line implies the assertion. 60 CHAPTER 3. QUASICONVEXITY Figure 3.3: An oscillating sequence. Figure 3.4: Another oscillating sequence. Example 3.16. Take Ω := (0, 1) again and let u j (x) = sin(2π jx) for j ∈ N (see Figure 3.4). The sequence (u j ) generates the homogeneous Young measure ν ∈ Y∞ ((0, 1); R) with 1 νx = √ Ly1 (−1, 1) 2 π 1−y for a.e. x ∈ (0, 1), as should be plausible from the oscillating sequence (there is more mass close to the horizontal axis than farther away). Example 3.17. Take a bounded Lipschitz domain Ω ⊂ R2 , let A, B ∈ R2×2 with B − A = a ⊗ n (this is equivalent to rank(A − B) ≤ 1), where a, n ∈ R2 , and for θ ∈ (0, 1) define ( ∫ x·n ) u(x) := Ax + χ (t) dt a, x ∈ R2 , 0 where χ := 1∪z∈Z [z,z+1−θ ) . If we let u j (x) := u( jx)/ j, x ∈ Ω, then (∇u j ) generates the homogeneous Young measure ν ∈ Y∞ (Ω; R2 ) with νx = θ δA + (1 − θ )δB for a.e. x ∈ Ω. This example can also be extended to include multiple scales, cf. [73]. Comment... The Fundamental Theorem has many important consequences, two of which are the following: 3.3. YOUNG MEASURES 61 Lemma 3.18. Let (V j ) ⊂ L p (Ω; RN ), 1 < p ≤ ∞, be a bounded sequence generating the Young measure ν ∈ Y p (Ω; RN ). Then, in L p (Ω; RN ) Vj ⇀ V ⟨ ⟩ V (x) := [νx ] = id, νx . for Proof. Bounded sequences in L p with p > 1 are weakly relatively compact and it suffices to identify the limit of any weakly convergent subsequence. From the Dunford–Pettis Theorem A.13 it follows that any such sequence is in fact equiintegrable (in L1 ). Now simply apply assertion (III) of the Fundamental Theorem for the integrand f (x, A) := A (or, more pedantically, f ji (A) := Aij for i = 1, . . . , d, m = 1, . . . , m). The preceding lemma does not hold for p = 1. One example is V j := j1(0,1/ j) , which concentrates. Proposition 3.19. Let (V j ) ⊂ L p (Ω; RN ), 1 ≤ p ≤ ∞, be a bounded sequence generating the Young measure ν ∈ Y p (Ω; RN ), and let f : Ω × RN → [0, ∞) be any Carathéodory integrand (not necessarily satisfying the equiintegrability property in (III) of the Fundamental Theorem). Then, it holds that ∫ lim inf j→∞ Ω f (x,V j (x)) dx ≥ ⟨⟨ ⟩⟩ f,ν . Proof. For M ∈ N define fM (x, A) := min( f (x, A), M). Then, (III) from the Fundamental Theorem is applicable with fM and we get ∫ Ω fM (x,V j (x)) dx → ⟨⟨ fM , ν ⟩⟩ ∫ ∫ = Ω fM (x, A) dνx (A) dx. Since f ≥ fM , we have ∫ lim inf j→∞ Ω f (x,V j (x)) dx ≥ ⟨⟨ ⟩⟩ fM , ν . We conclude by letting M ↑ ∞ and using the monotone convergence lemma. Another important feature of Young measures is that they allow us to read off whether or not our sequence converges in measure: Proposition 3.20. Let (νx ) ⊂ M1 (RN ) be the Young measure generated by a bounded sequence (V j ) ⊂ L p (Ω; RN ), 1 ≤ p ≤ ∞. Let furthermore K ⊂ RN be compact. Then, dist(V j (x), K) → 0 in measure if and only if supp νx ⊂ K for a.e. x ∈ Ω. Moreover, if and only if νx = δV (x) for a.e. x ∈ Ω. Comment... V j → V in measure 62 CHAPTER 3. QUASICONVEXITY Proof. We only show the second assertion, the proof of the first one is similar. Take the bounded, positive Carathéodory integrand f (x, A) := |A −V (x)| , 1 + |A −V (x)| x ∈ Ω, A ∈ RN . Then, for all δ ∈ (0, 1), the Markov inequality and the Fundamental Theorem on Young measures yield ∫ { } 1 lim sup x ∈ Ω : f (x,V j (x)) ≥ δ ≤ lim f (x,V j (x)) dx j→∞ δ Ω j→∞ ∫ ⟩ 1 ⟨ = f (x, q), νx dx. δ Ω On the other hand, ∫ ( ∫ ⟨ ⟩ ) q f x,V j (x) dx f (x, ), νx dx = lim j→∞ Ω Ω { } ≤ δ |Ω| + lim sup x ∈ Ω : f (x,V j (x)) ≥ δ . j→∞ The preceding estimates show that f (x,V j (x)) converges to zero in measure if and only if ⟨ f (x, q), νx ⟩ = 0 for L d -almost every x ∈ Ω, which is the case if and only if νx = δV (x) almost everywhere. We conclude by observing that f (x,V j (x)) converges to zero in measure if and only if V j converges to V in measure. 3.4 Gradient Young measures The most important subclass of Young measures for our purposes is that of Young measures that can be generated by a sequence of gradients: Let ν ∈ Y p (Ω; Rm×d ) (use RN = Rm×d in the theory of the last section). We say that ν is a p-gradient Young measure, in symbols ν ∈ GY p (Ω; Rm×d ), if there exists a bounded sequence (u j ) ⊂ W1,p (Ω; Rm ) such that Y ∇u j → ν , i.e. the sequence (∇u j ) generates ν . Note that it is not necessary that every sequence that generates ν is a sequence of gradients (which, in fact, is never the case). We have already considered examples of gradient Young measures in the previous section, see in particular Example 3.17. Is it the case that every Young measure is a gradient Young measure? The answer is no, as can be easily seen: Let V ∈ (L p ∩ C∞ )(Ω; Rm×d ) with curlV (x) ̸= 0 for some x ∈ Ω. Then, νx := δV (x) cannot be a gradient Young measure since for such measures it must hold Y Comment... that [ν ] is a gradient itself: if there existed a sequence (u j ) ⊂ W1,p (Ω; Rm ) with ∇u j → ν , then ∇u j ⇀ V by Lemma 3.18, and it would follow that curlV = 0. Consequently, for every ν ∈ GY p (Ω; Rm×d ) there must exist u ∈ W1,p (Ω; Rm ) with [ν ] = ∇u. In this case we call any such u an underlying deformation of ν . We first show the following technical, but immensely useful, result about gradient Young measures. Recall that we assume Ω to be open, bounded, and to have a Lipschitz boundary (to make the trace well-defined). 3.4. GRADIENT YOUNG MEASURES 63 Lemma 3.21. Let ν ∈ GY p (Ω; Rm×d ), 1 < p < ∞, and let u ∈ W1,p (Ω; Rm ) be an underlying deformation of ν , i.e. [ν ] = ∇u. Then, there exists a sequence (u j ) ⊂ W1,p (Ω; Rm ) such that u j |∂ Ω = u|∂ Ω Y ∇u j → ν . and Furthermore, in addition we can ensure that (∇u j ) is p-equiintegrable. Proof. Step 1. Since Ω is assumed to have a Lipschitz boundary, we can extend a generating sequence (∇v j ) for ν to all of Rd , so we assume (v j ) ⊂ W1,p (Rd ; Rm ) with sup j ∥v j ∥W1,p ≤ C < ∞. In the following, we need the maximal function M f : Rd → R ∪ {+∞} of f : Rd → R ∪ {+∞}, which is defined as ∫ M f (x0 ) := sup − r>0 B(x0 ,r) | f (x)| dx, x0 ∈ Rd . We quote the following two results about the maximal function whose proofs can be found in [87]: (i) ∥M f ∥L p ≤ C∥M f ∥L p , where C = C(d, p) is a constant. (ii) For every M > 0 and f ∈ W1,p (Rd ; Rm ), the maximal function M f is Lipschitzcontinuous on the set {M (| f | + |∇ f |) < M} and its Lipschitz-constant is bounded by CM, where C = C(d, m, p) is a dimensional constant. With this tool at hand, consider the sequence V j := M (|v j | + |∇v j |), j ∈ N. By (i) above, (V j ) is uniformly bounded in L p (Ω) and we may select a (non-renumbered) subsequence such that V j generates a Young measure µ ∈ Y p (Ω). Next, define for h ∈ N the (nonlinear) truncation operator Th , { v if |v| ≤ h, Th v := v ∈ R, v h |v| if |v| > h, and let (φl )l ⊂ C∞ (Ω) be a countable family that is dense in C(Ω). Then, for fixed h ∈ N, the sequence (ThV j ) j is uniformly bounded in L∞ (Ω) and so, by Young measure representation, ∫ lim lim φl (x)|ThV j (x)| dx = lim ∫ p h→∞ j→∞ Ω h→∞ Ω φl (x) ∫ |Th v| dµx (v) dx = ∫ p Ω ⟨ ⟩ φl (x) | q| p , µx dx, where in the last step we used the monotone convergence lemma. Thus, for some diagonal sequence Wk := TkV j(k) , we have that k→∞ Ω φl (x) |Wk (x)| p dx = ∫ Ω ⟨ ⟩ φl (x) | q| p , µx dx for all l ∈ N. Comment... ∫ lim 64 CHAPTER 3. QUASICONVEXITY Thus, |Wk | p ⇀ ⟨| q| p , µx ⟩ in L1 and {|Wk | p }k is an equiintegrable family by the Dunford– Pettis Theorem. We can see also the equiintegrability in a more elementary way as follows: For a Borel subset A ⊂ Ω take φ ∈ C∞ (Ω) with 1A ≤ φ and estimate ∫ lim sup A k→∞ |Wk (x)| p dx ≤ lim ∫ k→∞ Ω φ (x) |Wk (x)| p dx = ∫ Ω ⟨ ⟩ φ (x) | q| p , µx dx ∫ by the L1 -convergence of |Wk | p to x 7→ ⟨| q| p , µx ⟩. The last integral tends to A ⟨| q| p , µx ⟩ dx as φ ↓ 1A , which vanishes if |A| → 0. Thus the family {Wk }k is p-equiintegrable. By (ii) above, vk = v j(k) is Ck-Lipschitz on the set Ek := {Wk ≤ k} and by a wellknown result for Lipschitz functions due to Kirszbraun, we may extend vk to a function wk : Rd → Rm that is globally Lipschitz continuous with the same Lipschitz constant Ck and wk = vk in Ek . Thus, |∇wk | ≤ CWk in Ω. Thus, {|∇wk |}k inherits the p-equiintegrable from {Wk }k . Moreover, by the Markov inequality, ∥Wk ∥Lp p |Ω \ Ek | ≤ →0 kp as k → ∞. Therefore, for all φ ∈ C0 (Ω) and h ∈ C0 (Rm ), ∫ Ω |φ h(∇wk ) − φ h(∇vk )| dx ≤ ∥φ h∥∞ |Ω \ Ek | → 0. Thus, since all such φ , h determine Young measure convergence (see the proof of the Fundamental Theorem 3.11), we have shown that (∇wk ) generates the same Young measure ν as (∇v j ) and additionally the family {∇wk }k is p-equiintegrable. Step 2. Since W1,p (Ω; Rm ) embeds compactly into the space L p (Ω; Rm ) by the RellichLet moreover (ρ j ) ⊂} C∞ Kondrachov Theorem A.18, we have wk → u in L p . { c (Ω; [0, 1]) be a sequence of cut-off functions such that for G j := x ∈ Ω : ρ j (x) = 1 it holds that |Ω \ G j | → 0 as j → ∞. For u j,k := ρ j wk + (1 − ρ j )u m ∈ W1,p u (Ω; R ) we observe ∇u j,k = ρ j ∇wk + (1 − ρ j )∇u + (wk − u) ⊗ ∇ρ j . We only consider the case p < ∞ in the following; the proof for the case p = ∞ is in fact easier. For every Carathéodory function f : Ω × Rm×d → R with | f (x, A)| ≤ M(1 + |A| p ) we have ∫ lim sup f (x, ∇wk (x)) − f (x, ∇u j,k (x)) dx k→∞ Ω ≤ lim sup CM k→∞ ≤ lim sup CM Comment... k→∞ ∫ Ω\G j ∫ Ω\G j 2 + 2|∇wk | p + |∇u| p + |wk − u| p · |∇ρ j | p dx 2 + 2|∇wk | p + |∇u| p dx, 3.4. GRADIENT YOUNG MEASURES 65 where C > 0 is a constant. However, by assumption, the last integrand is equiintegrable and hence lim lim sup j→∞ ∫ k→∞ Ω f (x, ∇wk (x)) − f (x, ∇u j,k (x)) dx = 0. We can now select a diagonal sequence u j = u j,k( j) such that (∇u j ) generates ν and satisfies all requirements from the statement of the lemma. Note that the equiintegrability is not affected by the cut-off procedure. The following result is now easy to prove: Lemma 3.22 (Jensen-type inequality). Let ν ∈ GY p (Ω; Rm×d ) be a homogeneous Young measure, i.e. νx = ν ∈ M1 (Rm×d ) for almost every x ∈ Ω, and let B := [ν ] ∈ Rm×d be its barycenter (which is a constant). Then, for all quasiconvex functions h : Rm×d → R with |h(A)| ≤ M(1 + |A| p ) (M > 0, 1 < p < ∞) it holds that h(B) ≤ ∫ h dν . (3.16) Notice that the conclusion of this lemma is trivial (and also holds for general Young measures, not just the gradient ones) if h is convex. Thus, also (3.16) expresses the convexity in the notion of quasiconvexity. Y Proof. Let (u j ) ⊂ W1,p (Ω; Rm ) with ∇u j → ν , u j |∂ Ω (x) = Bx for x ∈ ∂ Ω (in the sense of trace) and (∇u j ) p-equiintegrable, the latter two conditions being realizable by Lemma 3.21. Then, from the definition of quasiconvexity, we get ∫ h(B) ≤ − h(∇u j ) dx. Ω Passing to the Young measure limit as j → ∞ on the right-hand side, for which we note that h(∇u j ) is equiintegrable by the growth assumption on h, we arrive at ∫ ∫ h(B) ≤ − Ω h dν dx = ∫ h dν , which already is the sought inequality. Comment... The last result will be of crucial importance in proving weak lower semicontinuity in the next section. It is remarkable that the converse also holds, i.e. the validity of (3.16) for all quasiconvex h with p-growth precisely characterizes the class of (homogeneous) gradient Young measures in the class of all (homogeneous) Young measures; this is the content of the Kinderlehrer–Pedregal Theorem 5.9, which we will meet in Chapter 5. 66 CHAPTER 3. QUASICONVEXITY 3.5 Gradient Young measures and rigidity To round off the introduction to Young measures let us return to the question if every Young measure is a gradient Young measure, but now with the additional requirement that the Young measure in question is homogeneous. Then, the barycenter is constant and hence always the gradient of an affine function, so the simple reasoning from the beginning of the section no longer applies. Still, there are homogeneous Young measures that are not gradients, but proving that no generating sequence of gradients can be found, is not so trivial. One possible way would be to show that there is a quasiconvex function h : Rm×d → R such that the Jensen-type inequality of Lemma 3.22 fails; this is, however, not easy. Let us demonstrate this assertion instead through the so-called (approximate) twogradient problem: Let A, B ∈ Rm×d , θ ∈ (0, 1) and consider the homogeneous Young measure ν := θ δA + (1 − θ )δB ∈ Y∞ (B(0, 1); Rm×d ). (3.17) We know from Example 3.17 that for rank(A − B) ≤ 1, ν is a gradient Young measure. In any case, supp ν = {A, B} and by Proposition 3.20 it follows that dist(V j , {A, B}) → 0 in Y measure for any generating sequence V j → ν . The following is the first instance of so-called rigidity results, which play a prominent role in the field: Theorem 3.23 (Ball–James). Let Ω ⊂ Rd be open, bounded, and convex. Also, let A, B ∈ Rm×d . (I) Let u ∈ W1,∞ (Ω; Rm ) satisfy the exact two-gradient inclusion ∇u ∈ {A, B} a.e. in Ω. (a) If rank(A − B) ≥ 2, then ∇u = A a.e. or ∇u = B a.e. (b) If B − A = a ⊗ n (a ∈ Rm , n ∈ Rd \ {0}), then there exists a Lipschitz function h : R → R with h′ ∈ {0, 1} almost everywhere, and a constant v0 ∈ Rm such that u(x) = v0 + Ax + h(x · n)a. (II) Let rank(A − B) ≥ 2 and assume that u j ⊂ W1,∞ (Ω; Rm ) is uniformly bounded and satisfies the approximate two-gradient inclusion ( ) dist ∇u j , {A, B} → 0 in measure. Then, Comment... ∇u j → A in measure or ∇u j → B in measure. 3.5. GRADIENT YOUNG MEASURES AND RIGIDITY 67 Proof. Ad (I) (a). Assume after a translation that B = 0 and thus rank A ≥ 2. Then, ∇u = Ag for a scalar function g : Ω → R. Mollifying u (i.e. working with uε := ρε ⋆ u instead of u and passing to the limit ε ↓ 0), we may assume that in fact g ∈ C1 (Ω). The idea of the proof is that the curl of ∇u vanishes, expressed as follows: for all i, j = 1, . . . , d and k = 1, . . . , m, it holds that ∂i (∇u)kj = ∂i j uk = ∂ ji uk = ∂ j (∇u)ki where M kj denotes the element in the k’th row and j’th column of the matrix M. For our special ∇u = Ag, this reads as Akj ∂i g = Aki ∂ j g. (3.18) Under the assumptions of (I) (a), we claim that ∇g ≡ 0. If otherwise ξ (x) := ∇g(x) ̸= 0 for some x ∈ Ω, then with ak (x) := Akj /ξ j (x) (k = 1, . . . , m) for any j such that ξ j (x) ̸= 0, where ak (x) is well-defined by the relation (3.18), we have Akj = ak (x)ξ j (x), i.e. A = a(x) ⊗ ξ (x). This, however, is impossible if rank A ≥ 2. Hence, ∇g ≡ 0 and u is an affine function. This property is also stable under mollification. Ad (I) (b). As in (I) (a), we assume ∇u = Ag (B = 0) and A = a ⊗ n. Pick any b ⊥ n. Then, d u(x + tb) = ∇u(x)b = [anT b]g(x) = 0. dt t=0 This implies that u is constant in direction b. As b ⊥ n was arbitrary and Ω is assumed convex, u(x) can only depend on x · n. This implies the claim. Ad (II). Assume once more that B = 0 and that there exists a (2 × 2)-minor M with M(A) ̸= 0. By assumption there are sets D j ⊂ Ω with ∇u j − A1D j → 0 in measure. Let us also assume that we have selected a subsequence such that ∗ u j ⇁ u in W1,∞ and ∗ 1D j ⇁ χ in L∞ . In the following we use that under an assumption of uniform L∞ -boundedness convergence in measure implies weak* convergence in L∞ . Indeed, for any ψ ∈ L1 (Ω) and any ε > 0 we have Ω (∇u j − A1D j )ψ dx { } ≤ x ∈ Ω : |∇u j (x) − A1D j (x)| > ε · ∥∇u j − A1D j ∥L∞ · ∥ψ ∥L1 + ε ∥ψ ∥L1 → 0 + ε ∥ψ ∥L1 Comment... ∫ 68 CHAPTER 3. QUASICONVEXITY ∗ As ε > 0 is arbitrary, we have indeed ∇u j − A1D j ⇁ 0 in L∞ . Now, w*-lim M(∇u j ) = w*-lim M(A1D j ) = M(A) · w*-lim 1D j = M(A)χ . j→∞ j→∞ j→∞ ∗ By a similar argument, ∇u j ⇁ ∇u = Aχ , and so, by the weak* continuity of minors proved in Lemma 3.10, w*-lim M(∇u j ) = M(∇u) = M(Aχ ) = M(A)χ 2 . j→∞ Thus, χ = χ 2 and we can conclude that there exists D ⊂ Ω such that χ = 1D and ∇u = A1D . Furthermore, combining the above convergence assertions, ∇u j → ∇u = A1D in measure. Finally, (I) (a) implies that ∇u = A or ∇u = 0 = B a.e. in Ω, finishing the proof. With this result at hand it is easy to see that our example (3.17) cannot be a gradient Y Young measure if rank(A − B) ≥ 2: If there was a sequence of gradients ∇u j → ν , then by the reasoning above we would have dist(∇u j , {A, B}) → 0, but from (II) of the Ball–James rigidity theorem, we would get ∇u j → A or ∇u j → B in measure, either one of which yields a contradiction by Proposition 3.20. 3.6 Lower semicontinuity After our detour through Young measure theory, we now return to the central subject of this chapter, namely to minimization problems of the form ∫ F [u] := f (x, ∇u(x)) dx → min, Ω over all u ∈ W1,p (Ω; Rm ) with u|∂ Ω = g, where Ω ⊂ Rd is a bounded Lipschitz domain, 1 < p < ∞, the integrand f : Ω × Rm × Rm×d → R has p-growth, | f (x, A)| ≤ M(1 + |A| p ) for some M > 0, Comment... and g ∈ W1−1/p,p (∂ Ω; Rm ) specifies the boundary values. In Chapter 2 we solved this problem via the Direct Method of the Calculus of Variations, a coercivity result, and, crucially, the Lower Semicontinuity Theorem 2.7. In this section, we recycle the Direct Method and the coercivity result, but extend lower semicontinuity to quasiconvex integrands; some motivation for this was given at the beginning of the chapter. Let us first consider how we could approach the proof of lower semicontinuity (it should be clear that the proof via Mazur’s Lemma that we used for the convex lower semicontinuity 3.6. LOWER SEMICONTINUITY 69 theorem, does not extend). Assume we have a sequence (u j ) ⊂ W1,p (Ω; Rm ) such that u j ⇀ u in W1,p and we want to show lower semicontinuity of our functional F . If we assume that our (norm-bounded) sequence ∇u j also generates the gradient Young measure ν ∈ GY p (Ω; Rm×d ), which is true up to a subsequence, and that the sequence of integrands ( f (x, ∇u j (x))) j is equiintegrable, then we already have a limit: F [u j ] → ∫ ∫ Ω f (x, A) dνx (A) dx. This is useful, because it now suffices to show the inequality ∫ f (x, q) dνx ≥ f (x, ∇u(x)) dx for almost every x ∈ Ω, to conclude lower semicontinuity. Now, this inequality is close to the Jensen-type inequality (3.16) for gradient Young measures from Lemma 3.22, and we already explained how this is related to (quasi)convexity. The only difference in comparison to (3.16) is that here we need to “localize” in x ∈ Ω and make νx a gradient Young measure in its own right. This is accomplished via the fundamental blow-up technique: Lemma 3.24 (Blow-up technique). Let ν = (νx ) ∈ GY p (Ω; Rm×d ) be a gradient Young measure. Then, for almost every x0 ∈ Ω, νx0 ∈ GY p (B(0, 1); Rm×d ) is a homogeneous gradient Young measure in its own right. Proof. We start with a general property of Young measures: Let {φk }k and {hl }l be countable dense subsets of C0 (Ω) and C0 (RN ), respectively. Assume furthermore that for a uniformly norm-bounded sequence (V j ) ⊂ L p (Ω; RN ) and ν ∈ Y p (Ω; RN ) we have ∫ lim j→∞ Ω φk (x)hl (V j (x)) dx = ∫ Ω φk (x) ∫ hl dνx dx for all k, l ∈ N. Y Then we claim that V j → ν . This is in fact clear once we realize that the set of linear combinations of the functions fk,l := φk ⊗ hl (that is, fk,l (x, A) := φk (x)hl (A)) is dense in C0 (Ω × RN ) and testing with such functions determines Young measure convergence, see the proof of the Fundamental Theorem 3.11. ⟨ ⟩ Now let x0 ∈ Ω be a Lebesgue point of all the functions x 7→ hl , νx (l ∈ N), that is, ∫ lim r↓0 B(0,1) ⟨ hl , νx ⟩ 0 +ry ⟨ ⟩ − hl , νx0 dy = 0. By Theorem A.7, almost every x ∈ Ω has this property. Then, at such a Lebesgue point x0 , set u j (x0 + ry) − [u j ]B(x0 ,r) , r where [u]B(x0 ,r) := ∫ −B(x ,r) u 0 y ∈ B(0, 1), dx. Comment... (r) v j (y) := 70 CHAPTER 3. QUASICONVEXITY For φk , hl from the collections exhibited at the beginning of the proof, we get ∫ (r) B(0,1) φk (y)hl (∇v j (y)) dy = ∫ B(0,1) 1 = d r φk (y)hl (∇u j (x0 + ry)) dy ∫ B(x0 ,r) φk (x−x ) 0 hl (∇u j (x)) dx r after a change of variables. Letting j → ∞ and then r ↓ 0, we get ∫ ∫ ( x − x )⟨ ⟩ 1 (r) 0 φk (y)hl (∇v j (y)) dy = lim d φk lim lim hl , νx dx r↓0 r r↓0 j→∞ B(0,1) r B(x0 ,r) ∫ ⟨ ⟩ = lim φk (y) hl , νx0 +ry dx r↓0 ∫ B(0,1) = B(0,1) ⟨ ⟩ φk (y) hl , νx0 dy, the last convergence following from the Lebesgue point property of x0 . Moreover, ∫ (r) B(0,1) |∇v j | p dy = ∫ B(0,1) |∇u j (x0 + ry)| p dy = 1 rd ∫ B(x0 ,r) |∇u j (x)| p dx and the last integral is uniformly bounded in j (for fixed r). Denote by λ ∈ M(Ω; Rm ) the weak* limit of the measures |∇u j | p L d Ω, which exists after taking a subsequence. If we require of x0 additionally that lim sup r↓0 λ (B(x0 , r)) < ∞, rd which hold at L d -almost every x0 ∈ Ω, then ∫ lim sup lim r↓0 j→∞ B(0,1) (r) |∇v j | p dy < ∞. (r) Since also [v j ]B(0,1) = 0, the Poincaré inequality from Theorem A.16 yields that there exists R( j) a diagonal sequence w j := vJ( j) that is uniformly bounded in W1,p (B(0, 1); Rm ) and such that for all k, l ∈ N, ∫ ∫ ⟨ ⟩ lim φk (y)hl (∇w j (y)) dy = φk (y) hl , νx0 dy. j→∞ B(0,1) B(0,1) Y Therefore, ∇w j → νx0 , where we now understand νx0 as a homogeneous (gradient) Young measure on B(0, 1). The proof of weak lower semicontinuity is then straightforward: Comment... Theorem 3.25. Let f : Ω × Rm×d → R be a Carathéodory integrand with | f (x, A)| ≤ M(1 + |A| p ) for some M > 0, 1 < p < ∞. Assume furthermore that f (x, q) is quasiconvex for every fixed x ∈ Ω. Then the corresponding functional F is weakly lower semicontinuous on W1,p (Ω; Rm ). 3.6. LOWER SEMICONTINUITY 71 Proof. Let (∇u j ) ⊂ W1,p (Ω; Rm ) with u j ⇀ u in W1,p . Assume that (∇u j ) generates the gradient Young measure ν = (νx ) ∈ GY p (Ω; Rm×d ), for which it holds that [ν ] = ∇u. This is only true up to a (non-relabeled) subsequence, but if we can establish the lower semicontinuity for every such subsequence it follows that the result also holds for the original sequence. From Proposition 3.19 we get ∫ lim inf j→∞ Ω f (x, ∇u j (x)) dx ≥ ⟨⟨ f,ν ⟩⟩ ∫ ∫ = Ω f (x, A) dνx (A) dx. Now, for almost every x ∈ Ω, we can consider νx as a homogeneous Young measure in GY p (B(0, 1); Rm×d ) by the Blow-up Lemma 3.24. Thus, the Jensen-type inequality from Lemma 3.22 reads ∫ f (x, A) dνx (A) ≥ f (x, ∇u(x)) for a.e. x ∈ Ω. Combining, we arrive at lim inf F [u j ] ≥ F [u], j→∞ which is what we wanted to show. At this point is is worthwhile to reflect on the role of Young measures in the proof of the preceding result, namely that they allowed us to split the argument into two parts: First, we passed to the limit in the functional via the Young measure. Second, we established a Jensen-type inequality, which then yielded the lower semicontinuity inequality. It is remarkable that the Young measure preserves exactly the “right” amount of information to serve as an intermediate object. We can now sum up and prove existence to our minimization problem: Theorem 3.26. Let f : Ω × Rm×d → R be a Carathéodory integrand such that f satisfies the lower growth bound µ |A| p − µ −1 ≤ f (x, A) for some µ > 0, the upper growth bound | f (x, A)| ≤ M(1 + |A| p ) for some M > 0, and is quasiconvex in its second argument. Then, the associated functional F has a minim 1−1/p,p (∂ Ω; Rm ). mizer over the space W1,p g (Ω; R ), where g ∈ W Proof. This follows directly by combining the Direct Method from Theorem 2.3 with the coercivity result in Proposition 2.6 and the Lower Semicontinuity Theorem 3.25. Comment... The following result shows that quasiconvexity is also necessary for weak lower semicontinuity; we only state and show this for x-independent integrands, but we note that it also holds for x-dependent ones by a localization technique. 72 CHAPTER 3. QUASICONVEXITY Proposition 3.27. Let f : Rm×d → R satisfy the p-growth condition. | f (x, A)| ≤ M(1 + |A| p ) for some M > 0. If the associated functional F [u] := ∫ Ω u ∈ W1,p (Ω; Rm ), f (∇u(x)) dx, is weakly lower semicontinuous (with or without fixed boundary values), then f is quasiconvex. Proof. We may assume that B(0, 1) ⊂⊂ Ω, otherwise we can translate and rescale the dom main. Let A ∈ Rm×d and ψ ∈ W1,∞ 0 (B(0, 1); R ). We need to show ∫ h(A) ≤ − B(0,1) h(A + ∇ψ (z)) dz. Take for every j ∈ N a Vitali cover of B(0, 1), see Theorem A.8, N( j) B(0, 1) = Z ⊎ ⊎ ( j) ( j) |Z| = 0, B(ak , rk ), k=1 ( j) rk ( j) ak ∈ B(0, 1), ≤ 1/ j (k = 1, . . . , N( j)), and fix a smooth function h : Ω \ B(0, 1) → with m R with h(x) = Ax for x ∈ ∂ B(0, 1) (and h|∂ Ω equal to the prescribed boundary values if there are any). Define ( ( j) ) x − ak ( j) ( j) ( j) if x ∈ B(ak , rk ), u j (x) := Ax + rk ψ ( j) rk and u j := h on Ω \ B(0, 1). Then, it is not hard to see that u j ⇀ u in W1,p for { Ax if x ∈ B(0, 1), x ∈ Ω. u(x) = h(x) if x ∈ Ω \ B(0, 1), Thus, the lower semicontinuity yields, after eliminating the constant part of the functional on Ω \ B(0, 1), ∫ B(0,1) f (A) dx ≤ lim inf j→∞ = lim inf j→∞ = lim inf ∫ j→∞ = B(0,1) ∫ B(0,1) ∑ k ∫ ( j) ( j) B(ak ,rk ) ∑ rk ( j) k f (∇u j (x)) dx ∫ B(0,1) ( ( ( j) )) x − ak f A + ∇ψ dx ( j) rk f (A + ∇ψ (y)) dy f (A + ∇ψ (y)) dy ( j) since ∑k rk = 1. This is nothing else than quasiconvexity. Comment... We note that also for quasiconvex integrands, our results about the Euler–Lagrange equations hold just the same (if we assume the same growth conditions). 3.7. INTEGRANDS WITH U-DEPENDENCE 73 3.7 Integrands with u-dependence One very useful feature of our Young measure approach is that it allows to derive a lower semicontinuity result for u-dependent integrands with minimal additional effort. So consider ∫ F [u] := f (x, u(x), ∇u(x)) dx → min, Ω over all u ∈ W1,p (Ω; Rm ) with u|∂ Ω = g, where Ω ⊂ Rd is a bounded Lipschitz domain, 1 < p < ∞, and the Carathéodory integrand f : Ω × Rm × Rm×d → R satisfies the p-growth bound | f (x, v, A)| ≤ M(1 + |v| p + |A| p ) for some M > 0, (3.19) and g ∈ W1−1/p,p (∂ Ω; Rm ). The main idea of our approach is to consider Young measures generated by the pairs (u j , ∇u j ) ∈ Rm+md . We have the following result: Lemma 3.28. Let (u j ) ⊂ L p (Ω; RM ) and (V j ) ⊂ L p (Ω; RN ) be norm-bounded sequences such that u j → u pointwise a.e. Y V j → ν = (νx ) ∈ Y p (Ω; RN ). and Y Then, (u j ,V j ) → µ = (µx ) ∈ Y p (Ω; RM+N ) with µx = δu(x) ⊗ νx for a.e. x ∈ Ω, that is, ∫ Ω f (x, u j (x),V j (x)) dx → ∫ ∫ Ω f (x, u(x), A) dνx (A) dx (3.20) for all Carathéodory f : Ω × RM × RN → R satisfying the p-growth bound (3.19). Proof. By a similar density argument as in the proof of Lemma 3.24, it suffices to show the convergence (3.20) for f (x, v, A) = φ (x)ψ (v)h(A), where φ ∈ C0 (Ω), ψ ∈ C0 (RM ), h ∈ C0 (RN ). We already know by assumption ⟩) ⟨ ∗ ( h(V j ) ⇁ x 7→ h, νx in L∞ . Furthermore, ψ (u j ) → ψ (u) almost everywhere and thus strongly in L1 since ψ is bounded. Thus, since the product of a L∞ -weakly* converging sequence and a L1 -strongly converging sequence converges itself weakly* in the sense of measures, ∫ Ω φ (x)ψ (u j (x))h(V j (x)) dx → ∫ Ω ⟩ ⟨ φ (x)ψ (u(x)) h, νx dx, Comment... which is (3.20) for our special f . This already finishes the proof. 74 CHAPTER 3. QUASICONVEXITY Y The trick of the preceding lemma is that in our situation, where V j = ∇u j → ν ∈ GY(Ω; Rm×d ), it allows us to “freeze” u(x) in the integrand. Then, we can apply the Jensentype inequality from Lemma 3.22 just as we did in Theorem 3.25 to get: Theorem 3.29. Let f : Ω × Rm × Rm×d → R be a Carathéodory integrand with p-growth, i.e. | f (x, v, A)| ≤ M(1 + |v| p + |A| p ) for some M > 0, 1 < p < ∞. Assume furthermore that f (x, v, q) is quasiconvex for every fixed (x, v) ∈ Ω × Rm . Then, the corresponding functional F is weakly lower semicontinuous on W1,p (Ω; Rm ). Remark 3.30. It is not too hard to extend the previous theorem to integrands f satisfying the more general growth condition | f (x, v, A)| ≤ M(1 + |v|q + |A| p ) for a 1 ≤ q < p/(d − p). By the Sobolev Embedding Theorem A.17, u j → u in Lq (strongly) for all such q and thus Lemma 3.28 and then Theorem 3.29 can be suitably generalized. 3.8 Regularity of minimizers At the end of Section 2.5 we discussed the failure of regularity for vector-valued problems. In particular, we mentioned various counterexamples to regularity. Upon closer inspection, however, it turns out that in all counterexamples, the points where regularity fails form a closed set. This is not a coincidence. Another issue was that all regularity theorems discussed so far require (strong) convexity of the integrand and our discussion so far has shown that convexity is not a good notion for vector-valued problems (however, much of the early regularity theory for vectorvalued problems focused on convex problems, we will not elaborate on this aspect here). To remedy this, we need a new notion, which will take over from strong convexity in regularity theory: A locally bounded Borel-measurable function h : Rm×d → R is called strongly (2-)quasiconvex if there exists µ > 0 such that A 7→ h(A) − µ |A|2 is quasiconvex. Equivalently, we may require that µ ∫ B(0,1) |∇ψ (z)|2 dz ≤ ∫ B(0,1) h(A + ∇ψ (z)) − h(A) dz m for all A ∈ Rm×d and all ψ ∈ W1,∞ 0 (B(0, 1); R ). The most well-known regularity result in this situation is by Lawrence C. Evans from 1986 (there is significant overlap with work by Acerbi & Fusco in the same year): Theorem 3.31 (Evans partial regularity theorem). Let f : Rm×d → R be of class C2 , strongly quasiconvex, and assume that there exists M > 0 such that Comment... D2A f (A)[B, B] ≤ M|B|2 for all A, B ∈ Rm×d . 3.9. NOTES AND BIBLIOGRAPHICAL REMARKS 75 1,2 m m 1/2,2 (∂ Ω; Rm ). Let u ∈ W1,2 g (Ω; R ) be a minimizer of F over Wg (Ω; R ) where g ∈ W Then, there exists a relatively closed singular set Σu ⊂ Ω with |Σu | = 0 and it holds that α u ∈ C1, loc (Ω \ Σu ) for all α ∈ (0, 1). The proof is long and builds on earlier work of Almgren and De Giorgi in Geometric Measure Theory. The result is called a “partial regularity” result because the regularity does not hold everywhere. It should be noted that while the scalar regularity theory was essentially a theory for PDEs, and hence applies to all solutions of the Euler–Lagrange equations, this is not the case for the present result: There is no regularity theory for critical points of the Euler–Lagrange equation for a quasiconvex or even polyconvex (see next chapter) integral functional. This was shown by Müller & Švérak in 2003 [75] (for the quasiconvex case) and Székelyhidi Jr. in 2004 [92] (for the polyconvex case). We finally remark that sometimes better estimates on the “smallness” of Σu than merely |Σu | = 0 are available. In fact, for strongly convex integrands it can be shown that the (Hausdorff-)dimension of the singular set is at most d − 2, this is actually within reach of our discussion in Section 2.5, because we showed W2,2 loc -regularity, and general properties of such functions imply the conclusion, see Chapter 2 of [44]. In the strongly quasiconvex case, much less is known, but at least for minimizers that happen to be W1,∞ it was established in 2007 by Kristensen & Mingione that the dimension of the singular set is strictly less than d, see [56]. Many other questions are open. 3.9 Notes and bibliographical remarks Comment... The notion of quasiconvexity was first introduced in Morrey’s fundamental paper [69] and later refined by Meyers in [65]. The results about null-Lagrangians, in particular Lemma 3.8 and Lemma 3.10, go back to Morrey [70], Reshetnyak [82] and Ball [5]. Our proof is based on the idea that certain combinations of derivatives might have good convergence properties even if the individual derivatives do not. This is also pivotal for so-called compensated compactness, which originated in work of Tartar [94, 95] and Murat [76, 77], in particular in their celebrated Div–Curl Lemma, see for example the discussion in [35]. It is quite remarkable that one can prove Brouwer’s fixed point theorem using the fact that minors are null-Lagrangians, see Theorem 8.1.3 in [36]. Proposition 3.6 is originally due to Morrey [70], we follow the presentation in [12]. We also remark that if only r = p, then a weaker version of Lemma 3.10 is true where we only have convergence of the minors in the sense of distributions, see Theorem 8.20 in [25]. The convexity properties of the quadratic function A 7→ MA : A, where M is a fourthorder tensor (or, equivalently, a matrix in Rmd×md ) has received considerable attention because it corresponds to a linear Euler–Lagrange equation. In this case, quasi-convexity and rank-one convexity are the same, and polyconvexity (see next chapter) is also equivalent to quasiconvexity if d = 2 or m = 2, but this does not hold anymore for d, m ≥ 3. These results together with pointers to the vast literature can be found in Section 5.3.2 in [25]. 76 CHAPTER 3. QUASICONVEXITY A more general result on why convexity is inadmissible for realistic problems in nonlinear elasticity can be found in Section 4.8 of [20]. The result that rank-one convex functions are locally Lipschitz continuous is wellknown for convex functions, see for example Corollary 2.4 in [34] and an adapted version for rank-one convex (even separately convex) functions is in Theorem 2.31 in [25]. Our proof with a quantitative bound is from Lemma 2.2 in [12]. The Fundamental Theorem of Young Measure Theory 3.11 has seen considerable evolution since its original proof by Young in the 1930s and 1940s [97–99]. It was popularized in the Calculus of Variations by Tartar [95] and Ball [7]. Further, much more general, versions have been contributed by Balder [4], also see [19] for probabilistic applications. The Ball–James Rigidity Theorem 3.23 is from [10], see [73] for further results in the same spirit. Lemma 3.21 is an expression of the well-known decomposition lemma from [38], another version is in [53]. Many results in this section that are formulated for Carathéodory integrands continue to hold for f : Ω × RN → R that are Borel measurable and lower semicontinuous in the second argument, so called normal integrands, see [16] and [37]. The linear growth case p = 1 is very special and requires more sophisticated tools, socalled generalized Young measures, see [1, 32, 57, 58]. One issue in the linear-growth case is demonstrated in the following counterpart of Lemma 3.18 for p = 1: Lemma 3.32. Let (V j ) ⊂ L1 (Ω; RN ) be a bounded sequence generating the Young measure (νx )x∈Ω ⊂ M1 (RN ). Define v(x) := [νx ] = ⟨id, νx ⟩. Then there exists an increasing sequence Ωk ⊂ Ω with |Ωk | ↑ |Ω| such that V j ⇀ v in L1 (Ωk ; RN ) for all k; this is called biting convergence. Comment... Once one knows the so-called Chacon Biting Lemma, which asserts that every L1 bounded sequence has a subsequence converging in the biting sense, the preceding lemma is easy to see using (III) from the Fundamental Theorem of Young Measure Theory 3.11. More information on this topic can be found in [14] and some of the references cited therein. It is possible to prove the Lower Semicontinuity Theorem 3.25 for quasiconvex integrands without the use of Young measures, see for instance Chapter 8 in [25] for such an approach. However, many of the ideas are essentially the same, they are just carried out directly without the intermediate steps of Young measures. However, Young measures allow one to very clearly organize and compartmentalize these ideas, which aids the understanding of the material. We will also use Young measures later for the relaxation of non-lower semicontinuous functionals. More on lower semicontinuity and Young measures can be found in the book [81]. The first use of Young measures to investigate questions of lower semicontinuity for quasiconvex integral functionals seems to be [50]. The blow-up technique is related to the use of tangent measures in Geometric Measure Theory, see for example [61]. Chapter 4 Polyconvexity At the beginning of Chapter 3 we saw that convexity cannot hold concurrently with frameindifference (and a mild positivity condition). Thus, we were lead to consider quasiconvex integrands. However, while quasiconvexity is of huge importance in the theory of the Calculus of Variations, our Lower Semicontinuity Theorem 3.25 has one major drawback: we needed to require the p-growth bound | f (x, A)| ≤ M(1 + |A| p ) for all x ∈ Ω, A ∈ Rd×d and some M > 0, 1 < p < ∞. Unfortunately, this is not a realistic assumption for nonlinear elasticity because it ignores the fact that infinite compressions should cost infinite energy. Indeed, realistic integrands have the property that f (A) → +∞ as f (A) = +∞ if det A ↓ 0 and det A ≤ 0. For example, the family of matrices ( ) 1 0 Aα := α > 0, , 0 α Comment... satisfies det Aα ↓ 0 as α ↓ 0, but |Aα | remains uniformly bounded, so the above p-growth bound cannot hold. The question whether the Lower Semicontinuity Theorem 3.25 for quasiconvex integrands can be extended to integrands with the above growth is currently a major unsolved problem, see [8]. For the time being, we have to confine ourself to a more restrictive notion of convexity if we want to allow for the above “elastic” growth. This type of convexity is called polyconvexity by John Ball in [5], who proved the corresponding existence theorem for minimization problems and enabled a mature mathematical treatment of nonlinear elasticity theory, including the realistic Ogden materials, see Section 1.4 and Example 4.3. We will only look at the three-dimensional theory because it is by far the most physically important case and because this restriction eases the notational burden; however, other dimensions can be treated as well, we refer to the comprehensive treatment in [25]. 78 CHAPTER 4. POLYCONVEXITY 4.1 Polyconvexity A continuous integrand h : R3×3 → R ∪ {+∞} (now we allow the value +∞) is called polyconvex if it can be written in the form h(A) = H(A, cof A, det A), A ∈ R3×3 , where H : R3×3 × R3×3 × R → R ∪ {+∞} is convex. Here, cof A denotes the cofactor matrix as defined in Appendix A.1. While convexity obviously implies polyconvexity, the converse is clearly wrong. However, we have: Proposition 4.1. A polyconvex h : R3×3 → R (not taking the value +∞) is quasiconvex. Proof. Let h be as stated and let ψ ∈ W1,∞ (B(0, 1); R3 ) with ψ (x) = Ax on ∂ B(0, 1). Then, using Jensen’s inequality ∫ − B(0,1) ∫ h(∇ψ ) dx = − H(∇ψ , cof ∇ψ , det ∇ψ ) dx B(0,1) (∫ ∫ ∫ ≥H − ∇ψ dx, − cof ∇ψ dx, − B(0,1) B(0,1) B(0,1) ) det ∇ψ dx = H(A, cof A, det A) = h(A), where for the last line we used the fact that minors are null-Lagrangians as proved in Lemma 3.8. At this point we have seen all four major convexity conditions that play a role in the modern Calculus of Variations. They can be put into a linear order by implications: convexity ⇒ polyconvexity ⇒ quasiconvexity ⇒ rank-one convexity While in the scalar case (d = 1 or m = 1) all these notions are equivalent, in higher dimensions the reverse implications do not hold in general. Clearly, the determinant function is polyconvex but not convex. However, the fact that the notions of polyconvexity, quasiconvexity, and rank-one convexity are all different in higher dimensions, are non-trivial. The fact that rank-one convexity does not imply quasiconvexity, at least for d ≥ 2, m ≥ 3, is the subject of Švérak’s famous counterexample, see Example 5.10. The case d = m = 2 is still open and one of the major unsolved problems in the field. Here we discuss an example of a quasiconvex, but not polyconvex function: Example 4.2 (Alibert–Dacorogna–Marcellini). Recall from Example 3.5 the Alibert– Dacorogna–Marcellini function ( ) hγ (A) := |A|2 |A|2 − 2γ det A , A ∈ R2×2 . Comment... It can be shown that hγ is polyconvex if and only if |γ | ≤ 1. Since it is quasiconvex if and only if |γ | ≤ γQC , where γQC > 1, we see that there are quasiconvex functions that are not polyconvex. See again Section 5.3.8 in [25] for details. 4.2. EXISTENCE THEOREMS 79 Example 4.3 (Ogden materials). Functions W of the form M [ ] N [ ] W (F) := ∑ ai tr (F T F)γi /2 + ∑ b j tr cof (F T F)δ j /2 + Γ(det F), i=1 F ∈ R3×3 , j=1 where ai > 0, γi ≥ 1, b j > 0, δ j ≥ 1, and Γ : R → R ∪ {+∞} is a convex function with Γ(d) → +∞ as d ↓ 0, Γ(d) = +∞ for s ≤ 0 and good enough growth from below, can be shown to be polyconvex. These stored energy functionals correspond to so-called Ogden materials and occur in a wide range of elasticity applications, see [20] for details. It can also be shown that in three dimensions convex functions of certain combinations of the singular values of a matrix (the singular values of A are the square roots of the eigenvalues of AT A) are polyconvex, see Section 5.3.3 in [25] for details. 4.2 Existence Theorems Let Ω ⊂ R3 be a connected bounded Lipschitz domain. In this section we will prove existence of a minimizer of the variational problem ∫ F [u] := W (x, ∇u(x)) − b(x) · u(x) dx → min, Ω (4.1) 1,p 3 over all u ∈ W (Ω; R ) with det ∇u > 0 a.e. and u|∂ Ω = g, where W : Ω × R3×3 → R ∪ {+∞} is Carathéodory and W (x, q) is polyconvex for almost every x ∈ Ω. Thus, W (x, A) = W(x, A, cof A, det A), (x, A) ∈ Ω × R3×3 , for W : Ω × R3×3 × R3×3 × R → R ∪ {+∞} with W(x, q, q, q) jointly convex and continuous for almost every x ∈ Ω. As the only upper growth assumptions on W we impose { W (x, A) → +∞ as det A ↓ 0, (4.2) W (x, A) = +∞ if det A ≤ 0. Furthermore, we assume as usual that g ∈ W1−1/p,p (∂ Ω; R3 ) and that b ∈ Lq (Ω; R3 ) where 1/p + 1/q = 1. The exponent p > 1 will remain unspecified for now. Later, when we impose conditions on the coercivity of W , we will also specify p. The first lower semicontinuity result is relatively straightforward: Theorem 4.4. If in addition to the above assumptions it holds that µ |A| p − µ −1 ≤ W (x, A) for some µ > 0 and p > 3, (4.3) then the minimization problem (4.1) has at least one solution in the space { } A := u ∈ W1,p (Ω; R3 ) : det ∇u > 0 a.e. and u|∂ Ω = g , Comment... whenever this set is non-empty. 80 CHAPTER 4. POLYCONVEXITY Proof. We apply the usual Direct Method. For a minimizing sequence (u j ) ⊂ A with F [u j ] → min, we get from the coercivity estimate (4.3) in conjunction with the Poincaré– Friedrichs inequality (and the fixed boundary values) that for any δ > 0, ∫ Ω W (x, ∇u j ) − b(x) · u j (x) dx ≥ µ ∥∇u j ∥Lp p − µ −1 |Ω| − ∥b∥Lq · ∥u j ∥L p µ µ |Ω| δ 1 p p p ≥ ∥u j ∥W ∥g∥W − ∥b∥qLq − ∥u j ∥W 1,p − 1,p . 1−1/p,p − C C µ δq p Choosing δ = pµ /(2C), we arrive at ( ) sup ∥u j ∥W1,p ≤ C sup F [u j ] + 1 . j∈N j∈N Thus, we may select a subsequence (not renumbered) such that u j ⇀ u in W1,p (Ω; R3 ). Now, by Lemma 3.10 and p > 3, { det ∇u j ⇀ det ∇u in L p/3 and cof ∇u j ⇀ cof ∇u in L p/2 . Thus, an argument entirely analogous to the proof of Theorem 2.7 (note that in that theorem we did not need any upper growth condition) yields that the hyperelastic part of F , u 7→ ∫ Ω W (x, ∇u j ) dx = ∫ Ω W(x, ∇u j , cof ∇u j , det ∇u j ) dx is weakly lower semicontinuous in W1,p . The second part of F is weakly continuous by Lemma 2.37. Thus, F [u] ≤ lim inf F [u j ] = inf F . j→∞ A In particular, det ∇u > 0 almost everywhere and u|∂ Ω = g by the weak continuity of the trace. Hence u ∈ A and the proof is finished. Unfortunately, the preceding theorem has a major drawback in that p > 3 has to be assumed. In applications in elasticity theory, however, a more realistic form of W is W (A) = λ (tr E)2 + µ tr E 2 + O(|E|2 ), 2 1 E = (AT A − I), 2 (4.4) where λ , µ > 0 are the Lamé constants. Except for the last term O(|E|2 ), which vanishes as |E| ↓ 0, this energy corresponds to a so-called St. Venant–Kirchhoff material. It was shown by Ciarlet & Geymonat [21] (also see Theorem 4.10-2 in [20]) that for any such Lamé constants, there exists a polyconvex function W of the form Comment... W (A) = a|A|2 + b| cof A|2 + γ (det A) + c, 4.2. EXISTENCE THEOREMS 81 where a, b, c > 0 and γ : R → [0, ∞] is convex, such that the above expansion (4.4) holds. Clearly, such W has 2-growth in |A|. Thus, we need an existence theorem for functions with these growth properties. The core of the refined argument will be an improvement of Lemma 3.10: Lemma 4.5. Let 1 ≤ p, q, r ≤ ∞ with 1 1 + ≤ 1, p q p ≥ 2, r ≥ 1, and assume that the sequence (u j ) ⊂ W1,p (Ω; R3 ) satisfies for some H ∈ Lq (Ω; R3×3 ), d ∈ Lr (Ω) that u j ⇀ u in W1,p , cof ∇u j ⇀ H in Lq , det ∇u j ⇀ d in Lr . Then, H = cof ∇u and d = det ∇u. Proof. The idea is to use distributional versions of the cofactor matrix and Jacobian determinant and to show that they agree with the usual definitions for sufficiently regular functions. To this aim we will use the representation of minors as divergences already employed in Lemmas 3.8, 3.10. Step 1. From (3.7) we get that for all φ = (φ 1 , φ 2 , φ 3 ) ∈ C1 (Ω; R3 ), (cof ∇φ )kl = ∂l+1 (φ k+1 ∂l+2 φ k+2 ) − ∂l+2 (φ k+1 ∂l+1 φ k+2 ), where k, l ∈ {1, 2, 3} are cyclic indices. Thus, for all ψ ∈ C∞ c (Ω), ∫ ∫ (cof ∇φ )kl ψ dx = − (φ k+1 ∂l+2 φ k+2 )∂l+1 ψ − (φ k+1 ∂l+1 φ k+2 )∂l+2 ψ dx Ω ⟨ Ω ⟩ =: (Cof ∇φ )kl , ψ (4.5) and we call the quantity “Cof ∇φ ” the distributional cofactors. To investigate the continuity properties of the distributional cofactors, we consider G(φ ) := ∫ Ω (φ k ∂l φ m )∂n ψ dx, φ ∈ W1,p (Ω; R3 ), for some k, l, m, n ∈ {1, 2, 3} and fixed ψ ∈ C∞ c (Ω). We can estimate this using the Hölder inequality as follows: |G(φ )| ≤ ∥φ ∥Ls ∥∇φ ∥L p ∥∇ψ ∥∞ whenever 1 1 + ≤ 1. s p Comment... (4.6) 82 CHAPTER 4. POLYCONVEXITY In particular this is true for s = p ≥ 2. Thus, since C1 (Ω; R3 ) is dense in W1,p (Ω; R3 ), we have that the equality (4.5) also holds for φ ∈ W1,p (Ω; R3 ) whenever p ≥ 2. Moreover, the above arguments yield for a sequence (φ j ) ⊂ W1,p (Ω; R3 ) that { φj → φ in Ls , if G(φ j ) → G(φ ) ∇φ j ⇀ ∇φ in L p . The first convergence is automatically satisfied by the compact Rellich–Kondrachov Theorem A.18 if { 3p if p < 3, s < 3−p (4.7) ∞ if p ≥ 3. We can always choose s such that it simultaneously satisfies (4.6) and (4.7), as a quick calculation shows. Thus, ⟨ ⟩ ⟨ ⟩ if Cof ∇φ j , ψ → Cof ∇φ , ψ φ j ⇀ φ in W1,p , p ≥ 2. (4.8) Step 2. First, if φ = (φ 1 , φ 2 , φ 3 ) ∈ C1 (Ω; R3 ), then we know from (3.8) that 3 3 ( ) det ∇φ = ∑ ∂l φ 1 (cof ∇φ )1l = ∑ ∂l φ 1 (cof ∇φ )1l . l=1 l=1 Thus, in this case, we have for all ψ ∈ C∞ c (Ω) that ∫ 3 Ω (det ∇φ )ψ dx = ∑ ∫ 3 l=1 Ω ∂l φ 1 (cof ∇φ )1l ψ dx = − ∑ ∫ ( ) φ 1 (cof ∇φ )1l ∂l ψ dx. (4.9) l=1 Ω The key idea now is that by Hölder’s inequality this quantity makes sense if only φ ∈ L p (Ω; R3 ) and cof ∇φ ∈ Lq (Ω; R3×3 ) with 1p + 1q ≤ 1. Analogously to the argument for the cofactor matrix, this motivates us to define for φ ∈ L p (Ω; R3 ) with cof ∇φ ∈ Lq (Ω; R3 ), where 1 1 + ≤ 1, p q the distributional determinant 3 ∫ ( ⟨ ⟩ ) Det ∇φ , ψ := − ∑ φ 1 (cof ∇φ )1l ∂l ψ dx, ψ ∈ C∞ c (Ω). l=1 Ω From (4.9) we see that if φ ∈ C1 (Ω; R3 ), then “det = Det”, i.e. ∫ ⟨ ⟩ (det ∇φ )ψ dx = Det ∇φ , ψ Ω (4.10) for all ψ ∈ C∞ c (Ω). We want to show that this equality remains to hold if merely φ ∈ 1,p 3 W (Ω; R ) and cof ∇φ ∈ Lq (Ω; R3×3 ) with p, q as in the statement of the lemma. For the 1 3 1 3×3 ), moment fix ψ ∈ C∞ c (Ω) and define for φ ∈ C (Ω; R ) and F ∈ C (Ω; R 3 Z(φ , F) := ∑ ∫ Comment... l=1 Ω 3 ∂l φ 1 Fl1 ψ + φ 1 Fl1 ∂l ψ dx = ∑ ∫ l=1 Ω Fl1 ∂l (φ 1 ψ ) dx. 4.2. EXISTENCE THEOREMS 83 To establish (4.10), we need to show Z(φ , cof ∇φ ) = 0. Recall the Piola identity (3.9), div cof ∇v = 0 if v ∈ C1 (Ω; R3 ). As a consequence, Z(φ , cof ∇v) = 0 for φ , v ∈ C1 (Ω; R3 ). Furthermore, we have the estimate |Z(φ , F)| ≤ ∥∇φ ∥L p ∥F∥Lq ∥ψ ∥∞ whenever 1 1 + ≤ 1. p q Thus, we get, via two approximation procedures (first in F = cof ∇v, then in φ ) Z(φ , cof ∇v) = 0 for φ ∈ W1,p (Ω; R3 ), cof ∇v ∈ Lq (Ω; R3×3 ). In particular, Z(φ , cof ∇φ ) = 0 for φ ∈ W1,p (Ω; R3 ), cof ∇φ ∈ Lq (Ω; R3×3 ). Therefore, (4.10) holds also for φ = u as in the statement of the lemma. Moreover, we see from the definition of the distributional determinant that for a sequence (φ j ) ⊂ W1,p (Ω; R3 ) we have { ⟨ ⟩ ⟨ ⟩ φj → φ in Ls , Det ∇φ j , ψ → Det ∇φ , ψ if cof ∇φ j ⇀ cof ∇φ in Lq . whenever 1 1 + ≤ 1. s q (4.11) The first convergence follows, as before, from the Rellich–Kondrachov Theorem A.18 if { 3p if p < 3, 3−p s< (4.12) ∞ if p ≥ 3. However, since we assumed 1 1 + ≤ 1, p q Comment... we can always choose s such that it satisfies (4.11) and (4.12) simultaneously, as can be seen by some elementary algebra. Thus, { ⟨ ⟩ ⟨ ⟩ φj ⇀ φ in W1,p , Det ∇φ j , ψ → Det ∇φ , ψ if (4.13) cof ∇φ j ⇀ cof ∇φ in Lq , 84 CHAPTER 4. POLYCONVEXITY where p, q satisfy the assumptions of the lemma. Step 3. Assume that, as in the statement of the lemma, we are given a sequence (u j ) ⊂ W1,p (Ω; R3 ) such that for H ∈ Lq (Ω; R3×3 ), d ∈ Lr (Ω) it holds that in W1,p , uj ⇀ u cof ∇u j ⇀ H in Lq , det ∇u j ⇀ d in Lr . Then, (4.5), (4.10) imply that for all ψ ∈ C∞ c (Ω), ⟨ ⟩ ⟨ ⟩ Cof ∇u j , ψ → H, ψ ⟨ ⟩ ⟨ ⟩ Det ∇u j , ψ → d, ψ . and On the other hand, from (4.8), (4.13), ⟨ ⟩ ⟨ ⟩ Cof ∇u j , ψ → Cof ∇u, ψ ⟨ ⟩ ⟨ ⟩ Det ∇u j , ψ → Det ∇u, ψ . and Thus, ∫ Ω (Cof ∇u − H)ψ dx = 0 ∫ and Ω (Det ∇u − d)ψ dx = 0 for all ψ ∈ C∞ c (Ω). The Fundamental Lemma of the Calculus of Variations 2.21 then gives immediately H = cof ∇u and d = det ∇u. This finishes the proof. With this tool at hand, we can now prove the improved existence result: Theorem 4.6 (Ball). Let 1 ≤ p, q, r < ∞ p ≥ 2, 1 1 + ≤ 1, p q r > 1, such that in addition to the assumptions at the beginning of this section, it holds that ( ) W (x, A) ≥ µ |A| p + | cof A|q + | det A|r − µ −1 for some µ > 0. Then, the minimization problem (4.1) has at least one solution in the space { A := u ∈ W1,p (Ω; R3 ) : cof ∇u ∈ Lq (Ω; R3×3 ), det ∇u ∈ Lr (Ω), } det ∇u > 0 a.e. and u|∂ Ω = g , Comment... whenever this set is non-empty. (4.14) 4.3. GLOBAL INJECTIVITY 85 Proof. This follows in a completely analogous way to the proof of Theorem 4.4, but now we select a subsequence of a minimizing sequence (u j ) ⊂ A such that for some H ∈ Lq (Ω; R3×3 ), d ∈ Lr (Ω) we have uj ⇀ u in W1,p , cof ∇u j ⇀ H in Lq , det ∇u j ⇀ d in Lr , which is possible by the usual weak compactness results in conjunction with the coercivity assumption (4.14). Thus, Lemma 4.5 yields uj ⇀ u in W1,p , cof ∇u j ⇀ cof ∇u in Lq , det ∇u j ⇀ det ∇u in Lr , and we may argue as in Theorem 4.4 to conclude that u is indeed a minimizer. The proof that u ∈ A is the same as in Theorem 4.4. 4.3 Global injectivity For reasons of physical admissibility, we need to additionally prove that we can find a minimizer that is globally injective, at least almost everywhere (since we work with Sobolev functions). There are several approaches to this issue, for example via the topological degree. We here present a classical argument by Ciarlet & Nečas [22]. It is important to notice that on physical grounds it is only realistic to expect this injectivity for p > d. For lower exponents, complex effects such as cavitation and (microscopic) fracture have to be considered. This is already expressed in the fact that Sobolev functions in W1,p for p ≤ d are not necessarily continuous anymore. We will prove the following theorem: Theorem 4.7. In the situation of Theorem 4.4, in particular p > 3, the minimization problem (4.1) has at least one solution in the space { } A := u ∈ W1,p (Ω; Rd ) : det ∇u > 0 a.e., u is injective a.e., u|∂ Ω = g , whenever this set is non-empty. Proof. Let (u j ) ⊂ A be a minimizing sequence. The proof is completely analogous to the one of Theorem 4.4, we only need to show in addition that u is injective almost everywhere. For our choice p > 3, the space W1,p (Ω; R3 ) embeds continuously into C(Ω; R3 ), so we have that Comment... u j → u uniformly. 86 CHAPTER 4. POLYCONVEXITY Let U ⊂ R3 be any relatively compact open set with u(Ω) ⊂⊂ U (note that u(Ω) is relatively compact). Then, u j (Ω) ⊂ U for j sufficiently large by the uniform convergence. Hence, for such j, |u j (Ω)| ≤ |U|. Furthermore, since det ∇u j converges weakly to det ∇u in L p/3 by Lemma 3.10 and all u j are injective almost everywhere by definition of the space A , ∫ Ω det ∇u dx = lim ∫ det ∇u j dx = lim |u j (Ω)| ≤ |U|. j→∞ Ω j→∞ Letting |U| ↓ |u(Ω)|, we get the Ciarlet–Nečas non-interpenetration condition ∫ Ω det ∇u dx ≤ |u(Ω)|. (4.15) It is known (see for example [60] or [17]) that even if only u ∈ W1,p (Ω; R3 ) it holds that ∫ Ω | det ∇u| dx = ∫ u(Ω) H 0 (u−1 (x′ )) dx′ , where we denote by H 0 the counting measure. We estimate |u(Ω)| ≤ ∫ u(Ω) H 0 (u−1 (x′ )) dx′ = ∫ Ω det ∇u dx ≤ |u(Ω)|, where we used that det ∇u > 0 almost everywhere and (4.15). Thus, H 0 (u−1 (x′ )) = 1 for a.e. x′ ∈ u(Ω), which is nothing else than the almost everywhere injectivity of u. 4.4 Notes and bibliographical remarks Comment... Theorem 4.6 is a refined version by Ball, Currie & Olver [9] of Ball’s original result in [5]. Also see [10, 11] for further reading. Many questions about polyconvex integral functionals remain open to this day: In particular, the regularity of solutions and the validity of the Euler–Lagrange equations are largely open in the general case. Note that the regularity theory from the previous chapter is not in general applicable, at least if we do not assume the upper p-growth. These questions are even open for the more restricted situation of nonlinear elasticity theory. See [8] for a survey on the current state of the art and a collection of challenging open problems. As for the almost injectivity (for p > 3), this is in fact sometimes automatic, as shown by Ball [6], but the arguments do not apply to all situations. Tang [93] extended this to p > 2, but since then we have to deal with non-continuous functions, it is not even clear how to define u(Ω). For fracture, there is a huge literature, see [74] and the recent [46], which also contains a large bibliography. A study on the surprising (mis)behavior of the determinant for p < d can be found in [52]. Chapter 5 Relaxation When trying to minimize an integral functional over a Sobolev space, it it a very real possibility that there is no minimizer if the integrand is not quasiconvex; of course, even functionals with non-quasiconvex integrands can have minimizers, but, generally speaking, we should expect non-quasiconvexity to be correlated with the non-existence of minimizers. For instance, consider the functional F [u] := ∫ 1 0 ( )2 |u(x)|2 + |u′ (x)|2 − 1 dx, u ∈ W1,4 0 (0, 1). Comment... The gradient part of the integrand, a 7→ (a2 − 1)2 , is called a double-well potential, see Figure 5.1. Approximate minimizers of F try to satisfy u′ ∈ {−1, +1} as closely as possible, while at the same time staying close to zero because of the first term under the integral. These contradicting requirements lead to minimizing sequences that develop faster and faster oscillations similar to the ones shown in Figure 3.2 on page 48. It should be intuitively clear that no classical function can be a minimizer of F . In this situation, we have essentially two options: First, if we only care about the infimal value of F , we can compute the relaxation F∗ of F , which (by definition) will be lower semicontinuous and then find its minimizer. This will give us the infimum of F over all admissible functions, but the minimizer of F∗ says only very little about the minimizing sequence of our original F since all oscillations (and concentrations in some cases) have been “averaged out”. Second, we can focus on the minimizing sequences themselves and try to find a generalized limit object to a minimizing sequence encapsulating “interesting” information about the minimization. This limit object will be a Young measure and in fact this was the original motivation for introducing them. Effectively, Young measure theory allows one to replace a minimization problem over a Sobolev space by a generalized minimization problem over (gradient) Young measures that always has a solution. In applications, the emerging oscillations correspond to microstructure, which is very important for example in material science. In this chapter we will consider both approaches in turn. 88 CHAPTER 5. RELAXATION Figure 5.1: A double-well potential. 5.1 Quasiconvex envelopes We have seen in Chapter 3 that integral functionals with quasiconvex integrands are weakly lower semicontinuous in W1,p , where the exponent 1 < p < ∞ is determined by growth properties of the integrand. If the integrand is not quasiconvex, then we would like to compute the functional’s relaxation, i.e. the largest (weakly) lower semicontinuous function below it. We can expect that when passing from the original functional to the relaxed one, we should also pass from the original integrand to a related but quasiconvex integrand. For this, we define the quasiconvex envelope Qh : Rm×d → R ∪ {−∞} of a locally bounded Borel-function h : Rm×d → R as } {∫ 1,∞ m Qh(A) := inf − h(A + ∇ψ (z)) dz : ψ ∈ W0 (ω ; R ) , A ∈ Rm×d , (5.1) ω where ω ⊂ Rd is a bounded Lipschitz domain. Clearly, Qh ≤ h. By a similar argument as in Lemma 3.2, one can see that this formula is independent of the choice of ω . If h m has p-growth at infinity, we may replace the space W1,∞ 0 (ω ; R ) in the above formula by 1,p W0 (ω ; Rm ). Finally, by a covering argument as for instance in Proposition 3.27, we may furthermore restrict the class of ψ in the above infimum to those satisfying ∥ψ ∥L∞ ≤ ε for any ε > 0. It is well-known that for continuous h the quasiconvex envelope Qh is equal to { } Qh = sup g : g quasiconvex and g ≤ h , (5.2) Comment... see Section 6.3 in [25]. Note that the equality also holds if one (hence both) of the two expressions is −∞. We first need the following lemma: 5.1. QUASICONVEX ENVELOPES 89 Lemma 5.1. For continuous h : Rm×d → R with p-growth, |h(A)| ≤ M(1 + |A| p ), the quasiconvex envelope Qh as defined in (5.1) is either identically −∞ or real-valued with pgrowth, and quasiconvex. Proof. Step 1. If Qh takes the value −∞, say Qh(B0 ) = −∞, then for all N ∈ N there exists m ψ̂N ∈ W1,∞ 0 (B(0, 1/2); R ) such that ∫ B(0,1/2) h(A0 + ∇ψ̂N (z)) dz ≤ −N. m For any A0 ∈ Rm×d pick ψ∗ ∈ W1,∞ A0 x (B(0, 1); R ) that has trace B0 x on ∂ B(0, 1/2) and set { ψ̂N (z) if z ∈ B(0, 1/2), z ∈ B(0, 1). ψN (z) := otherwise, ψ∗ (z Hence, ∫ h(∇ψN (z)) dz ≤ Qh(A0 ) ≤ − B(0,1) −N +C B(0, 1) for some fixed C > 0 (depending on our choice of ψ∗ ). Letting N → ∞, we get Qh(A0 ) = −∞. Thus, Qh is identically −∞. m m×d we need to show Step 2. For any ψ ∈ W1,∞ 0 (B(0, 1); R ) and any A0 ∈ R ∫ − B(0,1) Qh(A0 + ∇ψ (z)) dz ≥ Qh(A0 ). (5.3) It can be shown that also Qh has p-growth, see for instance Lemma 2.5 in [54]. Thus, we can use an approximation argument in conjunction with Theorem 2.30 to prove that it suffices to m show the inequality (5.3) for piecewise affine ψ ∈ W1,∞ 0 (B(0, 1); R ), say ψ (x) = vk + Ak x (vk ∈ Rm , Ak ∈ Rm×d ) for x ∈ Gk from a disjoint collection of Lipschitz subdomains Gk ⊂ Ω ∪ (k = 1, . . . , N) with |Ω \ k Gk | = 0. Here we use the W1,p -density of piecewise affine functions in W1,∞ 0 , which can be proved by mollification and an elementary approximation of smooth functions by piecewise affine ones (e.g. employing finite elements). m Fix ε > 0. By the definition of Qh, for every k we can find φk ∈ W1,∞ 0 (Gk ; R ) such that ∫ Qh(A0 + Ak ) ≥ − f (A0 + Ak + ∇φk (z)) dz − ε . Gk Then, B(0,1) Qh(A0 + ∇ψ (z)) dz = ≥ N ∑ |Gk |Qh(A0 + Ak ) k=1 N ∫ ∑ k=1 Gk h(A0 + Ak + ∇φk (z)) dz − ε |Gk | ∫ h(A0 + ∇φ (z)) dz − ε |B(0, 1)| ( ) ≥ |B(0, 1)| Qh(A0 ) − ε , = B(0,1) Comment... ∫ 90 CHAPTER 5. RELAXATION m where φ ∈ W1,∞ 0 (B(0, 1); R ) is defined as { vk + Ak x + φk (x) if x ∈ Gk , φ (x) := 0 otherwise, x ∈ B(0, 1), and the last step follows from the definition of Qh. Letting ε ↓ 0, we have shown (5.3). We can also use the notion of the quasiconvex envelope to introduce a class of nontrivial quasiconvex functions with p-growth, 1 < p < ∞ (the linear growth case is also possible using a refined technique). Lemma 5.2. Let F ∈ R2×2 with rank F = 2 and let 1 < p < ∞. Define h(A) := dist(A, {−F, F}) p , A ∈ R2×2 . Then, Qh(0) > 0 and the quasiconvex envelope Qh is not convex (at zero). Proof. We first show Qh(0) > 0. Assume to the contrary that Qh(0) = 0. Then, by (5.1) 2 there exists a sequence (ψ j ) ⊂ W1,∞ 0 (B(0, 1); R ) with ∫ − B(0,1) h(∇ψ j ) dz → 0. (5.4) Let P : R2×2 → R2×2 be the projection onto the orthogonal complement of span{F}. Some linear algebra shows |P(A)| p ≤ h(A) for all A ∈ R2×2 . Therefore, P(∇ψ j ) → 0 in L p (B(0, 1); R2×2 ). (5.5) With the definition ĝ(ξ ) := ∫ g(x)e−2π iξ ·x dx, B(0,1) ξ ∈ Rd , d for the Fourier transform ĝ of a function g ∈ L1 (B(0, 1); RN ), we get the rule ∇ ψ j (ξ ) = 2 / span{F} for all a, ξ ∈ R \ {0} implies P(a ⊗ ξ ) ̸= 0. (2π i) ψ̂ j (ξ ) ⊗ ξ . The fact that a ⊗ ξ ∈ Some algebra implies that we may write d ψ j (ξ ) = M(ξ )P(ψ̂ j (ξ ) ⊗ ξ ), ∇ ξ ∈ Rd , where M(ξ ) : R2×2 → R2×2 is linear and furthermore depends smoothly and positively 0-homogeneously on ξ . We omit the details of this construction. For p = 2, Parseval’s formula ∥g∥L2 = ∥ĝ∥L2 together with (5.5) then implies d ∥∇ψ j ∥L2 = ∥∇ ψ j ∥L2 = ∥M(ξ )P(ψ̂ j (ξ ) ⊗ ξ )∥L2 ≤ ∥M∥∞ ∥P(ψ̂ j (ξ ) ⊗ ξ )∥L2 = ∥M∥∞ ∥P(∇ψ j )∥L2 → 0. Comment... But then h(∇ψ j ) → |F| in L1 , contradicting (5.4). 5.2. RELAXATION OF INTEGRAL FUNCTIONALS 91 For 1 < p < ∞, we may apply the Mihlin Multiplier Theorem (see for example Theorem 5.2.7 in [45] and Theorem 6.1.6 in [15]) to get analogously to the case p = 2 that ∥∇ψ j ∥L p ≤ ∥M∥Cd ∥P(∇ψ j )∥L p → 0, which is again at odds with (5.4). Finally, if Qh was convex at zero, we would have ) 1( ) 1( Qh(F) + Qh(−F) ≤ h(F) + h(−F) = 0, 2 2 Qh(0) ≤ a contradiction. 5.2 Relaxation of integral functionals We first consider the abstract principles of relaxation before moving on to more concrete integral functionals. Suppose we have a functional F : X → R ∪ {+∞}, where X is a reflexive Banach space. Assume furthermore that our F is not weakly lower semicontinuous1 , i.e. there exists at least one sequence x j ⇀ x in X with F [x] > lim inf j→∞ F [x j ]. Then, the relaxation F∗ : X → R ∪ {−∞, +∞} of F is defined as { } F∗ [x] := inf lim inf F [x j ] : x j ⇀ x in X . j→∞ Theorem 5.3 (Abstract relaxation theorem). Let X be a separable, reflexive Banach space and let F : X → R ∪ {+∞} be a functional. Assume furthermore: (WH1) Weak Coercivity: For all Λ > 0, the sublevel set { x ∈ X : F [x] ≤ Λ } is sequentially weakly relatively compact. Then, the relaxation F∗ of F is weakly lower semicontinuous and minX F∗ = infX F . Proof. Since X ∗ is also separable, we can find a countable set { fl }l∈N ⊂ X ∗ such that xj ⇀ x in X ⇐⇒ sup j ∥x j ∥ < ∞ and ⟨x j , fl ⟩ → ⟨x, fl ⟩ for all l ∈ N. Step 1. We first show weak lower semicontinuity of F∗ : Let x j ⇀ x in X such that lim inf j→∞ F∗ [x j ] < ∞ (if this is not satisfied, there is nothing to show). Selecting a (not renumbered) subsequence if necessary, we may further assume that α := sup j F∗ [x j ] < ∞. For every j ∈ N, there exists a sequence (x j,k )k ⊂ X such that x j,k ⇀ x j in X as k → ∞ and 1 lim inf F [x j,k ] ≤ F∗ [x j ] + . k→∞ j Comment... 1 In principle, if F nevertheless is weakly lower semicontinuous for a minimizing sequence, then no relaxation is necessary. It is, however, very difficult to establish such a property without establishing general weak lower semicontinuity. 92 CHAPTER 5. RELAXATION Now select a y j := x j,k( j) such that ⟨y j , fl ⟩ − ⟨x j , fl ⟩ ≤ 1 j for all l = 1, . . . , j, and F [y j ] ≤ F∗ [x j ] + 2 2 ≤α+ . j j Then, from the weak coercivity (WH1) we get sup j ∥y j ∥ < ∞. Thus, it follows that the so-selected diagonal sequence (y j ) satisfies y j ⇀ x and F∗ [x] ≤ lim inf F [y j ] ≤ lim inf F∗ [x j ] j→∞ j→∞ and F∗ is shown to be weakly lower semicontinuous. Step 2. Next take a minimizing sequence (x j ) ⊂ X such that lim F∗ [x j ] = inf F∗ . j→∞ X Now, for every x j select a y j ∈ X such that 1 F [y j ] ≤ F∗ [x j ] + . j From the coercivity assumption (WH1) we get that we may select a weakly converging subsequence (not renumbered) y j ⇀ y, where y ∈ X. Then, since F∗ is weakly lower semicontinuous and F∗ ≤ F , F∗ [y] ≤ lim inf F∗ [y j ] ≤ lim inf F [y j ] ≤ lim F∗ [x j ] = inf F∗ , j→∞ j→∞ j→∞ X so F∗ [y] = minX F∗ . Furthermore, inf F ≤ lim inf F [y j ] ≤ min F∗ ≤ inf F . X j→∞ X X Thus, minX F∗ = infX F , completing the proof. As usual, we here are most interested in the concrete case of integral functionals F [u] := ∫ Ω f (x, ∇u(x)) dx, u ∈ W1,p (Ω; Rm ), where, as usual, f : Ω × Rm×d → R is a Carathéodory integrand satisfying the p-growth and coercivity assumption µ |A| p − µ −1 ≤ f (x, A) ≤ M(1 + |A| p ), (x, A) ∈ Ω × Rm×d , (5.6) Comment... for some µ , M > 0, 1 < p < ∞, and Ω ⊂ Rd a bounded Lipschitz domain. The main result of this section is: 5.2. RELAXATION OF INTEGRAL FUNCTIONALS 93 Theorem 5.4 (Relaxation). Let F be as above and assume furthermore that there exists a modulus of continuity ω , that is, ω : [0, ∞) → [0, ∞) is continuous, increasing, and ω (0) = 0, such that | f (x, A) − f (y, A)| ≤ ω (|x − y|)(1 + |A| p ), x, y ∈ Ω, A ∈ Rm×d . (5.7) Then, the relaxation F∗ of F is F∗ [u] = ∫ Ω Q f (x, ∇u(x)) dx, u ∈ W1,p (Ω; Rm ), where Q f (x, q) is the quasiconvex envelope of f (x, q) for all x ∈ Ω. Proof. We define G [u] := ∫ Ω Q f (x, ∇u(x)) dx, u ∈ W1,p (Ω; Rm ). As Q f (x, q) is quasiconvex for all x ∈ Ω by Lemma 5.1, it is continuous by Proposition 3.6. We will see below that Q f is continuous in its first argument, so Q f is Carathéodory and G is thus well-defined. We will show in the following that (I) G ≤ F∗ and (II) F∗ ≤ G . To see (I), it suffices to observe that G is weakly lower semicontinuous by Theorem 3.25 and that G ≤ F . Thus, from the definition of F∗ we immediately get G ≤ F∗ . We will show (II) in several steps. m Step 1. Fix A ∈ Rm×d . For x ∈ Ω let ψx ∈ W1,p 0 (B(0, 1); R ) be such that ∫ − B(0,1) f (x, A + ∇ψx (z)) dz ≤ Q f (x, A) + 1. Then we use (5.6) to observe ∫ µ− B(0,1) |A + ∇ψx (z)| p dz − µ −1 ≤ Q f (x, A) + 1 ≤ M(1 + |A| p ) + 1. Let now x, y ∈ Ω and estimate using (5.7) and the definition of Q f (x, q), ∫ Q f (y, A) − Q f (x, A) ≤ − B(0,1) f (y, A + ∇ψx (z)) − f (x, A + ∇ψx (z)) dz + 1 ∫ ≤ ω (|x − y|) − B(0,1) 1 + |A + ∇ψx (z)| p dz + 1 ≤ Cω (|x − y|)(1 + |A| p ), where C = C(µ , M) is a constant. Exchanging the roles of x and y, we see Q f (x, A) − Q f (y, A) ≤ Cω (|x − y|)(1 + |A| p ) (5.8) Comment... for all x, y ∈ Ω and A ∈ Rm×d . In particular, Q f is continuous in x. 94 CHAPTER 5. RELAXATION Step 2. Next we show that it suffices to consider piecewise affine u. If u ∈ W1,p (Ω; Rm ), then there exists a sequence (v j ) ⊂ W1,p (Ω; Rm ) of piecewise affine functions such that v j → u in W1,p . Since Q f is Carathéodory and has p-growth (as Q f ≤ f ), Theorem 2.30 shows that G [v j ] → G [u] and, by lower semicontinuity, F∗ [u] ≤ lim inf j→∞ F∗ [v j ]. Thus, if (II) holds for all piecewise affine u, it also holds for all u ∈ W1,p (Ω; Rm ). Step 3. Fix ε > 0 and let u ∈ W1,p (Ω; Rm ) be piecewise affine, say u(x) = vk + Ak x (where vk ∈ Rm , Ak ∈ Rm×d ) on the set Gk from a disjoint collection of open sets Gk ⊂ Ω ∪ (k = 1, . . . , N) such that |Ω \ Nk=1 Gk | = 0. Fix such k and cover Gk up to a nullset with countably many balls Bk,l := B(xk,l , rk,l ) ⊂ Gk , where xk,l ∈ Gk , 0 < rk,l < ε (l ∈ N) such that ∫ − Q f (x, Ak ) dx − Q f (xk,l , Ak ) ≤ Cω (ε )(1 + |Ak | p ). Bk,l This is possible by the Vitali Covering Theorem A.8 in conjunction with (5.8). Now, from the definition of Q f and the independence of this definition from the domain, m ∞ we can find a function ψk,l ∈ W1,∞ 0 (Bk,l ; R ) with ∥ψk,l ∥L ≤ ε and ∫ Q f (xk,l , Ak ) − − ) f xk,l , Ak + ∇ψk,l (z) dz ≤ ε . ( Bk,l Set { ψk,l (x) if x ∈ Bk,l , vε (x) := u(x) + 0 otherwise, x ∈ Ω. In a similar way to Step 1 we can show that ∫ µ − |Ak + ∇ψk,l (z)| p dz − µ −1 ≤ M(1 + |Ak | p ) + ε . Bk,l Thus, ∫ |∇vε | dx = ∑ ∫ p Ω k,l M ≤ µ Bk,l ∫ Ω |Ak + ∇ψk,l (x)| p dx 1 + |∇u(x)| p dx + |Ω| ε |Ω| + < ∞, µ2 µ so, by the Poincaré–Friedrichs inequality, (vε )ε >0 ∈ W1,p (Ω; Rm ) is uniformly bounded in W1,p . By the continuity assumption (5.7), ∫ ∫ ∫ − f (xk,l , ∇vε (x)) dx − − f (x, ∇vε (x)) dx ≤ ω (ε ) − 1 + |∇vε (x)| p dx. B B B Comment... k,l k,l k,l 5.3. YOUNG MEASURE RELAXATION 95 Then, we can combine all the above arguments to get ∫ ∫ − Q f (x, ∇u(x)) dx − − f (x, ∇vε (x)) dx Bk,l ∫ Bk,l ≤ Cω (ε ) − 2 + |Ak | p + |∇vε (x)| p dx + ε . Bk,l Multiplying this by |Bk,l | and summing over all k, l, we get ∫ G [u] − F [vε ] ≤ Cω (ε ) 2 + |∇u(x)| p + |∇vε (x)| p dx + ε |Ω| Ω and this vanishes as ε ↓ 0. For u j := v1/ j we can convince ourselves that u j ⇀ u since (u j ) is weakly relatively compact in W1,p and ∥u j − u∥L∞ ≤ 1/ j → 0. Hence, G [u] = lim F [u j ] ≥ F∗ [u], j→∞ which is (II) for piecewise affine u. 5.3 Young measure relaxation As discussed at the beginning of this chapter, the relaxation strategy as outlined in the preceding section has one serious drawback: While it allows to find the infimal value, the relaxed functional does not tell us anything about the structure (e.g. oscillations) of minimizing sequences. Often, however, in applications this structure of minimizing sequences is decisive as it represents the development of microstructure. For example, in material science, this microstructure can often be observed and greatly influences material properties. Therefore, in this section we implement the second strategy mentioned at the beginning of the chapter, that is, we extend the minimization problem to a larger space and look for solutions there. Let us first, in an abstract fashion, collect a few properties that our extension should satisfy. Assume we are given a space X together with a notion of convergence “→” and a functional F : X → R ∪ {+∞}, which is not lower semicontinuous with respect to the convergence “→” in X. Then, we need to extend X to a larger space X with a convergence “⇝” and F to a functional F : X → R ∪ {+∞}, called the extension–relaxation such that the following conditions are satisfied: (i) Extension property: X ⊂ X (in the sense of an embedding) and, using this embedding, F |X = F . (ii) Lower bound: If x j → x in X, then, up to a subsequence, there exists x ∈ X such that x j ⇝ x in X and lim inf F [x j ] ≥ F [x]. Comment... j→∞ 96 CHAPTER 5. RELAXATION (iii) Recovery sequence: For all x ∈ X there exists a recovery sequence (x j ) ⊂ X with x j ⇝ x in X and such that lim F [x j ] = F [x]. j→∞ Intuitively, (ii) entails that we can solve our minimization problem for F by passing to the extended space; in particular, if (x j ) ⊂ X is minimizing and relatively compact (which, up to a subsequence, implies x j → x in X), then (ii) tells us that x should be considered the “minimizer”. On the other hand, (iii) ensures that the minimization problems for F and for F are sufficiently related. In particular, it is easy to see that the infima agree and indeed F attains its minimum (for this, we need to assume some coercivity of F ): min F = inf F . X X If we consider integral functionals defined on X = W1,p (Ω; Rm ), we have already seen a very convenient extended space X, namely the space of gradient p-Young measures. In fact, these generalized variational problems are the original raison d’être for Young measures. In the following we will show how this fits into the above abstract framework. This, we return to our prototypical integral functional F [u] := ∫ Ω f (x, ∇u(x)) dx, u ∈ W1,p (Ω; Rm ), with f : Ω × Rm×d → R Carathéodory and µ |A| p − µ −1 ≤ f (x, A) ≤ M(1 + |A| p ), (x, A) ∈ Ω × Rm×d , (5.9) for some µ , M > 0, 1 < p < ∞, and Ω ⊂ Rd a bounded Lipschitz domain. Motivated by the considerations in Sections 3.3 and 3.4, we let for a gradient Young measure ν = (νx ) ∈ GY p (Ω; Rm×d ) the extension–relaxation F : GY p (Ω; Rm×d ) → R be defined as F [ν ] := ⟨⟨ f,ν ⟩⟩ ∫ ∫ = Ω f (x, A) dνx (A) dx. Then, our relaxation result takes the following form: Theorem 5.5 (Relaxation with Young measures). Let F , F be as above. (I) Extension property: For every u ∈ W1,p (Ω; Rm ) define the elementary Young measure (δ∇u ) ∈ GY p (Ω; Rm×d ) as (δ∇u )x := δ∇u(x) Comment... Then, F [u] = F [δ∇u ]. for a.e. x ∈ Ω. 5.3. YOUNG MEASURE RELAXATION 97 (II) Lower bound: If u j ⇀ u in W1,p (Ω; Rm ), then, up to a subsequence, there exists a Y Young measure ν ∈ GY p (Ω; Rm×d ) such that ∇u j → ν (i.e. (∇u j ) generates ν ) with [ν ] = ∇u a.e., and lim inf F [u j ] ≥ F [ν ]. (5.10) j→∞ (III) Recovery sequence: For all ν ∈ GY p (Ω; Rm×d ) there exists a recovery generating Y sequence (u j ) ⊂ W1,p (Ω; Rm ) with ∇u j → ν and such that lim F [u j ] = F [ν ]. (5.11) j→∞ (IV) Equality of minima: F attains its minimum and min GY p (Ω;Rm×d ) F= inf W1,p (Ω;Rm ) F. Furthermore, all these statements remain true if we prescribe boundary values. For a Young measure we here prescribe the boundary values for an underlying deformation (which is only determined up to a translation, of course). Proof. Ad (I). This follows directly from the definition of F . Ad (II). The existence of ν follows from the Fundamental Theorem 3.11. The lower bound (5.10) is a consequence of Proposition 3.19. Ad (III). By virtue of Lemma 3.21 we may construct an (improved) sequence (u j ) ⊂ W1,p (Ω; Rm ) such that (i) u j |∂ Ω = u, where u ∈ W1,p (Ω; Rm ) is an underlying deformation of ν , that is, ∇u = [ν ], (ii) the family {∇u j } j is p-equiintegrable, and Y (iii) ∇u j → ν . For this improved generating sequence we can now use the statement about representation of limits of integral functionals from the Fundamental Theorem 3.11 (here we need the p-equiintegrability) to get F [u j ] → ⟨⟨ f,ν ⟩⟩ = F [ν ], Comment... which is nothing else than (5.11). Ad (IV). This is not hard to see using (II), (III) and the coercivity assumption, that is, the lower bound in (5.9). 98 CHAPTER 5. RELAXATION Figure 5.2: The double-well potential in the sailing example. Example 5.6. In our sailing example from Section 1.6, we were tasked with solving ( ) ∫ T r2 ′ F [r] := cos(4 arctan r (t)) − vmax 1 − 2 dt → min, R 0 r(0) = r(T ) = 0, |r(t)| ≤ R. Here, because of the additional constraints we can work in any L p -space that we want, even L∞ . Clearly, the integrand ( ) r2 W (r, a) := cos(4 arctan a) − vmax 1 − 2 , |r(t)| ≤ R, R is not convex in a, see Figure 5.2. Technically, the preceding theorem is not applicable, since W depends on r and a and p = ∞, but it is obvious that we can simply extend it to consider Young measures (δr(x) ⊗ νx )x ∈ Y∞ (Ω; R × R) with ν ∈ GY∞ (Ω; R) and [ν ] = r′ as in Section 3.7 (we omit the details for p = ∞). We collect all such product Young measures in the set GY∞ (Ω; R × R). Then, we consider the extended–relaxed variational problem ⟨⟨ ⟩⟩ ∫ T F [ δ ν ] := W, δ ν = W (r(t), a) dνx (a) dx → min, ⊗ ⊗ r r ∞ 0 δr ⊗ ν = (δr(x) ⊗ νx )x ∈ GY (Ω; R × R), r(0) = r(T ) = 0, |r(t)| ≤ R. Comment... Let us also construct a sequence of solutions that generate the optimal Young measure solution. Some elementary analysis yields that g(a) := cos(4 arctan a) has two minima in a = ±1, where g(a) = −1 (see Figure 5.2). Moreover, h(r) := −vmax (1 − r2 /R2 ) attains its minimum −vmax for r = 0. Thus, ( ) W ≥ − 1 + vmax =: Wmin . 5.3. YOUNG MEASURE RELAXATION 99 We let { s − 1 if s ∈ [0, 2], h(s) := 3 − s if s ∈ (2, 4], s ∈ [0, 4], and consider h to be extended to all s ∈ R by periodicity. Then set ( ) T 4j r j (t) := h t +1 . 4j T It is easy to see that r′j ∈ {−1, +1} and r j → 0 uniformly. Thus, F [r j ] → T ·Wmin = inf F . By the Fundamental Theorem 3.11 on Young measures, we know that we may select a subsequence of j’s (not renumbered) such that (see Lemma 3.28) (r j , r′j ) → δr ⊗ ν = (δr(x) ⊗ νx )x ∈ GY∞ (Ω; R × R). Y From the construction of r j we see ( ) 1 1 δr ⊗ ν = δ0 ⊗ δ−1 + δ+1 . 2 2 Clearly, this δr ⊗ ν is minimizing for F . The following lemma creates a connection to the other relaxation approach from the last section: Lemma 5.7. Let h : Rm×d → R be continuous and satisfy µ |A| p − µ −1 ≤ h(A) ≤ M(1 + |A| p ) A ∈ Rm×d , for some µ , M > 0. Then, for all A ∈ Rm×d there exists a homogeneous gradient Young measure νA ∈ GY p (B(0, 1); Rm×d ) such that ∫ h dνA . Qh(A) = Proof. According to (5.1) and the remarks following it, {∫ } 1,p m Qh(A) := inf − h(A + ∇ψ (z)) dz : ψ ∈ W0 (B(0, 1); R ) . B(0,1) Let now (ψ j ) ⊂ W1,p (B(0, 1); Rm ) be a minimizing sequence in the above formula. This minimization problem is admissible in Theorem 5.5, which yields a Young measure minimizer ν ∈ GY p (B(0, 1); Rm ) such that ∫ Qh(A) = − B(0,1) ∫ h dνx dx. Comment... It only remains to show that we can replace (νx ) by a homogeneous Young measure νA . This follows from the following lemma. 100 CHAPTER 5. RELAXATION Lemma 5.8 (Homogenization). Let ν = (νx ) ∈ GY p (Ω; Rm ) such that [ν ] = ∇u for some u ∈ W1,p (Ω; Rm ) with linear boundary values. Then, for any Lipschitz domain D ⊂ Rd there exists a homogeneous ν = (ν y ) ∈ GY p (D; Rm ) such that ∫ h dν = ∫ ∫ 1 |D| Ω h dνx dx (5.12) for all continuous h : Rm×d → R with |h(A)| ≤ M(1 + |A| p ). Furthermore, this result remains valid if Ω = Qd = (−1/2, 1/2)d , the d-dimensional unit cube and u has periodic boundary values. Proof. Let (u j ) ⊂ W1,p (Ω; Rm ) with u j (x) = Mx on ∂ Ω (in the sense of trace) for a fixed Y matrix M ∈ Rm×d such that ∇u j → ν , in particular sup j ∥∇u j ∥L p < ∞. Now choose for every j ∈ N a Vitali cover of D, see Theorem A.8, N( j) D=Z ⊎ ⊎ ( j) ( j) Ω(ak , rk ), |Z| = 0, k=1 ( j) ( j) with ak ∈ D, rk ≤ 1/ j (k = 1, . . . , N( j)), and Ω(a, r) := a + rΩ. Then define ( ( j) ) ( j) r( j) u y − ak + Mak j k ( j) v j (y) := rk 0 ( j) ( j) if y ∈ Ω(ak , rk ), y ∈ D. otherwise, We have v j ∈ W1,p (D; Rm ) (it is easy to see that there are no jumps over the gluing boundaries) and ( ∇v j (y) = ∇u j ( j) ) y − ak ( j) ( j) if y ∈ Ω(ak , rk ). ( j) rk We can then use a change of variables to compute for all φ ∈ C0 (D) and h ∈ C(Rm×d ) with p-growth, ( ( ( j) )) y − ak φ (y)h(∇v j (y)) dy = ∑ φ (y) h ∇u j dy ( j) ( j) ( j) D rk k=1 Ω(ak ,rk ) ( ) ∫ N( j) 1 ( j) ( j) d = ∑ (rk ) φ (ak ) h(∇u j (x)) dx + O |D| j Ω k=1 ∫ N( j) ∫ where we also used that φ is uniformly continuous. Letting j → ∞ and using that (Riemann sums) N( j) lim Comment... j→∞ ∫ ∑ (rk )d φ (ak ) = − φ (x) dx, k=1 ( j) ( j) D 5.4. CHARACTERIZATION OF GRADIENT YOUNG MEASURES 101 we arrive at ∫ lim j→∞ D ∫ φ (y)h(∇v j (y)) dy = − φ (x) dx · D ∫ ∫ Ω h(A) dνx dx. (5.13) For φ = 1 and h(A) := |A| p , this gives sup j ∥∇v j ∥Lp p = sup j ∥∇u j ∥Lp p < ∞. Y Thus, there exists ν ∈ GY p (D; Rm ) such that ∇v j → ν . Using Lemma 3.21 we may moreover assume that (∇u j ), and hence also (∇v j ), is pequiintegrable. Then, (5.13) implies ∫ ∫ D ∫ φ (y)h(A) dν y dy = − φ (x) dx · D ∫ ∫ Ω h(A) dνx dx for all φ , h as above. This implies in particular that (ν y ) = ν is homogeneous. For φ = 1, we get (5.12). The additional claim about Ω = Qd and an underlying deformation u with periodic boundary values follows analogously, since also in this case we can “glue” generating functions on a (special) covering. 5.4 Characterization of gradient Young measures In the previous section we replaced a minimization problem over a Sobolev space by its extension–relaxation, defined on the space of gradient Young measures. Unfortunately, this subset of the space of Young measures is so far defined only “extrinsically”, i.e. through the existence of a generating sequence of gradients. The question arises whether there is also an “intrinsic” characterization of gradient Young measures. Intuitively, trying to understand gradient Young measures amounts to understanding the “structure” of sequences of gradients, which is a useful endeavor. Recall that in Lemma 3.22 we showed that a homogeneous gradient Young measure ν ∈ GY p (B(0, 1); Rm×d ) with barycenter B := [ν ] ∈ Rm×d satisfies the Jensen-type inequality h(B) ≤ ∫ h dν Comment... for all quasiconvex functions h : Rm×d → R with |h(A)| ≤ M(1 + |A| p ) (M > 0). There, we interpreted this as an expression of the (generalized) convexity of h, in analogy with the classical Jensen-inequality. However, we can also dualize the situation and see the validity of the above Jensentype inequality for quasiconvex functions as a property of gradient Young measures. The following (perhaps surprising) result shows that this dual point of view is indeed valid and the Jensen-type inequality (essentially) characterizes gradient Young measures, a result due to David Kinderlehrer and Pablo Pedregal from the early 1990s: 102 CHAPTER 5. RELAXATION Theorem 5.9 (Kinderlehrer–Pedregal). Let ν ∈ Y p (Ω; Rm×d ) be a Young measure such that [ν ] = ∇u for some u ∈ W1,p (Ω; Rm ) and such that for almost every x ∈ Ω the Jensentype inequality h(∇u(x)) ≤ ∫ h dνx holds for all quasiconvex h : Rm×d → R with |h(A)| ≤ M(1 + |A| p ) (M > 0). Then ν ∈ GY p (Ω; Rm×d ), i.e. ν is a gradient p-Young measure. The idea of the proof is to show that the set of gradient Young measures is convex and (weakly*) closed in the set of Young measures and the Jensen-type inequalities entail that ν cannot be separated from this set by a hyperplane (which turn out to be in oneto-one correspondence with integrands). Then, the Hahn–Banach Theorem implies that ν actually lies in this set and is hence a gradient Young measure. One first shows this for homogeneous Young measures and then uses a gluing procedure to extend the result to x-dependent ν = (νx ). The details, however, are quite technical, see [81] or the original works [49, 51] for a full proof. This theorem is conceptually very important. It entails that if we could understand the class of quasiconvex functions, then we also could understand gradient Young measures and thus sequences of gradients. Unfortunately, our knowledge of quasiconvex functions (and hence of gradient Young measures) is very limited at present. Most results are for specific situations only and much of it has an “ad-hoc” flavor. We close this chapter with a result showing that quasiconvex functions are not easy to understand. In Proposition 3.3 we proved that quasiconvexity implies rank-one convexity and Proposition 4.1 showed that polyconvexity implies quasiconvexity. Hence, in a sense, rank-one convexity is a lower bound and polyconvexity an upper bound on quasiconvexity. Moreover, the Alibert–Dacorogna–Marcellini function (see Examples 3.5, 4.2) ( ) hγ (A) := |A|2 |A|2 − 2γ det A , A ∈ R2×2 , γ ∈ R. has the following convexity properties: √ 2 2 • hγ is convex if and only if |γ | ≤ ≈ 0.94, 3 2 • hγ is rank-one convex if and only if |γ | ≤ √ ≈ 1.15, 3 ] ( 2 • hγ is quasiconvex if and only if |γ | ≤ γQC for some γQC ∈ 1, √ , 3 • hγ is polyconvex if and only if |γ | ≤ 1. However, it is open whether γQC = √2 3 and so it could be that quasiconvexity and rank-one Comment... convexity are actually equivalent. This was open for a long time until Vladimir Švérak’s 1992 counterexample [89], which settled the question in the negative, at least for d ≥ 2, 5.4. CHARACTERIZATION OF GRADIENT YOUNG MEASURES 103 m ≥ 3. The case d = m = 2 is still open and considered to be very important, also because of its connection to other branches of mathematics. Example 5.10 (Švérak). We will show the non-equivalence of quasiconvexity and rankone convexity for m = 3, d = 2 only. Higher dimensions can be treated using an embedding of R3×2 into Rm×d . To accomplish this, we construct h : R3×2 → R that is quasiconvex, but not rank-one convex. Define a linear subspace L of R3×2 as follows: x 0 L := 0 y : x, y, z ∈ R . z z We denote by P : R3×2 → L a linear projection onto L given as a 0 a b 0 d P(A) := for A = c d ∈ R3×2 . (e + f )/2 (e + f )/2 e f Also define g : L → R to be x 0 g 0 y := −xyz. z z For α , β > 0, we let hα ,β : R3×2 → R be given as ( ) hα ,β (A) := g(P(A)) + α |A|2 + |A|4 + β |A − P(A)|2 . We show the following two properties of hα ,β : (I) For every α > 0 sufficiently small and all β > 0, the function hα ,β is not quasiconvex. (II) For every α > 0, there exists β = β (α ) > 0 such that hα ,β is rank-one convex. Ad (I): For A = 0, D = (0, 1)2 , and a periodic ψ ∈ W1,∞ (D; R3 ) given as sin 2π x1 1 sin 2π x2 , ψ (x1 , x2 ) := 2π sin 2π (x1 + x2 ) we have ∇ψ ∈ L and so P(∇ψ ) = ∇ψ . Thus, we may compute (using the identity for the cosine of a sum and the fact that the sine is an odd function) ∫ D g(∇ψ ) dx = − ∫ 1∫ 1 0 0 1 (cos 2π x1 )2 (cos 2π x2 )2 dx1 dx2 = − < 0. 4 Thus, for α > 0 sufficiently small, D hα ,β (∇ψ ) dx < 0 = hα ,β (0). Comment... ∫ 104 CHAPTER 5. RELAXATION It turns out that in the definition of quasiconvexity we may alternatively test with functions that have merely periodic boundary values on D (see Proposition 5.13 in [25]). Hence, (I) follows. Ad (II): Rank-one convexity is equivalent to the Legendre–Hadamard condition d2 2 D hα ,β (A)[B, B] := 2 hα ,β (A + tB) ≥ 0 for all A, B ∈ R3×2 , rank B ≤ 1. dt t=0 D2 hα ,β (A)[B, B] Here, is the second (Gâteaux-)derivative of hα ,β at A in direction B. The function g is a homogeneous polynomial of degree three, whereby we can find c > 0 such that for all A, B ∈ R3×2 with rank B ≤ 1, d2 2 G(A, B) := D (g ◦ P)(A)[B, B] := 2 g(P(A + tB)) ≥ −c|A||B|2 . dt t=0 A computation then shows that D2 hα ,β (A)[B, B] = G(A, B) + 2α |B|2 + 4α |A|2 |B|2 + 8α (A : B)2 + 2β |B − P(B)|2 ≥ (−c + 4α |A|)|A||B|2 . where we used the Frobenius product A : B := ∑i, j Aij Bij . Thus, for |A| ≥ c/(4α ), the Legendre–Hadamard condition (and hence rank-one convexity) holds. We still need to prove the Legendre–Hadamard condition for |A| < c/(4α ) and any fixed α > 0. Since D2 hα ,β (A)[B, B] is homogeneous of degree two in B, we only need to consider A, B from the compact set } { c 3×2 3×2 , |B| = 1, rank B = 1 . K := (A, B) ∈ R × R : |A| ≤ 4α From the estimate D2 hα ,β (A)[B, B] ≥ G(A, B) + 2α |B|2 + 2β |B − P(B)|2 =: κ (A, B, β ) it suffices to show that there exists β = β (α ) > 0 such that κ (A, B, β ) ≥ 0 for all (A, B) ∈ K. Assume that this is not the case. Then there exists β j → ∞ and (A j , B j ) ∈ K with 0 > κ (A j , B j , β j ) = G(A j , B j ) + 2α + 2β |B j − P(B j )|2 . As K is compact, we may assume (A j , B j ) → (A, B) ∈ K, for which it holds that G(A, B) + 2α ≤ 0, P(B) = B, and rank B = 1. However, since rank B = 1, we have g(P(A+tB)) = 0 for all t ∈ R. This implies in particular G(A, B) = D2 (g◦P)(A)[B, B] = 0, yielding 2α ≤ 0, which contradicts our assumption α > 0. Thus, we have shown that for a suitable choice of α , β > 0 it indeed holds that hα ,β is rankone convex. Comment... So, quasiconvexity and, dually, gradient Young measures, must remain somewhat mysterious concepts. This also shows that the theory of elliptic systems of PDEs, for which traditionally the Legendre–Hadamard condition is the expression of “ellipticity”, is incomplete. In particular, non-variational systems (i.e. systems of PDEs which do not arise as the Euler–Lagrange equations to a variational problem) are only poorly understood. 5.5. NOTES AND BIBLIOGRAPHICAL REMARKS 105 5.5 Notes and bibliographical remarks Comment... Classically, one often defines the quasiconvex envelope through (5.2) and not as we have done through (5.1). In this case, (5.2) is often called the Dacorogna formula, see Section 6.3 in [25] for further references. The construction of Lemma 5.2 is based on a construction of Šverák [88] and some ellipticity arguments similar to those in Lemma 2.7 of [73]. Laurence Chisholm Young originally introduced the objects that are now called Young measures as “generalized curves/surfaces” in the late 1930s and early 1940s [97–99] to treat problems in the Calculus of Variations and Optimal Control Theory that could not be solved using classical methods. At the end of the 1960s he also wrote a book [100] (2nd edition [101]) on the topic, which explained these objects and their applications in great detail (including philosophical discussions on the semantics of the terminology “generalized curve/surface”). More recently, starting from the seminal work of Ball & James in 1987 [10], Young measures have provided a convenient framework to describe fine phase mixtures in the theory of microstructure, also see [33, 73] and the references cited therein. Further, and more related to the present work, relaxation formulas involving Young measures for nonconvex integral functionals have been developed and can for example be found in Pedregal’s book [81] and also in Chapter 11 of the recent text [3]; historically, Dacorogna’s lecture notes [24] were also influential. A second avenue of development – somewhat different from Young’s original intention – is to use Young measures only as a technical tool (that is without them appearing in the final statement of a result). This approach is in fact quite old and was probably first adopted in a series of articles by McShane from the 1940s [62–64]. There, the author first finds a generalized Young measure solution to a variational problem, then proves additional properties of the obtained Young measures, and finally concludes that these properties entail that the generalized solution is in fact classical. This idea exhibits some parallels to the hugely successful approach of first proving existence of a weak solution to a PDE and then showing that this weak solution has (in some cases) additional regularity. The Kinderlehrer–Pedregal Theorem 5.9 also holds for p = ∞, which was in fact the first case to be established. See [81] for an overview and proof. A different proof of Proposition 5.2 can be found in Section 5.3.9 of [25]. The conditions (i)–(iii) at the beginning of Section 5.3 are modeled on the concept of Γ-convergence (introduced by De Giorgi) and are also the basis of the usual treatment of parameter-dependent integral functionals. See [18] for an introduction and [27] for a thorough treatise. Appendix A Prerequisites This appendix recalls some facts that are needed throughout the course. A.1 General facts and notation First, recall the following version of the fundamental Young inequality ab ≤ δ 2 1 2 a + b , 2 2δ which holds for all a, b ≥ 0 and δ > 0. In this text (as in much of the literature), the matrix space Rm×d of (m × d)-matrices always comes equipped with the the Frobenius matrix (inner) product A : B := ∑ Akj Bkj , A, B ∈ Rm×d , j,k which is just the Euclidean product if we identify such matrices with vectors in Rmd . This inner product induces the Frobenius matrix norm √ |A| := ∑(Akj )2 , A ∈ Rm×d . j,k Very often we will use the tensor product of vectors a ∈ Rm , b ∈ Rd , a ⊗ b := abT ∈ Rm×d . The tensor product interacts well with the Frobenius norm, |a ⊗ b| = |a| · |b|, a ∈ Rm , b ∈ Rd . Comment... While all norms on the finite-dimensional space Rm×d are equivalent, some “finer” arguments require to specify a matrix norm; if nothing else is stated, we always use the Frobenius norm. 108 APPENDIX A. PREREQUISITES It is proved in texts on Matrix Analysis, see for instance [48], that this Frobenius norm can also be expressed as |A| = √ ∑ σi (A)2 , A ∈ Rm×d , i where σi (A) ≥ 0 is the i’th singular value of A, where i = 1, . . . , min(d, m). Recall that every matrix A ∈ Rm×d has a (real) singular value decomposition A = PΣQT for orthogonal P ∈ Rm×m , Q ∈ Rd×d , and a diagonal Σ = diag(σ1 , . . . , σmin(d,m) ) ∈ Rm×d with only positive entries σ1 ≥ σ2 ≥ . . . ≥ σr > 0 = σr+1 = σr+2 = · · · = σmin(d,m) , called the singular values of A, where r is the rank of A. For A ∈ Rd×d the cofactor matrix cof A ∈ Rd×d is the matrix whose ( j, k)’th entry is ¬j ¬j (−1) j+k M¬k (A) with M¬k (A) being the ( j, k)-minor of A, i.e. the determinant of the matrix that originates from A by deleting the j’th row and k’th column. One important formula (and in fact another way to define the cofactor matrix) is ∂A [det A] = cof A. Furthermore, the Cramer rule entails that A(cof A)T = det A · I for all A ∈ Rd×d , where I denotes the identity matrix as usual. In particular, if A is invertible, A−1 = (cof A)T . det A Sometimes, the matrix (cof A)T is called the adjugate matrix to A in the literature (we will not use this terminology here, however). A fundamental inequality involving the determinant is the Hadamard inequality: Let A ∈ Rd×d with columns A j ∈ Rn ( j = 1, . . . , d). Then, d | det A| ≤ ∏ |A j | ≤ |A|d . j=1 Comment... Of course, an analogous formula holds with the rows of A. A.2. MEASURE THEORY A.2 109 Measure theory We assume that the reader is familiar with the notion of Lebesgue- and Borel-measurability, nullsets, and L p -spaces; a good introduction is [85]. For the d-dimensional Lebesgue measure we write L d (or Lxd if we want to stress the integration variable), but the Lebesgue measure of a Borel- or Lebesgue-measurable set A ⊂ Rd is simply |A|. THe characteristic functionf such an A is { 1 if x ∈ A, 1A (x) := x ∈ Rd . 0 otherwise, In the following, we only recall some results that are needed elsewhere. Lemma A.1 (Fatou). Let f j : RN → [0, +∞], j ∈ N, be a sequence of Lebesgue-measurable functions. Then, ∫ lim inf j→∞ f j (x) dx ≥ ∫ lim inf f j (x) dx. j→∞ Lemma A.2 (Monotone convergence). Let f j : RN → [0, +∞], j ∈ N, be a sequence of Lebesgue-measurable functions with f j (x) ↑ f (x) for almost every x ∈ Ω. Then, f : RN → [0, +∞] is measurable and ∫ lim j→∞ ∫ f j (x) dx = f (x) dx. Lemma A.3. If f j → f strongly in L1 (Rd ), 1 ≤ p ≤ ∞, that is, ∥ f j − f ∥L p → 0 as j → ∞, then there exists a subsequence (not renumbered1 ) such that f j → f almost everywhere. Theorem A.4 (Lebesgue dominated convergence theorem). Let f j : RN → Rn , j ∈ N, be a sequence of Lebesgue-measurable functions such that there exists an L p -integrable majorant g ∈ L p (RN ), 1 ≤ p < ∞, that is, | f j| ≤ g for all j ∈ N. If f j → f pointwise almost everywhere, then also f j → f strongly in L p . The following strengthening of Lebesgue’s theorem is very convenient: Theorem A.5 (Pratt). If f j → f pointwise almost everywhere and there exists a sequence (g j ) ⊂ L1 (RN ) with g j → g in L1 such that | f j | ≤ g j , then f j → f in L1 . The following convergence theorem is of fundamental significance: generally do not choose different indices for subsequences if the original sequence is discarded. Comment... 1 We 110 APPENDIX A. PREREQUISITES Theorem A.6 (Vitali convergence theorem). Let Ω ⊂ Rd be bounded and let ( f j ) ⊂ L p (Ω; Rm ), 1 ≤ p < ∞. Assume furthermore that (i) No oscillations: f j → f in measure, that is, for all ε > 0, { } x ∈ Ω : | f j − f | > ε → 0 as j → ∞. (ii) No concentrations: { f j } j is p-equiintegrable, i.e. ∫ lim sup j K↑∞ {| f j |>K} | f j | p dx = 0. Then, f j → f strongly in L p . Lebesgue-integrable functions also have a very useful “continuity” property: Theorem A.7. Let f ∈ L1 (Rd ). Then, almost every x0 ∈ Rd is a Lebesgue point of f , i.e. ∫ lim − r↓0 B(x0 ,r) | f (x) − f (x0 )| dx = 0, where B(x0 , r) ⊂ Rd is the ball with center x0 and radius r > 0 and ∫ −B(x ,r) 0 := |B(x10 ,r)| ∫ B(x0 ,r) . The following covering theorem is a handy tool for several constructions, see Theorems 2.19 and 5.51 in [2] for a proof: Theorem A.8 (Vitali covering theorem). Let B ⊂ RN be a bounded Borel set and let K ⊂ Rd be compact. Then, there exist ak ∈ B, rk > 0, where k ∈ N, such that B=Z ⊎ N ⊎ K(ak , rk ), K(ak , rk ) = ak + rk K, k=1 ⊎ with Z ⊂ B a Lebesgue-nullset, |Z| = 0, and “ ” denoting that the sets in the union are disjoint. We also need other measures than Lebesgue measure on subsets of RN (this could be either Rd or a matrix space Rm×d , identified with Rmd ). All of these abstract measures will be Borel measures, that is, they are defined on the Borel σ -algebra B(RN ) of RN . In this context, recall that the Borel-σ -algebra is the smallest σ -algebra that contains the open sets. All Borel measures defined on RN that do not take the value +∞ are collected in the set M(RN ) of (finite) Radon measures, its subclass of probability measures is M1 (RN ). We also use M(Ω), M1 (Ω) for the subset of measures that only charge Ω ⊂ RN . We define the duality pairing Comment... ⟨ ⟩ ∫ h, µ := h(A) dµ (A) A.3. FUNCTIONAL ANALYSIS 111 for h : RN → Rm and µ ∈ M(RN ), whenever this integral makes sense (we need at least h µ -measurable). The following notation is convenient for the barycenter of a finite Borel measure µ : ⟨ ⟩ ∫ [µ ] := id, µ = A dµ (A). Probability measures and convex functions interact well: Lemma A.9 (Jensen inequality). For all probability measures µ ∈ M1 (RN ) and all convex h : RN → R it holds that (∫ ) ∫ ⟨ ⟩ A dµ (A) ≤ h(A) dµ (A) = h, µ . h([µ ]) = h There is an alternative, “dual”, view on finite measures, expressed in the following important theorem: Theorem A.10 (Riesz representation theorem). The space of Radon measures M(RN ) is isometrically isomorphic to the dual space C0 (RN )∗ via the duality pairing ⟨ ⟩ ∫ h, µ = h dµ . As an easy, often convenient, consequence, we can define measures through their action on C0 (RN ). Note, however, that we then need to check the boundedness |⟨h, µ ⟩| ≤ ∥h∥∞ . If the element µ ∈ C0 (RN )∗ is additionally positive, that is, ⟨h, µ ⟩ ≥ 0 for h ≥ 0, and normalized, that is, ⟨1, µ ⟩ = 1 (here, 1 ≡ 1 on the whole space), then µ from the Riesz representation theorem is in fact a probability measure, µ ∈ M1 (RN ). A.3 Functional analysis We assume that the reader has a solid foundation in the basic notions of Functional Analysis such as Banach spaces and their duals, weak/weak* convergence (and topology2 ), reflexivity, and weak/weak* compactness. The application of x∗ ∈ X ∗ to x ∈ X is often written via the duality pairing ⟨x, x∗ ⟩ = ⟨x∗ , x⟩ := x∗ (x). We write x j ⇀ x for weak convergence ∗ and x j ⇁ x for weak* convergence. In this context recall that in Banach spaces topological weak compactness is equivalent to sequential weak compactness, this is the Eberlein– Šmulian Theorem. Also recall that the norm in a Banach space is lower semicontinuous with respect to weak convergence: If x j ⇀ x in X, then ∥x∥ ≤ lim inf j→∞ ∥x j ∥. One very thorough reference for most of this material is [23]. The following are just reminders: Theorem A.11 (Weak compactness). Let X be a reflexive Banach space. Then, normbounded sets in X are sequentially weakly relatively compact. we here really only need convergence of sequences. Comment... 2 However, 112 APPENDIX A. PREREQUISITES Theorem A.12 (Sequential Banach–Alaoglu theorem). Let X be a separable Banach space. Then, norm-bounded sets in the dual space X ∗ are weakly* sequentially relatively compact. Theorem A.13 (Dunford–Pettis). Let Ω ⊂ Rd be bounded. A bounded family { f j } ⊂ L1 (Ω) ( j ∈ N) is equiintegrable if and only if it is weakly relatively (sequentially) compact in L1 (Ω). Finally, weak convergence can be “improved” to strong convergence in the following way: Lemma A.14 (Mazur). Let x j ⇀ x in a Banach space X. Then, there exists a sequence (y j ) ⊂ X of convex combinations, N( j) yj = ∑ N( j) θ j,n xn , θ j,n ∈ [0, 1], n= j ∑ θ j,n = 1. n= j such that y j → x in X. A.4 Sobolev and other function spaces spaces We give a brief introduction to Sobolev spaces, see [59] or [36] for more detailed accounts and proofs. Let Ω ⊂ Rd . As usual we denote by C(Ω) = C0 (Ω), Ck (Ω), k = 1, 2, . . ., the spaces of continuous and k-times continuously differentiable functions. As norms in these spaces we have ∥u∥Ck := ∑ |α |≤k ∥∂ α u∥∞ , u ∈ Ck (Ω), k = 0, 1, 2, . . . , where ∥ q∥∞ is the supremum norm. Here, the sum is over all multi-indices α ∈ (N ∪ {0})d with |α | := α1 + · · · + αd ≤ k and ∂ α := ∂1α1 ∂2α2 · · · ∂dαd is the α -derivative operator. Similarly, we define C∞ (Ω) of infinitely-often differentiable functions (but this is not a Banach space anymore). A subscript “c ” indicates that all functions u in the respective function space (e.g. C∞ c (Ω)) must have their support supp u := { x ∈ Ω : u(x) ̸= 0 } Comment... compactly contained in Ω (so, supp u ⊂ Ω and supp u compact). For the compact containment of a A in an open set B we write A ⊂⊂ B, which means that A ⊂ B. In this way, the previous condition could be written supp u ⊂⊂ Ω. For k ∈ N a positive integer and 1 ≤ p ≤ ∞, the Sobolev space Wk,p (Ω) is defined to contain all functions u ∈ L p (Ω) such that the weak derivative ∂ α u exists in L p (Ω) for all A.4. SOBOLEV AND OTHER FUNCTION SPACES SPACES 113 multi-indices α ∈ (N ∪ {0})d with |α | ≤ k. This means that for each such α , there is a (unique) function vα ∈ L p (Ω) satisfying ∫ |α | vα · ψ dx = (−1) ∫ u · ∂ α ψ dx for all ψ ∈ C∞ c (Ω), and we write ∂ α u for this vα . The uniqueness follows from the Fundamental Lemma of the Calculus of Variations 2.21. Clearly, if u ∈ Ck (Ω), then the weak derivative is the classical derivative. For u ∈ W1,p (Ω) we further define the weak gradient and weak divergence, ∇u := (∂1 u, ∂2 u, . . . , ∂d u), div u := ∂1 u + ∂2 u + · · · + ∂d u. As norm in Wk,p (Ω), 1 ≤ p < ∞, we use ( )1/p ∑ ∥u∥Wk,p := |α |≤k ∥∂ α u∥Lp p , u ∈ Wk,p (Ω). For p = ∞, we set ∥u∥Wk,∞ := max ∥∂ α u∥L∞ , |α |≤k u ∈ Wk,∞ (Ω). Under these norms, the Wk,p (Ω) become Banach spaces. Concerning the boundary values of Sobolev functions, we have: Theorem A.15 (Trace theorem). For every domain Ω ⊂ Rd with Lipschitz boundary and every 1 ≤ p ≤ ∞, there exists a bounded linear trace operator trΩ : W1,p (Ω) → L p (∂ Ω) such that tr φ = φ |∂ Ω if φ ∈ C(Ω). We write trΩ u simply as u|∂ Ω . Furthermore, trΩ is bounded and weakly continuous between W1,p (Ω) and L p (∂ Ω). For 1 < p < ∞ denote the image of W1,p (Ω) under trΩ by W1−1/p,p (∂ Ω), called the trace space of W1,p (Ω). As a makeshift norm on W1−1/p,p (∂ Ω) one can use the L p (∂ Ω)norm, but W1−1/p,p (∂ Ω) is not closed under this norm. This can be remedied by introducing a more complicated norm involving “fractional” derivatives (hence the notation for W1−1/p,p (∂ Ω)), under which this set becomes a Banach space. 1,p 1,p We write W1,p 0 (Ω) for the linear subspace of W (Ω) consisting of all W -functions with zero boundary values (in the sense of trace). More generally, we use W1,p g (Ω) with g ∈ W1−1/p,p (∂ Ω), for the affine subspace of all W1,p -functions with boundary trace g. The following are some properties of Sobolev spaces, stated for simplicity only for the first-order space W1,p (Ω) with Ω ⊂ Rd a bounded Lipschitz domain. Comment... Theorem A.16 (Poincaré–Friedrichs inequality). Let u ∈ W1,p (Ω). 114 APPENDIX A. PREREQUISITES (I) If u|∂ Ω = 0, then ∥u∥L p ≤ C∥∇u∥L p , where C = C(Ω) > 0 is a constant. If 1 < p < ∞ and g ∈ W1−1/p,p (∂ Ω), then ( ) ∥u∥L p ≤ C ∥∇u∥L p + ∥g∥W1−1/p,p . (II) Setting [u]Ω := ∫ −Ω u dx, it holds that ∥u − [u]Ω ∥L p ≤ C∥∇u∥L p , where C = C(Ω) > 0 is a constant. Sobolev functions have higher integrability than originally assumed: Theorem A.17 (Sobolev embedding). Let u ∈ W1,p (Ω). ∗ (I) If p < d, then u ∈ L p (Ω), where p∗ = dp d−p and there is a constant C = C(Ω) > 0 such that ∥u∥L p∗ ≤ C∥u∥W1,p . (II) If p = d, then u ∈ Lq (Ω) for all 1 ≤ q < ∞ and ∥u∥Lq ≤ Cq ∥u∥W1,p . (III) If p > d, then u ∈ C(Ω) and ∥u∥∞ ≤ C∥u∥W1,p . The second part can in fact be made more precise by considering embeddings into Hölder spaces, see Section 5.6.3 in [36] for details. Theorem A.18 (Rellich–Kondrachov). Let (u j ) ⊂ W1,p (Ω) with u j ⇀ u in W1,p . (I) If p < d, then u j → u in Lq (Ω; RN ) (strongly) for any q < p∗ = d p/(d − p). (II) If p = d, then u j → u in Lq (Ω; RN ) (strongly) for any q < ∞. (III) If p > d, then u j → u uniformly. The following is a consequence of the Stone–Weierstrass Theorem: Theorem A.19 (Density). For every 1 ≤ p < ∞, u ∈ W1,p (Ω), and all ε > 0 there exists v ∈ (W1,p ∩ C∞ )(Ω) with u − v ∈ W1,p 0 (Ω) and ∥u − v∥W1,p < ε . Finally, we note that for 0 < γ ≤ 1 a function u : Ω → R is γ -Hölder-continuous function, in symbols u ∈ C0,γ (Ω), if |u(x) − u(y)| < ∞. |x − y|γ x,y∈Ω ∥u∥C0,γ := ∥u∥∞ + sup Comment... x̸=y A.4. SOBOLEV AND OTHER FUNCTION SPACES SPACES 115 Functions in C0,1 (Ω) are called Lipschitz-continuous. We construct higher-order spaces Ck,γ (Ω) analogously. d Next, we define a family of mollifiers as follows: Let ρ ∈ C∞ c (R ) be radially symmetric ∞ d and positive. Then, the family (ρδ )δ ⊂ Cc (B(0, δ ))(R ) consists of the functions ( ) 1 x ρδ (x) := d ρ , x ∈ Rd . δ δd For u ∈ Wk,p (Rd ), where k ∈ N ∪ {0} and 1 ≤ p < ∞, we define the mollification uδ ∈ Wk,p (Rd ) as the convolution between u and ρδ , i.e. uδ (x) := (ρδ ⋆ u)(x) := ∫ ρδ (x − y)u(y) dy, x ∈ Rd . Lemma A.20. If u ∈ Wk,p (Rd ), then uδ → u strongly in Wk,p as δ ↓ 0. Analogous results also hold for continuously differentiable functions. Finally, all the above notions and theorems continue to hold for vector-valued functions u = (u1 , . . . , um )T : Ω → Rm and in this case we set 1 ∂1 u ∂2 u1 · · · ∂d u1 ∂1 u2 ∂2 u2 · · · ∂d u2 m×d ∇u := . . .. .. ∈ R .. . . ∂1 um ∂2 um · · · ∂d um Comment... We use the spaces C(Ω; Rm ), Ck (Ω; Rm ), Wk,p (Ω; Rm ), Ck,γ (Ω; Rm ) with analogous meanings. k,γ We also use local versions of all spaces, namely Cloc (Ω), Ckloc (Ω), Wk,p loc (Ω), Cloc (Ω), where the defining norm is only finite on every compact subset of Ω. Bibliography J. J. Alibert and G. Bouchitté. Non-uniform integrability and generalized Young measures. J. Convex Anal., 4:129–147, 1997. [2] L. Ambrosio, N. Fusco, and D. Pallara. Functions of Bounded Variation and Free-Discontinuity Problems. Oxford Mathematical Monographs. Oxford University Press, 2000. [3] H. Attouch, G. Buttazzo, and G. Michaille. Variational Analysis in Sobolev and BV Spaces – Applications to PDEs and Optimization. MPS-SIAM Series on Optimization. Society for Industrial and Applied Mathematics, 2006. [4] E. J. Balder. A general approach to lower semicontinuity and lower closure in optimal control theory. SIAM J. Control Optim., 22:570–598, 1984. [5] J. M. Ball. Convexity conditions and existence theorems in nonlinear elasticity. Arch. Ration. Mech. Anal., 63:337–403, 1976/77. [6] J. M. Ball. Global invertibility of Sobolev functions and the interpenetration of matter. Proc. Roy. Soc. Edinburgh Sect. A, 88:315–328, 1981. [7] J. M. Ball. A version of the fundamental theorem for Young measures. In PDEs and continuum models of phase transitions (Nice, 1988), volume 344 of Lecture Notes in Physics, pages 207–215. Springer, 1989. [8] J. M. Ball. Some open problems in elasticity. In Geometry, mechanics, and dynamics, pages 3–59. Springer, 2002. [9] J. M. Ball, J. C. Currie, and P. J. Olver. Null Lagrangians, weak continuity, and variational problems of arbitrary order. J. Funct. Anal., 41:135–174, 1981. [10] J. M. Ball and R. D. James. Fine phase mixtures as minimizers of energy. Arch. Ration. Mech. Anal., 100:13–52, 1987. [11] J. M. Ball and R. D. James. Proposed experimental tests of a theory of fine microstructure and the two-well problem. Phil. Trans. R. Soc. Lond. A, 338:389–450, 1992. Comment... [1] 118 BIBLIOGRAPHY J. M. Ball, B. Kirchheim, and J. Kristensen. Regularity of quasiconvex envelopes. Calc. Var. Partial Differential Equations, 11:333–359, 2000. [13] J. M. Ball and V. J. Mizel. One-dimensional variational problems whose minimizers do not satisfy the Euler-Lagrange equation. Arch. Rational Mech. Anal., 90:325–388, 1985. [14] J. M. Ball and F. Murat. Remarks on Chacon’s biting lemma. Proc. Amer. Math. Soc., 107:655–663, 1989. [15] J. Bergh and J. Löfström. Interpolation Spaces, volume 223 of Grundlehren der mathematischen Wissenschaften. Springer, 1976. [16] H. Berliocchi and J.-M. Lasry. Intégrandes normales et mesures paramétrées en calcul des variations. Bull. Soc. Math. France, 101:129–184, 1973. [17] B. Bojarski and T. Iwaniec. Analytical foundations of the theory of quasiconformal mappings in Rn . Ann. Acad. Sci. Fenn. Ser. A I Math., 8:257–324, 1983. [18] A. Braides. Γ-convergence for Beginners, volume 22 of Oxford Lecture Series in Mathematics and its Applications. Oxford University Press, 2002. [19] C. Castaing, P. R. de Fitte, and M. Valadier. Young Measures on Topological Spaces: With Applications in Control Theory and Probability Theory. Mathematics and its Applications. Kluwer, 2004. [20] Ph. G. Ciarlet. Mathematical Elasticity, volume 1: Three Dimensional Elasticity. North-Holland, 1988. [21] Ph. G. Ciarlet and G. Geymonat. Sur les lois de comportement en élasticité non linéaire compressible. C. R. Acad. Sci. Paris Sér. II Méc. Phys. Chim. Sci. Univers Sci. Terre, 295:423–426, 1982. [22] Ph. G. Ciarlet and J. Nečas. Injectivity and self-contact in nonlinear elasticity. Arch. Rational Mech. Anal., 97:171–188, 1987. [23] J. B. Conway. A Course in Functional Analysis, volume 96 of Graduate Texts in Mathematics. Springer, 2nd edition, 1990. [24] B. Dacorogna. Weak Continuity and Weak Lower Semicontinuity for Nonlinear Functionals, volume 922 of Lecture Notes in Mathematics. Springer, 1982. [25] B. Dacorogna. Direct Methods in the Calculus of Variations, volume 78 of Applied Mathematical Sciences. Springer, 2nd edition, 2008. [26] B. Dacorogna. Introduction to the calculus of variations. Imperial College Press, 2nd edition, 2009. Comment... [12] BIBLIOGRAPHY 119 G. Dal Maso. An Introduction to Γ-Convergence, volume 8 of Progress in Nonlinear Differential Equations and their Applications. Birkhäuser, 1993. [28] E. De Giorgi. Sulla differenziabilità e l’analiticità delle estremali degli integrali multipli regolari. Mem. Accad. Sci. Torino. Cl. Sci. Fis. Mat. Nat. (3), 3:25–43, 1957. [29] E. De Giorgi. Un esempio di estremali discontinue per un problema variazionale di tipo ellittico. Boll. Un. Mat. Ital., 1:135–137, 1968. [30] J. Diestel and J. J. Uhl, Jr. Vector measures, volume 15 of Mathematical Surveys. American Mathematical Society, 1977. [31] N. Dinculeanu. Vector measures, volume 95 of International Series of Monographs in Pure and Applied Mathematics. Pergamon Press, 1967. [32] R. J. DiPerna and A. J. Majda. Oscillations and concentrations in weak solutions of the incompressible fluid equations. Comm. Math. Phys., 108:667–689, 1987. [33] G. Dolzmann and S. Müller. Microstructures with finite surface energy: the two-well problem. Arch. Ration. Mech. Anal., 132:101–141, 1995. [34] I. Ekeland and R. Temam. Convex analysis and variational problems. North-Holland, 1976. [35] L. C. Evans. Weak convergence methods for nonlinear partial differential equations, volume 74 of CBMS Regional Conference Series in Mathematics. Conference Board of the Mathematical Sciences, 1990. [36] L. C. Evans. Partial differential equations, volume 19 of Graduate Studies in Mathematics. American Mathematical Society, 2nd edition, 2010. [37] I. Fonseca and G. Leoni. Modern Methods in the Calculus of Variations: L p Spaces. Springer, 2007. [38] I. Fonseca, S. Müller, and P. Pedregal. Analysis of concentration and oscillation effects generated by gradients. SIAM J. Math. Anal., 29:736–756, 1998. [39] M. Giaquinta and S. Hildebrandt. Calculus of variations. I – The Lagrangian formalism, volume 310 of Grundlehren der mathematischen Wissenschaften. Springer, 1996. [40] M. Giaquinta and S. Hildebrandt. Calculus of variations. II – The Hamiltonian formalism, volume 311 of Grundlehren der mathematischen Wissenschaften. Springer, 1996. [41] M. Giaquinta, G. Modica, and J. Souček. Cartesian currents in the calculus of variations. I, volume 37 of Ergebnisse der Mathematik und ihrer Grenzgebiete. Springer, 1998. Comment... [27] 120 BIBLIOGRAPHY M. Giaquinta, G. Modica, and J. Souček. Cartesian currents in the calculus of variations. II, volume 38 of Ergebnisse der Mathematik und ihrer Grenzgebiete. Springer, 1998. [43] D. Gilbarg and N. S. Trudinger. Elliptic Partial Differential Equations of Second Order, volume 224 of Grundlehren der mathematischen Wissenschaften. Springer, 1998. [44] E. Giusti. Direct methods in the calculus of variations. World Scientific, 2003. [45] L. Grafakos. Classical Fourier Analysis, volume 249 of Graduate Texts in Mathematics. Springer, 2nd edition, 2008. [46] D. Henao and C. Mora-Corral. Invertibility and weak continuity of the determinant for the modelling of cavitation and fracture in nonlinear elasticity. Arch. Ration. Mech. Anal., 197:619–655, 2010. [47] D. Hilbert. Mathematische Probleme – Vortrag, gehalten auf dem internationalen Mathematiker-Kongreß zu Paris 1900. Nachrichten von der Königlichen Gesellschaft der Wissenschaften zu Göttingen, Mathematisch-Physikalische Klasse, pages 253– 297, 1900. [48] R. A. Horn and C. R. Johnson. Matrix analysis. Cambridge University Press, 2nd edition, 2013. [49] D. Kinderlehrer and P. Pedregal. Characterizations of Young measures generated by gradients. Arch. Ration. Mech. Anal., 115:329–365, 1991. [50] D. Kinderlehrer and P. Pedregal. Weak convergence of integrands and the Young measure representation. SIAM J. Math. Anal., 23:1–19, 1992. [51] D. Kinderlehrer and P. Pedregal. Gradient Young measures generated by sequences in Sobolev spaces. J. Geom. Anal., 4:59–90, 1994. [52] K. Koumatos, F. Rindler, and E. Wiedemann. Differential inclusions and Young measures involving prescribed Jacobians. SIAM J. Math. Anal., 2015. to appear. [53] J. Kristensen. Lower semicontinuity of quasi-convex integrals in BV. Calc. Var. Partial Differential Equations, 7(3):249–261, 1998. [54] J. Kristensen. Lower semicontinuity in spaces of weakly differentiable functions. Math. Ann., 313:653–710, 1999. [55] J. Kristensen and C. Melcher. Regularity in oscillatory nonlinear elliptic systems. Math. Z., 260:813–847, 2008. [56] J. Kristensen and G. Mingione. The singular set of Lipschitzian minima of multiple integrals. Arch. Ration. Mech. Anal., 184:341–369, 2007. Comment... [42] BIBLIOGRAPHY 121 J. Kristensen and F. Rindler. Characterization of generalized gradient Young measures generated by sequences in W1,1 and BV. Arch. Ration. Mech. Anal., 197:539– 598, 2010. Erratum: Vol. 203 (2012), 693-700. [58] M. Kružı́k and T. Roubı́ček. On the measures of DiPerna and Majda. Math. Bohem., 122:383–399, 1997. [59] G. Leoni. A first course in Sobolev spaces, volume 105 of Graduate Studies in Mathematics. American Mathematical Society, 2009. [60] M. Marcus and V. J. Mizel. Transformations by functions in Sobolev spaces and lower semicontinuity for parametric variational problems. Bull. Amer. Math. Soc., 79:790–795, 1973. [61] P. Mattila. Geometry of Sets and Measures in Euclidean Spaces, volume 44 of Cambridge Studies in Advanced Mathematics. Cambridge University Press, 1995. [62] E. J. McShane. Generalized curves. Duke Math. J., 6:513–536, 1940. [63] E. J. McShane. Necessary conditions in generalized-curve problems of the calculus of variations. Duke Math. J., 7:1–27, 1940. [64] E. J. McShane. Sufficient conditions for a weak relative minimum in the problem of Bolza. Trans. Amer. Math. Soc., 52:344–379, 1942. [65] N. G. Meyers. Quasi-convexity and lower semi-continuity of multiple variational integrals of any order. Trans. Amer. Math. Soc., 119:125–149, 1965. [66] B. S. Mordukhovich. Variational analysis and generalized differentiation. I – Basic theory, volume 330 of Grundlehren der mathematischen Wissenschaften. Springer, 2006. [67] B. S. Mordukhovich. Variational analysis and generalized differentiation. II – Applications, volume 331 of Grundlehren der mathematischen Wissenschaften. Springer, 2006. [68] C. B. Morrey, Jr. On the solutions of quasi-linear elliptic partial differential equations. Trans. Amer. Math. Soc., 43:126–166, 1938. [69] C. B. Morrey, Jr. Quasiconvexity and the semicontinuity of multiple integrals. Pacific J. Math., 2:25–53, 1952. [70] C. B. Morrey, Jr. Multiple Integrals in the Calculus of Variations, volume 130 of Grundlehren der mathematischen Wissenschaften. Springer, 1966. [71] J. Moser. A new proof of De Giorgi’s theorem concerning the regularity problem for elliptic differential equations. Comm. Pure Appl. Math., 13:457–468, 1960. Comment... [57] 122 BIBLIOGRAPHY J. Moser. On Harnack’s theorem for elliptic differential equations. Comm. Pure Appl. Math., 14:577–591, 1961. [73] S. Müller. Variational models for microstructure and phase transitions. In Calculus of variations and geometric evolution problems (Cetraro, 1996), volume 1713 of Lecture Notes in Mathematics, pages 85–210. Springer, 1999. [74] S. Müller and S. J. Spector. An existence theory for nonlinear elasticity that allows for cavitation. Arch. Rational Mech. Anal., 131:1–66, 1995. [75] S. Müller and V. Šverák. Convex integration for Lipschitz mappings and counterexamples to regularity. Ann. of Math., 157:715–742, 2003. [76] F. Murat. Compacité par compensation. Ann. Scuola Norm. Sup. Pisa Cl. Sci., 5:489– 507, 1978. [77] F. Murat. Compacité par compensation. II. In Proceedings of the International Meeting on Recent Methods in Nonlinear Analysis (Rome, 1978), pages 245–256. Pitagora, 1979. [78] J. Nash. Continuity of solutions of parabolic and elliptic equations. Amer. J. Math., 80:931–954, 1958. [79] J. Nečas. On regularity of solutions to nonlinear variational inequalities for secondorder elliptic systems. Rend. Mat., 8:481–498, 1975. [80] P. J. Olver. Applications of Lie groups to differential equations, volume 107 of Graduate Texts in Mathematics. Springer, 2nd edition, 1993. [81] P. Pedregal. Parametrized Measures and Variational Principles, volume 30 of Progress in Nonlinear Differential Equations and their Applications. Birkhäuser, 1997. [82] Y. G. Reshetnyak. On the stability of conformal mappings in multidimensional spaces. Siberian Math. J., 8:69–85, 1967. [83] R. T. Rockafellar. Convex Analysis, volume 28 of Princeton Mathematical Series. Princeton University Press, 1970. [84] R. T. Rockafellar and R. J.-B. Wets. Variational analysis, volume 317 of Grundlehren der mathematischen Wissenschaften. Springer, 1998. [85] W. Rudin. Real and Complex Analysis. McGraw–Hill, 3rd edition, 1986. [86] L. Simon. Schauder estimates by scaling. Calc. Var. Partial Differential Equations, 5:391–407, 1997. [87] E. M. Stein. Harmonic Analysis. Princeton University Press, 1993. Comment... [72] BIBLIOGRAPHY 123 [88] V. Šverák. Quasiconvex functions with subquadratic growth. Proc. Roy. Soc. London Ser. A, 433:723–725, 1991. [89] V. Šverák. Rank-one convexity does not imply quasiconvexity. Proc. Roy. Soc. Edinburgh Sect. A, 120:185–189, 1992. [90] V. Šverák and X. Yan. A singular minimizer of a smooth strongly convex functional in three dimensions. Calc. Var. Partial Differential Equations, 10:213–221, 2000. [91] V. Sverák and X. Yan. Non-Lipschitz minimizers of smooth uniformly convex functionals. Proc. Natl. Acad. Sci. USA, 99:15269–15276, 2002. [92] L. Székelyhidi, Jr. The regularity of critical points of polyconvex functionals. Arch. Ration. Mech. Anal., 172:133–152, 2004. [93] Q. Tang. Almost-everywhere injectivity in nonlinear elasticity. Proc. Roy. Soc. Edinburgh Sect. A, 109:79–95, 1988. [94] L. Tartar. Compensated compactness and applications to partial differential equations. In Nonlinear analysis and mechanics: Heriot-Watt Symposium, Vol. IV, volume 39 of Res. Notes in Math., pages 136–212. Pitman, 1979. [95] L. Tartar. The compensated compactness method applied to systems of conservation laws. In Systems of nonlinear partial differential equations (Oxford, 1982), volume 111 of NATO Adv. Sci. Inst. Ser. C Math. Phys. Sci., pages 263–285. Reidel, 1983. [96] B. van Brunt. The calculus of variations. Universitext. Springer, 2004. [97] L. C. Young. Generalized curves and the existence of an attained absolute minimum in the calculus of variations. C. R. Soc. Sci. Lett. Varsovie, Cl. III, 30:212–234, 1937. [98] L. C. Young. Generalized surfaces in the calculus of variations. Ann. of Math., 43:84–103, 1942. [99] L. C. Young. Generalized surfaces in the calculus of variations. II. Ann. of Math., 43:530–544, 1942. [100] L. C. Young. Lectures on the calculus of variations and optimal control theory. Saunders, 1969. Comment... [101] L. C. Young. Lectures on the calculus of variations and optimal control theory. Chelsea, 2nd edition, 1980. (reprinted by AMS Chelsea Publishing 2000). Index Ball existence theorem, 84 Ball–James rigidity theorem, 66 Banach–Alaoglu theorem, 112 barycenter, 111 biting convergence, 76 blow-up technique, 69 bootstrapping, 28 Borel σ -algebra, 110 Borel measure, 110 brachistochrone curve, 2 Caccioppoli inequality, 25 Carathéodory integrand, 14 Ciarlet–Nečas non-interpenetration condition, 86 coercivity, 12 weak, 14 cofactor matrix, 108 compactness principle for Young measures, 56 compensated compactness, 75 continuum mechanics, 6 convexity, 11 strong, 25 convolution, 115 Cramer rule, 108 critical point, 21 cut-off construction, 48 Dacorogna formula, 105 De Giorgi regularity theorem, 28 De Giorgi–Nash–Moser regularity theorem, 29 density theorem for Sobolev spaces, 114 difference quotient, 25 Direct Method, 12 for weak convergence, 13 with side constraint, 38 directional derivative, 19 Dirichlet integral, 5, 17, 33 disintegration theorem, 57 distributional cofactors, 81 distributional determinant, 82 double-well potential, 10, 87 duality pairing, 111 duality pairing, for measures, 110 Dunford–Pettis theorem, 112 elasticity tensor, 7 electric energy, 4 electric potential, 4 ellipticity, 21 Euler–Lagrange equation, 19 Evans partial regularity theorem, 74 extension–relaxation, 95, 96 Faraday law of induction, 4 Fatou lemma, 109 frame-indifference, 43 Frobenius matrix (inner) product, 107 Frobenius matrix norm, 107 Fundamental lemma of the Calculus of Variations, 23 Fundamental theorem of Young measure theory, 55 Comment... action of a measure, 111 adjugate matrix, 108 126 Gauss law, 4 Green–St. Venant strain tensor, 6 ground state in quantum mechanics, 5 growth bound lower, 15 upper, 15 Hölder-continuity, 114 Hadamard inequality, 108 Hamiltonian, 5 harmonic function, 20 homogenization lemma, 100 hyperelasticity, 6 integrand, 11 invariance, 34 Jensen inequality, 111 Jensen-type inequality, 65 Kinderlehrer–Pedregal theorem, 102 Korn inequality, 17 Lagrange multiplier, 39 Lamé constants, 80 lamination, 47 Laplace equation, 20 Laplace operator, 20 Lavrentiev gap phenomenon, 31 Lebesgue dominated convergence theorem, 109 Lebesgue point, 110 Legendre–Hadamard condition, 104 linearized strain tensor, 7 Lipschitz domain, 11 Lipschitz-continuity, 115 lower semicontinuity, 12 Comment... maximal function, 63 Mazur lemma, 112 microstructure, 87, 95 minimizing sequence, 12 minor, 51 modulus of continuity, 93 mollification, 115 INDEX mollifier, 115 Monotone convergence lemma, 109 multi-index, 112 ordered, 51 Noether multipliers, 35 Noether theorem, 35 normal integrand, 76 null-Lagrangian, 51 o, 109 objective functional, 12 Ogden material, 7 optimal beating problem, 10 optimal saving problem, 8 Philosphy of Science, 1 Piola identity, 52 Poincaré–Friedrichs inequality, 113 Poisson equation, 4 polyconvexity, 78 Pratt theorem, 109 probability measure, 110 quasiaffine, 53 quasiconvex envelope, 88 quasiconvexity, 44 strong, 74 Radon measure, 110 rank of a minor, 51 rank-one convexity, 46 recovery sequence, 96 regular variational integral, 25 relaxation, 88, 91 relaxation theorem abstract, 91 for integral functionals, 93 with Young measures, 96 Rellich–Kondrachov theorem, 114 Riesz representation theorem, 111 rigid body motion, 6 scalar problem, 18 Schauder estimate, 28, 29 INDEX tensor product, 107 trace operator, 113 trace space, 113 trace theorem, 113 two-gradient inclusion approximate, 66 exact, 66 two-gradient problem, 66 underlying deformation, 62 utility function, 8 variation first, 21 variational principle, 1 vectorial problem, 18 Vitali convergence theorem, 110 Vitali covering theorem, 110 W2,2 loc -regularity theorem, 25 wave equation, 23 wave function, 5 weak compactness in Banach spaces, 111 weak continuity, 38 weak continuity of minors, 53 weak derivative, 112 weak divergence, 113 weak gradient, 113 weak lower semicontinuity, 14 Young inequality, 107 Young measure, 55 elementary, 56 gradient, 62 homogeneous, 59 set of ˜s, 55 Comment... Schrödinger equation, 5 stationary, 5, 41 Scorza Dragoni theorem, 15 side constraint, 37 singular set, 75 singular value, 108 singular value decomposition, 108 Sobolev embedding theorem, 114 Sobolev space, 112 solution classical, 23 strong, 23 weak, 19 St. Venant–Kirchhoff material, 80 Statistical Mechanics, 1 stiffness tensor, 7 strong convergence in L p -spaces, 109 support, 112 symmetries of minimizers, 33 127