Functional Inequalities for Gaussian and Log-Concave Probability Measures

Functional Inequalities for Gaussian and Log-Concave Probability Measures Ewain Gwynne Adviser: Professor Elton Hsu Northwestern University A thesis submitted for a Bachelor’s degree in Mathematics (with honors) and Mathematical Methods in the Social Sciences Abstract We give three proofs of a functional inequality for the standard Gaussian measure originally due to William Beckner. The first uses the central limit theorem and a tensorial property of the inequality. The second uses the Ornstein-Uhlenbeck semigroup, and the third uses the heat semigroup. These latter two proofs yield a more general inequality than the one Beckner originally proved. We then generalize our new inequality to log-concave probability measures, study the relationship between this inequality and a generalized logarithmic Sobolev inequality, and prove several other inequalities for log-concave probability measures, including Brascamp and Lieb’s sharpened Poincaré inequality and Bobkov and Ledoux’s sharpened logarithmic Sobolev inequality of the same form. We discuss some of the potential applications of our work in economics. 1 Contents 1 Introduction 2 Proof of Beckner’s Inequality via the 2.1 Tensorial property . . . . . . . . . . 2.2 Two-point inequality . . . . . . . . . 2.3 First proof of Beckner’s inequality . 3 Central Limit Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Extended Beckner Inequality via Semigroup 3.1 The Ornstein-Uhlenbeck operator . . . . . . . 3.2 The Ornstein-Uhlenbeck semigroup . . . . . . 3.3 Proof of extended Beckner inequality . . . . . 3.4 Classical Heat semigroup . . . . . . . . . . . Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 5 7 9 11 11 13 16 19 4 Beckner Inequality for Log-Concave Probability Measures 21 4.1 Generalization of the Ornstein-Uhlenbeck operator and semigroup . . . . . . . . . . . . . . . 21 4.2 Commutation with the gradient . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 4.3 Proof of Beckner’s inequality for log-concave probability measures . . . . . . . . . . . . . . . 25 5 Other Inequalities for Log-Concave Probability Measures 5.1 Generalized logarithmic Sobolev inequality . . . . . . . . . . 5.2 Inequality for the semigroup . . . . . . . . . . . . . . . . . . . 5.3 Brascamp-Lieb inequality . . . . . . . . . . . . . . . . . . . . 5.4 Sharpened logarithmic Sobolev inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 27 30 31 34 6 Appendices 39 6.1 Sobolev Spaces for Log-Concave Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 6.2 Existence of Semigroup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 6.3 Applications to Economics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 2 0.1 Preface I completed this thesis in 2013, shortly before my graduation from Northwestern University. It is intended to fulfill the requirements for graduation with honors in Mathematics as well as the requirements for a major in Mathematical Methods in the Social Sciences (MMSS). This thesis consists of a mixture of results which I proved in collaboration with my adviser, and my own exposition of material from research papers. The various sections herein were written over the course of nearly a year, beginning in late Spring of 2012 (when I was a Junior in college) and continuing throughout Summer 2012 and the following school year. My work on this thesis has exposed me to a broad range of techniques and concepts in analysis and probability theory, which will be of use to me in my intended future career as a mathematician. This work has also improved my intuition for problem solving and my ability to read research papers in mathematics. I am indebted to several organizations and individuals for the successful completion of this thesis. I would like to thank my adviser, Professor Elton Hsu, for suggesting this project and for his guidance throughout my work. He struck a perfect balance between providing enough guidance to keep me from following dead ends and to make sure I had sufficient mathematical background to tackle problems that arose, and allowing me to explore my own ideas and learn from my mistakes. Moreover, his explanations have been a great help in building my mathematical intuition, and his suggestions for improvements in my writing have not only strengthened the exposition in this paper, but have also made me a stronger writer in general. I would like to thank Professor Valentino Tosatti for serving as my second reader. His review of this document and careful, insightful comments on it have been a major help in the writing process. I would like to thank Professors Joseph Ferrie and William Rogerson of the MMSS program and Professor Mike Stein of the Math department for their flexibility in allowing me to do a thesis which would work for both of my majors. I would like to thank Northwestern University for funding part of my work on this project via an undergraduate research grant in the summer of 2012. The financial independence provided by this grant enabled me to devote my full attention to research during that summer, and thereby to discover more mathematics than would otherwise have been possible. 3 1 Introduction The standard Gaussian measure on Rn is the measure 2 γ n = (2π)−n/2 e−|x| /2 dx. (1.1) In the case n = 1, we write γ 1 = γ. Two of the most fascinating and important properties of this measure are the Poincaré inequality Z kf k22 − f dγ n ≤ k∇f k22 (1.2) Rn and Gross’s [16] logarithmic Sobolev inequality Z f 2 log |f | dγ − kf k22 log kf k2 ≤ k∇f k22 , (1.3) Rn both valid for functions f in the Sobolev space W 2,1 (γ n ) (see Appendix 6.1 for the definition of this space and its basic properties). The Poincaré and log-Sobolev inequalities are used throughout pure and applied mathematics, in fields as diverse as quantum mechanics, mathematical finance, infinite dimensional analysis, mathematical statistics, stochastic analysis, random matrix theory, and partial differential equations. For example, the logarithmic Sobolev inequality can be viewed as a sharpened form of Heisenberg’s uncertainty principle. It is also used to obtain bounds for the solutions of partial differential equations, improve models of fluctuations in stock prices, and characterize the behavior of Brownian motion on manifolds. Recall that for 1 ≤ p < ∞, the Lp norm of a measurable function f on Rn with respect to a measure µ on Rn is defined by Z 1/p kf kp = |f |p dµ . Rn Here the measure µ will always be clear from the context. Beckner [5] has proven a functional inequality for the standard Gaussian measure which involves the Lp norms for 1 ≤ p ≤ 2: kf k22 − kf k2p ≤ (2 − p)k∇f k22 . (1.4) For p = 1, inequality (1.4) is equivalent to the Poincaré inequality, as can be seen for bounded f by adding a sufficiently large constant C so that f + C is non-negative, and for the general f by approximation by bounded functions. Furthermore, if we divide both sides of (1.4) by 2 − p and let p → 2, the left side tends to the left side of (1.3). Thus Beckner’s inequality interpolates between the Poincaré inequality and the logarithmic Sobolev inequality. Beckner’s original proof of inequality (1.4) is based on the explicit spectral decomposition of the OrnsteinUhlenbeck operator in terms of Hermite polynomials and Nelson’s [26] hypercontractivity inequality for the Ornstein-Uhenbeck semigroup. This latter inequality is a significant result in its own right, and is most easily proven using the logarithmic Sobolev inequality (see, for example, [9]). Apparently unaware of Beckner’s work, Latala and Oleszkiewicz [21] proved an extension of Beckner’s r r inequality for measures ce−|x1 | −...−|xn | dx with 1 ≤ r ≤ 2. However, in the Gaussian case r = 2 inequality (1.4) was derived from the logarithmic Sobolev inequality and hypercontractivity of the Ornstein-Uhlenbeck semigroup, via an argument similar to that in [5]. In Section 3.1 of [22], Ledoux used non-linear PDE to prove a version of (1.4) for the invariant probability measures of Markov semigroups whose generators satisfy a curvature-dimension inequality; in the Gaussian case, this inequality reduces to a sharpened form of (1.4), with the right side multiplied by (n − 1)/n and the parameter p allowed to increase to 2n/(n − 1). Several other authors have also studied generalizations of Beckner’s inequality in various directions; see for example [1], [2], [4], [11], [22], and [30]. However, the arguments in these papers also require results comparable in difficulty to hypercontractivity or the logarithmic Sobolev inequality. So, it is instructive to find a direct proof of Beckner’s inequality. We shall give three such proofs. For our first proof of Beckner’s inequality, in Chapter 2, we shall deduce (1.4) via the central limit theorem (Theorem 2.8) and an approximation argument, beginning with an analogous inequality for a probability measure on a two-point set. This method of proof was suggested as an alternative approach in [5], and is similar to Gross’ original proof of the logarithmic Sobolev inequality. 4 In fact, approximation arguments of this sort are pervasive in probability theory. So, this proof illustrates an important technique. In the course of the proof, we prove an important tensorial property of inequality (1.4), which allows one to deduce the n-dimensional inequality from the 1-dimensional one, and implies that the inequality can be extended to infinite dimensional Gaussian measures. Our second proof, in Chapter 3, uses the elementary properties of the Ornstein-Uhlenbeck operator and its associated semigroup, introduced in Sections 3.1 and 3.2. The Ornstein-Uhlenbeck operator satisfies a special integration by parts formula (Proposition 3.3), and its semigroup preserves Gaussian integrals. These two properties make the Ornstein-Uhlenbeck operator a natural tool for proving inequalities of the sort we study here. Our method actually yields a new, more general version of inequality (1.4), valid with the exponent 2 replaced by any q > 2: kf k2q − kf k2p ≤ (q − p)k∇f k2q , f ∈ W q,1 (µ), q ≥ 2, 1 ≤ p ≤ 2. (1.5) Our third proof, given in Section 3.4, replaces the Ornstein-Uhlenbeck semigroup with the better known classical heat semigroup, and also yields the extended inequality (1.5). Beginning in Chapter 4 we shall concern ourselves with log-concave probability measures, a natural generalization of Gaussian measures. We will study analogues of the Ornstein-Uhlenbeck operator and semigroup in this more general setting. We will then use them to extend our proof of (1.5) to prove an analogue of this inequality for general log-concave probability measures on Rn , with a multiplicative constant depending on the measure appearing on the right side. In Chapter 5, we will study several other inequalities for log-concave probability measures, often using the semigroup of Subection 4.1. In Section 5.1, we will explore the implication relationships between inequality (1.5) and a generalized logarithmic Sobolev inequality: Z 2 2 kf k2−q |f |q log |f | dµ − log(kf kq )kf k2q ≤ Ck∇f k2q , f ∈ W q,1 (γ n ), q≥2 q q q Rn in the context of a general log-concave probability measure µ on Rn . In Section 5.2, we will prove an inequality for the semigroup associated to a log-concave probability measure, which extends an inequality which Beckner derived along with (1.4) in [5]. In Section 5.3 we will use semigroup methods to prove a sharpened Poincaré inequality for log-concave probability measures due to Brascamp and Lieb [10]: kf k22 2 Z − f dµ Rn Z ≤ h(D2 v)−1 ∇f, ∇f i dµ. Rn In the course of the proof, we also obtain invertibility of our generalization of the Ornstein-Uhlenbeck operator on the space of functions in L2 (µ) with vanishing mean (Proposition 5.14). In Section 5.4, we shall prove an analogous sharpened logarithmic Sobolev inequality due to Bobkov and Ledoux [7], under a stronger hypothesis on the measure µ; and use the Herbst argument (Proposition 5.21) to give counterexamples which show that this inequality cannot hold in general. The inequalities we study here have potential uses in many different fields. To illustrate their applicability, we shall discuss some of their potential applications in economics in Appendix 6.3. 2 Proof of Beckner’s Inequality via the Central Limit Theorem The main goal of this chapter is to prove the following theorem. Theorem 2.1 (Beckner’s Inequality). If f ∈ W 2,1 (γ n ), then for each p ∈ (1, 2), kf k22 − kf k2p ≤ (2 − p)k∇f k22 . (2.1) We shall first establish a tensorial property of the inequality 2.1, then proceed by way of an approximation argument using the central limit theorem and an analogous inequality for a probability measure on a twopoint set. This method of proof is inspired by Gross’ [16] original proof of the logarithmic Sobolev inequality, and was suggested in [5]. 5 2.1 Tensorial property In this section, we shall prove a tensorial property of Beckner’s inequality, in the setting of an arbitrary probability measure. This property will allow us to pass from an inequality for a measure on a two-point set to the n-fold convolution of such a measure in order to apply the central limit theorem in the next section. It will also allow us to deduce the n-dimensional case of Beckner’s inequality from the 1-dimensional case: Theorem 2.2 (Tensorial property). Let µ be a probability measure on a set Ω. Let 1 ≤ p ≤ 2. Suppose that F is a subspace of L2 (µ), B is a bilinear form on F, and µ satisfies an inequality kf k22 − kf k2p ≤ C(2 − p)B(f, f ), f ∈F (2.2) for some constant C > 0. Then the n-fold Cartesian product measure µn satisfies the inequality kf k22 − kf k2p ≤ C(2 − p)B̃n (f, f ), f ∈ F̃ n , where F̃ n is the space of functions f on Ωn such that xi 7→ f (x1 , ..., xn ) ∈ F for each fixed x1 , ..., xi−1 , xi+1 , ..., xn ∈ Ω and Z X n B [f (x1 , ..., xi−1 , ·, xi+1 , ..., xn ), g(x1 , ..., xi−1 , ·, xi+1 , ..., xn )] dµn (x). (2.3) B̃n (f, g) := Ωn i=1 In words, the bilinear form B̃n is defined as follows: for each index i, we apply the operator B to the function on Ω given by xi 7→ f (x1 , ..., xi−1 , xi , xi+1 , ..., xn ) with the variables other than the ith fixed. Then, we sum over all n. Allowing the coordinates we had previously held fixed to vary, this gives a function on Ωn . We then integrate this function over Ωn . To see why we should care about this particular bilinear form, consider the most important case of R Theorem 2.2, namely where Ω = R, F = Cc∞ (R), and B(f, g) = R f 0 g 0 dµ. Here we have Z B [f (x1 , ..., xi−1 , ·, xi+1 , ..., xn ), g(x1 , ..., xi−1 , ·, xi+1 , ..., xn )] = ∂i f (x1 , ..., xn )∂i g(x1 , ..., xn ) dµ(xi ) R and so Z h∇f, ∇gidµn . B̃n (f, g) = (2.4) Rn Thus, in the case of a probability measure on R, Theorem 2.2 tells us that Beckner’s inequality (2.1) in dimension one implies Beckner’s inequality in dimension n for f ∈ Cc∞ (R), and hence, by density (see Appendix 6.1), also for f ∈ W 2,1 (µ). In particular, it will suffice to prove Theorem 2.1 only in the onedimensional case. Our proof of Theorem 2.2 is based on an argument by Latala and Oleszkiewicz [21]. We first need a result which characterizes the expression on the left side of (2.2). Lemma 2.3. Let µ be a probability measure on a set Ω. Let q ∈ [1, 2]. For f ∈ L2 (µ), f ≥ 0, set Z q Z Φ(f ) = f q dµ − f dµ . Ω Ω Then Φ is a convex functional on the non-negative functions on L2 (µ), i.e. for any t ∈ [0, 1], if f, g ≥ 0, then Φ(tf + (1 − t)g) ≤ tΦ(f ) + (1 − t)Φ(g). (2.5) Proof. First suppose that f and g satisfy A ≥ f, g ≥ a for constants a, A > 0. We need to show that α(t) := Φ(tf + (1 − t)g) − tΦ(f ) − (1 − t)Φ(g) ≤ 0 for each t ∈ [0, 1]. By our hypotheses on f , α is a twice-differentiable function of t. We have α(0) = α(1) = 0. Therefore, if α(t) is positive for some t ∈ [0, 1], then α attains a positive maximum in [0, 1]. So, it suffices to 6 show that there can be no such maximum. For this, it is enough to show that α00 (t) ≥ 0. Using dominated convergence to differentiate under the integral sign, we compute d2 Φ(tf + (1 − t)g) dt2 Z q−2 Z 2 Z = q(q − 1) (tf + (1 − t)g)q−2 (f − g)2 dµ − q(q − 1) (tf + (1 − t)g) dµ f − g dµ . α00 (t) = Ω Ω Ω Thus, we need to show that Z 2 Z 2−q Z (tf + (1 − t)g)q−2 (f − g)2 dµ ≥ f − g dµ . tf + (1 − t)g dµ Ω (2.6) Ω Ω Fix t and set h = (tf + (1 − t)g)2−q . By Hölder’s inequality, Z Z 2 2 Z Z f − g√ (f − g)2 √ = f − g dµ h dµ ≤ dµ h dµ h h Ω Ω Ω ZΩ Z q−2 2 2−q (tf + (1 − t)g) (f − g) dµ (tf + (1 − t)g) dµ . = Ω Ω But, since x 7→ x2−q is concave on [0, ∞), we have Z 2−q Z 2−q (tf + (1 − t)g) dµ ≤ tf + (1 − t)g dµ . Ω Ω Plugging this into the last line proves (2.6). For the general non-negative f and g in L2 , one can find sequences (fj ) and (gj ) which are bounded above and bounded away from 0 and which converge to f and g, respectively, in the L2 norm. Take the limit in (2.5) for fj and gj to obtain the result for f and g. Lemma 2.4. Let µ be a measure on a set Ω. Then if q ∈ (1, 2) and f ∈ L2 (µn ) is non-negative, Z Z Z q X q Z n Z q q n n f dµi − f dµi f dµ − f dµ ≤ dµn Ωn Ωn i=1 Ωn Ωi Ωi where Z Z f dµi := Ω f (x1 , ..., xi−1 , xi , xi+1 , ..., xn ) dµ(xi ). Ω Proof. First observe that the formula concerns only the absolute value of f , so we may assume that f ≥ 0. Suppose first that n = 2, in which case the desired formula is Z q Z Z f q dµ1 dµ2 − f dµ1 dµ2 Ω Ω Ω×Ω Z q Z Z q Z Z Z q q ≤ f dµ1 − f dµ1 + f dµ2 − f dµ2 dµ1 dµ2 . (2.7) Ω Ω Ω Ω Ω Ω Note that since µ1 and µ2 are probability measures, integrating a second time with respect to the same measure has no effect. So, we can rearrange terms to obtain the equivalent inequality q Z Z q Z Z q Z Z Z Z f dµ2 dµ1 − f dµ1 dµ2 ≤ f q dµ1 dµ2 − f dµ1 dµ2 . (2.8) Ω Ω Ω Ω Ω Ω Ω Ω Define Φ as in Lemma 2.3. We apply the general form of Jensen’s inequality, with the L1 (Ω)-valued random variable y 7→ f (·, y) and the convex function Φ on L1 (Ω) to obtain Z Z Φ f (·, y) dµ2 (y) ≤ Φ(f (·, y)) dµ2 (y). Ω Ω 7 Expanding both sides of this inequality, we obtain (2.8). This proves the result in the n = 2 case. The general case follows by a simple induction argument. Proof of Theorem 2.2. Let f ∈ F̃ n . Apply Lemma 2.4 to find "Z Z 2/p # Z X n 2 2 2 p kf k2 − kf kp ≤ f dµi − |f | dµi dµn . Ωn i=1 Ω (2.9) Ω By hypothesis, for fixed x1 , ..., xi−1 , xi+1 , ..., xn ∈ Ω, each term in the sum on the right side of (2.9) is bounded above by C(2 − p)B [f (x1 , ..., xi−1 , ·, xi+1 , ..., xn ), f (x1 , ..., xi−1 , ·, xi+1 , ..., xn )] . Summing over all i and integrating, we get that the right side of (2.9) is bounded above by Z C(2 − p) n X B [f (x1 , ..., xi−1 , ·, xi+1 , ..., xn ), f (x1 , ..., xi−1 , ·, xi+1 , ..., xn )] dµn (x) = C(2 − p)B̃n (f, f ). Ωn i=1 Remark 2.5. The tensorial property proven in Lemma 2.4 also allows one to extend Beckner’s inequality to infinite dimensional Gaussian measures. See [9] for a discussion of these measures. Remark 2.6. Later, we shall prove an inequality for log-concave probability measures of the form kf k2q − kf k2p ≤ C(q − p)k∇f k2q , q ≥ 2, 1 ≤ p ≤ q, f ∈ W 2,1 (µ). In the case q > 2, this inequality does not, to our knowledge, possess the tensorial property of this section. We must therefore prove this inequality in all dimensions, rather than just deducing the n-dimensional case from the one-dimensional case as we do for Beckner’s inequality below. 2.2 Two-point inequality Let X n = {−1, 1}n , with the uniform probability measure mn which assigns measure 1/2n to each point. In this subsection, we shall prove an inequality for X = X 1 which is analogous to that in Theorem 2.1. This inequality will eventually enable us to deduce Theorem 2.1 in the one-dimensional case via a limiting argument based on the central limit theorem (Theorem 2.8). Proposition 2.7. Define a bilinear form on X by B(f, g) = 1 (f (1) − g(−1))2 . 4 (2.10) Then for any function f on X, kf k22 − kf k2p ≤ (2 − p)B(f, f ), (2.11) where here the norms are with respect to m. Proof. We begin with some reductions. Every function on X is of the form f (x) = ax + b for constants a and b. If b = 0, then |f | is constant, so the left side of (2.11) vanishes and the result is obvious. So, we may assume that b 6= 0. Then since the desired inequality is invariant under rescaling, we may take f (x) = fa (x) = ax + 1 for some a. The measure m is symmetric about the origin, so it suffices to assume that a ≥ 0. Move all of the terms in (2.11) to one side and treat the difference as a function of a: φ(a) = (2 − p)B(fa ) − (kfa k22 − kfa k2p ). We shall show that φ ≥ 0, whence the desired inequality. The proof is essentially a direct calculation. We treat two cases. 8 Case I: 0 ≤ a ≤ 1. Then φ(a) = −(p − 1)a2 − 1 + (1 + a)p + (1 − a)p 2 2/p . Observe that φ(0) = 0. We shall show that φ0 (0) = 0 and that φ is convex on [0, 1], so has a unique minimum value of 0 at the origin. We have φ0 (a) = −2(p − 1)a + kfa k2p (1 + a)p−1 − (1 − a)p−1 (2.12) and φ0 (0) = 0. Moreover, 2−p 2−2p 2 kfa kp 1)kfa k2−p (1 + p φ00 (a) = −2(p − 1) + +(p − (1 + a)p−1 − (1 − a)p−1 a)p−2 + (1 − a)p−2 2 (2.13) The second term here is always non-negative. By Jensen’s inequality, kfa kp2−p ≥ 1+a+1−a 2 2−p = 1. By convexity of the function x 7→ xp−2 for x > 0, p−2 (1 + a) p−2 + (1 − a) ≥ 2 1+a+1−a 2 p−2 = 2. Thus the third term in (2.13) satisfies (p − 1)kfa k2−p (1 + a)p−2 + (1 − a)p−2 ≥ 2(p − 1). p Therefore φ00 ≥ 0 on [0, 1], so φ is convex on [0, 1], as desired. Case II: a ≥ 1. Then φ(a) = −(p − 1)a2 − 1 + (1 + a)p + (1 − a)p 2 2/p . We know from the previous case that φ(1) ≥ 0, so to show that φ is non-negative it suffices to show that φ0 (a) ≥ 0 for a ≥ 1. We compute φ0 (a) = −2(p − 1)a + kfa k2−p (1 + a)p−1 + (a − 1)p−1 . (2.14) 2 For the first factor of the second term in (2.14), we have by Jensen’s inequality that kfa k22−p ≥ 1+a+a−1 2 2−p = a2−p . (2.15) We claim that the second factor in (2.14) satisfies (1 + a)p−1 + (a − 1)p−1 ≥ 2(p − 1)ap−1 . (2.16) If we can demonstrate this, then (2.14), (2.15), and (2.16) will imply φ0 (a) ≥ −2(p − 1)a + a2−p · 2(p − 1)ap−1 = 0 which is what we need to show. To prove the claim, set ψ(a) = (1 + a)p−1 + (a − 1)p−1 − 2(p − 1)ap−1 . 9 (2.17) We must show that ψ is non-negative on [1, ∞). It suffices to show that ψ(1) ≥ 0 and that ψ is increasing on [1, ∞). We have ψ(1) = 2p−1 − 2(p − 1) ≥ 0. Furthermore ψ 0 (a) = (p − 1) (1 + a)p−2 + (a − 1)p−2 − 2(p − 1)ap−2 ≥ (p − 1) (a + 1)p−2 + (a − 1)p−2 − 2ap−2 . The function x 7→ xp−2 is convex on (0, ∞), so (a + 1)p−2 + (a − 1)p−2 ≥ 2 a+1 a−1 + 2 2 p−2 = 2ap−2 and hence ψ 0 (a) ≥ 0 and ψ is increasing in a. It follows that ψ(a) ≥ ψ(1) ≥ 0 for a ≥ 1. As we remarked above, this proves (2.16), and completes the proof of (2.11). 2.3 First proof of Beckner’s inequality Theorem R 2.8 (Central Limit Theorem). Let µ be a probability measure on R, normalized so that 0 and R x2 dµ(x) = 1. Define a function φn on Rn by R R x dµ(x) = n 1 X φn (x1 , ..., xn ) = √ xj . n j=1 Let µn be the product of n copies of µ on Rn , and let µn∗ be the law of φn , i.e. the probability measure on R such that µn∗ (a, b] = µn {x ∈ Rn : a ≤ φn (x) ≤ b} for each interval (a, b]. Then as n → ∞, µn∗ (a, b] → γ(a, b] for each interval (a, b], and Z Z g dµn∗ → R g dγ R for any bounded continuous function g. In other words, µn∗ → γ in the weak-* topology on the set of finite measures on R. A proof of the central limit theorem can be found in Ch. 3 of [13]. In probabilistic terms, the central limit theorem is the statement that the normalized means φn of a sequence of independent identically distributed random variables converge in distribution to a variable with a Gaussian distribution. In practical applications, the central limit theorem is used to estimate the probability that the mean of large sample lies in a given range, using known probabilities for the Gaussian measure. It is the fundamental tool behind most elementary hypothesis tests and confidence intervals. We shall apply the central limit with the measure m from Section 2.2. Note that we can and do view m as a measure on R, which assigns zero measure to sets which do not contain 1 or −1. First we apply Theorem 2.2 in this setting. For B as in (2.3), we have ! Z n X B̃n (f, f ) = B [f (x1 , ..., xi−1 , ·, xi+1 , ..., xn )] dµn Xn = 1 4 Z i=1 n X (f (x1 , ..., xi−1 , 1, xi+1 , ..., xn ) − f (x1 , ..., xi−1 , −1, xi+1 , ..., xn ))2 dmn . (2.18) X n i=1 By Proposition 2.7 and Theorem 2.2, kf k2L2 (mn ) − kf k2Lp (mn ) ≤ C(2 − p)B̃n (f, f ), 10 1≤p≤2 (2.19) for each function f on X n . Now suppose that f is a function on R. For each integer n, define a function fn on X n by   n X 1 xj  . fn (x1 , ..., xn ) = f  √ n j=1 (2.20) In the notation of Theorem 2.8, fn = f ◦ φn , so the theorem suggests that fn will converge in some sense to n f . We seek a simple expression for B̃n (fn , fn ). Fix an integer i = 1, ..., n, and define X−i to be the product P n . of the n − 1 copies of X in X n other than the ith. Write y for the sum of the n − 1 coordinates of y ∈ X−i Then the integral of the ith summand in (2.18) is Z (fn (x1 , ..., xi−1 , 1, xi+1 , ..., xn ) − fn (x1 , ..., xi−1 , −1, xi+1 , ..., xn ))2 dmn Xn = 1 X n y∈X−i 2n−1 P P 2 y+1 y−1 √ √ f −f . n n (2.21) P If the tuple y has k ones and n − 1 − k negatives ones, then y = 2k − n + 1. For each k = 0, ..., n − 1, n there are n − 1 choose k tuples y ∈ X−i satisfying this condition. Thus the integral in (2.21) equals n−1 X 1 2n−1 k=0 n−1 2k − n 2 2k − n √ √ f +√ −f . k n n n Observe that this is independent of i. We therefore have B̃n (fn , fn ) n−1 X 2 2k − n 2 2k − n √ √ √ f + − f 2n+1 n n n k=0 2  n−1 √ √ + √2n − f 2k−n X 1 n − 1 f 2k−n n n   . √ = 2n−1 k 2/ n = n n−1 k (2.22) k=0 We are now ready to prove the one-dimensional version of Theorem 2.1. Proof of Theorem 2.1. By Theorem 2.2 and the remarks following it, it will suffice to prove Theorem 2.1 in the case n = 1. Furthermore, since Cc∞ (Rn ) is dense in W 2,1 (µ) (c.f. Appendix 6.1), it suffices to prove the result for f ∈ C ∞ (R). Define fn as in (2.20). We shall deduce (2.1) as the limit as n → ∞ of (2.11) for fn . The functions f 2 and |f |p are continuous with compact support, so by Theorem 2.8, as n → ∞, kfn kL2 (mn ) = kf kL2 (mn∗ ) → kf kL2 (γ) , kfn kLp (mn ) = kf kLp (mn∗ ) → kf kLp (γ) . (2.23) This proves convergence of the left side of (2.11). It remains to treat the right side. Let gn be the function on R which equals 2  2k−n √ √2 √ f 2k−n + − f n n n   √ 2/ n 2(k+1)−n √ , √ on [ 2k−n ] for each k = 0, ..., n − 1 and which equals zero elsewhere. n n Let mn∗ be the law of φn , as in Theorem 2.8. By elementary probability theory, the measure mn∗ is a binomial distribution with parameters n − 1 and 1/2. Therefore Z gn dm(n−1)∗ = R n−1 X k=0 1 2n−1 2  2k−n 2 2k−n n − 1  f √n + √n − f √n  √ = B̃n (fn , fn ). k 2/ n 11 2(k+1)−n √ , √ ]. Since On the other hand, gn is the square of a difference quotient of f on each interval [ 2k−n n n ∞ 0 2 f ∈ Cc (R), it is readily verified that gn → (f ) uniformly as n → ∞. Thus, given > 0, we may choose N so large that |gn − (f 0 )2 | < uniformly over R whenever n ≥ N . By the central limit theorem, we may choose N 0 ≥ N such that Z Z 0 2 0 2 n (f ) dm(n−1)∗ − (f ) dγ < R R 0 whenever n ≥ N . Hence for each such n, Z Z Z B̃n (fn , fn ) − (f 0 )2 dγ ≤ gn − (f 0 )2 dm(n−1)∗ + whence B̃n (fn , fn ) → 3 R R Z (f ) dm(n−1)∗ − Rn R R 0 2 R (f ) dγ < 2 0 2 (f 0 )2 dγ. Thus, if we take the limit in (2.11) for fn , we get (2.1). Extended Beckner Inequality via Semigroup Methods In this chapter, we introduce the Ornstein-Uhlenbeck operator and its associated semigroup, and later the classical heat semigroup, and use them to prove an extended version of Beckner’s inequality. The following bit of notation will be useful here and in the remainder of the paper. Let q ≥ 1. If 1 ≤ p ≤ q, we say that a probability measure µ on Rn satisfies inequality Bec(q, p) with constant C if for each f ∈ W q,1 (µ), kf k2q − kf k2p ≤ C(q − p)k∇f k2q . (3.1) Notice that Bec(2, p) for 1 ≤ p ≤ 2 is Beckner’s original inequality. Our goal is to prove the following. Theorem 3.1. The measure γ n satisfies Bec(q, p) with constant 1 whenever q ≥ 2 and 1 ≤ p ≤ q. Furthermore, equality holds if and only if f is constant, a.e. 3.1 The Ornstein-Uhlenbeck operator In this section, we will introduce the Ornstein-Uhlenbeck operator and discuss the basic theory of semigroups. A more detailed discussion of the Ornstein Uhlenbeck operator can be found in [9]. A more detailed discussion of semigroups can be found in [8]. Definition 3.2. The Ornstein-Uhlenbeck operator is the densely defined unbounded operator A : L2 (γ n ) → L2 (γ n ) defined by Af (x) = ∆f (x) − hx, ∇f (x)i (3.2) P n with domain consisting of those f ∈ L2 (γ n ) such that this formula defines an L2 function. Here ∆ = j=1 ∂j2 denotes the Laplacian. The Ornstein-Uhlenbeck operator has a number of interesting properties, as well as applications in partial differential equations and probability theory. For our purposes, the importance of this operator lies in the following integration by parts formula, which will play an essential role in what follows. Proposition 3.3. If Af exists and g ∈ W 2,1 (γ n ), then Z Z h∇f, ∇gi dγ n = − Rn gAf dγ n . (3.3) Rn Proof. By converting to an ordinary Lebesgue integral and integrating by parts, we obtain the elementary formula Z Z n ∂j f (x) dγ (x) = xj f (x) dγ n (x), f ∈ Cc∞ (Rn ). (3.4) R R 12 By approximation, the same holds for f ∈ W 2,1 (γ n ). We now apply this and the product rule to the term hx, g(x)∇f (x)i in each coordinate to get Z Z gAf dγ n = −hx, g(x)∇f (x)i + g(x)∆f (x) dγ n (x) n n R ZR = −h∇f (x), ∇g(x)i − g(x)∆f (x) + g(x)∆f (x) dγ n (x) Rn Z = − h∇f, ∇gi dγ n . Rn We are interested in solving the analogue of the heat equation for the operator A, namely ∂t u(t, x) = Au(t, x), u(0, x) = f (x), (3.5) for sufficiently regular functions f ∈ L2 (µ). In order to do so, we will need the following notion from functional analysis. Definition 3.4. A contraction semigroup is a family of operators {Tt : t ≥ 0} on a Banach space X satisfying 1. T0 is the identity on X; 2. Tt ◦ Ts = Tt+s . 3. t 7→ Tt is strongly continuous in t. In other words, if tj → t then for each x ∈ X, Ttj x → Tt x in the norm of X. 4. kTt xk ≤ kxk for all x ∈ X. Perhaps the simplest nontrivial example of a contraction semigroup is the family of maps on R defined by Tt (x) = e−t x. Definition 3.5. The infinitesimal generator of {Tt } is the operator S defined by S(x) = lim t→0 Tt (x) − x , t with domain D(S) consisting of those x ∈ X such that this limit exists. In our above example, the infinitesimal generator of x 7→ e−t x is x 7→ −x. In general, abstract theory (see, for example, Section 7.4 of [14]) implies that the infinitesimal generator of a contraction semigroup on a Banach space X always exist on a dense subspace of X. However, it is difficult in general to find its precise domain. For many purposes (including our own), one can settle for a semigroup whose generator is an extension of a given operator, i.e. one which agrees with the given operator on its domain, but may have a larger domain. The relation between the infinitesimal generator and the semigroup is made clear by the following. Lemma 3.6. If x ∈ D(S), then for each t ≥ 0, d Tt x = Tt Sx = STt x. dt (3.6) Proof. Since t 7→ Tt x is continuous, we have d Ts+t x − Tt x Tt x = lim = Tt s→0 dt s and Ts x − x s→0 s lim d Ts Tt x − Tt x Tt x = lim = STt x. s→0 dt s 13 = Tt Sx 3.2 The Ornstein-Uhlenbeck semigroup By Lemma 3.6, if we can find a semigroup {Tt } whose infinitesimal generator is an extension of the Ornstein Uhlenbeck operator A, then u(t, x) = Tt f (x) will be a solution to (3.5). We could simply state the formula (called the Mehler Formula) for Tt and prove that it has the necessary properties, but for pedagogical reasons we first make the following informal derivation. We assume all functions involved are sufficiently regular that our calculations make sense. Taking the Fourier transform, defined by Z ĝ(ξ) = e−ihx,ξi g(x) dx, g ∈ L1 (Rn ) Rn in (3.5) gives ∂t û(t, ξ) = −|ξ|2 û(t, ξ) + i n X ∂ξj (−iξj û(t, ξ)) = −|ξ|2 û(t, ξ) + j=1 n X ξj ∂ξj û(t, ξ) + nû(t, ξ), j=1 with initial condition û(0, ξ) = fˆ(ξ). Equivalently, ∂t û(t, ξ) − n X ξj ∂ξj û(t, ξ) + (|ξ|2 − n)û(t, ξ) = 0, û(0, ξ) = fˆ(ξ). j=1 For fixed ξ0 ∈ Rn , the characteristic ODE for this PDE are ṫ(s) ˙ ξ(s) = 1 ż(s) = (n − |ξ|2 )z(s) t(0) = 0, = −ξ ξ(0) = ξ0 , z(0) = fˆ(ξ0 ), where z is the variable we substitute for û. Solving the first two equations and substituting into the third, we obtain t = s ξ(s) = e−s ξ0 ż(s) (n − |ξ0 |2 e−2s )z(s) = fˆ(ξ0 ). z(0) = The last two equations yield the solution 1 z(s) = es e− 2 |ξ0 | 2 (1−e−2s ) fˆ(ξ0 ). Given (t, ξ) ∈ [0, ∞) × Rn , we select s ∈ [0, ∞) and ξ0 ∈ Rn such that (t, ξ) = (t(s), ξ(s)) = (s, e−s ξ0 ), i.e. s = t, ξ0 = et ξ. Then 2 2t −2t 1 û(t, ξ) = z(s) = et e− 2 |ξ| e (1−e ) fˆ(et ξ). The inverse Fourier transform u(t, x) of this function will be et times the convolution of the inverse Fourier transform of fˆ(et ξ) and the inverse Fourier transform of exp(− 21 |ξ|2 e2t (1 − e−2t )). From the elementary properties of the Fourier transform, the inverse Fourier transform of fˆ(et ξ) is (2π)n/2 e−t f (e−t ξ) and the inverse Fourier transform of exp(− 12 |ξ|2 e2t (1 − e−2t )) is (2π)n/2 1 √ exp(− |ξ|2 e−2t (1 − e−2t )−1 ). t −2t n 2 (e 1 − e ) Thus, upon taking the inverse Fourier transform and cancelling a factor of (2π)n et , we get Z 1 1 √ u(t, x) = exp(− |z|2 e−2t (1 − e−2t )−1 )f (e−t x − e−t z) dz. t −2t n 2 (e 1 − e ) Rn 14 √ Substitute z = et 1 − e−2t y and use symmetry of the Gaussian to find Z p u(t, x) = f (e−t x + 1 − e−2t y) dγ n (y). Rn We are therefore lead to define Definition 3.7. The Ornstein-Uhlenbeck Semigroup is the family of operators {Tt } defined for t ≥ 0 by Z p Tt f (x) = f (e−t x + 1 − e−2t y) dγ n (y), Rn for all functions f on Rn for which this integral exists for a.e. x ∈ Rn . We claim that {Tt } is a contraction semigroup on Lp , 1 ≤ p ≤ ∞, with infinitesimal generator A. To prove our claim, we shall require the following elementary lemma. Lemma 3.8 (Change of Variables Formula). If a2 + b2 = 1, then for any f ∈ L1 (γ n ), Z Z Z n f (u) dγ (u) = f (ax + by) dγ n (x) dγ n (y). Rn Rn Rn Proof. Since γ n is a probability measure, we may insert a second integral to get Z Z Z Z Z 2 2 1 n n n f (u) dγ (u) = f (u) dγ (u) dγ (v) = f (u)e−|u| /2−|v| /2 dudv. n (2π) R R R Rn R Make the change of variables u = ax + by, v = bx − ay. Since a2 + b2 = 1, the Jacobian matrix for this change of variables has determinant one. So, by the change of variables formula from elementary calculus, our integral becomes Z Z Z Z 2 2 1 1 −|ax+by|2 /2−|bx−ay|2 /2 f (ax + by)e dxdy = f (ax + by)e−|x| /2−|y| /2 dxdy (2π)n Rn Rn (2π)n Rn Rn Z Z = f (ax + by) dγ n (x) dγ n (y). Rn Rn From this lemma, we can deduce some basic properties of the Ornstein-Uhenbeck semigroup. Proposition 3.9. Let s, t ≥ 0. Then: 1. Tt preserves integrals: if f ∈ L1 (γ n ), then Z n Z Tt f dγ = Rn f dγ n . Rn 2. kTt f kp ≤ kf kp for all p ≥ 1; in particular, Tt is a bounded, equivalently continuous, map from Lp (γ n ) to itself. 3. Tt ◦ Ts = Tt+s . 4. If f ∈ W 2,1 (γ n ), then so is Tt f and for each k = 1, ..., n, ∂k (Tt f ) = e−t Tt (∂k f ). 5. t 7→ Tt is a continuous function of t. That is, if tj → t, then for each f ∈ Lp (γ n ), Ttj f → Tt f in Lp . R 6. If f ∈ L2 (γ n ), then limt→∞ Tt f = Rn f dγ n in Lp for any 1 ≤ p < ∞. Clearly, T0 is the identity operator, so items 2, 3, and 5 mean that {Tt } is a contraction semigroup on Lp (γ n ). Item 1 implies that γ n is the invariant measure for {Tt }. 15 Proof. √ 1. If we take a = e−t and b = 1 − e−2t , then Lemma 3.8 gives Z Z Z Z p n n n −t −2t Tt f dγ = f (e x + 1 − e y) dγ (y) dγ (x) = Rn Rn f dγ n . (3.7) Rn Rn 2. From item 1, together with Jensen’s inequality, we get for f ∈ L1 (γ n ) that p Z Z p n p −t −2t kTt f kp = f (e x + 1 − e y) dγ (y) dγ n (x) n n ZR Z R p ≤ |f (e−t x + 1 − e−2t y)|p dγ n (y) dγ n (x) = kf kpp . Rn Rn Since L1 (γ n ) is dense in Lp (γ n ) for p ≥ 1, this proves item 2. 3. Observe that Z Tt+s f (x) = f (e−t−s x + p 1 − e−2t−2s y) dγ n (y). Rn Apply Proposition 3.8 in the y variable, with √ 1 − e−2t −s , a=e √ 1 − e−2t−2s √ 1 − e−2s b= √ . 1 − e−2t−2s This gives us Z Z Tt+s f (x) = Rn f (e−t e−s x + e−s p p 1 − e−2t w + 1 − e−2s z) dγ n (w) dγ n (z) = Tt Ts f (x). Rn 4. This can be obtained for f ∈ C ∞ (Rn ) by differentiating under the integral sign, and follows for the general f ∈ W 2,1 (γ n ) by the usual density argument. 5. Suppose tj → t. First consider f ∈ Cc (Rn ). Then Ttj f → Tt f a.e., and it follows from dominated convergence that Ttj f → Tt f in Lp (γ n ). For the general f ∈ Lp (γ n ), one can find a sequence (fi ) ∈ Cc (Rn ) converging to f in Lp (γ n ). Then kTtj f − Tt f kp ≤ kTtj f − Ttj fi kp + kTtj fi − Tt fi kp + kTt fi − Tt f kp . Taking the limit, first in i then in j, proves item 5. 6. If f is bounded, the result is immediate from the dominated convergence theorem. In the general case, let > 0 and choose a bounded measurable g such that kf − gkp < . Then ! Z Z Z n n n lim sup Tt f − f dγ ≤ lim sup kTt f − Tt gkp + Tt g − g dγ + Tt g − g dγ < 2 n n n t→∞ t→∞ R p R p which proves the result in general. Finally, we can relate the semigroup {Tt } to the Ornstein-Uhlenbeck operator A. Proposition 3.10. Let f ∈ D(A). Then t 7→ Tt f is a differentiable function of t and d Tt f = ATt f = Tt Af. dt In particular, taking t = 0, the infinitesimal generator of {Tt } is an extension of A. 16 R p Proof. We have d Tt (f )(x) dt = Z p hx, ∇f (e−t x + 1 − e−2t y)i dγ n (y) Rn Z p e−2t hy, ∇f (e−t x + 1 − e−2t y)i dγ n (y). +√ 1 − e−2t Rn −e−t (3.8) By item 4 of Proposition 3.9, the first term on the right side of (3.8) equals −hx, e−t Tt ∇f (x)i = −hx, ∇Tt f (x)i, where the action of Tt on vector valued functions is componentwise. By (3.4) applied in each coordinate of y, the second term in (3.8) equals Z p −2t e ∆f (e−t x + 1 − e−2t y) dγ(y) = e−2t Tt ∆f (x) = ∆Tt f (x). Rn Substituting these two relations into (3.8) proves the first desired equality, and taking t = 0 shows that the infinitesimal generator of {Tt } is an extension of A (its domain may be larger than the domain of A). The second equality follows from Lemma 3.6. 3.3 Proof of extended Beckner inequality The integration by parts formula 3.3 makes the operator A and its associated semigroup {Tt } a natural tool for the proof of Theorem 3.1. Here we give the proof, and also prove some sharpness results. Proof of Theorem 3.1. By density, we may take f ∈ C 2 (Rn ) with f bounded. Then |f | ∈ W q,1 (γ n ) with |∇|f || = |∇f | a.e. (see Appendix 6.1). So, it suffices to suppose that f is non-negative. Furthermore, by replacing f with f + and letting → 0, we may take f ≥ c > 0 for some constant c. Let Z 2/q p q/p n φ(t) = [Tt f ] dγ . Rn f dγ n , we have that the left side of Bec(q, p) is given by Z ∞ 2 2 kf kq − kf kp = φ0 (t) dt. Since T0 is the identity and limt→∞ Tt f = R Rn 0 0 We shall estimate φ . To simplify notation in what follows, put Z 2/q−1 α(t) = [Tt f p ]q/p dγ n . (3.9) Rn Using the relation ∂t Tt f = ATt f , we compute φ0 (t) = 2 α(t) p Z [Tt f p ]q/p−1 ATt f p dγ n . Rn By the integration by parts formula for A, this equals Z Z 2 q 2 α(t) h∇[Tt f p ]q/p−1 , ∇ (Tt f p )i dγ n = − 1 α(t) [Tt f p ]q/p−2 |∇Tt f p |2 dγ n . p p p Rn Rn (3.10) We have ∇Tt f p = e−t Tt (∇(f p )) = e−t pTt (f p−1 ∇f ). (3.11) Therefore, (3.10) equals 2e−2t (q − p) α(t) Z [Tt f p ]q/p−2 |Tt (f p−1 ∇f )|2 dγ n . Rn 17 (3.12) By Hölder’s inequality applied inside the definition of Tt , 2 2−2/p 2/p (Tt |∇f |p ) . |Tt (f p−1 ∇f )|2 ≤ Tt (f p−1 |∇f |) ≤ (Tt f p ) (3.13) Plugging this into (3.12) yields φ0 (t) ≤ 2e−2t (q − p) α(t) Z (Tt f p ) q/p−2/p 2/p (Tt |∇f |p ) dγ n . (3.14) Rn The q = 2 case can be handled by trivial modifications of what follows, so we henceforth assume q > 2. Apply Hölder’s inequality a second time, this time with the exponents q/(q − 2) and q/2, to get Z q/p−2/p 2/p (Tt f p ) (Tt |∇f |p ) dγ n Rn Z ≤ p q/p (Tt f ) dγ n 1−2/q Z p q/p (Tt |∇f | ) Rn dγ n 2/q . Rn The last term here is precisely α(t)−1 , so upon plugging this into (3.14), we obtain Z 2/q q/p (Tt |∇f |p ) dγ n φ0 (t) ≤ 2e−2t (q − p) . Rn Since 1 ≤ p ≤ q, we have (Tt |∇f |p ) q/p ≤ Tt |∇f |q (3.15) So, since Tt preserves integrals, 0 φ (t) ≤ 2e −2t Z (q − p) q |∇f | dγ n 2/q . (3.16) Rn Integrating this from 0 to ∞ proves Bec(q, p). If equality holds in Bec(q, p), then it must hold for almost every t in (3.16). Therefore it must hold for almost every t in (3.13). By the conditions for equality in Hölder’s inequality, this means that f p = c|∇f |p for some constant c. On the other hand, the condition for equality in Jensen’s inequality, together with (3.15), imply that |∇f |p is constant. Therefore if f satisfies the hypotheses we imposed at the beginning of the proof, then equality in Bec(q, p) implies that f is constant. The L2 limit of constant functions is constant a.e., so upon passing to the limit we get the result for the general non-negative f . The result for the general f ∈ W 2,1 (µ) is obtained by replacing f with |f | and applying the positive case, together with Proposition 6.3 of Appendix 6.1. So, by the non-negative case, equality for f implies that |f | is constant a.e. But, since f ∈ W 2,1 (µ), this means that f itself is constant a.e. Remark 3.11. If we apply Hölder’s inequality in each coordinate of ∇f separately after obtaining (3.12), essentially the same proof shows that γ n satisfies Bec(q, p) with the right side replaced by (q − p) n X k∂k f k2q . k=1 This alternative inequality does not appear to be either weaker or stronger in general than the original inequality Bec(q, p). The condition q ≥ 2 in Theorem 3.1 is essential. To see this, we need a lemma. Lemma 3.12. If µ satisfies inequality Bec(q, p) for any 1 ≤ p < q, and any constant C, then µ satisfies Z 2 kf k22 − f dµ ≤ Ck∇f k2q . (3.17) Rn for all f ∈ W q,1 (µ). 18 Proof. It suffices to prove the implication for bounded functions f ∈ W q,1 (µ). Replace f with 1 + f in Bec(q, p), and divide by 2 : k1 + f k2q − k1 + f k2p ≤ (q − p)Ck∇f k2q . (3.18) 2 Apply L’hopital’s rule twice to get that as → 0 the left side tends to 1 d2 k1 + f k2q − k1 + f k2p |=0 . 2 2 d If is so small that 1 + f > 0, we have d2 k1 + f k2q d2 = 2(q − 1)k1 + f k2−q q Z Rn f 2 (1 + f )q−2 dµ + 2(2 − q)k1 + f k2−2q q Z f (1 + f )q−1 dµ 2 Rn Replacing q with p, then evaluating both expressions at = 0, we see that the limit of the left side of (3.18) is Z 2 Z 2 Z Z (q − 1) f 2 dµ + (2 − q) f dµ − (p − 1) f 2 dµ − (2 − p) f dµ Rn Rn = (q − p)kf k22 − (q − p) Rn Rn 2 Z f dµ . Rn Dividing by q − p then proves (3.17). Proposition 3.13. The standard Gaussian measure γ n does not satisfy inequality Bec(q, p) for any 1 ≤ p < q < 2 and any constant C. Proof. By the Lemma, it will suffice to show that γ n does not satisfies inequality (3.17). Take ft (x1 , ..., xn ) = etx1 , for t > 0. One has the formula Z 2 eax1 dγ n (x) = ea /2 , a ∈ R. Rn From this, one sees that (3.17) for ft with constant C is equivalent to 2 e 2t2 −e t2 2 qt2 ≤ t Ce 2 e(2−q)t − e(1−q)t ≤ C. t2 ⇔ But, the left side of this last inequality tends to ∞ as t → ∞. Thus γ n cannot satisfy (3.17) for any constant C, so by Proposition 3.12 it cannot satisfy Bec(q, p) for any q < 2, any p < q, and any constant C. By a similar argument, we can also get that our result in Theorem 3.1 is sharp. Proposition 3.14. Let q ≥ 2 and 1 ≤ p ≤ 2. The standard Gaussian measure γ n does not satisfy inequality Bec(q, p) for constant C < 1. Proof. As in the proof of Proposition 3.13, set ft (x1 , ..., xn ) = etx1 for t > 0. Then Bec(q, p) with constant C for ft is equivalent to 2 2 2 2 eqt − ep ≤ t2 C(q − p)eqt ⇔ 1 − e(p−q)t ≤ C(q − p). t2 As t → 0, the left side of this last inequality tends to q − p, so it cannot hold with C < 1. 19 . 3.4 Classical Heat semigroup Inequality Bec(q, p) for the Gaussian measure can also be proven by means of the classical heat semigroup, rather than the Ornstein-Uhlenbeck semigroup. This is the method used by E. Hsu and the author in [17]. The proof itself is slightly longer, but the classical heat semigroup is better known than the OrnsteinUhlenbeck semigroup, and less work is required to establish its basic properties. We give this alternative proof here. Definition 3.15. The classical heat semigroup {Ps : s > 0} defined by Z 2 1 f (y)e−|x−y| /2s dy Ps f (x) = n/2 (2πs) Rn (3.19) for bounded continuous functions f on Rn . We shall require the following elementary properties of {Ps }. Proposition 3.16. Suppose f : Rn → R is bounded and continuous. Then 1. Ps f → f as s → 0. R 2. P1 f (0) = Rn f dγ n . 3. For each s, t ≥ 0, Ps ◦ Pt = Ps+t . 4. Ps f solves the heat equation: ∂s Ps f = 12 ∆Ps f . 5. If ∇f is bounded and continuous, then ∇Ps f = Ps ∇f , where the action of Ps on ∇f is componentwise. Note that items 1 and 3 imply that {Ps } is a semigroup. Item 4 implies that its infinitesimal generator is an extension of the half-Laplacian 21 ∆. The heat semigroup is not, however, a contraction semigroup. Item 2 provides the motivation for using the heat semigroup in the context of inequalities for γ n . √ Proof. 1. Substitute y = x − sz in (3.19) to obtain the alternative formula Z Z √ √ 1 −|z|2 /2 dz = f (x − sz) dγ n (z). (3.20) Ps f (x) = f (x − sz)e (2π)n/2 Rn n R R Since f is bounded, we can use dominated convergence to find that this tends to Rn f (x) dγ n (z) = f (x) as s → 0. 2. This is immediate from (3.19) and (1.1). 3. Using the alternative formula (3.20) for Ps , we have Z Z √ √ Ps Pt f (x) = f (x − sy − tz) dγ n (z) dγ n (y). Rn By Lemma 3.8 with a = √ √ s , s+t b= √ √ t , s+t Z f (x − Rn this equals √ s + tu) dγ n (u) = Ps+t f (x). Rn 4. Let ρ(x, y, s) = 2 1 e−|x−y| /2s (2πs)n/2 be the heat kernel. For j = 1, ..., n, we have ∂j2 ρ(x, y, s) = (xj − yj )2 1 − s2 s 20 ρ(x, y, s) and so ∆ρ(x, y, s) = n |x − y|2 − 2 s s ρ(x, y, s) = 2∂s ρ(x, y, s). Differentiating under the integral sign is justified since ∆ρ ∈ L1 (Rn ) and f is bounded, so we obtain Z Z ∆Ps f (x) = f (y)∆ρ(x, y, s) dy = 2 f (y)∂s ρ(x, y, s) dy = 2∂s f (y). Rn Rn 5. This follows from differentiation under the integral sign in (3.20). We now employ these elementary properties of the classical heat semigroup {Ps } to give an alternative proof of inequality Bec(q, p). Proof of Theorem 3.1. By a standard approximation argument, it is enough show the inequality (3.1) for a smooth function f such that 0 < c ≤ f ≤ C and ∇f is bounded. For 0 ≤ s ≤ 1, consider the function h i2/q q/p φs (x) = Ps (P1−s f p ) (x) . (3.21) The idea of considering such a function in the context of functional inequalities can be traced back to Neveu [27]. We can write the left side of (3.1) as Z 1 2 2 kf kq − kf kp = φ1 (0) − φ0 (0) = ∂s φs (0) ds. 0 The technical part of our proof is the computation of the derivative of 3.21 with respect to s. From the definition (3.21) of φs we have h i2/q 2 ∂s φs = ∂s Ps gsq/p = as ∂s (Ps gs )q/p , q where, to simplify the notation here and later, we have introduced the functions 2/q−1 gs = P1−s f p and as = Ps gsq/p . We compute ∂s φs = 2 2 as (∂s Ps )gsq/p + as Ps gsq/p−1 ∂s gs . q p (3.22) q/p Using the relation ∂s Ps = (1/2)Ps ∆, we may rewrite the first term on the right side as (1/q)as Ps ∆ gs , which equals 1 1 q − 1 as Ps gsq/p−2 |∇gs |2 + as Ps gsq/p−1 ∆gs (3.23) p p p by the identity q q q − 1 hq/p−2 |∇h|2 + hq/p−1 ∆h ∆ hq/p = p p p applied with h = gs . From ∂s P1−s = −(1/2)∆P1−s we have ∂s gs = −(1/2)∆gs , so the second term in the sum (3.23) exactly cancels the second term in (3.22). In the remaining term, we use the fact that P1−s commutes with ∇ and write ∇gs = pP1−s (f p−1 ∇f ), which gives ∂s φs = (q − p)as Ps gsq/p−2 |P1−s (f p−1 ∇f )|2 . (3.24) Note that P1−s is an integral with respect to a (probability) measure, so we can use Hölder’s inequality with the exponents p/(p − 1) and p to get (p−1)/p |P1−s (f p−1 ∇f )| ≤ P1−s (f p−1 |∇f |) ≤ (P1−s f p ) 21 (P1−s |∇f |p ) 1/p . Thus, by (3.24), 2/p ∂s φs ≤ (q − p)as Ps gsq/p−2/p (P1−s |∇f |p ) . (3.25) The case q = 2 is covered by trivial modifications to what follows, so in the remainder of the proof we assume q > 2. Hölder’s inequality with the exponents q/(q − 2) and q/2 yields 1−2/q 2/q 2/p q/p Ps gsq/p−2/p (P1−s |∇f |p ) ≤ Ps gsq/p Ps (P1−s |∇f |p ) . The first factor on the right side is exactly a−1 s , which cancels the factor as in (3.25). We now have 2/q q/p ∂s φs ≤ (q − p) Ps (P1−s |∇f |p ) . p/q From 1 ≤ p ≤ q we have P1−s |∇f |p ≤ (P1−s |∇f |q ) . This together with the semigroup property Ps P1−s = P1 gives Z 2/q 2/q q/p 2/q Ps (P1−s |∇f |p ) ≤ (Ps P1−s |∇f |q ) = . |∇f |q dγ Rn The last equality holds at x = 0. It follows that Z ∂s φs ≤ (q − p) |∇f |q dγ 2/q . Rn Integrating from 0 to 1 yields (3.1). 4 Beckner Inequality for Log-Concave Probability Measures We now turn our attention to a general log-concave probability measure µ on Rn . Definition 4.1. A probability measure µ on Rn is called log-concave with concavity b > 0 if µ = e−v(x) dx, where v ∈ C 2 (Rn ) and the matrix D2 v of second order partial derivatives of v satisfies hD2 v(x)ξ, ξi ≥ b|ξ|2 for each x, ξ ∈ Rn . The most important log-concave probability measure is the standard Gaussian measure γ n , which corresponds to the case where v(x) = |x|2 /2 + log(2π)n/2 . Log-concave probability measures satisfy many of the same properties as Gaussian measures do, so it is natural to ask to what extent the inequalities we study for the Gaussian measure can be generalized to this setting. The goal of this chapter is to prove the following. Theorem 4.2. Let µ be a log-concave probability measure on Rn with concavity b > 0. Then µ satisfies inequality Bec(q, p) with constant 1/b for all q ≥ 2 and all 1 ≤ p ≤ q. Furthermore, equality holds if and only if f is constant a.e. To prove Theorem 4.2, we proceed as in Chapter 3. We first define analogues of the Ornstein-Uhlenbeck operator and the Ornstein-Uhlenbeck semigroup, then use these operators to prove the result via an argument similar to that in Section 3.3. 4.1 Generalization of the Ornstein-Uhlenbeck operator and semigroup Let µ = e−v(x) dx be a log-concave probability measure on Rn with concavity b > 0. Define an operator A on L2 (µ) by A(f ) := ∆f − h∇f, ∇vi, (4.1) with domain D(A) consisting of all functions f such that ∆f − h∇f, ∇vi defines a function in L2 (µ). This operator A is a generalization of the Ornstein-Uhlenbeck operator, as suggested by the following generalization of Proposition 3.3. 22 Lemma 4.3. If g ∈ W 2,1 (µ) and f ∈ D(A), then Z Z gA(f ) dµ = − Rn h∇f, ∇gi dµ. (4.2) Rn Proof. By converting to an integral with respect to Lebesgue measure and integrating by parts, we obtain Z Z f ∂k vdµ = ∂k f dµ, f ∈ Cc∞ (Rn ), k = 1, ..., n. (4.3) Rn Rn By approximation, this also holds for f ∈ W 2,1 (µ). Using (4.3), we then calculate Z gA(f ) dµ = Rn n Z X k=1 Rn g∂k2 f − g∂k v∂k f dµ = n Z X k=1 Rn g∂k2 f − ∂k g∂k f − g∂k2 f Z dµ = − h∇f, ∇gi dµ. Rn In analogy with Chapter 3, we now seek a contraction semigroup whose infinitesimal generator is an extension of the operator A. To formalize some of the basic properties we need this semigroup to possess, we make the following definition. Definition 4.4. Suppose {Tt } is a contraction semigroup on a Banach space X of functions on a probability space Ω. Then {Tt } is said to be a Markov semigroup if Tt 1 = 1 (where 1 denotes the constant function ω 7→ 1); and Tt preserves positivity: if f ≥ 0 a.e., then Tt f ≥ 0 a.e. From the Mehler formula, it is clear that the Ornstein-Uhlenbeck semigroup of Chapter 3 is a Markov semigroup. In Appendix 6.2, it is shown, using abstract functional analytic methods, that there exists a Markov semigrup {Tt }, consisting of symmetric operators, whose infinitesimal generator is an extension of A. Unlike in Chapter 3, we do not have an explicit formula for {Tt }. Nevertheless, it turns out that this semigroup satisfies many of the same properties as the Ornstein-Uhlenbeck semigroup. Proposition 4.5. Let {Tt } be the symmetric Markov semigroup with generator a self-adjoint extension of A. Then 1. Each Tt preserves integrals: if f ∈ L1 (µ), then Z Z Tt f dµ = Rn f dµ. Rn 2. Suppose c ≤ f ≤ C for some constants c and C. Then for each x ∈ Rn , c ≤ Tt f (x) ≤ C. 3. Tt f (x) is given by integration against a Borel probability measure νt,x for a.e. x ∈ Rn . 4. Each Tt defines a norm-decreasing operator Lp (µ) → Lp (µ), for each 1 ≤ p ≤ ∞. 5. If f ∈ C 2 (Rn ) ∩ L2 (µ), then so is Tt f . Proof. 1. Apply (4.2) to get d dt Z Z Tt f dµ = Rn Z h∇Tt f, ∇1i dµ = 0. A(Tt f ) dµ = Rn Rn Thus the integral of Tt f is independent of t, and in particular it equals the integral of T0 f = f . 2. Since Tt fixes the constant functions and preserves positivity, the fact that C − f ≥ 0 implies that C − Tt f ≥ 0, and similarly for c. 23 3. By item 2, f 7→ Tt f (x) defines a positive linear functional on C0 (Rn ). So, the Riesz representation theorem implies that Tt f (x) is given by integrating against a Borel measure νt,x for each f ∈ C0 (Rn ). For a non-negative f ∈ L2 (µ), one can find a sequence (fj ) ∈ C0 (Rn ) which increases to f a.e., so by dominated convergence Tt f (x) is also given by integration against νt,x . For the general f ∈ L2 (µ), the result follows by considering positive and negative parts separately. Finally, from Tt 1 = 1, we get that νt,x is a probability measure. 4. If f is bounded, then |f | and |f |p are each in L2 (µ), and by item 3, we can apply Jensen’s inequality to get |Tt f |p ≤ Tt |f |p . Therefore item 1 gives Z Z p p |Tt f | dµ ≤ Rn Z Tt |f | dµ = Rn |f |p dµ. Rn The set of bounded f is dense in Lp (µ) for each 1 ≤ p ≤ ∞, so we get a unique extension of Tt to each Lp . 5. We have that u(t, x) := Tt f (x) solves the differential equation ∂t u = ∆u − h∇v, ∇ui (4.4) with initial condition u(0, x) = f (x). The coefficient vector ∇v in this PDE is C 1 , so by standard regularity results for parabolic equations (see, for example, Ch. 7 of [14]) applied on each bounded subset of Rn , it follows that Tt f lies in C 2 (Rn ) as well. 4.2 Commutation with the gradient The only missing ingredient in our proof of Bec(q, p) is an inequality relating |∇Tt f |2 and Tt |∇f |2 . To obtain such an estimate, we first make the following definition, which we take from [22]. Definition 4.6. We define the carré du champ of A to be the unbounded bilinear form Γ : L2 (µ) × L2 (µ) → L2 (µ) given by 2Γ(f, g) := A(f g) − f A(g) − gA(f ). We define the curvature operator of A to be the unbounded bilinear form Γ2 : L2 (µ) × L2 (µ) → L2 (µ) given by 2Γ2 (f, g) := AΓ(f, g) − Γ(f, Ag) − Γ(g, Af ). (4.5) The domains of Γ and Γ2 are the subsets of L2 (µ) × L2 (µ) on which the above formulas produce functions in L2 (µ). We put Γ(f ) = Γ(f, f ) and similarly for Γ2 (f ). Proposition 4.7. The carré du champ of A is given by Γ(f, g) = h∇f, ∇gi. Proof. Due to bilinearity, it suffices to prove the result in the case f = g, i.e. we must show Γ(f ) = |∇f |2 . We have A(f 2 ) = ∆(f 2 ) − ∇(f 2 ) · ∇v = 2|∇f |2 + 2f ∆f − 2f ∇f · ∇v and 2f A(f ) = 2f ∆f − 2f ∇f · ∇v. Subtracting, we get Γ(f ) = |∇f |2 which is the desired formula. 24 Proposition 4.8. The curvature operator of A satisfies Γ2 (f ) = |D2 f |2 + h(D2 v)∇f, ∇f i. Proof. In the case f = g formula (4.5) is 2Γ2 (f ) = AΓ(f ) − 2Γ(f, Af ) (4.6) By Proposition 4.7 the first term in (4.6) is given by AΓ(f ) ∆|∇f |2 − ∇v · ∇|∇f |2 . = Expanding out the second term above, we get n X ∇v · ∇|∇f |2 = 2 ∂k f ∂jk f ∂j v. j,k=1 For the second term in (4.6), we have Γ(f, Af ) = ∇f · ∇(∆f − ∇v · ∇f ). We compute ∇f · ∇(∇v · ∇f ) = n X ∂k f ∂jk v∂j f + ∂k f ∂jk f ∂j v j,k=1 n X j,k=1 = n X h(D2 v)∇f, ∇f i + ∂k f ∂jk f ∂j v. j,k=1 Plugging our calculations into (4.6 and cancelling the terms 2 Pn j,k=1 ∂k f ∂jk f ∂j v, we find 2Γ2 (f ) = ∆|∇f |2 − 2∇f · ∇∆f + 2h(D2 v)∇f, ∇f i. We have ∆|∇f |2 = 2 n X j,k=1 ∂k f ∂kjj f + 2 n X (∂jk f )2 = 2 j,k=1 and ∇f · ∇∆f = n X (4.7) ∂k f ∂kjj f + 2|D2 f |2 j,k=1 n X ∂k f ∂kjj f. j,k=1 Plugging these two expressions into (4.7), cancelling a pair of terms, and dividing by 2, we get Γ2 (f ) = |D2 f |2 + h(D2 v)∇f, ∇f i. From the preceding proposition and our hypothesis on D2 v, one has the inequality Γ2 (f ) ≥ h(D2 v)∇f, ∇f i ≥ b|∇f |2 . (4.8) In the terminology of Markov semigroups, the operator A has positive curvature (see [22] for further discussion). From this bound, we can deduce our desired inequality for the gradient of Tt f . Our proof follows that in [22]. Proposition 4.9. For function f ∈ C ∞ (Rn ) ∩ L2 (µ) and t ≥ 0, |∇Tt f |2 ≤ e−2bt Tt |∇f |2 . 25 Proof. Fix t > 0 and for t ≥ s ≥ 0 define φ(s) = e−2bs Ts |∇Tt−s f |2 . Using the relation ∂s Ts = Ts A, we compute φ0 (s) = −2be−2bs Ts |∇Tt−s f |2 + e−2bs Ts A|∇Tt−s f |2 + ∂s |∇Tt−s f |2 . (4.9) By Proposition 4.7 one has ∂s |∇Tt−s f |2 = −2h∇Tt−s f, ∇ATt−s f i = −2Γ(Tt−s f, ATt−s f i, and so from (4.9) and the definition of Γ2 , φ0 (s) = −2e−2bs Ts [−bΓ(Tt−s f ) + Γ2 (Tt−s f )) . By (4.8), the argument of Ts is non-negative, so since Ts preserves positivity, φ0 is non-negative. Thus φ is increasing. In particular, |∇Tt f |2 = φ(0) ≤ φ(t) = e−2bt Tt |∇f |2 which proves the result. Corollary 4.10. If f ∈ L1 (µ), then Z lim Tt f = t→∞ f dµ Rn in L1 . Proof. If f ∈ Cc1 (Rn ), we infer from Proposition 4.9 that |∇Tt f |2 → 0 uniformly Ras t → ∞, so Tt f must converge to a constant function. Since Tt preserves integrals, this constant R must beR Rn f dµ. For the general f ∈ L1 (µ), given > 0 we can find g ∈ Cc1 (Rn ) with kf − gk1 < and Rn g dµ = Rn f dµ. Then Z Z g dµ lim sup Tt f − f dµ ≤ lim sup kTt f − Tt gk1 + Tt g − < , t→∞ Rn t→∞ 1 Rn 1 which proves the result for f . 4.3 Proof of Beckner’s inequality for log-concave probability measures We are now ready to prove Theorem 4.2. The proof is essentially the same as that in Section 3.3. Proof of Theorem 4.2. Define A as in (4.1) and let {Tt } be the semigroup of Proposition 4.5. By a standard approximation argument, we may take f ∈ C 2 (Rn ) with f ≥ a > 0 for some constant a. Then by Proposition 4.5, Tt f has the same properties for each t ≥ 0. Set Z p q/p φ(t) = [Tt f ] 2/q dµ . Rn Since T0 is the identity and limt→∞ Tt f = R Rn f dµ, we have that the left side of Bec(q, p) is given by kf k2q − kf k2p = Z ∞ φ0 (t) dt. 0 We shall estimate φ0 . To simplify notation in what follows, put Z α(t) = [Tt f p ]q/p dµ Rn 26 2/q−1 . (4.10) Using the relation ∂t Tt = ATt , we compute 2 φ (t) = α(t) p 0 Z [Tt f p ]q/p−1 ATt f p dµ. Rn By the integration by parts formula for A, this equals Z Z 2 q 2 α(t) h∇[Tt f p ]q/p−1 , ∇ (Tt f p )idµ = − 1 α(t) [Tt f p ]q/p−2 |∇Tt f p |2 dµ. p p p Rn Rn (4.11) By Proposition 4.9 applied with f replaced by f p , |∇Tt f p |2 ≤ e−2bt [Tt |∇(f p )|]2 = e−2bt p2 [Tt (f p−1 |∇f |)]2 . Therefore, φ0 (t) ≤ 2e−2bt (q − p) α(t) Z [Tt f p ]q/p−2 [Tt (f p−1 |∇f |)]2 dµ. (4.12) Rn Now apply Hölder’s inequality to get 2 2/p Tt (f p−1 |∇f |) ≤ [Tt f p ]2−2/p (Tt |∇f |p ) . Thus, φ0 (t) ≤ 2e−2bt (q − p) α(t) Z 2/p [Tt f p ]q/p−2/p (Tt |∇f |p ) dµ. (4.13) Rn The case q = 2 is covered by trivial modifications to what follows, so in the remainder of the proof we assume q > 2. Apply Hölder’s inequality a second time, this time with the exponents q/(q − 2) and q/2, to get Z 2/p [Tt f p ]q/p−2/p (Tt |∇f |p ) dµ Rn Z ≤ [Tt f p ]q/p dµ 1−2/q Z Rn q/p (Tt |∇f |p ) 2/q dµ . Rn The first factor here is precisely α(t)−1 , so upon plugging this into (4.13), we obtain φ0 (t) ≤ 2e−2bt (q − p) Z q/p (Tt |∇f |p ) 2/q dµ . Rn Since 1 ≤ p ≤ q, q/p (Tt |∇f |p ) ≤ Tt |∇f |q . So, since Tt preserves integrals, we get φ0 (t) ≤ 2e−2bt (q − p) Z |∇f |q dµ 2/q . Rn Integrating this from 0 to ∞ proves Bec(q, p) with constant 1/b. The condition for equality follows just as in the proof of Theorem 3.1. 5 Other Inequalities for Log-Concave Probability Measures In the remainder of the paper, we shall study several inequalities for log-concave probability measures which are related to inequality Bec(q, p). 27 5.1 Generalized logarithmic Sobolev inequality Let q ≥ 1. We say that µ satisfies inequality LSI(q) with constant C if whenever f ∈ W q,1 (µ), Z 2 2 kf k2−q |f |q log |f | dµ − log(kf kq )kf k2q ≤ Ck∇f k2q . q q q Rn (5.1) Note that LSI(2) is the ordinary logarithmic Sobolev inequality (1.3), with a constant C on the right. We shall explore the relationship between inequality LSI(q) and the inequality Bec(q, p) from Chapters 3 and 4. First we have an implication relation within inequality LSI(q): Proposition 5.1. If µ satisfies LSI(q) with constant C, then for each r > q, µ satisfies LSI(r) with constant C. Proof. If f ∈ W r,1 (µ), then |f | ∈ W r,1 (µ) with |∇|f || = |∇f | a.e., so it suffices to consider f ≥ 0. Apply LSI(q) to the function f r/q : 2r kf k2r/q−r r q2 Z f r log f dµ − Rn 2r r2 2r/q log(kf k )kf k ≤ C r r q2 q2 Z 2/q f r−q |∇f |q dµ . (5.2) Rn Apply Hölder’s inequality on the right side, with exponents r/q and r/(r − q), to get Z Rn 2/q 2/q f r−q |∇f |q dµ ≤ kf kr−q k∇f kq/r = kf k2r/q−2 k∇f k2r . r r r 2r/q−2 Plugging this into (5.2) and dividing by (r/q)2 kf kr on both sides yields LSI(r) with constant C. Proposition 5.2. If µ satisfies Bec(q, p) with some constant Cp for each p ∈ [1, q), then µ also satisfies LSI(q) with constant C := lim supp→q Cp . Proof. By the usual approximation arguments, it suffices to prove the inequality for f ∈ C ∞ (Rn ) ∩ W 2,1 (µ) with f ≥ c > 0 for some constant c. Divide both sides of Bec(q, p) by q − p to get kf k2q − kf k2p ≤ Cp k∇f k2q . q−p (5.3) Our hypotheses on f imply that p 7→ kf k2p is a differentiable function of p, so as p → q − , the left side of (5.3) tends to Z d 2 2 kf k2p |p=q = kf k2−q f q log(f ) dµ − log(kf kq )kf k2q , q dp q q Rn which is precisely the left side of LSI(q). The right side of (5.3) is bounded above by Ck∇f k2q as p → q − , which proves the desired inequality. By Theorem 4.2, we immediately get Corollary 5.3. For q ≥ 2, the measure µ satisfies inequality inequality LSI(q) with constant 1/b. There is a partial converse to Proposition 5.2. In order to prove it, we will need to consider how the quotient kf k2q − kf k2p q−p changes as q and p vary. It turns out that a slight variant of this quantity is better behaved. Namely, θ(q, p) := kf k2q − kf k2p kf k2q − kf k2p = qp , 1/p − 1/q q−p 1 ≤ p < q. (5.4) Of course, θ depends on f , but the function will always be clear from the context, so is not indicated in the notation. The key feature of θ is the following. 28 Lemma 5.4. The function θ is increasing in q and p. This lemma and its proof are based on a result in [21]. Proof. The function φ is the negative of a difference quotient: θ(q, p) = − β(1/p) − β(1/q) , 1/p − 1/q where β(t) := kf k21/t (5.5) for t ∈ (0, 1]. We claim that β is convex. Given the claim, we have that the difference quotients of β are increasing in both arguments. Hence β(1/p) − β(1/q) 1/p − 1/q is decreasing in both arguments, so its negative θ(q, p) is increasing in both arguments, which proves our claim. It remains to prove that β is convex. We first claim that Z 1 |f |1/t dµ . α(t) := log(β(t)) = log(kf k1/t ) = t log 2 Rn is a convex function of t. Indeed, if t, s ∈ [0, 1], we can apply Hölder’s inequality with exponents t/(t + s) and s/(t + s) to get Z t+s t+s 2/(t+s) α log f dµ = 2 2 Rn t+s ≤ log kf 1/(t+s) k(t+s)/t kf 1/(t+s) k(t+s)/s 2 t+s t+s 1/(t+s) 1/(t+s) = log kf k1/t + log kf k1/s 2 2 1 1 = α(t) + α(s). 2 2 Now, the convexity of α and the fact that the exponential function is increasing and convex gives us t+s 1 1 1 1 β ≤ e2(α(t)/2+α(s)/2) ≤ e2α(t) + e2α(s) = β(t) + β(s), 2 2 2 2 2 which proves that β is convex. We can now prove a partial converse of Proposition 5.2. Proposition 5.5. Suppose that µ satisfies LSI(q) with constant C. Then for all p ∈ [1, q), µ satisfies Bec(q, p) with constant (q/p)C. Proof. As before, it suffices to prove the implication for f ∈ C ∞ (Rn ) with f ≥ c > 0 for some constant c. Since θ(q, p) is increasing in p ≤ q, we have kf k2q − kf k2p q−p = kf k2q − kf k2p θ(q, p) 1 limp→q θ(q, p) q θ(q, p) q ≤ = lim = lim . qp p q p p→q qp p p→q q−p This last limit is precisely d 2 kf k2p |p=q = kf k2−q q dp q Z f q log(f ) dµ − Rn 2 log(kf kq )kf k2q . q By LSI(q), this is less than or equal to Ck∇f k2q , so kf k2q − kf k2p q ≤ Ck∇f k2q , q−p p which is Bec(q, p) with the claimed constant. 29 Remark 5.6. Propositions 5.2 and 5.5, together with the results of Chapter 3, tell us that the standard Gaussian measure satisfies LSI(q) with constant 1 for q ≥ 2, and does not satisfy LSI(q) with any constant for q < 2. We also see that for q ≥ 2 and 1 ≤ p ≤ q, Bec(q, p) with constant q/p can be deduced from the logarithmic Sobolev inequality via the “soft” argument LSI(2) (constant 1) ⇒ LSI(q) (constant 1) ⇒ Bec(q, p) (constant q/p) where the first implication is by Proposition 5.1 and the second by Proposition 5.5. However, the sharp constant 1 we obtained in Chapter 3 cannot be deduced in this indirect manner. Lemma 5.4 also tells us something about the relationship between inequality Bec(q, p) for different values of the parameter q. Corollary 5.7. Suppose 1 ≤ p ≤ q ≤ r. If µ satisfies Bec(r, p) with constant C, one has r kf k2q − kf k2p ≤ C(q − p)k∇f k2r . q Proof. Since θ is increasing in q we have that for any p ≤ q, kf k2q − kf k2p 1 1 r kf k2r − kf k2p = θ(q, p) ≤ θ(r, p) = . q−p qp qp q r−p By hypothesis, this is less than or equal to Ck∇f k2r , which proves our result. Taking µ = γ, r = 2, q ≤ 2, we see that Beckner’s original inequality Bec(2, p) for γ n yield the estimate kf k2q − kf k2p ≤ 2 (q − p)k∇f k22 , q for 1 ≤ p ≤ q ≤ 2. From what we proved above, whenever 1 ≤ q ≤ r we have the implications Bec(q, p) (all p < q, constant C) ⇒ LSI(q) (constant C) ⇒ LSI(r) (constant C) ⇒ Bec(r, p) (all p < r, constant (r/p)C) There is another implication within inequality Bec(q, p). Proposition 5.8. Suppose 1 ≤ q ≤ r and µ satisfies Bec(q, p) with constant C for all p ≤ q. Then for each r > q and each p ∈ [r/q, r), µ satisfies Bec(r, p) with constant (r/q)C. Proof. If f ∈ W r,1 (µ), then |f | ∈ W r,1 (µ) with |∇|f || = |∇f | a.e., so it suffices to consider f ≥ 0. Apply Bec(q, p) to the function f r/q : Z 2/q r2 2r/q 2r/q r−q q kf kr − kf krp/q ≤ C(q − p) 2 f |∇f | dµ . (5.6) q Rn Apply Hölder’s inequality on the right side, with exponents r/q and r/(r − q), to get Z 2/q 2/q r−q q f |∇f | dµ ≤ kf kr−q k∇f kq/r = kf kr2r/q−2 k∇f k2r . r r Rn 2r/q−2 Plugging this into (5.6), dividing by kf kr 2r/q kf k2r − kf k2−2r/q kf krp/q r on both sides, and rewriting the constant on the right yields Z 2/q rp r f r−q |∇f |q dµ . (5.7) ≤C r− q q Rn Since rp/q ≤ r, we have 2r/q−2 kf k2r/q−2 ≥ kf krp/q r ⇒ 2−2r/q kf k2−2r/q ≤ kf krp/q r . Therefore, the left side of (5.7) is greater than or equal to kf k2r − kf k2rp/q . As p ranges from 1 to q, rp/q ranges from r/q to r. Thus, we see that (5.7) implies Bec(r, p) with the asserted constant and range for the auxiliary parameter. 30 5.2 Inequality for the semigroup In this section, we prove an inequality for the semigroup Tt whose generator is an extension of A, with A defined as in (4.1). This inequality is closely related to Beckner’s p-norm inequality Bec(2, p), and generalizes Beckner’s inequality for the Ornstein-Uhlenbeck in [5], this time for q ≤ 2. Theorem 5.9 (Inequality for the Semigroup). Let µ = e−v(x) dx be a log-concave probability measure on Rn with concavity b > 0. Let A be as in (4.1). Let f ∈ W 2,1 (µ). For p ∈ (1, 2], let t(p) be such that e−2bt(p) = p − 1. Then whenever 1 ≤ p ≤ q ≤ 2, 1 kTt(q) k22 − kTt(p) f k22 ≤ (q − p) k∇f k22 . b (5.8) Remark 5.10. If we take q = 2, then t(q) = 0 and T0 f = f , so inequality (5.8) yields to the relation 1 kf k22 − kTt(p) f k22 ≤ (2 − p) k∇f k22 . b (5.9) In the Gaussian case (for which b = 1), this is the inequality for the Ornstein-Uhlenbeck semigroup which Beckner proved in [5]. Our result here generalizes this inequality for other exponents and other measures. Proof. By a standard approximation argument, we can assume that f ∈ C0∞ (Rn ) with C > f > c for some C, c > 0. Define φ(p) = kTt(p) f k22 . Then (5.8) is equivalent to the relation 1 φ(q) − φ(p) ≤ k∇f k22 . q−p b The left side is a difference quotient of φ. Furthermore, the hypotheses on f imply that φ is differentiable. So, by the mean value theorem, to prove our inequality it suffices to show that φ0 (p) ≤ 1 k∇f k22 , b for each p ∈ [1, 2]. This is the inequality we shall prove. If we differentiate φ(p) in p and integrate by parts, we get Z Z d d 1 0 2 φ (p) = t(p) (Tt(p) f ) dµ = − A(Tt(p) f )Tt(p) f dµ dp dt Rn b(p − 1) Rn Z 1 |∇Tt(p) f |2 dµ . = b(p − 1) Rn (5.10) (5.11) By Proposition 4.9, this is less than or equal to Z 1 e−2bt(p) Tt(p) |∇f |2 dµ = k∇f k22 , b(p − 1) b n R as required. Remark 5.11. Equation (5.11) shows that φ(p) = kTt(p) f k22 is an increasing function of p. In fact, if we apply the same argument n times, with ∂j f in place of f , we get n Z X 1 |∂jk Tt(p) f |2 dµ ≥ 0, φ00 (p) = 2 b (p − 1)2 Rn j,k=1 whence φ(p) is a convex function of p. It follows that φ(q) − φ(p) q−p is increasing in q and p, so the inequality get sharper as q and p increase. In particular, Beckner’s original inequality (5.9) with q = 2 is stronger than the inequality for smaller q. 31 Remark 5.12. In the Gaussian case, Nelson’s hypercontractivity inequality is essentially the statement that kTt(p) f k2 ≤ kf kp . Thus, together with hypercontractivity, (5.8) immediately implies the inequality kTt(q) f k22 − kf k2p ≤ (2 − p)k∇f k22 . In particular, if we take q = 2, we get Beckner’s p-norm inequality Bec(2, p). This is how Beckner originally obtained this inequality. Since Bec(2, p) implies the logarithmic Sobolev inequality LSI(2) and LSI(2) can be used to prove hypercontractivity [9], we see that the inequalities of this section, Beckner’s p-norm inequalities, LSI(2), and hypercontractivity are all logically equivalent in the Gaussian case. 5.3 Brascamp-Lieb inequality In this section, we prove a sharpened form of the Poincaré inequality for log-concave probability measures on Rn , originally due to Brascamp and Lieb [10]: Theorem 5.13. Let µ = e−v(x) dx be a log-concave probability measure on Rn with concavity b > 0. Let f ∈ W 2,1 (µ). Then Z 2 Z kf k22 − f dµ ≤ h(D2 v)−1 ∇f, ∇f i dµ. (5.12) Rn Observe that if x ∈ Rn , and y = p Rn (D2 v)x, then h(D2 v)−1 x, xi = hy, yi ≤ 1 1 h(D2 v)y, yi = |x|2 . b b Therefore inequality (5.12) is sharper than the inequality Bec(2, 1) with constant 1/b, which we proved in Chapter 4. The original proof in [10] is by a direct, albeit lengthy, calculation and an induction argument. We take an alternative, functional-analytic approach which yields several intermediate results which are of independent interest. The first of these is: Proposition 5.14. Define A := ∆ − h∇v, ∇i as in (4.1). Let Â be the self-adjoint extension of A which is the infinitesimal generator of Rthe semigroup {Tt } of Chapter 4 (c.f. Appendix 6.2). Then Â is invertible R from the set of f ∈ D(Â) with Rn f dµ = 0 to the set of g ∈ L2 (µ) with Rn g dµ = 0. Furthermore the inverse of Â is continuous with respect to the L2 norm, and we have the inequality kf k2 ≤ 1 kÂf k2 . b (5.13) To prove this, we need two lemmas. Lemma 5.15. Let {Tt } be a contraction semigroup on a Hilbert space X with generator S. Let x ∈ X. Let Y be a dense linear subspace of X, and suppose that for each y ∈ Y , 1 lim hTs x − x, yi = hz, yi s→0 s (5.14) for some z ∈ X. Then Sx exists and equals z. Proof. Since Y is dense, (5.14) holds for each y ∈ X, not just for y ∈ Y . For each t ≥ 0, Tt is symmetric and so 1 1 lim hTt+s x − Tt x, yi = lim hTs x − x, Tt yi = hz, Tt yi = hTt z, yi. s→0 s s→0 s Thus d hTt x, yi = hTt z, yi. dt 32 Integrating, we get Z hTt x − x, yi = t hTs z, yids 0 for each y ∈ X. Therefore Tt x − x = Rt 0 Ts zds. As t → 0, 1 t Z t Ts zds → z 0 strongly, so t−1 (Tt x − x) → z strongly, i.e. Sx = z. Now let {Tt } be the semigroup of Proposition 5.14. R Lemma 5.16. Let f ∈ L2 (µ) with Rn f dµ = 0. Then kTt f k2 ≤ e−bt kf k2 . Proof. By density it suffices to prove the formula for f ∈ D(A). Fix t > 0 and for t ≥ s ≥ 0 define Z −2bs φ(s) = e (Tt−s f )2 dµ. Rn We have φ(0) = kTt f k22 , φ(t) = e−2bt kf k22 , and Z Z φ0 (s) = −2be−2bs (Tt−s f )2 dµ + e−2bs Rn ∂s (Tt−s f )2 dµ. (5.15) Rn We compute Z Z 2 Z ∂s (Tt−s f ) dµ = − 2(ÂTt−s f )(Tt−s f ) dµ = 2 Rn Rn |∇Tt−s f |2 dµ. Rn Plugging this into (5.15), we find φ0 (s) = 2e−2bs (k∇Tt−s k22 − bkTt−s f k22 ). By the Poincaré inequality, this is non-negative. Thus φ0 is increasing, so in particular kTt f k22 = φ(0) ≤ φ(t) = e−2bt kf k22 . Proof of Proposition 5.14. For g ∈ L2 (µ) with R g dµ = 0, define Z ∞ Bg = − Tt g dt. Rn (5.16) 0 We claim that B = Â−1 . First we need to check that B is well defined and continuous. Let s > 0. By Minkowski’s inequality for integrals, Z Z Tt g dt Rn !1/2 2 s dµ Z ≤ 0 s kTt gk2 dt (5.17) 0 By Lemma 5.16, this is less than or equal to Z s 1 − e−2s kgk2 . e−bt kgk2 dt = b 0 Letting s → ∞, we find kBgk2 ≤ 33 1 kgk2 . b (5.18) Thus B : L2 (µ) → L2 (µ) Rcontinuously. R Now, if g = Âf , then Rn g dµ = Rn f Â(1) dµ = 0, and Z ∞ Z Bg = − Tt (Âf ) dt = − 0 0 ∞ d Tt f dt = f. dt Thus B Â = Id. R On the other hand, if g ∈ L2 (µ) with Rn g dµ = 0, then Z ∞ 1 1 (Ts Bg − Bg) = (Tt g − Tt+s g) dt, s s 0 so if φ ∈ Cc∞ (Rn ), we get from Fubini’s theorem and symmetry of Tt that Z ∞Z Z 1 1 (Ts Bg − Bg)φ dµ = (Tt g − Tt+s g) φ dµdt n s Rn s Z0 ∞ ZR 1 = (Tt φ − Tt+s φ) g dµdt. Rn s 0 Note that Lemma 5.16 and Hölder’s inequality show that Tt gφ is jointly integrable in t and over Rn , so that the application of Fubini’s theorem is justified. As s → 0, this tends to Z ∞ Z ∞Z Z Z − Âφg dµdt = Â − Tt φ dt g dµ = φg dµ. 0 Rn Rn Rn 0 Thus by Lemma 5.15, ÂBg exists and equals g, as required. Thus B = Â−1 and estimate (5.13) is immediate from (5.18). Now we proceed to prove Theorem 5.13. Our proof is based on that of B. Heffler [18]. Like the proof of Theorem 4.2, this proof relies on a commutation relation between the operator Â defined in (4.1) and the gradient operator. Let L2 (µ)n be the space of n-component vector valued functions with components in L2 (µ), with inner R product hF, GiL2 (µ)n = Rn hF, Gi dµ. Let L : (L2 (µ))n → (L2 (µ))n be the unbounded operator defined by LF = (D2 v)F − AF, (5.19) where A acts componentwise on vector-valued functions. Then if f ∈ Cc∞ (Rn ), ∇Af = ∇∆f − (D2 f )∇v − (D2 v)∇f = −L∇f. (5.20) Proof of Theorem 5.13. By approximation, it suffices to prove the inequality for f ∈ C ∞ (Rn ), with f constant outsideRa compact set. Since the inequality is invariant under adding a constant to f , we may also assume that Rn f dµ = 0. Then by Lemma 5.14, g := Â−1 f is well defined. By the integration by parts formula for A, one has Z Z Z kf k22 = (Âg)2 dµ = − h∇Âg, ∇gi dµ = hL∇g, ∇gi dµ. (5.21) Rn Rn Rn We have ∇f = ∇Âg = −L∇g. (5.22) hLF, F iL2 (µ)n = h(D2 v)F, F iL2 (µ)n − hÂF, F iL2 (µ)n = h(D2 v)F, F iL2 (µ)n + k∇F k22 . (5.23) For any F in the domain of L, we have The second term on the right in (5.23) non-negative, so hLF, F iL2 (µ)n ≥ h(D2 v)F, F iL2 (µ)n ≥ bkF k22 . 34 (5.24) Therefore the operator L̃ := bI − L is non-positive on L2 (µ)n . Clearly this operator is symmetric, and its domain contains Cc∞ (Rn )n , so is dense in L2 (µ)n . So, exactly as we do for A in Appendix 6.2, we can apply the Friedrichs extension theorem to extend L̃ to a self-adjoint operator on L2 (µ), still denoted by L̃, then apply the spectral theorem to get a contraction semigroup with generator L̃. It then follows from the Hille-Yosida theorem that the extension L̂ := bI − L̃ of L is invertible with inverse satisfying kL̂−1 F k ≤ (1/b)kF kL2 (µ)n for each F ∈ L2 (µ)n . Since F 7→ hL̂F, F iL2 (µ)n defines an inner product on the domain of L, we have by the Cauchy-Schwarz inequality that whenever F ∈ L2 (µ)n and G is in the domain of L̂, |hF, GiL2 (µ)n |2 = |hL̂L̂−1 F, GiL2 (µ)n |2 ≤ hL̂−1 F, F iL2 (µ)n hL̂G, GiL2 (µ)n with equality iff G = L̂−1 F . Therefore ( −1 hL̂ F, F iL2 (µ)n = sup |hF, GiL2 (µ)n |2 hL̂G, GiL2 (µ)n ) : G ∈ D(L) . By (5.24), this is less than or equal to sup |hF, GiL2 (µ)n |2 : G ∈ D(L) . h(D2 v)G, GiL2 (µ)n By the same argument above with L̂ replaced by D2 v (and the fact that D(L̂) is dense in L2 (µ)n ), this equals h(D2 v)−1 F, F iL2 (µ)n , and hence hL̂−1 F, F iL2 (µ)n ≤ h(D2 v)−1 F, F iL2 (µ)n . (5.25) ∇g = −L̂−1 ∇f. (5.26) From (5.22) one has Then applying this and (5.25) with F = ∇f , we get hL̂∇g, ∇giL2 (µ)n = hL̂−1 ∇f, ∇f iL2 (µ)n ≤ h(D2 v)−1 ∇f, ∇f iL2 (µ)n and plugging this into (5.21) gives our desired inequality. Remark 5.17. Inequality (5.12) suggests that one might look for an analogous version of inequality Bec(2, p) or LSI(2) for log-concave probability measures, i.e. an inequality of the form 2 kf k2−q q q Z |f |q log |f | dµ − Rn 2 log(kf k2 )kf k2q ≤ C q 2/q h(D2 v)−1 ∇f, ∇f iq/2 dµ . Z or kf k2q − kf k2p q ≤ C(q − p) p Z (5.27) Rn 2 −1 h(D v) q/2 ∇f, ∇f i 2/q dµ (5.28) Rn Bobkov and Ledoux [7] have demonstrated that this is not possible in general, although (5.27) holds for q = 2 under additional regularity hypotheses on the measure. In the next section, we give their proof, and deduce (5.27) and (5.28) for the general q, with a suitable constant, as a corollary. 5.4 Sharpened logarithmic Sobolev inequality In this section, we shall prove a sharpened logarithmic Sobolev inequality for a restricted class of log-concave probability measures due to Bobkov and Ledoux [7], which is analogous to the sharpened Poincaré inequality of Section 5.3. The key tool is the following deep convexity inequality: 35 Theorem 5.18 (Prékopa-Leindler). Let f, g, h be non-negative, measurable functions on Rn . Suppose that 0 < t < 1 and for each x, y ∈ Rn , f ((1 − t)x + ty) ≥ g(x)1−t h(y)t . Then Z Z f (x) dx ≥ Rn 1−t Z g(x) dx Rn t h(x) dx . Rn For the proof, see [24]. We remark that Bobkov and Ledoux [7] have also used the Prékopa-Leindler theorem to prove the Brascamp-Lieb inequality of Section 5.3, via an argument similar to the one below. Theorem 5.19 (Bobkov-Ledoux). Let µ = e−v(x) dx be a log-concave probability measure on an open convex set Ω ⊂ Rn .1 Let f ∈ W 2,1 (µ). Suppose that for any h ∈ Rn , the function x 7→ hD2 v(x)h, hi is concave on Ω. Then one has the sharpened logarithmic Sobolev inequality Z Z f 2 log |f | dµ − kf k22 log (kf k2 ) ≤ C h(D2 v)−1 ∇f, ∇f i dµ (5.29) Ω Ω with constant C = 3/2. Note that, unlike our previous results, this inequality does not hold for general log-concave probability measures, even with a different constant. A counterexample is given below. The minimal hypotheses on µ required to obtain inequality (5.29) are, to our knowledge, not known. Furthermore, the constant 3/2 in inequality (5.29) is not sharp in all cases. For example, if µ = γ n is the standard Gaussian measure, then (5.29) is just the ordinary logarithmic Sobolev inequality with constant C, and holds with C = 1. Below, we give a counterexample to the effect that the constant cannot be improved to C = 1 in all cases. The sharpest possible constants in (5.29) are not known in general. Proof. By the usual approximation arguments, we can take f to be smooth with 0 < c ≤ f ≤ C < ∞. In fact, by approximating and rescaling, we can arrange that, in addition, f ≡ 1 outside of a compact set. Then we can write f 2 = eg , where g ∈ Cc∞ (Ω). Let t, s > 0 with t + s = 1. Set gt (z) = sup{g(x) − [tv(x) + sv(y) − v(z)] : x, y ∈ Ω, z = tx + sy}. We shall apply the Prékopa-Leindler theorem to the functions egt −v χΩ , eg/t−v χΩ , e−v χΩ , where χ denotes an indicator function. By definition of gt , we have egt (tx+sy)−v(tx+sy) ≥ eg(x)−[tv(x)+sv(y)] = (eg(x)/t−v(x) )t (e−v(y) )s so the hypotheses of the theorem are satisfied and we get Z Z t egt dµ ≥ eg/t dµ . Ω (5.30) Ω We shall take the limit as t → 1, s → 0 to obtain our desired inequality. To simplify notation in what follows, we denote the left side of (5.29) by Ent(f ) (for “entropy”). First we shall see how Ent(f ) arises from the right side of (5.30). By logarithmic differentiation, one has Z 1 d 1−1/t kf k1/t = − kf k1/t f 1/t log(f ) dµ + log(kf k1/t )kf k1/t dt t Ω 1 A log-concave probability measure on Ω is defined in an analogous manner to a log-concave probability measure on Rn . Sobolev spaces for these measures are defined as in Appendix 6.1, but with Cc∞ (Ω) replaced by the set of smooth functions f on Ω whose derivatives up to order m are in L2 (µ). 36 If we replace f by f 2 and evaluate at t = 1, we get Z d 2 |t=1 kf k1/t = −2 f 2 log(f ) dµ + 2 log (kf k2 ) kf k22 = −2 Ent(f ). dt Ω Thus, recalling that eg = f 2 , we get by Taylor expansion at t = 1 that Z e g/t t Z dµ = eg dµ + 2s Ent(f ) + O(s2 ). Ω (5.31) Ω We now need a suitable estimate on gt . Let L(s) := tv(x) + sv(y) − v(z) be the quantity subtracted from g(x) in the formula defining gt , where z = tx + sy. Put k = x − y. One has d h∇v(rz + (1 − r)x), ki = −shD2 v(rz + (1 − r)x)k, ki dr d 1 v(rz + (1 − r)x) = h∇v(rz + (1 − r)x), ki. dr s So, integrating by parts, we find that Z 1 rshD2 v(rz + (1 − r)x)k, kidr Z = −h∇v(z), ki + 0 1 rh∇v(rz + (1 − r)x), kidr 0 1 1 = −h∇v(z), ki − v(z) + v(x). s s Similarly, Z 0 1 1 1 rthD2 v(rz + (1 − r)y)k, kidr = h∇v(z), ki − v(z) + v(y). t t Thus Z L(s) = ts 1 rshD2 v(rz + (1 − r)x)k, ki + rthD2 v(rz + (1 − r)y)k, kidr. (5.32) 0 By our hypothesis on v, hD2 v(rz + (1 − r)x)k, ki ≥ rhD2 v(z)k, ki + (1 − r)hD2 v(x)k, ki ≥ rhD2 v(z)k, ki and similarly with x and y interchanged. Thus we find Z 1 ts L(s) ≥ ts r2 hD2 v(z)k, kidr = hD2 v(z)k, ki. 3 0 Hence, ts 2 hD v(z)k, ki : x, y ∈ Ω, z = tx + sy}. 3 We have x = z + sk, so by Taylor expansion about s = 0 we obtain gt (z) ≤ sup{g(x) − 1 gt (z) ≤ g(z) + s sup{h∇g(z), ki − hD2 v(z)k, ki} + O(s2 ). 3 k∈Ω Now fix z and make the evaluation at this z implicit. Differentiating in k, we find that the quantity inside the supremum is maximized by choosing k such that 2 ∇g − (D2 v)k = 0. 3 37 This is equivalent to k = 32 (D2 v)−1 ∇g, and hence ≤ gt = 3s 3s h(D2 v)−1 ∇g, ∇gi − h(D2 v)−1 ∇g, ∇gi + O(s2 ) 2 4 3s 2 −1 g + h(D v) ∇g, ∇gi + O(s2 ). 4 g+ From the Taylor expansion of the exponential function, we then get 3s egt ≤ eg e 4 h(D 2 v)−1 ∇g,∇gi+O(s2 ) = eg + 3s g e h(D2 v)−1 ∇g, ∇gi + O(s2 ). 4 Substitute this inequality and (5.31) into (5.30) to obtain Z Z Z 3s g 2 g eg h(D2 v)−1 ∇g, ∇gi dµ + O(s2 ). e dµ + 2s Ent(f ) + O(s ) ≤ e dµ + 4 Ω Ω Ω R Cancelling Ω eg dµ on both sides, dividing by s, and then letting s → 0, we find Z Z 3 g 2 −1 e h(D v) ∇g, ∇gi dµ = 3 h(D2 v)−1 ∇f, ∇f i dµ. 2 Ent(f ) ≤ 4 Ω Ω This completes the proof. Recall that in Section 5.1, we established the chain of implications: LSI(2) (constant C) ⇒ LSI(q) (constant C) ⇒ Bec(q, p) (constant (q/p)C) for q ≥ 2, 1 ≤ p < q. (The first implication is Proposition 5.1 and the second is Proposition 5.5.) Exactly the 2/q R , same arguments used there, with Rn replaced by Ω and k∇f k2q replaced by Ω h(D2 v)−1 ∇f, ∇f iq/2 dµ yield Corollary 5.20. Let µ satisfy the hypotheses of Theorem 5.19, q ≥ 2, 1 ≤ p < q, f ∈ W q,1 (µ). Then 2 kf k2−q q q Z 2 |f | log |f | dµ − log(kf k2 )kf k2q ≤ C q Ω q and kf k2q − kf k2p ≤ q C(q − p) p Z 2 −1 h(D v) ∇f, ∇f i 2/q dµ . (5.33) Ω 2/q h(D2 v)−1 ∇f, ∇f iq/2 dµ Z q/2 (5.34) Ω with constant C = 3/2. To see that Theorem 5.19 cannot hold without the additional concavity hypothesis on v, and that the constant in this theorem cannot be sharpened to C = 1, we proceed by way of the following two results, which appear in [23]. Proposition 5.21 (Herbst Argument). Let µ be a log-concave probability measure which satisfies (5.29). R Then for function f ∈ W 2,1 (µ) with h(D2 v)−1 ∇f, ∇f i ≤ 1 a.e. and Ω f dµ := α < ∞, and for any t > 0, Z 2 e2tf dµ ≤ e2Ct +αt . (5.35) Proof. By approximating the general f satisfying the conditions of the proposition with continuously differentiable functions in the C 0,1 norm, we may assume that f is continuously differentiable. Let φ(t) = ke2tf k22 , g = etf /φ(t)1/2 . By (5.29), Z Z h(D2 v)−1 ∇g, ∇gi dµ g 2 log g dµ ≤ C Ω Ω 38 Note that the second term on the left vanishes since kgk2 = 1. In terms of f and φ, this reads Z φ0 (t) 1 C t − log φ(t) ≤ t2 h(D2 v)−1 ∇f, ∇f ie2tf dµ. 2φ(t) 2 φ(t) Ω d log φ(t) . On the right we have h(D2 v)−1 ∇f, ∇f i ≤ After dividing by t2 , the left side of this inequality equals 21 dt t 1, so 1 d log φ(t) ≤ C. 2 dt t R We have limt→0 log tφ(t) = Rn f dµ = α, so if we integrate both sides and use the fundamental theorem of calculus we get log φ(t) ≤ Ct + α/2. 2t Rearranging terms gives (5.35). Corollary 5.22. If f satisfies the hypotheses of Proposition 5.21 and 0 < λ < 1/2C, then Z 2 eλf < ∞. Ω Proof. By Chebyshev’s inequality and (5.35), for any t, r > 0, 2 µ{2f (x) ≥ α + r} ≤ µ{e2tf (x) ≥ eαt+rt } ≤ e2Ct +αt−αt−rt 2 = e2Ct −rt . The right side is minimized when t = r/4C, in which case we have µ{2f (x) ≥ α + r} ≤ e−r 2 /8C By Fubini’s theorem, one then has Z Z 2 2 eλf dµ = e(λ/4)(2f ) dµ Ω Ω Z λ ∞ (λ/4)r2 re µ{f (x) ≥ r}dr = 1+ 2 0 Z λ ∞ (λ/4)r2 −r2 /8C ≤ 1+ re dr, 2 0 which is finite provided λ < 1/2C. For example, consider v(x) = − log(2x) on (0, 1). The function v is strictly convex on (0, 1) and the integral e−v(x) over (0, 1) is 1, so µ = e−v(x) dx = 2xdx defines a log-concave probability measure on (0, 1). This measure is also introduced as a counterexample in [7]. Let f (x) = log x on (0, 1). We have f 02 /v 00 ≡ 1 and Z Z 1 1 x log xdx = −1/2. f dµ = 2 0 0 So, if µ satisfies (5.29) for some constant C, then Corollary 5.22 implies that we must in particular have R 1 λf 2 e dµ < ∞ for any 0 < λ < 2/C. But, for any such λ, 0 Z 1 e 0 λf 2 1 Z 2 xeλ(log x) dx = ∞. dµ = 2 0 Thus µ cannot satisfy (5.29) for any constant C. Note further that (5.34) with q = 2 implies (5.29) with the same constant, so µ cannot satisfy (5.34) with q = 2 and any constant either. 39 As another example, take v(x) = xp on (1, ∞) for 2 ≤ p < 3. Then v is strictly convex, so µ = Ce−v(x) dx, −1 R∞ , defines a log-concave probability measure on (1, ∞). Furthermore v 00 is concave, where C = 1 e−v(x) dx so µ satisfies the hypotheses of Theorem 5.19. Let p 2 p(p − 1) p/2 f (x) = x . p R∞ Then f 02 /v 00 ≡ 1, 1 f dµ < ∞, so the hypotheses of Proposition 5.35 are satisfied. For λ > 0, Z ∞ Z ∞ 2 4(p − 1) p eλf dµ = exp λx − xp dx. p 1 1 p This is finite if and only if λ < 4(p−1) . Therefore Corollary 5.22 implies that µ cannot satisfy (5.29) for any constant C < 2(p − 1)/p. In particular, µ cannot satisfy (5.29) with constant 1 unless p = 2. Remark 5.23. To our knowledge it is not known what hypotheses on the measure are required to obtain (5.33) and (5.34) in general, nor what the sharpest possible constants in these inequalities are. 6 6.1 Appendices Sobolev Spaces for Log-Concave Measures In this appendix, we shall define the measures and function spaces which are the settings for our inequalities, and prove some Recall that a multi-index is an element α = (α1 , ..., αn ) of Nn . We Pn of their basic properties. Qn αj write |α| = k=1 αk , and ∂α = k=1 ∂j . Given a log-concave probability measure µ on Rn , p ∈ [1, ∞), and m ∈ N, define a norm on the space ∞ Cc (Rn ) of smooth functions with compact support by X kf kW p,m (µ) := k∂α f kp . |α|≤m The completion of Cc∞ (Rn ) under this norm is called the mth Sobolev space with exponent p, and is denoted by W p,m (µ). A sequence (fj ) in Cc∞ (Rn ) converges to f ∈ C ∞ (Rn ) in W p,m (µ) if and only if ∂α fj → ∂α f in Lp for each α with |α| ≤ m. Likewise, if (fj ) is a Cauchy sequence in Cc∞ (Rn ), then whenever |α| ≤ m, (∂α f ) is a Cauchy sequences in Lp . Since Lp is complete, (fj ) converges to a function f ∈ Lp (µ) in the Lp norm, and so does each (∂α fj ). We want to define the Sobolev partial derivatives ∂α f of f as the Lp limits of the sequences (∂α fj ). However, we first need to check these these limits do not depend on the choice of sequence (fj ) ∈ Cc∞ (Rn ). By induction on m, it will suffice to establish the following: Proposition 6.1. Suppose that two sequences (fj ) and (gj ) in Cc∞ (Rn ) converge to the same function f in Lp (µ), and for each k = 1, ..., n, (∂k fj ) and (∂k gj ) converge in Lp (µ). Then for each k, lim ∂k fj = lim ∂k gj . j→∞ j→∞ Proof. By replacing fj by fj − gj , it suffices to assume that fj → 0 in Lp and to show that hk := limj→∞ ∂k fj = 0 for each k = 1, ..., n. Let φ ∈ Cc∞ (Rn ). Then by the integration by parts formula (4.3) and the product rule, Z Z Z hk φ dµ = lim (∂k fj )φ dµ = lim fj (φ∂k v − ∂k φ) dµ. Rn j→∞ j→∞ Rn But, since fj → 0 in Lp (µ), this limit is zero. Therefore that hk = 0 a.e. 40 R Rn Rn hk φ dµ = 0 for each φ ∈ Cc∞ (Rn ), which implies In particular, Proposition 6.1 implies that the Sobolev partial derivatives of a function in Cc∞ (Rn ) agree with its ordinary partial derivatives, so there is no ambiguity in using the same notation for both. We remark that, by definition, Cc∞ (Rn ) is dense in W p,m (µ), with respect to both its own norm and the Lp norm. As such, by approximating more general functions by smooth, compactly supported ones, it often suffices to prove results only on Cc∞ (Rn ). Next we establish that the spaces W p,m (µ) are sufficiently large to be of interest. Recall that C m (Rn ) denotes the set of m-times continuously differentiable functions on Rn . Proposition 6.2. If f ∈ C m (Rn ) and f and each of its partial derivatives up to order m are in Lp (µ), then f ∈ W p,m (µ). Proof. First we show that the set Ccm (Rn ) of m-times continuously differentiable functions with compact support is contained in W p,m (µ). Given f ∈ Ccm (Rn ), we need to approximate f in the W p,m norm by functions in Cc∞ (Rn ). For this, let g ∈ Cc∞ (Rn ) be non-negative with Lebesgue integral is 1. Set gt (x) = t−n g(x/t), and let Z f ∗ gt (x) = gt (x − y)f (y) dy Rn be the convolution of f and gt . Since f and gt are compactly supported, so is f ∗ gt . By the mean value theorem and the dominated convergence theorem, we can differentiate under the integral sign to obtain that f ∗ gt is smooth, and that for each multi-index α with |α| ≤ m, ∂α (f ∗ gt ) = f ∗ (∂α gt ) = (∂α f ) ∗ gt . It is a standard theorem in the theory of Lp spaces that the functions gt are an approximate identity, in the sense that h ∗ gt → h in Lp (dx) (and hence also in Lp (µ)) for any h ∈ Lp (dx). In particular, (∂α f ) ∗ gt → ∂α f in Lp (µ) as t → 0. So, f ∗ gt → f in W p,m (µ), which implies f ∈ W p,m (µ). Now suppose f satisfies the hypotheses of the theorem, but may not be compactly supported. Since W p,m is complete, in light of the above it will suffice to approximate f in the W p,m norm by functions in Cck (Rn ). To do so, set Bj = {x ∈ Rn : |x| ≤ j}. Take a sequence of smooth bump functions βj , equal to 1 on Bj , 0 on Rn \Bj+1 , and with uniformly bounded partial derivatives up to order α. Then if fj := βj f , we have ∂α fj → ∂α f pointwise for each |α| ≤ m. The product rule allows us to bound the integrals of |∂α fj |p over Bj+1 \ Bj in terms of the integrals of the partial derivatives of f over this region. Since each ∂α f ∈ Lp , said integrals tend to zero, and we get fj → f in W p,m . The following proposition will often allow us to reduce proofs to the case of non-negative functions. Proposition 6.3. Let f ∈ C 1 (Rn ) ∩ W p,1 (µ). Then |f | ∈ W p,1 (µ) and |∇f | = |∇|f || a.e. Proof. We first define a family of functions which will act as a C 1 approximation of the absolute value function. For y ∈ R, t > 0, set Ft (y) = (y 2 + t2 )1/2 − t. We claim that Ft ◦ f → |f | in W p,1 (µ) as t → 0. Indeed, we have Ft ◦ f → |f | pointwise, and kFt ◦ f k∞ ≤ t + kf k∞ . By dominated convergence, Ft ◦ f → |f | in Lp . Furthermore, for each index k = 1, ..., n, ∂k (Ft ◦ f ) = f ∂k f . (f 2 + t2 )1/2 41 Now, if either ∂k f (x) = 0 or f (x) 6= 0, then as t → 0, this converges to sgn(f (x))∂k f (x). So, the only points where convergence can fail to occur are those in D = {x ∈ Rn : f (x) = 0, ∇f (x) 6= 0}. We claim that D has measure zero with respect to Lebesgue measure, and hence also with respect to µ. In dimension 1, D consists of countably many endpoints of the open intervals where f 6= 0, so has measure zero. In higher dimensions, if x ∈ D, the implicit function theorem allows us to solve f (y) = 0 locally for one of the coordinates of y as a C 1 function of the others, and thereby express D ∩ Ux as the image of a C 1 function from Rn−1 to Rn for some neighborhood Ux of x. The image of such a function has measure zero, and countably many such Ux cover D. It follows that D has measure zero. Thus ∂k Ft ◦ f → sgn(f )∂k f a.e., and we have |∂k Ft ◦ f | ≤ k∂k f k∞ a.e. so a second application of dominated convergence yields ∂k Ft ◦ f → ∂k f in Lp . It follows that |f | ∈ W p,1 (µ) with |∂k |f || = |∂k f | (in the Sobolev sense). 6.2 Existence of Semigroup In this appendix, we prove the existence of a symmetric Markov semigroup {Tt } whose infinitesimal generator is an extension of the operator A defined in (4.1). We shall require the following theorems from functional analysis: Theorem 6.4 (Friedrichs Extension). Let S be a densely defined symmetric operator on a Hilbert space X. If S is non-negative (or non-positive), in the sense that hSx, xi ≥ 0 (resp. hSx, xi ≤ 0) for x in the domain of S, then there is a unique non-negative (resp. non-positive), self-adjoint extension of S. Theorem 6.5 (Spectral Theorem). Let S be a bounded self-adjoint operator on a Hilbert space X. Then there is a measure space (Ω, ν) and a linear isometry U : X → L2 (ν) such that U SU −1 is multiplication Mλ by some measurable function λ on Ω. For f, g ∈ L2 (µ), write Z hf, giL2 (µ) = f g dµ. (6.1) Rn The operator A in (4.1) is densely defined on L2 , for its domain clearly contains Cc∞ (Rn ). Symmetry of A follows by applying Lemma 4.3 twice. Furthermore, A is non-positive, for if f ∈ D(A), then Lemma 4.3 shows that hf, Af iL2 (µ) = −k∇f k2 ≤ 0. Thus the Friedrichs extension theorem implies that A has a non-positive self-adjoint extension Â. The spectral theorem implies that there is a measure ν on a set Ω and a linear isometry U : L2 (µ) → L2 (ν) such that U ÂU −1 is multiplication by some measurable function λ on Ω: U AU −1 = Mλ . For t ≥ 0, put Tt = U −1 Metλ U : D(Â) → L2 (µ). (6.2) Proposition 6.6. {Tt } extends to a symmetric contraction semigroup on L2 (µ) with infinitesimal generator Â. Proof. From the definition (6.2), it is immediate that T0 is the identity operator and Tt ◦ Ts = Tt+s on D(Â). Furthermore, the mapping t 7→ Meλt g is continuous for fixed g ∈ L2 (ν), so since U and its inverse are continuous, so is t 7→ Tt f for each f ∈ L2 (µ). Taking the transpose of both sides of (6.2) shows that each Tt is symmetric. Now we show that each Tt does not increase L2 -norms on D(Â), so extends to a norm-decreasing operator on all of L2 (µ). Since Â is non-positive, we have for each f ∈ D(Â) that 0 ≥ hÂf, f iL2 (µ) = hU −1 Mλ U f, f iL2 (µ) = hλU f, U f iL2 (ν) . 42 Since D(Â) is dense in L2 (µ), U −1 D(Â) is dense in L2 (ν). Therefore this inequality implies that λ ≤ 0 a.e. Hence etλ ≤ 1 a.e., and it follows that hTt f, Tt f iL2 (µ) = hMetλ U f, Metλ U f iL2 (ν) ≤ hU f, U f iL2 (ν) = hf, f iL2 (µ) . Therefore Tt is norm-decreasing on D(Â). Since D(Â) is dense in L2 (µ), Tt extends to a norm-decreasing operator on L2 (µ), and by density of D(Â) we still have T0 = Id and Tt ◦ Ts = Tt+s . Thus {Tt } is a contraction semigroup on L2 (µ). It remains to check that the infinitesimal generator of this semigroup is Â. Since U is linear and continuous, it commutes with differentiation in t. That is, if f ∈ D(Â), then in the L2 sense, d d Tt f = U −1 Metλ U f = U −1 Mλ Metλ U f = U −1 Mλ U U −1 Metλ U f = ÂTt f. dt dt d |t=0 Tt f = Âf whenever f ∈ D(Â). Conversely, suppose f ∈ L2 (µ) and Evaluating at t = 0 shows that dt d 2 dt |t=0 Tt f exists in L (µ). Then for any g ∈ D(Â), h d 1 |t=0 Tt f, giL2 (µ) = lim hTt f − f, giL2 (µ) . t→0 t dt From the formula (6.2), it is clear that Tt is symmetric for each t, so this equals 1 lim hf, Tt g − giL2 (µ) = hf, ÂgiL2 (µ) . t t→0 Since Â is self-adjoint, it follows that f ∈ D(Â) with Âf = of {Tt } is Â. d dt |t=0 Tt f . Therefore the infinitesimal generator Of course, in the Gaussian case, our semigroup {Tt } is just the Ornstein-Uhlenbeck semigroup. To show that {Tt } is a Markov semigroup, we need the following result, which characterizes contraction and Markov semigroups in terms of their generators: Theorem 6.7 (Hille-Yosida Theorem for Markov Semigroups). Let S be a closed linear operator defined on a domain D(S) of a Banach space X. Then S generates a contraction semigroup if and only if 1. D(S) is dense in X; 2. For every λ > 0, λI−S is invertible and the resolvent (λI−S)−1 exists and satisfies k(λI−S)−1 k ≤ 1/λ. This semigroup is Markov if and only if S(1) = 0 and (λI − S)−1 preserves positivity for all λ > 0. For a proof, see Ch. 8 of [8]. In order to apply the Hille-Yosida theorem to Â, we need to check that Â is closed. Indeed, if (fj ) ∈ D(Â), fj → f ∈ L2 (µ), and Âfj → g ∈ L2 (µ), then for any φ ∈ D(Â), hg, φiL2 (µ) = lim hÂfj , φiL2 (µ) = lim hfj , ÂφiL2 (µ) = hf, ÂφiL2 (µ) . j→∞ j→∞ Therefore f belongs to the domain of the adjoint of Â. But, Â is self-adjoint, so f ∈ D(Â), and by symmetry hg, φiL2 (µ) = hÂf, φiL2 (µ) for each φ ∈ D(Â). Since D(Â) is dense in L2 (µ), it follows that Âf = g, so that Â is closed. From Proposition 6.6, Â generates a contraction semigroup, so by the Hille-Yosida theorem, for each λ > 0, (λI − Â)−1 exists as a continuous operator on L2 (µ). To show that our semigroup {Tt } is Markov, it therefore remains to check the last two conditions in Theorem 6.7. Proposition 6.8. The semigroup {Tt } of Proposition 6.6 is Markov. 43 Proof. Clearly, Â(1) = 0. First we check that (λI − Â)−1 is positivity preserving on (λI − Â)Cc2 (Rn ). We must show that if f ∈ Cc2 (Rn ) and g = (λI − Â)f ≥ 0, then f ≥ 0. In this case, f attains a minimum at some point x0 ∈ Rn . At this point, ∇f = 0 and ∆f ≥ 0. We have 0 ≤ (λI − Â)f (x0 ) = (λI − A)f (x0 ) = λf (x0 ) − ∆f (x0 ), so λf (x0 ) ≥ ∆f (x0 ) ≥ 0, whence f ≥ 0. In the general case, let g ∈ L2 (µ) be non-negative. We must show that (λI − Â)−1 g ≥ 0. We claim that Cc2 (Rn ) ⊂ (λI − Â)Cc2 (Rn ). (6.3) Given the claim, if g ∈ L2 (µ) is non-negative a.e., then we can select a non-negative sequence (gj ) ∈ Cc2 (Rn ) which converges to g in L2 . Then by the first case and (6.3), (λI − Â)−1 gj is non-negative a.e., and since (λI − Â)−1 is continuous, these functions converge to (λI − A)−1 g in L2 . Hence (λI − Â)−1 g is non-negative a.e. Thus, by the Hille-Yosida theorem, the semigroup generated by Â is Markov. It remains to prove (6.3). Consider a fixed compact set K ⊂ Rn which equals the closure of its interior, and put L2K (µ) = {f ∈ L2 (µ) : supp(f ) ⊂ K}. The space L2K (µ) is a closed subspace of L2 (µ), hence is itself a Hilbert space. Furthermore Â maps L2K (µ) to L2K (µ) and the restriction of Â to D(Â) ∩ L2K (µ) is symmetric and non-positive. Since D(Â) ∩ L2K (µ) contains Cc2 (K o ) (where K o is the interior of K), D(Â) ∩ L2K (µ) is dense in L2K (µ). Therefore we can apply the Friedrichs extension theorem, the spectral theorem, and the Hille Yosida theorem just as we did on L2 (µ) to find that (λI − Â)−1 exists as an operator from L2K (µ) to L2K (µ). This resolvent must necessarily agree with the ordinary resolvent on Cc2 (K o ), so we find that (λI − Â)−1 maps Cc2 (K o ) to L2K (µ) for each compact K ⊂ Rn . That is, if f ∈ Cc2 (Rn ), then (λI − Â)−1 f is compactly supported. On the other hand, elliptic regularity (see [14], Ch. 6) implies, in particular, that (λI − Â)−1 f ∈ Cc2 (Rn ). Thus (λI − Â)−1 Cc2 (Rn ) ⊂ Cc2 (Rn ), and so Cc2 (Rn ) ⊂ (λI − Â)Cc2 (Rn ). 6.3 Applications to Economics Inequalities like those we study here have applications in a vast array of different fields. One such field is economics. Many economic phenomena, including fluctuations stock prices, machine failure rates, and shifts in unemployment can be modelled using log-concave probability measures. Gaussian measures are ubiquitous in mathematical modelling of all forms. Other log-concave measures allow for more flexible models which can better match the data, or more precise calculations with certain statistics. Among these, some of the most commonly used are the Gamma and Weibull distributions. Such measures are especially important in finance. In a 1959 paper, M.F.M. Osborne showed that the logarithms of many stock prices follow a Brownian motion, a stochastic process with independent, Gaussian increments [28]. Osborne’s discovery sparked widespread interest Gaussian and other log-concave measures in finance. The ability to model stock price fluctuations mathematically enables researchers and investors to quantify the risk associated with investing in the market, and thereby to model investment decisions in a formal manner. One of the best known applications of this idea is the Black-Scholes equation for the price of an option. This is a partial differential equation used to optimize pricing and portfolio allocation [6]. Another application of log-concave measures is to reliability functions in industrial engineering. Consider a machine which has some positive probability of breaking down. A reliability function is a measure µ on [0, ∞), with the interpretation that the measure of a set E is the probability that the machine breaks down at a a time t ∈ E. Log-concave measures arise naturally in this context. For example, if F (t) = µ[0, t] is the cumulative distribution function of µ, and f is its density, then the quantity Z ∞ f (t) dt − x M RL(x) = t 1 − F (x) t 44 is called the mean residual lifetime function of the machine, and represents the expected time before a machine will break down, given that it has survived to time x. Naturally, one would want M RL(x) to be decreasing in x, and it turns out that this is the case if and only if the measure µ is log-concave. Many similar desirable properties of a reliability function are also equivalent to log-concavity [3]. In order to use models like these, one must understand the behavior of the measures on which they rely. Of particular interest is the variance of a random variable—its average distance from its mean—and its entropy—a measure of the uncertainty in its value. If f is a function on Rn representing a random variable, then its variance with respect to a measure µ is given by Z 2 Z f 2 dµ − f dµ Rn Rn and its entropy (a measure of the uncertainty in its value) is given by Z Z |f | log |f |dµ − Rn Z 2 f dµ log Rn 1/2 |∇f | dµ . 2 Rn The Poincaré and logarithmic Sobolev inequalities bound these two quantities, respectively, in terms of the L2 norm of the gradient of f . These two inequalities are of much use in economic models which use Gaussian or other log-concave measures. For example, they can be used to estimate the total long-term variance of a stock price about its mean, or the expected amount of time before the process modelled by a log-concave reliability function breaks down. These inequalities are also used in quantitative estimates for the variance and entropy of prices, cash flows, inflation and other processes which follow a log-concave distribution [12]. Furthermore, the logarithmic Sobolev inequality can be used to prove concentration of measure inequalities [22], which bound the probability that a 1-Lipschitz random variable f deviates from its mean by at least t > 0: Z µ{x ∈ Rn : |f (x) − f dµ| ≥ t} ≤ φ(t) Rn for some rapidly decaying function φ. Given a concentration of measure inequality for µ, one can often obtain a similar or even sharper inequality for the n-fold product measure µn . As such, concentration of measure inequalities are indispensable in studying measures on high- or infinite-dimensional spaces. In economics, as in many other fields, large data sets are often represented as vectors in which each observation corresponds to a coordinate. A statistic of interest, e.g. the sample mean, is a function on the many-dimensional set of possible vectors of observations. Concentration of measure inequalities estimate how close the sample statistic is likely to be to its population counterpart (typically equal to its expected value). Moreover, concentration of measure inequalities can be used to deduce generalized central limit theorems (as in [20]), which are also useful in economics and statistics for estimating large sample probabilities and proving the convergence of estimators. The intermediate Beckner inequality and its generalization for q > 2, which we prove in Sections 3 and 4 has indirect applications in that it can be used to prove versions of the Poincaré and logarithmic Sobolev inequalities, as we do in Subsection 5.1. It also has more direct uses. The quantity Z 2 Z f dµ − Rn 2/p |f | dµ p Rn appearing on the left side of Beckner’s inequality is called the p-variance of f , and represents the dispersion of f about its Lp norm in much the same sense that the usual variance represents the dispersion of f about its mean. Beckner’s inequality might be used to estimate the rate of inflation or value of an investment in terms of its Lp norms, instead of its mean. Such estimates might be useful, for example, if one seeks an intermediate measure of the “average” of such a quantity, between the more commonly used mean and L2 norm. Indeed, Lp norms for general p 6= 2 arise naturally in many economic applications, such as the measurement of economic welfare [25]. As we show in Section 2, these inequalities possess a tensorisation property similar to concentration of measure inequalities, which makes them useful for studying large data sets. Furthermore, as with the logarithmic Sobolev inequality, Beckner’s inequality has been applied to prove 45 concentration of measure inequalities, e.g. in [21]. Generalizations of this inequality like the ones we obtain here are likely to be used for this purpose in the future. Sharper and more general inequalities, as well as new proofs of existing inequalities, lead to a better understanding of the measures in question. This in turn leads to more precise estimates and thereby more accurate models and better statistical tests. So, all of our results here have potential uses in the economic scenarios discussed above. For example, our extended Beckner inequalities might someday be used to prove new concentration of measure inequalities (and thereby lead to better stochastic models) or to new methods of estimating of the rates of convergence of stock prices or the lifetimes of industrial processes. Likewise, the sharpened Beckner and Poincaré inequalities for log-concave measures of Section 5 could be used to obtain sharper estimates for variance and entropy of such processes. Discrete analogous of Beckner, Poincaré, and log-Sobolev inequalities are also of economic interest. These inequalities play a similar role to their continuous counterparts in models which rely on discrete random variables, such as Bernoulli trials or discrete random walks. Such models include those for market entry, demand for relatively small quantities of goods, and a plethora of scenarios in game theory [12]. Consequently, the inequality for Bernoulli trials we derive in Subsection 2.2 and the tensorial property of Subsection 2.1, which allows it to be extended to higher dimensions, may have economic applications in their own right as well. Furthermore, the Ornstein-Uhlenbeck operator and its associated semigroup, which we study in Subsection 3.1 and 3.2, is often used in stochastic models for interest rates and commodity prices. For example, S. Rampertshammer [29] has developed a model for assessing the value of pairs trading based on the OrnsteinUhlenbeck operator. Pairs trading is an investment strategy whereby one simultaneously purchases shares in an asset which is below its normal historical price and a short-sells a second asset which is above its normal historical price. The idea is that the prices are likely to return to their historical mean values, and, even if they don’t, the investor will not lose money if the market as a whole improves or worsens. Rampertshammer uses the Ornstein-Uhlenbeck process (the stochastic process generated by the Ornstein-Uhlenbeck operator) to model the likelihood that two assets will rise or fall in price at the same time, and thereby to determine the optimal portfolio allocation in a pair trade. Plausibly, the generalization of the Ornstein-Uhlenbeck operator and semigroup to general log-concave probability measures which we study in Sections 4 and 5 could be used in similar models with different log-concave probability measures in place of the Gaussian measure. Inequalities for the Ornstein-Uhlenbeck semigroup and its generalization to log-concave probability measures like the ones we derive here could be used to bound the variance, p-variance, and entropy of the prices and estimators involved in these models. References [1] A. Arnold, J. Bartier, and J. Dolbeault. Interpolation between logarithmic Sobolev and Poincaré inequalities. Commun. Math. Sci., 5, No. 4 (2007), 971-979. [2] A. Arnold, P. Markowich, G. Toscani, and A. Unterreiter. On logarithmic Sobolev inequalities and the rate of convergence to equilibrium for Fokker-Planck type equations. Comm. Partial Differential Equations, 26 No. 1-2 (2001), 43-100. [3] M. Bagnoli and T.C. Bergtrom. Log-concave Probability and Its Applications. UC Santa Barbara Postprints, 2005. [4] F. Barthe and C. Roberto. Sobolev inequalities for probability measures on the real line. Studia Math. 159 (2003), 481-497. [5] W. Beckner. A Generalized Poincaré Inequality for Gaussian Measures. Proceedings of the American Mathematical Society 105, No. 2 (1989), 397-400. [6] F. Black and M. Scholes. The Pricing of Options and Corporate Liabilities. Journal of Political Economy, 18 No. 3 (1974), 637-654. [7] S.G. Bobkov and M. Ledoux. From Brunn-Minkowski to Brascamp-Lieb and to logarithmic Sobolev inequalities. Geometric And Functional Analysis, 10 (2000), 1028-1052. [8] A. Bobrowski. Functional Analysis for Probability and Stochastic Processes: An Introduction. Cambridge University Press (2005). 46 [9] V. I. Bogachev. Gaussian Measures. American Mathematical Society (1998). [10] H.J. Brascamp and E.H. Lieb. On Extensions of the Brunn-Minkowski and Prékopa-Leindler Theorems, Including Inequalities for Log Concave Functions, with an Application to the Diffusion Equation. Journal of Functional Analysis 22 (1976), 366-389. [11] D. Chafaı̈. Entropies, convexity, and functional inequalities: On Phi-entropies and Phi-Sobolev inequalities. J. Math. Kyoto Univ., 44 No. 2 (2004), 325–363. [12] J. Dupačová, J. Hurt, and J. Štėpán. Stochastic Modeling in Economics and Finance. Springer, 2002. [13] R. Durrett. Probability: Theory and Examples. 4th ed., Cambridge University Press, 2010. [14] L.C. Evans. Partial Differential Equations. American Mathematical Society, 1998. [15] B. Franchi, S. Gallot and R. Wheeden. Sobolev and isoperimetric inequalities for degenerate metrics. Math. Ann. 300 (1994), 557571. [16] L. Gross. Logarithmic Sobolev inequalities. Amer. J. Math. (1975), 1061-1083. [17] E. Gwynne and E. Hsu. On Beckner’s Inequality for Gaussian Measures. Elemente der Mathematik. Submitted. [18] B. Heffler. Remarks on Decay of Correlations and Witten Laplacians Brascamp-Lieb Inequalities and Semiclassical Limit. Journal of Functional Analysis 155 (1998), 571-586. [19] E.P. Hsu and S.R.S. Varadhan. Probability Theory and its Applications. American Mathematical Society, 1999. [20] B. Klartag. A central limit theorem for convex sets. Inventiones Mathematicae 168 no. 1 (2007), 91-131. [21] R. Latala and K. Oleszkiewicz. Between Sobolev and Poincaré. Geometric Aspects of Functional Analysis, Lecture Notes in Math. no. 1745 (2000), Springer, 147-168. [22] M. Ledoux. The Geometry of Markov Diffusion Generators. Ann. Fac. Sci. Toulouse, IX (2000), 305-366. [23] M. Ledoux. Concentration of measure and logarithmic Sobolev inequalities. Séminaire de Probabilités XXXIII, Springer Lecture Notes in Math. (1997). [24] S. Levy. Flavors of Geometry. Cambridge University Press, 1997. [25] T. Mitra and E.A. Ok. Majorization by Lp -Norms. New York University preprint, 2001. Available at https://files.nyu.edu/eo1/public/Papers-PDF/Major.pdf. [26] E. Nelson. The free Markoff field. J. Funct. Anal. 12 (1973), 211-227. [27] J. Neveu. Sur l’esperance conditionelle par rapport à un mouvement brownien. Ann. Insti. Henri Poincaré, B., 12 (1976), 105-110. [28] M.F.M. Osborne. Brownian Motion in the Stock Market. U.S. Naval Research Laboratory, 1959. [29] S. Rampertshammer. An Ornstein-Uhlenbeck Framework for Pairs Trading. Preprint. Available at http://www.ms.unimelb.edu.au/publications/RampertshammerStefan.pdf. [30] F.Y. Wang. A Generalization of Poincaré and Log-Sobolev Inequalities. Potential Analysis, 22 No. 1 (2005), 1-15. [31] M.H. Ye. Applications of Brownian motion to economic models of optimal stopping. University of Wisconsin–Madison, 1984. 47

Functional Inequalities for Gaussian and Log-Concave Probability Measures

Related documents

Products

Support

Functional Inequalities for Gaussian and Log-Concave Probability Measures

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib