Functional Inequalities for Gaussian and Log-Concave Probability Measures

advertisement
Functional Inequalities for Gaussian and Log-Concave
Probability Measures
Ewain Gwynne
Adviser: Professor Elton Hsu
Northwestern University
A thesis submitted for a Bachelor’s degree in
Mathematics (with honors)
and
Mathematical Methods in the Social Sciences
Abstract
We give three proofs of a functional inequality for the standard Gaussian measure originally due to
William Beckner. The first uses the central limit theorem and a tensorial property of the inequality.
The second uses the Ornstein-Uhlenbeck semigroup, and the third uses the heat semigroup. These latter
two proofs yield a more general inequality than the one Beckner originally proved. We then generalize
our new inequality to log-concave probability measures, study the relationship between this inequality
and a generalized logarithmic Sobolev inequality, and prove several other inequalities for log-concave
probability measures, including Brascamp and Lieb’s sharpened Poincaré inequality and Bobkov and
Ledoux’s sharpened logarithmic Sobolev inequality of the same form. We discuss some of the potential
applications of our work in economics.
1
Contents
1 Introduction
2 Proof of Beckner’s Inequality via the
2.1 Tensorial property . . . . . . . . . .
2.2 Two-point inequality . . . . . . . . .
2.3 First proof of Beckner’s inequality .
3
Central Limit Theorem
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3 Extended Beckner Inequality via Semigroup
3.1 The Ornstein-Uhlenbeck operator . . . . . . .
3.2 The Ornstein-Uhlenbeck semigroup . . . . . .
3.3 Proof of extended Beckner inequality . . . . .
3.4 Classical Heat semigroup . . . . . . . . . . .
Methods
. . . . . .
. . . . . .
. . . . . .
. . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
4
5
7
9
11
11
13
16
19
4 Beckner Inequality for Log-Concave Probability Measures
21
4.1 Generalization of the Ornstein-Uhlenbeck operator and semigroup . . . . . . . . . . . . . . . 21
4.2 Commutation with the gradient . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.3 Proof of Beckner’s inequality for log-concave probability measures . . . . . . . . . . . . . . . 25
5 Other Inequalities for Log-Concave Probability Measures
5.1 Generalized logarithmic Sobolev inequality . . . . . . . . . .
5.2 Inequality for the semigroup . . . . . . . . . . . . . . . . . . .
5.3 Brascamp-Lieb inequality . . . . . . . . . . . . . . . . . . . .
5.4 Sharpened logarithmic Sobolev inequality . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
26
27
30
31
34
6 Appendices
39
6.1 Sobolev Spaces for Log-Concave Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
6.2 Existence of Semigroup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
6.3 Applications to Economics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
2
0.1
Preface
I completed this thesis in 2013, shortly before my graduation from Northwestern University. It is intended
to fulfill the requirements for graduation with honors in Mathematics as well as the requirements for a major
in Mathematical Methods in the Social Sciences (MMSS).
This thesis consists of a mixture of results which I proved in collaboration with my adviser, and my own
exposition of material from research papers. The various sections herein were written over the course of
nearly a year, beginning in late Spring of 2012 (when I was a Junior in college) and continuing throughout
Summer 2012 and the following school year. My work on this thesis has exposed me to a broad range of
techniques and concepts in analysis and probability theory, which will be of use to me in my intended future
career as a mathematician. This work has also improved my intuition for problem solving and my ability to
read research papers in mathematics.
I am indebted to several organizations and individuals for the successful completion of this thesis.
I would like to thank my adviser, Professor Elton Hsu, for suggesting this project and for his guidance
throughout my work. He struck a perfect balance between providing enough guidance to keep me from
following dead ends and to make sure I had sufficient mathematical background to tackle problems that
arose, and allowing me to explore my own ideas and learn from my mistakes. Moreover, his explanations
have been a great help in building my mathematical intuition, and his suggestions for improvements in my
writing have not only strengthened the exposition in this paper, but have also made me a stronger writer in
general.
I would like to thank Professor Valentino Tosatti for serving as my second reader. His review of this
document and careful, insightful comments on it have been a major help in the writing process.
I would like to thank Professors Joseph Ferrie and William Rogerson of the MMSS program and Professor
Mike Stein of the Math department for their flexibility in allowing me to do a thesis which would work for
both of my majors.
I would like to thank Northwestern University for funding part of my work on this project via an undergraduate research grant in the summer of 2012. The financial independence provided by this grant enabled
me to devote my full attention to research during that summer, and thereby to discover more mathematics
than would otherwise have been possible.
3
1
Introduction
The standard Gaussian measure on Rn is the measure
2
γ n = (2π)−n/2 e−|x|
/2
dx.
(1.1)
In the case n = 1, we write γ 1 = γ. Two of the most fascinating and important properties of this measure
are the Poincaré inequality
Z
kf k22 −
f dγ n ≤ k∇f k22
(1.2)
Rn
and Gross’s [16] logarithmic Sobolev inequality
Z
f 2 log |f | dγ − kf k22 log kf k2 ≤ k∇f k22 ,
(1.3)
Rn
both valid for functions f in the Sobolev space W 2,1 (γ n ) (see Appendix 6.1 for the definition of this space
and its basic properties). The Poincaré and log-Sobolev inequalities are used throughout pure and applied
mathematics, in fields as diverse as quantum mechanics, mathematical finance, infinite dimensional analysis,
mathematical statistics, stochastic analysis, random matrix theory, and partial differential equations. For
example, the logarithmic Sobolev inequality can be viewed as a sharpened form of Heisenberg’s uncertainty
principle. It is also used to obtain bounds for the solutions of partial differential equations, improve models
of fluctuations in stock prices, and characterize the behavior of Brownian motion on manifolds.
Recall that for 1 ≤ p < ∞, the Lp norm of a measurable function f on Rn with respect to a measure µ
on Rn is defined by
Z
1/p
kf kp =
|f |p dµ
.
Rn
Here the measure µ will always be clear from the context. Beckner [5] has proven a functional inequality for
the standard Gaussian measure which involves the Lp norms for 1 ≤ p ≤ 2:
kf k22 − kf k2p ≤ (2 − p)k∇f k22 .
(1.4)
For p = 1, inequality (1.4) is equivalent to the Poincaré inequality, as can be seen for bounded f by adding
a sufficiently large constant C so that f + C is non-negative, and for the general f by approximation by
bounded functions. Furthermore, if we divide both sides of (1.4) by 2 − p and let p → 2, the left side tends
to the left side of (1.3). Thus Beckner’s inequality interpolates between the Poincaré inequality and the
logarithmic Sobolev inequality.
Beckner’s original proof of inequality (1.4) is based on the explicit spectral decomposition of the OrnsteinUhlenbeck operator in terms of Hermite polynomials and Nelson’s [26] hypercontractivity inequality for the
Ornstein-Uhenbeck semigroup. This latter inequality is a significant result in its own right, and is most
easily proven using the logarithmic Sobolev inequality (see, for example, [9]).
Apparently unaware of Beckner’s work, Latala and Oleszkiewicz [21] proved an extension of Beckner’s
r
r
inequality for measures ce−|x1 | −...−|xn | dx with 1 ≤ r ≤ 2. However, in the Gaussian case r = 2 inequality
(1.4) was derived from the logarithmic Sobolev inequality and hypercontractivity of the Ornstein-Uhlenbeck
semigroup, via an argument similar to that in [5]. In Section 3.1 of [22], Ledoux used non-linear PDE
to prove a version of (1.4) for the invariant probability measures of Markov semigroups whose generators
satisfy a curvature-dimension inequality; in the Gaussian case, this inequality reduces to a sharpened form
of (1.4), with the right side multiplied by (n − 1)/n and the parameter p allowed to increase to 2n/(n − 1).
Several other authors have also studied generalizations of Beckner’s inequality in various directions; see
for example [1], [2], [4], [11], [22], and [30]. However, the arguments in these papers also require results
comparable in difficulty to hypercontractivity or the logarithmic Sobolev inequality. So, it is instructive to
find a direct proof of Beckner’s inequality.
We shall give three such proofs. For our first proof of Beckner’s inequality, in Chapter 2, we shall
deduce (1.4) via the central limit theorem (Theorem 2.8) and an approximation argument, beginning with
an analogous inequality for a probability measure on a two-point set. This method of proof was suggested as
an alternative approach in [5], and is similar to Gross’ original proof of the logarithmic Sobolev inequality.
4
In fact, approximation arguments of this sort are pervasive in probability theory. So, this proof illustrates
an important technique. In the course of the proof, we prove an important tensorial property of inequality
(1.4), which allows one to deduce the n-dimensional inequality from the 1-dimensional one, and implies that
the inequality can be extended to infinite dimensional Gaussian measures.
Our second proof, in Chapter 3, uses the elementary properties of the Ornstein-Uhlenbeck operator and
its associated semigroup, introduced in Sections 3.1 and 3.2. The Ornstein-Uhlenbeck operator satisfies a
special integration by parts formula (Proposition 3.3), and its semigroup preserves Gaussian integrals. These
two properties make the Ornstein-Uhlenbeck operator a natural tool for proving inequalities of the sort we
study here. Our method actually yields a new, more general version of inequality (1.4), valid with the
exponent 2 replaced by any q > 2:
kf k2q − kf k2p ≤ (q − p)k∇f k2q ,
f ∈ W q,1 (µ),
q ≥ 2,
1 ≤ p ≤ 2.
(1.5)
Our third proof, given in Section 3.4, replaces the Ornstein-Uhlenbeck semigroup with the better known
classical heat semigroup, and also yields the extended inequality (1.5).
Beginning in Chapter 4 we shall concern ourselves with log-concave probability measures, a natural
generalization of Gaussian measures. We will study analogues of the Ornstein-Uhlenbeck operator and
semigroup in this more general setting. We will then use them to extend our proof of (1.5) to prove an
analogue of this inequality for general log-concave probability measures on Rn , with a multiplicative constant
depending on the measure appearing on the right side.
In Chapter 5, we will study several other inequalities for log-concave probability measures, often using the
semigroup of Subection 4.1. In Section 5.1, we will explore the implication relationships between inequality
(1.5) and a generalized logarithmic Sobolev inequality:
Z
2
2
kf k2−q
|f |q log |f | dµ − log(kf kq )kf k2q ≤ Ck∇f k2q ,
f ∈ W q,1 (γ n ),
q≥2
q
q
q
Rn
in the context of a general log-concave probability measure µ on Rn .
In Section 5.2, we will prove an inequality for the semigroup associated to a log-concave probability
measure, which extends an inequality which Beckner derived along with (1.4) in [5].
In Section 5.3 we will use semigroup methods to prove a sharpened Poincaré inequality for log-concave
probability measures due to Brascamp and Lieb [10]:
kf k22
2
Z
−
f dµ
Rn
Z
≤
h(D2 v)−1 ∇f, ∇f i dµ.
Rn
In the course of the proof, we also obtain invertibility of our generalization of the Ornstein-Uhlenbeck
operator on the space of functions in L2 (µ) with vanishing mean (Proposition 5.14).
In Section 5.4, we shall prove an analogous sharpened logarithmic Sobolev inequality due to Bobkov and
Ledoux [7], under a stronger hypothesis on the measure µ; and use the Herbst argument (Proposition 5.21)
to give counterexamples which show that this inequality cannot hold in general.
The inequalities we study here have potential uses in many different fields. To illustrate their applicability,
we shall discuss some of their potential applications in economics in Appendix 6.3.
2
Proof of Beckner’s Inequality via the Central Limit Theorem
The main goal of this chapter is to prove the following theorem.
Theorem 2.1 (Beckner’s Inequality). If f ∈ W 2,1 (γ n ), then for each p ∈ (1, 2),
kf k22 − kf k2p ≤ (2 − p)k∇f k22 .
(2.1)
We shall first establish a tensorial property of the inequality 2.1, then proceed by way of an approximation
argument using the central limit theorem and an analogous inequality for a probability measure on a twopoint set. This method of proof is inspired by Gross’ [16] original proof of the logarithmic Sobolev inequality,
and was suggested in [5].
5
2.1
Tensorial property
In this section, we shall prove a tensorial property of Beckner’s inequality, in the setting of an arbitrary
probability measure. This property will allow us to pass from an inequality for a measure on a two-point set
to the n-fold convolution of such a measure in order to apply the central limit theorem in the next section.
It will also allow us to deduce the n-dimensional case of Beckner’s inequality from the 1-dimensional case:
Theorem 2.2 (Tensorial property). Let µ be a probability measure on a set Ω. Let 1 ≤ p ≤ 2. Suppose that
F is a subspace of L2 (µ), B is a bilinear form on F, and µ satisfies an inequality
kf k22 − kf k2p ≤ C(2 − p)B(f, f ),
f ∈F
(2.2)
for some constant C > 0. Then the n-fold Cartesian product measure µn satisfies the inequality
kf k22 − kf k2p ≤ C(2 − p)B̃n (f, f ),
f ∈ F̃ n ,
where F̃ n is the space of functions f on Ωn such that xi 7→ f (x1 , ..., xn ) ∈ F for each fixed x1 , ..., xi−1 , xi+1 , ..., xn ∈
Ω and
Z X
n
B [f (x1 , ..., xi−1 , ·, xi+1 , ..., xn ), g(x1 , ..., xi−1 , ·, xi+1 , ..., xn )] dµn (x).
(2.3)
B̃n (f, g) :=
Ωn i=1
In words, the bilinear form B̃n is defined as follows: for each index i, we apply the operator B to the
function on Ω given by xi 7→ f (x1 , ..., xi−1 , xi , xi+1 , ..., xn ) with the variables other than the ith fixed. Then,
we sum over all n. Allowing the coordinates we had previously held fixed to vary, this gives a function on
Ωn . We then integrate this function over Ωn .
To see why we should care about this particular bilinear form,
consider the most important case of
R
Theorem 2.2, namely where Ω = R, F = Cc∞ (R), and B(f, g) = R f 0 g 0 dµ. Here we have
Z
B [f (x1 , ..., xi−1 , ·, xi+1 , ..., xn ), g(x1 , ..., xi−1 , ·, xi+1 , ..., xn )] =
∂i f (x1 , ..., xn )∂i g(x1 , ..., xn ) dµ(xi )
R
and so
Z
h∇f, ∇gidµn .
B̃n (f, g) =
(2.4)
Rn
Thus, in the case of a probability measure on R, Theorem 2.2 tells us that Beckner’s inequality (2.1) in
dimension one implies Beckner’s inequality in dimension n for f ∈ Cc∞ (R), and hence, by density (see
Appendix 6.1), also for f ∈ W 2,1 (µ). In particular, it will suffice to prove Theorem 2.1 only in the onedimensional case.
Our proof of Theorem 2.2 is based on an argument by Latala and Oleszkiewicz [21]. We first need a
result which characterizes the expression on the left side of (2.2).
Lemma 2.3. Let µ be a probability measure on a set Ω. Let q ∈ [1, 2]. For f ∈ L2 (µ), f ≥ 0, set
Z
q
Z
Φ(f ) =
f q dµ −
f dµ .
Ω
Ω
Then Φ is a convex functional on the non-negative functions on L2 (µ), i.e. for any t ∈ [0, 1], if f, g ≥ 0,
then
Φ(tf + (1 − t)g) ≤ tΦ(f ) + (1 − t)Φ(g).
(2.5)
Proof. First suppose that f and g satisfy A ≥ f, g ≥ a for constants a, A > 0. We need to show that
α(t) := Φ(tf + (1 − t)g) − tΦ(f ) − (1 − t)Φ(g) ≤ 0
for each t ∈ [0, 1]. By our hypotheses on f , α is a twice-differentiable function of t. We have α(0) = α(1) = 0.
Therefore, if α(t) is positive for some t ∈ [0, 1], then α attains a positive maximum in [0, 1]. So, it suffices to
6
show that there can be no such maximum. For this, it is enough to show that α00 (t) ≥ 0. Using dominated
convergence to differentiate under the integral sign, we compute
d2
Φ(tf + (1 − t)g)
dt2
Z
q−2 Z
2
Z
= q(q − 1) (tf + (1 − t)g)q−2 (f − g)2 dµ − q(q − 1)
(tf + (1 − t)g) dµ
f − g dµ .
α00 (t) =
Ω
Ω
Ω
Thus, we need to show that
Z
2
Z
2−q Z
(tf + (1 − t)g)q−2 (f − g)2 dµ
≥
f − g dµ .
tf + (1 − t)g dµ
Ω
(2.6)
Ω
Ω
Fix t and set h = (tf + (1 − t)g)2−q . By Hölder’s inequality,
Z
Z
2
2 Z
Z
f − g√
(f − g)2
√
=
f − g dµ
h dµ ≤
dµ
h dµ
h
h
Ω
Ω
Ω
ZΩ
Z
q−2
2
2−q
(tf + (1 − t)g) (f − g) dµ
(tf + (1 − t)g)
dµ .
=
Ω
Ω
But, since x 7→ x2−q is concave on [0, ∞), we have
Z
2−q
Z
2−q
(tf + (1 − t)g)
dµ ≤
tf + (1 − t)g dµ
.
Ω
Ω
Plugging this into the last line proves (2.6).
For the general non-negative f and g in L2 , one can find sequences (fj ) and (gj ) which are bounded
above and bounded away from 0 and which converge to f and g, respectively, in the L2 norm. Take the limit
in (2.5) for fj and gj to obtain the result for f and g.
Lemma 2.4. Let µ be a measure on a set Ω. Then if q ∈ (1, 2) and f ∈ L2 (µn ) is non-negative,
Z
Z
Z
q X
q Z
n Z
q
q
n
n
f dµi −
f dµi
f dµ −
f dµ
≤
dµn
Ωn
Ωn
i=1
Ωn
Ωi
Ωi
where
Z
Z
f dµi :=
Ω
f (x1 , ..., xi−1 , xi , xi+1 , ..., xn ) dµ(xi ).
Ω
Proof. First observe that the formula concerns only the absolute value of f , so we may assume that f ≥ 0.
Suppose first that n = 2, in which case the desired formula is
Z
q
Z Z
f q dµ1 dµ2 −
f dµ1 dµ2
Ω Ω
Ω×Ω
Z
q Z
Z
q Z Z Z
q
q
≤
f dµ1 −
f dµ1 +
f dµ2 −
f dµ2
dµ1 dµ2 .
(2.7)
Ω
Ω
Ω
Ω
Ω
Ω
Note that since µ1 and µ2 are probability measures, integrating a second time with respect to the same
measure has no effect. So, we can rearrange terms to obtain the equivalent inequality
q
Z Z
q Z Z
q
Z Z
Z Z
f dµ2 dµ1 −
f dµ1 dµ2
≤
f q dµ1 dµ2 −
f dµ1 dµ2 .
(2.8)
Ω
Ω
Ω
Ω
Ω
Ω
Ω
Ω
Define Φ as in Lemma 2.3. We apply the general form of Jensen’s inequality, with the L1 (Ω)-valued random
variable y 7→ f (·, y) and the convex function Φ on L1 (Ω) to obtain
Z
Z
Φ
f (·, y) dµ2 (y) ≤
Φ(f (·, y)) dµ2 (y).
Ω
Ω
7
Expanding both sides of this inequality, we obtain (2.8). This proves the result in the n = 2 case. The
general case follows by a simple induction argument.
Proof of Theorem 2.2. Let f ∈ F̃ n . Apply Lemma 2.4 to find
"Z
Z
2/p #
Z X
n
2
2
2
p
kf k2 − kf kp ≤
f dµi −
|f | dµi
dµn .
Ωn i=1
Ω
(2.9)
Ω
By hypothesis, for fixed x1 , ..., xi−1 , xi+1 , ..., xn ∈ Ω, each term in the sum on the right side of (2.9) is
bounded above by
C(2 − p)B [f (x1 , ..., xi−1 , ·, xi+1 , ..., xn ), f (x1 , ..., xi−1 , ·, xi+1 , ..., xn )] .
Summing over all i and integrating, we get that the right side of (2.9) is bounded above by
Z
C(2 − p)
n
X
B [f (x1 , ..., xi−1 , ·, xi+1 , ..., xn ), f (x1 , ..., xi−1 , ·, xi+1 , ..., xn )] dµn (x) = C(2 − p)B̃n (f, f ).
Ωn i=1
Remark 2.5. The tensorial property proven in Lemma 2.4 also allows one to extend Beckner’s inequality
to infinite dimensional Gaussian measures. See [9] for a discussion of these measures.
Remark 2.6. Later, we shall prove an inequality for log-concave probability measures of the form
kf k2q − kf k2p ≤ C(q − p)k∇f k2q ,
q ≥ 2,
1 ≤ p ≤ q,
f ∈ W 2,1 (µ).
In the case q > 2, this inequality does not, to our knowledge, possess the tensorial property of this section.
We must therefore prove this inequality in all dimensions, rather than just deducing the n-dimensional case
from the one-dimensional case as we do for Beckner’s inequality below.
2.2
Two-point inequality
Let X n = {−1, 1}n , with the uniform probability measure mn which assigns measure 1/2n to each point.
In this subsection, we shall prove an inequality for X = X 1 which is analogous to that in Theorem 2.1.
This inequality will eventually enable us to deduce Theorem 2.1 in the one-dimensional case via a limiting
argument based on the central limit theorem (Theorem 2.8).
Proposition 2.7. Define a bilinear form on X by
B(f, g) =
1
(f (1) − g(−1))2 .
4
(2.10)
Then for any function f on X,
kf k22 − kf k2p ≤ (2 − p)B(f, f ),
(2.11)
where here the norms are with respect to m.
Proof. We begin with some reductions. Every function on X is of the form f (x) = ax + b for constants
a and b. If b = 0, then |f | is constant, so the left side of (2.11) vanishes and the result is obvious. So,
we may assume that b 6= 0. Then since the desired inequality is invariant under rescaling, we may take
f (x) = fa (x) = ax + 1 for some a. The measure m is symmetric about the origin, so it suffices to assume
that a ≥ 0. Move all of the terms in (2.11) to one side and treat the difference as a function of a:
φ(a) = (2 − p)B(fa ) − (kfa k22 − kfa k2p ).
We shall show that φ ≥ 0, whence the desired inequality. The proof is essentially a direct calculation. We
treat two cases.
8
Case I: 0 ≤ a ≤ 1. Then
φ(a) = −(p − 1)a2 − 1 +
(1 + a)p + (1 − a)p
2
2/p
.
Observe that φ(0) = 0. We shall show that φ0 (0) = 0 and that φ is convex on [0, 1], so has a unique minimum
value of 0 at the origin. We have
φ0 (a) = −2(p − 1)a + kfa k2p (1 + a)p−1 − (1 − a)p−1
(2.12)
and φ0 (0) = 0. Moreover,
2−p
2−2p
2 kfa kp
1)kfa k2−p
(1 +
p
φ00 (a) = −2(p − 1) +
+(p −
(1 + a)p−1 − (1 − a)p−1
a)p−2 + (1 − a)p−2
2
(2.13)
The second term here is always non-negative. By Jensen’s inequality,
kfa kp2−p ≥
1+a+1−a
2
2−p
= 1.
By convexity of the function x 7→ xp−2 for x > 0,
p−2
(1 + a)
p−2
+ (1 − a)
≥ 2
1+a+1−a
2
p−2
= 2.
Thus the third term in (2.13) satisfies
(p − 1)kfa k2−p
(1 + a)p−2 + (1 − a)p−2 ≥ 2(p − 1).
p
Therefore φ00 ≥ 0 on [0, 1], so φ is convex on [0, 1], as desired.
Case II: a ≥ 1. Then
φ(a) = −(p − 1)a2 − 1 +
(1 + a)p + (1 − a)p
2
2/p
.
We know from the previous case that φ(1) ≥ 0, so to show that φ is non-negative it suffices to show that
φ0 (a) ≥ 0 for a ≥ 1. We compute
φ0 (a) = −2(p − 1)a + kfa k2−p
(1 + a)p−1 + (a − 1)p−1 .
(2.14)
2
For the first factor of the second term in (2.14), we have by Jensen’s inequality that
kfa k22−p
≥
1+a+a−1
2
2−p
= a2−p .
(2.15)
We claim that the second factor in (2.14) satisfies
(1 + a)p−1 + (a − 1)p−1 ≥ 2(p − 1)ap−1 .
(2.16)
If we can demonstrate this, then (2.14), (2.15), and (2.16) will imply
φ0 (a) ≥ −2(p − 1)a + a2−p · 2(p − 1)ap−1 = 0
which is what we need to show.
To prove the claim, set
ψ(a) = (1 + a)p−1 + (a − 1)p−1 − 2(p − 1)ap−1 .
9
(2.17)
We must show that ψ is non-negative on [1, ∞). It suffices to show that ψ(1) ≥ 0 and that ψ is increasing
on [1, ∞). We have ψ(1) = 2p−1 − 2(p − 1) ≥ 0. Furthermore
ψ 0 (a) = (p − 1) (1 + a)p−2 + (a − 1)p−2 − 2(p − 1)ap−2
≥ (p − 1) (a + 1)p−2 + (a − 1)p−2 − 2ap−2 .
The function x 7→ xp−2 is convex on (0, ∞), so
(a + 1)p−2 + (a − 1)p−2 ≥ 2
a+1 a−1
+
2
2
p−2
= 2ap−2
and hence ψ 0 (a) ≥ 0 and ψ is increasing in a. It follows that ψ(a) ≥ ψ(1) ≥ 0 for a ≥ 1. As we remarked
above, this proves (2.16), and completes the proof of (2.11).
2.3
First proof of Beckner’s inequality
Theorem
R 2.8 (Central Limit Theorem). Let µ be a probability measure on R, normalized so that
0 and R x2 dµ(x) = 1. Define a function φn on Rn by
R
R
x dµ(x) =
n
1 X
φn (x1 , ..., xn ) = √
xj .
n j=1
Let µn be the product of n copies of µ on Rn , and let µn∗ be the law of φn , i.e. the probability measure on
R such that
µn∗ (a, b] = µn {x ∈ Rn : a ≤ φn (x) ≤ b}
for each interval (a, b]. Then as n → ∞,
µn∗ (a, b] → γ(a, b]
for each interval (a, b], and
Z
Z
g dµn∗ →
R
g dγ
R
for any bounded continuous function g. In other words, µn∗ → γ in the weak-* topology on the set of finite
measures on R.
A proof of the central limit theorem can be found in Ch. 3 of [13]. In probabilistic terms, the central
limit theorem is the statement that the normalized means φn of a sequence of independent identically
distributed random variables converge in distribution to a variable with a Gaussian distribution. In practical
applications, the central limit theorem is used to estimate the probability that the mean of large sample lies
in a given range, using known probabilities for the Gaussian measure. It is the fundamental tool behind
most elementary hypothesis tests and confidence intervals.
We shall apply the central limit with the measure m from Section 2.2. Note that we can and do view
m as a measure on R, which assigns zero measure to sets which do not contain 1 or −1. First we apply
Theorem 2.2 in this setting. For B as in (2.3), we have
!
Z
n
X
B̃n (f, f ) =
B [f (x1 , ..., xi−1 , ·, xi+1 , ..., xn )] dµn
Xn
=
1
4
Z
i=1
n
X
(f (x1 , ..., xi−1 , 1, xi+1 , ..., xn ) − f (x1 , ..., xi−1 , −1, xi+1 , ..., xn ))2 dmn . (2.18)
X n i=1
By Proposition 2.7 and Theorem 2.2,
kf k2L2 (mn ) − kf k2Lp (mn ) ≤ C(2 − p)B̃n (f, f ),
10
1≤p≤2
(2.19)
for each function f on X n .
Now suppose that f is a function on R. For each integer n, define a function fn on X n by


n
X
1
xj  .
fn (x1 , ..., xn ) = f  √
n j=1
(2.20)
In the notation of Theorem 2.8, fn = f ◦ φn , so the theorem suggests that fn will converge in some sense to
n
f . We seek a simple expression for B̃n (fn , fn ). Fix an integer
i = 1, ..., n, and define X−i
to be the product
P
n
.
of the n − 1 copies of X in X n other than the ith. Write
y for the sum of the n − 1 coordinates of y ∈ X−i
Then the integral of the ith summand in (2.18) is
Z
(fn (x1 , ..., xi−1 , 1, xi+1 , ..., xn ) − fn (x1 , ..., xi−1 , −1, xi+1 , ..., xn ))2 dmn
Xn
=
1
X
n
y∈X−i
2n−1
P
P
2
y+1
y−1
√
√
f
−f
.
n
n
(2.21)
P
If the tuple y has k ones and n − 1 − k negatives ones, then
y = 2k − n + 1. For each k = 0, ..., n − 1,
n
there are n − 1 choose k tuples y ∈ X−i
satisfying this condition. Thus the integral in (2.21) equals
n−1
X
1
2n−1
k=0
n−1
2k − n
2
2k − n
√
√
f
+√
−f
.
k
n
n
n
Observe that this is independent of i. We therefore have
B̃n (fn , fn )
n−1
X
2
2k − n
2
2k − n
√
√
√
f
+
−
f
2n+1
n
n
n
k=0
2
 n−1
√
√
+ √2n − f 2k−n
X 1 n − 1 f 2k−n
n
n

 .
√
=
2n−1
k
2/ n
=
n
n−1
k
(2.22)
k=0
We are now ready to prove the one-dimensional version of Theorem 2.1.
Proof of Theorem 2.1. By Theorem 2.2 and the remarks following it, it will suffice to prove Theorem 2.1 in
the case n = 1. Furthermore, since Cc∞ (Rn ) is dense in W 2,1 (µ) (c.f. Appendix 6.1), it suffices to prove the
result for f ∈ C ∞ (R). Define fn as in (2.20). We shall deduce (2.1) as the limit as n → ∞ of (2.11) for fn .
The functions f 2 and |f |p are continuous with compact support, so by Theorem 2.8, as n → ∞,
kfn kL2 (mn ) = kf kL2 (mn∗ ) → kf kL2 (γ) ,
kfn kLp (mn ) = kf kLp (mn∗ ) → kf kLp (γ) .
(2.23)
This proves convergence of the left side of (2.11).
It remains to treat the right side. Let gn be the function on R which equals
2
 2k−n
√
√2
√
f 2k−n
+
−
f
n
n
n


√
2/ n
2(k+1)−n
√ ,
√
on [ 2k−n
] for each k = 0, ..., n − 1 and which equals zero elsewhere.
n
n
Let mn∗ be the law of φn , as in Theorem 2.8. By elementary probability theory, the measure mn∗ is a
binomial distribution with parameters n − 1 and 1/2. Therefore
Z
gn dm(n−1)∗ =
R
n−1
X
k=0
1
2n−1
2
 2k−n
2
2k−n
n − 1  f √n + √n − f √n 
√
= B̃n (fn , fn ).
k
2/ n
11
2(k+1)−n
√ ,
√
]. Since
On the other hand, gn is the square of a difference quotient of f on each interval [ 2k−n
n
n
∞
0 2
f ∈ Cc (R), it is readily verified that gn → (f ) uniformly as n → ∞. Thus, given > 0, we may choose
N so large that |gn − (f 0 )2 | < uniformly over R whenever n ≥ N . By the central limit theorem, we may
choose N 0 ≥ N such that
Z
Z
0 2
0 2
n (f ) dm(n−1)∗ − (f ) dγ < R
R
0
whenever n ≥ N . Hence for each such n,
Z
Z
Z
B̃n (fn , fn ) − (f 0 )2 dγ ≤ gn − (f 0 )2 dm(n−1)∗ + whence B̃n (fn , fn ) →
3
R
R
Z
(f ) dm(n−1)∗ −
Rn
R
R
0 2
R
(f ) dγ < 2
0 2
(f 0 )2 dγ. Thus, if we take the limit in (2.11) for fn , we get (2.1).
Extended Beckner Inequality via Semigroup Methods
In this chapter, we introduce the Ornstein-Uhlenbeck operator and its associated semigroup, and later the
classical heat semigroup, and use them to prove an extended version of Beckner’s inequality. The following
bit of notation will be useful here and in the remainder of the paper. Let q ≥ 1. If 1 ≤ p ≤ q, we say that a
probability measure µ on Rn satisfies inequality Bec(q, p) with constant C if for each f ∈ W q,1 (µ),
kf k2q − kf k2p ≤ C(q − p)k∇f k2q .
(3.1)
Notice that Bec(2, p) for 1 ≤ p ≤ 2 is Beckner’s original inequality. Our goal is to prove the following.
Theorem 3.1. The measure γ n satisfies Bec(q, p) with constant 1 whenever q ≥ 2 and 1 ≤ p ≤ q. Furthermore, equality holds if and only if f is constant, a.e.
3.1
The Ornstein-Uhlenbeck operator
In this section, we will introduce the Ornstein-Uhlenbeck operator and discuss the basic theory of semigroups.
A more detailed discussion of the Ornstein Uhlenbeck operator can be found in [9]. A more detailed discussion
of semigroups can be found in [8].
Definition 3.2. The Ornstein-Uhlenbeck operator is the densely defined unbounded operator A : L2 (γ n ) →
L2 (γ n ) defined by
Af (x) = ∆f (x) − hx, ∇f (x)i
(3.2)
P
n
with domain consisting of those f ∈ L2 (γ n ) such that this formula defines an L2 function. Here ∆ = j=1 ∂j2
denotes the Laplacian.
The Ornstein-Uhlenbeck operator has a number of interesting properties, as well as applications in partial
differential equations and probability theory. For our purposes, the importance of this operator lies in the
following integration by parts formula, which will play an essential role in what follows.
Proposition 3.3. If Af exists and g ∈ W 2,1 (γ n ), then
Z
Z
h∇f, ∇gi dγ n = −
Rn
gAf dγ n .
(3.3)
Rn
Proof. By converting to an ordinary Lebesgue integral and integrating by parts, we obtain the elementary
formula
Z
Z
n
∂j f (x) dγ (x) =
xj f (x) dγ n (x),
f ∈ Cc∞ (Rn ).
(3.4)
R
R
12
By approximation, the same holds for f ∈ W 2,1 (γ n ). We now apply this and the product rule to the term
hx, g(x)∇f (x)i in each coordinate to get
Z
Z
gAf dγ n =
−hx, g(x)∇f (x)i + g(x)∆f (x) dγ n (x)
n
n
R
ZR
=
−h∇f (x), ∇g(x)i − g(x)∆f (x) + g(x)∆f (x) dγ n (x)
Rn
Z
= −
h∇f, ∇gi dγ n .
Rn
We are interested in solving the analogue of the heat equation for the operator A, namely
∂t u(t, x) = Au(t, x),
u(0, x) = f (x),
(3.5)
for sufficiently regular functions f ∈ L2 (µ). In order to do so, we will need the following notion from
functional analysis.
Definition 3.4. A contraction semigroup is a family of operators {Tt : t ≥ 0} on a Banach space X satisfying
1. T0 is the identity on X;
2. Tt ◦ Ts = Tt+s .
3. t 7→ Tt is strongly continuous in t. In other words, if tj → t then for each x ∈ X, Ttj x → Tt x in the
norm of X.
4. kTt xk ≤ kxk for all x ∈ X.
Perhaps the simplest nontrivial example of a contraction semigroup is the family of maps on R defined
by Tt (x) = e−t x.
Definition 3.5. The infinitesimal generator of {Tt } is the operator S defined by
S(x) = lim
t→0
Tt (x) − x
,
t
with domain D(S) consisting of those x ∈ X such that this limit exists.
In our above example, the infinitesimal generator of x 7→ e−t x is x 7→ −x. In general, abstract theory
(see, for example, Section 7.4 of [14]) implies that the infinitesimal generator of a contraction semigroup
on a Banach space X always exist on a dense subspace of X. However, it is difficult in general to find its
precise domain. For many purposes (including our own), one can settle for a semigroup whose generator is
an extension of a given operator, i.e. one which agrees with the given operator on its domain, but may have
a larger domain.
The relation between the infinitesimal generator and the semigroup is made clear by the following.
Lemma 3.6. If x ∈ D(S), then for each t ≥ 0,
d
Tt x = Tt Sx = STt x.
dt
(3.6)
Proof. Since t 7→ Tt x is continuous, we have
d
Ts+t x − Tt x
Tt x = lim
= Tt
s→0
dt
s
and
Ts x − x
s→0
s
lim
d
Ts Tt x − Tt x
Tt x = lim
= STt x.
s→0
dt
s
13
= Tt Sx
3.2
The Ornstein-Uhlenbeck semigroup
By Lemma 3.6, if we can find a semigroup {Tt } whose infinitesimal generator is an extension of the Ornstein
Uhlenbeck operator A, then u(t, x) = Tt f (x) will be a solution to (3.5). We could simply state the formula
(called the Mehler Formula) for Tt and prove that it has the necessary properties, but for pedagogical reasons
we first make the following informal derivation.
We assume all functions involved are sufficiently regular that our calculations make sense. Taking the
Fourier transform, defined by
Z
ĝ(ξ) =
e−ihx,ξi g(x) dx,
g ∈ L1 (Rn )
Rn
in (3.5) gives
∂t û(t, ξ) = −|ξ|2 û(t, ξ) + i
n
X
∂ξj (−iξj û(t, ξ)) = −|ξ|2 û(t, ξ) +
j=1
n
X
ξj ∂ξj û(t, ξ) + nû(t, ξ),
j=1
with initial condition û(0, ξ) = fˆ(ξ). Equivalently,
∂t û(t, ξ) −
n
X
ξj ∂ξj û(t, ξ) + (|ξ|2 − n)û(t, ξ) = 0,
û(0, ξ) = fˆ(ξ).
j=1
For fixed ξ0 ∈ Rn , the characteristic ODE for this PDE are
ṫ(s)
˙
ξ(s)
=
1
ż(s)
=
(n − |ξ|2 )z(s)
t(0)
=
0,
= −ξ
ξ(0) = ξ0 ,
z(0) = fˆ(ξ0 ),
where z is the variable we substitute for û. Solving the first two equations and substituting into the third,
we obtain
t = s
ξ(s)
= e−s ξ0
ż(s)
(n − |ξ0 |2 e−2s )z(s)
= fˆ(ξ0 ).
z(0)
=
The last two equations yield the solution
1
z(s) = es e− 2 |ξ0 |
2
(1−e−2s )
fˆ(ξ0 ).
Given (t, ξ) ∈ [0, ∞) × Rn , we select s ∈ [0, ∞) and ξ0 ∈ Rn such that (t, ξ) = (t(s), ξ(s)) = (s, e−s ξ0 ), i.e.
s = t, ξ0 = et ξ. Then
2 2t
−2t
1
û(t, ξ) = z(s) = et e− 2 |ξ| e (1−e ) fˆ(et ξ).
The inverse Fourier transform u(t, x) of this function will be et times the convolution of the inverse Fourier
transform of fˆ(et ξ) and the inverse Fourier transform of exp(− 21 |ξ|2 e2t (1 − e−2t )). From the elementary
properties of the Fourier transform, the inverse Fourier transform of fˆ(et ξ) is (2π)n/2 e−t f (e−t ξ) and the
inverse Fourier transform of exp(− 12 |ξ|2 e2t (1 − e−2t )) is
(2π)n/2
1
√
exp(− |ξ|2 e−2t (1 − e−2t )−1 ).
t
−2t
n
2
(e 1 − e )
Thus, upon taking the inverse Fourier transform and cancelling a factor of (2π)n et , we get
Z
1
1
√
u(t, x) =
exp(− |z|2 e−2t (1 − e−2t )−1 )f (e−t x − e−t z) dz.
t
−2t
n
2
(e 1 − e ) Rn
14
√
Substitute z = et 1 − e−2t y and use symmetry of the Gaussian to find
Z
p
u(t, x) =
f (e−t x + 1 − e−2t y) dγ n (y).
Rn
We are therefore lead to define
Definition 3.7. The Ornstein-Uhlenbeck Semigroup is the family of operators {Tt } defined for t ≥ 0 by
Z
p
Tt f (x) =
f (e−t x + 1 − e−2t y) dγ n (y),
Rn
for all functions f on Rn for which this integral exists for a.e. x ∈ Rn .
We claim that {Tt } is a contraction semigroup on Lp , 1 ≤ p ≤ ∞, with infinitesimal generator A. To
prove our claim, we shall require the following elementary lemma.
Lemma 3.8 (Change of Variables Formula). If a2 + b2 = 1, then for any f ∈ L1 (γ n ),
Z
Z Z
n
f (u) dγ (u) =
f (ax + by) dγ n (x) dγ n (y).
Rn
Rn
Rn
Proof. Since γ n is a probability measure, we may insert a second integral to get
Z
Z Z
Z Z
2
2
1
n
n
n
f (u) dγ (u) =
f (u) dγ (u) dγ (v) =
f (u)e−|u| /2−|v| /2 dudv.
n
(2π) R R
R
Rn R
Make the change of variables u = ax + by, v = bx − ay. Since a2 + b2 = 1, the Jacobian matrix for this
change of variables has determinant one. So, by the change of variables formula from elementary calculus,
our integral becomes
Z Z
Z Z
2
2
1
1
−|ax+by|2 /2−|bx−ay|2 /2
f (ax + by)e
dxdy =
f (ax + by)e−|x| /2−|y| /2 dxdy
(2π)n Rn Rn
(2π)n Rn Rn
Z Z
=
f (ax + by) dγ n (x) dγ n (y).
Rn
Rn
From this lemma, we can deduce some basic properties of the Ornstein-Uhenbeck semigroup.
Proposition 3.9. Let s, t ≥ 0. Then:
1. Tt preserves integrals: if f ∈ L1 (γ n ), then
Z
n
Z
Tt f dγ =
Rn
f dγ n .
Rn
2. kTt f kp ≤ kf kp for all p ≥ 1; in particular, Tt is a bounded, equivalently continuous, map from Lp (γ n )
to itself.
3. Tt ◦ Ts = Tt+s .
4. If f ∈ W 2,1 (γ n ), then so is Tt f and for each k = 1, ..., n, ∂k (Tt f ) = e−t Tt (∂k f ).
5. t 7→ Tt is a continuous function of t. That is, if tj → t, then for each f ∈ Lp (γ n ), Ttj f → Tt f in Lp .
R
6. If f ∈ L2 (γ n ), then limt→∞ Tt f = Rn f dγ n in Lp for any 1 ≤ p < ∞.
Clearly, T0 is the identity operator, so items 2, 3, and 5 mean that {Tt } is a contraction semigroup on
Lp (γ n ). Item 1 implies that γ n is the invariant measure for {Tt }.
15
Proof.
√
1. If we take a = e−t and b = 1 − e−2t , then Lemma 3.8 gives
Z
Z Z
Z
p
n
n
n
−t
−2t
Tt f dγ =
f (e x + 1 − e y) dγ (y) dγ (x) =
Rn
Rn
f dγ n .
(3.7)
Rn
Rn
2. From item 1, together with Jensen’s inequality, we get for f ∈ L1 (γ n ) that
p
Z Z
p
n
p
−t
−2t
kTt f kp =
f (e x + 1 − e y) dγ (y) dγ n (x)
n
n
ZR Z R
p
≤
|f (e−t x + 1 − e−2t y)|p dγ n (y) dγ n (x) = kf kpp .
Rn
Rn
Since L1 (γ n ) is dense in Lp (γ n ) for p ≥ 1, this proves item 2.
3. Observe that
Z
Tt+s f (x) =
f (e−t−s x +
p
1 − e−2t−2s y) dγ n (y).
Rn
Apply Proposition 3.8 in the y variable, with
√
1 − e−2t
−s
,
a=e √
1 − e−2t−2s
√
1 − e−2s
b= √
.
1 − e−2t−2s
This gives us
Z
Z
Tt+s f (x) =
Rn
f (e−t e−s x + e−s
p
p
1 − e−2t w + 1 − e−2s z) dγ n (w) dγ n (z) = Tt Ts f (x).
Rn
4. This can be obtained for f ∈ C ∞ (Rn ) by differentiating under the integral sign, and follows for the
general f ∈ W 2,1 (γ n ) by the usual density argument.
5. Suppose tj → t. First consider f ∈ Cc (Rn ). Then Ttj f → Tt f a.e., and it follows from dominated
convergence that Ttj f → Tt f in Lp (γ n ). For the general f ∈ Lp (γ n ), one can find a sequence (fi ) ∈
Cc (Rn ) converging to f in Lp (γ n ). Then
kTtj f − Tt f kp ≤ kTtj f − Ttj fi kp + kTtj fi − Tt fi kp + kTt fi − Tt f kp .
Taking the limit, first in i then in j, proves item 5.
6. If f is bounded, the result is immediate from the dominated convergence theorem. In the general case,
let > 0 and choose a bounded measurable g such that kf − gkp < . Then
!
Z
Z
Z
n
n
n
lim sup Tt f −
f dγ ≤ lim sup kTt f − Tt gkp + Tt g −
g dγ + Tt g −
g dγ < 2
n
n
n
t→∞
t→∞
R
p
R
p
which proves the result in general.
Finally, we can relate the semigroup {Tt } to the Ornstein-Uhlenbeck operator A.
Proposition 3.10. Let f ∈ D(A). Then t 7→ Tt f is a differentiable function of t and
d
Tt f = ATt f = Tt Af.
dt
In particular, taking t = 0, the infinitesimal generator of {Tt } is an extension of A.
16
R
p
Proof. We have
d
Tt (f )(x)
dt
=
Z
p
hx, ∇f (e−t x + 1 − e−2t y)i dγ n (y)
Rn
Z
p
e−2t
hy, ∇f (e−t x + 1 − e−2t y)i dγ n (y).
+√
1 − e−2t Rn
−e−t
(3.8)
By item 4 of Proposition 3.9, the first term on the right side of (3.8) equals
−hx, e−t Tt ∇f (x)i = −hx, ∇Tt f (x)i,
where the action of Tt on vector valued functions is componentwise. By (3.4) applied in each coordinate of
y, the second term in (3.8) equals
Z
p
−2t
e
∆f (e−t x + 1 − e−2t y) dγ(y) = e−2t Tt ∆f (x) = ∆Tt f (x).
Rn
Substituting these two relations into (3.8) proves the first desired equality, and taking t = 0 shows that the
infinitesimal generator of {Tt } is an extension of A (its domain may be larger than the domain of A). The
second equality follows from Lemma 3.6.
3.3
Proof of extended Beckner inequality
The integration by parts formula 3.3 makes the operator A and its associated semigroup {Tt } a natural tool
for the proof of Theorem 3.1. Here we give the proof, and also prove some sharpness results.
Proof of Theorem 3.1. By density, we may take f ∈ C 2 (Rn ) with f bounded. Then |f | ∈ W q,1 (γ n ) with
|∇|f || = |∇f | a.e. (see Appendix 6.1). So, it suffices to suppose that f is non-negative. Furthermore, by
replacing f with f + and letting → 0, we may take f ≥ c > 0 for some constant c. Let
Z
2/q
p q/p
n
φ(t) =
[Tt f ] dγ
.
Rn
f dγ n , we have that the left side of Bec(q, p) is given by
Z ∞
2
2
kf kq − kf kp =
φ0 (t) dt.
Since T0 is the identity and limt→∞ Tt f =
R
Rn
0
0
We shall estimate φ . To simplify notation in what follows, put
Z
2/q−1
α(t) =
[Tt f p ]q/p dγ n
.
(3.9)
Rn
Using the relation ∂t Tt f = ATt f , we compute
φ0 (t) =
2
α(t)
p
Z
[Tt f p ]q/p−1 ATt f p dγ n .
Rn
By the integration by parts formula for A, this equals
Z
Z
2 q
2
α(t)
h∇[Tt f p ]q/p−1 , ∇ (Tt f p )i dγ n =
− 1 α(t)
[Tt f p ]q/p−2 |∇Tt f p |2 dγ n .
p
p p
Rn
Rn
(3.10)
We have
∇Tt f p = e−t Tt (∇(f p )) = e−t pTt (f p−1 ∇f ).
(3.11)
Therefore, (3.10) equals
2e−2t (q − p) α(t)
Z
[Tt f p ]q/p−2 |Tt (f p−1 ∇f )|2 dγ n .
Rn
17
(3.12)
By Hölder’s inequality applied inside the definition of Tt ,
2
2−2/p
2/p
(Tt |∇f |p ) .
|Tt (f p−1 ∇f )|2 ≤ Tt (f p−1 |∇f |) ≤ (Tt f p )
(3.13)
Plugging this into (3.12) yields
φ0 (t) ≤ 2e−2t (q − p) α(t)
Z
(Tt f p )
q/p−2/p
2/p
(Tt |∇f |p )
dγ n .
(3.14)
Rn
The q = 2 case can be handled by trivial modifications of what follows, so we henceforth assume q > 2.
Apply Hölder’s inequality a second time, this time with the exponents q/(q − 2) and q/2, to get
Z
q/p−2/p
2/p
(Tt f p )
(Tt |∇f |p )
dγ n
Rn
Z
≤
p q/p
(Tt f )
dγ
n
1−2/q Z
p q/p
(Tt |∇f | )
Rn
dγ
n
2/q
.
Rn
The last term here is precisely α(t)−1 , so upon plugging this into (3.14), we obtain
Z
2/q
q/p
(Tt |∇f |p )
dγ n
φ0 (t) ≤ 2e−2t (q − p)
.
Rn
Since 1 ≤ p ≤ q, we have
(Tt |∇f |p )
q/p
≤ Tt |∇f |q
(3.15)
So, since Tt preserves integrals,
0
φ (t) ≤ 2e
−2t
Z
(q − p)
q
|∇f | dγ
n
2/q
.
(3.16)
Rn
Integrating this from 0 to ∞ proves Bec(q, p).
If equality holds in Bec(q, p), then it must hold for almost every t in (3.16). Therefore it must hold for
almost every t in (3.13). By the conditions for equality in Hölder’s inequality, this means that
f p = c|∇f |p
for some constant c. On the other hand, the condition for equality in Jensen’s inequality, together with
(3.15), imply that |∇f |p is constant. Therefore if f satisfies the hypotheses we imposed at the beginning of
the proof, then equality in Bec(q, p) implies that f is constant. The L2 limit of constant functions is constant
a.e., so upon passing to the limit we get the result for the general non-negative f . The result for the general
f ∈ W 2,1 (µ) is obtained by replacing f with |f | and applying the positive case, together with Proposition
6.3 of Appendix 6.1. So, by the non-negative case, equality for f implies that |f | is constant a.e. But, since
f ∈ W 2,1 (µ), this means that f itself is constant a.e.
Remark 3.11. If we apply Hölder’s inequality in each coordinate of ∇f separately after obtaining (3.12),
essentially the same proof shows that γ n satisfies Bec(q, p) with the right side replaced by
(q − p)
n
X
k∂k f k2q .
k=1
This alternative inequality does not appear to be either weaker or stronger in general than the original
inequality Bec(q, p).
The condition q ≥ 2 in Theorem 3.1 is essential. To see this, we need a lemma.
Lemma 3.12. If µ satisfies inequality Bec(q, p) for any 1 ≤ p < q, and any constant C, then µ satisfies
Z
2
kf k22 −
f dµ ≤ Ck∇f k2q .
(3.17)
Rn
for all f ∈ W q,1 (µ).
18
Proof. It suffices to prove the implication for bounded functions f ∈ W q,1 (µ). Replace f with 1 + f in
Bec(q, p), and divide by 2 :
k1 + f k2q − k1 + f k2p
≤ (q − p)Ck∇f k2q .
(3.18)
2
Apply L’hopital’s rule twice to get that as → 0 the left side tends to
1 d2
k1 + f k2q − k1 + f k2p |=0 .
2
2 d
If is so small that 1 + f > 0, we have
d2
k1 + f k2q
d2
=
2(q − 1)k1 + f k2−q
q
Z
Rn
f 2 (1 + f )q−2 dµ + 2(2 − q)k1 + f k2−2q
q
Z
f (1 + f )q−1 dµ
2
Rn
Replacing q with p, then evaluating both expressions at = 0, we see that the limit of the left side of (3.18)
is
Z
2
Z
2
Z
Z
(q − 1)
f 2 dµ + (2 − q)
f dµ − (p − 1)
f 2 dµ − (2 − p)
f dµ
Rn
Rn
= (q − p)kf k22 − (q − p)
Rn
Rn
2
Z
f dµ
.
Rn
Dividing by q − p then proves (3.17).
Proposition 3.13. The standard Gaussian measure γ n does not satisfy inequality Bec(q, p) for any 1 ≤ p <
q < 2 and any constant C.
Proof. By the Lemma, it will suffice to show that γ n does not satisfies inequality (3.17). Take ft (x1 , ..., xn ) =
etx1 , for t > 0. One has the formula
Z
2
eax1 dγ n (x) = ea /2 ,
a ∈ R.
Rn
From this, one sees that (3.17) for ft with constant C is equivalent to
2
e
2t2
−e
t2
2
qt2
≤ t Ce
2
e(2−q)t − e(1−q)t
≤ C.
t2
⇔
But, the left side of this last inequality tends to ∞ as t → ∞. Thus γ n cannot satisfy (3.17) for any constant
C, so by Proposition 3.12 it cannot satisfy Bec(q, p) for any q < 2, any p < q, and any constant C.
By a similar argument, we can also get that our result in Theorem 3.1 is sharp.
Proposition 3.14. Let q ≥ 2 and 1 ≤ p ≤ 2. The standard Gaussian measure γ n does not satisfy inequality
Bec(q, p) for constant C < 1.
Proof. As in the proof of Proposition 3.13, set ft (x1 , ..., xn ) = etx1 for t > 0. Then Bec(q, p) with constant
C for ft is equivalent to
2
2
2
2
eqt − ep ≤ t2 C(q − p)eqt
⇔
1 − e(p−q)t
≤ C(q − p).
t2
As t → 0, the left side of this last inequality tends to q − p, so it cannot hold with C < 1.
19
.
3.4
Classical Heat semigroup
Inequality Bec(q, p) for the Gaussian measure can also be proven by means of the classical heat semigroup,
rather than the Ornstein-Uhlenbeck semigroup. This is the method used by E. Hsu and the author in [17].
The proof itself is slightly longer, but the classical heat semigroup is better known than the OrnsteinUhlenbeck semigroup, and less work is required to establish its basic properties. We give this alternative
proof here.
Definition 3.15. The classical heat semigroup {Ps : s > 0} defined by
Z
2
1
f (y)e−|x−y| /2s dy
Ps f (x) =
n/2
(2πs)
Rn
(3.19)
for bounded continuous functions f on Rn .
We shall require the following elementary properties of {Ps }.
Proposition 3.16. Suppose f : Rn → R is bounded and continuous. Then
1. Ps f → f as s → 0.
R
2. P1 f (0) = Rn f dγ n .
3. For each s, t ≥ 0, Ps ◦ Pt = Ps+t .
4. Ps f solves the heat equation: ∂s Ps f = 12 ∆Ps f .
5. If ∇f is bounded and continuous, then ∇Ps f = Ps ∇f , where the action of Ps on ∇f is componentwise.
Note that items 1 and 3 imply that {Ps } is a semigroup. Item 4 implies that its infinitesimal generator is
an extension of the half-Laplacian 21 ∆. The heat semigroup is not, however, a contraction semigroup. Item
2 provides the motivation for using the heat semigroup in the context of inequalities for γ n .
√
Proof.
1. Substitute y = x − sz in (3.19) to obtain the alternative formula
Z
Z
√
√
1
−|z|2 /2
dz =
f (x − sz) dγ n (z).
(3.20)
Ps f (x) =
f (x − sz)e
(2π)n/2 Rn
n
R
R
Since f is bounded, we can use dominated convergence to find that this tends to Rn f (x) dγ n (z) = f (x)
as s → 0.
2. This is immediate from (3.19) and (1.1).
3. Using the alternative formula (3.20) for Ps , we have
Z Z
√
√
Ps Pt f (x) =
f (x − sy − tz) dγ n (z) dγ n (y).
Rn
By Lemma 3.8 with a =
√
√ s ,
s+t
b=
√
√ t ,
s+t
Z
f (x −
Rn
this equals
√
s + tu) dγ n (u) = Ps+t f (x).
Rn
4. Let
ρ(x, y, s) =
2
1
e−|x−y| /2s
(2πs)n/2
be the heat kernel. For j = 1, ..., n, we have
∂j2 ρ(x, y, s) =
(xj − yj )2
1
−
s2
s
20
ρ(x, y, s)
and so
∆ρ(x, y, s) =
n
|x − y|2
−
2
s
s
ρ(x, y, s) = 2∂s ρ(x, y, s).
Differentiating under the integral sign is justified since ∆ρ ∈ L1 (Rn ) and f is bounded, so we obtain
Z
Z
∆Ps f (x) =
f (y)∆ρ(x, y, s) dy = 2
f (y)∂s ρ(x, y, s) dy = 2∂s f (y).
Rn
Rn
5. This follows from differentiation under the integral sign in (3.20).
We now employ these elementary properties of the classical heat semigroup {Ps } to give an alternative
proof of inequality Bec(q, p).
Proof of Theorem 3.1. By a standard approximation argument, it is enough show the inequality (3.1) for a
smooth function f such that 0 < c ≤ f ≤ C and ∇f is bounded. For 0 ≤ s ≤ 1, consider the function
h
i2/q
q/p
φs (x) = Ps (P1−s f p ) (x)
.
(3.21)
The idea of considering such a function in the context of functional inequalities can be traced back to
Neveu [27]. We can write the left side of (3.1) as
Z 1
2
2
kf kq − kf kp = φ1 (0) − φ0 (0) =
∂s φs (0) ds.
0
The technical part of our proof is the computation of the derivative of 3.21 with respect to s.
From the definition (3.21) of φs we have
h
i2/q
2
∂s φs = ∂s Ps gsq/p
= as ∂s (Ps gs )q/p ,
q
where, to simplify the notation here and later, we have introduced the functions
2/q−1
gs = P1−s f p and as = Ps gsq/p
.
We compute
∂s φs =
2
2
as (∂s Ps )gsq/p + as Ps gsq/p−1 ∂s gs .
q
p
(3.22)
q/p
Using the relation ∂s Ps = (1/2)Ps ∆, we may rewrite the first term on the right side as (1/q)as Ps ∆ gs
,
which equals
1
1 q
− 1 as Ps gsq/p−2 |∇gs |2 + as Ps gsq/p−1 ∆gs
(3.23)
p p
p
by the identity
q q
q
− 1 hq/p−2 |∇h|2 + hq/p−1 ∆h
∆ hq/p =
p p
p
applied with h = gs . From ∂s P1−s = −(1/2)∆P1−s we have ∂s gs = −(1/2)∆gs , so the second term in the
sum (3.23) exactly cancels the second term in (3.22). In the remaining term, we use the fact that P1−s
commutes with ∇ and write ∇gs = pP1−s (f p−1 ∇f ), which gives
∂s φs = (q − p)as Ps gsq/p−2 |P1−s (f p−1 ∇f )|2 .
(3.24)
Note that P1−s is an integral with respect to a (probability) measure, so we can use Hölder’s inequality with
the exponents p/(p − 1) and p to get
(p−1)/p
|P1−s (f p−1 ∇f )| ≤ P1−s (f p−1 |∇f |) ≤ (P1−s f p )
21
(P1−s |∇f |p )
1/p
.
Thus, by (3.24),
2/p
∂s φs ≤ (q − p)as Ps gsq/p−2/p (P1−s |∇f |p )
.
(3.25)
The case q = 2 is covered by trivial modifications to what follows, so in the remainder of the proof we assume
q > 2. Hölder’s inequality with the exponents q/(q − 2) and q/2 yields
1−2/q 2/q
2/p
q/p
Ps gsq/p−2/p (P1−s |∇f |p )
≤ Ps gsq/p
Ps (P1−s |∇f |p )
.
The first factor on the right side is exactly a−1
s , which cancels the factor as in (3.25). We now have
2/q
q/p
∂s φs ≤ (q − p) Ps (P1−s |∇f |p )
.
p/q
From 1 ≤ p ≤ q we have P1−s |∇f |p ≤ (P1−s |∇f |q ) . This together with the semigroup property Ps P1−s =
P1 gives
Z
2/q
2/q
q/p
2/q
Ps (P1−s |∇f |p )
≤ (Ps P1−s |∇f |q )
=
.
|∇f |q dγ
Rn
The last equality holds at x = 0. It follows that
Z
∂s φs ≤ (q − p)
|∇f |q dγ
2/q
.
Rn
Integrating from 0 to 1 yields (3.1).
4
Beckner Inequality for Log-Concave Probability Measures
We now turn our attention to a general log-concave probability measure µ on Rn .
Definition 4.1. A probability measure µ on Rn is called log-concave with concavity b > 0 if
µ = e−v(x) dx,
where v ∈ C 2 (Rn ) and the matrix D2 v of second order partial derivatives of v satisfies hD2 v(x)ξ, ξi ≥ b|ξ|2
for each x, ξ ∈ Rn .
The most important log-concave probability measure is the standard Gaussian measure γ n , which corresponds to the case where v(x) = |x|2 /2 + log(2π)n/2 . Log-concave probability measures satisfy many of the
same properties as Gaussian measures do, so it is natural to ask to what extent the inequalities we study for
the Gaussian measure can be generalized to this setting. The goal of this chapter is to prove the following.
Theorem 4.2. Let µ be a log-concave probability measure on Rn with concavity b > 0. Then µ satisfies
inequality Bec(q, p) with constant 1/b for all q ≥ 2 and all 1 ≤ p ≤ q. Furthermore, equality holds if and
only if f is constant a.e.
To prove Theorem 4.2, we proceed as in Chapter 3. We first define analogues of the Ornstein-Uhlenbeck
operator and the Ornstein-Uhlenbeck semigroup, then use these operators to prove the result via an argument
similar to that in Section 3.3.
4.1
Generalization of the Ornstein-Uhlenbeck operator and semigroup
Let µ = e−v(x) dx be a log-concave probability measure on Rn with concavity b > 0. Define an operator A
on L2 (µ) by
A(f ) := ∆f − h∇f, ∇vi,
(4.1)
with domain D(A) consisting of all functions f such that ∆f − h∇f, ∇vi defines a function in L2 (µ).
This operator A is a generalization of the Ornstein-Uhlenbeck operator, as suggested by the following
generalization of Proposition 3.3.
22
Lemma 4.3. If g ∈ W 2,1 (µ) and f ∈ D(A), then
Z
Z
gA(f ) dµ = −
Rn
h∇f, ∇gi dµ.
(4.2)
Rn
Proof. By converting to an integral with respect to Lebesgue measure and integrating by parts, we obtain
Z
Z
f ∂k vdµ =
∂k f dµ,
f ∈ Cc∞ (Rn ),
k = 1, ..., n.
(4.3)
Rn
Rn
By approximation, this also holds for f ∈ W 2,1 (µ). Using (4.3), we then calculate
Z
gA(f ) dµ =
Rn
n Z
X
k=1
Rn
g∂k2 f
− g∂k v∂k f dµ =
n Z
X
k=1
Rn
g∂k2 f
− ∂k g∂k f −
g∂k2 f
Z
dµ = −
h∇f, ∇gi dµ.
Rn
In analogy with Chapter 3, we now seek a contraction semigroup whose infinitesimal generator is an
extension of the operator A. To formalize some of the basic properties we need this semigroup to possess,
we make the following definition.
Definition 4.4. Suppose {Tt } is a contraction semigroup on a Banach space X of functions on a probability
space Ω. Then {Tt } is said to be a Markov semigroup if Tt 1 = 1 (where 1 denotes the constant function
ω 7→ 1); and Tt preserves positivity: if f ≥ 0 a.e., then Tt f ≥ 0 a.e.
From the Mehler formula, it is clear that the Ornstein-Uhlenbeck semigroup of Chapter 3 is a Markov
semigroup.
In Appendix 6.2, it is shown, using abstract functional analytic methods, that there exists a Markov
semigrup {Tt }, consisting of symmetric operators, whose infinitesimal generator is an extension of A. Unlike
in Chapter 3, we do not have an explicit formula for {Tt }. Nevertheless, it turns out that this semigroup
satisfies many of the same properties as the Ornstein-Uhlenbeck semigroup.
Proposition 4.5. Let {Tt } be the symmetric Markov semigroup with generator a self-adjoint extension of
A. Then
1. Each Tt preserves integrals: if f ∈ L1 (µ), then
Z
Z
Tt f dµ =
Rn
f dµ.
Rn
2. Suppose c ≤ f ≤ C for some constants c and C. Then for each x ∈ Rn , c ≤ Tt f (x) ≤ C.
3. Tt f (x) is given by integration against a Borel probability measure νt,x for a.e. x ∈ Rn .
4. Each Tt defines a norm-decreasing operator Lp (µ) → Lp (µ), for each 1 ≤ p ≤ ∞.
5. If f ∈ C 2 (Rn ) ∩ L2 (µ), then so is Tt f .
Proof.
1. Apply (4.2) to get
d
dt
Z
Z
Tt f dµ =
Rn
Z
h∇Tt f, ∇1i dµ = 0.
A(Tt f ) dµ =
Rn
Rn
Thus the integral of Tt f is independent of t, and in particular it equals the integral of T0 f = f .
2. Since Tt fixes the constant functions and preserves positivity, the fact that C − f ≥ 0 implies that
C − Tt f ≥ 0, and similarly for c.
23
3. By item 2, f 7→ Tt f (x) defines a positive linear functional on C0 (Rn ). So, the Riesz representation
theorem implies that Tt f (x) is given by integrating against a Borel measure νt,x for each f ∈ C0 (Rn ).
For a non-negative f ∈ L2 (µ), one can find a sequence (fj ) ∈ C0 (Rn ) which increases to f a.e., so by
dominated convergence Tt f (x) is also given by integration against νt,x . For the general f ∈ L2 (µ), the
result follows by considering positive and negative parts separately. Finally, from Tt 1 = 1, we get that
νt,x is a probability measure.
4. If f is bounded, then |f | and |f |p are each in L2 (µ), and by item 3, we can apply Jensen’s inequality
to get
|Tt f |p ≤ Tt |f |p .
Therefore item 1 gives
Z
Z
p
p
|Tt f | dµ ≤
Rn
Z
Tt |f | dµ =
Rn
|f |p dµ.
Rn
The set of bounded f is dense in Lp (µ) for each 1 ≤ p ≤ ∞, so we get a unique extension of Tt to each
Lp .
5. We have that u(t, x) := Tt f (x) solves the differential equation
∂t u = ∆u − h∇v, ∇ui
(4.4)
with initial condition u(0, x) = f (x). The coefficient vector ∇v in this PDE is C 1 , so by standard
regularity results for parabolic equations (see, for example, Ch. 7 of [14]) applied on each bounded
subset of Rn , it follows that Tt f lies in C 2 (Rn ) as well.
4.2
Commutation with the gradient
The only missing ingredient in our proof of Bec(q, p) is an inequality relating |∇Tt f |2 and Tt |∇f |2 . To obtain
such an estimate, we first make the following definition, which we take from [22].
Definition 4.6. We define the carré du champ of A to be the unbounded bilinear form Γ : L2 (µ) × L2 (µ) →
L2 (µ) given by
2Γ(f, g) := A(f g) − f A(g) − gA(f ).
We define the curvature operator of A to be the unbounded bilinear form Γ2 : L2 (µ) × L2 (µ) → L2 (µ) given
by
2Γ2 (f, g) := AΓ(f, g) − Γ(f, Ag) − Γ(g, Af ).
(4.5)
The domains of Γ and Γ2 are the subsets of L2 (µ) × L2 (µ) on which the above formulas produce functions
in L2 (µ). We put Γ(f ) = Γ(f, f ) and similarly for Γ2 (f ).
Proposition 4.7. The carré du champ of A is given by
Γ(f, g) = h∇f, ∇gi.
Proof. Due to bilinearity, it suffices to prove the result in the case f = g, i.e. we must show
Γ(f ) = |∇f |2 .
We have
A(f 2 )
=
∆(f 2 ) − ∇(f 2 ) · ∇v
=
2|∇f |2 + 2f ∆f − 2f ∇f · ∇v
and
2f A(f ) = 2f ∆f − 2f ∇f · ∇v.
Subtracting, we get
Γ(f ) = |∇f |2
which is the desired formula.
24
Proposition 4.8. The curvature operator of A satisfies
Γ2 (f ) = |D2 f |2 + h(D2 v)∇f, ∇f i.
Proof. In the case f = g formula (4.5) is
2Γ2 (f ) = AΓ(f ) − 2Γ(f, Af )
(4.6)
By Proposition 4.7 the first term in (4.6) is given by
AΓ(f )
∆|∇f |2 − ∇v · ∇|∇f |2 .
=
Expanding out the second term above, we get
n
X
∇v · ∇|∇f |2 = 2
∂k f ∂jk f ∂j v.
j,k=1
For the second term in (4.6), we have
Γ(f, Af ) = ∇f · ∇(∆f − ∇v · ∇f ).
We compute
∇f · ∇(∇v · ∇f )
=
n
X
∂k f ∂jk v∂j f +
∂k f ∂jk f ∂j v
j,k=1
n
X
j,k=1
=
n
X
h(D2 v)∇f, ∇f i +
∂k f ∂jk f ∂j v.
j,k=1
Plugging our calculations into (4.6 and cancelling the terms 2
Pn
j,k=1
∂k f ∂jk f ∂j v, we find
2Γ2 (f ) = ∆|∇f |2 − 2∇f · ∇∆f + 2h(D2 v)∇f, ∇f i.
We have
∆|∇f |2 = 2
n
X
j,k=1
∂k f ∂kjj f + 2
n
X
(∂jk f )2 = 2
j,k=1
and
∇f · ∇∆f =
n
X
(4.7)
∂k f ∂kjj f + 2|D2 f |2
j,k=1
n
X
∂k f ∂kjj f.
j,k=1
Plugging these two expressions into (4.7), cancelling a pair of terms, and dividing by 2, we get
Γ2 (f ) = |D2 f |2 + h(D2 v)∇f, ∇f i.
From the preceding proposition and our hypothesis on D2 v, one has the inequality
Γ2 (f ) ≥ h(D2 v)∇f, ∇f i ≥ b|∇f |2 .
(4.8)
In the terminology of Markov semigroups, the operator A has positive curvature (see [22] for further discussion). From this bound, we can deduce our desired inequality for the gradient of Tt f . Our proof follows that
in [22].
Proposition 4.9. For function f ∈ C ∞ (Rn ) ∩ L2 (µ) and t ≥ 0,
|∇Tt f |2 ≤ e−2bt Tt |∇f |2 .
25
Proof. Fix t > 0 and for t ≥ s ≥ 0 define
φ(s) = e−2bs Ts |∇Tt−s f |2 .
Using the relation ∂s Ts = Ts A, we compute
φ0 (s) = −2be−2bs Ts |∇Tt−s f |2 + e−2bs Ts A|∇Tt−s f |2 + ∂s |∇Tt−s f |2 .
(4.9)
By Proposition 4.7 one has
∂s |∇Tt−s f |2
= −2h∇Tt−s f, ∇ATt−s f i = −2Γ(Tt−s f, ATt−s f i,
and so from (4.9) and the definition of Γ2 ,
φ0 (s) = −2e−2bs Ts [−bΓ(Tt−s f ) + Γ2 (Tt−s f )) .
By (4.8), the argument of Ts is non-negative, so since Ts preserves positivity, φ0 is non-negative. Thus φ is
increasing. In particular,
|∇Tt f |2 = φ(0) ≤ φ(t) = e−2bt Tt |∇f |2
which proves the result.
Corollary 4.10. If f ∈ L1 (µ), then
Z
lim Tt f =
t→∞
f dµ
Rn
in L1 .
Proof. If f ∈ Cc1 (Rn ), we infer from Proposition 4.9 that |∇Tt f |2 → 0 uniformly Ras t → ∞, so Tt f must
converge to a constant function. Since Tt preserves integrals, this constant
R must beR Rn f dµ. For the general
f ∈ L1 (µ), given > 0 we can find g ∈ Cc1 (Rn ) with kf − gk1 < and Rn g dµ = Rn f dµ. Then
Z
Z
g dµ
lim sup Tt f −
f dµ ≤ lim sup kTt f − Tt gk1 + Tt g −
< ,
t→∞
Rn
t→∞
1
Rn
1
which proves the result for f .
4.3
Proof of Beckner’s inequality for log-concave probability measures
We are now ready to prove Theorem 4.2. The proof is essentially the same as that in Section 3.3.
Proof of Theorem 4.2. Define A as in (4.1) and let {Tt } be the semigroup of Proposition 4.5. By a standard
approximation argument, we may take f ∈ C 2 (Rn ) with f ≥ a > 0 for some constant a. Then by Proposition
4.5, Tt f has the same properties for each t ≥ 0. Set
Z
p q/p
φ(t) =
[Tt f ]
2/q
dµ
.
Rn
Since T0 is the identity and limt→∞ Tt f =
R
Rn
f dµ, we have that the left side of Bec(q, p) is given by
kf k2q − kf k2p =
Z
∞
φ0 (t) dt.
0
We shall estimate φ0 . To simplify notation in what follows, put
Z
α(t) =
[Tt f p ]q/p dµ
Rn
26
2/q−1
.
(4.10)
Using the relation ∂t Tt = ATt , we compute
2
φ (t) = α(t)
p
0
Z
[Tt f p ]q/p−1 ATt f p dµ.
Rn
By the integration by parts formula for A, this equals
Z
Z
2 q
2
α(t)
h∇[Tt f p ]q/p−1 , ∇ (Tt f p )idµ =
− 1 α(t)
[Tt f p ]q/p−2 |∇Tt f p |2 dµ.
p
p p
Rn
Rn
(4.11)
By Proposition 4.9 applied with f replaced by f p ,
|∇Tt f p |2 ≤ e−2bt [Tt |∇(f p )|]2 = e−2bt p2 [Tt (f p−1 |∇f |)]2 .
Therefore,
φ0 (t) ≤ 2e−2bt (q − p) α(t)
Z
[Tt f p ]q/p−2 [Tt (f p−1 |∇f |)]2 dµ.
(4.12)
Rn
Now apply Hölder’s inequality to get
2
2/p
Tt (f p−1 |∇f |) ≤ [Tt f p ]2−2/p (Tt |∇f |p ) .
Thus,
φ0 (t) ≤ 2e−2bt (q − p) α(t)
Z
2/p
[Tt f p ]q/p−2/p (Tt |∇f |p )
dµ.
(4.13)
Rn
The case q = 2 is covered by trivial modifications to what follows, so in the remainder of the proof we assume
q > 2. Apply Hölder’s inequality a second time, this time with the exponents q/(q − 2) and q/2, to get
Z
2/p
[Tt f p ]q/p−2/p (Tt |∇f |p )
dµ
Rn
Z
≤
[Tt f p ]q/p dµ
1−2/q Z
Rn
q/p
(Tt |∇f |p )
2/q
dµ
.
Rn
The first factor here is precisely α(t)−1 , so upon plugging this into (4.13), we obtain
φ0 (t) ≤ 2e−2bt (q − p)
Z
q/p
(Tt |∇f |p )
2/q
dµ
.
Rn
Since 1 ≤ p ≤ q,
q/p
(Tt |∇f |p )
≤ Tt |∇f |q .
So, since Tt preserves integrals, we get
φ0 (t) ≤ 2e−2bt (q − p)
Z
|∇f |q dµ
2/q
.
Rn
Integrating this from 0 to ∞ proves Bec(q, p) with constant 1/b.
The condition for equality follows just as in the proof of Theorem 3.1.
5
Other Inequalities for Log-Concave Probability Measures
In the remainder of the paper, we shall study several inequalities for log-concave probability measures which
are related to inequality Bec(q, p).
27
5.1
Generalized logarithmic Sobolev inequality
Let q ≥ 1. We say that µ satisfies inequality LSI(q) with constant C if whenever f ∈ W q,1 (µ),
Z
2
2
kf k2−q
|f |q log |f | dµ − log(kf kq )kf k2q ≤ Ck∇f k2q .
q
q
q
Rn
(5.1)
Note that LSI(2) is the ordinary logarithmic Sobolev inequality (1.3), with a constant C on the right. We
shall explore the relationship between inequality LSI(q) and the inequality Bec(q, p) from Chapters 3 and 4.
First we have an implication relation within inequality LSI(q):
Proposition 5.1. If µ satisfies LSI(q) with constant C, then for each r > q, µ satisfies LSI(r) with constant
C.
Proof. If f ∈ W r,1 (µ), then |f | ∈ W r,1 (µ) with |∇|f || = |∇f | a.e., so it suffices to consider f ≥ 0. Apply
LSI(q) to the function f r/q :
2r
kf k2r/q−r
r
q2
Z
f r log f dµ −
Rn
2r
r2
2r/q
log(kf
k
)kf
k
≤
C
r
r
q2
q2
Z
2/q
f r−q |∇f |q dµ
.
(5.2)
Rn
Apply Hölder’s inequality on the right side, with exponents r/q and r/(r − q), to get
Z
Rn
2/q 2/q
f r−q |∇f |q dµ
≤ kf kr−q
k∇f kq/r
= kf k2r/q−2
k∇f k2r .
r
r
r
2r/q−2
Plugging this into (5.2) and dividing by (r/q)2 kf kr
on both sides yields LSI(r) with constant C.
Proposition 5.2. If µ satisfies Bec(q, p) with some constant Cp for each p ∈ [1, q), then µ also satisfies
LSI(q) with constant C := lim supp→q Cp .
Proof. By the usual approximation arguments, it suffices to prove the inequality for f ∈ C ∞ (Rn ) ∩ W 2,1 (µ)
with f ≥ c > 0 for some constant c. Divide both sides of Bec(q, p) by q − p to get
kf k2q − kf k2p
≤ Cp k∇f k2q .
q−p
(5.3)
Our hypotheses on f imply that p 7→ kf k2p is a differentiable function of p, so as p → q − , the left side of
(5.3) tends to
Z
d
2
2
kf k2p |p=q = kf k2−q
f q log(f ) dµ − log(kf kq )kf k2q ,
q
dp
q
q
Rn
which is precisely the left side of LSI(q). The right side of (5.3) is bounded above by Ck∇f k2q as p → q − ,
which proves the desired inequality.
By Theorem 4.2, we immediately get
Corollary 5.3. For q ≥ 2, the measure µ satisfies inequality inequality LSI(q) with constant 1/b.
There is a partial converse to Proposition 5.2. In order to prove it, we will need to consider how the
quotient
kf k2q − kf k2p
q−p
changes as q and p vary. It turns out that a slight variant of this quantity is better behaved. Namely,
θ(q, p) :=
kf k2q − kf k2p
kf k2q − kf k2p
= qp
,
1/p − 1/q
q−p
1 ≤ p < q.
(5.4)
Of course, θ depends on f , but the function will always be clear from the context, so is not indicated in the
notation. The key feature of θ is the following.
28
Lemma 5.4. The function θ is increasing in q and p.
This lemma and its proof are based on a result in [21].
Proof. The function φ is the negative of a difference quotient:
θ(q, p) = −
β(1/p) − β(1/q)
,
1/p − 1/q
where
β(t) := kf k21/t
(5.5)
for t ∈ (0, 1]. We claim that β is convex. Given the claim, we have that the difference quotients of β are
increasing in both arguments. Hence
β(1/p) − β(1/q)
1/p − 1/q
is decreasing in both arguments, so its negative θ(q, p) is increasing in both arguments, which proves our
claim.
It remains to prove that β is convex. We first claim that
Z
1
|f |1/t dµ .
α(t) := log(β(t)) = log(kf k1/t ) = t log
2
Rn
is a convex function of t. Indeed, if t, s ∈ [0, 1], we can apply Hölder’s inequality with exponents t/(t + s)
and s/(t + s) to get
Z
t+s
t+s
2/(t+s)
α
log
f
dµ
=
2
2
Rn
t+s
≤
log kf 1/(t+s) k(t+s)/t kf 1/(t+s) k(t+s)/s
2
t+s
t+s
1/(t+s)
1/(t+s)
=
log kf k1/t
+
log kf k1/s
2
2
1
1
=
α(t) + α(s).
2
2
Now, the convexity of α and the fact that the exponential function is increasing and convex gives us
t+s
1
1
1
1
β
≤ e2(α(t)/2+α(s)/2) ≤ e2α(t) + e2α(s) = β(t) + β(s),
2
2
2
2
2
which proves that β is convex.
We can now prove a partial converse of Proposition 5.2.
Proposition 5.5. Suppose that µ satisfies LSI(q) with constant C. Then for all p ∈ [1, q), µ satisfies
Bec(q, p) with constant (q/p)C.
Proof. As before, it suffices to prove the implication for f ∈ C ∞ (Rn ) with f ≥ c > 0 for some constant c.
Since θ(q, p) is increasing in p ≤ q, we have
kf k2q − kf k2p
q−p
=
kf k2q − kf k2p
θ(q, p)
1 limp→q θ(q, p)
q
θ(q, p)
q
≤
= lim
= lim
.
qp
p
q
p p→q qp
p p→q
q−p
This last limit is precisely
d
2
kf k2p |p=q = kf k2−q
q
dp
q
Z
f q log(f ) dµ −
Rn
2
log(kf kq )kf k2q .
q
By LSI(q), this is less than or equal to Ck∇f k2q , so
kf k2q − kf k2p
q
≤ Ck∇f k2q ,
q−p
p
which is Bec(q, p) with the claimed constant.
29
Remark 5.6. Propositions 5.2 and 5.5, together with the results of Chapter 3, tell us that the standard
Gaussian measure satisfies LSI(q) with constant 1 for q ≥ 2, and does not satisfy LSI(q) with any constant
for q < 2. We also see that for q ≥ 2 and 1 ≤ p ≤ q, Bec(q, p) with constant q/p can be deduced from the
logarithmic Sobolev inequality via the “soft” argument
LSI(2) (constant 1) ⇒ LSI(q) (constant 1) ⇒ Bec(q, p) (constant q/p)
where the first implication is by Proposition 5.1 and the second by Proposition 5.5. However, the sharp
constant 1 we obtained in Chapter 3 cannot be deduced in this indirect manner.
Lemma 5.4 also tells us something about the relationship between inequality Bec(q, p) for different values
of the parameter q.
Corollary 5.7. Suppose 1 ≤ p ≤ q ≤ r. If µ satisfies Bec(r, p) with constant C, one has
r
kf k2q − kf k2p ≤ C(q − p)k∇f k2r .
q
Proof. Since θ is increasing in q we have that for any p ≤ q,
kf k2q − kf k2p
1
1
r kf k2r − kf k2p
=
θ(q, p) ≤
θ(r, p) =
.
q−p
qp
qp
q
r−p
By hypothesis, this is less than or equal to Ck∇f k2r , which proves our result.
Taking µ = γ, r = 2, q ≤ 2, we see that Beckner’s original inequality Bec(2, p) for γ n yield the estimate
kf k2q − kf k2p ≤
2
(q − p)k∇f k22 ,
q
for 1 ≤ p ≤ q ≤ 2.
From what we proved above, whenever 1 ≤ q ≤ r we have the implications
Bec(q, p) (all p < q, constant C) ⇒ LSI(q) (constant C)
⇒ LSI(r) (constant C) ⇒ Bec(r, p) (all p < r, constant (r/p)C)
There is another implication within inequality Bec(q, p).
Proposition 5.8. Suppose 1 ≤ q ≤ r and µ satisfies Bec(q, p) with constant C for all p ≤ q. Then for each
r > q and each p ∈ [r/q, r), µ satisfies Bec(r, p) with constant (r/q)C.
Proof. If f ∈ W r,1 (µ), then |f | ∈ W r,1 (µ) with |∇|f || = |∇f | a.e., so it suffices to consider f ≥ 0. Apply
Bec(q, p) to the function f r/q :
Z
2/q
r2
2r/q
2r/q
r−q
q
kf kr − kf krp/q ≤ C(q − p) 2
f
|∇f | dµ
.
(5.6)
q
Rn
Apply Hölder’s inequality on the right side, with exponents r/q and r/(r − q), to get
Z
2/q 2/q
r−q
q
f
|∇f | dµ
≤ kf kr−q
k∇f kq/r
= kf kr2r/q−2 k∇f k2r .
r
r
Rn
2r/q−2
Plugging this into (5.6), dividing by kf kr
2r/q
kf k2r − kf k2−2r/q
kf krp/q
r
on both sides, and rewriting the constant on the right yields
Z
2/q
rp r
f r−q |∇f |q dµ
.
(5.7)
≤C r−
q q
Rn
Since rp/q ≤ r, we have
2r/q−2
kf k2r/q−2
≥ kf krp/q
r
⇒
2−2r/q
kf k2−2r/q
≤ kf krp/q
r
.
Therefore, the left side of (5.7) is greater than or equal to kf k2r − kf k2rp/q . As p ranges from 1 to q, rp/q
ranges from r/q to r. Thus, we see that (5.7) implies Bec(r, p) with the asserted constant and range for the
auxiliary parameter.
30
5.2
Inequality for the semigroup
In this section, we prove an inequality for the semigroup Tt whose generator is an extension of A, with A
defined as in (4.1). This inequality is closely related to Beckner’s p-norm inequality Bec(2, p), and generalizes
Beckner’s inequality for the Ornstein-Uhlenbeck in [5], this time for q ≤ 2.
Theorem 5.9 (Inequality for the Semigroup). Let µ = e−v(x) dx be a log-concave probability measure on
Rn with concavity b > 0. Let A be as in (4.1). Let f ∈ W 2,1 (µ). For p ∈ (1, 2], let t(p) be such that
e−2bt(p) = p − 1. Then whenever 1 ≤ p ≤ q ≤ 2,
1
kTt(q) k22 − kTt(p) f k22 ≤ (q − p) k∇f k22 .
b
(5.8)
Remark 5.10. If we take q = 2, then t(q) = 0 and T0 f = f , so inequality (5.8) yields to the relation
1
kf k22 − kTt(p) f k22 ≤ (2 − p) k∇f k22 .
b
(5.9)
In the Gaussian case (for which b = 1), this is the inequality for the Ornstein-Uhlenbeck semigroup which
Beckner proved in [5]. Our result here generalizes this inequality for other exponents and other measures.
Proof. By a standard approximation argument, we can assume that f ∈ C0∞ (Rn ) with C > f > c for some
C, c > 0. Define
φ(p) = kTt(p) f k22 .
Then (5.8) is equivalent to the relation
1
φ(q) − φ(p)
≤ k∇f k22 .
q−p
b
The left side is a difference quotient of φ. Furthermore, the hypotheses on f imply that φ is differentiable.
So, by the mean value theorem, to prove our inequality it suffices to show that
φ0 (p) ≤
1
k∇f k22 ,
b
for each p ∈ [1, 2]. This is the inequality we shall prove.
If we differentiate φ(p) in p and integrate by parts, we get
Z
Z
d
d
1
0
2
φ (p) =
t(p)
(Tt(p) f ) dµ = −
A(Tt(p) f )Tt(p) f dµ
dp
dt Rn
b(p − 1)
Rn
Z
1
|∇Tt(p) f |2 dµ .
=
b(p − 1)
Rn
(5.10)
(5.11)
By Proposition 4.9, this is less than or equal to
Z
1
e−2bt(p)
Tt(p) |∇f |2 dµ = k∇f k22 ,
b(p − 1)
b
n
R
as required.
Remark 5.11. Equation (5.11) shows that φ(p) = kTt(p) f k22 is an increasing function of p. In fact, if we
apply the same argument n times, with ∂j f in place of f , we get
n Z
X
1
|∂jk Tt(p) f |2 dµ ≥ 0,
φ00 (p) = 2
b (p − 1)2
Rn
j,k=1
whence φ(p) is a convex function of p. It follows that
φ(q) − φ(p)
q−p
is increasing in q and p, so the inequality get sharper as q and p increase. In particular, Beckner’s original
inequality (5.9) with q = 2 is stronger than the inequality for smaller q.
31
Remark 5.12. In the Gaussian case, Nelson’s hypercontractivity inequality is essentially the statement that
kTt(p) f k2 ≤ kf kp .
Thus, together with hypercontractivity, (5.8) immediately implies the inequality
kTt(q) f k22 − kf k2p ≤ (2 − p)k∇f k22 .
In particular, if we take q = 2, we get Beckner’s p-norm inequality Bec(2, p). This is how Beckner originally
obtained this inequality. Since Bec(2, p) implies the logarithmic Sobolev inequality LSI(2) and LSI(2)
can be used to prove hypercontractivity [9], we see that the inequalities of this section, Beckner’s p-norm
inequalities, LSI(2), and hypercontractivity are all logically equivalent in the Gaussian case.
5.3
Brascamp-Lieb inequality
In this section, we prove a sharpened form of the Poincaré inequality for log-concave probability measures
on Rn , originally due to Brascamp and Lieb [10]:
Theorem 5.13. Let µ = e−v(x) dx be a log-concave probability measure on Rn with concavity b > 0. Let
f ∈ W 2,1 (µ). Then
Z
2 Z
kf k22 −
f dµ ≤
h(D2 v)−1 ∇f, ∇f i dµ.
(5.12)
Rn
Observe that if x ∈ Rn , and y =
p
Rn
(D2 v)x, then
h(D2 v)−1 x, xi = hy, yi ≤
1
1
h(D2 v)y, yi = |x|2 .
b
b
Therefore inequality (5.12) is sharper than the inequality Bec(2, 1) with constant 1/b, which we proved in
Chapter 4.
The original proof in [10] is by a direct, albeit lengthy, calculation and an induction argument. We take an
alternative, functional-analytic approach which yields several intermediate results which are of independent
interest. The first of these is:
Proposition 5.14. Define A := ∆ − h∇v, ∇i as in (4.1). Let  be the self-adjoint extension of A which
is the infinitesimal generator of Rthe semigroup {Tt } of Chapter 4 (c.f. Appendix
6.2). Then  is invertible
R
from the set of f ∈ D(Â) with Rn f dµ = 0 to the set of g ∈ L2 (µ) with Rn g dµ = 0. Furthermore the
inverse of  is continuous with respect to the L2 norm, and we have the inequality
kf k2 ≤
1
kÂf k2 .
b
(5.13)
To prove this, we need two lemmas.
Lemma 5.15. Let {Tt } be a contraction semigroup on a Hilbert space X with generator S. Let x ∈ X. Let
Y be a dense linear subspace of X, and suppose that for each y ∈ Y ,
1
lim hTs x − x, yi = hz, yi
s→0 s
(5.14)
for some z ∈ X. Then Sx exists and equals z.
Proof. Since Y is dense, (5.14) holds for each y ∈ X, not just for y ∈ Y . For each t ≥ 0, Tt is symmetric
and so
1
1
lim hTt+s x − Tt x, yi = lim hTs x − x, Tt yi = hz, Tt yi = hTt z, yi.
s→0 s
s→0 s
Thus
d
hTt x, yi = hTt z, yi.
dt
32
Integrating, we get
Z
hTt x − x, yi =
t
hTs z, yids
0
for each y ∈ X. Therefore Tt x − x =
Rt
0
Ts zds. As t → 0,
1
t
Z
t
Ts zds → z
0
strongly, so t−1 (Tt x − x) → z strongly, i.e. Sx = z.
Now let {Tt } be the semigroup of Proposition 5.14.
R
Lemma 5.16. Let f ∈ L2 (µ) with Rn f dµ = 0. Then
kTt f k2 ≤ e−bt kf k2 .
Proof. By density it suffices to prove the formula for f ∈ D(A). Fix t > 0 and for t ≥ s ≥ 0 define
Z
−2bs
φ(s) = e
(Tt−s f )2 dµ.
Rn
We have φ(0) = kTt f k22 , φ(t) = e−2bt kf k22 , and
Z
Z
φ0 (s) = −2be−2bs
(Tt−s f )2 dµ + e−2bs
Rn
∂s (Tt−s f )2 dµ.
(5.15)
Rn
We compute
Z
Z
2
Z
∂s (Tt−s f ) dµ = −
2(ÂTt−s f )(Tt−s f ) dµ = 2
Rn
Rn
|∇Tt−s f |2 dµ.
Rn
Plugging this into (5.15), we find
φ0 (s) = 2e−2bs (k∇Tt−s k22 − bkTt−s f k22 ).
By the Poincaré inequality, this is non-negative. Thus φ0 is increasing, so in particular
kTt f k22 = φ(0) ≤ φ(t) = e−2bt kf k22 .
Proof of Proposition 5.14. For g ∈ L2 (µ) with
R
g dµ = 0, define
Z ∞
Bg = −
Tt g dt.
Rn
(5.16)
0
We claim that B = Â−1 . First we need to check that B is well defined and continuous. Let s > 0. By
Minkowski’s inequality for integrals,
Z
Z
Tt g dt
Rn
!1/2
2
s
dµ
Z
≤
0
s
kTt gk2 dt
(5.17)
0
By Lemma 5.16, this is less than or equal to
Z s
1
− e−2s kgk2 .
e−bt kgk2 dt =
b
0
Letting s → ∞, we find
kBgk2 ≤
33
1
kgk2 .
b
(5.18)
Thus B : L2 (µ) → L2 (µ) Rcontinuously.
R
Now, if g = Âf , then Rn g dµ = Rn f Â(1) dµ = 0, and
Z ∞
Z
Bg = −
Tt (Âf ) dt = −
0
0
∞
d
Tt f dt = f.
dt
Thus B Â = Id.
R
On the other hand, if g ∈ L2 (µ) with Rn g dµ = 0, then
Z ∞
1
1
(Ts Bg − Bg) =
(Tt g − Tt+s g) dt,
s
s
0
so if φ ∈ Cc∞ (Rn ), we get from Fubini’s theorem and symmetry of Tt that
Z ∞Z
Z
1
1
(Ts Bg − Bg)φ dµ =
(Tt g − Tt+s g) φ dµdt
n s
Rn s
Z0 ∞ ZR
1
=
(Tt φ − Tt+s φ) g dµdt.
Rn s
0
Note that Lemma 5.16 and Hölder’s inequality show that Tt gφ is jointly integrable in t and over Rn , so that
the application of Fubini’s theorem is justified. As s → 0, this tends to
Z ∞
Z ∞Z
Z
Z
−
Âφg dµdt =
 −
Tt φ dt g dµ =
φg dµ.
0
Rn
Rn
Rn
0
Thus by Lemma 5.15, ÂBg exists and equals g, as required. Thus B = Â−1 and estimate (5.13) is immediate
from (5.18).
Now we proceed to prove Theorem 5.13. Our proof is based on that of B. Heffler [18]. Like the proof of
Theorem 4.2, this proof relies on a commutation relation between the operator  defined in (4.1) and the
gradient operator.
Let L2 (µ)n be the space
of n-component vector valued functions with components in L2 (µ), with inner
R
product hF, GiL2 (µ)n = Rn hF, Gi dµ. Let L : (L2 (µ))n → (L2 (µ))n be the unbounded operator defined by
LF = (D2 v)F − AF,
(5.19)
where A acts componentwise on vector-valued functions. Then if f ∈ Cc∞ (Rn ),
∇Af = ∇∆f − (D2 f )∇v − (D2 v)∇f = −L∇f.
(5.20)
Proof of Theorem 5.13. By approximation, it suffices to prove the inequality for f ∈ C ∞ (Rn ), with f constant outsideRa compact set. Since the inequality is invariant under adding a constant to f , we may also
assume that Rn f dµ = 0. Then by Lemma 5.14, g := Â−1 f is well defined. By the integration by parts
formula for A, one has
Z
Z
Z
kf k22 =
(Âg)2 dµ = −
h∇Âg, ∇gi dµ =
hL∇g, ∇gi dµ.
(5.21)
Rn
Rn
Rn
We have
∇f = ∇Âg = −L∇g.
(5.22)
hLF, F iL2 (µ)n = h(D2 v)F, F iL2 (µ)n − hÂF, F iL2 (µ)n = h(D2 v)F, F iL2 (µ)n + k∇F k22 .
(5.23)
For any F in the domain of L, we have
The second term on the right in (5.23) non-negative, so
hLF, F iL2 (µ)n ≥ h(D2 v)F, F iL2 (µ)n ≥ bkF k22 .
34
(5.24)
Therefore the operator L̃ := bI − L is non-positive on L2 (µ)n . Clearly this operator is symmetric, and
its domain contains Cc∞ (Rn )n , so is dense in L2 (µ)n . So, exactly as we do for A in Appendix 6.2, we
can apply the Friedrichs extension theorem to extend L̃ to a self-adjoint operator on L2 (µ), still denoted
by L̃, then apply the spectral theorem to get a contraction semigroup with generator L̃. It then follows
from the Hille-Yosida theorem that the extension L̂ := bI − L̃ of L is invertible with inverse satisfying
kL̂−1 F k ≤ (1/b)kF kL2 (µ)n for each F ∈ L2 (µ)n .
Since F 7→ hL̂F, F iL2 (µ)n defines an inner product on the domain of L, we have by the Cauchy-Schwarz
inequality that whenever F ∈ L2 (µ)n and G is in the domain of L̂,
|hF, GiL2 (µ)n |2 = |hL̂L̂−1 F, GiL2 (µ)n |2 ≤ hL̂−1 F, F iL2 (µ)n hL̂G, GiL2 (µ)n
with equality iff G = L̂−1 F . Therefore
(
−1
hL̂
F, F iL2 (µ)n = sup
|hF, GiL2 (µ)n |2
hL̂G, GiL2 (µ)n
)
: G ∈ D(L) .
By (5.24), this is less than or equal to
sup
|hF, GiL2 (µ)n |2
: G ∈ D(L) .
h(D2 v)G, GiL2 (µ)n
By the same argument above with L̂ replaced by D2 v (and the fact that D(L̂) is dense in L2 (µ)n ), this
equals h(D2 v)−1 F, F iL2 (µ)n , and hence
hL̂−1 F, F iL2 (µ)n ≤ h(D2 v)−1 F, F iL2 (µ)n .
(5.25)
∇g = −L̂−1 ∇f.
(5.26)
From (5.22) one has
Then applying this and (5.25) with F = ∇f , we get
hL̂∇g, ∇giL2 (µ)n = hL̂−1 ∇f, ∇f iL2 (µ)n ≤ h(D2 v)−1 ∇f, ∇f iL2 (µ)n
and plugging this into (5.21) gives our desired inequality.
Remark 5.17. Inequality (5.12) suggests that one might look for an analogous version of inequality Bec(2, p)
or LSI(2) for log-concave probability measures, i.e. an inequality of the form
2
kf k2−q
q
q
Z
|f |q log |f | dµ −
Rn
2
log(kf k2 )kf k2q ≤ C
q
2/q
h(D2 v)−1 ∇f, ∇f iq/2 dµ
.
Z
or
kf k2q
−
kf k2p
q
≤ C(q − p)
p
Z
(5.27)
Rn
2
−1
h(D v)
q/2
∇f, ∇f i
2/q
dµ
(5.28)
Rn
Bobkov and Ledoux [7] have demonstrated that this is not possible in general, although (5.27) holds for
q = 2 under additional regularity hypotheses on the measure. In the next section, we give their proof, and
deduce (5.27) and (5.28) for the general q, with a suitable constant, as a corollary.
5.4
Sharpened logarithmic Sobolev inequality
In this section, we shall prove a sharpened logarithmic Sobolev inequality for a restricted class of log-concave
probability measures due to Bobkov and Ledoux [7], which is analogous to the sharpened Poincaré inequality
of Section 5.3. The key tool is the following deep convexity inequality:
35
Theorem 5.18 (Prékopa-Leindler). Let f, g, h be non-negative, measurable functions on Rn . Suppose that
0 < t < 1 and for each x, y ∈ Rn ,
f ((1 − t)x + ty) ≥ g(x)1−t h(y)t .
Then
Z
Z
f (x) dx ≥
Rn
1−t Z
g(x) dx
Rn
t
h(x) dx .
Rn
For the proof, see [24]. We remark that Bobkov and Ledoux [7] have also used the Prékopa-Leindler
theorem to prove the Brascamp-Lieb inequality of Section 5.3, via an argument similar to the one below.
Theorem 5.19 (Bobkov-Ledoux). Let µ = e−v(x) dx be a log-concave probability measure on an open convex
set Ω ⊂ Rn .1 Let f ∈ W 2,1 (µ). Suppose that for any h ∈ Rn , the function x 7→ hD2 v(x)h, hi is concave on
Ω. Then one has the sharpened logarithmic Sobolev inequality
Z
Z
f 2 log |f | dµ − kf k22 log (kf k2 ) ≤ C h(D2 v)−1 ∇f, ∇f i dµ
(5.29)
Ω
Ω
with constant C = 3/2.
Note that, unlike our previous results, this inequality does not hold for general log-concave probability
measures, even with a different constant. A counterexample is given below. The minimal hypotheses on µ
required to obtain inequality (5.29) are, to our knowledge, not known.
Furthermore, the constant 3/2 in inequality (5.29) is not sharp in all cases. For example, if µ = γ n is the
standard Gaussian measure, then (5.29) is just the ordinary logarithmic Sobolev inequality with constant C,
and holds with C = 1. Below, we give a counterexample to the effect that the constant cannot be improved
to C = 1 in all cases. The sharpest possible constants in (5.29) are not known in general.
Proof. By the usual approximation arguments, we can take f to be smooth with 0 < c ≤ f ≤ C < ∞. In
fact, by approximating and rescaling, we can arrange that, in addition, f ≡ 1 outside of a compact set. Then
we can write f 2 = eg , where g ∈ Cc∞ (Ω).
Let t, s > 0 with t + s = 1. Set
gt (z) = sup{g(x) − [tv(x) + sv(y) − v(z)] : x, y ∈ Ω, z = tx + sy}.
We shall apply the Prékopa-Leindler theorem to the functions
egt −v χΩ ,
eg/t−v χΩ ,
e−v χΩ ,
where χ denotes an indicator function. By definition of gt , we have
egt (tx+sy)−v(tx+sy)
≥ eg(x)−[tv(x)+sv(y)] = (eg(x)/t−v(x) )t (e−v(y) )s
so the hypotheses of the theorem are satisfied and we get
Z
Z
t
egt dµ ≥
eg/t dµ .
Ω
(5.30)
Ω
We shall take the limit as t → 1, s → 0 to obtain our desired inequality. To simplify notation in what follows,
we denote the left side of (5.29) by Ent(f ) (for “entropy”).
First we shall see how Ent(f ) arises from the right side of (5.30). By logarithmic differentiation, one has
Z
1
d
1−1/t
kf k1/t = − kf k1/t
f 1/t log(f ) dµ + log(kf k1/t )kf k1/t
dt
t
Ω
1 A log-concave probability measure on Ω is defined in an analogous manner to a log-concave probability measure on Rn .
Sobolev spaces for these measures are defined as in Appendix 6.1, but with Cc∞ (Ω) replaced by the set of smooth functions f
on Ω whose derivatives up to order m are in L2 (µ).
36
If we replace f by f 2 and evaluate at t = 1, we get
Z
d
2
|t=1 kf k1/t = −2
f 2 log(f ) dµ + 2 log (kf k2 ) kf k22 = −2 Ent(f ).
dt
Ω
Thus, recalling that eg = f 2 , we get by Taylor expansion at t = 1 that
Z
e
g/t
t Z
dµ =
eg dµ + 2s Ent(f ) + O(s2 ).
Ω
(5.31)
Ω
We now need a suitable estimate on gt . Let
L(s) := tv(x) + sv(y) − v(z)
be the quantity subtracted from g(x) in the formula defining gt , where z = tx + sy. Put k = x − y. One has
d
h∇v(rz + (1 − r)x), ki = −shD2 v(rz + (1 − r)x)k, ki
dr
d 1
v(rz + (1 − r)x) = h∇v(rz + (1 − r)x), ki.
dr s
So, integrating by parts, we find that
Z 1
rshD2 v(rz + (1 − r)x)k, kidr
Z
=
−h∇v(z), ki +
0
1
rh∇v(rz + (1 − r)x), kidr
0
1
1
= −h∇v(z), ki − v(z) + v(x).
s
s
Similarly,
Z
0
1
1
1
rthD2 v(rz + (1 − r)y)k, kidr = h∇v(z), ki − v(z) + v(y).
t
t
Thus
Z
L(s) = ts
1
rshD2 v(rz + (1 − r)x)k, ki + rthD2 v(rz + (1 − r)y)k, kidr.
(5.32)
0
By our hypothesis on v,
hD2 v(rz + (1 − r)x)k, ki ≥ rhD2 v(z)k, ki + (1 − r)hD2 v(x)k, ki ≥ rhD2 v(z)k, ki
and similarly with x and y interchanged. Thus we find
Z 1
ts
L(s) ≥ ts
r2 hD2 v(z)k, kidr = hD2 v(z)k, ki.
3
0
Hence,
ts 2
hD v(z)k, ki : x, y ∈ Ω, z = tx + sy}.
3
We have x = z + sk, so by Taylor expansion about s = 0 we obtain
gt (z) ≤ sup{g(x) −
1
gt (z) ≤ g(z) + s sup{h∇g(z), ki − hD2 v(z)k, ki} + O(s2 ).
3
k∈Ω
Now fix z and make the evaluation at this z implicit. Differentiating in k, we find that the quantity inside
the supremum is maximized by choosing k such that
2
∇g − (D2 v)k = 0.
3
37
This is equivalent to k = 32 (D2 v)−1 ∇g, and hence
≤
gt
=
3s
3s
h(D2 v)−1 ∇g, ∇gi − h(D2 v)−1 ∇g, ∇gi + O(s2 )
2
4
3s
2 −1
g + h(D v) ∇g, ∇gi + O(s2 ).
4
g+
From the Taylor expansion of the exponential function, we then get
3s
egt ≤ eg e 4 h(D
2
v)−1 ∇g,∇gi+O(s2 )
= eg +
3s g
e h(D2 v)−1 ∇g, ∇gi + O(s2 ).
4
Substitute this inequality and (5.31) into (5.30) to obtain
Z
Z
Z
3s
g
2
g
eg h(D2 v)−1 ∇g, ∇gi dµ + O(s2 ).
e dµ + 2s Ent(f ) + O(s ) ≤
e dµ +
4 Ω
Ω
Ω
R
Cancelling Ω eg dµ on both sides, dividing by s, and then letting s → 0, we find
Z
Z
3
g
2 −1
e h(D v) ∇g, ∇gi dµ = 3 h(D2 v)−1 ∇f, ∇f i dµ.
2 Ent(f ) ≤
4 Ω
Ω
This completes the proof.
Recall that in Section 5.1, we established the chain of implications:
LSI(2) (constant C) ⇒ LSI(q) (constant C) ⇒ Bec(q, p) (constant (q/p)C)
for q ≥ 2, 1 ≤ p < q. (The first implication is Proposition 5.1 and the second is Proposition 5.5.) Exactly the
2/q
R
,
same arguments used there, with Rn replaced by Ω and k∇f k2q replaced by Ω h(D2 v)−1 ∇f, ∇f iq/2 dµ
yield
Corollary 5.20. Let µ satisfy the hypotheses of Theorem 5.19, q ≥ 2, 1 ≤ p < q, f ∈ W q,1 (µ). Then
2
kf k2−q
q
q
Z
2
|f | log |f | dµ − log(kf k2 )kf k2q ≤ C
q
Ω
q
and
kf k2q − kf k2p ≤
q
C(q − p)
p
Z
2
−1
h(D v)
∇f, ∇f i
2/q
dµ
.
(5.33)
Ω
2/q
h(D2 v)−1 ∇f, ∇f iq/2 dµ
Z
q/2
(5.34)
Ω
with constant C = 3/2.
To see that Theorem 5.19 cannot hold without the additional concavity hypothesis on v, and that the
constant in this theorem cannot be sharpened to C = 1, we proceed by way of the following two results,
which appear in [23].
Proposition 5.21 (Herbst Argument). Let µ be a log-concave probability
measure which satisfies (5.29).
R
Then for function f ∈ W 2,1 (µ) with h(D2 v)−1 ∇f, ∇f i ≤ 1 a.e. and Ω f dµ := α < ∞, and for any t > 0,
Z
2
e2tf dµ ≤ e2Ct +αt .
(5.35)
Proof. By approximating the general f satisfying the conditions of the proposition with continuously differentiable functions in the C 0,1 norm, we may assume that f is continuously differentiable. Let φ(t) = ke2tf k22 ,
g = etf /φ(t)1/2 . By (5.29),
Z
Z
h(D2 v)−1 ∇g, ∇gi dµ
g 2 log g dµ ≤ C
Ω
Ω
38
Note that the second term on the left vanishes since kgk2 = 1. In terms of f and φ, this reads
Z
φ0 (t)
1
C
t
− log φ(t) ≤
t2 h(D2 v)−1 ∇f, ∇f ie2tf dµ.
2φ(t) 2
φ(t) Ω
d log φ(t)
. On the right we have h(D2 v)−1 ∇f, ∇f i ≤
After dividing by t2 , the left side of this inequality equals 21 dt
t
1, so
1 d log φ(t)
≤ C.
2 dt
t
R
We have limt→0 log tφ(t) = Rn f dµ = α, so if we integrate both sides and use the fundamental theorem of
calculus we get
log φ(t)
≤ Ct + α/2.
2t
Rearranging terms gives (5.35).
Corollary 5.22. If f satisfies the hypotheses of Proposition 5.21 and 0 < λ < 1/2C, then
Z
2
eλf < ∞.
Ω
Proof. By Chebyshev’s inequality and (5.35), for any t, r > 0,
2
µ{2f (x) ≥ α + r} ≤ µ{e2tf (x) ≥ eαt+rt } ≤ e2Ct
+αt−αt−rt
2
= e2Ct
−rt
.
The right side is minimized when t = r/4C, in which case we have
µ{2f (x) ≥ α + r} ≤ e−r
2
/8C
By Fubini’s theorem, one then has
Z
Z
2
2
eλf dµ =
e(λ/4)(2f ) dµ
Ω
Ω
Z
λ ∞ (λ/4)r2
re
µ{f (x) ≥ r}dr
= 1+
2 0
Z
λ ∞ (λ/4)r2 −r2 /8C
≤ 1+
re
dr,
2 0
which is finite provided λ < 1/2C.
For example, consider v(x) = − log(2x) on (0, 1). The function v is strictly convex on (0, 1) and the
integral e−v(x) over (0, 1) is 1, so µ = e−v(x) dx = 2xdx defines a log-concave probability measure on (0, 1).
This measure is also introduced as a counterexample in [7]. Let f (x) = log x on (0, 1). We have f 02 /v 00 ≡ 1
and
Z
Z
1
1
x log xdx = −1/2.
f dµ = 2
0
0
So, if µ satisfies (5.29) for some constant C, then Corollary 5.22 implies that we must in particular have
R 1 λf 2
e
dµ < ∞ for any 0 < λ < 2/C. But, for any such λ,
0
Z
1
e
0
λf 2
1
Z
2
xeλ(log x) dx = ∞.
dµ = 2
0
Thus µ cannot satisfy (5.29) for any constant C. Note further that (5.34) with q = 2 implies (5.29) with the
same constant, so µ cannot satisfy (5.34) with q = 2 and any constant either.
39
As another example, take v(x) = xp on (1, ∞) for 2 ≤ p < 3. Then v is strictly convex, so µ = Ce−v(x) dx,
−1
R∞
, defines a log-concave probability measure on (1, ∞). Furthermore v 00 is concave,
where C = 1 e−v(x) dx
so µ satisfies the hypotheses of Theorem 5.19. Let
p
2 p(p − 1) p/2
f (x) =
x .
p
R∞
Then f 02 /v 00 ≡ 1, 1 f dµ < ∞, so the hypotheses of Proposition 5.35 are satisfied. For λ > 0,
Z ∞
Z ∞
2
4(p − 1) p
eλf dµ =
exp
λx − xp dx.
p
1
1
p
This is finite if and only if λ < 4(p−1)
. Therefore Corollary 5.22 implies that µ cannot satisfy (5.29) for any
constant C < 2(p − 1)/p. In particular, µ cannot satisfy (5.29) with constant 1 unless p = 2.
Remark 5.23. To our knowledge it is not known what hypotheses on the measure are required to obtain
(5.33) and (5.34) in general, nor what the sharpest possible constants in these inequalities are.
6
6.1
Appendices
Sobolev Spaces for Log-Concave Measures
In this appendix, we shall define the measures and function spaces which are the settings for our inequalities,
and prove some
Recall that a multi-index is an element α = (α1 , ..., αn ) of Nn . We
Pn of their basic properties.
Qn
αj
write |α| = k=1 αk , and ∂α = k=1 ∂j .
Given a log-concave probability measure µ on Rn , p ∈ [1, ∞), and m ∈ N, define a norm on the space
∞
Cc (Rn ) of smooth functions with compact support by
X
kf kW p,m (µ) :=
k∂α f kp .
|α|≤m
The completion of Cc∞ (Rn ) under this norm is called the mth Sobolev space with exponent p, and is denoted
by W p,m (µ).
A sequence (fj ) in Cc∞ (Rn ) converges to f ∈ C ∞ (Rn ) in W p,m (µ) if and only if ∂α fj → ∂α f in Lp for
each α with |α| ≤ m. Likewise, if (fj ) is a Cauchy sequence in Cc∞ (Rn ), then whenever |α| ≤ m, (∂α f ) is
a Cauchy sequences in Lp . Since Lp is complete, (fj ) converges to a function f ∈ Lp (µ) in the Lp norm,
and so does each (∂α fj ). We want to define the Sobolev partial derivatives ∂α f of f as the Lp limits of the
sequences (∂α fj ). However, we first need to check these these limits do not depend on the choice of sequence
(fj ) ∈ Cc∞ (Rn ). By induction on m, it will suffice to establish the following:
Proposition 6.1. Suppose that two sequences (fj ) and (gj ) in Cc∞ (Rn ) converge to the same function f in
Lp (µ), and for each k = 1, ..., n, (∂k fj ) and (∂k gj ) converge in Lp (µ). Then for each k,
lim ∂k fj = lim ∂k gj .
j→∞
j→∞
Proof. By replacing fj by fj − gj , it suffices to assume that fj → 0 in Lp and to show that hk :=
limj→∞ ∂k fj = 0 for each k = 1, ..., n. Let φ ∈ Cc∞ (Rn ). Then by the integration by parts formula
(4.3) and the product rule,
Z
Z
Z
hk φ dµ = lim
(∂k fj )φ dµ = lim
fj (φ∂k v − ∂k φ) dµ.
Rn
j→∞
j→∞
Rn
But, since fj → 0 in Lp (µ), this limit is zero. Therefore
that hk = 0 a.e.
40
R
Rn
Rn
hk φ dµ = 0 for each φ ∈ Cc∞ (Rn ), which implies
In particular, Proposition 6.1 implies that the Sobolev partial derivatives of a function in Cc∞ (Rn ) agree
with its ordinary partial derivatives, so there is no ambiguity in using the same notation for both. We remark
that, by definition, Cc∞ (Rn ) is dense in W p,m (µ), with respect to both its own norm and the Lp norm. As
such, by approximating more general functions by smooth, compactly supported ones, it often suffices to
prove results only on Cc∞ (Rn ).
Next we establish that the spaces W p,m (µ) are sufficiently large to be of interest. Recall that C m (Rn )
denotes the set of m-times continuously differentiable functions on Rn .
Proposition 6.2. If f ∈ C m (Rn ) and f and each of its partial derivatives up to order m are in Lp (µ), then
f ∈ W p,m (µ).
Proof. First we show that the set Ccm (Rn ) of m-times continuously differentiable functions with compact
support is contained in W p,m (µ). Given f ∈ Ccm (Rn ), we need to approximate f in the W p,m norm
by functions in Cc∞ (Rn ). For this, let g ∈ Cc∞ (Rn ) be non-negative with Lebesgue integral is 1. Set
gt (x) = t−n g(x/t), and let
Z
f ∗ gt (x) =
gt (x − y)f (y) dy
Rn
be the convolution of f and gt . Since f and gt are compactly supported, so is f ∗ gt . By the mean value
theorem and the dominated convergence theorem, we can differentiate under the integral sign to obtain that
f ∗ gt is smooth, and that for each multi-index α with |α| ≤ m,
∂α (f ∗ gt ) = f ∗ (∂α gt ) = (∂α f ) ∗ gt .
It is a standard theorem in the theory of Lp spaces that the functions gt are an approximate identity, in the
sense that h ∗ gt → h in Lp (dx) (and hence also in Lp (µ)) for any h ∈ Lp (dx). In particular,
(∂α f ) ∗ gt → ∂α f
in Lp (µ) as t → 0.
So, f ∗ gt → f in W p,m (µ), which implies f ∈ W p,m (µ).
Now suppose f satisfies the hypotheses of the theorem, but may not be compactly supported. Since
W p,m is complete, in light of the above it will suffice to approximate f in the W p,m norm by functions in
Cck (Rn ). To do so, set
Bj = {x ∈ Rn : |x| ≤ j}.
Take a sequence of smooth bump functions βj , equal to 1 on Bj , 0 on Rn \Bj+1 , and with uniformly bounded
partial derivatives up to order α. Then if fj := βj f , we have ∂α fj → ∂α f pointwise for each |α| ≤ m. The
product rule allows us to bound the integrals of |∂α fj |p over Bj+1 \ Bj in terms of the integrals of the partial
derivatives of f over this region. Since each ∂α f ∈ Lp , said integrals tend to zero, and we get fj → f in
W p,m .
The following proposition will often allow us to reduce proofs to the case of non-negative functions.
Proposition 6.3. Let f ∈ C 1 (Rn ) ∩ W p,1 (µ). Then |f | ∈ W p,1 (µ) and |∇f | = |∇|f || a.e.
Proof. We first define a family of functions which will act as a C 1 approximation of the absolute value
function. For y ∈ R, t > 0, set
Ft (y) = (y 2 + t2 )1/2 − t.
We claim that Ft ◦ f → |f | in W p,1 (µ) as t → 0. Indeed, we have Ft ◦ f → |f | pointwise, and
kFt ◦ f k∞ ≤ t + kf k∞ .
By dominated convergence, Ft ◦ f → |f | in Lp . Furthermore, for each index k = 1, ..., n,
∂k (Ft ◦ f ) =
f ∂k f
.
(f 2 + t2 )1/2
41
Now, if either ∂k f (x) = 0 or f (x) 6= 0, then as t → 0, this converges to sgn(f (x))∂k f (x). So, the only points
where convergence can fail to occur are those in
D = {x ∈ Rn : f (x) = 0, ∇f (x) 6= 0}.
We claim that D has measure zero with respect to Lebesgue measure, and hence also with respect to µ. In
dimension 1, D consists of countably many endpoints of the open intervals where f 6= 0, so has measure
zero. In higher dimensions, if x ∈ D, the implicit function theorem allows us to solve f (y) = 0 locally for
one of the coordinates of y as a C 1 function of the others, and thereby express D ∩ Ux as the image of a C 1
function from Rn−1 to Rn for some neighborhood Ux of x. The image of such a function has measure zero,
and countably many such Ux cover D. It follows that D has measure zero.
Thus ∂k Ft ◦ f → sgn(f )∂k f a.e., and we have |∂k Ft ◦ f | ≤ k∂k f k∞ a.e. so a second application of
dominated convergence yields ∂k Ft ◦ f → ∂k f in Lp . It follows that |f | ∈ W p,1 (µ) with |∂k |f || = |∂k f | (in
the Sobolev sense).
6.2
Existence of Semigroup
In this appendix, we prove the existence of a symmetric Markov semigroup {Tt } whose infinitesimal generator
is an extension of the operator A defined in (4.1). We shall require the following theorems from functional
analysis:
Theorem 6.4 (Friedrichs Extension). Let S be a densely defined symmetric operator on a Hilbert space X.
If S is non-negative (or non-positive), in the sense that hSx, xi ≥ 0 (resp. hSx, xi ≤ 0) for x in the domain
of S, then there is a unique non-negative (resp. non-positive), self-adjoint extension of S.
Theorem 6.5 (Spectral Theorem). Let S be a bounded self-adjoint operator on a Hilbert space X. Then
there is a measure space (Ω, ν) and a linear isometry U : X → L2 (ν) such that U SU −1 is multiplication Mλ
by some measurable function λ on Ω.
For f, g ∈ L2 (µ), write
Z
hf, giL2 (µ) =
f g dµ.
(6.1)
Rn
The operator A in (4.1) is densely defined on L2 , for its domain clearly contains Cc∞ (Rn ). Symmetry of A
follows by applying Lemma 4.3 twice. Furthermore, A is non-positive, for if f ∈ D(A), then Lemma 4.3
shows that
hf, Af iL2 (µ) = −k∇f k2 ≤ 0.
Thus the Friedrichs extension theorem implies that A has a non-positive self-adjoint extension Â.
The spectral theorem implies that there is a measure ν on a set Ω and a linear isometry U : L2 (µ) → L2 (ν)
such that U ÂU −1 is multiplication by some measurable function λ on Ω:
U AU −1 = Mλ .
For t ≥ 0, put
Tt = U −1 Metλ U : D(Â) → L2 (µ).
(6.2)
Proposition 6.6. {Tt } extends to a symmetric contraction semigroup on L2 (µ) with infinitesimal generator
Â.
Proof. From the definition (6.2), it is immediate that T0 is the identity operator and Tt ◦ Ts = Tt+s on
D(Â). Furthermore, the mapping t 7→ Meλt g is continuous for fixed g ∈ L2 (ν), so since U and its inverse are
continuous, so is t 7→ Tt f for each f ∈ L2 (µ). Taking the transpose of both sides of (6.2) shows that each Tt
is symmetric.
Now we show that each Tt does not increase L2 -norms on D(Â), so extends to a norm-decreasing operator
on all of L2 (µ). Since  is non-positive, we have for each f ∈ D(Â) that
0 ≥ hÂf, f iL2 (µ) = hU −1 Mλ U f, f iL2 (µ) = hλU f, U f iL2 (ν) .
42
Since D(Â) is dense in L2 (µ), U −1 D(Â) is dense in L2 (ν). Therefore this inequality implies that λ ≤ 0 a.e.
Hence etλ ≤ 1 a.e., and it follows that
hTt f, Tt f iL2 (µ) = hMetλ U f, Metλ U f iL2 (ν) ≤ hU f, U f iL2 (ν) = hf, f iL2 (µ) .
Therefore Tt is norm-decreasing on D(Â). Since D(Â) is dense in L2 (µ), Tt extends to a norm-decreasing
operator on L2 (µ), and by density of D(Â) we still have T0 = Id and Tt ◦ Ts = Tt+s . Thus {Tt } is a
contraction semigroup on L2 (µ).
It remains to check that the infinitesimal generator of this semigroup is Â. Since U is linear and continuous, it commutes with differentiation in t. That is, if f ∈ D(Â), then in the L2 sense,
d
d
Tt f = U −1 Metλ U f = U −1 Mλ Metλ U f = U −1 Mλ U U −1 Metλ U f = ÂTt f.
dt
dt
d
|t=0 Tt f = Âf whenever f ∈ D(Â). Conversely, suppose f ∈ L2 (µ) and
Evaluating at t = 0 shows that dt
d
2
dt |t=0 Tt f exists in L (µ). Then for any g ∈ D(Â),
h
d
1
|t=0 Tt f, giL2 (µ) = lim hTt f − f, giL2 (µ) .
t→0 t
dt
From the formula (6.2), it is clear that Tt is symmetric for each t, so this equals
1
lim hf, Tt g − giL2 (µ) = hf, ÂgiL2 (µ) .
t
t→0
Since  is self-adjoint, it follows that f ∈ D(Â) with Âf =
of {Tt } is Â.
d
dt |t=0 Tt f .
Therefore the infinitesimal generator
Of course, in the Gaussian case, our semigroup {Tt } is just the Ornstein-Uhlenbeck semigroup.
To show that {Tt } is a Markov semigroup, we need the following result, which characterizes contraction
and Markov semigroups in terms of their generators:
Theorem 6.7 (Hille-Yosida Theorem for Markov Semigroups). Let S be a closed linear operator defined on
a domain D(S) of a Banach space X. Then S generates a contraction semigroup if and only if
1. D(S) is dense in X;
2. For every λ > 0, λI−S is invertible and the resolvent (λI−S)−1 exists and satisfies k(λI−S)−1 k ≤ 1/λ.
This semigroup is Markov if and only if S(1) = 0 and (λI − S)−1 preserves positivity for all λ > 0.
For a proof, see Ch. 8 of [8].
In order to apply the Hille-Yosida theorem to Â, we need to check that  is closed. Indeed, if (fj ) ∈ D(Â),
fj → f ∈ L2 (µ), and Âfj → g ∈ L2 (µ), then for any φ ∈ D(Â),
hg, φiL2 (µ) = lim hÂfj , φiL2 (µ) = lim hfj , ÂφiL2 (µ) = hf, ÂφiL2 (µ) .
j→∞
j→∞
Therefore f belongs to the domain of the adjoint of Â. But, Â is self-adjoint, so f ∈ D(Â), and by symmetry
hg, φiL2 (µ) = hÂf, φiL2 (µ) for each φ ∈ D(Â). Since D(Â) is dense in L2 (µ), it follows that Âf = g, so that
 is closed.
From Proposition 6.6, Â generates a contraction semigroup, so by the Hille-Yosida theorem, for each
λ > 0, (λI − Â)−1 exists as a continuous operator on L2 (µ). To show that our semigroup {Tt } is Markov, it
therefore remains to check the last two conditions in Theorem 6.7.
Proposition 6.8. The semigroup {Tt } of Proposition 6.6 is Markov.
43
Proof. Clearly, Â(1) = 0. First we check that (λI − Â)−1 is positivity preserving on (λI − Â)Cc2 (Rn ). We
must show that if f ∈ Cc2 (Rn ) and
g = (λI − Â)f ≥ 0,
then f ≥ 0. In this case, f attains a minimum at some point x0 ∈ Rn . At this point, ∇f = 0 and ∆f ≥ 0.
We have
0 ≤ (λI − Â)f (x0 ) = (λI − A)f (x0 ) = λf (x0 ) − ∆f (x0 ),
so λf (x0 ) ≥ ∆f (x0 ) ≥ 0, whence f ≥ 0.
In the general case, let g ∈ L2 (µ) be non-negative. We must show that (λI − Â)−1 g ≥ 0. We claim that
Cc2 (Rn ) ⊂ (λI − Â)Cc2 (Rn ).
(6.3)
Given the claim, if g ∈ L2 (µ) is non-negative a.e., then we can select a non-negative sequence (gj ) ∈ Cc2 (Rn )
which converges to g in L2 . Then by the first case and (6.3), (λI − Â)−1 gj is non-negative a.e., and since
(λI − Â)−1 is continuous, these functions converge to (λI − A)−1 g in L2 . Hence (λI − Â)−1 g is non-negative
a.e. Thus, by the Hille-Yosida theorem, the semigroup generated by  is Markov.
It remains to prove (6.3). Consider a fixed compact set K ⊂ Rn which equals the closure of its interior,
and put
L2K (µ) = {f ∈ L2 (µ) : supp(f ) ⊂ K}.
The space L2K (µ) is a closed subspace of L2 (µ), hence is itself a Hilbert space. Furthermore  maps L2K (µ)
to L2K (µ) and the restriction of  to D(Â) ∩ L2K (µ) is symmetric and non-positive. Since D(Â) ∩ L2K (µ)
contains Cc2 (K o ) (where K o is the interior of K), D(Â) ∩ L2K (µ) is dense in L2K (µ). Therefore we can
apply the Friedrichs extension theorem, the spectral theorem, and the Hille Yosida theorem just as we
did on L2 (µ) to find that (λI − Â)−1 exists as an operator from L2K (µ) to L2K (µ). This resolvent must
necessarily agree with the ordinary resolvent on Cc2 (K o ), so we find that (λI − Â)−1 maps Cc2 (K o ) to L2K (µ)
for each compact K ⊂ Rn . That is, if f ∈ Cc2 (Rn ), then (λI − Â)−1 f is compactly supported. On the
other hand, elliptic regularity (see [14], Ch. 6) implies, in particular, that (λI − Â)−1 f ∈ Cc2 (Rn ). Thus
(λI − Â)−1 Cc2 (Rn ) ⊂ Cc2 (Rn ), and so Cc2 (Rn ) ⊂ (λI − Â)Cc2 (Rn ).
6.3
Applications to Economics
Inequalities like those we study here have applications in a vast array of different fields. One such field is
economics. Many economic phenomena, including fluctuations stock prices, machine failure rates, and shifts
in unemployment can be modelled using log-concave probability measures. Gaussian measures are ubiquitous
in mathematical modelling of all forms. Other log-concave measures allow for more flexible models which
can better match the data, or more precise calculations with certain statistics. Among these, some of the
most commonly used are the Gamma and Weibull distributions.
Such measures are especially important in finance. In a 1959 paper, M.F.M. Osborne showed that the
logarithms of many stock prices follow a Brownian motion, a stochastic process with independent, Gaussian
increments [28]. Osborne’s discovery sparked widespread interest Gaussian and other log-concave measures
in finance. The ability to model stock price fluctuations mathematically enables researchers and investors
to quantify the risk associated with investing in the market, and thereby to model investment decisions in a
formal manner. One of the best known applications of this idea is the Black-Scholes equation for the price
of an option. This is a partial differential equation used to optimize pricing and portfolio allocation [6].
Another application of log-concave measures is to reliability functions in industrial engineering. Consider
a machine which has some positive probability of breaking down. A reliability function is a measure µ on
[0, ∞), with the interpretation that the measure of a set E is the probability that the machine breaks down
at a a time t ∈ E. Log-concave measures arise naturally in this context. For example, if F (t) = µ[0, t] is the
cumulative distribution function of µ, and f is its density, then the quantity
Z ∞
f (t)
dt − x
M RL(x) =
t
1 − F (x)
t
44
is called the mean residual lifetime function of the machine, and represents the expected time before a
machine will break down, given that it has survived to time x. Naturally, one would want M RL(x) to be
decreasing in x, and it turns out that this is the case if and only if the measure µ is log-concave. Many
similar desirable properties of a reliability function are also equivalent to log-concavity [3].
In order to use models like these, one must understand the behavior of the measures on which they rely.
Of particular interest is the variance of a random variable—its average distance from its mean—and its
entropy—a measure of the uncertainty in its value. If f is a function on Rn representing a random variable,
then its variance with respect to a measure µ is given by
Z
2
Z
f 2 dµ −
f dµ
Rn
Rn
and its entropy (a measure of the uncertainty in its value) is given by
Z
Z
|f | log |f |dµ −
Rn
Z
2
f dµ log
Rn
1/2
|∇f | dµ
.
2
Rn
The Poincaré and logarithmic Sobolev inequalities bound these two quantities, respectively, in terms of the
L2 norm of the gradient of f .
These two inequalities are of much use in economic models which use Gaussian or other log-concave
measures. For example, they can be used to estimate the total long-term variance of a stock price about
its mean, or the expected amount of time before the process modelled by a log-concave reliability function
breaks down. These inequalities are also used in quantitative estimates for the variance and entropy of prices,
cash flows, inflation and other processes which follow a log-concave distribution [12].
Furthermore, the logarithmic Sobolev inequality can be used to prove concentration of measure inequalities [22], which bound the probability that a 1-Lipschitz random variable f deviates from its mean by at
least t > 0:
Z
µ{x ∈ Rn : |f (x) −
f dµ| ≥ t} ≤ φ(t)
Rn
for some rapidly decaying function φ. Given a concentration of measure inequality for µ, one can often obtain
a similar or even sharper inequality for the n-fold product measure µn . As such, concentration of measure
inequalities are indispensable in studying measures on high- or infinite-dimensional spaces. In economics, as
in many other fields, large data sets are often represented as vectors in which each observation corresponds
to a coordinate. A statistic of interest, e.g. the sample mean, is a function on the many-dimensional set
of possible vectors of observations. Concentration of measure inequalities estimate how close the sample
statistic is likely to be to its population counterpart (typically equal to its expected value). Moreover,
concentration of measure inequalities can be used to deduce generalized central limit theorems (as in [20]),
which are also useful in economics and statistics for estimating large sample probabilities and proving the
convergence of estimators.
The intermediate Beckner inequality and its generalization for q > 2, which we prove in Sections 3 and
4 has indirect applications in that it can be used to prove versions of the Poincaré and logarithmic Sobolev
inequalities, as we do in Subsection 5.1. It also has more direct uses. The quantity
Z
2
Z
f dµ −
Rn
2/p
|f | dµ
p
Rn
appearing on the left side of Beckner’s inequality is called the p-variance of f , and represents the dispersion
of f about its Lp norm in much the same sense that the usual variance represents the dispersion of f about
its mean. Beckner’s inequality might be used to estimate the rate of inflation or value of an investment
in terms of its Lp norms, instead of its mean. Such estimates might be useful, for example, if one seeks
an intermediate measure of the “average” of such a quantity, between the more commonly used mean and
L2 norm. Indeed, Lp norms for general p 6= 2 arise naturally in many economic applications, such as the
measurement of economic welfare [25]. As we show in Section 2, these inequalities possess a tensorisation
property similar to concentration of measure inequalities, which makes them useful for studying large data
sets. Furthermore, as with the logarithmic Sobolev inequality, Beckner’s inequality has been applied to prove
45
concentration of measure inequalities, e.g. in [21]. Generalizations of this inequality like the ones we obtain
here are likely to be used for this purpose in the future.
Sharper and more general inequalities, as well as new proofs of existing inequalities, lead to a better
understanding of the measures in question. This in turn leads to more precise estimates and thereby more
accurate models and better statistical tests. So, all of our results here have potential uses in the economic
scenarios discussed above. For example, our extended Beckner inequalities might someday be used to prove
new concentration of measure inequalities (and thereby lead to better stochastic models) or to new methods
of estimating of the rates of convergence of stock prices or the lifetimes of industrial processes. Likewise, the
sharpened Beckner and Poincaré inequalities for log-concave measures of Section 5 could be used to obtain
sharper estimates for variance and entropy of such processes.
Discrete analogous of Beckner, Poincaré, and log-Sobolev inequalities are also of economic interest. These
inequalities play a similar role to their continuous counterparts in models which rely on discrete random
variables, such as Bernoulli trials or discrete random walks. Such models include those for market entry,
demand for relatively small quantities of goods, and a plethora of scenarios in game theory [12]. Consequently,
the inequality for Bernoulli trials we derive in Subsection 2.2 and the tensorial property of Subsection 2.1,
which allows it to be extended to higher dimensions, may have economic applications in their own right as
well.
Furthermore, the Ornstein-Uhlenbeck operator and its associated semigroup, which we study in Subsection 3.1 and 3.2, is often used in stochastic models for interest rates and commodity prices. For example, S.
Rampertshammer [29] has developed a model for assessing the value of pairs trading based on the OrnsteinUhlenbeck operator. Pairs trading is an investment strategy whereby one simultaneously purchases shares in
an asset which is below its normal historical price and a short-sells a second asset which is above its normal
historical price. The idea is that the prices are likely to return to their historical mean values, and, even if
they don’t, the investor will not lose money if the market as a whole improves or worsens. Rampertshammer
uses the Ornstein-Uhlenbeck process (the stochastic process generated by the Ornstein-Uhlenbeck operator)
to model the likelihood that two assets will rise or fall in price at the same time, and thereby to determine
the optimal portfolio allocation in a pair trade. Plausibly, the generalization of the Ornstein-Uhlenbeck operator and semigroup to general log-concave probability measures which we study in Sections 4 and 5 could
be used in similar models with different log-concave probability measures in place of the Gaussian measure.
Inequalities for the Ornstein-Uhlenbeck semigroup and its generalization to log-concave probability measures
like the ones we derive here could be used to bound the variance, p-variance, and entropy of the prices and
estimators involved in these models.
References
[1] A. Arnold, J. Bartier, and J. Dolbeault. Interpolation between logarithmic Sobolev and Poincaré inequalities. Commun. Math. Sci., 5, No. 4 (2007), 971-979.
[2] A. Arnold, P. Markowich, G. Toscani, and A. Unterreiter. On logarithmic Sobolev inequalities and
the rate of convergence to equilibrium for Fokker-Planck type equations. Comm. Partial Differential
Equations, 26 No. 1-2 (2001), 43-100.
[3] M. Bagnoli and T.C. Bergtrom. Log-concave Probability and Its Applications. UC Santa Barbara Postprints, 2005.
[4] F. Barthe and C. Roberto. Sobolev inequalities for probability measures on the real line. Studia Math.
159 (2003), 481-497.
[5] W. Beckner. A Generalized Poincaré Inequality for Gaussian Measures. Proceedings of the American
Mathematical Society 105, No. 2 (1989), 397-400.
[6] F. Black and M. Scholes. The Pricing of Options and Corporate Liabilities. Journal of Political Economy,
18 No. 3 (1974), 637-654.
[7] S.G. Bobkov and M. Ledoux. From Brunn-Minkowski to Brascamp-Lieb and to logarithmic Sobolev
inequalities. Geometric And Functional Analysis, 10 (2000), 1028-1052.
[8] A. Bobrowski. Functional Analysis for Probability and Stochastic Processes: An Introduction. Cambridge
University Press (2005).
46
[9] V. I. Bogachev. Gaussian Measures. American Mathematical Society (1998).
[10] H.J. Brascamp and E.H. Lieb. On Extensions of the Brunn-Minkowski and Prékopa-Leindler Theorems,
Including Inequalities for Log Concave Functions, with an Application to the Diffusion Equation. Journal
of Functional Analysis 22 (1976), 366-389.
[11] D. Chafaı̈. Entropies, convexity, and functional inequalities: On Phi-entropies and Phi-Sobolev inequalities. J. Math. Kyoto Univ., 44 No. 2 (2004), 325–363.
[12] J. Dupačová, J. Hurt, and J. Štėpán. Stochastic Modeling in Economics and Finance. Springer, 2002.
[13] R. Durrett. Probability: Theory and Examples. 4th ed., Cambridge University Press, 2010.
[14] L.C. Evans. Partial Differential Equations. American Mathematical Society, 1998.
[15] B. Franchi, S. Gallot and R. Wheeden. Sobolev and isoperimetric inequalities for degenerate metrics.
Math. Ann. 300 (1994), 557571.
[16] L. Gross. Logarithmic Sobolev inequalities. Amer. J. Math. (1975), 1061-1083.
[17] E. Gwynne and E. Hsu. On Beckner’s Inequality for Gaussian Measures. Elemente der Mathematik.
Submitted.
[18] B. Heffler. Remarks on Decay of Correlations and Witten Laplacians Brascamp-Lieb Inequalities and
Semiclassical Limit. Journal of Functional Analysis 155 (1998), 571-586.
[19] E.P. Hsu and S.R.S. Varadhan. Probability Theory and its Applications. American Mathematical Society,
1999.
[20] B. Klartag. A central limit theorem for convex sets. Inventiones Mathematicae 168 no. 1 (2007), 91-131.
[21] R. Latala and K. Oleszkiewicz. Between Sobolev and Poincaré. Geometric Aspects of Functional Analysis, Lecture Notes in Math. no. 1745 (2000), Springer, 147-168.
[22] M. Ledoux. The Geometry of Markov Diffusion Generators. Ann. Fac. Sci. Toulouse, IX (2000), 305-366.
[23] M. Ledoux. Concentration of measure and logarithmic Sobolev inequalities. Séminaire de Probabilités
XXXIII, Springer Lecture Notes in Math. (1997).
[24] S. Levy. Flavors of Geometry. Cambridge University Press, 1997.
[25] T. Mitra and E.A. Ok. Majorization by Lp -Norms. New York University preprint, 2001. Available at
https://files.nyu.edu/eo1/public/Papers-PDF/Major.pdf.
[26] E. Nelson. The free Markoff field. J. Funct. Anal. 12 (1973), 211-227.
[27] J. Neveu. Sur l’esperance conditionelle par rapport à un mouvement brownien. Ann. Insti. Henri
Poincaré, B., 12 (1976), 105-110.
[28] M.F.M. Osborne. Brownian Motion in the Stock Market. U.S. Naval Research Laboratory, 1959.
[29] S. Rampertshammer. An Ornstein-Uhlenbeck Framework for Pairs Trading. Preprint. Available at
http://www.ms.unimelb.edu.au/publications/RampertshammerStefan.pdf.
[30] F.Y. Wang. A Generalization of Poincaré and Log-Sobolev Inequalities. Potential Analysis, 22 No. 1
(2005), 1-15.
[31] M.H. Ye. Applications of Brownian motion to economic models of optimal stopping. University of
Wisconsin–Madison, 1984.
47
Download