Journal of Inequalities in Pure and Applied Mathematics

Journal of Inequalities in Pure and
Applied Mathematics
http://jipam.vu.edu.au/
Volume 2, Issue 1, Article 11, 2001
ON SOME APPLICATIONS OF THE AG INEQUALITY IN INFORMATION
THEORY
B. MOND AND J. PEČARIĆ
S CHOOL OF M ATHEMATICS , L A T ROBE U NIVERSITY, B UNDOORA 3086, V ICTORIA , AUSTRALIA .
b.mond@latrobe.edu.au
FACULTY OF T EXTILE T ECHNOLOGY, U NIVERSITY OF Z AGREB , Z AGREB , C ROATIA
pecaric@hazu.hr
Received 3 November, 2000; accepted 11 January, 2001
Communicated by A. Lupas
A BSTRACT. Recently, S.S. Dragomir used the concavity property of the log mapping and the
weighted arithmetic mean-geometric mean inequality to develop new inequalities that were then
applied to Information Theory. Here we extend these inequalities and their applications.
Key words and phrases: Arithmetic-Geometric Mean, Kullback-Leibler Distances, Shannon’s Entropy.
2000 Mathematics Subject Classification. 26D15.
1. I NTRODUCTION
One of the most important inequalities is the arithmetic-geometric
means inequality:
P
Let ai , pi , i = 1, . . . , n be positive numbers, Pn = ni=1 pi . Then
(1.1)
n
Y
i=1
p /P
ai i n
n
1 X
≤
p i ai ,
Pn i=1
with equality iff a1 = · · · = an .
It is well-known that using (1.1) we can prove the following generalization of another wellknown inequality, that is Hölder’s inequality:
P
Let pij , qi (i = 1, . . . , m; j = 1, . . . , n) be positive numbers with Qm = m
i=1 qi . Then
! Qqi
m
n
n Y
m
m
Y
X
X
qi
pij
.
(1.2)
(pij ) Qm ≤
j=1 i=1
i=1
j=1
In this note, we show that using (1.1) we can improve some recent results which have applications in information theory.
ISSN (electronic): 1443-5756
c 2001 Victoria University. All rights reserved.
042-00
2
B. M OND AND J. P E ČARI Ć
2. A N I NEQUALITY OF I.A. A BOU -TAIR AND W.T. S ULAIMAN
The main result from [1] is:
Let pij , qi (i = 1, . . . , m; j = 1, . . . , n) be positive numbers. Then
m
m
n
n Y
X
qi
1 XX
(pij ) Qm ≤
pij qi .
Q
m
j=1 i=1
i=1 j=1
(2.1)
Moreover, set in (1.1), n = m, pi = qi , ai =
m
n
Y
X
(2.2)
i=1
! Qqi
m
pij
j=1
Pn
j=1
pij . We now have
!
m
n
1 X X
≤
pij qi .
Qm i=1 j=1
Now (1.2) and (2.2) give
(2.3)
n Y
m
m
n
X
Y
X
qi
(pij ) Qm ≤
pij
j=1 i=1
i=1
j=1
! Qqi
m
m
n
1 XX
≤
pij qi .
Qm i=1 j=1
which is an interpolation of (2.1). Moreover, the generalized Hölder inequality was obtained in
[1] as a consequence of (2.1). This is not surprising since (2.1), for n = 1, becomes
m
m
Y
qi
1 X
Q
m
(pi1 )
≤
pi1 qi
Qm i=1
i=1
which is, in fact, the A-G inequality (1.1) (set m = n, pi1 = ai and qi = pi ). Theorem 3.1 in
[1] is the well-known
Shannon
Pn
Pn inequality:
Given
a
=
a,
i=1 i
i=1 bi = b. Then
n
a X
ai
a ln
≤
ai ln
; ai , bi > 0.
b
bi
i=1
It was obtained from (2.1) through the special case
n aai
Y
bi
b
(2.4)
≤ .
ai
a
i=1
Let us note that (2.4) is again a direct consequence of the A-G inequality. Indeed, in (1.1),
setting ai → bi /aiP
, p i → ai , i P
= 1, . . . , n we have (2.4). Theorem 3.2 from [1] is Rényi’s
m
m
inequality. Given
a
=
a,
i=1 i
i=1 bi = b, then for α > 0, α 6= 1,
m
X 1
1
(aα b1−α − a) ≤
aαi b1−α
− ai ; ai , bi ≥ 0.
i
α−1
α−1
i=1
In fact, in the proof given in [1], it was proved that Hölder’s inequality is a consequence of (2.1).
As we have noted, Hölder’s inequality is also a consequence of the A-G inequality.
3. O N S OME I NEQUALITIES OF S.S. D RAGOMIR
The following theorems were proved in [2]:
J. Inequal. Pure and Appl. Math., 2(1) Art. 11, 2001
http://jipam.vu.edu.au/
A PPLICATION OF THE AG I NEQUALITY TO I NFORMATION T HEORY.
3
Theorem 3.1. Let ai ∈ (0, 1) and bi > 0(i = 1, . . . , n). If pi > 0 (i = 1, . . . , n) is such that
P
n
i=1 pi = 1, then
#
" n
#
" n
n
X a2 X
X ai a i
(3.1)
exp
pi i −
p i ai
≥ exp
pi
−1
b
b
i
i
i=1
i=1
i=1
n ai pi
Y
ai
≥
bi
i=1
"
a i #
n
X
pi
≥ exp 1 −
pi
ai
i=1
#
" n
n
X
X
≥ exp
p i ai −
p i bi
i=1
i=1
with equality iff ai = bi for all i ∈ {1, . . . , n}.
Theorem 3.2. Let ai P
∈ (0, 1) (i = 1, . . . , n) and bj > 0 (j = 1, . . . , m).
PmIf pi > 0 (i =
n
1, . . . , n) is such that i=1 pi = 1 and qj > 0 (j = 1, . . . , m) is such that j=1 qj = 1, then
we have the inequality
!
" n m
#
a i
n
m
n
X
X
X
XX
a
q
i
j
(3.2)
−
p i ai
≥ exp
−1
exp
pi a2i
p i qj
b
bj
i=1
j=1 j
i=1
i=1 j=1
n
Q
≥
aai i pi
i=1
m
Q
q
bj j
Pni=1 pi ai
j=1
a i #
bj
p i qj
≥ exp 1 −
ai
i=1 j=1
!
n
m
X
X
≥ exp
p i ai −
qj b j .
"
m
n X
X
i=1
j=1
The equality holds in (3.2) iff a1 = · · · = an = b1 = · · · = bm .
First we give an improvement of the second and third inequality in (3.1).
P
Theorem 3.3. Let ai , bi and pi (i = 1, . . . , n) be positive real numbers with ni=1 pi = 1. Then
a i
−1 X
a i
n
ai
ai
(3.3)
exp pi
−1
≥
pi
bi
bi
i=1
n pi ai
Y
ai
≥
bi
i=1
" n
#−1
X b i a i
≥
pi
ai
i=1
"
a i #
n
X
bi
≥ exp 1 −
pi
,
ai
i=1
with equality iff ai = bi , i = 1, . . . , n.
J. Inequal. Pure and Appl. Math., 2(1) Art. 11, 2001
http://jipam.vu.edu.au/
4
B. M OND AND J. P E ČARI Ć
Proof. The first inequality in (3.3) is a simple consequence of the following well-known elementary inequality
(3.4)
ex−1 ≥ x, for all x ∈ R
with equality iff x = 1. The second inequality is a simple consequence of the A-G inequality
that is, in (1.1), set ai → (ai /bi )ai , i = 1, . . . , n. The third inequality is again a consequence of
(1.1). Namely, for ai → (bi /ai )ai , i = 1, . . . , n, (1.1) becomes
a i
n
n ai pi
X
Y
bi
bi
≤
pi
ai
ai
i=1
i=1
which is equivalent to the third inequality. The last inequality is again a consequence of (3.4).
Theorem
3.4. Let ai ∈ (0, 1) and bi > 0 (i = 1, . . . , n). If pi > 0, i = 1, . . . , n is such that
Pn
i=1 pi = 1, then
" n
#
" n
#
n
X a2 X
X ai a i
i
(3.5)
exp
−
pi ai ≥ exp
−1
pi
pi
b
b
i
i
i=1
i=1
i=1
a i
n
X
ai
≥
pi
bi
i=1
n
pa
Y
ai i i
≥
bi
i=1
" n
#−1
X b i a i
≥
pi
ai
i=1
"
a i #
n
X
bi
≥ exp 1 −
pi
ai
i=1
" n
#
n
X
X
≥ exp
p i ai −
p i bi
i=1
i=1
with equality iff ai = bi for all i = 1, . . . , n.
Proof. The theorem follows from Theorems 3.1 and 3.3.
Theorem
3.5.
Pn
PmLet ai , pi (i = 1, . . . , n); bj , qj (j = 1, . . . , m) be positive numbers with
i=1 pi =
j=1 qj = 1. Then
" n m
#
a i
a i
n X
m
XX
X
ai
ai
(3.6)
exp
p i qj
−1
≥
p i qj
bj
bj
i=1 j=1
i=1 j=1
n
Q
≥
aai i pi
i=1
m
Q
q
bj j
Pni=1 pi ai
j=1
"
≥
i=1
J. Inequal. Pure and Appl. Math., 2(1) Art. 11, 2001
ai #−1
bj
p i qj
ai
j=1
n X
m
X
http://jipam.vu.edu.au/
A PPLICATION OF THE AG I NEQUALITY TO I NFORMATION T HEORY.
"
≥ exp 1 −
5
ai #−1
bj
.
p i qj
a
i
j=1
n X
m
X
i=1
Equality in (3.6) holds iff a1 = · · · = an = b1 = · · · = bm .
Proof. The first and the last inequalities are simple consequences of (3.4). The second is also a
simple consequence of the A-G inequality. Namely, we have
n
Q
a i
aai i pi
n X
m
n Y
m ai pi qj
Y
X
ai
ai
i=1
=
≤
p
q
,
P
i
j
m
n
Q qj i=1 pi ai
b
b
j
j
i=1 j=1
i=1 j=1
bj
j=1
which is the second inequality in (3.6). By the A-G inequality, we have
a i
n Y
m ai pi qj
n X
m
Y
X
bj
bj
≤
p i qj
ai
ai
i=1 j=1
i=1 j=1
which gives the third inequality in (3.6).
Theorem 3.6. Let the assumptions of Theorem 3.2 be satisfied. Then
" n
#
" n m
#
a i
X
m n
X
X
XX
q
a
j
i
(3.7)
exp
−
p i ai
≥ exp
−1
pi a2i
p i qj
bj
bj
i=1
j=1
i=1
i=1 j=1
a i
n X
m
X
ai
≥
p i qj
bj
i=1 j=1
n
Q
≥
aai i pi
i=1
m
Q
q
bj j
Pni=1 pi ai
j=1
ai #−1
bj
≥
p i qj
ai
i=1 j=1
"
a i #
n X
m
X
bj
≥ exp 1 −
p i qj
ai
i=1 j=1
" n
#
m
X
X
≥ exp
p i ai −
qj b j .
"
n X
m
X
i=1
j=1
Equality holds in (3.7) iff a1 = · · · = an = b1 = · · · = bm .
Proof. The theorem is a simple consequence of Theorems 3.2 and 3.5.
4. S OME I NEQUALITIES FOR D ISTANCE F UNCTIONS
In 1951, Kullback and Leibler introduced the following distance function in Information
Theory (see [4] or [5])
(4.1)
KL(p, q) :=
n
X
i=1
J. Inequal. Pure and Appl. Math., 2(1) Art. 11, 2001
pi log
pi
,
qi
http://jipam.vu.edu.au/
6
B. M OND AND J. P E ČARI Ć
provided that p, q ∈ Rn++ := {x = (x1 , . . . , xn )∈ Rn , xi > 0, i = 1, . . . , n}. Another useful
distance function is the χ2 -distance given by (see [5])
Dχ2 (p, q) :=
(4.2)
n
X
p2 − q 2
i
i
qi
i=1
,
where p, q ∈ Rn++ . S.S. Dragomir [2] introduced the following two new distance functions
(4.3)
P2 (p, q) :=
n pi
X
pi
i=1
qi
−1
and
(4.4)
n pi
X
qi
P1 (p, q) :=
−
+1 ,
p
i
i=1
provided p, q ∈ Rn++ . The following inequality connecting all the above four distance functions
holds.
Theorem 4.1. Let p, q ∈ Rn++ with pi ∈ (0, 1). Then we have the inequality:
Dχ2 (p, q) + Qn − Pn ≥ P2 (p, q)
1
≥ n ln
P2 (p, q) + 1
n
≥ KL(p, q)
1
≥ −n ln −
P1 (p, q) + 1
n
≥ P1 (p, q)
≥ Pn − Qn ,
P
P
where Pn = ni=1 pi = 1, Qn = ni=1 qi . Equality holds in (4.5) iff pi = qi (i = 1, . . . , n).
(4.5)
Proof. Set in (3.5), pi = 1/n, ai = pi , bi = qi (i = 1, . . . , n) and take logarithms. After
multiplication by n, we get (4.5).
Corollary 4.2. Let p, q be probability distributions. Then we have
(4.6)
Dχ2 (p, q) ≥ P2 (p, q)
1
≥ n ln
P2 (p, q) + 1
n
≥ KL(p, q)
1
≥ −n ln 1 −
P1 (p, q)
n
≥ P1 (p, q) ≥ 0.
Equality holds in (4.6) iff p = q.
Remark 4.3. Inequalities (4.5) and (4.6) are improvements of related results in [2].
J. Inequal. Pure and Appl. Math., 2(1) Art. 11, 2001
http://jipam.vu.edu.au/
A PPLICATION OF THE AG I NEQUALITY TO I NFORMATION T HEORY.
7
5. A PPLICATIONS FOR S HANNON ’ S E NTROPY
The entropy of a random variable is a measure of the uncertainty of the random variable,
it is a measure of the amount of information required on the average to describe the random
variable. Let p(x), x ∈ χ be a probability mass function. Define the Shannon’s entropy f of a
random variable X having the probability distribution p by
X
1
(5.1)
H(X) :=
p(x) log
.
p(x)
x∈χ
In the above definition we use the convention (based on continuity arguments) that 0 log 0q =
1
0 and p log p0 = ∞. Now assume that |χ| (card(χ) = |χ|) is finite and let u(x) = |χ|
be the
uniform probability mass function in χ. It is well known that [5, p. 27]
X
p(x)
(5.2)
KL(p, q) =
p(x) log
= log |χ| − H(X).
q(x)
x∈χ
The following result is important in Information Theory [5, p. 27]:
Theorem 5.1. Let X, p and χ be as above. Then
H(X) ≤ log |χ|,
(5.3)
with equality if and only if X has a uniform distribution over χ.
In what follows, by the use of Corollary 4.2, we are able to point out the following estimate
for the difference log |χ| − H(X), that is, we shall give the following improvement of Theorem
9 from [2]:
Theorem 5.2. Let X, p and χ be as above. Then
X
(5.4)
|χ|E(X) − 1 ≥
|χ|p(x) [p(x)]p(x) − 1
x∈χ
(
≥ |χ| ln
1 X p(x)
[|χ| [p(x)]p(x)
|χ| x∈χ
)
≥ ln |χ| − H(X)
(
)
1 X −p(x)
≥ −|x| ln
|χ|
[p(x)]−p(x)
|χ| x∈χ
X
≥
|χ|−p(x) [p(x)]−p(x) − 1 ≥ 0,
x∈χ
where E(X) is the informational energy of X, i.e., E(X) :=
1
for all x ∈ χ.
in (5.4) iff p(x) = |χ|
P
x∈χ
Proof. The proof is obvious by Corollary 4.2 by choosing u(x) =
p2 (x). The equality holds
1
.
|χ|
6. A PPLICATIONS FOR M UTUAL I NFORMATION
We consider mutual information, which is a measure of the amount of information that one
random variable contains about another random variable. It is the reduction of uncertainty of
one random variable due to the knowledge of the other [6, p. 18].
To be more precise, consider two random variables X and Y with a joint probability mass
function r(x, y) and marginal probability mass functions p(x) and q(y), x ∈ X , y ∈ Y.
J. Inequal. Pure and Appl. Math., 2(1) Art. 11, 2001
http://jipam.vu.edu.au/
8
B. M OND AND J. P E ČARI Ć
The mutual information is the relative entropy between the joint distribution and the product
distribution, that is,
X
r(x, y)
I(X; Y ) =
r(x, y) log
= D(r, pq).
p(x)q(y)
x∈χ,y∈Y
The following result is well known [6, p. 27].
Theorem 6.1. (Non-negativity of mutual information). For any two random variables X, Y
I(X, Y ) ≥ 0,
(6.1)
with equality iff X and Y are independent.
In what follows, by the use of Corollary 4.2, we are able to point out the following estimate
for the mutual information, that is, the following improvement of Theorem 11 of [2]:
Theorem 6.2. Let X and Y be as above. Then we have the inequality
"
#
X X r2 (x, y)
X X r(x, y) r(x,y)
−1≥
−1
p(x)q(y)
p(x)q(y)
x∈χ y∈Y
x∈χ y∈Y
"
r(x,y) #
X
X
r(x, y)
1
≥ |χ| |Y| ln
|χ||Y| x∈χ y∈Y p(x)q(y)
≥ I(X, Y )
(
r(x,y)
1 X X p(x)q(y)
≥ −|χ| |Y| ln
|χ||Y| x∈χ y∈Y
r(x, y)
"
#
r(x,y)
XX
r(x, y)
≥ 0.
≥
1−
p(x)q(y)
x∈χ y∈Y
The equality holds in all inequalities iff X and Y are independent.
R EFERENCES
[1] I.A. ABOU-TAIR AND W.T. SULAIMAN, Inequalities via convex functions, Internat. J. Math. and
Math. Sci., 22 (1999), 543–546.
[2] S.S. DRAGOMIR, An inequality for logarithmic mapping and applications for the relative entropy,
RGMIA Res. Rep. Coll., 3(2) (2000), Article 1, http://rgmia.vu.edu.au/v3n2.html
[3] S. KULLBACK
(1951), 79–86.
AND
R.A. LEIBLER, On information and sufficiency, Annals Maths. Statist., 22
[4] S. KULLBACK, Information and Statistics, J. Wiley, New York, 1959.
[5] A. BEN-TAL, A. BEN-ISRAEL AND M. TEBOULLE, Certainty equivalents and information measures: duality and extremal, J. Math. Anal. Appl., 157 (1991), 211–236.
[6] T.M. COVER
1991.
AND
J.A.THOMAS, Elements of Information Theory, John Wiley and Sons, Inc.,
J. Inequal. Pure and Appl. Math., 2(1) Art. 11, 2001
http://jipam.vu.edu.au/