Journal of Inequalities in Pure and Applied Mathematics ON SOME APPLICATIONS OF THE AG INEQUALITY IN INFORMATION THEORY volume 2, issue 1, article 11, 2001. BERTRAM MOND AND JOSIP E. PEČARIĆ School of Mathematics And Statistics La Trobe University, Bundoora, Victoria, 3083 AUSTRALIA. EMail: B.Mond@latrobe.edu.au URL: http://www.latrobe.edu.au/www/mathstats/Staff/mond.html Department of Mathematics Faculty of Textile Technology University of Zagreb, CROATIA. EMail: pecaric@mahazu.hazu.hr URL: http://mahazu.hazu.hr/DepMPCS/indexJP.html Received 3 November, 2000; accepted 11 January 2001. Communicated by: A. Lupaş Abstract Contents JJ J II I Home Page Go Back Close c 2000 School of Communications and Informatics,Victoria University of Technology ISSN (electronic): 1443-5756 042-00 Quit Abstract Recently, S.S. Dragomir used the concavity property of the log mapping and the weighted arithmetic mean-geometric mean inequality to develop new inequalities that were then applied to Information Theory. Here we extend these inequalities and their applications. 2000 Mathematics Subject Classification: 26D15. Key words: Arithmetic-Geometric Mean, Kullback-Leibler Distances, Shannon’s Entropy. Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2 An Inequality of I.A. Abou-Tair and W.T. Sulaiman . . . . . . . . 4 3 On Some Inequalities of S.S. Dragomir . . . . . . . . . . . . . . . . . . 6 4 Some Inequalities for Distance Functions . . . . . . . . . . . . . . . . 12 5 Applications for Shannon’s Entropy . . . . . . . . . . . . . . . . . . . . . 15 6 Applications for Mutual Information . . . . . . . . . . . . . . . . . . . . 17 References On Some Applications of the AG Inequality in Information Theory Bertram Mond and Josip E. Pečarić Title Page Contents JJ J II I Go Back Close Quit Page 2 of 19 J. Ineq. Pure and Appl. Math. 1(1) Art. 6, 2000 http://jipam.vu.edu.au 1. Introduction One of the most important inequalities is the arithmetic-geometric means inequality: P Let ai , pi , i = 1, . . . , n be positive numbers, Pn = ni=1 pi . Then (1.1) n Y i=1 p /P ai i n n 1 X ≤ p i ai , Pn i=1 with equality iff a1 = · · · = an . It is well-known that using (1.1) we can prove the following generalization of another well-known inequality, that is Hölder’s inequality: PmLet pij , qi (i = 1, . . . , m; j = 1, . . . , n) be positive numbers with Qm = i=1 qi . Then (1.2) m n Y m n X Y X qi Q m (pij ) ≤ pij j=1 i=1 i=1 ! qi Qm On Some Applications of the AG Inequality in Information Theory Bertram Mond and Josip E. Pečarić Title Page Contents . j=1 In this note, we show that using (1.1) we can improve some recent results which have applications in information theory. JJ J II I Go Back Close Quit Page 3 of 19 J. Ineq. Pure and Appl. Math. 1(1) Art. 6, 2000 http://jipam.vu.edu.au 2. An Inequality of I.A. Abou-Tair and W.T. Sulaiman The main result from [1] is: Let pij , qi (i = 1, . . . , m; j = 1, . . . , n) be positive numbers. Then n Y m m n X qi 1 XX (pij ) Qm ≤ pij qi . Q m j=1 i=1 i=1 j=1 (2.1) Moreover, set in (1.1), n = m, pi = qi , ai = m n Y X (2.2) i=1 ! pij j=1 qi Qm Pn j=1 pij . We now have ! m n 1 X X ≤ pij qi . Qm i=1 j=1 m n n Y m Y X X qi (pij ) Qm ≤ pij j=1 i=1 Bertram Mond and Josip E. Pečarić Title Page Contents Now (1.2) and (2.2) give (2.3) On Some Applications of the AG Inequality in Information Theory i=1 j=1 ! Qqi m m n 1 XX ≤ pij qi . Qm i=1 j=1 which is an interpolation of (2.1). Moreover, the generalized Hölder inequality was obtained in [1] as a consequence of (2.1). This is not surprising since (2.1), for n = 1, becomes m m Y qi 1 X (pi1 ) Qm ≤ pi1 qi Q m i=1 i=1 JJ J II I Go Back Close Quit Page 4 of 19 J. Ineq. Pure and Appl. Math. 1(1) Art. 6, 2000 http://jipam.vu.edu.au which is, in fact, the A-G inequality (1.1) (set m = n, pi1 = ai and qi = pi ). Theorem 3.1 in [1] is the well-known Shannon inequality: Pn Pn Given i=1 ai = a, i=1 bi = b. Then a ln a b ≤ n X i=1 ai ; ai , bi > 0. ai ln bi It was obtained from (2.1) through the special case a (2.4) n ai Y bi i=1 ai b ≤ . a Let us note that (2.4) is again a direct consequence of the A-G inequality. Indeed, in (1.1), setting ai → bi /ai , pi → ai , i = (2.4). Pm1, . . . , n wePhave m Theorem 3.2 from [1] is Rényi’s inequality. Given i=1 ai = a, i=1 bi = b, then for α > 0, α 6= 1, m X 1 1 (aα b1−α − a) ≤ aαi b1−α − ai ; ai , bi ≥ 0. i α−1 α−1 i=1 In fact, in the proof given in [1], it was proved that Hölder’s inequality is a consequence of (2.1). As we have noted, Hölder’s inequality is also a consequence of the A-G inequality. On Some Applications of the AG Inequality in Information Theory Bertram Mond and Josip E. Pečarić Title Page Contents JJ J II I Go Back Close Quit Page 5 of 19 J. Ineq. Pure and Appl. Math. 1(1) Art. 6, 2000 http://jipam.vu.edu.au 3. On Some Inequalities of S.S. Dragomir The following theorems were proved in [2]: Theorem 3.1. P Let ai ∈ (0, 1) and bi > 0(i = 1, . . . , n). If pi > 0 (i = 1, . . . , n) is such that ni=1 pi = 1, then # " n # " n n X a2 X X ai a i i (3.1) exp pi − pi ai ≥ exp pi −1 bi bi i=1 i=1 i=1 n ai pi Y ai ≥ bi i=1 " a i # n X pi ≥ exp 1 − pi ai " n i=1 # n X X ≥ exp p i ai − p i bi i=1 Theorem 3.2. Let ai ∈ (0, 1) (i =P1, . . . , n) and bj > 0 (j = 1, . . . , m). If pi > 0 (i P = 1, . . . , n) is such that ni=1 pi = 1 and qj > 0 (j = 1, . . . , m) is such that m j=1 qj = 1, then we have the inequality (3.2) exp n X i=1 pi a2i m X qj j=1 bj − n X i=1 p i ai " ≥ exp n X m X i=1 Bertram Mond and Josip E. Pečarić Title Page Contents JJ J i=1 with equality iff ai = bi for all i ∈ {1, . . . , n}. ! On Some Applications of the AG Inequality in Information Theory # a i ai p i qj −1 bj j=1 II I Go Back Close Quit Page 6 of 19 J. Ineq. Pure and Appl. Math. 1(1) Art. 6, 2000 http://jipam.vu.edu.au n Q ≥ aai i pi i=1 m Q j=1 q bj j " Pni=1 pi ai a i # bj ≥ exp 1 − p i qj ai i=1 j=1 ! n m X X ≥ exp p i ai − qj b j . i=1 n X m X j=1 The equality holds in (3.2) iff a1 = · · · = an = b1 = · · · = bm . On Some Applications of the AG Inequality in Information Theory Bertram Mond and Josip E. Pečarić First we give an improvement of the second and third inequality in (3.1). Theorem 3.3. Let ai , bi and pi (i = 1, . . . , n) be positive real numbers with Pn i=1 pi = 1. Then −1 X a i a i n ai ai (3.3) exp pi −1 ≥ pi bi bi i=1 n pi ai Y ai ≥ bi i=1 " n #−1 X b i a i ≥ pi ai i=1 " a i # n X bi ≥ exp 1 − pi , a i i=1 Title Page Contents JJ J II I Go Back Close Quit Page 7 of 19 J. Ineq. Pure and Appl. Math. 1(1) Art. 6, 2000 http://jipam.vu.edu.au with equality iff ai = bi , i = 1, . . . , n. Proof. The first inequality in (3.3) is a simple consequence of the following well-known elementary inequality (3.4) ex−1 ≥ x, for all x ∈ R with equality iff x = 1. The second inequality is a simple consequence of the A-G inequality that is, in (1.1), set ai → (ai /bi )ai , i = 1, . . . , n. The third inequality is again a consequence of (1.1). Namely, for ai → (bi /ai )ai , i = 1, . . . , n, (1.1) becomes a i n ai pi n Y X bi bi ≤ pi ai ai i=1 i=1 which is equivalent to the third inequality. The last inequality is again a consequence of (3.4). Theorem 3.4. ai ∈ (0, 1) and bi > 0 (i = 1, . . . , n). If pi > 0, i = 1, . . . , n PLet n is such that i=1 pi = 1, then # " n # " n n X ai a i X a2 X i (3.5) exp − pi ai ≥ exp pi −1 pi b b i i i=1 i=1 i=1 a i n X ai ≥ pi bi i=1 n pa Y ai i i ≥ bi i=1 On Some Applications of the AG Inequality in Information Theory Bertram Mond and Josip E. Pečarić Title Page Contents JJ J II I Go Back Close Quit Page 8 of 19 J. Ineq. Pure and Appl. Math. 1(1) Art. 6, 2000 http://jipam.vu.edu.au ai #−1 bi ≥ pi ai i=1 " a i # n X bi ≥ exp 1 − pi ai " n i=1 # n X X ≥ exp p i ai − pi bi " n X i=1 i=1 On Some Applications of the AG Inequality in Information Theory with equality iff ai = bi for all i = 1, . . . , n. Bertram Mond and Josip E. Pečarić Proof. The theorem follows from Theorems 3.1 and 3.3. Theorem 3.5. Let ai , pi (i = 1, . . . , n); bj , qj (j = 1, . . . , m) be positive numbers with Pn Pm i=1 pi = j=1 qj = 1. Then " (3.6) exp n X m X i=1 # a i ai −1 p i qj bj j=1 ≥ n X m X i=1 a i ai p i qj bj j=1 n Q ≥ q bj j Pn i=1 ≥ JJ J II I Close pi ai j=1 " Contents Go Back aai i pi i=1 m Q Title Page ai #−1 n X m X bj p i qj ai i=1 j=1 Quit Page 9 of 19 J. Ineq. Pure and Appl. Math. 1(1) Art. 6, 2000 http://jipam.vu.edu.au " ≥ exp 1 − ai #−1 bj . p i qj a i j=1 n X m X i=1 Equality in (3.6) holds iff a1 = · · · = an = b1 = · · · = bm . Proof. The first and the last inequalities are simple consequences of (3.4). The second is also a simple consequence of the A-G inequality. Namely, we have n Q i=1 m Q q bj j aai i pi Pni=1 pi ai = n Y m ai pi qj Y ai i=1 j=1 bj ≤ n X m X i=1 a i ai p i qj , bj j=1 j=1 which is the second inequality in (3.6). By the A-G inequality, we have a i n Y m ai pi qj n X m Y X bj bj ≤ p i qj ai ai i=1 j=1 i=1 j=1 which gives the third inequality in (3.6). Theorem 3.6. Let the assumptions of Theorem 3.2 be satisfied. Then # " n X m n X X qj 2 (3.7) − p i ai exp p i ai bj i=1 j=1 i=1 " n m # a i XX ai −1 ≥ exp p i qj bj i=1 j=1 On Some Applications of the AG Inequality in Information Theory Bertram Mond and Josip E. Pečarić Title Page Contents JJ J II I Go Back Close Quit Page 10 of 19 J. Ineq. Pure and Appl. Math. 1(1) Art. 6, 2000 http://jipam.vu.edu.au ≥ n X m X i=1 a i ai p i qj bj j=1 n Q ≥ Q m aai i pi i=1 q bj j Pni=1 pi ai j=1 ai #−1 bj ≥ p i qj ai i=1 j=1 " a i # n X m X bj ≥ exp 1 − p i qj ai i=1 j=1 " n # m X X ≥ exp p i ai − qj b j . " n X m X i=1 j=1 Equality holds in (3.7) iff a1 = · · · = an = b1 = · · · = bm . Proof. The theorem is a simple consequence of Theorems 3.2 and 3.5. On Some Applications of the AG Inequality in Information Theory Bertram Mond and Josip E. Pečarić Title Page Contents JJ J II I Go Back Close Quit Page 11 of 19 J. Ineq. Pure and Appl. Math. 1(1) Art. 6, 2000 http://jipam.vu.edu.au 4. Some Inequalities for Distance Functions In 1951, Kullback and Leibler introduced the following distance function in Information Theory (see [4] or [5]) KL(p, q) := (4.1) n X pi log i=1 pi , qi provided that p, q ∈ Rn++ := {x = (x1 , . . . , xn )∈ Rn , xi > 0, i = 1, . . . , n}. Another useful distance function is the χ2 -distance given by (see [5]) Dχ2 (p, q) := (4.2) n X p2 − q 2 i i=1 i qi , where p, q ∈ Rn++ . S.S. Dragomir [2] introduced the following two new distance functions n pi X pi (4.3) P2 (p, q) := −1 qi i=1 On Some Applications of the AG Inequality in Information Theory Bertram Mond and Josip E. Pečarić Title Page Contents JJ J II I Go Back and n pi X qi P1 (p, q) := − +1 , p i i=1 (4.4) Close Quit Page 12 of 19 Rn++ . provided p, q ∈ The following inequality connecting all the above four distance functions holds. J. Ineq. Pure and Appl. Math. 1(1) Art. 6, 2000 http://jipam.vu.edu.au Theorem 4.1. Let p, q ∈ Rn++ with pi ∈ (0, 1). Then we have the inequality: Dχ2 (p, q) + Qn − Pn ≥ P2 (p, q) 1 ≥ n ln P2 (p, q) + 1 n ≥ KL(p, q) 1 ≥ −n ln − P1 (p, q) + 1 n ≥ P1 (p, q) ≥ Pn − Qn , Pn Pn where Pn = p = 1, Q = i n i=1 i=1 qi . Equality holds in (4.5) iff pi = qi (i = 1, . . . , n). (4.5) Proof. Set in (3.5), pi = 1/n, ai = pi , bi = qi (i = 1, . . . , n) and take logarithms. After multiplication by n, we get (4.5). Corollary 4.2. Let p, q be probability distributions. Then we have (4.6) Dχ2 (p, q) ≥ P2 (p, q) 1 ≥ n ln P2 (p, q) + 1 n ≥ KL(p, q) 1 ≥ −n ln 1 − P1 (p, q) n ≥ P1 (p, q) ≥ 0. On Some Applications of the AG Inequality in Information Theory Bertram Mond and Josip E. Pečarić Title Page Contents JJ J II I Go Back Close Quit Page 13 of 19 J. Ineq. Pure and Appl. Math. 1(1) Art. 6, 2000 http://jipam.vu.edu.au Equality holds in (4.6) iff p = q. Remark 4.1. Inequalities (4.5) and (4.6) are improvements of related results in [2]. On Some Applications of the AG Inequality in Information Theory Bertram Mond and Josip E. Pečarić Title Page Contents JJ J II I Go Back Close Quit Page 14 of 19 J. Ineq. Pure and Appl. Math. 1(1) Art. 6, 2000 http://jipam.vu.edu.au 5. Applications for Shannon’s Entropy The entropy of a random variable is a measure of the uncertainty of the random variable, it is a measure of the amount of information required on the average to describe the random variable. Let p(x), x ∈ χ be a probability mass function. Define the Shannon’s entropy f of a random variable X having the probability distribution p by (5.1) H(X) := X p(x) log x∈χ 1 . p(x) In the above definition we use the convention (based on continuity argu 0 ments) that 0 log q = 0 and p log p0 = ∞. Now assume that |χ| (card(χ) = 1 |χ|) is finite and let u(x) = |χ| be the uniform probability mass function in χ. It is well known that [5, p. 27] X p(x) (5.2) KL(p, q) = p(x) log = log |χ| − H(X). q(x) x∈χ The following result is important in Information Theory [5, p. 27]: Theorem 5.1. Let X, p and χ be as above. Then (5.3) H(X) ≤ log |χ|, On Some Applications of the AG Inequality in Information Theory Bertram Mond and Josip E. Pečarić Title Page Contents JJ J II I Go Back Close Quit Page 15 of 19 with equality if and only if X has a uniform distribution over χ. J. Ineq. Pure and Appl. Math. 1(1) Art. 6, 2000 http://jipam.vu.edu.au In what follows, by the use of Corollary 4.2, we are able to point out the following estimate for the difference log |χ| − H(X), that is, we shall give the following improvement of Theorem 9 from [2]: Theorem 5.2. Let X, p and χ be as above. Then X |χ|E(X) − 1 ≥ (5.4) |χ|p(x) [p(x)]p(x) − 1 x∈χ ( ≥ |χ| ln 1 X p(x) [|χ| [p(x)]p(x) |χ| x∈χ ) On Some Applications of the AG Inequality in Information Theory Bertram Mond and Josip E. Pečarić ≥ ln |χ| − H(X) ( ) 1 X −p(x) ≥ −|x| ln |χ| [p(x)]−p(x) |χ| x∈χ X ≥ |χ|−p(x) [p(x)]−p(x) − 1 ≥ 0, Title Page Contents x∈χ where E(X) is the informational energy of X, i.e., E(X) := 1 equality holds in (5.4) iff p(x) = |χ| for all x ∈ χ. P x∈χ Proof. The proof is obvious by Corollary 4.2 by choosing u(x) = p2 (x). The JJ J II I Go Back 1 . |χ| Close Quit Page 16 of 19 J. Ineq. Pure and Appl. Math. 1(1) Art. 6, 2000 http://jipam.vu.edu.au 6. Applications for Mutual Information We consider mutual information, which is a measure of the amount of information that one random variable contains about another random variable. It is the reduction of uncertainty of one random variable due to the knowledge of the other [6, p. 18]. To be more precise, consider two random variables X and Y with a joint probability mass function r(x, y) and marginal probability mass functions p(x) and q(y), x ∈ X , y ∈ Y. The mutual information is the relative entropy between the joint distribution and the product distribution, that is, X r(x, y) I(X; Y ) = r(x, y) log = D(r, pq). p(x)q(y) x∈χ,y∈Y The following result is well known [6, p. 27]. Theorem 6.1. (Non-negativity of mutual information). For any two random variables X, Y (6.1) I(X, Y ) ≥ 0, On Some Applications of the AG Inequality in Information Theory Bertram Mond and Josip E. Pečarić Title Page Contents JJ J II I Go Back with equality iff X and Y are independent. In what follows, by the use of Corollary 4.2, we are able to point out the following estimate for the mutual information, that is, the following improvement of Theorem 11 of [2]: Close Quit Page 17 of 19 J. Ineq. Pure and Appl. Math. 1(1) Art. 6, 2000 http://jipam.vu.edu.au Theorem 6.2. Let X and Y be as above. Then we have the inequality # " X X r2 (x, y) X X r(x, y) r(x,y) −1≥ −1 p(x)q(y) p(x)q(y) x∈χ y∈Y x∈χ y∈Y " r(x,y) # X X r(x, y) 1 ≥ |χ| |Y| ln |χ||Y| x∈χ y∈Y p(x)q(y) ≥ I(X, Y ) ( X X p(x)q(y) r(x,y) 1 |χ||Y| x∈χ y∈Y r(x, y) " # r(x,y) XX r(x, y) ≥ 1− ≥ 0. p(x)q(y) x∈χ y∈Y ≥ −|χ| |Y| ln The equality holds in all inequalities iff X and Y are independent. On Some Applications of the AG Inequality in Information Theory Bertram Mond and Josip E. Pečarić Title Page Contents JJ J II I Go Back Close Quit Page 18 of 19 J. Ineq. Pure and Appl. Math. 1(1) Art. 6, 2000 http://jipam.vu.edu.au References [1] I.A. ABOU-TAIR AND W.T. SULAIMAN, Inequalities via convex functions, Internat. J. Math. and Math. Sci., 22 (1999), 543–546. [2] S.S. DRAGOMIR, An inequality for logarithmic mapping and applications for the relative entropy, RGMIA Res. Rep. Coll., 3(2) (2000), Article 1. [ONLINE] http://rgmia.vu.edu.au/v3n2.html [3] S. KULLBACK AND R.A. LEIBLER, On information and sufficiency, Annals Maths. Statist., 22 (1951), 79–86. [4] S. KULLBACK, Information and Statistics, J. Wiley, New York, 1959. [5] A. BEN-TAL, A. BEN-ISRAEL AND M. TEBOULLE, Certainty equivalents and information measures: duality and extremal, J. Math. Anal. Appl., 157 (1991), 211–236. On Some Applications of the AG Inequality in Information Theory Bertram Mond and Josip E. Pečarić Title Page Contents [6] T.M. COVER AND J.A.THOMAS, Elements of Information Theory, John Wiley and Sons, Inc., 1991. JJ J II I Go Back Close Quit Page 19 of 19 J. Ineq. Pure and Appl. Math. 1(1) Art. 6, 2000 http://jipam.vu.edu.au