May 30, 2005 Means of positive numbers and matrices 1 Dedicated to Professor Pál Rózsa on the occasion of his 80th birthday Dénes Petz 2 and Róbert Temesi 3 Department for Mathematical Analysis Budapest University of Technology and Economics H-1521 Budapest XI., Hungary It is shown that any two-variable-mean of positive matrices (studied by Kubo and Ando) can be extended to more variables. The n-variable-mean Mn (A1 , A2 , . . . , An ) is defined by a symmetrization procedure when the ntuple (A1 , A2 , . . . , An ) is ordered, it is monotone to each variables and satisfies the transformer inequality. This approach is motivated by the paper of Ando, Li and Mathias on geometric means. Special attention is paid to the logarithmic mean. It is conjectured that for matrix means the symmetrization procedure converges for all triplets. 1 Introduction A function M : R+ × R+ → R+ may be called a mean of positive numbers if (i) M (x, x) = x for every x ∈ R+ . (ii) M (x, y) = M (y, x) for every x, y ∈ R+ . (iii) If x < y, then x < M (x, y) < y. (iv) If x < x0 and y < y 0 , then M (x, y) < M (x0 , y 0 ). 1 The work was partially supported by the Hungarian grant OTKA T032662 and the paper is to be published in SIAM Matrix Anal. Appl . 2 E-mail: petz@math.bme.hu. 3 E-mail: temesi@math.bme.hu. 1 (v) M (x, y) is continuous. √ The geometric mean xy, the arithmetic mean (x + y)/2 and the harmonic mean 2/(x−1 +y −1 ) are the most known examples, sometimes they are called Pythagorean means. There are many other means as well, such as 1 Aα (x, y) := (xα y 1−α + x1−α y α ) 2 and (0 ≤ α ≤ 1) (1) xα + y α 1/α (−∞ ≤ α ≤ ∞). (2) 2 In order to have a more treatable class of means, we impose the extra condition of homogeneity Bα (x, y) := (vi) M (tx, ty) = tM (x, y) (t, x, y ∈ R+ ). Note that all the above examples are homogeneous means. A two-variable function M (x, y) satisfying condition (vi) can be reduced to a onevariable function f (x) := M (1, x). Namely, M (x, y) is recovered from f as y M (x, y) = xf . (3) x What are the properties of f which imply conditions (i)–(v)? They are as follows. (i)0 f (1) = 1 (ii)0 tf (t−1 ) = f (t) (iii)0 f (t) > 1 if t > 1 and f (t) < 1 if 0 < t < 1. (iv)0 f is monotone increasing. (v)0 f is continuous. Then a homogeneous and continuous mean is uniquely described by a function f satisfying the properties (i)0 –(v)0 . A function f determines a mean for positive matrices (or operators) when f is matrix monotone. This property is a strengthening of (iv)0 . Then the formula M (A, B) = A1/2 f (A−1/2 BA−1/2 )A1/2 (4) replaces (3). In this paper the two-variable matrix means of Kubo and Ando are extended to n-tuples (A1 , A2 , . . . , An ) of positive matrices such that A1 ≤ A2 ≤ . . . ≤ An . The latter restriction is not required for every mean, the Pythagorean means work very well without the ordering. The geometric mean was discussed by Ando, Li and Mathias [1] and we present an elementary proof of their result. We conjecture that there are other means such that the more-variable-extension works without the restrictive ordering condition. For example, the logarithmic mean seems to be in this class. Although we discuss briefly the means for positive real variables, the emphasis is on the matrix case. 2 2 Means for three variables The geometric mean, the arithmetic mean and the harmonic mean are extended to three (and more) variables as √ xyz, (x + y + z)/3, 3/(x−1 + y −1 + z −1 ). Our target is to extend a general mean M2 (x, y) of two variables to three variables. In order to do this assume that 0 < x ≤ y ≤ z and define a recursion x1 = x, xn+1 = M2 (xn , yn ), y1 = y, z1 = z yn+1 = M2 (xn , zn ), zn+1 = M2 (yn , zn ). (5) (6) Below we refer to this recursion as symmetrization procedure. One can show by induction that xn ≤ yn ≤ zn , moreover the sequence (xn ) is increasing and (zn ) is decreasing. Therefore, the limits L := lim xn n→∞ and U = lim zn n→∞ exist. Our next goal is to show that L = U . Assume that L < U . By the continuity of the mean, yn → M := M2 (L, U ), and L < M < U . Taking the limit of the relation M2 (yn , zn ) = zn+1 , we obtain M2 (M, U ) = U. This contradicts (iii) and L = U is proven. Given a mean M2 (x, y) of two variables, we define M3 (x, y, z) as the limit limn xn = limn yn = limn zn in the above recursion (5) and (6) for any x, y, z ∈ R+ . Note that the existence of the limit does not require the symmetry of the M2 (x, y). Example 1 For the arithmetic, geometric and harmonic means the above symmetrization yields the customary three-variable-mean from the corresponding two-variable-ones. Example 2 For the logarithmic mean, different extensions have been proposed in the literature. An extension can be given by the formula [9]: L1 (x1 , x2 , . . . , xn ) = (n − 1) L1 (x2 , x3 , . . . , xn ) − L1 (x1 , x2 , . . . , xn−1 ) log xn − log x1 This gives L1 (x, y, z) := x y + (log x − log y)(log x − log z) (log y − log x)(log y − log z) z + . (log z − log x)(log z − log y) 3 Note that this logarithmic mean appears in the scalar curvature of a certain Riemannian metric on matrices [7]. Another possibility is L2 (x, y, z) := (x − y)(x − z)(y − z) 1 , 2 xy(log x − log y) + xz(log z − log x) + yz(log y − log z) see [12]. Since Li (x, y, z) = Li L(x, y), L(x, z), L(y, z) (i = 1, 2) does not hold, our above symmetrization does not give Li (x, y, z) from L(x, y) (i = 1, 2). Given a mean M2 of two variables, the corresponding mean M3 of three variables (constructed above) satisfies the equation M3 (x, y, z) = M3 M2 (x, y), M2 (x, z), M2 (y, z) (7) and has the property M3 (u, u, u) = u. Conversely, when M3 is continuous and has the above two properties, then it must be the mean obtained from M2 by the symmetrization procedure. Indeed, (7) implies M3 (x, y, z) = M3 (xn , yn , zn ). Since the sequences (xn ), (yn ) and (zn ) converge to some u, the right-hand-side converges to M3 (u, u, u) = u. Therefore, M3 is the mean provided by the symmetrization procedure. Example 3 Let f (x) + f (y) Mf (x, y) = f 2 be a quasi-arithmetic mean defined by means of a strictly monotone function f . For these means f (x) + f (y) + f (z) −1 M3 (x, y, z) = f 3 −1 as it is easy to check (7). Remember that arithmetic, geometric and harmonic means belong to this class. 3 Means of two positive matrices A theory of means of positive operators was developed by Kubo and Ando [6]. The theory has many interesting applications. Here we do not go into the details concerning operator means but we confine ourselves to the essentials. Operator means are binary operations on positive operators acting on a Hilbert space and they satisfy conditions similar to the above (i)-(v), where the notation A < B means B − A is positive semidefinite and A 6= B: 4 (a) M (A, A) = A for every A, (b) M (A, B) = M (B, A) for every A and B, (c) if A < B, then A < M (A, B) < B, (d) if A < A0 and B < B 0 , then M (A, B) < M (A0 , B 0 ), (e) M is continuous. An important further requirement is the transformer inequality: (f) C M (A, B) C ∗ ≤ M (CAC ∗ , CBC ∗ ) for all operators C. This is a stronger version of condition (vi), since for invertible C, there is equality in (f). The key result in the theory is that operator means are in a 1-to-1 correspondence with operator monotone functions satisfying conditions (i)0 –(v)0 . The correspondence is given by M (A, B) = A1/2 f (A−1/2 BA−1/2 )A1/2 (8) when A is invertible. Note that an operator monotone function is automatically continuous and even smooth. See [2] for the theory of operator monotone functions. For the sake of simplicity we restrict our discussion to invertible positive matrices. (Generalization to positive operators is immediate.) An important example is the geometric mean: A#B = A1/2 (A−1/2 BA−1/2 )1/2 A1/2 (9) which has the special property (λA)#(µB) = for positive numbers λ and µ. p λµ(A#B) (10) Example 4 Consider fα (x) = 2xα+1/2 1 + x2α (0 ≤ α ≤ 1/2). Then 1 1 fα (x)−1 = x−α−1/2 + xα−1/2 2 2 is clearly operator monotone decreasing, so fα is operator monotone. The operator monotone function f (x) := gives the logarithmic mean. x−1 log x (11) 5 Example 5 From the arithmetic and geometric means, one can arrive at the logarithmic mean also in the matrix case, cf. [4]. We define an operation W on the pairs of positive matrices as W (A, B) := ((A + B)/2, ((A + B)/2)#B). (12) By induction it can be shown that the coordinates of W Wn,1 and Wn,2 satisfy conditions (a) − (f) except (b), so these are not symmetric. Even in this case Wn,1 (A, B) can be written in the form of (8) Wn,1 (A, B) = A1/2 fn (A−1/2 BA−1/2 )A1/2 . fn (x) = Wn,1 (x, 1) which converges to (11) according to [4]. This fact implies that the iteration works also for matrices: the sequence A1/2 fn (A−1/2 BA−1/2 )A1/2 converges to the logarithmic mean of A and B. The recursive way of mixing two means as in the previous example can produce new means, or new operator monotone functions from arbitrary initial operator monotone data g1 and h1 . Recursively gn := g1 (gn−1 , hn−1 ) and hn := h1 (gn−1 , hn−1 ) Example 6 Mixing of the operator monotone functions g1 (x) = √ x, h1 (x) = 2x x+1 (or geometric and harmonic means) yields the (joint) limit √ x−1 2 x . log x 1 + x Carrying out the recursion with x−1 g1 (x) = log x √ x−1 2 x and h1 (x) = log x 1 + x we arrive at the operator monotone function x − 1 2 log x 2 . 1+x The operator mean corresponding to the function (13) looks a bit unusual. (13) In the paper [11] a one-to-one correspondence was established between means and certain Riemannian metrics on a space of matrices. 6 4 Means for three matrices The mean of three positive matrices can be obtained by a symmetrization procedure from the two-variable-mean. Theorem 1 Let A, B, C ∈ Mn (C) be positive definite matrices and let M2 be an operator mean. Assume that A ≤ B ≤ C. Set a recursion as A1 = A, An+1 = M2 (An , Bn ), B1 = B, C1 = C, Bn+1 = M2 (An , Cn ), Cn+1 = M2 (Bn , Cn ). (14) (15) Then the limits M3 (A, B, C) := lim An = lim Bn = lim Cn n n n (16) exist. Proof: By the monotonicity of M2 and mathematical induction, we see that An ≤ Bn ≤ Cn . It follows that the sequence (An ) is increasing and (Cn ) is decreasing. Therefore, the limits L := lim An and U = lim Cn n→∞ n→∞ exist. We claim that L = U . Assume that L 6= U . By continuity, Bn → M2 (L, U ) =: M , where L < M < U . Since M2 (Bn , Cn ) = Cn+1 , the limit n → ∞ gives M2 (M, U ) = U , which contradicts M < U . By the symmetrization procedure any operator mean of two variables has an extension to those triplets (A, B, C) which can be ordered. In a few cases, the latter restriction can be skipped. Example 7 If M2 (A, B) = (A+B)/2 is the arithmetic mean, then the above recursion can be written explicitly and the convergence is easily verified without the conditions A ≤ B ≤ C. The limit gives 1 M3 (A, B, C) = (A + B + C). 3 Example 8 The case of the harmonic mean M2 (A, B) = 2/(A−1 + B −1 ) can be reduced to the arithmetic mean by means of the inverse, and the recursion gives M3 (A, B, C) = A−1 3 . + B −1 + C −1 7 The following theorem shows this for the geometric mean. Theorem 2 Let A, B, C ∈ Mn (C) be positive definite matrices. Set a recursion as A1 = A, An+1 = An #Bn , B1 = B, C1 = C, Bn+1 = An #Cn , Cn+1 = Bn #Cn . (17) (18) Then the limit G(A, B, C) := lim An = lim Bn = lim Cn n n n (19) exists. Proof: Choose positive numbers λ and µ such that A0 := A < B 0 := λB < C 0 := µC. Start the recursion with these matrices. By Theorem 1 the limits G(A0 , B 0 , C 0 ) := lim A0n = lim Bn0 = lim Cn0 n n n exist. For the numbers a := 1, b := λ and c := µ the recursion provides a convergent sequence (an , bn , cn ) of triplets. (λµ)1/3 = lim an = lim bn = lim cn . n n n Since An = A0n /an , Bn = Bn0 /bn and Cn = Cn0 /cn due to property (10) of the geometric mean, the limits stated in the theorem must exist and equal G(A0 , B 0 , C 0 )/(λµ)1/3 . The result of Theorem 2 was obtained in [1] but our proof is different and completely elementary. The path γ(t) = A1/2 (A−1/2 BA−1/2 )t A1/2 (0 ≤ t ≤ 1) (20) connects A with B. In [3, 8], a Riemannian geometry is defined on the set of positive definite matrices such that the path (20) is geodesic. In this approach the geometric mean of two (or three) matrices has a nice geometric interpretation (similarly to the arithmetic mean in the flat space.) Theorem 3 The mean M3 (A, B, C) defined in Theorem 1 under the condition A ≤ B ≤ C is a symmetric continuous function of all variables, monotone in all variables and satisfies the transformer inequality XM3 (A, B, C)X ∗ ≤ M3 (XAX ∗ , XBX ∗ , XCX ∗ ) for any matrix X. 8 Proof: Symmetry and monotonicity are obvious; we prove the transformer inequality. Let An , Bn and Cn be the matrices recursively defined in the construction of M3 (A, B, C) and A0n , Bn0 , Cn0 those defined in the construction of M3 (XAX ∗ , XBX ∗ , XCX ∗ ). By repeated use of the transformer inequality (f), we see that XAn X ∗ ≤ A0n and the limit n → ∞ gives our statement. To prove the continuity it is essential to check that lim M3 (A − εI, B − εI, C − εI) = M3 (A, B, C) = lim M3 (A + εI, B + εI, C + εI). ε→+0 ε→+0 Both limits exists by monotonicity and M3 (A, B, C) must be in between. The different limits would lead to a contradiction. If kA − A0 k, kB − B 0 k, kC − C 0 k ≤ ε, then A − εI ≤ A0 ≤ A + εI and similarly B − εI ≤ B 0 ≤ B + εI, C − εI ≤ C 0 ≤ C + εI. By monotonicity M3 (A − εI, B − εI, C − εI) ≤ M3 (A0 , B 0 , C 0 ) ≤ M3 (A + εI, B + εI, C + εI). Taking limit ε → 0 continuity follows. In case of the geometric mean, the transformer inequality holds without the restriction on ordering. This mean has several other properties as shown in [1]. The next theorem is straightforward from the construction of three-variable-means. Theorem 4 An inequality between the two-variable-means M2 and M20 is inherited by the corresponding three-variable-means M3 and M30 . In particular, this theorem applies to the inequality between the arithmetic, logarithmic and geometric means. 5 Discussion Given a positive operator mean M2 , formula (7) defines a corresponding three-variablemean which certainly makes sense for ordered triplets. The approach extends to more variables in an obvious way. In this case we start with an ordered n-tuple, we form the (n − 1)-variable-mean of all possible subsets, we obtain an ordered n-tuple and continue the procedure. The limit gives the n-variable-mean (provided that the (n − 1)-variable-mean is already at our disposal). The transformer inequality can be proved as in Theorem 3. Restricting the discussion to three variables, we observed that ordering the triplet of the matrices is not necessary for the construction of some means. Computer simulations (programmed in “Maple”) show that recursion for the logarithmic mean seems to be convergent without the ordering condition on the three input matrices. We do not have a proof for this. 9 12 10 8 6 4 2 0 5 10 15 20 Iteration for the logarithmic mean The 3 × 3 matrices A1 , B1 and C1 are chosen randomly and the symmetrization procedure is carried out in 20 steps. The maximum of the diameter D(An , Bn , Cn ) of 100 tests is shown vertically and the number of iterations, from 1 to 20, is on the horizontal axis. It seems that there are more 3-variable-means without the ordering constraint than the known arithmetic, geometric and harmonic means. (Computer simulation has been carried out for some other means as well.) An operator monotone function f associated with a matrix mean has the property f (1) = 1 and f 0 (1) = 1/2. The latter formula follows from (ii)0 . From the power series expansion around 1, one can deduce that there are some positive numbers ε and δ such that kA1/2 f (A−1/2 B1 A−1/2 )A1/2 − A1/2 f (A−1/2 B2 A−1/2 )A1/2 k ≤ (1 − δ)kB1 − B2 k whenever I − ε ≤ A, B1 , B2 ≤ I + ε. This estimate equivalently means kMf (A, B1 ) − Mf (A, B2 )k ≤ (1 − δ)kB1 − B2 k , where k · k denotes the operator norm. If the diameter of a triplet (A, B, C) is defined as D(A, B, C) := max{kA − Bk, kB − Ck, kA − Ck} , then we have D(An+1 , Bn+1 , Cn+1 ) ≤ (1 − δ)D(An , Bn , Cn ), 10 (21) provided that the condition I − ε ≤ A1 , B1 , C1 ≤ I + ε holds. Therefore in a small neighborhood of the identity the symmetrization procedure converges exponentially fast for any triplet of matrices and for any matrix mean. Acknowledgement. The authors thank T. Ando, M. Pálfia, J. Réffy, and R. Szentpéteri for useful conversations on the subject and for some remarks, and to the referee for information on [3, 8]. References [1] T. Ando, C-K. Li and R. Mathias, Geometric means, Linear Algebra Appl. 385(2004), 305–334. [2] R. Bhatia, Matrix Analysis, Springer-Verlag, New York, 1996. [3] R. Bhatia and J. Holbrook, Geometry and means, preprint, 2005. [4] B. C. Carlson, The logarithmic mean, Amer. Math. Monthly 79, 615-618 (1972). [5] F. Hiai and H. Kosaki, Means of Hilbert space operators, Lecture Notes In Maths. 1820, Springer, 2003. [6] F. Kubo, T. Ando, Means of positive linear operators, Math. Ann. 246(1980), 205– 224. [7] P.W. Michor, D. Petz and A. Andai, On the curvature of a certain Riemannian space of matrices, Infin. Dimens. Anal. Quantum Probab. Relat. Top. 3(2000), 199–212. [8] M. Moakher, A differential geometric approach to the geometric mean of symmetric positive definite matrices, SIAM J. Matrix Anal. Appl. 26(2005), 735–747. [9] S. Mustonen, Logarithmic mean for several arguments, http://www.survo.fi/ papers/logmean.pdf [10] E. Neuman, The weighted logarithmic mean, J. Math. Anal. Appl. 188(1994), 885– 900. [11] D. Petz, Monotone metrics on matrix spaces, Linear Algebra Appl. 244(1996), 81– 96. [12] A.O. Pittenger, The logarithmic mean in n variables, Amer. Math. Monthly 92(1985), 99–104. [13] M. K. Vamanamurthy and M. Vuorinen, Inequalities for means, J. Math. Anal. Appl., 183(1994), 155–166. 11