Means of positive numbers and matrices

advertisement
May 30, 2005
Means of positive numbers and matrices
1
Dedicated to Professor Pál Rózsa
on the occasion of his 80th birthday
Dénes Petz
2
and Róbert Temesi
3
Department for Mathematical Analysis
Budapest University of Technology and Economics
H-1521 Budapest XI., Hungary
It is shown that any two-variable-mean of positive matrices (studied by
Kubo and Ando) can be extended to more variables. The n-variable-mean
Mn (A1 , A2 , . . . , An ) is defined by a symmetrization procedure when the ntuple (A1 , A2 , . . . , An ) is ordered, it is monotone to each variables and satisfies the transformer inequality. This approach is motivated by the paper of
Ando, Li and Mathias on geometric means. Special attention is paid to the
logarithmic mean. It is conjectured that for matrix means the symmetrization procedure converges for all triplets.
1
Introduction
A function M : R+ × R+ → R+ may be called a mean of positive numbers if
(i) M (x, x) = x for every x ∈ R+ .
(ii) M (x, y) = M (y, x) for every x, y ∈ R+ .
(iii) If x < y, then x < M (x, y) < y.
(iv) If x < x0 and y < y 0 , then M (x, y) < M (x0 , y 0 ).
1
The work was partially supported by the Hungarian grant OTKA T032662 and the paper is to be
published in SIAM Matrix Anal. Appl .
2
E-mail: petz@math.bme.hu.
3
E-mail: temesi@math.bme.hu.
1
(v) M (x, y) is continuous.
√
The geometric mean xy, the arithmetic mean (x + y)/2 and the harmonic
mean 2/(x−1 +y −1 ) are the most known examples, sometimes they are called Pythagorean
means. There are many other means as well, such as
1
Aα (x, y) := (xα y 1−α + x1−α y α )
2
and
(0 ≤ α ≤ 1)
(1)
xα + y α 1/α
(−∞ ≤ α ≤ ∞).
(2)
2
In order to have a more treatable class of means, we impose the extra condition of
homogeneity
Bα (x, y) :=
(vi) M (tx, ty) = tM (x, y) (t, x, y ∈ R+ ).
Note that all the above examples are homogeneous means.
A two-variable function M (x, y) satisfying condition (vi) can be reduced to a onevariable function f (x) := M (1, x). Namely, M (x, y) is recovered from f as
y M (x, y) = xf
.
(3)
x
What are the properties of f which imply conditions (i)–(v)? They are as follows.
(i)0 f (1) = 1
(ii)0 tf (t−1 ) = f (t)
(iii)0 f (t) > 1 if t > 1 and f (t) < 1 if 0 < t < 1.
(iv)0 f is monotone increasing.
(v)0 f is continuous.
Then a homogeneous and continuous mean is uniquely described by a function f satisfying the properties (i)0 –(v)0 .
A function f determines a mean for positive matrices (or operators) when f is matrix
monotone. This property is a strengthening of (iv)0 . Then the formula
M (A, B) = A1/2 f (A−1/2 BA−1/2 )A1/2
(4)
replaces (3). In this paper the two-variable matrix means of Kubo and Ando are extended
to n-tuples (A1 , A2 , . . . , An ) of positive matrices such that A1 ≤ A2 ≤ . . . ≤ An . The
latter restriction is not required for every mean, the Pythagorean means work very well
without the ordering. The geometric mean was discussed by Ando, Li and Mathias
[1] and we present an elementary proof of their result. We conjecture that there are
other means such that the more-variable-extension works without the restrictive ordering
condition. For example, the logarithmic mean seems to be in this class. Although we
discuss briefly the means for positive real variables, the emphasis is on the matrix case.
2
2
Means for three variables
The geometric mean, the arithmetic mean and the harmonic mean are extended to three
(and more) variables as
√
xyz,
(x + y + z)/3,
3/(x−1 + y −1 + z −1 ).
Our target is to extend a general mean M2 (x, y) of two variables to three variables.
In order to do this assume that 0 < x ≤ y ≤ z and define a recursion
x1 = x,
xn+1 = M2 (xn , yn ),
y1 = y,
z1 = z
yn+1 = M2 (xn , zn ),
zn+1 = M2 (yn , zn ).
(5)
(6)
Below we refer to this recursion as symmetrization procedure.
One can show by induction that xn ≤ yn ≤ zn , moreover the sequence (xn ) is increasing and (zn ) is decreasing. Therefore, the limits
L := lim xn
n→∞
and
U = lim zn
n→∞
exist. Our next goal is to show that L = U .
Assume that L < U . By the continuity of the mean, yn → M := M2 (L, U ), and
L < M < U . Taking the limit of the relation M2 (yn , zn ) = zn+1 , we obtain
M2 (M, U ) = U.
This contradicts (iii) and L = U is proven.
Given a mean M2 (x, y) of two variables, we define M3 (x, y, z) as the limit limn xn =
limn yn = limn zn in the above recursion (5) and (6) for any x, y, z ∈ R+ . Note that the
existence of the limit does not require the symmetry of the M2 (x, y).
Example 1 For the arithmetic, geometric and harmonic means the above symmetrization yields the customary three-variable-mean from the corresponding two-variable-ones.
Example 2 For the logarithmic mean, different extensions have been proposed in the
literature. An extension can be given by the formula [9]:
L1 (x1 , x2 , . . . , xn ) = (n − 1)
L1 (x2 , x3 , . . . , xn ) − L1 (x1 , x2 , . . . , xn−1 )
log xn − log x1
This gives
L1 (x, y, z) :=
x
y
+
(log x − log y)(log x − log z) (log y − log x)(log y − log z)
z
+
.
(log z − log x)(log z − log y)
3
Note that this logarithmic mean appears in the scalar curvature of a certain Riemannian
metric on matrices [7].
Another possibility is
L2 (x, y, z) :=
(x − y)(x − z)(y − z)
1
,
2 xy(log x − log y) + xz(log z − log x) + yz(log y − log z)
see [12]. Since
Li (x, y, z) = Li L(x, y), L(x, z), L(y, z)
(i = 1, 2)
does not hold, our above symmetrization does not give Li (x, y, z) from L(x, y) (i = 1, 2).
Given a mean M2 of two variables, the corresponding mean M3 of three variables
(constructed above) satisfies the equation
M3 (x, y, z) = M3 M2 (x, y), M2 (x, z), M2 (y, z)
(7)
and has the property M3 (u, u, u) = u. Conversely, when M3 is continuous and has the
above two properties, then it must be the mean obtained from M2 by the symmetrization
procedure. Indeed, (7) implies
M3 (x, y, z) = M3 (xn , yn , zn ).
Since the sequences (xn ), (yn ) and (zn ) converge to some u, the right-hand-side converges to M3 (u, u, u) = u. Therefore, M3 is the mean provided by the symmetrization
procedure.
Example 3 Let
f (x) + f (y)
Mf (x, y) = f
2
be a quasi-arithmetic mean defined by means of a strictly monotone function f . For
these means
f (x) + f (y) + f (z)
−1
M3 (x, y, z) = f
3
−1
as it is easy to check (7). Remember that arithmetic, geometric and harmonic means
belong to this class.
3
Means of two positive matrices
A theory of means of positive operators was developed by Kubo and Ando [6]. The
theory has many interesting applications. Here we do not go into the details concerning
operator means but we confine ourselves to the essentials. Operator means are binary
operations on positive operators acting on a Hilbert space and they satisfy conditions
similar to the above (i)-(v), where the notation A < B means B − A is positive semidefinite and A 6= B:
4
(a) M (A, A) = A for every A,
(b) M (A, B) = M (B, A) for every A and B,
(c) if A < B, then A < M (A, B) < B,
(d) if A < A0 and B < B 0 , then M (A, B) < M (A0 , B 0 ),
(e) M is continuous.
An important further requirement is the transformer inequality:
(f) C M (A, B) C ∗ ≤ M (CAC ∗ , CBC ∗ )
for all operators C. This is a stronger version of condition (vi), since for invertible C,
there is equality in (f).
The key result in the theory is that operator means are in a 1-to-1 correspondence with
operator monotone functions satisfying conditions (i)0 –(v)0 . The correspondence is
given by
M (A, B) = A1/2 f (A−1/2 BA−1/2 )A1/2
(8)
when A is invertible. Note that an operator monotone function is automatically continuous and even smooth. See [2] for the theory of operator monotone functions. For the sake
of simplicity we restrict our discussion to invertible positive matrices. (Generalization
to positive operators is immediate.)
An important example is the geometric mean:
A#B = A1/2 (A−1/2 BA−1/2 )1/2 A1/2
(9)
which has the special property
(λA)#(µB) =
for positive numbers λ and µ.
p
λµ(A#B)
(10)
Example 4 Consider
fα (x) =
2xα+1/2
1 + x2α
(0 ≤ α ≤ 1/2).
Then
1
1
fα (x)−1 = x−α−1/2 + xα−1/2
2
2
is clearly operator monotone decreasing, so fα is operator monotone.
The operator monotone function
f (x) :=
gives the logarithmic mean.
x−1
log x
(11)
5
Example 5 From the arithmetic and geometric means, one can arrive at the logarithmic mean also in the matrix case, cf. [4]. We define an operation W on the pairs of
positive matrices as
W (A, B) := ((A + B)/2, ((A + B)/2)#B).
(12)
By induction it can be shown that the coordinates of W Wn,1 and Wn,2 satisfy conditions
(a) − (f) except (b), so these are not symmetric. Even in this case Wn,1 (A, B) can be
written in the form of (8)
Wn,1 (A, B) = A1/2 fn (A−1/2 BA−1/2 )A1/2 .
fn (x) = Wn,1 (x, 1) which converges to (11) according to [4].
This fact implies that the iteration works also for matrices: the sequence
A1/2 fn (A−1/2 BA−1/2 )A1/2
converges to the logarithmic mean of A and B.
The recursive way of mixing two means as in the previous example can produce new
means, or new operator monotone functions from arbitrary initial operator monotone
data g1 and h1 . Recursively
gn := g1 (gn−1 , hn−1 )
and
hn := h1 (gn−1 , hn−1 )
Example 6 Mixing of the operator monotone functions
g1 (x) =
√
x,
h1 (x) =
2x
x+1
(or geometric and harmonic means) yields the (joint) limit
√
x−1 2 x
.
log x 1 + x
Carrying out the recursion with
x−1
g1 (x) =
log x
√
x−1 2 x
and h1 (x) =
log x 1 + x
we arrive at the operator monotone function
x − 1 2
log x
2
.
1+x
The operator mean corresponding to the function (13) looks a bit unusual.
(13)
In the paper [11] a one-to-one correspondence was established between means and
certain Riemannian metrics on a space of matrices.
6
4
Means for three matrices
The mean of three positive matrices can be obtained by a symmetrization procedure
from the two-variable-mean.
Theorem 1 Let A, B, C ∈ Mn (C) be positive definite matrices and let M2 be an operator
mean. Assume that A ≤ B ≤ C. Set a recursion as
A1 = A,
An+1 = M2 (An , Bn ),
B1 = B,
C1 = C,
Bn+1 = M2 (An , Cn ),
Cn+1 = M2 (Bn , Cn ).
(14)
(15)
Then the limits
M3 (A, B, C) := lim An = lim Bn = lim Cn
n
n
n
(16)
exist.
Proof: By the monotonicity of M2 and mathematical induction, we see that An ≤
Bn ≤ Cn . It follows that the sequence (An ) is increasing and (Cn ) is decreasing. Therefore, the limits
L := lim An
and
U = lim Cn
n→∞
n→∞
exist. We claim that L = U .
Assume that L 6= U . By continuity, Bn → M2 (L, U ) =: M , where L < M < U . Since
M2 (Bn , Cn ) = Cn+1 ,
the limit n → ∞ gives M2 (M, U ) = U , which contradicts M < U .
By the symmetrization procedure any operator mean of two variables has an extension
to those triplets (A, B, C) which can be ordered. In a few cases, the latter restriction
can be skipped.
Example 7 If M2 (A, B) = (A+B)/2 is the arithmetic mean, then the above recursion
can be written explicitly and the convergence is easily verified without the conditions
A ≤ B ≤ C. The limit gives
1
M3 (A, B, C) = (A + B + C).
3
Example 8 The case of the harmonic mean M2 (A, B) = 2/(A−1 + B −1 ) can be
reduced to the arithmetic mean by means of the inverse, and the recursion gives
M3 (A, B, C) =
A−1
3
.
+ B −1 + C −1
7
The following theorem shows this for the geometric mean.
Theorem 2 Let A, B, C ∈ Mn (C) be positive definite matrices. Set a recursion as
A1 = A,
An+1 = An #Bn ,
B1 = B,
C1 = C,
Bn+1 = An #Cn ,
Cn+1 = Bn #Cn .
(17)
(18)
Then the limit
G(A, B, C) := lim An = lim Bn = lim Cn
n
n
n
(19)
exists.
Proof: Choose positive numbers λ and µ such that
A0 := A < B 0 := λB < C 0 := µC.
Start the recursion with these matrices. By Theorem 1 the limits
G(A0 , B 0 , C 0 ) := lim A0n = lim Bn0 = lim Cn0
n
n
n
exist. For the numbers
a := 1,
b := λ
and
c := µ
the recursion provides a convergent sequence (an , bn , cn ) of triplets.
(λµ)1/3 = lim an = lim bn = lim cn .
n
n
n
Since
An = A0n /an , Bn = Bn0 /bn
and
Cn = Cn0 /cn
due to property (10) of the geometric mean, the limits stated in the theorem must exist
and equal G(A0 , B 0 , C 0 )/(λµ)1/3 .
The result of Theorem 2 was obtained in [1] but our proof is different and completely
elementary. The path
γ(t) = A1/2 (A−1/2 BA−1/2 )t A1/2
(0 ≤ t ≤ 1)
(20)
connects A with B. In [3, 8], a Riemannian geometry is defined on the set of positive
definite matrices such that the path (20) is geodesic. In this approach the geometric
mean of two (or three) matrices has a nice geometric interpretation (similarly to the
arithmetic mean in the flat space.)
Theorem 3 The mean M3 (A, B, C) defined in Theorem 1 under the condition A ≤
B ≤ C is a symmetric continuous function of all variables, monotone in all variables
and satisfies the transformer inequality
XM3 (A, B, C)X ∗ ≤ M3 (XAX ∗ , XBX ∗ , XCX ∗ )
for any matrix X.
8
Proof: Symmetry and monotonicity are obvious; we prove the transformer inequality.
Let An , Bn and Cn be the matrices recursively defined in the construction of M3 (A, B, C)
and A0n , Bn0 , Cn0 those defined in the construction of M3 (XAX ∗ , XBX ∗ , XCX ∗ ). By
repeated use of the transformer inequality (f), we see that
XAn X ∗ ≤ A0n
and the limit n → ∞ gives our statement.
To prove the continuity it is essential to check that
lim M3 (A − εI, B − εI, C − εI) = M3 (A, B, C) = lim M3 (A + εI, B + εI, C + εI).
ε→+0
ε→+0
Both limits exists by monotonicity and M3 (A, B, C) must be in between. The different
limits would lead to a contradiction.
If kA − A0 k, kB − B 0 k, kC − C 0 k ≤ ε, then A − εI ≤ A0 ≤ A + εI and similarly
B − εI ≤ B 0 ≤ B + εI, C − εI ≤ C 0 ≤ C + εI. By monotonicity
M3 (A − εI, B − εI, C − εI) ≤ M3 (A0 , B 0 , C 0 ) ≤ M3 (A + εI, B + εI, C + εI).
Taking limit ε → 0 continuity follows.
In case of the geometric mean, the transformer inequality holds without the restriction
on ordering. This mean has several other properties as shown in [1].
The next theorem is straightforward from the construction of three-variable-means.
Theorem 4 An inequality between the two-variable-means M2 and M20 is inherited by
the corresponding three-variable-means M3 and M30 .
In particular, this theorem applies to the inequality between the arithmetic, logarithmic and geometric means.
5
Discussion
Given a positive operator mean M2 , formula (7) defines a corresponding three-variablemean which certainly makes sense for ordered triplets.
The approach extends to more variables in an obvious way. In this case we start with
an ordered n-tuple, we form the (n − 1)-variable-mean of all possible subsets, we obtain
an ordered n-tuple and continue the procedure. The limit gives the n-variable-mean
(provided that the (n − 1)-variable-mean is already at our disposal). The transformer
inequality can be proved as in Theorem 3.
Restricting the discussion to three variables, we observed that ordering the triplet of
the matrices is not necessary for the construction of some means. Computer simulations
(programmed in “Maple”) show that recursion for the logarithmic mean seems to be
convergent without the ordering condition on the three input matrices. We do not have
a proof for this.
9
12
10
8
6
4
2
0
5
10
15
20
Iteration for the logarithmic mean
The 3 × 3 matrices A1 , B1 and C1 are chosen randomly and the symmetrization procedure is carried out in 20 steps. The maximum of the diameter
D(An , Bn , Cn ) of 100 tests is shown vertically and the number of iterations,
from 1 to 20, is on the horizontal axis.
It seems that there are more 3-variable-means without the ordering constraint than
the known arithmetic, geometric and harmonic means. (Computer simulation has been
carried out for some other means as well.)
An operator monotone function f associated with a matrix mean has the property
f (1) = 1 and f 0 (1) = 1/2. The latter formula follows from (ii)0 . From the power series
expansion around 1, one can deduce that there are some positive numbers ε and δ such
that
kA1/2 f (A−1/2 B1 A−1/2 )A1/2 − A1/2 f (A−1/2 B2 A−1/2 )A1/2 k ≤ (1 − δ)kB1 − B2 k
whenever I − ε ≤ A, B1 , B2 ≤ I + ε. This estimate equivalently means
kMf (A, B1 ) − Mf (A, B2 )k ≤ (1 − δ)kB1 − B2 k ,
where k · k denotes the operator norm.
If the diameter of a triplet (A, B, C) is defined as
D(A, B, C) := max{kA − Bk, kB − Ck, kA − Ck} ,
then we have
D(An+1 , Bn+1 , Cn+1 ) ≤ (1 − δ)D(An , Bn , Cn ),
10
(21)
provided that the condition I − ε ≤ A1 , B1 , C1 ≤ I + ε holds. Therefore in a small
neighborhood of the identity the symmetrization procedure converges exponentially fast
for any triplet of matrices and for any matrix mean.
Acknowledgement. The authors thank T. Ando, M. Pálfia, J. Réffy, and R.
Szentpéteri for useful conversations on the subject and for some remarks, and to the
referee for information on [3, 8].
References
[1] T. Ando, C-K. Li and R. Mathias, Geometric means, Linear Algebra Appl.
385(2004), 305–334.
[2] R. Bhatia, Matrix Analysis, Springer-Verlag, New York, 1996.
[3] R. Bhatia and J. Holbrook, Geometry and means, preprint, 2005.
[4] B. C. Carlson, The logarithmic mean, Amer. Math. Monthly 79, 615-618 (1972).
[5] F. Hiai and H. Kosaki, Means of Hilbert space operators, Lecture Notes In Maths.
1820, Springer, 2003.
[6] F. Kubo, T. Ando, Means of positive linear operators, Math. Ann. 246(1980), 205–
224.
[7] P.W. Michor, D. Petz and A. Andai, On the curvature of a certain Riemannian space
of matrices, Infin. Dimens. Anal. Quantum Probab. Relat. Top. 3(2000), 199–212.
[8] M. Moakher, A differential geometric approach to the geometric mean of symmetric
positive definite matrices, SIAM J. Matrix Anal. Appl. 26(2005), 735–747.
[9] S. Mustonen, Logarithmic mean for several arguments, http://www.survo.fi/ papers/logmean.pdf
[10] E. Neuman, The weighted logarithmic mean, J. Math. Anal. Appl. 188(1994), 885–
900.
[11] D. Petz, Monotone metrics on matrix spaces, Linear Algebra Appl. 244(1996), 81–
96.
[12] A.O. Pittenger, The logarithmic mean in n variables, Amer. Math. Monthly
92(1985), 99–104.
[13] M. K. Vamanamurthy and M. Vuorinen, Inequalities for means, J. Math. Anal.
Appl., 183(1994), 155–166.
11
Download