MINIMAX THEOREMS ON HERMITIAN MATRICES

advertisement
MINIMAX THEOREMS ON HERMITIAN MATRICES
MINH KHA
There are some interesting inequalities involving eigenvalues of Hermitian matrices.
Almost of these I just summarize (I may add some little comments) from the new
book of Terence Tao “Topics in random matrix theory” [1].
Minimax theorems
A beautiful theorem of Courant and Fischer gives us an alternate definition of
eigenvalues of Hermitian matrices:
Theorem 0.1. (Courant-Fischer theorem)
Let A be a Hermitian n × n matrix. Let λ1 (A) ≥ . . . ≥ λn (A) are all eigenvalues
of A. Then
λi (A) = supdimV =i inf v∈V,kvk=1 v ∗ Av = inf dimV =n+1−i supv∈V,kvk=1 v ∗ Av.
The proof of this is a short and nice application of the spectral theorem of hermitian
matrices.
There is a generalisation which is called Wielandt minimax formula:
Theorem 0.2. For each collection 1 ≤ i1 < i2 < ... < ik ≤ n, we define a partial
flag to be a nested collection V1 ⊂ V2 ⊂ ... ⊂ Vk ⊂ Cn such that dimVj = ij . Define
the associated Schubert variety X(V1 , ..., Vk ) to be the collection of all k-dimensional
subspace W such that dim(W ∩ Vj ) ≥ j. Then λi1 (A) + λi2 (A) + ... + λik (A) =
supV1 ,..,Vk inf W ∈X(V1 ,..,Vk ) tr(A|W ) for all n × n hermitian matrix A. By duality, we
have λi1 (A) + λi2 (A) + ... + λik (A) = inf V1 ,..,Vk supW ∈X(V1 ,..,Vk ) tr(A|W ) if dimVj =
n − ij + 1 and Vk ⊂ ... ⊂ V1 .
Pm ∗
n
Here the partial trace T r(A|W ) =
i=1 vi Avi , where W is a subspace of C
spanned by an orthonormal basis v1 , ..., vn . It is not difficult to see this definition is
independent of the choice of orthonormal basis of W . Note that this trace operator is
defined similarly for trace class operators on a separable Hilbert space: the Schatten
S1 class (or noncommutative L1 space).
The proof of this one is involved with some combinatorial and dimension counting
arguments.
We sketch a little bit some ideas used in the proof.
Sketch: The key in this proof is to construct a suitable vector space W from Claim
2. Note that since the sequence λi (A) is decreasing, the sum λi1 (A) + ... + λik (A)
Pk Pn
could be approximated (≥) by a kind of “Riemann sum”
j=1 (
i=ij λi (A)γi,j ) ∼
Pk Pn
∗
j=1 (
i=ij γj Aγj ) for some suitable weighted vectors γj = (γij ,j , ..., γn,j ). This suggests us to find a subspace W generated by k orthonormal vectors from subspaces
2000 Mathematics Subject Classification. 20C07, (20E99).
Key words and phrases. .
1
2
MINH KHA
spanned by {en , ..., eij }, and since we would like to control the sizes of W ∩ Vj are
big enough (≥ j) each time j increases, so W should be chosen in a way such that it
contains “enough” linearly independent vectors vi ∈ Vi . This is the main idea of the
following two claims.
Proof. Here are some useful claims which are interesting independently:
Claim 1: Given a nest of subspaces Vk ⊂ ... ⊂ V1 such that dimVi ≥ k − i + 1 for
every i ∈ N. Suppose that there is an orthonormal family of vectors wi ∈ Vi for any
i = 1, ..., k − 1. Let we denote U as the vector subspace spanned by these wi . Then
we can always extend and redefine this family in such a way that there exists a vector
u ∈ V1 U so that U ⊕ Cu is spanned by an orthonormal basis h1 , ..., hk where each
hi ∈ Vi .
Proof of claim 1:
To prove the first claim, we use induction on k. The initial case k = 2 is clear.
Suppose we are at induction step k − 1.
Let S = span{w2 , ..., wk−1 }. To “find u”, we should use our induction argument
on the nest of k − 1 elements V2 , ..., Vk . Clearly, the restriction on dimensions of these
elements Vi is still satisfied in the case k − 1. Thus, we could find a normalize vector
v ∈ V2 S such that S ⊕ Cv is spanned by an orthonormal basis h2 , ..., hk where each
hi ∈ Vi for any i = 2, ..., k. Now our vector v could be in U or not.
In the first case, if v ∈ U then since dimU = dimS + 1, we must have U = S ⊕ Cv
or equivalently, U = span{h2 , ..., hk }. Thus, it suffices for us to choose any normalize
vector u ∈ V1 U then our claim is proved in this case. This is done since dimV1 ≤
k = dimU + 1.
In the second case, we redefine v by its projection on the orthogonal complement
⊥
U of U in V1 . We could see that all hi and w1 must be orthogonal to each other.
Since then, we could let u = v and U ⊕ Cv = Cw1 ⊕ span{h2 , ..., hk } is our desire.
Claim 2: Given a partial flag V1 ⊂ ... ⊂ Vk with dimVj = ij , and a nest of vector
subspaces Wk ⊂ ... ⊂ W1 with dimWj ≥ n − ij + 1. Then we could select two
orthonormal families of vectors vi ∈ Vi and wi ∈ Wi such that they span the same
vector subspace W = span{v1 , ..., vk } = span{w1 , ..., wk }.
Proof of Claim 2 :
We use induction on k too. k = 1 is obvious by the estimate dim(A ∩ B) ≥
dim(A) + dim(B) − n for any two subspaces A, B.
Assume we are at induction step k − 1. We want to prove for the case k. So
by the induction hypothesis, we choose two orthonormal families of vectors vi ∈ Vi
and wi ∈ Wi for any i = 1, ..., k − 1 and we denote their span is U . Define Sj =
Wj ∩ Vk , j = 1, ..., k − 1. Note that each wj ∈ Sj .
Then dimSj ≥ dimWj + dimVk − n ≥ ik − ij + 1 ≥ k − j + 1 and Sk−1 ⊂ ... ⊂ S1 .
Applying the first claim, we could find a normalize vector u ∈ S1 U so that U ⊕ Cu
is spanned by an orthonormal basis h1 , ..., hk , where hi ∈ Si ⊂ Wi and if we let vk = u
then W = U ⊕ Cvk satisfies our requirement.
Proof of theorem:
Given a hermitian matrix A, we recall e1 , ..., en to be the eigenvectors basis of A
such that Aei = λi (A)ei .
MINIMAX THEOREMS ON HERMITIAN MATRICES
3
Let Wj = span{en , ..., eij }, then it is clear that the nest Wj satisfies the assumption
of Claim 2. We choose two orthonormal families of vectors vi ∈ Vi and wi ∈ Wi such
that they span the same vector subspace W = span{v1 , ..., vk } = span{w1 , ..., wk }.
Then W ∈ X(V1 , ..., Vk ) and:
T r(A|W ) =
k
X
vi∗ Avi =
i=1
≤
k X
n
X
k
X
wi∗ Awi =
i=1
k X
n
X
|e∗l wj |2 e∗l Ael
j=1 l=ij
|e∗l wj |2 λij (A) = λi1 (A) + ... + λik (A)
j=1 l=ij
, which proves λi1 (A) + λi2 (A) + ... + λik (A) ≥ supV1 ,..,Vk inf W ∈X(V1 ,..,Vk ) tr(A|W ).
The other direction is much easier since if we take Vj = span{e1 , ..., eij }, then
for any W ∈ X(V1 , .., Vk ), we can easily choose recursively an orthonormal basis wj for W such that wj ∈ W ∩ Vj for any j = 1, ..., k. Thus, T r(A|W ) =
Pk Pij
∗
2
l=1 |el wj | λl (A) ≥ λi1 (A) + ... + λik (A), which completes the proof of our
j=1
theorem.
References
[1] T. Tao, Topics in random matrix theory, Graduate Studies in Mathematics, vol. 132, American
Mathematical Society, Providence, RI, 2012.
M.K., Department of Mathematics, Texas A&M University, College Station, TX
77843-3368, USA
E-mail address: kha@math.tamu.edu
Download