Relative Perturbation Theory: II Eigenspace and

advertisement
Relative Perturbation Theory:
(II) Eigenspace and Singular Subspace Variations
Ren-Cang Li
Mathematical Science Section
Oak Ridge National Laboratory
P.O. Box 2008, Bldg 6012
Oak Ridge, TN 37831-6367
(li@msr.epm.ornl.gov)
LAPACK working notes # 85 (rst published July, 1994, revised January, 1996)
Abstract
The classical perturbation theory for Hermitian matrix eigenvalue and singular
value problems provides bounds on invariant subspace variations that are proportional
to the reciprocals of absolute gaps between subsets of spectra or subsets of singular
values. These bounds may be bad news for invariant subspaces corresponding to
clustered eigenvalues or clustered singular values of much smaller magnitudes than
the norms of matrices under considerations when some of these clustered eigenvalues
or clustered singular values are perfectly relatively distinguishable from the rest. In
this paper, we consider how eigenspaces of a Hermitian matrix A change when it is
perturbed to Ae = D AD and how singular values of a (nonsquare) matrix B change
when it is perturbed to Be = D1 BD2 , where D, D1 and D2 are assumed to be close
to identity matrices of suitable dimensions, or either D1 or D2 close to some unitary
matrix. It is proved that under these kinds of perturbations, the change of invariant
subspaces are proportional to the reciprocals of relative gaps between subsets of spectra
or subsets of singular values. We have been able to extend well-known Davis-Kahan
sin theorems and Wedin sin theorems. As applications, we obtained bounds for
perturbations of graded matrices.
This material is based in part upon work supported, during January, 1992{August, 1995, by Argonne
National Laboratory under grant No. 20552402 and by the University of Tennessee through the Advanced
Research Projects Agency under contract No. DAAL03-91-C-0047, by the National Science Foundation
under grant No. ASC-9005933, and by the National Science Infrastructure grants No. CDA-8722788 and
CDA-9401156, and supported, since August, 1995, by a Householder Fellowship in Scientic Computing
at Oak Ridge National Laboratory, supported by the Applied Mathematical Sciences Research Program,
Oce of Energy Research, United States Department of Energy contract DE-AC05-96OR22464 with Lockheed Martin Energy Research Corp. Part of this work was done during summer of 1994 while the author
was at Department of Mathematics, University of California at Berkeley.
1
2
Ren-Cang Li: Relative Perturbation Theory
1 Introduction
Let A and Ae be two n n Hermitian matrices with eigendecompositions
!
!
U1 ;
A
2
U2
!
!
e
e
U
1
1
Ae = Ue e Ue (Ue1; Ue2)
e 2
Ue2 ;
= U U (U1 ; U2)
1
(1.1)
(1.2)
where U; Ue 2 Un , U1 ; Ue1 2 Cnk (1 k < n) and
1 = diag(1; ; k );
e 1 = diag(e 1; ; e k );
2 = diag(k+1; ; n);
e 2 = diag(e k+1; ; e n ):
(1.3)
(1.4)
Suppose now that A and Ae are close. The question is: How close are the eigenspaces
spanned by Ui and Uei ? This question has been well answered by four celebrated theorems
so-called sin , tan , sin 2 and tan 2 due to Davis and Kahan [2, 1970] for arbitrary
additive perturbations in the sense that the perturbations to A can be made arbitrary
as long as Ae , A is kept small. It is proved that the changes of invariant subspaces are
proportional to the reciprocals of absolute gaps between subsets of spectra. This paper,
however, will address the same question but under multiplicative perturbations: How close
are the eigenspaces spanned by Ui and Uei under the assumption that Ae = DAD for some
D close to I ? Our bounds suggest that the changes of invariant subspaces be proportional
to the reciprocals of relative gaps between subsets of spectra. A similar question for
singular value decompositions will be answered also. To be specic, we will deal with
perturbations of the following kinds:
Eigenvalue problems:
1. A and Ae = D AD for the Hermitian case, where D is nonsingular and close to
the identity matrix.
e for the graded nonnegative Hermitian case, where
2. A = S HS and Ae = S HS
it is assumed that H and He are nonsingular and often that S is a highly graded
diagonal matrix (this assumption is not necessary to our theorems).
Singular value problems:
1. B and Be = D1 BD2 , where D1 and D2 are nonsingular and close to identity
matrices or one of them close to an identity matrix and the other to some
unitary matrix.
e for the graded case, where it is assumed that G and
2. B = GS and Be = GS
Ge are nonsingular and often that S is a highly graded diagonal matrix (this
assumption is not necessary to our theorems).
3
Ren-Cang Li: Relative Perturbation Theory
These perturbations cover component-wise relative perturbations to entries of symmetric
tridiagonal matrices with zero diagonal [4, 9], entries of bidiagonal and biacyclic matrices
[1, 3, 4], and perturbations in graded nonnegative Hermitian matrices [5, 12], in graded
matrices of singular value problems [5, 12] and more [6]. Recently, Eisenstat and Ipsen [7,
1994] launched an attack towards the above mentioned perturbations except graded cases.
We will give a brief comparison among their results and ours.
This paper is organized as follows. We briey review Davis-Kahan sin theorems for
Hermitian matrices and their generalizations|Wedin sin theorems for singular value
decompositions in x3. We present in x4.1 our sin theorems for eigenvalue problems
for A and Ae = D AD and for graded nonnegative Hermitian matrices. Theorems for
singular value problems for B and Be = D1 BD2 and for graded matrices are given in x4.2.
We discuss how to bound from below relative gaps, for example, between 1 and e 2 by
relative gaps between 1 and 2 in x5. A word will be said in x6 regarding EisenstatIpsen's theorems in comparison with ours. Detailed proofs are postponed to xx7, 8, 9, and
10. Finally in x11, we present our conclusions and outline further possible extensions to
diagonalizable matrices.
2 Preliminaries
Throughout this paper, we will be following notation we used in the rst part of this series
Li [11]. Most frequently used are the two kinds of relative distances: %p and dened for
; e 2 C by
, e j ;
%p(; e) = p j p, eje p for 1 p 1, and ; e ) = jp
jj + jj
jej
with convention 0=0 = 0 for convenience.
Lemma 2.1 (Li) 1. For ; 2 C, %p(; ) 2,1=p(; ).
2. For ; 2 R and 0, %p(; ) %p(2 ; 2) and 2(; ) (2 ; 2).
3. %p is a metric on R.
4. For ; ; ! 0, (; ) (; ! ) + (!; ) + 18 (; )(; ! )(!; ):
p
Since this paper concerns with the variations of subspaces, we need some metrics to measure the dierence between two subspaces. In this, we follow Davis and Kahan [2, 1970],
Stewart and Sun [14]. Let X; Xe 2 Cnk (n > k) have full column rank k, and dene the
angle matrix (X; Xe ) between X and Xe as
(X; Xe ) def
= arccos((X X ), 2 X Xe (Xe Xe ),1 Xe X (X X ), 2 ), 2 :
1
1
1
The canonical angles between the subspace X = R(X ) and Xe = R(Xe ) are dened to be
the singular values of the Hermitian matrix (X; Xe ), where R(X ) denotes the subspace
spanned by the column vectors of X . The following lemma is well-known. For a proof of
it, the reader is referred to, e.g, Li [10, Lemma 2.1].
4
Ren-Cang Li: Relative Perturbation Theory
e Xe1) 2 Cnn is a nonsingular matrix, and
Lemma 2.2 Suppose that (X;
!
e
Y
,
1
e Xe1) = e ; Ye 2 Cnk :
(X;
Y1
Then for any unitarily invariant norm jjj jjj
sin (X; Xe ) = (Ye1Ye1),1=2Ye1X (X X ),1=2 :
In this lemma, as well as many other places in the rest of this paper, we talk about the
\same" unitarily invariant norm jjj jjj that applies to matrices of dierent dimensions at
the same time. Such applications of a unitarily invariant norm are understood in the
following sense: First there is a unitarily invariant norm jjj jjj on CM N for suciently
large integers M and N ; Then for a matrix X 2 Cmn (m M and N n), jjX jj is
dened by appending X with zero blocks to make it M N and then taking the unitarily
invariant norm of the enlarged matrix.
Taking X = U1 and Xe = Ue1 (see (1.1) and (1.2)), with Lemma 2.2 one has
(U1 ; Ue1 ) = arccos(U1Ue1 Ue1 U1),1=2 and sin (U1; Ue1) = Ue2 U1 :
(2.5)
For more discussions on angles between subspaces, the reader is referred to Davis and
Kahan [2] and Stewart and Sun [14, Chapters I and II].
3 Davis-Kahan
3.1
sin
Theorems and Wedin
sin
Theorems
Eigenspace Variations
Let A and Ae be two Hermitian matrices whose eigendecompositions are given by (1.1) and
(1.2):
! !
U1 ,
1
A = U U = (U1; U2)
(1.1)
U
2
2
! e !
U1 ,
eA = Ue e Ue = (Ue1; Ue2) e 1 e
(1.2)
2
Ue2
where U; Ue 2 Un , U1 ; Ue1 2 Cnk (1 k < n) and i 's and e j 's are dened as in (1.3)
and (1.4). Dene
e 1 , U11 = (Ae , A)U1:
R def
= AU
(3.1)
The following two theorems are the matrix versions of two sin theorems due to Davis
and Kahan [2, 1970].
Theorem 3.1 (Davis-Kahan) Let A and Ae be two Hermitian matrices with eigendecompositions (1.1), (1.2), (1.3), and (1.4). If def
= 1ik;min
j , ek+j j > 0; then
1j n,k i
e
k sin (U1; Ue1)kF kRkF = k(A , A)U1kF :
(3.2)
5
Ren-Cang Li: Relative Perturbation Theory
In this theorem, the spectrum of 1 and that of e 2 are only required to be disjoint. In
the next theorem, they are required, more strongly, to be well-separated by intervals.
Theorem 3.2 (Davis-Kahan) Let A and Ae be two Hermitian matrices with eigendecompositions (1.1), (1.2), (1.3), and (1.4). Assume there is an interval [; ] and a > 0
such that the spectrum of 1 lies entirely in [; ] while that of e 2 lies entirely outside of
( , ; + ) (or such that the spectrum of e 2 lies entirely in [; ] while that of 1 lies
entirely outside of ( , ; + )). Then for any unitarily invariant norm jj jj
e
(
A
,
A
)
U
1
:
(3.3)
sin (U1; Ue1) jjR jj =
3.2
Singular Space Variations
Now, turn to the perturbations of Singular Value Decompositions (SVD). Let B and Be be
two m n (m n) complex matrices with SVDs
0
1
B = U V (U1 ; U2) B
@ 0
0
0e
1
eB = Ue e Ve (Ue1; Ue2) B@ 0
0
2
0
0
e2
0 0
1
CA
1
CA
!
V1 ;
V2
(3.4)
Ve1 ;
Ve2
(3.5)
!
where U; Ue 2 Um , V; Ve 2 Un , U1; Ue1 2 Cmk , V1; Ve1 2 Cnk (1 k < n) and
1 = diag(1; ; k ); 2 = diag(k+1 ; ; n );
e 1 = diag(e1 ; ; ek ); e 2 = diag(ek+1 ; ; en ):
(3.6)
(3.7)
e 1 , U11 = (Be , B)V1 and RL def
RR def
= BV
= Be U1 , V11 = (Be , B )U1:
(3.8)
Dene residuals
The following two theorems are due to Wedin [15, 1972].
Theorem 3.3 (Wedin) Let B and Be be two m n (m n) complex matrices with SVDs
(3.4), (3.5), (3.6), and (3.7). If def
= min 1ik;min
j , ek+j j; 1min
> 0; then
1j n,k i
ik i
q
k sin (U1; Ue1)k2F + k sin (V1; Ve1)k2F
q 2
q
kRRkF + kRLk2F
k(Be , B)V1k2F + k(Be , B )U1k2F
=
:
(3.9)
6
Ren-Cang Li: Relative Perturbation Theory
Theorem 3.4 (Wedin) Let B and Be be two m n (m n) complex matrices with SVDs
(3.4), (3.5), (3.6), and (3.7). If there exist > 0 and > 0 such that
min + 1ik i
and 1max
e ;
j n,k k+j
then for any unitarily invariant norm jjj jj
o
n
max sin (U1; Ue1) ; sin (V1 ; Ve1)
(3.10)
e o
n e
max (B , B )V1 ; (B , B )U1
:
max fjj RRjj ; jjRLjjg =
4 Relative Perturbation Theorems
4.1
Eigenspace Variations
The following theorem is an extension of Theorem 3.1.
Theorem 4.1 Let A and Ae = DAD be two n n Hermitian matrices with eigendecompositions (1.1), (1.2), (1.3), and (1.4), where D is nonsingular and close to I . If
2 def
= 1ik;min
% ( ; e ) > 0; then
1j n,k 2 i k+j
q
k(I , D,1)U1k2F + k(I , D)U1k2F
k sin (U1; Ue1)kF :
(4.1)
2
By imposing a stronger condition on the separation between the spectra of e 2 and 1, we
have the following bound on any unitarily invariant norm of sin (U1 ; Ue1 ).
Theorem 4.2 Let A and Ae = DAD be two n n Hermitian matrices with eigendecompositions (1.1), (1.2), (1.3), and (1.4), where D is nonsingular and close to I . Assume
that there exist > 0 and > 0 such that the spectrum of 1 lies entirely in [,; ] while
that of e 2 lies entirely outside (, , ; + ) (or such that the spectrum of 1 lies entirely
outside (, , ; + ) while that of e 2 lies entirely in [,; ]). Then for any unitarily
invariant norm jjj jjj
q
jj(I , D,1)U1jjjq + jjj(I , D)U1jjq
;
sin (U1; Ue1) q
p
(4.2)
where p def
= %p(; + ).
Now we consider eigenspace variations for a graded Hermitian matrix A = S HS 2
e . Set H def
Cnn perturbed to Ae = S HS
= He , H:
7
Ren-Cang Li: Relative Perturbation Theory
e be two nn Hermitian matrices with eigenTheorem 4.3 Let A = S HS and Ae = S HS
decompositions (1.1), (1.2), (1.3), and (1.4). H is positive denite and kH ,1k2kH k2 <
1. If def
=
min
(i; ek+j ) > 0; then
1ik; 1j n,k
,1
k sin (U1; Ue1)kF kD ,D kF p
kH ,1k2
kH kF :
,
1
1 , kH k2kH k2 (4.3)
where D = (I + H ,1=2(H )H ,1=2)1=2 = D.
e be two nn Hermitian matrices with eigenTheorem 4.4 Let A = S HS and Ae = S HS
decompositions (1.1), (1.2), (1.3), and (1.4). H is positive denite and kH ,1k2kH k2 <
1. Assume that there exist > 0 and > 0 such that
max 1ik i
or
min 1ik i
and 1min
e + j n,k k+j
+ and
Then for any unitarily invariant norm jj jjj
max e
1j n,k k+j
:
D , D,1
kH ,1k2
jjjH jj ;
e
p
sin (U1; U1) ,
1
1 , kH k2 kH k2 (4.4)
= (; + ) and D = (I + H ,1=2(H )H ,1=2)1=2 = D.
where def
4.2
Singular Space Variations
The following two theorems concern singular space perturbations.
Theorem 4.5 Let B and Be = D1BD2 be two m n (m n) (complex) matrices with
SVDs (3.4), (3.5), (3.6), and (3.7), where D1 and D2 are nonsingular and close to identities. Let
8 >
e
min
min
%
(
;
)
;
min
%
(
;
0)
;
<
1ik;1jn,k 2 i k+j 1ik 2 i
2 def
=>
% ( ; e ) ;
: min 1ik;min
1j n,k 2 i k+j
if m > n;
otherwise:
(4.5)
If 2 > 0, then
q
k sin (U1; Ue1)k2F + k sin (V1; Ve1)k2F
(4.6)
q
k(I , D1)U1k2F + k(I , D1,1 )U1k2F + k(I , D2)V1k2F + k(I , D2,1)V1k2F
:
2
8
Ren-Cang Li: Relative Perturbation Theory
Theorem 4.6 Let B and Be = D1BD2 be two m n (m n) (complex) matrices with
SVDs (3.4), (3.5), (3.6), and (3.7), where D1 and D2 are nonsingular and close to identities. If there exist > 0 and > 0 such that
min + and 1max
e ;
1ik i
j n,k k+j
then for any unitarily invariant norm jjj jj
max
o
n
(4.7)
sin(U1 ; Ue1) ; sin (V1; Ve1)
q
q
1 max (I , D2,1 )V1 q + jj(D1 , I )U1 jjq ; (I , D1,1 )U1 q + jj(D2 , I )V1 jjq ;
p
!
sin (U1 ; Ue1)
(4.8)
sin (V1 ; Ve1) s q (I , D )U
q
(I , D1,1)U1
1 1
+
(I , D2 )V1 (I , D2,1 )V1 q
q
q
;
p
where p def
= %p (; + ).
In Theorems 4.5 and 4.6, we assumed that both D1 and D2 are close to identity matrices.
But, intuitively D2 should not aect U1 much as long as it is close to a unitary matrix. In
fact, if D2 2 Un , it does not aect U1 at all. The following theorems indeed conrm this
observation.
Theorem 4.7 Let B and Be = D1BD2 be two m n (m n) (complex) matrices with
SVDs (3.4), (3.5), (3.6), and (3.7), where D1 and D2 are close to some unitary matrices.
Let 2 be dened by (4.5) and set
8 >
e
min
min
(
;
)
;
min
(
;
0)
;
<
1ik;1jn,k i k+j 1ik i
def
=>
(i ; ek+j ) ;
: min 1ik;min
1j n,k
Assume1
if m > n;
otherwise:
2 > p1 maxfkD1 , D1,1 k2; kD2 , D2,1 k2g:
2 2
If D1 is close to identity, then
sin (U1; Ue1)F sin (U1; Ue1)F (4.9)
(4.10)
q
k(I , D1,1)U1k2F + k(I , D1)U1k2F 1 kD2 , D2,1kF
+ 1 + 16 2 , 1 ;
2 , 2,3=22
q
k(I , D1,1)U1k2F + k(I , D1)U1k2F kD2 , D2,1kF
+ 3=2
;
2 , 2,3=22
2 2 , 1
1 This implies, by Lemma 2.1, > 1 maxfkD1 , D,1 k2 ; kD2 , D,1 k2 g.
1
2
2
(4.11)
(4.12)
9
Ren-Cang Li: Relative Perturbation Theory
If D2 is close to identity, then
kD , D,1kF qk(I , D2,1)V1k2F + k(I , D2)V1k2F
;
sin (V1; Ve1)F 1 + 162 21 , 1 +
2 , 2,3=21
2
(4.13)
q
,1
k(I , D2,1)V1k2F + k(I , D2)V1k2F
;
(4.14)
sin (V1; Ve1)F kD231=2, D,1 kF +
, 2,3=2
2
2
where 1 = kD1 , D1,1 k2 and 2 = kD2 , D2,1 k2.
2
1
Inequalities (4.11) and (4.12) which dier slightly in their last term clearly says that D2
contributes to sin (U1; Ue1) with its departure from some unitary matrix, and similarly
do (4.13) and (4.14).
Remark. When one of the D1 and D2 is I , assumption (4.10) can actually be weakened
to 2 > 0, as shall be seen from our proofs.
Theorem 4.8 Let B and Be = D1BD2 be two m n (m n) (complex) matrices with
SVDs (3.4), (3.5), (3.6), and (3.7), where D1 and D2 are close to some unitary matrices.
Suppose that there exist > 0 and > 0 such that
min + and 1max
e :
1ik i
j n,k k+j
Assume2
1 maxfkD , D,1 k ; kD , D,1 k g:
%p (; + ) > 21+1
1
2
1 2
2 2
=p
If D1 is close to identity, then for any unitarily invariant norm jjj jj
(4.15)
r
,1
(I , D,1)U1 q + jj(I , D)U1jjjq 1
1
1 D2 , D2 ;
+
1
+
sin (U1; Ue1) 16 2 , 1
p , 2,1,1=p 2
q
r
(I , D1,1)U1 q + jj(I , D1)U1jjjq D , D,1
2
+ 1+1=p 2 ;
sin (U1; Ue1) ,
1
,
1
=p
p , 2
2
2
2 , 1
(4.16)
q
(4.17)
If D2 is close to identity, then
,1 r
,1 )V1 q + jj(I , D )V1jjq
(
I
,
D
D
,
D
2
2
1
2 1
+
;
sin (V1; Ve1) 1 + 16
,
1
,
1
=p
2 , 2
p , 2
1
q
q
,1 r
(I , D2,1)V1 + jj(I , D2)V1jjq
D1 , D1 :
sin (V1; Ve1) 21+1=p , +
p , 2,1,1=p1
p 2
(4.18)
q
2 This implies, by Lemma 2.1, > 1 maxfkD1 , D,1 k2 ; kD2 , D,1 k2 g.
1
2
2
(4.19)
10
Ren-Cang Li: Relative Perturbation Theory
where p def
= %p(; + ), def
= (; + ), and 1 = kD1 , D1,1 k2 and 2 = kD2 , D2,1 k2.
Remark. If either D1 or D2 is I , (4.15) can actually be removed.
Now, Let's briey mention a possible application of Theorems 4.5, 4.6, 4.7 and 4.8. It
has something to do with deation in computing the singular value systems of a bidiagonal
matrix. Taking account of the remarks we have made, we get
Corollary 4.1 Assume D1 = I and D2 takes the form
!
I X ;
I
D2 =
where X is a matrix of suitable dimensions. Let 2 be dened by (4.5), and by (4.9).
If 2 > 0, then
kX kF ;
sin (U1; Ue1)F p
2 p
r
sin (U1; Ue1)2 + sin (V1; Ve1)2 2kX kF :
F
F
2
(4.20)
(4.21)
Proof: Inequality (4.21) follows from (4.6). Inequality (4.20) follows from (4.11) and
D , D,1 =
2
2
X
X
!
p
) kD2 , D2,1kF = 2kX kF:
Corollary 4.2 Assume D2 = I and D1 takes the form
!
D1 =
I X ;
I
where X is a matrix of suitable dimensions. Let 2 be dened by (4.5), and by (4.9).
If 2 > 0, then
kX kF ;
sin (V1; Ve1)F p
2 p
r
sin (U1; Ue1)2 + sin (V1; Ve1)2 2kX kF :
F
F
Corollary 4.3 Assume D1 = I and D2 takes the form
!
D2 =
2
I X ;
I
where X is a matrix of suitable dimensions. Suppose that there exist > 0 and > 0
such that
min + and 1max
e :
1ik i
j n,k k+j
11
Ren-Cang Li: Relative Perturbation Theory
Then
!
X ; sin (V1; Ve1) jjX jj ;
p
X
sin (U1; Ue1) 21
(4.22)
where p def
= %p(; + ), def
= (; + ).
Proof: The rst inequality in (4.22) follows from (4.16), and the second from (4.7).
Corollary 4.4 Assume D2 = I and D1 takes the form
!
D1 =
I X ;
I
where X is a matrix of suitable dimensions. Suppose that there exist > 0 and > 0
such that
min + and 1max
e :
1ik i
j n,k k+j
Then
!
X ;
X
sin (U1; Ue1) jjjX jj ; sin (V1; Ve1) 21
p
= %p(; + ), def
= (; + ).
where p def
Now, we consider singular space variations for a graded matrix B = GS 2 Cnn pere 2 Cnn , where G is nonsingular. Set G def
turbed to Be = GS
= Ge ,G. If k(G)G,1k2 < 1,
,
1
then Ge = G + G = [I + (G)G ]G is nonsingular also.
e 2 Cnn with SVDs (3.4), (3.5), (3.6),
Theorem 4.9 Let B = GS 2 Cnn and Be = GS
and (3.7), where G is nonsingular. Assume k(G)G,1k2 < 1. If
2 def
= 1ik;min
% ( ; e ) > 0;
1j n,k 2 i k+j
then
r
sin (U1; Ue1)2 + sin (V1; Ve1)2
F
q F
k(G)G,1U1k2F + k[I + G,(G)],1G,(G)U1k2F
2
s
kG,1k2 1 + (1 , kG,11k kGk )2 kGkF ;
2
2
2
(4.23)
12
Ren-Cang Li: Relative Perturbation Theory
and
,1
,1 ,
(4.24)
sin (V1; Ve1)F kI + (G)G ,2(I + (G)G ) kF
,1k2 ! k(G)G,1kF
,1 + G, (G)kF
k
(
G
)
G
k
(
G
)
G
+
k(G)G,1kF
1 , k(G)G,1k2
2 ,1
1 + 1 , kG,11k kGk kG k22kGkF ;
2
2
where def
= 1ik;min
(i ; ek+j ):
1j n,k
e = [I + (G)G,1]GS = D1BD2, where D1 = I + (G)G,1 and
Proof: Write Be = GS
D2 = I . Then apply Theorem 4.5 to get (4.23), and apply Theorem 4.7 to get (4.24).
e 2 Cnn with SVDs (3.4), (3.5), (3.6),
Theorem 4.10 Let B = GS 2 Cnn and Be = GS
and (3.7), where G is nonsingular. Assume k(G)G,1k2 < 1. If there exist > 0 and
> 0 such that
min 1ik i
+ and
max 1ik i
and
or, the other way around, i.e.,
max e
1j n,k k+j
min e
1j n,k k+j
+ ;
then for any unitarily invariant norm jjj jj
o
n
max sin (U1 ; Ue1 ) ; sin (V1; Ve1)
(G)G,1U ; [I + G,(G)],1G,(G)U max
1
1
kG,1k2
jjGjj ;
,
1
1 , kG k2 kGk2 !1
(4.25)
1
sin (U1 ; Ue1 )
sin (V1; Ve1) q
jj(G)G,1U1jjq + jj[I + G,(G)],1G,(G)U1jjq
p
s
kG,1k2 1 + (1 , kG,11k kGk )q jjGjj ;
2
2
p
(4.26)
q
q
and
I + (G)G,1 , (I + (G)G,1),1
e
sin (V1; V1) 2
(4.27)
Ren-Cang Li: Relative Perturbation Theory
13
(G)G,1 + G,(G)
,1k2 ! (G)G,1
k
(
G
)
G
+ 1 , k(G)G,1k
jjj(G)G,1jjj
2 2
,
1
1 + 1 , kG,11k kGk kG 2k2jjGjj :
2
2
where p def
= %p(; + ), and def
= (; + ).
e = [I + (G)G,1]GS = D1BD2, where D1 = I + (G)G,1
Proof: Again write Be = GS
and D2 = I . Then apply Theorem 4.6 to get (4.25) and (4.26), and apply Theorem 4.8 to
get (4.27).
Remark. Inequalities (4.24) and (4.27) may provide much tighter bounds than (4.23)
and (4.25), especially when (G)G,1 is very close to a skew Hermitian matrix.
5 More on Relative Gaps
In the theorems of x4, various relative gaps play an indispensable role. Those gaps are
imposed on either between 1 and e 2 or between 1 and e 2. In some applications, it may
be more convenient to have theorems where only positive relative gaps between 1 and
2 or between 1 and 2 are assumed. Based on results of Ostrowski [13, 1959] (see also
[8, pp.224{225]), Barlow and Demmel [1, 1990], Demmel and Veselic [5, 1992], Eisenstat
and Ipsen [6, 1993], Mathias [12, 1994], and Li [11, 1994], theorems in x4 can be modied
to accommodate this need. In what follows, we list inequalities for how to bound relative
gaps between 1 and e 2 or between 1 and e 2 from below for each theorem by relative
gaps between 1 and 2 or between 1 and 2 . The derivations of these inequalities
depends on the fact listed in Lemma 2.1.
For Theorem 4.1: 2 1ik;min
% ( ; ) , kI , DDk2 :
1j n,k 2 i k+j
For Theorem 4.2: Assume that there exist ^ > 0 and ^ > 0 such that the spectrum of
1 lies entirely in [,^ ; ^ ] while that of 2 lies entirely outside (,^ , ^; ^ + ^) (or
such that the spectrum of 1 lies entirely outside (,^ , ^; ^ + ^) while that of 2
lies entirely in [,^; ^]). If %p(^; ^ + ^) > kI , DDk2 , then there are > 0 and
> 0 as the theorem requires, and
p %p(^; ^ + ^) , kI , D Dk2:
For Theorem 4.3: > 1ik;min
(i ; k+j ) , 12 kD , D,1 k2 .
1j n,k
For Theorem 4.4: Assume that there exist ^ > 0 and ^ > 0 such that
max ^ and 1min
^ + ^
1ik i
j n,k k+j
14
Ren-Cang Li: Relative Perturbation Theory
or
If (^; ^ + ^) >
requires, and
min ^ + ^
1ik i
1 kD , D,1 k2 , then
2
^:
there are > 0 and > 0 as the theorem
and
max 1j n,k k+j
,1
^ 1
(^; ^kD+,D),,1k22 kD , D ^ k2 :
1 + 16 (^; ^ + )
For Theorem 4.5: 2 ^2 , , where = 2p1 2 (kD1 , D1,1k2 + kD2 , D2,1k2) (or =
maxfj1 , min(D1)min(D2)j; j1 , max(D1)max(D2)jg) and
8 >
% ( ; e ); min % ( ; 0) ;
< min 1ik;min
1j n,k 2 i k+j 1ik 2 i
^2 def
=>
% ( ; e ) ;
: min 1ik;min
1j n,k 2 i k+j
if m > n;
otherwise:
(5.1)
For Theorems 4.6: Assume there exist ^ > 0 and ^ > 0 such that
min ^ + ^ and 1max
^:
1ik i
j n,k k+j
If %p (^; ^ + ^) > , then there are > 0 and > 0 as the theorem requires, and
p %p (^; ^ + ^) , ;
where = 21+11 =p (kD1,D1,1 k2 +kD2,D2,1 k2) (or = maxfj1,min(D1)min(D2)j; j1,
max(D1)max(D2)jg).
For Theorems 4.7: Let ^2 be dened by (5.1) and set
8 >
(i ; k+j ); 1min
( ; 0) ;
< min 1ik;min
1j n,k
ik i
^ def
=>
(i ; k+j ) ;
: min 1ik;min
1j n,k
If ^2 > 2p1 2 (1 + 2 + maxf1; 2 g); then
if m > n;
otherwise:
(5.2)
^ , 12 (1 + 2)= 1 + 321 1 2
1
:
p
2 ^2 ,
( + ) and +2 ^ = 1 + 1 2 2 1 2
1 + 116
32 1 2
For Theorems 4.8: Suppose that there exist ^ > 0 and ^ > 0 such that
min ^ + ^ and 1max
^:
1ik i
j n,k k+j
If %p(^; ^ + ^) > 21+11 =p (1 + 2 + maxf1 ; 2 g); then there are > 0 and > 0 as the
theorem requires, and
^) , 12 (1 + 2 )= 1 + 321 1 2
(^
;
^
+
1
:
p %p(^; ^+^), 21+1=p (1+2 ) and +2 (^
1 + 116
; ^ + ^)= 1 + 321 12
15
Ren-Cang Li: Relative Perturbation Theory
For Theorem 4.9:
2 ^2 , p1 and 1^+,=^2 ;
2 2
16 where ^2 def
= 1ik;min
% ( ; ), ^ def
= 1ik;min
(i ; k+j ), = kD ,
1j n,k 2 i k+j 1j n,k
D,1 k2, and D = I + (G)G,1.
For Theorem 4.10: Suppose there exist ^ > 0 and ^ > 0 such that
min i ^ + ^ and max k+j ^
1ik
1j n,k
or, the other way around, i.e.,
max 1ik i
If %p (^; ^ + ^) >
and
^ and
min 1j n,k k+j
^ + ^;
1
21+1=p ,
then there are > 0 and > 0 as the theorem requires,
1 and (^; ^ + ^) , =2 ;
p %p(^; ^ + ^) , 21+1
=p
1 + (^
^
16 ; ^ + )
where = kD , D,1 k2 and D = I + (G)G,1.
6 A Word on Eisenstat-Ipsen's Theorems
Eisenstat and Ipsen [7, 1994] have obtained a few bounds on eigenspace variations for
A and Ae = D AD and on singular space variations for Bpand = Be = D1BD2 . Their
bounds for subspaces of dimension k > 1 contain a factor k which makes their results
less competitive to ours. For this reason we will not compare their bounds for subspaces
of dimension k > 1 with ours.
For the problem studied in Theorem 4.1, Eisenstat and Ipsen [7, 1994] tried to bound
the angle j between uej and R(U1), where ue1 ; ue2; ; uek are the columns of Ue1. They
showed
, ,1
sin j kI , D D k2 + kI , Dk2;
(6.1)
where
j
8
< min j ,e j ;
def
j = : k+1in je j
1
if e j 6= 0,
otherwise:
Inequality (6.1) does provide a nice bound. Unfortunately it does not bound straightforwardly k sin (U1 ; Ue1 )k2, and generally,
i
j
j
sin j k sin (U1 ; Ue1 )k2 for j = 1; 2; : : :; k
Ren-Cang Li: Relative Perturbation Theory
16
and
p all of them may be strict. In the worst case, k sin (U1; Ue1)k2 could be as large as
k 1max
sin j . To make a fair comparison to our inequality (4.1), we consider the case
j k
k = 1. One infers from inequality (4.1) that3
q
kI , D,1k22 + kI , Dk22
:
sin 1 min %2(i ; e 1)
2in
(6.2)
It looks that (6.1) may be potentially sharper because 1 may be much larger than
min % ( ; e ). But this is not quite true because of the extra term kI , Dk2 in (6.1)
2in 2 i 1
which stays no matter how large 1 is. We present our arguments as follows.
1. These relative perturbation bounds are most likely to be used when the closest
eigenvalue ` among all figni=2 to e 1 has about the same magnitude as e 1 , i.e.,
when j`j je 1j and
p
p
1 j` , e1 j=je 1j 2%2(`; e 1) 2 2min
% ( ; e ):
in 2 i 1
In such a situation, if I , D is very tiny, then
q
p
kI , D,1k22 + kI , Dk22 2kI , Dk2 + O(kI , Dk22);
kI , D,D,1k2 2kI , Dk2 + O(kI , Dk22):
Hence asymptotically, inequalities (6.2) and (6.1) read, respectively
p
2kI , Dk2 + O(kI , Dk2 );
2
e
p%2(`; 1)
sin 1 2kI ,eDk2 + kI , Dk2 + O(kI , Dk22 ):
%2(`; 1)
So our bound (6.2) is asymptotically sharper by amount kI , Dk2.
2. On the other hand, if the closest eigenvalue ` among all figni=2 to e 1 has much
bigger magnitude than je 1j, i.e., j` j je 1j, then
sin 1 min % ( ; e ) 1:
2in 2 i 1
Thus asymptotically, inequality (6.2) read
p
sin 1 2kI , Dk2 + O(kI , Dk22 )
which cannot be much worse than (6.1).
3 By treating A and Ae symmetrically, one can see Theorem 4.1 remains valid with the relative gap 2
between 1 and e 2 replaced by the relative gap 1ik;min
% (e ; ) between e 1 and 2 .
1jn,k 2 i k+j
17
Ren-Cang Li: Relative Perturbation Theory
To be short, we have shown that it is possible for our inequality (6.2) to be less sharp than
(6.1) by a constant factor and in the case when these bounds are most likely to be used
in estimating errors (6.2) is sharper.
Another point we like to make is that we had given up some sharpness for elegantness
in the derivation of (6.2). Recall that we have, besides (7.1), also
e 1Ue1 U2 , Ue1 U22 = e 1 Ue1(I , D,1 )U2 + Ue1 (D , I )U22:
When e 1 = 0, this equation reduces to ,Ue1 U2 = Ue1 (I , D )U2 which implies
(6.3)
k sin (U1; Ue1)k2 kD , I k2;
a better bound than (6.2), provided j =
6 0 for j = k + 1; ; n. Equation (6.3) implies
ue1U2 = ue1(D,1 , I )U2e1 (e 1I , 2),1 + ue1(I , D)U2 2(e 1I , 2),1 ;
which can be used to obtain an identity for sin 1 = kue1U2 k2 ! Generally (6.3) produces
k sin (U1; Ue1)k2
jeij kI , D,1k +
jk+j j kD , I k
1ik;max
max
2
2
e
e
1j n,k ji , k+j j
1ik; 1j n,k ji , k+j j
which would be a better bound than (4.1) in the case when
either 1min
je j 1max
j j or 1max
je j 1min
j j:
ik i
j n,k k+j
ik i
j n,k k+j
Eisenstat and Ipsen [7, 1994] treated singular value problems in a very similar way as
they did to eigenspaces. This makes our arguments above apply to our bounds and their
bounds for singular value problems.
Eisenstat and Ipsen [7, 1994] did not study bounds in other matrix norms.
7 Proofs of Theorems 4.1 and 4.2
e 1 , U11 = (Ae , A)U1 as dened in (3.1). Notice that
Let R = AU
e 1 , Ue2U11 = e 2Ue2U1 , Ue2U11;
Ue2R = Ue2AU
Ue2R = Ue2(hAe , A)U1 = Ue2 [D AD , D Ai+ DA , A] U1
= Ue2 D AD(I , D,1 ) + (D , I )A U1
= e 2Ue2 (I , D,1 )U1 + Ue2 (D , I )U11 :
Thus, we have
e 2Ue2 U1 , Ue2 U11 = e 2 Ue2(I , D,1 )U1 + Ue2 (D , I )U11:
(7.1)
18
Ren-Cang Li: Relative Perturbation Theory
LemmaT7.1 Let 2 Css and , 2 Ctt be two Hermitian matrices, and let E; F 2 Cst.
If (
) (,) = ;, then
equation
q matrix
. X , X , =def
E + F , has a unique solution,
2
2
and moreover kX kF kE kF + kF kF 2 , where 2 =
min
% (!; ).
!2(
); 2(,) 2
Proof: For any unitary matrices P 2 Us and Q 2 Ut , the substitutions
P P; ,
Q,Q; X
P XQ; E
P EQ; and F
P FQ
leave the lemma unchanged, so we may assume without loss of generality that
= diag(!1 ; !2; : : :; !s ) and , = diag(1; 2; : : :; t).
Write X = (xij ), E = (eij ), and F = (fij ). Entrywise, equation X , X , = E + F ,
reads !i xij , xij j = !i eij + fij j . TThus xij exists uniquely provided !i 6= j which is
guaranteed by the assumption (
) (,) = ;, and moreover
j(!i , j )xij j2 = j!ixij , xij j j2 = j!ieij + fij j j2 (j!ij2 + j j2)(jeij j2 + jfij j2)
by the Cauchy-Schwarz inequality. This implies
2
2
2
2
jxij j2 jeij j + jfij j2 jeij j +2 jfij j
[%2(!i ; j )]
2
P je j2 + P
jfij j2
ij
X
kE k2F + kF k2F ;
i;j
) kX k2F = jxij j2 i; j
=
2
2
2
i; j
2
as was to be shown.
Proof of Theorem 4.1: By Lemma 7.1 and equation (7.1), we have
,1
2
2
e
e kUe2U1k2F kU2 (I , D )U1kF+2 kU2 (D , I )U1kF
2
1
2 k(I , D,1)U1k2F + k(D , I )U1k2F :
2
This completes the proof of Theorem 4.1, since k sin (U1; Ue1)kF = kUe2U1 kF .
Lemma 7.2 Let 2 Css and , 2 Ctt be two Hermitian matrices, and let E; F 2 Cst.
If there exist > 0 and > 0 such that
k
k2 and k,,1k,2 1 + or
k
,1k,2 1 + and k,k2 ;
then matrix equation X , X , = E +qF , has a unique
. solution, anddefmoreover for any
q
q
unitarily invariant norm jjj jjj, jjX jj jjE jj + jjF jjj p , where p = %p(; + ).
q
19
Ren-Cang Li: Relative Perturbation Theory
T
Proof: First of all, the conditions of this lemma imply (
) (,) = ;, thus X exists
uniquely by Lemma 7.1. In what follows, we present a proof of the bound for jjX jj for the
case k
k2 and k,,1 k,2 1 + . A proof for the other case is analogous. Post-multiply
equation X , X , = E + F , by ,,1 to get
X ,,1 , X = E ,,1 + F:
(7.2)
Under the assumptions k
k2 and k,,1 k,2 1 + ) k,,1 k2 +1 , we have
,1 X , , X jjX jj , X ,,1 jjX jj , k
k2 jjX jj k,,1k2
1
jjX jj , jjjX jj + = 1 , + jjX jj ;
and
,1 E , + F E ,,1 + jjF jjj k
k2 jjjE jj k,,1k2 + jjF jjj
s
p q
1
jjjE jj + + jjjF jj 1 + ( + )p jjE jjq + jjF jjjq :
q
p
By equation (7.2), we deduce that
s
p q
1 , + jjX jj 1 + ( + )p jjE jjq + jjF jjjq
p
q
from which the desired inequality follows.
Proof of Theorem 4.2: By Lemma 7.2 and equation (7.1), we have
r
e q e q ,
e
,
1
U2 U1 U2 (I , D )U1 + U2 (D , I )U1 p
q
q
q
,
1
jj(I , D I )U1jj + jj(D , I )U1jj p;
as required since sin (U1 ; Ue1 ) = Ue2 U1 .
q
q
8 Proofs of Theorems 4.3 and 4.4
Notice that
A = S HS = (H 1=2S ) H 1=2S;
Ae = S H 1=2(I + H ,1=2(H )H ,1=2)H 1=2S
= (I + H ,1=2(H )H ,1=2)1=2H 1=2S (I + H ,1=2(H )H ,1=2)1=2H 1=2S:
20
Ren-Cang Li: Relative Perturbation Theory
Set B = S H 1=2 and Be = S H 1=2(I + H ,1=2(H )H ,1=2)1=2 def
= BD, where
1=2
,
1
=
2
,
1
=
2
D = I + H (H )H
. Given the eigendecompositions of A and Ae as in (1.1),
(1.2), (1.3), and (1.4), one easily see that B and Be admit the following SVDs.
B =
U 1=2V (U1; U2)
11=2
!
12=2
e 1=2
Be = Ue e 1=2Ve (Ue1; Ue2) 1 e 1=2
2
!
V1 ;
V2
! e !
V1 ;
Ve2
where U; Ue are the same as in (1.1) and (1.2), V1; Ve1 2 Cnk . Notice that
e B , BD
e ,1B = Be (D , D,1)B:
Ae , A = Be Be , BB = BD
Pre- and post-multiply the equations by Ue and U , respectively, to get e Ue U , Ue U =
e 1=2Ve (D , D,1 )V 1=2 which yields
e 2 Ue2U1 , Ue2U1 1 = e 12=2Ve2 (D , D,1 )V111=2:
(8.1)
The following inequality will be very useful in the rest of our proofs.
e ,1 ,1
V2 (D , D )V1 D , D 1=2 ,1=2
= I + H ,1=2(H )H ,1=2 , I + H ,1=2(H )H ,1=2
,1=2 ,1=2
k I + H ,1=2(H )H ,1=2
k2 H (H )H ,1=2
,1
p1k,HkHk,21jkjkHjHj k :
2
2
Lemma 8.1 Let 2 Css and , 2TCtt be two nonnegative denite Hermitian matrices, and let E 2 Cst . If (
) (,) = ;, then matrix equation X , X , =
1=2E ,1=2 has a unique solution X 2 Cst , and moreover kX kF kE kF=, where
def
=
min
(!; ).
!2(
); 2(,)
Proof: For any unitary matrices P 2 Us and Q 2 Ut , the substitutions
P P; 1=2 (P P )1=2; , Q ,Q; ,1=2 (Q,Q)1=2;
X
P XQ; and E
P EQ
leave the lemma unchanged, so we may assume without loss of generality that
= diag(!1 ; !2; : : :; !s ) and , = diag(1; 2; : : :; t).
Write X = p(xij ), pE = (eij ). Entrywise, equation X , X , = 1=2E ,1=2 reads
!i xij , xij j = !i eij j . As long as !i 6= j , xij exists uniquely, and
jxij j2 = jeij j2=(!i; j ) jeij j2=
21
Ren-Cang Li: Relative Perturbation Theory
summing which over 1 i s and 1 j t leads to the desired inequality.
Proof of Theorem 4.3: Equation (8.1) and Lemma 8.1 imply
,1
e ,1
sin (U1; Ue1)F = Ue2U1F kV2 (D , D )V1kF kD ,D kF ;
as required.
Lemma 8.2 Let 2 Css and , 2 Ctt be two nonnegative denite Hermitian matrices,
and let E 2 Cst . If there exist > 0 and > 0 such that
k
k2 and k,,1k,2 1 + or
k
,1k,2 1 + and k,k2 ;
then matrix equation X , X , = 1=2E ,1=2 has a unique solution X 2 Cst , and
= (; + ).
moreover jjX jj jjE jj = , where def
Proof: The existence
and uniqueness of X are easy to see because the conditions of this
T
lemma imply (
) (,) = ;. To bound jjX jj, we present a proof for the case k
k2 and k,,1 k,2 1 + . A proof for the other case is analogous. Post-multiply equation
X , X , = 1=2E ,1=2 by ,,1 to get
X ,,1 , X = 1=2E ,,1=2:
(8.2)
Under the assumptions
k
k and k,,1k,2 1 + ) k,,1 k2 +1 , we have
X ,,1 , X 1 , 2jjX jj as in the proof
of Lemma 7.2 and
+
1=2 ,1=2
p
E , k
1=2k2 jjjE jj k,,1=2k2 jjjE jj p1+ :
By equation (8.2), we deduce that
r
1 , + jjjX jj + jjE jj
from which the desired inequality follows.
Proof of Theorem 4.4: Equation (8.1) and Lemma 8.2 imply
e ,1 V2 (D , D )V1 D , D,1
e
e
;
sin (U1; U1) = U2 U1 as required.
22
Ren-Cang Li: Relative Perturbation Theory
9 Proofs of Theorems 4.5 and 4.6
e 1 , U11 = (Be , B)V1 and RL = Be U1 , V11 = (Be , B )U1 as dened in
Let RR = BV
(3.8).
9.1
The Square Case:
m=n
When m = n, the SVDs (3.4) and (3.5) read
!
!
V1 ;
B = U V (U1; U2) 1 V2
2
! e !
V1 :
eB = Ue e Ve (Ue1; Ue2) e 1 e
2
Ve2
Notice that
e 1 , Ue2U11 = e 2Ve2V1 , Ue2U11;
Ue2RR = Ue2 BV
Ue2RR = Ue2 (hBe , B)V1 = Ue2(D1 BD2 ,i D1 B + D1 B , B)V1
= Ue2 Be (I , D2,1 ) + (D1 , I )B V1
= e 2Ve2 (I , D2,1 )V1 + Ue2 (D1 , I )U11
to get
e 2 Ve2 V1 , Ue2U1 1 = e 2 Ve2 (I , D2,1 )V1 + Ue2 (D1 , I )U11 :
On the other hand,
(9.1)
Ve2RL = Ve2 Be U1 , Ve2V11 = e 2Ue2 U1 , Ve2 V11:
Ve2RL = Ve2 (hBe , B )U1 = Ve2 (D2B D1 i, D2 B + D2 B , B )U1
= Ve2 Be (I , D1,1 ) + (D2 , I )B U1
= e 2Ue2 (I , D1,1 )U1 + Ve2(D2 , I )V11
which produce
e 2 Ue2U1 , Ve2 V11 = e 2 Ue2 (I , D1,1 )U1 + Ve2 (D2 , I )V11 :
(9.2)
Equations (9.1) and (9.2) take an equivalent forms as a single matrix equation with dimensions doubled.
! e
!
!
!
e2U1
e 2
U2 U1
U
1
(9.3)
1
e 2
Ve2 V1 ,
Ve2V1
!
e 2 ! Ue2(I , D1,1)U1
= e
2
Ve2 (I , D2,1 )V1
!
!
e2(D1 , I )U1
U
1
+
:
1
Ve2(D2 , I )V1
23
Ren-Cang Li: Relative Perturbation Theory
Proof of Theorem 4.5: Notice that the eigenvalues of
1
1
!
e 2
e 2
!
are ek+j and these of
are i , and that
%2(i; ,ek+j ) %2(i; ek+j ) and %2(,i ; ek+j ) %2(i ; ek+j ):
By Lemma 7.1 and equation (9.3), we have
kUe2 U1 k2F + kVe2 V1 k2F
h
i
12 kUe2 (I , D1,1 )U1 k2F + kUe2 (D1 , I )U1 k2F + kVe2 (I , D2,1 )V1k2F + kVe2 (D2 , I )V1 k2F
2
12 k(I , D1,1 )U1 k2F + k(D1 , I )U1 k2F + k(I , D2,1 )V1 k2F + k(D2 , I )V1 k2F
2
which completes the proof.
Lemma 9.1 Let 2 Css and , 2 Ctt be two Hermitian matrices, and let
e F; F;
e 2 Cst satisfying
X; Y; E; E;
X , Y , = E + F , and Y , X , = Ee + Fe ,:
If there exist > 0 and > 0 such that
k
k2 and k,,1k,2 1 + or
k
,1k,2 1 + and k,k2 ;
then for any unitarily invariant norm jjj jj,
(q
r q q )
1
q
q
q
(9.4)
max fjj X jj ; jjY jjg max jjE jj + jjF jj ; q Ee + Fe ;
p
where p def
= %p(; + ).
Proof: We present a proof for the case k
k2 and k,,1 k,2 1 + . A proof for the
other case is analogous. Consider rst the subcase jjX jj jjY jj. Post-multiply equation
Y , X , = Ee + Fe , by ,,1 to get
Y ,,1 , X = Ee ,,1 + Fe :
(9.5)
Then we have, by k
k2 and k,,1 k,2 1 + ) k,,1 k2 +1 , that
,1 Y , , X jjX jj + , Y ,,1 jjX jj , k
k2 jjY jj k,,1k2
jjX jj , jjY jj +1 jjjX jj , jjX jj +1 = 1 , + jjjX jj
24
Ren-Cang Li: Relative Perturbation Theory
and
e ,1 e E, + F Ee ,,1 + Fe k
k2 Ee k,,1k2 + Fe e 1 e s
p r q q
e
e
E
+ F 1 +
p E + F :
p
+
( + )
q
By equation (9.5), we deduce that
s
p r q q
1 , + jjjX jj 1 + ( + )p Ee + Fe q
p
r q q
Ee + Fe : Similarly if jjX jj < jjY jj,
which produce that if jjjX jj jjY jj , jjjX jj q
from X , Y , = E + F , we can obtain jjY jjj 1 jjE jjq + jjF jjjq : Inequality (9.4) now
1
p
q
q
p
follows.
Proof of Theorem 4.6: By equations (9.1) and (9.2) and Lemma 9.1, we have
max
n e e o
U2 U1 ; V2 V1
( r
q q
1
max Ve2 (I , D2,1 )V1 + Ue2 (D1 , I )U1 ;
p
)
r
Ue2(I , D1,1)U1q + Ve2(D2 , I )V1 q
q
q
1
q
q
q
q
,
1
,
1
max
(I , D )V + jj(D , I )U jj ; (I , D )U + jj(D , I )V jj ;
q
q
p
q
2
1
1
1
q
1
1
2
1
as required. Turning to inequality (4.8), we have by equation (9.3) and Lemma 7.2 that
Ue2 U1
Ve2 V1
!
1
p
! q
Ue2 (I , D1,1)U1
(9.6)
Ve2 (I , D2,1 )V1 e !q !1=q
U
(
D
,
I
)
U
1
+ 2 1
;
Ve2 (D2 , I )V1 since the conditions of Theorem 4.6 imply
e 2
e 2
!
;
2
!,1
1
1 1 + :
2
Since Ue2 U1 and sin (U1 ; Ue1 ) have the same nonzero singular values and so do Ve2V1 and
sin (V1; Ve1),
e U2 U1
Ve2 V1
! !
= sin (U1; Ue1)
sin (V1; Ve1) :
(9.7)
25
Ren-Cang Li: Relative Perturbation Theory
Note also
Ue2 (I , D1,1 )U1
Ue2 (D1 , I )U1
!
Ve2 (I , D2,1 )V1
!
Ve2 (D2 , I )V1
Thus, one has
=
Ue2
=
Ue2
Ue2 (I , D1,1)U1
Ve2 (I , D2,1 )V1
Ue2 (D1 , I )U1
Ve2 (D2 , I )V1
Ve2
Ve2
!
!
(I , D1,1 )U1
(D1 , I )U1
(I , D2,1 )V1
(D2 , I )V1
;
:
! (I , D1,1 )U1
(I , D2,1 )V1 ; (9.8)
! (D1 , I )U1 (D2 , I )V1 : (9.9)
Inequality (4.8) is a consequence of (9.6), (9.7), (9.8) and (9.9).
9.2
m>n
The Non-Square Case:
e 0m;m,n).
Augment B and Be by a zero block 0m;m,n to Ba = (B; 0m;m,n ) and Bea = (B;
e
From B = D1 BD2 , we get
Bea = D Ba
1
D2
!
Im,n
def
= D1 Ba Da2 :
From the SVDs (3.4) and (3.5) of B and Be , one can calculate the SVDs of Ba and Bea :
!
!
Va1 ;
a2
Va2
!
!
e
e
V
1
a1
Bea = Ue e a Vea = (Ue1; Ue2)
e a2
Vea2 ;
Ba = U a Va = (U1; U2)
where
a2 =
2
!
0m,n;m,n
; Va1 =
1
V1
0m,n;k
!
(9.10)
(9.11)
!
; Va2 = V2 I
;
m,n
similarly for e a2 , Vea1 and Vea2 . The following fact is easy to establish
sin (Va1; Vea1) = sin (V1; Ve1) :
Applying the square case of Theorems 4.5 and 4.6 to m m matrices Ba and Bea just
dened will complete the proofs.
26
Ren-Cang Li: Relative Perturbation Theory
10 Proofs of Theorems 4.7 and 4.8
We have seen in x9 how to deal with the nonsquare case by transforming it to the square
case. So here we will only give proofs for the square case: m = n: Let B^ = D1 B and
B = BD2 and their SVDs be
! ^ !
^
V1 ;
B^ = U^ ^ V^ (U^1; U^2) 1 ^
(10.1)
V^2
2
! !
V1 ;
1
B = U V (U1; U2)
(10.2)
2
V2
where Ue ; U 2 Un , Ve ; V 2 Un , U1; U1 2 Cnk , V1; V1 2 Cnk and
^ 1 = diag(^1; ; ^k ); ^ 2 = diag(^k+1 ; ; ^n );
1 = diag(1 ; ; k ); 2 = diag(k+1 ; ; n):
Partitionings in (10.1) and (10.2) shall be done in such a way that
max (i ; ^i ) 21 kD1 , D1,1 k2 ;
1in
max (i ; i ) 12 kD2 , D2,1 k2 ;
1in
max (^i ; ei ) 12 kD2 , D2,1 k2 ;
1in
max (i ; ei ) 12 kD1 , D1,1 k2 :
1in
(10.3)
Such partitionings are possible because of the relative perturbation theorems proved in
Li [11]. By the fact %p(; ) 2,1=p (; ) (see Lemma 2.1 below), these inequalities imply
max % ( ; ^ ) 21+11 =p kD1 , D1,1 k2 ; 1max
% (^ ; e ) 1in p i i
in p i i
max % ( ; ) 21+11 =p kD2 , D2,1 k2 ; 1max
% ( ; e ) 1in p i i
in p i i
^ 2 . We have
Consider B^ and Be = BD
B^ B^ = U^ ^ ^ U^ (U^1 ; U^2)
^ 21
,1
1
21+1=p kD2 , D2 k2 ;
,1
1
21+1=p kD1 , D1 k2 :
! ^ !
U1
U^2 ;
! e !
2
e
U1 :
Be Be = Ue e e Ue (Ue1 ; Ue2) 1 e 2
Ue2
2
^ 22
(10.4)
(10.5)
(10.6)
Notice that
e 2B^ , BD
e 2,1B^ = Be (D2 , D2,1)B^ ;
Be Be , B^ B^ = BD
Pre- and post-multiply the equations by Ue and U^ , respectively, to get e 2Ue U^ , Ue U^ ^ 2 =
e Ve (D2 , D2,1 )V^ ^ which gives
e 22 Ue2U^1 , Ue2U^1 ^ 21 = e 2 Ve2 (D2 , D2,1 )V^1^ 1:
(10.7)
27
Ren-Cang Li: Relative Perturbation Theory
Consider now B and Be = D1 B . We have
21
B B = V V (V1; V2)
Notice that
! !
V1
V2 ;
! e !
V1 :
eBBe = Ve e e Ve (Ve1; Ve2) e 21 e 2
2
Ve2
22
(10.8)
(10.9)
Be Be , B B = Be D1 B , Be D1,1B = Be (D1 , D1,1 )B:
Pre- and post-multiply the equations by Ve and V , respectively, to get e 2 Ve V , Ve V 2 =
e Ue (D1 , D1,1 )U which gives
e 22 Ve2 V1 , Ve2 V1 21 = e 2 Ue2 (D1 , D1,1 )U1 1 :
(10.10)
Two other eigendecompositions that will be used later in the proofs are
! !
2
U1 ;
1
BB = U U (U1 ; U2)
(10.11)
22
U2
! !
2
V1 :
1
B B = V V (V1; V2)
(10.12)
2
2
V2
Proof of Theorem 4.7: Equations (10.7) and (10.10) and Lemma 8.1 produce
e ,1 V2 (D2 , D2 )V^1F
e
e
^
^
;
sin (U1; U1)F = U2 U1F ^ 21; e 22 )
(
e ,1 U2 (D1 , D1 )U1F
e
e
sin
(
V
;
V
)
=
V
V
;
1 1 F 2 1 F
( 21; e 22 )
2; e 2 ) and (
21; e 22) def
where4 (^ 21; e 22 ) def
= 1ik;min
(^
= 1ik;min
(i2 ; ek2+j ):
i
k
+
j
1j n,k
1j n,k
On the other hand, applying Theorem 4.1 to BB and B^ B^ = D1BB D1 leads to (see
(10.5) and (10.11))
q
k(I , D1,1)U1k2F + k(I , D1)U1k2F
;
sin (U1; U^1)F (2 ; ^ 2)
2
1
2
where 2(21; ^ 22) def
= 1ik;min
% (2; ^ 2 ); Applying Theorem 4.1 to B B and B B =
1j n,k 2 i k+j
D2 B BD2 leads to (see (10.8) and (10.12))
q
k(I , D2,1)V1k2F + k(I , D2)V1k2F
;
sin (V1; V1)F (2; 2)
4 We abuse notation
2
1
2
here for convenience. As we recall, has its own assignment in the statement
of Theorem 4.7. However, it is re-dened as a function in this proof. Hopefully, this would not cause any
confusion.
28
Ren-Cang Li: Relative Perturbation Theory
where 2(21; 22) def
= 1ik;min
% ( 2; 2 ); Let 1 = kD1 , D1,1 k2 and 2 = kD2 ,
1j n,k 2 i k+j
D2,1 k2 . We claim
8 2 ( ;e ),
1 2 1;
>
<
(^ 21; e 22) 2 (^ 1; e 2) > 1+ 161 (1;e2 )
: 23=22(1; e 2) , 1:
(10.13)
This is because
(^i2; ek2+j ) 2(^i ; ek+j ) 21+1=2%2(^i ; ek+j )
(by Lemma 2.1)
23=2 [h%2(i; ek+j ) , %2(^i; ii)] (%2 is a metric on R)
23=2 %2(i; ek+j ) , 2,3=21
(by (10.4)
and because
(i ; ek+j ) (i ; ^i) + (^i ; ek+j ) + 81 (i ; ^i)(^i; ek+j )(i ; ek+j ):
(by (10.4)
) (^i; ek+j ) 1 +(1i; e(k+;j^) ,)((;ie; ^i) ) 1(+i;1ek+(j );,e 1 =)2 :
i k+j
i i
i k+j
16
8
Similarly, we have
8 2 ( ;e ),
>
< 1+ 2 1 (21;e22) ;
2
2
e
e
(1; 2 ) 2 (1; 2 ) > 16
: 23=22(1; e 2) , 2;
2(21; ^ 22) 2(1; ^ 2) 2 (1; e 2 ) , 2,3=22 ;
2(21; 22) 2(1; 2) 2 (1; e 2 ) , 2,3=21 :
The proof will be completed by employing
sin (U1; Ue1)F sin (U1; U^1)F + sin (U^1; Ue1)F ;
sin (V1; Ve1)F sin (V1; V1)F + sin (V1; Ve1)F ;
since k sin ( ; )kF is a metric on the space of k-dimensional subspaces [14].
Proof of Theorems 4.8: Denote = + . Let ^ and be the largest positive numbers
such that
(; ^) 21 kD2 , D2,1 k2 and (; ) 12 kD1 , D1,1 k2
which guarantee that k^ 2 k2 ^ , k 2k2 , and
1 kD , D,1 k and % (; ) 1 kD , D,1 k ;
%p (; ^) 21+1
p
2 2
1 2
=p 2
21+1=p 1
29
Ren-Cang Li: Relative Perturbation Theory
and let ^ and be the smallest numbers such that
(; ^) 21 kD1 , D1,1 k2 and (; ) 12 kD2 , D2,1 k2
which guarantee that k^ ,1 1 k,2 1 ^, k ,1 1 k,2 1 , and
1 kD , D,1 k and % (; ) 1 kD , D,1 k :
%p(; ^) 21+1
p
1 2
2 2
=p 1
21+1=p 2
(4.15) implies minf^; g > and > maxf^; g.
Equations (10.7) and (10.10) and Lemma 8.2 produce
e ,1 V (D , D )V^1
;
sin (U^1; Ue1) = Ue2U^1 2 2 2 ^22
(
;
)
e ,1 U (D , D )U1
:
sin (V1; Ve1) = Ve2V1 2 (12; 21)
On the other hand, applying Theorem 4.2 to BB and B^ B^ = D1BB D1 leads to (see
(10.5) and (10.11))
r
(I , D1,1)U1q + jj(I , D1)U1jjjq
;
sin (U1; U^1) (^2 ; 2)
q
Applying Theorem 4.1 to B B and B B = D2B BD2 leads to (see (10.8) and (10.12))
r
(I , D,1)V1 q + jj(I , D)V1jjq
2
2
:
sin (V1; V1) 2
2
( ; )
q
Notice that
8 2(;),1
< 1
;
(2; ^2) 2(; ^) : 1+ 16 (; )
21+1=p% (; ) , 1 ;
8 2(;),p2
< 2
;
(2 ; 2) 2(; ) : 1+ 16 (; )
21+1=p%p(; ) , 2 ;
%p(^2; 2) %p(^; ) %p(; ) , 21+1=p2 ;
%p(2; 2) %p(; ) %p(; ) , 21+1=p1 ;
where 1 = kD1 , D1,1 k2 and 2 = kD2 , D2,1 k2. The proof will be completed by employing
sin (U1; Ue1) sin (U1; U^1) + sin (U^1; Ue1) ;
sin (V1; Ve1) sin (V1; V1) + sin (V1; Ve1) :
since jjsin ( ; )jj is a metric on the space of k-dimensional subspaces [14].
30
Ren-Cang Li: Relative Perturbation Theory
11 Conclusions and Further Extensions to Diagonalizable
Matrices
We have developed a relative perturbation theory for eigenspace and singular space variations under multiplicative perturbations. In the theory, extensions of Davis-Kahan sin theorems and Wedin sin theorems from the classical perturbation theory are made. Our
unifying treatment covers almost all previously studied cases over the last six years or so.
Using the similar technique in this paper, one can also develop a relative perturbation
theory for eigenspaces of diagonalizable matrices: A and Ae = D1 AD2 are diagonalizable,
where D1 and D2 are close to the identity matrix. We outline a way of doing this. Let
eigendecompositions of A and Ae be
AX = X (X1; X2) 1 2
!
!
e
and AeXe = Xe e (Xe 1; Xe 2 ) 1 e ;
2
where X; Xe 2 Cnn are nonsingular, and X1 ; Xe 1 2 Cnk (1 k < n) and
1 = diag(1; ; k );
e 1 = diag(e 1; ; e k );
i 's and ej 's may be complex. Partition
Y1
Y2
X ,1 =
!
2 = diag(k+1; ; n);
e 2 = diag(e k+1; ; e n ):
and Xe ,1 =
!
Ye1 ;
Ye2
e 1 , X11 = (Ae , A)X1. We have
where Y1 ; Ye1 2 Cnk . Dene R def
= AX
e 1 , Ye2X11 = e 2Ye2X1 , Ye2X11;
Ye2R = Ye2 AX
Ye2R = Ye2 (hAe , A)X1 = Ye2 (D1AD2 ,i D1 A + D1A , A)X1
= Ye2 Ae(I , D2,1 ) + (D1 , I )A X1
= e 2Ye2 (I , D2,1 )X1 + Ye2 (D1 , I )X11:
Thus we have the following perturbation equation
e 2Ye2 X1 , Ye2X1 1 = e 2Ye2 (I , D2,1 )X1 + Ye2 (D1 , I )X11
from which various bounds on sin (X1; Xe 1) can be derived under certain conditions. For
example, let 2 def
= 1ik;min
% ( ; e ). If 2 > 0, then by Lemma 7.1 we have
1j n,k 2 i k+j
q
kYe2X1kF 1 kYe2(I , D2,1)X1k2F + kYe2(D1 , I )X1k2F
2
q
1
kYe2k2kX1k2 kI , D2,1k2F + kD1 , I k2F:
2
Ren-Cang Li: Relative Perturbation Theory
31
Notice that by Lemma 2.2
k sin (X1; Xe1)kF = k(Ye2Ye2),1=2Ye2X1(X1X1),1=2kF
k(Ye2Ye2),1=2k2kYe2X1kFk(X1X1),1=2k2:
Then a bound on k sin (X1; Xe 1 )kF is immediately available.
Acknowledgement: I thank Professor W. Kahan for his consistent encouragement and
support, Professor J. Demmel and Professor B. N. Parlett for helpful discussions.
References
[1] J. Barlow and J. Demmel. Computing accurate eigensystems of scaled diagonally dominant
matrices. SIAM Journal on Numerical Analysis, 27:762{791, 1990.
[2] C. Davis and W. Kahan. The rotation of eigenvectors by a perturbation. III. SIAM Journal
on Numerical Analysis, 7:1{46, 1970.
[3] J. Demmel and W. Gragg. On computing accurate singular values and eigenvalues of matrices
with acyclic graphs. Linear Algebra and its Application, 185:203{217, 1993.
[4] J. Demmel and W. Kahan. Accurate singular values of bidiagonal matrices. SIAM Journal
on Scientic and Statistical Computing, 11:873{912, 1990.
[5] J. Demmel and K. Veselic. Jacobi's method is more accurate than QR. SIAM Journal on
Matrix Analysis and Applications, 13(4):1204{1245, 1992.
[6] S. C. Eisenstat and I. C. F. Ipsen. Relative perturbation techniques for singular value problems.
Research Report YALEU/DCS/RR{942, Department of Computer Science, Yale University,
1993.
[7] S. C. Eisenstat and I. C. F. Ipsen. Relative perturbation bounds for eigenspaces and singular
vector subspaces. In J. G. Lewis, editor, Proceedings of the Fifth SIAM Conference on Applied
Linear Algebra, pages 62{66, Philadelphia, 1994. SIAM Publications.
[8] R. A. Horn and C. R. Johnson. Matrix Analysis. Cambridge University Press, Cambridge,
1985.
[9] W. Kahan. Accurate eigenvalues of a symmetric tridiagonal matrix. Technical Report CS41,
Computer Science Department, Stanford University, Stanford, CA, 1966. (revised June 1968).
[10] R.-C. Li. On perturbations of matrix pencils with real spectra. Mathematics of Computation,
62:231{265, 1994.
[11] R.-C. Li. Relative perturbation theory: (i) eigenvalue and singular value variations. Technical
Report UCB//CSD-94-855, Computer Science Division, Department of EECS, University of
California at Berkeley, 1994. (revised January 1996).
[12] R. Mathias. Spectral perturbation bounds for graded positive denite matrices. Manuscript,
Department of Mathematics, College of William & Mary, 1994.
[13] A. M. Ostrowski. A quantitative formulation of Sylvester's law of inertia. Proc. National
Acad. Sciences (USA), 45:740{744, 1959.
[14] G. W. Stewart and J.-G. Sun. Matrix Perturbation Theory. Academic Press, Boston, 1990.
[15] P.-
A. Wedin. Perturbation bounds in connection with singular value decomposition. BIT,
12:99{111, 1972.
Download