Relative Perturbation Theory: (II) Eigenspace and Singular Subspace Variations Ren-Cang Li Mathematical Science Section Oak Ridge National Laboratory P.O. Box 2008, Bldg 6012 Oak Ridge, TN 37831-6367 (li@msr.epm.ornl.gov) LAPACK working notes # 85 (rst published July, 1994, revised January, 1996) Abstract The classical perturbation theory for Hermitian matrix eigenvalue and singular value problems provides bounds on invariant subspace variations that are proportional to the reciprocals of absolute gaps between subsets of spectra or subsets of singular values. These bounds may be bad news for invariant subspaces corresponding to clustered eigenvalues or clustered singular values of much smaller magnitudes than the norms of matrices under considerations when some of these clustered eigenvalues or clustered singular values are perfectly relatively distinguishable from the rest. In this paper, we consider how eigenspaces of a Hermitian matrix A change when it is perturbed to Ae = D AD and how singular values of a (nonsquare) matrix B change when it is perturbed to Be = D1 BD2 , where D, D1 and D2 are assumed to be close to identity matrices of suitable dimensions, or either D1 or D2 close to some unitary matrix. It is proved that under these kinds of perturbations, the change of invariant subspaces are proportional to the reciprocals of relative gaps between subsets of spectra or subsets of singular values. We have been able to extend well-known Davis-Kahan sin theorems and Wedin sin theorems. As applications, we obtained bounds for perturbations of graded matrices. This material is based in part upon work supported, during January, 1992{August, 1995, by Argonne National Laboratory under grant No. 20552402 and by the University of Tennessee through the Advanced Research Projects Agency under contract No. DAAL03-91-C-0047, by the National Science Foundation under grant No. ASC-9005933, and by the National Science Infrastructure grants No. CDA-8722788 and CDA-9401156, and supported, since August, 1995, by a Householder Fellowship in Scientic Computing at Oak Ridge National Laboratory, supported by the Applied Mathematical Sciences Research Program, Oce of Energy Research, United States Department of Energy contract DE-AC05-96OR22464 with Lockheed Martin Energy Research Corp. Part of this work was done during summer of 1994 while the author was at Department of Mathematics, University of California at Berkeley. 1 2 Ren-Cang Li: Relative Perturbation Theory 1 Introduction Let A and Ae be two n n Hermitian matrices with eigendecompositions ! ! U1 ; A 2 U2 ! ! e e U 1 1 Ae = Ue e Ue (Ue1; Ue2) e 2 Ue2 ; = U U (U1 ; U2) 1 (1.1) (1.2) where U; Ue 2 Un , U1 ; Ue1 2 Cnk (1 k < n) and 1 = diag(1; ; k ); e 1 = diag(e 1; ; e k ); 2 = diag(k+1; ; n); e 2 = diag(e k+1; ; e n ): (1.3) (1.4) Suppose now that A and Ae are close. The question is: How close are the eigenspaces spanned by Ui and Uei ? This question has been well answered by four celebrated theorems so-called sin , tan , sin 2 and tan 2 due to Davis and Kahan [2, 1970] for arbitrary additive perturbations in the sense that the perturbations to A can be made arbitrary as long as Ae , A is kept small. It is proved that the changes of invariant subspaces are proportional to the reciprocals of absolute gaps between subsets of spectra. This paper, however, will address the same question but under multiplicative perturbations: How close are the eigenspaces spanned by Ui and Uei under the assumption that Ae = DAD for some D close to I ? Our bounds suggest that the changes of invariant subspaces be proportional to the reciprocals of relative gaps between subsets of spectra. A similar question for singular value decompositions will be answered also. To be specic, we will deal with perturbations of the following kinds: Eigenvalue problems: 1. A and Ae = D AD for the Hermitian case, where D is nonsingular and close to the identity matrix. e for the graded nonnegative Hermitian case, where 2. A = S HS and Ae = S HS it is assumed that H and He are nonsingular and often that S is a highly graded diagonal matrix (this assumption is not necessary to our theorems). Singular value problems: 1. B and Be = D1 BD2 , where D1 and D2 are nonsingular and close to identity matrices or one of them close to an identity matrix and the other to some unitary matrix. e for the graded case, where it is assumed that G and 2. B = GS and Be = GS Ge are nonsingular and often that S is a highly graded diagonal matrix (this assumption is not necessary to our theorems). 3 Ren-Cang Li: Relative Perturbation Theory These perturbations cover component-wise relative perturbations to entries of symmetric tridiagonal matrices with zero diagonal [4, 9], entries of bidiagonal and biacyclic matrices [1, 3, 4], and perturbations in graded nonnegative Hermitian matrices [5, 12], in graded matrices of singular value problems [5, 12] and more [6]. Recently, Eisenstat and Ipsen [7, 1994] launched an attack towards the above mentioned perturbations except graded cases. We will give a brief comparison among their results and ours. This paper is organized as follows. We briey review Davis-Kahan sin theorems for Hermitian matrices and their generalizations|Wedin sin theorems for singular value decompositions in x3. We present in x4.1 our sin theorems for eigenvalue problems for A and Ae = D AD and for graded nonnegative Hermitian matrices. Theorems for singular value problems for B and Be = D1 BD2 and for graded matrices are given in x4.2. We discuss how to bound from below relative gaps, for example, between 1 and e 2 by relative gaps between 1 and 2 in x5. A word will be said in x6 regarding EisenstatIpsen's theorems in comparison with ours. Detailed proofs are postponed to xx7, 8, 9, and 10. Finally in x11, we present our conclusions and outline further possible extensions to diagonalizable matrices. 2 Preliminaries Throughout this paper, we will be following notation we used in the rst part of this series Li [11]. Most frequently used are the two kinds of relative distances: %p and dened for ; e 2 C by , e j ; %p(; e) = p j p, eje p for 1 p 1, and ; e ) = jp jj + jj jej with convention 0=0 = 0 for convenience. Lemma 2.1 (Li) 1. For ; 2 C, %p(; ) 2,1=p(; ). 2. For ; 2 R and 0, %p(; ) %p(2 ; 2) and 2(; ) (2 ; 2). 3. %p is a metric on R. 4. For ; ; ! 0, (; ) (; ! ) + (!; ) + 18 (; )(; ! )(!; ): p Since this paper concerns with the variations of subspaces, we need some metrics to measure the dierence between two subspaces. In this, we follow Davis and Kahan [2, 1970], Stewart and Sun [14]. Let X; Xe 2 Cnk (n > k) have full column rank k, and dene the angle matrix (X; Xe ) between X and Xe as (X; Xe ) def = arccos((X X ), 2 X Xe (Xe Xe ),1 Xe X (X X ), 2 ), 2 : 1 1 1 The canonical angles between the subspace X = R(X ) and Xe = R(Xe ) are dened to be the singular values of the Hermitian matrix (X; Xe ), where R(X ) denotes the subspace spanned by the column vectors of X . The following lemma is well-known. For a proof of it, the reader is referred to, e.g, Li [10, Lemma 2.1]. 4 Ren-Cang Li: Relative Perturbation Theory e Xe1) 2 Cnn is a nonsingular matrix, and Lemma 2.2 Suppose that (X; ! e Y , 1 e Xe1) = e ; Ye 2 Cnk : (X; Y1 Then for any unitarily invariant norm jjj jjj sin (X; Xe ) = (Ye1Ye1),1=2Ye1X (X X ),1=2 : In this lemma, as well as many other places in the rest of this paper, we talk about the \same" unitarily invariant norm jjj jjj that applies to matrices of dierent dimensions at the same time. Such applications of a unitarily invariant norm are understood in the following sense: First there is a unitarily invariant norm jjj jjj on CM N for suciently large integers M and N ; Then for a matrix X 2 Cmn (m M and N n), jjX jj is dened by appending X with zero blocks to make it M N and then taking the unitarily invariant norm of the enlarged matrix. Taking X = U1 and Xe = Ue1 (see (1.1) and (1.2)), with Lemma 2.2 one has (U1 ; Ue1 ) = arccos(U1Ue1 Ue1 U1),1=2 and sin (U1; Ue1) = Ue2 U1 : (2.5) For more discussions on angles between subspaces, the reader is referred to Davis and Kahan [2] and Stewart and Sun [14, Chapters I and II]. 3 Davis-Kahan 3.1 sin Theorems and Wedin sin Theorems Eigenspace Variations Let A and Ae be two Hermitian matrices whose eigendecompositions are given by (1.1) and (1.2): ! ! U1 , 1 A = U U = (U1; U2) (1.1) U 2 2 ! e ! U1 , eA = Ue e Ue = (Ue1; Ue2) e 1 e (1.2) 2 Ue2 where U; Ue 2 Un , U1 ; Ue1 2 Cnk (1 k < n) and i 's and e j 's are dened as in (1.3) and (1.4). Dene e 1 , U11 = (Ae , A)U1: R def = AU (3.1) The following two theorems are the matrix versions of two sin theorems due to Davis and Kahan [2, 1970]. Theorem 3.1 (Davis-Kahan) Let A and Ae be two Hermitian matrices with eigendecompositions (1.1), (1.2), (1.3), and (1.4). If def = 1ik;min j , ek+j j > 0; then 1j n,k i e k sin (U1; Ue1)kF kRkF = k(A , A)U1kF : (3.2) 5 Ren-Cang Li: Relative Perturbation Theory In this theorem, the spectrum of 1 and that of e 2 are only required to be disjoint. In the next theorem, they are required, more strongly, to be well-separated by intervals. Theorem 3.2 (Davis-Kahan) Let A and Ae be two Hermitian matrices with eigendecompositions (1.1), (1.2), (1.3), and (1.4). Assume there is an interval [; ] and a > 0 such that the spectrum of 1 lies entirely in [; ] while that of e 2 lies entirely outside of ( , ; + ) (or such that the spectrum of e 2 lies entirely in [; ] while that of 1 lies entirely outside of ( , ; + )). Then for any unitarily invariant norm jj jj e ( A , A ) U 1 : (3.3) sin (U1; Ue1) jjR jj = 3.2 Singular Space Variations Now, turn to the perturbations of Singular Value Decompositions (SVD). Let B and Be be two m n (m n) complex matrices with SVDs 0 1 B = U V (U1 ; U2) B @ 0 0 0e 1 eB = Ue e Ve (Ue1; Ue2) B@ 0 0 2 0 0 e2 0 0 1 CA 1 CA ! V1 ; V2 (3.4) Ve1 ; Ve2 (3.5) ! where U; Ue 2 Um , V; Ve 2 Un , U1; Ue1 2 Cmk , V1; Ve1 2 Cnk (1 k < n) and 1 = diag(1; ; k ); 2 = diag(k+1 ; ; n ); e 1 = diag(e1 ; ; ek ); e 2 = diag(ek+1 ; ; en ): (3.6) (3.7) e 1 , U11 = (Be , B)V1 and RL def RR def = BV = Be U1 , V11 = (Be , B )U1: (3.8) Dene residuals The following two theorems are due to Wedin [15, 1972]. Theorem 3.3 (Wedin) Let B and Be be two m n (m n) complex matrices with SVDs (3.4), (3.5), (3.6), and (3.7). If def = min 1ik;min j , ek+j j; 1min > 0; then 1j n,k i ik i q k sin (U1; Ue1)k2F + k sin (V1; Ve1)k2F q 2 q kRRkF + kRLk2F k(Be , B)V1k2F + k(Be , B )U1k2F = : (3.9) 6 Ren-Cang Li: Relative Perturbation Theory Theorem 3.4 (Wedin) Let B and Be be two m n (m n) complex matrices with SVDs (3.4), (3.5), (3.6), and (3.7). If there exist > 0 and > 0 such that min + 1ik i and 1max e ; j n,k k+j then for any unitarily invariant norm jjj jj o n max sin (U1; Ue1) ; sin (V1 ; Ve1) (3.10) e o n e max (B , B )V1 ; (B , B )U1 : max fjj RRjj ; jjRLjjg = 4 Relative Perturbation Theorems 4.1 Eigenspace Variations The following theorem is an extension of Theorem 3.1. Theorem 4.1 Let A and Ae = DAD be two n n Hermitian matrices with eigendecompositions (1.1), (1.2), (1.3), and (1.4), where D is nonsingular and close to I . If 2 def = 1ik;min % ( ; e ) > 0; then 1j n,k 2 i k+j q k(I , D,1)U1k2F + k(I , D)U1k2F k sin (U1; Ue1)kF : (4.1) 2 By imposing a stronger condition on the separation between the spectra of e 2 and 1, we have the following bound on any unitarily invariant norm of sin (U1 ; Ue1 ). Theorem 4.2 Let A and Ae = DAD be two n n Hermitian matrices with eigendecompositions (1.1), (1.2), (1.3), and (1.4), where D is nonsingular and close to I . Assume that there exist > 0 and > 0 such that the spectrum of 1 lies entirely in [,; ] while that of e 2 lies entirely outside (, , ; + ) (or such that the spectrum of 1 lies entirely outside (, , ; + ) while that of e 2 lies entirely in [,; ]). Then for any unitarily invariant norm jjj jjj q jj(I , D,1)U1jjjq + jjj(I , D)U1jjq ; sin (U1; Ue1) q p (4.2) where p def = %p(; + ). Now we consider eigenspace variations for a graded Hermitian matrix A = S HS 2 e . Set H def Cnn perturbed to Ae = S HS = He , H: 7 Ren-Cang Li: Relative Perturbation Theory e be two nn Hermitian matrices with eigenTheorem 4.3 Let A = S HS and Ae = S HS decompositions (1.1), (1.2), (1.3), and (1.4). H is positive denite and kH ,1k2kH k2 < 1. If def = min (i; ek+j ) > 0; then 1ik; 1j n,k ,1 k sin (U1; Ue1)kF kD ,D kF p kH ,1k2 kH kF : , 1 1 , kH k2kH k2 (4.3) where D = (I + H ,1=2(H )H ,1=2)1=2 = D. e be two nn Hermitian matrices with eigenTheorem 4.4 Let A = S HS and Ae = S HS decompositions (1.1), (1.2), (1.3), and (1.4). H is positive denite and kH ,1k2kH k2 < 1. Assume that there exist > 0 and > 0 such that max 1ik i or min 1ik i and 1min e + j n,k k+j + and Then for any unitarily invariant norm jj jjj max e 1j n,k k+j : D , D,1 kH ,1k2 jjjH jj ; e p sin (U1; U1) , 1 1 , kH k2 kH k2 (4.4) = (; + ) and D = (I + H ,1=2(H )H ,1=2)1=2 = D. where def 4.2 Singular Space Variations The following two theorems concern singular space perturbations. Theorem 4.5 Let B and Be = D1BD2 be two m n (m n) (complex) matrices with SVDs (3.4), (3.5), (3.6), and (3.7), where D1 and D2 are nonsingular and close to identities. Let 8 > e min min % ( ; ) ; min % ( ; 0) ; < 1ik;1jn,k 2 i k+j 1ik 2 i 2 def => % ( ; e ) ; : min 1ik;min 1j n,k 2 i k+j if m > n; otherwise: (4.5) If 2 > 0, then q k sin (U1; Ue1)k2F + k sin (V1; Ve1)k2F (4.6) q k(I , D1)U1k2F + k(I , D1,1 )U1k2F + k(I , D2)V1k2F + k(I , D2,1)V1k2F : 2 8 Ren-Cang Li: Relative Perturbation Theory Theorem 4.6 Let B and Be = D1BD2 be two m n (m n) (complex) matrices with SVDs (3.4), (3.5), (3.6), and (3.7), where D1 and D2 are nonsingular and close to identities. If there exist > 0 and > 0 such that min + and 1max e ; 1ik i j n,k k+j then for any unitarily invariant norm jjj jj max o n (4.7) sin(U1 ; Ue1) ; sin (V1; Ve1) q q 1 max (I , D2,1 )V1 q + jj(D1 , I )U1 jjq ; (I , D1,1 )U1 q + jj(D2 , I )V1 jjq ; p ! sin (U1 ; Ue1) (4.8) sin (V1 ; Ve1) s q (I , D )U q (I , D1,1)U1 1 1 + (I , D2 )V1 (I , D2,1 )V1 q q q ; p where p def = %p (; + ). In Theorems 4.5 and 4.6, we assumed that both D1 and D2 are close to identity matrices. But, intuitively D2 should not aect U1 much as long as it is close to a unitary matrix. In fact, if D2 2 Un , it does not aect U1 at all. The following theorems indeed conrm this observation. Theorem 4.7 Let B and Be = D1BD2 be two m n (m n) (complex) matrices with SVDs (3.4), (3.5), (3.6), and (3.7), where D1 and D2 are close to some unitary matrices. Let 2 be dened by (4.5) and set 8 > e min min ( ; ) ; min ( ; 0) ; < 1ik;1jn,k i k+j 1ik i def => (i ; ek+j ) ; : min 1ik;min 1j n,k Assume1 if m > n; otherwise: 2 > p1 maxfkD1 , D1,1 k2; kD2 , D2,1 k2g: 2 2 If D1 is close to identity, then sin (U1; Ue1)F sin (U1; Ue1)F (4.9) (4.10) q k(I , D1,1)U1k2F + k(I , D1)U1k2F 1 kD2 , D2,1kF + 1 + 16 2 , 1 ; 2 , 2,3=22 q k(I , D1,1)U1k2F + k(I , D1)U1k2F kD2 , D2,1kF + 3=2 ; 2 , 2,3=22 2 2 , 1 1 This implies, by Lemma 2.1, > 1 maxfkD1 , D,1 k2 ; kD2 , D,1 k2 g. 1 2 2 (4.11) (4.12) 9 Ren-Cang Li: Relative Perturbation Theory If D2 is close to identity, then kD , D,1kF qk(I , D2,1)V1k2F + k(I , D2)V1k2F ; sin (V1; Ve1)F 1 + 162 21 , 1 + 2 , 2,3=21 2 (4.13) q ,1 k(I , D2,1)V1k2F + k(I , D2)V1k2F ; (4.14) sin (V1; Ve1)F kD231=2, D,1 kF + , 2,3=2 2 2 where 1 = kD1 , D1,1 k2 and 2 = kD2 , D2,1 k2. 2 1 Inequalities (4.11) and (4.12) which dier slightly in their last term clearly says that D2 contributes to sin (U1; Ue1) with its departure from some unitary matrix, and similarly do (4.13) and (4.14). Remark. When one of the D1 and D2 is I , assumption (4.10) can actually be weakened to 2 > 0, as shall be seen from our proofs. Theorem 4.8 Let B and Be = D1BD2 be two m n (m n) (complex) matrices with SVDs (3.4), (3.5), (3.6), and (3.7), where D1 and D2 are close to some unitary matrices. Suppose that there exist > 0 and > 0 such that min + and 1max e : 1ik i j n,k k+j Assume2 1 maxfkD , D,1 k ; kD , D,1 k g: %p (; + ) > 21+1 1 2 1 2 2 2 =p If D1 is close to identity, then for any unitarily invariant norm jjj jj (4.15) r ,1 (I , D,1)U1 q + jj(I , D)U1jjjq 1 1 1 D2 , D2 ; + 1 + sin (U1; Ue1) 16 2 , 1 p , 2,1,1=p 2 q r (I , D1,1)U1 q + jj(I , D1)U1jjjq D , D,1 2 + 1+1=p 2 ; sin (U1; Ue1) , 1 , 1 =p p , 2 2 2 2 , 1 (4.16) q (4.17) If D2 is close to identity, then ,1 r ,1 )V1 q + jj(I , D )V1jjq ( I , D D , D 2 2 1 2 1 + ; sin (V1; Ve1) 1 + 16 , 1 , 1 =p 2 , 2 p , 2 1 q q ,1 r (I , D2,1)V1 + jj(I , D2)V1jjq D1 , D1 : sin (V1; Ve1) 21+1=p , + p , 2,1,1=p1 p 2 (4.18) q 2 This implies, by Lemma 2.1, > 1 maxfkD1 , D,1 k2 ; kD2 , D,1 k2 g. 1 2 2 (4.19) 10 Ren-Cang Li: Relative Perturbation Theory where p def = %p(; + ), def = (; + ), and 1 = kD1 , D1,1 k2 and 2 = kD2 , D2,1 k2. Remark. If either D1 or D2 is I , (4.15) can actually be removed. Now, Let's briey mention a possible application of Theorems 4.5, 4.6, 4.7 and 4.8. It has something to do with deation in computing the singular value systems of a bidiagonal matrix. Taking account of the remarks we have made, we get Corollary 4.1 Assume D1 = I and D2 takes the form ! I X ; I D2 = where X is a matrix of suitable dimensions. Let 2 be dened by (4.5), and by (4.9). If 2 > 0, then kX kF ; sin (U1; Ue1)F p 2 p r sin (U1; Ue1)2 + sin (V1; Ve1)2 2kX kF : F F 2 (4.20) (4.21) Proof: Inequality (4.21) follows from (4.6). Inequality (4.20) follows from (4.11) and D , D,1 = 2 2 X X ! p ) kD2 , D2,1kF = 2kX kF: Corollary 4.2 Assume D2 = I and D1 takes the form ! D1 = I X ; I where X is a matrix of suitable dimensions. Let 2 be dened by (4.5), and by (4.9). If 2 > 0, then kX kF ; sin (V1; Ve1)F p 2 p r sin (U1; Ue1)2 + sin (V1; Ve1)2 2kX kF : F F Corollary 4.3 Assume D1 = I and D2 takes the form ! D2 = 2 I X ; I where X is a matrix of suitable dimensions. Suppose that there exist > 0 and > 0 such that min + and 1max e : 1ik i j n,k k+j 11 Ren-Cang Li: Relative Perturbation Theory Then ! X ; sin (V1; Ve1) jjX jj ; p X sin (U1; Ue1) 21 (4.22) where p def = %p(; + ), def = (; + ). Proof: The rst inequality in (4.22) follows from (4.16), and the second from (4.7). Corollary 4.4 Assume D2 = I and D1 takes the form ! D1 = I X ; I where X is a matrix of suitable dimensions. Suppose that there exist > 0 and > 0 such that min + and 1max e : 1ik i j n,k k+j Then ! X ; X sin (U1; Ue1) jjjX jj ; sin (V1; Ve1) 21 p = %p(; + ), def = (; + ). where p def Now, we consider singular space variations for a graded matrix B = GS 2 Cnn pere 2 Cnn , where G is nonsingular. Set G def turbed to Be = GS = Ge ,G. If k(G)G,1k2 < 1, , 1 then Ge = G + G = [I + (G)G ]G is nonsingular also. e 2 Cnn with SVDs (3.4), (3.5), (3.6), Theorem 4.9 Let B = GS 2 Cnn and Be = GS and (3.7), where G is nonsingular. Assume k(G)G,1k2 < 1. If 2 def = 1ik;min % ( ; e ) > 0; 1j n,k 2 i k+j then r sin (U1; Ue1)2 + sin (V1; Ve1)2 F q F k(G)G,1U1k2F + k[I + G,(G)],1G,(G)U1k2F 2 s kG,1k2 1 + (1 , kG,11k kGk )2 kGkF ; 2 2 2 (4.23) 12 Ren-Cang Li: Relative Perturbation Theory and ,1 ,1 , (4.24) sin (V1; Ve1)F kI + (G)G ,2(I + (G)G ) kF ,1k2 ! k(G)G,1kF ,1 + G, (G)kF k ( G ) G k ( G ) G + k(G)G,1kF 1 , k(G)G,1k2 2 ,1 1 + 1 , kG,11k kGk kG k22kGkF ; 2 2 where def = 1ik;min (i ; ek+j ): 1j n,k e = [I + (G)G,1]GS = D1BD2, where D1 = I + (G)G,1 and Proof: Write Be = GS D2 = I . Then apply Theorem 4.5 to get (4.23), and apply Theorem 4.7 to get (4.24). e 2 Cnn with SVDs (3.4), (3.5), (3.6), Theorem 4.10 Let B = GS 2 Cnn and Be = GS and (3.7), where G is nonsingular. Assume k(G)G,1k2 < 1. If there exist > 0 and > 0 such that min 1ik i + and max 1ik i and or, the other way around, i.e., max e 1j n,k k+j min e 1j n,k k+j + ; then for any unitarily invariant norm jjj jj o n max sin (U1 ; Ue1 ) ; sin (V1; Ve1) (G)G,1U ; [I + G,(G)],1G,(G)U max 1 1 kG,1k2 jjGjj ; , 1 1 , kG k2 kGk2 !1 (4.25) 1 sin (U1 ; Ue1 ) sin (V1; Ve1) q jj(G)G,1U1jjq + jj[I + G,(G)],1G,(G)U1jjq p s kG,1k2 1 + (1 , kG,11k kGk )q jjGjj ; 2 2 p (4.26) q q and I + (G)G,1 , (I + (G)G,1),1 e sin (V1; V1) 2 (4.27) Ren-Cang Li: Relative Perturbation Theory 13 (G)G,1 + G,(G) ,1k2 ! (G)G,1 k ( G ) G + 1 , k(G)G,1k jjj(G)G,1jjj 2 2 , 1 1 + 1 , kG,11k kGk kG 2k2jjGjj : 2 2 where p def = %p(; + ), and def = (; + ). e = [I + (G)G,1]GS = D1BD2, where D1 = I + (G)G,1 Proof: Again write Be = GS and D2 = I . Then apply Theorem 4.6 to get (4.25) and (4.26), and apply Theorem 4.8 to get (4.27). Remark. Inequalities (4.24) and (4.27) may provide much tighter bounds than (4.23) and (4.25), especially when (G)G,1 is very close to a skew Hermitian matrix. 5 More on Relative Gaps In the theorems of x4, various relative gaps play an indispensable role. Those gaps are imposed on either between 1 and e 2 or between 1 and e 2. In some applications, it may be more convenient to have theorems where only positive relative gaps between 1 and 2 or between 1 and 2 are assumed. Based on results of Ostrowski [13, 1959] (see also [8, pp.224{225]), Barlow and Demmel [1, 1990], Demmel and Veselic [5, 1992], Eisenstat and Ipsen [6, 1993], Mathias [12, 1994], and Li [11, 1994], theorems in x4 can be modied to accommodate this need. In what follows, we list inequalities for how to bound relative gaps between 1 and e 2 or between 1 and e 2 from below for each theorem by relative gaps between 1 and 2 or between 1 and 2 . The derivations of these inequalities depends on the fact listed in Lemma 2.1. For Theorem 4.1: 2 1ik;min % ( ; ) , kI , DDk2 : 1j n,k 2 i k+j For Theorem 4.2: Assume that there exist ^ > 0 and ^ > 0 such that the spectrum of 1 lies entirely in [,^ ; ^ ] while that of 2 lies entirely outside (,^ , ^; ^ + ^) (or such that the spectrum of 1 lies entirely outside (,^ , ^; ^ + ^) while that of 2 lies entirely in [,^; ^]). If %p(^; ^ + ^) > kI , DDk2 , then there are > 0 and > 0 as the theorem requires, and p %p(^; ^ + ^) , kI , D Dk2: For Theorem 4.3: > 1ik;min (i ; k+j ) , 12 kD , D,1 k2 . 1j n,k For Theorem 4.4: Assume that there exist ^ > 0 and ^ > 0 such that max ^ and 1min ^ + ^ 1ik i j n,k k+j 14 Ren-Cang Li: Relative Perturbation Theory or If (^; ^ + ^) > requires, and min ^ + ^ 1ik i 1 kD , D,1 k2 , then 2 ^: there are > 0 and > 0 as the theorem and max 1j n,k k+j ,1 ^ 1 (^; ^kD+,D),,1k22 kD , D ^ k2 : 1 + 16 (^; ^ + ) For Theorem 4.5: 2 ^2 , , where = 2p1 2 (kD1 , D1,1k2 + kD2 , D2,1k2) (or = maxfj1 , min(D1)min(D2)j; j1 , max(D1)max(D2)jg) and 8 > % ( ; e ); min % ( ; 0) ; < min 1ik;min 1j n,k 2 i k+j 1ik 2 i ^2 def => % ( ; e ) ; : min 1ik;min 1j n,k 2 i k+j if m > n; otherwise: (5.1) For Theorems 4.6: Assume there exist ^ > 0 and ^ > 0 such that min ^ + ^ and 1max ^: 1ik i j n,k k+j If %p (^; ^ + ^) > , then there are > 0 and > 0 as the theorem requires, and p %p (^; ^ + ^) , ; where = 21+11 =p (kD1,D1,1 k2 +kD2,D2,1 k2) (or = maxfj1,min(D1)min(D2)j; j1, max(D1)max(D2)jg). For Theorems 4.7: Let ^2 be dened by (5.1) and set 8 > (i ; k+j ); 1min ( ; 0) ; < min 1ik;min 1j n,k ik i ^ def => (i ; k+j ) ; : min 1ik;min 1j n,k If ^2 > 2p1 2 (1 + 2 + maxf1; 2 g); then if m > n; otherwise: (5.2) ^ , 12 (1 + 2)= 1 + 321 1 2 1 : p 2 ^2 , ( + ) and +2 ^ = 1 + 1 2 2 1 2 1 + 116 32 1 2 For Theorems 4.8: Suppose that there exist ^ > 0 and ^ > 0 such that min ^ + ^ and 1max ^: 1ik i j n,k k+j If %p(^; ^ + ^) > 21+11 =p (1 + 2 + maxf1 ; 2 g); then there are > 0 and > 0 as the theorem requires, and ^) , 12 (1 + 2 )= 1 + 321 1 2 (^ ; ^ + 1 : p %p(^; ^+^), 21+1=p (1+2 ) and +2 (^ 1 + 116 ; ^ + ^)= 1 + 321 12 15 Ren-Cang Li: Relative Perturbation Theory For Theorem 4.9: 2 ^2 , p1 and 1^+,=^2 ; 2 2 16 where ^2 def = 1ik;min % ( ; ), ^ def = 1ik;min (i ; k+j ), = kD , 1j n,k 2 i k+j 1j n,k D,1 k2, and D = I + (G)G,1. For Theorem 4.10: Suppose there exist ^ > 0 and ^ > 0 such that min i ^ + ^ and max k+j ^ 1ik 1j n,k or, the other way around, i.e., max 1ik i If %p (^; ^ + ^) > and ^ and min 1j n,k k+j ^ + ^; 1 21+1=p , then there are > 0 and > 0 as the theorem requires, 1 and (^; ^ + ^) , =2 ; p %p(^; ^ + ^) , 21+1 =p 1 + (^ ^ 16 ; ^ + ) where = kD , D,1 k2 and D = I + (G)G,1. 6 A Word on Eisenstat-Ipsen's Theorems Eisenstat and Ipsen [7, 1994] have obtained a few bounds on eigenspace variations for A and Ae = D AD and on singular space variations for Bpand = Be = D1BD2 . Their bounds for subspaces of dimension k > 1 contain a factor k which makes their results less competitive to ours. For this reason we will not compare their bounds for subspaces of dimension k > 1 with ours. For the problem studied in Theorem 4.1, Eisenstat and Ipsen [7, 1994] tried to bound the angle j between uej and R(U1), where ue1 ; ue2; ; uek are the columns of Ue1. They showed , ,1 sin j kI , D D k2 + kI , Dk2; (6.1) where j 8 < min j ,e j ; def j = : k+1in je j 1 if e j 6= 0, otherwise: Inequality (6.1) does provide a nice bound. Unfortunately it does not bound straightforwardly k sin (U1 ; Ue1 )k2, and generally, i j j sin j k sin (U1 ; Ue1 )k2 for j = 1; 2; : : :; k Ren-Cang Li: Relative Perturbation Theory 16 and p all of them may be strict. In the worst case, k sin (U1; Ue1)k2 could be as large as k 1max sin j . To make a fair comparison to our inequality (4.1), we consider the case j k k = 1. One infers from inequality (4.1) that3 q kI , D,1k22 + kI , Dk22 : sin 1 min %2(i ; e 1) 2in (6.2) It looks that (6.1) may be potentially sharper because 1 may be much larger than min % ( ; e ). But this is not quite true because of the extra term kI , Dk2 in (6.1) 2in 2 i 1 which stays no matter how large 1 is. We present our arguments as follows. 1. These relative perturbation bounds are most likely to be used when the closest eigenvalue ` among all figni=2 to e 1 has about the same magnitude as e 1 , i.e., when j`j je 1j and p p 1 j` , e1 j=je 1j 2%2(`; e 1) 2 2min % ( ; e ): in 2 i 1 In such a situation, if I , D is very tiny, then q p kI , D,1k22 + kI , Dk22 2kI , Dk2 + O(kI , Dk22); kI , D,D,1k2 2kI , Dk2 + O(kI , Dk22): Hence asymptotically, inequalities (6.2) and (6.1) read, respectively p 2kI , Dk2 + O(kI , Dk2 ); 2 e p%2(`; 1) sin 1 2kI ,eDk2 + kI , Dk2 + O(kI , Dk22 ): %2(`; 1) So our bound (6.2) is asymptotically sharper by amount kI , Dk2. 2. On the other hand, if the closest eigenvalue ` among all figni=2 to e 1 has much bigger magnitude than je 1j, i.e., j` j je 1j, then sin 1 min % ( ; e ) 1: 2in 2 i 1 Thus asymptotically, inequality (6.2) read p sin 1 2kI , Dk2 + O(kI , Dk22 ) which cannot be much worse than (6.1). 3 By treating A and Ae symmetrically, one can see Theorem 4.1 remains valid with the relative gap 2 between 1 and e 2 replaced by the relative gap 1ik;min % (e ; ) between e 1 and 2 . 1jn,k 2 i k+j 17 Ren-Cang Li: Relative Perturbation Theory To be short, we have shown that it is possible for our inequality (6.2) to be less sharp than (6.1) by a constant factor and in the case when these bounds are most likely to be used in estimating errors (6.2) is sharper. Another point we like to make is that we had given up some sharpness for elegantness in the derivation of (6.2). Recall that we have, besides (7.1), also e 1Ue1 U2 , Ue1 U22 = e 1 Ue1(I , D,1 )U2 + Ue1 (D , I )U22: When e 1 = 0, this equation reduces to ,Ue1 U2 = Ue1 (I , D )U2 which implies (6.3) k sin (U1; Ue1)k2 kD , I k2; a better bound than (6.2), provided j = 6 0 for j = k + 1; ; n. Equation (6.3) implies ue1U2 = ue1(D,1 , I )U2e1 (e 1I , 2),1 + ue1(I , D)U2 2(e 1I , 2),1 ; which can be used to obtain an identity for sin 1 = kue1U2 k2 ! Generally (6.3) produces k sin (U1; Ue1)k2 jeij kI , D,1k + jk+j j kD , I k 1ik;max max 2 2 e e 1j n,k ji , k+j j 1ik; 1j n,k ji , k+j j which would be a better bound than (4.1) in the case when either 1min je j 1max j j or 1max je j 1min j j: ik i j n,k k+j ik i j n,k k+j Eisenstat and Ipsen [7, 1994] treated singular value problems in a very similar way as they did to eigenspaces. This makes our arguments above apply to our bounds and their bounds for singular value problems. Eisenstat and Ipsen [7, 1994] did not study bounds in other matrix norms. 7 Proofs of Theorems 4.1 and 4.2 e 1 , U11 = (Ae , A)U1 as dened in (3.1). Notice that Let R = AU e 1 , Ue2U11 = e 2Ue2U1 , Ue2U11; Ue2R = Ue2AU Ue2R = Ue2(hAe , A)U1 = Ue2 [D AD , D Ai+ DA , A] U1 = Ue2 D AD(I , D,1 ) + (D , I )A U1 = e 2Ue2 (I , D,1 )U1 + Ue2 (D , I )U11 : Thus, we have e 2Ue2 U1 , Ue2 U11 = e 2 Ue2(I , D,1 )U1 + Ue2 (D , I )U11: (7.1) 18 Ren-Cang Li: Relative Perturbation Theory LemmaT7.1 Let 2 Css and , 2 Ctt be two Hermitian matrices, and let E; F 2 Cst. If ( ) (,) = ;, then equation q matrix . X , X , =def E + F , has a unique solution, 2 2 and moreover kX kF kE kF + kF kF 2 , where 2 = min % (!; ). !2( ); 2(,) 2 Proof: For any unitary matrices P 2 Us and Q 2 Ut , the substitutions P P; , Q,Q; X P XQ; E P EQ; and F P FQ leave the lemma unchanged, so we may assume without loss of generality that = diag(!1 ; !2; : : :; !s ) and , = diag(1; 2; : : :; t). Write X = (xij ), E = (eij ), and F = (fij ). Entrywise, equation X , X , = E + F , reads !i xij , xij j = !i eij + fij j . TThus xij exists uniquely provided !i 6= j which is guaranteed by the assumption ( ) (,) = ;, and moreover j(!i , j )xij j2 = j!ixij , xij j j2 = j!ieij + fij j j2 (j!ij2 + j j2)(jeij j2 + jfij j2) by the Cauchy-Schwarz inequality. This implies 2 2 2 2 jxij j2 jeij j + jfij j2 jeij j +2 jfij j [%2(!i ; j )] 2 P je j2 + P jfij j2 ij X kE k2F + kF k2F ; i;j ) kX k2F = jxij j2 i; j = 2 2 2 i; j 2 as was to be shown. Proof of Theorem 4.1: By Lemma 7.1 and equation (7.1), we have ,1 2 2 e e kUe2U1k2F kU2 (I , D )U1kF+2 kU2 (D , I )U1kF 2 1 2 k(I , D,1)U1k2F + k(D , I )U1k2F : 2 This completes the proof of Theorem 4.1, since k sin (U1; Ue1)kF = kUe2U1 kF . Lemma 7.2 Let 2 Css and , 2 Ctt be two Hermitian matrices, and let E; F 2 Cst. If there exist > 0 and > 0 such that k k2 and k,,1k,2 1 + or k ,1k,2 1 + and k,k2 ; then matrix equation X , X , = E +qF , has a unique . solution, anddefmoreover for any q q unitarily invariant norm jjj jjj, jjX jj jjE jj + jjF jjj p , where p = %p(; + ). q 19 Ren-Cang Li: Relative Perturbation Theory T Proof: First of all, the conditions of this lemma imply ( ) (,) = ;, thus X exists uniquely by Lemma 7.1. In what follows, we present a proof of the bound for jjX jj for the case k k2 and k,,1 k,2 1 + . A proof for the other case is analogous. Post-multiply equation X , X , = E + F , by ,,1 to get X ,,1 , X = E ,,1 + F: (7.2) Under the assumptions k k2 and k,,1 k,2 1 + ) k,,1 k2 +1 , we have ,1 X , , X jjX jj , X ,,1 jjX jj , k k2 jjX jj k,,1k2 1 jjX jj , jjjX jj + = 1 , + jjX jj ; and ,1 E , + F E ,,1 + jjF jjj k k2 jjjE jj k,,1k2 + jjF jjj s p q 1 jjjE jj + + jjjF jj 1 + ( + )p jjE jjq + jjF jjjq : q p By equation (7.2), we deduce that s p q 1 , + jjX jj 1 + ( + )p jjE jjq + jjF jjjq p q from which the desired inequality follows. Proof of Theorem 4.2: By Lemma 7.2 and equation (7.1), we have r e q e q , e , 1 U2 U1 U2 (I , D )U1 + U2 (D , I )U1 p q q q , 1 jj(I , D I )U1jj + jj(D , I )U1jj p; as required since sin (U1 ; Ue1 ) = Ue2 U1 . q q 8 Proofs of Theorems 4.3 and 4.4 Notice that A = S HS = (H 1=2S ) H 1=2S; Ae = S H 1=2(I + H ,1=2(H )H ,1=2)H 1=2S = (I + H ,1=2(H )H ,1=2)1=2H 1=2S (I + H ,1=2(H )H ,1=2)1=2H 1=2S: 20 Ren-Cang Li: Relative Perturbation Theory Set B = S H 1=2 and Be = S H 1=2(I + H ,1=2(H )H ,1=2)1=2 def = BD, where 1=2 , 1 = 2 , 1 = 2 D = I + H (H )H . Given the eigendecompositions of A and Ae as in (1.1), (1.2), (1.3), and (1.4), one easily see that B and Be admit the following SVDs. B = U 1=2V (U1; U2) 11=2 ! 12=2 e 1=2 Be = Ue e 1=2Ve (Ue1; Ue2) 1 e 1=2 2 ! V1 ; V2 ! e ! V1 ; Ve2 where U; Ue are the same as in (1.1) and (1.2), V1; Ve1 2 Cnk . Notice that e B , BD e ,1B = Be (D , D,1)B: Ae , A = Be Be , BB = BD Pre- and post-multiply the equations by Ue and U , respectively, to get e Ue U , Ue U = e 1=2Ve (D , D,1 )V 1=2 which yields e 2 Ue2U1 , Ue2U1 1 = e 12=2Ve2 (D , D,1 )V111=2: (8.1) The following inequality will be very useful in the rest of our proofs. e ,1 ,1 V2 (D , D )V1 D , D 1=2 ,1=2 = I + H ,1=2(H )H ,1=2 , I + H ,1=2(H )H ,1=2 ,1=2 ,1=2 k I + H ,1=2(H )H ,1=2 k2 H (H )H ,1=2 ,1 p1k,HkHk,21jkjkHjHj k : 2 2 Lemma 8.1 Let 2 Css and , 2TCtt be two nonnegative denite Hermitian matrices, and let E 2 Cst . If ( ) (,) = ;, then matrix equation X , X , = 1=2E ,1=2 has a unique solution X 2 Cst , and moreover kX kF kE kF=, where def = min (!; ). !2( ); 2(,) Proof: For any unitary matrices P 2 Us and Q 2 Ut , the substitutions P P; 1=2 (P P )1=2; , Q ,Q; ,1=2 (Q,Q)1=2; X P XQ; and E P EQ leave the lemma unchanged, so we may assume without loss of generality that = diag(!1 ; !2; : : :; !s ) and , = diag(1; 2; : : :; t). Write X = p(xij ), pE = (eij ). Entrywise, equation X , X , = 1=2E ,1=2 reads !i xij , xij j = !i eij j . As long as !i 6= j , xij exists uniquely, and jxij j2 = jeij j2=(!i; j ) jeij j2= 21 Ren-Cang Li: Relative Perturbation Theory summing which over 1 i s and 1 j t leads to the desired inequality. Proof of Theorem 4.3: Equation (8.1) and Lemma 8.1 imply ,1 e ,1 sin (U1; Ue1)F = Ue2U1F kV2 (D , D )V1kF kD ,D kF ; as required. Lemma 8.2 Let 2 Css and , 2 Ctt be two nonnegative denite Hermitian matrices, and let E 2 Cst . If there exist > 0 and > 0 such that k k2 and k,,1k,2 1 + or k ,1k,2 1 + and k,k2 ; then matrix equation X , X , = 1=2E ,1=2 has a unique solution X 2 Cst , and = (; + ). moreover jjX jj jjE jj = , where def Proof: The existence and uniqueness of X are easy to see because the conditions of this T lemma imply ( ) (,) = ;. To bound jjX jj, we present a proof for the case k k2 and k,,1 k,2 1 + . A proof for the other case is analogous. Post-multiply equation X , X , = 1=2E ,1=2 by ,,1 to get X ,,1 , X = 1=2E ,,1=2: (8.2) Under the assumptions k k and k,,1k,2 1 + ) k,,1 k2 +1 , we have X ,,1 , X 1 , 2jjX jj as in the proof of Lemma 7.2 and + 1=2 ,1=2 p E , k 1=2k2 jjjE jj k,,1=2k2 jjjE jj p1+ : By equation (8.2), we deduce that r 1 , + jjjX jj + jjE jj from which the desired inequality follows. Proof of Theorem 4.4: Equation (8.1) and Lemma 8.2 imply e ,1 V2 (D , D )V1 D , D,1 e e ; sin (U1; U1) = U2 U1 as required. 22 Ren-Cang Li: Relative Perturbation Theory 9 Proofs of Theorems 4.5 and 4.6 e 1 , U11 = (Be , B)V1 and RL = Be U1 , V11 = (Be , B )U1 as dened in Let RR = BV (3.8). 9.1 The Square Case: m=n When m = n, the SVDs (3.4) and (3.5) read ! ! V1 ; B = U V (U1; U2) 1 V2 2 ! e ! V1 : eB = Ue e Ve (Ue1; Ue2) e 1 e 2 Ve2 Notice that e 1 , Ue2U11 = e 2Ve2V1 , Ue2U11; Ue2RR = Ue2 BV Ue2RR = Ue2 (hBe , B)V1 = Ue2(D1 BD2 ,i D1 B + D1 B , B)V1 = Ue2 Be (I , D2,1 ) + (D1 , I )B V1 = e 2Ve2 (I , D2,1 )V1 + Ue2 (D1 , I )U11 to get e 2 Ve2 V1 , Ue2U1 1 = e 2 Ve2 (I , D2,1 )V1 + Ue2 (D1 , I )U11 : On the other hand, (9.1) Ve2RL = Ve2 Be U1 , Ve2V11 = e 2Ue2 U1 , Ve2 V11: Ve2RL = Ve2 (hBe , B )U1 = Ve2 (D2B D1 i, D2 B + D2 B , B )U1 = Ve2 Be (I , D1,1 ) + (D2 , I )B U1 = e 2Ue2 (I , D1,1 )U1 + Ve2(D2 , I )V11 which produce e 2 Ue2U1 , Ve2 V11 = e 2 Ue2 (I , D1,1 )U1 + Ve2 (D2 , I )V11 : (9.2) Equations (9.1) and (9.2) take an equivalent forms as a single matrix equation with dimensions doubled. ! e ! ! ! e2U1 e 2 U2 U1 U 1 (9.3) 1 e 2 Ve2 V1 , Ve2V1 ! e 2 ! Ue2(I , D1,1)U1 = e 2 Ve2 (I , D2,1 )V1 ! ! e2(D1 , I )U1 U 1 + : 1 Ve2(D2 , I )V1 23 Ren-Cang Li: Relative Perturbation Theory Proof of Theorem 4.5: Notice that the eigenvalues of 1 1 ! e 2 e 2 ! are ek+j and these of are i , and that %2(i; ,ek+j ) %2(i; ek+j ) and %2(,i ; ek+j ) %2(i ; ek+j ): By Lemma 7.1 and equation (9.3), we have kUe2 U1 k2F + kVe2 V1 k2F h i 12 kUe2 (I , D1,1 )U1 k2F + kUe2 (D1 , I )U1 k2F + kVe2 (I , D2,1 )V1k2F + kVe2 (D2 , I )V1 k2F 2 12 k(I , D1,1 )U1 k2F + k(D1 , I )U1 k2F + k(I , D2,1 )V1 k2F + k(D2 , I )V1 k2F 2 which completes the proof. Lemma 9.1 Let 2 Css and , 2 Ctt be two Hermitian matrices, and let e F; F; e 2 Cst satisfying X; Y; E; E; X , Y , = E + F , and Y , X , = Ee + Fe ,: If there exist > 0 and > 0 such that k k2 and k,,1k,2 1 + or k ,1k,2 1 + and k,k2 ; then for any unitarily invariant norm jjj jj, (q r q q ) 1 q q q (9.4) max fjj X jj ; jjY jjg max jjE jj + jjF jj ; q Ee + Fe ; p where p def = %p(; + ). Proof: We present a proof for the case k k2 and k,,1 k,2 1 + . A proof for the other case is analogous. Consider rst the subcase jjX jj jjY jj. Post-multiply equation Y , X , = Ee + Fe , by ,,1 to get Y ,,1 , X = Ee ,,1 + Fe : (9.5) Then we have, by k k2 and k,,1 k,2 1 + ) k,,1 k2 +1 , that ,1 Y , , X jjX jj + , Y ,,1 jjX jj , k k2 jjY jj k,,1k2 jjX jj , jjY jj +1 jjjX jj , jjX jj +1 = 1 , + jjjX jj 24 Ren-Cang Li: Relative Perturbation Theory and e ,1 e E, + F Ee ,,1 + Fe k k2 Ee k,,1k2 + Fe e 1 e s p r q q e e E + F 1 + p E + F : p + ( + ) q By equation (9.5), we deduce that s p r q q 1 , + jjjX jj 1 + ( + )p Ee + Fe q p r q q Ee + Fe : Similarly if jjX jj < jjY jj, which produce that if jjjX jj jjY jj , jjjX jj q from X , Y , = E + F , we can obtain jjY jjj 1 jjE jjq + jjF jjjq : Inequality (9.4) now 1 p q q p follows. Proof of Theorem 4.6: By equations (9.1) and (9.2) and Lemma 9.1, we have max n e e o U2 U1 ; V2 V1 ( r q q 1 max Ve2 (I , D2,1 )V1 + Ue2 (D1 , I )U1 ; p ) r Ue2(I , D1,1)U1q + Ve2(D2 , I )V1 q q q 1 q q q q , 1 , 1 max (I , D )V + jj(D , I )U jj ; (I , D )U + jj(D , I )V jj ; q q p q 2 1 1 1 q 1 1 2 1 as required. Turning to inequality (4.8), we have by equation (9.3) and Lemma 7.2 that Ue2 U1 Ve2 V1 ! 1 p ! q Ue2 (I , D1,1)U1 (9.6) Ve2 (I , D2,1 )V1 e !q !1=q U ( D , I ) U 1 + 2 1 ; Ve2 (D2 , I )V1 since the conditions of Theorem 4.6 imply e 2 e 2 ! ; 2 !,1 1 1 1 + : 2 Since Ue2 U1 and sin (U1 ; Ue1 ) have the same nonzero singular values and so do Ve2V1 and sin (V1; Ve1), e U2 U1 Ve2 V1 ! ! = sin (U1; Ue1) sin (V1; Ve1) : (9.7) 25 Ren-Cang Li: Relative Perturbation Theory Note also Ue2 (I , D1,1 )U1 Ue2 (D1 , I )U1 ! Ve2 (I , D2,1 )V1 ! Ve2 (D2 , I )V1 Thus, one has = Ue2 = Ue2 Ue2 (I , D1,1)U1 Ve2 (I , D2,1 )V1 Ue2 (D1 , I )U1 Ve2 (D2 , I )V1 Ve2 Ve2 ! ! (I , D1,1 )U1 (D1 , I )U1 (I , D2,1 )V1 (D2 , I )V1 ; : ! (I , D1,1 )U1 (I , D2,1 )V1 ; (9.8) ! (D1 , I )U1 (D2 , I )V1 : (9.9) Inequality (4.8) is a consequence of (9.6), (9.7), (9.8) and (9.9). 9.2 m>n The Non-Square Case: e 0m;m,n). Augment B and Be by a zero block 0m;m,n to Ba = (B; 0m;m,n ) and Bea = (B; e From B = D1 BD2 , we get Bea = D Ba 1 D2 ! Im,n def = D1 Ba Da2 : From the SVDs (3.4) and (3.5) of B and Be , one can calculate the SVDs of Ba and Bea : ! ! Va1 ; a2 Va2 ! ! e e V 1 a1 Bea = Ue e a Vea = (Ue1; Ue2) e a2 Vea2 ; Ba = U a Va = (U1; U2) where a2 = 2 ! 0m,n;m,n ; Va1 = 1 V1 0m,n;k ! (9.10) (9.11) ! ; Va2 = V2 I ; m,n similarly for e a2 , Vea1 and Vea2 . The following fact is easy to establish sin (Va1; Vea1) = sin (V1; Ve1) : Applying the square case of Theorems 4.5 and 4.6 to m m matrices Ba and Bea just dened will complete the proofs. 26 Ren-Cang Li: Relative Perturbation Theory 10 Proofs of Theorems 4.7 and 4.8 We have seen in x9 how to deal with the nonsquare case by transforming it to the square case. So here we will only give proofs for the square case: m = n: Let B^ = D1 B and B = BD2 and their SVDs be ! ^ ! ^ V1 ; B^ = U^ ^ V^ (U^1; U^2) 1 ^ (10.1) V^2 2 ! ! V1 ; 1 B = U V (U1; U2) (10.2) 2 V2 where Ue ; U 2 Un , Ve ; V 2 Un , U1; U1 2 Cnk , V1; V1 2 Cnk and ^ 1 = diag(^1; ; ^k ); ^ 2 = diag(^k+1 ; ; ^n ); 1 = diag(1 ; ; k ); 2 = diag(k+1 ; ; n): Partitionings in (10.1) and (10.2) shall be done in such a way that max (i ; ^i ) 21 kD1 , D1,1 k2 ; 1in max (i ; i ) 12 kD2 , D2,1 k2 ; 1in max (^i ; ei ) 12 kD2 , D2,1 k2 ; 1in max (i ; ei ) 12 kD1 , D1,1 k2 : 1in (10.3) Such partitionings are possible because of the relative perturbation theorems proved in Li [11]. By the fact %p(; ) 2,1=p (; ) (see Lemma 2.1 below), these inequalities imply max % ( ; ^ ) 21+11 =p kD1 , D1,1 k2 ; 1max % (^ ; e ) 1in p i i in p i i max % ( ; ) 21+11 =p kD2 , D2,1 k2 ; 1max % ( ; e ) 1in p i i in p i i ^ 2 . We have Consider B^ and Be = BD B^ B^ = U^ ^ ^ U^ (U^1 ; U^2) ^ 21 ,1 1 21+1=p kD2 , D2 k2 ; ,1 1 21+1=p kD1 , D1 k2 : ! ^ ! U1 U^2 ; ! e ! 2 e U1 : Be Be = Ue e e Ue (Ue1 ; Ue2) 1 e 2 Ue2 2 ^ 22 (10.4) (10.5) (10.6) Notice that e 2B^ , BD e 2,1B^ = Be (D2 , D2,1)B^ ; Be Be , B^ B^ = BD Pre- and post-multiply the equations by Ue and U^ , respectively, to get e 2Ue U^ , Ue U^ ^ 2 = e Ve (D2 , D2,1 )V^ ^ which gives e 22 Ue2U^1 , Ue2U^1 ^ 21 = e 2 Ve2 (D2 , D2,1 )V^1^ 1: (10.7) 27 Ren-Cang Li: Relative Perturbation Theory Consider now B and Be = D1 B . We have 21 B B = V V (V1; V2) Notice that ! ! V1 V2 ; ! e ! V1 : eBBe = Ve e e Ve (Ve1; Ve2) e 21 e 2 2 Ve2 22 (10.8) (10.9) Be Be , B B = Be D1 B , Be D1,1B = Be (D1 , D1,1 )B: Pre- and post-multiply the equations by Ve and V , respectively, to get e 2 Ve V , Ve V 2 = e Ue (D1 , D1,1 )U which gives e 22 Ve2 V1 , Ve2 V1 21 = e 2 Ue2 (D1 , D1,1 )U1 1 : (10.10) Two other eigendecompositions that will be used later in the proofs are ! ! 2 U1 ; 1 BB = U U (U1 ; U2) (10.11) 22 U2 ! ! 2 V1 : 1 B B = V V (V1; V2) (10.12) 2 2 V2 Proof of Theorem 4.7: Equations (10.7) and (10.10) and Lemma 8.1 produce e ,1 V2 (D2 , D2 )V^1F e e ^ ^ ; sin (U1; U1)F = U2 U1F ^ 21; e 22 ) ( e ,1 U2 (D1 , D1 )U1F e e sin ( V ; V ) = V V ; 1 1 F 2 1 F ( 21; e 22 ) 2; e 2 ) and ( 21; e 22) def where4 (^ 21; e 22 ) def = 1ik;min (^ = 1ik;min (i2 ; ek2+j ): i k + j 1j n,k 1j n,k On the other hand, applying Theorem 4.1 to BB and B^ B^ = D1BB D1 leads to (see (10.5) and (10.11)) q k(I , D1,1)U1k2F + k(I , D1)U1k2F ; sin (U1; U^1)F (2 ; ^ 2) 2 1 2 where 2(21; ^ 22) def = 1ik;min % (2; ^ 2 ); Applying Theorem 4.1 to B B and B B = 1j n,k 2 i k+j D2 B BD2 leads to (see (10.8) and (10.12)) q k(I , D2,1)V1k2F + k(I , D2)V1k2F ; sin (V1; V1)F (2; 2) 4 We abuse notation 2 1 2 here for convenience. As we recall, has its own assignment in the statement of Theorem 4.7. However, it is re-dened as a function in this proof. Hopefully, this would not cause any confusion. 28 Ren-Cang Li: Relative Perturbation Theory where 2(21; 22) def = 1ik;min % ( 2; 2 ); Let 1 = kD1 , D1,1 k2 and 2 = kD2 , 1j n,k 2 i k+j D2,1 k2 . We claim 8 2 ( ;e ), 1 2 1; > < (^ 21; e 22) 2 (^ 1; e 2) > 1+ 161 (1;e2 ) : 23=22(1; e 2) , 1: (10.13) This is because (^i2; ek2+j ) 2(^i ; ek+j ) 21+1=2%2(^i ; ek+j ) (by Lemma 2.1) 23=2 [h%2(i; ek+j ) , %2(^i; ii)] (%2 is a metric on R) 23=2 %2(i; ek+j ) , 2,3=21 (by (10.4) and because (i ; ek+j ) (i ; ^i) + (^i ; ek+j ) + 81 (i ; ^i)(^i; ek+j )(i ; ek+j ): (by (10.4) ) (^i; ek+j ) 1 +(1i; e(k+;j^) ,)((;ie; ^i) ) 1(+i;1ek+(j );,e 1 =)2 : i k+j i i i k+j 16 8 Similarly, we have 8 2 ( ;e ), > < 1+ 2 1 (21;e22) ; 2 2 e e (1; 2 ) 2 (1; 2 ) > 16 : 23=22(1; e 2) , 2; 2(21; ^ 22) 2(1; ^ 2) 2 (1; e 2 ) , 2,3=22 ; 2(21; 22) 2(1; 2) 2 (1; e 2 ) , 2,3=21 : The proof will be completed by employing sin (U1; Ue1)F sin (U1; U^1)F + sin (U^1; Ue1)F ; sin (V1; Ve1)F sin (V1; V1)F + sin (V1; Ve1)F ; since k sin ( ; )kF is a metric on the space of k-dimensional subspaces [14]. Proof of Theorems 4.8: Denote = + . Let ^ and be the largest positive numbers such that (; ^) 21 kD2 , D2,1 k2 and (; ) 12 kD1 , D1,1 k2 which guarantee that k^ 2 k2 ^ , k 2k2 , and 1 kD , D,1 k and % (; ) 1 kD , D,1 k ; %p (; ^) 21+1 p 2 2 1 2 =p 2 21+1=p 1 29 Ren-Cang Li: Relative Perturbation Theory and let ^ and be the smallest numbers such that (; ^) 21 kD1 , D1,1 k2 and (; ) 12 kD2 , D2,1 k2 which guarantee that k^ ,1 1 k,2 1 ^, k ,1 1 k,2 1 , and 1 kD , D,1 k and % (; ) 1 kD , D,1 k : %p(; ^) 21+1 p 1 2 2 2 =p 1 21+1=p 2 (4.15) implies minf^; g > and > maxf^; g. Equations (10.7) and (10.10) and Lemma 8.2 produce e ,1 V (D , D )V^1 ; sin (U^1; Ue1) = Ue2U^1 2 2 2 ^22 ( ; ) e ,1 U (D , D )U1 : sin (V1; Ve1) = Ve2V1 2 (12; 21) On the other hand, applying Theorem 4.2 to BB and B^ B^ = D1BB D1 leads to (see (10.5) and (10.11)) r (I , D1,1)U1q + jj(I , D1)U1jjjq ; sin (U1; U^1) (^2 ; 2) q Applying Theorem 4.1 to B B and B B = D2B BD2 leads to (see (10.8) and (10.12)) r (I , D,1)V1 q + jj(I , D)V1jjq 2 2 : sin (V1; V1) 2 2 ( ; ) q Notice that 8 2(;),1 < 1 ; (2; ^2) 2(; ^) : 1+ 16 (; ) 21+1=p% (; ) , 1 ; 8 2(;),p2 < 2 ; (2 ; 2) 2(; ) : 1+ 16 (; ) 21+1=p%p(; ) , 2 ; %p(^2; 2) %p(^; ) %p(; ) , 21+1=p2 ; %p(2; 2) %p(; ) %p(; ) , 21+1=p1 ; where 1 = kD1 , D1,1 k2 and 2 = kD2 , D2,1 k2. The proof will be completed by employing sin (U1; Ue1) sin (U1; U^1) + sin (U^1; Ue1) ; sin (V1; Ve1) sin (V1; V1) + sin (V1; Ve1) : since jjsin ( ; )jj is a metric on the space of k-dimensional subspaces [14]. 30 Ren-Cang Li: Relative Perturbation Theory 11 Conclusions and Further Extensions to Diagonalizable Matrices We have developed a relative perturbation theory for eigenspace and singular space variations under multiplicative perturbations. In the theory, extensions of Davis-Kahan sin theorems and Wedin sin theorems from the classical perturbation theory are made. Our unifying treatment covers almost all previously studied cases over the last six years or so. Using the similar technique in this paper, one can also develop a relative perturbation theory for eigenspaces of diagonalizable matrices: A and Ae = D1 AD2 are diagonalizable, where D1 and D2 are close to the identity matrix. We outline a way of doing this. Let eigendecompositions of A and Ae be AX = X (X1; X2) 1 2 ! ! e and AeXe = Xe e (Xe 1; Xe 2 ) 1 e ; 2 where X; Xe 2 Cnn are nonsingular, and X1 ; Xe 1 2 Cnk (1 k < n) and 1 = diag(1; ; k ); e 1 = diag(e 1; ; e k ); i 's and ej 's may be complex. Partition Y1 Y2 X ,1 = ! 2 = diag(k+1; ; n); e 2 = diag(e k+1; ; e n ): and Xe ,1 = ! Ye1 ; Ye2 e 1 , X11 = (Ae , A)X1. We have where Y1 ; Ye1 2 Cnk . Dene R def = AX e 1 , Ye2X11 = e 2Ye2X1 , Ye2X11; Ye2R = Ye2 AX Ye2R = Ye2 (hAe , A)X1 = Ye2 (D1AD2 ,i D1 A + D1A , A)X1 = Ye2 Ae(I , D2,1 ) + (D1 , I )A X1 = e 2Ye2 (I , D2,1 )X1 + Ye2 (D1 , I )X11: Thus we have the following perturbation equation e 2Ye2 X1 , Ye2X1 1 = e 2Ye2 (I , D2,1 )X1 + Ye2 (D1 , I )X11 from which various bounds on sin (X1; Xe 1) can be derived under certain conditions. For example, let 2 def = 1ik;min % ( ; e ). If 2 > 0, then by Lemma 7.1 we have 1j n,k 2 i k+j q kYe2X1kF 1 kYe2(I , D2,1)X1k2F + kYe2(D1 , I )X1k2F 2 q 1 kYe2k2kX1k2 kI , D2,1k2F + kD1 , I k2F: 2 Ren-Cang Li: Relative Perturbation Theory 31 Notice that by Lemma 2.2 k sin (X1; Xe1)kF = k(Ye2Ye2),1=2Ye2X1(X1X1),1=2kF k(Ye2Ye2),1=2k2kYe2X1kFk(X1X1),1=2k2: Then a bound on k sin (X1; Xe 1 )kF is immediately available. Acknowledgement: I thank Professor W. Kahan for his consistent encouragement and support, Professor J. Demmel and Professor B. N. Parlett for helpful discussions. References [1] J. Barlow and J. Demmel. Computing accurate eigensystems of scaled diagonally dominant matrices. SIAM Journal on Numerical Analysis, 27:762{791, 1990. [2] C. Davis and W. Kahan. The rotation of eigenvectors by a perturbation. III. SIAM Journal on Numerical Analysis, 7:1{46, 1970. [3] J. Demmel and W. Gragg. On computing accurate singular values and eigenvalues of matrices with acyclic graphs. Linear Algebra and its Application, 185:203{217, 1993. [4] J. Demmel and W. Kahan. Accurate singular values of bidiagonal matrices. SIAM Journal on Scientic and Statistical Computing, 11:873{912, 1990. [5] J. Demmel and K. Veselic. Jacobi's method is more accurate than QR. SIAM Journal on Matrix Analysis and Applications, 13(4):1204{1245, 1992. [6] S. C. Eisenstat and I. C. F. Ipsen. Relative perturbation techniques for singular value problems. Research Report YALEU/DCS/RR{942, Department of Computer Science, Yale University, 1993. [7] S. C. Eisenstat and I. C. F. Ipsen. Relative perturbation bounds for eigenspaces and singular vector subspaces. In J. G. Lewis, editor, Proceedings of the Fifth SIAM Conference on Applied Linear Algebra, pages 62{66, Philadelphia, 1994. SIAM Publications. [8] R. A. Horn and C. R. Johnson. Matrix Analysis. Cambridge University Press, Cambridge, 1985. [9] W. Kahan. Accurate eigenvalues of a symmetric tridiagonal matrix. Technical Report CS41, Computer Science Department, Stanford University, Stanford, CA, 1966. (revised June 1968). [10] R.-C. Li. On perturbations of matrix pencils with real spectra. Mathematics of Computation, 62:231{265, 1994. [11] R.-C. Li. Relative perturbation theory: (i) eigenvalue and singular value variations. Technical Report UCB//CSD-94-855, Computer Science Division, Department of EECS, University of California at Berkeley, 1994. (revised January 1996). [12] R. Mathias. Spectral perturbation bounds for graded positive denite matrices. Manuscript, Department of Mathematics, College of William & Mary, 1994. [13] A. M. Ostrowski. A quantitative formulation of Sylvester's law of inertia. Proc. National Acad. Sciences (USA), 45:740{744, 1959. [14] G. W. Stewart and J.-G. Sun. Matrix Perturbation Theory. Academic Press, Boston, 1990. [15] P.- A. Wedin. Perturbation bounds in connection with singular value decomposition. BIT, 12:99{111, 1972.