Numerical Methods I: Eigenvalues and eigenvectors Georg Stadler Courant Institute, NYU stadler@cims.nyu.edu Oct 29, 2015 1 / 19 Sources/reading Main reference: Section 5 in Deuflhard/Hohmann 2 / 19 Overview Conditioning 3 / 19 Eigenvalues and eigenvectors How hard are they to find? For a matrix A ∈ Rn×n (potentially symmetric and/or positive—can also be complex), we want to find λ and x such that Ax = λx. I This is a nonlinear problem. I How difficult is this? Eigenvalues are the roots of the characteristic polynomial. Also, any polynomial is the characteristic polynomial of a matrix. Thus, for matrices larger than 4 × 4, eigenvalues cannot be computed analytically. I Must use an iterative algorithm. 4 / 19 Conditioning Consider an algebraically simple eigenvalue λ0 of A: Ax0 = λ0 x0 . Then, there exists a continuously differentiable map λ(·) : A → λ(A) in a neighborhood of A such that λ(A) = λ0 . The derivative is λ0 (A)C = (Cx0 , y 0 ) , (x0 , y 0 ) where y 0 is an eigenvector of AT for the eigenvalue λ0 . 5 / 19 Conditioning Compute norm of λ0 (A) as linear mapping that maps C 7→ (Cx0 , y 0 ) . (x0 , y 0 ) Use as norm for C the norm induced by the Euclidean norm: kλ0 (A)k = kx0 kky 0 k , |(x0 , y 0 )| i.e., the inverse cosine of the angle between x0 , y 0 . 6 / 19 Conditioning Theorem: The absolute and relative condition numbers for computing a simple eigenvalue λ0 are κabs = kλ0 (A)k = and κrel = 1 , | cos(^(x0 , y 0 ))| kAk . |λ0 cos(^(x0 , y 0 ))| In particular, for normal matrices, κabs = 1. Note that finding non-simple eigenvalues is ill-posed (but can still be done). 7 / 19 Overview Power method and variants 8 / 19 The power method Choose starting point x and iterate xk+1 := Axk , xk+1 = xk+1 /kxk+1 k Idea: Eigenvector corresponding to largest (in absolute norm) eigenvalue will start dominating, i.e., xk converges to eigenvector for largest eigenvalue x. I Convergence speed depends on eigenvalues I Only finds largest eigenvalue λmax = xT Ax upon convergence 9 / 19 The power method—variants Inverse power method: Having an estimate λ̄ for an eigenvalue λi , consider the (A − λ̄I)−1 which has the eigenvalues (λi − λ̄)−1 , i = 1, . . . , n. Consider the inverse power iteration (A − λ̄I)xk+1 = xk , xk+1 = xk+1 /kxk+1 k I Required matrix-solve in every iteration I Convergence speed depends on how close λ̄ is to λi . 10 / 19 The power method—variants Rayleigh quotient iteration: Accelerated version of the inverse power method using of changing shifts: I Choose starting vector x0 with kx0 k = 1. Compute λ0 = (x0 )T Ax0 . I For i = 0, 1, . . . do (A − λ̄k I)xk+1 = xk , I xk+1 = xk+1 /kxk+1 k. Compute λk+1 = (xk+1 )T Axk+1 , and go back. 11 / 19 Overview The QR algorithm 12 / 19 The QR algorithm The QR algorithm for finding eigenvalues is as follows (A0 := A), and for k = 0, 1, . . .: I Compute QR decomposition of Ak , i.e., Ak = QR. I Ak+1 := RQ, k := k + 1 and go back. Why should that converge to something useful? 13 / 19 The QR algorithm I Similarity transformations do not change the eigenvalues, i.e., if B is invertible, then A and P −1 AP have the same eigenvalues. I QAk+1 QT = QRQQT = QR = Ak , i.e., the iterates Ak in the QR algorithm have the same eigenvalues. I The algorithm is closely related to the Rayleigh coefficient method. I The algorithms is expensive (QR-decomposition is O(n3 )). I Convergence can be slow. 14 / 19 QR algorithm and Hessenberg matrices Idea: Find a matrix format that is preserved in the QR-algorithm. Hessenberg matrices H are matrices for which Hi,j = 0 if i > j + 1. I Hessenberg matrices remain Hessenberg in the QR algorithm. I An iteration of the QR-algorithm with a Hessenberg matrix requires O(n2 ) flops. Algorithm: 1. Use Givens rotations to transfer A into Hessenberg form. Use transpose operations on right hand side (similarity transformation). 2. Use QR algorithm for the Hessenberg matrix. 15 / 19 Overview The QR algorithm for symmetric matrices 16 / 19 QR algorithm for symmetric matrices Let’s consider symmetric matrices A. Then the Hessenberg form (after application of Given rotations from both sides) is tridiagonal. Theorem: For a symmetric matrix with distinct eigenvalues |λ1 | > . . . |λn | > 0, Λ = diag(λ1 , . . . , λn ) holds: 1. limk→∞ Qk = I, 2. limk→∞ Rk = Λ, k k 3. aij = O λλji for i > j. 17 / 19 QR algorithm for symmetric matrices I I The method also converges in the presence of multiple eigenvalues. Only when λi = −λi+1 , the corresponding block does not become diagonal. One can speed up the method by introducing a shift operator: I I Ak − σk I = QR. Ak+1 = RQ + σk I. Again, the shift can be updated during the iteration. 18 / 19 QR algorithm for symmetric matrices Summary Complexity: Convert to Hessenberg form using Givens rotations: 4/3n3 flops; each QR iteration: O(n2 ) flops. Overall, convergence is dominated by the reduction to tridiagonal form. This method finds all eigenvalues (of a symmetric matrix). The corresponding eigenvectors can be found from the algorithm as well. 19 / 19