Iterative Methods and QR Factorization Lecture 5 Alessandra Nardi Thanks to Prof. Jacob White, Suvranu De, Deepak Ramaswamy, Michal Rewienski, and Karen Veroy Last lecture review • Solution of system of linear equations Mx=b • Gaussian Elimination basics – LU factorization (M=LU) – Pivoting for accuracy enhancement – Error Mechanisms (Round-off) • Ill-conditioning • Numerical Stability – Complexity: O(N3) • Gaussian Elimination for Sparse Matrices – – – – Improved computational cost: factor in O(N1.5) Data structure Pivoting for sparsity (Markowitz Reordering) Graph Based Approach Solving Linear Systems • Direct methods: find the exact solution in a finite number of steps – Gaussian Elimination • Iterative methods: produce a sequence of approximate solutions hopefully converging to the exact solution – Stationary • Jacobi • Gauss-Seidel • SOR (Successive Overrelaxation Method) – Non Stationary • GCR, CG, GMRES….. Iterative Methods Iterative methods can be expressed in the general form: x(k) =F(x(k-1)) where s s.t. F(s)=s is called a Fixed Point Hopefully: x(k) s (solution of my problem) • Will it converge? How rapidly? Iterative Methods Stationary: x(k+1) =Gx(k)+c where G and c do not depend on iteration count (k) Non Stationary: x(k+1) =x(k)+akp(k) where computation involves information that change at each iteration Iterative – Stationary Jacobi In the i-th equation solve for the value of xi while assuming the other entries of x remain fixed: N mij x j bi xi j 1 bi mij x j j i mii xi (k ) bi mij x j ( k 1) j i mii (k ) 1 k 1 1 x D L U x D b In matrix terms the method becomes: where D, -L and -U represent the diagonal, the strictly lower-trg and strictly upper-trg parts of M Iterative – Stationary Gauss-Seidel Like Jacobi, but now assume that previously computed results are used as soon as they are available: N mij x j bi xi j 1 bi mij x j j i mii In matrix terms the method becomes: xi (k ) x (k ) bi mij x j j i (k ) mij x j j i mii D L (Uxk 1 b) 1 where D, -L and -U represent the diagonal, the strictly lower-trg and strictly upper-trg parts of M ( k 1) Iterative – Stationary Successive Overrelaxation (SOR) Devised by extrapolation applied to Gauss-Seidel in the form of weighted average: xi (k ) wxi (k ) (1 w) xi ( k 1) xi (k ) bi mij x j j i (k ) mij x j ( k 1) j i mii In matrix terms the method becomes: x ( k ) D wL ( wU (1 w) D) x k 1 w( D wL) 1 b 1 where D, -L and -U represent the diagonal, the strictly lower-trg and strictly upper-trg parts of M w is chosen to increase convergence Iterative – Non Stationary The iterates x(k) are updated in each iteration by a multiple ak of the search direction vector p(k) x(k+1) =x(k)+akp(k) Convergence depends on matrix M spectral properties • Where does all this come from? What are the search directions? How do I choose ak ? Will explore in detail in the next lectures Outline • QR Factorization – Direct Method to solve linear systems • Problems that generate Singular matrices – Modified Gram-Schmidt Algorithm – QR Pivoting • Matrix must be singular, move zero column to end. – Minimization view point Link to Iterative Non stationary Methods (Krylov Subspace) LU Factorization fails – Singular Example v1 v2 1 1 v3 v4 1 1 0 0 v1 1 1 1 0 0 v 1 2 0 0 1 1 v3 1 0 0 1 2 v4 0 The resulting nodal matrix is SINGULAR, but a solution exists! LU Factorization fails – Singular Example 1 1 0 0 1 1 0 0 0 0 1 1 0 0 1 2 One step GE 1 1 0 0 0 0 0 0 0 0 1 1 0 0 1 2 The resulting nodal matrix is SINGULAR, but a solution exists! Solution (from picture): v4 = -1 v3 = -2 v2 = anything you want solutions v1 = v2 - 1 QR Factorization – Singular Example Recall weighted sum of columns view of systems of equations M1 M2 x1 b1 x2 b2 MN xN bN x1M 1 x2 M 2 xN M N b M is singular but b is in the span of the columns of M QR Factorization – Key idea If M has orthogonal columns Orthogonal columns implies: Mi M j 0 i j Multiplying the weighted columns equation by i-th column: M i x1M1 x2 M 2 xN M N M i b Simplifying using orthogonality: xi M i M i M i b xi Mi b M M i i QR Factorization - M orthonormal Picture for the two-dimensional case M1 M1 b b M2 x1 Non-orthogonal Case x2 Orthogonal Case M2 M is orthonormal if: Mi M j 0 i j and Mi Mi 1 QR Factorization – Key idea x1 b1 x2 b2 M M M 2 N 1 xN bN Original Matrix y b 1 1 y2 b2 QN Q1 Q2 y N bN Matrix with Orthonormal Columns Qy b y Q b T How to perform the conversion? QR Factorization – Projection formula Given M1 , M 2 , find Q2 =M 2 r12 M1 so that M1 Q2 M1 M 2 r12 M1 0 M1 M 2 r12 M1 M1 M2 Q2 r12 M1 QR Factorization – Normalization Formulas simplify if we normalize 1 1 Q1 M1 M1 Q1 Q1 1 r11 M M 1 1 Now find Q2 =M 2 r12Q1 so that Q2 Q1 0 r12 Q1 M 2 1 1 Finally Q2 Q2 Q2 r 22 Q2 Q2 QR Factorization – 2x2 case Mx=b Qy=b Mx=Qy M1 y1 x1 M 2 x1M 1 x2 M 2 Q1 Q2 y1Q1 y2Q2 x2 y2 M1 r11Q1 M 2 r22 Q2 r12Q1 r11 r12 x1 y1 0 r x y 22 2 2 QR Factorization – 2x2 case M1 x1 M2 x2 r11 r12 x1 b1 Q1 Q2 0 r22 x2 b2 Upper Triangular Orthonormal Two Step Solve Given QR Step 1) QRx b Rx Q b b Step 2) Backsolve Rx b T QR Factorization – General case M1 M2 M 3 M1 M 2 r12 M 1 To Insure the third column is orthogonal M M 0 M1 M 3 r13 M1 r23 M 2 0 M2 3 r13 M1 r23 2 M 3 r13 M 1 r23 M 2 QR Factorization – General case M M 0 M1 M 3 r13 M1 r23 M 2 0 M2 3 r13 M1 r23 2 M 1 M 1 M 1 M 2 r13 M 1 M 3 M 2 M 1 M 2 M 2 r23 M 2 M 3 In general, must solve NxN dense linear system for coefficients QR Factorization – General case To Orthogonalize the Nth Vector M1 M1 M N 1 M 1 2 M 1 M N 1 r1, N M 1 M N M N 1 M N 1 rN 1, N M N 1 M N 3 N inner products or N work QR Factorization – General case Modified Gram-Schmidt Algorithm M1 M2 M 3 M1 M 2 r12Q1 M 3 r13Q1 r23Q2 To Insure the third column is orthogonal Q M Q r 0 r Q1 M 3 Q1r13 Q2 r23 0 r13 Q1 M 3 2 3 Q1r13 2 23 23 Q2 M 3 QR Factorization Modified Gram-Schmidt Algorithm (Source-column oriented approach) For i = 1 to N “For each Source Column” rii M i M i Normalize 1 Qi M i rii For j = i+1 to N { 2 2 N 2 N operations i 1 “For each target Column right of source” rij M j Qi end end N M j M j rij Qi N 3 ( N i )2 N N operations i 1 QR Factorization – By picture M1 1 Q M Q22 Q33 M Q M44 r11 r12 r13 r14 r22 r23 r24 r33 r34 r44 QR Factorization – Matrix-Vector Product View Suppose only matrix-vector products were available? x 1 0 0 1 x1 x2 0 0 0 xN x1e1 x2e2 0 1 x N eN Mx x1Me1 x2 Me2 xN MeN x1M1 x2 M 2 xN M N More convenient to use another approach QR Factorization Modified Gram-Schmidt Algorithm (Target-column oriented approach) For i = 1 to N “For each Target Column” M i Mei For j = 1 to i-1 "matrix-vector product" “For each Source Column left of target” rji Q j M i M i M i rjiQ j end rii M i M i 1 Qi M i r end ii N 3 ( N i )2 N N operations i 1 N Normalize 2 2 N 2 N operations i 1 QR Factorization M1 1 Q M1 1 Q M Q22 M Q22 Q33 M Q33 M Q M44 Q M44 r11 r12 r13 r14 r22 r23 r24 r33 r34 r44 r11 r12 r13 r14 r22 r23 r24 r33 r34 r44 QR Factorization – Zero Column What if a Column becomes Zero? Q1 0 0 M3 0 MN Matrix MUST BE Singular! 1) Do not try to normalize the column. 2) Do not use the column as a source for orthogonalization. 3) Perform backward substitution as well as possible QR Factorization – Zero Column Resulting QR Factorization Q1 0 0 Q3 0 QN r11 0 0 0 0 r12 r13 0 0 0 r33 0 0 0 0 r1N 0 r3 N rNN QR Factorization – Zero Column Recall weighted sum of columns view of systems of equations M1 M2 x1 b1 x2 b2 MN xN bN x1M 1 x2 M 2 xN M N b M is singular but b is in the span of the columns of M Reasons for QR Factorization • QR factorization to solve Mx=b – Mx=b QRx=b Rx=QTb where Q is orthogonal, R is upper trg • O(N3) as GE • Nice for singular matrices – Least-Squares problem Mx=b where M: mxn and m>n • Pointer to Krylov-Subspace Methods – through minimization point of view QR Factorization – Minimization View Definition of the Residual R: R x b Mx Find x which satisfies Mx b Minimize over all x N R x R x Ri x T i 1 Equivalent if b span cols M T Mx b and min x R x R x 0 Minimization More General! 2 QR Factorization – Minimization View One-Dimensional Minimization Suppose x x1e1 and therefore Mx x1Me1 x1M1 One dimensional Minimization R x R x b x1Me1 b x1Me1 T T b b 2 x1b Me1 x Me1 Me1 T T 2 1 T d T T T R x R x 2b Me1 2 x1 Me1 Me1 0 dx T b Me1 x1 T T e1 M Me1 Normalization QR Factorization – Minimization View One-Dimensional Minimization: Picture Me1 M 1 b x1 T b Me1 x1 T T e1 M Me1 e1 One dimensional minimization yields same result as projection on the column! QR Factorization – Minimization View Two-Dimensional Minimization Now x x1e1 x2e2 and Mx x1Me1 x2 Me2 Residual Minimization R x R x b x1Me1 x2 Me2 b x1Me1 x2 Me2 T T b b 2 x1b Me1 x Me1 Me1 T T T 2 1 2 x2b Me2 x Me2 Me2 T Coupling Term 2 2 T 2 x1 x2 Me1 Me2 T QR Factorization – Minimization View Two-Dimensional Minimization: Residual Minimization R x R x bb bx1 Me Me12 x 21x1bx2Me T Coupling Term Me Me T 2 2 x2b Me2 x2 Me Me T 2 x1 x2 Me1 Me2 T T T T 2 b1 x1Me x2 Me 1 1 1 2 T 2 2 dRT ( x) R( x) 0 2bT M e1 2 x1 ( M e1 )T ( M e1 ) coupling term dx1 dRT ( x) R( x) 0 2bT M e 2 2 x2 ( M e 2 )T ( M e 2 ) coupling term dx2 To eliminate coupling term: we change search directions !!! QR Factorization – Minimization View Two-Dimensional Minimization More General Search Directions v1 p1 v2 p2 and Mx v1Mp1 v2 Mp2 span p1 , p2 = span e1 , e2 x x1 e1 x2 e2 x R x R x b b 2v1b Mp1 v Mp1 Mp1 T T T T 2 1 2v2b Mp2 v Mp2 Mp2 T Coupling Term T 2 2 2v1v2 Mp1 Mp2 T If p M Mp2 0 Minimizations Decouple!! T 1 T QR Factorization – Minimization View Two-Dimensional Minimization More General Search Directions Goal: find a set of search directions such that p i M M p j 0 when j i T T In this case minimization decouples !!! pi and pj are called MTM orthogonal QR Factorization – Minimization View Forming MTM orthogonal Minimization Directions i-th search direction equals MTM orthogonalized unit vector i 1 pi ei rji p j pi M T Mp j 0 T j 1 Use previous orthogonalized Search directions Mp Me Mp Mp T rji j i T j j QR Factorization – Minimization View Minimizing in the Search Direction When search directions pj are MTM orthogonal, residual minimization becomes: Minimize: v Mpi Mpi 2vibT Mpi 2 i Differentiating: T 2vi Mpi Mpi 2b Mpi 0 T vi T bT Mpi Mpi Mpi T QR Factorization – Minimization View Minimization Algorithm For i = 1 to N pi ei “For each Target Column” For j = 1 to i-1 “For each Source Column left of target” rij pTj M T Mpi pi pi rij p j end rii Mpi Mpi 1 pi pi rii x x vi pi end Orthogonalize Search Direction Normalize Intuitive summary • QR factorization Minimization view (Direct) (Iterative) • Compose vector x along search directions: – Direct: composition along Qi (orthonormalized columns of M) need to factorize M – Iterative: composition along certain search directions you can stop half way • About the search directions: – Chosen so that it is easy to do the minimization (decoupling) pj are MTM orthogonal – Each step: try to minimize the residual Compare Minimization and QR Q1 QN Q2 M 1 e1 r11 p1 M 1 e2 r12e1 r22 p2 Orthonormal M 1 e2 riN ei MTM rNN Orthonormal pN Summary • Iterative Methods Overview – Stationary – Non Stationary • QR factorization to solve Mx=b – Modified Gram-Schmidt Algorithm – QR Pivoting – Minimization View of QR • Basic Minimization approach • Orthogonalized Search Directions • Pointer to Krylov Subspace Methods