Case Study in Computational Science & Engineering - Lecture 5 Solution of Sparse Linear Systems • Direct Methods – Systematic transformation of system of equations into equivalent systems, until the unknown variables are easily solved for. • Iterative methods – Starting with an initial “guess” for the unknown vector, successively “improve” the guess, until it is “sufficiently” close to the solution. 1 Case Study in Computational Science & Engineering - Lecture 5 Direct Solution of Linear Systems Gaussian Elimination 2 x1 3x 2 x 3 13 x1 x 2 x 3 6 3x1 2 x 2 2 x 3 15 div by 2 x1 15 . x 2 0.5 x 3 6.5 x1 x 2 x 3 6 3x1 2 x 2 2 x 3 15 *(-1) x1 15 . x 2 0.5x 3 6.5 0.5x 2 0.5x 3 0.5 2.5x 2 0.5x 3 4.5 x1 15 . x 2 0.5x 3 6.5 x2 x3 1 2 x 3 2 *(-3) x1 15 . x 2 0.5x 3 6.5 x2 x3 1 2.5x 2 0.5x 3 4.5 x1 15 . x 2 0.5x 3 6.5 x2 x3 1 x3 1 • Unknowns solved by back-substitution after Gaussian Elimination 2 Case Study in Computational Science & Engineering - Lecture 5 LU Decomposition • More efficient than Gaussian Eimination when solving many systems with the same coefficient matrix. • First A is decomposed into product: A = LU A11 A12 A13 A21 A22 A23 A31 A32 A33 = L11 0 0 1 U 12 U 13 L 21 L 22 0 0 1 U 23 L31 L32 L33 0 0 1 • To solve linear system Ax=b, we need to solve (LU)x=b • Let z=Ux; we have L(Ux)=b, or Lz=b. This can be solved for z by forward-substitution. • Since Ux=z, and z is now known, we can solve for x by back-substitution. 3 Case Study in Computational Science & Engineering - Lecture 5 Cholesky Factorization • If A is symmetric and positive definite (xT Ax > 0), it can be factored in the form A = LLT A11 A12 A13 A21 A22 A23 A31 A32 A33 = L11 0 0 L 21 L 22 0 L31 L32 L33 1 L 21 L31 0 1 L32 0 0 1 • Cholesky factorization requires only around half as many arithmetic operations as LU decomposition. • The forward and back-substitution process is the same as with LU decomposition. 4 Case Study in Computational Science & Engineering - Lecture 5 Sparse Linear Systems • A significant fraction of matrix elements are known to be zero, e.g. matrix arising from a finite-difference discretization of a PDE: 1 2 3 4 5 6 1 2 3 1 4 -1 0 -1 0 0 2 -1 4 -1 0 -1 0 3 0 -1 4 0 -1 -1 4 -1 0 0 4 -1 0 4 5 6 5 0 -1 0 -1 4 -1 6 0 0 -1 0 -1 4 • At most 5 non-zero elements in any row of the matrix, irrespective of the size of the matrix (number of grid points). • Sparse matrix is represented in some compact form that keeps information about the non-zero elements. 5 Case Study in Computational Science & Engineering - Lecture 5 Sparse Linear Systems • For a 100 by 100 grid, with a finite difference discretization using a 5-point stencil, less than .05% of the matrix n2 elements are non-zero. 1 1 n n2 Physical nxn Grid n2 Resulting n2 x n2 sparse matrix 6 Case Study in Computational Science & Engineering - Lecture 5 Compressed Sparse Row Format • A commonly used representation for sparse matrices: 0 1 2 3 4 5 0 1 2 3 4 5 4 -1 0 -1 0 0 -1 4 -1 0 -1 0 0 -1 4 0 0 -1 -1 0 0 4 -1 0 0 -1 0 -1 4 -1 0 0 -1 0 -1 4 0 3 7 10 13 17 20 rb 4 -1 -1 -1 4 -1 -1 -1 4 -1 -1 4 -1 -1 -1 4 -1 -1 -1 4 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 0 1 3 0 1 2 4 1 2 5 0 3 4 1 3 4 5 2 4 5 for (i = 0; i<n; i++) for(j=0;j<n;j++) y[i] += a[i][j]*x[j]; for (i = 0; i<n; i++) for(j=rb[i];j<rb[i+1];j++) y[i] += a[j]*x[col[j]]; Dense MV Multiply Sparse MV Multiply a col 7 Case Study in Computational Science & Engineering - Lecture 5 Fill-in Non-Zeros • During solution of sparse linear system (by GE or LU or Cholesky), row-updates often result in creation of non-zero entries that were originally zero. X X 0 X X X 0 0 0 0 X X X 0 X X X X 0 X X X 0 F 0 0 X X X F X X • Row updates using row-1 result in fill-in non-zeros (F). 8 Case Study in Computational Science & Engineering - Lecture 5 Effect of reordering on fill-in • Re-ordering the equations (rows) or unknowns (columns) can result in significant change in the number of fill-in non-zeros, and hence time for matrix factorization. X X X X X X 0 0 X 0 X 0 X 0 0 X Fill-in with GE X X X X X X F F X F X F X 0 0 X 0 X 0 X 0 0 X X X F F X Reorder rows/cols X 0 0 X 0 X 0 X 0 0 X X X No fill-in X with GE X X X X X X 9 Case Study in Computational Science & Engineering - Lecture 5 Associated graph of matrix • A graph-based view of matrix’s sparsity structure is extremely useful in generating low-fill re-orderings. 1 2 3 4 5 6 1 X X X X 2 X X X 3 4 5 6 X X X X X X X X X X X 4 1 2 6 3 5 • The associated graph of a symmetric sparse matrix has a vertex corresponding to each row/col. of matrix, and an edge corresponding to each non-zero matrix entry. 10 Case Study in Computational Science & Engineering - Lecture 5 Fill-in and graph transformation • Row-i updates row-j, j>i iff Aji is non-zero; in the associated graph a matrix non-zero corresponds to an edge. • Row-update(i->j) could cause fill-in non-zero Ajk corresponding to all non-zeros Aik. i j k l i X X X X j X X F F k X l X l l i j i j X X k k • After all updates from row-i, all neighbors of vertex i in the associated graph form a clique. 11 Case Study in Computational Science & Engineering - Lecture 5 Fill-in and graph transformation • Each row’s effect on fill-in generation is captured by the “clique” transformation on the associated graph. • The graph view is valuable in suggesting matrix reordering approaches. 1 X X X X 1 2 3 4 5 6 2 6 2 X X X 3 4 5 6 X X X X X X X X X X X 1 2 3 4 5 6 1 X X X X 2 X X X F 3 X X X F X X 4 4 1 1 2 3 5 6 4 5 6 X F X F X X X X 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 X X F X X X F X F F F X F X X X F F X 1 2 3 4 5 6 X F X F F X F F 4 2 3 5 6 X F X F F F F X 4 1 2 3 5 6 1 3 5 12 Case Study in Computational Science & Engineering - Lecture 5 Matrix re-ordering: Minimum Degree • Graph-based algorithm for generating low-fill re-ordering. • Matrix permutation is viewed as node-numbering problem in associated graph. • Low-degree nodes are numbered early - so that they are removed without adding many fill-in edges. d=1 1 d=2 d=3 d=3 d=3 d=1 1 d=1 d=3 d=1 1 d=2 1 d=2 4 d=3 d=2 d=3 d=2 d=2 d=1 d=1 d=1 2 d=1 2 3 2 3 • For example, minimum-degree finds a no-fill ordering. 13 Case Study in Computational Science & Engineering - Lecture 5 Re-ordered matrix 4 1 1 4 2 6 1 2 3 4 5 6 1 X X X X 5 3 5 2 X X X F 3 X X X F X X F 4 X F F X F F 5 6 X F X F X F F F X 2 6 3 Old # New # 1 4 2 5 3 6 4 1 5 3 6 2 1 2 3 4 5 6 1 X X 2 X X 3 X X 4 X X X X 5 X X X X 6 X X X X 14 Case Study in Computational Science & Engineering - Lecture 5 Matrix re-ordering: Nested Dissection • Find a minimal vertex-separator to bisect associated graph; number those nodes last; recursively apply to both halves. • Property: Given a numbering of nodes, fill-in Aij exists, j>i, iff there is a path from i to j in graph using only lower numbered vertices. 1-21 19 21 43 49 40 22-42 42 • No fill-in edges between one half and other half of partition. 15 Case Study in Computational Science & Engineering - Lecture 5 Comparison of Ordering Schemes Number of non-zeros after fill-in Grid => Nest Dis Min Deg BW Min Natural 4x4 120 100 108 118 8x8 768 654 792 974 16x16 4500 4020 5936 7966 32x32 25072 23172 45664 64574 64x64 131904 133278 357568 520318 128x128 659880 771088 2,828670 *** 256x256 3,180260 4,438460 *** *** Sparse matrix factorization time Grid => Nest Dis Min Deg BW Min Natural 4x4 77.3 74.4 76.9 75.5 8x8 77.7 76.1 78.4 80.0 16x16 84.8 89.5 91.3 96.3 32x32 134.6 154.7 160.3 226.1 64x64 504.8 674.8 1006.0 1844.8 128x128 3063.1 5076.8 16604.6 *** 256x256 22807.8 48664.7 *** *** 16