Solving Scalar Linear Systems Iterative approach Lecture 15 MA/CS 471 Fall 2003 Some matlab scripts to construct various types of random circuit loop matrices are available at the class website: The Sparsity Pattern of a Loop Circuit Matrix for a Random Circuit (with 1000 closed loops) b gridcircuit.m An array of current loops with random resistors (resistors on circuit boundary not shown) Matrix Due To Random Grid Circuit b Note the large amount of structure in the loop circuit matrix The Limits of Factorization • In the last class/lab we began to see that there are limits to the size of linear system solvable with matrix factorization based methods. • The storage cost for the loop current matrix built on a Cartesian circuit stored as a sparse NxN matrix is ~ 5*N • However, using LU (or Cholesky) and symmetric RCM the storage requirement is b*N which is typically at least an order of magnitude larger than the storage required for the loop matrix itself. • Also – memory spent on storing the matrices is memory we could have used for extra cells… Alternative Approach • We are going to pursue iterative methods which will satisfy the equation Ax b in an approximate way without an excessive amount of extra storage. • There are a number of different classes of iterative methods, today we will discuss an example from the class of stationary methods. http://www.netlib.org/linalg/html_templates/node9.html Jacobi Iteration • Example system: • Initial guess: • Algorithm: 5 x 1y 3z 2 3x 6 y 2 z 3 2x 3y 6z 1 x0 y 0 z 0 0 5 x n1 1 y n 3 z n 2 3 x n 6 y n1 2 z n 3 2 x n 3 y n 6 z n1 1 • i.e. for the I’th equation compute the I’th degree of freedom using the values computed from the previous iteration. Cleaning The Scheme Up 1 x 2 1y n 3z n 5 1 n 1 n n y 3 3x 2 z 6 1 n 1 n n z 1 2 x 3 y 6 n 1 5 x n1 1 y n 3 z n 2 3x 6 y n n 1 2z 3 2x 3y 6z n n n n 1 1 Couple of Iterations 1 2 0 0 x 2 1y 3z 5 5 1 1 1 0 0 y 3 3x 2 z 6 2 1 1 1 0 0 z 1 2 x 3 y 6 6 1 1st iteration: 2nd iteration: 1 1 1 1 1 1 1 x 2 1 y 3 z 2 1. 3. 5 5 2 6 5 2 1 1 2 1 11 1 1 y 3 3 x 2 z 3 3. 2. 6 6 5 6 45 2 1 1 2 1 13 1 1 z 1 2 x 3 y 1 2. 3. 6 6 5 2 60 2 Pseudo-Code For Jacobi Method 1) Build A, b 2) Build modified A with diagonal zero Q 3) Set initial guess x=0 4) do{ jN 1 xi bi Qij x j a) compute: A j 1 max xi xi ii b) compute error: err i 1,.., N max bi i 1,.., N c) update x: }while err>tol xi xi Matlab To The Rescue Running The Jacobi Iteration Script • I ran the script with the stopping tolerance (tol) set to 1e-8: • Note that the error is of the order of 1e-8 !!! • i.e. the Jacobi iterative method computes an approximate solution to the system of equations!. Try The Larger Systems • First I made the script into a function which takes the rhs vector, system matrix and stopping tolerance as input. • It returns the approximate solution and the residual history. Run Time! • First I built a circuit matrix using the gridcircuit.m script (with N=100) • Then I built a random source vector for the right hand side. • Then I called the jacobisolve routine with: • [x,residuals] = jacobisolve(mat,b,1e-4); Convergence History Stopping criterion.. Increasing System Size Notice how the number of iterations required grew with N Ok – so I kind of broke the rules • We set up the Jacobi iteration but did not ask the question “when will the Jacobi iteration converge and how fast?” Definition: A matrix A is diagonally dominant if N j 1, j i A ij A ii Theorem: If the matrix A is diagonally dominant then Ax=b has a unique solution x and the Jacobi iteration produces a 0 n x sequence which converges to x for any initial guess x Informally: The “more diagonally dominant” a matrix is the faster it will converge… this holds some of the time. Going Back To The Circuit System • For the gridcircuit code I set up the resistors so that all the internal circuit loops shared all their resistors with other loops. • the current balance law => for internal cells the total cell resistance = sum of of resistances shared with neighboring cells… • i.e. the row sums of the internal cell loop currents N is zero A A j 1, j i ij ii • i.e. the matrix is weakly diagonally dominant – and does not exactly satisfy the convergence criterion for the Jacobi iterative scheme. Slight Modification of the Circuit • I added an additional random resistor to each cell (i.e. increased the diagonal entry and did not change the off-diagonal entries). • This modification ensures that the matrix is now strictly diagonally dominant. Convergence history for diagonally dominant system – notice dramatic reduction in iteration count Gauss-Seidel • Example system: • Initial guess: • Algorithm: 5 x 1y 3z 2 3x 6 y 2 z 3 2x 3y 6z 1 x0 y 0 z 0 0 5 x n1 1 y n 3 z n 2 3 x n1 6 y n1 2 z n 3 2 x n1 3 y n1 6 z n1 1 • i.e. for the I’th equation compute the I’th degree of freedom using the values computed from the previous iteration and the new values just computed Cleaning The Scheme Up 1 x 2 1y n 3z n 5 1 n 1 n 1 n y 3 3x 2 z 6 1 n 1 n 1 n 1 z 1 2 x 3 y 6 n 1 5 x n1 1 y n 3 z n 2 3x 2x n 1 n 1 6y 3y n 1 n 1 2z 3 6z n n 1 1 First Iteration 1st iteration: 1 0 0 x 2 1y 3z 5 1 1 y 3 3 x1 2 z 0 6 1 1 z 1 2 x1 3 y1 6 1 As soon as the 1 level values are computed, we use them in the next equations.. Theorem First This Time!. • So we should first ask the questions 1) When will the Gauss-Seidel iteration converge? 2) How fast will it converge? Definition: A matrix is said to be positive definite if xt Ax 0 for all x N Theorem: If A is symmetric and positive definite, then the Gauss-Seidel iteration converges for any initial guess for x Unoficially: Gauss-Seidel will converge twice as fast in some cases as Jacobi. Gauss-Seidel Algorithm • We iterate: xi0 0 n 1 i x j i 1 jN 1 n 1 n bi A ij x j A ij x j A ii j 1 j i 1 Comparing Jacobi and Gauss-Seidel Same problem – Gauss-Seidel takes almost half the work The Catch • Ok – so it looks like one would always use Gauss-Seidel rather than Jacobi iteration. • However, let us consider the parallel implementation. n 1 i J: x n 1 i GS : x j i 1 jN 1 n n bi Aij x j Aij x j Aii j 1 j i 1 j i 1 jN 1 n 1 n bi Aij x j Aij x j A ii j 1 j i 1 Volunteer To Design A Parallel Version 1) Decide which cells go where 2) Decide how much information each process needs to keep locally 3) Decide what information needs to be communicated among processes. 4) Are there any intrinsic bottlenecks in Gauss-Seidel or Jacobi?. 5) Can we devise a hybrid version of GS which avoids the bottlenecks?. Project 3 (serial part) In C: 1) Build a sparse matrix based on one of the random circuits (design the storage for it yourself or use someone else’s sparse storage structure or class) 2) Write a sparse matrix times vector routine. 3) Implement the Jacobi iterative scheme