CM0368 Scientific Computing

CM0368 Scientific Computing Spring 2009 Professor David Walker david@cs.cf.ac.uk Schedule • Weeks 1 & 2 (6): Numerical linear algebra (DWW) – Solving Ax = b by Gaussian elimination and Gauss-Seidel. – Algebraic eigenvalue problem. Power method and QR method. • Weeks 3 & 4 (6): Numerical solution of differential equations (BMB) – Ordinary differential equations (finite difference and Runge-Kutta methods). – Partial differential equations (finite difference method) • Weeks 5 & 6 (5): Applications in Physics (BMB) – Laplace and Helmholtz equations, Schrodinger problem for the hydrogen atom, etc. • Weeks 6 & 7 (4): Numerical optimisation (DWW) • Weeks 2 – 11: Tutorials on Wednesdays at 1pm in T2.07. These will be given by Kieran Robert. Text Book • “Numerical Computing with MATLAB” by Cleve B. Moler, SIAM Press, 2004. ISBN 0898715601. • http://www.readinglists.co.uk/rsl/student/sv iewlist.dfp?id=20248 • Web edition at http://www.mathworks.com/moler/ Web Site • Lecture notes and other module material can be found at: http://users.cs.cf.ac.uk/David.W.Walker/CM0368 Numerical Linear Algebra • A system of n simultaneous linear equations can be represented in matrix notation as: Ax = b where A is an nn matrix, and x and b are vectors of length n. • Can write solution as x = A-1b where A-1 is the inverse of A. • A square matrix is said to be non-singular if its inverse exists. If the inverse does not exist then the matrix is singular. Geometrical Interpretation • If A is a 22 matrix, for example  2  1 x1   3         3 4  x2    1 then 2x1 – x2 = 3 and 3x1 + 4x2 = -1. Each represents a straight line and the solution of the above is given by their intersection. • If A is a 33 matrix each of the three equations represents a plane, and the solution is the point lying at the intersection of the three planes. MATLAB Solution • In MATLAB we can find the solution to Ax = b by writing: x = A\b • Write: A = [10 -7 0;-3 2 6;5 -1 5] • Then: b = [7;4;6] • Then: x = A\b Gaussian Elimination • Eliminate x1 from all the equations after the first. • Then eliminate x2 from all the equations after the second. • Then eliminate x3 from all the equations after the third. • And so on, until after n-1 steps we have eliminated xj from all the equations after the jth, for j = 1, 2, …, n-1. • These steps are referred to as the forward elimination stage of Gaussian elimination. Example  10  7 0  x1   7         3 2 6  x2    4   5  1 5  x   6    3    Subtract -3/10 times equation 1 from equation 2, and 5/10 times equation 1 from equation 3. 10  7 0  x1   7        0  0.1 6  x2    6.1   0 2.5 5  x   2.5    3    Next we swap equations 2 and 3. This is called partial pivoting. It is done to get the largest absolute value on or below the diagonal in column 2 onto the diagonal. This makes the algorithm more stable with respect to roundoff errors (see later). Example (continued) 10  7 0  x1   7        0 2.5 5  x2    2.5   0  0.1 6  x   6.1    3    Now subtract -0.1/2.5 times equation 2 from equation 3. 10  7 0  x1   7        0 2.5 5  x2    2.5   0 0 6.2  x   6.2    3    This completes the forward elimination stage of the Gaussian elimination algorithm. Pseudocode for Forward Elimination make b column n+1 of A for k=1 to n-1 find pivot A(m,k) in column k and row mk swap rows k and m for i=k+1 to n mult = A(i,k)/A(k,k) for j=i+1 to n+1 A(i,j) = A(i,j) – mult*A(k,j) end end end Back Substitution • After the forward elimination phase, the matrix has been transformed into upper triangular form. • Equation n just involves xn and so can now be solved immediately. • Equation n-1 just involves xn-1 and xn, and since we already know xn we can find xn-1. • Working our way backwards through the equations we can find xn, xn-1,…, x1. • This is called the back substitution phase of the Gaussian elimination algorithm. The Example Again 10  7 0  x1   7        0 2.5 5  x2    2.5   0 0 6.2  x   6.2    3    Equation 3 is 6.2x3 = 6.2, so x3 = 1. This value is substituted into equation 2: 2.5x2 + (5)(1) = 2.5 so x2 = -1. Substituting for x2 and x3 in equation 1: 10x1 + (-7)(-1) = 7 so x1 = 0. Pseudocode for Back Substitution This solves Ux = b x(n) = b(n)/U(n,n) for k=n-1 to 1 sum = 0 for j=k+1 to n sum = sum + U(k,j)*x(j) end x(k) = (b(k) – sum)/U(k,k) end LU Factorisation • The Gaussian elimination process can be expressed in terms of three matrices. • The first matrix has 1’s on the main diagonal and the multipliers used in the forward elimination below the diagonal. This is a lower triangular matrix with unit diagonal, and is usually denoted by L. • The second matrix, denoted by U, is the upper triangular matrix obtained at the end of the forward elimination. • The third matrix, denoted by P, is a permutation matrix representing the row interchanges performed in pivoting. L, U, and P 0 0  1   L   0. 5 1 0   0.3  0.04 1    10  7 0    U   0 2.5 5   0 0 6.2    1 0 0   P  0 0 1 0 1 0   The original matrix can be expressed as: LU = PA The permutation matrix is the identity matrix with its rows reordered. If Pij = 1 then row i of the permuted matrix is row j of the original matrix. The same information can be represented in a vector, the ith entry of which gives the number of the column containing the 1 in row i. 1   p   3  2   Some MATLAB Code • L, U, and P can be found in MATLAB as follows: [L,U,P] = lu(A) • Solution of the system Ax=b is equivalent to solving the two triangular systems Ly = Pb and Ux = y • Once you have L, U, and P it is simple to solve the original system of equations: y = L\(P*b) and x = U\y, or just x = U\(L\(P*b)) • This should give the same answer as: x = A\b LDU Factorisation • It is sometimes useful to explicitly separate out the diagonal of U, which contains the pivots. • We write U=DU’ where U’ is the same as U except that it has 1’s on the diagonal, and D is a diagonal matrix containing the diagonal elements of U. • With this form of the factorisation we have LDU = PA 0 0  1   L   0. 5 1 0   0.3  0.04 1    0  10 0   D   0 2.5 0   0 0 6.2    1  7 0   U  0 1 5 0 0 1   Pseudocode for LU Factorisation make b column n+1 of A for k=1 to n-1 find pivot A(m,k) in column k and row mk swap rows k and m for i=k+1 to n A(i,k) = A(i,k)/A(k,k) for j=i+1 to n+1 A(i,j) = A(i,j) – A(i,k)*A(k,j) end end end Explanation of LU • At stage i of forward elimination we do pivoting to find the largest absolute value in column i on or below the diagonal, and then exchange rows to bring it onto the diagonal. • Then we subtract multiples of row i from rows i+1,…,n. • Each of these operations can be represented by a matrix multiplication. Elementary Matrices • An elementary matrix, M, is one that has 1’s along the main diagonal and 0’s everywhere else, except for one non-zero value (-m, say) in row i and column j. • Multiplying A by M has the effect of subtracting m times row j of matrix A from row i. • Ignoring pivoting, the GE algorithm applies a series of elementary matrices to A to get U: U = Mn-1….M2M1A • If Li-1=Mi then L1L2…Ln-1U = A so taking L = L1L2…Ln-1 we have LU=A. • Li is the same as Mi except the sign of the non-zero value is changed. • With pivoting U = Mn-1 Pn-1 ….M2P2M1P1A, and it can be shown in a similar way that LU=PA, where P = Pn-1…P2P1. The Need for Pivoting • Suppose we change our problem slightly to:  7 0  x1   7   10        3 2.099 6  x2    3.901  5  x   6   1 5   3    where the solution is still (0,-1,1)T. • Now suppose we solve the problem on a computer that does arithmetic to five decimal places. Pivoting (continued) • The first step of the elimination gives: 7 0  x1   7  10      0  0 . 001 6 x  6 . 001   2    0     2.5 5  x3   2.5   • The (2,2) element is quite small. Now we continue without pivoting. • The next step is to subtract -2.5103 times equation 2 from equation 3: (5-(-2.5 103 )(6))x3 = (2.5-(-2.5 103)(6.001)) • The righthand side is (2.5+1.50025 104). The second term is rounded to 1.5002 104. When we add the 2.5 the result is rounded to 1.5004 104. Pivoting (continued) • So the equation for x3 becomes: 1.5005 104 x3 = 1.5004 104 which gives x3 = 0.99993 (instead of 1). • Then x2 is found from: -0.001 x2 + (6)(0.99993) = 6.001 which gives: -0.001 x2 = 1.5 10-3 so x2 = -1.5 (instead of -1). • Finally, x1 is found from: 10 x1 + (-7)(-1.5) = 7 which gives x1 = -0.35 (instead of 0). • The problem arises from using a small pivot which leads to a larger multiplier. • Partial pivoting ensures that the multipliers are always less than or equal to 1 in magnitude, and results in a satisfactory solution. Measuring Errors • The discrepancy due to rounding error in a solution can be measured by the error: e = x – x* and by the residual: r = b - Ax* where x is the exact solution and x* is the computed solution. • e=0 if and only if r=0, however, e and r are not necessarily both small. Error Example • Consider an example in which the matrix is almost singular. • If GE is used to compute the solution with low precision we get a large error but small residual. • Geometrically, the lines represented by the equation are almost parallel. • GE with partial pivoting will always produce small residuals, but not necessarily small errors.  0.780 0.563  x1   0.217         0.913 0.659  x2   0.254  computed solution exact solution Sensitivity • We want to know how sensitive the solution to Ax=b is to perturbations in A and b. • To do this we first have to come up with some measure of how close to singular a matrix is. • If A is singular, a solution will not exist for some b’s (inconsistency), while for other b’s it is not unique. • So if A is nearly singular we would expect small changes in A and b to cause large changes in x. • If A is the identity matrix then x=b, so if A is close to the identity matrix we expect small changes in A and b to result in small changes in x. Singularity and Pivots • In GE all the pivots are non-zero if and only if A is non-singular, provided exact arithmetic is used. • So if the pivots are small we expect the matrix to be close to singular. • However, with finite precision arithmetic, a matrix might still be close to singular even if none of the pivots are small. Norms • Size of pivots is not adequate to decide how close a matrix is to being singular. • Define lp family of norms (1≤p≤): 1/ p  p    xi   i 1  n x p • Most common norms use p = 1, 2, and .  n  x 1    xi   i 1  1/ 2  2 x 2    xi   i 1  n x   max xi i Properties of Norms • All these norms have the following properties: x  0 if x  0 0 0 cx  c x for all scalars c x  y  x  y (the triangle inequality) • In MATLAB use norm(x,p) to find a norm: norm1 = norm(x,1) norm2 = norm(x) norminf = norm(x,inf) Condition Number • Ax may have a very different norm from x, and this change in norm is related to the solution sensitivity to changes in A. Denote: M = max Ax x and m = min Ax x where the max and min are taken over all nonzero vectors x. Note, if A is singular, m = 0. • The condition number, (A), of A is defined as the ratio M/m. Usually we are interested in the order of magnitude (A), so it doesn’t matter which norm is used. Relative Errors • Suppose we have an error b in b, which results in an error x in x. So Ax=b becomes: A(x+x) = b+b so Ax= b. From the definition of M and m we have: b  M x and b  m x Then if m0 we have the following relationship between the relative error in b and in x: (A) measures the x b   ( A) amplification of the x b relative error! Uses of Condition Number 1. As a measure of the amplification of relative error due to changes in rhs. 2. As a measure of the amplification of relative error due to changes in matrix A. 3. As a measure of how close a matrix is to being singular (hard maths omitted). If (A) is large then A is close to singular. Some Properties of (A) • • • • (A)  1 since M  m. (P) = 1, if is a permutation matrix. (cA) = (A) for scalar c. For D a diagonal matrix (D) is the ratio of the largest diagonal value to the smallest. Relative Error Example • In this example we use the l1-norm:  4.1  4.1 2.8   b    A    9.7 6.6   9.7  • Solution is x = (1 0)T, and ||b|| = 13.8 and ||x|| = 1. Change b slightly to:  0.34  ~  4.11 ~  then x becomes x    b    9.70   0.97  Small change in b gives big change in x. • Errors are ||b||=0.01 and ||x||=1.63, so relative errors are: b b  0.0007246 x x  1.63 (A) is large = 1.63/0.0007246 = 2249.4 Relative Error and Residual • Condition number is also important in round-off error in GE. It can be shown that: b  Ax* A x*   x  x* x*    ( A)  where x* is the numerical solution of Ax=b,  is a constant smaller than about 10, and  the machine precision. • The first inequality says that the relative residual will be about the same size as the round-off error no matter how ill-conditioned A is. • The second inequality says that the relative error is small if (A) is small but might be large if (A) is large. Matrix Norms • The norm of matrix A is ||A|| = M, i.e., A  max Ax x • Since ||A-1|| = 1/m it follows that the condition number can also be defined as: (A) = ||A|| ||A-1|| • In MATLAB, cond(A,p) computes the condition number of A relative to the lp-norm. Iterative methods for Ax=b • Gaussian Elimination is a direct method that computes the solution of Ax=b is O(n3/3) operations. • If n is large we might want a faster, less accurate, method. • With an iterative method we can stop the iteration whenever we think we have a sufficiently accurate solution. Iterative methods • Suppose we split the matrix as A = S-T, then Sx = Tx + b. • We can turn this into an iteration: Sxk+1 = Txk + b or xk+1 = S-1Txk + S-1b • So if this sequence converges then we can start the iteration with a guess at the solution x0 and get an approximate solution. • We need S to be easily invertible Common Iterative Methods 1. S = diagonal part of A (Jacobi’s method) 2. S = triangular part of A (Gauss-Seidel method) 3. S = combination of 1 and 2 (successive overrelaxation or SOR) S is called a pre-conditioner. The choice of S affects the convergence properties of the solution. Jacobi Method • S = diag(A) so formula for iteration k+1 becomes: i 1 aii ( xi ) k 1   aij ( x j ) k  j 1 n  a (x ) j i 1 ij j k  bi • Expressed in matrices: Dxk+1 = - (L+U)xk + b where D, L, and U are the diagonal, strictly lower triangular, and strictly upper triangular parts of A, respectively. Note: these are not the D, L, and U of the LDU factorisation. Jacobi Example • Example:  0  2  1  2 0 0 1 1 , S   , T   , S T   A   1  1 2   0 2 1 0  2 Then:  0  x1       x2  k 1  1 2 1  x   b1  2  1   2     0  x2  k  b2    2 1  2 0   Pseudocode for Jacobi Method choose an initial guess x0 for k=0,1,2,…. for i=1 to n sum = 0.0 for j=1 to i-1 and i+1 to n sum = sum + A(i,j)*xk(j) end xk+1(i) = (b(i)–sum)/A(i,i) end check convergence and continue if needed end Gauss-Seidel Method • A problem with the Jacobi method is that we have to store all of xk until we have finished computing xk+1. • In the Gauss-Seidel method the xk+1 are used as soon as they are computed, and replace the corresponding xk on the righthand side i 1 aii ( xi ) k 1   aij ( x j ) k 1  j 1 n  a (x ) j i 1 ij j k  bi • This uses half the storage of the Jacobi method. Gauss-Seidel in Matrix Notation • Expressed in matrices: (D+L)xk+1 = -Uxk + b where D+L is the lower triangular part of A, and U is the strictly upper triangular part of A. Gauss-Seidel Example • Example:  2  1  2 0 0 1  0 1/ 2  , S   , T   , S 1T    A    1 2   1 2   0 0  0 1/ 4  Then:  2 0  x1   0 1  x1   b1              1 2  x2  k 1  0 0  x2  k  b2  • Gauss-Seidel is better than Jacobi because it uses half the storage and often converges faster. Pseudocode for Gauss-Seidel choose an initial guess x0 for k=0,1,2,…. for i=1 to n sum = 0.0 for j=1 to i-1 sum = sum + A(i,j)*xk+1(j) end for j=i+1 to n sum = sum + A(i,j)*xk(j) end xk+1(i) = (b(i)–sum)/A(i,i) end check convergence and continue if needed end

CM0368 Scientific Computing

Related documents

Products

Support

CM0368 Scientific Computing

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib