MAE 376: Applied Mathematics for MAEs Dr. Paul T. Bauman, Dr. Ehsan Esfahani, Dr. Abani K. Patra with contributions from Dr. Souma Chowdhury Fall Semester 2016 c 2015 Paul T. Bauman, Ehsan Esfahani, Abani K. Patra All rights reserved. This work is intended for students of MAE 376 at the University of Buffalo and should not be publicly distributed. Contents Introduction 0.1 Overview of Course . . . . . . . . . . . . . . . . . . . . . . . . . . . . 0.2 Historical Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . i i ii I 1 Mathematical Background 1 Linear Systems 1.1 Matrix Algebra . . . . . . . . . . . . . . . 1.2 Representing Linear Systems of Equations 1.3 More Than Just an Array of Numbers... . 1.4 Solving Linear Systems (by hand) . . . . . 2 Eigenvalues and Eigenvectors 2.1 Core Idea . . . . . . . . . . 2.2 Applications . . . . . . . . . 2.3 Extending the idea . . . . . 2.3.1 Representation . . . 2.3.2 Special Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 3 7 15 16 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 21 23 29 29 29 3 Differential Equations 3.1 Introduction . . . . . . . . . . . . . . . . . . . . 3.1.1 Classification of Differential Equations . 3.1.2 Notations . . . . . . . . . . . . . . . . . 3.1.3 Operators . . . . . . . . . . . . . . . . . 3.2 Solutions of Linear ODEs . . . . . . . . . . . . 3.2.1 2nd Order ODE with constant coefficient 3.3 Solutions of Partial Differential Equations . . . 3.3.1 Wave Equation . . . . . . . . . . . . . . 3.3.2 Heat Equation . . . . . . . . . . . . . . . 3.3.3 Different Boundary conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 31 31 32 33 34 35 37 38 40 43 II . . . . . . . . . . . . . . . . . . . . . . . . . Numerical Methods . . . . . . . . . . . . . . . . . . . . . . . . . 45 4 Numerical Solution of Linear Systems 3 47 4 CONTENTS 4.1 4.2 4.3 4.4 4.5 4.6 4.7 Automating Gaussian Elimination . . . . . . . Computational Work of Gaussian Elimination Partial Pivoting . . . . . . . . . . . . . . . . . LU Decomposition . . . . . . . . . . . . . . . Cholesky Decomposition . . . . . . . . . . . . Computing the Inverse of a Matrix . . . . . . Practice Problems . . . . . . . . . . . . . . . . 5 Numerical Error and Conditioning 5.1 Error Considerations . . . . . . . . . 5.2 Floating Point Representation . . . . 5.3 Review of Vector and Matrix Norms 5.4 Conditioning of Linear Systems . . . 5.5 Practice Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Numerical Differentiation 6.1 Approximating Derivatives . . . . . . . . . . . . . . 6.1.1 What Are Finite Differences? . . . . . . . . 6.1.2 Taylor Series and Approximate Derivative . 6.1.3 Taylor Series and Finite Differences . . . . . 6.1.4 What if there is error in evaluation of f (x)? 6.2 Higher Dimensions and Partial Derivatives . . . . . 6.3 Practice Problems . . . . . . . . . . . . . . . . . . . 7 Solution of Initial Value Problems 7.1 A Simple Illustration . . . . . . . . . . . . 7.2 Stability . . . . . . . . . . . . . . . . . . . 7.3 Multistage Methods . . . . . . . . . . . . . 7.3.1 First Order Explicit RK Methods . 7.3.2 Second Order Explicit RK Methods 7.3.3 Fourth Order Explicit RK . . . . . 7.3.4 MATLAB and RK methods . . . . 7.4 Practice Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Solution of Boundary Value Problems 8.1 Heat transfer in a One-Dimensional Rod . . . . . . . . . . . . . . 8.1.1 One-Dimensional Rod with Fixed Temperature Ends . . . 8.1.2 One-Dimensional Rod with Mixed Boundary Conditions . 8.2 General Linear Second Order ODEs with Nonconstant Coefficients 8.3 Two dimensional Equations . . . . . . . . . . . . . . . . . . . . . 8.4 Practice Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 48 50 51 57 58 60 . . . . . 61 61 63 65 67 68 . . . . . . . 69 69 69 70 70 75 75 77 . . . . . . . . 79 79 81 82 83 83 85 86 87 . . . . . . 89 89 89 91 92 93 94 9 Solution of Eigenproblems 95 9.1 Power Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 9.2 Inverse Power Method . . . . . . . . . . . . . . . . . . . . . . . . . . 97 CONTENTS 9.3 9.4 5 Shifted Inverse Power Methods . . . . . . . . . . . . . . . . . . . . . QR Iteration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 Nonlinear Equations 10.1 Example Problem: Rocket . . . . . . . . 10.1.1 Problem Formulation . . . . . . . 10.1.2 Problem Solution . . . . . . . . . 10.2 Solving Non-Linear Equations . . . . . . 10.2.1 Test for Linearity . . . . . . . . . 10.2.2 Methods of Solution . . . . . . . 10.3 Convergence . . . . . . . . . . . . . . . . 10.3.1 Bisection . . . . . . . . . . . . . . 10.3.2 Newton Rhapson . . . . . . . . . 10.3.3 Fixed Point . . . . . . . . . . . . 10.4 Nonlinear Systems of Equations . . . . . 10.4.1 Fixed-Point Method . . . . . . . 10.4.2 Newton-Raphson Method . . . . 10.4.3 Case Study: Four Bar Mechanism III . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Data Analysis 12 Interpolation 12.1 Polynomial Interpolation . . . . 12.1.1 Monomial Functions and 12.1.2 Lagrange Polynomials . 12.2 Splines . . . . . . . . . . . . . . 12.2.1 Linear Splines . . . . . . 12.2.2 Cubic Splines . . . . . . Bibliography 99 99 99 101 101 102 102 105 106 106 107 107 108 109 111 113 11 Linear Regression 11.1 Least Squares Fit: Two Parameter Functions 11.2 Polynomial Regression . . . . . . . . . . . . 11.3 Multiple Linear Regression . . . . . . . . . . 11.4 General Linear Least Squares Regression . . 13 Numerical Integration 13.1 Newton-Cotes Rules . . 13.1.1 Trapezoidal Rule 13.1.2 Simpson’s Rule . 13.1.3 Composite Rules 13.2 Gauss Quadrature . . . . 97 98 . . . . . . . . . . . . . . . . . . . . . . the . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Vandermonde . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 115 119 120 121 . . . . . . 123 123 123 124 127 127 128 . . . . . 131 131 131 133 133 135 139 6 CONTENTS Introduction 0.1 Overview of Course This course builds on “core” mechanics, engineering, and mathematics courses and brings these ideas together to continue training you (the student) how to take an engineering problem and solve it systematically using mathematical and computing tools. As such, there are two themes that will continually arise during the course: • Formulating an engineering problem as a mathematical problem • Solving a mathematical problem using mathematical and computational tools Conceptually, in any engineering problem, we are approximating the behavior of the system using a model. The vast majority of models are mathematical models. That is, we make a number of assumptions and approximations to arrive at equations that model the Engineering or Physics Problem behavior of the system. Examples include, Newtonian mechanics, Bernoulli-Euler beams, Euler equations for compressible inviscid gas dynamics, NavierMathematical Model (Approximations & Assumptions) Stokes equations for viscous fluid flow, general relativity, quantum mechanics, etc. In a limited number of instances, an analytical solution may be feasible, Numerical Formulation of Governing Equations but in the vast majority of cases, we must resort to numerical methods to compute solutions to our models. The flow chart in Figure 1 shows a qualAnalytical & Numerical Methods itative overview of this process. In this course, we will focus on all aspects starting from the engineering problem all the way to computing a solution. Solutions The coverage of these topics is structured into three modules: Applied Mathematics, Numerical Methods, and Data Analysis. During the Applications/Decisions applied mathematics portion of the course, we will cover the mathematical foundation of linear systems of equations, eigenvalues and eigenvectors, and ordi- Figure 1: Flow chart for solvnary and partial differential equations. In particular, ing an engineering problem. we will heavily emphasize transforming an engineering problem into a mathematical i ii INTRODUCTION problem. During this time, the lab portion of the course will focus on introducing Matlab. The second portion of the course will focus on formulating and solving the mathematical problems using numerical methods. We will be heavily using Matlab to program our algorithms to solve these problems. As such, programming will be a core component of this course. The final module of the course is focused on methods of data analysis, particularly regression, interpolation, and numerical integration. Again, Matlab programming will be a heavy component of the data analysis module of the course. 0.2 Historical Perspective Before the advent of computers, scientific inquiry proceeded in two stages: theory and experimentation. Theories are hypothesized on the basis of observations of nature and logical arguments. Subsequently, physical experiments are conducted in order to test the theory. As limitations of the theory are uncovered, the theory is revised and new experiments are conducted. Beginning in 1947, a dramatic change was initiated: the invention of the transistor, see Figure 2. The transistor is a fundamental logical unit in every computer processor. An increasing number of transistors in a computer chip allows for a greater number of operations per unit time. An exponential growth in the number of transistors in a computer processor was observed for a number of decades and is attributed to Gordon Moore of Intel: Moore’s Law. Figure 3 graphically illustrates Moore’s law (note the log-scale Figure 2: Image of first on the vertical axis). The tremendous growth in comtransistor. Taken from puting capability has brought modeling, numerical methods, and computer simulation to the forefront of https://en.wikipedia. scientific inquiry and engineering analysis. Indeed, it org/wiki/Transistor. is now accepted that computation is the third pillar of science, next to theory and experimentation. Thus, a critical set of tools for scientists and engineers is mathematical modeling and numerical methods. These need to understand and use these tools is the purpose for this course. 0.2. HISTORICAL PERSPECTIVE iii Microprocessor Transistor Counts 1971-2011 & Moore's Law 16-Core SPARC T3 Six-Core Core i7 Six-Core Xeon 7400 2,600,000,000 Dual-Core Itanium 2 AMD K10 POWER6 Itanium 2 with 9MB cache AMD K10 1,000,000,000 AMD K8 Pentium 4 Transistor count 8-core POWER7 Quad-core z196 Quad-Core Itanium Tukwila 8-Core Xeon Nehalem-EX Six-Core Opteron 2400 Core i7 (Quad) Core 2 Duo Cell Itanium 2 100,000,000 10-Core Xeon Westmere-EX Atom AMD K7 AMD K6-III curve shows transistor count doubling every two years 10,000,000 Barton AMD K6 Pentium III Pentium II AMD K5 Pentium 80486 1,000,000 80386 80286 100,000 68000 8085 6800 10,000 6809 Z80 8080 MOS 6502 8008 2,300 4004 80186 8088 8086 RCA 1802 1971 1980 1990 2000 2011 Date of introduction Figure 3: Growth of the number of transistors in a single computer chip. Exponential trend called Moore’s Law. From https://en.wikipedia.org/wiki/Moore’s_law iv INTRODUCTION Part I Mathematical Background 1 Chapter 1 Linear Systems In this chapter, we focus on engineering systems that can be cast into linear systems of equations. Although a more complete treatment of linear systems is the subject of courses in linear algebra, we first review necessary topics in matrices and matrix algebra in Section 1.1 and then apply these ideas to constructing linear systems based on problems encountered in statics, circuits, and spring-mass systems in Section 1.2. In Section 1.3, we briefly discuss matrices in the context of transformations. Finally, in Section 1.4, we introduce solving linear systems manually using Gaussian Elimination. 1.1 Matrix Algebra We first review the basic notation and algebra of vectors and matrices. Symbolically a vector is an array of numbers, oriented in a row or column with only a single row or single column, respectively. For example, take the vectors a and b: 4 5 a = 1, 2, 3 , b = 6 7 a is a row vector with three entries while b is a column vector with four entries. As scientists and engineers, we use the vector notation in a variety of ways including the description of coordinates and points of bodies in physical space. We will use vectors (and matrices) in more interesting ways to describe physical systems. A matrix generalizes the notion of a vector. Instead of a single row or column of entries, we now have an ordered array of entries. A matrix A is said to be n × m (“n by m”) if it has n rows and m columns. Equation (1.1) illustrates a generic matrix A. a11 a12 . . . a1n a21 a22 . . . a2n A = .. (1.1) . . . . aij . . . am1 am2 . . . amn 3 4 CHAPTER 1. LINEAR SYSTEMS The scalar entry aij corresponds to the entry on the ith row and j th column. If m = 1, we have a row vector; if n = 1 we have a column vector. The matrix is said to be square if m = n. If the matrix is not square, it is said to be rectangular. By convention, we typically use lower-case roman letters for vectors and upper-case roman letters for matrices; for entries in a vector or a matrix, we use a lower-case roman letter with numeric subscripts indicating the location of that entry in the vector or matrix. Example 1.1.1. Matrices 5 5 1 2 4 0 1 A1 = 1 3 7 , b = 2 , c = 3 7 4 6 , B = 2 5 2 7 8 6 7 7 (1.2) A1 is a 3 × 3 square matrix with a23 = 7. b is 4 × 1 column vector with b2 = 1. c is a 1 × 4 row vector with c1 = 3. B is a 3 × 2 rectangular matrix with b12 = 0. There are several special forms of matrices that occur frequently in the study of linear systems. A symmetric matrix is such that for each i and j, aij = aji . The matrix A1 in Equation 1.2 is a symmetric matrix. Note that a matrix must be square for it to be symmetric. A diagonal matrix is one with zeros in the off-diagonal entries; that is aij = 0, i 6= j. Example 1.1.2. Diagonal Matrix 3 0 0 D = 0 2 0 0 0 1 (1.3) D is a diagonal matrix since its off-diagonal entries are zero. Its diagonal entries are d11 = 3, d22 = 2, d33 = 1. The identity matrix is a particular diagonal matrix: all the diagonal entries are 1; that is, aii = 1 and aij = 0, i 6= j. Example 1.1.3. Identity Matrix 1 0 0 I = 0 1 0 0 0 1 (1.4) I is a 3 × 3 identity matrix. The symbol I is typically reserved for referring to the identity matrix. The Matlab command eye(3) will generate a 3×3 identity matrix. The zero matrix is what you’d expect: a matrix of all zeros. That is, aij = 0 for all i, j. A 4 × 5 zero matrix can be generated in Matlab with the command zeros(4,5). A final set of special matrices that we’ll need possess all zeros below the diagonal or all zeros above the diagonal are called upper triangular and lower triangular, respectively. 1.1. MATRIX ALGEBRA 5 Example 1.1.4. u11 u12 u13 l11 0 0 U = 0 u22 u23 , L = l21 l22 0 0 0 u33 l31 l32 l33 (1.5) U is a 3 × 3 upper triangular matrix while L is a 3 × 3 lower triangular matrix. Matrix objects have their own rules for algebraic manipulation for which we will make extensive use. Two matrices are said to be equal, A = B, if aij = bij for each entry i, j. That is, each entry in the matrix A must be equal to the corresponding entry in matrix B in order for the condition A = B to be true. We can add and subtract matrices: C = A + B is defined by adding each corresponding entry of A and B and setting that value in C. That is, cij = aij + bij for each i, j. Similarly, D = B − A is computed as dij = bij − aij . Matrix addition (and subtraction) are commutative: A + B = B + A and associative: (A + B) + C = A + (B + C). We can multiply matrices by a scalar number α ∈ R (α is a real number). Symbolically, we write α ∗ A = A ∗ α = αA; computationally, we multiply each entry in the matrix by the scalar α: αA = αaij for each i, j. We can multiply two matrices, but this is slightly more involved. Given an m × n matrix A and an n × p matrix B, we wish to compute “A times B”: A ∗ B = AB. C = AB is defined as n X aik bkj (1.6) Cij = k=1 So, note, in particular, that the inner dimensions of the matrices must be equal. Otherwise, the multiplication of two matrices does not make sense and is not defined. The dimensions of C will be equal to the outer dimensions of the matrices A and B. In words, the matrix product proceeds by taking the ith row of the matrix A and multiplying it by the j th column in B (a “dot-product” of those vectors) which yields the scalar value for cij . Example 1.1.5. Take 3 1 5 9 A = 8 6 , B = 7 2 0 4 (1.7) First, we confirm the inner dimensions of A and B are equal, namely 2. The outer dimension of A and B are 3, 2 and so C = AB will have dimensions of 3 × 2. 3 1 3×5+1×7 3×9+1×2 22 29 8 6 5 9 = 8 × 5 + 6 × 7 8 × 9 + 6 × 2 = 82 84 7 2 0 4 0×5+4×7 0×9+4×2 28 8 Note that BA is not defined. (1.8) 6 CHAPTER 1. LINEAR SYSTEMS Matrix multiplication enjoys some of the standard multiplicative algebraic properties. Matrix multiplication is associative: (AB) C = A (BC); it is distributive: A (B + C) = AB + AC. However, matrix multiplication is not commutative: AB 6= BA. We saw this already in Example 1.1.5. We can also now understand why the identity matrix, I, is named as such: AI = IA = A. We will have use for the transpose of a matrix. Given the matrix A, we denote the transpose as AT and it is defined as aTij = aji . Example 1.1.6. Take the matrix A as follows. 5 1 A = 2 3 4 6 Then, 5 2 4 A = 1 3 6 T We’ve seen that many of the operations we use for scalars translate analogously to vectors and matrices, but what about division? Division is not defined for matrices, but there is an analogous operation: the inverse. The inverse of a matrix A, A−1 , is defined such that A−1 A = AA−1 = I. We will discuss the inverse matrix more in-depth later. We need one final operation on square matrices for later use: the determinant. The determinant appears in a number of different algorithms and formulae throughout engineering science. We denote the determinant as |A| or det A. It has a recursive definition based on the size of the matrix. For a 1 × 1 matrix, |A| = |a11 | = a11 . For a 2 × 2 matrix, a a (1.9) |A| = 11 12 = a11 a22 − a12 a21 a21 a22 For a 3 × 3 matrix, a11 a12 a13 a a a a a a |A| = a21 a22 a23 = a11 22 23 −a12 21 23 + a13 21 22 a31 a33 a31 a32 a32 a33 a31 a32 a33 | {z } (1.10) Minor,Mij The general definition is in terms of the minors Mij , which are merely determinants of the submatrices formed by removing the ith row and the j th column from the original matrix. For a n × n matrix A, det A = n X j=1 aij (−1) i+j Mij = n X aij Cij (1.11) j=1 where Cij is a cofactor; we select a particular i, usually 1. There are more efficient ways to compute the determinant that will be considered later. 1.2. REPRESENTING LINEAR SYSTEMS OF EQUATIONS 7 There are several useful properties of determinants. Let In be the n × n identity matrix and let α be a scalar. det(AB) = det(A) det(B) det(αIn ) = αn det(αA) = det(αIn A) = αn det(A) 1 det(A−1 ) = det(A) T det(A ) = det(A) 1.2 (1.12) (1.13) (1.14) (1.15) (1.16) Representing Linear Systems of Equations Now, we will study how to form linear systems of equations for several typical undergraduate engineering problems. First, however, we review the notion of linearity and systems of equations. Let L(x) be some mathematical operator acting on x. x could be a number and L could be a function; L could also be a matrix, x a vector. We say that L is linear if it satisfies the following properties: L(x + y) = L(x) + L(y) L(αx) = αL(x), α ∈ R (1.17) (1.18) If L does not satisfy these properties, then it is said to be nonlinear. For example, if L is a one-dimensional function, say L(x) = 3x, then L is linear. If L(x) = 2x3 − 5x2 + 1, then it is nonlinear. We can also have systems of equations. Systems of equations are multiple sets of equations that share the same unknowns. Many engineering problems can be modeled as systems of linear equations. Example 1.2.1. Linear System of Equations Take the following system of equations. 2x1 + 0x2 = 3 −1x1 + 2x2 = 4 This is a system of two equations and two unknowns. The unknowns are x1 and x2 . These are linear systems because the each of the equations in the system is linear with respect to the unknowns. Later in the course, we will encounter nonlinear systems — systems where some or all of the equations are nonlinear in one or more of the unknowns. Our goal will be to solve such systems of equations in a systematic way such that we can write computer programs to automate their solution. It is prudent to ask several questions about these systems of equations. First, are there any solutions? This is called existence of solutions. Second, how many solutions are possible? This is called uniqueness of solutions. We say that the system is non-singular if there exists one and only one solution. Otherwise, the system is singular. For linear systems, there are only three possibilities about the number of solutions: 0, 1, ∞. 8 CHAPTER 1. LINEAR SYSTEMS Example 1.2.2. What is the/are solution(s) to the following linear system? x1 + x 2 = 2 2x1 + 2x2 = 4 Take x1 = 2, x2 = 0. We can also take x1 = 1, x2 = 1. There are ∞ many solutions to this linear system. Example 1.2.3. What is the/are solution(s) to the following linear system? x1 + 0x2 = 5 7x1 + 0x2 = 2 The first equation of the system suggests that x1 = 5 while the second shows x1 = 2/7. This contradiction indicates there is no solution to the system of equations. Example 1.2.4. What is the/are solution(s) to the following linear system? x1 + 0x2 = 5 x1 + x2 = 17 Take x1 = 5. Now substitute into the second equation to get 5 + x2 = 17. This yields x2 = 12. This system has exactly one solution: x1 = 5, x2 = 12. For systems of equations that only have two components, we can visualize what’s happening graphically. We can plot each of the lines on a graph and where they intersect represents the solution of the linear system. Figure 1.1a shows a clear unique solution. Figure 1.1b shows a case where the lines nowhere intersect, i.e. they are parallel — this case has no solutions. Finally, Figure 1.1c show a case where the lines overlap — this case has infinitely many solutions, i.e. every point on the line is a solution. 1.2. REPRESENTING LINEAR SYSTEMS OF EQUATIONS (a) Graphical representation of a twodimensional linear system that possesses a unique solution. 9 (b) Graphical representation of a twodimensional linear system that possesses a no solution. (c) Graphical representation of a twodimensional linear system that possesses infinitely many solutions. Figure 1.1: Figures illustrating the various possibilities of solutions for a twodimensional linear system. Figures taken from Chapra [1]. We now turn to the study of linear systems using tools of matrix algebra. This step is critical to allowing us to systematically solving such systems. Before turning to more interesting examples, we begin with a generic 3 × 3 linear system. a11 x1 + a12 x2 + a13 x3 = b1 a21 x1 + a22 x2 + a23 x3 = b2 a31 x1 + a32 x2 + a33 x3 = b3 (1.19) (1.20) (1.21) 10 CHAPTER 1. LINEAR SYSTEMS The first step is to recognize that we can write Equation 1.19 as two vectors in equality: a11 x1 + a12 x2 + a13 x3 b1 a21 x1 + a22 x2 + a23 x3 = b2 (1.22) a31 x1 + a32 x2 + a33 x3 b3 Now Equation 1.22 is a vector equation. Finally, we recognize that the vector on the left of Equation 1.22 can actually be expressed as a matrix-vector product: a11 a12 a13 x1 b1 a21 a22 a23 x2 = b2 (1.23) a31 a32 a33 x3 b3 We now have a system that can be concisely written as Ax = b: x is the (unknown) solution of interest while A and b will be given. This form is more conducive to systematic solutions by computer; this is especially critical on systems with large numbers of unknowns. We now focus on typical applications that you’ve encountered in previous coursework such that we write the problems in this form. Example 1.2.5. Spring-Mass System in Static Equilibrium Figure 1.2 illustrates a simple spring-mass system. We will assume that each of the masses is loaded with a constant force and is in static equilibrium. We denote the force on mass m1 as f1 and the force on mass m2 as f2 . We wish to compute the displacements of each of the masses. k f1 k m1 x1 f2 k m2 x2 Figure 1.2: Spring-mass system. First we draw free body diagrams for each of the masses. Figure 1.3: Free body diagram of masses. Since the system is in static equilibrium, we know that the sum of the forces on each of the masses must be zero. This yields the following system of equations: f1 + k(x2 − x1 ) − kx1 = 0 f2 − kx2 − k(x2 − x1 ) = 0 (1.24) (1.25) 1.2. REPRESENTING LINEAR SYSTEMS OF EQUATIONS 11 We can rewrite this system as 2kx1 − x2 = f1 −x1 + 2kx2 = f2 (1.26) (1.27) Now we can rewrite this system in matrix form: 2k −k −k 2k x1 f = 1 x2 f2 (1.28) Example 1.2.6. Electrical Circuit with Resistors Figure 1.4 illustrates a simple circuit with several resistors and voltage loadings. We wish to compute the current in each of the closed loops in the circuit. V2 R3 R1 R4 V1 R2 R5 V3 Figure 1.4: Simple electrical circuit. Kirchoff’s law states that the sum of the voltages around a closed must be zero. In this example, we orient the loops counter-clockwise and label as shown in Figure 1.5. 12 CHAPTER 1. LINEAR SYSTEMS Figure 1.5: Simple electrical circuit with oriented loops for summing voltages. Further, we use Ohm’s law which relates the voltage drop, V , across a resistor to the current, I, and the resistance, R: V = IR. Summing around each of the voltage loops yields the following system of equations: Loop 1: Loop 2: Loop 3: I1 R5 + I1 R4 − I2 R4 + I1 R2 − I3 R2 − V3 = 0 I2 R4 − I1 R4 + I2 R3 + V2 + I2 R1 − I3 R1 = 0 I3 R2 − I1 R2 + I3 R1 − I2 R1 + V1 = 0 (1.29) (1.30) (1.31) We can rewrite the system as follows: (R5 + R4 + R2 )I1 − R4 I2 − I3 R2 = V3 −R4 I1 + (R4 + R3 + R1 )I2 − R1 I3 = −V2 −R2 I1 − R1 I2 + (R2 + R1 )I3 = −V1 Now we can write the system in matrix form: (R5 + R4 + R2 ) −R4 −R2 I1 V3 −R4 (R4 + R3 + R1 ) −R1 I2 = −V2 −R2 −R1 (R2 + R1 ) I3 −V1 (1.32) (1.33) (1.34) (1.35) Notice that the matrix is symmetric. Example 1.2.7. Static Truss with Two-Force Members Figure 1.6 shows a simple truss under loading. The truss consists of members with loadings at two points only. We will assume the truss is in static equilibrium. We wish to compute the force response throughout the truss, in particular the forces in the rods and at the pins at the walls. 1.2. REPRESENTING LINEAR SYSTEMS OF EQUATIONS 13 Figure 1.6: Static truss under loading with two-force members only. We begin by drawing free body diagrams at each of the points B, A, and C. We denote the force in the rod between points B and A as FBA and the force in the rod between points A and C as FAC . The reaction forces at C are denoted Cx and Cy for the x- and y-components, respectively. Similarly, the reaction forces at pin B are denoted Bx and By . Figure 1.7: Free body diagrams of truss with two-force members. Because the truss is in static equilibrium the forces in the x- and y-directions must each sum to zero. This gives six static equilibrium equations: B: C: A: Bx + FBA sin(30o ) = 0 By + FBA cos(30o ) = 0 Cx + FAC cos(20o ) = 0 Cy + FAC sin(20o ) = 0 FBA sin(30o ) + FAC cos(20o ) = 0 FBA cos(30o ) − FAC sin(20o ) − W = 0 As with the previous examples, we can now sin(30o ) 0 1 0 cos(30o ) 0 0 1 o 0 cos(20 ) 0 0 o 0 sin(20 ) 0 0 sin(30o ) cos(20o ) 0 0 cos(30o ) − sin(20o ) 0 0 (1.36) (1.37) (1.38) (1.39) (1.40) (1.41) rewrite this system into matrix form. 0 0 FBA 0 0 0 FAC 0 1 0 Bx = 0 (1.42) 0 1 By 0 0 0 Cx 0 0 0 Cy W 14 CHAPTER 1. LINEAR SYSTEMS Example 1.2.8. Static Truss with Three-Force Members Figure 1.8 shows a simple truss under loading. The truss consists of members with loadings at three points in the members, requiring additional considerations beyond Example 1.2.7. Figure 1.8: Static truss under loading with three-force members. First we draw free body diagrams of the system as well as each of the members, noting that the bar BE is a two force member. Figure 1.9: Free body diagrams of forces in static truss shown in Figure 1.8. We see, then, that the unknowns are Ax , Ay , FBE , CX , Cy , Dx , Gx , Gy , eight in all. With eight unknowns, we’ll need eight equations in order to determine the unknowns. 1.3. MORE THAN JUST AN ARRAY OF NUMBERS... 15 We start by summing forces. Ax + FBE sin(53.13o ) − Cx + W + Dx Ay + FBE cos(53.13o ) + Cy CX − FBE cos(36.8698o ) + Gx −Cy − FBE sin(36.8698o ) + Gy −Gx − W −Gy − W =0 =0 =0 =0 =0 =0 This gives us six equations. We need two more. We’ll use balance of moments. First, taking moments of bar CEG around point C gives −FBE sin(36.8698o ) ∗ 8 + Gy ∗ 16 = 0 Now, taking moments of bar ABCD around point A gives −FBE sin(53.13o ) ∗ 6 + Cx ∗ 12 − W ∗ 15 − Dx ∗ 18 This gives us now 8 total equations. We can then write this system of equations in matrix form: 1 0 sin(53.13o ) −1 0 1 0 0 −W Ax 0 1 cos(53.13o ) 0 1 0 0 0 Ay 0 o 0 0 − cos(36.8698 ) 1 0 0 1 0 FBE 0 0 0 − sin(36.8698o ) 0 −1 0 0 1 Cx 0 (1.43) 0 0 = 0 0 0 0 −1 0 Cy W 0 0 0 0 0 0 0 −1 Dx W 0 0 −8 sin(36.8698o ) 0 0 0 0 16 Gx 0 0 0 −6 sin(53.13o ) 12 0 −18 0 0 Gy 15 ∗ W 1.3 More Than Just an Array of Numbers... Up to this point, we’ve discussed basic matrix algebra and representing some of our favorite engineering problems as linear systems of equations and, subsequently, in matrix form. But what are matrices, really? Are they just convenient arrays of numbers? Yes, they are convenient, but they are more. They encapsulate information about changing vectors. In mathematical parlance, they are linear operators. An “operator” is fancy language for “takes one object and converts it to another”. For matrices that we’ve been discussing, we use the following notation to indicate that the operator takes a n-dimensional vector and returns an m − dimensional vector: A : x ∈ Rn → Rm . A could be many kinds of operators, but in the present context, A is a matrix. (What must be the size of the matrix to take an n-vector to give an m-vector?) In such cases, we tend to refer to A as a linear transformation — it transforms an input vector into an output vector in a linear way. 16 CHAPTER 1. LINEAR SYSTEMS Example 1.3.1. Scaling a Vector Take the vector xT = [1, 1]. If we multiply by a real number, α, we get αxT = [α, α]. We can also accomplish this using a matrix. Take the matrix α 0 A= (1.44) 0 α Then the operation Ax will yield the vector [α, α]T . Example 1.3.2. Scaling One Component of a Vector Take the vector xT = [1, 1]. We can generalize the previous example by only scaling the x-component of a vector: α 0 A= (1.45) 0 1 Now, Ax will yield the vector [α, 1]T . Already, we’re now beyond the simple algebra of vectors. Example 1.3.3. Reflect a Vector across the Axis Take the vector xT = [1, 1]. We can use linear transformations to operate on vectors by reflecting them across the x-axis. 1 0 A= (1.46) 0 −1 How do we change A to give a reflection across the y-axis? Example 1.3.4. Rotate a Vector by an Angle θ Take the vector xT = [1, 1]. An extremely useful linear transformation is a rotation. cos θ − sin θ A= (1.47) sin θ cos θ This linear transformation will rotate a two-dimensional vector counterclockwise by the angle θ. Notice that length of the vector is unchanged. 1.4 Solving Linear Systems (by hand) In this section, we consider our first systematic strategy for solving linear systems of equations called Gaussian Elimination. We need such strategies as directly solving for unknowns by substitution is not conducive to programming. Gaussian Elimination is a direct method as we directly compute a solution in a fixed number of steps (as opposed to iterative algorithms where the solution cannot be computed in a fixed number of steps). Gaussian Elimination proceeds in two broad steps: 1. The elimination phase where we transform the matrix into upper triangular form; 2. Because the matrix is now in upper triangular form, we can perform backward substitution where the last unknown is readily computed, from which the remaining unknowns can be systematically computed. Let’s start with a 2 × 2 matrix. 1.4. SOLVING LINEAR SYSTEMS (BY HAND) Example 1.4.1. Gaussian Elimination of a 2 × 2 Matrix a11 a12 x1 b = 1 a21 a22 x2 b2 17 (1.48) In the first stage of Gaussian Elimination, we need to transform the matrix into upper triangular form. We do this by eliminating the entries in the matrix that are below the diagonal, in this case only a21 . We use information in the first equation to accomplish this, namely we rescale and add the first equation to the second equation. (−a21 /a11 )(a11 x1 + a12 x2 = b1 ) + a21 x1 + a22 x2 = b2 Adding the first equation into the second yields (a22 − a21 a12 /a11 )x2 = b2 − a21 /a11 b1 (1.49) For simplicity, define a022 = a22 − a21 a12 /a11 and b02 = b2 − a21 /a11 b1 , then the linear system from Equation (1.48) becomes b a11 a12 x1 (1.50) = 10 b2 0 a022 x2 Now we can begin the backward substitution phase. The value of x2 is trivial: x2 = b02 /a022 . Now that we have the value of x2 , we can substitute back into the first equation: a11 x1 + a12 (b02 /a022 ) = b1 Now we can readily compute x1 = (b1 − a12 (b02 /a022 ))/a11 . Note that we did not create an x02 — our transformation of the system did not change the solution. We effectively rotated the system about the solution. Let’s now consider the previous examples where we constructed the linear systems of equations. Example 1.4.2. Solution of Example 1.2.5 Take k = 2, f1 = −1, and f2 = 3. Then, the linear system we constructed before is now 4 −2 x1 −1 = (1.51) −2 4 x2 3 As before, we eliminate the (2, 1) entry by scaling the first equation by 1/2 and adding it to the second. This gives 4 −2 x1 −1 = (1.52) 0 3 x2 5/2 Then, we readily see x2 = 5/6. Substituting into the first equation gives 4x1 − 5/3 = −1 This gives x1 = 1/6. 18 CHAPTER 1. LINEAR SYSTEMS Example 1.4.3. Solution of Example 1.2.6 Take R1 = 12, R2 = 4, R3 = 5, R4 = 2, R5 = 10, all in Ohms and V1 = 100, and V2 = V3 = 0, all in Volts. Then the system from Example 1.2.6 becomes 18 −2 −4 I1 0 −2 19 −12 I2 = 0 (1.53) −4 −12 16 I3 −100 We need to eliminate all the entries below the diagonal; we’ll start with the (2, 1) entry. We scale the first equation (row 1) by 1/9 and add it to the second equation (row 2). This gives 18 −2 −4 I1 0 0 169/9 −112/9 I2 = 0 (1.54) −4 −12 16 I3 −100 Now we need to eliminate the (3, 1) entry. We scale the first equation (row 1) by 2/9 and add it to third equation (row 3). This gives 0 18 −2 −4 I1 0 169/9 −112/9 I2 = 0 (1.55) −100 0 −112/9 136/9 I3 Finally, we need to eliminate the (3, 2) entry. We scale the second equation (row 2) 112/9 = 112/169 and add it to the third equation (row 3). This gives by 169/9 18 −2 −4 I1 0 0 169/9 −112/9 I2 = 0 (1.56) 0 0 1160/169 I3 −100 Now we may proceed with backward substitution. Clearly, I3 = −845/58. Substituting I3 into the second equation gives 169/9I2 − 112/9 ∗ (−845/58) = 0 This gives I2 = −280/29. Now substituting I2 and I3 into the first equation gives 18I1 − 2 ∗ (−280/29) − 4 ∗ (−845/58) = 0 This gives I1 = −125/29. Example 1.4.4. Solution of Example 1.2.7 Take W = 100 Newtons. Rearranging the equations we have the following system of equations: 1 0 0 0 sin(30o ) 0 Bx 0 0 1 0 0 cos(30o ) By 0 0 0 0 1 0 Cx 0 0 cos(20o ) = (1.57) o 0 0 0 1 0 0 sin(20 ) C y 0 0 0 0 sin(30o ) cos(20o ) FBA 0 0 0 0 0 cos(30o ) − sin(20o ) FAC 100 1.4. SOLVING LINEAR SYSTEMS (BY HAND) 19 Now we eliminate all the entries below the diagonal. Because of our rearrangement, there is only one entry: the (6, 5) entry. First, scale the fifth equation (row 5) by − cos(30o )/ sin(30o ) and add to the sixth equation (row 6). This gives (switching to numerical values) 1 0 0 0 sin(30o ) 0 Bx 0 0 1 0 0 cos(30o ) By 0 0 0 0 1 0 Cx 0 0 cos(20o ) = (1.58) o 0 0 0 1 0 0 sin(20 ) C y 0 0 0 0 sin(30o ) cos(20o ) FBA 0 0 0 0 0 0 a066 FAC 100 where a066 = − sin(20o ) − cos(20o ) cos(30o )/ sin(30o ). Now we can proceed with backward substitution. Clearly FAC = 100/a066 = −50.7713 Newtons. Substituting into the fifth equation gives sin(30o )FBA + cos(20o )(−50.7713) = 0 which yields FBA = 95.4189 Newtons. Substituting into the fourth equation gives Cy + sin(20o )(−50.7713) = 0 yielding Cy = 17.3648. Substituting into the third equation gives Cx + cos(20o )(−50.7713) = 0 yielding Cx = 47.7094. Now substituting into the second equation gives By + cos(30o )(95.4189) = 0 yielding By = −82.6352. Finally, substituting into the first equation gives Bx + sin(30o )(95.4189) = 0 yielding Bx = −47.7094. 20 CHAPTER 1. LINEAR SYSTEMS Chapter 2 Eigenvalues and Eigenvectors 2.1 Core Idea Consider a simple 2 × 2 matrix A and 2 × 1 a11 a12 , x= A= a21 a22 vectors x, y y x1 , y= 1 y2 x2 (2.1) such that Ax = y y1 = a11 x1 + a12 x2 , y2 = a21 x1 + a22 x2 (2.2) Let a11 = 1, a12 = 2, a21 = 2, a22 = 3 ; x1 = 1, x2 = 0 ⇒ y1 = 1, y2 = 2 1 1 [A] = 0 2 or A : vector (1, 0) → vector (1, 2) Alternately stated, the operator A : x → y. Geometrically speaking (see Fig. 2.1) A stretched and rotated x to get y. Clearly, the stretching and rotating must depend on both A and x. Use the MATLAB command eigshow with a matrix of your choice and watch. What happens when the matrix is symmetric and real? 1 Let us now see if for a given A, we can find x such that y1 = λx1 and y2 = λx2 i.e. x and y are on the same line and y is just a simple multiple of x. This implies a11 x1 + a12 x2 = λx1 a21 x1 + a22 x2 = λx2 1 for some value of x, y and x will line up 21 22 a) CHAPTER 2. EIGENVALUES AND EIGENVECTORS b) Figure 2.1: A : x → y generated using the eigshow command in Matlab. a) Note the stretching and rotation. b) Note the simple scaling for the case when x is an eigenvector. or Ax = λx (2.3) Rearranging (a11 − λ)x1 + a12 x2 = 0 a21 x1 + (a22 − λ)x2 = 0 or (A − λI)x = 0 (2.4) The trivial answer to (2.4) is 0. Much more interesting is the possible non-trivial answer. In the last chapter, we discussed the idea of existence and uniqueness of solutions of linear systems. One method of examining the solvability of the linear system is examining its determinant. If the determinant of a matrix is non-zero, then the matrix is invertible. If the determinant is exactly zero, then the matrix is singular. Using this fact, then we know that for (A−λI)x = 0 to have a non-trivial solution det(A − λI) = 0 i.e. det(A − λI) = 0 ⇒ (a11 − λ)(a22 − λ) − a12 a21 = 0 2 ⇒ λ + (−a11 − a22 )λ + (a11 a22 − a12 a21 ) = 0 2.2. APPLICATIONS 23 This polynomial is named the characteristic polynomial and has roots p (a11 + a22 ) + (a11 + a22 )2 − 4(a11 a22 − a12 a21 ) λ1 = 2 p (a11 + a22 ) + (a11 − a22 )2 + 4a12 a21 ) = (2.5) 2 p (a11 + a22 ) − (a11 + a22 )2 − 4(a11 a22 − a12 a21 ) λ2 = 2 p (a11 + a22 ) − (a11 − a22 )2 + 4a12 a21 ) (2.6) = 2 Now let us plug in these choices λ1 , λ2 in (2.4). We recover two relationships among the components of x as x1 a12 =− (2.7) x2 a11 − λ1 x1 a12 =− (2.8) x2 a11 − λ2 Note that since (2.4) has 0 on the right (homogeneous equation) – thus a solution (x1 , x2 ) can be multiplied by any scalar α to produce a valid solution. Thus, (2.4) only defines a relationship among x1 , x2 and not a unique solution. (2.7) and (2.8) produce two such independent relationships. If we assume x1 = 1 it follows that: a11 − λ1 ) ξ1 ≡ (x1 , x2 ) = (1, − a12 a11 − λ2 ) (2.9) ξ2 ≡ (x1 , x2 ) = (1, − a12 are two linearly independent vectors for whom Aξ1 = λ1 ξ1 and Aξ2 = λ2 ξ2 (2.10) Clearly – the vectors ξ1 , ξ2 are special. A will only stretch (NOT ROTATE) them by λ1 , λ2 . (λ1 , ξ1 ), (λ2 , ξ2 ) are called (eigen value, eigen vector) pairs. 2.2 Applications • Example 1 Consider now the mass spring system from the previous chapter. Setting k = 1 2 −1 A= −1 2 p p (2 + 2) + 0 + 4(−1)(−1) (a11 + a22 ) + (a11 − a22 )2 + 4a12 a21 ) = =3 λ1 = p 2 p2 (a11 + a22 ) − (a11 − a22 )2 + 4a12 a21 ) (2 + 2) − 0 + 4(−1)(−1) λ1 = = =1 2 2 1 1 Corresponding to λ1 = 3 we get ξ1 = and for λ2 = 1 we get ξ2 = by −1 1 using (2.9). These modes and natural frequencies can be calculated from a finite element model using a process that we illustrate now. 24 CHAPTER 2. EIGENVALUES AND EIGENVECTORS Sample computation of modes/frequencies for 2 DOF system: Figure 2.2: Vibration of two mass spring system. • Example 2: For mechanical systems like the one in the first example there is a nice physical interpretation of eigen values and eigen vectors. Consider a variant as in Figure 2.2. The equations of motion from summing forces for each mass in Fig 2.2 are m1 0 x¨1 k1 −k1 x1 0 + = 0 m2 x¨2 −k1 k1 + k2 x2 0 (2.11) Let x1 = a1 sin ωt and x2 = a2 sin ωt Differentiating twice with respect to t x¨1 = −ω 2 a1 sin ωt and x¨2 = −ω 2 a2 sin ωt substituting above k1 −k1 a1 sin ωt 0 a1 sin ωt 2 m1 =ω −k1 k1 + k2 a2 sin ωt 0 m2 a2 sin ωt (2.12) Or Kx = λM x where λ ≡ ω 2 . Multiplying both sides by M −1 . (Note that since M is diagonal, M −1 is simply a diagonal matrix with the inverse of each scalar value of M .) M −1 Kx = λM −1 M x = λx An eigenvalue problem! λ ≡ ω 2 is the eigenvalue of the matrix M −1 K. By our choice of x1 and x2 , ω is the natural frequency. Thus, the eigen value here is the square of the natural frequency. 2.2. APPLICATIONS 25 ξ2 m1 m1 x1 −1 ξ1 2 k1 m1 m2 x2 1 m2 m2 1 k2 Figure 2.3: Original configuration and vibration at the two modes corresponding to the two eigen vectors of M −1 K. k1 m1 −λ − mk12 − mk11 0 a1 sin ωt = k1 +k2 a 0 − λ 2 m2 (2.13) As before, to find the eigen values we need to set det(M −1 K − λI) = 0. Let k1 = 10, k2 = 20, m1 = 5, m2 = 10 (10 − 5λ)(30 − 10λ) − 100 = 0 50(λ2 − 5λ + 4) = 0 λ = 1, 4 ⇒ ω1 = 1, ω2 = 2 Plugging these in we have for the eigen vectors 2 −1 ξ1 = and ξ2 = 1 1 Thus for vibrations at frequency ω1 = 1 ⇒ x1 = 2 sin ωt and x2 = 1 sin ωt and for vibrations at frequency ω2 = 2 ⇒ x1 = −1 sin ωt and x2 = 1 sin ωt Figure 2.3 shows the original configuration and modes of vibration. • Example 3 Consider the 2D plane stress state σxx = 2, σxy = 5, σyy = 5. In 2 5 matrix form . Rotating the stress element (using for e.g. Mohr’s cicle) 5 5 – see figure 2.4 – to where shear stresses are zero to get principal stresses gets 26 CHAPTER 2. EIGENVALUES AND EIGENVECTORS 2 y 5 −1.7202 x 8.720 2 5 y’ x’ Figure 2.4: 2D plane stress and rotated form showing principal stresses. The principal stresses are the eigen values of the stress matrix and the directions are the eigen vectors. −1.7202 us the following principal stresses . Now recall that if axes line up 8.7202 with eigen vectors we will only have a stretch equal to the eigen value i.e. Ax0 = λ1 x0 , Ay 0 = λ2 y 0 . Thus, the principal stresses are the eigen values of the stress matrix and the principal directions are the eigen vectors. • Example 4 Consider the following electrical circuit: di The voltage across the inductors, each with inductance Lj , is VL,j = Lj dt , where t is time. Additionally, the voltage across the capacitors with capacitance Cj , R is given by C1j idt. Using Kirchoff’s law and summing voltages around each loop, we arrive at the following system of equations: Z Z di1 1 1 − (i1 − i2 ) dt − i1 dt = 0 (2.14) E − L1 dt C1 C3 Z Z 1 1 di2 (i1 − i2 ) − i2 dt − L2 =0 (2.15) C1 C2 dt 2.2. APPLICATIONS 27 If we differentiate both equation with respect to time, then the equations become 1 1 dE d2 i 1 1 + − i2 = (2.16) L1 2 + i1 dt C1 C3 C1 dt d2 i 2 1 1 1 L2 2 − i 1 + i2 + =0 (2.17) dt C1 C1 C2 If the supplied voltage, E, is constant, we can write the system as d2 i1 1 + C13 L1 0 C1 dt2 + 2 0 L2 ddti22 − C11 − C11 1 + C12 C1 i1 0 = i2 0 (2.18) As with the spring-mass system, we take the currents, ij to be of the form ij = Ij sin(ωt + φ). Substituting into Equation 2.18 gives 1 + 1 L1 0 I1 ω = C1 1 C3 0 L2 I2 − C1 2 − C11 1 + C12 C1 I1 I2 (2.19) Again, we have an eigenproblem, now for the frequency of the current in the circuits. • Example 5 Consider the example that we had at the beginning of this chapter where matrix multiplication can be seen as a geometrical scaling and rotation. Figure 2.5 illustrates n equally spaced 2D points on a ciricle (blue line) which are represented by matrix Pnx2 . Using a transformation matrix M , we can find P 0 which is the transformed version of the original points (red line) using the following equation. P = x1 y 1 x2 y 2 1 2 0 .. .. , M = 2 1 , P = P M. . . xn y n (2.20) Eigenvectors of ’M’ are also shown in Figure 2.5. In fact the eigenvectors demonstrate the main directions in which the data have been transformed. Moreover, the absolute value of eigenvalues associated with each eigenvector represent the importance of that direction. For instance the eigenvector associated with the largest eigenvalue demonstrate the main direction of transformation. Now use a M = αI where α is a scalar value and I is an identity matrix. What are the eigenvalues and eigenvectors? How would the shape change after the new transformation? Can you explain the physical meaning of the eigenvectors of the new transformation matrix? 28 CHAPTER 2. EIGENVALUES AND EIGENVECTORS 𝛼2 𝑉2 𝛼1 𝑉1 Figure 2.5: 2D Geometrical Transformation These modes and natural frequencies can be calculated from a finite element model using a Figure 2.6: Vibration of a cantilever beam. process that we illustrate now. Sample computation of modes/frequencies for 2 DOF system: • Example 6 Now consider one more example – a continuous system – a beam that is either cantilevered or simply supported – see figure 2.6. While, the same ideas extend here we will need to use a differential equation to model this. We will develop this after the next chapter on differential equations. The picture suggests the identification of natural frequencies with eigen values and mode shapes with eigen vectors. 2.3. EXTENDING THE IDEA A = 29 [ ] 2 −1 −1 2 λ1=1 ξ1=(0.707,0.707) λ2=3 ξ2=(−0.707,0.707 y=Az β1 λ1 ξ1 β2 λ2 ξ2 β2 O 2.3 2.3.1 Z β1 x Extending the idea Representation Now consider some vector z and product Az. Since ξ1 , ξ2 are linearly independent we can write z = β1 ξ1 + β2 ξ2 . Thus using the linearity of A Az = A(β1 ξ1 + β2 ξ2 ) = β1 Aξ1 + β2 Aξ2 = β1 λ1 ξ1 + β2 λ2 ξ2 (2.21) We can represent the action of matrix A on z by using only a simple combination of the eigen vectors Az = β1 λ1 ξ1 + β2 λ2 ξ2 . Thus, if we know the eigen values and eigen vectors λ1 , λ2 , ξ1 , ξ2 we can easily compute the action of A on any z, Az by simply scaling the eigen vectors with the components of z β1 , β2 and the eigen values. Thus, the effect of operator A on z is conveniently represented using the eigenvectors. 2.3.2 Special Properties Now we list some special properties of eigen values and eigen vectors that are useful: P 1 Eigenvalues and eigen vectors of symmetric matrices are real. P 2 Eigen vectors ξi , ξj of a symmetric matrix corresponding to distinct eigen values λi 6= λj are orthogonal. ξi · ξj = 0 30 CHAPTER 2. EIGENVALUES AND EIGENVECTORS Chapter 3 Differential Equations 3.1 Introduction In solving engineering problems, we always use the laws of physics to represent our problems with a set of mathematical equations which are often in forms of ‘differential equations’. This chapter provides the basic definitions of differential equations and briefly reviews the solution to the common differential equations in the domain of Mechanical and Aerospace Engineering. Differential Equations are equations containing derivatives of one or more variables. Variables indicating the value of a function are called dependent variables and the ones that takes different values in the domain are referred to as independent variable. For example, Equation 3.1 and 3.2 are two differential equations, where in the first one , ‘y’ is the dependent variable and ‘x’ is the independent variable. Whereas in the second equation ‘u’ is the dependent variable are ‘x’ and ‘t’ are two independent variables. 3.1.1 dy = x2 dx (3.1) ∂ 2u ∂u = 2 ∂x ∂t (3.2) Classification of Differential Equations Ordinary Differential Equation (ODE): a differential equation whose dependent variable is a function of a single independent variable. Equation 3.1 is an ODE because y = y(x). Partial Differential Equation (PDE): a differential equation whose dependent variable is a function of two or more independent variables. Equation 3.2 is a PDE because u = u(x, t). 31 32 CHAPTER 3. DIFFERENTIAL EQUATIONS Order of Differential Equation: The highest derivative that appears in the differential equation. Homogeneous Equation: If the scalar values (terms which do not include the dependent variables) in the differential equation are equal to zero, that equation is called ‘Homogeneous’ otherwise it is ‘non-homogeneous’. In other words if there exist a trivial solution (x=0) for a differential equation, it is a homogeneous equation otherwise it is non-homogeneous. 2 For example ddyx2 + sin(x) = 0 is a homogeneous equation but if you replace the ‘0’ with a constant value or a function of the independent variable ‘g(y)’ in general, it becomes a non-homogeneous equation. Linear/Non-linear: Any differential equation is called linear if it satisfies the superposition criteria. That is if x1 and x2 are two different solution of the differential equations, their liner combination (C1 x1 + C2 x2 ) is a solution too. Examples 3.1 Classify the following Differential Equations: • d3 y dx3 2 d y + y dx 2 + x = 0: 3rd order, Non-homogeneous ODE. Variable: (Dependent y, Independent x). • d3 x dy 3 2 + y ddyx2 + x = 0: 3rd order, Homogeneous ODE. Variable: (Dependent x, Independent y). • ∂2u ∂x2 + ∂2u ∂y 2 = ∂2u : ∂t2 2nd order, Homogeneous PDE. Variable: (Dependent u, Independent x, y, t). • ∂2u ∂x2 + ∂2u ∂y 2 = 5: 2nd order, Non-Homogeneous PDE. Variable: (Dependent u, Independent x, y). 3.1.2 Notations In most engineering problems, we are dealing with a physical parameter (such as velocity, temperature, current, etc.) which varies in a domain. Let use the dependent variable ‘u’ to represent such a physical parameter. If ‘u’ only varies in time, then there is only one independent variable that is time (t) and u = u(t). The differential equations describing the physics of u(t) will be ODE problems and for simplicity we 2 often use u̇ and ü instead of du and ddt2u . dt If ‘u’ varies in time as well spatial domain (e.g. x or y direction), then there are multiple independent variables and u = u(t, x, y). The differential equations 3.1. INTRODUCTION 33 describing the physics of u(t, x, y) will be PDE problems and for simplicity we may use the following notations: 2 2 2 2 ut , ux , uy , uxy , utt , uxx , uyy instead of ∂u , ∂u ∂u , ∂ u , ∂ u , ∂ u and ∂∂yu2 respectively. ∂t ∂x ∂y ∂x∂y ∂t2 ∂x2 3.1.3 Operators Let φ(x, y) be a scalar parameter (such as temperature) and ~u(x, y) = u1 (x, y)î + u2 (x, y)ĵ a vector parameter such as velocity. Both φ and ~u can vary from one point of the domain (xi , yi ) to another (xj , yj ). Then the following operators may be applied to one or both of these two fields. ~ or simply ∇φ represents the level of variation (graGradient: The operator ∇φ dient) of φ in different direction. The mathematical representations is shown in equation 3.3. Note that the gradient will convert the scalar into a vector. ∂φ ∂φ ~ î + ĵ = φx î + φy ĵ. (3.3) ∇φ(x, y) = ∇φ(x, y) = ∂x ∂y Divergence: The operator ∇ · ~u represents the magnitude of the u at different points of the domain according to equation 3.4. Note that the Divergence will convert a vector field into a scalar one. ∇ · ~u(x, y) = ∂u1 (x, y) ∂u2 (x, y) + . ∂x ∂y (3.4) Laplacian: The Laplace operator ∇2 is the divergence of a gradient ∇ · ∇φ. it will convert a scalar field to another scalar field. ∇2 φ(x, y) = ∇ · ∇φ = ( ∂ ∂ ∂φ ∂φ ∂ 2φ ∂ 2φ î + ĵ) · ( î + ĵ) = + = φxx + φyy . (3.5) ∂x ∂y ∂x ∂y ∂x2 ∂y 2 Curl: The Curl operator ∇ × ~u will transform the vector field to another vector field according to equation 3.6. For example if u is the velocity field, ∇ × ~u will represent the angular velocity field. ∇ × ~u(x, y) = ( ∂ ∂ ∂u2 ∂u1 î + ĵ) × (u1 (x, y)î + u2 (x, y)ĵ) = ( − )k̂. ∂x ∂y ∂x ∂y (3.6) Example 3.2 Find the following operations if M = xy î + sin(x)ĵ and T = zx + 2y + e−z : 2 • ∇T = ∂T î ∂x + ∂T ĵ ∂y + ∂T k̂ ∂z = 2xz î + 2ĵ + (x2 − e−z )k̂. • ∇M : beyond the scope of this class. • ∇ · T : is not defined. • ∇·M = ∂ xy ∂x + ∂ sin(x) ∂y = y. 34 CHAPTER 3. DIFFERENTIAL EQUATIONS • ∇ × T : is not defined. ∂ • ∇ × M = ( ∂x î + ∂ ĵ) ∂y × (xy î + sin(x)ĵ) = ( ∂sin(x) − ∂x ∂(xy) )k̂ ∂y = (cos(x) − x)k̂. • ∇2 T = Txx + Tyy + Tzz = 2z + e−z . • ∇2 M : beyond the scope of this class. Table 3.1: Summary of the Operators Operation Gradient Input Scalar Equation ∇φ(x, y) = ∂φ î + ∂x Divergence Vector ∇ · ~u(x, y) = ∂u1 ∂x + ∂u2 ∂y Scalar Laplacian Scalar ∇2 φ(x, y) = ∂2φ ∂x2 + ∂2φ ∂y 2 Scalar Curl Vector ∂ î + ∇ × u(x, y) = ( ∂x 3.2 ∂ ĵ) ∂y ∂φ ĵ ∂y Output Vector × (u1 î + u2 ĵ) Vector Solutions of Linear ODEs A Linear ODE of order n has precisely n distinct solutions: x1 (t), x2 (t), ..., xn (t). The general solution of nth order homogeneous ODE is a linear combination of its n independent solutions. That is x(t) = C1 x1 (t) + C2 x2 (t) + ... + Cn xn (t) C1 , C2 , ...Cn are n unknown scalar values. To find a unique solution we need an additional n conditions. They are known as initial conditions if they represent the dependant variable at a specific time, or boundary conditions if they provide spatial information. If all the additional constraints are given at a initial point of the independent variable (time or space), then the ODE problem may be referred to as an initial value problem, otherwise it will be called a boundary value problem. Example 3.3 Classify the following ODE problems. 2 d y dy 2 • x dx 2 + (x − y) = 0 constraints: at x = 0, y = 1 and dx = 0. Answer: Initial value problem with three boundary conditions. • d2 x dt2 + x = sin(t) constraints: at t = 0, x = 1 and at t = 1, x = 1. Answer: Boundary value problem with two initial conditions. 3.2. SOLUTIONS OF LINEAR ODES 3.2.1 35 2nd Order ODE with constant coefficient The general solution of any inhomogeneous ODE has the general form of: x(t) = xh (t) + xp (t) where xh (t) is the general solution to the homogeneous equation and xp (t) is the particular solution in response to the inhomogeneous part of the ODE. In many textbooks xh (t) and xp (t) may be refereed to as transient solution and steady state solutions respectively. In this section, we seek to find the ‘homogeneous’ and ‘particular’ solution of 2nd order ODE with constant coefficients which is shown in Equation 3.7. aẍ + bẋ + cx = f (t) (3.7) Homogeneous/Transient Solution If f (t) = 0, Equation 3.7 will be a homogeneous ODE. The constant coefficient suggests that the solution of this ODE may have the general form of xh (t) = Aeαt . We calculate the first and 2nd derivatives of this candidate solution and replace them in 3.7. ẋh = αAeαt = αxh , ẍh = αẋh = α2 xh → aα2 xh + bαxh + cxh = 0 → (aα2 + bα + c)xh = 0 (3.8) (3.9) In order to have a non-trivial solution, we need to find the roots of (aα2 +bα+c) = 0. The roots of this quadratic equation can be studied in three different cases. Case1: b2 − 4ac > 0: There are two distinct roots α1 and α2 . Each one of these solutions represents a possible solution. The general solution of this ODE will be the linear combination of the two solutions: xh (t) = A1 eα1 t + A2 eα2 t Case2: b2 − 4ac = 0: There is a real repeated roots α1 . In this case, eα1 t and teα1 t are two possible solutions (Review your ODE textbook for the proof of the 2nd solution). The general solution of this ODE will be the linear combination of the two solutions: xh (t) = A1 eα1 t + A2 teα1 t Case3: b2 − 4ac < 0: There are complex conjugate roots α1 = p + iq and α2 = p − iq. In this case, e(p+iq)t and e(p−iq)t are two possible solutions. The general solution of this ODE will be the linear combination of the two solutions: x(t) = C1 e(p+iq)t + C2 e(p−iq)t . We can use Euler formula eiφ = cos(φ) + i sin(φ) to 36 CHAPTER 3. DIFFERENTIAL EQUATIONS further simplify the solution into a real format. xh (t) = A1 ept cos(qt) + A2 ept sin(qt) = ept [A1 cos(qt) + A2 sin(qt)] Particular Solution The particular solution of ODE has the general form of the inhomogeneous part. For instance if the inhomogeneous part is sin(4t) the particular solution will have the form of Asin(4t)+Bcos(4t). A and B are two constants that should satisfy the ODE problem. The most common inhomogeneous parts and their corresponding solutions are listed in Table 3.2. Table 3.2: Particular solutions of most common solutions inhomogeneous function C sin(ωt) Particular solution A sin(ωt) + B cos(ωt) C cos(ωt) A sin(ωt) + B cos(ωt) a1 t3 + a2 t2 + a3 t + a4 b1 t3 + b2 t2 + b3 t + b4 nth order polynomial nth order polynomial c1 eαt d1 eαt Note that if we have a linear function of different inhomogeneous functions, the particular solution will be a liner combination of their associated solutions. Moreover, if the inhomogeneous function is an nth order polynomial, the solution wil also be an nth order polynomial. All the polynomial coefficients should be assumed to be non-zero and evaluated in the main ODE. See the following examples for more clarifications. Example 3.4 Find the particular solution of the following ODEs: • ẋ + 4x = 8t2 . Answer: f (t) = 8t2 → xp (t) = a1 t2 + a2 t + a3 . Substitute xp (t) in the ODE to find a1 , a2 , a3 ẋp + 4xp = t2 → (2a1 t + a2 ) + 4(a1 t2 + a2 t + a3 ) = 8t2 → (4a1 − 8)t2 + (2a1 + 4a2 )t + (4a3 + a2 ) = 0 4a1 − 8 = 0 → a1 = 2 → 2a1 + 4a2 = 0 → a2 = − 21 a1 = −1 → xp = 2t2 − t + 0.25 4a3 + a2 = 0 → a3 = − 14 a2 = 0.25 • ẍ + ẋ − x = t + sin(t). 3.3. SOLUTIONS OF PARTIAL DIFFERENTIAL EQUATIONS 37 Answer: f (t) = t + sin(t) → xp (t) = a1 t + a2 + a3 sin(t) + a4 cos(t). Substitute xp (t) in the ODE to find a1 , ...a4 x¨p + ẋp − xp = t + sin(t) → (−a1 − 1)t + (a1 − a2 ) + (−2a3 − a4 − 1) sin(t) + (a3 − 2a4 ) cos(t) = 0 −a1 − 1 = 0 → a1 = −1 a1 − a2 = 0 → a2 = a1 = −1 → a3 − 2a4 = 0 → a3 = 2a4 −2a3 − a4 − 1 = 0 → −5a4 = 1 → a4 = −0.2 , a3 = −0.4 → xp = −t − 1 − 0.4 sin(t) − 0.2 cos(t) Example 3.5: Find the complete solution of ẍ + 2ẋ + x = sin(2t). Assume that x(0) = 0 and ẋ(0) = 0 Answer. Step I finding the homogeneous solution: ẍ + ẋ + 2x = 0 xh (t) = Aeαt → α2 + 2α + 1 = 0 → (α + 1)2 = 0 → α1 = α2 = −1 Case 2 → xh (t) = C1 e−t + C2 te−t = e−t (C1 + C2 t) Step II Finding the particular solution xp (t) = A sin(2t) + B cos(2t) → ẋp = 2A cos(2t) − 2B sin(2t) , ẍp = −4A sin(2t) − 4B cos(2t) ẍp + 2ẋp + xp = sin(t) → (−3A − 2B − 1) sin(2t) + (−3B + 4A) cos(2t) = 0 3A + 4B = −1 → A = −0.12 , B = −0.16 −3B + 4A = 0 x(t) = xh (t) + xp (t) = e−t (C1 + C2 t) − 0.12 sin(2t) − 0.16 cos(2t) Step III- Apply initial conditions. x(0) = 0 → C1 = 0.16 ẋ(0) = 0 → C2 = C1 + 0.24 → C2 = 0.4 → x(t) = (0.16 + 0.4t)e−t − 0.12 sin(2t) − 0.16 cos(2t) 3.3 Solutions of Partial Differential Equations The most common approach for solving PDE is the method of Separation of Variables. Although this method is not universally applicable to all PDEs, it will provide a solution to the most simple engineering problems. It assumes that the dependent variable (e.g y(x, t)) can be written as a product of two separate functions, each of which depends on only one independent variable. Separation of variables will convert a PDE problem into a set of ODE problems. We will use this approach to solve some of the popular PDE problems in engineering. 38 CHAPTER 3. DIFFERENTIAL EQUATIONS 3.3.1 Wave Equation Consider an elastic string stretched under a tension between two points on the x axis. By releasing the stretched string, we are interested in calculating the vibration of the string. In other word we want to find y(x, t) in Figure 3.1. The PDE describing the vibration of the elastic string is shown in Equation 3.10 and is known as 1-dimensional wave equation. Y y(t=0,x)=f(x) x=L x=0 X Figure 3.1: Elastic String under tension 2 ∂ 2y 2∂ y = α (3.10) ∂t2 ∂x2 This equation is subjected to two boundary conditions: y(0, t) = 0 , y(L, t) = 0 and two initial conditions: y(x, 0) = f (x) , yt (x, 0) = 0. Problem: Wave Equations: ∂2y ∂t2 2 ∂ y = α2 ∂x 2 BCs: y(0, t) = 0 , y(L, t) = 0 ICs: y(x, 0) = f (x) , yt (x, 0) = 0 Solution: Step I-Separation of Variables Let assume that we can write the solution as a multiplication of two independent solution for space (X(x)) and time (T (t)). y(x, t) = X(x)T (t) d2 T yt = X(x) dT , y = X(x) tt dt dt2 → X(x) d2 T2 = α2 T (t) d2 X2 → dt dx 2 d X yx = T (t) dX , y = T (t) xx 2 dx dx 1 d2 X 1 1 d2 T → = 2 (3.11) X(x) dx2 α T (t) dt2 In Equation 3.11, the right hand side is a function of time (t) while the left hand side is a function of space (x). This is impossible unless they are both equal to a 3.3. SOLUTIONS OF PARTIAL DIFFERENTIAL EQUATIONS 39 constant value. It turns out that only a negative constant value will yield to a nontrivial solution. Therefore the PDE problem can be written as two ODE problems. 1st ODE: 2nd ODE: 1 d2 X X(x) dx2 1 1 d2 T α2 T (t) dt2 = −k 2 → X 00 + k 2 X = 0 = −k 2 → T 00 + α2 k 2 T = 0 To see that the constant must be negative, let us consider the cases when the constant is zero and when the constant is positive. If the constant is zero, then the ODE for the x-variable is simply X 00 = 0. The solution is X(x) = ax+b for constants a and b. However, if we now consider the boundary conditions X(0) = X(L) = 0 (see below), then that means a = b = 0, i.e. X(x) = 0. So only the trivial solution can satisify the equation if the constant is zero. Similarly, if the constant is positive, call it k 2 , then the differential equation for X(x) becomes X 00 − k 2 X = 0. In this case, the form of the solution will be a sum of exponentials. Again, applying the boundary conditions, we find that the only function that satisifies the differential equation is X(x) = 0. Thus, the constant must be negative in order for us to find a non-trivial (i.e. nonzero) solution. Step II-Solving the ODEs: The above ODEs are homogeneous equations of case3. To solve the above ODEs we need find the new initial and boundary conditions. y(0, t) = X(0)T (t) = 0 → X(0) = 0 yt (x, 0) = X(x)Ṫ (0) = 0 → Ṫ (0) = 0 → and . y(L, t) = X(L)T (t) = 0 → X(L) = 0 y(x, 0) = X(x)T (0) = f (x) Solving ODE1: X 00 + k 2 X = 0 and X(0) = X(L) = 0. Case 3: → X(x) = C1 cos(kx) + C2 sin(kx). X(0) = 0 → C1 = 0 X(L) = 0 → C2 sin(kx) = 0. There are infinite solutions: kn = nπ L → Xn (x) = Cn sin( nπ x) , n = 1, 2, ...∞ L Solving ODE2: T 00 + n2 π 2 2 α T L2 = 0 and Ṫ (0) = 0, X(x)T (0) = f (x). Case 3: → T (t) = D1 cos( nπ αt) + D2 sin( nπ αt). L L Ṫ (0) = 0 → D2 = 0 → Tn (t) = Dn cos( nπ αt) , n = 1, 2, ...∞ L Final PDE Solutions: There are infinite solutions in space and time domain, yn (x, t) = Xn (t)Tn (t). Therefore the final solution will be a linear combination of these solutions. 40 CHAPTER 3. DIFFERENTIAL EQUATIONS y(x, t) = P∞ n=1 yn (x, t) = P∞ n=1 wn sin( nπ x) cos( nπ αt) L L wn can be calculated using the remaining initial condition: y(0, x) = f (x). y(x, 0) = f (x) → y(x, 0) = P∞ n=1 wn sin( nπ x) = f (x) L This is the Fourier series representation of the function f (x). Recognizing this, we can compute the coefficients, wn , by applying the formula for computing the Fourier coefficients: wn = 2 L RL 0 f (x)sin( nπ x)dx L Example 3.6 Solve the wave equation for f (x) = sin(x) and L = π. y(x, t) = wn = 2 π P∞ n=1 yn (x, t) = Rπ 0 P∞ n=1 wn sin(nx) cos(nαt) sin(x) sin(nx)dx → w1 = 1 and other wn are zero. y(x, t) = sin(x) cos(αt) 3.3.2 Heat Equation Heat equation describes the heat propagation through a medium. In this equation, α is the thermal diffusivity. In this section, we will use the method of ‘separation of variables’ to solve the Heat Equation subject to different boundary conditions. ∂u ∂t = α∇2 u Problem 1. Find a temperature distribution in a rod with constant temperature at both ends. Assume that the surface of the rod is insulated (no convection). Figure 3.2: Heat transfer in a rod with homogeneous boundary conditions 2 This problem can be described by ∂u = α2 ∂∂xu2 . This is a 1-dimensional heat ∂t equation. Similar to wave equation, we solve this problem in three different steps. 3.3. SOLUTIONS OF PARTIAL DIFFERENTIAL EQUATIONS 41 Solution: Step I-Separation of Variables Let assume that we can write the solution as a multiplication of two independent solution for space (X(x)) and time (T (t)). u(x, t) = X(x)T (t) ut = X(x) dT dt 2 T 2 ux = T (t) dX , uxx = T (t) ddxX2 → X(x) ddt = α2 T (t) ddxX2 dx → 1 d2 X 1 1 dT = X(x) dx2 α2 T (t) dt (3.12) In Equation 3.12, the right hand side is a function of time (t) while the left hand side is a function of space (x). This is impossible unless they are equal to a constant value. Using similar arguments from the wave equation development, it can be shown that only a negative constant value will yield to a non-trivial solution. Therefore the PDE problem can be written as two ODE problems. 1st ODE: 2nd ODE: 1 d2 X X(x) dx2 1 1 dT α2 T (t) dt = −k 2 → X 00 + k 2 X = 0 = −k 2 → T 0 + α2 k 2 T = 0 Step II-Solving the ODEs: The 1st ODEs is a homogeneous equations of case3 and the 2nd one is a first order ODE. To solve the above ODEs we need find the new initial and boundary conditions. u(0, t) = X(0)T (t) = 0 → X(0) = 0 → and u(x, 0) = X(x)T (0) = f (x). u(L, t) = X(L)T (t) = 0 → X(L) = 0 Solving ODE1: X 00 + k 2 X = 0 and X(0) = X(L) = 0. Case 3: → X(x) = C1 cos(kx) + C2 sin(kx). X(0) = 0 → C1 = 0 X(L) = 0 → C2 sin(kx) = 0. There are infinite solutions: kn = nπ L → Xn (x) = Cn sin( nπ x) , n = 1, 2, ...∞ L Solving ODE2: dT dt 1st order ODE: → → Tn (t) = e− R n2 π 2 2 α t L2 2 2 = − nLπ2 α2 T and X(x)T (0) = f (x). dT T =− R n2 π 2 2 α dt. L2 , n = 1, 2, ...∞ Final PDE Solutions: There are infinite solutions in space and time domain, un (x, t) = Xn (t)Tn (t). Therefore the final solution will be a linear combination of 42 CHAPTER 3. DIFFERENTIAL EQUATIONS these solutions. u(x, t) = P∞ n=1 un (x, t) = P∞ − n=1 wn e n2 π 2 2 α t L2 sin( nπ x) L wn can be calculated using the remaining initial condition: y(0, x) = f (x). u(x, 0) = f (x) → u(x, 0) = P∞ n=1 x) = f (x) wn sin( nπ L Again, we use the formula for the Fourier coefficients: wn = 2 L RL 0 f (x)sin( nπ x)dx L Example 3.7 Solve the heat equation for f (x) = 100 , α = 1 and L = π. u(x, t) = P∞ n=1 un (x, t) = P∞ − n=1 wn e n2 π 2 2 α t L2 Rπ (1 − (−1)n ) wn = π2 0 100 sin(nx)dx = 200 nπ P∞ 200 2 u(x, t) = n=1 nπ (1 − (−1)n ) sin(nx)en t = sin( nπ x) L 400 π (sin(x)e−t + sin(3x)e−9t + ...) Problem 2. Find a temperature in a rod with an insulated end. (IC: u(x, 0) = f (x), BCs: u(L, t) = 0, ux (0, t) = 0) The formulation of this problem is similar to problem 1 but with different boundary conditions. Therefore step 1 is the same as problem 1 but we have to redo the step 2 and 3. ux (0, t) = X 0 (0)T (t) = 0 → X 0 (0) = 0 → and u(x, 0) = f (x). u(L, t) = X(L)T (t) = 0 → X(L) = 0 Solving ODE1: X 00 + k 2 X = 0 and X(0) = X(L) = 0. Case 3: → X(x) = C1 cos(kx) + C2 sin(kx). X 0 (0) = 0 → C2 = 0 X(L) = 0 → C1 cos(kx) = 0. There are infinite solutions: kn = (2n−1)π 2L → Xn (x) = Cn cos( (2n−1)π x) , n = 1, 2, ...∞ 2L Solving ODE2: It will be same as problem 1 but note that k-value has been changed to (2n−1)π 2L → Tn (t) = e− (2n−1)2 π 2 2 α t 4L2 , n = 1, 2, ...∞ Final PDE Solutions: There are infinite solutions in space and time domain, un (x, t) = Xn (t)Tn (t). Therefore the final solution will be a linear combination of 3.3. SOLUTIONS OF PARTIAL DIFFERENTIAL EQUATIONS 43 these solutions. u(x, t) = P∞ n=1 un (x, t) = P∞ − n=1 wn e (2n−1)2 π 2 2 α t 4L2 cos( (2n−1)π x) 2L wn can be calculated using the remaining initial condition: y(0, x) = f (x). 3.3.3 Different Boundary conditions Comparing the solutions of problem 1 and 2, you can see that the solution of the same PDE varies depending on the BCs. In general there are three classes of boundary conditions that may be applied to the PDE problems. • Drichlet BCs: u is specified at the boundary. If u = 0 it is called homogeneous boundary conditions. • Neumann BCs: The first derivative of u is specified at the boundary. • Robin BCs: A combination of u and its derivative is specified at the boundary. 44 CHAPTER 3. DIFFERENTIAL EQUATIONS Part II Numerical Methods 45 Chapter 4 Numerical Solution of Linear Systems Previously, we have discussed Gaussian Elimination as a systematic way to compute the solution to linear systems of equations. However, for large systems, say number of unknowns greater than 10, this is very cumbersome and error-prone to do manually. We would much rather have a computer do the work for us. Thus, we will first discuss outlining an algorithm in “pseudo-code” to perform Gaussian Elimination for arbitrary square matrices. Next, we consider the computational effort involved in the algorithm. Next, we study the pitfalls of “naı̈ve” Gaussian Elimination and propose strategies to overcome these pitfalls. Finally, we consider more efficient variations when one has multiple right-hand-side vectors for the same matrix. 4.1 Automating Gaussian Elimination To this point, we’ve only discussed Gaussian Elimination through examples and illustrating in words. Now, we must convert this procedure into an algorithm that can be implemented in computer code. When we write such algorithms, we typically use “pseudo-code” — that is, we express the logic of the algorithm, adhering to common programming syntax, but not conforming to a specific language. We construct the algorithm by examining the Gaussian Elimination procedure. We use each pivot row to eliminate the lower triangular portion of the matrix, each column at a time. If we have an n × n matrix A, then for each pivot row, i = 1 to n − 1, we zero out the column beneath the current pivot element, i.e. rows i + 1 to n. We zero out the column by scaling the pivot row using the pivot element and substracting from the row for which we are trying to construct a zero in the appropriate column; we also have to adjust the right-hand-side. Figure 4.1 gives the pseudo-code for the Gaussian Elimination procedure we discussed previously. 47 48 CHAPTER 4. NUMERICAL SOLUTION OF LINEAR SYSTEMS % Forward elimination for i = 1:n-1 %pivot row for j = i+1:n factor = A(j,i)/A(i,i) A(j,i) = 0 b(j) = b(j) - factor*b(i) for k = i+1:n A(j,k) = A(j,k) - factor*A(i,k) end end end %Backward substitution x(n) = b(n)/A(n,n) for i = 1:n-1 k = n-i x(k) = b(k) for j = k+1:n x(k) = x(k) - A(k,j)*x(j) end x(k) = x(k)/A(k,k) end Figure 4.1: Pseudo-code for naı̈ve Gaussian Elimination 4.2 Computational Work of Gaussian Elimination Before we discuss the operation count of Gaussian Elimination, we need to introduce the concept of “Big O” notation. This is useful because when discuss the scaling of algorithms, we often don’t need to know exact counts, but only trends as the number of entries gets large. We say a function f (x) is O(g(x)) if as x → a, there exists a δ and M such that |f (x)| ≤ M |g(x)| for |x − a| < δ. In the present context, we might say that “The number of FLOPS in Gaussian Elimination is O(n3 ).” This means that while we don’t exactly the constant in front of n3 , we know that the cost scales like the cube of the dimension of the matrix for which we’d perform Gaussian Elimination. Now, we can proceed to count the number of floating point operations (FLOPS) involved in Gaussian elimination. We start with the elimination phase. Referring to Figure 4.1, we have 1 divide, 1 substraction, and 1 multiplication for each i and for each j. Additionally, for each k, we have 1 substraction and 1 multiplication. Then, 4.2. COMPUTATIONAL WORK OF GAUSSIAN ELIMINATION 49 the total number of FLOPS, T , is T = n−1 X n X 3+ i=1 j=i+1 ! n X 2 (4.1) k=i+1 Next, we can separate the terms and factor out the numbers from the sums since they are independent of the counting index. Thus, we have T = n−1 X n X 3+ i=1 j=i+1 =3 =3 n−1 X n n X X n−1 X n X 1+2 n−1 X n n X X i=1 j=i+1 i=1 j=i+1 k=i+1 n−1 X n−1 X n X (n − i) + 2 i=1 =3 2 n−1 X (4.2) i=1 j=i+1 k=i+1 1 (4.3) (n − i) (4.4) i=1 j=i+1 (n − i) + 2 i=1 n−1 X (n − i) (n − i) (4.5) i=1 Now regrouping terms, we have, T = = = n−1 X i=1 n−1 X i=1 n−1 X (n − i) + 2 (n − i) + 2 (n − i) + 2 i=1 n−1 X i=1 n−1 X i=1 n−1 X (n − i) + 2 n−1 X (n − i) (n − i) (4.6) i=1 [(n − i) + (n − i) (n − i)] (4.7) (n − i) (n − i + 1) (4.8) i=1 Now, there are two useful identies that we will use: n−1 X i=1 n−1 X i=1 i= (n − 1)(n) 2 (4.9) n(n − 1)(2n − 1) i = 6 2 Continuing, we can now separate Equation (4.8) into separate terms conducive to applying the identities in Equation (4.9): T = n−1 X i=1 n− n−1 X i=1 i+2 n−1 X n2 − 2ni + n + i2 − i (4.10) i=1 (n − 1)n = n(n − 1) − + 2n2 (n − 1) − 4n 2 n(n − 1)(2n − 1) (n − 1)n +2 −2 6 2 (n − 1)n 2 + 2n(n − 1) (4.11) 50 CHAPTER 4. NUMERICAL SOLUTION OF LINEAR SYSTEMS Collecting the terms we see that 2 T = n3 + O(n2 ) 3 (4.12) In other words, there are 32 n3 flops plus some other terms that scale like n2 , but n2 grows much slower than the n3 term, so the constant on the n2 term is unimportant. A similar analysis can be performed for the backward substitution phase. It turns out that backward substitution is only O(n2 ), so the elimination phase is by far the dominant cost of Gaussian Elimination. Table 4.1 shows the cost of Gauss Elimination for various matrix sizes. Table 4.1: Cost of Gaussian Elimination as a function of matrix dimension n Back 2 3 n Elimination Substitution Total Flops n 3 10 705 100 805 667 100 671550 10000 681550 666667 1000 6.67 × 108 1000000 6.68 × 108 6.67 × 108 4.3 Percentage due to Elimination 87.58% 98.53% 99.85% Partial Pivoting Up to this point, we have not considered any of the potential modes of failure of Gaussian Elimination, other than assuming that the system to be solved is nonsingular. However, examining Figure 4.1 we can see one particular mode of failure: namely, if any entry of the diagonal is zero, then we will have a divide-by-zero instance, resulting in failure of the algorithm. Such circumstances are common, even for non-singular systems. What can we do to overcome this difficulty? The main strategy that is followed is pivoting the rows. That is, we exchange two rows in the system to remove the zero from the diagonal. The order of the equations of our system doesn’t alter the solution, so we are free to interchange the rows. We call this strategy “partial pivoting”. It is possible to additionally interchange columns — such a strategy is called “complete pivoting”. However, the complexity of the complete pivoting strategy far outweighs its utility. Indeed, the vast majority of linear systems encountered in science and engineering can be adequately treated with partial pivoting. So, assuming we wish to adopt the partial pivoting strategy, how do we systematically implement it? The key is one extra step before eliminating entries in the current pivot column. Namely, we search for the largest magnitude element in the current column and then exchange the current with row with that row. In this way, we always ensure that we have a nonzero value for the pivot element. Example 4.3.1. Gaussian Elimination with Partial Pivoting 4.4. LU DECOMPOSITION 51 Consider the following linear system of equations: 1 0 2 3 x1 1 −1 2 2 −3 x2 −1 0 1 1 4 x3 = 2 6 2 2 4 x4 1 We wish to solve this system using Gaussian Elimination with partial pivoting. First, we examine column 1 and see that the largest element is the (4, 1) entry. Thus, we pivot the 1st and 4th rows to get 6 2 2 4 x1 1 −1 2 2 −3 x2 −1 0 1 1 4 x3 = 2 1 0 2 3 x4 1 Now we proceed eliminating the 6 2 0 7/3 0 1 0 −1/3 subdiagonal entries in the first column. This leaves 2 4 x1 1 7/3 −7/3 x2 = −5/6 1 4 x3 2 5/3 7/3 x4 5/6 Now, the pivot element becomes the (2, 2) entry. Before eliminating the subdiagonal elements of the second column, we search for the maximum element, in absolute value, in the second column below the diagonal. Here, the current pivot element is the maximum, so there is no need to pivot. We proceed with elimination yielding 6 2 2 4 x1 1 0 7/3 7/3 −7/3 x2 −5/6 = 0 0 0 5 x3 33/14 0 0 2 2 x4 5/7 Now the pivot element is the (3, 3) entry. Now we have a zero pivot! There’s only one other remaining row, so we exchange row 3 and 4. This gives 6 2 2 4 x1 1 0 7/3 7/3 −7/3 x2 −5/6 = 0 0 2 2 x3 5/7 0 0 0 5 x4 33/14 Now the elimination phase is complete (system is now upper triangular) and the solution procedure can be completed with the standard backward substitution algorithm. 4.4 LU Decomposition Gaussian Elimination gives a “direct” route to a solution for a given linear system. However, if we wish to change only the right-hand-side, Gaussian Elimination requires us to completely redo all stages of the solution process. As we saw in previous 52 CHAPTER 4. NUMERICAL SOLUTION OF LINEAR SYSTEMS examples in Chapter 1.2, such as the statics examples, the right-hand-side consists of only external loads on a structure. When studying a structure, we may wish to apply many different loads without changing the geometry or other properties of the structure. Given that the elimination phase is by far the most compute-intensive, is there a way we can reuse the elimination part of the solution phase so that we can easily, and cheaply, change only the loading? The answer to this question is yes and relies on the notion of matrix decompositions. The idea is to “decompose” a matrix into separate matrices. In the present context, we focus on the LU decomposition: A = LU (4.13) where L is a lower triangular matrix and U is an upper triangular matrix. We know we ought to be able to accompolish such a decomposition because the elimination phase of Gaussian Elimination transformed the system into the form Ux = d (4.14) Assuming that we can compute the decomposition A = LU , then we can see how we can alter our solution procedure. Consider the linear system Ax = b. If we decompose A = LU , then we have LU x = b. Now, define y = U x. Then we have Ly = b. Since L is lower triangular and b is given, we can easily solve for y using forward substitution, in analogy with backward substitution only starting at the “top” and moving forward through the system. With y computed, y now becomes the data for the system U x = y. Now, we can solve U x = y using backward substitution. Both substitution phases are cheap, i.e. O(n2 ), so that once we’ve determined the LU decomposition of A, it is straight-forward to vary the vector b without having to recompute the matrix decomposition. So, using the LU decomposition, the solving a linear system proceeds in three steps: 1. Factor: Compute the matrix decomposition A = LU , where L is lower triangular and U is upper triangular. 2. Forward Substitution: Solve Ly = b for the vector y. 3. Backward Substitution: Solve U x = y for the vector x. How do we compute the LU decomposition? Let’s first consider a 2 × 2 case. 1 2 l11 0 u11 u12 = 3 5 l21 l22 0 u22 This gives us the relationships: l11 u11 l11 u12 l21 u11 l21 u12 + l22 u22 =1 =2 =3 =5 4.4. LU DECOMPOSITION 53 We have 4 equations, but 6 unknowns — the system is underdetermined! This means we have a choice in how to proceed. The choice we make is to have the diagonal entries lii = 1.1 Now that we’ve made this choice, we can continue developing the LU decomposition. For the general 4 × 4 case, the factorization is a11 a12 a13 a14 1 0 0 0 u11 u12 u13 u14 a21 a22 a23 a24 l21 1 0 0 0 u22 u23 u24 A= (4.15) a31 a32 a33 a34 = l31 l32 1 0 0 0 u33 u34 a41 a42 a43 a44 l41 l42 l43 1 0 0 0 u44 We can multiply L and U back together and compare against A in order to deduce the values of lij and uij . a11 a12 a13 a14 a21 a22 a23 a24 A= a31 a32 a33 a34 a41 a42 a43 a44 u11 u12 u13 u14 l21 u11 l21 u12 + u22 l21 u13 + u23 l21 u14 + u24 = l31 u11 l31 u12 + l32 u22 l31 u13 + l32 u23 + u33 l31 u14 + l32 u24 + u34 l41 u11 l41 u12 + l42 u22 l41 u13 + l42 u23 + l43 u33 l41 u14 + l42 u24 + l43 u34 + u44 Right away, we see the top row matches directly with A, namely u11 = a11 , u12 = a12 , u13 = a13 , and u14 = a14 . With u11 determined, we can now determine the li1 values. In particular, l21 = a21 /u11 = a21 /a11 , and similarly, l31 = a31 /a11 and l41 = a41 /a11 . These are exactly the scaling factors from Gaussian Elimination! In fact, turning to the 2nd row, the entries in U are the same as if we were to apply Gaussian Elimination to A. Namely, u22 = a22 − l21 u12 , u23 = a23 − l21 u13 , and u24 = a24 − l21 u14 . This pattern continues upon examination of the second column, the 3rd row, and so on. So, to construct the LU decomposition of A, we apply the elimination step of Gaussian Elimination to construct U and we store the factors in L. Example 4.4.1. LU Decomposition Consider the following matrix A: 1 2 3 A = 2 6 10 3 14 28 Let us construct the LU decomposition of A. We apply Gaussian Elimination as before, but now we’ll keep track of the pivot factors and store them in L and the “eliminated” version of A will be U . First, we scale the first equation by 2/1 and 1 This is the so-called “Doolittle” decomposition; there are other variants, but we will not discuss them. 54 CHAPTER 4. NUMERICAL SOLUTION OF LINEAR SYSTEMS substract from the second equation. Thus, the l21 entry will be 2 and we have 1 2 3 1 0 0 U = 0 2 4 , L = 2 1 0 3 14 28 l31 l32 1 Now we eliminate the (3, 1) entry by scaling the first equation by 3/1 and subtracting from the third equation. Thus, l31 = 3 and we have 1 2 3 1 0 0 U = 0 2 4 , L = 2 1 0 0 8 19 3 l32 1 Finally, we eliminate the (3, 2) entry by scaling the second equation by 8/2 and subtracting from the third equation. Thus, l32 = 4 and we have 1 2 3 1 0 0 U = 0 2 4 , L = 2 1 0 0 0 3 3 4 1 The eliminated version of A is, in fact, U and we have filled L, so we have completed the LU decomposition of A. Now, we may verify that 1 0 0 1 2 3 1 2 3 LU = 2 1 0 0 2 4 = 2 6 10 = A 3 4 1 0 0 3 3 14 28 Example 4.4.2. Solving a Linear System using LU Decomposition Consider the linear system 1 2 3 x1 1 2 6 10 x2 = 2 3 14 28 x3 3 In the previous example, we computed the LU decompositon of A: 1 0 0 1 2 3 LU = 2 1 0 0 2 4 3 4 1 0 0 3 To solve the linear system using the LU decomposition, first we define the vector y = U x. Then, upon substituting for A and U x, we have 1 0 0 y1 1 Ly = 2 1 0 y2 = 2 = b 3 4 1 y3 3 4.4. LU DECOMPOSITION 55 We apply forward substitution to compute the y vector. So, trivially, y1 = 1. Then, 2y1 + y2 = 2 so that y2 = 0. Finally, 3y1 + 4y2 + y3 = 3 so that y3 = 0. Thus, y = [1, 0, 0]T . Now that we’ve computed y, we can perform backward substitution, i.e. solve 1 2 3 x1 1 U x = 0 2 4 x2 = 0 = y 0 0 3 x3 0 So we immediately see that x3 = 0. Then, moving “backwards”, 2x2 + 4x3 = 0 so that x2 = 0. Finally, we have x1 + 2x2 + 3x3 = 1 so that x1 = 1. Thus, the solution to our original linear system is x = [1, 0, 0]T . Once we’ve computed the decomposition of our matrix, we can easily compute the solution for a variety of right-hand-side vectors by using the LU decomposition and forward and backward substitution. This potentially save a great deal of computational effort as the LU decomposition phase is O(n3 ), but the forward and backward substitution phases are only O(n2 ). So, we only need to do one LU decomposition for our matrix, the expensive part, but then it’s relatively cheap to do many repeated forward and backward substitutions for many right-hand-side vectors. We may also use partial pivoting with the LU decomposition. The same strategy is used as with Gaussian Elimination, namely we search the current column for the largest magnitude entry and exchange the two rows. Additionally, we now keep track of a permutation matrix, P , that encodes the row swapping operations. In particular, P is just the identity matrix, but each time two rows are exchanged during the decomposition, we also exchange the corresponding two rows of P . In particular, at the end of the LU decomposition computation, P A = LU (4.16) The important implication here is that, if we wish to solve a linear system, we must also permute the right-hand-side in order to obtain a correct solution: P Ax = LU x = P b (4.17) That is, we must apply P to b before performing the forward and backward substitution. This is because when we pivot the rows during elimination, we have to also swap the rows of the right-hand side. Example 4.4.3. LU Decomposition with Partial Pivoting Consider again the matrix A: 1 0 2 3 −1 2 2 −3 A= 0 1 1 4 6 2 2 4 56 CHAPTER 4. NUMERICAL SOLUTION OF LINEAR SYSTEMS We will compute the LU decomposition including partial pivoting. In addition to populating the lower triangular L matrix, we will also keep track of the permuation matrix P . As in Example (4.3.1), we first exchange rows 1 and 4 since the (4, 1) entry is the largest in magnitude in the first column. So we exchange the rows in A and proceed as before, eliminating the first column and storing the pivot factors in L, but we now also permute rows 1 and 4 in our matrix P : 6 2 2 4 1 0 0 0 0 0 0 1 0 7/3 7/3 −7/3 , L = −1/6 1 0 0 , P = 0 1 0 0 , U = 0 1 1 4 0 l32 1 0 0 0 1 0 0 −1/3 5/3 7/3 1/6 l42 l43 1 1 0 0 0 No we proceed to eliminate the second column. Here, no permutation is we proceed: 6 2 2 4 1 0 0 0 0 0 0 7/3 7/3 −7/3 −1/6 1 0 0 0 1 ,L = U = ,P = 0 0 0 0 0 0 5 3/7 1 0 0 0 2 2 1/6 −1/7 l43 1 1 0 needed, so 0 0 1 0 1 0 , 0 0 Now, in column 3, we must permute the 3rd and 4th rows according to the pivoting algorithm. Here, we must exchange rows 3 and 4 of P , but we must also exchange the subdiagonal rows L: 6 2 2 4 1 0 0 0 0 0 0 1 0 7/3 7/3 −7/3 1 0 0 , L = −1/6 , P = 0 1 0 0 , U = 0 0 1/6 −1/7 1 0 1 0 0 0 2 2 0 0 0 5 0 3/7 l43 1 0 0 1 0 Finally, l43 = 0/2 = 0 and U is already in upper triangular form so we are done: 1 0 0 0 6 2 2 4 0 0 0 1 −1/6 0 7/3 7/3 −7/3 1 0 0 , P = 0 1 0 0 L= 1/6 −1/7 1 0 , U = 0 0 2 2 1 0 0 0 0 3/7 0 1 0 0 0 5 0 0 1 0 Example 4.4.4. Solving a linear system using LU Decomposition with Partial Pivoting Consider the following linear system of equations: 1 0 2 3 x1 1 −1 2 2 −3 x2 −1 0 1 1 4 x3 = 2 6 2 2 4 x4 1 4.5. CHOLESKY DECOMPOSITION 57 We will use the decomposition we computed previously to solve the linear system. First, we must apply the permutation matrix, P , to our right-hand-side vector since P Ax = LU x = P b Thus, 0 0 1 0 0 1 0 0 0 0 0 1 1 1 1 −1 −1 0 = 0 2 1 0 1 2 First, we perform forward substitution, Ly = P b: 1 0 0 −1/6 1 0 1/6 −1/7 1 0 3/7 0 0 y1 1 y1 1 0 y2 = −1 ⇒ y2 = −5/6 y3 5/7 0 y3 1 y4 1 y4 2 33/14 And now we can perform backward substitution, U x = y, to get the solution x: 6 2 2 4 x1 1 x1 −13/70 0 7/3 7/3 −7/3 x2 −5/6 = ⇒ x2 = 8/35 0 0 2 2 x3 5/7 x3 −4/35 0 0 0 5 x4 33/14 x4 33/70 4.5 Cholesky Decomposition Many systems encountered in engineering analysis are symmetric and positive-definite. The system is symmetric if, when written in matrix form Ax = b, A = AT . The system is positive definite if, and only if, all the eigenvalues are real and positive. There are many algorithms for which we can take advantage of this structure of the problem. The Cholesky decomposition is one such example. For symmetric, positive-definite systems, instead of computing the LU decomposition, we can compute the Cholesky Decomposition: A = LLT (4.18) where is a lower-triangular matrix. Note we already save half of the memory requirements of the LU decomposition because we only need to store L, not both L and U . Additionally, the Cholesky decomposition requires approximately half the computational effort of the LU decomposition. 58 CHAPTER 4. NUMERICAL SOLUTION OF LINEAR SYSTEMS To construct the Cholesky decomposition, we proceed as before, writing out the matrix in general form and computing each entry term-by-term. For the 3 × 3 case, l11 0 0 l11 l21 l31 A = LLT = l21 l22 0 0 l22 l32 l31 l32 l33 0 0 l33 2 (symmetric) l11 2 2 l21 + l22 = l21 l11 2 2 2 l31 l11 l31 l21 + l32 l22 l31 + l32 + l33 From this expression, we can see that each of the entries in L is ! j−1 X 1 aij − lik ljk , i > j lij = ljj k=1 v u j−1 u X t 2 ljj = ajj − ljk k=1 Using these expressions, we can construct pseudocode for the Cholesky algorithm. for k = 1 : n % evaluate off-diagonal terms for i = 1 : k-1 sum=0 for j = 1 : i -1 sum = sum + A(i,j) * A(k,j) end A(k,i) = (A(k,i) - sum) / A(i,i) end % evaluate diagonal term s=0 for j = 1 : k-1 sum = sum + (A(k,j))^2 end A(k,k) = sqrt(A(k,k) - sum) end Figure 4.2: Pseudo-code for Cholesky factorization. 4.6 Computing the Inverse of a Matrix We open this discussion by first saying that one should never compute the inverse of a matrix directly. We will elaborate on this point later, but suffice it to say for the time being, it is much more efficient to compute the matrix decomposition 4.6. COMPUTING THE INVERSE OF A MATRIX 59 and use forward and backward subsitution to solve a linear system. That said, the inverse of a matrix is a useful theoretical tool that we use often. To compute the inverse of a matrix A, we use LU decomposition, but for a sequence of right-hand-sides, bi : Axi = bi (4.19) We choose bi as each of the unit vectors: b1 = [1, 0, 0, . . . , 0]T , b2 = [0, 1, 0, 0, . . . , 0]T , b3 = [0, 0, 1, 0, . . . , 0]T , etc. Once we solve for each xi , we combine each of the column vectors into a matrix; this matrix is A−1 . That is A−1 = [x1 , x2 , . . . , xn ] This corresponds to nothing else than the definition of the inverse: AA−1 = I AX = B Now, if we apply the LU decomposition algorithm LU X = I ⇒ U X = Y, LY = I Notice now that forward backard substitutions must be done on matrix right-handsides, not vectors. This means that forward and backward substitution become O(n3 ) algorithms, from O(n2 ). This means that forward and backward substitution become asymptotically as expensive as the matrix decomposition! This is why one should never, except in the most trivial of circumstances (e.g. diagonal matrices), solve a linear system by computing the inverse matrix. Example 4.6.1. Computing the Inverse of a Matrix Consider the matrix 1 −1 2 A = −2 1 1 −1 2 1 The LU decomposition for this matrix 1 0 L = −2 1 −1 −1 First, we solve LY = I for the 1 0 −2 1 −1 −1 is 0 1 −1 2 0 , U = 0 −1 5 1 0 0 8 matrix Y . 0 y11 y12 y13 1 0 0 0 y21 y22 y23 = 0 1 0 1 y31 y32 y33 0 0 1 We do the forward substitution one column at a time. 1 0 0 y11 1 −2 1 0 y21 = 0 −1 −1 1 y31 0 60 CHAPTER 4. NUMERICAL SOLUTION OF LINEAR SYSTEMS This gives y11 = 1, y21 = 2, and y31 = column: 1 0 −2 1 −1 −1 3. We repeat this process again for the next 0 y12 0 0 y22 = 1 1 y32 0 This gives y12 = 0, y22 = 1, and y32 = 1. Repeating again for the third column gives y13 = y23 = 0 and y33 = 1. Now, having computed Y , we solve U X = Y : 1 −1 2 x11 x12 x13 1 0 0 0 −1 5 x21 x22 x23 = 2 1 0 0 0 8 x31 x32 x33 3 1 1 Proceeding before, one column at a time, we can compute each column of X. When completed, the matrix X will be exactly A−1 : 1/8 −5/8 3/8 X = A−1 = −1/8 −3/8 5/8 3/8 1/8 1/8 Verify that AA−1 = I. 4.7 Practice Problems Work the following problems in Chapra [1]: • 8.3 • 8.10 • 9.6 • 10.5 • 10.8 Chapter 5 Numerical Error and Conditioning To this point, we have mainly focused on analytical solutions. The previous chapter was our first foray into numerical methods, but the focus was on turning our “by-hand” algorithm into Matlab code that automated the solution procedure. Nevertheless, for many engineering problems, it is simply not possible to compute analytical solutions and we must rely on numerical approximations. While these approximations can be very accurate, they are nonetheless approximations. Therefore, it is important for us to understand the errors involved in our approximations and to judge whether the level of error is acceptable for studying engineering systems. First, we will overview a potpourri of topics related to error considerations. Then, we will discuss the how real numbers are represented digitally in the computer. Before studying the impact of floating point representation on the solution of linear systems, we will need to review the concept of norms as they relate to vectors and matrices. 5.1 Error Considerations When discussing the various errors encountered our study, we discuss the accuracy of algorithms, methods, etc. as well as the precision of the computer that we use, etc. Accuracy relates to “how closely our computed values match the true values”. Precision, on the other hand, relates to “how closely computed values agree”. Figure 5.1 illustrates this concept with targets. The more accurate points on the target are closer to the bullseye, while the more precise points are clustered closer to together. Notice that we can have more precision, but less accuracy. Errors arise from many different sources, such as approximate algorithms, modeling approximations, and the numerical computations that take place in the computer. For any particular scalar quantity, we can think of the error as a “shift” between the “truth” (whatever that is) and our computed value: x∗ = x + e (5.1) where x∗ is the true value, x is our computed value, and e is the error. So, if we know x∗ , then the absolute error is e = x∗ − x 61 (5.2) 62 Accuracy vs. Precision CHAPTER 5. NUMERICAL ERROR AND CONDITIONING Figure 5.1: Graphical depiction of accuracy and precision concepts. Taken from [1]. and the relative error is x∗ − x x∗ We also can discuss relative error in terms of percent relative error: erel = erel % = erel × 100% (5.3) (5.4) Of course, we never can compute the exact error — if we could, we would know the exact solution! We are reduced to trying to estimate the error. The estimates used depend on the context; they include comparing “old” and “new” values, looking at the remainder of a Taylor series truncation, looking at the residual of our equations, and using higher fidelity algorithms to gain insight into the error. Another important concept that relates to reporting the precision of our answer, or its error, is significant figures. The number of significant figures indicates precision. Significant digits of a number are those that can be used with confidence, e.g., the number of certain digits plus one estimated digit. For example, if you read the speedometer in your car (and assuming it is a traditional analog needle and not digital), you wouldn’t purport to be going “48.958” miles-per-hour because you cannot accurately read the gage to that many digits. When reporting values, it is understood that leading zeros are not significant figures since these are eliminated by “moving the decimal point” in scientific notation, whereas trailing zeros are significant. For example, the numbers 0.00001753, 0.0001753, and 0.001753 all have 4 significant figures whereas 5.38 × 104 , 5.380 × 104 , and 5.3800 × 104 have 3, 4, and 5 significant digits, respectively. We must be careful to not report false significant figures. For example, if we type 3.25/1.96 into Matlab, we will get back 1.65816326530162. But we will report either 1.65 (chopping) or 1.66 (rounding). This is because we do not know what lies 5.2. FLOATING POINT REPRESENTATION 63 beyond the second decimal place! Consider the following example. If we change the third (unknown) digit and use chopping, we get 3.259/1.960 = 1.66275510204082... 3.250/1.969 = 1.65058405281869... Similarly, if we using rounding and change the third (unknown) digit, we get 3.254/1.955 = 1.66445012787724... 3.250/1.969 = 1.65058405281869... We see that we can easily end up with different decimal values in the second decimal place! So, we only report 3 significant figures: the first two we are confident and the third is uncertain. 5.2 Floating Point Representation Once source of error that is always present when performing computations using a computer is the limitation of the computer to only be able to store a finite number of digits. Because of this limitation, arithmetic operations will always have round-off error present. To understand the source of round-off error, we need to study how the computer stores numbers. First, however, we need to remind ourselves of the basic number systems. By now, all of us are very comfortable using a base 10 (decimal) number system. However, because the memory in a computer is effectively just switches (on or off), the computer works in binary number systems. Figure 5.2 illustrates how we represent the number 86, 409 in a decimal system vs. representing the number 173 in a binary number system. Number Systems Base 10 (Decimal) Base 2 (Binary) Figure 5.2: Illustration of decimal and binary number systems. Taken from [1]. 64 CHAPTER 5. NUMERICAL ERROR AND CONDITIONING Next, we focus on the storage of integers because integers are simpler than real-valued numbers so we will start there. Integer$Representa2ons$ Figure 5.3 illustrates thethemain idea storing an the integer • Use first bit of afor word to indicate sign in memory in a com0: positive (off), in 1: negative (on)is a zero or one, with the first bit puter. Namely, each bit–(on-off switch memory) • Remaining bitsstoring are used the to store associated with the integer used for signa number of the integer. + 1 0 1 0 0 1 0 1 1 0 ! %"""""" "$""""""" # Sign Number Figure 5.3: Illustration of integer storage in computer memory. When discussing the size of the integers, we refer to how many bits are used to store the integer. For example, if we consider an 8-bit integer, then we have 1 bit for the sign and 7 bits to represent the numeral part of the integer. Figure 5.4 illustrates such an 8-bit integer. Integer$Representa2ons$ 8-bit word ± 26 2 5 2 4 2 3 2 2 2 1 20 $#" $!!!!!!#!!!!!!" Sign Number = 0000000 # smallest number Figure 5.4: Illustration of integer storage in0base10 computer memory. base2 = " !largest number = 1111111base2 = 127 base10 • +/- 0000000 are the same, therefore we may use Notice, in particular, -0that there are to represent -128bounds to the numbers we can represent: Total numbers 28 = 256 (-128 ∼127) the smallest number •is 0000000 in =base 2, which is 0 in base 10, while the largest number is 1111111 in base 2, which is 127 in base 10. Additionally, because “0” is mathematically the same as “+0”, we can use “-0” to represent -128. So therefore, with an 8-bit integer, we can represent the numbers -128 to 127. Anything outside of this range is an overflow (larger than the max) or an underflow error (smaller than the min): we “flow under/over” the boundaries of memory we have to represent the number. Of course, we need numbers much larger than 128. More commonly used integers are 32-bit and 64-bit versions. With 32-bit integers, we can stores numbers up to 231 = 2, 147, 483, 648, while with 64-bit integers, we can store 263 = 9.22337203685 × 1018 . Note that this explains why one could not have more than 2 Gigabytes of memory in a computer that possessed only 32-bit hardware and/or a 32-bit operating system: the memory could not be addressed! The storage of real-valued numbers is similar. The format that is used is referred to as “floating point representation”, alluding to the fact that scientific notation is always used and the decimal “floats” to accommodate the normalization of the number. In particular, there is a bit to track the sign of the number, then a block for the signed exponent, and then the “mantissa”, the significant figures of the number. Figure 5.5 illustrates generic floating point representation for an arbitrary base B number system. 5.3. REVIEW OF VECTOR AND MATRIX NORMS 65 Floa2ng$Point$Representa2on$ e m $!!# !! " $!!# !!" $ ±$ ± e 1 e 2 % e m d 1 d 2 d 3 % d p sign of number signed exponent mantissa N = ± .d 1 d 2 d 3 !d p B e = mB e • m: mantissa Figure 5.5: Illustration of Base storage of floating numbers in computer memory. • B: of the numberpoint system • e: signed exponent • Note: the mantissa is usually normalized if Of course, in the computer, the number used is always binary (base 2). the leading digit is system zero There are two primary types used for floating point numbers in the computer: “single precision” (32-bit) and “double precision” (64-bit). In scientific computing, double precision is the norm; in particular, by default, all floating point numbers in Matlab are double precision. Of the 32-bits allocated for single precision numbers, 1 bit is for the sign, 8 bits are for the signed exponent, and 23 bits are for the digits. Thus, for single precision numbers, the smallest number that can be stored is ≈ ±1.17549 × 10−38 and the largest is ≈ ±3.40282 × 1038 . Analogously, for the 64-bits in a double precision number, 1 bit is for the sign, 11 bits are for the signed exponent, and 52 bits are for the digits. The smallest double precision number is ≈ ±2.2251 × 10−308 while the largest is ≈ ±1.7977 × 10308 . In addition to the limit on the magnitude of the numbers that can be stored, there’s also a limit on the difference that can be stored during a floating point operation. In particular, we only have the width of the mantissa available for the digits, so any differences that exceed that width will be truncated. In particular, if we are adding two numbers whose difference exceeds the width of the mantissa, then the addition will be truncated completely. Example 5.2.1. Double Precision Truncation The width of the mantissa in double precision is 52 bits, so we expect differences greater than of the order of 252 ≈ 4.5036 × 1015 to be truncated. In fact, the number 2−52 ≈ 2.22 × 10−16 is called the “machine epsilon” and is the variable eps in Matlab. So, if we execute the command 1 + eps/2 in Matlab, we see that we get back exactly the value of 1, i.e the eps/2 factor was truncated. 5.3 Review of Vector and Matrix Norms Before examining the impact of round-off error due to floating point truncation, we first need to recall the definition of vector and matrix norms and some of their properties. Given vectors u and v, their dot product is u · v = u1 v1 + u2 v2 + · · · + un vn = n X ui vi (5.5) i=1 The length of the vector u is written in terms of its dot product: √ kuk2 = u · u (5.6) 66 CHAPTER 5. NUMERICAL ERROR AND CONDITIONING In fact, the length of the vector is one type of vector norm. Norms, in general, are used to measure the size of mathematical objects. This language allows us to study of the behavior of more abstract mathematical entities. In general, norms are just mathematical operators that take the object and return a number. So, vector norms take vectors and return a number: kuk : u ∈ Rn → R (5.7) The Euclidean norm in Equation (5.6) is one example. Other examples are the general p-norms: ! p1 n X (5.8) kukp = |ui |p i=1 and the “infinity” norm: kuk∞ = max |ui | i (5.9) The idea of measuring the size of objects is quite general. We can apply it to matrices as well. So, in the case of matrices, the norm operator takes a matrix and returns a number: kAk : A ∈ Rm × Rn → R (5.10) These norms are “induced” by the norms of vectors; their formal development is beyond the scope of this course, but we provide a few examples. The one norm of a matrix corresponds to the “column sum”: kAk1 = max n X j |aij | (5.11) i=1 while the infinity norm of a matrix is the “row sum”: kAk∞ = max i n X |aij | (5.12) j=1 The Frobenius norm is what you might’ve guessed as the two norm: v uX n u n X t a2ij kAkF = (5.13) i=1 j=1 For square matrices, it turns out that the two norm is the square root of the maximum eigenvalue of AT A: q kAk2 = max λi (AT A) (5.14) i In this case, kAk2 is sometimes called the spectral norm. All norms enjoy the following properties, by definition: kuk > 0 if u 6= 0 kαuk = |α|kuk, α ∈ R ku + vk ≤ kuk + kvk (5.15) (5.16) (5.17) 5.4. CONDITIONING OF LINEAR SYSTEMS 67 The last property is also referred to as the triangle inequality. There are also two other important properties that vector and matrix norms satisfy: kABk ≤ kAkkBk kAxk ≤ kAkkxk 5.4 (5.18) (5.19) Conditioning of Linear Systems Armed with the previously discussed notions of norms, we are now ready to study the effect of truncation error in the solution of linear systems. Supposed we are interested in solving the system Ax = b. Suppose now that as we construct b, we have accumulated errors due to floating point truncation: b + ∆b. Here, b is the exact vector and ∆b is error we incurred. This error will then induce error in our solution, namely x + ∆x. So the system we are really solving is A(x + ∆x) = (b + ∆b) (5.20) Expanding terms and using the fact that the exact problem, Ax = b is still satisfied, Ax + A∆x = b + ∆b A∆x = ∆b Then, multiplying both sides by A−1 and taking norms, we have ∆x = A−1 ∆b ⇒ k∆xk = kA−1 ∆bk ⇒ k∆xk ≤ kA−1 kk∆bk If we now examine our exact equation, Ax = b ⇒ kAxk = kbk ⇒ kbk ≤ kAkkxk 1 1 ≤ kAk ⇒ kxk kbk Putting together the relationships for k∆xk and kxk, we have k∆xk k∆bk ≤ kAkkA−1 k kxk kbk (5.21) What this equation is saying is that our output error, k∆xk/kxk, is our input error, k∆bk/kbk, magnified by the quantity kAkkA−1 k. The quantity kAkkA−1 k is called the condition number and is written at the symbol κ(A). So if our input error is due to floating point truncation, say k∆bk/kbk ≈ 10−15 , and if our condition number 68 CHAPTER 5. NUMERICAL ERROR AND CONDITIONING is ≈ 105 , then our output error is going to be k∆xk/kxk ≈ 10−10 . We lost five digits just due to the condition number! In general, we also have errors in our matrix A. A similar calculation to that above yields k∆xk k∆Ak k∆bk ≤ κ(A) + (5.22) kxk kAk kbk So our output error is roughly the sum of our input errors, but then magnified by the condition number. So the higher the condition number, the more error in our output. Linear systems that possess a large condition number are said to be “ill-conditioned” or “poorly conditioned” systems. The solution of such systems will possibly require more care in order to obtain the necessary accuracy. Example 5.4.1. Hilbert Matrix One notoriously ill-conditioned matrix is the Hilbert matrix: 1 1 . . . 1 21 3 n 1 1 1 1 . . . n+1 3 4 2 1 1 1 1 . . . A = 3 4 5 n+2 .. .. .. .. . . . . . . . 1 1 1 1 . . . 2n−1 n n+1 n+2 (5.23) In Matlab, the hilb command will generate a Hilbert matrix for the input dimension. Additionally, the condition number of a matrix can be computed using the cond command. Thus, the Matlab command cond(hilb(5)) will compute the condition number of a 5 × 5 Hilbert matrix. Matlab reports the condition number as 4.7661e+05. 5.5 Practice Problems Work the following problems in Chapra [1]: • 11.9 Chapter 6 Numerical Differentiation We have so far spoken of solving differential equations where it was possible to find a function that satisfied the differential equation and initial/ boundary conditions by “guessing” a function – inserting it in the equation and solving for appropriate constants. However, this approach fails in almost all real problems where the geometry is complex and/or the functions needed are not simple. When no exact solution is possible the best we can do is to obtain approximate solutions. A first step to this end is creating approximations of the derivatives that are in the differential equations. 6.1 Approximating Derivatives Most approaches to approximate derivatives fall into one of the following categories 1. Finite differences 2. Fitting a curve/surface to the desired function and using its slope as the derivative. 6.1.1 What Are Finite Differences? Finite differences build on the definition of the derivative of a function f (x) as the “rate of change of f (x) with respect to x”. Thus, the derivative will be the ratio of the “variation in f (x) and the corresponding variation in x”. For some xi let f (xi + ∆x) − f (xi ) df (xi ) = lim ∆x→0 dx ∆x Instead of passing to the limit where ∆x is infinitesimal we can simply approximate the derivative using a finite value of ∆x as df f (xi + ∆x) − f (xi ) (xi ) ≈ = fˆ0 (xi ) dx ∆x (6.1) Thus, finite means not infinite and not infinitesimal, in other words non-zero. Depending on the choice of ∆x we will get more or less error. As we make ∆x smaller 69 70 CHAPTER 6. NUMERICAL DIFFERENTIATION xi)4# xi)3# xi)2# xi)1# xi# xi+1# xi+2# xi+3# xi+4# Figure 6.1: Discretization for finite difference approximation of derivative at xi . df and smaller we will recover dx but then the number of computations may become unaffordable. The question then is how to pick a variation of f (x) and a variation in x, ∆x so it minimizes error but is still easy to compute. 6.1.2 Taylor Series and Approximate Derivative Finite difference approximations of derivatives with a systematic definition of both the variation in f (x) and ∆x can be generated by combining Taylor series expansions. Recall the definition of Taylor series that any differentiable function f (x) can be estimated in the neighborhood of a point xi as f (x) = f (xi ) + 1 d2 f 1 d3 f df 2 (xi )(x − xi ) + (x )(x − x ) + (xi )(x − xi )3 ... i i dx 2 dx2 6 dx3 (6.2) Writing x − xi = ∆x or x = xi + ∆x yields f (xi + ∆x) − f (xi ) = df 1 d2 f 1 d3 f 2 (xi )∆x + (x )∆x + (xi )∆x3 .... i dx 2 dx2 6 dx3 Dividing by ∆x and rearranging we get an expression for the error in fˆ0 (xi ) df f (xi + ∆x) − f (xi ) df 1 d2 f 1 d3 f fˆ0 (xi ) − (xi ) = − (xi ) = (x )∆x + (xi )∆x2 + ... i dx ∆x dx 2 dx2 6 dx3 The first term in the error depends on ∆x, the second term depends on ∆x2 , ... Thus for ∆x << 1 the first term will dominate. In this case, since the leading order term in the truncation error is O(∆x), we say that this is a first order accurate approximation. 6.1.3 Taylor Series and Finite Differences Let us now consider a systematic use of the Taylor series approximation to control the error in the approximation of the derivative – for example if we can devise a scheme to construct fˆ0 (xi ) so that the leading term in the error above is 0 then clearly the error will go as ∆x2 . In this case, we say the error is second order accurate since the exponent on ∆x is 2. Consider, the line in Figure 6.1 with a set of points at equal intervals {..., xi−4 , xi−3 , xi−2 , xi−1 , xi , xi+1 , xi+2 , xi+3 , xi+4 ...} 6.1. APPROXIMATING DERIVATIVES 71 Now let us express the values f (xi±∗ ) in terms of a Taylor series expansion about xi : ... = ... df (xi )(xi−2 − xi ) + dx df f (xi ) + (xi )(xi−1 − xi ) + dx f (xi ) df f (xi ) + (xi )(xi+1 − xi ) + dx df f (xi ) + (xi )(xi+2 − xi ) + dx ... f (xi−2 ) = f (xi ) + f (xi−1 ) = f (xi ) = f (xi+1 ) = f (xi+2 ) = ... = 1 d2 f (xi )(xi−2 − xi )2 + 2 2 dx 1 d2 f (xi )(xi−1 − xi )2 + 2 dx2 1 d3 f (xi )(xi−2 − xi )3 ... 3 6 dx 1 d3 f (xi )(xi−1 − xi )3 ... 6 dx3 1 d2 f (xi )(xi+1 − xi )2 + 2 dx2 1 d2 f (xi )(xi+2 − xi )2 + 2 dx2 1 d3 f (xi )(xi+1 − xi )3 ... 6 dx3 1 d3 f (xi )(xi+2 − xi )3 ... 6 dx3 Rearranging ... ... f (xi−2 ) 1 f (xi−1 ) 1 f (xi ) = 1 f (xi+1 ) 1 f (xi+2 ) 1 ... ... ... ... f (xi ) 1 n (x − x ) ... df (x ) i−2 i n! i dx 1 (xi−1 − xi )n ... d2 f n! 2 (xi ) dx 0... d3 f 1 n dx3 (xi ) (x − x ) ... i+1 i n! ... 1 n (xi+2 − xi ) ... dn f n! dxn (xi ) ... ... (xi−2 − xi ) (xi−1 − xi ) 0 (xi+1 − xi ) (xi+2 − xi ) 1 (xi−2 2 1 (xi−1 2 − xi ) 2 − xi ) 2 0 1 (xi+1 − xi )2 2 1 (xi+2 − xi )2 2 1 (xi−2 6 1 (xi−1 6 − xi ) 3 − xi ) 3 0 1 (xi−1 − xi )3 6 1 (xi+2 − xi )3 6 ... ... ... ... ... Note that a simpler form is possible if the spacing of the points is uniform i.e |xi−4 − xi | = 4∆x, |xi−3 − xi | = 3∆x, |xi−2 − xi | = 2∆x, |xi−1 − xi | = ∆x, |xi+1 − xi | = ∆x, |xi+2 − xi | = 2∆x, .... {f } = [D]{df } {df } = [D]−1 {f } (6.3) Let us consider the implication of (6.3). If we know the values of the function at a set of points f (xi±∗ ) then we are able to exactly compute different orders of derivative di f (xi ) at xi . Thus, our original goal of solving a differential equation involving dxi 2 terms that look like ddxf2 (xi ) etc. can be accomplished by replacing the derivative terms using suitable expression from (6.3). In reality, of course, we cannot afford to compute the whole Taylor series but will truncate it after a few terms. This implies that our choice of truncation point will define the approximation error e.g. if we truncate after 3 terms, the error will be 3 dominated by 61 ddxf3 (xi )∆x3 . Thus, assuming no error in f (xi±∗ ), the error is set by the lowest higher order derivative that we do not include and the appropriate power of ∆x. This also allows us to estimate the first few derivatives using only a few of the df equations from (6.3). (e.g. dx which is 2 unknowns including f (xi ) needs only 2 72 CHAPTER 6. NUMERICAL DIFFERENTIATION 2 df d f equations from (6.3) or dx , dx2 which is 3 unknowns only – this can be done by choosing 3 equations from (6.3).) f (xi ) f (xi ) 1 0 = df f (xi+1 ) 1 (xi+1 − xi ) dx (xi ) Or setting (xi+1 − xi ) = ∆x, −1 f (xi ) 1 0 1 = = −1 df 1 ∆x (xi ) ∆x dx 0 1 ∆x f (xi ) = f (xi+1 ) which leads to f (xi+1 ) − f (xi ) df = dx ∆x with a truncation error of O(∆x). f (xi ) f (xi−2 ) 1 (xi−2 − xi ) 21 (xi−2 − xi )2 df f (xi−1 ) = 1 (xi−1 − xi ) 1 (xi−1 − xi )2 dx (xi ) 2 d2 f f (xi ) 1 0 0 (xi ) dx2 −1 f (xi ) f (xi−2 ) 1 (xi−2 − xi ) 12 (xi−2 − xi )2 df dx (xi ) = 1 (xi−1 − xi ) 12 (xi−1 − xi )2 f (xi−1 ) 2 d f f (xi ) 1 0 0 (xi ) dx2 (6.4) Or rewriting assuming uniform discretization −1 f (xi ) 1 −2∆x 12 (−2∆x)2 f (xi−2 ) df dx (xi ) = 1 (−∆x) 21 (−∆x)2 f (xi−1 ) 2 d f f (xi ) 1 0 0 (xi ) dx2 Inverting and solving leads to df −f (xi−2 ) + 4f (xi−1 ) − 3f (xi ) (xi ) = dx 2∆x (6.5) with a truncation error of O(∆x2 ). d2 f −f (xi−2 ) + 2f (xi−1 ) − f (xi ) (xi ) = 2 dx ∆x2 (6.6) with a truncation error of O(∆x). This is of course by no means a unique choice – we could have chosen −1 f (xi ) 1 0 0 f (xi ) df dx (xi ) = 1 (xi+1 − xi ) 12 (xi+1 − xi )2 f (xi+1 ) 2 d f 1 (xi+2 − xi ) 12 (xi−2 − xi )2 f (xi+2 ) (xi ) dx2 to get df −f (xi+2 ) + 4f (xi+1 ) − 3f (xi ) (xi ) = dx 2∆x (6.7) 6.1. APPROXIMATING DERIVATIVES 73 with a truncation error of O(∆x2 ), or, d2 f −f (xi+2 ) + 2f (xi+1 ) − f (xi ) (x ) = i dx2 ∆x2 (6.8) with a truncation error of O(∆x). Two things stand out • the formulae (6.4), (6.7), (6.8) involve values f (xi+k ), k > 0 while (6.5), (6.6) involve values f (xi+k ), k < 0. The first category k > 0 are called forward differences while the second category k < 0 are called backward differences. df but (6.4) uses only two points • Formulae (6.4), (6.5) both approximate dx f (xi ) and f (xi+1 ) and has a truncation error of ∆x while (6.5) uses 3 points f (xi−2 ), f (xi−1 ), f (xi ) and gets a truncation error of ∆x2 . Note also that (6.7) achieves ∆x2 with a different set of points. What about not picking either k > 0 or k < 0 exclusively? This can indeed lead to many possible difference formulae. One of which is special – central finite differences using evenly spaced x are a bit special because there is natural cancellation of alternating terms in the Taylor series. The order of accuracy is one degree higher than normal. f (xi+1 ) − f (xi−1 ) df = (6.9) dx ∆x with a truncation error of O(∆x2 ). Example 6.1 Let f (x) = sin(x). Consider a discretization ..., xi−2 = π/2−0.1, xi−1 = π/2−0.05, xi = π/2, xi+1 = π/2+0.05, xi+2 = π/2+0.1... now let us estimate df (π/2). dx Using (6.4) df f (xi+1 ) − f (xi ) sin(π/2 + 0.05) − sin(π/2) 0.99875 − 1 = = = = −0.025 dx ∆x 0.05 0.05 The exact answer is cos(π/2) = 0. Note the error is comparable to ∆x at 0.05. Using (6.7) df −f (xi+2 )+4f (xi+1 )−3f (xi ) (xi ) = 2∆x dx − sin(π/2−0.1)+4. sin(π/2+0.05)−3. sin(π/2) = 0.05×2 = 3.1237 × 10−05 << ∆x2 = 0.0025 (6.10) (6.11) (6.12) Even more impressive on this problem the central difference calculation is exact because of canceling errors! The MATLAB diff function is a fast way to compute differences df=diff(f ); gives the same result as i=1:length(f )-1; df=f(i+1)-f(i); 74 CHAPTER 6. NUMERICAL DIFFERENTIATION Figure 6.2: Table of Forward Difference formulae. Taken from Chapra [1]. Figure 6.3: Table of Central Difference formulae. Taken from Chapra [1]. 6.2. HIGHER DIMENSIONS AND PARTIAL DERIVATIVES 75 Figure 6.4: Table of Backward Difference formulae. Taken from Chapra [1]. 6.1.4 What if there is error in evaluation of f (x)? This is commonly encountered when processing empirical data. In that case, there is no underlying function that we can readily evaluate, we only have data points. And, of course, those points will have error in them. Further, we have no mechanism to shrink the spacing without gathering new data (which can be very expensive). So we are stuck using the points we have. Unfortunately, finite differences do not handle measurement error very well. In particular, derivatives tend to get “noisier” as we take more and more derivatives. Fig. 6.5 shows the effect of error in the evaluation. If we are indeed in such a situation, the best thing to do is use the formulae that possess more points. This can both increase the accuracy of the derivative as well as help “smooth out” the noise. 6.2 Higher Dimensions and Partial Derivatives With uniform grids it is “relatively straight forward” to combine one-dimensional finite difference rules by a “tensor” product. It is just as easy, perhaps more so, to generate multidimensional finite differences by combining multidimensional Taylor series expansions. This is no more difficult to do for non-uniform grids. In two dimensions the Taylor series will be: ∂f 1 ∂ 2f ∂f (x0 , y0 )(x − x0 ) + (x0 , y0 )(y − y0 ) + (x0 , y0 )(x − x0 )2 + ∂x ∂y 2 ∂x2 1 ∂ 2f 1 ∂ 2f 2 (x , y )(y − y ) + (x0 , y0 )(x − x0 )(y − y0 ) + · · · 0 0 0 2 ∂y 2 2 ∂x∂y f (x, y) =f (x0 , y0 ) + For the first order partial derivatives, all the single variable derivative formulae apply in the direction in which the derivative is acting. Similarly for the second 76 CHAPTER 6. NUMERICAL DIFFERENTIATION Figure 6.5: On the left, we have the “exact” function. On the right, we add just a small amount of error in the function. When we use finite differences to estimate the derivative, the error is greatly exaggerated. 6.3. PRACTICE PROBLEMS 77 derivatives, except for the mixed partial term. For the mixed partial term, we can apply the finite difference formulae one-at-a-time. Namely, ∂ 2f ∂ ∂f = ∂x∂y ∂x ∂y Then, using a central difference approximation in the y-direction: ∂f f (x0 , y0 + ∆y) − f (x0 , y0 − ∆y) (x0 , y0 ) ≈ ∂y 2∆y Now, we can apply a central difference approximation to in the x-direction to the previous expression: ∂ ∂f (x0 , y0 ) ≈ ∂x ∂y f (x0 +∆x,y0 +∆y)−f (x0 +∆x,y0 −∆y) 2∆y − f (x0 −∆x,y0 +∆y)−f (x0 −∆x,y0 −∆y) 2∆y 2∆x f (x0 + ∆x, y0 + ∆y) − f (x0 + ∆x, y0 − ∆y) = + 4∆x∆y −f (x0 − ∆x, y0 + ∆y) + f (x0 − ∆x, y0 − ∆y) 4∆x∆y Example 6.2.1. Numerical Gradient in Matlab See Chapra [1, Example 21.8, pg. 538] 6.3 Practice Problems Work the following problems in Chapra [1]: • 21.3 • 21.12 • 21.14 78 CHAPTER 6. NUMERICAL DIFFERENTIATION Chapter 7 Solution of Initial Value Problems Solving initial value problems requires us to start with a given ordinary differential equation and initial condition and march forward in time. Finite differences of the types we just learned are very useful for that. 7.1 A Simple Illustration Consider the simple ODE for y(t) used to find y(1) given dy + 2y = 0 dt with initial conditions y(0) = 1 First let us find the exact solutions for comparison. Analytical solutions of these are of the form y = aeλt + b. Differentiating with respect to time, t, we get dy = λaeλt dt y(0) = 1 ae−2t = 1 ⇒a=1 y(t) = e−2t y(1) = e−2×1 = 0.13533 Now let us solve it with finite differences. Replace dy by the first order forward dt difference formula yi+1∆t−yi , with yi = y(ti ), etc. and ti+1 = ti + ∆t: λaeλt + 2(aeλt + b) = 0 aeλt (λ + 2) + 2b = 0 ⇒ 2b = 0 λ = −2. yi+1 − yi + 2yi = 0 ∆t yi+1 − yi = −2yi ∆t yi+1 = yi − 2yi ∆t 79 (7.1) 80 CHAPTER 7. SOLUTION OF INITIAL VALUE PROBLEMS Since we know y(0) = 1 we can start the iteration y0 = y(0) = 1 and after choosing a ∆t (say ∆t = 0.25) we have t1 = 0 + ∆t = 0.25 t2 = t1 + ∆t = 0.50 t3 = t2 + ∆t = 0.75 t1 = t3 + ∆t = 1.00 → → → → y1 y2 y3 y4 = y0 − 2y0 ∆t = 1 − 2 × 1 × 0.25 = 0.5 = y1 − 2y1 ∆t = 0.5 − 2. × 0.5 × 0.25 = 0.25 = y2 − 2y2 ∆t = 0.25 − 2. × 0.25 × 0.25 = 0.125 = y3 − 2y3 ∆t = 0.125 − 2 × 0.125 × 0.25 = 0.0625 Clearly y(1) = 0.0625 is a somewhat poor approximation of y(1) = 0.13533! Can we do better? If ∆t → 0 then the difference formula reduces to the derivative. Thus, let us try with a smaller ∆t e.g. ∆t = 0.1. t1 = 0 + ∆t = 0.1 t2 = t1 + ∆t = 0.2 ... t10 = t9 + ∆t = 1.0 → → ... → y1 = y0 − 2y0 ∆t = 1 − 2 × 1 × 0.1 = 0.8 y2 = y1 − 2y1 ∆t = 0.8 − 2 × 0.8 × 0.1 = 0.64 ... y10 = y9 − 2y9 ∆t = 0.107374 This value using a smaller ∆t = 0.1 is clearly more accurate. Note that in this formula yi+1 depends only on yi and we can compute. This method based on the forward difference is called the Forward Euler method. Let us now consider a backward difference formula dy yi − yi−1 = dt ∆t Applying to our problem leads to yi − yi−1 + 2yi = 0 ∆t yi − yi−1 = −2yi ∆t yi = yi−1 − 2yi ∆t yi+1 = yi − 2yi+1 ∆t (7.2) The new value is “implicitly” defined – unlike the forward difference. We can simplify – though if this was a system of ODEs we would have to solve systems of equations! yi+1 (1 + 2∆t) = yi yi yi+1 = 1 + 2∆t (7.3) Solving this again for ∆t = 0.25 leads to a y(1) = 0.197531. Reducing ∆t to 0.1 leads to y(1) = 0.161506; Further reducing ∆t = 0.01 gets y(1) = 0.138033. This method is called the Backward Euler method. Example 7.1.1. Euler’s Method See Chapra [1, Example 22.1, pg. 556]. Example 7.1.2. Euler’s Method for Systems of Equations See Chapra [1, Example 22.4, pg. 572] 7.2. STABILITY 7.2 81 Stability What is meant by a “stable” method? The finite difference integration at every increment introduces numerical round-off and other types of error at every update. Let us rework the previous example with a small perturbation and look for behavior at y(10). The exact answer is y = e−20 = 2.061154 × 10−09 . Since we know y(0) = 1 we can start the iteration y0 = y(0) = 1 and after choosing a ∆t (say ∆t = 0.25) and add a perturbation of 0.001 we have t1 = 0 + ∆t = 0.25 t2 = t1 + ∆t = 0.50 t3 = t2 + ∆t = 0.75 ... t40 = t39 + ∆t = 10.0 → → → ... → y1 = y0 − 2y0 ∆t = 1 − 2 × 1 × 0.25 = 0.501 y2 = y1 − 2y1 ∆t = 0.501 − 2. × 0.501 × 0.25 = 0.2505 y3 = y2 − 2y2 ∆t = 0.2505 − 2. × 0.25 × 0.2505 = 0.12525 ... y40 = y39 − 2y39 ∆t = y39 (1 − 2 × 0.25) = 9.094 × 10−13 The error induced by the perturbation is clearly decaying out quickly from 0.001 to 0.00025 in 2 steps. thus, for this calculation any perturbation (likely much smaller than the 0.001 ) will not affect the result. Let us rework the previous example with a small perturbation and a larger time step. Since we know y(0) = 1 we can start the iteration y0 = y(0) = 1 and after choosing a ∆t (say ∆t = 2.) we have t1 = 0 + ∆t = 2. t2 = t1 + ∆t = 4. t3 = t2 + ∆t = 6. ... → → → ... y1 = y0 − 2y0 ∆t = 1 − 2 × 1 × 2. = −3.001 y2 = y1 − 2y1 ∆t = −3.001 − 2. × (−3.001) × 2 = 9.003 y3 = y2 − 2y2 ∆t = 9.003 − 2. × 9.003 × 2 = −27.012 ... The simple question arises – will the perturbations decay leaving the solution uncorrupted or will they keep growing and ultimately completely overwhelm the solution. Conditions under which the latter happens is what we seek. The answer lies in the eigenvalue problem: dy = λy; y(0) = y0 dx y(x) = y0 eλx First Order Forward Difference formula yi+1 = yi + ∆xλyi = (1 + ∆xλ)yi After n steps yn = (1 + ∆xλ)n y0 = An y0 where A is an amplification factor. For stability we need the |A| < 1 ⇒ |(1 + ∆xλ)| < 1 82 CHAPTER 7. SOLUTION OF INITIAL VALUE PROBLEMS 2 λ This idea of having a limited step size in order to preserve stability is inherent to all explicit methods. Now let’s repeat this procedure for the backward Euler method. In particular, λ < 0 ⇒ ∆x < yi+1 = yi + ∆xλyi+1 ⇒ yi+1 (1 − ∆xλ) = yi 1 yi ⇒ yi+1 = (1 − ∆xλ) Now, the amplification factor A = 1/(1 − ∆xλ). So, for λ < 0, there is no restriction on ∆x to preserve stability! It is unconditionally stable! That is not to say that you can take large step sizes and expect an accurate solution, but rather that the errors do not grow exponentially with each time step. Although not all implicit methods are unconditionally stable, generally implicit methods have much better stability properties compared to explicit methods. This increased stability comes at a cost, however. If we are evolving systems of differential equations, then implicit methods will require the solution of a, potentially nonlinear, system of equations. However, for problems with large time step restrictions, even the cost of the extra solves of linear or nonlinear systems can still leave the implicit method being the clear winner in terms of cost. Equations with such time step restrictions are said to be stiff. Chemical kinetics is one classical example of highly stiff systems of differential equations (arising from the large magnitudes of activation energies). Stiff systems often have very fast transients followed by a much slower mode in the solution. 7.3 Multistage Methods Any numerical method to solve the ODE dy = f (t, y) dt that has the form yn+1 = yn + ∆t p X bj kj (7.4) j=1 k1 = f (tn , yn ) ki = f (tn + ci−1 ∆t, yn + ∆t (7.5) i−1 X ai−1,j kj ), i = 2, . . . , n (7.6) j=1 belongs to the Runge-Kutta family of methods. The forward and backward Euler methods belong to the Runge-Kutta family. However, the most popular implementations of Runge-Kutta (RK) are “predictor-corrector” forms. The idea is that we use 7.3. MULTISTAGE METHODS 83 “intermediate stages” of the time step interval to “predict” the value and then use the slope at that point to “correct” the final result. The more predictions/corrections, the more accurate you can make the scheme. Indeed, this is how Runge-Kutta methods are derived in general: we have a particular accuracy we wish to achieve. We use Taylor expansions up to the order of accuracy we desire, and then match the terms in the Taylor expansion with the form in equation (7.4) to determine the unknown coefficients. Example 7.3.1. Heun’s Method See Chapra [1, Example 22.2, pg. 563] 7.3.1 First Order Explicit RK Methods In this case, we only are seeking first order accurate methods, so p = 1. That is, yi+1 = yi + ∆tb1 k1 k1 = f (ti , yi ) so that yi+1 = yi + ∆tb1 f (ti , yi ) Now we do a Taylor expansion of yi+1 about ti : yi+1 = yi + ẏ(ti )∆t ⇒ yi+1 = yi + f (ti , yi )∆t Comparing this Taylor expression to our Runge-Kutta step, we see that b1 = 1. This is just Forward Euler! 7.3.2 Second Order Explicit RK Methods We can repeat the same procedure for second order methods. Namely, yi+1 = yi + ∆t (b1 k1 + b2 k2 ) k1 = f (ti , yi ) k2 = f (ti + c1 ∆t, yi + a1,1 k1 ∆t) so that yi+1 = yi + ∆t (b1 f (ti , yi ) + b2 f (ti + c1 ∆t, yi + a1,1 k1 ∆t)) Now, if we do a Taylor expansion of yi+1 about ti : ∆t2 2 ∆t2 = yi + f (ti , yi )∆t + ÿ(ti ) 2 yi+1 = yi + ẏ(ti )∆t + ÿ(ti ) ⇒ yi+1 84 CHAPTER 7. SOLUTION OF INITIAL VALUE PROBLEMS To handle the ÿ term, we recognize that ÿ(ti ) = f˙(ti , yi ). Furthermore, y is implicitly a function of time, y = y(t), so we can use the chain rule to differentiate f with respect to time: ∂f dy ∂f + f˙(ti , yi ) = ∂t ∂y dt = ft + fy f Thus, our Taylor expansion of y is yi+1 = yi + f (ti , yi )∆t + (ft (ti , yi ) + fy (ti , yi )f (ti , yi )) ∆t2 2 (7.7) We can’t quite compare our Taylor series to our R-K step rule yet. Now, we need to expand the k2 term in a Taylor expansion about (ti , yi ) so that we may directly compare with our Taylor expansion in equation (7.7): f (ti + c1 ∆t, yi + a1,1 k1 ∆t) = f (ti , yi ) + c1 ∆tft (ti , yi ) + a1,1 k1 ∆tfy (ti , yi ) + · · · Now, substituting back into our R-K equation yi+1 = yi + ∆t (b1 f (ti , yi ) + b2 (f (ti , yi ) + c1 ∆tft (ti , yi ) + a1,1 k1 ∆tfy (ti , yi ))) = yi + ∆tb1 f (ti , yi ) + ∆tb2 f (ti , yi ) + ∆t2 c1 ft (ti , yi ) + ∆t2 a1,1 k1 fy (ti , yi ) = yi + ∆tf (ti , yi )(b1 + b2 ) + ∆t2 b2 c1 ft (ti , yi ) + ∆t2 b2 a1,1 f (ti , yi )fy (ti , yi ) ∆t2 = yi + f (ti , yi )∆t(b1 + b2 ) + (2b2 c1 ft (ti , yi ) + 2b2 a1,1 fy (ti , yi )f (ti , yi )) 2 Comparing the final equation to our Taylor expansion in equation (7.7), we get the following conditions for our coefficients: b1 + b2 = 1 2b2 c1 = 1 2b2 a1,1 = 1 We have 4 unknowns and only 3 equations to solve for them! This means there is an infinite number of 2nd order accurate RK methods, we can choose 1 coefficient and then determine the other three. The 3 most popular are • The Midpoint Method — b2 = 1 ⇒ b1 = 0, c1 = a1,1 = 1/2 • Ralston’s Method — b2 = 3/4 ⇒ b1 = 1/4, c1 = a1,1 = 2/3 • Heun’s Method — b2 = 1/2 ⇒ b1 = 1/2, c1 = a1,1 = 1 7.3. MULTISTAGE METHODS 85 The Midpoint Method To solve dy = f (y, t) dt y(0) = y0 we need a scheme to use yn ≡ y(tn ), f (yn , tn ), ∆t to find yn+1 ≈ y(tn+1 ). Let us start by describing the midpoint method. yn + f (tn , yn ) ∆t 2 yn+ 1 = 2 yn+1 = yn + f (tn+ 1 , yn+ 1 ∆t) 2 2 predictor (7.8) corrector (7.9) Note that 7.8 uses the value of dy (t ) to predict the value of y(tn+ 1 ). This predicted dt n 2 value is then used to estimate the corrected value at yn+1 ≈ y(tn+1 ) Example Problem: 4t 1 dy = 4.e 5 − y dt 2 y(0) = 2; y(3) =? (7.10) (7.11) The exact solution of this is 40 4t 14 −5 e 5 − e 2 ; y(3) = 33.677171176968 13 13 y(t) = (7.12) Using Midpoint rule yn+ 1 = 2 4 yn + (4.e 5 tn − 12 yn ) ∆t 2 yn+1 = yn + (4.e 4 (t + ∆t ) 5 n 2 (7.13) − 12 yn+ 1 ) 2 (7.14) Using ∆t = 1/2 and 12 function evaluations y(t = 3) = 33.770169 while the true value= 33.67717. Why not just use forward Euler with half the time-step instead? The answer is the midpoint method is more accurate. Its error is O(∆t2 ) (by design!) while forward Euler’s error is O(∆t). Using ∆t = 1/4 and 12 function evaluations Forward Euler yields y(t = 3) = 31.6432 true value= 33.67717 – a much larger error than the midpoint method above. 7.3.3 Fourth Order Explicit RK A similar (rather tedious) process can be used to derive a fourth order method i.e. error is O(∆t4 ). Again, there will be choices for the coefficients since there will be fewer equations than coefficients when matching terms with the Taylor expansion. The key point when designing these methods, then, is not only the order of accuracy, but the number of times f (t, y) must be evaluated in the stage. That is, we want 86 CHAPTER 7. SOLUTION OF INITIAL VALUE PROBLEMS to maximize the accuracy of the method and minimize the number of times f (t, y) has to be evaluated. One can imagine for a large system of equations, f (t, y) may be more expensive to evaluate. In this sense, the 4th order R-K methods are special: fourth order accuracy can be achieved with four evaluations of f (t, y). Methods higher than order 4 do not enjoy this benefit and, thus, the popularity of the fourth order R-K methods. This method below is such an R-K scheme: it is fourth order in accuracy and requires only four evaluations of f (t, y): ∆t 2 ∆t ∆t yn + f (tn + , y1 ) 2 2 ∆t yn + f (tn + , y2 )∆t 2 1 yn + (f (tn , yn ) 6 ∆t ∆t 2f (tn + , y1 ) + 2f (tn + , y2 ) + f (tn + ∆t, y3 ))∆t 2 2 y1 = yn + f (tn , yn ) (7.15) y2 = (7.16) y3 = yn+1 = + (7.17) (7.18) (7.19) Using ∆t = 1 required 12 function evaluations y(t = 3) = 33.721348 true value= 33.67717. Example 7.3.2. Fourth Order Runge-Kutta See Chapra [1, Example 22.3, pg. 570] 7.3.4 MATLAB and RK methods MATLAB has implemented several Runge-Kutta methods. In all cases, the user supplies a function handle for f (t, y), the time interval [t0 , tfinal ], and the initial condition y(0). ode45 is the defacto initial value problem solver in MATLAB. The 4 indicates that it’s fourth order accurate. The 5 indicates that a fifth order error estimation is used (in fact, this method is called the Runge-Kutta-Feylberg method). The idea is that we can compare the step produced by the fourth order method with the the step produced by a fifth order method — this gives us an estimate of the error! Thus, if the error is too large, we shrink the time step size, ∆t, and try again. The RKF method is special because the fifth order error estimator reuses many of the function evaluations from the fourth order method. This is why the MATLAB methods return arrays for both the times and the solution — you very likely will not have uniform time steps following the application of the algorithm. There are other methods as well: ode23, ode113, etc. Additionally, there are methods designed for stiff problems: ode23s, ode15s, ode23t, etc. As always, use help ode45 etc. to get the full documentation. 7.4. PRACTICE PROBLEMS 7.4 Practice Problems Work the following problems in Chapra [1]: • 22.1 • 22.7 • 22.15 87 88 CHAPTER 7. SOLUTION OF INITIAL VALUE PROBLEMS Chapter 8 Solution of Boundary Value Problems Finite difference approximations are a very effective way of solving boundary value problems. First, we’ll start with a specific example of heat transfer in a onedimensional rod. Then, we’ll consider a general second order, linear, ordinary differential equations with non-constant coefficients. Finally, we’ll consider partial differential equations in a two-dimensional setting. 8.1 Heat transfer in a One-Dimensional Rod First, we’ll consider the case of pure Dirichlet boundary conditions, i.e. fixed temperatures on both ends of the rod. Then, we’ll consider the case of specifying a Neumann boundary condition on one end, i.e. a specified heat flux. 8.1.1 One-Dimensional Rod with Fixed Temperature Ends For the 1-D rod illustrated in Fig. 8.1 Consider the simple problem of heat transfer along a rod between two defined temperatures at either end. T∞ convec&on' Ta' conduc&on' L' 'x' Tb' 1" 2" 3" 4" 5" 6" 7" 8" 9" 10" 11" 12" 13" 14" 15" 16" Figure 8.1: Heat transfer along 1D rod from Ta to Tb 89 90 CHAPTER 8. SOLUTION OF BOUNDARY VALUE PROBLEMS d2 T + h(T∞ − T ) = 0, dx2 T (0) = Ta T (L) = Tb 0<x<L (8.1) (8.2) Equation 8.1 defines the temperatures on 0 < x < L and 8.2 the temperatures at the boundaries x = 0, L. Equation 8.1 and 8.2 define the complete “boundary value problem”. To solve the differential equation, one possible approach could be to use the formulae we developed for the 2nd order derivative. First, we “discretize” the onedimensional domain into a set of N points xi , i = 1, . . . , N , shown in Figure 8.1 for N = 16. Then, T1 = Ta and TN = Tb as those are our boundary conditions. For the interior points, let Ti , i = 2, . . . , N − 1 be temperatures at fixed points along the rod. We approximate the derivatives in the differential equation (8.1) by the appropriate finite difference formulae. Here, we use the central difference formula for the second derivative: Ti+1 − 2Ti + Ti−1 d2 T = 2 dx ∆x2 Now, inserting into equation (8.1) 0= Ti+1 − 2Ti + Ti−1 + h(T∞ − Ti ) ∆x2 (8.3) Or multiplying through by ∆x2 and rearranging we have − Ti−1 + (2 + h∆x2 )Ti − Ti+1 = h∆x2 T∞ (8.4) (8.4) is a template equation that holds inside the domain. At the ends we are given the temperatures T1 = Ta , T16 = Tb (8.5) Using (8.4) and the above values for T1 , T16 we can write equations: i = 2 : h∆x2 T∞ = −Ta + (2 + h∆x2 )T2 − T3 i = 3 : h∆x2 T∞ = −T2 + (2 + h∆x2 )T3 − T4 i = 4 : h∆x2 T∞ = −T3 + (2 + h∆x2 )T4 − T5 ... ...... 2 i = 15 : h∆x T∞ = −T14 + (2 + h∆x2 )T15 − Tb Combining and putting in matrix form 1 0 0 0 ... 0 T1 Ta −1 2 + h∆x2 T2 h∆x2 T∞ −1 0 ... 0 2 0 −1 2 + h∆x2 −1 ... 0 T3 = h∆x T∞ ... ... ... ... ... ... ... ...2 2 ... ... ... −1 2 + h∆x −1 T15 h∆x T∞ 0 0 0 0 ... 1 T16 Tb (8.6) (8.7) (8.8) (8.9) 8.1. HEAT TRANSFER IN A ONE-DIMENSIONAL ROD 91 Figure 8.2: Exact and approximate solution for Heat transfer along 1D rod Or ⇒ [K]{T } = {F } {T } = [K]−1 {F } (8.10) To get better results we will need to increase the number of intervals from 15 to much larger numbers e.g. 100. For a test problem with Ta = 300K, Tb = 400K, T∞ = 200K, L = 10m, h = 0.05m−2 we can solve it for N=100 and compare to the exact solution of the problem obtained using previous chapter techniques to be √ T (x) = 200 + 20.4671e 0.05x √ + 79.5329e −0.05x The approximation is plotted against the exact in Fig 8.2. Example 8.1.1. Finite Difference Approximation of BVPs See Chapra [1, Example 24.5, pg. 629] 8.1.2 One-Dimensional Rod with Mixed Boundary Conditions So far we have fixed T (0) = Ta , T (L) = Tb . These are called Dirichlet boundary conditions. Another option is to fix the derivative at one end; this is called a Neumann boundary condition. For example, at x = 0, set dT = b. One way to treat this boundary condition is dx to replace the derivative by a suitable difference formula. To do so, we introduce a “ghost point”, T0 . This point is not actually part of the grid, but allows us to write down the central difference formula at x1 : T2 − T0 = b ⇒ T0 = 2b∆x + T2 2∆x 92 CHAPTER 8. SOLUTION OF BOUNDARY VALUE PROBLEMS Now write down the discretized governing equation at x1 and combine with the difference formula above: T2 − 2(h + ∆x2 )T1 + T0 = −h∆x2 T∞ T2 − 2(h + ∆x2 )T1 + 2b∆x + T2 = −h∆x2 T∞ 2(h + ∆x2 )T1 − 2T2 = h∆x2 T∞ + 2b∆x (8.11) Note that because of the substitution, the “ghost point”, T0 is not part of the system, since it could be written in terms of T2 . Now, we can write this linear system of equations in matrix form: 2 + h∆x2 −2 0 0 ... 0 T1 h∆x2 T∞ − 2b∆x −1 2 + h∆x2 −1 0 ... 0 h∆x2 T∞ T2 2 2 T3 0 −1 2 + h∆x −1 ... 0 h∆x T ∞ = ... ... ... ... ... ... ... ... ... ... ... −1 2 + h∆x2 −1 T15 h∆x2 T∞ 0 0 0 0 ... 1 T16 Tb Now, as before, we can solve the linear system of equations for the values of Ti , i = 1, . . . , N . Example 8.1.2. Incorporating Neumann Boundary Conditions See Chapra [1, Example 24.6, pg. 632] 8.2 General Linear Second Order ODEs with Nonconstant Coefficients Now we will consider the case of a general, linear, second order differential equation with nonconstant coefficients. We’ll focus on the pure Dirichlet boundary condition case, but it is easy enough to generalize to include Neumann boundary conditions, as we saw in Section 8.1.2. y 00 + p(x)y 0 + q(x)y = r(x), y(0) = ya y(L) = yb 0<x<L (8.12) (8.13) (8.14) for given functions p(x), q(x), and r(x). As before, we discretize the line into points xi , i = 1, . . . , N . Then, we can apply suitable finite difference formulae to the equation. Here, we’ll use a central difference formula for the second derivative and a forward difference for the first derivative. As before, we let pi = p(xi ), qi = q(xi ), and ri = r(xi ). y1 = ya yi+1 − yi yi+1 − 2yi + yi−1 + pi + qi yi = ri , 2 ∆x ∆x yN = yb (8.15) i = 2, . . . , N − 1 (8.16) (8.17) 8.3. TWO DIMENSIONAL EQUATIONS 93 Rearranging terms, and multiplying by −∆x2 , we have y1 = ya (8.18) 2 −yi−1 + yi 2 + pi ∆x − qi ∆x + yi+1 (−1 − pi ∆x) = −ri ∆x2 , (8.19) i = 2, . . . , N − 1 yN = yb (8.20) Rewriting in matrix form, we have 1 −1 0 ... ... 0 0 0 2+pi ∆x−qi ∆x2 −1−pi ∆x 0 0 −1 ... ... 0 2+pi ∆x−qi ∆x2 −1−pi ∆x ... ... 0 ... −1 0 ... ... ... ... 2+pi ∆x−qi ∆x2 ... ya −ri ∆x2 −ri ∆x2 = ... −ri ∆x2 −1−pi ∆x yN −1 1 yN yb 0 0 0 ... y1 y2 y3 ... Notice that the example in Section 8.1 is captured here by setting p(x) = 0, q(x) = −h, and r(x) = −hT∞ . 8.3 Two dimensional Equations Now we consider boundary value problems for two-dimensional domains. That is, we’ll consider the application of finite difference methods to partial differential equations. All of the ideas follow very naturally from the one-dimensional case. The only difference is now how to mange the indexing of the two-dimensional grid of points into the linear system. We’ll focus on the Poisson equation: ∆u = f . Furthermore, we’ll only consider Dirichlet boundary conditions; similar ideas to the one-dimensional case apply for Neumann boundary conditions. Consider the Poisson equation on the rectangular domain Ω = [0, a] × [0, b], with Dirichlet boundary conditions: ∂ 2u ∂ 2u + = f, ∂x2 ∂y 2 u(0, y) = ul u(a, y) = ur u(x, 0) = ub u(x, b) = ut (x, y) ∈ Ω (8.21) (8.22) (8.23) (8.24) (8.25) for a given forcing function f (x, y). As before, we discretize the domain. Now we’ll have a two-dimensional array points, say N points in the x-direction and M points in the y-direction. Now we have two-indices to track points on the grid: (i, j), i = 1, . . . , N ; j = 1, . . . M . That is, each point in the grid lies at (xi , yj ). 94 CHAPTER 8. SOLUTION OF BOUNDARY VALUE PROBLEMS Now we apply central difference rules to both the partial derivatives: ∂ 2u ui−1,j − 2ui,j + ui+1,j ≈ 2 ∂x ∆x2 2 ∂ u ui,j−1 − 2ui,j + ui,j+1 ≈ 2 ∂y ∆y 2 (8.26) (8.27) Substituting into our original partial differential equation, we have ui−1,j − 2ui,j + ui+1,j ui,j−1 − 2ui,j + ui,j+1 + = fi,j ∆x2 ∆y 2 u1,j = ul , j = 1, . . . , M uN,j = ur , j = 1, . . . , M ui,1 = ub , i = 1, . . . , N ui,M = ut , i = 1, . . . , N (8.28) (8.29) (8.30) (8.31) (8.32) where, as before, fi,j = f (xi , yj ). In the one-dimensional case, there was a direct correspondence between the index of the equation and the index of the matrix row. Here, the situation is more complicated. We track the points in the grid using two indices, but have to convert those two indices into a single index into the matrix equation. The simplest possibility, since we have a nice structured grid, is to count from left to right, bottom to top; i.e. start at j = 1, move i from 1 to N , then move to j = 2, then, again, move i from 1 to N , etc. This is conveniently expressed by the following integer function: k = i + (j − 1)N (8.33) The integer k corresponds to the entry in the matrix for a given (i, j) and takes values k = 1, . . . , N ∗ M . So now, as we consider the equation at each point in the domain, i.e. a specific (i, j), we can now directly map that to a row in linear system. 8.4 Practice Problems Work the following problems in Chapra [1]: • 24.8(b) • 24.12 Chapter 9 Solution of Eigenproblems In Chapter 2, we discussed the formulation and solution of eigenproblems Ax = λx. The methods of solution we considered were all based on finding the roots of the characteristic polynomial. However, that’s only practical for small problems. In this case, we can specify exactly how small: dimension 4. Why? It was proven in the 1800’s that polynomials of order 5 or higher possess no analytical solution. That means we must use numerical methods for solving practical eigen problems. However, numerically finding the roots of polynomials is not the most practical method. We will consider three methods: the power method, used for finding the largest eigenvalue and corresponding eigenvector, the inverse power method, which computes the smallest eigenvalue and its eigenvector, and the QR method, which computes the entire spectrum at once. This will also be our first encounter with iterative methods. That is, the numerical methods we’ve encountered so far have been “direct” methods, i.e. we can directly compute the solution and we can count how many operations it will take to do so. This is not the case with iterative methods. These methods update the solution and we must continually check if we solved the problem to the desired accuracy. There’s no way to predict ahead of time, except in the most trivial of circumstances, how many steps it will take. Thus, we will be introducing a new kind of error: the error incurred by only approximately solving our numerical problem. 9.1 Power Method The power method is perhaps the most classical approach to solving eigenproblems. The power method (or power iteration) will yield an approximation of the largest eigenvalue and its corresponding eigenvector. The idea is very simple. First, we make a guess of what the eigenvector is, call it z. Then, compute w = Az. If z is an eigenvector, then, for any component k, X wk = Akj zj = λzk (9.1) j ⇒λ= wk zk 95 (9.2) 96 CHAPTER 9. SOLUTION OF EIGENPROBLEMS That is, if z was an eigenvector, we can directly compute the corresponding eigenvalue. If z is not an eigenvector, then we use w as the next guess for the eigenvector. And we repeat this process. So, if our initial guess is z0 , then w0 = Az0 z1 = w0 w1 = Az1 = Aw0 = A2 z0 ... This is where the name “Power Method” comes from, we are effectively repeatedly multiplying by A. How do we extract, then, the eigenvalue? At each iteration, we always normalize z so that kzk∞ = 1, i.e. we rescale z so that its largest component is 1. Why? That largest component will converge to the eigenvalue. That is, looking at Equation (9.2), if z is getting closer to the eigenvector and its largest component has the value of 1, then the the factor we use to normalize z (the largest value of w) will be the eigenvalue. So, the final procedure for the Power Method is as follows. Repeat until convergence tolerance is reached: 1. Provide initial guess for z. 2. Compute w = Az. 3. z = 1 w kwk∞ The maximum value of w will be the eigenvalue and z will be the eigenvector. How do we assess convergence? There are several choices. First, we can check and see if the eigenvalue is not changing within each iteration: |λi − λi−1 | < εtol (9.3) where i indicates the current iteration and εtol is a user-supplied tolerance. Although such a test is practical, it can also be misleading since, in many iterative methods, the convergence may stagnate, i.e. the value changes very little between iterations, but we are not solving the original problem. Thus, it is also useful to check the residual : ri = Azi − λi zi (9.4) But r is a vector — how can we check if a vector is “small”? This is another use for norms. Norms give us a way of measuring the size of vectors. So, in this case, our residual error check would be kri k = kAzi − λi zi k < εtol Typically, we use the Euclidean norm. Example 9.1.1. Power Method Consider the matrix 40 −20 0 A = −20 40 −20 0 −20 40 (9.5) 9.2. INVERSE POWER METHOD 97 We will perform several iterations of the Power Method to estimate the largest eigenvalue. We use the initial guess z = [1, 1, 1]T . Then, w = Az = [20, 0, 20]T . We need to normalize for the next iteration so we extract the largest value (in absolute value) and normalize. So, max w = 20 and z = [1, 0, 1]T . So after 1 iteration, our approximation of the largest eigenvalue is 20 and the corresponding eigenvector is [1, 0, 1]T . Again, w = Az = [40, −40, 40]T . Extracting the largest value gives 40 and z = [1, −1, 1]T , our current approximation for the eigenpair. Repeating again, w = Az = [60, −80, 60]. So, the largest value now is −80 and normalizing gives z = [−3/4, 1, −3/4]T . Repeating again, w = Az = [−50, 70, −50]. Normalizing gives the largest value as 70 and z = [−5/7, 1, −5/7]T ; this is our estimate of the eigenpair after 4 iterations of the Power method. The exact values, according to the MATLAB eig command are λ = 68.28427 . . . and x = [−0.707107, 1, −0.707107]T . 9.2 Inverse Power Method The Inverse Power Method is very easy to understand once we understand the Power Method. We can rewrite our original eigenproblem as A−1 x = 1 x λ (9.6) If we now apply the power method to Equation (9.6), we will be computing the largest value of 1/λ. This means that we’ll be computing the smallest value of λ, i.e. the smallest eigenvalue of A! The algorithm is very similar to the Power Method. Repeat until convergence tolerance is reached: 1. Provide initial guess for z. 2. Compute w = A−1 z. 3. z = 1 w kwk∞ The maximum value of w will be the eigenvalue (1/λ) and z will be the eigenvector. Of course, the primary difference is now we’re not simply multiplying by A, but rather we have to solve a linear system at each iteration. This is much more expensive than the Power method! Of course, one can perform an LU decomposition and store L and U to reuse for each linear solve, but this initial step can still be quite costly for even modest sized matrices. 9.3 Shifted Inverse Power Methods With the Power Method and Inverse Power Method, we can now compute the largest and smallest eigenvalues. What about others? We will take advantage of a shifting property of matrices to examine the other eigenvalues. Namely, if we take our original 98 CHAPTER 9. SOLUTION OF EIGENPROBLEMS eigenproblem and subtract the vector τ x, with τ some number, from both sides, we have (A − τ I) x = (λ − τ ) x (9.7) That is, if we shift the matrix, we also correspondingly shift the eigenvalues. In particular, we use this to shift the eigenvalues for the Inverse Power Method, thereby controlling which of the (shifted) eigenvalues is the smallest. Although this is not practical for computing the entire spectrum, it can be useful for computing a handful of eigenvalues and for accelerating convergence of desired eigenvalues. 9.4 QR Iteration Although it can often be useful to compute a few of the smallest and/or largest eigenvalues of a matrix, it can also be useful to compute all eigenvalues and eigenvectors. Computing this whole spectrum is the job of the QR iteration. The QR iteration is extremely simple to write down, but has a heavy theoretical burden that will we not fully explore. Nevertheless, the QR iteration is based on the existence of the QR decomposition of a matrix: A = QR where Q is an orthogonal matrix and R is an upper triangular matrix. An orthogonal matrix satisfies two important properties: det Q = ±1 and Q−1 = QT . If we assume that such a QR decomposition exists, then the QR iteration is very simple. Repeat until convergence: 1. Compute the QR decomposition of A = QR. 2. Set A = RQ This second step in the repeating iteration can be seen a different way. Namely, since QR = A, then R = QT A (since Q is an orthogonal matrix), then we reset A = RQ, what we have is A = QT AQ. Thus we are repeatedly transforming A using the orthogonal matrices Q. These repeated similarity transformations will yield the entire spectrum, namely the diagonal of the final A will be the eigenvalues and the product of all the matrices Q will contain the eigenvectors. Computing the QR decomposition is an O(n3 ) operation. Thus, we would be performing an O(n3 ) at every iteration of the QR iteration! Thus, in actual implementations of such methods, the matrix A is first reduced to upper Hessenberg form, upper triangular form plus one nonzero subdiagonal, for which the QR decomposition is O(n2 ) (O(n) if the matrix is symmetric!). Such strategies are at the heart of the MATLAB command eig for computing eigenvalues and eigenvectors. This is also why the cost of computing the entire spectrum can be quite large for even modest sized matrices. Chapter 10 Nonlinear Equations Engineering analysis proceeds in a three stage iterative process – first we must carry out a careful set of observations to determine what data and information about the problem are available; second we must develop our basic knowledge of the physics, feasibility constraints, etc., into a mathematical analysis of the problem; and, finally we must solve the mathematical problem numerically to obtain the desired results of the analysis. Let us now explore this in the context of a simple example. 10.1 Example Problem: Rocket A rocket with initial mass m0 (shell + fuel) is fired vertically at time t0 . Fuel is consumed at a constant rate: dm (10.1) q= dt and is expended at constant speed u relative to the rocket. We wish to determine: 1. Velocity as a function of time t, neglecting air resistance. 2. The time t at which the rocket will reach velocity v. 3. The size of rocket (initial mass) needed to reach a given velocity at a given time. 10.1.1 Problem Formulation Principles of Impulse and Momentum I~ = ∆~p At time t • Force: W (t) = m(t)g • Momentum: Pz (t) = m(t)v(t) 99 (10.2) 100 Figure 10.1: gram CHAPTER 10. NONLINEAR EQUATIONS Rocket Dia- Figure 10.3: Time t+∆t Figure 10.2: Time t 10.2. SOLVING NON-LINEAR EQUATIONS 101 At time t + ∆t • W (t + ∆t) = [m(t) − ∆m]g + ∆mg • Pz (t + ∆t) = [m(t) − ∆m]v(t + ∆t) + ∆m[v(t + ∆t) − u] • ∆m = q∆t 10.1.2 Problem Solution Z t+∆t −W (t)dt = Pz (t + ∆t) − Pz (t) (10.3) m(t)g∆t = m(t)[v(t + ∆t) − v(t)] − qu∆t (10.4) t Divide by ∆t m(t)g = m(t) v(t + ∆t) − v(t) − qu ∆t (10.5) dv (t) − qu dt (10.6) Limit as ∆t → 0 m(t)g = m(t) Separate variables and integrate. v Z t qu − g dt dv = m0 − qt 0 o m0 v(t) = u ln − gt m0 − qt m0 f (t) = u ln − gt − v = 0 m0 − qt Z (10.7) (10.8) (10.9) To answer the first question we posed – velocity as a function of time t, all we have to do is evaluate 10.8 for given values of m0 , q and g. To answer the second, time at which the rocket will reach a desired velocity, we need to solve 10.9 i.e. given v, m0 , q, g, find the roots of f (t) = 0. For another example on finding roots of a non-linear equation, refer to Chapter 5 of Ref [1]. 10.2 Solving Non-Linear Equations First we note that the “equation” is not of the form t = g(...) – no explicit expression for t as a function of everything else is possible. Secondly, the equation is non-linear in the m0 , q, and t variable. How do we know this? We can test the equation for linearity. 102 CHAPTER 10. NONLINEAR EQUATIONS 10.2.1 Test for Linearity Suppose we have an equation f (v) = 0 If we replace v by a linear combination of variable, e.g. 2v1 + v2 , and if f (v) is a linear function (note: not an affine function) – we will now get 2f (v1 ) + f (v2 ) = 0 For 10.9, clearly, this is not possible – the log function precludes this! Thus 10.9 is non-linear in t, m0 and q. Most of the time you can find this by inspection. 10.2.2 Methods of Solution Let’s now frame some solution strategies. The solutions have to be searched for inside a possible range of values. The simplest and first thing to do is graph the function and get a sense of its behavior. 1. Incremental Search This is the simplest strategy. Simply divide up a range of into smaller increments t ∈ [a, b] = [t0 , t1 , t2 ..., tM ] and keep evaluating the function until, for some ti , |f (t) − f (ti )| < , a defined tolerance. ti is then the desired root. 2. Bisection This is a smarter search strategy which uses the bisection method to speed up the search. The core idea is that for f (troot ) = 0 it must change sign to the left and right of troot . So if we start with an interval [a,b] on which ) will f (t) changes sign the root must lie in there. Checking the sign of f ( a+b 2 then tell us which half contains the sign change and hence root. The procedure can be applied recursively until we are happy with the accuracy of the root. The algorithm to implement this scheme is: Input m0 , vdesired , q and range[a, b] 1.Check data for consistency 2.Evaluate f (a), f (b) and check that sign changes i.e. f (a) ∗ f (b) < 0 ) 3.Evaluate f (m) = f ( a+b 2 4. if f (m) ∗ f (b) < 0 then set a = m, else set b = m repeat 3,4 until f (m) < A simple MATLAB code to implement this is shown in Figure 10.4. It should be noted that if f 0 (troot ) = 0, the bisection method might not work within the given bounds – this is the case when the function does not change sign within the given bounds (even though a root does exist) . 10.2. SOLVING NON-LINEAR EQUATIONS 103 Exercise: Modify the sample MATLAB code to answer the third question i.e. estimate the size of rocket needed to attain a desired velocity after a specified amount of time. For further practice in using the Bisection method, see Examples 5.3 and 5.4 in Ref [1]. 3. Newton Rhapson The Taylor series expansion of a function f (x) about a known value f (x1 ) is given by: f (x) = f (x1 ) + (x − x1 )f 0 (x1 ) + (x − x1 )2 f 00 (x1 ) + ... 2! (10.10) We can approximate to ”first order” as f (x) ≈ f (x1 ) + (x − x1 )f 0 (x1 ) + O(x − x1 )2 (10.11) Near a possible root x1 f (x) = 0 = f (x1 ) + (x − x1 )f 0 (x1 ) (10.12) What this equation is showing is that we can estimate f (x) by it’s value at x1 , f (x1 ) plus a correction given by the slope f 0 (x1 ) times the change in x, x − x1 i.e. a straight line approximation of the function. x = x1 − f (x1 ) f 0 (x1 ) (10.13) Repeating xi+1 = xi − f (xi ) f 0 (xi ) (10.14) For sufficiently small x − x1 this will always work but the small might be too small!! Newton’s method converges fast as error is of order (x − x1 )2 BUT if this gap is large we might not converge at all. This method also needs the derivative to be computable – this may not always be possible, though often we can get an approximation. If f (t) has repeated roots e.g. f (x) = (x − 3)2 then f 0 (x) = f (x) = 0 for x = 3. The method will break down since we will end up dividing by zero!! Example 10.2.1. Examples 6.2, 6.3, and 6.4 of Ref [1]. 4. Secant Variation of NR method – replace f 0 by successive iterates. f 0 (xi ) ≈ f (xi ) − f (xi−1 ) xi − xi−1 Example 10.2.2. Example 6.5 of Ref [1]. 104 CHAPTER 10. NONLINEAR EQUATIONS MATLAB Code for Bisection to find roots of 10.9 % Adapted from Faucett, Applied Numerical Analysis Using Matlab (2008) % Root finding by bisection % Lets get the variables input vdes =input ( ’ desired vel ’); M0 = input (’ initial mass ’); q = input (’ Burn Rate ’); u = input(’ exhaust vel ’ ); % Define the function (x is time, M0 is initial mass ...) f = @(x)(u ∗ (log(M 0) − log(M 0 − q ∗ x)) − vdes)/9.81 − x; % Set initial range a = input(’ search left ’); b = input (’ search right ’); % Make sure that poorly chosen arguments don’t kill code if (M 0 − q ∗ b) <= 0, error ( ’negative argument to log ’), end % Set tolerance and max iterations kmax = 500; tol = 0.001; fa = f(a); fb = f(b); % Check that bisection will work if sign(fa) == sign(fb), error(’ function has same sign at end points – bisection will not work’), end disp(’ step a b m fm bound’) for k = 1:kmax m = (a+b)/2; fm = f(m); iter = k; bound = (b-a)/2; out = [ iter, a, b, m, fm, bound ]; disp( out ) if abs(f m) < tol, disp(’ bisection has converged’); break; end outerr = [ abs(fm) ]; % Check which half of domain contains the root if f m ∗ f a > 0 a = m; fa = fm; else b = m; fb = fm; end bb(k)=m; err(k)=fm; if (iter >= kmax), disp(’ zero not found to desired tolerance’), fprintf(’ err = % e \n’, outerr ), end end % Just an optional set of variables to make a nice plot of the function for k=1:kmax fr(k)=f(a+(b-a)*k/kmax); tim(k)=a+(b-a)*k/kmax; end subplot(1,2,1), plot(tim,fr); subplot(1,2,2), plot(err,’*’); Figure 10.4: Code for solving 10.9 using the bisection method 10.3. CONVERGENCE 105 5. Regula Falsi Variation of bisection method. Instead of “bisecting” the interval at each iteration, we instead use a linear approximation. That is, we approximate the function f (x) by a linear function that passes through the points f (a) and f (b). Where this linear approximation is zero determines the next guess of the root, xr : xr = b − f (b)(a − b) f (a) − f (b) (10.15) Example 10.2.3. Example 5.5 of Ref [1]. 6. Fixed Point Iteration Rewrite f (x) = 0 as x = g(x) A point x that satisfies x = g(x) is said to be a fixed point of the function g(x). So, we are looking for fixed points of g(x), but because the equation x = g(x) was based on f (x) = 0, solving this fixed point problem will also solve our original root problem. The iteration then proceeds as xi+1 = g(xi ), i = 1, 2, 3, ... (10.16) Stop when |xi+1 − g(xi+1 )| < or |f (xi+1 )| < This process converges (see below) if |g 0 (x)| < 1 Example 10.2.4. Example 6.1 of Ref [1]. 10.3 Convergence To be useful, iterative numerical methods like the ones we have discussed above must reduce error to a tolerable level and in the limit (if we have infinite computing resources!) reduce it to zero. We analyze the methods above to see if they satisfy this. Usually, most methods will do it under restrictions – these are the conditions we need to make sure our problem observes before we apply the method to that problem. 106 10.3.1 CHAPTER 10. NONLINEAR EQUATIONS Bisection Let us look at the interval hi+1 at iteration i + 1. Since the root is in the interval [ai+1 , bi+1 ]the maximum error |ei+1 | < hi+1 (10.17) 1 1 1 1 hi+1 = bi+1 − ai+1 = hi = 2 hi−1 = ... = i+1 h0 = i+1 (b − a) (10.18) 2 2 2 2 Taking logarithms and rearranging: log10 h0 − log10 |ei | −1 (10.19) i≤ log10 2 Thus given an accuracy ei we can estimate the upper bound of iterations. Note that the above argument breaks down if there is more than 1 root. Thus, we can apply this analysis to bisection only if there is 1 root in [ai+1 , bi+1 ]. 10.3.2 Newton Rhapson Let the correct root be x∗ for f (x) = 0 i.e. f (x∗ ) = 0. Let xn , xn+1 be estimates at the n, n + 1 iteration such that |x∗ − xn | = δ << 1. We can define errors en = x∗ − xn , en+1 = x∗ − xn+1 . By Taylor series, 0 = f (x∗ ) = f (xn ) + f 0 (xn )(x∗ − xn ) + f 00 (ξ) ∗ (x − xn )2 2 for some ξ ∈ (x∗ , xn ) using the reminder formula for Taylor series f 00 (xn ) (x∗ − xn )2 + .... 2 By Newton Rhapson xn+1 = xn − f 00 (ξ) (x∗ 2 (10.20) − xn )2 = f (xn ) f 0 xn ⇒ f (xn ) = f 0 (xn )(xn − xn+1 ) (10.21) Using 10.21 in 10.20 we have 0 = f 0 (xn )(xn − xn+1 ) + f 0 (xn )(x∗ − xn ) + = f 0 (xn )en+1 + ⇒ en+1 Or en+1 f 00 (ξ) 2 = − 0 e 2f (xn ) n ∝ e2n f 00 (ξ) ∗ (x − xn )2 (10.22) 2 f 00 (ξ) 2 e 2 n (10.23) This is the quadratic convergence property of Newton’s method. This means that, if we’re close enough to the solution, the error at the next iteration will be the square of the error at the current iteration, i.e. if the error at our current iteration is ≈ 10−3 , then the error at the next iteration will be ≈ 10−6 , and then ≈ 10−12 after that. This is very fast convergence! 10.4. NONLINEAR SYSTEMS OF EQUATIONS 10.3.3 107 Fixed Point Let the correct root be x∗ . x∗ = g(x∗ ) (10.24) Subtracting 10.16 and dividing by x∗ − xi x∗ − xi+1 = g(x∗ ) − g(xi ) x∗ − xi+1 g(x∗ )−g(xi ) = x∗ −xi ∗ x − xi (10.25) (10.26) Using the mean value theorem of calculus (g() is a continuous function defined on the interval [x∗ , xi ] so there must exist a value of ξ ∈ [x∗ , xi ] for which g 0 (ξ) is given by the ratio of the difference of values of g() at each end and the length of the interval). ei+1 = ei ei+1 = g(x∗ )−g(xi ) x∗ −xi = g 0 (ξ) for ξ ∈ [x∗ , xi ] g 0 (ξ)ei (10.27) (10.28) For convergence the error must reduce i.e. ei+1 < ei . Thus, g 0 (ξ) < 1 10.4 Nonlinear Systems of Equations To this point, we have only considered scalar nonlinear equations, i.e. only a single equation with one independent variable. Just as we can have systems of linear equations, studied extensively in previous chapters, we can also have systems of nonlinear equations. That is, we can have multiple equations with multiple independent variables, with a nonlinear dependence on the independent variables. Such systems arise in many different aspects of science and engineering. The first step is always to transform the equations into a root problem, namely f(x) = 0, or, written more explicitly f1 (x1 , x2 , x3 , . . . , xn ) = 0 f2 (x1 , x2 , x3 , . . . , xn ) = 0 f3 (x1 , x2 , x3 , . . . , xn ) = 0 .. . fn (x1 , x2 , x3 , . . . , xn ) = 0 Now we have n equations with n unknowns, with a nonlinear dependence in one or more of the independent variables. Example 10.4.1. Simple Nonlinear System of Equations 108 CHAPTER 10. NONLINEAR EQUATIONS The following system of equations is nonlinear in the unknowns x1 , x2 , x3 : x21 + 2x22 + 4x3 = 7 2x1 x2 + 5x2 − 7x2 x23 = 8 5x1 − 2x2 + x23 = 4 Example 10.4.2. Intersection of a Circle and an Ellipse The intersection of geometric shapes provide more interesting examples of nonlinear systems. Consider the intersection of a circle and an ellipse: x 2 y 2 + =1 a b x2 + y 2 = r 2 We are looking for points (x, y) where the two curves intersect, i.e. the values of (x, y) that satisfy both of the equations. Depending on the parameters of the ellipse and the circle, there will be differing numbers of solutions – i.e., there could be 0 to 4 points of intersection between the ellipse and the circle. To solve such systems of equations, we can use strategies similar to those used for scalar equations. We will consider two algorithms here: the fixed-point method and Newton’s method (also called the Newton-Raphson method). By far, Newton’s method is the most widely used. One key difference in the case of a nonlinear system of equations is assessing convergence. In the case of a single equation (scalar case), we could simply check |f (xi )| < εtol since f (x) was a scalar function. Now, we have a vector f (x). How do we check the tolerance here? We use norms! Norms give us a measure of the magnitude of a vector; in this case allowing us to check if the magnitude of our vector is small enough, i.e. kf (x)k < εtol . Typically, the Euclidean norm (2-norm) is used, kf (x)k2 (c.f. Equation (5.6)). 10.4.1 Fixed-Point Method As with the scalar case, we will rewrite the root problem, f (x) = 0 into the form x = g(x). Written more explicitly, we have f1 (x1 , x2 , x3 , . . . , xn ) = 0 f2 (x1 , x2 , x3 , . . . , xn ) = 0 f2 (x1 , x2 , x3 , . . . , xn ) = 0 ⇒ x1 = g1 (x1 , x2 , x3 , . . . , xn ) x2 = g2 (x1 , x2 , x3 , . . . , xn ) x3 = g3 (x1 , x2 , x3 , . . . , xn ) .. . fn (x1 , x2 , x3 , . . . , xn ) = 0 xn = gn (x1 , x2 , x3 , . . . , xn ) Then, as in the scalar case, given an initial guess x0 , the iteration proceeds as xi+1 = g(xi ), i = 0, 1, 2, . . . We continue until we reach an acceptable tolerance. (10.29) 10.4. NONLINEAR SYSTEMS OF EQUATIONS 109 Example 10.4.3. Formulate Nonlinear System for Fixed-Point Method Consider the following root problem: f1 (x1 , x2 , x3 ) = x21 + 50x1 + x22 + x23 − 200 = 0 f2 (x1 , x2 , x3 ) = x21 + 20x2 + x23 − 50 = 0 f3 (x1 , x2 , x3 ) = −x21 − x22 + 40x3 + 75 = 0 We can rearrange the equations into fixed-point form: 200 − x21 − x22 − x23 50 2 50 − x1 − x23 x2 = g2 (x1 , x2 , x3 ) = 20 x21 + x22 − 75 x3 = g3 (x1 , x2 , x3 ) = 40 x1 = g1 (x1 , x2 , x3 ) = Example 10.4.4. Example 12.3 of Ref [1]. 10.4.2 Newton-Raphson Method In the scalar case, we derived Newton’s method by considering a truncated Taylor series. We will do the same here, but now we have a system of equations, so we must do a Taylor series for each equation. Consider the system of nonlinear equations: f1 (x1 , x2 , x3 , . . . , xn ) = 0 f2 (x1 , x2 , x3 , . . . , xn ) = 0 f3 (x1 , x2 , x3 , . . . , xn ) = 0 .. . fn (x1 , x2 , x3 , . . . , xn ) = 0 Now we expand each equation in a Taylor series up to first order about the point xi . Note that since we have multiple independent variables, the Taylor series will involve partial derivatives with-respect-to each of the independent variables. We use the notation f1,i+1 = f1 (x1,i+1 , x2,i+1 , x3,i+1 , . . . , xn,i+1 ), f1,i = f1 (x1,i , x2,i , x3,i , . . . , xn,i ), etc. ∂f1,i ∂f1,i ∂f1,i (x1,i+1 − x1,i ) + (x2,i+1 − x2,i ) + · · · + (xn,i+1 − xn,i ) f1,i+1 = f1,i + ∂x1 ∂x2 ∂xn ∂f2,i ∂f2,i ∂f2,i f2,i+1 = f2,i + (x1,i+1 − x1,i ) + (x2,i+1 − x2,i ) + · · · + (xn,i+1 − xn,i ) ∂x1 ∂x2 ∂xn ∂f3,i ∂f3,i ∂f3,i f3,i+1 = f3,i + (x1,i+1 − x1,i ) + (x2,i+1 − x2,i ) + · · · + (xn,i+1 − xn,i ) ∂x1 ∂x2 ∂xn .. . fn,i+1 = fn,i + ∂fn,i ∂fn,i ∂fn,i (x1,i+1 − x1,i ) + (x2,i+1 − x2,i ) + · · · + (xn,i+1 − xn,i ) ∂x1 ∂x2 ∂xn 110 CHAPTER 10. NONLINEAR EQUATIONS Now, setting f1,i+1 = f2,i+1 = f3,i+1 = · · · = fn,i+1 = 0 and moving f1,i though fn,i to the other side of the equation, we have ∂f1,i (x1,i+1 − x1,i ) + ∂x1 ∂f2,i = (x1,i+1 − x1,i ) + ∂x1 ∂f3,i (x1,i+1 − x1,i ) + = ∂x1 −f1,i = −f2,i −f3,i ∂f1,i (x2,i+1 − x2,i ) + · · · + ∂x2 ∂f2,i (x2,i+1 − x2,i ) + · · · + ∂x2 ∂f3,i (x2,i+1 − x2,i ) + · · · + ∂x2 ∂f1,i (xn,i+1 − xn,i ) ∂xn ∂f2,i (xn,i+1 − xn,i ) ∂xn ∂f3,i (xn,i+1 − xn,i ) ∂xn .. . −fn,i = ∂fn,i ∂fn,i ∂fn,i (x1,i+1 − x1,i ) + (x2,i+1 − x2,i ) + · · · + (xn,i+1 − xn,i ) ∂x1 ∂x2 ∂xn We can rewrite these equations in matrix form as ∂f1,i ∂f1,i ∂f1,i ∂f . . . ∂x1,i x1,i+1 − x1,i −f1,i ∂x1 ∂x2 ∂x3 n ∂f2,i ∂f2,i ∂f2,i ∂f . . . ∂x2,i ∂x1 x2,i+1 − x2,i ∂x2 ∂x3 −f2,i n ∂f3,i ∂f ∂f3,i ∂f3,i 3,i x3,i+1 − x3,i = −f3,i . . . ∂xn ∂x2 ∂x3 ∂x1 .. .. .. .. .. .. .. . . . . . . . ∂fn,i ∂fn,i ∂fn,i ∂fn,i xn,i+1 − xn,i −fn,i ... ∂x1 ∂x2 ∂x3 ∂xn The matrix is called the Jacobian matrix, which we label Ji . So, the newton step looks very familiar: xi+1 = xi − J−1 (10.30) i fi Now, however, we have to solve a linear system of equations to compute the Newton step! But the principle is still the same: we compute the derivative and the derivative guides the next step in the iteration. Here, though, the derivative is the Jacobian matrix. Example 10.4.5. Newton Iteration for Intersection of Curves Consider the nonlinear system f1 (x1 , x2 ) = x21 + x22 − 1 f2 (x1 , x2 ) = x21 − x2 If we wish to use Newton’s method to solve the root problem f (x) = 0, we must compute the Jacobian. In this case, the Jacobian matrix is 2x1 2x2 J= 2x1 −1 Now, we evaluate the nonlinear system, f (x) and the Jacobian J(x) at each iteration and solve the linear system J∆x = −f to get the Newton update. Example 10.4.6. Case Study 12.3 of Ref [1]. 10.4. NONLINEAR SYSTEMS OF EQUATIONS 10.4.3 111 Case Study: Four Bar Mechanism We now consider a prototypical engineering problem: the rigid body kinematics of a structure. Such problems arise in the study of robotic systems, for example. Consider the four bar mechanism shown in Figure 10.5. The goal is to predict the motion of Figure 10.5: Simple four bar mechanism. the structure as we change one of the angles, for example. We assume that the bars are rigid and that we are given the lengths of each bar: r1 , r2 , r3 , and r4 . The angle θ1 is fixed since bar 1 cannot rotate (due to being pinned at both ends). The equation for point P is merely the sum of the position vectors of each of the bars. Namely, rP = r2 + r3 = r1 + r4 (10.31) We can express Equation (10.31) in terms of unit vectors i and j: r2 (cos θ2 i + sin θ2 j) + r3 (cos θ3 i + sin θ3 j) =r1 (cos θ1 i + sin θ1 j) +r4 (cos θ4 i + sin θ4 j) (10.32) Since the equations must be satisfied for each component i and j, we can write these as two separate equations, one for the x-direction and one for the y-direction: r2 cos θ2 + r3 cos θ3 = r1 cos θ1 + r4 cos θ4 r2 sin θ2 + r3 sin θ3 = r1 sin θ1 + r4 sin θ4 (10.33) (10.34) Now, suppose were are changing θ4 (i.e. rotating bar 4), what will be the configuration of the system? This is a system of nonlinear equations in terms of θ2 and θ3 ! f1 (θ2 , θ3 ) = r2 cos θ2 + r3 cos θ3 − r1 cos θ1 − r4 cos θ4 f2 (θ2 , θ3 ) = r2 sin θ2 + r3 sin θ3 − r1 sin θ1 − r4 sin θ4 (10.35) (10.36) 112 CHAPTER 10. NONLINEAR EQUATIONS So for the given data on the system and the current value of θ4 , we can solve the nonlinear system (10.35) for θ2 and θ3 to get the position of the system. If we wish to use Newton’s method, we’ll need the Jacobian matrix: −r2 sin θ2 −r3 sin θ3 J(θ2 , θ3 ) = (10.37) r2 cos θ2 r3 cos θ3 We’ll also need to supply an initial guess for our numerical method. However, we need to be careful. In addition to worrying about supplying initial values that could lead to a singular Jacobian matrix (the scalar analog is a zero derivative), we also need to worry about multiple solutions. Figure 10.6 illustrates a second valid configuration for the four bar system, i.e. our system of nonlinear equations may have multiple solutions! Which solution will be get? This depends on our initial guess! Typically, in these scenarios, there is a preferred solution, either through other physical constraints not posed in the system, preferred in the design process, etc. Thus, we need to have an initial guess that will give us our preferred solution. Figure 10.6: Multiple valid configurations for the four bar mechanism. Part III Data Analysis 113 Chapter 11 Linear Regression As engineers, we are often faced with the task of determining parameters of materials from experimental data. For example, one way to determine the Youngs modulus of a material is to subject a sample of the material to controlled loading and examine the resulting stress-strain curve, for example one similar to Figure 11.1. We know that for linear elastic materials, the stress and strain are related through Hooke’s Law σ = Eε. However, as we see in Figure 11.1, we have many different values of stress and strain — which one should we use? Given that we have a model, Hooke’s law in this case, we would like it to match all the data as much as possible. That is, we want the best fit or best approximation. 11.1 Least Squares Fit: Two Parameter Functions Our “best fit” problem can be stated as follows. Given a set of n data points (xi , yi ), i = 1, . . . , n, we want to find a function f (x) that best fits the data. Note that this means yi 6= f (xi ). What do we mean by “best fit”? Typically, we want to minimize the error between the data, yi , and the function f (x): ei = yi − f (xi ) (11.1) See also Figure 11.2. But we can’t just use the error as is because we can have cancellation! Some errors are positive and some are negative, so it’s possible to sum the errors and have zero total error even though we clearly aren’t matching the data. Figure 11.1: Example of Stress-Strain measurements for several types of Aluminum. 115 116 CHAPTER 11. LINEAR REGRESSION y f (x) f (xi ) yi ei x xi Figure 11.2: Illustration of error between data and curve fit. Instead, we will square each of the error components so that our total error is Sr = n X e2i = i=1 n X (yi − f (xi ))2 (11.2) i=1 So, our goal is to minimize the total error or, in other words, find the least square error. So now we need to choose the form of f (x). This is typically dictated by our understanding of the physical process, e.g. Hooke’s Law. We will first consider the simple case of a line: f (x) = a0 + a1 x. Now the problem is to determine the values of a0 and a1 that will minimize the error in Equation (11.2). Namely, min Sr = min a0 ,a1 a0 ,a1 n X i=1 e2i = min a0 ,a1 n X (yi − a0 − a1 xi )2 (11.3) i=1 The minimum occurs when ∇Sr = 0. In this case, the variables we are varying are the parameters a0 and a1 , so the gradient is with respect to each of these parameters: ∂Sr =0 ∂a0 ∂Sr =0 ∂a1 (11.4) (11.5) Computing each of these derivatives gives us the following system of equations (again, 11.1. LEAST SQUARES FIT: TWO PARAMETER FUNCTIONS 117 a0 and a1 are the unknowns): n X (yi − a0 − a1 xi ) = 0 i=1 n X (11.6) xi (yi − a0 − a1 xi ) = 0 i=1 We can rewrite these equations in matrix form: Pn Pn Pnn Pni=1 x2i a0 = Pni=1 yi a1 i=1 xi yi i=1 xi i=1 xi In this case, we can compute the solution directly, namely P P P P ( ni=1 x2i ) ( ni=1 yi ) − ( ni=1 xi yi ) ( ni=1 xi ) a0 = P P 2 n ( ni=1 x2i ) − ( ni=1 xi ) Pn Pn Pn n ( i=1 xi yi ) − ( i=1 xi ) ( i=1 yi ) a1 = P P 2 n ( ni=1 x2i ) − ( ni=1 xi ) (11.7) (11.8) (11.9) Now that we have a solution for our line-fitting problem, we can use some simple statistical-type measures to assess the fit. First, is the variance Sy : r St n−1 n X St = (yi − ȳ)2 Sy = (11.10) (11.11) i=1 P where ȳ = i yi /n is the average. This looks at the variation of all the data around the mean. We can also look at the variation of the data around our best fit line: r Sr Sy/x = (11.12) n−2 Figure 11.3 illustrates these two measures. Using these measures, we can define a goodness-of-fit measure, so-called “coefficient of determination”: r2 = St − Sr St (11.13) When Sr becomes very small, then we have very small mismatch between our function f (x) and the data yi and r2 → 1, and the function captures the behavior of the data very well. If, on the otherhand, r2 → 0, then this means that Sr is getting close to St . Intuitively, this means that our total error, Sr , looks very similar to the spread of the data around the mean, St , and the function, statistically, does not capture the behavior of the data well. 118 CHAPTER 11. LINEAR REGRESSION Standard Deviation for Regression Line Sy Sy/ x St n 1 Sy/x Sy Sr n 2 n St ( yi y)2 ; y i 1 n Sr ( yi a0 1 n n yi i 1 a1 x i ) 2 i 1 Sy : Spread around the mean Sy/x : Spread around the regression line November 10, 2011 Figure 11.3: Illustration of variation of data around the mean of the data (St ) and Slide 25 around the curve fit (Sr . For the straight-line case that we have considered in this section, we can write down the value of r explicitly: P P P n ( ni=1 xi yi ) − ( ni=1 xi ) ( ni=1 yi ) q P (11.14) r=q P Pn Pn 2 2 n n 2 2 n ( i=1 xi ) − ( i=1 xi ) n ( i=1 yi ) − ( i=1 yi ) Exercise: Use the (wind tunnel experiment) data from Table 14.1 of Ref [1], to perform a least squares fit (linear regression), and then assess the quality of the fit using the “goodness-of-fit” measure described above. Example 11.1.1. Example 14.5 of Ref [1]. Example 11.1.2. Example 14.6 of Ref [1]. Exercise: Learn to implement the MATLAB function, linregr, for linear regression and use it to again solve the three example problems in this section. There are also several other two-parameter functions that on the surface appear to be nonlinear in the parameters, but that we can easily transform into a straight line regression problem. Consider the exponential form. If we apply the ln to both sides, we get ln f = ln α1 + β1 x (11.15) So, after applying the ln, we have an equation that looks like a straight line. Now, if we take ln yi and now using Equation 11.15 to fit (xi , ln yi ), we’ll get values for ln α1 and β1 . The final step is then to take the exponential of ln α1 to retrieve the value of α1 . Table 11.2 summarizes the transformations for each of the functions in Table 11.1. Example 11.1.3. Case Study 14.6 of Ref [1]. 11.2. POLYNOMIAL REGRESSION 119 Table 11.1: Two Parameter Functions for Regression. Exponential f (x) = α1 eβ1 x Power Law f (x) = α2 xβ2 Saturation Growth Rate f (x) = α3 x x + β3 Table 11.2: Two Parameter Functions for Regression. Transformed Function Data Transformation Fit Parameters ln f (x) = ln α1 + β1 x (xi , ln yi ) ln α1 , β1 log10 f (x) = log10 α2 + β2 log10 x (log10 xi , log10 yi ) 1 1 , xi y i log10 α2 , β2 1 β3 1 1 = + f (x) α3 α3 x 11.2 1 β3 , α3 α3 Polynomial Regression Thus far, we have only considered line functions that have only two parameters. We can easily generalize the procedure to arbitrary order polynomials. We’ll begin with quadratic polynomials. Consider f (x) = a0 + a1 x + a2 x2 (11.16) The total square error between this function and our given data (xi , yi ) is Sr = n X yi − a0 − a1 xi − a2 x2i 2 (11.17) i=1 Now we have three parameters: a0 , a1 , a2 . As before, the least square error occurs when ∇Sr = 0. In this case ∂Sr =0 ∂a0 ∂Sr =0 ∂a1 ∂Sr =0 ∂a2 (11.18) (11.19) (11.20) 120 CHAPTER 11. LINEAR REGRESSION This yields a (linear) system of equations that possesses three equations and three unknowns. In matrix form, those equations are Pn Pn 2 Pn a0 Pnn Pni=1 x2i Pni=1 x3i Pni=1 yi i=1 xi a1 = i=1 xi yi Pn 2 Pni=1 x3i Pni=1 x4i Pn 2 a2 i=1 xi i=1 xi i=1 xi i=1 xi yi (11.21) This idea generalizes to any mth order polynomial. An mth order polynomial will have m + 1 parameters. Following the procedure above will yield a linear system m + 1 equations. Example 11.2.1. Example 15.1 of Ref [1]. Exercise: Learn to implement the MATLAB function, polyfit, for polynomial regression and use it to fit 2nd and 3rd order polynomials for the above example problem. 11.3 Multiple Linear Regression We have only considered fitting data with functions that have one independent variable. If our data has multiple indendent variables, it is possible to best-fit that data with multivariable functions. Let us consider data [(xi , zi ), yi ]. That is, we now have two independent variables (xi , zi ) and our dependent variable yi . Now we will be fitting functions with two independent variables f (x, z). Let’s consider the case of fitting a plane: f (x, z) = a0 + a1 x + a2 z (11.22) As before, we construct the sum of the square of the errors: Sr = n X (yi − a0 − a1 xi − a2 zi )2 (11.23) i=1 Here, again, we have three parameters a0 , a1 , a2 . The best-fit occurs when the gradient is zero: ∇Sr = 0. This yields a linear system with three equations: Pn Pn Pn n x z a y i i 0 i i=1 i=1 i=1 P Pn 2 Pn P ni=1 xi a1 = ni=1 xi yi x x z i i i i=1 i=1 Pn Pn Pn 2 Pn a2 i=1 zi i=1 xi zi i=1 zi i=1 zi yi (11.24) In analogy with polynomial regression, we can easily extend this case to any number of dimensions (independent variables) and follow the same procedure. Example 11.3.1. Example 15.2 of Ref [1]. 11.4. GENERAL LINEAR LEAST SQUARES REGRESSION 11.4 121 General Linear Least Squares Regression To this point, we have considered three forms of linear regression: straight line, polynomial, and multiple dimensions. We can easily encompass all these cases in a general formulation of the linear regression problem. Given data (xi , yi ), i = 1, . . . , n, where now xi is potentially a vector (the multiple dimensions case), we seek to fit a function f (x) of the form f (x) = a0 f0 (x) + a1 f1 (x) + · · · + am fm (x) (11.25) where now each fj (x), j = 0, . . . , m is some function of our independent variables. In the straight line case, x = x, f0 = 1, and f1 = x. In the polynomial case, x = x, f0 = 1, f1 = x, . . . , fm = xm . In multiple linear regression case x = (x, z), f0 = 1, f1 = x, and f2 = z. So, in this general form, as long as the function, f (x) we wish to use is linear in the unknown coefficients, we can represent the problem in the generalized form as discussed above. For formulating the solution in this general case, we begin at a slightly different point. Instead of directly writing the sum of the square of the errors, we instead express the relationship between the data yi , our fitting function, and the error in matrix form as follows: y = Aa + e (11.26) where y is the n × 1 vector of data, e is the n × 1 vector of errors, a is the (m + 1) × 1 vector of unknown coefficients, and A is an n × (m + 1) matrix: f0 (x1 ) f1 (x1 ) . . . fm (x1 ) f0 (x2 ) f1 (x2 ) . . . fm (x2 ) A = .. (11.27) .. .. . . . . . . f0 (xn ) f1 (xn ) . . . fm (xn ) That is, each row i corresponds to evaluating our functions at the data points xi . Now, we can rearrange to see the error is e = y − Aa and then compute the sum of the square of the errors as Sr = eT e = n X i=1 yi − m+1 X !2 Aij aj (11.28) j=1 Upon computing ∇Sr , we get the following system of equations: AT Aa = AT y (11.29) These are the so-called Normal Equations. This is the most general formulation of the linear regression problem. An important point of consideration in the solution of this system is that the condition number of this system can be easily be quite large. In particular, κ(AT A) ≈ κ(A)2 . Thus, solving the normal equations using the Cholesky decomposition could be numerically difficult if A is even moderately 122 CHAPTER 11. LINEAR REGRESSION ill-conditioned. Instead, what is typically done is to use the QR decomposition as it is more numerically stable. This is what is done in MATLAB for example. In MATLAB, you can form the (non-square) matrix A and the data vector y and simply use the “slash” command: A\y. The solution will be the vector of coefficients a. Example 11.4.1. Example 15.3 of Ref [1]. Chapter 12 Interpolation In Chapter 11, we considered data that we wished to approximate using a specified function. In this chapter, we consider the case in which we want to construct a function that exactly matches the given data. Such instances arise in many places, e.g. tabulated thermodynamic data, atmospheric data, material properties, etc. Thus, we are given data (xi , yi ), i = 1, . . . , n and we wish to construct f (xi ) = yi for all data points. This is called interpolation and a function f (x) that satisfies f (xi ) = yi is said to interpolate the data. 12.1 Polynomial Interpolation We first begin with the case of interpolating our data with a single polynomial, in contrast with multiple polynomials considered in Section 12.2. See Figure 12.1 for an illustration. 12.1.1 Monomial Functions and the Vandermonde Matrix We first consider polynomial functions based on a sum of monomials: f (x) = a1 + a2 x + a3 x2 + ... + an xn−1 (12.1) As we can see right away, if we have n data points, there are n coefficients to be determined. Thus, for n data points, we must interpolate using a polynomial of order n − 1. Now, we use the interpolation condition, f (xi ) = yi for each data pair. This gives us n equations we can express in matrix form: 1 x1 x21 . . . xn−1 a1 y1 1 1 x2 x2 . . . xn−1 a2 y2 2 2 1 x3 x2 . . . xn−1 a3 y3 (12.2) 3 3 = .. .. .. . . .. .. .. . . . . . . . 2 n−1 1 xn xn . . . x n an yn Solving this linear system will yield the coefficients for our polynomial and, therefore, give us the interpolating function of our data. However, there is one problem. The 123 124 CHAPTER 12. INTERPOLATION y yi f (x) xi x Figure 12.1: Illustration of polynomial interpolation. matrix in Equation (12.2) is known as the Vandermonde matrix and it is notoriously ill-conditioned. So ill-conditioned, in fact, that it is effectively unusable. Example 12.1.1. Vandermonde System for Interpolation Consider the data (300, 0.616), (400, 0.525), and (500, 0.457). Since we have three data points, we will interpolate with a quadratic polynomial. Assembling the linear system, following Equation (12.2), we have 1 300 90, 000 a1 0.616 1 400 160, 000 a2 = 0.525 1 500 250, 000 a3 0.457 Matlab reports that the condition of this matrix is 5.89 × 106 . Because of the ill-conditioning of such systems, we must resort to other forms of polynomials for interpolation. These will still give the same curve. There is only one polynomial that will interpolate the given data points, but we can write it in a different form, more conducive to numerical implementation. 12.1.2 Lagrange Polynomials We consider here functions that are a sum of Lagrange polynomials. f (x) = y1 L1 (x) + y2 L2 (x) + · · · + yn Ln (x) (12.3) The key idea here is that the coefficient in front of each of the Lagrange polynomial terms, Li (x), is the data we are trying to interpolate. In particular, the Lagrange 12.1. POLYNOMIAL INTERPOLATION 125 polynomials possess the property that Li (xi ) = 1, i = 1, . . . , n and Li (xj ) = 0, i 6= j, j = 1, . . . , n. This naturally gives us our interpolation condition f (xi ) = yi . Let’s begin with the linear case (i.e. two data points). f (x) = y1 L1 (x) + y2 L2 (x) (12.4) We need L1 (x1 ) = 1 and L1 (x2 ) = 0. Similarly, we need L2 (x1 ) = 0 and L2 (x2 ) = 1. These naturally give the following forms for the Lagrange polynomials: x − x2 x1 − x2 x − x1 L2 (x) = x2 − x1 L1 (x) = and thus f (x) = y1 x − x2 x1 − x2 + y2 x − x1 x2 − x1 The quadratic (three data point) case is very similar. (x − x2 ) (x − x3 ) (x1 − x2 ) (x1 − x3 ) (x − x1 ) (x − x3 ) L2 (x) = (x2 − x1 ) (x2 − x3 ) (x − x1 ) (x − x2 ) L3 (x) = (x3 − x1 ) (x3 − x2 ) L1 (x) = and, therefore, (x − x2 ) (x − x3 ) (x − x1 ) (x − x3 ) (x − x1 ) (x − x2 ) f (x) = y1 +y2 +y3 (x1 − x2 ) (x1 − x3 ) (x2 − x1 ) (x2 − x3 ) (x3 − x1 ) (x3 − x2 ) Thus, we see that interpolating functions that use Lagrange polynomials have n coefficients, yi , and n Lagrange polynomials of order n − 1. We can succinctly write such functions as f (x) = n X i=1 yi Li (x), n Y (x − xj ) Li (x) = (xi − xj ) j=1 (12.5) j6=i So far, we have developed interpolants based on single functions. However, one major drawback of such an approach is that high order polynomials tend to be highly oscillatory, even when the data is quite smooth. See Figure 12.2. Thus, for large quantities of data, a single interpolating function is not a practical solution. One remedy is, instead of a single high-order polynomial, we use many low-order polynomials together. This is the notion of spline functions. 126 CHAPTER 12. INTERPOLATION Figure 12.2: Runge function (red), 1/ (1 + 25x2 ), with 5th order interpolating polynomial (blue) and 9th order interpolating polynomial (green) – interpolations done on equidistant points in the range [−1, 1]. Taken from https://en.wikipedia.org/ wiki/Runge%27s_phenomenon 12.2. SPLINES 127 y fi fi+1 (x) 1 (x) yi fi (x) x xi Figure 12.3: Illustration of spline interpolation. 12.2 Splines The idea of splines is to perform “piecewise-interpolation” of our data. So, between each interval (xi , xi+1 ), interpolate the data using a lower order polynomial. The points at which the functions meet are called “knots” (the knots tie together each of the polynomials into a single function). Figure 12.3 illustrates a spline interpolant. It is important to note that splines are in general capable of interpolating data of size n (i.e., n points) where n orderofthesplinepolynomial. 12.2.1 Linear Splines First, we consider a linear spline. In this case, we use a linear function in each data interval (xi , xi+1 ), i = 1, . . . , n − 1. In particular, fi (x) = ai + bi (x − xi ), i = 1, . . . , n − 1 (12.6) The first coefficient is determined by the interpolation condition, fi (xi ) = yi , giving ai = yi . The second coefficient is also determined by interpolation, but at the other point in the interval: fi (xi+1 ) = yi+1 . This gives bi = (yi+1 − yi )/(xi+1 − xi ). So, for each interval, we have a different function. Thus, when comparing to polynomial interpolation discussed in Section 12.1, we see we have an additional step. Namely, given a value of x, we must determine in what interval x lies in order to determine which of our splining functions are to be used. Thus, when using spline interpolants, we must make use of efficient search algorithms to search for the correct interval, e.g. binary search. Example 12.2.1. Examples 18.1 and 18.2 of Ref [1]. 128 CHAPTER 12. INTERPOLATION y x fi0 (x) Figure 12.4: Illustration of derivative of linear spline interpolant. Another consideration arises when we need to consider derivatives of our interpolating function. In the linear spline case, the first derivatives will not be continuous and the second derivatives are not even defined, see Figure 12.4. When derivative information is needed, we need to use higher order spline functions so that we may enforce continuity of derivatives. In general, splines of order n + 1 are needed to yield n continuous derivatives. Perhaps the most typical case of higher-order splines arise in the form of cubic splines. 12.2.2 Cubic Splines For cubic splines, the function on each interval takes the form fi (x) = ai + bi (x − xi ) + ci (x − xi )2 + di (x − xi )3 , i = 1, . . . n − 1 (12.7) for n data points (xi , yi ). We have n − 1 intervals, with four coefficients per interval — thus we need 4(n − 1) conditions to determine all the coefficients. The first condition is that the function must interpolate the data: fi (xi ) = yi , i = 1, . . . , n − 1. This gives ai = yi . The second condition is that the spline must be continuous at the knots: fi (xi+1 ) = yi+1 . Let hi = xi+1 − xi . Then, we have ai + bi hi + ci h2i + di h3i = yi+1 , i = 1, . . . , n − 1 12.2. SPLINES 129 The third condition is that the derivative of the spline must be continuous at the 0 knots: fi0 (xi+1 ) = fi+1 (xi+1 ). The derivative is fi0 (x) = bi + 2ci (x − xi ) + 3di (x − xi )2 (12.8) Thus, applying our third condition, we have the following n − 2 conditions: bi + 2ci hi + 3di h2i = bi+1 , i = 1, . . . , n − 2 (12.9) The fourth set of conditions is that the second derivative must be continuous at the 00 (xi+1 ). The second derivative is knots: fi00 (xi+1 ) = fi+1 fi00 (x) = 2ci + 6di (x − xi ) (12.10) Thus, applying our fourth condition, we have the following n − 2 conditions: ci hi + 3di hi = ci+1 , i = 1, . . . , n − 2 (12.11) These interpolation and continuity conditions have given us 2(n − 1) + 2(n − 2) constraints. We need still two more conditions to be able to fully constrain the system for all the unknown coefficients. We have many choices, but there are three common ones that are used: 00 1. Natural Condition: f100 (x1 ) = fn−1 (xn ) = 0. 0 (xn ) = A2 , where A1 , A2 are 2. Clamped End Condition: f10 (x1 ) = A1 , fn−1 given numbers. 000 000 3. “Not-a-knot” Condition: f1000 (x2 ) = f2000 (x2 ) and fn−2 (xn−1 ) = fn−1 (xn−1 ) Choosing one of these sets of conditions will yield a system of equations to solve that will give the coefficients for each cubic spline in each of the n − 1 intervals. Actually, this system can be written in tridiagonal form yielding an efficient solution strategy. In Matlab, one may use the spline function to construct cubic splines utilizing the “not-a-knot” condition as well as the interp1 function which will construct linear splines, cubic splines, etc. depending on the method supplied by the user. Example 12.2.2. Examples 18.3 of Ref [1]. Exercise: Generate and plot the upper half of an airfoil by fitting a cubic spline to the following truncated airfoil data: x = [0, 1, 2, 4, 8, 16, 24, 40, 56, 72, 80]/80; y = [0, 28, 39, 53, 70, 86, 90, 79, 55, 22, 2]/1000; where (x, y) represents 11 points on the airfoil. Re-do this problem by using the MATLAB function, spline. 130 CHAPTER 12. INTERPOLATION Chapter 13 Numerical Integration The integration of functions is a common task in engineering applications. However, we do not always have the luxury of functions that can be analytically integrated. We may not even have analytical functions to begin with — we may have only data points! Thus, we need to develop numerical schemes to integrate functions (or data) in such circumstances. The principal idea is that we approximate the integral of a function as the weighted sum of evaluations of that function: Z b f (x) dx ≈ a n X ci f (xi ) (13.1) i=1 where ci are weights and xi ∈ [a, b]. Different numerical integration methods will have differing weights and evaluation points, but we can always reduce the methods to this primitive form. There are two primary classes of methods that we’ll consider here: so-called Newton-Cotes formulae, suitable in many circumstances, including integrating when analytical functions are not available, and Gaussian quadrature formulae, used typically for integrals of complex functions that are not analytically available. 13.1 Newton-Cotes Rules Newton-Cotes rules are based on a very simple idea. Namely, if we interpolate our function at equally spaced points, or we only have equally spaced points, then we can interpolate the data using polynomials and then integrate the resulting interpolant. 13.1.1 Trapezoidal Rule If we use a linear interpolant, then we can integrate this interpolating function. The area under this curve is a trapezoid, motivating the name trapezoidal rule for this numerical method; see Figure 13.1. To derive the final rule, we simply perform the 131 132 CHAPTER 13. NUMERICAL INTEGRATION y f (x) a b x Figure 13.1: Illustration of trapezoidal rule. integration. Z a b Z b f (b) − f (a) f (x) dx ≈ f (a) + (x − a) dx b−a a (f (b) − f (a)) = f (a)(b − a) + (b − a) 2 f (b) + f (a) = (b − a) 2 So, then, the trapezoidal rule has coefficients c1 = c2 = (b − a)/2 and x1 = a and x2 = b. As has been done many times previously, we can examine the error in this approximation by considering Taylor expansions of our function. We omit the details and simply state that, for the trapezoidal rule, the error is |E| = 1 00 f (ξ)(b − a)3 , 12 ξ ∈ [a, b] (13.2) In particular, we observe that, since the error is dominated by the second derivative, that the trapezoidal rule is exact for constant and linear functions! This is not surprising since the development of the trapezoidal rule began with using linear interpolation. Example 13.1.1. Example 19.1 of Ref [1]. Exercise: problem. Use the MATLAB function, trapz, to solve the above example 13.1. NEWTON-COTES RULES 13.1.2 133 Simpson’s Rule If instead we use a quadratic polynomial to interpolate, as opposed to a linear function, we will arrive at Simpson’s Rule. Let x1 = a, x2 = (a + b)/2, x3 = b. Then, we use a Lagrange interpolant to derive Simpson’s rule: Z b Z b (x − x1 )(x − x3 ) (x − x2 )(x − x3 ) f (x0 ) + f (x1 )+ f (x) dx ≈ (x1 − x2 )(x1 − x3 ) (x2 − x1 )(x2 − x3 ) a a (x − x1 )(x − x2 ) f (x2 ) dx (x3 − x1 )(x3 − x2 ) (b − a) (f (x1 ) + 4f (x2 ) + f (x3 )) = 6 Therefore, we see that, for Simpson’s rule, c1 = c3 = (b − a)/6 and c2 = 2(b − a)/3. As was the case with the Trapezoidal rule, we can use a Taylor analysis to examine the error in Simpson’s rule. Again, omitting the details, |E| = (b − a)5 (4) f (ξ), 90 ξ ∈ [a, b] Interestingly, because the error is dominated by the fourth derivative, we see that not only are quadratic polynomials integrated exactly, as we would expect, but also cubic functions. Example 13.1.2. Example 19.3 of Ref [1]. 13.1.3 Composite Rules Although we could conceptually proceed with higher-order polynomials to achieve more accuracy with our integration rules, we can follow a simpler strategy. Just as was the case with interpolation, instead of pursuing higher order oscillatory polynomials, we can subdivide the interval [a, b] into equally spaced segments, and then apply our integration rules on each subinterval and the sum the result. See Figure 13.2 for an illustration. These rules are called composite integration rules. We proceed simply by decomposing our integral over the subintervals: Z b Z x1 Z x2 Z xn f (x) dx = f (x) dx + f (x) dx + · · · + f (x) dx a x0 x1 xn−1 Suppose we have n equally space segments so that each interval length h = (b−a)/n. Now if we apply the trapezoidal on each of the intervals, we’ll get the composite trapezoidal rule: Z b h h h f (x) dx = (f (x0 ) + f (x1 )) + (f (x1 ) + f (x2 )) + · · · + (f (xn−1 ) + f (xn )) 2 2 2 a " ! # n−1 X h = f (x0 ) + 2 f (xi ) + f (xn ) 2 i=1 134 CHAPTER 13. NUMERICAL INTEGRATION y f (x) a b h x Figure 13.2: Illustration of composite numerical integration rules. Example 13.1.3. Example 19.2 of Ref [1]. Exercise: Use the MATLAB function, cumtrapz, to solve the above example problem. Similarly, we can apply Simpson’s rule. However, we must work with two intervals at-a-time since we need three points to evaluate the function. Thus, to apply composite Simpson’s rule, we must have an even number of intervals (or an odd number of points). Proceeding, we have Z b 2h 2h (f (x0 ) + 4f (x1 ) + f (x2 )) + (f (x2 ) + 4f (x3 ) + f (x4 )) 6 6 2h + ··· + (f (xn−2 ) + 4f (xn−1 ) + f (xn )) 6 " ! ! # n−2 n−1 X X h = f (x0 ) + 2 f (xi ) + 4 f (xi ) + f (xn ) 3 i=2,4,6,... i=1,3,5,... f (x) dx = a To assess the error, we can sum the contribution for each of the intervals. For the composite trapezoidal rule, the error for an interval is given in Equation (13.2), so the total error is Et = n X (b − a)3 i=1 12n3 f 00 (ξi ) n (b − a)3 X 00 f (ξi ) = 12n3 i=1 13.2. GAUSS QUADRATURE 135 The sum is now only over the second derivative. But this just looks like the average: n 1 X 00 f (ξi ) f¯00 = n i=1 Thus, we have (b − a)3 ¯00 nf 12n3 (b − a)3 ¯00 = f 12n2 (b − a) 2 ¯00 = hf 12 Et = (13.3) where, again, h is the interval spacing. Thus, we see that the composite trapezoidal rule is O(h2 ). A similar argument holds for the composite Simpson’s rule: (b − a)5 ¯(4) nf 90n5 (b − a) 4 ¯(4) = hf 90 Et = (13.4) Here, we see Simpson’s rule is O(h4 ). Example 13.1.4. Example 19.4 of Ref [1]. Additional Examples for applying Trapezoidal and Simpson’s rules of numerical integration: Example 13.1.5. Example 19.5 of Ref [1]. Example 13.1.6. Case Study 19.9 of Ref [1]. 13.2 Gauss Quadrature So far, we have considered numerical integration rules that can be applied both to functions and to datasets that may not have an explicit function that we can evaluate. If we further pursue the case where we do have a function that we can evaluate, there are opportunities for more accurate integration rules. In particular, we can take advantage of cancellation of errors. Consider the illustration in Figure 13.3. If we carefully select at what points we evaluate the function, we can better balance the amount of positive and negative errors that we incur in the integration. Gaussian Quadrature rules are built by choosing (optimizing) the choice of the coefficients ci and the evaluation points xi such that polynomials of up to a certain order are integrated exactly. This procedure is called the method of undetermined coefficients. 136 CHAPTER 13. NUMERICAL INTEGRATION y error x0 a f (x) x1 b x Figure 13.3: Illustration of error incurred in integration approximation. To illustrate this process, we will re-derive the Trapezoidal by following the procedure for choosing the coefficients ci (the points xi are already determined for Trapezoidal rule). Here, Z b f (x) dx = c0 f (a) + c1 f (b) a So we have two coefficients to determine, c0 , c1 . Thus, we can enforce two conditions to determine the coefficients. The first is that we wish to integrate constant functions exactly; the second is that we wish to integrate linear functions exactly. So, if we take f (x) = 1 (constant function), then Z a 1 dx = b − a = c0 (1) + c1 (1) b Similarly, if we choose f (x) = x (linear function), then a Z x dx = b b 2 − a2 = c0 (a) + c1 (b) 2 This gives a linear system of equations. Solving we find, as we expect, c0 = c1 = (b − a)/2. Now, we follow this approach, but allow x0 and x1 to vary as well. For simplicity of the derivation, we consider the integration over the interval [−1, 1]; we will discuss later how to apply these results to the general interval [a, b]. Thus, Z 1 f (x) dx ≈ c0 f (x0 ) + c1 f (x1 ) −1 13.2. GAUSS QUADRATURE 137 Now, we have 4 unknowns, we can enforce four constraints. So, we will seek to exactly integrate constant, linear, quadratic, and cubic functions. Therefore, we have the following system of nonlinear equations: 1 Z 1 dx = 2 = c0 + c1 (13.5) x dx = 0 = c0 x0 + c1 x1 (13.6) 2 = c0 x20 + c1 x21 3 (13.7) x3 dx = 0 = c0 x30 + c1 x31 (13.8) −1 Z 1 Z −1 1 x2 dx = −1 Z 1 −1 In this case, we can solve the equations analytically. Solve (13.6) for c1 and substitute into (13.8): −c0 x0 x1 c0 x 0 3 x =0 ⇒ c0 x30 − x1 1 ⇒ x20 = x21 c1 = Since x0 6= x1 , then we must have x0 = −x1 . Substituting this result into (13.6), we have c0 = c1 . Using this result with (13.5), we find c0 = c1 = 1 . Now, finally, using √ √ this result in (13.7), we find x1 = 1/ 3 and, therefore, x0 = −1/ 3 . Notice that this integration rule will evaluate up to cubic polynomials exactly with only two function evaluations. As such, this is called a two-point quadrature rule. Following the procedures above, one can derive the equations and integration rules for n points. Such rules are tabulated. In general, n-point Gaussian quadrature rules will integrate polynomials of order 2n − 1 exactly. To this point, our two-point quadrature rule is valid for integrals posed on the interval [−1, 1]. To apply this integration rule to an interval [a, b], we must use a change of variables: Z b Z 1 f (x) dx = f ((g(t))) g 0 (t) dt (13.9) −1 a Take g(t) = a1 + a2 t, then g(−1) = a = a1 + a2 (−1) g(1) = b = a1 + a2 (1) so that a1 = a+b b−a , a2 = 2 2 138 CHAPTER 13. NUMERICAL INTEGRATION So, we can map t ∈ (−1, 1) to x ∈ (a, b) as (b + a) + (b − a)t 2 b−a dt dx = 2 x= Example 13.2.1. Example 20.3 of Ref [1]. (13.10) Bibliography [1] S. C. Chapra. Applied Numerical Methods with Matlab for Engineers & Scientists. McGraw-Hill, 3rd edition, 2011. 139