Unit 16 Matrix Equations and the Inverse of a Matrix Now that we have defined matrix multiplication, we can better understand where the matrices we used in solving systems of linear equations came from. Consider a system of m equations in the n unknowns x1 , x2 , ..., xn . Let aij be the coefficient of the j th variable in the ith equation, so that A = (aij ) is the coefficient matrix for this SLE. Let X be a column vector whose entries are the variables in the system, and let B be a column vector whose entries are the right hand side values of the equations. Then X is an n × 1 matrix and B is an m × 1 matrix. If we form the matrix product AX, we have: AX = = a11 a21 .. . am1 a12 · · · a1n a22 · · · a2n .. .. .. . . . am2 · · · amn a11 x1 + a21 x1 + .. . am1 x1 · x1 x2 .. . xn a12 x2 + . . . + a1n xn a22 x2 + . . . + a2n xn .. .. .. . . . + am2 x2 + . . . + amn xn and we see that AX is a column vector whose entries are the left hand sides of the equations. Thus, the matrix equation AX = B simply says that the ith entry of column vector AX must be equal to the ith entry of column vector B, i.e., that for each equation in the system, the left hand side of the equation must equal the right hand side of the equation. 1 We see that AX = B is a matrix representation of the SLE. The augmented matrix (A|B), i.e., the coefficient matrix for the system with the column vector of right hand side values appended as an extra column, is just a form of short-hand for this matrix equation AX = B. For instance, the SLE: 3x x 2x −x + 3y + 12z = 6 + y + 4z = 2 + 5y + 20z = 10 + 2y + 8z = 4 (from example 6 in Unit 14) can be written as the 3 3 12 x 1 1 4 · y = 2 5 20 z −1 2 8 matrix equation: 6 2 10 4 which says that: 3x x 2x −x + 3y + 12z 6 + y + 4z 2 = + 5y + 20z 10 + 2y + 8z 4 and the matrix equation can be matrix: 3 1 2 −1 written in short-hand as the augmented 3 12 6 1 4 2 5 20 10 2 8 4 Notice that when we solve the SLE AX = B, the solution is a set of values for the unknowns in the column vector X. When we state a solution to a system of equations, it is usually more convenient to express it as a vector, rather than as a column vector. That is, if the solution to an SLE involving x, y and z is x = 1, y = 2 and z = 3, we generally write this as x 1 y = 2 (x, y, z) = (1, 2, 3) rather than z 3 2 Because an n × 1 column vector can also be written as an n-vector, we often give column vectors names in vector notation, rather than in matrix notation. That is, rather than talking about the column vectors X and B, we can refer to these as the vectors ~x and ~b. Thus, we usually write the matrix form of an SLE as A~x = ~b, rather than as AX = B. This is the convention which we will use from now on. However, you should bear in mind that when we write A~x = ~b for a system of m equations in n variables, since A is (as always) the m × n coefficient matrix, ~x, the vector of variables, is used here as an n × 1 column vector, and ~b, the vector of right hand side values, is actually an m × 1 column vector, so that this equation makes sense mathematically. Definition 16.1. Any SLE involving m equations and n variables can be represented by the matrix form of the SLE A~x = ~b, where A is the m × n coefficient matrix, ~x is the vector (technically, column vector) of the unknowns and ~b is the vector (technically, column vector) of right hand side values. Both ~x and ~b can be written out either as column vectors or as mor n-vectors, whichever is most convenient in the present context. We see that solving a system of linear equations is equivalent to solving the equation A~x = ~b for the vector ~x. If we had a single equation in one unknown, of the form ax = b, where x is a variable and a and b are scalars, we could solve this easily by multiplying both sides of the equation by the inverse of a, a−1 , to get x = a−1 b. It is sometimes possible to do something analogous to this approach in solving the equation A~x = ~b. To do this, we need to define the inverse of a matrix. Definition 16.2. Let A be a square matrix. If there exists a matrix B with the same dimension as A such that AB = BA = I then we say that A is invertible (or nonsingular) and that B is the inverse of A, written B = A−1 . If A has no inverse (i.e., if no such matrix B exists), then A is said to be noninvertible (or singular). Notice: Only square matrices can have inverses. 3 Example 1. For A = 1 2 3 4 and B = −2 3 2 1 − 12 , show that the matrix B is the inverse of matrix A. Solution: We must show that AB = BA = I: AB = 1 2 3 4 = (1)(−2) + (2)(3/2) (1)(1) + (2)(−1/2) (3)(−2) + (4)(3/2) (3)(1) + (4)(−1/2) = 1 0 0 1 =I −2 BA = 3 2 −2 3 2 1 − 12 1 − 12 1 2 3 4 = (−2)(1) + (1)(3) (−2)(2) + (1)(4) (3/2)(1) + (−1/2)(3) (3/2)(2) + (−1/2)(4) = 1 0 0 1 =I Therefore, AB = BA = I and so B is the inverse of A, i.e., B = A−1 . Theorem 16.3. If A is invertible then its inverse is unique. Proof: Suppose that a square matrix A has two inverses, say B and C. We show that it must be true that B = C. Let n be the order of the square matrix A. Then B and C must also be square matrices of order n. Of course, the n × n matrix B can be multiplied by the identity matrix of order n. This matrix multiplication leaves the 4 matrix unchanged. That is, we have: B = BI However, since C is an inverse of A, then by definition AC = I (with order n), so we can write: BI = B(AC) We know that matrix multiplication is associative (see properties of matrix operations, unit 15). Thus, we have: B(AC) = (BA)C But since B is also an inverse of A, then, again by definition, BA = I and so we have: (BA)C = IC Finally, we know that multiplying the square matrix C by the identity matrix of the same order leaves the matrix unchanged, so we get: IC = C Putting this all together, we have: B = B(I) = B(AC) = (BA)C = I(C) = C and we see that, in fact, B = C. That is, we see that in order for B and C to both be inverses of A, they must be the same matrix, so any invertible matrix has a unique inverse. Theorem 16.4. Let A be a square matrix. If a square matrix B exists with AB = I, then BA = I as well, so in fact B = A−1 . What this theorem tells us is that, for instance, we only really needed to compute one of AB or BA in example 1 to prove that B is the inverse of A. We next outline an extremely important procedure, for finding the inverse of a given matrix (or determining that the matrix is singular, i.e. that no inverse exists). This procedure involves row-reducing an augmented matrix. This augmented matrix differs from those we used previously in that we are augmenting the matrix by appending more than 1 column. The procedure for 5 transforming the augmented matrix to RREF is not affected by this change, though. Procedure for finding the inverse of a matrix: Let A be a square matrix and let I be the identity matrix of the same order as A. Form the augmented matrix [A|I] and transform it to row-reduced echelon form. Let [C|D] be the final augmented matrix in RREF. Then (1) if C = I then A−1 = D (2) if C is not the identity matrix, then A is not invertible, i.e., is singular. Example 2. Find A−1 , 1 (a) A = 1 2 if it exists, for 1 1 2 3 1 2 1 1 1 (b) A = 1 2 3 2 3 4 Solution: (a) We wish to find the inverse of a 3 × 3 matrix, so we start by forming the augmented matrix obtained by appending the 3 × 3 identity matrix. We then row reduce this augmented matrix to obtain RREF. 1 0 0 1 1 1 1 1 1 1 0 0 R2→R2−R1 1 2 −1 1 0 [A|I] = 1 2 3 0 1 0 −−−−−−−→ 0 R3→R3−2R1 2 1 2 0 0 1 0 −1 0 −2 0 1 2 −1 0 1 0 −1 R1→R1−R2 −−−−−−−→ 0 1 1 0 2 −1 R3→R3+R2 0 0 2 −3 1 1 1 0 −1 2 −1 0 1 R3→ R3 2 −1 1 0 −−−−2−→ 0 1 3 1 1 0 0 1 −2 2 2 6 1 1 1 0 0 − 21 2 2 R1→R1+R3 2 0 −1 −−−−−−−→ 0 1 0 R2→R2−2R3 3 1 1 0 0 1 −2 2 2 This matrix is now in RREF. We see that the original matrix (i.e. the left side of the augmented matrix) has been transformed into the 3 × 3 identity matrix. This tells us that A is invertible and that the columns on the right side of the RREF augmented matrix are the columns of A−1 . Thus we see that 1 1 − 12 2 2 0 −1 A−1 = 2 1 1 3 −2 2 2 Check: AA−1 1 1 1 1 1 − 12 2 2 0 −1 = 1 2 3 2 3 1 1 2 1 2 −2 2 2 1 1 3 1 1 + 2 − − + 0 + − 1 + 12 2 2 2 2 2 = 12 + 4 − 29 − 21 + 0 + 32 21 − 2 + 32 1 + 2 − 3 −1 + 0 + 1 1 − 1 + 1 1 0 0 = 0 1 0 0 0 1 (b) We follow the same procedure as above. 1 1 1 1 0 0 1 0 −1 0 −3 2 R.R.E.F. 2 0 2 −1 [A|I] = 1 2 3 0 1 0 −−−−−→ 0 1 2 3 4 0 0 1 0 0 0 1 1 −1 Since the matrix on the left is not an identity matrix, we see that the matrix A is singular, i.e. has no inverse. Notice: During the process of row reducing the matrix, as soon as the bottom row becomes all 0’s in the left part of the augmented matrix, we can tell that the original matrix is not going to be transformed into an identity matrix. Therefore we could stop at that point. There is no need to continue reducing the matrix to RREF. 7 The method of Inverses Returning to the problem of solving the linear system A~x = ~b, we see that if A is a square invertible matrix then we have A~x = ~b ⇒ A−1 (A~x) = A−1 (~b) ⇒ (A−1 A)~x = A−1~b ⇒ I~x = A−1~b ⇒ ~x = A−1~b Therefore, if we can find A−1 , we are able to solve the system simply by multiplying A−1 times ~b. Caveat: The method of inverses can only be used when A is a square invertible matrix. If the coefficient matrix A is not square, or is singular, then we cannot use this approach to find ~x. Example 3. Solve the following system using the method of inverses. x + y x + 2y 2x + y + z + 3z + 2z = = = 4 6 5 Solution: The matrix form of the system is 1 1 1 x 4 ~ y = 6 A~x = b : 1 2 3 2 1 2 z 5 8 We saw in Example 2(a) that for this matrix A, we have A−1 = 1 − 12 2 2 0 −1 1 2 − 23 1 2 Therefore, 3 1 − 21 4 2 2 −1~ ~x = A b = 6 = 3 2 0 −1 1 1 5 − 32 − 21 2 2 3 x 2 so y = 3 , i.e. (x, y, z) = 23 , 3, − 21 z − 21 is the unique solution to this system. 1 2 Notice: For any SLE for which the coefficient matrix A is square and invertible, A−1 is unique and thus so is A−1~b. Therefore any system for which the method of inverses can be used has a unique solution. Example 4. Show that the system x + x + x + y 2y 2y + + + z 3z 4z = b1 = b2 = b3 has a unique solution for any values of b1 , b2 and b3 . Solution: The coefficient matrix for this system is: 1 1 1 A= 1 2 3 1 2 4 We try to find A−1 : 1 1 1 1 1 0 0 R.R.E.F. 1 2 3 0 1 0 − −−−−→ 0 1 2 4 0 0 1 0 2 −2 −1 3 so A = −1 0 −1 9 0 0 2 −2 1 1 0 −1 3 −2 0 1 0 −1 1 1 −2 1 1 2 But then, no matter what the vector ~b = (b1 , b2 , b3 ) is, we can find the unique solution to the system as: 2b1 − 2b2 + b3 2 −2 1 b1 x y = A−1~b = −1 3 −2 b2 = −b1 + 3b2 − 2b3 −b2 + b3 0 −1 1 b3 z For instance, for b1 = b2 = b3 = 1, we find that the unique solution is (x, y, z) = (2(1) − 2(1) + 1, −(1) + 3(1) − 2(1), −(1) + 1) = (1, 0, 0). 10