Module in Linear Algebra and Matrix Theory

1 MODULE 1 MATRICES OVER A FIELD 𝑭 Introduction In this lesson, we will discuss the different operations on matrices and its properties, the transpose of a matrix, the different types of matrices and the special types of square matrices. Objectives 1. 2. 3. 4. 5. After going through this chapter, you are expected to be able to do the following: Define matrix. Give the different types of matrices. Perform fundamental operations on matrices. Find the transpose of a given matrix. Identify special types of square matrices. 1.1 Definition of a Field Definition 1.1.1: By a field 𝐹 we mean a nonempty set of elements with two laws of combination, which we call addition and multiplication, satisfying the following conditions: 𝐹1 Closure Properties To every pair of elements 𝑎, 𝑏 ∈ 𝐹, 𝑎 + 𝑏 ∈ 𝐹 To every pair of elements 𝑎, 𝑏 ∈ 𝐹, 𝑎𝑏 ∈ 𝐹 𝐹2 Commutative Laws 𝑎+𝑏 =𝑏+𝑎 𝑎𝑏 = 𝑏𝑎 ∀𝑎, 𝑏 ∈ 𝐹 𝐹3 Associative Laws (𝑎 + 𝑏) + 𝑐 = 𝑎 + (𝑏 + 𝑐) (𝑎𝑏)𝑐 = 𝑎(𝑏𝑐) ∀𝑎, 𝑏, 𝑐 ∈ 𝐹 𝐹4 Distributive Laws (𝑎 + 𝑏)𝑐 = 𝑎𝑐 + 𝑏𝑐 𝑐(𝑎 + 𝑏) = 𝑐𝑎 + 𝑐𝑏 ∀𝑎, 𝑏, 𝑐 ∈ 𝐹 𝐹5 Identity Elements ∃0 ∈ 𝐹 such that 𝑎 + 0 = 𝑎 ∀𝑎 ∈ 𝐹 ∃1 ∈ 𝐹, 1 ≠ 0 such that 𝑎 ∙ 1 = 𝑎 ∀𝑎 ∈ 𝐹 2 𝐹6 Inverse Elements ∀𝑎 ∈ 𝐹, ∃ − 𝑎 ∈ 𝐹 such that 𝑎 + (−𝑎) = 0 ∀𝑎 ∈ 𝐹, 𝑎 ≠ 0, ∃𝑎−1 ∈ 𝐹 such that 𝑎𝑎 −1 = 1 The elements of 𝐹 are called scalars. The set of real numbers ℝ and complex numbers ℂ are examples of fields under the usual addition and multiplication in these sets 1.2 Definition of a Matrix Definition 1.2.1: A matrix over a field 𝐹 is a rectangular array of elements from 𝐹 arranged in 𝑚 horizontal rows and 𝑛 vertical columns: 𝑎11 𝑎12 … 𝑎1𝑛 𝑎21 𝑎22 … 𝑎2𝑛 𝐴=[ ⋮ ⋮ ⋮ ] 𝑎𝑚1 𝑎𝑚2 … 𝑎𝑚𝑛 (1) The ith row of A is [𝑎𝑖1 𝑎𝑖2 … 𝑎1𝑛 ] (1 ≤ 𝑖 ≤ 𝑚); the jth column of A is 𝑎1𝑗 𝑎2𝑗 [ ⋮ ] (1 ≤ 𝑗 ≤ 𝑛). 𝑎𝑚𝑗 The symbol 𝑀𝑚,𝑛 (𝐹) denotes the collection of all 𝑚 x 𝑛 matrices over 𝐹. We usually denote matrices by capital letters. If 𝑚 is equal to 𝑛, then the matrix is called a square matrix of order 𝒏. Example 1: Below are examples of matrices of different sizes. 2 −5 i) 𝐴 = [ ] is a 2 x 2 matrix 3 4 3 ii) 𝐵 = [−1 0 (A is a square matrix of order 2) 4 −8 −5 5 3 1 ] is a 3 x 4 matrix 7 2 −4 −8 The second row is [−1 5 3 1] and the third column is [ 3 ] 2 3 iii) 𝐶 = [ 1 2 1/2 0 ] is a 2 x 3 matrix 3 4/3 We will use the notation 𝐴 = [𝑎𝑖𝑗 ] to denote a matrix A with entries 𝑎𝑖𝑗 . The subscripts 𝑖 and 𝑗 will be used to denote the position of an entry 𝑎𝑖𝑗 in a matrix. Thus, the 𝑖𝑗th element or entry of a matrix A is the number appearing in the 𝑖th row and 𝑗th column of A. 1 0 1 0 4 Example 2: Let 𝐴 = [ ] and 𝐵 = [5 1 −2 2 3 3 2 −1 4 ]. −2 Then A is a 2 x 3 matrix with 𝑎12 = 0 (the element appearing in the first row and the second column), 𝑎13 = 4, and 𝑎23 = 3; B is a 3 x 3 matrix with 𝑏12 = 0, 𝑏13 = −1 , 𝑏21 = 5 , 𝑏23 = 4, 𝑏31 = 3 , and 𝑏32 = 2 . If A is a square matrix of order 𝑛, then the numbers 𝑎11 , 𝑎22 , … , 𝑎𝑛𝑛 form the main diagonal. Thus in matrix B above, the elements 𝑏11 = 1, 𝑏22 = 1, and 𝑏33 = −2 form the main diagonal. Definition 1.2.2: (Equality of Matrices) Two 𝑚 x 𝑛 matrix 𝐴 = [𝑎𝑖𝑗 ] and 𝐵 = [𝑏𝑖𝑗 ] are said to be equal if 𝑎𝑖𝑗 = 𝑏𝑖𝑗 , 1 ≤ 𝑖 ≤ 𝑚 , 1 ≤ 𝑗 ≤ 𝑛 , that is, if corresponding elements agree. Example 3: 1 2 𝐴=[ 2 −5 −3 1 2 ] and 𝐵 = [ 4 2 𝑥 𝑤 ] are equal if 𝑥 = −5 and 𝑤 = −3. 4 1.3 Matrix Operations and Its Properties Definition 1.3.1: (Addition of Matrices) If 𝐴 = [𝑎𝑖𝑗 ] and 𝐵 = [𝑏𝑖𝑗 ] are 𝑚 x 𝑛 matrices, then the sum of A and B is the 𝑚 x 𝑛 matrix 𝐶 = [𝑐𝑖𝑗 ] defined by [𝑐𝑖𝑗 ] = [𝑎𝑖𝑗 ] + [𝑏𝑖𝑗 ] (1  𝑖 ≤ 𝑚, 1  𝑗 ≤ 𝑛) That is, C is obtained by adding the corresponding elements of A and B. 4 1 2 4  0 2 4  Example 1: Let A   and B     3 1 5  1 3 1  Then 𝐴 + 𝐵 = [ 1+0 3+1 (−2) + (−2) 4 + 4 1 ]= [ (−1) + (−3) 5 + 1 4 −4 8 ] −4 6 Note that only matrices with the same size can be added. Definition 1.3.2: (Scalar Multiplication) If 𝐴 = [𝑎𝑖𝑗 ] is an 𝑚 x 𝑛 matrix and 𝑟 is a real number, then the scalar multiple of A by 𝑟, 𝑟A, is the 𝑚 x 𝑛 matrix 𝐵 = [𝑏𝑖𝑗 ] , where bij  ra ij (1  𝑖 ≤ 𝑚, 1  𝑗 ≤ 𝑛) That is, B is obtained by multiplying each element of A by 𝑟.  2 4 3 Example 2: Let 𝑟 = −3 and A    . Then 2  5 0  4 𝑟𝐴 = −3 [ 2 −3(4) −3(3) 3 −2 ] =[ −3(2) −3(−5) −5 0 −3(−2) −12 −9 6 ]= [ ] −3(0) −6 15 0 Definition 1.3.3: (Matrix Multiplication) If 𝐴 = [𝑎𝑖𝑗 ] is an 𝑚 x 𝑛 matrix and 𝐵 = [𝑏𝑖𝑗 ] is an 𝑛 x 𝑝 matrix, then the product of A and B is the 𝑚 x 𝑝 matrix 𝐶 = [𝑐𝑖𝑗 ], defined by 𝑐𝑖𝑗 = ∑𝑛𝑘=1 𝑎𝑖𝑘 𝑏𝑘𝑗 To illustrate this, let us consider an 𝑚 x 𝑛 matrix 𝐴 and an 𝑛 x 𝑝 matrix 𝐵. 𝑗th column of 𝐵 𝑖th row of 𝐴 𝑎11 𝑎21 ⋮ 𝑎𝑖1 ⋮ [ 𝑎𝑚1 𝑎12 𝑎22 ⋮ 𝑎𝑖2 ⋮ 𝑎𝑚2 … 𝑎1𝑛 𝑏11 𝑏12 … 𝑏1𝑗 … 𝑏1𝑝 … 𝑎2𝑛 𝑏21 𝑏22 … 𝑏2𝑗 … 𝑏2𝑝 ⋮ ⋮ ⋮ ⋮ ⋮ … 𝑎𝑖𝑛 𝑏𝑛1 𝑏𝑛2 … 𝑏𝑛𝑗 … 𝑏𝑛𝑝 ⋮ [ ] … 𝑎𝑚𝑛 ] Note that the 𝑖th row of 𝐴 and the 𝑗th column of 𝐵 must have the same number of components for product AB to be defined. 5 If we let AB = C then the 𝑖𝑗th element of AB is 𝑐𝑖𝑗 = [𝑎𝑖1 𝑎𝑖2 Example 3: If 𝐴 = [ 𝑏1𝑗 𝑏 … 𝑎𝑖𝑛 ] 2𝑗 = 𝑎𝑖1 𝑏𝑖𝑗 + 𝑎𝑖2 𝑏2𝑗 + … + 𝑎𝑖𝑛 𝑏𝑛𝑗 ⋮ [𝑏𝑛𝑗 ] 1 3 −2 6 ] and 𝐵 = [ ] then −2 5 4 7 −2 𝑐11 = [1 3] [ ] = 1(−2) + 3(4) = 10 4 −2 𝑐21 = [−2 5] [ ] = (−2)(−2) + 5(4) = 24 4 6 𝑐12 = [1 3] [ ] = 1(6) + 3(7) = 27 7 6 𝑐22 = [−2 5] [ ] = (−2)(6) + 5(7) = 23 7 10 Thus 𝐴𝐵 = [ 24 Similarly, 𝐵𝐴 = [ 27 ] 23 (−2)(1) + 6(−2) (−2)(3) + 6(5) −2 6 1 3 ][ ]= [ ] 4(1) + 7(−2) 4(3) + 7(5) 4 7 −2 5 −14 = [ −10 24 ] 47 Example 3 shows that matrix multiplication, in general, is not commutative. 7 −1 2 0 5 Example 4: Let 𝐴 = [ ] and 𝐵 = [−2 5 4 −3 1 3 2 Find a) AB and b) BA. 1 6 4 −4]. −3 2 6 Solution: a. Note that the number of columns of A is equal to the number of rows of B hence the product AB is defined. By Definition 1.3.3, AB is a 2 x 4 matrix. 2 𝐴𝐵 = [ 4 = [ = [ 7 −1 0 5 ] [−2 5 −3 1 3 2 14 + 0 + 15 28 + 6 + 3 29 8 37 −17 1 6 4 −4] −3 2 (−2) + 0 + 10 (−4) + (−15) + 2 2 + 0 + (−15) 4 + (−12) + (−3) 12 + 0 + 10 ] 24 + 12 + 2 −13 22 ] −11 38 b. The product BA is not defined because the number of columns of B is not equal to number of rows of A. the SAQ 1-1 −2 5 3 −1 ] and 𝐵 = [−4 3]. Find 1 −4 3 1 a. 𝐴𝐵 and 𝐵𝐴 1 Let 𝐴 = [ 2 Solve SAQ1-1 in your notebook and compare your answer with ASAQ1-1. ASAQ 1-1 The number of columns of 𝐴 is equal to the number of rows of 𝐵 hence 𝐴𝐵 is defined. Similarly, the number of columns of 𝐵 is equal to the number of rows of 𝐴 thus 𝐵𝐴 is also defined. Multiplying we have 1(2)  3(4)  (1)(3) 1(5)  3(3)  (1)(1)  AB     2(2)  1(4)  (4)(3) 2(5)  1(3)  (4)(1)   17 13    20 9  7 and  2 5 8 1 18 1 3 1    BA   4 3    2 9 8   2 1 4 3 1   5 10 7  Remarks: If 𝐴 = [𝑎𝑖𝑗 ] is an 𝑚 x 𝑛 matrix and 𝐵 = [𝑏𝑖𝑗 ] is an 𝑛 x 𝑝 matrix then, 1. BA may not be defined as in Example 4.b; this will take place if 𝑝 ≠ 𝑚. 2. If BA is defined, which means 𝑝 = 𝑚, then BA is 𝑛 x 𝑛 while AB is 𝑚 x 𝑝; thus if 𝑛 ≠ 𝑝, AB and BA are of different size (see ASAQ1-1). 3. If AB and BA are both of the same size, they may be equal. 4. If AB and BA are both of the same size, they may be unequal (see Example 3). Definition 1.3.4: (Additive Inverse of a Matrix) Let 𝐴 = [𝑎𝑖𝑗 ]. Then – 𝐴 is the matrix obtained by replacing the elements of 𝐴 by their additive inverses; that is −𝐴 = −[𝑎𝑖𝑗 ] = [−𝑎𝑖𝑗 ] Example 5. Let 𝐴 = [ −2 5 1/2 ] then 3 2/3 −17 –𝐴 = [ 2 −5 −1/2 ] −3 −2/3 17 Definition 1.3.5: (The Zero Matrix) An 𝑚 x 𝑛 matrix 𝐴 = [𝑎𝑖𝑗 ] is called a zero matrix and is denoted by 0 if 𝑎𝑖𝑗 = 0 , 1  𝑖 ≤ 𝑚, 1  𝑗 ≤ 𝑛; that is all entries are equal to zero. Properties of Matrix Operations Theorem 1.3.1: (Properties of Matrix Addition) Let A, B, and C be 𝑚 x 𝑛 matrices then: (a) A + B = B + A (commutative law for matrix addition) (b) A + ( B + C ) = ( A + B ) + C (associative law for matrix addition) (c) A + O = A (The matrix O is the 𝒎 x 𝒏 zero matrix) (d) there is a unique 𝑚 x 𝑛 matrix (−𝐴) such that 𝐴 + (−𝐴) = 𝑂 We will prove part (a) and leave the proof of the remaining parts as an exercise. 8 Proof of part (a): Let 𝐴 = [𝑎𝑖𝑗 ] and 𝐵 = [𝑏𝑖𝑗 ] be 𝑚 x 𝑛 matrices. By Definition 1.3.1 (1  𝑖 ≤ 𝑚, 1  𝑗 ≤ 𝑛) 𝐴 + 𝐵 = 𝑎𝑖𝑗 + 𝑏𝑖𝑗 = 𝑏𝑖𝑗 + 𝑎𝑖𝑗 since a, b ∈ 𝑅 then a + b = b + a for any real numbers a and b. = 𝐵 + 𝐴 Therefore, matrix addition is commutative. Example 6: To illustrate part (a) of Theorem 1.3.1, let  4 1  4 6 A and B     3 2  10 2  Then,  4 1  4 6   8 5  A B       3 2  10 2 13 4 and  4 6   4 1  8 5  B A     10 2   3 2  13 4 Example 7: To illustrate part (c) of Theorem 1.3.1, let 1 2 1 0 0 0    A   3 2 2 and O  0 0 0 . Note that O is the 3 x 3 zero matrix.  4 5 3  0 0 0 Then, 1 𝐴 + 𝑂 = [3 4 2 −1 0 0 2 −2] + [0 0 5 3 0 0 0 1 0] = [3 0 4 Example 8: To illustrate part (d) of Theorem 1.3.1, let 2 2 3   2 2 3 A and  A    2  3 1 2   3 1 2 −1 2 −2] = 𝐴 5 3 9 Then,  2 2 3   2 2 3 0 0 0  A  ( A)     2 0 0 0  3 1 2  3 1 Theorem 1.3.2: (Properties of Matrix Multiplication) Let 𝐴 = [𝑎𝑖𝑗 ] be an 𝑚 x 𝑛 matrix, 𝐵 = 𝑏𝑖𝑗 be an 𝑛 x 𝑝 matrix, and 𝐶 = 𝑐𝑖𝑗 be a 𝑝 x 𝑞 matrix. Then (a) (𝐴𝐵)𝐶 = 𝐴(𝐵𝐶) (associative law for matrix multiplication) (b) 𝐴(𝐵 + 𝐶) = 𝐴𝐵 + 𝐴𝐶 (left distributive law for matrix multiplication) (c) (𝐴 + 𝐵)𝐶 = 𝐴𝐶 + 𝐵𝐶 (right distributive law for matrix multiplication) Proof of part (a): Note that AB is an 𝑚 x 𝑝 matrix hence (AB)C is an 𝑚 x 𝑞 matrix. Similarly, BC is an 𝑛 x 𝑞 matrix hence A(BC) is an 𝑚 x 𝑞 matrix. We see that (AB)C and A(BC) have the same size. Now we must show that the corresponding parts of (AB)C and A(BC) are equal. Let AB = D = [𝑑𝑖𝑗 ]. Then 𝑑𝑖𝑗 = ∑𝑛𝑘=1 𝑎𝑖𝑘 𝑏𝑘𝑗 The 𝑖𝑗th component of (AB)C = DC is ∑𝑝𝑠=1 𝑑𝑖𝑠 𝑐𝑠𝑗 = ∑𝑝𝑠=1(∑𝑛𝑘=1 𝑎𝑖𝑘 𝑏𝑘𝑠 )𝑐𝑠𝑗 𝑝 = ∑𝑠=1 ∑𝑛𝑘=1 𝑎𝑖𝑘 𝑏𝑘𝑠 𝑐𝑠𝑗 Now we let BC = E = [𝑒𝑖𝑗 ] then 𝑝 𝑒𝑘𝑗 = ∑𝑠=1 𝑏𝑘𝑠 𝑐𝑠𝑗 Thus the 𝑖𝑗th component of A(BC) = AE is ∑𝑛𝑘=1 𝑎𝑖𝑘 𝑒𝑘𝑗 = ∑𝑛𝑘=1 ∑𝑝𝑠=1 𝑎𝑖𝑘 𝑏𝑘𝑠 𝑐𝑠𝑗 = 𝑖𝑗th component of (AB)C. This shows that (𝐴𝐵)𝐶 = 𝐴(𝐵𝐶). 10 Example 9: To illustrate part (a) of Theorem 1.3.2 we let 1  2 3 1 0  2 3 2 4     A  , B  0 2 2 2 , and C  0  1  2 3   1 0 1 3    2 0 2  3 0  1 3   0 0  Then,  8 10 1 18 11 3 2 4    4 A( BC )   0  8 6       1 2 3   5 1 1   23 23 8    and 1 3 8   2 10 5 ( AB)C    1 7 2 5 0   2 0 2  3 0   4 18 11  1 3   23 23 8   0 0  This shows that matrix multiplication is associative. Proof of part (b): Let 𝐴 = [𝑎𝑖𝑗 ] be an 𝑚 x 𝑛 matrix and let 𝐵 = [𝑏𝑖𝑗 ] and 𝐶 = [𝑐𝑖𝑗 ] be 𝑛 x 𝑝 matrices. Then the 𝑘𝑗th element of B + C is 𝑏𝑘𝑗 + 𝑐𝑘𝑗 and the 𝑖𝑗th element of A(B + C) is ∑𝑛𝑘=1 𝑎𝑖𝑘 (𝑏𝑘𝑗 + 𝑐𝑘𝑗 ) = ∑𝑛𝑘=1 𝑎𝑖𝑘 𝑏𝑘𝑗 + ∑𝑛𝑘=1 𝑎𝑖𝑘 𝑐𝑘𝑗 = 𝑖𝑗th component of AB + 𝑖𝑗th component of AC Thus, A(B + C) = AB + AC. The proof of part (c) is similar to the proof of part (b) and is left as an exercise. 0 1 0 2  2 2 4   Example 10: Let A   , B   2 3  , and C   1 1    4 1 3  5 1   2 0  11 Then 2 1 2 2 4   24 16  A( B  C )   3 4       4 1 3  7 1   20 9   and 18 10 6 6   24 16  AB  AC      13 0  7 9  20 9 Theorem 1.3.3: (Properties of Scalar Multiplication) If r and s are real numbers and A and B are matrices then (a) r(sA) = (rs)A (b) (r + s)A = rA + sA (c) r(A + B) = rA + rB (d) A(rB) = r(AB) = (rA)B We prove part (c) of the theorem and leave the proof of the other parts as an exercise. Proof: Let 𝐴 = [𝑎𝑖𝑗 ] and 𝐵 = [𝑏𝑖𝑗 ] be 𝑚 x 𝑛 matrices. Then the 𝑖𝑗th entry of 𝑟(𝐴 + 𝐵) = 𝑟(𝑎𝑖𝑗 + 𝑏𝑖𝑗 ) definition of scalar multiplication = 𝑟𝑎𝑖𝑗 + 𝑟𝑏𝑖𝑗 distributive property of real numbers = 𝑖𝑗th entry of 𝑟𝐴 + 𝑖𝑗th entry of 𝑟𝐵 definition of scalar multiplication Hence 𝑟(𝐴 + 𝐵) = 𝑟𝐴 + 𝑟𝐵. Example 11: To illustrate part (d) of Theorem 1.3.3 we let Then, 3 1 3 2 𝑟 = −3, 𝐴 = [ ], and 𝐵 = [2 2 −1 3 0 𝐴(𝑟𝐵) = [ 3 −4 1 3 2 ] ((−3) [2 1 ]) 2 −1 3 0 5 −9 12 1 3 2 = [ ] [−6 −3 ] 2 −1 3 0 −15 −4 1] 5 12 −27 −27 = [ ] −12 −18 and 3 −4 1 3 2 𝑟(𝐴𝐵) = −3 ([ ] [2 1 ]) 2 −1 3 0 5 = −3 [ 9 9 ] 4 6 −27 −27 = [ ] −12 −18 Hence 𝐴(𝑟𝐵) = 𝑟(𝐴𝐵). 1.4 Transpose of a Matrix Definition 1.4.1: (The Transpose of a Matrix) If 𝐴 = [𝑎𝑖𝑗 ] is an 𝑚 x 𝑛 matrix, then the 𝑛 x 𝑚 matrix AT  [aij ] , where T aij  a ji T (1  𝑖 ≤ 𝑚, 1  𝑗 ≤ 𝑛) is called the transpose of A. Thus the transpose of A is obtained by interchanging the rows and columns of A. 4 −2 −3 Example 1: Let 𝐴 = [ ], 𝐵 = [3 0 −5 −2 4 0 𝐴𝑇 = [−2 −5] −3 −2 2 ], and 𝐶 = [ −1]. Then 5 −1 4 Note that the first row of A became the first column of 𝐴𝑇 and the second row of A became the second column of 𝐴𝑇 . Similarly 3 𝐵 𝑇 = [ 5 ] and 𝐶 𝑇 = [2 −1 −1 4] 13 Theorem 1.4.1: (Properties of Transpose) If r is a scalar and A and B are matrices, then (a) (AT)T = A (b) (A + B)T = AT + BT (c) (AB)T = BTAT (d) (rA)T = rAT Proof of part (a): Let 𝐴 = [𝑎𝑖𝑗 ]. Then the 𝑖𝑗th entry of A is 𝑎𝑖𝑗 = 𝑎𝑗𝑖𝑇 definition of transpose of a matrix = (𝑎𝑇 )𝑇𝑖𝑗 definition of a transpose of a matrix 𝑇 𝑇 Hence (𝐴 ) = 𝐴. Proof of part (b): Let [(𝐴 + 𝐵)𝑇 ]𝑖𝑗 be the 𝑖𝑗th component of (𝐴 + 𝐵)𝑇 and (𝐴 + 𝐵)𝑗𝑖 denote the 𝑗𝑖th component of 𝐴 + 𝐵. Then [(𝐴 + 𝐵)𝑇 ]𝑖𝑗 = (𝐴 + 𝐵)𝑗𝑖 definition of a transpose = 𝑎𝑗𝑖 + 𝑏𝑗𝑖 definition of addition of matrices = (𝐴𝑇 )𝑖𝑗 + (𝐵 𝑇 )𝑖𝑗 definition of a transpose = 𝑖𝑗th entry of 𝐴𝑇 + 𝑖𝑗th entry of 𝐵 𝑇 Thus (𝐴 + 𝐵)𝑇 = 𝐴𝑇 + 𝐵 𝑇 . The proof of parts (c) and (d) is left as an exercise. Example 2: To illustrate part (c) of Theorem 1.4.1 we let 0 1 B   2 2 3 1 2  1 0 2 3   0 7    0 7  T T T  3  1 Then ( AB)   and B A          3 3  1 2 1  2 3  3 3   1 3 2 A  and  2 1 3  14 1.5 Special Types of Square Matrices Definition 1.5.1: A square matrix 𝐴 = [𝑎𝑖𝑗 ] for which every term off the main diagonal is zero, that is, 𝑎𝑖𝑗 = 0 for 𝑖 ≠ 𝑗, is called a diagonal matrix. Example 1: The following are diagonal matrices.  5 0 0 3 0  and H   0 8 0  G  0 1  0 0 3 Definition 1.5.2: A diagonal matrix 𝐴 = [𝑎𝑖𝑗 ], for which all terms on the main diagonal are equal, that is, 𝑎𝑖𝑗 = 𝑐 for 𝑖 = 𝑗 and 𝑎𝑖𝑗 = 0 for 𝑖 ≠ 𝑗, is called a scalar matrix. Example 2: The following are examples of scalar matrices. 3 𝐵 = [0 0 0 0 −7 0 ] 3 0] and 𝐶 = [ 0 −7 0 3 Definition 1.5.3: A matrix 𝐴 = [𝑎𝑖𝑗 ] is called upper triangular if a ij  0 for i > j. It is called lower triangular if a ij  0 for i < j. Definition 1.5.4: A matrix 𝐴 = [𝑎𝑖𝑗 ] is called strictly upper triangular if a ij  0 for i is called strictly lower triangular if a ij  0 for i Example 3: 0 𝐷 = [0 0 8 0 0 j. It j.  2 0 0 1 3 5  0 0 0     Let A = 0 2  1 , B =  5  1 0 , 𝐶 = [2 0 0] and 3 7 0  6  3 4 0 0  4 4 −9] 0 Matrix A is upper triangular since all entries below the main diagonal are zero. Matrix B is lower triangular since all entries above the main diagonal are zero while C and D are strictly lower triangular and strictly upper triangular, respectively. Strictly upper/lower triangular matrices are upper/ lower triangular matrices with all entries on the main diagonal equal to zero. 15 There are two important types of square matrices that can be defined in terms of the transpose operation. These are symmetric and skew symmetric matrices which we define as follows: Definition 1.5.5: A matrix 𝐴 = [𝑎𝑖𝑗 ] is called symmetric if AT = A.  1 2 3 1 −2 −3   𝑇 Example 4: Let A   2 5 0  then 𝐴 = [−2 5 0 ]. Since 𝐴𝑇 = 𝐴 then A is −3 0 4  3 0 4  symmetric. SAQ 1-2 Let 𝐴 and 𝐵 be symmetric matrices. Show that 𝐴 + 𝐵 is symmetric. ASAQ 1-2 (𝐴 + 𝐵)𝑇 = 𝐴𝑇 + 𝐵 𝑇 =𝐴+𝐵 Part (b) of Theorem 1.4.1 Definition of symmetric matrix Hence 𝐴 + 𝐵 is symmetric. Definition 1.5.6: A matrix 𝐴 = [𝑎𝑖𝑗 ] is called skew symmetric if 𝐴𝑇 = −𝐴.  0 3 2  Example 5: The matrix B   3 0 4  is skew symmetric because  2 4 0  0 3 𝐵 𝑇 = [−3 0 2 4 −2 −4] = −𝐵. 0 16 SAQ 1-3 Let 𝐴 and 𝐵 be skew-symmetric matrices. Show that 𝐴 + 𝐵 is skew-symmetric. ASAQ 1-3 (𝐴 + 𝐵)𝑇 = 𝐴𝑇 + 𝐵 𝑇 = (−𝐴) + (−𝐵) = −(𝐴 + 𝐵) Part (b) of Theorem 1.4.1 Definition of skew-symmetric matrix Thus 𝐴 + 𝐵 is skew-symmetric. Definition 1.5.7: (The Identity Matrix) The 𝑛 x 𝑛 matrix 𝐼𝑛 = [𝑎𝑖𝑗 ] defined by 𝑎𝑖𝑗 = 1 if 𝑖 = 𝑗, 𝑎𝑖𝑗 = 0 if 𝑖 ≠ 𝑗, is called the 𝑛 x 𝑛 identity matrix of order 𝒏. Example 6: Below are examples of identity matrices. 1 0 0  I 3  0 1 0 0 0 1  1 0 I4   0  0 is a 3 x 3 identity matrix of order 3 0 0 0 1 0 0  is a 4 x 4 identity matrix of order 4 0 1 0  0 0 1 Note that an identity matrix is a scalar matrix where all the elements on the main diagonal are 1. 17 Remarks: The identity matrix 𝐼𝑛 functions for 𝑛 x 𝑛 matrices the way the number 1 functions for real numbers. In other words, the identity matrix 𝐼𝑛 is actually a multiplicative identity for 𝑛 x 𝑛 matrices. That is 𝐴𝐼𝑛 = 𝐼𝑛 𝐴 = 𝐴 for every 𝑛 x 𝑛 matrix A. 4 0 1 Example 7: Let A   5 3 2  . Then  2 1 4   4 0 1  1 0 0   4 0 1  AI3  5 3 2 0 1 0  5 3 2   A  2 1 4 0 0 1   2 1 4 and 1 0 0   4 0 1   4 0 1  I 3 A  0 1 0 5 3 2  5 3 2  A 0 0 1   2 1 4  2 1 4 Hence, AI3 = I3A = A Suppose that A is a square matrix. If p is a positive integer, then we define 1. Ap = A.A…A p factors If A is n x n, we also define 2. A0 = In 3. ApAq = Ap+q 4. (Ap)q = Apq NOTE: 1. The rule (AB)P = APBP holds only if AB = BA. 2. The rule “if AB = 0 then either A = 0 or B = 0” is not true for matrices. 18 2 Example 8: Let A    8 Then, 2 AB    8 3 3 6 and B     12   2 4 3  3 6  0 0  12   2 4 0 0 ACTIVITY 1 1. Let 𝐴 = [ 3 2 0 3 5 1 2 4 3 5 ], 𝐵 = [3 4 ], 𝐶 = [−2 7 2 ], 𝐷 = [ ], −1 5 2 −4 1 −5 6 4 −3 2 4 5 −5 4 𝐸 = [0 1 4], and 𝐹 = [ ]. 2 3 3 −2 1 If possible, compute a. ( BT + A )C b. AB c. 2D – 3F d) ( C + E )T e) AB + DF f) ( 3C – 2E )TB 2. Let 𝐴 and 𝐵 be skew-symmetric matrices, show 𝐴𝐵 is symmetric if and only if 𝐴𝐵 = 𝐵𝐴. 3. Let 𝐴 be an 𝑛 x 𝑛 matrix. Show that a. 𝐴 + 𝐴𝑇 is symmetric. b. 𝐴 − 𝐴𝑇 is skew-symmetric. c. 𝐴𝐴𝑇 and 𝐴𝑇 𝐴 are symmetric.  4 2 4. Let A    . Find 1 3  a) A2 + 3A b) 2A3 + 3A2 + 4A + 5I2 19 MODULE 2 LINEAR EQUATIONS AND MATRICES Introduction In this lesson we will discuss the Gauss-Jordan reduction method and the Gaussian Elimination method and their application to the solution of linear systems, inverse of a matrix and the practical method for finding the inverse of a matrix, determinants and its properties, cofactor expansion and Cramer’s rule. Objectives After going through the lessons in this chapter, you are expected to be able to do the following: 1. Explain solution of system of linear equations by substitution and by elimination through simultaneous equations. 2. Transform a given matrix into a row-echelon and reduced row-echelon form. 3. Solve systems of linear equations using the Gauss-Jordan reduction method and the Gaussian Elimination method. 4. Define homogenous systems. 5. Differentiate between singular and non-singular matrices. 6. Enumerate the properties of the Inverse. 7. Find the inverse of a given matrix. 8. Solve systems of linear equations using the inverse of a matrix. 9. Define permutation. 10. Evaluate the determinant of a matrix using permutation. 11. Reduce the problem of evaluating determinants by co-factor expansion. 12. Find the inverse of a matrix using determinants and co-factor expansion. 13. Solve systems of linear equations by using the Cramer’s Rule. 2.1 Solutions of Systems of Linear Equations In this section we introduce a systematic technique of eliminating unknowns in solving systems of linear equations. To illustrate this technique, let us consider the following system: Example 1: 3𝑥1 + 2𝑥2 − 𝑥3 = −2 2𝑥1 − 3𝑥2 + 2𝑥3 = 14 𝑥1 + 2𝑥2 + 3𝑥3 = 6 (1) 20 Interchanging the first and third equations gives us 𝑥1 + 2𝑥2 + 3𝑥3 = 6 2𝑥1 − 3𝑥2 + 2𝑥3 = 14 3𝑥1 + 2𝑥2 − 𝑥3 = −2 (2) If we multiply the first equation in (2) by – 2 and add the result to the second equation we get −2𝑥1 − 4𝑥2 − 6𝑥3 = −12 2𝑥1 − 3𝑥2 + 2𝑥3 = 14 −7𝑥2 − 4𝑥3 = 2 The new equation obtained may replace either the first or second equation in system (2), (the two equations used to obtain it). Next let us multiply the first equation in (2) by – 3 and add the result to the third equation. This gives us −3𝑥1 − 6𝑥2 − 9𝑥3 = −18 3𝑥1 + 2𝑥2 − 𝑥3 = −2 or −5𝑥2 − 10𝑥3 = −20 𝑥2 + 2𝑥3 = 4 (dividing both sides of the equation by –5 ) The new equation obtained may replace either the first or the third equation in system (2). Thus if we replace the second equation in (2) by −7𝑥2 − 4𝑥3 = 2 and the third equation by 𝑥2 + 2𝑥3 = 4, we obtain the new system: 𝑥1 + 2𝑥2 + 3𝑥3 = 6 −7𝑥2 − 4𝑥3 = 2 𝑥2 + 2𝑥3 = 4 (3) Note that the variable 𝑥1 has been eliminated from the second and third equations of system (3). This new system is equivalent to the original system (1). Next we multiply the third equation of (3) by 7 and add the result to the second equation. This gives us 7 𝑥2 + 14𝑥3 = 28 −7𝑥2 − 4𝑥3 = 2 or 10𝑥3 = 30 𝑥3 = 3 (dividing both sides of the equation by 10) 21 The new equation obtained may replace either the second or third equation in (3). Replacing the third equation we get an equivalent system 𝑥1 + 2𝑥2 + 3𝑥3 = 6 −7𝑥2 − 4𝑥3 = 2 𝑥3 = 3 (4) To solve for the variable 𝑥2 we substitute 𝑥3 = 3 to the second equation in (4): −7𝑥2 – 4(3) = 2 𝑥2 = −2 To solve for the variable 𝑥1 , we substitute 𝑥2 = −2 and 𝑥3 = 3 to the first equation in (4): 𝑥1 + 2(−2) + 3(3) = 6 𝑥1 = 1 Thus the solution to the system (1) is the ordered triple (1, 2, 3). The method used here is called the Gaussian elimination. The objective of Gaussian elimination is to reduce a given system to triangular or echelon (staircase pattern) form and then use back substitution to find the solution of the system. The Gaussian elimination was modified by Carl Friedrich Gauss (1777 – 1855) and Wilhelm Jordan (1842 – 1899). They called it the Gauss-Jordan elimination. To illustrate it, let us start from system (3) of Example 1. Example 2: 𝑥1 + 2𝑥2 + 3𝑥3 = 6 −7𝑥2 − 4𝑥3 = 2 𝑥2 + 2𝑥3 = 4 (3) Interchange the second and third equations in (3): 𝑥1 + 2𝑥2 + 3𝑥3 = 6 𝑥2 + 2𝑥3 = 4 −7𝑥2 − 4𝑥3 = 2 (4) Now let us multiply the second equation in (4) by – 2 and add the result to the first equation. This gives us 𝑥1 + 2𝑥2 + 3𝑥3 = 6 −2 𝑥2 − 4𝑥3 = −8 𝑥1 − 𝑥3 = −2 22 Next we multiply the second equation in (4) by 7 and add the result to the third equation. This gives us 7 𝑥2 + 14𝑥3 −7𝑥2 − 4𝑥3 10𝑥3 or 𝑥3 = = = = 28 2 30 3 Replacing the first equation in (4) by 𝑥1 − 𝑥3 = −2 and the third equation by 𝑥3 = 3, we obtain the new system 𝑥1 − 𝑥3 = −2 𝑥2 + 2𝑥3 = 4 𝑥3 = 3 (5) Note that the variable 𝑥2 has been eliminated from the first and third equations of (5). This time let us add the first equation in (5) to the third equation. This gives us 𝑥1 𝑥1 − 𝑥3 = −2 𝑥3 = 3 = 1 Finally we multiply the third equation in (5) by – 2 and add the result to the second equation. 𝑥2 + 2𝑥3 = 4 −2𝑥3 = 3 𝑥2 = −2 System (5) now becomes 𝑥1 𝑥2 = 1 = −2 𝑥3 = 3 Thus the solution to the system is the ordered triple (1, -2, 3). The Gauss-Jordan elimination is an algorithm that reduces a given system to reduced row echelon form or row canonical form. This method is less efficient than the Gaussian elimination in solving systems of equations however, it is well suited for calculating the inverse of a matrix which we will discuss in a later section. 23 2.2 Elementary Row Operations A system of linear equations can be placed in matrix form. This notation is very efficient especially when we are dealing with large systems. To begin, let us consider an 𝑚 x 𝑛 system of linear equations 𝑎11 𝑥1 + 𝑎12 𝑥2 + … + 𝑎1𝑛 𝑥𝑛 = 𝑏1 𝑎21 𝑥1 + 𝑎22 𝑥2 + … + 𝑎2𝑛 𝑥𝑛 = 𝑏2 ⋮ 𝑎𝑚1 𝑥1 + 𝑎𝑚2 𝑥2 + … + 𝑎𝑚𝑛 𝑥𝑛 = 𝑏𝑚 (1) The coefficients of this linear system can be written in a rectangular array having 𝑚 rows and 𝑛 columns, and we designate this array as 𝐴: 𝑎11 𝑎12 … 𝑎1𝑛 𝑎21 𝑎22 … 𝑎2𝑛 ⋮ ⋮ ⋮ 𝐴= 𝑎𝑚1 𝑎𝑚2 … 𝑎𝑚𝑛 [ ] This 𝑚 x 𝑛 matrix 𝐴 is called the coefficient matrix for the given system (1) above. If we add another column in 𝐴 to include the constants b1, b2, …, bm , we will have a matrix that expresses compactly all the relevant information contained in (1). Such a matrix is called the augmented matrix for (1), and is usually denoted as [A│B]. If we let C = [A│B] be the augmented matrix for the system (1), then C is the 𝑚 x ( 𝑛+1 ) matrix given by 𝑎11 𝑎12 … 𝑎1𝑛 𝑏1 𝑎21 𝑎22 … 𝑎2𝑛 𝑏2 C= ⋮ ⋮ ⋮ ⋮ 𝑎𝑚1 𝑎𝑚2 … 𝑎𝑚𝑛 𝑏𝑚 [ ] We also write 𝐴𝑋 = 𝐵, where 𝑎11 𝑎12 … 𝑎1𝑛 𝑎21 𝑎22 … 𝑎2𝑛 ⋮ ⋮ ⋮ 𝐴= and 𝑎𝑚1 𝑎𝑚2 … 𝑎𝑚𝑛 [ ] represent the coefficients and 𝑥1 𝑥2 𝑋=[ ⋮ ] 𝑥𝑛 𝑏1 𝑏2 𝐵=[ ] ⋮ 𝑏𝑚 24 represents the variables. For example, the array 1 −3 0 2 [0 4 5 −1 2 −1 3 0 2 −3] 4 represents the system of 3 linear equations 𝑥1 − 3𝑥2 + 2𝑥4 = 2 4𝑥2 + 5𝑥3 − 𝑥4 = −3 2𝑥1 − 𝑥2 + 3𝑥3 = 4 The task of this section is to manipulate the augmented matrix representing a given linear system into reduced row echelon form. But before we continue, let us introduce some terminology. We have seen from Examples 1 and 2 that three important operations were applied to solve the given system. These are: multiplying or dividing both sides of an equation by a nonzero number, adding a multiple of one equation to another equation, and interchanging two equations. These three operations when applied to the rows of an augmented matrix are called elementary row operations and will result in the matrix of an equivalent system. Definition 2.2.1: An elementary row operation on an 𝑚 x 𝑛 matrix 𝐴 = [𝑎𝑖𝑗 ] is any one of the following operations: 1. Interchange rows 𝑖 and 𝑗 of A. 2. Multiply row 𝑖 of 𝐴 by any nonzero real number. 3. Add 𝑐 times row 𝑖 of 𝐴 to row 𝑗 of 𝐴, 𝑖 ≠ 𝑗. The process of applying elementary row operations to simplify an augmented matrix is called row reduction. The following notations will be used to denote the elementary row operation used: 1. 𝑅𝑖 𝑅𝑗 means interchanging the 𝑖th row and the 𝑗th row 2. 𝑐𝑅𝑖 𝑅𝑖 means that the 𝑖th row is replace with 𝑐 times the 𝑖th row. 3. 𝑐𝑅𝑖 + 𝑅𝑗 𝑅𝑗 means that the 𝑗th row is replaced with the sum of the 𝑗th row and the 𝑖th row multiplied by 𝑐. Before we illustrate the three elementary operations, let us define first a matrix in reduced row echelon form. 25 Definition 2.2.2: An 𝑚 x 𝑛 matrix is said to be in reduced row echelon form when it satisfies the following properties: a) All rows consisting entirely of zeros, if any, are at the bottom of the matrix. b) The first nonzero entry in each row that does not consist entirely of zeros is a 1, called the leading entry(pivot) of its row. c) If rows 𝑖 and 𝑖 + 1 are two successive rows that do not consist entirely of zeros, then the leading entry of row 𝑖 + 1 is to the right of the leading entry of row 𝑖. d) If a column contains a leading entry of some row, then all other entries in that column are zero. Note that a matrix in reduced row echelon form might not have any rows that consist entirely of zeros. Example 1: Below are matrices in reduced row echelon form: 1 0 0 4  A  0 1 0 3  0 0 1 2   The first nonzero entry in each row is 1 called the pivot.  The leading entry (pivot) for row 2 is to the right of the leading entry for row 1 and the leading entry for row 3 is to the right of the leading entry for row 2.  All other entries for the columns containing a leading entry are zero.  There are entries other than zero in column 4 since there is no pivot in that column. 1 2 0 0 B  0 0 1 0 0 0 0 1   The leading entry for row 2 appears in column 3 and the leading entry for row 3 appears in column 4.  Column 2 does not contain a pivot. 1 0 C 0  0  All rows consisting of zeros are at the bottom of the matrix.  Columns 1 and 3 contain a pivot hence all other entries in those columns are zero. 0 0 1/ 3 0 1 0  0 0 0   0 0 0  26 Example 2: Below are matrices not in reduced row echelon form: 1 2 0 4 D  0 0 0 0 0 0 1 3   Fails property (a), the row consisting entirely of zeros must be at the bottom of the matrix.  Fails property (b), the first nonzero entry in row 2 is not equal to 1.  Fails property (d), column 3 contains a leading entry but the other entries in that column are not equal to zero. 4 1 0 3 E  0 2  2 5  0 0 1 2 1 0 F 0  0 0 3 4 1  2 5 1 2 2  0 0 0  Fails property (c), the leading entry of the third row must be to the right of the leading entry of row 2. Example 3: The three elementary operations are illustrated below: 0 1 0 2  Let A  2 3 0  4 2 1 6 9  (a) Interchanging rows 1 and 3 of A, (𝑅1 𝑅3 ) , we obtain: 2 1 6 9  B  2 3 0  4 0 1 0 2  (b) Replacing row 1 of B by ½ times row 1 of B, ( 12𝑅1 1 9 1 2 3 2 𝐶 = [ 2 3 0 − 4] 0 1 0 2 𝑅1 ) , we obtain: 27 (c) Replacing row 2 by the sum of -2 times row 1 of 𝐶 plus row 2 of 𝐶, (−2𝑅1 + 𝑅2 𝑅2 ) we obtain: 1 9 1 2 3 2 𝐷 = [ 0 2 − 6 − 13] 0 1 0 2 Definition 2.2.3: An 𝑚 x 𝑛 matrix 𝐴 is said to be row equivalent to an 𝑚 x 𝑛 matrix 𝐵 if 𝐵 can be obtained by applying a finite sequence of elementary row operations to 𝐴. Hence in Example 3, 𝐴 is row equivalent to 𝐷 because 𝐷 is obtained from 𝐴 after applying a series of elementary row operations on A. NOTE: 1. Every matrix is row equivalent to itself. 2. If A is row equivalent to B, then B is row equivalent to A. 3. If A is row equivalent to B and B is row equivalent to C, then A is row equivalent to C. Theorem 2.2.1: Every nonzero 𝑚 x 𝑛 matrix is row equivalent to a unique matrix in reduced row echelon form. 2.3 The Gauss-Jordan Method The Gauss-Jordan method is a systematic technique for applying matrix row transformations in an attempt to reduce a matrix to diagonal form also called reduced row echelon form. The following are steps in using the Gauss-Jordan method to put a matrix into reduced row echelon form: STEP 1: Form the augmented matrix [ A│B ]. STEP 2: Obtain 1 as the first element of the first column. STEP 3: Use the first row to transform the remaining entries in the first column to 0. STEP 4: Obtain 1 as the second entry in the second column. STEP 5: Use the second row to transform the remaining entries in the second column to 0. STEP 6: Continue in this manner as far as possible. 28 0 0 1 2  Example 1: Transform A  2 3 0  2 to reduced row echelon form. 3 3 6  9  To transform A to reduced row echelon form, we perform Gauss-Jordan elimination method. The sequence of elementary row operations, beginning with A, follows. Interchange rows 1 and 3 of 𝐴 to obtain 𝐴1 0 0 1 2 𝐴 = [ 2 3 0 −2 ] 3 3 6 −9 𝑅1 𝑅3 3 3 6 −9 [2 3 0 −2] = 𝐴1 0 0 1 2 (Multiply row 1 of 𝐴1 by 1/3 to obtain 𝐴2 ) 1 3 𝑅1 𝑅1 1 [2 0 1 2 −3 3 0 −2] = 𝐴2 0 1 2 (Multiply row 1 of 𝐴2 by -2 and add to row 2 to obtain 𝐴3 ) −2𝑅1 + 𝑅2 𝑅2 1 [0 0 1 2 −3 1 −4 4 ] = 𝐴3 0 1 2 (Multiply row 2 of 𝐴3 by -1 and add to row 1 to obtain 𝐴4 ) 1𝑅2 + 𝑅1 𝑅1 1 0 [0 1 0 0 6 −7 −4 4 ] = 𝐴4 1 2 Multiply row 3 of 𝐴4 by −6 and add to row 1; multiply row 3 of 𝐴4 by 4 and add to row 2 to obtain 𝐴5 ) −6𝑅3 + 𝑅1 4𝑅3 + 𝑅2 𝑅1 𝑅3 1 0 0 −19 [0 1 0 12 ] = 𝐴5 0 0 1 2 𝐴5 is already in reduced row echelon form. We now apply these results to the solution of linear system. 29 Theorem 2.3.1. Let AX = B and CX = D be two linear systems each of 𝑚 equations in 𝑛 unknowns. If the augmented matrices [ A│B ] and [ C│D ] of these systems are row equivalent, then both linear systems have exactly the same solutions. Example 2. Solve the linear system x + y + 2z = -1 x – 2y + z = -5 3x + y + z = 3 by Gauss-Jordan reduction. Solution: Let the augmented matrix be 1 𝐴 = [1 3 1 2 −2 1 1 1 −1 −5] 3 Applying the Gauss-Jordan elimination we have 1 1 2 [1 −2 1 3 1 1 −1 𝑅 + 𝑅 1 2 −5] 3𝑅 + 𝑅 1 3 3 1 − 3 𝑅2 −𝑅2 + 𝑅1 2𝑅2 + 𝑅3 3 − 13 𝑅3 5 − 3 𝑅3 + 𝑅1 1 − 3 𝑅3 + 𝑅2 𝑅2 𝑅3 1 [0 0 1 2 −3 −1 −2 −5 𝑅2 1 [0 0 1 2 1 1/3 −2 −5 1 [0 0 0 5/3 −7/3 1 1/3 4/3 ] 0 −13/3 26/3 𝑅1 𝑅3 𝑅3 𝑅1 𝑅2 1 [0 0 1 [0 0 0 1 0 0 1 0 −1 −4] 6 5/3 1/3 1 0 0 1 −1 4/3] 6 −7/3 4/3 ] −2 1 2] −2 The last matrix is in reduced row echelon form and is row equivalent to the augmented matrix representing the given system. As an augmented matrix, it represents the system x y z = 1 = 2 = -2 30 Thus the solution set is {(1, 2, - 2)}. The given system is consistent. Example 3: Solve the linear system 𝑥1 + 𝑥2 − 𝑥3 = 7 4𝑥1 − 𝑥2 + 5𝑥3 = 4 6𝑥1 + 𝑥2 + 3𝑥3 = 0 Solution: Let the augmented matrix be 1 [4 6 1 −1 −1 5 1 3 7 4] 0 Applying the Gauss-Jordan elimination we have 1 1 −1 [4 −1 5 6 1 3 7 −4𝑅 + 𝑅 4] −6𝑅1 + 𝑅2 1 3 0 1 𝑅2 + 𝑅1 −𝑅2 + 𝑅3 5 1 − 5 𝑅2 𝑅2 1 [ 𝑅3 0 0 1 −1 −5 9 −5 9 𝑅1 𝑅3 1 0 4/5 [0 −5 9 0 0 0 𝑅2 1 [0 0 0 1 0 7 −24] −42 11/5 −24 ] −18 4/5 −9/5 0 11/5 24/5 ] −18 The last equation reads 0𝑥1 + 0𝑥2 + 0𝑥3 = −18, which is impossible since 0 ≠ −18. Hence the given system has no solution. In this case the system is said to be inconsistent. Example 4: Solve the system 2𝑥1 + 6𝑥2 − 4𝑥3 + 2𝑥4 = 4 𝑥1 − 𝑥3 + 𝑥4 = 5 −3𝑥1 + 2𝑥2 − 2𝑥3 = −2 Solution: Applying the Gauss-Jordan elimination to the augmented matrix, we have 2 6 −4 [ 1 0 −1 −3 2 −2 2 1 0 4 5] −2 𝑅1 𝑅2 1 0 −1 [ 2 6 −4 −3 2 −2 1 2 0 5 4] −2 31 𝑅2 𝑅3 1 0 [ 0 6 0 2 𝑅2 𝑅2 1 [ 0 0 −2𝑅2 + 𝑅3 𝑅3 1 0 [ 0 1 0 0 𝑅3 1 1 0 −1 0 [ 0 1 −1/3 0 0 1 −9/13 −2𝑅1 + 𝑅2 3𝑅1 + 𝑅3 1 6 3 − 13 𝑅3 𝑅3 + 𝑅1 1 3 𝑅3 + 𝑅2 𝑅1 𝑅2 −1 −2 −5 1 0 3 0 −1 1 −1/3 2 −5 5 −6] 13 1 0 3 5 −1] 13 −1 1 −1/3 0 −13/3 3 5 −1] 15 1 0 0 4/13 [ 0 1 0 3/13 0 0 1 −9/13 5 −1 ] −45/13 20/13 −28/13] −45/13 Column 4 has no pivot hence 𝑥4 is a free variable. If we let 𝑥4 = 𝑟, 𝑟 ∈ 𝑅, then we obtain: 𝑥1 = 20 13 − 4 13 𝑟 , 𝑥2 = − 28 13 20 Thus the solution set is in the form of ( 13 − − 3 13 4 13 𝑟 , and 𝑥3 = − 𝑟, − 28 13 − 3 13 45 13 𝑟, − + 45 13 9 13 + 𝑟. 9 13 𝑟, 𝑟), 𝑟 ∈ 𝑅. The given system has infinitely many solution. Corollary 2.3.1. If A and C are row equivalent 𝑚 x 𝑛 matrices, then the linear systems AX = 0 and CX = 0 have exactly the same solutions. SAQ 2-1 Consider the system 2𝑥1 − 𝑥2 + 3𝑥3 = 𝑎 3𝑥1 + 𝑥2 − 5𝑥3 = 𝑏 −5𝑥1 − 5𝑥2 + 21𝑥3 = 𝑐 Show that the system is inconsistent if 𝑐 ≠ 2𝑎 − 3𝑏. 32 ASAQ 2-1 Reducing the augmented matrix of the given system to row echelon form, we have 2 −1 3 [3 1 −5 −5 −5 21 𝑎 𝑏] 𝑐 1 −1/2 𝑅1 [ 3 1 −5 −5 1 𝑅 2 1 −3𝑅1 + 𝑅2 5𝑅1 + 𝑅3 𝑅2 𝑅3 1 [0 0 3/2 −5 21 −1/2 3/2 5/2 −19/2 −15/2 57/2 𝑎/2 𝑏 ] 𝑐 𝑎/2 −3𝑎+2𝑏 2 5𝑎+2𝑐 ] 2 3𝑅2 + 𝑅3 𝑅3 1 [0 0 −1/2 3/2 5/2 −19/2 0 0 𝑎/2 −3𝑎+2𝑏 2 ] −2𝑎 + 3𝑏 + 𝑐 The last equation reads 0𝑥1 + 0𝑥2 + 0𝑥3 = −2𝑎 + 3𝑏 + 𝑐. The given system has no solution if −2𝑎 + 3𝑏 + 𝑐 ≠ 0. Thus the system is inconsistent if 𝑐 ≠ 2𝑎 − 3𝑏. 2.4 Homogeneous Systems A linear system of the form a11 x1  a12 x2  ...  a1n xn  0 a21 x1  a22 x2  ...  a2 n xn  0 (2) am1 x1  am 2 x2  ...  amn xn  0 is called a homogeneous system. We can write it in matrix form as 𝐴𝑋 = 0 The solution to the homogeneous system (2) is called the trivial solution if x1  x2  ...  xn  0 and a solution x1 , x2 ,..., x n in which not all the xi are zero is called a nontrivial solution. 33 Example 1. Consider the homogeneous system 𝑥 + 2𝑦 + 𝑧 = 0 2𝑥 + 5𝑦 + 3𝑧 = 0 3𝑥 + 7𝑦 + 5𝑧 = 0 The augmented matrix of this system is 1 2 [2 5 3 7 1 3 5 0 0] 0 which is row equivalent to (you must do the calculation here) 1 0 [0 1 0 0 0 0 1 0 0] 0 Hence the solution to the given homogeneous system is 𝑥 = 𝑦 = 𝑧 = 0 (a trivial solution). SAQ 2-2 Find the solution of the homogeneous system 𝑥 + 𝑦 + 𝑧 + 𝑤 = 0 2𝑥 + 𝑦 𝑧 = 0 3𝑥 + 𝑦 + 𝑧 + 2𝑤 = 0 ASAQ 2-2 The augmented matrix of this system is 1 1 1 1 [2 1 −1 0 3 1 1 2 0 0] 0 34 Transforming it to reduced row echelon form we have 1 1 1 1 [2 1 −1 0 3 1 1 2 0 −2𝑅 + 𝑅 0] −3𝑅1 + 𝑅2 1 3 0 −𝑅2 1 𝑅2 1 1 [ 𝑅3 0 −1 −3 0 −2 −2 𝑅2 1 [0 0 1 1 1 3 −2 −2 1 0 −2 0] −1 0 1 0 2 0] −1 0 𝑅1 1 [ 𝑅3 0 0 0 −2 1 3 0 4 −1 0 2 0] 3 0 𝑅 4 3 1 𝑅3 [0 0 0 −2 1 3 0 1 −1 0 2 0] 3/4 0 2𝑅3 + 𝑅1 −3𝑅3 + 𝑅2 1 𝑅1 [ 𝑅2 0 0 −𝑅2 + 𝑅1 2𝑅2 + 𝑅3 1 0 0 1 0 0 1 1/2 0 −1/4 0] 3/4 0 The fourth column has no pivot hence 𝑤 is a free variable. If we let 𝑤 = 𝑠, 𝑠 ∈ 𝑅, then the solution of the given homogeneous system is 1 𝑥 = −2𝑠 𝑦 = 1 4 𝑠 3 𝑧 = −4𝑠 𝑤 = 𝑠 1 1 3 where 𝑠 is any real number. Thus the solution set is in the form of {(− 2 𝑠, 4 𝑠, − 4 𝑠, 𝑠)}, 𝑠 ∈ 𝑅. This shows that the system has a nontrivial solution. This result is generalized in the next theorem. Theorem 2.4.1: A homogeneous system of 𝑚 equations in 𝑛 unknowns always has a nontrivial solution if 𝑚 < 𝑛, that is , if the number of unknowns exceeds the number of equations. 35 ACTIVITY 1. Find all solutions to the given linear systems. a. 𝑥1 − 𝑥2 − 𝑥3 = 0 b. 2𝑥1 + 3𝑥2 − 𝑥3 = 0 2𝑥1 + 𝑥2 + 2𝑥3 = 0 −4𝑥1 + 2𝑥2 + 𝑥3 = 0 𝑥1 − 4𝑥2 − 5𝑥3 = 0 7𝑥1 + 3𝑥2 − 9𝑥3 = 0 c. 𝑥1 + 𝑥2 + 𝑥3 + 𝑥4 = 6 2𝑥1 − 𝑥3 − 𝑥4 = 4 3 𝑥3 + 6𝑥4 = 3 𝑥1 − 𝑥4 = 5 2. Find all values of 𝑎 for which the resulting system has (a) no solution, (b) a unique solution, and (c) infinitely many solutions. 𝑥1 + 𝑥2 − 𝑥3 = 3 𝑥1 − 𝑥2 + 3𝑥3 = 4 𝑥1 + 𝑥2 + (𝑎2 − 10)𝑥3 = 𝑎 2.5 The Inverse of a Matrix When working with real numbers, we know that a number 𝑎 times its inverse is one provided that 𝑎 ≠ 0. Thus the equation 𝑎𝑥 = 𝑏 could be solved for 𝑥 by dividing both sides (or multiplying both sides by 1/𝑎) of the equation by 𝑎 to get 𝑥 = 𝑏/𝑎 provided that 𝑎 ≠ 0. However the matrix equation 𝐴𝑋 = 𝐵 cannot be divided by the matrix 𝐴 on both sides because there is no matrix division. This section is concerned on finding a matrix whose function is similar to the inverse of a real number 𝑎. We like to find a matrix inverse such that 𝐴 times its inverse is equal to “one”. The matrix equivalent of “one” is called the “identity matrix”. Definition 2.5.1: An 𝑛 x 𝑛 matrix 𝐴 is called nonsingular (or invertible) if there exists an 𝑛 x 𝑛 matrix 𝐵 such that 𝐴𝐵 = 𝐵𝐴 = 𝐼𝑛 The matrix 𝐵 is called an inverse of 𝐴. If there exists no such matrix 𝐵, then 𝐴 is called singular (or non-invertible). 36 −2 Example 1: Let 𝐴 = [ 3 −4/11 1 ] and 𝐵 = [ 3/11 4 −2 1 −4/11 𝐴𝐵 = [ ][ 3 4 3/11 −4/11 𝐵𝐴 = [ 3/11 1/11 ] then, 2/11 1/11 1 0 ] = [ ] = 𝐼2 2/11 0 1 1/11 −2 ][ 2/11 3 1 1 ]= [ 4 0 and 0 ] = 𝐼2 1 Since 𝐴𝐵 = 𝐵𝐴 = 𝐼2 , we conclude that 𝐵 is an inverse of 𝐴 and that 𝐴 is nonsingular. 4 Example 2: Let 𝐴 = [ −3 Solution: Let 𝐴−1 = [ 𝑎 𝑐 2 ], find 𝐴−1 . 1 𝑏 ] then 𝑑 𝐴𝐴−1 = 𝐼2 [ [ 4 2 𝑎 ][ −3 1 𝑐 4𝑎 + 2𝑐 −3𝑎 + 𝑐 1 0 𝑏 ]= [ ] 0 1 𝑑 4𝑏 + 2𝑑 1 ]= [ 0 −3𝑏 + 𝑑 0 ] 1 Two matrices are equal if and only if their corresponding parts are equal. Hence if we equate the corresponding parts, we get 4𝑎 + 2𝑐 = 1 −3𝑎 + 𝑐 = 0 and 4𝑏 + 2𝑑 = 0 −3𝑏 + 𝑑 = 1 Solving the system we have a = 1/10, b = -1/5, c = 3/10, and d = 2/5. Thus 𝐴−1 = [ 1/10 3/10 −1/5 ] 2/5 You can always check your answer by taking the product 𝐴𝐴−1 and making sure that the answer is the identity matrix 𝐼2 . There is a simple procedure for finding the inverse of a 2 x 2 matrix. It can be done easily as follows: Let 𝐴 = [ 𝑎 𝑐 𝑤 𝑏 ]. We are looking for a matrix [ 𝑦 𝑑 𝑥 𝑧 ] such that 37 [ [ 𝑎 𝑐 𝑎𝑤 + 𝑏𝑦 𝑐𝑤 + 𝑑𝑦 𝑏 𝑤 ][ 𝑑 𝑦 𝑥 1 0 𝑧 ] = [0 1] 𝑎𝑥 + 𝑏𝑧 1 0 ] = [ ] 𝑐𝑥 + 𝑑𝑧 0 1 Equating the corresponding parts we have 𝑎𝑤 + 𝑏𝑦 = 1 (1) 𝑐𝑤 + 𝑑𝑦 = 0 (2) 𝑎𝑥 + 𝑏𝑧 = 0 (3) 𝑐𝑥 + 𝑑𝑧 = 1 (4) Multiplying equation (1) by 𝑐 and equation (2) by 𝑎 we get: 𝑎𝑐𝑤 + 𝑏𝑐𝑦 = 𝑐 𝑎𝑐𝑤 + 𝑎𝑑𝑦 = 0 (1a) (2a) Subtracting equation (1a) from equation (2a) and solving for 𝑦 gives us: 𝑦= −𝑐 𝑎𝑑−𝑏𝑐 Multiplying equation (3) by 𝑐 and equation (4) by 𝑎 we obtain: 𝑎𝑐𝑥 + 𝑏𝑐𝑧 = 0 𝑎𝑐𝑥 + 𝑎𝑑𝑧 = 𝑎 (3a) (4a) Subtracting equation (3a) from equation (4a) and solving for 𝑧 gives us: 𝑧= 𝑎 𝑎𝑑−𝑏𝑐 In a similar manner we can solve for 𝑤 by multiplying equation (1) by 𝑑 and equation (2) by 𝑏: 𝑑 𝑤= 𝑎𝑑−𝑏𝑐 Finally we multiply equation (3) by 𝑑 and equation (4) by 𝑏 to solve for 𝑥: 𝑥= −𝑏 𝑎𝑑−𝑏𝑐 38 Thus, the inverse of matrix 𝐴 = [ 𝑑 −𝑏 [𝑎𝑑−𝑏𝑐 −𝑐 𝑎𝑑−𝑏𝑐 𝑎 ] 𝑎𝑑−𝑏𝑐 𝑎𝑑−𝑏𝑐 = 𝑎 𝑐 𝑏 ] is the matrix 𝑑 1 𝑑 [ 𝑎𝑑−𝑏𝑐 −𝑐 −𝑏 ] provided that 𝑎𝑑 − 𝑏𝑐 ≠ 0. 𝑎 4 2 Example 3: Solve the inverse of matrix 𝐴 = [ ] using the formula. −3 1 Solution: Let 𝐴 = [ 4 2 𝑎 ]=[ −3 1 𝑐 𝐴−1 = 1 1 [ 10 3 𝑏 ] then 𝑎𝑑 − 𝑏𝑐 = 4(1) − 2(−3) = 10. Thus 𝑑 1/10 −2 ] = [ 3/10 4 −1/5 ] 2/5 Note that this procedure only works for 2 x 2 matrices. In general, the procedure in finding the inverse of any 𝑛 x 𝑛 matrix is as follows: 1. Form the 𝑛 x 2𝑛 partitioned matrix [ 𝐴│𝐼𝑛 ] obtained by adjoining the identity matrix 𝐼𝑛 to the given matrix 𝐴. 2. Use elementary row operations to transform the matrix obtained in Step 1 to reduced row echelon form. Remember that any elementary row operation we do to a row of 𝐴 we also do to the corresponding row of 𝐼𝑛 . 3. The series of elementary row operations which reduces 𝐴 to 𝐼𝑛 will reduce 𝐼𝑛 to 𝐴−1 . If 𝐴 cannot be reduced to 𝐼𝑛 then 𝐴 is singular and 𝐴−1 does not exist. 1 2 Example 4. Find the inverse of the matrix 𝐴 = [2 5 1 0 3 3]. 8 Solution: We form the 3 x 6 partitioned matrix [A │I3 ] by adjoining the 3 x 3 identity matrix to 𝐴 and transform it to reduced row echelon form by applying the Gauss-Jordan elimination. 𝐴 1 [2 1 2 5 0 𝐼3 3 3 8 1 0 0 0 1 0 0 −2𝑅 + 𝑅 0] −𝑅 1+ 𝑅 2 1 3 1 𝑅2 𝑅3 39 1 2 [0 1 0 −2 3 −3 5 9 −3 −1 1 0 0 −2 1 0] −1 0 1 −2𝑅2 + 𝑅1 2𝑅2 + 𝑅3 1 [0 0 0 1 0 1 [0 0 0 9 1 −3 0 1 5 −2 0 −2 1 0] 5 −2 −1 1 [0 0 0 1 0 −40 13 5 0 0 1 5 −2 −2 1 −5 2 0 0] −𝑅3 1 𝑅1 𝑅3 𝑅3 −9𝑅3 + 𝑅1 3𝑅3 + 𝑅2 𝑅1 𝑅2 16 9 −5 −3]. −2 −1 −40 Since 𝐴 has been transformed to 𝐼3 then 𝐴−1 = [ 13 5 16 −5 −2 9 −3]. −1 You can check your answer by taking the product 𝐴𝐴−1 and making sure that the answer is the identity matrix 𝐼3 . 2 1 Example 5. Find the inverse of the matrix 𝐴 = [ 1 −2 −3 −1 −1 −3]. 2 Solution: We form the 3 x 6 matrix [A │I3 ] and transform it to reduced row echelon form (you must do the calculation here): 2 1 −1 1 [ 1 −2 −3 0 −3 −1 2 0 0 0 1 0 ] is row equivalent to [ 1 0 0 1 0 1 0 0 2/5 1/5 0 −1 1/5 −2/5 0 ] 1 0 −1/5 −1/35 −1/7 Since 𝐴 is row equivalent to a matrix which has a row consisting of zeros, then 𝐴 cannot be reduced to 𝐼3 . Thus 𝐴 has no inverse and we say that 𝐴 is a singular matrix. Example 4 shows that an inverse exists if the partitioned matrix [𝐴 𝐼𝑛 ] can be reduced to [𝐼𝑛 𝐵] where 𝐵 = 𝐴−1 . That is, A is row equivalent to 𝐼𝑛 . This is stated in the following theorem. Theorem 2.5.1: An 𝑛 x 𝑛 matrix is nonsingular if and only if it is row equivalent to 𝐼𝑛 . Theorem 2.5.2: If a matrix has an inverse, then the inverse is unique. 40 Proof: Suppose 𝐴 has two inverses, say 𝐵 and C. Then by Definition 2.5.1, 𝐴𝐵 = 𝐵𝐴 = 𝐼𝑛 and 𝐴𝐶 = 𝐶𝐴 = 𝐼𝑛 . We now have 𝐵(𝐴𝐶) = (𝐵𝐴)𝐶 Matrix multiplication is associative Then 𝐵 = 𝐵𝐼𝑛 = 𝐵(𝐴𝐶) = (𝐵𝐴)𝐶 = 𝐼𝑛 𝐶 = 𝐶 By transitivity, we conclude that 𝐵 = 𝐶 and the theorem is proved. Theorem 2.5.3: ( Properties of the Inverse ) a) If 𝐴 is a nonsingular matrix, then 𝐴−1 is nonsingular and (𝐴−1 )−1 = 𝐴 b) If 𝐴 and 𝐵 are nonsingular matrices, then 𝐴𝐵 is nonsingular and (𝐴𝐵)−1 = 𝐵 −1 𝐴−1 c) If A is a nonsingular matrix, then (𝐴𝑇 )−1 = (𝐴−1 )𝑇 We prove part (b) of the theorem and leave the proof of parts (a) and (c) as an exercise. Proof of part (b): We have to show that (𝐴𝐵)(𝐵−1 𝐴−1 ) = (𝐵 −1 𝐴−1 )(𝐴𝐵) = 𝐼𝑛 , that is the inverse of 𝐴𝐵 = 𝐵 −1 𝐴−1 . (𝐴𝐵)(𝐵−1 𝐴−1 ) = 𝐴(𝐵𝐵 −1 )𝐴−1 Matrix multiplication is associative = 𝐴(𝐼𝑛 )𝐴−1 Definition of an inverse of a matrix = 𝐴𝐴−1 𝐼𝑛 is a multiplicative identity for an 𝑛 x 𝑛 matrix = 𝐼𝑛 Definition of an inverse of a matrix Similarly (𝐵 −1 𝐴−1 )(𝐴𝐵) = 𝐵 −1 (𝐴−1 𝐴)𝐵 Matrix multiplication is associative = 𝐵 −1 (𝐼𝑛 )𝐵 Definition of an inverse of a matrix = 𝐵 −1 𝐵 𝐼𝑛 is a multiplicative identity for an 𝑛 x 𝑛 matrix = 𝐼𝑛 Definition of an inverse of a matrix 41 Hence (𝐴𝐵)−1 = 𝐵 −1 𝐴−1. Corollary 2.5.1: If 𝐴1 , 𝐴2 , ⋯ , 𝐴𝑟 are 𝑛 x 𝑛 nonsingular matrices, then 𝐴1 𝐴2 ⋯ 𝐴𝑟 is nonsingular and (𝐴1 𝐴2 ⋯ 𝐴𝑟 )−1 = 𝐴𝑟 −1 𝐴𝑟−1 −1 ⋯ 𝐴1 −1 Theorem 2.5.4: Suppose that 𝐴 and 𝐵 are 𝑛 x 𝑛 matrices a) If 𝐴𝐵 = 𝐼𝑛 then 𝐵𝐴 = 𝐼𝑛 . b) If 𝐵𝐴 = 𝐼𝑛 then 𝐴𝐵 = 𝐼𝑛 . 2.6 Linear Systems and Inverses Suppose that matrix 𝐴 is invertible. Consider the system 𝐴𝑋 = 𝐵 where 𝑥1 𝑏1 𝑥2 𝑏 𝑋 = [ ⋮ ] and 𝐵 = [ 2 ] are 𝑛 x 1 matrices where 𝑥1 , 𝑥2 , ⋯ , 𝑥𝑛 are variables and ⋮ 𝑥𝑛 𝑏𝑛 𝑏1 , 𝑏2 , ⋯ , 𝑏𝑛 are real numbers. Since 𝐴 is invertible then 𝐴−1 exists and 𝐴𝑋 = 𝐵 𝐴−1 𝐴𝑋 = 𝐴−1 𝐵 Multiplying both sides by 𝐴−1 𝐼𝑛 𝑋 = 𝐴−1 𝐵 Definition of an inverse of a matrix 𝑋 = 𝐴−1 𝐵 𝐼𝑛 is the multiplicative identity for an 𝑛 x 𝑛 matrix Hence 𝑋 = 𝐴−1 𝐵 is the unique solution of the system. Example 1. Let 𝐴 = [ 1 1 5 ], 𝐵 = [ ]. Solve 𝐴𝑋 = 𝐵 by using the inverse of A. 1 2 7 Solution: 𝐴−1 = [ 2 −1 ] (verify) −1 1 42 Hence 𝑋 = 𝐴−1 𝐵 2 −1 5 3 𝑋=[ ][ ]= [ ] −1 1 7 2 Hence the unique solution is 𝑋 = (3, 2). Theorem 2.6.1: If 𝐴 is an 𝑛 x 𝑛 matrix, the homogeneous system 𝐴𝑋 = 0 has a nontrivial solution if and only if A is singular. Example 2: Consider the homogeneous system 2𝑥1 + 𝑥2 − 𝑥3 = 0 𝑥1 − 2𝑥2 − 3𝑥3 = 0 −3𝑥1 − 𝑥2 + 2𝑥3 = 0 2 1 −1 where the coefficient matrix 𝐴 = [ 1 −2 −3] is the singular matrix of Example 5 of −3 −1 2 the previous section. The augmented matrix 2 1 −1 [ 1 −2 −3 −3 −1 2 0 1 0 0] is row equivalent to [0 1 0 0 0 −1 1 0 0 0] 0 which implies that 𝑥1 = 𝑟, 𝑥2 = −𝑟, and 𝑥3 = 𝑟, where 𝑟 ∈ 𝑅. Thus the system has a nontrivial solution. Theorem 2.6.2: If 𝐴 is an 𝑛 x 𝑛 matrix, then 𝐴 is nonsingular if and only if the linear system 𝐴𝑋 = 𝐵 has a unique solution for every 𝑛 x 1 matrix 𝐵. 43 SAQ 2-3 1 Let 𝐴 = [1 2 𝐵 1 −1 −1 2 ]. Find the inverse of 𝐴 then solve 𝐴𝑋 = 𝐵 where 1 −1 1 is the 3 x 1 matrix (a) [−2], 3 4 (b) [−3], and 5 5 (c) [ 7 ] −4 ASAQ 2-3 To solve for 𝐴−1 let us form the 3 x 6 matrix [𝐴 𝐼3 ] and transform it to reduced row echelon form. 1 1 −1 [1 −1 2 2 1 −1 1 0 0 −𝑅 + 𝑅 1 2 0 1 0] −2𝑅 + 𝑅 1 3 0 0 1 𝑅2 −𝑅2 𝑅2 1 [ 𝑅3 0 0 1 −1 −2 3 −1 1 1 0 −1 1 −2 0 0 0] 1 1 1 −1 𝑅3 [0 −1 1 0 −2 3 1 0 0 −2 0 1] −1 1 0 1 1 −1 𝑅2 [0 1 −1 0 −2 3 1 2 −1 0 0 0 −1] 1 0 −1 0 2 0 3 1 1 −1] −2 −𝑅2 + 𝑅1 2𝑅2 + 𝑅3 𝑅1 𝑅3 1 0 [0 1 0 0 0 −1 1 𝑅3 + 𝑅2 𝑅2 1 0 0 [0 1 0 0 0 1 −1 0 1 5 1 −3] 3 1 −2 −1 0 1 Thus 𝐴−1 = [ 5 1 −3]. Using this to solve the linear system 𝐴𝑋 = 𝐵 for 𝑋 where 3 1 −2 44 1 (a) 𝐵 = [−2] we have 3 𝑋 = 𝐴−1 𝐵 −1 0 = [5 1 3 1 1 1 −3] [−2] −2 3 2 = [−6] −5 4 (b) 𝐵 = [−3] we have 5 𝑋 = 𝐴−1 𝐵 4 −1 0 1 = [ 5 1 −3] [−3] 3 1 −2 5 1 = [2] −1 5 (c) 𝐵 = [ 7 ] we have −4 𝑋 = 𝐴−1 𝐵 −1 0 1 5 = [ 5 1 −3] [ 7 ] 3 1 −2 −4 −9 = [ 44 ] 30 This only shows that when a matrix is nonsingular, then the system 𝐴𝑋 = 𝐵 has a unique solution for every 𝑛 x 1 matrix 𝐵. 45 Based on the preceding theorems and examples, it can be noted that the following statements are equivalent: 1. The matrix 𝐴 is invertible (nonsingular). 2. The system 𝐴𝑋 = 0 has only the trivial solution. 3. The matrices 𝐴 and 𝐼𝑛 are row equivalent. 4. The system 𝐴𝑋 = 𝐵 has a unique solution for every 𝑛 x 1 matrix 𝐵. ACTIVITY 1. Find the inverse of the following matrices:  1 2 3 4  1 1 1  4 2 1 3  a. 0 2 3 b.   3 0 0  3 5 5 1    2 0 2 3  1 4 1 0 5 2. Let 𝐵 = [2 1 4] and 𝐷 = [0 2 3 5 3 4 7 2 1]. Calculate 𝐵𝐷 and 𝐷−1 𝐵 −1 . Verify 3 that (𝐵𝐷)(𝐷−1 𝐵−1 ) = 𝐼3 . 3. Find all values of a 1 A  1 1 -1 exists. What is A ? for which the inverse of 1 0 0 0  2 a  4. Show that if 𝐴, 𝐵, and 𝐶 are invertible matrices, then 𝐴𝐵𝐶 is invertible and (𝐴𝐵𝐶)−1 = 𝐶 −1 𝐵 −1 𝐴−1 . 5. Prove parts (a) and (c) of Theorem 2.5.4. 46 2.7 Determinants Definition 2.7.1: Let S = {1, 2, … , n } be the set of integers from 1 to n, arranged in ascending order. A rearrangement j1 j 2 ... j n of the elements of S is called a permutation of S. The set of all permutations of S is denoted by 𝑆𝑛 . For example, {1, 2} has two permutations namely 𝑆2 = { (12), (21) } and {1, 2, 3} has six permutations namely 𝑆3 = { (123), (132), (213), (231), (312), (321) }. The number of permutations of the set {1, 2, …, 𝑛} can be determined without writing a list as in the example above. Notice that there are 𝑛 possible positions to be filled. There are 𝑛 choices for the first position, 𝑛 − 1 for the second, 𝑛 − 2 for the third, and only one element for the 𝑛th position. Thus the total number of permutations of 𝑛 elements is 𝑛(𝑛 − 1)(𝑛 − 2) … 2.1 = 𝑛! A permutation j1 j 2 ... j n of S = { 1, 2, … , n } is said to have an inversion if a larger integer jr precedes a smaller one j s . For example, consider the permutation (23541). The following pairs of numbers form an inversion: 21, 31, 54, 51. A permutation is called even if the total number of inversions in it is even and odd if the total number of inversions in it is odd. Example 1: The even permutations in S3 are: 123 ( There is no inversion ) 231 ( There are two inversions; 21 and 31 ) 312 ( There are two inversions; 31 and 32 ) The odd permutations in S3 are: 132 ( There is one inversion; 32 ) 213 ( There is one inversion; 21 ) 321 ( There are three inversions; 32, 31, and 21 ) Note that the number of odd permutations is equal to the number of even permutations. Definition 2.7.2: Let A  [ a ij ] be an 𝑛 x 𝑛 matrix. We define the determinant of 𝐴 (written │𝐴│) by A    a1 j1a2 j2 ...an jn , where the summation ranges over all permutations j1 j 2 ... j n of the set S = {1, 2, …, n}. 47 Note that each term ±𝑎1𝑗1 𝑎2𝑗2 … 𝑎𝑛𝑗𝑛 of |𝐴| is a product of 𝑛 elements of 𝐴 such that one and only one element comes from each row and one and only one element comes from each column. Thus if the factors come from successive rows then the first number in the subscripts are in the natural order 1, 2, …, 𝑛. Since all rows are different then the subscripts 𝑗1 , 𝑗2 , … , 𝑗𝑛 are distinct. This means that {𝑗1 , 𝑗2 , … , 𝑗𝑛 } is a permutation of {1, 2, … , 𝑛 }. The sign of the term 𝑎1𝑗1 𝑎2𝑗2 … 𝑎𝑛𝑗𝑛 is if the permutation 𝑗1 , 𝑗2 , … , 𝑗𝑛 is even otherwise, the sign is . 𝑎11 Example 2: Let 𝐴 = [𝑎 21 𝑎12 𝑎22 ]. To illustrate Definition 2.7.2, let us consider exactly one element from each row and each column. There are two possibilities: 𝑎11 [ ∗ ∗ 𝑎22 ] (1) ∗ [𝑎 𝑎12 ∗ ] (2) 21 In (1), the second subscripts form the even permutation 12 hence the sign of the product 𝑎11 𝑎22 is positive. In (2), the second subscripts form the odd permutation 21 hence the sign of the product 𝑎12 𝑎21 is negative. Thus │A│= a11a22  a12 a21 . The determinant in Example 2 can be solved by writing the terms 𝑎1___ 𝑎2___ and 𝑎1___ 𝑎2___ (there are only two terms because 𝑆2 has only two permutations) Next we fill in the blanks with the two permutations of 𝑆2 . 𝑎11 𝑎22 and 𝑎12 𝑎21 We assign a sign for the even permutation 12 and a Taking the sum of the two terms we have │A│= a11a22  a12 a21 . sign for the odd permutation 21. 48 Example 3: Let 𝐴 = [ 2 −4 ] then 3 5 |𝐴| = (2)(5) − (−4)(3) = 22  a11 Example 4: Let A  a 21 a31 terms a12 a 22 a32 a13  a 23  . Since 𝑆3 has 6 permutations then we write the 6 a33  𝑎1___ 𝑎2___ 𝑎3___ , 𝑎1___ 𝑎2___ 𝑎3___, 𝑎1___ 𝑎2___ 𝑎3___, 𝑎1___ 𝑎2___ 𝑎3___, 𝑎1___ 𝑎2___ 𝑎3___, and 𝑎1___ 𝑎2___ 𝑎3___ Next we fill in the blanks with the six permutations of 𝑆3 . The even permutations are 123, 231, and 312 and the odd permutations are 213, 132, and 321. Thus |𝐴| = 𝑎11 𝑎22 𝑎33 + 𝑎12 𝑎23 𝑎31 + 𝑎13 𝑎21 𝑎32 − 𝑎12 𝑎21 𝑎33 − 𝑎11 𝑎23 𝑎32 − 𝑎13 𝑎22 𝑎31 We can also obtain the determinant of 𝐴 by augmenting the first two columns as shown below: 𝑎11 𝑎12 𝑎13 𝑎11 𝑎12 𝑎21 𝑎22 𝑎23 𝑎21 𝑎22 𝑎31 𝑎32 𝑎33 𝑎31 𝑎32 − − − + + + We form the product of each of the three entries joined by the line from left to right and precede each product by a plus sign. Next we form the product of each of the three entries joined by the line from right to left and precede each product by a minus sign. 1 2 Example 5: Let 𝐴 = [−2 1 3 −1 1 2 3 |𝐴| = |−2 1 4 3 −1 2 1 −2 3 3 4]. Then 2 2 1| −1 = (1)(1)(2) + (2)(4)(3) + (3)(-2)(-1) – (3)(1)(3) – (1)(4)(-1) – (2)(-2)(2) = 35 49 2.8 Properties of Determinants Theorem 2.8.1: The determinant of a matrix 𝐴 and its transpose are equal. Example 1. Let A be the matrix of Example 5. Then 1 𝐴 = [2 3 −2 3 1 −1] 4 2 𝑇 The determinant of AT is 1 −2 3 𝐴 = |2 1 −1 3 4 2 𝑇 1 −2 2 1| 3 4 = (1)(1)(2) + (-2)(-1)(3) + (3)(2)(4) – (3)(1)(3) – (1)(-1)(4) – (-2)(2)(2) = 35 Because of this property we can now replace “row” by “column” in the succeeding theorems about the determinants of a matrix 𝐴. Theorem 2.8.2: Let 𝐵 be the matrix obtained from a matrix 𝐴 by (i) multiplying a row (column) of 𝐴 by a scalar 𝑐; then |𝐵| = 𝑐|𝐴|. (ii) interchanging two rows(columns) of |𝐴|; then |𝐵| = −|𝐴|. (iii) adding a multiple of a row(column) of 𝐴 to another; then |𝐵| = |𝐴|. 2 Example 2. To illustrate part (i) of Theorem 2.8.2, let 𝐵 = [ 4 factor of the entries in column 1, then 5 ]. Since 2 is a common −7 2 5 1 5 2  2(7  10)  34 4 7 2 7  2 1 5 2 Example 3: To illustrate part (ii) of Theorem 2.8.2, let A   and B     . Then 5 2  2 1 A  4 + 5 = 1 and B = 5 + 4 = 1. This shows that |𝐵| = −|𝐴|. 50 6 Example 4: To illustrate part (iii) of theorem 2.8.2, let 𝐴 = [−1 3 A  6 9 12  1 0 2  3𝑅 + 𝑅 2 3    3 0 8  𝑅3 9 −12 0 2 ]. 0 −8 B  6 9 12  1 0 2     0 0 2  |𝐴| = 54 − 72 = −18 and |𝐵| = 0 − 18 = −18. Thus |𝐴| = |𝐵|. Theorem 2.8.3: Let 𝐴 be a square matrix. (i) If two rows (columns) of 𝐴 are equal, then │𝐴│ = 0. (ii) If a row (column) of 𝐴 consists entirely of zeros, then │𝐴│ = 0. Proof of part (i): Suppose the rows 𝑚 and 𝑛 of 𝐴 are equal. Interchange rows 𝑚 and 𝑛 of 𝐴 to obtain a matrix 𝐵. By theorem 2.8.2 (ii), |𝐵| = −|𝐴|. Since rows 𝑚 and 𝑛 are equal then 𝐵 = 𝐴, so |𝐵| = |𝐴|. Thus |𝐴| = −|𝐴| (By substitution) Hence |𝐴| = 0.  3 2 1 Example 5: Let A   1 0 4  (row 1 is equal to row 3).The determinant of A is  3 2 1  3 2 1 A  1 0 4 = 3(0)(4) + 2(4)(3) +(-1)(2)(1)] – 3(2)(4) - 2(-1)(1) - 1(0)(3) 3 2 1 = 0 + 24 – 2 – 24 + 2 – 0 = 0 51 Proof of part (ii): Let the 𝑛th row(column) of 𝐴 consist entirely of zero. Since each term in the |𝐴| contains a factor from each row(column) then each term in the |𝐴| is equal to zero. Hence |𝐴| = 0. Theorem 2.8.4: If a matrix A  [ a ij ] is upper (lower) triangular, then A  a11a22 ...ann that is, the determinant of a triangular matrix is the product of the elements on the main diagonal. Example 6: Evaluate the determinant of each matrix by applying theorem 2.8.4. 2 (a) 𝐴 = [0 1 4 2 (b) 𝐵 = [ 3 −2 −2 0 8 −2 3 −4 −4 2 ] −1 5 3 −4 1 5] 1 −3 6 4 Solution: (a) We transform matrix 𝐴 into triangular form by applying elementary operations and taking note of the corresponding changes in the determinant. 2 |0 1 3 −4 −4 2 | −1 5 𝑅1 𝑅3 −2𝑅1 + 𝑅3 1 𝑅 4 2 5𝑅2 + 𝑅3 1 = − |0 2 (By theorem 2.8.2.ii) 𝑅3 1 −1 5 = − |0 −4 2 | 0 5 −14 (By theorem 2.8.2.iii) 𝑅2 1 = (−4) |0 0 (By theorem 2.8.2.i) 𝑅3 1 −1 5 1/2 | = ( −4) |0 −1 0 0 −23/2 The last matrix is in triangular form thus 23 −1 5 −4 2 | 3 −4 |𝐴| = (−4)(1)(−1) (− ) = −46 2 −1 5 −1 1/2 | 5 −14 (By theorem 2.8.2.iii) 52 4 2 3 −2 (b) | −2 0 8 −2 3 −4 1 5| 𝑅 1 1 −3 6 4 𝑅3 2𝑅1 + 𝑅3 4𝑅1 + 𝑅4 1 − 𝑅1 2 1 0 = −(−2) | 3 −2 0 2 0 −2 𝑅1 −𝑅3 + 𝑅4 𝑅2 𝑅3 𝑅4 𝑅4 1 −3 1 5 | (By theorem 2.8.2.ii) 3 −4 6 4 −2 0 1 −3 3 −2 1 5 | (By theorem 2.8.2.iii) = −| 0 2 5 −10 0 −2 10 −8 𝑅3 𝑅4 −3𝑅1 + 𝑅2 𝑅2 + 𝑅3 −𝑅2 + 𝑅4 −2 0 3 −2 = −| 4 2 8 −2 1 0 = 2 | 0 −2 0 2 0 −2 −1/2 3/2 1 5 | (By theorem 2.8.2.i) 5 −10 10 −8 −1/2 5/2 5 10 1 0 −1/2 0 −2 5/2 = 2| 15/2 0 0 15/2 0 0 3/2 1/2 |(By theorem 2.8.2.iii) −10 −8 3/2 1/2 |(By theorem 2.8.2.iii) −19/2 −17/2 3/2 1 0 −1/2 5/2 1/2 = 2 | 0 −2 |(By theorem 2.8.2.iii) 0 0 15/2 −19/2 0 0 0 1 15 Thus |𝐵| = 2(1)(−2) ( 2 ) (1) = −30. Theorem 2.8.5: The determinant of a product of two matrices is the product of their determinants; that is AB  A B Corollary 2.8.1: If A is nonsingular, then A  0 and A 1  1 . A 53  2 4  5 / 22 2 /11 Example 7. Let A   then A1   .   3 / 22 1/11  3 5  The determinant of A is │A│= 10 – (-12) = 22 and the determinant of A-1 is 1 A-1│= (5/22)(1/11) – (2/11)(-3/22) = 1/22. Hence |𝐴−1 | = . |𝐴| ACTIVITY 1. Evaluate the determinants of the following matrices.  4 3 2 a. A  3  2 5 2 4 6  3  1 2 b. B  4 5 6 7 1 2 4   1 2  c.  4 8 16   3 0 5  2. Find all values of  for which a.  2 3 2 0  3 1 0 1 b.  I3  A  0 where A  2 0 1 0 0 1 3. The matrix 𝐴 is called idempotent if 𝐴2 = 𝐴. What are the possible values for the |𝐴| if 𝐴 is idempotent? 4. Prove Corollary 2.8.1. 2.9 Minors and Cofactor Expansion For large matrices such as 𝑛 ≥ 4, evaluating the determinants using the permutation formula could be very tedious. We see from the preceding section that transforming a matrix into a triangular matrix could make the computation easier. Another method that is also efficient in the computation of determinants of large matrices is by means of cofactor expansion. 54 Definition 2.9.1: Let A  [ a ij ] be an 𝑛 x 𝑛 matrix. Let M ij be the (𝑛 -1) x (𝑛 – 1) submatrix of 𝐴 obtained by deleting the 𝑖th row and the 𝑗th column of 𝐴. The determinant M ij is called the minor of 𝑎𝑖𝑗 . The cofactor Aij of 𝑎𝑖𝑗 is defined as Aij  (1) i  j M ij  5 1 6  Example 1: Let A   3 4 2  . If we delete the first row and second column of 𝐴 we 7 2 1  3 2 obtain the 2 x 2 submatrix 𝑀12 = [ ]. The determinant of 𝑀12 is the minor of 𝑎12 = 7 1 −1. Thus M 12  3 2  3  14  11 . 7 1 Likewise, if we delete the second row and the third column of 𝐴 we obtain the 2 x 2 5 −1 submatrix 𝑀23 = [ ]. The determinant |𝑀23 | is called the minor of 𝑎23 = 2. Thus 7 2 M 23  5 1  10  (7)  17 . 7 2 By Definition 2.9.1, the cofactor of 𝑎12 is 𝐴12 = (−1)1+2 |𝑀12 | = −(−11) = 11 Similarly the cofactor of 𝑎23 is 𝐴23 = (−1)2+3 |𝑀23 | = (−1)(17) = −17 55 Theorem 2.9.1: Let A  [ a ij ] be an 𝑛 x 𝑛 matrix. Then for each 1 ≤ 𝑖 ≤ 𝑛, A  ai1 Ai1  ai 2 Ai 2  ...  ain Ain (expansion of │𝐴│ about the 𝑖th row) and for each 1 ≤ 𝑗 ≤ 𝑛, A  a1 j A1 j  a2 j A2 j  ...  anj Anj (expansion of │𝐴│ about the 𝑗th column) 2 2 −3 1 0 1 2 −1 Example 2: Evaluate the determinant of 𝐴 = [ ] by cofactor expansion. 3 −1 4 1 2 3 0 0 Solution: You can expand about any row or column of your choice, however it is best to expand about the fourth row because it has the most number of zeros. Thus if we expand about the fourth row we obtain |𝐴| = 𝑎41 𝐴41 + 𝑎42 𝐴42 + 𝑎43 𝐴43 + 𝑎44 𝐴44 = 2𝐴41 + 3𝐴42 + 0𝐴43 + 0𝐴44 Notice that the cofactor of a zero entry need not be calculated, so if we can get another zero on the fourth row then the computation would be a lot easier. Let us apply elementary operation on the determinant of 𝐴 and take note of the changes in the determinant. 2 2 −3 1 0 1 2 −1 | | 3 −1 4 1 2 3 0 0 1 𝐶 2 1 −3𝐶1 + 𝐶2 𝐶1 1 2 −3 1 0 1 2 −1 = 2| | 3/2 − 1 4 1 1 3 0 0 By theorem 2.8.2.i 1 −1 −3 1 0 1 2 −1 𝐶2 = 2 | | By theorem 2.8.2.iii 3/2 − 11/2 4 1 1 0 0 0 1 −1 −3 1 0 1 2 −1 Now let us evaluate the determinant | | . Expanding about the fourth 3/2 − 11/2 4 1 1 0 0 0 row we have 56 1 −1 −3 1 0 1 2 −1 | | = (1)𝐴41 + 0𝐴42 + 0𝐴43 + 0𝐴44 3/2 − 11/2 4 1 1 0 0 0 = (1)(−1)4+1 −1 −3 1 | 111 2 −1| −2 4 1 −1 −3 1 = − | 111 2 −1| −2 4 1 −1 Next we evaluate | 111 − 2 −3 2 4 1 −1| by expanding about the third row. (Note, you can 1 expand about any row or column of your choice.) −1 | 111 − 2 −3 2 4 1 −1| = (− 11) (−1)3+1 |−3 2 2 1 1 −1 1 | + (4)(−1)3+2 | | −1 1 −1 −1 −3 + (1)(−1)3+3 | | 1 2 11 = (− 2 ) (1) + (−4)(0) + (1)(1) = −9/2 Substituting, we have 1 −1 −3 1 9 0 1 2 −1 | | = − (− 2) = 9/2 3/2 − 11/2 4 1 1 0 0 0 57 Hence 2 2 −3 1 9 0 1 2 −1 | | = 2( ) = 9 3 −1 4 1 2 2 3 0 0 SAQ 2-4 4  4 2 1  1 2 0 3   Let A  . Evaluate the determinant by 2 0 3 4   0  3 2 1  (a) reducing 𝐴 into a triangular matrix, and (b) expanding about the second row. ASAQ 2-4 (a) Reducing 𝐴 into a triangular matrix using elementary row operation we have 4 −4 1 2 | 2 0 0 −3 2 0 3 2 1 3 | 𝑅1 4 1 −4𝑅1 + 𝑅2 −2𝑅1 + 𝑅3 3𝐶4 + 𝐶2 −2𝐶4 + 𝐶3 1 2 4 −4 𝑅2 = − | 2 0 0 −3 0 2 3 2 3 1 | 4 1 By theorem 2.8.2.ii 0 3 2 − 11 | 3 −2 2 1 By theorem 2.8.2.iii 1 11 − 6 3 𝐶2 0 − 45 24 − 11 = −| | 𝐶3 0 − 10 7 − 2 0 0 0 1 By theorem 2.8.2.iii 1 2 𝑅2 0 − 12 = −| 𝑅3 0 −4 0 −3 58 1 9 𝑅2 −2𝑅2 + 𝑅3 −6 1 11 24/9 𝑅2 = −9 |0 −5 0 −10 7 0 0 0 1 11 𝑅3 = −9 | 0 −5 0 −10 0 0 3 −11/9 | −2 1 By theorem 2.8.2.i −6 3 24/9 −11/9 | By theorem 2.8.2.iii 5/3 4/9 0 1 5 Hence |𝐴| = −9(1)(−5) (3) (1) = 75 (b) Before we expand about the second row, let us introduce more zeros on the second row by applying elementary column operations on the determinant. 4 −4 1 2 | 2 0 0 −3 2 0 3 2 1 3 −2𝐶1 + 𝐶2 | 4 −3𝐶1 + 𝐶4 1 4 − 12 𝐶2 1 0 = | 𝐶4 2 −4 0 −3 2 − 11 0 0 | 3 −2 2 1 By theorem 2.8.2.iii Expanding about the 2nd row we have 4 − 12 1 0 | 2 −4 0 −3 2 − 11 0 0 | = 1𝐴21 3 −2 2 1 −12 2 −11 = 1(−1)2+1 | −4 3 −2 | −3 2 1 Next we introduce zeros on the third row by using column operations. −12 2 −11 3𝐶 + 𝐶 3 1 − | −4 3 −2 | −2𝐶3 + 𝐶2 −3 2 1 −45 24 𝐶1 = − |−10 7 𝐶2 0 0 −11 −2 | 1 −45 24 = −(−1)3+3 | | expanding about the −10 7 3rd row = −(−315 + 240) = 75 59 SAQ 2-5 Prove: If the 𝑖th row of A is multiplied by a scalar 𝑐, then the determinant of 𝐴 is multiplied by 𝑐. ASAQ 2-5 𝑎11 𝑎21 | ⋮ Let |𝐵| = 𝑐𝑎 | 𝑖1 ⋮ 𝑎𝑛1 𝑎12 𝑎22 ⋮ 𝑐𝑎𝑖2 ⋮ 𝑎𝑛2 ⋯ 𝑎1𝑛 ⋯ 𝑎2𝑛 ⋮ | ⋯ 𝑐𝑎𝑖𝑛 | ⋮ ⋯ 𝑎𝑛𝑛 If we expand about the 𝑖th row, we have |𝐵| = 𝑐𝑎𝑖1 𝐴𝑖1 + 𝑐𝑎𝑖2 𝐴𝑖2 + ⋯ + 𝑐𝑎𝑖𝑛 𝐴𝑖𝑛 = 𝑐(𝑎𝑖1 𝐴𝑖1 + 𝑎𝑖2 𝐴𝑖2 + ⋯ + 𝑎𝑖𝑛 𝐴𝑖𝑛 ) = 𝑐|𝐴| Definition 2.9.2: If A  [ a ij ] is an 𝑛 x 𝑛 matrix, the adjoint of 𝐴 denoted by adj𝐴, is the transpose of the matrix of cofactors. Thus,  A11 A adjA   12    A1n A21 ... An1  A22 ... An 2    A2 n ... Ann  60  2 1 3 Example 3: Let A   1 2 0  . Compute adj 𝐴.  3 2 1  Solution: 2 0 1 3 A11  (1)11 2 A21  (1) 21  7 2 1 2 1 A31  (1)31 1 3  6 2 0 A12  (1)1 2 1 0 1 3 1 A22  (1) 2 2 2 3  7 3 1 A32  (1)3 2 2 3  3 1 0 A13  (1)13 1 2  4 3 2 A23  (1) 23 2 1 7 3 2 A33  (1)33 2 1 5 1 2 Hence,  2 7 6  adj 𝐴 =  1 7 3  4 7 5  Theorem 2.9.2: If A  [ a ij ] is an 𝑛 x 𝑛 matrix, then 𝐴(𝑎𝑑𝑗𝐴) = (𝑎𝑑𝑗𝐴)𝐴 = |𝐴|𝐼𝑛  2 1 3 Example 4: Let A   1 2 0  be the matrix of Example 3. Compute |𝐴| using Theorem  3 2 1  2.9.2. Solution:  2 1 3  2 7 6  7 0 0  1 0 0        𝐴(adj 𝐴) =  1 2 0  1 7 3 =  0 7 0   7 0 1 0   3 2 1   4 7 5   0 0 7  0 0 1  Thus, A  7 61 Corollary 2.9.1: If A is 𝑛 x 𝑛 matrix and A  0 , then A 1  1 (adjA) A  2 1 3 Example 5: Let A   1 2 0  be the matrix of Example 3, find 𝐴−1 .  3 2 1  Solution: From previous examples we have  2 7 6  |𝐴| = −7 and adj 𝐴 =  1 7 3    4 7 5  Hence, −1 𝐴  2 7 6  = −7  1 7 3  4 7 5  1 −2/7 = [−1/7 4/7 1 6/7 1 3/7 ] −1 −5/7 You can check your answer by taking the product 𝐴𝐴−1 and making sure that the answer is the identity matrix 𝐼3 . Theorem 2.9.3: A matrix A is nonsingular if and only if A  0 . Corollary 2.9.2: If A is an 𝑛 x 𝑛 matrix, then the homogeneous system AX = 0 has a nontrivial solution if and only if A  0 . 62 Theorem 2.9.4: (Cramer’s Rule) Let a11 x1  a12 x2  ...  a1n xn  b1 a21 x1  a22 x2  ...  a2n xn  b2 an1 x1  an 2 x2  ...  ann xn  bn be a linear system of 𝑛 equations in 𝑛 unknowns and let A  [ a ij ] be the coefficient matrix so that we can write the given system as AX = B, where b1  b  B   2     bn  If A  0 , then the system has the unique solution x1  A1 A , x2  A2 A , … , xn  An A , where 𝐴𝑖 is the matrix obtained by replacing the 𝑖th column of 𝐴 by 𝐵. Example 6: Using Cramer’s Rule, determine the solution of the linear system 3x + y – z = 4 - x + y + 3z = 0 x + 2y + z = 1 Solution: The determinant of the coefficient matrix is (you must do the calculation here) 3 1 1 A   1 1 3 = −8 and 1 2 1 4 1 1 4 A1  0 1 3 = −16 (𝐴1 is obtained by replacing the first column of 𝐴 by 𝐵 = [0]) 1 1 2 1 63 3 4 1 A2   1 0 1 1 3 3 =8, 1 1 4 A3   1 1 0 = −8 1 2 1 4 (𝐴2 is obtained by replacing the second column of 𝐴 by 𝐵 = [0]) 1 4 (𝐴3 is obtained by replacing the third column of 𝐴 by 𝐵 = [0]) 1 By Cramer’s Rule, x A1 A  A2 A3  8 8  16   1, and z   2, y   1 A 8 8 A 8 Thus the solution set is {(2, -1, 1)}. NOTE: Cramer’s rule is only applicable to the case where we have 𝑛 equations in 𝑛 unknowns and where the coefficient matrix 𝐴 is nonsingular. Cramer’s rule becomes computationally inefficient for 𝑛 > 4, and it is better to use the Gauss – Jordan method. ACTIVITY I. Solve the following system using Cramer’s Rule. 1. 2𝑥1 + 5𝑥2 − 𝑥3 = −1 4𝑥1 + 𝑥2 + 3𝑥3 = 3 −2𝑥1 + 2𝑥2 = 0 2. 𝑥1 + 𝑥2 + 𝑥3 + 𝑥4 = 6 2𝑥1 − 𝑥3 − 𝑥4 = 4 3 𝑥3 + 6𝑥4 = 3 𝑥1 − 𝑥4 = 5 1 a a2 3. Show that 1 b b 2 = (b-a)(c-a)(c-b). Also, find the inverse of 𝐴 by using the formula 1 c c2 1 𝐴−1 = |𝐴| adj 𝐴. 2 2 4. Find the determinant of 𝐴 = 4 1 [2 1 3 5 0 1 0 1 2 1 0 −1 3 2 5 3 −3 . 2 4 1 0] 64 MODULE 3 VECTOR SPACES OVER A FIELD Introduction In this chapter we will discuss vectors in ℝ𝑛 and its basic properties, properties and structure of a vector space and subspace, linear combination and spanning sets. Objectives At the end of this chapter, you are expected to be able to do the following: 1. Define vector, vector spaces and subspaces. 2. Enumerate and explain the properties of vector spaces and subspaces. 3. Determine whether a set with given operations is a vector space. 4. Determine whether the given subsets of ℝ𝑛 are subspaces. 5. Define linear combination and spanning. 6. Check whether a given set of vectors spans a vector space V. 3.1 Vectors in the Plane Vectors are measurable quantities which has magnitude and direction. Example of vectors are velocity, force, and acceleration. Vectors in ℝ𝟐 A vector on the plane ℝ𝟐 can be described as an ordered pair 𝐗 = (𝑥1 , 𝑥2 ) where 𝑥1 , 𝑥2 ∈ ℝ. It can also be denoted by a 2 x 1 matrix 𝑥1 𝐗 = [𝑥 ] 2 With X we associate the directed line segment with initial point at the origin O and the ⃗⃗⃗⃗⃗ . terminal point at 𝑃(𝑥1 , 𝑥2 ), denoted by 𝑂𝑃 The direction of a directed line segment is the angle made with the positive X-axis and the magnitude of a directed line segment is its length. 65 Vector Operations Definition 3.1.1: Let 𝐗 = (𝑥1 , 𝑥2 ) and 𝒀 = (𝑦1 , 𝑦2 ) be two vectors in the plane. The sum of the vectors X and Y, denoted by 𝑿 + 𝒀, is the vector (𝑥1 + 𝑦1 , 𝑥2 + 𝑦2 ). Example 1: Let X = (2, 3) and Y = (-4, 1). Then X + Y = (-2, 4) Definition 3.1.2: If X = (𝑥, 𝑦) and 𝑐 is a scalar (a real number), then the scalar multiple 𝑐𝑿 of 𝑿 by 𝑐 is the vector (𝑐𝑥, 𝑐𝑦). If 𝑐 > 0, then 𝑐𝑥 is in the same direction as 𝑿, whereas if 𝑑 < 0, then 𝑑𝑥 is in the opposite direction. Example 2: Let c = 3, d = -2, and X = (3, -2). Then cX = (9, -6) and dX = (-6, 4) NOTE: 1. 𝑂 = (0, 0) is called the zero vector. 2. X + 𝑂 = X 3. X + (-1)X = 𝑂 4. (-1)X = X; the negative of X 5. X + (-1)Y = X – Y; the difference between X and Y. 3.2. 𝒏-Vectors Definition 3.2.1: An 𝑛-vector is an 𝑛 x 1 matrix 𝑥1 𝑥2 𝑿=[⋮ ] 𝑥𝑛 where 𝑥1 , 𝑥2 , … , 𝑥𝑛 are real numbers, which are called the components of X. The set of all 𝑛-vectors is denoted by ℝ𝑛 and is called 𝑛-space. 66 Definition 3.2.2: Let 𝑿 = (𝑥1 , 𝑥2 , … , 𝑥𝑛 ) and 𝒀 = (𝑦1 , 𝑦2 , … , 𝑦𝑛 ) be two vectors in ℝ𝑛 . The sum of the vectors X and Y is the vector (𝑥1 + 𝑦1 , 𝑥2 + 𝑦2 , … , 𝑥𝑛 + 𝑦𝑛 ) and is denoted by X + Y. Example 1: If X = (1, 2, -3) and Y = (0, 1, -2) are vectors in R3, then X + Y = (1+0, 2+1, -3 + (-2) ) = ( 1, 3, -5 ). Definition 3.2.3: If 𝑿 = (𝑥1 , 𝑥2 , … , 𝑥𝑛 ) is a vector in ℝ𝑛 and 𝑐 is a scalar, then the scalar multiple 𝑐𝑿 of 𝑿 by 𝑐 is the vector (𝑐𝑥1 , 𝑐𝑥2 , … , 𝑐𝑥𝑛 ) Example 2: If X = ( 2, 1, 5, -2 ) is a vector in ℝ4 and c = -3, then cX = (-3) ( 2, 1, 5, -2 ) = (-6, -3, -15, 6 ) The operations of vector addition and scalar multiplication satisfy the following properties: Theorem 3.2.1: Let X, Y, and Z be any vectors in ℝ𝑛 ; let 𝑐 and d be any scalars. Then I. X + Y is a vector in ℝ𝑛 (that is, ℝ𝑛 is closed under the operation of vector addition). a. X + Y = Y + X b. X + ( Y + Z ) = ( X + Y ) +Z c. There is a unique vector in ℝ𝑛 , 0 = (0, 0, …, 0) such that X + 0 = 0 + X = X. d. There is a unique vector – X, -X = (- x1 , - x2 , … , - xn ) such that X + (- X) = 0 67 II. cX is a vector in ℝ𝑛 a. ( X + Y ) = cX + cY b. ( c + d )X = cX + dX c. c(dX) = (cd)X d. 1X = X 3.3 Vector Spaces and Subspaces Definition 3.3.1: A vector space V over a field 𝐹 is a nonempty set of elements, called vectors, together with two operations called vector addition and scalar multiplication satisfying the following properties: [A1] (Closure under addition) To every pair of vectors X, Y ∈ V, then X + Y ∈ V. [A2] (Commutative law of vector addition) X+ Y= Y+X ∀ X , Y  V. [A3] (Associative law of vector addition) X + ( Y + Z ) = ( X + Y ) + Z ∀ X, Y, and Z  V. [A4] (Zero Element) ∃𝑶 ∈ V such that X + 𝑶 = 𝑶 + X = X ∀ X  V. [A5] (Negative Elements) ∀ X  V, ∃ X ∈ V such that X + ( X) = 𝑶 [M1] (Closure under scalar multiplication) To every X  V and 𝑐 ∈ 𝐹, then 𝑐 X  V. [M2] (Distributive Law) c ( X + Y ) = c X + c Y, ∀𝑐 ∈ 𝐹 , ∀ X, Y  V. 68 [M3] (Distributive Law) (𝑐 + 𝑑 ) X = 𝑐 X + 𝑑 X, ∀𝑐, 𝑑 ∈ 𝐹, ∀ X  V. [M4] (Associative Law of Scalar Multiplication) (𝑐𝑑 )X = 𝑐(𝑑 X ), ∀𝑐, 𝑑 ∈ 𝐹, ∀ X  V. [M5] (Identity Element) 1 X = X, ∀ X  V. Example 1. Consider the set 𝑃𝑛 of all polynomials of degree ≤ 𝑛 together with the zero polynomial. Show that 𝑃𝑛 is a vector space. Solution: We have to show that all properties are satisfied. A polynomial in 𝑃𝑛 is expressible as 𝑝(𝑥) = 𝑝0 𝑥 𝑛 + 𝑝1 𝑥 𝑛−1 + … + 𝑝𝑛−1 𝑥 + 𝑝𝑛 where 𝑝0 , 𝑝1 , … , 𝑝𝑛 are real numbers. Let 𝑝(𝑥), 𝑞(𝑥), and 𝑟(𝑥) ∈ 𝑃𝑛 and 𝑐, 𝑑 ∈ ℝ. Then [A1] 𝑝(𝑥) + 𝑞(𝑥) = (𝑝0 + 𝑞0 )𝑥 𝑛 + (𝑝1 + 𝑞1 )𝑥 𝑛−1 + ⋯ + (𝑝𝑛−1 + 𝑞𝑛−1 )𝑥 + (𝑝𝑛 + 𝑞𝑛 ) Clearly, the sum of two polynomials of degree ≤ 𝑛 is another polynomial with degree ≤ 𝑛. Hence 𝑃𝑛 is closed under addition. [A2] 𝑝(𝑥) + 𝑞(𝑥) = (𝑝0 + 𝑞0 )𝑥 𝑛 + (𝑝1 + 𝑞1 )𝑥 𝑛−1 + ⋯ + (𝑝𝑛−1 + 𝑞𝑛−1 )𝑥 + (𝑝𝑛 + 𝑞𝑛 ) = (𝑞0 + 𝑝0 )𝑥 𝑛 + (𝑞1 + 𝑝1 )𝑥 𝑛−1 + ⋯ + (𝑞𝑛−1 + 𝑝𝑛−1 )𝑥 + (𝑞𝑛 + 𝑝𝑛 ) = 𝑞(𝑥) + 𝑝(𝑥) Thus addition is commutative. [A3] [𝑝(𝑥) + 𝑞(𝑥)] + 𝑟(𝑥) = [(𝑝0 + 𝑞0 )𝑥 𝑛 + (𝑝1 + 𝑞1 )𝑥 𝑛−1 + ⋯ + (𝑝𝑛−1 + 𝑞𝑛−1 )𝑥 +(𝑝𝑛 + 𝑞𝑛 )] + 𝑟0 𝑥 𝑛 + 𝑟1 𝑥 𝑛−1 + … + 𝑟𝑛−1 𝑥 + 𝑟𝑛 = [(𝑝0 + 𝑞0 ) + 𝑟0 ]𝑥 𝑛 + [(𝑝1 + 𝑞1 )+𝑟1 ]𝑥 𝑛−1 + ⋯ + [(𝑝𝑛−1 + 𝑞𝑛−1 ) + 𝑟𝑛−1 ]𝑥 +[(𝑝𝑛 + 𝑞𝑛 ) + 𝑟𝑛 ] = [𝑝0 + (𝑞0 + 𝑟0 )]𝑥 𝑛 + [𝑝1 + (𝑞1 + 𝑟1 )]𝑥 𝑛−1 + ⋯ + [𝑝𝑛−1 + (𝑞𝑛−1 + 𝑟𝑛−1 )]𝑥 + [𝑝𝑛 + (𝑞𝑛 + 𝑟𝑛 )] 69 = (𝑝0 𝑥 𝑛 + 𝑝1 𝑥 𝑛−1 + … + 𝑝𝑛−1 𝑥 + 𝑝𝑛 ) + [(𝑞0 + 𝑟0 )𝑥 𝑛 + (𝑞1 + 𝑟1 )𝑥 𝑛−1 + ⋯ + (𝑞𝑛−1 + 𝑟𝑛−1 )𝑥 + (𝑞𝑛 + 𝑟𝑛 )] = 𝑝(𝑥) + [𝑞(𝑥) + 𝑟(𝑥)] Thus vector addition is associative. [A4] Let 𝑂 = 0𝑥 𝑛 + 0𝑥 𝑛−1 + … + 0𝑥 + 0 be the zero polynomial then 𝑝(𝑥) + 𝑂 = 𝑂 + 𝑝(𝑥) = 𝑝(𝑥) [A5] Let – 𝑝(𝑥) = −𝑝0 𝑥 𝑛 − 𝑝1 𝑥 𝑛−1 − ⋯ − 𝑝𝑛−1 𝑥 − 𝑝𝑛 , then 𝑝(𝑥) + (−𝑝(𝑥)) = 𝑂 [M1] 𝑐𝑝(𝑥) = 𝑐𝑝0 𝑥 𝑛 + 𝑐𝑝1 𝑥 𝑛−1 + … + 𝑐𝑝𝑛−1 𝑥 + 𝑐𝑝𝑛 where 𝑐𝑝0 , 𝑐𝑝1 , … , 𝑐𝑝𝑛 ∈ ℝ . Clearly , the product of a real number and a polynomial is also a polynomial. Hence 𝑃𝑛 is closed under scalar multiplication. [M2] 𝑐[𝑝(𝑥) + 𝑞(𝑥)] = 𝑐[(𝑝0 + 𝑞0 )𝑥 𝑛 + (𝑝1 + 𝑞1 )𝑥 𝑛−1 + ⋯ + (𝑝𝑛−1 + 𝑞𝑛−1 )𝑥 +(𝑝𝑛 + 𝑞𝑛 )] = 𝑐(𝑝0 + 𝑞0 )𝑥 𝑛 + 𝑐(𝑝1 + 𝑞1 )𝑥 𝑛−1 + ⋯ + 𝑐(𝑝𝑛−1 + 𝑞𝑛−1 )𝑥 + 𝑐(𝑝𝑛 + 𝑞𝑛 ) = (𝑐𝑝0 + 𝑐𝑞0 )𝑥 𝑛 + (𝑐𝑝1 + 𝑐𝑞1 )𝑥 𝑛−1 + ⋯ + (𝑐𝑝𝑛−1 + 𝑐𝑞𝑛−1 )𝑥 + (𝑐𝑝𝑛 + 𝑐𝑞𝑛 ) = (𝑐𝑝0 𝑥 𝑛 + 𝑐𝑝1 𝑥 𝑛−1 + … + 𝑐𝑝𝑛−1 𝑥 + 𝑐𝑝𝑛 ) + (𝑐𝑞0 𝑥 𝑛 + 𝑐𝑞1 𝑥 𝑛−1 + … + 𝑐𝑞𝑛−1 𝑥 + 𝑐𝑞𝑛 ) = 𝑐𝑝(𝑥) + 𝑐𝑞(𝑥) [M3] (𝑐 + 𝑑)𝑝(𝑥) = (𝑐 + 𝑑)(𝑝0 𝑥 𝑛 + 𝑝1 𝑥 𝑛−1 + … + 𝑝𝑛−1 𝑥 + 𝑝𝑛 ) = (𝑐 + 𝑑)𝑝0 𝑥 𝑛 + (𝑐 + 𝑑)𝑝1 𝑥 𝑛−1 + … + (𝑐 + 𝑑)𝑝𝑛−1 𝑥 + (𝑐 + 𝑑)𝑝𝑛 70 = (𝑐𝑝0 + 𝑑𝑝0 )𝑥 𝑛 + (𝑐𝑝1 + 𝑑𝑝1 )𝑥 𝑛−1 + ⋯ + (𝑐𝑝𝑛−1 + 𝑑𝑝𝑛−1 )𝑥 + (𝑐𝑝𝑛 + 𝑑𝑝𝑛 ) = (𝑐𝑝0 𝑥 𝑛 + 𝑐𝑝1 𝑥 𝑛−1 + … + 𝑐𝑝𝑛−1 𝑥 + 𝑐𝑝𝑛 ) + 𝑑𝑝0 𝑥 𝑛 + 𝑑𝑝1 𝑥 𝑛−1 + … + 𝑑𝑝𝑛−1 𝑥 + 𝑑𝑝𝑛 = 𝑐𝑝(𝑥) + 𝑑𝑝(𝑥) [M4] (𝑐𝑑)𝑝(𝑥) = (𝑐𝑑)𝑝0 𝑥 𝑛 + (𝑐𝑑)𝑝1 𝑥 𝑛−1 + … + (𝑐𝑑)𝑝𝑛−1 𝑥 + (𝑐𝑑)𝑝𝑛 = 𝑐(𝑑𝑝0 )𝑥 𝑛 + 𝑐(𝑑𝑝1 )𝑥 𝑛−1 + … + 𝑐(𝑑𝑝𝑛−1 )𝑥 + 𝑐(𝑑𝑝𝑛 ) = 𝑐(𝑑𝑝0 𝑥 𝑛 + 𝑑𝑝1 𝑥 𝑛−1 + … + 𝑑𝑝𝑛−1 𝑥 + 𝑑𝑝𝑛 ) = 𝑐(𝑑𝑝(𝑥)) [M5] 1𝑝(𝑥) = 1𝑝0 𝑥 𝑛 + 1𝑝1 𝑥 𝑛−1 + … + 1𝑝𝑛−1 𝑥 + 1𝑝𝑛 = 𝑝0 𝑥 𝑛 + 𝑝1 𝑥 𝑛−1 + … + 𝑝𝑛−1 𝑥 + 𝑝𝑛 = 𝑝(𝑥) Since all ten properties are satisfied, then 𝑃𝑛 is a vector space. Example 2. Let V be the set of all ordered triples of real numbers (𝑥, 𝑦, 𝑧) with the operations (𝑥, 𝑦, 𝑧) + (𝑥 ′ , 𝑦 ′ , 𝑧 ′ ) = (𝑥 + 𝑥 ′ , 𝑦 + 𝑦 ′ , 𝑧 + 𝑧 ′ ) 𝑐(𝑥, 𝑦, 𝑧) = (𝑥, 1, 𝑧) Determine if V is a vector space. Solution: To prove that V is a vector space, we have to show that all ten properties are satisfied. Let 𝑿 = (𝑥, 𝑦, 𝑧), 𝒀 = (𝑥 ′ , 𝑦 ′ , 𝑧 ′ ) and 𝒁 = (𝑥 ′′ , 𝑦 ′′ , 𝑧 ′′ ) [A1] V and 𝑐, 𝑑 ∈ ℝ. Then 𝑿 + 𝒀 = (𝑥 + 𝑥 ′ , 𝑦 + 𝑦 ′ , 𝑧 + 𝑧 ′ ) Since 𝑥 + 𝑥 ′ , 𝑦 + 𝑦 ′ , and 𝑧 + 𝑧 ′ addition. ∈ ℝ, then 𝑿 + 𝒀 ∈ V. Hence V is closed under 71 [A2] 𝑿 + 𝒀 = (𝑥 + 𝑥 ′ , 𝑦 + 𝑦 ′ , 𝑧 + 𝑧 ′ ) = (𝑥 ′ + 𝑥, 𝑦 ′ + 𝑦, 𝑧 ′ + 𝑧) =𝒀+𝑿 Hence vector addition is commutative. [A3] (𝑿 + 𝒀) + 𝒁 = (𝑥 + 𝑥 ′ , 𝑦 + 𝑦 ′ , 𝑧 + 𝑧 ′ ) + (𝑥 ′′ , 𝑦 ′′ , 𝑧 ′′ ) = ((𝑥 + 𝑥 ′ ) + 𝑥 ′′ , (𝑦 + 𝑦 ′ ) + 𝑦 ′′ , (𝑧 + 𝑧 ′ ) + 𝑧′′) = (𝑥 + (𝑥 ′ + 𝑥 ′′ ), 𝑦 + (𝑦 ′ + 𝑦 ′′ ), 𝑧 + (𝑧 ′ + 𝑧 ′′ )) = (𝑥, 𝑦, 𝑧) + (𝑥 ′ + 𝑥 ′′ , 𝑦′ + 𝑦 ′′ , 𝑧 ′ + 𝑧 ′′ ) = 𝑿 + (𝒀 + 𝒁) Thus vector addition is associative. [A4] Let 𝑶 = (0, 0, 0) ∈ V be the zero vector then 𝑿 + 𝑶 = (𝑥, 𝑦, 𝑧) + (0, 0, 0) = (𝑥, 𝑦, 𝑧) =𝑿 [A5] Let – 𝑿 = (−𝑥, −𝑦, −𝑧) then 𝑿 + (−𝑿) = (𝑥, 𝑦, 𝑧) + (−𝑥, −𝑦, −𝑧) = (0, 0, 0) =𝑶 [M1] 𝑐𝑿 = 𝑐(𝑥, 𝑦, 𝑧) = (𝑥, 1, 𝑧) Since 𝑥 and 𝑧 are real numbers then 𝑐𝑿 ∈ V. Hence V is closed under scalar multiplication. 72 [M2] 𝑐(𝑿 + 𝒀) = 𝑐(𝑥 + 𝑥 ′ , 𝑦 + 𝑦 ′ , 𝑧 + 𝑧 ′ ) = (𝑥 + 𝑥 ′ , 1, 𝑧 + 𝑧 ′ ) Also, 𝑐𝑿 + 𝑐𝒀 = 𝑐(𝑥, 𝑦, 𝑧) + 𝑐(𝑥 ′ , 𝑦 ′ , 𝑧 ′ ) = (𝑥, 1, 𝑧) + (𝑥 ′ , 1, 𝑧 ′ ) = (𝑥 + 𝑥 ′ , 2, 𝑧 + 𝑧 ′ ) Since 𝑐(𝑿 + 𝒀) ≠ 𝑐𝑿 + 𝑐𝒀 then M2 is not satisfied. Hence V under the prescribed operations is not a vector space. Theorem 3.3.1: If V is a vector space, then: a) 0𝑿 = 0, for every 𝑿 in V. b) 𝑐0 = 0, for every scalar 𝑐. c) If 𝑐𝑿 = 0, then 𝑐 = 0 or 𝑿 = 0 d) (-1)𝑿 = −𝑿, for every 𝑿 in V. SAQ 3-1 Consider the set 𝑀3,3 (ℝ) of all 3 x 3 matrices with entries in under the usual operations of matrix addition and scalar multiplication. Show that 𝑀3,3 (ℝ) is a vector space. ASAQ 3-1 Let 𝐴, 𝐵, and 𝐶 ∈ 𝑀3,3 (ℝ) and 𝑐, 𝑑 ∈ ℝ. Then [A1] For every 𝐴, 𝐵 𝑀3,3 (ℝ), we have 𝐴 + 𝐵 ∈ 𝑀3,3 (ℝ). Hence 𝑀3,3 (ℝ) is closed under addition. [A2] For every 𝐴, 𝐵 commutative. 𝑀3,3 (ℝ), we have 𝐴 + 𝐵 = 𝐵 + 𝐴, that is matrix addition is 73 [A3] For every 𝐴, 𝐵, 𝐶 𝑀3,3 (ℝ), we have (𝐴 + 𝐵) + 𝐶 = 𝐴 + (𝐵 + 𝐶), that is matrix addition is associative. [A4] Let 𝑂 be the 3 x 3 zero matrix, then for every 𝐴 𝐴+𝑂 =𝑂+𝐴=𝐴 [A5] For every 𝐴 𝑀3,3 (ℝ), we have 𝑀3,3 (ℝ), we have 𝐴 + (−𝐴) = 𝑂. [M1] For every 𝑐 ∈ ℝ and 𝐴 𝑀3,3 (ℝ), we have 𝑐𝐴 closed under scalar multiplication. 𝑀3,3 (ℝ). Hence 𝑀3,3 (ℝ) is [M2] For every 𝑐 ∈ ℝ and 𝐴, 𝐵 𝑀3,3 (ℝ), we have 𝑐(𝐴 + 𝐵) = 𝑐𝐴 + 𝑐𝐵. [M3] For every 𝑐, 𝑑 ∈ ℝ and 𝐴 𝑀3,3 (ℝ), we have (𝑐 + 𝑑)𝐴 = 𝑐𝐴 + 𝑑𝐴. [M4] For every 𝑐, 𝑑 ∈ ℝ and 𝐴 𝑀3,3 (ℝ), we have (𝑐𝑑)𝐴 = 𝑐(𝑑𝐴). [M5] For every 𝐴 𝑀3,3 (ℝ),we have 1𝐴 = 𝐴. Hence 𝑀3,3 (ℝ) under the usual operations of matrix addition and scalar multiplication is a vector space. Definition 3.3.2: Let V be a vector space and W a nonempty subset of V. If W is a vector space under the operations of addition and scalar multiplication defined on V, then W is called a subspace of V. Theorem 3.3.2: Let V be a vector space under the operations addition and scalar multiplication and let W be a nonempty subset of V. Then W is a subspace of V if and only if the following conditions hold: a) The zero vector 𝑶 belongs to W; b) For every vector 𝑿, 𝒀  W, 𝑐  ℝ: i. The sum 𝑿 + 𝒀  W ii. The multiple 𝑐𝑿  W. Property (i) in (b) states that W is closed under vector addition, and property (ii) in (b) states that W is closed under scalar multiplication. Both properties may be combined into the following equivalent single statement: (b’) For every 𝑿, 𝒀  W, 𝑎, 𝑏  , the linear combination 𝑎𝑿 + 𝑏𝒀  W. 74 REMARK: If V is any vector space, then V automatically contains two subspaces, the set {0} consisting of the zero vector alone and the whole space V itself. These are called the trivial subspaces of V. Example 3: Consider the vector space ℝ3 . Let U consists of all vectors in ℝ3 whose entries are equal; that is U = { (a, b, c) : a = b = c }. Show that U is a subspace of ℝ3 . Solution: (a) The zero vector ( 0, 0, 0 ) belongs to U. (b) Let 𝑿 = (𝑎, 𝑎, 𝑎) and 𝒀 = (𝑏, 𝑏, 𝑏) be vectors in U and let 𝑘, 𝑑  . 𝑘𝑿 + 𝑑𝒀 = (𝑘𝑎, 𝑘𝑎, 𝑘𝑎) + (𝑑𝑏, 𝑑𝑏, 𝑑𝑏) = (𝑘𝑎 + 𝑑𝑏, 𝑘𝑎 + 𝑑𝑏, 𝑘𝑎 + 𝑑𝑏) Clearly, 𝑘𝑿 + 𝑑𝒀 U hence U is a subspace of ℝ3 . Example 4: Let V = ℝ3 and let W = { (a, b, c ): a  0}. Determine if W is a subspace of V. Solution: (a) The zero vector ( 0, 0, 0 )  W. (b) Let 𝑿 = (𝑎, 𝑏, 𝑐) and 𝒀 = (𝑎′ , 𝑏 ′ , 𝑐 ′ ) be vectors in W. 𝑋 + 𝑌 = (𝑎 + 𝑎′ , 𝑏 + 𝑏 ′ , 𝑐 + 𝑐 ′ ) Since 𝑎  0 and 𝑎′  0 then 𝑎 + 𝑎′  0. Hence 𝑿 + 𝒀  W. (c) Let 𝑋 = (1, 2, 3) and 𝑘 = −2 then 𝑘𝑿 = (−2, −4, −6)  W since −2 < 0. Therefore, W is not a subspace of V. 75 ACTIVITY I. In problems 1-4, determine whether the given set together with the given operations is a vector space. 1. The set of ordered pairs (𝑎, 𝑏) of real numbers with the operations (𝑎, 𝑏) + (𝑐, 𝑑) = (𝑎 + 𝑐, 𝑏 + 𝑑) and 𝑘(𝑎, 𝑏) = (𝑘𝑎, 0) 2. The set of all ordered triples of real numbers of the form (0,0, 𝑧) with the operations ( 0, 0, 𝑧 ) + ( 0, 0, 𝑧′ ) = ( 0, 0, 𝑧 + 𝑧′ ) and 𝑐 ( 0, 0, 𝑧 ) = ( 0, 0, 𝑐𝑧 ). 3. The set of polynomials (in 𝑥) of degree  𝑛 together with the zero polynomial with positive constant term. 4. The set of all ordered pairs (𝑎, 𝑏) of real numbers with addition and scalar multiplication in V defined by (𝑎, 𝑏) + (𝑐, 𝑑) = (𝑎𝑐, 𝑏𝑑) and 𝑘(𝑎, 𝑏) = (𝑘𝑎, 𝑘𝑏) 5. Which of the following subsets of ℝ3 are subspaces of ℝ3 ? The set of all vectors of the form (a) (𝑎, 𝑏, 𝑐), where 𝑎 = 𝑐 = 0 (b) (𝑎, 𝑏, 𝑐),, where 𝑎 = −𝑐 (c) (𝑎, 𝑏, 𝑐),, where 𝑏 = 2𝑎 + 1 6. Let V be the set of all 2 x 3 matrices under the usual operations of matrix addition and scalar multiplication. Which of the following subsets of V are subspaces? The set of all matrices of the form a (a)  d b c , where 𝑏 = 𝑎 + 𝑐. 0 0  a (b)  d b c , where 𝑐 > 0 0 0  76 3.4 Linear Combinations and Spanning Sets Definition 3.4.1: A vector 𝑿  V is said to be a linear combination of the set of vectors { 𝑿𝟏 , 𝑿𝟐 , ⋯ , 𝑿𝒌 }  V if there exists scalars 𝑐1 , 𝑐2 , ⋯ , 𝑐𝑘 such that 𝑿 = 𝑐1 𝑿𝟏 + 𝑐2 𝑿𝟐 + ⋯ + 𝑐𝑘 𝑿𝒌 We also say that 𝑿 is linearly dependent on the 𝑿𝒊 . Example 1. In ℝ3 , (−15, −4, 0) is a linear combination of (−3, −1, 4) and (3, 2, 8) since 2(−3, −1, 4) − 3(3, 2, 8) = (−15, −4, 0). −1 Example 2. [ −1 6 7 −1 0 3 1 3 −1 ] is a linear combination of [ ] and [ ] since 15 16 1 5 6 −2 0 −1 −1 0 3[ 1 5 3 1 3 ] + 2[ 6 −2 0 −1 −1 6 ]=[ −1 −1 15 7 ] 16 Example 3. In ℝ3 , let X1 = (4, 2, 3), X2 = (2, 1, 2), and X3 = ( 2, 1, 0). Determine if (4, 2, 6) is a linear combination of X1, X2, and X3. 𝑿= Solution: X is a linear combination of X1, X2, and X3 if we can find 𝑐1 , 𝑐2, and 𝑐3  that such 𝑐1(4, 2, 3) + 𝑐2 (2, 1, 2) + 𝑐3 ( 2, 1, 0) (4, 2, 6) Multiplying and adding yields the following linear system. 4𝑐1 + 2𝑐2 2𝑐3 = 4 2𝑐1 + 𝑐2 + 𝑐3 = 2 3𝑐1 2𝑐2 = 6 Row reducing the augmented matrix we have 4 2 −2 [2 1 1 −3 −2 0 4 2] −6 1 4 𝑅1 1 1/2 −1/2 𝑅1 [ 2 1 1 −3 −2 0 1 2] −6 77 −2𝑅1 + 𝑅2 3𝑅1 + 𝑅3 1 𝑅2 −2𝑅3 2 𝑅2 1 1/2 𝑅2 0 [0 𝑅3 0 −1/2 1 1/2 𝑅2 [0 0 𝑅3 0 1 𝑅3 −1/2 2 −3/2 1 0] −3 −1/2 1 3 1 0] 6 1 1/2 −1/2 [0 1 3 0 0 1 1 6] 0 This implies that 𝑐1 = 2, 𝑐2 = 6, and 𝑐3 = 0. It can be verified that 2(4, 2, 3) + 6(2, 1, 2) + 0( 2, 1, 0) = (4, 2, 6). Thus X is a linear combination of X1 = (4, 2, 3), X2 = (2, 1, 2), and X3 = ( 2, 1, 0). SAQ 3-2 In 𝑃3 , let 𝑝(𝑥) = 𝑥 2 − 2𝑥, 𝑞(𝑥) = 5𝑥 − 2, and 𝑟(𝑥) = 𝑥 2 + 3. Determine if 𝑓(𝑥) = 2𝑥 2 + 4𝑥 − 7 is a linear combination of 𝑝(𝑥), 𝑞(𝑥), and 𝑟(𝑥). ASAQ 3-2 𝑓(𝑥) is a linear combination of 𝑝(𝑥), 𝑞(𝑥), and 𝑟(𝑥) if we can find 𝑐1 , 𝑐2 , and 𝑐3 ∈ ℝ such that 𝑐1 (𝑥 2 − 2𝑥) + 𝑐2 (5𝑥 − 2) + 𝑐3 (𝑥 2 + 3) = 2𝑥 2 + 4𝑥 − 7 Equating the coefficients of similar terms we obtain the system 𝑐1 + 𝑐3 = 2 −2𝑐1 + 5𝑐2 =4 −2𝑐2 + 3𝑐3 = −7 Row reducing the augmented matrix we obtain (you must do the calculation here) 78 1 0 [0 1 0 0 1 2/5 1 2 8/5] −1 Thus the solution to the given system is 𝑐1 = 3, 𝑐2 = 2, 𝑐3 = −1. It can be verified that 3(𝑥 2 − 2𝑥) + 2(5𝑥 − 2) − (𝑥 2 + 3) = 2𝑥 2 + 4𝑥 − 7 Thus 𝑓(𝑥) is a linear combination of 𝑝(𝑥), 𝑞(𝑥), and 𝑟(𝑥). Definition 3.4.2: Let S = { 𝑿𝟏 , 𝑿𝟐 , ⋯ , 𝑿𝒏 } be a set of vectors in a vector space V. The set S spans V or V is spanned by S, if every vector in V is a linear combination of the vector in S. That is for every 𝑿 ∈ V there are scalars 𝑐1 , 𝑐2 , ⋯ , 𝑐𝑛 such that 𝑿 = 𝑐1 𝑿𝟏 + 𝑐2 𝑿𝟐 + ⋯ + 𝑐𝑛 𝑿𝒏 Example 4. The vectors 𝑿1 = (1, 0) and 𝑿𝟐 = (0, 1) span ℝ2 since every vector (𝑎, 𝑏) in ℝ2 can be expressed as a linear combination of 𝑿1 and 𝑿𝟐 . That is (𝑎, 𝑏) = 𝑎(1, 0) + 𝑏(0, 1) for every 𝑎, 𝑏 ∈ ℝ. Example 5. Every 2 x 2 matrix can be written as a linear combination of the four matrices 1 0 0 1 0 0 0 0 𝐸1 = [ ], 𝐸2 = [ ], 𝐸3 = [ ], and 𝐸4 = [ ]. That is 0 0 0 0 1 0 0 1 𝑎 [ 𝑐 1 0 0 𝑏 ] = 𝑎[ ]+𝑏[ 0 0 0 𝑑 1 0 ]+𝑐[ 0 1 0 0 0 ]+𝑑[ ] 0 0 1 for every 𝑎, 𝑏, 𝑐, 𝑑 ∈ ℝ. Thus 𝐸1 , 𝐸2 , 𝐸3 , and 𝐸4 span 𝑀2,2 . Example 6. Let V be the vector space ℝ3 and let S = { X1 = (1, 2, 3), X2 = (−1, −2, 1), X3 = (0, 1, 0) }. Does S span ℝ3 ? Solution. Choose an arbitrary vector X = (𝑎, 𝑏, 𝑐) in ℝ3 . The set of vectors S = {𝑿𝟏 , 𝑿𝟐 , 𝑿𝟑 } spans ℝ3 if we can find scalars 𝑐1 , 𝑐2 , and 𝑐3 such that 𝑐1(1, 2, 3) + 𝑐2 (−1, −2, 1) + 𝑐3 (0, 1, 0) = (𝑎, 𝑏, 𝑐) 79 Multiplying and adding we obtain the linear system 𝑐1 − 𝑐2 − = 𝑎 2𝑐1 − 2𝑐2 + 𝑐3 = 𝑏 3𝑐1 + 𝑐2 = 𝑐 The augmented matrix 1 −1 0 [2 −2 1 3 1 0 𝑎 𝑏 ] is row equivalent to 𝑐 1 0 0 [0 1 0 0 0 1 (𝑎 + 𝑐)/4 (−3𝑎 + 𝑐)/4] (verify) −2𝑎 + 𝑏 The solution to the given system is 𝑐1 = 𝑎+𝑐 4 , 𝑐2 = −3𝑎+𝑐 4 , and 𝑐3 = −2𝑎 + 𝑏 . Since there exists 𝑐1 , 𝑐2, and 𝑐3 for every choice of 𝑎, 𝑏, and 𝑐, we say that every vector in ℝ3 is a linear combination of X1 = (1, 2, 3), X2 = (−1, −2, 1), and X3 = (0, 1, 0). For example, if 𝑿 = (2, −1, 6) then𝑐1 = 2, 𝑐2 = 0 and 𝑐3 = −5. It can be verified that (2, −1, 6) = 2(1, 2, 3) + 0(−1, −2, 1) − 5(0, 1, 0) Thus S = {𝑿𝟏 , 𝑿𝟐 , 𝑿𝟑 } spans ℝ3 . Example 7: Does 𝑿 = {−1, 4, 2, 2} belongs to span {𝑿𝟏 , 𝑿𝟐 , 𝑿𝟑 } where X1 = (1, 0, 0, 1), X2 = (1, −1, 0, 0), and X3 = (0, 1, 2, 1). Solution: Every vector in span {𝑿𝟏 , 𝑿𝟐 , 𝑿𝟑 } is of the form 𝑐1 𝑿𝟏 + 𝑐2 𝑿𝟐 + 𝑐3 𝑿𝟑 Thus 𝑿 belongs to span {𝑿𝟏 , 𝑿𝟐 , 𝑿𝟑 } if we can write it as a linear combination of 𝑿𝟏 , 𝑿𝟐 , and 𝑿𝟑 . Let 𝑐1 , 𝑐2 , and 𝑐3 be scalars such that 𝑐1 (1, 0, 0, 1) + 𝑐2 (1, −1, 0, 0) + 𝑐3 (0, 1, 2, 1) = (−1, 4, 2, 2) 80 Multiplying and adding we obtain the linear system 𝑐1 + 𝑐2 −𝑐2 + 𝑐3 2𝑐3 𝑐1 + 𝑐3 = = = = −1 4 2 2 The third equation implies that 𝑐3 = 1. Substituting it to the second and fourth equations, we get 𝑐1 = 1 and 𝑐2 = −3. However, 𝑐1 + 𝑐2 = 1 + (−3) = −2 ≠ −1 Thus 𝑿 does not belong to span {𝑿𝟏 , 𝑿𝟐 , 𝑿𝟑 }. SAQ 3-3 Let S = {X1 = (1, 1, 0), X2 = (1, 3, 2), X3 = (4, 9, 5) } be vectors in ℝ3 . Does S span ℝ3 . ASAQ 3-3 Choose an arbitrary vector X = (𝑎, 𝑏, 𝑐) in ℝ3 . The set of vectors S = {𝑿𝟏 , 𝑿𝟐 , 𝑿𝟑 } spans ℝ3 if we can find scalars 𝑐1 , 𝑐2 , and 𝑐3 such that 𝑐1 (1, 1, 0) + 𝑐2 (1, 3, 2) + 𝑐3 (4, 9, 5) = (𝑎, 𝑏, 𝑐) Multiplying and adding we obtain the linear system 𝑐1 + 𝑐2 + 4𝑐3 = 𝑎 𝑐1 + 3𝑐2 + 9𝑐3 = 𝑏 2𝑐2 + 5𝑐3 = 𝑐 The augmented matrix 1 1 [1 3 0 2 4 9 5 𝑎 1 1 𝑏 ] is row equivalent to [0 2 𝑐 0 0 4 5 0 𝑎 −1 + 𝑏 ](verify) 𝑎−𝑏+𝑐 81 The system has no solution if 𝑎 − 𝑏 + 𝑐 ≠ 0. Thus S = {𝑿𝟏 , 𝑿𝟐 , 𝑿𝟑 } does not span ℝ3 . For example, the vector (3, 1, 0) cannot be written as a linear combination of 𝑿𝟏 , 𝑿𝟐 , 𝑿𝟑 . ACTIVITY 1. Determine whether 𝑓(𝑥) = 3𝑥 2 − 3𝑥 + 1 belongs to span {𝑝(𝑥), 𝑞(𝑥), 𝑟(𝑥)} where 𝑝(𝑥) = 𝑥 2 − 𝑥, 𝑞(𝑥) = 𝑥 2 − 2𝑥 + 1 and 𝑟(𝑥) = −𝑥 2 + 1. 2. Let S = {X1 = ( 6, 4, -2, 4 ), X2 = ( 2, 0, 0, 1 ), X3 = ( 3, 2, -1, 2 ), X4 = ( 5, 6, -3, 2 ), X5 = ( 0, 4, -2, -1 ) }. Does S spans ℝ4 ? 82 MODULE 4 LINEAR INDEPENDENCE Introduction In this chapter we will discuss linear dependence and independence of a given set of vectors, the basis and dimension of a vector space V, the rank of a matrix and how it can be used to determine whether a matrix is singular or nonsingular and whether a homogeneous system of equations has a trivial or nontrivial solution. Objectives 1. 2. 3. 4. 5. Define linear dependence and independence. Determine whether a given set of vectors is linearly independent or dependent. Define and explain basis and dimension of a given vector. Find a basis for a vector space spanned by a given set of vectors. Recognize the rank of a matrix and used this information to determine whether a homogeneous system has a nontrivial solution. 4.1 Definition and Examples Definition 4.1.1: A set of vectors { 𝑿𝟏 , 𝑿𝟐 , ⋯ , 𝑿𝒌 }  V is said to be linearly dependent if there exists scalars 𝑐1 , 𝑐2 , ⋯ , 𝑐𝑘 not all of which are zeros such that 𝑐1 𝑿𝟏 + 𝑐2 𝑿𝟐 + ⋯ + 𝑐𝑘 𝑿𝒌 = 𝟎 Otherwise, the set is said to be linearly independent. Meaning the set of vectors {𝑿𝟏 , 𝑿𝟐 , ⋯ , 𝑿𝒌 } is linearly independent if 𝑐1 𝑿𝟏 + 𝑐2 𝑿𝟐 + ⋯ + 𝑐𝑘 𝑿𝒌 = 𝟎 is true if and only if 𝑐1 = 𝑐2 = ⋯ = 𝑐3 = 0. Example 1. Let S ={X1 = (1, -1), X2 = (1, 1) } be vectors in ℝ2 . Determine whether S = {X1, X2 } is linearly dependent or linearly independent. Solution: Let 𝑐1 and 𝑐2 be scalars. Next we form the equation 𝑐1(1, -1) + 𝑐2 (1, 1) = (0, 0) Multiplying and adding we get the homogeneous system 𝑐1 + 𝑐2 = 0 −𝑐1 + 𝑐2 = 0 Since the only solution to this system is 𝑐1 = 𝑐2 = 0 then S = {X1, X2 } is linearly independent. 83 13 Example 2. Let S = {𝑿𝟏 = (1, 2 3), 𝑿𝟐 = (2, −1, 4), 𝑿𝟑 = (3, − 2 , 4)} be vectors in ℝ3 . Determine whether S = {𝑿𝟏 , 𝑿𝟐 , 𝑿𝟑 } is linearly dependent or linearly independent. Solution: Let 𝑐1 , 𝑐2 , and 𝑐3 be scalars. Next we form the equation 13 𝑐1 (1, 2, 3) + 𝑐2 (2, −1, 4) + 𝑐3 (3, − 2 , 4) = (0, 0, 0) Multiplying and adding we obtain the homogeneous system 𝑐1 + 2𝑐2 + 3𝑐3 = 0 13 2𝑐1 − 𝑐2 − 2 𝑐3 = 0 3𝑐1 + 4𝑐2 + 4𝑐3 = 0 Row reducing the augmented matrix we have 1 2 3 [2 −1 −13/2 3 4 4 0 −2𝑅 + 𝑅 0] −3𝑅 1 + 𝑅 2 1 3 0 −15𝑅2 2𝑅2 + 𝑅3 3 𝑅2 1 2 [0 −5 −25/2 𝑅3 0 −2 −5 1 2 𝑅2 [0 1 0 −2 1 2 𝑅3 [0 1 0 0 3 5/2 −5 3 5/2 0 0 0] 0 0 0] 0 0 0] 0 Since column 3 has no pivot then 𝑐3 is a free variable. This implies that the system has infinitely many solutions. In particular if we let 𝑐3 = 2 then 𝑐1 = 4 and 𝑐2 = −5. It can be verified that 4(1, 2, 3) − 5(2, −1, 4) + 2 (3, − 13 , 4) = (0, 0, 0) 2 Hence 𝑿𝟏 , 𝑿𝟐 , and 𝑿𝟑 are linearly dependent. REMARKS: 1. Any set of vectors that includes the zero vector is linearly dependent. Proof: Let S = {𝑿𝟏 , 𝑿𝟐 , ⋯ , 𝑿𝒏 , 𝑶}. Then 0 ∙ 𝑿𝟏 + 0 ∙ 𝑿𝟐 + ⋯ + 0 ∙ 𝑿𝒏 + 2 ∙ 𝑶 = 0 Hence S is linearly dependent. 84 2. If X ≠ 0, { X } is linearly independent 3. The zero vector is linearly dependent SAQ 4-1 Determine whether the vectors 𝑿𝟏 = (1, −1, 2), 𝑿𝟐 = (4, 0, 0), 𝑿𝟑 = (−2, 3, 5) and 𝑿𝟒 = (7, 1, 2) are linearly dependent or linearly independent. ASAQ 4-1 Let 𝑐1 , 𝑐2, 𝑐3 and 𝑐4 be scalars. Then we form the equation 𝑐1 (1, −1, 2) + 𝑐2 (4, 0, 0) + 𝑐3 (−2, 3, 5) + 𝑐4 (7, 1, 2) = (0, 0, 0) Multiplying and adding we obtain the homogeneous system 𝑐1 + 4𝑐2 − 2𝑐3 + 7𝑐4 = 0 −𝑐1 + 3𝑐3 + 𝑐4 = 0 2𝑐1 + 5𝑐3 + 2𝑐4 = 0 1 4 −2 7 The augmented matrix [−1 0 3 1 2 0 5 2 0 0] is row equivalent to 0 1 4 −2 7 0 2 0] (you must do the calculation here) [0 1 1/4 0 0 1 4/11 0 The last equation implies that 𝑐4 is a free variable. Hence 𝑐1 = −4𝑐2 + 2𝑐3 − 7𝑐4 1 𝑐2 = −4𝑐3 − 2𝑐4 4 𝑐3 = −11𝑐4 85 If we let 𝑐4 = 11 then 𝑐1 = −1, 𝑐2 = −21, and 𝑐3 = −4. It can be verified that −1(1, −1, 2) − 21(4, 0, 0) − 4(−2, 3, 5) + 11(7, 1, 2) = (0, 0, 0) Hence the given vectors are linearly dependent. A very important theorem about linear dependence or independence of vectors can now be stated. Theorem 4.1.1. A set of 𝑛 vectors in ℝ𝑚 is always linearly dependent if 𝑛 > 𝑚. Corollary 4.1.1. A set of linearly independent vectors in ℝ𝑛 contains at most 𝑛 vectors. Corollary 4.1.1 can be interpreted in this way: if we have 𝑛 linearly independent vectors, then adding more vectors will make the set of vectors linearly dependent. For example, the set of vectors {𝑿𝟏 = (1, -1), 𝑿𝟐 = (1, 1)} of Example 1 is linearly independent. Then by Corollary 4.1.1, the set of vectors {𝑿𝟏 = (1, -1), 𝑿𝟐 = (1, 1), 𝑿𝟑 = (2, 3)} is linearly dependent. Theorem 4.1.2: Let S = { 𝑿𝟏 , 𝑿𝟐 , ⋯ , 𝑿𝒌 } be a set of nonzero vectors in a vector space V. Then S is linearly dependent if and only if one of the vectors 𝑿𝒋 is a linear combination of the preceding vectors in S. Example 3. Let S = {X1 = (1, 1, 0), X2 = (0, 2, 3), X3 = (1, 2, 3), X4 = (3, 6, 6) } be a set of vectors in ℝ3 . Determine if V is linearly dependent. If linearly dependent, express one of the vectors as a linear combination of the rest. Solution: By Corollary 4.1.1, we know that S is linearly dependent. Now we form the equation 𝑐1(1, 1, 0) + 𝑐2 (0, 2, 3) + 𝑐3 (1, 2, 3) + 𝑐4 (3, 6, 6) = (0, 0, 0) Multiplying and adding we get the homogeneous system 𝑐1 + 𝑐3 + 3𝑐4 = 0 𝑐1 + 2𝑐2 + 2𝑐3 + 6𝑐4 = 0 3𝑐2 + 3𝑐3 + 6𝑐4 = 0 Row reducing the augmented matrix we obtain 86 1 0 1 [0 1 1 0 0 1 3 0 2 0] 1 0 This implies that 𝑐4 is a free variable. If we let 𝑐4 = 𝑟 ∈ ℝ and 𝑟 ≠ 0 then 𝑐3 = −𝑐4 = −𝑟 𝑐2 = −𝑐3 − 2𝑐4 = 𝑟 − 2𝑟 = −𝑟 𝑐1 = −𝑐3 − 3𝑐4 = 𝑟 − 3𝑟 = −2𝑟 Thus the given vectors are linearly dependent and it can be verified that −2𝑟(1, 1, 0) − 𝑟(0, 2, 3) − 𝑟(1, 2, 3) + 𝑟(3, 6, 6) = (0, 0, 0) By Theorem 4.1.2, we can write one of the vectors as a linear combination of the rest. Transposing the first three terms to the right side we have 𝑟(3, 6, 6) = 2𝑟(1, 1, 0) + 𝑟(0, 2, 3) + 𝑟(1, 2, 3) (3, 6, 6) = 2(1, 1, 0) + (0, 2, 3) + (1, 2, 3) (dividing both sides by 𝑟) Clearly, (3, 6, 6) is a linear combination of the preceding vectors. NOTE: Any vector in the given set can be expressed as a linear combination of the other vectors. 4.2 Basis and Dimension Definition 4.2.1: A set of vectors S = {𝑿𝟏 , 𝑿𝟐 , ⋯ , 𝑿𝒏 } in a vector space V is called a basis for V if S spans V and S is linearly independent. Example 1. Show that the set S = {X1 = (3, 2, 2), X2 = (-1, 2, 1), X3 = (0, 1, 0) } is a basis for ℝ3 . Solution: First we have to show that S is linearly independent. Forming the equation 𝑐1(3, 2, 2) + 𝑐2 (−1, 2, 1)+ 𝑐3 (0, 1, 0) = (0, 0, 0) we obtain the linear system 3𝑐1 − 𝑐2 = 0 2𝑐1 + 2𝑐2 + 𝑐3 = 0 2𝑐1 + 𝑐2 = 0 87 Solving the given system we obtain only the zero solution, 𝑐1 = 𝑐2 = 𝑐3 = 0. This implies that S is linearly independent. Next we have to show that S spans ℝ3 . To show that S spans ℝ3 , let 𝑿 = (𝑎, 𝑏, 𝑐) be any vector in ℝ3 . Then we form the equation 𝑐1(3, 2, 2) + 𝑐2 (−1, 2, 1)+ 𝑐3 (0, 1, 0) = (𝑎, 𝑏, 𝑐) Multiplying and adding we obtain the linear system 3𝑐1 − 𝑐2 = 𝑎 2𝑐1 + 2𝑐2 + 𝑐3 = 𝑏 2𝑐1 + 𝑐2 = 𝑐 ac 3c  2a 2a  5b  8c , c2  , and c3  . Since the above 5 5 5 system is consistent for any choice of 𝑎, 𝑏, and 𝑐, then S spans ℝ3 . Solving the system we get c1  Thus S = { (3, 2, 2), (−1, 2, 1), (0, 1, 0) } } is a basis for ℝ3 . A basis for a vector space V is not unique. In fact it can be easily verified that the set of vectors {(1, 0, 0), (0, 1, 0), (0, 0, 1)} is also a basis for ℝ3 . This is called the natural basis for ℝ3 . Likewise the set of vectors {(1, 0, 0, 0), (0, 1, 0, 0), (0, 0, 1, 0), (0, 0, 0, 1)} forms a natural basis for ℝ4 . In general, the vectors 𝑋1 = (1, 0, ⋯ , 0), 𝑋2 = (0, 1, ⋯ , 0), ⋯ , 𝑋𝑛 = (0, 0, ⋯ , 1) constitute a basis for ℝ𝑛 . Example 2. The monomials {𝑥 3 , 𝑥 2 , 𝑥, 1} is a natural basis for 𝑃3 . In general the monomials {𝑥 𝑛 , 𝑥 𝑛−1 , ⋯ , 𝑥, 1} constitute a basis for 𝑃𝑛 . Example 3. The 2 x 2 matrices {[ 1 0 0 ],[ 0 0 0 1 0 0 0 0 ],[ ],[ ]} is a natural basis for 𝑀22 . 0 1 0 0 1 SAQ 4-2 Determine whether A = {𝑥 2 − 1, 𝑥 2 − 2, 𝑥 2 − 3} is a basis for 𝑃2 . 88 ASAQ 4-2 First we have to determine if A is linearly independent. Let 𝑐1 , 𝑐2, and 𝑐3 be scalars. Then we form the equation 𝑐1 (𝑥 2 − 1) + 𝑐2 (𝑥 2 − 2) + 𝑐3 (𝑥 2 − 3) = 0 Equating the coefficients of similar terms we obtain the system 𝑐1 + 𝑐2 + 𝑐3 = 0 −𝑐1 − 2 𝑐2 − 3𝑐3 = 0 The augmented matrix in reduced row echelon form is (verify) [ 1 0 0 1 −1 2 0 ] 0 This implies that A is linearly dependent. Thus A is not a basis for 𝑃2 . Theorem 4.2.1: If S = { 𝑿𝟏 , 𝑿𝟐 , ⋯ , 𝑿𝒏 } is a basis for a vector space V, then every vector in V can be written in one and only one way as a linear combination of the vectors in S. Definition 4.2.2: If the vector space V has a finite basis, then the dimension of V is the number of vectors in every basis and V is called a finite dimensional vector space. Otherwise V is called an infinite dimensional vector space. If V = {0}, then V is said to be zero dimensional. We often write dimV for the dimension of V. Example 4. Since a basis for ℝ3 consists of 3 linearly independent vectors then dimℝ3 = 3. Likewise dimℝ4 = 4, dimℝ5 = 5 and so on. In general dimℝ𝑛 = 𝑛 since 𝑛 linearly independent vectors constitute a basis for ℝ𝑛 . Example 5. By Example 2, we see that dim𝑃3 = 4 and in general the dim𝑃𝑛 = 𝑛 + 1. Theorem 4.2.2: Let S = { 𝑿𝟏 , 𝑿𝟐 , ⋯ , 𝑿𝒏 } be a set of nonzero vectors in a vector space V and let W = spans S. Then some subset of S is a basis for W. Theorem 4.2.2 means that if W consists of all vectors that can be expressed as a linear combination of the vectors in S (W=spanS) then some subset of S forms a basis for W. Thus if S is linearly independent then { 𝑿𝟏 , 𝑿𝟐 , ⋯ , 𝑿𝒏 } is a basis for W. If S is linearly independent there exists a proper subset of S which forms a basis for W. Let us consider the next example: 89 Example 6: Let S = {X1 = (1, 2, 2), X2 = (3, 2, 1), X3 = (11, 10, 7), X4 = (7, 6, 4) }. Find a basis for the subspace W = span S of ℝ3 . Solution: W is a set of all vectors in ℝ3 which can be expressed as a linear combination of the vectors in S. By Theorem 4.1.1 we know that S is linearly dependent. Hence S is not a basis for W but by Theorem 4.2.2 S contains a proper subset of linearly independent vectors which forms a basis for W. Next we form the equation 𝑐1(1, 2, 2) + 𝑐2 (3, 2, 1) + 𝑐3 (11, 10, 7) + 𝑐4 (7, 6, 4) = (0, 0, 0) Equating the corresponding components, we obtain the homogeneous system 𝑐1 + 3𝑐2 + 11𝑐3 + 7𝑐4 = 0 2𝑐1 + 2𝑐2 + 10𝑐3 + 6𝑐4 = 0 2𝑐1 + 𝑐2 + 7𝑐3 + 4𝑐4 = 0 The augmented matrix in reduced row echelon form is (verify) 1 0 [0 1 0 0 2 1 0 3 2 0] 0 0 0 The leading 1s appear in columns 1 and 2, thus {X1, X2} is a basis for W = span S. It means that {X1, X2} is the smallest set possible that could span W. By Theorem 4.1.2, we can write 𝑿𝟑 and 𝑿𝟒 as a linear combination of the two vectors X1 and X2. Thus 𝑿𝟑 and 𝑿𝟒 can be discarded and the remaining vectors still span W. That is W = span{ 𝑿𝟏 , 𝑿𝟐 , 𝑿𝟑 , 𝑿𝟒 } = span{X1, X2}. Theorem 4.2.3: Suppose that dimV = 𝑛. If T = { Y1, Y2, …, Yr } is a linearly independent set of vectors in V, then 𝑟 ≤ 𝑛. Proof: Let 𝑿𝟏 , 𝑿𝟐 , ⋯ , 𝑿𝒏 be a basis for V. If 𝑟 > 𝑛 then we can find scalars 𝑐1 , 𝑐2 , ⋯ , 𝑐𝑟 not all zero such that 𝑐1 𝒀𝟏 + 𝑐2 𝒀𝟐 + ⋯ + 𝑐𝑟 𝒀𝒓 = 𝑶 is satisfied. This will contradict the linear independence of the 𝒀𝒊 ′s. Thus 𝑟 ≤ 𝑛. Corollary 4.2.1: If S = { 𝑿𝟏 , 𝑿𝟐 , ⋯ , 𝑿𝒏 } and T = { Y1, Y2, …, Ym } are bases for a vector space, then 𝑛 = 𝑚. (If a vector space has one basis with a finite number of elements, then all other bases are finite and have the same number of elements) 90 Theorem 4.2.4: If S is linearly independent set of vectors in a finite-dimensional vector space V, then there is a basis T for V, which contains S. Example 7: In V = ℝ4 , the set A = {(1, 1, 0, 0), (0, 0, 1, 1), (1, 0, 1, 0), (0, 1, 0, −1)} is a basis, and B = { (1, 2, −1, 1), (0, 1, 2, −1) } is linearly independent. Extend B into a basis for V using A. Solution: A  B = {(1, 2, −1, 1), (0, 1, 2, −1), (1, 1, 0, 0), (0, 0, 1, 1), (1, 0, 1, 0), (0, 1, 0, −1) } Delete vectors in A which are linear combination of the preceding vectors. a. Try (1, 1, 0, 0) 𝑐1(1, 2, −1, 1) + 𝑐2 (0, 1, 2, −1) = (1, 1, 0, 0) Then 𝑐1 =1 2𝑐1 + 𝑐2 = 1 −𝑐1 + 2𝑐2 = 0 𝑐1 – 𝑐2 = 0 If 𝑐1 = 1 then 𝑐2 = 1 but 2𝑐1 + 𝑐2 = 3  1. The system of equations has no solution so (1, 1, 0, 0) is not a linear combination of the preceding vectors. Thus we retain (1, 1, 0, 0). b. Try (0, 0, 1, 1) 𝑐1 (1, 2, -1, 1) + 𝑐2 (0, 1, 2, -1) = (0, 0, 1, 1) Then 𝑐1 =0 2𝑐1 + 𝑐2 = 0 −𝑐1 + 2𝑐2 = 1 𝑐1 – 𝑐2 = 1 If 𝑐1 = 0 then 𝑐2 = 0 but –𝑐1 + 2𝑐2 = 0  1. The system has no solution so (0, 0, 1, 1) is not a linear combination of the preceding vectors. Thus we retain (0, 0, 1, 1). Hence, {(1, 2, −1, 1), (0, 1, 2, −1), (1, 1, 0, 0), (0, 0, 1, 1)} is a basis for V. Theorem 4.2.5: Let V be an 𝑛-dimensional vector space, and let S = { 𝑿𝟏 , 𝑿𝟐 , ⋯ , 𝑿𝒏 } be a set of 𝑛 vectors in V. a. If S is linearly independent, then it is a basis for V. b. If S spans V, then it is a basis for V. 91 From our previous lesson, we see that the set of all solutions to the homogeneous system AX = 0, where A is 𝑚 𝑥 𝑛 , is a subspace of ℝ𝑛 . To find a basis for this solution space we consider the following example. Example 8: Find a basis for the solution space W of  x1  1 2 1 2 1    0  1 2 2 1 2   x2  0    x      2 4 3 3 3   3  0     x4     0 0 1 1 1   0   x5  Solution: Transform the augmented matrix to reduced row-echelon form using the Gauss-Jordan reduction method. The augmented matrix in reduced row-echelon form is (verify) 1 0 [ 0 0 2 0 0 0 0 3 0 1 −1 0 0 0 1 0 0 0 0 0 ] 0 0 The solution is 𝑐1 = −2𝑠 – 3𝑡, 𝑐2 = 𝑠, 𝑐3 = 𝑡, 𝑐4 = 𝑡, and 𝑐5 = 0 where 𝑠 and 𝑡 are any real numbers. Thus every solution is of the form  2 s  3t   s     , where 𝑠 and 𝑡 are real numbers. X  t    t   0  Since W is a solution space then every vector in W can be written in the form of  2   3  1 0     X  s 0  + t 1      0 1   0   0  Since 𝑠 and 𝑡 can take on any values, we first let 𝑠 = 1, 𝑡 = 0 and let 𝑠 = 0, 𝑡 = 1, obtaining as solutions 92  2  1   X1   0    0  0   3  0   X2   1    1   0  and Thus S = {X1, X2} belongs to W. Since any vector in W can be written as a linear combination of the vectors in S then S spans W. We have to show that S = {X1, X2} is linearly independent. We form the equation 𝑐1(−2, 1, 0, 0, 0 ) + 𝑐2 (−3, 0, 1, 1, 0 ) = (0, 0, 0, 0, 0) Then −2𝑐1 − 3𝑐2 = 0 𝑐1 = 0 𝑐2 = 0 The only solution to the above system is 𝑐1 = 𝑐2 = 0, hence S is linearly independent and is a basis for W. Thus the dimension of W is 2. SAQ 4-3 Find a basis for the solution space W of the homogeneous system 𝑥1 + 2𝑥2 + 2𝑥3 − 𝑥4 + 𝑥5 = 0 2𝑥2 + 2𝑥3 − 2𝑥4 − 𝑥5 = 0 2𝑥1 + 6𝑥2 + 2𝑥3 − 4𝑥4 + 𝑥5 = 0 𝑥1 + 4𝑥2 − 3𝑥4 =0 What is the dimension of W? ASAQ 4-3 1 0 The augmented matrix [ 2 1 2 2 6 4 2 2 2 0 −1 1 0 −2 −1 0 ] is row equivalent (verify) to −4 1 0 −3 0 0 93 1 0 [ 0 0 0 1 0 0 0 1 2 0 −1 −1/2 1 0 0 0 0 0 0 0 ] which is in reduced row echelon form. 0 0 Since columns 4 and 5 have no pivots then 𝑥4 and 𝑥5 are free variables. If we let 𝑥4 = 𝑠 and 𝑥5 = 𝑡 where 𝑠 and 𝑡 are real numbers then 𝑥1 = −𝑠 − 2𝑡 1 𝑥2 = 𝑠 + 2 𝑡 𝑥3 = 0 𝑥4 = 𝑠 𝑥5 = 𝑡 where 𝑠 and 𝑡 are real numbers. Thus all solutions are of the form −𝑠 − 2𝑡 1 𝑠 +2𝑡 𝑋= 0 𝑠 [ 𝑡 ] −2 −1 1/2 1 =𝑠 0 +𝑡 0 1 0 [0] [ 1 ] Since 𝑠 and 𝑡 can be any real number, we first let 𝑠 = 1, 𝑡 = 0 and then let 𝑠 = 0, 𝑡 = 2 to obtain the solution −1 −4 1 1 𝑋1 = 0 and 𝑋2 = 0 1 0 [2] [0] which span W. It can be easily verified that 𝑋1 and 𝑋2 are linearly independent because one vector is not a multiple of the other. Thus 𝑋1 and 𝑋2 form a basis for W and dimW = 2. Note that you may obtain a different basis since 𝑠 and 𝑡 can take on any values. 94 ACTIVITY 1. Let a) X1 = ( 4, 2, 1 ), X2 = ( 2, 6, -5 ), X3 = ( 1, -2, 3 ) b) X1 = ( 1, 2, 3 ), X2 = ( 1, 1, 1 ), X3 = ( 1, 0, 1 ) c) X1 = ( 1, 1, 0 ), X2 = ( 0, 2, 3 ), X3 = ( 1, 2, 3 ), X4 = ( 3, 6, 6 ) Which of the given set of vectors in ℝ3 is linearly dependent? For those that are, express one vector as a linear combination of the rest. 3. Determine whether the set  1 S    0 1 0 0 1 0 0 1  , , ,  0 1 1  0 1  1 1  is a basis for the vector space V of all 2 x 2 matrices. 4. Find a basis of ℝ4 containing the vector (1, 2, 3, 4). 4.3 The Rank of a Matrix Definition 4.3.1: Let  a11 a12 ... a1n  a a22 ... a2 n  21  A      am1 am 2 ... amn  be an 𝑚 x 𝑛 matrix. The rows of A X 1  (a11 , a12 ,..., a1n ) X 2  (a 21 , a 22 ,..., a 2 n ) ⋮ X m  (a m1 , a m 2 ,..., a mn ) considered as vectors in ℝ𝑛 , span a subspace of ℝ𝑛 , called the row space of A. Similarly, the columns of A,  a11   a12   a1n  a  a  a  21  22  2n   Y1  , Y2  , … , Yn                 am1   am 2   amn  95 considered as vectors in ℝ𝑚 , span a subspace of ℝ𝑚 called the column space of A. 1 2 0 1  Example 1. Let A   2 6 3 2  . The rows of A, 3 10 6 5  X1 = (1, 2, 0, 1), X2 = (2, 6, −3, −2), and X3 = (3, 10, −6, −5) are vectors in ℝ4 , and these vectors span a subspace of ℝ4 called the row space of A. That is Row space of A = span { 𝑋1 , 𝑋2 , 𝑋3 } Similarly, the columns of A, Y1 = (1, 2, 3), Y2 = (2, 6, 10), Y3 = (0, −3, −6) and Y4 = (1, −2, −5) are vectors in ℝ3 , and these vectors span a subspace of ℝ3 called the column space of A. That is Column space of A = span { 𝑌1 , 𝑌2 , 𝑌3 , 𝑌4 } Theorem 4.3.1: If A and B are two 𝑚 x 𝑛 row equivalent matrices, then the row spaces of A and B are equal. Example 2: Let S = {𝑿𝟏 , 𝑿𝟐 , 𝑿𝟑 , 𝑿𝟒 } where 𝑿𝟏 = (1, 2, −1), 𝑿𝟐 = (6, 3, 0), 𝑿𝟑 = (4, −1, 2), and 𝑿𝟒 = (2, −5, 4). Find a basis for the subspace V = spanS of ℝ3 . Solution: V is the row space of the matrix A whose rows are the given vectors 1 2 −1 6 3 0 A=[ ] 4 −1 2 2 −5 4 Applying the Gauss-Jordan elimination we obtain the matrix B 1 0 𝐵=[ 0 0 0 1 0 0 1/3 −2/3 ] 0 0 which is row equivalent to A. By Theorem 4.3.1, the row spaces of A and B are equal. Since the nonzero rows of B are linearly independent then (1, 0, 1/3) and (0, 1, −2/3) 96 form a basis for V. It means that all vectors in V can be expressed as a linear combination of these two vectors. Note that the basis for V is not a subset of S. However, expressing any vector in V as a linear combination of the basis obtained by this procedure is very simple. For example, the vectors (2, −5, 4) and (−1, 4, −3) are in V. Since the leading 1s appear in columns 1 and 2 then (2, −5, 4) = 2(1, 0, 1/3) – 5(0, 1, −2/3) and (−1, 4, −3) = −1(1, 0, 1/3) + 4(0, 1, −2/3) Also note that the dimV = 2 ≠ 3 hence V ≠ ℝ3 . Since V ≠ ℝ3 then not all vectors in ℝ3 can be expressed as a linear combination of (1, 0, 1/3) and (0, 1, −2/3). It is very easy to see from our example above that all vectors of the form (2, −5, 𝑐), 𝑐 ≠ 4, are not in V. Definition 4.3.2: The dimension of the row space of A is called the row rank of A, and the dimension of the column space of A is called the column rank of A. 1 2 Example 3. Let 𝐴 = [2 6 3 10 0 1 −3 −3]. Find the row and column ranks of A. −6 −7 Solution: 1 0 A is row equivalent (verify) to 𝐵 = [0 1 0 0 3 6 −3/2 −5/2] which is in reduced row 0 0 echelon form. The vectors (1, 0, 3, 6) and (0, 1, −3/2, −5/2) form a basis for the row space of A. Thus the row rank of A is 2. To find the column rank of A, we form the matrix 1 2 𝐴𝑇 = [ 0 1 2 3 6 10 ] −3 −6 −3 −7 which is row equivalent (verify) to the matrix 1 0 𝐶=[ 0 0 0 −1 1 2 ] 0 0 0 0 97 The vectors (1, 0 −1) and (0, 1, 2) form a basis for the row space of 𝐴𝑇 . Thus 1 0 [ 0 ] and [1] −1 2 form a basis for the column space of A. Hence the column rank of A is 2. Note that the row rank and column rank of a matrix are equal. This is stated in the next theorem. Theorem 4.3.2. The row and column ranks of the (𝑚 x 𝑛) matrix A =  aij  are equal. The next example will show us how we can apply the method used in Example 3 in finding a basis for a subspace when the vectors are given in column form. 1 0 2 3 5 2 2 1 2 0 Example 4: Let S = {[ ] , [ ] , [ ] , [ ] , [ ]} . Find a basis for the subspace V = spanS of 1 1 3 1 0 1 2 1 4 −1 ℝ4 . 1 2 Solution: Let 𝐴 = [ 1 1 0 2 1 2 1 0 𝐴𝑇 = 2 3 [5 2 1 3 1 2 2 1 2 0 3 5 2 0 ]. Next we form the matrix 1 0 4 −1 1 1 1 2 3 1 1 4 0 −1] Applying elementary row operations we obtain (verify) the matrix 1 0 𝐵𝑇 = 0 0 [0 0 1 0 0 0 0 0 1 0 0 −1 3/5 4/5 0 0 ] The basis for V are the nonzero rows of 𝐵 𝑇 written as columns. Thus 98 0 0 1 1 0 0 [ ], [ ], and [ ] 0 1 0 3/5 4/5 −1 form a basis for V. Theorem 4.3.3. An 𝑛 x 𝑛 matrix is nonsingular if and only if rank A = 𝑛. 1 Example 5. Find the rank of 𝐴 = [1 1 2 0 1 −3]. 3 3 Solution: Transforming A to reduced row echelon form we obtain (verify) 1 0 [0 1 0 0 −6 3] 0 Since rank of A = 2 < 3 then A is singular. We know from our previous lesson that a matrix is singular if and only if |𝐴| = 0. It can be verified that 1 2 |1 1 1 3 0 −3| = 0 3 This result is stated in the following corollary and can be used in determining the rank of an 𝑛 x 𝑛 matrix. Corollary 4.3.1. If A is an 𝑛 x 𝑛 matrix, then rank A = 𝑛 if and only if A  0. The next corollary gives another method of testing whether the homogeneous system 𝐴𝑋 = 0 has a trivial or nontrivial solution. Corollary 4.3.2. The homogeneous system AX = 0 of 𝑛 linear equations in 𝑛 unknowns has a nontrivial solution if and only if rank A < 𝑛. 99 Example 6: The 3 x 3 matrix A of Example 5 is singular. Hence the homogeneous system 𝑥1 + 2𝑥2 = 0 𝑥1 + 𝑥2 − 3𝑥3 = 0 𝑥1 + 3𝑥2 + 3𝑥3 = 0 has a nontrivial solution. Corollary 4.3.3. Let S = {𝑿𝟏 , 𝑿𝟐 , ⋯ , 𝑿𝒏 } be a set of vectors in ℝ𝑛 and let A be the matrix whose rows (columns) are the vectors in S. Then S is linearly independent if and only if |𝐴| ≠ 0. Example 7: The vectors (1, −2, 3), (2, 4, 7) and (0, −1, 5) are linearly independent because 1 [2 0 −2 3 4 7] = 41 ≠ 0 −1 5 By Theorem 4.2.5, (1, −2, 3), (2, 4, 7) and (0, −1, 5) form a basis for ℝ3 . ACTIVITY 1. Find a basis for the row space and column space of the following matrices: 1 2 3 a. [4 5 6] 7 8 9 2 1 3 b. [2 −1 5 1 1 1 −2 2] 1 2. Let V = P1 and let A = { x + 1, x – 1, 2x + 3 } a. Show that S spans P1. b. Find a subset of A which is a basis for P1. 3. Consider the following subset of the vector space of all real-valued functions S = { cos2t, sin2t, cos2t } Find a basis for the subspace W = spanS. What is the dimension of W? 100 MODULE 5 LINEAR TRANSFORMATIONS AND MATRICES Introduction In this chapter we will discuss a function mapping one vector space to another vector space, one-to-one and onto linear transformations, the kernel of a linear transformation and how it can be used to determine if the linear transformation is one-toone, the range of a linear transformation and how it can be used to determine whether the linear transformation is onto and the matrix of a linear transformation. Objectives After going through this chapter, you are expected to be able to do the following: 1. Define and explain linear transformation or linear mapping. 2. Determine whether a function mapping one vector space to another is a linear transformation. 3. Differentiate between one-to-one linear transformation and onto linear transformation. 4. Define the kernel and range of a linear transformation. 5. Using kernel, distinguish between one-to-one linear transformation and onto linear transformation. 6. Find the matrix of a linear transformation with respect to some bases. 5.1 Linear Transformations (Linear Mappings) Definition 5.1.1: Let V and W be vector spaces. A linear transformation L of V into W is a function assigning a unique vector L(X) in W to each X in V such that: a. L ( X + Y ) = L (X) + L(Y), for every vector X and Y in V, b. L ( 𝑐X ) = 𝑐L(X), for every vector X in V and every scalar 𝑐. Example 1: Let L: ℝ3 → ℝ3 be defined by L(x, y, z) = (x, y, 0) Determine whether L is a linear transformation or not. Solution: Let X = (𝑥1 , 𝑦1 , 𝑧1 ) and Y = (𝑥2 , 𝑦2 , 𝑧2 )  ℝ3 then 𝑋 + 𝑌 = (𝑥1 + 𝑥2 , 𝑦1 + 𝑦2 , 𝑧1 + 𝑧2 ). 101 a) We have to show that L(X + Y) = L(X) + L(Y) L(X + Y) = L(𝑥1 + 𝑥2 , 𝑦1 + 𝑦2 , 𝑧1 + 𝑧2 ) = (𝑥1 + 𝑥2 , 𝑦1 + 𝑦2 , 0) = (𝑥1 , 𝑦1 , 0) + ( 𝑥2 , 𝑦2 , 0) = L(X) + L(Y) Hence L(X + Y) = L(X) + L(Y) b) Let 𝑐𝑿 = (𝑐𝑥1 , 𝑐𝑦1 , 𝑐𝑧1 ). We have to show that L(𝑐X) = 𝑐L(X). L(𝑐𝑿) = (𝑐𝑥1 , 𝑐𝑦1 , 0) = 𝑐(𝑥1 , 𝑦1 , 0) = 𝑐L(𝑿) Hence L(cX) = cL(X). Since the two conditions are satisfied then L is a linear transformation. SAQ 5-1 Let L: ℝ3 ℝ3 be defined by L(𝑥, 𝑦, 𝑧) = (𝑥 − 𝑦, 𝑥 2 , 2𝑧) Determine whether L is a linear transformation or not. 102 ASAQ 5-1 Let 𝑿 = (𝑥1 , 𝑦1 , 𝑧1 ) and Y = (𝑥2 , 𝑦2 , 𝑧2 )  ℝ3 then 𝑋 + 𝑌 = (𝑥1 + 𝑥2 , 𝑦1 + 𝑦2 , 𝑧1 + 𝑧2 ). L(𝑿 + 𝒀) = ((𝑥1 + 𝑥2 ) − (𝑦1 + 𝑦2 ), (𝑥1 + 𝑥2 )2 , 2(𝑧1 + 𝑧2 )) = (𝑥1 + 𝑥2 − 𝑦1 − 𝑦2 , 𝑥1 2 + 2𝑥1 𝑥2 + 𝑥2 2 , 2𝑧1 + 2𝑧2 ) and L(X) + L(Y) = (𝑥1 − 𝑦1 , 𝑥1 2 , 2𝑧1 ) + (𝑥2 − 𝑦2 , 𝑥2 2 , 2𝑧2 ) = ((𝑥1 + 𝑥2 − 𝑦1 − 𝑦2 , 𝑥1 2 + 𝑥2 2 , 2𝑧1 + 2𝑧2 ) Since L(𝑿 + 𝒀) ≠ L(𝑿) + L(𝒀) then L is not a linear transformation. Theorem 5.1.1: If L: V→ W is a linear transformation, then L(c1X1 + c2X2 + … + ckXk ) = c1L(X1) + c2L(X2) + … + ckL(Xk) for any vectors X1, X2, … , Xk in V and any scalars c1, c2, … , ck. Theorem 5.1.2: Let L: ℝ𝑛 → ℝ𝑚 be a linear transformation. Then a. L(0) = 0 b. L(−𝑿) = − L(𝑿) for every 𝑿 ∈ ℝ𝑛 c. L ( X – Y ) = L(X) – L(Y) Proof of Theorem 5.1.2.a. a. L(0) = 0 L(0) = L (0 + 0) L(0) = L(0) + L(0) L(0) – L(0) = L(0) + L(0) – L(0) 0 = L(0) 103 Proof of Theorem 5.1.2.b. b. L(−𝑿) = −L(𝑿) L(−𝑿) = L(−1 ∙ 𝑿) = −1L(𝑿) = −L(𝑿) Proof of Theorem 5.1.2.c. c. L(𝑿 − 𝒀) = L(𝑿 + (−1)𝒀) = L(𝑿) + L(−1 ∙ 𝒀) = L(𝑿) − L(𝒀) Example 2: Let L: ℝ3 → ℝ2 be defined by L(𝑥, 𝑦, 𝑧) = (𝑥, 𝑦) be a linear transformation (verify). It can be verified that L(0, 0, 0) = (0, 0) That is, the zero vector in ℝ3 is mapped to the zero vector in ℝ2 . Also, let X = (𝑥1 , 𝑦1 , 𝑧1 ) and Y = (𝑥2 , 𝑦2 , 𝑧2 ) be any vectors in ℝ3 then 𝑿 − 𝒀 = (𝑥1 − 𝑥2 , 𝑦1 − 𝑦2 , 𝑧1 − 𝑧2 ). Thus L(X – Y) = (𝑥1 − 𝑥2 , 𝑦1 − 𝑦2 ) = (𝑥1 , 𝑦1 ) − (𝑥2 , 𝑦2 ) = L(X) – L(Y) Example 3: Let T: ℝ2 → ℝ2 be the “translation mapping” defined by T(𝑥, 𝑦) = (𝑥 + 1, 𝑦 + 2) By Theorem 5.1.2 letter a, T is not a linear transformation since T(0, 0) = (0 + 1, 0 + 2) = (1, 2) ≠ (0, 0) (The zero vector is not mapped into the zero vector). 104 Theorem 5.1.3: Let L: V→ W be a linear transformation of an n-dimensional vector space V into a vector space W. Also let S = { X1 , X2, … , Xn } be a basis for V. If X is any vector in V, then L(X) is completely determined by {L(X1), L(X2), … , L(Xn) }. Example 4: Let L: ℝ3 → ℝ3 be a linear transformation for which we know that (2, −4) , L(0, 1, 0) = (3, −5) and L(0, 0, 1) = (2, 3). a. What is L(1, −2, 3)? b) What is L(𝑎, 𝑏, 𝑐)? Solution: The set {(1, 0, 0), (0, 1, 0), (0, 0, 1)} is a basis for ℝ3 , and (1, −2, 3) = 1(1, 0, 0) – 2(0, 1, 0) + 3(0, 0, 1) By Theorem 5.1.3: a) L(1, −2, 3) = L1(1, 0, 0) – L2(0, 1, 0) + L3(0, 0, 1) = L(1, 0, 0) – 2L(0, 1, 0) + 3L(0, 0, 1) = (2, −4) – 2(3, −5) + 3(2, 3) = (2, 23) b) We see from part (a) that for any (𝑎, 𝑏, 𝑐)  ℝ3 , we have L(𝑎, 𝑏, 𝑐) = 𝑎L(1, 0, 0) + 𝑏L(0, 1, 0) + 𝑐L(0, 0, 1) = 𝑎(2, −4) + 𝑏(3, −5) + 𝑐(2, 3) = (2𝑎 + 3𝑏 + 2𝑐, −4𝑎 −5𝑏 + 3𝑐) SAQ 5-2 Let L: P2 → P3 be a linear transformation for which we know that L(1) = 1, L(t) = t2 and L(t2) = t3 + t. Find (a) L(2t2 – 5t + 3) (b) L(at2 + bt + c) L(1, 0,0) = 105 ASAQ2 The set {𝑡 2 , 𝑡, 1} is a basis for 𝑃2 . By Theorem 5.1.3 a. L(2𝑡 2 − 5𝑡 + 3) = L2(𝑡 2 ) + L(−5)(𝑡) + L3(1) = 2L(𝑡 2 ) − 5L(𝑡) + 3L(1) = 2(𝑡 3 + 𝑡) − 5(𝑡 2 ) + 3(1) = 2𝑡 3 + 2𝑡 − 5𝑡 2 + 3 = 2𝑡 3 − 5𝑡 2 + 2𝑡 + 3 b. L(𝑎𝑡 2 + 𝑏𝑡 + 𝑐) = 𝑎L(𝑡 2 ) + 𝑏L(𝑡) + cL(1) = 𝑎(𝑡 3 + 𝑡) + 𝑏(𝑡 2 ) + 𝑐 = 𝑎𝑡 3 + 𝑎𝑡 + 𝑏𝑡 2 + 𝑐 = 𝑎𝑡 3 + 𝑏𝑡 2 + 𝑎𝑡 + 𝑐 5.2 The Kernel and Range of a Linear Transformation Definition 5.2.1: A linear transformation L: V→ W is said to be one-to-one if for all X1, X2 in V, X1 ≠ X2 implies that L(X1) ≠ L(X2). An equivalent statement is that L is one-to-one if for all X1, X2 in V, L(X1) = L(X2) implies that X1 = X2. Example 1. Let L: ℝ2 → ℝ2 be a linear transformation defined by L( 𝑥, 𝑦 ) = ( 𝑥 + 𝑦, 𝑥 ) Determine if L is one–to–one or not. Solution: Let X1 = (𝑥1 , 𝑦1 ) and X2 = (𝑥2 , 𝑦2 ) be vectors in ℝ2 . We have to show that if L(X1) = L(X2) then X1 = X2. L(X1) = L(X2) (𝑥1 + 𝑦1 , 𝑥1 ) = (𝑥2 + 𝑦2 , 𝑥2 ) Equating the corresponding parts we have 𝑥1 + 𝑦1 = 𝑥2 + 𝑦2 𝑥1 = 𝑥2 106 If we subtract the second equation from the first, we get 𝑦1 = 𝑦2 which implies that 𝑿𝟏 = 𝑿𝟐 . Thus L is one-to-one. Example 2. Let L: ℝ3 → ℝ3 be a linear transformation defined by L(x, y, z) = (x, y, 0) Determine if L is one-to-one or not. Solution: Let X1 = (4, 5, −3) and X2 = (4, 5, 2) be vectors in ℝ3 . We see that X1  X2 but L(4, 5, −3) = L(4, 5, 2) = (4, 5, 0). Hence L is not one-to-one. Definition 5.2.2: Let L: V→ W be a linear transformation. The kernel of L, kerL, is the subset of V consisting of all vectors X such that L(X) = Ow. Note that kerL is not empty since by Theorem 5.1.2 we know that if L: ℝ𝑛 → ℝ𝑚 is a linear transformation then the zero vector in ℝ𝑛 is mapped to the zero vector in ℝ𝑚 . Thus 𝑂 ∈ kerL. Example 3. Let L: ℝ4 → ℝ3 be a linear transformation defined by L(𝑥, 𝑦, 𝑧, 𝑤) = (𝑥 + 𝑦, 𝑧 + 𝑤, 𝑥 + 𝑧) The vector (1, −1, −1, 1) is in kerL since L(1, −1, −1, 1) = (0, 0, 0) while the vector (1, 2, 3, −4) is not in kerL because L(1, 2, 3, −4) = ( 3, 1, 5 ) ≠ (0, 0, 0). Thus all vectors X in ℝ4 such that L(X) = (0, 0, 0) are in kerL. Example 4. Let L be the linear transformation of Example 3. The kerL consists of all vectors X in ℝ4 such that L(X) = 0. If we let X = (𝑥, 𝑦, 𝑧, 𝑤) then L(𝑥, 𝑦, 𝑧, 𝑤) = (0, 0, 0) (𝑥 + 𝑦, 𝑧 + 𝑤, 𝑥 + 𝑧) = (0, 0, 0) Equating the corresponding parts we obtain the homogeneous system 𝑥+𝑦 =0 𝑧 + 𝑤=0 𝑥 +𝑧 =0 107 The augmented matrix in reduced echelon form (verify) is 1 [0 0 0 0 −1 0 1 0 1 0] 0 1 1 0 This implies that 𝑤 is a free variable and can take on any value. Hence 𝑥=𝑟 𝑦 = −𝑟 𝑧 = −𝑟 𝑤 = 𝑟 where 𝑟 ∈ ℝ Thus kerL consists of all vectors of the form (𝑟, −𝑟, −𝑟, 𝑟) where 𝑟 is any real number. That is ker L = { (𝑟, −𝑟, −𝑟, 𝑟): 𝑟  ℝ} = { 𝑟(1, −1, −1, 1): 𝑟  ℝ} Hence ker L = span { (1, −1, −1, 1)} Since (1, −1, −1, 1) is linearly independent then it forms a basis for kerL. Thus dim(kerL) = 1. SAQ 5-3 Let L: ℝ4 → ℝ2 be defined by x   y x  y L       z    z  w       w  Find kerL. 108 ASAQ 5-3 ker L = { (𝑥, 𝑦, 𝑧, 𝑤) : (𝑥 − 𝑦, 𝑧 − 𝑤) = (0, 0) } Equating the corresponding parts we get: 𝑥 − 𝑦 = 0 which implies that 𝑥 = 𝑦 and 𝑧 − 𝑤 = 0 which implies that 𝑧 = 𝑤 If we let 𝑦 = 𝑟  ℝ and 𝑤 = 𝑠  ℝ then kerL = { (𝑟, 𝑟, 𝑠, 𝑠) : 𝑟 and 𝑠  ℝ } = { 𝑟 (1, 1, 0, 0) + 𝑠(0, 0, 1, 1) } Hence kerL = span {(1, 1, 0, 0), (0, 0, 1, 1)} Since {(1, 1, 0, 0), (0, 0, 1, 1)} is linearly independent (one vector is not a scalar multiple of the other) then it forms a basis for kerL. Hence dim(kerL) = 2. Theorem 5.2.1: If L: V→ W is a linear transformation, then kerL is a subspace of V. Definition 5.2.3: If L: V→ W is a linear transformation, then the range of L, denoted by range L, is the set of all vectors in W that are images, under L, of vectors in V. Thus a vector Y is in range L if we can find some vector X in V such that L(X) = Y. If range L = W, we say that L is onto. Example 5: Let L: ℝ3 → ℝ2 be a linear transformation defined by L(𝑥, 𝑦, 𝑧) = (𝑥 + 𝑦, 𝑦 − 𝑧) The range or image of L (ImL) is: ImL = { (𝑥 + 𝑦, 𝑦 − 𝑧) } = { 𝑥 (1, 0) + 𝑦 (1, 1) + 𝑧 (0, -1) : 𝑥, 𝑦, 𝑧  ℝ } Thus ImL = span {(1, 0), (1, 1), (0, −1)}. By Theorem 4.1.1 {(1, 0), (1, 1), (0, −1)} is linearly dependent. Since (0, −1) is a linear combination of the other vectors such as 109 (0, −1) = 1 (1, 0) + (−1)(1, 1) then we can delete (0, −1) from the set and the remaining vectors still span ImL. The set {(1, 0), (1, 1)} is linearly independent therefore forms a basis for ImL. Hence dim(ImL) = 2. Theorem 5.2.2: Let L: V→ W be a linear transformation (i) L is a monomorphism (one-to-one) if and only if dim(kerL) = 0. (ii) L is an epimorphism (onto) if and only if dim(ImL) = dim W. Example 6. In ASAQ 5-3, since dim(kerL) = 2 then L is not a monomorphism (one-to-one) and in Example 5, since dim(ImL) = 2 = dimℝ2 then L is an epimorphism (onto). Theorem 5.2.3: If L: V→ W is a linear transformation, then range L is a subspace of W. Theorem 5.2.4: If L: V→ W is a linear transformation, then dim(kerL) + dim(range L) = dim V Example 7: Verify Theorem 5.2.4 using the linear transformation of Example 5. Solution: kerL = { (𝑥, 𝑦, 𝑧) : (𝑥 + 𝑦, 𝑦 − 𝑧) = (0, 0) } Equating the corresponding parts we get: 𝑥 + 𝑦 = 0 which implies that 𝑥 = −𝑦 and 𝑦 − 𝑧 = 0 which implies that 𝑧 = 𝑦 ; If we let 𝑦 = 𝑟, 𝑟  ℝ, then kerL = { (−𝑟, 𝑟, 𝑟) : 𝑟  ℝ } = { 𝑟(−1, 1, 1) : 𝑟  ℝ } Thus kerL = span { (−1, 1, 1) } which is linearly independent hence forms a basis for kerL. Thus dim(kerL) = 1. 110 From the previous example, we know that dim(ImL) = 2. Therefore dim(kerL) + dim(ImL) = dim ℝ3 1 + 2 = 3 Remark: Let L: V → W be a linear transformation. Then the rank of L is defined to be the dimension of its image, and the nullity of L is defined to be the dimension of its kernel. rank (L) = dim(ImL) and nullity (L) = dim(kerL) Thus the preceding theorem yields the following formula for L when V has a finite dimension: rank(L) + nullity(L) = dim V. Corollary 5.2.1: Let L: V→ W be a linear transformation and dim V = dim W, a. If L is one-to-one, then it is onto. b. If L is onto, then it is one-to-one. SAQ 5-4 Let L: ℝ3 →ℝ4 be defined by L(𝑥, 𝑦, 𝑧) = (𝑥 + 𝑧, 𝑦 − 𝑥, 𝑦 + 𝑧, 𝑥 + 𝑦 + 2𝑧) Verify Theorem 5.2.7 111 ASAQ 5-4 L(𝑥, 𝑦, 𝑧) = (0, 0, 0, 0) (𝑥 + 𝑧, 𝑦 − 𝑥, 𝑦 + 𝑧, 𝑥 + 𝑦 + 2𝑧) = (0, 0, 0, 0) Equating the corresponding parts we obtain the homogeneous system 𝑥 +𝑧 =0 −𝑥 + 𝑦 =0 𝑦 +𝑧 =0 𝑥 + 𝑦 + 2𝑧 = 0 The augmented matrix in reduced row echelon form (verify) is 1 0 [ 0 0 0 1 0 0 1 1 0 0 0 0 ] 0 0 Thus 𝑥 = −𝑧, 𝑦 = −𝑧, where 𝑧 can be any real number. If we let 𝑧 = 𝑟 ∈ ℝ then kerL = {(−𝑟, −𝑟, 𝑟): 𝑟 ∈ ℝ} = {𝑟(−1, −1, 1): 𝑟 ∈ ℝ} Hence kerL = span {(−1, −1, 1)}. Since (−1, −1, 1) is linearly independent then it is a basis for kerL. Thus nullity of L = dim(kerL) = 1. Solving for the image of L, we have Im(L) = {(𝑥 + 𝑧, 𝑦 − 𝑥, 𝑦 + 𝑧, 𝑥 + 𝑦 + 2𝑧)} = {𝑥(1, −1, 0, 1) + 𝑦(0, 1, 1, 1) + 𝑧(1, 0, 1, 2): 𝑥, 𝑦, 𝑧 ∈ ℝ} Since (1, 0, 1, 2) = (1, −1, 0, 1) + (0, 1, 1, 1) then (1, 0, 1, 2) is a linear combination of the preceding vectors and may be deleted from the spanning set. Thus Im(L) = span {(1, −1, 0, 1), (0, 1, 1, 1)} Since {(1, −1, 0, 1), (0, 1, 1, 1)} is linearly independent (one vector is not a scalar multiple of the other) then it is a basis for Im(L). Therefore Rank of L = dim(ImL) = 2 112 Now it can be verified that Nullity of L + rank of L = dim(ℝ3 ) 1 + 2 = 3 ACTIVITY 1. Which of the following are linear transformations? (a) L(x, y) = (x2 + x, y – y2) (b) L(x, y) = (x – y, 0, 2x + 3)   x   2x  3 y    (c) L   y    3 y  2 z    z    2z      2. Let L: ℝ3 → ℝ4 be defined by L(x, y, z) = (x + y + z, x + 2y – 3z, 2x + 3y – 2z, 3x + 4y – z) (a) Find a basis for and the dim(kerL), (b) Find a basis for and the dim(ImL), (c) Verify Theorem 5.2.4. 3. Let L: ℝ3 → ℝ3 be defined by L(x, y z) = (x + z, x + y + 2z, 2x + y + 3z ) (a) Is L one-to-one? (b) Is L onto? 4. Let L: P2 → P2 be the linear transformation defined by L(at2 + bt + c) = (a + c)t2 + (b + c)t. (a) Is t2 – t – 1 in kerL? (d) Find a basis for kerL. (b) Is t2 + t -1 in kerL? (e) Find a basis for ImL. 2 (c) Is 2t – t in range L? 113 5.3 The Matrix of a Linear Transformation Coordinate Vectors Definition 5.3.1: Let V be an n-dimensional vector space with basis S = { X1 , X2, … , Xn }. If X = a1X1 + a2X2 + … + anXn is any vector in V, then the vector  a1  a  X   S  2     an  in ℝ𝑛 is called the coordinate vector of X with respect to the basis S. The components of [ X ] S are called the coordinates of X with respect to S.  1   2  Example 1. Let S =   ,    be a basis for ℝ2 . Find the coordinate vectors of the 1  3  following vectors with respect to S.  3  (a)    7  12  (b)   13 Solution:  3  (a) Let X =   . To find [X]S we must find c1 and c2 such that  7   3  2 1  7  = c1   + c2  3       1 Multiplying, adding, and equating the corresponding parts, we get c1 + 2c2 = − 3 − c1 + 3c2 = − 7 The solution is c1 = 1 and c2 = −2. Thus the coordinate vector of X with respect to the basis S is 1  2    114 Letter (b) is left as an exercise.  1  0  Example 2. Let S =  E1    , E2     be the natural basis for ℝ2 . Then 0 1     3  1  0   7  = −3  0  + (−7)  1         3  Hence [X]S =   . Note that [X]S is the original vector itself when the natural basis is  7  used. This result is true in general. Theorem 5.3.1: Let L: V→ W be a linear transformation of an 𝑛-dimensional vector space V into an 𝑚-dimensional vector space W ( 𝑛 ≠ 0 and 𝑚 ≠ 0 ) and let S = { X1 , X2, … , Xn} and T = {Y1, Y2, …, Ym } be bases for V and W, respectively. Then the 𝑚 x 𝑛 matrix A, whose jth column is the coordinate vector [L(Xj)]T of L(Xj) with respect to T, is associated with L and has the following property: If Y = L(X) for some X in V, then [Y]T = A[X]S , where [X]S and [Y]T are the coordinate vectors of X and Y with respect to the respective bases S and T. Moreover, A is the only matrix with this property. Procedure for computing the matrix of a linear transformation L: V→ W with respect to the bases S = { X1 , X2, … , Xn} and T = { Y1, Y2, …, Ym } for V and W, respectively: STEP 1: Compute L(Xj) for j = 1, 2, … , 𝑛. STEP 2: Find the coordinate vector [L(Xj)]T of L(Xj) with respect to the basis T. This means that we have to express L(Xj) as a linear combination of the vectors in T. STEP 3: The matrix A of L with respect to S and T is formed by choosing [L(Xj)]T as the jth column of A. Definition 5.3.2: The matrix of Theorem 5.3.1 is called the matrix of L with respect to the bases S and T. The equation [L(X)]T = A[X]S is called the representation of L with respect to S and T. We also say that the said equation represents L with respect to S and T. 115 Example 3: Let L: ℝ2 → ℝ2 be defined by   x  x  2 y L   =   2x  y    y   1  0  Let S =  X1    , X 2     be the natural basis for ℝ2 and let 0 1      1  2  T = Y1    , Y2     be another basis for ℝ2 . Find the matrix of L with respect to S and 2 0    1   T. Compute L     using the matrix of L.  2  Solution:  1   1   1  2 L     =    a1    a2   2 2 0  0   1 = a1  2a2 2 = 2a1 1 The solution is a1  1 and a2  1, so  L( X1 )T =   1  0   2  1  2 L     =    a1    a2    1 2 0  1   2 = a1  2a2 −1 = 2a1  1/ 2  The solution is a1  1/ 2 and a2  3/ 4 , so  L( X 2 )T =  .  3/ 4  Thus the matrix A of L with respect to S and T is 1 1/ 2  A  1 3 / 4    1     1   To compute for  L      using A, we first compute for     . Since S is the natural    2  T  2  S basis for ℝ2 then 116  1   1    =   2  2  S So,   1    1 1/ 2   L     =   1 3 / 4     2  T 1   0   2  = 5 / 2      ACTIVITY   1   0  1     1. Let S =   1 ,  2  ,  0   be a basis for ℝ3 . Find the coordinate vector of each of the   2  1  0         following vectors with respect to S. 1  (a)  4   2  3 (b)  4   3  1 (c)  2   4  x  2 y   x   2. Let L: ℝ2 → ℝ3 be defined by L       2 x  y  . Let S and T be the natural bases for   y   x  y      1  0   ℝ and ℝ , respectively. Also, let S’ =   ,    and T’ = 1 1  2 3  1   0   1           1  , 1  ,  1  be bases for  0  1   1          ℝ2 and ℝ3 , respectively. Find the matrix of L with respect to (a) S and T (b) S’ and T’ 3. Let L: P1 → P3 be defined by L(p(t)) = t2p(t). Let S = { t, t + 1 } and T = { t3 , t2 – 1, t, t + 1 } be bases for P1 and P3, respectively. Find the matrix of L with respect to S and T. Compute [ L(–3t + 3) ]T using the matrix of L. 117 MODULE 6 EIGENVALUES AND EIGENVECTORS Introduction In this section we will discuss the concepts of eigenvalues, eigenvectors, and eigenspaces, algebraic and geometric multiplicity of an eigenvalue, Hamilton-Cayley theorem, similar matrices and diagonalizable matrices. Objectives At the end of this chapter, you are expected to be able to do the following: 1. Define eigenvalues and eigenvectors. 2. Compute for the eigenvalues and the associated eigenvectors of a matrix representing a linear transformation. 3. Determine the algebraic and geometric multiplicities of the eigenvalues. 3. Identify the properties of a diagonalizable matrix. 4. Determine the matrix of transition P so that 𝑃 −1 𝐴𝑃 is a diagonal matrix, given a matrix A representing a linear transformation of a vector space into itself. 6.1 Characteristic Polynomial Definition 6.1.1: If A  [ a ij ] is an 𝑛 x 𝑛 matrix, the polynomial matrix 𝑥𝐼 − 𝐴 = 𝐶 is called the characteristic matrix of 𝐴. The determinant of 𝐶 is called the characteristic polynomial of 𝐴. The equation det𝐶 = 0 is called the characteristic equation of 𝐴. 1 Example 1: Let 𝐴 = [ 4 −3 ]. Then −2 𝑥𝐼2 − 𝐴 = 𝐶 = [ = [ is called the characteristic matrix of 𝐴. 𝑥 0 0 1 ]−[ 𝑥 4 𝑥−1 −4 3 ] 𝑥+2 −3 ] −2 118 The determinant of 𝐶 𝑥−1 | −4 3 | = (𝑥 − 1)(𝑥 + 2) − (−12) 𝑥+2 = 𝑥 2 + 𝑥 + 10 is called the characteristic polynomial of 𝐴 and the equation 𝑥 2 + 𝑥 + 10 = 0 is called the characteristic equation of 𝐴. 6.2 Hamilton-Cayley Theorem Let 𝑓(𝑥) = 𝑎0 𝑥 𝑛 + 𝑎1 𝑥 𝑛−1 + … + 𝑎𝑛−1 𝑥 + 𝑎𝑛 be a polynomial in 𝑥 with real coefficients. If 𝐴 is an 𝑛 x 𝑛 matrix then 𝑓(𝐴) is equal to the matrix 𝑎0 𝐴𝑛 + 𝑎1 𝐴𝑛−1 + … + 𝑎𝑛−1 𝐴 + 𝑎𝑛 𝐼𝑛 Note that we replace the constant term by 𝑎𝑛 𝐼𝑛 so that each term of 𝑓(𝐴) is a matrix. Theorem 6.2.1. (Hamilton-Cayley Theorem) If 𝐴 is an 𝑛 x 𝑛 matrix and 𝑓(𝑥) is its characteristic polynomial, then 𝑓(𝐴) = 0. 2 3 Example 1: Let 𝐴 = [ ]. The characteristic matrix of 𝐴 is −1 4 𝑥 − 2 −3 𝐶=[ ] 1 𝑥−4 Next we compute for the characteristic polynomial of 𝐴. 𝑓(𝑥) = det𝐶 𝑥−2 𝑓(𝑥) = | 1 −3 | 𝑥−4 = (𝑥 − 2)(𝑥 − 4) − (−3) = 𝑥 2 − 6𝑥 + 11 It can be verified that the constant term is equal to the determinant of 𝐴. 119 Computing for 𝑓(𝐴) we have 𝑓(𝐴) = 𝐴2 − 6𝐴 + 11𝐼2 2 3 2 =[ ][ −1 4 −1 1 3 2 3 ] − 6[ ] + 11 [ 0 4 −1 4 1 18 12 18 11 = [ ]−[ ]+[ −6 13 −6 24 0 0 = [ 0 0 ] 1 0 ] 11 0 ] 0 Hence 𝐴 satisfies its characteristic equation, that is 𝑓(𝐴) = 0. Since det𝐴 = 11 ≠ 0 then 𝐴 is nonsingular so 𝐴−1 exists. By the Hamilton-Cayley Theorem, we form the equation 𝐴2 − 6𝐴 + 11𝐼2 = 0 11𝐼2 = 6𝐴 − 𝐴2 𝐼𝟐 = 𝐼𝟐 = 1 11 (6𝐴 − 𝐴2 ) 1 11 (𝐴)(6𝐼 − 𝐴) Therefore, 1 𝐴−1 = 11 (6𝐼 − 𝐴) 4 −3 ] 1 2 4/11 −3/11 = [ ] 1/11 2/11 1 = 11 [ 120 SAQ 6-1 Find the characteristic polynomial for the matrix 2 −2 3 [1 1 1] 1 3 −1 and a) Show by direct substitution that this matrix satisfies its characteristic equation. b) Find 𝐴−1 . ASAQ 6-1 The characteristic matrix of 𝐴 is 𝑥−2 [ −1 −1 2 −3 𝑥−1 −1 ] −3 𝑥 + 1 and its characteristic polynomial is 𝑓(𝑥) = (𝑥 − 2)(𝑥 − 1)(𝑥 + 1) − 7 − [3(𝑥 − 1) − 2(𝑥 + 1) + 3(𝑥 − 2)] = 𝑥 3 − 2𝑥 2 − 5𝑥 + 6 Next we form the equation 𝑓(𝐴) = 𝐴3 − 2𝐴2 − 5𝐴 + 6𝐼3 = 0 (1) where 2 −2 3 2 −2 3 5 𝐴2 = 𝐴 ∙ 𝐴 = [1 1 1 ] [1 1 1 ] = [4 1 3 −1 1 3 −1 4 5 3 1 2 −2 𝐴3 = 𝐴2 ∙ 𝐴 = [4 2 3] [1 1 4 −2 7 1 3 Substituting these to (1), we have 3 1 2 3] and −2 7 3 14 −4 1 ] = [13 3 −1 13 11 17 11] 3 121 14 𝑓(𝐴) = [13 13 0 0 = [0 0 0 0 −4 17 2 −2 3 6 0 5 3 1 ] − 2 [ ] − 5 [ ] + [ 3 11 1 1 1 0 6 4 2 3 11 3 1 3 −1 0 0 4 −2 7 0 0] 6 0 0] 0 Thus 𝐴 satisfies its characteristic equation. b. To solve for 𝐴−1 , we form the equation 𝐴3 − 2𝐴2 − 5𝐴 + 6𝐼3 = 0 6𝐼3 = −𝐴3 + 2𝐴2 + 5𝐴 1 𝐼3 = 6 (−𝐴3 + 2𝐴2 + 5𝐴) 1 𝐼3 = 6 (𝐴)(−𝐴2 + 2𝐴 + 5𝐼3 ) Therefore 1 𝐴−1 = 6 (−𝐴2 + 2𝐴 + 5𝐼3 ) 2 1 −5 −3 −1 = [[−4 −2 −3] + 2 [1 6 1 −4 2 −7 −2 3 5 0 0 ] + [ 1 1 0 5 0]] 3 −1 0 0 5 4 −7 5 = 6 [−2 5 −1] −2 8 −4 2/3 −7/6 5/6 −1/3 5/6 −1/6] =[ −1/3 4/3 −2/3 1 6.3 Eigenvalues, Eigenvectors, and Eigenspaces Definition 6.3.1: Let 𝐴 be an 𝑛 x 𝑛 matrix. The real number 𝜆 is called an eigenvalue of 𝐴 if there exists a nonzero vector 𝑿 in ℝ𝑛 such that 𝐴𝑿 = 𝜆𝑿 (1) 122 Every nonzero vector 𝑿 satisfying (1) is called an eigenvector of 𝐴 associated with the eigenvalue 𝜆. Example 1: Let 𝐴 = [ [ 2 −1 ]. Since −2 3 2 −1 1 1 ][ ] = 4[ ] −2 3 −2 −2 then 4 is an eigenvalue of 𝐴 and [ 4. 1 ] is the eigenvector associated with the eigenvalue 𝜆 = −2 Theorem 6.3.1: The eigenvalues of 𝐴 are the real roots of the characteristic polynomial of A. 1 0 Example 2. Let 𝐴 = [−1 3 3 2 𝑥−1 | 1 −3 0 0 ]. The characteristic polynomial of 𝐴 is −2 0 𝑥−3 −2 0 0 | = (𝑥 − 1)(𝑥 − 3)(𝑥 + 2) 𝑥+2 By Theorem 6.3.1, the eigenvalues of 𝐴 are the roots of the characteristic equation (𝑥 − 1)(𝑥 − 3)(𝑥 + 2) = 0 Thus 𝜆1 = 1, 𝜆2 = 3, and 𝜆3 = −2 are the eigenvalues of 𝐴. To find the eigenvector associated with 𝜆1 = 1, we form the system (1𝐼3 − 𝐴)𝑋 = 0: 0 0 0 𝑥1 0 𝑥 [ 1 −2 0] [ 2 ] = [0] −3 −2 3 𝑥3 0 6 A solution to this system is {(4 𝑟, 8 𝑟, 𝑟) : 𝑟 ∈ ℝ}. Thus if 𝑟 = 8 then 𝑋1 = [3] is the 8 eigenvector associated with 𝜆1 = 1. 3 3 123 To find the eigenvector associated with 𝜆2 = 3, we form the system (3𝐼3 − 𝐴)𝑋 = 0: 2 0 0 𝑥1 0 𝑥 [1 0 0] [ 2 ] = [0] −3 −2 5 𝑥3 0 0 A solution to this system is {(0, 2 𝑟, 𝑟) : 𝑟 ∈ ℝ}. Thus if 𝑟 = 2 then 𝑋2 = [5] is the 2 eigenvector associated with 𝜆2 = 3. 5 To find the eigenvector associated with 𝜆3 = −2, we form the system (−2𝐼3 − 𝐴)𝑋 = 0: −3 0 0 𝑥1 0 𝑥 [ 1 −5 0] [ 2 ] = [0] −3 −2 0 𝑥3 0 0 A solution to this system is {(0, 0, 𝑟): 𝑟 ∈ ℝ}. Thus 𝑋2 = [0] is the eigenvector associated 1 with 𝜆3 = −2.  2 1 Example 3: Let A   .  1 3 The characteristic polynomial of 𝐴 is 𝑥−2 | 1 −1 | = (𝑥 − 2)(𝑥 − 3) − (−1) 𝑥−3 = 𝑥 2 − 5𝑥 + 7 Since 𝑥 2 − 5𝑥 + 7 = 0 has no real roots then A has no eigenvalues. Definition 6.3.2: Let L: V→ V be a linear transformation and let  be an eigenvalue of L. Denote by S(  ) the set of all eigenvectors associated with  , together with the zero vector. Then S(  ) is called the eigenspace of L associated with  . Definition 6.3.3: Let  be an eigenvalue of L. (a) The geometric multiplicity of  is the dimension of S(  ). (b) The algebraic multiplicity of  is its multiplicity as a root of the characteristic polynomial f(  ). 124 Example 4. Let L be a linear transformation represented by the matrix 2 0 0 𝐴 = [3 −1 0 ] 0 4 −1 Determine the geometric and algebraic multiplicities of its eigenvalues. Solution: The characteristic polynomial of A is 𝑥−2 0 0 𝑓(𝑥) = | −3 𝑥 + 1 0 | 0 −4 𝑥+1 = (𝑥 − 2)(𝑥 + 1)(𝑥 + 1) = (𝑥 − 2)(𝑥 + 1)2 Thus the eigenvalues of A are 𝜆1 = 2 with algebraic multiplicity 1 and 𝜆2 = 𝜆3 = −1 with algebraic multiplicity 2. To find the geometric multiplicity of 𝜆1 = 2, we have to determine the dimension of the eigenspace S(2) associated with 𝜆1 = 2. Substituting 2 for x in the characteristic matrix of A we obtain 0 0 0 [−3 3 0] 0 −4 3 This matrix in reduced row echelon form is 1 [0 0 3 The solution is {(4 𝑟, 3 4 0 −3/4 1 −3/4] 0 0 𝑟, 𝑟) : 𝑟 ∈ ℝ}. Thus 3 S(2) = {(4 𝑟, 3 4 𝑟, 𝑟) : 𝑟 ∈ ℝ} = span {(3, 3, 4)} Hence 𝜆1 = 2 has geometric multiplicity 1. 125 Similarly, substituting −1 for x in the characteristic matrix we obtain −3 0 0 [−3 0 0] 0 −4 0 This matrix in reduced row echelon form is 1 0 [0 1 0 0 0 0] 0 Thus S(−1) = {(0, 0, 𝑟): 𝑟 ∈ ℝ} = span{(0, 0, 1)} Hence 𝜆2 = 𝜆3 = −1 has geometric multiplicity 1. Theorem 6.3.2. Let 𝜆 be an eigenvalue. Then the geometric multiplicity of 𝜆 does not exceed its algebraic multiplicity. Theorem 6.3.3. If the eigenvalues 𝜆1 , … , 𝜆𝑛 are all different and {𝜉1 , … , 𝜉2 } is a set of eigenvectors, 𝜉𝑖 corresponding to 𝜆𝑖 , then the set {𝜉1 , … , 𝜉2 } is linearly independent. Example 5. Let A be the matrix of Example 2. The three distinct eigenvalues of A are 6 0 0 𝜆1 = 1, 𝜆2 = 3, and 𝜆3 = −2 and the corresponding eigenvectors are [3], [5], and [0]. It 8 2 1 6 0 0 can be verified that [3], [5], and [0] are linearly independent. 8 2 1 126 SAQ 6-2 3 2 Let 𝐴 = [ 1 4 −2 −4 2 1 ]. −1 Determine the algebraic and geometric multiplicities of its eigenvalues. ASAQ 6-2 The characteristic polynomial of A is 𝑥−3 −2 −2 𝑓(𝑥) = | −1 𝑥 − 4 −1 | 2 4 𝑥+1 = 𝑥 3 − 6𝑥 2 + 11𝑥 − 6 = (𝑥 − 1)(𝑥 − 2)(𝑥 − 3) Thus the eigenvalues of A are 1, 2, and 3 all of which have algebraic multiplicity 1. To solve for the geometric multiplicity of 𝜆 = 1, we substitute 1 for x in the characteristic matrix obtaining the matrix −2 −2 −2 [−1 −3 −1] 2 4 2 This matrix in reduced row echelon form is 1 0 [0 1 0 0 Thus S(1) = {(−𝑟, 0, 𝑟): 𝑟 ∈ ℝ = span{(−1, 0, 1)} Hence 𝜆 = 1 has geometric multiplicity 1. 1 0] 0 127 To solve for the geometric multiplicity of 𝜆 = 2, we substitute 2 for x in the characteristic matrix obtaining the matrix −1 −2 −2 [−1 −2 −1] 2 4 3 This matrix in reduced row echelon form is equal to 1 2 [0 0 0 0 0 1] 0 Thus S(2) = {(−2𝑟, 𝑟, 0): 𝑟 ∈ ℝ = span{(−2, 1, 0)} Hence 𝜆 = 2 has geometric multiplicity 1. To solve for the geometric multiplicity of 𝜆 = 3, we substitute 3 for x in the characteristic matrix obtaining the matrix 0 −2 −2 [−1 −1 −1] 2 4 4 This matrix in reduced row echelon form is 1 0 [0 1 0 0 Thus S(3) = {(0, −𝑟, 𝑟): 𝑟 ∈ ℝ = span{(0, −1, 1)} Hence 𝜆 = 3 has geometric multiplicity 1. 0 1] 0 128 6.4 Diagonalization Definition 6.4.1: A matrix B is similar to a matrix A if there is a nonsingular matrix P such that B = P-1AP Example 1: Let 𝐴 = [ 𝑃−1 𝐴𝑃 we have 1 −1 −1 −1 1 ] and 𝑃 = [ ]. Then 𝑃−1 = [ 2 4 2 1 −2 𝐵=[ = [ 1 ]. If we let 𝐵 = −1 1 1 1 −1 −1 −1 ][ ][ ] −2 −1 2 4 2 1 3 0 ] 0 2 By Definition 6.4.1, B is similar to A. REMARKS: 1. A is similar to A. 2. If B is similar to A, then A is similar to B. 3. If A is similar to B and B is similar to C, then A is similar to C. Theorem 6.4.1. Similar matrices have the same characteristic polynomial. Theorem 6.4.2. Similar matrices have the same eigenvalues and eigenvectors. Example 2. Let A and B be the matrices of Example 1. The characteristic polynomial of A is 𝑥−1 1 𝑓(𝑥) = | | = 𝑥 2 − 5𝑥 + 6 −2 𝑥 − 4 and the characteristic polynomial of B is 𝑥−3 0 𝑓(𝑥) = | | = 𝑥 2 − 5𝑥 + 6 0 𝑥−2 Thus A and B have the same characteristic polynomial. Consequently, A and B also have the same eigenvalues and eigenvectors. 129 Definition 6.4.2: We shall say that the matrix A is diagonalizable if it is similar to a diagonal matrix. In this case we also say that A can be diagonalized. Example 3. Let A be the matrix of Example 1. Since A is similar to a diagonal matrix then A is diagonalizable. Theorem 6.4.3: An 𝑛 x 𝑛 matrix A is diagonalizable if and only if it has 𝑛 linearly independent eigenvectors. In this case A is similar to a diagonal matrix D, with P-1AP = D, whose diagonal elements are the eigenvalues of A, while P is a matrix whose columns are 𝑛 linearly independent eigenvectors of A. Theorem 6.4.4: A matrix A is diagonalizable if all the roots of its characteristics polynomial are real and distinct. Example 4. Let 𝐴 = [ 1 4 ]. The characteristic polynomial of A is 1 −2 𝑥−1 𝑓(𝑥) = | −1 −4 | = 𝑥 2 + 𝑥 − 6 = (𝑥 + 3)(𝑥 − 2) 𝑥+2 Since the eigenvalues are real and distinct then A is diagonalizable. Example 5. Let 𝐴 be the matrix of SAQ6-2. The distinct eigenvalues of A are 1, 2, and 3 −1 −2 hence A is diagonalizable. The linearly independent eigenvectors of A are [ 0 ] , [ 1 ] , 1 0 0 and [−1]. Thus by Theorem 6.4.3 1 −1 −2 0 1 2 2 𝑃=[ 0 1 −1] and 𝑃−1 = [−1 −1 −1] (verify) 1 0 1 −1 −2 −1 Then 𝐷 = 𝑃−1 𝐴𝑃 1 = [−1 −1 2 2 3 2 2 −1 −2 0 −1 −1] [ 1 4 1 ][ 0 1 −1] −2 −1 −2 −4 −1 1 0 1 1 2 2 −1 −2 0 = [−2 −2 −2] [ 0 1 −1] −3 −6 −3 1 0 1 130 1 0 = [0 2 0 0 0 0] 3 Note that D is a diagonal matrix whose diagonal elements are the eigenvalues of A. ACTIVITY 2  2 3  1. Let A  0 3  2 . Find the characteristic polynomial, eigenvalues, and 0  1 2  eigenvectors of A. 3  2 1  2. Let A  0 2 0 . Find a nonsingular matrix P such that P-1AP is diagonal. 0 0 0 3. Let L be a linear transformation represented by the matrix 0 0 0  A  0 1 0  1 0 1  Determine the geometric and algebraic multiplicities of its eigenvalues. 131 MODULE 7 INNER PRODUCT SPACES Introduction In this section we will discuss inner product, inner product spaces, orthonormal bases in ℝ𝑛 , the Gram-Schmidth orthogonalization process, diagonalization of symmetric matrix, and quadratic forms. Objectives After studying this module, you should be able to: 1. Explain the steps on the diagonalization of the matrix. 2. Enumerate the properties of a diagonizable symmetric matrix. 3. Perform diagonalization of a symmetric matrix by orthogonalization. 7.1 Inner Product in ℝ𝒏 Definition 7.1.1. The length or magnitude or norm of the vector 𝑋 = (𝑥1 , 𝑥2 , … , 𝑥𝑛 ) in ℝ𝑛 is ‖𝑋‖ = √𝑥1 2 + 𝑥2 2 + … + 𝑥𝑛 2 The above formula is also use to define the distance from the point (𝑥1 , 𝑥2 , … , 𝑥𝑛 ) to the origin. Example 1. Let 𝑋 = (1, 2, −3) be a vector in ℝ3 . The length of 𝑋 is ‖𝑋‖ = √12 + 22 + (−3)2 = √14 Definition 7.1.2. If 𝑋 = (𝑥1 , 𝑥2 , … , 𝑥𝑛 ) and 𝑌 = (𝑦1 , 𝑦2 , … , 𝑦𝑛 ) are vectors in ℝ𝑛 , then their inner product (also called dot product) is given by 𝑋 ∙ 𝑌 = 𝑥1 𝑦1 + 𝑥2 𝑦2 + ⋯ 𝑥𝑛 𝑦𝑛 . Example 2. Let 𝑋 = (1, 0, 1) and 𝑌 = (2, 3, −1). Then their inner product is 𝑋 ∙ 𝑌 = 1 ∙ 2 + 0 ∙ 3 + 1(−1) = 1 132 Theorem 7.1.1. (Properties of Inner Product) If 𝑋, 𝑌 and 𝑍 are vectors in ℝ𝑛 and 𝑐 is a scalar, then: a. 𝑋 ∙ 𝑋 = ‖𝑋‖2 ≥ 0, with equality if and only if 𝑋 = 0. b. 𝑋 ∙ 𝑌 = 𝑌 ∙ 𝑋 c. (𝑋 + 𝑌) ∙ 𝑍 = 𝑋 ∙ 𝑍 + 𝑌 ∙ 𝑍 d. (𝑐𝑋) ∙ 𝑌 = 𝑋 ∙ (𝑐𝑌) = 𝑐(𝑋 ∙ 𝑌) NOTE: A vector space V along with an inner product or scalar product operation satisfying all these properties is called an inner product space. Example 3. product. ℝ𝑛 is an inner product space since it satisfies all the properties of inner Definition 7.1.3. The angle between two nonzero vectors 𝑋 and 𝑌 is defined as the unique number 𝜃, 0 ≤ 𝜃 ≤ 𝜋, such that 𝑐𝑜𝑠𝜃 = 𝑋∙𝑌 ‖𝑋‖‖𝑌‖ Example 4. Let 𝑋 = (0, 0, 1, 1) and 𝑌 = (1, 0, 1, 0). Then ‖𝑋‖ = √2, ‖𝑌‖ = √2 and 𝑋 ∙ 𝑌 = 1 Thus 𝑐𝑜𝑠𝜃 = 1 (√2)(√2) 1 𝑐𝑜𝑠𝜃 = 2 1 𝜃 = cos−1 (2) 𝜃 = 60𝑜 Definition 7.1.4. Two nonzero vectors 𝑋 and 𝑌 in ℝ𝑛 are said to be orthogonal if 𝑋 ∙ 𝑌 = 0. If one of the vectors is the zero vector, we agree to say that the vectors are orthogonal. They are said to be parallel if |𝑋 ∙ 𝑌| = ‖𝑋‖‖𝑌‖. They are in the same direction if 𝑋 ∙ 𝑌 = ‖𝑋‖‖𝑌‖. That is, they are orthogonal if 𝑐𝑜𝑠𝜃 = 0, parallel if 𝑐𝑜𝑠𝜃 = ±1 and in the same direction if 𝑐𝑜𝑠𝜃 = 1. 133 Example 5. Let 𝑋1 = (4, 2, 6, −8), 𝑋2 = (−2, 3, −1, −1) and 𝑋3 = (−2, −1, −3, 4). Then 𝑋1 ∙ 𝑋2 = 4(−2) + 2 ∙ 3 + 6(−1) + (−8)(−1) = 0, 𝑋2 ∙ 𝑋3 = (−2)(−2) + 3(−1) + (−1)(−3) + (−1)(4) = 0 and 𝑋1 ∙ 𝑋3 = 4(−2) + 2(−1) + 6(−3) + (−8)(4) = −60 This shows that 𝑋1 and 𝑋2 are orthogonal and 𝑋2 and 𝑋3 are also orthogonal. Moreover, ‖𝑋1 ‖ = 2√30 and ‖𝑋3 ‖ = √30 then |𝑋1 ∙ 𝑋3 | = ‖𝑋1 ‖‖𝑋3 ‖ |−60| = (2√30)(√30) 60 = 60 This implies that 𝑋1 and 𝑋3 are parallel but not in the same direction. Definition 7.1.5. A unit vector 𝑈 in ℝ𝑛 is a vector of unit length. If 𝑋 is a nonzero vector, then the vector 𝑈= [ 1 ‖𝑋‖ ]𝑋 is a unit vector in the direction of 𝑋. Example 6. Consider the vector 𝑋 = (0, 4, 2, 3). Since ‖𝑋‖ = √29 then the vector 𝑈= 1 √29 (0, 4, 2, 3) = (0, Is a unit vector in the direction of 𝑋. 4 √29 , 2 √29 , 3 ) √29 134 72. Orthonormal Bases in ℝ𝒏 Definition 7.2.1. A set 𝑆 = {𝑥1 , 𝑥2 , … , 𝑥𝑘 } in ℝ𝑛 is called orthogonal if any two distinct vectors in S are orthogonal. An orthonormal set of vectors is an orthogonal set of unit vectors. Example 1. Let 𝑋1 = (1, 2, −1, 1), 𝑋2 = (0, −1, −2, 0) and 𝑋3 = (1, 0, 0, −1). Since 𝑋1 ∙ 𝑋2 = 0, 𝑋1 ∙ 𝑋3 = 0 and 𝑋2 ∙ 𝑋3 = 0 then the set {𝑋1, 𝑋2 , 𝑋3 } is orthogonal. The vectors 𝑌1 = ( 1 , 2 √7 √7 ,− 1 , 1 ) , 𝑌2 = (0, − √7 √7 1 √5 ,− 2 √5 , 0) and 𝑌3 = ( 1 √2 , 0, 0, − 1 ) √2 are unit vectors in the direction of 𝑋1 , 𝑋2 and 𝑋3 , respectively. Thus {𝑌1 , 𝑌2 , 𝑌3 } is an orthonormal set and span the same subspace as {𝑋1 , 𝑋2 , 𝑋3 }, that is, span{𝑋1 , 𝑋2 , 𝑋3 } = span{𝑌1 , 𝑌2 , 𝑌3 }. Notice that an orthonormal set is just a set of orthogonal vectors in which each vector has been normalized to unit length. Theorem 7.2.1. Let 𝑆 = {𝑥1 , 𝑥2 , … , 𝑥𝑘 } be an orthogonal set of nonzero vectors in ℝ𝑛 . Then S is linearly independent. Example 2. The vectors 𝑋1 , 𝑋2 and 𝑋3 of Example 1 are orthogonal hence they are linearly independent. Corollary 7.2.1. An orthonormal set of vectors in ℝ𝑛 is linearly independent. Example 3. The set of vectors {𝑌1 , 𝑌2 , 𝑌3 } of Example 1 is orthonormal hence it is linearly independent. Gram-Schmidt Orthogonalization Process Gram-Schmidth process is a procedure by which an orthonormal set of vectors is obtained from a linearly independent set of vectors in an inner product space. 135 Definition 7.2.2. (Gram-Schmidt Process) Let W be a nonzero subspace of ℝ𝑛 with basis 𝑆 = {𝑋1 , 𝑋2 , … , 𝑋𝑚 }. Then there exists an orthonormal basis 𝑇 = {𝑍1 , 𝑍2 , … , 𝑍𝑚 } for W. Steps in the Gram-Schmidth process: STEP 1. Let 𝑌1 = 𝑋1. STEP 2. Compute the vectors 𝑌2 , 𝑌3 , … , 𝑌𝑚 , by the formula 𝑋𝑘 ∙𝑌𝑗 𝑌𝑘 = 𝑋𝑘 − ∑𝑘−1 𝑗=1 ( 𝑌 𝑗 ∙𝑌𝑗 ) 𝑌𝑗 for 2 ≤ 𝑘 ≤ 𝑚 The set of vectors 𝑆 ′ = {𝑌1 , 𝑌2 , … , 𝑌𝑚 } is an orthogonal set. STEP 3. Normalize each vector in 𝑆′ to obtain an orthonormal basis for W. Example 4. Use the Gram-Schmidth process to transform the basis {(1, 1, 1), (0, 1, 1), (1, 2, 3)} for ℝ3 into an orthonormal basis for ℝ3 . Solution: We apply the Gram-Schmidth process to obtain 3 vectors 𝑍1 , 𝑍2 , 𝑍3 which also span ℝ3 and orthogonal to each other. STEP 1. Let 𝑌1 = 𝑋1 = (1, 1, 1). STEP 2. We now compute for 𝑌2 and 𝑌3 : 𝑋2 ∙ 𝑌1 𝑌2 = 𝑋2 − ( )𝑌 𝑌1 ∙ 𝑌1 1 2 = (0, 1, 1) − ( ) (1, 1, 1) 3 2 1 1 = (− , , ) 3 3 3 and 𝑋3 ∙ 𝑌1 𝑋3 ∙ 𝑌2 𝑌3 = 𝑋3 − ( ) 𝑌1 − ( )𝑌 𝑌1 ∙ 𝑌1 𝑌2 ∙ 𝑌2 2 6 1 2 1 1 = (1, 2, 3) − ( ) (1, 1, 1) − ( ) (− , , ) 3 6/9 3 3 3 136 3 2 1 1 = (1, 2, 3) − (2, 2, 2) − (− , , ) 2 3 3 3 1 1 = (0, − , ) 2 2 2 1 1 1 1 The set of vectors 𝑆 ′ = {(1, 1, 1), (− 3 , 3 , 3) , (0, − 2 , 2)} is orthogonal. STEP 3. We normalize each vector found in STEP 2. Let 𝑍1 = 𝑌1 ‖𝑌1 ‖ 1 𝑍2 = 𝑌2 1 = ‖𝑌2 ‖ √6/9 = = √3 3 𝑌3 ‖𝑌3 ‖ = = 1 , 1 , 1 √3 √3 √3 ) , (− 1 , 1 ) 2 1 1 2 1 1 √6 2 √2 , , 1 ) 1 1 √2/4 2 1 , √6 √6 √6 1 2 , √3 √3 √3 (− 3 , 3 , 3) (0, − 2 , 2) 1 1 (0, − 2 , 2) = (0, − Then 𝑇 = {( 1 (− 3 , 3 , 3) = (− 𝑍3 = (1, 1, 1) = ( 1 1 , 1 ) √2 √2 1 1 1 , ) , (0, − √2 , √2)} is an orthonormal basis for ℝ3 . √6 √6 √6 137 SAQ 7-1 Use the Gram-Schmidt process to find an orthonormal basis for the subspace W of ℝ4 with basis {(1, 1, -1, 0), (0, 2, 0, 1), (-1, 0, 0, 1)}. ASAQ 7-1 STEP 1. Let 𝑌1 = 𝑋1 = (1, 1, −1, 0) STEP 2. We now solve for 𝑌2 and 𝑌3 : 𝑋2 ∙ 𝑌1 𝑌2 = 𝑋2 − ( )𝑌 𝑌1 ∙ 𝑌1 1 2 = (0, 2, 0, 1) − (1, 1, −1, 0) 3 2 4 2 = (− , , , 1) 3 3 3 𝑋3 ∙ 𝑌1 𝑋3 ∙ 𝑌2 𝑌3 = 𝑋3 − ( ) 𝑌1 − ( )𝑌 𝑌1 ∙ 𝑌1 𝑌2 ∙ 𝑌2 2 1 5/3 2 4 2 = (−1, 0, 0,1) − (− ) (1, 1, −1, 0) − (− , , , 1) 3 11/3 3 3 3 = (− 4 3 7 6 ,− ,− , ) 11 11 11 11 2 4 2 4 3 7 6 Thus 𝑆 ′ = {(1, 1, −1, 0), (− 3 , 3 , 3 , 1) , (− 11 , − 11 , − 11 , 11)} is an orthogonal basis for a subspace W of ℝ4 . 138 STEP 3. We normalize each vector in 𝑆′. Let 𝑍1 = 𝑍2 = 𝑌1 ‖𝑌1 ‖ 1 = (1, 1, -1, 0) = ( √3 , 1 √3 √3 ,− 1 √3 ,0 ) 𝑌2 1 2 4 2 = (− 3 , 3 , 3 , 1) ‖𝑌2 ‖ √33/9 = 3 2 4 2 𝑌3 ‖𝑌3 ‖ 1 = 11 1 1 3 3 4 , , 2 , 3 ) √33 √33 √33 √33 3 7 6 (− 11 , − 11 , − 11 , 11) 4 √110 = (− 2 4 √110/121 = (− 3 , 3 , 3 , 1) √33 = (− 𝑍3 = 1 3 7 6 (− 11 , − 11 , − 11 , 11) 4 √110 ,− 3 √110 1 3 ,− 7 , 6 √110 √110 ) 2 4 2 3 4 3 7 6 , , , ) , (− √110 , − √110 , − √110 , √110)} is 33 √33 √33 √33 Thus 𝑇 = {(√ , √ , − √ , 0 ) , (− √ an orthonormal basis for the subspace W of ℝ4 ACTIVITY 1. Which of the following are orthonormal sets of vectors? 1 1 1 1 1 a. ( , 0, − ) , ( , , ) , (0, 1, 0) √2 √2 √3 √3 √3 b. (0, 2, 2, 1), (−1, 1, 2, 2), (0, 1, −2, 1) 1 c. ( √6 ,− 2 √6 , 0, 1 ) , (− √6 1 √3 ,− 1 √3 , 0, − 1 √3 ) , (0, 0, 1, 0) 139 2. Use the Gram-Schmidt process to find an orthonormal basis for the subspace of ℝ4 with basis {(1, -1, 0, 1), (2, 0, 0, -1), (0, 0, 1, 0)}. 7.3 Diagonalization of Symmetric Matrix Theorem 7.3.1. All roots of the characteristic polynomial of a symmetric matrix are real numbers. −1 0 Example 1. Let 𝐴 = [ 0 2 0 0 0 0]. The characteristic polynomial of A is 3 𝑥+1 𝑓(𝑥) = | 0 0 0 0 𝑥−2 0 | 0 𝑥−3 = (𝑥 + 1)(𝑥 − 2)(𝑥 − 3) Clearly the roots of 𝑓(𝑥) are all real numbers. Corollary 7.3.1. If A is a symmetric matrix all of whose eigenvalues are distinct, then A is diagonalizable. Example 2. Let A be the matrix of Example 1. The eigenvalues 𝜆1 = −1, 𝜆2 = 2 and 𝜆3 = 3 are real and distinct. Hence A is diagonalizable. Theorem 7.3.2. If A is a symmetric matrix, then the eigenvectors that belong to distinct eigenvalues of A are orthogonal. Example 3. Let A be the matrix of Example 1. To find the associated eigenvectors, we form the system (𝜆𝐼3 − 𝐴)𝑋 = 0 and solve for x. Thus 1 0 0 𝑋1 = [0], 𝑋2 = [1] and X 3 = [0] 0 0 1 140 are the eigenvectors associated with 𝜆1 = −1, 𝜆2 = 2 and 𝜆3 = 3, respectively. It can be verified that {𝑋1 , 𝑋2 , 𝑋3 } is an orthogonal set of vectors in ℝ3 2. Definition 7.3.1. A nonsingular matrix A is called orthogonal if 𝐴−1 = 𝐴𝑇 . 1 0 0 Example 4. Let 𝐴 = [0 1/√2 −1/√2]. 0 −1/√2 −1/√2 −1 Since 𝐴 1 = [0 0 0 0 1/√2 −1/√2] = 𝐴𝑇 then A is an orthogonal matrix. −1/√2 −1/√2 Theorem 7.3.3. The 𝑛 x 𝑛 matrix A is orthogonal if and only if the columns (and rows) of A form an orthonormal set of vectors in ℝ𝑛 . Example 5. Let A be the matrix of Example 4. The columns (and rows) of A are of unit length and are mutually orthogonal. Thus A is orthogonal. Theorem 7.3.4. If A is symmetric 𝑛 x 𝑛 matrix , then there exists an orthogonal matrix P such that 𝑃−1 𝐴𝑃 = 𝑃𝑇 𝐴𝑃 = 𝐷, a diagonal matrix, where the columns of P consist of the linearly independent eigenvectors of A and the diagonal elements of D are the eigenvalues of A associated with these eigenvectors. 0 −1 −1 Example 6. Let 𝐴 = [−1 0 −1]. Find an orthogonal matrix P and a diagonal matrix D −1 −1 0 such that 𝐷 = 𝑃−1 𝐴𝑃 = 𝑃𝑇 𝐴𝑃. Solution: The characteristic polynomial of A is 𝑥 𝑓(𝑥) = |1 1 1 1 𝑥 1| = (𝑥 − 1)2 (𝑥 + 2) 1 𝑥 141 The eigenvalues of A are 𝜆1 = 1, 𝜆2 = 1 and 𝜆3 = −2. To find the eigenvector associated with 𝜆1 = 1, we solve for the homogeneous system (𝜆1 𝐼3 − 𝐴)𝑋 = 0: 1 1 1 𝑥1 0 [1 1 1] [𝑥2 ] = [0] 1 1 1 𝑥3 0 (1) The solution to this system is (verify) −𝑟 − 𝑠 [ 𝑟 ] where 𝑟, 𝑠 ∈ ℝ 𝑠 Thus a basis for the solution space of (1) consists of the eigenvectors −1 −1 𝑋1 = [ 1 ] and 𝑋2 = [ 0 ] 0 1 However, the two vectors are not orthogonal so we apply the Gram-Schmidt process to obtain an orthonormal basis for the solution space of (1). Let 𝑌1 = 𝑋1 = (−1, 1, 0). Then 𝑋2 ∙ 𝑌1 𝑌2 = 𝑋2 − ( )𝑌 𝑌1 ∙ 𝑌1 1 1 = (−1, 0, 1) − (−1, 1, 0) 2 1 1 = (− , − , 1) 2 2 Normalizing these two vectors we get 𝑍1 = 𝑋1 ‖𝑋1 ‖ = 1 √2 (−1, 1, 0) 2 1 1 𝑋 𝑍2 = ‖𝑋2 ‖ = (− 2 , − 2 , 1) √6 2 = (− 1 √6 ,− 1 , 2 ) √6 √6 142 Thus {𝑍1 , 𝑍2 } is an orthonormal basis of eigenvectors for the solution space of (1). Next we find the eigenvector associated with 𝜆3 = −2 by solving the homogeneous system (−2𝐼3 − 𝐴)𝑋 = 0: −2 1 1 𝑥1 0 [ 1 −2 1 ] [𝑥2 ] = [0] 1 1 −2 𝑥3 0 (2) The solution to this system is (verify) 𝑟 [𝑟] where 𝑟 ∈ ℝ 𝑟 Thus a basis for the solution space of (2) consists of the eigenvector 1 𝑋3 = [1] 1 Normalizing this we get 𝑋3 ‖𝑋3 ‖ 𝑍3 = = 1 √3 =( 1 (1, 1, 1) , 1 , 1 ) √3 √3 √3 Note that 𝑍3 is orthogonal to both 𝑍1 and 𝑍2 thus {𝑍1 , 𝑍2 , 𝑍3 } is an orthonormal basis of ℝ3 consisting of eigenvectors of A. By Theorem 7.3.4, − 𝑃= 1 √2 1 √2 [ 0 1 1 √6 1 √3 1 √6 2 1 − − √6 1 − √2 1 and 𝑃 −1 = 𝑃𝑇 = − √6 √3 √3] 1 [ √3 1 √2 1 − √6 1 √3 0 2 √6 1 √ 3] 143 Hence 𝐷 = 𝑃𝑇 𝐴𝑃 1 − √2 1 6 = − √ 1 [ √3 − 1 0 √2 1 − √6 1 √3 1 1 √2 1 √2 1 = − √6 − √6 2 0 2 [−1 √6 1 −1 √3] 2 0 2 √6 2 − √1 2 −1 −1 0 −1] −1 0 1 √2 [ − 1 √2 1 √2 [− √3 − √3 − √3] [ 0 0 1 1 √6 1 √3 1 √6 2 √3 1 − − √6 − √1 6 1 −√ 6 2 √6 1 √3 1 √3 1 √3] √3] 1 0 0 = [0 1 0 ] 0 0 −2 Note that D is a diagonal matrix whose main diagonal consists of the eigenvalues of A. SAQ 7-2 1 Let 𝐴 = [0 0 0 0 3 −2]. Find an orthogonal matrix P and a −2 3 diagonal matrix D such that 𝐷 = 𝑃−1 𝐴𝑃 = 𝑃𝑇 𝐴𝑃. 144 ASAQ 7-2 The characteristic polynomial of A is 𝑥−1 0 0 𝑓(𝑥) = | 0 𝑥−3 2 | 0 2 𝑥−3 = (𝑥 − 1)(𝑥 − 3)(𝑥 − 3) − 4(𝑥 − 1) = (𝑥 − 1)2 (𝑥 − 5) The eigenvalues of A are 𝜆1 = 1, 𝜆2 = 1 and 𝜆3 = 5. To find the eigenvector associated with 𝜆1 = 1, we solve for the homogeneous system (𝜆1 𝐼3 − 𝐴)𝑋 = 0: 0 0 0 𝑥1 0 [0 −2 2 ] [𝑥2 ] = [0] 0 2 −2 𝑥3 0 (1) The solution to this system is (verify) 𝑟 [𝑠] where 𝑟, 𝑠 ∈ ℝ 𝑠 Thus a basis for the solution space of (1) consists of the eigenvectors 1 0 𝑋1 = [0] and 𝑋2 = [1] 0 1 Note that 𝑋1 and 𝑋2 are already orthogonal. Next we normalize 𝑋2: 0 𝑋2 𝑍2 = = [1/√2] ‖𝑋2 ‖ 1/√2 Thus {𝑋1 , 𝑍2 } is an orthonormal basis of eigenvectors for the solution space of (1). Next we find the eigenvector associated with 𝜆3 = 5 by solving the homogeneous system (5𝐼3 − 𝐴)𝑋 = 0: 145 4 0 0 𝑥1 0 [0 2 2] [𝑥2 ] = [0] 0 2 2 𝑥3 0 (2) The solution to this system is (verify) 0 [−𝑟] 𝑟 ∈ ℝ 𝑟 Thus a basis for the solution space of (2) consists of the eigenvector 0 𝑋3 = [−1] 1 Normalizing 𝑋3 we obtain 0 𝑋3 𝑍3 = = [−1/√2] ‖𝑋3 ‖ 1/√2 Since 𝑍3 is orthogonal to both 𝑋1 and 𝑍2 then {𝑋1 , 𝑍2 , 𝑍3 } is an orthonormal basis of ℝ3 consisting of the eigenvectors of A. Thus 1 0 0 1 0 0 −1 𝑇 𝑃 = [0 1/√2 −1/√2] and 𝑃 = 𝑃 = [0 1/√2 1/√2] 0 1/√2 1/√2 0 −1/√2 1/√2 Hence 1 0 0 1 𝐷 = [0 1/√2 1/√2] [0 0 −1/√2 1/√2 0 1 = [0 0 0 0 1 1/√2 1/√2] [0 −5/√2 5/√2 0 1 0 0 = [0 1 0 ] 0 0 5 0 0 1 3 −2] [0 −2 3 0 0 0 1/√2 −1/√2] 1/√2 1/√2 0 0 1/√2 −1/√2] 1/√2 1/√2 146 ACTIVITY −3 0 1. Let 𝐴 = [ 0 −2 −1 0 −1 0 ]. Find an orthogonal matrix P and a diagonal matrix D such −3 that 𝐷 = 𝑃 −1 𝐴𝑃 = 𝑃𝑇 𝐴𝑃. REFERENCES: 1. Kolman. Bernard. Introductory Linear Algebra; 4th edition, New York: Mcmillan Publishing Company, 1988. 2. Lipschutz, Seymour. Linear Algebra; SI (Metric Edition), Singapore: McGraw-Hill International Book Company, 1981. 3. Johnson and Riess. Introduction to Linear Algebra; Phil: Addison-Wesley Publishing Company, Inc. 1981. 4. Perry, William L. Elementary Linear Algebra; USA: McGraw-Hill, Inc. 1988.

Module in Linear Algebra and Matrix Theory

Related documents

Products

Support

Module in Linear Algebra and Matrix Theory

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib