Dr. Michael Gabriel Department of Mathematical Sciences U08620 - Linear Algebra 1. From R2 and R3 to R n In R 2 and R 3 , we can represent a vector both graphically and algebraically. Graphically we can do so with a directed line segment, i.e. u = The defining characteristics of a vector are its magnitude and direction, and so two vectors u and v are defined to be equal if they have the same magnitude, i.e. u v , and if they are parallel and point in the same direction. Then, given any vector in R 2 or R 3 , we can translate it so that it begins at the origin, and represent it algebraically by the coordinates of the terminal point of the vector. In R 2 and R 3 , addition of vectors may be defined in terms of the parallelogram law (or equivalently the triangle law) of addition. For scalar multiplication, if u is a vector in R 2 or R 3 , and k is any real number, then the scalar multiple ku is defined to be the vector with magnitude k u , and with direction the same as that of u if k 0 , and with direction opposite to that of u if k 0 . If k 0 , then ku is defined to be the zero vector. In component terms, in R 2 for example, recall that addition and scalar multiplication are given by, ( x1 , x 2 ) ( y1 , y 2 ) ( x1 y1 , x 2 y 2 ) , and k ( x1 , x 2 ) (kx1 , kx2 ) . Now although our geometric visualization does not extend beyond 3-space, there is nothing stopping us regarding quadruples of real numbers ( x1 , x 2 , x3 , x 4 ) as “vectors” in “4-dimensional” space, or quintuples ( x1 , x 2 , x3 , x 4 , x5 ) as “vectors” in “5dimensional” space, and so on, and continuing to do mathematics with these “vectors”. With this in mind, we define the following: Definition 1.1 If n is a positive integer, then an ordered n-tuple is a sequence of real numbers ( x1 , x 2 , , x n ) . We denote the set of all such n-tuples by R n . 1 We can indeed think of these ordered n-tuples either as generalized points or generalized vectors. When we write ( x1 , x 2 , , x n ) , whether we mean a point or vector will be clear from the context. Now just as two vectors in R 2 or R 3 are equal if and only each of their components are equal, we say that vectors u (u1 , u2 , , un ) and v (v1 , v 2 , , v n ) in R n are equal if and only if u1 v 1 , u2 v 2 , , un v n . Then, following the definition of vector addition and scalar multiplication in R 2 and R 3 , we define operations on the elements of the set R n : For each u (u1 , u2 , , un ) and v (v1 , v 2 , , v n ) in R n , and real number k , define the sum of u and v to be u v (u1 v1 , u2 v 2 , , un v n ) , and the scalar multiple ku to be ku (ku1 , ku2 , , kun ) . These are called the standard operations on R n . Recall that both in R 2 and R 3 we have a “zero vector”, (the vector with zero magnitude and undefined direction), and also that every vector u has a “negative”, (the vector with magnitude equal to that of u , but with opposite direction). In R n , we define our zero vector to be 0 (0, 0, define its negative to be u (u1 , u2 , , 0) , and for u (u1 , u 2 , , un ) we , un ) . We further define the difference n between two vectors in R to be v u v (u ) ; that is, v u (v1 u1 , v 2 u2 , , v n un ) . The way in which we have defined our operations of vector addition and scalar multiplication in R n ensure that all of the most important arithmetic properties of vector addition and scalar multiplication of vectors in R 2 and R 3 still hold in an identical fashion in R n . These are listed below. Properties of Vector Operations in R n If u , v and w are vectors in R n , and k and l are scalars, then (a) (b) (c) (d) (e) (f) u v v u u (v w ) (u v ) w u 0 u 0u u (-u ) 0 , that is, u - u 0 k (lu ) (kl )u k (u v ) ku kv 2 (g) (k l )u ku lu (h) 1u u These are all easily verified, e.g. (g) (k l )u (k l )(u1 , u 2 , (ku1 lu1 , ku 2 lu 2 , , un ) ((k l )u1 , (k l )u 2 , , kun lun ) (ku1 , ku 2 , , (k l )u n ) , ku n ) (lu1 , lu 2 , , lu n ) ku lu In analogy with R 2 and R 3 , we can also define the notion of length and distance in Rn . Definition 1.2 If u (u1 , u2 , , un ) is a vector in R n , we define the Euclidean norm (or Euclidean length) of u to be u u12 u2 2 un 2 . Moreover, for points u and v in R n , we define the Euclidean distance between them to be d(u, v ) u v (u1 v1 ) 2 (u2 v 2 ) 2 (un v n ) 2 . Similarly we can generalize the definition of the dot product on R 2 and R 3 as follows. Definition 1.3 If u (u1 , u2 , , un ) and v (v1 , v 2 , the Euclidean inner product u v to be u v u1v1 u2v 2 , v n ) are vectors in R n , we define unv n . We refer to R n , endowed with this inner product, as Euclidean n-space. Observation 1.4 For each u (u1 , u2 , 1 , un ) in R n , u (u u ) 2 . Properties of the Euclidean Inner Product If u and v are vectors in R n , and k is any real number, then (a) (b) (c) (d) u v v u u (v w ) u v u w k (u v ) (ku ) v u (kv ) u u 0 and u u 0 u 0 Again these are easily verified and are left as an exercise. (At least one of these was checked in class.) 3 Now in R 2 and R 3 , the dot product of vectors a and b was initially defined as a b a b cos , where is the angle between a and b . The formula in terms of components came later, and it was necessary to prove that this formula is indeed correct. Of course in R n ,, for n 4 , we have no predefined notion of “angle between vectors”. However, we can use our definition of Euclidean norm and inner product to define such an angle. Definition 1.5 If u (u1 , u2 , , un ) and v (v1 , v 2 , cosine of the angle between u and v by cos , v n ) are vectors in R n , define the u v . u v Then extending the notion of perpendicularity for vectors in R 2 and R 3 to vectors in R n , we say that vectors u and v in R n are orthogonal if and only if u v 0 ; that is, if and only if the angle between them is 90 . Warning: the above definition does not quite contain the entire story. If the value of the right hand side of the equation does not lie in the interval 1,1 , then the definition makes no sense. That the right hand side does indeed lie in this interval is ensured by one of the most important inequalities in linear algebra: Cauchy-Schwarz Inequality in R n For all u, v R n , u v u v . We will later see a more general version of this inequality, the proof of which is nontrivial. Properties of the norm If u and v are vectors in R n , and k is any real number, then (a) u 0 (b) u 0 u 0 (c) ku k u (d) u v u v (triangle inequality) Proof (c) ku k 2u12 k 2u2 2 k 2un 2 k 2 (u12 u 2 2 un 2 ) k u . 4 (d) u v 2 (u v ) (u v ) u u u v v u v v u 2u v v 2 2 u 2 u v v u 2 u v v 2 2 2 2 ( u v )2 □ and so the result follows by taking square roots. Proposition 1.6 If u and v are orthogonal vectors in R n , then u v 2 u v . 2 2 Exercise 1.7 For u, v R n , show that u v 1 1 2 2 u v u v . 4 4 2. Vector Spaces 2.1 Introduction In this section, using as a base the theory of vectors in R 2 and R 3 , and now indeed vectors in R n , we will take the next step and generalize the notion of a vector even further. The following is taken from “Elementary Linear Algebra, Applications Version”, by Howard Anton and Chris Rorres. “… we shall generalize the concept of a vector still further. We shall state a set of axioms which, if satisfied by a class of objects, will entitle those objects to be called "vectors". The axioms will be chosen by abstracting the most important properties of vectors in R n ; as a consequence, vectors in R n will automatically satisfy these axioms. Thus, our new concept of a vector will include our old vectors and many new kinds of vectors as well. These new types of vectors will include, among other things, various kinds of matrices and functions. Our work in this section is not an idle exercise in theoretical mathematics; it will provide a powerful tool for extending our geometric visualization to a wide variety of important mathematical problems where geometric intuition would not otherwise be available. Briefly stated, the idea is this: We can visualize vectors in R 2 and R 3 geometrically as arrows, which enables to draw physical or mental pictures to help solve problems. Because the axioms that we will be using to create our new kinds of vectors will be based on properties of vectors in R 2 and R 3 , these new vectors will have many of the familiar properties of vectors in R 2 and R 3 . Consequently, when we want to solve a problem involving our new kinds of vectors, say matrices or functions, we may be able to get a foothold on the problem by visualizing geometrically what the corresponding would be like in R 2 and R 3 .” 5 So what will we mean by a vector space? This will simply be any non-empty set V such that (I) given any pair of elements u and v in this set, we can associate with them another element “ u v ” in V (II) given any element u in V and any real number k , we can associate an element ku in V (III) these associations should satisfy a certain set of rules (axioms). What are these rules, and from where do they arise? Since it is our aim to generalize the notion of vectors in R n , these rules should be built from those properties which were highlighted as being the most important ones satisfied by vector addition and scalar multiplication in R n . 2.2 Definition of a Vector Space Definition 2.2.1 Let V be any nonempty set on which two operations, “vector addition” and “scalar multiplication” are defined. By vector addition, we mean a rule for associating with each pair of objects u and v in V , an object u v called the sum of u and v ; by scalar multiplication we mean a rule for associating with each real number k and each object u in V , an object ku , called the scalar multiple of u by k . If each of the following axioms are satisfied by all objects u , v and w in V and all scalars k and l , then we call V a vector space, and we call the objects in V vectors. (a) If u and v are in V , then u v is in V (b) u v v u (c) u (v w ) (u v ) w (d) There is an object in V called a zero vector, denoted 0 , such that u 0 u 0 u for all objects u in V . (e) For each u in V , there is an object u in V , called the negative of u , u (-u ) 0 (u ) u . (f) If k is any scalar and u is any object in V , then ku is in V . (g) k (lu ) (kl )u (h) k (u v ) ku kv (i) (k l )u ku lu (j) 1u u Important points to note From these axioms, it can be deduced that in a vector space, (i) the zero vector is unique. 6 (ii) given u in V , its negative vector, u , is unique. (iii) 0u 0 (iv) (1)u u Proof (i) Suppose w 1 and w 2 are vectors in V such that u w1 u and u w 2 u for all u V . Then with u w1 in the second equation and u w 2 in the first, we see that w1 w 2 w1 and w 2 w1 w 2 . Therefore by axiom (b), w1 w 2 . (ii) Suppose w 1 and w 2 are vectors in V such that u w1 0 and u w 2 0 . Then w1 w1 0 w1 (u w 2 ) (w1 u ) w 2 (u w1 ) w 2 0 w 2 w 2 0 w 2 . Parts (iii) and (iv) are left as exercises here. (They were proved in class.) □ 2.3 Examples of Vector Spaces 1.V R n , with the standard operations. 2. W Pn (t ) , the set of all real polynomials with degree less than or equal to n, with operations as follows: for p, q Pn (t ) , i.e. p(t ) ao a1t a2t 2 an t n , q (t ) bo b1t b2t 2 k R , define polynomials p q and kp by ( p q )(t ) (ao bo ) (a1 b1 )t (a2 b2 )t 2 (kp)(t ) kao ka1t ka2t 2 bn t n , and (an bn )t n , kant n . 3. S {( x ,3x ) : x R} , with the standard operations of R 2 . 4. V M 2,2 (R ) with the usual operations of matrix addition and scalar multiplication. 5. T {A M2,2 (R) : AT A} with the usual operations of matrix addition and scalar multiplication. 6. With a, b R , a b , let V F ([a, b]) , the set of all functions from the interval [a, b] to R . We say that two functions f , g F ([a, b]) are equal if and only if f ( x ) g ( x ) for all x [a, b] . Then V is a vector space with operations: (f g )( x ) f ( x ) g ( x ) and (kf )( x ) kf ( x ) , for f , g F ([a, b]) and k R . 7 7. L {( x , y ) : x , y R} , with operations defined as ( x , y ) ( x , y ) ( x x 1, y y 1) and k ( x , y ) (k kx 1, k ky 1) . 8. X {x R : x 0} with operations, x y xy and k x x k , for x , y V and k R . 2.4 Examples of sets with operations failing at least one of the axioms 1. V {( x , x 5) : x R} , with standard operations of R 2 . This fails even axiom (a). 2. V {( x , y ) : x , y R, y 0} , with operations defined as ( x , y ) ( x , y ) ( xy yx , yy ) and k ( x , y ) ( kx ,1) . y This satisfies all axioms other than (j). 3. V {( x , y ) : x , y R, x 0} , with operations defined as ( x , y ) ( x , y ) ( xx , xy yx ) and k ( x , y ) ( x , ky ) . This satisfies all axioms other than (i). 2.5 Subspaces Many of the examples of vector spaces that we have considered so far have indeed been examples of subspaces. Definition 2.5.1 Suppose that V is a vector space, and that W is a nonempty subset of V . If W is also a vector space under the same operations as V , then we call W a subspace of V . For example, in Section 2.3, S is a subspace of R 2 and T is a subspace of M 2,2 (R ) . However, L is not a subspace of R 2 with the standard operations, since the operations on L are defined differently. Remarks 2.5.2 1. For any vector space V , {0} is a subspace of V ; where 0 is the zero vector of V. 2. V is always a subspace of itself. 8 Theorem 2.5.3 If W is a nonempty subset of a vector space V , then it is a subspace of V if and only if it is closed under vector addition and scalar multiplication (of V ); that is, if and only if whenever u and v are in W , u v is also in W for each real number k , whenever u is in W , ku is also in W Proof Axioms (b), (c), (g), (h), (i), and (j) follow automatically from the fact that V , is a vector space. Axioms (d) and (e) follow from closure under scalar multiplication, and points (iii) and (iv) stated after our definition of a vector space above. □ More Examples of Subspaces 1. Let V R 3 with the standard operations, and W be the plane through the origin with normal vector (1, 3, 2) . Then W is a subspace of V . Check: W {( x , y , z) : x 3y 2z 0} {( x , y , z) : ( x , y , z) (1, 3, 2) 0} . If ( x1 , y1 , z1 ) , ( x 2 , y 2 , z2 ) W , then ( x1 , y1 , z1 ) ( x 2 , y 2 , z2 ) W , since (( x1 , y1 , z1 ) ( x 2 , y 2 , z2 )) (1, 3, 2) ( x1, y1, z1 ) (1, 3, 2) ( x 2 , y 2 , z2 ) (1, 3, 2) 0 0 0 . Similarly, for any k R and ( x , y , z ) W , k ( x , y , z) W , since (k ( x , y , z)) (1, 3, 2) k (( x , y , z) (1, 3, 2)) k 0 0 . Remark 2.5.4 Clearly we could have done the same thing for any normal vector, and so we can say that every plane in R 3 through the origin, is a subspace of R 3 . Of course, any plane that does not contain the origin cannot be a subspace; since the origin is in fact the zero vector of R 3 . 2. Any line in R n is a subspace of R n . 3. W {f F ([0,1]) : f (0) 0} is a subspace of F ([0,1]) . 4. The special linear group, SL2 (R) {A M 2,2 (R ) : tr ( A) 0} is a subspace of M 2,2 (R ) . (The trace of a matrix A , written tr ( A) , is the sum of the diagonal entries of a b A . For example on M 2,2 (R ) , tr a b .) c d Exercise 2.5.5 Choose your favourite vector space and pick any subset that you like. Test if it is a subspace. 2.6 Spanning Sets Let’s return to our plane through the origin W {( x, y , z) : x 3y 2z 0} R 3 above. We can rewrite the elements of this space as follows: 9 W {( x , y , z ) : x 3y 2z 0} {( x , y , z ) : x 3y 2z} {(3y 2z, y , z) : y , z R} {y (3,1, 0) z( 2, 0,1) : y , z R} What we have done is write every vector in this vector space W , as a sum of scalar multiples of particular vectors, (in this case (3,1, 0) and (2, 0,1) ), in W . Definitions 2.6.1 (1) Suppose that v1 , v 2 , , v m are vectors in a vector space V . We say that a vector v in V is a linear combination of v1 , v 2 , , v m , if there are scalars a1 , a2 , , am such that v a1v1 a2v 2 amv m . (2) Given a set of vectors {v1 , v 2 , , v m } in a vector space V , we define the set of all linear combinations of these vectors to be the span of V , i.e. Span {v1 , v 2 , (3) If U Span {v1 , v 2 , , v m } {a1v1 a2v 2 amv m : a1 , a2 , , v m } , we say that {v1 , v 2 , , am R} . , v m } is a spanning set for U . Example 2.6.2 For W {( x , y , z) : x 3y 2z 0} as above, we have that W Span {(3,1, 0), (2, 0,1)} . Example 2.6.3 T {A M2,2 (R) : AT A} . Every element of this vector space is of a b the form, , for some a, b, d R , and so b d 1 0 0 1 0 0 T Span , , . 0 0 1 0 0 1 a b Example 2.6.4 Every element of SL2 (R) is of the form, , for some c a a, b, c R , and so 1 0 0 1 0 0 SL2 (R ) Span , , . 0 1 0 0 1 0 a 0 1 0 0 0 Example 2.6.5 D2 : a, b R Span , . 0 b 0 0 0 1 10 Non-uniqueness of spanning sets For a vector space V , a spanning set is not unique. In fact, there are infinitely many different spanning sets for a given vector space. For example for W above, we could equally have proceeded as follows: 1 2 W {( x , y , z) : x 3y 2z 0} {( x , y , z) : y x z} 3 3 1 2 1 2 {( x , x z, z) : x , z R} {x (1, , 0) z(0, ,1) : y , z R} 3 3 3 3 and so also 1 2 W Span {(1, , 0), (0, ,1)} . 3 3 Indeed any pair of non-parallel vectors lying in the plane W will form a spanning set for W . Some general remarks 2.6.7 Suppose that V is a vector space. (1) If {u1 , u2 , spans V : , um } spans V , and w is any vector in V , then {u1 , u2 , , um , w} also If v V , then v a1u1 a2u2 amum for some scalars a1 , a2 , , am . Then of course we also have that v a1u1 a2u2 amum 0w , and so v is a linear combination of {u1 , u2 , , um , w} . (2) Suppose {u1 , u2 , , um } spans V , and that some u k is a linear combination of the vectors u1 , u2 , , uk 1 , uk 1 , , um . Then {u1 , u2 , , uk 1 , uk 1 , , um } also spans V : We need to check that whenever v V is a linear combination of u1 , u 2 , , um , then it is also a linear combination of u1 , u2 , , uk 1 , uk 1 , , um . This can easily be seen from the following: v a1u1 a1u1 ak 1u k 1 ak u k ak 1u k 1 ak 1uk 1 ak (b1u1 (a1 ak b1 )u1 for some b1 , amu m bk 1u k 1 bk 1u k 1 bmu m ) ak 1u k 1 (ak 1 ak bk 1 )uk 1 (ak 1 ak bk 1 )uk 1 bk 1 , bk 1 , amu m (am ak bm )um , bm R . A common problem 2.6.8 We often need to answer is the following question: Given a vector space V , and vectors u1 , u2 , , um V , when is a given vector w in Span {u1 , u2 , , um } ? Rephrasing: Do there exist scalars a1 , a2 , , am such that w a1u1 a2u2 amum ? 11 Example 2.6.9 In R 4 , let u1 (1, 2, 1,3) , u2 (2, 4,1, 2) and u3 (3, 6,3, 7) . Is w (1, 2, 4,11) in Span {u1 , u 2 , u3} ? We need to try to find scalars a, b and c , such that au1 bu2 cu3 w , i.e. we need to solve (a, 2a, a,3a) (2b, 4b, b, 2b) (3c, 6c,3c, 7c ) (1, 2, 4,11) . This gives us the system of linear equations, a 2b 3c 1 2a 4b 6c 2 a b 3c 4 , 3a 2b 7c 11 which, when we solve by row reduction, we find has infinitely many solutions: a 3 t , b 1 2t , c t , for t R . Therefore w Span {u1 , u 2 , u3} . Taking t 1 for instance gives us, 4u1 3u2 u3 w . Example 2.6.10 In P2 (t ) , let p1 (t ) 1 3t 4t 2 , p2 (t ) 2 t t 2 and p3 (t ) 3 5t 2t 2 . Do we have that q (t ) 4 2t 2t 2 is in the span of {p1 , p2 , p3} ? We need to try to find scalars a, b and c , such that ap1 bp2 cp3 q , i.e. we need to solve a 3at 4at 2 2b bt bt 2 3c 5ct 2ct 2 4 2t 2t 2 . This gives us the system of linear equations, a 2b 3c 4 3a b 5c 2 . 4a b 2c 2 Solving by row reduction we get 1 2 3 4 3 1 5 2 4 1 2 2 1 2 3 4 10 0 1 2 , 7 0 0 0 1 from which we see that the system of equations has no solution. Therefore q Span {p1 , p2 , p3} . 12 Remark 2.6.11 Deciding if a given vector lies in the span of a given set of vectors, amounts to solving a system of linear equations – provided that our vector space is “finite dimensional”. All of the vector spaces we have, and will consider, with the exception of the function spaces, have this property. We will expand on this later. Theorem 2.6.12 Suppose that V is a vector space and that u1 , u2 , , uk V . (a) U Span {u1 , u2 , , uk } is a subspace of V . (b) If W is any subspace of V such that u1 , u2 , , uk W , then U W . Proof (a) We just need to check that U is closed under addition and scalar multiplication: If a1u1 a2u2 a1u1 a2u2 ak uk and b1u1 b2u2 ak uk b1u1 b2u2 bk uk are in V , then their sum, bk uk (a1 b1 )u1 (a2 b2 )u2 (ak bk )uk is clearly also in U . Similarly, for any k R , k (a1u1 a2u2 ak uk ) ka1u1 ka2u2 kak uk is also in U . (b) Since W is a subspace, it is closed under addition and scalar multiplication. Therefore, since u1 , u2 , , uk W , it follows that each of ku1 , ku2 , , kuk W , furthermore, that their sum ku1 ku2 kuk W , i.e. that every element of U is in W. □ Remark 2.6.13 If, given a subset of a vector space, you can easily see that this subset is the span of some set of vectors, then this is enough to deduce that this subset is in fact a subspace. Question 2.6.14 What are the subspaces of R ? Well we know that {0} and R are subspaces. Any others? Let S be a subspace of R , and S {0} , so that there exists non-zero vector u S . Since S is a subspace, it follows that ku S for all real numbers k . Therefore S R , and so {0} and R are the only subspaces. Question 2.6.15 What are the subspaces of R 2 ? We know that {0} and R 2 are subspaces. Again we ask if there are any others. 13 Let S be a subspace of R 2 , and S {0} , so that there exists non-zero vector u ( x1 , x 2 ) S . Again since S is a subspace, it follows that ku S for all real numbers k . There are then two possibilities: either S Span {u} , i.e. line through the origin with direction vector u , or S Span {u} . From the first of these possibilities, we see that every line through the origin is also a subspace of R 2 . If S Span {u} , then there exists non-zero vector v ( y1 , y 2 ) S such that v Span {u} . From part (b) of the above theorem, we have that S Span {u , v } . But u and v are a pair of non-parallel vectors in R 2 , and so Span {u , v } R 2 . Therefore the only subspaces of R 2 are {0} , all lines through the origin, and R 2 itself. Question 2.6.16 What are the subspaces of R 3 ? By similar reasoning, and using the fact that whenever we have three non-coplanar vectors in R 3 , every other vector in R 3 can be written as a linear combination of these three vectors, we find that the only subspaces of R 3 are: {0} , all lines through the origin, all planes through the origin, and R 3 itself. 2.7 Linear Dependence and Independence In the previous section, we have been considering sets of vectors which span a given vector space, in the sense that every vector in this vector space can be expressed as a linear combination of these spanning vectors. In general, there may be many different ways in which a given vector can be expressed as a linear combination of the vectors in a spanning set. We will now begin to study conditions under which each vector can be expressed as a linear combination of spanning vectors in exactly one way. Definition 2.7.1 Let V be a vector space. We say that the set of vectors {v1 , v 2 , , v k } in V is linearly dependent (LD) if there are scalars a1 , a2 , , ak R , with at least one of them non-zero, such that a1v1 a2v 2 akv k 0 . Conversely, if the only solution to a1v1 a2v 2 akv k 0 is a1 a2 then we say that {v1 , v 2 , , v k } is linearly independent (LI). ak 0 , 14 Regarding the point made in the introduction to this section, we draw attention to the following. Observation 2.7.2 Suppose that {v1 , v 2 , , v k } in V is a LI set of vectors, and that w V is in the span of {v1 , v 2 , , v k } . Then scalars a1 , a2 , , ak R such that a1v1 a2v 2 akv k w are unique; that is, there is only one way to express w as a linear combination of {v1 , v 2 , , v k } . To see this, suppose a1v1 a2v 2 akv k w and b1v1 b2v 2 bkv k w . Then it follows that a1v1 a2v 2 akv k b1v1 b2v 2 bkv k , i.e. (a1 b1 )v1 (a2 b2 )v 2 (ak bk )v k 0 . But since {v1 , v 2 , , v k } is LI, this means that a1 b1 a2 b2 a1 b1 , a2 b2 , , ak bk . ak bk 0 , i.e. □ We now look at examples where we determine whether a given set of vectors is LI or LD. Example 2.7.3 {(1, 0), (0,1)} is a LI set in R 2 : Solving a(1, 0) b(0,1) (0, 0) gives us a 0 , b 0 . Example 2.7.4 How about the set {(1,1, 0), (1,3, 2), (4,9,5)} in R 3 ? We need to solve a(1,1, 0) b(1,3, 2) c (4,9,5) (0, 0, 0) , i.e., (a b 4c , a 3b 9c , 2b 5c ) (0, 0, 0) , i.e., the system of linear equations: a b 4c 0 a 3b 9c 0 , 2b 5c 0 which we solve by row reduction: 1 1 4 0 1 3 9 0 0 2 5 0 1 1 0 1 0 0 4 5 2 0 0 0 , 0 3 5 t, b t , c t , for t R . Thus, this set of 2 2 vectors is linearly dependent; for example (with t 2 ), which has general solution a 3(1,1, 0) 5(1,3, 2) 2(4,9,5) (0, 0, 0) . 15 Example 2.7.5 The set {(1, 1, 2,3), (5,1, 0, 2), (2,1, 1, 6)} in R 4 ? We need to solve a(1, 1, 2,3) b(5,1, 0, 2) c (2,1, 1, 6) (0, 0, 0, 0) , i.e., (a 5b 2c , a b c , 2a c ,3a 2b 6c ) (0, 0, 0, 0) , i.e., the system of linear equations: a 5b 2c 0 a b c 0 2a c 0 , 3a 2b 6c 0 which we solve by row reduction: 1 1 2 3 5 2 0 1 1 0 0 1 0 2 6 0 1 0 0 0 0 0 0 1 0 0 . 0 1 0 0 0 0 Therefore there is a unique solution, a b c 0 , and so this set of vectors is LI. 4 7 1 1 1 2 1 1 Example 2.7.6 The set of matrices , , , in M 2,2 (R ) . 7 9 1 1 3 4 4 5 We need to solve 4 7 1 1 1 2 1 1 0 0 a b c d , 7 9 1 1 3 4 4 5 0 0 4a b c d i.e. 7a b 3c 4d equations: 7a b 2c d 0 0 , which produces the system of linear 9a b 4c 5d 0 0 4a b c d 0 7a b 2c d 0 7a b 3c 4d 0 . 9a b 4c 5d 0 If we solve by row reduction, we will see that this system is consistent, and so the set of matrices is LD. Remark 2.7.7 The procedure for deciding if a given set of vectors is LI or LD is always the same: write down the equation we need to solve, a1v1 a2v 2 akv k 0 form the appropriate augmented matrix and solve by row reduction 16 if there are infinitely many solutions, then the set is LD, while if there is a unique solution, i.e. a1 a2 ak 0 , then the set is LI. Proposition 2.7.8 A set {v1 , v 2 , , v m } is linearly dependent if and only if at least one of these vectors can be written as a linear combination of the others. Proof ( ) Suppose {v1 , v 2 , , v m } is LD. Then there is a solution to a1v1 a2v 2 amv m 0 where at least one of the ai is non-zero. Suppose, without loss of generality, that a1 0 . Then from a1v1 a2v 2 amv m 0 we have that a a v1 2 v 2 m v m , a1 a1 and so we have been able to write v1 as a linear combination of {v 2 , , v m } . ( ) Suppose, without loss of generality, that v1 can be written as a linear combination of {v 2 , , v m } , i.e. v1 b2v 2 bmv m . From this we see that v1 b2v 2 bmv m 0 , and so {v1 , v 2 , a1v1 a2v 2 amv m 0 has a solution where a1 1 0 .) , v m } is LD. (Since □ Equivalently 2.7.9 This observation can clearly be rephrased by saying that {v1 , v 2 , , v m } is linearly independent if and only if no vector in this set can be written as a linear combination of the others. Observation 2.7.10 Proposition 2.7.8, combined with point (2) of 2.6.7, allows us to deduce that if we have a linearly dependent spanning set, then we can reduce this set in size while retaining the fact that it is a spanning set. This will soon be of great importance to us, since we will wish to find spanning sets of minimum size. Proposition 2.7.11 Indeed, we can strengthen the above observation if we assume each of the vectors in our set is non-zero: If {v1 , v 2 , , v m } , with m 2 , is a LD set, and each v i 0 , then some v k , with k 2 , is a linear combination of {v1 , , v k 1} . (That is, at least one of these vectors is a linear combination of the vectors preceding it in the list.) Proof {v1 , v 2 , , v m } is LD, and so there are scalars a1 , a2 , , am R , at least one of which is non-zero, such that a1v1 a2v 2 amv m 0 . Choosing the largest k such a a that ak 0 , we have that a1v1 akv k 0 , and so v k 1 v 1 k 1 v k 1 . It ak ak remains to verify that k cannot be 1 : If k 1 , this would mean that a1v1 0 , which in turn would imply that v1 0 . This gives a contradiction, since we are assuming that each v i 0 , and so k 2 . □ 17 Remarks 2.7.12 If we have a set of vectors, one of which is the zero vector, then this set is immediately seen to be LD: for any set {0, u1 , u2 , , ul } , clearly a0 0u2 0ul 0 , with any non-zero a , is a solution. If v is any non-zero vector in a vector space, the set {v } , containing only the vector v , is linearly independent; since the only solution to av 0 is a 0 . If {u1 , u2 , , um } is a linearly dependent set of vectors in a vector space V , and w1 , w 2 , , w l are any other vectors in V , then the set {u1 , u2 , , um , w1 , w 2 , , w l } is also linearly dependent: This follows since if there are scalars a1 , a2 , , am R , with at least one of them non-zero, such that a1u1 a2u2 amum 0 , then we have that a1u1 a2u2 amum 0w1 0w 2 0w l 0 . This tells us that if a subset of a set of vectors is LD, then the whole set is LD. Equivalently, if a set of vectors is LI, then every subset of this set also has to be LI. Question 2.7.13 Given a linearly independent set of vectors {v1 , v 2 , , v k } in a vector space V , how can we increase the size of this set, while retaining the property of linear independence for our new set? The answer is given by the following theorem. This too will be of fundamental importance to us, since we will also be looking for linearly independent sets of vectors of maximum size. (Compare this remark with that made in Observation 2.7.10.) Theorem 2.7.14 Let {v1 , v 2 , , v k } be a linearly independent set of vectors in V , and v any other non-zero vector in V . The set {v1 , v 2 , , v k , v} is linearly independent if and only if v Span {v1 , v 2 , , v k } . Proof ( ) If {v1 , v 2 , , v k , v} is linearly independent, then by Observation 2.7.8, no vector in this set is a linear combination of the other vectors, and so in particular, v is not a linear combination of v1 , v 2 , , v k . ( ) Considering the equation a1v1 a2v 2 akv k ak 1v 0 (*), we need to deduce from our assumption of linear independence of {v1 , v 2 , , v k } , that a1 , a2 , , ak , ak 1 must all be 0 . 18 a1 a a v 1 2 v 2 k v k , and so v ak 1 ak 1 ak 1 , v k } . This is contrary to our assumption, and so ak 1 must If ak 1 0 , we would then have that v would be in Span{v1 , v 2 , be zero. Then by (*), we have that a1v1 a2v 2 akv k 0 ; from which it follows that also a1 a2 ak 0 , since {v1 , v 2 , , v k } is a LI set. Therefore, {v1 , v 2 , , v k , v } is linearly independent. □ Therefore, we can increase the size of a linearly independent set, and retain linear independence, if and only if we can find another vector which is not in the span of the original set. In the following examples, we will try to do this for the given linearly independent sets of vectors. 1 1 0 0 Example 2.7.15 The LI set , in M 2,2 (R ) . 0 0 1 0 Example 2.7.16 The LI set (1, 4),(3,5) in R 2 . Example 2.7.17 The LI set (1,3,1, 2),(2,5, 1,3),(1,3,7, 2) in R 4 . Example 2.7.18 W {( x , y , z) : x 3y 2z 0} Span{(3,1,0),(-2,0,1)} R 3 . Can we extend the linearly independent set {(3,1,0),(-2,0,1)} to a larger linearly independent set in R 3 ? 19 Remark 2.7.19 Any 4 vectors in R 3 are LD. Why? More generally, we have the following theorem, which is very important in the theory of vector spaces: Theorem 2.7.20 If a vector space V can be spanned by n vectors, and {w1 , w 2 , , w m } is a linearly independent set in V , the m n . Proof □ This result can be stated as: Size of any spanning set for a vector space V Size of any LI set in V . 20 Example 2.7.21 Since we know that R 4 can be spanned by four vectors, for example {(1, 0, 0, 0), (0,1, 0, 0), (0, 0,1, 0), (0, 0, 0,1)} , we can deduce that any five or more vectors in R 4 are linearly dependent. Example 2.7.22 Example 2.7.23 2.8 Basis and Dimension What if a spanning set for a vector space V is also linearly independent? Definition 2.8.1 A set {v1 , v 2 , , v n } in a vector space V , is a basis for V if 1. {v1 , v 2 , , v n } spans V and 2. {v1 , v 2 , , v n } is linearly independent. Remark 2.8.2 So a basis is a largest linearly independent set, or a smallest spanning set. We will now find bases for each of the following vector spaces. Example 2.8.3 {(1, 0), (0,1)} is a basis for R 2 Example 2.8.4 Another basis for R 2 ? Example 2.8.5 We have seen that the plane through the origin, W {( x , y , z) : x 3y 2z 0} is spanned by the set {(3,1,0),(-2,0,1)} , and that this set is also LI. Therefore {(3,1,0),(-2,0,1)} is a basis for W . 21 Example 2.8.6 A basis for the hyperplane through the origin H {( x , y , z, w ) : 4 x y 5z 3w 0} ? Example 2.8.7 A basis for SL2 {A M 2,2 (R ) : Tr( A) 0} ? Example 2.8.8 A basis for R n ? Question 2.8.9 Do all bases for a given vector space have the same number of vectors? The following theorem is fundamental in vector space theory. Theorem 2.8.10 If {v1 , v 2 , space V , then n m . , v n } and {w1 , w 2 , , w m } are both bases for a vector Proof □ Definition 2.8.11 If a vector space V has a finite basis {v1 , v 2 , , v n } , the we say that the dimension of V is n , or that V is n -dimensional. We write dim V n . 22 Example 2.8.12 dim R 2 Example 2.8.13 dim R n Example 2.8.14 dim SL2 Example 2.8.15 Dimension of a plane through the origin in R 3 Example 2.8.16 Dimension of the vector space of diagonal 2 x 2 real matrices Example 2.8.17 dim M m ,n (R ) We can now state that: Size of any spanning set for V dim V Size of any LI set in V . So in general, how can we find bases for vector spaces? 23 Theorem 2.8.18 Suppose that V is a vector space, and that dim V n . Then 1. Any LI set {v1 , v 2 , , v n } in V is a basis, (i.e. automatically spans V ) 2. Any spanning set {w1 , w 2 , , w n } in V is a basis, (i.e. is automatically LI ) 3. If {v1 , v 2 , , v m } is a LI set in V , then there are vectors {v m1 , v m 2 , , v n } in V such that {v1 , v 2 , , v m , v m1 , v m2 , , v n } is a basis for V , (i.e. every LI set can be extended to a basis) 4. If {w1 , w 2 , , w l } spans V , then there is a subset of this set which forms a basis for V . Proof 1. 2. 3. 4. □ So this theorem tells us that if we know that the dimension of our vector space is n say, then to find a basis it suffices to find a spanning set with n vectors, or a linearly independent set with n vectors. Example 2.8.19 Does the set of vectors {(1, 2,5), (2,5,1), (1,5, 2)} form a basis for R 3 ? 24 Example 2.8.20 The pair of vectors {(1,1,1), (1, 2,3)} is LI in R 3 . Extend this set to a basis for R 3 . Question 2.8.21 Suppose that W is a vector space with dimension n , and U is a subspace of W . What can we say about the dimension of U ? Theorem 2.8.22 If W is a vector space with dimension n , and U is a subspace of W , then 1. dim U dim W 2. dim U dim W if and only if U W Proof 1. 2. □ Observations 2.8.23 so for example, any 3 -dimensional subspace of R 3 has to be R 3 itself. every subspace of R 5 , for example, has either dimension 0,1,2,3,4 or 5; where we say that a space has dimension 0 when it is the space containing only the zero vector. Any 4 -dimensional subspace of M 2,2 (R ) has to be M 2,2 (R ) itself. 2.9 Coordinate Vectors Question 2.9.1 What are bases good for? Well to begin with, recall that in Observation 2.7.2 we saw that if {v1 , v 2 , , v k } is a LI set of vectors in a vector space V , and if w V is in the span of {v1 , v 2 , , v k } , then the scalars a1 , a2 , , ak R such that a1v1 a2v 2 akv k w are unique; that is, there is only one way to express w as a linear combination of {v1 , v 2 , , v k } . 25 Therefore, suppose that V is an n -dimensional vector space, and that B={v1 , v 2 , , v n } is a basis for V , with the order of the vectors in this basis fixed. Then each v V we can write uniquely in the form a1v1 a2v 2 anv n v . The scalars a1 , a2 , , an R are called the coordinates of v with respect to the basis B . These clearly form a vector (a1 , a2 , , an ) R n , called the coordinate vector of v with respect to B , which we denote by v B , that is, v B (a1, a2 , , an ) R n . Example 2.9.2 In Example 2.8.19 we saw that the vectors B {(1, 2,5), (2,5,1), (1,5, 2)} form a basis for R 3 . Find v B , where v (5, 21, 22) . By row reduction we solve a(1, 2,5) b(2,5,1) c (1,5, 2) (5, 21, 22) , and find that the (necessarily unique) solution is a 3, b 1, c 4 , and so (5, 21, 22)B (3, 1, 4) . Example 2.9.3 In Example 2.8.7 we saw that an example of a basis for SL2 is 1 0 0 1 0 0 6 4 B , , . Find w B , where w 11 6 . 0 1 0 0 1 0 It is easy to see immediately that w B (6, 4,11) R3 . Example 2.9.4 In R n , for each i 1, , n , let ei (0, , 0,1, 0, , 0) R n be the vector with all entries zero, except the ith entry which is one. Then the basis E {e1 , e2 , , en } is called the standard basis for R n . Clearly for any vector v (a1 , a2 , , an ) R n we have that v E is just v itself. Example 2.9.5 In P2 (t ) , the polynomials p1 (t ) t 1 , p2 (t ) t 1 and p3 (t ) t 2 2t 1 form a basis B . Find pB , where p(t ) 2t 2 5t 9 . Example 2.9.6 In Pn (t ) , the set of polynomials {p0 (t ) 1, p1 (t ) t , p2 (t ) t 2 , p(t ) ao a1t a2t 2 pB (ao , a1, a2 , , pn (t ) t n } , forms a basis B . For any polynomial an t n , it is again then immediately clear that , an ) R n1 . 26 Conclusion 2.9.7 So in general, we have the following picture. Fix an ordered basis B={v1 , v 2 , , v n } for our vector space V . Then we have a one-to-one correspondence between vectors in V and vectors in R n : v x1v1 x2v 2 xnv n V v B =(x1 , x2 , , xn ) R n . (Different vectors in V have different coordinate vectors, with respect to B , in R n ; and every (a1 , a2 , , an ) R n is the coordinate vector, with respect to B , of some vector v V .) Moreover, this correspondence preserves the vector space operations in the sense that if v V v B =(x1 , x2 , , xn ) R n and w V w B =(z1 , z2 , , zn ) R n , then v w V (x1 z1 , x 2 z2 , , x n zn ) R n , that is, v w B v B w B ; and that for k R , kv B k v B . This tells us that whenever we have an n -dimensional vector space V , irrespective of how strange it might look, it behaves in exactly the same way as R n . Two vector spaces which behave in the same way in this sense are said to be isomorphic, and a correspondence such as that above is called an isomorphism. We will talk more about this, and make it more precise, when we cover the subject of “Linear Transformations” next semester. 2.10 Applications of Row Reduction Let A M m ,n (R ) . Then the rows of A , v1 , v 2 , , v m , each consisting of n real numbers, can be thought of as vectors of R n . Similarly, the columns of A , w1 , w 2 , , w n , can be thought of as vectors in R m . Definition 2.10.1 We define the Row Space of A , denoted Row ( A ) , to be the subspace of R n spanned by the rows of A , Row ( A) Span {v1 , v 2 , , v m } R n . Similarly, the column space of A , denoted Col ( A ) , is defined as Col ( A) Span {w1 , w 2 , , w m } R m . 27 Question 2.10.2 Suppose {w1 , w 2 , , w m } is a linearly dependent set in R n . How can we find a basis for W Span {w1 , w 2 , , w m } , and hence find the dimension of W ? (Compare this with Question 4 of Week 4 Exercises.) There are just two ingredients we need to answer this question: 1. The application of elementary row operations to a matrix has no effect on the row space of that matrix, (that is, if A and B are row equivalent matrices, then Row ( A) Row (B ) .) 2. The non-zero rows of a matrix in row echelon form are linearly independent. Part 1. is simply an immediate consequence of the fact that interchanging two vectors in a spanning set makes no difference to the span of that set, and that for any non-zero k R , Span {w1 , w 2 , ,wi , , w m } Span {w1 , w 2 , , kw i , , w m} and Span {w1 , w 2 , , w i 1 , w i , w i 1 , , w m } Span {w1 , w 2 , , w i 1 , w i kw j , w i 1 , ,w m} . (These two equalities are easily checked and are left as an exercise.) For Part 2., remembering that a matrix is in row echelon form if all zero rows are at the bottom, and the first non-zero entry in each row occurs at least one position to the right of the first non-zero entry in the preceding row, it is then straightforward to see that the non-zero rows form a LI set of vectors. Therefore to answer Question 2.10.2, it follows that all we need to do is write the vectors in our set, {w1 , w 2 , , w m } , as the rows of an m x n matrix A , and reduce to row echelon form. Then the non-zero rows of A will form a basis for Row ( A ) , i.e. Span {w1 , w 2 , , w m } ; and of course the number of non-zero rows gives the dimension of Span {w1 , w 2 , , w m } . Example 2.10.3 Find a basis for W Span {(0,0,3,1, 4),(1,3,1, 2,1),(3,9, 4,5, 2),(4,12,8,8,7)} R 5 , and hence state the dimension of W . 28 Question 2.10.4 What does this tell us about our original spanning set for W ? Example 2.10.5 Do the same for U Span {(1,1,1,1),(1, 2,3, 2),(2,5,6, 4),(2,6,8,5)} R 4 . (This is the set of Question 4, Week 4 Exercises.) Question 2.10.6 What does this tell us about our original spanning set for U ? Important Observation 2.10.7 We can therefore use this approach to easily determine if a given set of vectors is LI or LD: Theorem 2.10.8 A set of vectors {w1 , w 2 , , w m } in R n is LI if and only if, when we reduce the matrix with rows w1 , w 2 , , w m to echelon form, there are no zero rows. Proof Let W Span {w1 , w 2 , () ,w m} . () □ 29 Question 2.10.9 Given linearly independent vectors {w1 , w 2 , , w m } in R n , so that necessarily m n , how can we use row reduction to easily extend this set to a basis for V ? (Compare this with Question 3, Week 4 Exercises.) The answer to this is best illustrated with an example. Example 2.10.10 Extend the LI set {(1, 2, 0, 4, 0), (3,8,5, 4,1), (0, 2,5,1, 0)} to a basis for R5 . Question 2.10.11 Suppose {u1 , u2 , , um } is a linearly dependent set in R n . How can we find a basis for U Span {u1 , u2 , , um } , consisting only of vectors from the set {u1 , u2 , , um } ? Again, this is best answered with an example. Example 2.10.12 Let U Span {u1 , u 2 , u3 , u 4 , u5 } R 4 , where u1 (1,1,3, 2), u2 (2, 2,6, 4), u3 (1, 2,5,1), u4 (0,1, 2, 1), u5 (1,3,7,0). Find a basis for U , consisting only of vectors from {u1 , u2 , u3 , u4 , u5} . 30 Example 2.10.13 Let’s do the same for U Span {u1 , u 2 , u3 , u 4 , u5 } R 4 , where u1 (1, 2,0,3), u2 (2, 5, 3,6), u3 (0,1,3,0), u4 (2, 1, 4,7), u5 (5, 8,1, 2). 31 2.11 Kernel, Range and Rank of a Matrix, and Invertibility These are terms that are more generally applied to arbitrary linear transformations between vector spaces. For now, though, we specialize simply to the case of matrices. Kernel: Let A M m ,n (R ) . In the following, we will think of elements of R n as column vectors, that is, nx1 real matrices, x1 x n x R , x 2 . xn 0 0 n (So the zero vector in R will be 0 .) 0 Consider the matrix equation, Ax 0 . Definition 2.11.1 The Kernel of A , (also called the null space of A ), is defined to be Ker (A) {x R n : Ax 0} . This is simply the set of solutions to the homogeneous system of linear equations Ax 0 , and so is clearly a subset of R n . Proposition 2.11.2 Ker (A) is a subspace of R n . Proof □ 32 Question 2.11.3 Suppose b R n is non-zero. Is {x R n : Ax b} a subspace of Rn ? Question 2.11.4 How can we find a basis for Ker ( A ) ; and hence determine its dimension? (The dimension of Ker ( A ) is also called the nullity of A .) 1 2 2 2 1 Example 2.11.5 A 1 2 1 3 2 M3,5 (R) . Ker ( A) ? 2 4 7 1 1 1 2 We just need to solve the augmented matrix, 1 2 2 4 the solutions are, x1 2r 4s 3t , x 2 r , x3 s t , we have that 2 2 1 0 1 3 2 0 . We find that 7 1 1 0 x 4 s, x5 t , for t R . Thus Ker ( A) Span These vectors are clearly linearly independent, and so dim Ker ( A) 3 . Remark 2.11.6 A spanning set for Ker ( A ) obtained in this way will always automatically be linearly independent. We therefore obtain the following theorem. Theorem 2.11.7 If A M m ,n (R ) , a spanning set for Ker ( A ) obtained in the manner above always forms a basis for Ker ( A ) . Moreover, 33 dim Ker ( A) number of parameters needed in general solution of Ax 0 . Observations 2.11.8 For A M m ,n (R ) , Ker ( A ) is a subspace of R n , and 0 dim Ker ( A) n dim Ker ( A) 0 if and only if Ker ( A) 0 ; that is, if and only if x1 x 2 xn 0 is the only solution to Ax 0 dim Ker ( A) n if and only if A 0 ; that is, if and only if A is the zero matrix. Range and Rank: Definition 2.11.9 Let A M m ,n (R ) . We define the Range of A , (or image of A ), to be Ran ( A) {Ax : x R n } . Clearly this is a subset of R m . Proposition 2.11.10 Ran ( A ) is a subspace of R m . This is clear, since in fact, Ran ( A ) is nothing but the column space of A , Col ( A) Span {c1 , c 2 , , cn } , where c1 , c 2 , , c n are the columns of A . To see this, a11 Ax a m1 x a11 a12 a1n 1 a21 a22 x2 x x1 2 amn xn am1 am 2 a1n a2 n . xn amn Definition 2.11.11 Let A M m ,n (R ) . The rank of A is defined to be the dimension of the range of A , that is, Rank ( A) dim Ran ( A ) . Question 2.11.12 How can we find a basis for Ran (A) ; and hence determine its dimension? Well, since Ran ( A) Col ( A ) , we just need to find a basis for Span {c1 , c 2 , , cn } . This can be done using either the method given as an solution to Question 2.10.2 or that given as an solution to Question 2.10.11. (In general though, the latter will be preferable, since it involves row reducing the same matrix that is row reduced to compute Ker ( A ) - so time would be saved if it’s required to compute bases for both kernel and range of a matrix A .) 34 Remark 2.11.13 From the worked examples illustrating solutions to Questions 2.10.2 and 2.10.11, for A M m ,n (R ) we see that the number of leading non-zero entries in any echelon form for A , equals dim Col ( A) dim Ran ( A ) . Moreover, this is clearly equal to, n number of parameters needed in general solution of Ax 0 . (We introduce a parameter for each column in which there is not a leading non-zero entry.) This leads us to the following fundamental theorem in linear algebra: Theorem 2.11.14 (Conservation of Dimension) If A M m ,n (R ) , then Nullity of A + Rank of A n , That is, dim Ker ( A) dim Ran ( A ) n . (This is also known as the Rank-Nullity Theorem.) Example 2.11.15 Let A M 4,5 (R ) be the matrix 1 1 A 2 3 3 1 2 3 4 3 1 4 . 3 4 7 3 8 1 7 8 (i) Find a basis for Ker ( A ) (ii) Find a basis for Ran ( A ) . 1 0 (i) Reducing to echelon form we obtain 0 0 3 1 2 3 1 2 1 1 , and in the same way as 0 0 0 0 0 0 0 0 in Example 2.11.5, we find that a basis for Ker ( A ) is {(5, 1, 0,1, 0), (5, 2,1, 0, 0), (0,1, 0, 0,1)} . (ii) In our echelon matrix of part (i), we see that columns 1 and 2 have leading nonzero entries, and so {(1,1, 2,3), (3, 4,3,8)} is a basis for Ran ( A ) . Note 2.11.16 We see that dim Ker ( A) dim Ran ( A ) 3 2 5 . 35 We end this section with a useful theorem listing conditions equivalent to the invertibility of an n x n matrix. Invertibility: Definition 2.11.17 A matrix A M n ,n (R ) is said to be invertible if there exists a matrix B M n ,n (R ) such that AB I n BA , where I n is the n x n identity matrix. The matrix B , automatically unique, is denoted by A 1 and is called the inverse of A . Theorem 2.11.18 For A M n ,n (R ) , each of the following conditions are equivalent. 1. A is invertible 2. For each b R n , there is a unique solution to the matrix equation Ax b 3. Ker ( A) {0} 4. Ran ( A) R n 5. Rank ( A) n 6. We can apply row operations to A and obtain I n 7. The rows of A form a linearly independent set of vectors in R n 8. The columns of A form a linearly independent set of vectors in R n 9. The rows of A form a basis for R n 10. The columns of A form a basis for R n 11. For x and y R n , if Ax Ay , then x y , (in this we say that A is one-toone or injective). Proof □ 36 3. Inner Product Spaces and Normed Spaces Recall that in Section 1., based on the notion of length (norm) for vectors in R 2 and R 3 we introduced the Euclidean length (norm) for vectors in R n , for arbitrary n . However, when, in Section 2., we began to talk about abstract vector spaces, the notion of length did not play any part. In this Section we will give our vector spaces additional structure, in such a way that the concept of length can be defined. 3.1 Inner Product Spaces Just as we took the most important properties of vector addition and scalar multiplication for vectors in R 2 and R 3 and used them as a foundation for our definition of an abstract vector space, we likewise take the most important properties enjoyed by the dot (inner) product in R 2 and R 3 and use them to define a general inner product on a vector space. Definition 3.1.1 Let V be a vector space. Suppose that to each pair of vectors u , v V we can assign a real number, denoted by u, v . (So we have a function, u, v R .) This function is called an inner product on V if it satisfies each of the following axioms, for all u , v , w V and a, b R : u , v V x V I1 au bv , w a u, w b v , w (linearity in first position) I 2 u, v v , u (symmetric property) I 3 u, u 0 , and u, u 0 u 0 (positive definite property). A vector space V , endowed with an inner product, is called an inner product space. (Compare I1 , I 2 and I 3 with properties (a), (c) and (d) of the Euclidean Inner Product stated on page 3.) Observations 3.1.2 By I1 and I 2 , we also have that u, av bw a u, v b u, w for all u , v , w V , a, b R . The inner product of a linear combination of vectors is a linear combination of inner products of the vectors: a u , b v i i i j j j ai b j ui , v j . i,j Example 3.1.3 5u1 4u2 ,3v1 2v 2 6v 3 15 u1 , v1 10 u1 , v 2 30 u1 , v 3 12 u2 , v1 8 u2 , v 2 24 u2 , v 3 . Examples of Inner Product Spaces 3.1.4 1. R n with the Euclidean inner product. 37 2. V M m ,n (R ) , with inner product defined as follows: for each A, B M m ,n (R ) , define A, B tr (B T A) . 1 3 2 1 3 1 e.g. if A , B M 2,3 (R) , then 6 1 0 2 4 4 A, B tr (B T A) 1 2 13 5 2 1 3 2 tr 3 4 tr 27 13 6 13 13 2 28 . 6 1 0 7 4 2 1 1 3. V C ([a, b]) the set of all continuous functions on the closed interval [a, b] , with inner product defined as follows: for each f , g C ([a, b]) , define b f , g f (t )g (t )dt . a This is called the standard inner product on C ([a, b]) . e.g. if f , g C ([0,1]) are the functions f (t ) 5t 3 and g (t ) t 2 , then 1 1 f , g 5t 3 3t 2dt . 0 4 Verification that A, B tr (B T A) does define an inner product on M m ,n (R ) I1 aA bB,C tr (C T[aA bB]) tr (aC T A bC TB) atr (C T A) btr (C TB) a A, C b B, C , where the second equality follows by elementary properties of matrix algebra, and the third equality follows since tr (kS lT ) ktr (S ) ltr (T ) , for all S ,T M n ,n (R ) and k , l R - check this for yourselves! I 2 A, B tr (B T A) tr (B T A) T tr ( A TB ) B, A , where the second equality follows since the trace of any matrix S is equal to the trace of the transpose, S T , of S . I 3 Let’s look at the i , j th entry of the n x n matrix A T A , that is ( AT A)i , j : Simply according to matrix multiplication, m m k 1 k 1 ( AT A) i , j (AT ) i ,k ( A) k , j ak ,i ak , j . Therefore, since the trace of a matrix is the sum of the diagonal entries, n n m n m tr ( A A) ( A A) r ,r ak ,r ak ,r (ak ,r ) 2 T T r 1 r 1 k 1 r 1 k 1 is the sum of the squares of all of the entries of the matrix A , and so clearly 38 tr ( AT A) 0 for all A M m ,n (R ) , and tr ( AT A) 0 if and only if each entry of A is zero, that is, A 0 . 3.2 Normed Spaces Using the properties of the Euclidean norm listed on page 4., we define the following. Definition 3.2.1 Let V be a vector space. Suppose that to each vector v V we can assign a real number, denoted by v . (So we have a function, v V v R .) This function is called a norm on V if it satisfies each of the following axioms, for all u , v V and k R : N1 v 0 , and v 0 v 0 N 2 kv k v N3 u v u v A vector space V , endowed with a norm, is called a normed space. Example 3.2.2 R n with the Euclidean norm. Remark 3.2.3 Recall that in R n , Euclidean norm and Euclidean inner product were related as follows: v v , v 1 2 , for each v R n . Indeed, whenever V is an inner product space, with inner product u, v , V will also be a normed space, with the norm of each v V defined as v v , v 1 2 . To prove this, we need to show that each of the axioms N1 , N 2 and N 3 , can be deduced from the axioms I1 , I 2 and I 3 , 1 where v v , v 2 . To do this, we will need to make use of the Cauchy-Schwarz inequality, which we now state and prove: Theorem 3.2.4 (Cauchy-Schwarz Inequality) Let V be an inner product space. For any vectors u , v V , we have u, v u, u 1 2 v ,v 1 2 . Proof Let t be any real number. Then, by I 3 , tu v , tu v t 2 u, u 2t u, v v , v 0 . 39 That is, the quadratic, at 2 bt c 0 , where a u, u , b 2 u, v , and c v , v . Therefore, b 2 4ac 0 , that is, 4 u , v 2 4 u , u v , v , and so u, v u, u 1 2 v ,v 1 2 . □ Now we can prove the following theorem. Theorem 3.2.5 Let V be an inner product space, with inner product , . Then V is 1 2 a normed space, with norm defined as v v , v , for each v V . Proof N1 follows immediately from I 3 . For N 2 , we see that kv 2 kv , kv k 2 v , v k 2 v , and so kv k v . For N 3 , 2 u v 2 u v ,u v u 2 u, v v 2 2 u 2 u, v v 2 2 u 2 u v v 2 2 ( u v )2 , □ and taking square roots gives us the required inequality. Examples 3.2.6 1. V M m ,n (R ) , with inner product, A, B tr (B T A) . Then a norm can be defined on M m ,n (R ) by: A A, A 1 2 1 2 tr ( A A) . T 3 1 2 1 M 4,2 (R ) , then e.g. if A 4 0 1 5 1 A tr ( A A) T 1 2 3 1 2 1 30 4 2 3 2 4 1 2 1 tr tr 57 . 1 1 0 5 4 0 4 27 1 5 b 2. V C ([a, b]) , with inner product, f , g f (t )g (t )dt . Then a norm can be a defined on C ([a, b]) by: f f , f 1 2 f (t )f (t )dt . b 1 2 a e.g. if f C ([0,1]) , with f (t ) 3t 5 , then f f ,f 1 2 9t 1 0 2 30t 25dt 1 2 13 . 40 Remark 3.2.7 There are normed spaces which are not necessarily inner product spaces. That is, we can have a norm on a vector space V , which has not necessarily been induced by an inner product on V . Example 3.2.8 Let V R n . Then as well as the Euclidean norm, we can also define other norms on V . For example, the so called infinity-norm on R n is defined as follows: for (a1 , a2 , , an ) R n , define (a1 , a2 , , an ) max { a1 , a2 , , an } . e.g. for (3, 9, 6, 2) R 4 , (3, 9,6, 2) 9 , that is, the maximum from the set of non-negative real numbers {3,9, 6, 2} . A further example of a norm on R n is the one-norm 1 , defined as follows: for (a1 , a2 , , an ) R n , define (a1 , a2 , , an ) 1 a1 a2 an . e.g. for (3, 9, 6, 2) R 4 , (3, 9,6, 2) 1 3 9 6 2 20 . Exercise 3.2.9 Verify that 1 and do indeed define norms on V R n . Example 3.2.10 Let V C ([a, b]) . Then in addition to the norm induced by the standard inner product on C ([a, b]) , we also have the following two norms: the one-norm, defined as b f 1 f (t ) dt , a the infinity-norm, defined as f max f (t ) . 3.3 Orthogonality Definition 3.3.1 Let V be an inner product space. Vectors u and v in V are said to be orthogonal, and u is said to be orthogonal to v , if u, v 0 . Observations 3.3.2 This relation is symmetrical, in that if u is orthogonal to v , then v is orthogonal to u . The zero vector is orthogonal to every vector: 0, v 0u, v 0 u, v 0 , for all v V , and any u V . The zero vector is the only vector with this property: suppose u V is a vector orthogonal to every vector in V . Then in particular it is orthogonal to itself, and so u, u 0 . This means that u has to be the zero vector. 41 Definition 3.3.3 Let V be an inner product space, and let S be any subset of V . The orthogonal complement of S , denoted by S and read as “ S perp”, is defined to be the set of all vectors in V that are orthogonal to every vector in S ; that is, S {v V : v , u 0, for every u S} . Proposition 3.3.4 S is a subspace of V . Proof Let v , w S and k R . Then, for each u S , v w , u v , u w , u 0 0 0 , and kv , u k v , u k (0) 0 , so that v w and kv S . □ Example 3.3.5 Let V R 4 with the Euclidean inner product. Find any non-zero vector in R 4 , orthogonal to each of the vectors, (2,1,3, 0), (4,1, 2, 2), (6, 2,5,1) . Let (a, b, c , d ) R 4 be such a vector. Then we have that 2a b 3c 0, 4a b 2c 2d 0, 6a 2b 5c d 0 . So any non-zero solution to this homogeneous system of linear equations, (if it exists), will do. 1 3 , t R , and so for We find that the general solution is d t , c t , b 0, a 4 8 example (3, 0, 2,8) is a non-zero vector orthogonal to each of the above three vectors. From this example, it is not hard to see the following. Observation 3.3.6 Whenever w1 , w 2 , , w m are vectors in R n , if we form the matrix W M m ,n (R ) with rows w1 , w 2 , , w m , we have that (Row (W ) ) Ker (W ) . 3.4 Orthogonal Sets and Bases Definition 3.4.1 Let S {u1 , u2 , , ur } be a set of non-zero vectors in an inner product space V . S is an orthogonal set if each pair of vectors in S is orthogonal, and S is an orthonormal set if it is orthogonal and each each vector is a unit vector. 42 Orthogonal: u i , u j 0 if i j 1 Orthonormal: ui , u j 0 ij ij Theorem 3.4.2 If S is an orthogonal set of non-zero vectors, then S is also linearly independent. Proof Let S {u1 , u2 , , ur } be such a set. We need to solve a1u1 a2u2 ar ur 0 . For each i 1, , r , taking the inner product of ui with each side of the above equation, we get 0 0, ui a1u1 a2u2 ar ur , ui a1 u1 , ui ai ui , ui ar ur , ui ai ui , ui , But each u i 0 , and so ui , ui 0 , which means that for each i , ai 0 and so {u1 , u2 , □ , ur } is linearly independent. Observation 3.4.3 If S {u1 , u2 , u1 , ur } is orthogonal, then ur 2 u1 2 ur 2 . Example 3.4.4 In R 3 , the set E3 {e1 , e2 , e3} , where e1 (1,0,0), e2 (0,1, 0), e3 (0, 0,1) , is orthonormal. Example 3.4.5 In R 3 , the set S {u1 , u 2 , u3} , where u1 (1, 2,1), u2 (2,1, 4), u3 (3, 2,1) , is orthogonal. Moreover, by Theorem 3.4.2, the set is also LI, and is therefore in fact an orthogonal basis for R 3 . Remark 3.4.6 We can easily transform an orthogonal set of non-zero vectors into an orthonormal set as follows: If the set S {u1 , u2 , 1 1 u1 , u2 , , ur } is orthogonal, then the set S u2 u1 , 1 ur ur is orthonormal. This process of dividing each vector in an orthogonal set by its norm to produce an orthonormal set is called normalization. Exercise 3.4.7 Normalize the set S from Example 3.4.5. 43 Observation 3.4.8 Given an orthogonal basis {u1 , u2 , , un } for an inner product space V , and given any vector v V , we can easily find the unique scalars a1 , a2 , , an R such that a1u1 a2u2 anun v : To solve a1u1 a2u2 anun v for a1 , a2 , , an R , for each i 1, the inner of ui with each side of this equation and obtain v , ui a1u1 a2u2 anun , ui a1 u1 , ui and so ai v , ui Each scalar, ai ai ui , ui for each i 1, ui , ui , n , we take v , ui ui , ui an un , ui ai ui , ui , ,n . , is called the Fourier coefficient of v with respect to ui . Example 3.4.9 Write v (7,1,9) as a linear combination of the vectors u1 , u 2 , u3 in Example 3.4.5. We find that a1 (7,1,9), (1, 2,1) (1, 2,1), (1, 2,1) 18 21 28 3 , a2 1 , a3 2, 6 21 14 and so (7,1,9) 3(1, 2,1) (2,1, 4) 2(3, 2,1) . Gram-Schmidt Orthogonalization Process: We now introduce a process by which, given a basis for an inner product space V , we can construct from it, an orthogonal basis for V . Suppose {v1 , v 2 , , v n } is a basis for an inner product space V . We can form an orthogonal basis {w1 , w 2 , , w n } for V as follows. Set w1 v 1 w2 v2 w3 v3 wn vn v 2 , w1 w1 , w1 v 3 , w1 w1 , w1 v n , w1 w1 , w1 w1 w1 w1 v 3 ,w 2 w 2 ,w 2 v n ,w 2 w 2 ,w 2 w2 w2 v n , w n 1 w n 1 , w n 1 w n 1 . 44 To see precisely why {w1 , w 2 , theorem. , w n } is an orthogonal set, we observe the following Theorem 3.4.10 Suppose that {u1 , u2 , , ur } is an orthogonal set of non-zero vectors in an inner product space S . Then if u is any vector in S , let u u u , u1 u1 , u1 u, u2 u1 u2 , u2 Then u is orthogonal to each of u1 , u2 , Proof For each i 1, u , u i u u , u1 u1 , u1 u2 u, ur ur , ur ur . , ur . , r , we have u1 u, ui u, ui u, u2 u2 , u2 u , u1 u1 , u1 u, ui ui , ui u2 u1 , u i ui , ui u, ur ur , ur u, u2 u2 , u2 ur , ui u2 , ui u, ur ur , ur (by orthogonality of {u1 , u 2 , ur , ui , u r }) u, ui u, ui 0 □ From this theorem, we see that each w k defined above, is orthogonal to each of the preceding w ’s, and so inductively it follows that {w1 , w 2 , , w n } is an orthogonal set. It is an orthogonal basis since it is automatically linearly independent, and the dimension of V is n . Example 3.4.11 Apply the Gram-Schmidt Orthogonalization Process to find an orthogonal basis, and then an orthonormal basis, for the subspace U of R 4 spanned by v1 (1,0,1,1), v 2 (1,6,5,3), v 3 (2, 1,6,1) . First let w1 v1 (1,0,1,1) . Then w 2 v 2 v 2 , w1 9 w 1 (1, 6,5,3) (1, 0,1,1) (2, 6, 2, 0) . w1 , w1 3 Finally, w 3 v 3 v 3 , w1 w1 , w1 w1 v 3 ,w 2 w 2 ,w 2 w2 9 2 10 14 32 (2, 1, 6,1) (1, 0,1,1) ( 2, 6, 2, 0) ( , , , 2) . 3 44 11 11 11 45 Thus, {w1 , w 2 , w 3} forms an orthogonal basis for U . For an orthonormal basis, we just normalize and obtain u1 1 (1, 0,1,1), 3 u2 1 (2, 6, 2, 0), 2 11 u3 11 10 14 32 ( , , , 2) . 164 11 11 11 Remark 3.4.12 Since multiplying vectors by non-zero scalars does not affect orthogonality, (and of course does not affect the span of these vectors), we can often make our calculations within the Gram-Schmidt Process much simpler by clearing out fractions or scaling down our vectors. For example in the above, we could have taken our w 2 to be (1,3,1, 0) , or taken w 3 to be (10, 14,32, 22) . Remark 3.4.13 We now see that every finite dimensional inner product space has an orthogonal basis. Remark 3.4 14 If S {v1 , v 2 , , v r } is an orthogonal set in an inner product space V , then we can always extend S to an orthogonal basis for V . To see this, first extend S to a basis for V , {v1 , v 2 , , v r , v r 1 , , v n } say. Then applying the Gram-Schmid Process we obtain w1 v1 , w 2 v 2 , , w r v r , (since the v i ’s are orthogonal), and further vectors w r 1 , , w n , where {w1 , w 2 , , w n } is an orthogonal basis for V . 4. Linear Transformations (Linear Maps) In this section, we consider a special class of functions that are of fundamental interest in linear algebra. They are called linear transformations (or linear maps), and their domains and codomains are vector spaces. They are furthermore special because they preserve the vector space structure. By this we mean the following. Suppose that T is a linear transformation from a vector space V to a vector space W ; that is T :V W . Then T possesses the following characteristics: whenever and then that is, and that for k R , when u V Tu W v V Tv W , u v V Tu Tv W , T (u v ) Tu Tv ; u V Tu W 46 and ku V then T (ku ) W , T (ku ) kTu . Remark Recall that we touched on this concept in Section 2.9. 4.1 Definition and Examples Definition 4.1.1 Let V and W be vectors spaces. A map T from V to W is called a linear transformation, or a linear map, if for all u , v V and k R , T (u v ) Tu Tv and T (ku ) kTu , Observations 4.1.2 1. T sends the zero vector of V to the zero vector of W ; T (0V ) 0W . 2. T is uniquely determined by how it acts on basis vectors of V . That is, if {v1 , v 2 , , v n } is any basis for V , and if we know how T acts on each of v1 , v 2 , , v n , then we know how T acts on every vector of V . e.g. Suppose T : R 2 R 2 , and we know that T (1, 0) (3,5) and T (0,1) (1, 7) , then we can deduce that for arbitrary ( x , y ) R 2 , T ( x, y ) T x (1,0) y (0,1) xT (1, 0) yT (0,1) x (3,5) y (1, 7) (3x y ,5 x 7 y ) . Remark 4.1.3 We could rephrase Observation 2. by saying that if {v1 , v 2 , , v n } is any basis for a vector space V , and w1 , w 2 , , w n are any (not necessarily distinct) vectors in a vector space W , then setting Tv1 w1 , Tv 2 w 2 , , Tv n w n is sufficient to uniquely define a linear transformation (LT) from V to W . Example 4.1.4 Let {e1 , e2 } be the standard basis of R 2 . Then taking, for instance, w1 (1, 2,3, 0, 7) and w 2 (0, 1, 7,1, 0) , we can set Te1 w1 and Te2 w 2 , thus defining the linear map T : R 2 R 5 , with T ( x , y ) ( x , 2 x y ,3x 7 y , y , 7 x ) . Example 4.1.5 Suppose we have a LT, F : R 2 R 2 , and we know that F (1,1) (1, 2) and F (1, 1) (4, 1) . Find a general expression for F ( x , y ) , for ( x , y ) R 2 . 47 Typically, given a map between vector spaces, we often want to determine whether or not it is linear. Example 4.1.6 Let D : P3 (t ) P2 (t ) be the map defined by, d (Dp)(t ) p(t ) , for p P3 (t ) , (the derivative mapping). dt This map is certainly linear, as we know that differentiation is a linear operation; that is, d d d d D( p q ) ( p q )(t ) p(t ) q (t ) p(t ) q (t ) Dp Dq , dt dt dt dt and d d D(kp) (kp)(t ) k p(t ) kDp , dt dt for all p, q P3 (t ) , k R . Example 4.1.7 Let F : R 2 R 2 , be defined as reflection in the x -axis. Example 4.1.8 Let V be a vector space, and B {u1 , u2 , , un } an ordered basis for V . Define I B : V R n to be the map, I Bv v B R n , for v V , where v B is the coordinate vector of v with respect to the basis B . (Recall Section 2.9.) This map is clearly linear; as can be seen from the discussion following Conclusion 2.9.7. Example 4.1.9 S : R 2 R 2 , rotation through a fixed angle of about the origin, in an anticlockwise direction. 48 Example 4.1.10 T : R 2 R 3 , defined by T ( x1 , x 2 ) ( x1 1, x1 x 2 ,3x1 4x 2 ) . Example 4.1.11 T : R 2 R 3 , given by T ( x1 , x 2 ) (0, x1x 2 , x1 2x 2 ) . Example 4.1.12 G : R 3 R 2 , with G( x, y , z) ( x , y z) . Example 4.1.13 T : R 2 R 3 , defined by T ( x1 , x 2 ) ( x1 , 2 x1 x 2 ,5x1 x 2 ) . Observation 4.1.14 When T is a map from R n to R m , the graph of T is defined to be the subset, GT {( x ,Tx ) : x R n } R n m . Then T is linear, if and only if, GT is a subspace of R n m . Exercise 4.1.15 Verify this. 49 4.2 Kernel and Range of a Linear Transformation In Section 2.11 we defined the kernel and range for m x n real matrices. We now do so (more generally) for any linear transformation. Definition 4.2.1 Let T : U V be a linear transformation between vector spaces U and V . Then define the kernel and range of T to be respectively, Ker T {x U : Tx 0V } and Ran T {Tx V : x U} . Observations 4.2.2 Ker T is a subspace of U . Ran T is a subspace of V . Proposition 4.2.3 Let T : U V be a linear transformation, and suppose that {u1 , u 2 , , ul } spans U . Then {Tu1 ,Tu2 , ,Tul } spans Ran T . Proof □ Example 4.2.4 Define a linear transformation, F : R 5 R 4 , by F ( x1 , x 2 , x3 , x 4 , x5 ) ( x1 2x 2 x3 x5 , x1 2x 2 2x3 x 4 3x5 ,3x1 6x 2 5x3 2x 4 7 x5 , 2 x1 4 x 2 x3 1x 4 ) . Find a basis for Ran F , and hence state its dimension. Compute also, a basis for Ker F . 50 Recall also from Section 2.11, that we stated, as Theorem 2.11.14, the “Conservation of Dimension Theorem” for matrices. This too, was a special case of the following more general statement. Theorem 4.2.5 (Conservation of Dimension) Let T : U V be a linear transformation, and suppose that dim U n . Then dim Ker T dim Ran T n. 4.3 A Most Important Source of Linear Transformations Whenever A M m ,n (R ) , the map TA : R n R m defined as TA ( x ) Ax is a linear transformation. Here, on the right-hand side, we are thinking of x ( x1 , x 2 , , xn ) R as an n x 1 matrix, and by Ax we mean the matrix product of A and x . So, x1 x TA ( x ) A 2 . xn n Exercise 4.3.1 Verify that TA is linear. In fact, this type of linear transformation is much more than an important source of examples – indeed every linear transformation is of this form. Fact 4.3.2 Let T : U V be a linear transformation between vector spaces U and V , where U has dimension n , and V has dimension m . Then T can be represented as a matrix transformation, TA , for some A M m ,n (R ) . In fact, such a T can be ‘represented’ in infinitely many different ways. We need to make precise, exactly what we mean by “represent”. The simplest case: Suppose T : R n R m is a linear transformation, and let {e1 , e2 , , en } be the standard basis for R n . Suppose Te1 (a11 , a21 , , am1 ) Te2 (a12 , a22 , , am 2 ) Ten (a1n , a2n , , amn ) . Form the m x n matrix, A , with columns Te1 , Te2 , , Ten : 51 a11 a12 a a22 A 21 am1 am 2 Then for any x ( x1 , x 2 , a11 a12 a a22 Ax 21 am1 am 2 a1n a2 n . amn , xn ) R n , a1n x1 a11 a12 a2 n x 2 a21 a22 x1 x 2 amn x n am1 am 2 a1n a2 n xn amn x1Te1 x2Te2 xnTen T ( x1e1 x 2e2 xnen ) Tx . Example 4.3.3 Let T : R 5 R 3 be defined by T ( x1 , x 2 , x3 , x 4 , x5 ) ( x1 x 2 2x3 , x 2 x3 , x1 x 4 3x5 ) . Then Te1 (1, 0,1) , Te2 (1,1, 0) , Te3 (2,1, 0) , Te4 (0, 0, 1) and Te5 (0, 0,3) , and so T TA , where 1 1 2 0 0 A 0 1 1 0 0 . 1 0 0 1 3 Remark 4.3.4 You might ask why we bothered explicitly going through the process of calculating Te1 , Te2 , , Te5 , then writing the resultant vectors as the columns of a matrix, when it is clear simply by looking at the definition of T that it is of the form TA , with A as above. The fact is that the above is a special case of a more general picture, and in this more general setting this procedure will be necessary. 4.4 Matrix Representations of Linear Transformations Theorem 4.4.1 Suppose that T : U V is a linear transformation, and that B {u1 , u2 , , un } and B {v1 , v 2 , , v m } are fixed ordered bases for vector spaces U and V respectively. Then there exists a unique m x n matrix A , such that for all u U , 52 Tu B Au B . We say that this matrix A , represents T with respect to the ordered bases B and B , and denote it by B T B . Proof First of all, we show that such a matrix A exists by explicitly constructing it: Apply T to each of u1 , u2 , , un , and write the results (uniquely) as linear combinations of v1 , v 2 , , v m ; Then form Tu1 ,Tu2 , B Tu1 a11v1 a12v 2 a1mv m Tu2 a21v1 a22v 2 a2mv m Tun an1v1 an 2v 2 anmv m . T B as the matrix whose columns are the coordinate vectors of ,Tun , that is, B Then B T B satisfies following. B B T B u B a21 a22 a2 m a21 a22 a2 m an1 an 2 . anm T B u B Tu B , for all u U , as can be seen from the Let u U and u B (t1 , t 2 , a11 a 12 a1m T B a11 a 12 a1m , t n ) . Then an1 t1 a11 a21 an 2 t 2 a12 a22 t t 1 2 anm t n a1m a2 m t1 Tu1 B t 2 Tu2 B a1m a2 m tn anm t n Tun B t1Tu1 t 2Tu2 t nTun B T (t1u1 t 2u2 t nun )B Tu B . To see that this matrix is unique, it suffices to observer the following. 53 Suppose A is an m x n matrix satisfying A u B Tu B , for all u U . Then in particular, it satisfies this for each of u1 , u2 , , un , that is, for each j 1, 2, ,n , A u j Tu j . B B 0 0 But u j 1 , with 1 in position j and zeros elsewhere, and so A u j is B B 0 0 precisely the j th column of A , and so A is uniquely determined from the equation, A u B Tu B . □ Example 4.4.2 In Example 4.3.3, the matrix A would be denoted by E3 T E , where 5 En denotes the standard basis for R n . In this case, we call A the Standard Matrix for T . Example 4.4.3 Consider the derivative mapping D : P3 (t ) P2 (t ) , given in Example 4.1.6. Let B {1, t , t 2 , t 3} and B {1, t , t 2 } be our bases for P3 (t ) and P2 (t ) respectively. Find B D B . Applying D to each of the basis vectors we get, D(1) 0 0(1) 0(t ) 0(t 2 ) D(t ) 1 1(1) 0(t ) 0(t 2 ) D(t 2 ) 2t 0(1) 2(t ) 0(t 2 ) D(t 3 ) 3t 2 0(1) 0(t ) 3(t 2 ) , so that B D B 0 1 0 0 0 0 2 0 . 0 0 0 3 Then, for example, if we want to apply D to the polynomial 3 t 2t 2 5t 3 , we can do so using B DB as follows. First of all, 3 t 2t 2 5t 3 (3, 1, 2,5) , and so B 54 3 0 1 0 0 1 1 0 0 2 0 2 4 0 0 0 3 15 5 gives us D(3 t 2t 2 5t 3 ) 1 4t 15t 2 . Example 4.4.4 Given the basis B {(2,1), (1, 4)} for R 2 , find E3 SB , where S : R R is defined by 2 3 S( x1 , x 2 ) (3x1 , 2x1 x 2 ,3x1 2x 2 ) . Verify that the matrix is correct by checking that E3 S B ( x1 , x 2 )B S ( x1 , x 2 )E , for all 3 ( x1 , x 2 ) R 2 . Example 4.4.5 For F : R 2 R 2 defined by F ( x , y ) (5x y , 2 x y ) , and B {(1, 4), (2, 7)} , find B F B . (Then verify that the matrix is the correct one.) 55 Fact 4.4.6 Given a linear transformation, T : U V , and bases B {u1 , u2 , , un } and B {v1 , v 2 , , v m } for U and V respectively, there is in fact a quicker method for finding B T B than that used above. 4.5 The Change-of-Basis Matrix Let’s look first at the case where our linear transformation is the identity, I : R n R n , so that Ix x , for all x R n . Observation 4.5.1 If is any basis for R n , then I I n , the n x n identity matrix. Exercise 4.5.2 Verify this. Let and be ordered bases for R n . Then that I is the unique n x n matrix such I x Ix x , for all Definition 4.5.3 Such a matrix I x Rn . is called a change-of-basis matrix. Remark 4.5.4 For x R n , if we are given x , the -coordinates of x , multiplying by I gives us x , the -coordinates of x ; and so the reason for its name is clear. Remark 4.5.5 The columns of I are the -coordinates of the vectors in , that is where {u1 , u2 , I u1 u2 un , , un } . Theorem 4.5.6 If and be ordered bases of R n , then I 1 I is invertible, and I . 56 Proof □ Example 4.5.7 If {(2,1), (1, 4)} , find E2 I . Example 4.5.8 If {(0, 0,1), (1, 0,1), (0, 2,1)} , find Remark 4.5.9 Clearly, in general, if {u1 , u2 , then En I u1 u 2 E3 I . , un } and I : R n R n is the identity, u n ; that is, the columns of En I are simply the vectors of the basis . We can now use Theorem ?????? to easily compute Example 4.5.10 If {(2,1), (1, 4)} , find I E I I E . n . 2 Example 4.5.11 If {(0, 0,1), (1, 0,1), (0, 2,1)} , find So far we have only looked at I E . 3 when at least one of or is a standard basis. What about the general case? 57 Theorem 4.5.12 If and are ordered bases for R n , then I I E n En I . Proof □ Example 4.5.13 If {(2,1), (1, 4)} and {(0, 2), (1,3)} find I . Example 4.5.14 If {(1, 0, 1), (2, 0,1), (1,1, 0)} and {(0, 0,1), (1, 0,1), (0, 2,1)} find I . In the next two examples we see the change-of-basis matrix in use. Example 4.5.15 If {(2,1), (1, 4)} and u (9, 18) , find the -coordinates of u . We want u . Thus, u 1 4 9 2 1 9 9 1 9 I E E2 I 2 18 18 1 4 18 1 9 1 9 9 2 . 2 18 5 9 (And so (9, 18) 2(2,1) (5)(1, 4) .) Example 4.5.16 Let {(0, 0,1), (1, 0,1), (0, 2,1)} and {(1,3,1), (2, 4, 1), (6, 2, 0)} . 58 (i) Find the standard coordinates of u , where the -coordinates of u are (3,1, 9) . (ii) Find the -coordinates of u , where the -coordinates of u are (3,1, 9) . (iii) Find the -coordinates of u , where the standard coordinates of u are (3,1, 9) . (iv) Find the -coordinates of u , where the -coordinates of u are (3,1, 9) . So we have seen that for each pair , of ordered bases for R n , I is an invertible n x n matrix. There is, in fact, the following partial converse to this: Let P be any invertible n x n matrix. Then we can always think of P as a changeof-basis matrix in the following sense. p11 p Suppose {v1 , v 2 , , v n } , a basis for R n , and P 21 pn1 define the set of vectors by, {u j p1 j v1 p2 j v 2 pnj v j } j 1, p12 p22 pn 2 ,n p1n p2 n . Then pnn . Claim 4.5.17 is a basis for R n , and P I . Proof 59 □ 4.6 Similar Matrices Let T : R n R m be a linear transformation, and suppose that and are ordered bases for R n , and that and are ordered bases for R m . We have seen how to construct matrices, T and T , which would respectively represent T with respect to the bases and , and with respect to the bases and . Question 4.6.1 What is the relationship between T and T ? So the question we are asking is; what is the relationship between two different matrices which represent the same linear transformation, but with respect to different bases? Well the answer is straightforward, and perhaps what you would immediately expect it to be. Theorem 4.6.2 Let T : R n R m be a linear transformation, and ordered bases for R n , and and ordered bases for R m . Then we have that T I T I . Proof □ 60 Example 4.6.3 Let T : R 2 R 3 be the linear transformation such that E3 T E 2 3 0 2 1 . 3 2 (So in fact, T ( x1 , x 2 ) (3x1 , 2x1 x 2 ,3x1 2x 2 ) , for ( x1 , x 2 ) R 2 .) If {(2,1), (1, 4)} and {(0, 0,1), (1, 0,1), (0, 2,1)} , use the above Theorem to find T . It is the special case of Question 4.6.1, in which n m , and , that leads us to the concept of similarity for matrices. The question becomes: Question 4.6.4 If T : R n R n is a linear transformation, and and are ordered bases for R n , how are T and T related? Terminology 4.6.5 (Note that we say that such a matrix, respect to the basis .) T , represents T with Well we know by Theorems 4.5.6 and 4.6.2, that T I T I I -1 T I . 61 Consequently, we immediately have the following. Proposition 4.6.6 If A and B are matrices in Mn (R) representing a linear transformation T : R n R n , (with respect to two possibly different bases for R n ), then there exists an invertible matrix P Mn (R) such that B P 1AP . Conversely, since every invertible matrix P Mn (R) can be thought of as a changeof-basis matrix, for some pair of bases, we can say that if A, B Mn (R) , with B P 1AP for some such matrix P , then A and B represent some linear transformation T : R n R n , (with respect to two possibly different bases for R n ), So it follows that we actually have equivalence between the two conditions of Proposition 4.6.6. Definition 4.6.7 Matrices A, B Mn (R) are said to be similar if there exists an invertible matrix P Mn (R) such that B P 1AP . Remark 4.6.8 In light of the above discussion, we could of course equivalently define matrices A, B Mn (R) to be similar, if they represent the same linear transformation. 4.7 Diagonalizability – our motivation for considering similarity Suppose that T : R n R n is a linear transformation. Then we know that it takes the general form, T ( x1 , x 2 , , xn ) (a11x1 a12 x 2 a1n x n , , an1x1 an 2 x 2 Each coordinate on the right hand side depends on each of x1 , x 2 , ann x n ) . , xn . Let’s look at the linear transformation T : R 2 R 2 defined as T ( x1 , x 2 ) (2x1 2x 2 , 5x1 x 2 ) . We can easily see that the standard representation of T is given by the matrix E2 T E 2 2 2 . 5 1 2 Suppose, alternatively, that we choose as a basis for R 2 , (1,1), ( ,1) . Then 5 we find that 62 T 4 0 , 0 3 which is a diagonal matrix. So, at the cost of using a different basis, we obtain a simpler matrix, a diagonal one, to describe T . We now have that for x R 2 , if x ( y 1 , y 2 ) , then Tx T x (4 y 1 ,3y 2 ) . This is the idea of diagonalizing a matrix – finding a simpler description of the behaviour of T . In general, given a linear transformation T : R n R n , we will wish to find a basis for R n , such that T is diagonal; so that if x R n , x ( y 1 , y 2 , , y n ) , then Tx (t1 x1 , t 2 x 2 , , t n x n ) , for some t1 , t 2 , , tn R . Note 4.7.1 This will not always be possible. In light of this, we note the following, which is an immediate consequence of our above discussions. Observation 4.7.2 If we are able to determine whether a given n x n matrix is similar to a diagonal one, then we are, in fact, able to determine if a given linear transformation can be represented by a diagonal matrix. 63