LINEAR TRANSFORMATIONS DEFINED GEOMETRICALLY 1. Introduction. Any n × n matrix A defines a linear transformation: A : Rn → Rn ; that is, matrices ‘act on column vectors by left-multiplication’ (move them around). Now we turn things around and ask: suppose we can describe a transformation T of Rn by the geometry of its effect on vectors: T ‘rotates vectors by 30 degrees counterclockwise about the axis with direction (1, 1, 1)’, or ‘projects vectors onto the plane (subspace) x + y + z = 0, parallel to the vector (2, 1, 1)’, etc. Suppose we know, in addition, that the transformation T is linear, that is, respects linear combinations: T (c1 v1 + . . . + cr vr ) = c1 T (v1 ) + . . . cr T (vr ). Problem: Can we find an n × n matrix A0 having the same geometric action on vectors as T ? If we find such a matrix, then we can easily compute the effect of T on any vector in Rn ; with the geometric description alone, we can compute easily the action on only a small number of vectors (to be precise: only on special subspaces). We already know one basic fact to help solve this problem: the columns of a matrix correspond to its action on the vectors ei of the ‘standard basis’ of Rn : A0 = [A0 e1 |A0 e2 | . . . |A0 en ] (by columns) Thus we can find A0 if we can compute the action of T on the ei . Example 1. T : R2 → R2 projects vectors onto the one-dimensional subspace E spanned by v1 = (2, 1), parallel to the line K (subspace) spanned by v2 = (1, 1). Thus we know: T (2, 1) = (2, 1), T (1, 1) = 0. To compute the effect of T on the standard basis of R2 , we note that: e1 = (1, 0) = (2, 1)−(1, 1) = v1 −v2 ; e2 = (0, 1) = 2(1, 1)−(2, 1) = 2v2 −v1 . Thus, by linearity: T e1 = T v1 − T v2 = (2, 1); T e2 = 2T v2 − T v1 = (−2, −1). 1 So A0 is the matrix having (2, 1) and (−2, −1) as column vectors: 2 −2 A0 = . 1 −1 From this we can easily compute the action of T on an arbitrary vector, say: T (3, 5) = (−2, −2). (Note: There is a subtle abuse of notation here- we are identifying the geometrically defined transformation T with a 2 × 2 matrix A0 that has the same action as T on any vector. Soon we’ll see that there are many matrices associated with the same (geometrically defined) T ; but since we always identify a vector in Rn with its ‘coordinates in the standard basis’ (see below), identifying T with its ‘matrix in the standard basis’ A0 leads to correct results. This note will become clearer once Definitions 1,2 below have been absorbed.) Assorted remarks on example 1. (1) Note that the column space of A0 is exactly the subspace E, corresponding to the fact that the image of T is E (that is, T maps all of R2 to E); (2) The nullspace (or kernel ) of A0 has defining equation x1 = x2 , so it coincides with the subspace K; this corresponds to the fact that T maps all of K to the zero vector. (3)Note that, in this case, N (A0 ) and Col(A0 ) intersect only at 0, and together span R2 . This not true for a general T (although the dimensions of N (T ) and Ran(T ) always add up to n), but it is always true for projections (as we’ll see later). (4) Once you’ve projected a vector onto E, projecting again won’t do anything; that is, it is clear geometrically that T 2 (v) = T (v), for any v ∈ R2 . On the matrix side, we would expect the corresponding identity: A20 = A0 , and this is indeed true (check!) Example 2. Let Rθ be the linear transformation of R2 that rotates vectors by θ radians, counterclockwise. From basic trigonometry: Rθ (e1 ) = (cos θ, sin θ), Rθ (e2 ) = (− sin θ, cos θ), so we associate to Rθ the ‘rotation matrix’: cos θ − sin θ Rθ = . sin θ cos θ For example, to compute the effect of rotating (4, 2) by 60 degrees (counterclockwise) we use Rπ/3 : √ √ 1/2 − 3/2 4 2 − 3 √ √ = . 2 3/2 1/2 2 3+1 2 To go further, consider Example 1 again: there is a basis of R2 for which we can easily write down the action of T , namely: B = {v1 , v2 }. Indeed, T v1 = v1 = 1v1 + 0v2 , T v2 = 0 = 0v1 + 0v2 . That is, the ‘coordinate vector of T v1 in the basis B’ is [T v1 ]B = (1, 0), and the ‘coordinate vector of T v2 in the basis B’ is [T v2 ]B = (0, 0). Using these ‘coordinate vectors’ as columns, we define the matrix of T in the basis B as: 1 0 [T ]B = [[T v1 ]B |[T v2 ]B ] = 0 0 This is by analogy with the matrix of T in the standard basis B0 = {e1 , e2 }. Above we computed A0 = [T ]B0 , the ‘matrix of T in the standard basis’. The point of these contortions is that any projection in the Universe will have the very simple form of [T ]B above if we choose the basis B appropriately, and then there is a simple formula (given below) to get the matrix of T in the standard basis (the one we really use). Before introducing formal definitions, here is another simple example. Example 3. T contracts vectors in E (spanned by v1 = (1, 1)) by a factor 1/2, expands vectors in E ⊥ (spanned by v2 = (1, −1)) by a factor 3. We want the matrices of T in the basis B = {(1, 1), (1, −1)} and in the standard basis. Since, by the definition of T : T v1 = (1/2)v1 , T v2 = 3v2 , we have immediately: [T ]B = 1/2 0 0 3 . We easily find the coordinates of e1 , e2 in the basis B: (1, 0) = (1/2)(1, 1) + (1/2)(1, −1), (0, 1) = (1/2)(1, 1) − (1/2)(1, −1), so that: T (1, 0) = (1/2)(1/2)v1 + (1/2)3v2 = (7/4, −5/4), T (0, 1) = (1/2)(1/2)v1 − (1/2)3v2 = (−5/4, 7/4). The matrix of T in the standard basis has these two vectors as columns: 7/4 −5/4 A0 = [T ]B0 = . −5/4 7/4 3 Remark: A subspace of Rn in which T acts as a pure expansion/contraction (as E and E ⊥ in this example) is called an ‘eigenspace’ for T , with ‘eigenvalue’ given by the expansion/contraction factor (1/2 and 3 in this example). Precise definitions will be given soon. 2. Coordinate changes. Any basis B = {v1 , . . . vn } of Rn determines a coordinate system. From the definition of basis, given any v ∈ Rn there is a unique expansion as a linear combination: v = x01 v1 + x02 v2 + . . . + x0n vn . Definition 1. The coordinate vector of v in the basis B (denoted [v]B ) is: [v]B = (x01 , . . . , x0n ), where v = x01 v1 + . . . + x0n vn . Given a basis B and a vector v = (x1 , . . . , xn ) ∈ Rn (recall vectors are identified with their ‘coordinate vectors’ in the standard basis), how do we find the coordinates of v in the basis B? Define an n × n matrix B by: B = [v1 |v2 | . . . |vn ] (by columns). Then B is invertible (why?) and Bei = vi for i = 1, . . . n, where B0 = {e1 , . . . , en } is the standard basis. Thus if v = x01 v1 + x02 v2 + . . . x0n vn , we find: B −1 v = x01 B −1 v1 + . . . + x0n B −1 vn = x01 e1 + . . . x0n en = (x01 , . . . , x0n ). This means we have the change of coordinates formula (for vectors): [v]B = B −1 [v]B0 , [v]B0 = B[v]B Analogous formulas hold for linear transformations and matrices. Given a linear transformation T : Rn → Rn , and a basis B = {v1 , . . . , vn }, we have: Definition 2. The matrix of T in the basis B (denoted [T ]B ) is defined by the relation: [T v]B = [T ]B [v]B . (As with vectors, we identify [T ]B0 with T .) To compute what this means, consider the expansions of T vi in the basis B, for each of the basis vectors v1 , . . . , v n : T v1 = a11 v1 + a21 v2 + . . . + an1 vn , . . . 4 T vj = a1j v1 + a2j v2 + . . . + anj vn , . . . T vn = a1n v1 + a2n v2 + . . . + ann vn . That is, with summation notation: T vj = n X aij vi . i=1 Then if [v]B = (x01 , . . . , x0n ), by linearity of T : Tv = n X x0j T vj n X n X = ( aij x0j )vi . j=1 i=1 j=1 By Definition 1, this means: n n X X 0 [T v]B = ( a1j xj , . . . , anj x0j ). j=1 j=1 The right-hand side of this equality is exactly the result of multiplying the matrix A = (aij ) by the (column) vector (x01 , . . . , x0n ) = [v]B . We conclude: [T ]B = A, where the entries (aij ) of A are defined above. Note that A is the n × n matrix whose columns are given by the coordinate vectors of T vi in the basis B: A = [T ]B ⇔ A = [[T v1 ]B | . . . |[T vn ]B ] (by columns). From the change of coordinates formula for vectors follows one for matrices. Since, for any v ∈ Rn : [v]B = B −1 [v]B0 and [T v]B = B −1 [T v]B0 = B −1 [T ]B0 [v]B0 , it follows that: [T v]B = B −1 [T ]B0 B[v]B . Using Definition 2, we obtain the change of coordinates formula for matrices: [T ]B = B −1 [T ]B0 B, [T ]B0 = B[T ]B B −1 . (This is one of the most useful formulas in Mathematics.) 5 Remark 1. Although this formula was derived for B0 thought of as the ‘standard’ basis, the same formula relates the matrices of a linear transformation T in two arbitrary bases B, B0 , as long as the ‘change of basis matrix’ B has as its column vectors the [vi ]B0 , where B = {v1 , v2 , . . . , vn }: B = [[v1 ]B0 | . . . [vn ]B0 ]. Remark 2. The change of basis formula also applies to linear transformations between two different spaces, T : Rn → Rm . Endow Rn with the basis B = {v1 , . . . , vn }, Rm with the basis C = {w1 , . . . , wm }. Consider the invertible matrices (given by columns): P = [v1 | . . . |vn ] ∈ Mm , Q = [w1 | . . . |wm ] ∈ Mm . We have for v ∈ Rn and w ∈ Rm the coordinate vectors in the new bases: [v]B = P −1 [v]std and [w]C = Q−1 [w]std . Definition. The matrix [T ]B,C ∈ Mm×n of T in the new bases B, C is defined by the identity: [T v]C = [T ]B,C [v]B . Exactly as before, one obtains the change of basis formula: [T ]B,C = Q−1 [T ]std P, [T ]std = Q[T ]B,C P −1 , where [T ]std denotes the matrix of T with respect to the standard bases of Rn , Rm : [T ]std = [T e1 | . . . |T en ], T ei ∈ Rm , i = 1, . . . , n. (see Example 6 below.) Example 3.-Projections. Let P : R3 → R3 be the linear operator that projects vectors onto the plane (subspace) E = {x|x1 + x2 + x3 = 0}, parallel to the subspace F spanned by (1, 1, 2) (a line). This means v −P v = c(1, 1, 2), for some constant c. To compute the matrix of P in the standard basis, we first find the matrix in a basis B = {v1 , v2 , v3 } adapted to the situation: v1 = (1, 1, 2) and {v2 , v3 } is a basis of E. For example, we may take v2 = (1, 0, −1), v3 = (0, 1, −1). We clearly have (since v2 and v3 are in E): P v1 = 0, P v2 = v2 = 0v1 + 1v2 + 0v3 , 6 P v3 = v3 = 0v1 + 0v2 + 1v3 , which gives the columns of the matrix 0 [P ]B = 0 0 of P in the basis B: 0 0 1 0 . 0 1 The ‘change of basis’ matrix B and its inverse are: 1 1 0 1 1 1 1 0 1 , B −1 = 3 −1 −1 , B= 1 4 2 −1 −1 −1 3 −1 and the change of basis formula gives the matrix of P in the standard basis: 3 −1 −1 1 3 −1 . [P ]B0 = B[P ]B B −1 = −1 4 −2 −2 2 The coordinate change x0 = B −1 x is given componentwise by: 1 x01 = (x1 + x2 + x3 ), 4 1 x02 = (3x1 − x2 + x3 ), 4 1 x03 = (−x1 + 3x2 − x3 ). 4 Note that the ranges of A = [P ]B and A0 = [P ]B0 are given by: Ran(A) = {x0 |x01 = 0}, Ran(A0 ) = E = {x|x1 + x2 + x3 = 0}; the equivalence of the statements x01 = 0 and x1 + x2 + x3 = 0 corresponds to the fact that the coordinate change x = Bx0 maps Ran(A) to Ran(A0 ): Ran(A0 ) = B(Ran(A)). Similarly for the kernels: Ker(A) = {x0 |x02 = 0, x03 = 0}, Ker(A0 ) = F = {x|3x1 −x2 −x3 = 0, −x1 +3x2 −x3 = 0}, so the coordinate change x = Bx0 maps Ker(A) to Ker(A0 ): Ker(A0 ) = B(Ker(A)). These correspondences will hold whenever two n × n matrices are related by a change of coordinates (or ‘conjugation’), that is, whenever: A0 = BAB −1 . 7 Example 4.-Eigenvalues. Using the same line F and plane E as in the previous example, let T : R3 → R3 be the linear transformation which contracts vectors by 1/2 in E, and expands vectors by 5 in F . Formally: T v = (1/2)v, v ∈ E; T v = 5v, v ∈ F. This means 5 and 1/2 are eigenvalues of T , with ‘eigenspaces’ F and E (respectively). Using the same basis B as in example 3, we clearly have: 5 0 0 [T ]B = 0 21 0 . 0 0 21 Using B and B −1 given above, we find the matrix of T in the standard basis: 13 9 9 1 [T ]B0 = B[T ]B B −1 = 9 13 9 . 8 18 18 22 This example illustrates the following important concept. Let T : Rn → Rn be linear; let λ ∈ R. Consider the subspace of Rn defined by: ET (λ) = {v ∈ Rn |T v = λv}. For most values of λ this subspace is trivial (i.e., contains only the zero vector.) We’re interested in those λ when it is not: Definition. λ ∈ R is an eigenvalue of T if ET (λ) 6= {0}. In this case ET (λ) is called the eigenspace of the eigenvalue λ, and the vectors of ET (λ) are called eigenvectors of T with eigenvalue λ. Examples. In example 3 above (projection), the eigenvalues are 0 and 1,with eigenspaces: EP (0) = F = Ker(P ), EP (1) = E = Ran(P ). Note that ET (0) = Ker(T ) is true for any linear transformation T of Rn , while EP (1) = Ran(P ) is true for any projection (see 3. below). In example 4 (contraction/expansion), the eigenvalues are 1/2 and 5, with eigenspaces: ET (1/2) = E, ET (5) = F. 8 3. Direct sums and projections. In all the examples above we have two subspaces E and F which together span all of Rn , and in each of which a linear transformation T acts in a simple way. This is a common situation. In fact, given two subspaces E, F of Rn , the following two conditions are equivalent: 1- Any v ∈ Rn can be written (in a unique way) as a sum v = u + w, where u ∈ E, w ∈ F ; 2-dimE + dimF = n and E ∩ F = {0} (that is, E and F intersect only at the origin.) Whenever this happens, we say ‘Rn is the direct sum of subspaces E and F ’, denoted: Rn = E ⊕ F. (Note: in general E and F need not be orthogonal). In this situation, we may define P = PE,F , the linear operator ‘projection on E parallel to F ’, by the rule: P v = u, if v = u + w as in (1), or equivalently: P v = v if v ∈ E; P v = 0 if v ∈ F. Of course, PF,E is similarly defined- it projects onto F parallel to E- and, from (1), for all v ∈ Rn : v = PE,F v + PF,E v, that is to say, we have a relation between these two operators: PE,F + PF,E = Idn , the identity operator in Rn . As seen in the example below, this is more useful than it sounds. Let P = PE,F . The following facts are geometrically obvious: (i) P 2 = P (projecting again doesn’t do anything new); (ii) Ran(P ) = E; (iii)Ker(P ) = F . Conversely, given any linear operator P : Rn → Rn , if P 2 = P it is not hard to show that Ker(P ) ∩ Ran(P ) = {0}, and their dimensions add up to n (good exercise for the theoretically minded). That is, any operator satisfying P 2 = P is automatically the projection onto its range, parallel to its kernel (as constructed above). 9 Definition 3. A linear operator in Rn is a projection if P 2 = P . Example 5. Consider again the subspaces E and F of Example 3 above. Let v = (1, 2, 3). Find the projections of v onto E (along F ) and onto F (along E). First solution: using the matrix of PE,F already computed, we find u = PE,F v: 3 −1 −1 1 −1/2 1 3 −1 2 = 1/2 . u = −1 4 −2 −2 2 3 0 And then w = PF,E v is obtained by taking the difference: w = v − u = (3/2, 3/2, 3). Second solution: Without using the change of coordinates formula, note that all we need to find are numbers c,w1 and w2 so that: (1, 2, 3) = c(1, 1, 2) + u1 (1, 0, −1) + u2 (0, 1, −1). That is, we need to solve the linear system: c + u1 = 1, c + u2 = 2, 2c − u1 − u2 = 3. Proceeding in the usual way, we find the unique solution: c = 3/2, u1 = −1/2, u2 = 1/2, which gives: u = u1 (1, 0, −1)+u2 (0, 1, −1) = (−1/2, 1/2, 0), w = c(1, 1, 2) = (3/2, 3/2, 3). Remark. From the discussion preceding example 5 it is clear that the matrix of PF,E in the standard basis of R3 (or, for that matter, in any basis) is the identity minus the matrix of PE,F in the same basis: 1 1 1 1 1 1 1 . [PF,E ]std = I3 − [PE,F ]std = 4 2 2 2 10 Remark on Example 4. It follows from the geometric definition of T (“expands vectors in F by 5, contracts vectors in E by 1/2”) that, as linear operators, we must have the following identity relating T and the projections onto E and onto F : 1 T = 5PF,E + PE,F . 2 It follows that the same identity must hold for the matrices of these linear operators with respect to any basis. The reader should verify that this is true for the standard basis of R3 (the three matrices involved have already been computed.) Example 6. Suppose T : R2 → R3 is given by: T (1, 1) = (1, 1, 0), T (−1, 1) = 2(−1, 0, 1). Let B = {(1, 1), (−1, 1)}, C = {(1, 1, 0), (−1, 0, 1), (0, 0, 1)}. Find [T ]B,C and [T ]std . We clearly have: 1 0 [T ]B,C = 0 2 , 0 0 1 −1 0 0 0 , Q= 1 0 1 1 P = and from the change of basis formula: [T ]std = Q[T ]B,C P −1 3 −1 1 −1 1 . = 2 −2 2 11 1 −1 1 1 ,