Notes on Linear Transformations November 17, 2014

advertisement
Notes on Linear Transformations
November 17, 2014
Recall that a linear transformation is a function V
and W such that
T
/
W between vector spaces V
(i) T (c~v ) = cT (~v ) for all ~v in V and all scalars c. (Geometrically, T takes lines to lines.)
(ii) T (~v1 + ~v2 ) = T (~v1 ) + T (~v2 ) for all ~v1 , ~v2 in V . (Geometrically, T takes parallelograms
to parallelograms.)
In the following, we will always assume that V
a summary of what will be discussed below.
(1) V
T
/
T
/
W is a linear transformation. Here is
W is completely determined by how it acts on a basis B = {~v1 , . . . , ~vn } of V .
(2) Every Rn
T
/
Rm is multiplication by a matrix A.
(3) B-coordinates turn V into Rn .
T
/ W as multiplication by a matrix, we can choose bases B = {~
(4) To interpret V
v1 , . . . , ~vn }
0
n
m
of V and B = {w
~ 1, . . . , w
~ m } of W , which turn V into R and W into R so that multiplying by a matrix makes sense.
(5) An in-depth example: the derivative and integral are linear transformations.
(1)
V
T /
W is determined by how it acts on B.
Let B = {~v1 , . . . , ~vn } be a basis of V . If ~v is any vector in V , then there is a unique way to
write ~v as a linear combination of the vectors in the basis: ~v = c1~v1 + · · · + cn~vn . The linear
combination exists since the basis vectors span V , and it is unique since the basis vectors
are independent. Then, by linearity of T , we compute
T (~v ) = T (c1~v1 + · · · + cn~vn ) = c1 T (~v1 ) + · · · + cn T (~vn ),
which shows that every T (~v ) is fully determined by the vectors T (~v1 ), · · · , T (~vn ). In particular, this shows that any two linear transformations that act the same way on all the basis
vectors must in fact be the same linear transformation.
(2) Every Rn
T /
Rm is multiplication by a matrix A.
When our vector spaces are Rn and Rm , there is a matrix A so that T (~v ) = A~v for all ~v in
Rn . This means that T is just multiplication by the matrix A. To find this A, compute how
T acts on the standard basis vectors of Rn and use the resulting vectors as the columns of
A. For instance, when n = 3, compute the vectors
02 3 1
1
@
4
05A ,
T
0
02 3 1
0
@
4
15A ,
T
0
02 3 1
0
@
4
05A ,
T
1
which are in Rm , and use these three vectors as the columns of A. Why does this give
the right matrix? By (1), it is enough to check that T and A act the same way on all the
standard basis vectors. But since
2 3
02 3 1
1
1
4
5
@
4
05A
A 0 = first column of A = T
0
0
and similar equations hold for the second and third columns of A, our definition of A was
exactly the right one to ensure that A agrees with T on the standard basis vectors.
(3) B-coordinates turn V into Rn .
In order to interpret V T / W as multiplication by a matrix, we first need to make sure
that V looks like Rn . This is necessary because matrices multiply column vectors, and the
vector space V could consist of vectors that bear no resemblance to column vectors (for
instance, V could be a vector space of polynomials). Similarly, we will need to make W look
like Rm , because multiplying our matrix by column vectors “in V ” will yield column vectors
that are supposed to be in W . Luckily for us, a basis is exactly the right thing to make a
vector space look like Rn .
Let B = {~v1 , . . . , ~vn } be a basis for V . As noted above, any vector ~v in V can be uniquely
expressed as a linear combination ~v = c1~v1 + · · · + cn~vn . The scalars c1 , . . . , cn are called the
B-coordinates of ~v , and we put them into a column vector
2 3
c1
6 .. 7
[~v ]B = 4 . 5 .
cn
The notation [~v ]B means “the B-coordinates of ~v ”, which is a vector in Rn . By taking Bcoordinates of all the vectors in V , we e↵ectively turn V into Rn . (More precisely, taking
[ ]B
/ Rn that induces a one-to-one correspondence between
B-coordinates is a linear map V
the vectors in V and the vectors in Rn . Such linear maps are called isomorphisms.)
2
(4) Using B, B 0 to interpret V
T /
W as multiplication by a matrix.
Choose a basis B = {~v1 , . . . , ~vn } of V and a basis B 0 = {w
~ 1, . . . , w
~ m } of W . By taking
coordinates, we can view any ~v in V as a column vector [~v ]B in Rn . Similarly, any w
~ in
W has an associated column vector [w]
~ B0 in Rm . Now our method in (2) will reveal that
T is multiplication by a matrix A. Following (2), we need to determine how T acts on the
standard basis vectors of Rn . But the standard basis vectors are simply the B-coordinates
[~v1 ]B , . . . , [~vn ]B of our basis B! Thus we’re interested in computing T (~v1 ), . . . , T (~vn ). However,
if we want to use the vectors T (~v1 ), . . . , T (~vn ) as the columns of A, we first need to turn
them into column vectors – this is achieved by taking B 0 -coordinates. Thus the columns of
A are
[T (~v1 )]B0 , . . . , [T (~vn )]B0 .
This definition of A exactly ensures that
A[~v1 ]B = [T (~v1 )]B0 ,
...,
A[~vn ]B = [T (~vn )]B0 ,
which is a confusing way of saying that A agrees with T when we turn V and W into Rn
and Rm .
One thing to emphasize is that this matrix A depends heavily on the chosen bases B and
B 0 . Di↵erent bases lead to di↵erent matrices.
(5) In-depth example: derivative and integral of polynomials.
Let V be the vector space of polynomials in x of degree  3, and let W be the vector space of
T /
d~v
polynomials in x of degree  2. Let V
W be the derivative T (~v ) = dx
. The fact that
the derivative is linear is one of the basic properties you learned in Calculus I! For instance,
linearity says that you can compute
d
(1
dx
+ 2x
x3 ) =
d
(1)
dx
+
d
(2x)
dx
d
(x3 )
dx
=0+2
3x2 = 2
3x2
term-by-term in a manner that has hopefully become instinctive for you. Let’s try to view
T as multiplication by a matrix.
First, we pick bases for V and W . The nicest choices are B = {1, x, x2 , x3 } for V and
0
B = {1, x, x2 } for W . With these nice bases, computing the coordinates of a polynomial ~v
simply amounts to building a column vector out of the coefficients of ~v . For instance
2 3
2 3
1
2
6
7
27
3
2
6
4
[1 + 2x x ]B = 4 5 and
[2 3x ]B0 = 0 5 .
0
3
1
The columns of our matrix A are
2 3
2 3
2 3
2 3
0
1
0
0
[T (1)]B0 = [0]B0 = 405 , [T (x)]B0 = [1]B0 = 405 , [T (x2 )]B0 = [2x]B0 = 425 , [T (x3 )]B0 = [3x2 ]B0 = 405 ,
0
0
0
3
3
so
2
3
0 1 0 0
A = 40 0 2 05 .
0 0 0 3
To see how A acts on a polynomial like 1+2x x3 , we first compute B-coordinates [1+2x x3 ]B
as above and then take the product
2 3
2
3 1
2 3
0 1 0 0 6 7
2
27 4 5
3
6
4
5
A[1 + 2x x ]B = 0 0 2 0 4 5 = 0 = [2 3x2 ]B0 ,
0
0 0 0 3
3
1
which agrees with our earlier calculation of how the derivative acts! Note that the first
column of A is the only free column, so the nullspace of A is
02 3 1
1
B6 0 7 C
6 7C
N (A) = span B
@405A = span([1]B ).
0
The column vectors in this nullspace correspond to the constant polynomials, which are
indeed the kernel of T : the derivative of any constant is 0! Also, since A has rank 3, the
column space of A has dimension 3, so the range of T must be all of W . This means that
every polynomial of degree  2 is the derivative of a polynomial of degree  3.
Let’s apply similar reasoning for the indefinite integral (the antiderivative). Using the
same V Rand W , the indefinite integral gives a linear transformation W S / V defined by
S(w)
~ = w
~ dx. Since the indefinite integral is only defined up to a constant C, we must make
a choice, namely C = 0. Again, the linearity of the integral is one of the basic properties
you saw in Calculus I (here it is important that we chose C = 0!). For instance, linearity
allows us to compute
Z
Z
Z
2
2
S(2 3x ) = (2 3x ) dx = 2 dx
3x2 dx = 2x x3 .
Let’s find the matrix B for S in the bases B 0 , B. The columns of B are
2 3
2 3
2 3
0
0
0
617
6
7
6
7
0
7 , [S(x)]B = [ 1 x2 ]B = 6 1 7 , [S(x2 )]B = [ 1 x3 ]B = 6 0 7 ,
[S(1)]B = [x]B = 6
2
3
405
4 5
405
2
1
0
0
3
so that
2
3
0 0 0
61 0 0 7
7
B=6
40 1 0 5 .
2
0 0 13
4
Let’s check how B acts on 2
B[2
3x2 ]B0
3x2 :
2
3
2 3
0 0 0 2 3
0
61 0 0 7 2
627
74 5 6 7
=6
40 1 0 5 0 = 4 0 5 = [2x
2
3
0 0 13
1
x3 ] B ,
which agrees with our above calculation. Note that B has rank 3, so N (B) = {~0}, so the
kernel of S is {0}: the only polynomial whose indefinite integral is 0 is the 0 polynomial.
Moreover, C(B) consists of all column vectors in R4 with 0 as their first coordinate, so the
range of S is all polynomials with no constant term: we made the choice to use the constant
term 0 for our indefinite integrals back when we defined S.
One more thing to do is to think about what happens when we act by both T and S in
either order. Since S is the antiderivative,
Z
d
T (S(w))
~ = dx w
~ dx = w
~
leaves w
~ unchanged (the composition T S is the identity transformation on W ). To see
how T and S act in the other order, let ~v = a + bx + cx2 + dx3 be a general element of V .
Then
Z
Z
d~v
S(T (~v )) =
dx = (b + 2cx + 3dx2 ) dx = bx + cx2 + dx3 ,
dx
which leaves ~v the same except for killing the constant term (the composition S T is a
projection onto span(x, x2 , x3 )). These properties can be easily seen using the matrices A
and B. The composition T S corresponds to the matrix product AB, which yields
2
3
2
3 0 0 0
2
3
0 1 0 0 6
1
0
0
1 0 07
7 4
5
AB = 40 0 2 05 6
40 1 0 5 = 0 1 0 ,
2
0 0 0 3
0 0 1
0 0 13
the identity transformation. Likewise,
2
3
2
3
0 0 0 2
0
61 0 0 7 0 1 0 0
60
74
5 6
BA = 6
40 1 0 5 0 0 2 0 = 40
2
0 0 0 3
0 0 13
0
0
1
0
0
0
0
1
0
3
0
07
7
05
1
is the projection matrix we described above. Although A and B are not inverses (since A
and B are not square, there’s no chance of them being invertible), B is the pseudoinverse
A+ of A: the two compositions calculated above give projections onto the column space and
row space of A. (See 7.3 of Strang for more about pseudoinverses.)
5
Download