Lecture Notes

advertisement
Dr. Michael Gabriel
Department of Mathematical Sciences
U08620 - Linear Algebra
1. From R2 and R3 to R n
In R 2 and R 3 , we can represent a vector both graphically and algebraically.
Graphically we can do so with a directed line segment, i.e.
u
=
The defining characteristics of a vector are its magnitude and direction, and so two
vectors u and v are defined to be equal if they have the same magnitude, i.e.
u  v , and if they are parallel and point in the same direction. Then, given any
vector in R 2 or R 3 , we can translate it so that it begins at the origin, and represent it
algebraically by the coordinates of the terminal point of the vector.
In R 2 and R 3 , addition of vectors may be defined in terms of the parallelogram law
(or equivalently the triangle law) of addition. For scalar multiplication, if u is a vector
in R 2 or R 3 , and k is any real number, then the scalar multiple ku is defined to be
the vector with magnitude k u , and with direction the same as that of u if k  0 ,
and with direction opposite to that of u if k  0 . If k  0 , then ku is defined to be the
zero vector.
In component terms, in R 2 for example, recall that addition and scalar multiplication
are given by,
( x1 , x 2 )  ( y1 , y 2 )  ( x1  y1 , x 2  y 2 ) ,
and
k ( x1 , x 2 )  (kx1 , kx2 ) .
Now although our geometric visualization does not extend beyond 3-space, there is
nothing stopping us regarding quadruples of real numbers ( x1 , x 2 , x3 , x 4 ) as “vectors”
in “4-dimensional” space, or quintuples ( x1 , x 2 , x3 , x 4 , x5 ) as “vectors” in “5dimensional” space, and so on, and continuing to do mathematics with these
“vectors”.
With this in mind, we define the following:
Definition 1.1 If n is a positive integer, then an ordered n-tuple is a sequence of real
numbers ( x1 , x 2 , , x n ) . We denote the set of all such n-tuples by R n .
1
We can indeed think of these ordered n-tuples either as generalized points or
generalized vectors. When we write ( x1 , x 2 , , x n ) , whether we mean a point or
vector will be clear from the context.
Now just as two vectors in R 2 or R 3 are equal if and only each of their components
are equal, we say that vectors u  (u1 , u2 , , un ) and v  (v1 , v 2 , , v n ) in R n are
equal if and only if u1  v 1 , u2  v 2 ,
, un  v n .
Then, following the definition of vector addition and scalar multiplication in R 2 and
R 3 , we define operations on the elements of the set R n :
For each u  (u1 , u2 , , un ) and v  (v1 , v 2 , , v n ) in R n , and real number k , define
the sum of u and v to be
u  v  (u1  v1 , u2  v 2 , , un  v n ) ,
and the scalar multiple ku to be
ku  (ku1 , ku2 , , kun ) .
These are called the standard operations on R n .
Recall that both in R 2 and R 3 we have a “zero vector”, (the vector with zero
magnitude and undefined direction), and also that every vector u has a “negative”,
(the vector with magnitude equal to that of u , but with opposite direction).
In R n , we define our zero vector to be 0  (0, 0,
define its negative to be u  (u1 , u2 ,
, 0) , and for u  (u1 , u 2 ,
, un ) we
, un ) . We further define the difference
n
between two vectors in R to be
v  u  v  (u ) ;
that is,
v  u  (v1  u1 , v 2  u2 ,
, v n  un ) .
The way in which we have defined our operations of vector addition and scalar
multiplication in R n ensure that all of the most important arithmetic properties of
vector addition and scalar multiplication of vectors in R 2 and R 3 still hold in an
identical fashion in R n . These are listed below.
Properties of Vector Operations in R n
If u , v and w are vectors in R n , and k and l are scalars, then
(a)
(b)
(c)
(d)
(e)
(f)
u v  v u
u  (v  w )  (u  v )  w
u  0  u  0u
u  (-u )  0 , that is, u - u  0
k (lu )  (kl )u
k (u  v )  ku  kv
2
(g) (k  l )u  ku  lu
(h) 1u  u
These are all easily verified, e.g. (g)
(k  l )u  (k  l )(u1 , u 2 ,
(ku1  lu1 , ku 2  lu 2 ,
, un )  ((k  l )u1 , (k  l )u 2 ,
, kun  lun )  (ku1 , ku 2 ,
, (k  l )u n ) 
, ku n )  (lu1 , lu 2 ,
, lu n )  ku  lu
In analogy with R 2 and R 3 , we can also define the notion of length and distance in
Rn .
Definition 1.2 If u  (u1 , u2 , , un ) is a vector in R n , we define the Euclidean norm
(or Euclidean length) of u to be
u  u12  u2 2 
 un 2 .
Moreover, for points u and v in R n , we define the Euclidean distance between
them to be
d(u, v )  u  v  (u1  v1 ) 2  (u2  v 2 ) 2 
 (un  v n ) 2 .
Similarly we can generalize the definition of the dot product on R 2 and R 3 as
follows.
Definition 1.3 If u  (u1 , u2 , , un ) and v  (v1 , v 2 ,
the Euclidean inner product u  v to be
u  v  u1v1  u2v 2 
, v n ) are vectors in R n , we define
 unv n .
We refer to R n , endowed with this inner product, as Euclidean n-space.
Observation 1.4 For each u  (u1 , u2 ,
1
, un ) in R n , u  (u  u ) 2 .
Properties of the Euclidean Inner Product If u and v are vectors in R n , and k is
any real number, then
(a)
(b)
(c)
(d)
u v  v u
u  (v  w )  u  v  u  w
k (u  v )  (ku )  v  u  (kv )
u  u  0 and u  u  0  u  0
Again these are easily verified and are left as an exercise. (At least one of these
was checked in class.)
3
Now in R 2 and R 3 , the dot product of vectors a and b was initially defined as
a  b  a b cos  ,
where  is the angle between a and b . The formula in terms of components came
later, and it was necessary to prove that this formula is indeed correct. Of course in
R n ,, for n  4 , we have no predefined notion of “angle between vectors”. However,
we can use our definition of Euclidean norm and inner product to define such an
angle.
Definition 1.5 If u  (u1 , u2 , , un ) and v  (v1 , v 2 ,
cosine of the angle between u and v by
cos  
, v n ) are vectors in R n , define the
u v
.
u v
Then extending the notion of perpendicularity for vectors in R 2 and R 3 to vectors in
R n , we say that vectors u and v in R n are orthogonal if and only if u  v  0 ; that
is, if and only if the angle between them is 90 .
Warning: the above definition does not quite contain the entire story. If the value of
the right hand side of the equation does not lie in the interval  1,1 , then the
definition makes no sense. That the right hand side does indeed lie in this interval is
ensured by one of the most important inequalities in linear algebra:
Cauchy-Schwarz Inequality in R n
For all u, v  R n ,
u v  u v .
We will later see a more general version of this inequality, the proof of which is nontrivial.
Properties of the norm If u and v are vectors in R n , and k is any real number,
then
(a) u  0
(b) u  0  u  0
(c) ku  k u
(d) u  v  u  v
(triangle inequality)
Proof (c) ku  k 2u12  k 2u2 2 
 k 2un 2  k 2 (u12  u 2 2 
 un 2 )  k u .
4
(d)
u v
2
 (u  v )  (u  v )  u  u  u  v  v  u  v  v
 u  2u  v  v
2
2
 u  2 u v  v
 u 2 u v  v
2
2
2
2
 ( u  v )2
□
and so the result follows by taking square roots.
Proposition 1.6 If u and v are orthogonal vectors in R n , then
u v
2
 u  v .
2
2
Exercise 1.7 For u, v  R n , show that
u v 
1
1
2
2
u v  u v .
4
4
2. Vector Spaces
2.1 Introduction
In this section, using as a base the theory of vectors in R 2 and R 3 , and now indeed
vectors in R n , we will take the next step and generalize the notion of a vector even
further. The following is taken from “Elementary Linear Algebra, Applications
Version”, by Howard Anton and Chris Rorres.
“… we shall generalize the concept of a vector still further. We shall state a set of
axioms which, if satisfied by a class of objects, will entitle those objects to be called
"vectors". The axioms will be chosen by abstracting the most important properties
of vectors in R n ; as a consequence, vectors in R n will automatically satisfy these
axioms. Thus, our new concept of a vector will include our old vectors and many
new kinds of vectors as well. These new types of vectors will include, among other
things, various kinds of matrices and functions. Our work in this section is not an
idle exercise in theoretical mathematics; it will provide a powerful tool for extending
our geometric visualization to a wide variety of important mathematical problems
where geometric intuition would not otherwise be available. Briefly stated, the idea
is this: We can visualize vectors in R 2 and R 3 geometrically as arrows, which
enables to draw physical or mental pictures to help solve problems. Because the
axioms that we will be using to create our new kinds of vectors will be based on
properties of vectors in R 2 and R 3 , these new vectors will have many of the
familiar properties of vectors in R 2 and R 3 . Consequently, when we want to solve
a problem involving our new kinds of vectors, say matrices or functions, we may be
able to get a foothold on the problem by visualizing geometrically what the
corresponding would be like in R 2 and R 3 .”
5
So what will we mean by a vector space?
This will simply be any non-empty set V such that
(I) given any pair of elements u and v in this set, we can associate with them
another element “ u  v ” in V
(II) given any element u in V and any real number k , we can associate an
element ku in V
(III) these associations should satisfy a certain set of rules (axioms).
What are these rules, and from where do they arise?
Since it is our aim to generalize the notion of vectors in R n , these rules should be
built from those properties which were highlighted as being the most important ones
satisfied by vector addition and scalar multiplication in R n .
2.2 Definition of a Vector Space
Definition 2.2.1 Let V be any nonempty set on which two operations, “vector
addition” and “scalar multiplication” are defined. By vector addition, we mean a rule
for associating with each pair of objects u and v in V , an object u  v called the
sum of u and v ; by scalar multiplication we mean a rule for associating with each
real number k and each object u in V , an object ku , called the scalar multiple of
u by k .
If each of the following axioms are satisfied by all objects u , v and w in V and all
scalars k and l , then we call V a vector space, and we call the objects in V
vectors.
(a) If u and v are in V , then u  v is in V
(b) u  v  v  u
(c) u  (v  w )  (u  v )  w
(d) There is an object in V called a zero vector, denoted 0 , such that
u  0  u  0  u for all objects u in V .
(e) For each u in V , there is an object u in V , called the negative of u ,
u  (-u )  0  (u )  u .
(f) If k is any scalar and u is any object in V , then ku is in V .
(g) k (lu )  (kl )u
(h) k (u  v )  ku  kv
(i) (k  l )u  ku  lu
(j) 1u  u
Important points to note
From these axioms, it can be deduced that in a vector space,
(i) the zero vector is unique.
6
(ii) given u in V , its negative vector, u , is unique.
(iii) 0u  0
(iv) (1)u  u
Proof (i) Suppose w 1 and w 2 are vectors in V such that u  w1  u and u  w 2  u for
all u V . Then with u  w1 in the second equation and u  w 2 in the first, we see
that
w1  w 2  w1 and w 2  w1  w 2 .
Therefore by axiom (b), w1  w 2 .
(ii) Suppose w 1 and w 2 are vectors in V such that u  w1  0 and u  w 2  0 . Then
w1  w1  0  w1  (u  w 2 )  (w1  u )  w 2  (u  w1 )  w 2  0  w 2  w 2  0  w 2 .
Parts (iii) and (iv) are left as exercises here. (They were proved in class.)
□
2.3 Examples of Vector Spaces
1.V  R n , with the standard operations.
2. W  Pn (t ) , the set of all real polynomials with degree less than or equal to n, with
operations as follows:
for p, q  Pn (t ) , i.e. p(t )  ao  a1t  a2t 2   an t n , q (t )  bo  b1t  b2t 2 
k  R , define polynomials p  q and kp by
( p  q )(t )  (ao  bo )  (a1  b1 )t  (a2  b2 )t 2 
(kp)(t )  kao  ka1t  ka2t 2 
 bn t n , and
 (an  bn )t n ,
 kant n .
3. S  {( x ,3x ) : x  R} , with the standard operations of R 2 .
4. V  M 2,2 (R ) with the usual operations of matrix addition and scalar multiplication.
5. T  {A  M2,2 (R) : AT  A} with the usual operations of matrix addition and scalar
multiplication.
6. With a, b  R , a  b , let V  F ([a, b]) , the set of all functions from the interval
[a, b] to R .
We say that two functions f , g  F ([a, b]) are equal if and only if f ( x )  g ( x ) for all
x  [a, b] . Then V is a vector space with operations:
(f  g )( x )  f ( x )  g ( x ) and (kf )( x )  kf ( x ) , for f , g  F ([a, b]) and k  R .
7
7. L  {( x , y ) : x , y  R} , with operations defined as
( x , y )  ( x , y )  ( x  x   1, y  y   1) and k ( x , y )  (k  kx  1, k  ky  1) .
8. X  {x  R : x  0} with operations,
x  y  xy and k x  x k , for x , y V and k  R .
2.4 Examples of sets with operations failing at least one of the axioms
1. V  {( x , x  5) : x  R} , with standard operations of R 2 . This fails even axiom (a).
2. V  {( x , y ) : x , y  R, y  0} , with operations defined as
( x , y )  ( x , y )  ( xy   yx , yy ) and k ( x , y )  (
kx
,1) .
y
This satisfies all axioms other than (j).
3. V  {( x , y ) : x , y  R, x  0} , with operations defined as
( x , y )  ( x , y )  ( xx , xy   yx ) and k ( x , y )  ( x , ky ) .
This satisfies all axioms other than (i).
2.5 Subspaces
Many of the examples of vector spaces that we have considered so far have indeed
been examples of subspaces.
Definition 2.5.1 Suppose that V is a vector space, and that W is a nonempty
subset of V . If W is also a vector space under the same operations as V , then we
call W a subspace of V .
For example, in Section 2.3, S is a subspace of R 2 and T is a subspace of
M 2,2 (R ) . However, L is not a subspace of R 2 with the standard operations, since
the operations on L are defined differently.
Remarks 2.5.2
1. For any vector space V , {0} is a subspace of V ; where 0 is the zero vector of
V.
2. V is always a subspace of itself.
8
Theorem 2.5.3 If W is a nonempty subset of a vector space V , then it is a
subspace of V if and only if it is closed under vector addition and scalar
multiplication (of V ); that is, if and only if
 whenever u and v are in W , u  v is also in W
 for each real number k , whenever u is in W , ku is also in W
Proof Axioms (b), (c), (g), (h), (i), and (j) follow automatically from the fact that V , is
a vector space. Axioms (d) and (e) follow from closure under scalar multiplication,
and points (iii) and (iv) stated after our definition of a vector space above.
□
More Examples of Subspaces
1. Let V  R 3 with the standard operations, and W be the plane through the origin
with normal vector (1, 3, 2) . Then W is a subspace of V .
Check: W  {( x , y , z) : x  3y  2z  0}  {( x , y , z) : ( x , y , z)  (1, 3, 2)  0} .
If ( x1 , y1 , z1 ) , ( x 2 , y 2 , z2 ) W , then ( x1 , y1 , z1 )  ( x 2 , y 2 , z2 ) W , since
(( x1 , y1 , z1 )  ( x 2 , y 2 , z2 ))  (1, 3, 2)  ( x1, y1, z1 )  (1, 3, 2)  ( x 2 , y 2 , z2 )  (1, 3, 2)  0  0  0 .
Similarly, for any k  R and ( x , y , z )  W , k ( x , y , z)  W , since
(k ( x , y , z))  (1, 3, 2)  k (( x , y , z)  (1, 3, 2))  k 0  0 .
Remark 2.5.4 Clearly we could have done the same thing for any normal vector,
and so we can say that every plane in R 3 through the origin, is a subspace of R 3 .
Of course, any plane that does not contain the origin cannot be a subspace; since
the origin is in fact the zero vector of R 3 .
2. Any line in R n is a subspace of R n .
3. W  {f  F ([0,1]) : f (0)  0} is a subspace of F ([0,1]) .
4. The special linear group, SL2 (R)  {A  M 2,2 (R ) : tr ( A)  0} is a subspace of
M 2,2 (R ) . (The trace of a matrix A , written tr ( A) , is the sum of the diagonal entries of
a b 
A . For example on M 2,2 (R ) , tr 
  a  b .)
c d 
Exercise 2.5.5 Choose your favourite vector space and pick any subset that you
like. Test if it is a subspace.
2.6 Spanning Sets
Let’s return to our plane through the origin W  {( x, y , z) : x  3y  2z  0}  R 3 above.
We can rewrite the elements of this space as follows:
9
W  {( x , y , z ) : x  3y  2z  0}  {( x , y , z ) : x  3y  2z}
 {(3y  2z, y , z) : y , z  R}  {y (3,1, 0)  z( 2, 0,1) : y , z  R}
What we have done is write every vector in this vector space W , as a sum of scalar
multiples of particular vectors, (in this case (3,1, 0) and (2, 0,1) ), in W .
Definitions 2.6.1
(1) Suppose that v1 , v 2 , , v m are vectors in a vector space V . We say that a vector
v in V is a linear combination of v1 , v 2 , , v m , if there are scalars a1 , a2 , , am such
that
v  a1v1  a2v 2   amv m .
(2) Given a set of vectors {v1 , v 2 , , v m } in a vector space V , we define the set of all
linear combinations of these vectors to be the span of V , i.e.
Span {v1 , v 2 ,
(3) If U  Span {v1 , v 2 ,
, v m }  {a1v1  a2v 2 
 amv m : a1 , a2 ,
, v m } , we say that {v1 , v 2 ,
, am  R} .
, v m } is a spanning set for U .
Example 2.6.2 For W  {( x , y , z) : x  3y  2z  0} as above, we have that
W  Span {(3,1, 0), (2, 0,1)} .
Example 2.6.3 T  {A  M2,2 (R) : AT  A} . Every element of this vector space is of
a b 
the form, 
 , for some a, b, d  R , and so
b d 
 1 0   0 1   0 0  
T  Span 
,
,
 .
0
0
1
0
0
1







a b 
Example 2.6.4 Every element of SL2 (R) is of the form, 
 , for some
 c a 
a, b, c  R , and so
 1 0   0 1   0 0  
SL2 (R )  Span 
,
,
 .
 0 1  0 0   1 0  
 a 0 

 1 0   0 0  
Example 2.6.5 D2  
 : a, b  R   Span 
,
 .
0
b
0
0
0
1









10
Non-uniqueness of spanning sets For a vector space V , a spanning set is not
unique. In fact, there are infinitely many different spanning sets for a given vector
space. For example for W above, we could equally have proceeded as follows:
1
2
W  {( x , y , z) : x  3y  2z  0}  {( x , y , z) : y  x  z}
3
3
1
2
1
2
 {( x , x  z, z) : x , z  R}  {x (1, , 0)  z(0, ,1) : y , z  R}
3
3
3
3
and so also
1
2
W  Span {(1, , 0), (0, ,1)} .
3
3
Indeed any pair of non-parallel vectors lying in the plane W will form a spanning set
for W .
Some general remarks 2.6.7 Suppose that V is a vector space.
(1) If {u1 , u2 ,
spans V :
, um } spans V , and w is any vector in V , then {u1 , u2 ,
, um , w} also
If v V , then v  a1u1  a2u2   amum for some scalars a1 , a2 , , am . Then of course
we also have that v  a1u1  a2u2   amum  0w , and so v is a linear combination of
{u1 , u2 , , um , w} .
(2) Suppose {u1 , u2 , , um } spans V , and that some u k is a linear combination of the
vectors u1 , u2 , , uk 1 , uk 1 , , um . Then {u1 , u2 , , uk 1 , uk 1 , , um } also spans V :
We need to check that whenever v V is a linear combination of u1 , u 2 , , um , then
it is also a linear combination of u1 , u2 , , uk 1 , uk 1 , , um . This can easily be seen
from the following:
v  a1u1 
 a1u1 
 ak 1u k 1  ak u k  ak 1u k 1 
 ak 1uk 1  ak (b1u1 
 (a1  ak b1 )u1 
for some b1 ,
 amu m
 bk 1u k 1  bk 1u k 1 
bmu m )  ak 1u k 1 
 (ak 1  ak bk 1 )uk 1  (ak 1  ak bk 1 )uk 1 
bk 1 , bk 1 ,
 amu m
 (am  ak bm )um
, bm  R .
A common problem 2.6.8 We often need to answer is the following question:
Given a vector space V , and vectors u1 , u2 , , um V , when is a given vector w in
Span {u1 , u2 , , um } ?
Rephrasing: Do there exist scalars a1 , a2 ,
, am such that w  a1u1  a2u2 
 amum ?
11
Example 2.6.9 In R 4 , let u1  (1, 2, 1,3) , u2  (2, 4,1, 2) and u3  (3, 6,3, 7) . Is
w  (1, 2, 4,11) in Span {u1 , u 2 , u3} ?
We need to try to find scalars a, b and c , such that au1  bu2  cu3  w , i.e. we need
to solve
(a, 2a, a,3a)  (2b, 4b, b, 2b)  (3c, 6c,3c, 7c )  (1, 2, 4,11) .
This gives us the system of linear equations,
a  2b  3c  1
2a  4b  6c  2
a  b  3c  4
,
3a  2b  7c  11
which, when we solve by row reduction, we find has infinitely many solutions:
a  3  t , b  1  2t , c  t , for t  R . Therefore w  Span {u1 , u 2 , u3} .
Taking t  1 for instance gives us,
4u1  3u2  u3  w .
Example 2.6.10 In P2 (t ) , let p1 (t )  1  3t  4t 2 , p2 (t )  2  t  t 2 and
p3 (t )  3  5t  2t 2 . Do we have that q (t )  4  2t  2t 2 is in the span of {p1 , p2 , p3} ?
We need to try to find scalars a, b and c , such that ap1  bp2  cp3  q , i.e. we need
to solve
a  3at  4at 2  2b  bt  bt 2  3c  5ct  2ct 2  4  2t  2t 2 .
This gives us the system of linear equations,
a  2b  3c  4
3a  b  5c  2 .
4a  b  2c  2
Solving by row reduction we get
 1 2 3 4 


 3 1 5 2  
 4 1 2 2 


 1 2 3 4 


10 
  0 1 2
,

7
0 0 0 1 


from which we see that the system of equations has no solution. Therefore
q Span {p1 , p2 , p3} .
12
Remark 2.6.11 Deciding if a given vector lies in the span of a given set of vectors,
amounts to solving a system of linear equations – provided that our vector space is
“finite dimensional”. All of the vector spaces we have, and will consider, with the
exception of the function spaces, have this property. We will expand on this later.
Theorem 2.6.12 Suppose that V is a vector space and that u1 , u2 , , uk V .
(a) U  Span {u1 , u2 , , uk } is a subspace of V .
(b) If W is any subspace of V such that u1 , u2 , , uk W , then U  W .
Proof (a) We just need to check that U is closed under addition and scalar
multiplication:
If a1u1  a2u2 
a1u1  a2u2 
 ak uk and b1u1  b2u2 
 ak uk  b1u1  b2u2 
 bk uk are in V , then their sum,
 bk uk  (a1  b1 )u1  (a2  b2 )u2 
 (ak  bk )uk
is clearly also in U .
Similarly, for any k  R ,
k (a1u1  a2u2 
 ak uk )  ka1u1  ka2u2 
 kak uk
is also in U .
(b) Since W is a subspace, it is closed under addition and scalar multiplication.
Therefore, since u1 , u2 , , uk W , it follows that each of ku1 , ku2 , , kuk W ,
furthermore, that their sum ku1  ku2   kuk W , i.e. that every element of U is in
W.
□
Remark 2.6.13 If, given a subset of a vector space, you can easily see that this
subset is the span of some set of vectors, then this is enough to deduce that this
subset is in fact a subspace.
Question 2.6.14 What are the subspaces of R ?
Well we know that {0} and R are subspaces. Any others?
Let S be a subspace of R , and S  {0} , so that there exists non-zero vector u  S .
Since S is a subspace, it follows that ku  S for all real numbers k . Therefore
S  R , and so {0} and R are the only subspaces.
Question 2.6.15 What are the subspaces of R 2 ?
We know that {0} and R 2 are subspaces. Again we ask if there are any others.
13
Let S be a subspace of R 2 , and S  {0} , so that there exists non-zero vector
u  ( x1 , x 2 )  S . Again since S is a subspace, it follows that ku  S for all real
numbers k .
There are then two possibilities: either S  Span {u} , i.e. line through the origin with
direction vector u , or S  Span {u} .
From the first of these possibilities, we see that every line through the origin is also a
subspace of R 2 .
If S  Span {u} , then there exists non-zero vector v  ( y1 , y 2 )  S such that
v  Span {u} . From part (b) of the above theorem, we have that S  Span {u , v } . But
u and v are a pair of non-parallel vectors in R 2 , and so Span {u , v }  R 2 .
Therefore the only subspaces of R 2 are {0} , all lines through the origin, and R 2
itself.
Question 2.6.16 What are the subspaces of R 3 ?
By similar reasoning, and using the fact that whenever we have three non-coplanar
vectors in R 3 , every other vector in R 3 can be written as a linear combination of
these three vectors, we find that the only subspaces of R 3 are: {0} , all lines through
the origin, all planes through the origin, and R 3 itself.
2.7 Linear Dependence and Independence
In the previous section, we have been considering sets of vectors which span a
given vector space, in the sense that every vector in this vector space can be
expressed as a linear combination of these spanning vectors. In general, there may
be many different ways in which a given vector can be expressed as a linear
combination of the vectors in a spanning set. We will now begin to study conditions
under which each vector can be expressed as a linear combination of spanning
vectors in exactly one way.
Definition 2.7.1 Let V be a vector space. We say that the set of vectors
{v1 , v 2 , , v k } in V is linearly dependent (LD) if there are scalars a1 , a2 , , ak  R ,
with at least one of them non-zero, such that
a1v1  a2v 2 
 akv k  0 .
Conversely, if the only solution to a1v1  a2v 2   akv k  0 is a1  a2 
then we say that {v1 , v 2 , , v k } is linearly independent (LI).
 ak  0 ,
14
Regarding the point made in the introduction to this section, we draw attention to the
following.
Observation 2.7.2 Suppose that {v1 , v 2 , , v k } in V is a LI set of vectors, and that
w V is in the span of {v1 , v 2 , , v k } . Then scalars a1 , a2 , , ak  R such that
a1v1  a2v 2   akv k  w are unique; that is, there is only one way to express w as
a linear combination of {v1 , v 2 , , v k } . To see this, suppose a1v1  a2v 2   akv k  w
and b1v1  b2v 2   bkv k  w . Then it follows that
a1v1  a2v 2 
 akv k  b1v1  b2v 2 
 bkv k ,
i.e.
(a1  b1 )v1  (a2  b2 )v 2 
 (ak  bk )v k  0 .
But since {v1 , v 2 , , v k } is LI, this means that a1  b1  a2  b2 
a1  b1 , a2  b2 , , ak  bk .
 ak  bk  0 , i.e.
□
We now look at examples where we determine whether a given set of vectors is LI
or LD.
Example 2.7.3 {(1, 0), (0,1)} is a LI set in R 2 :
Solving a(1, 0)  b(0,1)  (0, 0) gives us a  0 , b  0 .
Example 2.7.4 How about the set {(1,1, 0), (1,3, 2), (4,9,5)} in R 3 ?
We need to solve a(1,1, 0)  b(1,3, 2)  c (4,9,5)  (0, 0, 0) , i.e.,
(a  b  4c , a  3b  9c , 2b  5c )  (0, 0, 0) , i.e., the system of linear equations:
a  b  4c  0
a  3b  9c  0 ,
2b  5c  0
which we solve by row reduction:
1 1 4 0


1 3 9 0 
0 2 5 0


1 1

 0 1

0 0

4
5
2
0
0

0 ,

0 
3
5
t, b
t , c  t , for t  R . Thus, this set of
2
2
vectors is linearly dependent; for example (with t  2 ),
which has general solution a 
3(1,1, 0)  5(1,3, 2)  2(4,9,5)  (0, 0, 0) .
15
Example 2.7.5 The set {(1, 1, 2,3), (5,1, 0, 2), (2,1, 1, 6)} in R 4 ?
We need to solve a(1, 1, 2,3)  b(5,1, 0, 2)  c (2,1, 1, 6)  (0, 0, 0, 0) , i.e.,
(a  5b  2c , a  b  c , 2a  c ,3a  2b  6c )  (0, 0, 0, 0) , i.e., the system of linear
equations:
a  5b  2c  0
a  b  c  0
2a  c  0
,
3a  2b  6c  0
which we solve by row reduction:
1

 1
2

3
5 2 0

1 1 0

0 1 0 

2 6 0
1

0

0

0
0 0 0

1 0 0
.
0 1 0

0 0 0
Therefore there is a unique solution, a  b  c  0 , and so this set of vectors is LI.
 4 7  1 1  1 2   1 1  
Example 2.7.6 The set of matrices 
,
,
,
  in M 2,2 (R ) .
7
9
1
1
3
4
4
5









We need to solve
4 7
 1 1
1 2
 1 1  0 0
a
  b
c
d

,
7 9
 1 1
3 4
 4 5  0 0
 4a  b  c  d
i.e. 
 7a  b  3c  4d
equations:
7a  b  2c  d   0 0 

 , which produces the system of linear
9a  b  4c  5d   0 0 
4a  b  c  d  0
7a  b  2c  d  0
7a  b  3c  4d  0
.
9a  b  4c  5d  0
If we solve by row reduction, we will see that this system is consistent, and so the
set of matrices is LD.
Remark 2.7.7 The procedure for deciding if a given set of vectors is LI or LD is
always the same:
 write down the equation we need to solve, a1v1  a2v 2   akv k  0
 form the appropriate augmented matrix and solve by row reduction
16

if there are infinitely many solutions, then the set is LD, while if there is a
unique solution, i.e. a1  a2   ak  0 , then the set is LI.
Proposition 2.7.8 A set {v1 , v 2 , , v m } is linearly dependent if and only if at least
one of these vectors can be written as a linear combination of the others.
Proof (  )
Suppose {v1 , v 2 , , v m } is LD. Then there is a solution to a1v1  a2v 2   amv m  0
where at least one of the ai is non-zero. Suppose, without loss of generality, that
a1  0 . Then from a1v1  a2v 2   amv m  0 we have that
a
a
v1   2 v 2   m v m ,
a1
a1
and so we have been able to write v1 as a linear combination of {v 2 , , v m } .
(  ) Suppose, without loss of generality, that v1 can be written as a linear
combination of {v 2 , , v m } , i.e.
v1  b2v 2 
 bmv m .
From this we see that v1  b2v 2   bmv m  0 , and so {v1 , v 2 ,
a1v1  a2v 2   amv m  0 has a solution where a1  1  0 .)
, v m } is LD. (Since
□
Equivalently 2.7.9 This observation can clearly be rephrased by saying that
{v1 , v 2 , , v m } is linearly independent if and only if no vector in this set can be written
as a linear combination of the others.
Observation 2.7.10 Proposition 2.7.8, combined with point (2) of 2.6.7, allows us to
deduce that if we have a linearly dependent spanning set, then we can reduce this
set in size while retaining the fact that it is a spanning set. This will soon be of great
importance to us, since we will wish to find spanning sets of minimum size.
Proposition 2.7.11 Indeed, we can strengthen the above observation if we assume
each of the vectors in our set is non-zero:
If {v1 , v 2 , , v m } , with m  2 , is a LD set, and each v i  0 , then some v k , with k  2 ,
is a linear combination of {v1 , , v k 1} . (That is, at least one of these vectors is a
linear combination of the vectors preceding it in the list.)
Proof {v1 , v 2 , , v m } is LD, and so there are scalars a1 , a2 , , am  R , at least one of
which is non-zero, such that a1v1  a2v 2   amv m  0 . Choosing the largest k such
a
a
that ak  0 , we have that a1v1   akv k  0 , and so v k   1 v 1   k 1 v k 1 . It
ak
ak
remains to verify that k cannot be 1 :
If k  1 , this would mean that a1v1  0 , which in turn would imply that v1  0 . This
gives a contradiction, since we are assuming that each v i  0 , and so k  2 .
□
17
Remarks 2.7.12
 If we have a set of vectors, one of which is the zero vector, then this set is
immediately seen to be LD: for any set {0, u1 , u2 , , ul } , clearly
a0  0u2   0ul  0 ,
with any non-zero a , is a solution.

If v is any non-zero vector in a vector space, the set {v } , containing only the
vector v , is linearly independent; since the only solution to av  0 is a  0 .

If {u1 , u2 ,
, um } is a linearly dependent set of vectors in a vector space V ,
and w1 , w 2 , , w l are any other vectors in V , then the set
{u1 , u2 , , um , w1 , w 2 , , w l } is also linearly dependent:
This follows since if there are scalars a1 , a2 , , am  R , with at least one of
them non-zero, such that a1u1  a2u2   amum  0 , then we have that
a1u1  a2u2   amum  0w1  0w 2  0w l  0 .
This tells us that if a subset of a set of vectors is LD, then the whole set is LD.
Equivalently, if a set of vectors is LI, then every subset of this set also has to
be LI.
Question 2.7.13 Given a linearly independent set of vectors {v1 , v 2 , , v k } in a
vector space V , how can we increase the size of this set, while retaining the
property of linear independence for our new set?
The answer is given by the following theorem. This too will be of fundamental
importance to us, since we will also be looking for linearly independent sets of
vectors of maximum size. (Compare this remark with that made in Observation
2.7.10.)
Theorem 2.7.14 Let {v1 , v 2 , , v k } be a linearly independent set of vectors in V , and
v any other non-zero vector in V . The set {v1 , v 2 , , v k , v} is linearly independent if
and only if v  Span {v1 , v 2 , , v k } .
Proof (  ) If {v1 , v 2 , , v k , v} is linearly independent, then by Observation 2.7.8, no
vector in this set is a linear combination of the other vectors, and so in particular, v
is not a linear combination of v1 , v 2 , , v k .
(  ) Considering the equation a1v1  a2v 2   akv k  ak 1v  0 (*), we need to
deduce from our assumption of linear independence of {v1 , v 2 , , v k } , that
a1 , a2 , , ak , ak 1 must all be 0 .
18
a1
a
a
v 1  2 v 2   k v k , and so v
ak 1
ak 1
ak 1
, v k } . This is contrary to our assumption, and so ak 1 must
If ak 1  0 , we would then have that v  
would be in Span{v1 , v 2 ,
be zero.
Then by (*), we have that a1v1  a2v 2   akv k  0 ; from which it follows that also
a1  a2   ak  0 , since {v1 , v 2 , , v k } is a LI set. Therefore, {v1 , v 2 , , v k , v } is
linearly independent.
□
Therefore, we can increase the size of a linearly independent set, and retain linear
independence, if and only if we can find another vector which is not in the span of
the original set.
In the following examples, we will try to do this for the given linearly independent
sets of vectors.
 1 1   0 0  
Example 2.7.15 The LI set 
,
  in M 2,2 (R ) .
 0 0   1 0  
Example 2.7.16 The LI set (1, 4),(3,5) in R 2 .
Example 2.7.17 The LI set (1,3,1, 2),(2,5, 1,3),(1,3,7, 2) in R 4 .
Example 2.7.18 W  {( x , y , z) : x  3y  2z  0}  Span{(3,1,0),(-2,0,1)}  R 3 . Can we
extend the linearly independent set {(3,1,0),(-2,0,1)} to a larger linearly independent
set in R 3 ?
19
Remark 2.7.19 Any 4 vectors in R 3 are LD.
Why?
More generally, we have the following theorem, which is very important in the theory
of vector spaces:
Theorem 2.7.20 If a vector space V can be spanned by n vectors, and
{w1 , w 2 , , w m } is a linearly independent set in V , the m  n .
Proof
□
This result can be stated as:
Size of any spanning set for a vector space V  Size of any LI set in V .
20
Example 2.7.21 Since we know that R 4 can be spanned by four vectors, for
example {(1, 0, 0, 0), (0,1, 0, 0), (0, 0,1, 0), (0, 0, 0,1)} , we can deduce that any five or more
vectors in R 4 are linearly dependent.
Example 2.7.22
Example 2.7.23
2.8 Basis and Dimension
What if a spanning set for a vector space V is also linearly independent?
Definition 2.8.1 A set {v1 , v 2 , , v n } in a vector space V , is a basis for V if
1. {v1 , v 2 , , v n } spans V
and
2. {v1 , v 2 , , v n } is linearly independent.
Remark 2.8.2 So a basis is a largest linearly independent set, or a smallest
spanning set.
We will now find bases for each of the following vector spaces.
Example 2.8.3 {(1, 0), (0,1)} is a basis for R 2
Example 2.8.4 Another basis for R 2 ?
Example 2.8.5 We have seen that the plane through the origin,
W  {( x , y , z) : x  3y  2z  0}
is spanned by the set {(3,1,0),(-2,0,1)} , and that this set is also LI. Therefore
{(3,1,0),(-2,0,1)} is a basis for W .
21
Example 2.8.6 A basis for the hyperplane through the origin
H  {( x , y , z, w ) : 4 x  y  5z  3w  0} ?
Example 2.8.7 A basis for SL2  {A  M 2,2 (R ) : Tr( A)  0} ?
Example 2.8.8 A basis for R n ?
Question 2.8.9 Do all bases for a given vector space have the same number of
vectors?
The following theorem is fundamental in vector space theory.
Theorem 2.8.10 If {v1 , v 2 ,
space V , then n  m .
, v n } and {w1 , w 2 ,
, w m } are both bases for a vector
Proof
□
Definition 2.8.11 If a vector space V has a finite basis {v1 , v 2 , , v n } , the we say
that the dimension of V is n , or that V is n -dimensional. We write dim V  n .
22
Example 2.8.12 dim R 2 
Example 2.8.13 dim R n 
Example 2.8.14 dim SL2 
Example 2.8.15 Dimension of a plane through the origin in R 3 
Example 2.8.16 Dimension of the vector space of diagonal 2 x 2 real matrices 
Example 2.8.17 dim M m ,n (R ) 
We can now state that:
Size of any spanning set for V  dim V  Size of any LI set in V .
So in general, how can we find bases for vector spaces?
23
Theorem 2.8.18 Suppose that V is a vector space, and that dim V  n . Then
1. Any LI set {v1 , v 2 ,
, v n } in V is a basis, (i.e. automatically spans V )
2. Any spanning set {w1 , w 2 ,
, w n } in V is a basis, (i.e. is automatically LI )
3. If {v1 , v 2 , , v m } is a LI set in V , then there are vectors {v m1 , v m 2 , , v n } in V
such that {v1 , v 2 , , v m , v m1 , v m2 , , v n } is a basis for V , (i.e. every LI set can
be extended to a basis)
4. If {w1 , w 2 , , w l } spans V , then there is a subset of this set which forms a
basis for V .
Proof
1.
2.
3.
4.
□
So this theorem tells us that if we know that the dimension of our vector space is n
say, then to find a basis it suffices to find a spanning set with n vectors, or a linearly
independent set with n vectors.
Example 2.8.19 Does the set of vectors {(1, 2,5), (2,5,1), (1,5, 2)} form a basis for R 3 ?
24
Example 2.8.20 The pair of vectors {(1,1,1), (1, 2,3)} is LI in R 3 . Extend this set to a
basis for R 3 .
Question 2.8.21 Suppose that W is a vector space with dimension n , and U is a
subspace of W . What can we say about the dimension of U ?
Theorem 2.8.22 If W is a vector space with dimension n , and U is a subspace of
W , then
1. dim U  dim W
2. dim U  dim W if and only if U  W
Proof
1.
2.
□
Observations 2.8.23
 so for example, any 3 -dimensional subspace of R 3 has to be R 3 itself.
 every subspace of R 5 , for example, has either dimension 0,1,2,3,4 or 5;
where we say that a space has dimension 0 when it is the space containing
only the zero vector.
 Any 4 -dimensional subspace of M 2,2 (R ) has to be M 2,2 (R ) itself.
2.9 Coordinate Vectors
Question 2.9.1 What are bases good for?
Well to begin with, recall that in Observation 2.7.2 we saw that if {v1 , v 2 , , v k } is a LI
set of vectors in a vector space V , and if w V is in the span of {v1 , v 2 , , v k } , then
the scalars a1 , a2 , , ak  R such that a1v1  a2v 2   akv k  w are unique; that is,
there is only one way to express w as a linear combination of {v1 , v 2 , , v k } .
25
Therefore, suppose that V is an n -dimensional vector space, and that
B={v1 , v 2 , , v n } is a basis for V , with the order of the vectors in this basis fixed.
Then each v V we can write uniquely in the form a1v1  a2v 2   anv n  v . The
scalars a1 , a2 , , an  R are called the coordinates of v with respect to the basis B .
These clearly form a vector (a1 , a2 ,
, an )  R n , called the coordinate vector of v with
respect to B , which we denote by v B , that is,
v B  (a1, a2 ,
, an )  R n .
Example 2.9.2 In Example 2.8.19 we saw that the vectors
B  {(1, 2,5), (2,5,1), (1,5, 2)} form a basis for R 3 . Find v B , where v  (5, 21, 22) .
By row reduction we solve a(1, 2,5)  b(2,5,1)  c (1,5, 2)  (5, 21, 22) , and find that the
(necessarily unique) solution is a  3, b  1, c  4 , and so
(5, 21, 22)B  (3, 1, 4) .
Example 2.9.3 In Example 2.8.7 we saw that an example of a basis for SL2 is
 1 0   0 1   0 0  
 6 4 
B  
,
,
  . Find w B , where w   11 6  .


 0 1  0 0   1 0  
It is easy to see immediately that w B  (6, 4,11)  R3 .
Example 2.9.4 In R n , for each i  1, , n , let ei  (0, , 0,1, 0, , 0)  R n be the vector
with all entries zero, except the ith entry which is one. Then the basis
E  {e1 , e2 , , en } is called the standard basis for R n . Clearly for any vector
v  (a1 , a2 ,
, an )  R n we have that v E is just v itself.
Example 2.9.5 In P2 (t ) , the polynomials p1 (t )  t  1 , p2 (t )  t  1 and
p3 (t )  t 2  2t  1 form a basis B . Find  pB , where p(t )  2t 2  5t  9 .
Example 2.9.6 In Pn (t ) , the set of polynomials
{p0 (t )  1, p1 (t )  t , p2 (t )  t 2 ,
p(t )  ao  a1t  a2t 2 
 pB  (ao , a1, a2 ,
, pn (t )  t n } , forms a basis B . For any polynomial
 an t n , it is again then immediately clear that
, an )  R n1 .
26
Conclusion 2.9.7 So in general, we have the following picture. Fix an ordered basis
B={v1 , v 2 , , v n } for our vector space V . Then we have a one-to-one
correspondence between vectors in V and vectors in R n :
v  x1v1  x2v 2 
 xnv n V  v B =(x1 , x2 ,
, xn )  R n .
(Different vectors in V have different coordinate vectors, with respect to B , in R n ;
and every (a1 , a2 , , an )  R n is the coordinate vector, with respect to B , of some
vector v V .)
Moreover, this correspondence preserves the vector space operations in the sense
that if
v V  v B =(x1 , x2 , , xn )  R n
and
w V  w B =(z1 , z2 , , zn )  R n ,
then
v  w V  (x1  z1 , x 2  z2 , , x n  zn )  R n ,
that is,
v  w B  v B  w B ;
and that for k  R ,
kv B  k v B .
This tells us that whenever we have an n -dimensional vector space V , irrespective
of how strange it might look, it behaves in exactly the same way as R n . Two vector
spaces which behave in the same way in this sense are said to be isomorphic, and
a correspondence such as that above is called an isomorphism.
We will talk more about this, and make it more precise, when we cover the subject
of “Linear Transformations” next semester.
2.10 Applications of Row Reduction
Let A  M m ,n (R ) . Then the rows of A , v1 , v 2 ,
, v m , each consisting of n real
numbers, can be thought of as vectors of R n . Similarly, the columns of A ,
w1 , w 2 , , w n , can be thought of as vectors in R m .
Definition 2.10.1 We define the Row Space of A , denoted Row ( A ) , to be the
subspace of R n spanned by the rows of A ,
Row ( A)  Span {v1 , v 2 , , v m }  R n .
Similarly, the column space of A , denoted Col ( A ) , is defined as
Col ( A)  Span {w1 , w 2 , , w m }  R m .
27
Question 2.10.2 Suppose {w1 , w 2 , , w m } is a linearly dependent set in R n . How
can we find a basis for W  Span {w1 , w 2 , , w m } , and hence find the dimension of
W ? (Compare this with Question 4 of Week 4 Exercises.)
There are just two ingredients we need to answer this question:
1. The application of elementary row operations to a matrix has no effect on the
row space of that matrix, (that is, if A and B are row equivalent matrices,
then Row ( A)  Row (B ) .)
2. The non-zero rows of a matrix in row echelon form are linearly independent.
Part 1. is simply an immediate consequence of the fact that interchanging two
vectors in a spanning set makes no difference to the span of that set, and that for
any non-zero k  R ,
Span {w1 , w 2 ,
,wi ,
, w m }  Span {w1 , w 2 ,
, kw i ,
, w m}
and
Span {w1 , w 2 ,
, w i 1 , w i , w i 1 ,
, w m }  Span {w1 , w 2 ,
, w i 1 , w i  kw j , w i 1 ,
,w m} .
(These two equalities are easily checked and are left as an exercise.)
For Part 2., remembering that a matrix is in row echelon form if
 all zero rows are at the bottom, and
 the first non-zero entry in each row occurs at least one position to the right of
the first non-zero entry in the preceding row,
it is then straightforward to see that the non-zero rows form a LI set of vectors.
Therefore to answer Question 2.10.2, it follows that all we need to do is write the
vectors in our set, {w1 , w 2 , , w m } , as the rows of an m x n matrix A , and reduce to
row echelon form. Then the non-zero rows of A will form a basis for Row ( A ) , i.e.
Span {w1 , w 2 , , w m } ; and of course the number of non-zero rows gives the
dimension of Span {w1 , w 2 , , w m } .
Example 2.10.3 Find a basis for
W  Span {(0,0,3,1, 4),(1,3,1, 2,1),(3,9, 4,5, 2),(4,12,8,8,7)}  R 5 ,
and hence state the dimension of W .
28
Question 2.10.4 What does this tell us about our original spanning set for W ?
Example 2.10.5 Do the same for
U  Span {(1,1,1,1),(1, 2,3, 2),(2,5,6, 4),(2,6,8,5)}  R 4 .
(This is the set of Question 4, Week 4 Exercises.)
Question 2.10.6 What does this tell us about our original spanning set for U ?
Important Observation 2.10.7 We can therefore use this approach to easily
determine if a given set of vectors is LI or LD:
Theorem 2.10.8 A set of vectors {w1 , w 2 , , w m } in R n is LI if and only if, when we
reduce the matrix with rows w1 , w 2 , , w m to echelon form, there are no zero rows.
Proof Let W  Span {w1 , w 2 ,
()
,w m} .
()
□
29
Question 2.10.9 Given linearly independent vectors {w1 , w 2 , , w m } in R n , so that
necessarily m  n , how can we use row reduction to easily extend this set to a basis
for V ? (Compare this with Question 3, Week 4 Exercises.)
The answer to this is best illustrated with an example.
Example 2.10.10 Extend the LI set {(1, 2, 0, 4, 0), (3,8,5, 4,1), (0, 2,5,1, 0)} to a basis for
R5 .
Question 2.10.11 Suppose {u1 , u2 , , um } is a linearly dependent set in R n . How
can we find a basis for U  Span {u1 , u2 , , um } , consisting only of vectors from the
set {u1 , u2 , , um } ?
Again, this is best answered with an example.
Example 2.10.12 Let U  Span {u1 , u 2 , u3 , u 4 , u5 }  R 4 , where
u1  (1,1,3, 2), u2  (2, 2,6, 4), u3  (1, 2,5,1), u4  (0,1, 2, 1), u5  (1,3,7,0).
Find a basis for U , consisting only of vectors from {u1 , u2 , u3 , u4 , u5} .
30
Example 2.10.13 Let’s do the same for U  Span {u1 , u 2 , u3 , u 4 , u5 }  R 4 , where
u1  (1, 2,0,3), u2  (2, 5, 3,6), u3  (0,1,3,0), u4  (2, 1, 4,7), u5  (5, 8,1, 2).
31
2.11 Kernel, Range and Rank of a Matrix, and Invertibility
These are terms that are more generally applied to arbitrary linear transformations
between vector spaces. For now, though, we specialize simply to the case of
matrices.
Kernel:
Let A  M m ,n (R ) . In the following, we will think of elements of R n as column vectors,
that is, nx1 real matrices,
 x1 
 
x
n
x R , x   2  .
 
 
 xn 
0
 
0
n
(So the zero vector in R will be 0    .)
 
 
0
Consider the matrix equation, Ax  0 .
Definition 2.11.1 The Kernel of A , (also called the null space of A ), is defined to be
Ker (A)  {x  R n : Ax  0} .
This is simply the set of solutions to the homogeneous system of linear equations
Ax  0 , and so is clearly a subset of R n .
Proposition 2.11.2 Ker (A) is a subspace of R n .
Proof
□
32
Question 2.11.3 Suppose b  R n is non-zero. Is {x  R n : Ax  b} a subspace of
Rn ?
Question 2.11.4 How can we find a basis for Ker ( A ) ; and hence determine its
dimension? (The dimension of Ker ( A ) is also called the nullity of A .)
 1 2 2 2 1 


Example 2.11.5 A   1 2 1 3 2   M3,5 (R) . Ker ( A)  ?
 2 4 7 1 1 


1 2

We just need to solve the augmented matrix,  1 2
2 4

the solutions are, x1  2r  4s  3t , x 2  r , x3  s  t ,
we have that
2 2 1 0 

1 3 2 0  . We find that
7 1 1 0 
x 4  s, x5  t , for t  R . Thus
Ker ( A)  Span
These vectors are clearly linearly independent, and so dim Ker ( A)  3 .
Remark 2.11.6 A spanning set for Ker ( A ) obtained in this way will always
automatically be linearly independent.
We therefore obtain the following theorem.
Theorem 2.11.7 If A  M m ,n (R ) , a spanning set for Ker ( A ) obtained in the manner
above always forms a basis for Ker ( A ) . Moreover,
33
dim Ker ( A)  number of parameters needed in general solution of Ax  0 .
Observations 2.11.8 For A  M m ,n (R ) ,



Ker ( A ) is a subspace of R n , and 0  dim Ker ( A)  n
dim Ker ( A)  0 if and only if Ker ( A)  0 ; that is, if and only if
x1  x 2   xn  0 is the only solution to Ax  0
dim Ker ( A)  n if and only if A  0 ; that is, if and only if A is the zero matrix.
Range and Rank:
Definition 2.11.9 Let A  M m ,n (R ) . We define the Range of A , (or image of A ), to
be
Ran ( A)  {Ax : x  R n } .
Clearly this is a subset of R m .
Proposition 2.11.10 Ran ( A ) is a subspace of R m .
This is clear, since in fact, Ran ( A ) is nothing but the column space of A ,
Col ( A)  Span {c1 , c 2 , , cn } , where c1 , c 2 , , c n are the columns of A . To see this,
 a11

Ax  
a
 m1
x 
 a11 
 a12 
a1n   1 




a21 
a22 
  x2 


x

    x1 
 2


amn   




 xn 
 am1 
 am 2 
 a1n 


a2 n 

.
 xn




 amn 
Definition 2.11.11 Let A  M m ,n (R ) . The rank of A is defined to be the dimension of
the range of A , that is,
Rank ( A)  dim Ran ( A ) .
Question 2.11.12 How can we find a basis for Ran (A) ; and hence determine its
dimension?
Well, since Ran ( A)  Col ( A ) , we just need to find a basis for Span {c1 , c 2 , , cn } .
This can be done using either the method given as an solution to Question 2.10.2 or
that given as an solution to Question 2.10.11. (In general though, the latter will be
preferable, since it involves row reducing the same matrix that is row reduced to
compute Ker ( A ) - so time would be saved if it’s required to compute bases for both
kernel and range of a matrix A .)
34
Remark 2.11.13 From the worked examples illustrating solutions to Questions
2.10.2 and 2.10.11, for A  M m ,n (R ) we see that the number of leading non-zero
entries in any echelon form for A , equals dim Col ( A)  dim Ran ( A ) .
Moreover, this is clearly equal to,
n  number of parameters needed in general solution of Ax  0 .
(We introduce a parameter for each column in which there is not a leading non-zero
entry.)
This leads us to the following fundamental theorem in linear algebra:
Theorem 2.11.14 (Conservation of Dimension)
If A  M m ,n (R ) , then
Nullity of A + Rank of A  n ,
That is,
dim Ker ( A)  dim Ran ( A )  n .
(This is also known as the Rank-Nullity Theorem.)
Example 2.11.15 Let A  M 4,5 (R ) be the matrix
1

1
A
2

3
3 1 2 3 

4 3 1 4 
.
3 4 7 3 

8 1 7 8 
(i) Find a basis for Ker ( A )
(ii) Find a basis for Ran ( A ) .
1

0
(i) Reducing to echelon form we obtain 
0

0
3 1 2 3 

1 2 1 1 
, and in the same way as
0 0 0 0

0 0 0 0
in Example 2.11.5, we find that a basis for Ker ( A ) is
{(5, 1, 0,1, 0), (5, 2,1, 0, 0), (0,1, 0, 0,1)} .
(ii) In our echelon matrix of part (i), we see that columns 1 and 2 have leading nonzero entries, and so {(1,1, 2,3), (3, 4,3,8)} is a basis for Ran ( A ) .
Note 2.11.16 We see that dim Ker ( A)  dim Ran ( A )  3  2  5 .
35
We end this section with a useful theorem listing conditions equivalent to the
invertibility of an n x n matrix.
Invertibility:
Definition 2.11.17 A matrix A  M n ,n (R ) is said to be invertible if there exists a
matrix B  M n ,n (R ) such that AB  I n  BA , where I n is the n x n identity matrix. The
matrix B , automatically unique, is denoted by A 1 and is called the inverse of A .
Theorem 2.11.18 For A  M n ,n (R ) , each of the following conditions are equivalent.
1. A is invertible
2. For each b  R n , there is a unique solution to the matrix equation Ax  b
3. Ker ( A)  {0}
4. Ran ( A)  R n
5. Rank ( A)  n
6. We can apply row operations to A and obtain I n
7. The rows of A form a linearly independent set of vectors in R n
8. The columns of A form a linearly independent set of vectors in R n
9. The rows of A form a basis for R n
10. The columns of A form a basis for R n
11. For x and y  R n , if Ax  Ay , then x  y , (in this we say that A is one-toone or injective).
Proof
□
36
3. Inner Product Spaces and Normed Spaces
Recall that in Section 1., based on the notion of length (norm) for vectors in R 2 and
R 3 we introduced the Euclidean length (norm) for vectors in R n , for arbitrary n .
However, when, in Section 2., we began to talk about abstract vector spaces, the
notion of length did not play any part. In this Section we will give our vector spaces
additional structure, in such a way that the concept of length can be defined.
3.1 Inner Product Spaces
Just as we took the most important properties of vector addition and scalar
multiplication for vectors in R 2 and R 3 and used them as a foundation for our
definition of an abstract vector space, we likewise take the most important properties
enjoyed by the dot (inner) product in R 2 and R 3 and use them to define a general
inner product on a vector space.
Definition 3.1.1 Let V be a vector space. Suppose that to each pair of vectors
u , v V we can assign a real number, denoted by u, v . (So we have a function,
u, v  R .) This function is called an inner product on V if it
satisfies each of the following axioms, for all u , v , w V and a, b  R :
u , v V x V
I1 au  bv , w  a u, w  b v , w (linearity in first position)
I 2 u, v  v , u (symmetric property)
I 3 u, u  0 , and u, u  0  u  0 (positive definite property).
A vector space V , endowed with an inner product, is called an inner product space.
(Compare I1 , I 2 and I 3 with properties (a), (c) and (d) of the Euclidean Inner
Product stated on page 3.)
Observations 3.1.2
 By I1 and I 2 , we also have that u, av  bw  a u, v  b u, w for all

u , v , w V , a, b  R .
The inner product of a linear combination of vectors is a linear combination of
inner products of the vectors:
a u , b v
i
i
i
j
j
j
  ai b j ui , v j .
i,j
Example 3.1.3 5u1  4u2 ,3v1  2v 2  6v 3
 15 u1 , v1  10 u1 , v 2  30 u1 , v 3 12 u2 , v1  8 u2 , v 2  24 u2 , v 3 .
Examples of Inner Product Spaces 3.1.4
1. R n with the Euclidean inner product.
37
2. V  M m ,n (R ) , with inner product defined as follows: for each A, B  M m ,n (R ) ,
define A, B  tr (B T A) .
 1 3 2 
 1 3 1 
e.g. if A  
, B  
  M 2,3 (R) , then
6 1 0 
 2 4 4 
A, B  tr (B T A)
 1 2 

 13 5 2 

  1 3 2  


 tr   3 4  
   tr  27 13 6   13  13  2  28 .
6
1
0

 7 4 2 
  1 1 





3. V  C ([a, b])  the set of all continuous functions on the closed interval [a, b] ,
with inner product defined as follows: for each f , g  C ([a, b]) , define
b
f , g   f (t )g (t )dt .
a
This is called the standard inner product on C ([a, b]) .
e.g. if f , g  C ([0,1]) are the functions f (t )  5t  3 and g (t )  t 2 , then
1
1
f , g   5t 3  3t 2dt  .
0
4
Verification that A, B  tr (B T A) does define an inner product on M m ,n (R )
I1 aA  bB,C  tr (C T[aA  bB])  tr (aC T A  bC TB)
 atr (C T A)  btr (C TB)  a A, C  b B, C ,
where the second equality follows by elementary properties of matrix algebra,
and the third equality follows since tr (kS  lT )  ktr (S )  ltr (T ) , for all
S ,T  M n ,n (R ) and k , l  R - check this for yourselves!
I 2 A, B  tr (B T A)  tr  (B T A) T   tr ( A TB )  B, A , where the second equality follows
since the trace of any matrix S is equal to the trace of the transpose, S T , of S .
I 3 Let’s look at the i , j th entry of the n x n matrix A T A , that is ( AT A)i , j :
Simply according to matrix multiplication,
m
m
k 1
k 1
( AT A) i , j   (AT ) i ,k ( A) k , j   ak ,i ak , j .
Therefore, since the trace of a matrix is the sum of the diagonal entries,
n
n
m
n
m
tr ( A A)   ( A A) r ,r   ak ,r ak ,r   (ak ,r ) 2
T
T
r 1
r 1 k 1
r 1 k 1
is the sum of the squares of all of the entries of the matrix A , and so clearly
38
tr ( AT A)  0 for all A  M m ,n (R ) , and tr ( AT A)  0 if and only if each entry of A is
zero, that is, A  0 .
3.2 Normed Spaces
Using the properties of the Euclidean norm listed on page 4., we define the
following.
Definition 3.2.1 Let V be a vector space. Suppose that to each vector v V we
can assign a real number, denoted by v . (So we have a function, v V
v  R .) This function is called a norm on V if it satisfies each of the following
axioms, for all u , v V and k  R :
N1 v  0 , and v  0  v  0
N 2 kv  k v
N3 u v  u  v
A vector space V , endowed with a norm, is called a normed space.
Example 3.2.2 R n with the Euclidean norm.
Remark 3.2.3 Recall that in R n , Euclidean norm and Euclidean inner product were
related as follows: v  v , v
1
2
, for each v  R n . Indeed, whenever V is an inner
product space, with inner product u, v , V will also be a normed space, with the
norm of each v V defined as v  v , v
1
2
. To prove this, we need to show that
each of the axioms N1 , N 2 and N 3 , can be deduced from the axioms I1 , I 2 and I 3 ,
1
where v  v , v 2 . To do this, we will need to make use of the Cauchy-Schwarz
inequality, which we now state and prove:
Theorem 3.2.4 (Cauchy-Schwarz Inequality)
Let V be an inner product space. For any vectors u , v V , we have
u, v  u, u
1
2
v ,v
1
2
.
Proof Let t be any real number. Then, by I 3 ,
tu  v , tu  v  t 2 u, u  2t u, v  v , v  0 .
39
That is, the quadratic, at 2  bt  c  0 , where a  u, u , b  2 u, v , and c  v , v .
Therefore, b 2  4ac  0 , that is, 4 u , v
2
 4 u , u v , v , and so u, v  u, u
1
2
v ,v
1
2
.
□
Now we can prove the following theorem.
Theorem 3.2.5 Let V be an inner product space, with inner product ,  . Then V is
1
2
a normed space, with norm defined as v  v , v
, for each v V .
Proof N1 follows immediately from I 3 . For N 2 , we see that kv
2
 kv , kv 
k 2 v , v  k 2 v , and so kv  k v . For N 3 ,
2
u v
2
 u  v ,u  v 
 u  2 u, v  v
2
2
 u  2 u, v  v
2
2
 u 2 u v  v
2
2
 ( u  v )2 ,
□
and taking square roots gives us the required inequality.
Examples 3.2.6
1. V  M m ,n (R ) , with inner product, A, B  tr (B T A) . Then a norm can be
defined on M m ,n (R ) by: A  A, A
1
2
1
2
  tr ( A A)  .
T
 3 1


2 1

 M 4,2 (R ) , then
e.g. if A 
 4 0 


1 5
1
A   tr ( A A) 
T
1
2
 
 3 1   2
1
 

 


 30 4  2
 3 2 4 1   2 1   

tr
 tr  


   57 .
   1 1 0 5   4 0   
 4 27  

 

  
 
 1 5   
b
2. V  C ([a, b]) , with inner product, f , g   f (t )g (t )dt . Then a norm can be
a
defined on C ([a, b]) by: f  f , f
1
2

  f (t )f (t )dt  .
b
1
2
a
e.g. if f  C ([0,1]) , with f (t )  3t  5 , then
f  f ,f
1
2

  9t
1
0
2
 30t  25dt

1
2
 13 .
40
Remark 3.2.7 There are normed spaces which are not necessarily inner product
spaces. That is, we can have a norm on a vector space V , which has not
necessarily been induced by an inner product on V .
Example 3.2.8 Let V  R n . Then as well as the Euclidean norm, we can also define
other norms on V . For example, the so called infinity-norm   on R n is defined as
follows: for (a1 , a2 ,
, an )  R n , define
(a1 , a2 ,
, an )

 max { a1 , a2 ,
, an } .
e.g. for (3, 9, 6, 2)  R 4 , (3, 9,6, 2)   9 , that is, the maximum from the set of
non-negative real numbers {3,9, 6, 2} .
A further example of a norm on R n is the one-norm  1 , defined as follows: for
(a1 , a2 ,
, an )  R n , define
(a1 , a2 ,
, an ) 1  a1  a2 
 an .
e.g. for (3, 9, 6, 2)  R 4 , (3, 9,6, 2) 1  3  9  6  2  20 .
Exercise 3.2.9 Verify that  1 and   do indeed define norms on V  R n .
Example 3.2.10 Let V  C ([a, b]) . Then in addition to the norm induced by the
standard inner product on C ([a, b]) , we also have the following two norms:
 the one-norm, defined as
b
f 1   f (t ) dt ,
a

the infinity-norm, defined as
f

 max  f (t )  .
3.3 Orthogonality
Definition 3.3.1 Let V be an inner product space. Vectors u and v in V are said to
be orthogonal, and u is said to be orthogonal to v , if u, v  0 .
Observations 3.3.2
 This relation is symmetrical, in that if u is orthogonal to v , then v is
orthogonal to u .
 The zero vector is orthogonal to every vector: 0, v  0u, v  0 u, v  0 , for

all v V , and any u V .
The zero vector is the only vector with this property: suppose u V is a
vector orthogonal to every vector in V . Then in particular it is orthogonal to
itself, and so u, u  0 . This means that u has to be the zero vector.
41
Definition 3.3.3 Let V be an inner product space, and let S be any subset of V .
The orthogonal complement of S , denoted by S  and read as “ S perp”, is defined
to be the set of all vectors in V that are orthogonal to every vector in S ; that is,
S   {v V : v , u  0, for every u  S} .
Proposition 3.3.4 S  is a subspace of V .
Proof Let v , w  S  and k  R . Then, for each u  S ,
v  w , u  v , u  w , u  0  0  0 , and
kv , u  k v , u  k (0)  0 ,
so that v  w and kv  S  .
□
Example 3.3.5 Let V  R 4 with the Euclidean inner product. Find any non-zero
vector in R 4 , orthogonal to each of the vectors, (2,1,3, 0), (4,1, 2, 2), (6, 2,5,1) .
Let (a, b, c , d )  R 4 be such a vector. Then we have that
2a  b  3c  0, 4a  b  2c  2d  0, 6a  2b  5c  d  0 .
So any non-zero solution to this homogeneous system of linear equations, (if it
exists), will do.
1
3
, t  R , and so for
We find that the general solution is d  t , c  t , b  0, a 
4
8
example (3, 0, 2,8) is a non-zero vector orthogonal to each of the above three
vectors.
From this example, it is not hard to see the following.
Observation 3.3.6 Whenever w1 , w 2 , , w m are vectors in R n , if we form the matrix
W  M m ,n (R ) with rows w1 , w 2 , , w m , we have that
(Row (W ) )   Ker (W ) .
3.4 Orthogonal Sets and Bases
Definition 3.4.1 Let S  {u1 , u2 , , ur } be a set of non-zero vectors in an inner
product space V . S is an orthogonal set if each pair of vectors in S is orthogonal,
and S is an orthonormal set if it is orthogonal and each each vector is a unit vector.
42

Orthogonal: u i , u j  0 if i  j

1
Orthonormal: ui , u j  
0
ij
ij
Theorem 3.4.2 If S is an orthogonal set of non-zero vectors, then S is also linearly
independent.
Proof Let S  {u1 , u2 ,
, ur } be such a set. We need to solve
a1u1  a2u2 
 ar ur  0 .
For each i  1, , r , taking the inner product of ui with each side of the above
equation, we get
0  0, ui  a1u1  a2u2 
 ar ur , ui
 a1 u1 , ui 
 ai ui , ui 
 ar ur , ui  ai ui , ui ,
But each u i  0 , and so ui , ui  0 , which means that for each i , ai  0 and so
{u1 , u2 ,
□
, ur } is linearly independent.
Observation 3.4.3 If S  {u1 , u2 ,
u1 
, ur } is orthogonal, then
 ur
2
 u1 
2
 ur
2
.
Example 3.4.4 In R 3 , the set E3  {e1 , e2 , e3} , where e1  (1,0,0), e2  (0,1, 0),
e3  (0, 0,1) , is orthonormal.
Example 3.4.5 In R 3 , the set S  {u1 , u 2 , u3} , where u1  (1, 2,1), u2  (2,1, 4),
u3  (3, 2,1) , is orthogonal. Moreover, by Theorem 3.4.2, the set is also LI, and is
therefore in fact an orthogonal basis for R 3 .
Remark 3.4.6 We can easily transform an orthogonal set of non-zero vectors into an
orthonormal set as follows:
If the set S  {u1 , u2 ,
1
 1
u1 ,
u2 ,
, ur } is orthogonal, then the set S   
u2
 u1
,

1
ur 
ur

is orthonormal.
This process of dividing each vector in an orthogonal set by its norm to produce an
orthonormal set is called normalization.
Exercise 3.4.7 Normalize the set S from Example 3.4.5.
43
Observation 3.4.8 Given an orthogonal basis {u1 , u2 , , un } for an inner product
space V , and given any vector v V , we can easily find the unique scalars
a1 , a2 , , an  R such that a1u1  a2u2   anun  v :
To solve a1u1  a2u2   anun  v for a1 , a2 , , an  R , for each i  1,
the inner of ui with each side of this equation and obtain
v , ui  a1u1  a2u2 
 anun , ui
 a1 u1 , ui 
and so ai 
v , ui
Each scalar, ai 
 ai ui , ui 
for each i  1,
ui , ui
, n , we take
v , ui
ui , ui
 an un , ui  ai ui , ui ,
,n .
, is called the Fourier coefficient of v with respect to ui .
Example 3.4.9 Write v  (7,1,9) as a linear combination of the vectors u1 , u 2 , u3 in
Example 3.4.5.
We find that
a1 
(7,1,9), (1, 2,1)
(1, 2,1), (1, 2,1)

18
21
28
 3 , a2 
 1 , a3 
 2,
6
21
14
and so (7,1,9)  3(1, 2,1)  (2,1, 4)  2(3, 2,1) .
Gram-Schmidt Orthogonalization Process:
We now introduce a process by which, given a basis for an inner product space V ,
we can construct from it, an orthogonal basis for V .
Suppose {v1 , v 2 , , v n } is a basis for an inner product space V . We can form an
orthogonal basis {w1 , w 2 , , w n } for V as follows. Set
w1  v 1
w2  v2 
w3  v3 
wn  vn 
v 2 , w1
w1 , w1
v 3 , w1
w1 , w1
v n , w1
w1 , w1
w1
w1 
w1 
v 3 ,w 2
w 2 ,w 2
v n ,w 2
w 2 ,w 2
w2
w2 

v n , w n 1
w n 1 , w n 1
w n 1 .
44
To see precisely why {w1 , w 2 ,
theorem.
, w n } is an orthogonal set, we observe the following
Theorem 3.4.10 Suppose that {u1 , u2 , , ur } is an orthogonal set of non-zero vectors
in an inner product space S . Then if u is any vector in S , let
u  u 
u , u1
u1 , u1
u, u2
u1 
u2 , u2
Then u is orthogonal to each of u1 , u2 ,
Proof For each i  1,
u , u i  u 
u , u1
u1 , u1
u2 

u, ur
ur , ur
ur .
, ur .
, r , we have
u1 
 u, ui 
 u, ui 
u, u2
u2 , u2
u , u1
u1 , u1
u, ui
ui , ui
u2 
u1 , u i 
ui , ui
u, ur

ur , ur
u, u2
u2 , u2
ur , ui
u2 , ui 

u, ur
ur , ur
(by orthogonality of {u1 , u 2 ,
ur , ui
, u r })
 u, ui  u, ui  0
□
From this theorem, we see that each w k defined above, is orthogonal to each of the
preceding w ’s, and so inductively it follows that {w1 , w 2 , , w n } is an orthogonal set.
It is an orthogonal basis since it is automatically linearly independent, and the
dimension of V is n .
Example 3.4.11 Apply the Gram-Schmidt Orthogonalization Process to find an
orthogonal basis, and then an orthonormal basis, for the subspace U of R 4
spanned by
v1  (1,0,1,1), v 2  (1,6,5,3), v 3  (2, 1,6,1) .
First let w1  v1  (1,0,1,1) .
Then w 2  v 2 
v 2 , w1
9
w 1  (1, 6,5,3)  (1, 0,1,1)  (2, 6, 2, 0) .
w1 , w1
3
Finally, w 3  v 3 
v 3 , w1
w1 , w1
w1 
v 3 ,w 2
w 2 ,w 2
w2
9
2
10 14 32
 (2, 1, 6,1)  (1, 0,1,1)  ( 2, 6, 2, 0)  (  ,  , , 2) .
3
44
11 11 11
45
Thus, {w1 , w 2 , w 3} forms an orthogonal basis for U .
For an orthonormal basis, we just normalize and obtain
u1 
1
(1, 0,1,1),
3
u2 
1
(2, 6, 2, 0),
2 11
u3 
11 10 14 32
(  ,  , , 2) .
164 11 11 11
Remark 3.4.12 Since multiplying vectors by non-zero scalars does not affect
orthogonality, (and of course does not affect the span of these vectors), we can
often make our calculations within the Gram-Schmidt Process much simpler by
clearing out fractions or scaling down our vectors. For example in the above, we
could have taken our w 2 to be (1,3,1, 0) , or taken w 3 to be (10, 14,32, 22) .
Remark 3.4.13 We now see that every finite dimensional inner product space has
an orthogonal basis.
Remark 3.4 14 If S  {v1 , v 2 , , v r } is an orthogonal set in an inner product space V ,
then we can always extend S to an orthogonal basis for V .
To see this, first extend S to a basis for V , {v1 , v 2 , , v r , v r 1 , , v n } say. Then
applying the Gram-Schmid Process we obtain w1  v1 , w 2  v 2 , , w r  v r , (since
the v i ’s are orthogonal), and further vectors w r 1 , , w n , where {w1 , w 2 , , w n } is an
orthogonal basis for V .
4. Linear Transformations (Linear Maps)
In this section, we consider a special class of functions that are of fundamental
interest in linear algebra. They are called linear transformations (or linear maps),
and their domains and codomains are vector spaces. They are furthermore special
because they preserve the vector space structure.
By this we mean the following.
Suppose that T is a linear transformation from a vector space V to a vector space
W ; that is T :V  W . Then T possesses the following characteristics:
whenever
and
then
that is,
and that for k  R , when
u V
Tu W
v V
Tv W ,
u  v V
Tu Tv W ,
T (u  v )  Tu  Tv ;
u V
Tu W
46
and
ku V
then
T (ku )  W ,
T (ku )  kTu .
Remark Recall that we touched on this concept in Section 2.9.
4.1 Definition and Examples
Definition 4.1.1 Let V and W be vectors spaces. A map T from V to W is called
a linear transformation, or a linear map, if for all u , v V and k  R ,
T (u  v )  Tu  Tv and T (ku )  kTu ,
Observations 4.1.2
1. T sends the zero vector of V to the zero vector of W ; T (0V )  0W .
2. T is uniquely determined by how it acts on basis vectors of V . That is, if
{v1 , v 2 , , v n } is any basis for V , and if we know how T acts on each of
v1 , v 2 , , v n , then we know how T acts on every vector of V .
e.g. Suppose T : R 2  R 2 , and we know that T (1, 0)  (3,5) and
T (0,1)  (1, 7) , then we can deduce that for arbitrary ( x , y )  R 2 ,
T ( x, y )  T  x (1,0)  y (0,1) 
 xT (1, 0)  yT (0,1)  x (3,5)  y (1, 7)  (3x  y ,5 x  7 y ) .
Remark 4.1.3 We could rephrase Observation 2. by saying that if {v1 , v 2 , , v n } is
any basis for a vector space V , and w1 , w 2 , , w n are any (not necessarily distinct)
vectors in a vector space W , then setting Tv1  w1 , Tv 2  w 2 , , Tv n  w n is
sufficient to uniquely define a linear transformation (LT) from V to W .
Example 4.1.4 Let {e1 , e2 } be the standard basis of R 2 . Then taking, for instance,
w1  (1, 2,3, 0, 7) and w 2  (0, 1, 7,1, 0) , we can set Te1  w1 and Te2  w 2 , thus
defining the linear map T : R 2  R 5 , with T ( x , y )  ( x , 2 x  y ,3x  7 y , y , 7 x ) .
Example 4.1.5 Suppose we have a LT, F : R 2  R 2 , and we know that F (1,1)  (1, 2)
and F (1, 1)  (4, 1) . Find a general expression for F ( x , y ) , for ( x , y )  R 2 .
47
Typically, given a map between vector spaces, we often want to determine whether
or not it is linear.
Example 4.1.6 Let D : P3 (t )  P2 (t ) be the map defined by,
d
(Dp)(t )  p(t ) , for p  P3 (t ) , (the derivative mapping).
dt
This map is certainly linear, as we know that differentiation is a linear operation; that
is,
d
d
d
d
D( p  q )  ( p  q )(t )   p(t )  q (t )   p(t )  q (t )  Dp  Dq ,
dt
dt
dt
dt
and
d
d
D(kp)  (kp)(t )  k p(t )  kDp ,
dt
dt
for all p, q  P3 (t ) , k  R .
Example 4.1.7 Let F : R 2  R 2 , be defined as reflection in the x -axis.
Example 4.1.8 Let V be a vector space, and B  {u1 , u2 ,
, un } an ordered basis for
V . Define I B : V  R n to be the map,
I Bv  v B  R n , for v V ,
where v B is the coordinate vector of v with respect to the basis B . (Recall Section
2.9.) This map is clearly linear; as can be seen from the discussion following
Conclusion 2.9.7.
Example 4.1.9 S : R 2  R 2 , rotation through a fixed angle of  about the origin, in
an anticlockwise direction.
48
Example 4.1.10 T : R 2  R 3 , defined by T ( x1 , x 2 )  ( x1  1, x1  x 2 ,3x1  4x 2 ) .
Example 4.1.11 T : R 2  R 3 , given by T ( x1 , x 2 )  (0, x1x 2 , x1  2x 2 ) .
Example 4.1.12 G : R 3  R 2 , with G( x, y , z)  ( x , y  z) .
Example 4.1.13 T : R 2  R 3 , defined by T ( x1 , x 2 )  ( x1 , 2 x1  x 2 ,5x1  x 2 ) .
Observation 4.1.14 When T is a map from R n to R m , the graph of T is defined to
be the subset,
GT  {( x ,Tx ) : x  R n }  R n m .
Then T is linear, if and only if, GT is a subspace of R n m .
Exercise 4.1.15 Verify this.
49
4.2 Kernel and Range of a Linear Transformation
In Section 2.11 we defined the kernel and range for m x n real matrices. We now do
so (more generally) for any linear transformation.
Definition 4.2.1 Let T : U V be a linear transformation between vector spaces U
and V . Then define the kernel and range of T to be respectively,
Ker T  {x  U : Tx  0V }
and
Ran T  {Tx V : x  U} .
Observations 4.2.2
 Ker T is a subspace of U .
 Ran T is a subspace of V .
Proposition 4.2.3 Let T : U V be a linear transformation, and suppose that
{u1 , u 2 , , ul } spans U . Then {Tu1 ,Tu2 , ,Tul } spans Ran T .
Proof
□
Example 4.2.4 Define a linear transformation, F : R 5  R 4 , by
F ( x1 , x 2 , x3 , x 4 , x5 ) 
( x1  2x 2  x3  x5 , x1  2x 2  2x3  x 4  3x5 ,3x1  6x 2  5x3  2x 4  7 x5 ,
2 x1  4 x 2  x3  1x 4 ) .
Find a basis for Ran F , and hence state its dimension. Compute also, a basis for
Ker F .
50
Recall also from Section 2.11, that we stated, as Theorem 2.11.14, the
“Conservation of Dimension Theorem” for matrices. This too, was a special case of
the following more general statement.
Theorem 4.2.5 (Conservation of Dimension)
Let T : U V be a linear transformation, and suppose that dim U  n . Then
dim Ker T  dim Ran T
 n.
4.3 A Most Important Source of Linear Transformations
Whenever A  M m ,n (R ) , the map TA : R n  R m defined as TA ( x )  Ax is a linear
transformation. Here, on the right-hand side, we are thinking of x  ( x1 , x 2 ,
, xn )
 R as an n x 1 matrix, and by Ax we mean the matrix product of A and x . So,
 x1 
 
x
TA ( x )  A  2  .
 
 
 xn 
n
Exercise 4.3.1 Verify that TA is linear.
In fact, this type of linear transformation is much more than an important source of
examples – indeed every linear transformation is of this form.
Fact 4.3.2 Let T : U V be a linear transformation between vector spaces U and
V , where U has dimension n , and V has dimension m . Then T can be
represented as a matrix transformation, TA , for some A  M m ,n (R ) . In fact, such a T
can be ‘represented’ in infinitely many different ways.
We need to make precise, exactly what we mean by “represent”.
The simplest case:
Suppose T : R n  R m is a linear transformation, and let {e1 , e2 ,
, en } be the
standard basis for R n .
Suppose
Te1  (a11 , a21 ,
, am1 )
Te2  (a12 , a22 ,
, am 2 )
Ten  (a1n , a2n ,
, amn ) .
Form the m x n matrix, A , with columns Te1 , Te2 ,
, Ten :
51
 a11 a12

a
a22
A   21


 am1 am 2
Then for any x  ( x1 , x 2 ,
 a11 a12

a
a22
Ax   21


 am1 am 2
a1n 

a2 n 
.


amn 
, xn )  R n ,
a1n  x1 
 a11 
 a12 
 




a2 n  x 2 
a21 
a22 


 x1
x

 

 2

 




amn  x n 
 am1 
 am 2 
 a1n 


a2 n 

 xn




 amn 
 x1Te1  x2Te2 
 xnTen
 T ( x1e1  x 2e2 
 xnen )  Tx .
Example 4.3.3 Let T : R 5  R 3 be defined by
T ( x1 , x 2 , x3 , x 4 , x5 )  ( x1  x 2  2x3 , x 2  x3 , x1  x 4  3x5 ) .
Then Te1  (1, 0,1) , Te2  (1,1, 0) , Te3  (2,1, 0) , Te4  (0, 0, 1) and Te5  (0, 0,3) , and
so T  TA , where
 1 1 2 0 0 


A  0 1 1 0 0 .
 1 0 0 1 3 


Remark 4.3.4 You might ask why we bothered explicitly going through the process
of calculating Te1 , Te2 , , Te5 , then writing the resultant vectors as the columns of a
matrix, when it is clear simply by looking at the definition of T that it is of the form
TA , with A as above.
The fact is that the above is a special case of a more general picture, and in this
more general setting this procedure will be necessary.
4.4 Matrix Representations of Linear Transformations
Theorem 4.4.1 Suppose that T : U V is a linear transformation, and that
B  {u1 , u2 , , un } and B  {v1 , v 2 , , v m } are fixed ordered bases for vector spaces
U and V respectively. Then there exists a unique m x n matrix A , such that for all
u U ,
52
Tu B  Au B .
We say that this matrix A , represents T with respect to the ordered bases B and
B  , and denote it by B T B .
Proof First of all, we show that such a matrix A exists by explicitly constructing it:
Apply T to each of u1 , u2 , , un , and write the results (uniquely) as linear
combinations of v1 , v 2 , , v m ;
Then form
Tu1 ,Tu2 ,
B
Tu1  a11v1  a12v 2 
 a1mv m
Tu2  a21v1  a22v 2 
 a2mv m
Tun  an1v1  an 2v 2 
 anmv m .
T B as the matrix whose columns are the coordinate vectors of
,Tun , that is,
B
Then B T B satisfies
following.
B
B
T B u B
a21
a22
a2 m
a21
a22
a2 m
an1 

an 2 
.


anm 
T B u B  Tu B , for all u U , as can be seen from the
Let u U and u B  (t1 , t 2 ,
 a11

a
  12


 a1m
T B
 a11

a
  12


 a1m
, t n ) . Then
an1  t1 
 a11 
 a21 
 




an 2  t 2 
a12 
a22 


t
t

  1 
 2

 




anm  t n 
 a1m 
 a2 m 
 t1 Tu1 B  t 2 Tu2 B 
 a1m 


a2 m 

 tn




 anm 
 t n Tun B
 t1Tu1  t 2Tu2 
 t nTun B
 T (t1u1  t 2u2 
 t nun )B  Tu B .
To see that this matrix is unique, it suffices to observer the following.
53
Suppose A is an m x n matrix satisfying A u B  Tu B , for all u U . Then in
particular, it satisfies this for each of u1 , u2 ,
, un , that is, for each j  1, 2,
,n ,
A u j   Tu j   .
B
B
0
 
 
0
 
But u j    1  , with 1 in position j and zeros elsewhere, and so A u j  is
B
B
0
 
 
0
 
precisely the j th column of A , and so A is uniquely determined from the equation,
A u B  Tu B .
□
Example 4.4.2 In Example 4.3.3, the matrix A would be denoted by
E3
T E
, where
5
En denotes the standard basis for R n . In this case, we call A the Standard Matrix
for T .
Example 4.4.3 Consider the derivative mapping D : P3 (t )  P2 (t ) , given in Example
4.1.6. Let B  {1, t , t 2 , t 3} and B  {1, t , t 2 } be our bases for P3 (t ) and P2 (t )
respectively. Find
B
D  B .
Applying D to each of the basis vectors we get,
D(1)  0  0(1)  0(t )  0(t 2 )
D(t )  1  1(1)  0(t )  0(t 2 )
D(t 2 )  2t  0(1)  2(t )  0(t 2 )
D(t 3 )  3t 2  0(1)  0(t )  3(t 2 ) ,
so that
B
D  B
0 1 0 0


 0 0 2 0 .
 0 0 0 3


Then, for example, if we want to apply D to the polynomial 3  t  2t 2  5t 3 , we can
do so using B DB as follows.
First of all, 3  t  2t 2  5t 3   (3, 1, 2,5) , and so
B
54
3
 0 1 0 0     1

  1  
0 0 2 0 2    4 
 0 0 0 3     15 

 5
 
 
gives us D(3  t  2t 2  5t 3 )  1  4t  15t 2 .
Example 4.4.4 Given the basis B  {(2,1), (1, 4)} for R 2 , find
E3
SB , where
S : R  R is defined by
2
3
S( x1 , x 2 )  (3x1 , 2x1  x 2 ,3x1  2x 2 ) .
Verify that the matrix is correct by checking that
E3
S B ( x1 , x 2 )B  S ( x1 , x 2 )E
, for all
3
( x1 , x 2 )  R 2 .
Example 4.4.5 For F : R 2  R 2 defined by F ( x , y )  (5x  y , 2 x  y ) , and
B  {(1, 4), (2, 7)} , find B F B . (Then verify that the matrix is the correct one.)
55
Fact 4.4.6 Given a linear transformation, T : U V , and bases B  {u1 , u2 , , un }
and B  {v1 , v 2 , , v m } for U and V respectively, there is in fact a quicker method
for finding
B
T B than that used above.
4.5 The Change-of-Basis Matrix
Let’s look first at the case where our linear transformation is the identity, I : R n  R n ,
so that Ix  x , for all x  R n .
Observation 4.5.1 If  is any basis for R n , then

I  
 I n , the n x n identity
matrix.
Exercise 4.5.2 Verify this.
Let  and  be ordered bases for R n . Then
that


I  
is the unique n x n matrix such
I   x   Ix    x  , for all
Definition 4.5.3 Such a matrix

I  
x  Rn .
is called a change-of-basis matrix.
Remark 4.5.4 For x  R n , if we are given  x  , the  -coordinates of x , multiplying
by

I  
gives us  x  , the  -coordinates of x ; and so the reason for its name is
clear.
Remark 4.5.5 The columns of

I  
are the  -coordinates of the vectors in  , that
is

where   {u1 , u2 ,
I   u1  u2 
un   ,
, un } .
Theorem 4.5.6 If  and  be ordered bases of R n , then

I 
1

I  
is invertible, and
  I  .
56
Proof
□
Example 4.5.7 If   {(2,1), (1, 4)} , find
E2
I   .
Example 4.5.8 If   {(0, 0,1), (1, 0,1), (0, 2,1)} , find
Remark 4.5.9 Clearly, in general, if   {u1 , u2 ,
then
En
I   u1 u 2
E3
I   .
, un } and I : R n  R n is the identity,
u n  ; that is, the columns of
En
I  
are simply the vectors of the
basis  .
We can now use Theorem ?????? to easily compute
Example 4.5.10 If   {(2,1), (1, 4)} , find

I  E

I  
I  E
.
n
.
2
Example 4.5.11 If   {(0, 0,1), (1, 0,1), (0, 2,1)} , find
So far we have only looked at


I  E
.
3
when at least one of  or  is a standard basis.
What about the general case?
57
Theorem 4.5.12 If  and  are ordered bases for R n , then

I  
  I  E
n
En
I  .
Proof
□
Example 4.5.13 If   {(2,1), (1, 4)} and   {(0, 2), (1,3)} find

I   .
Example 4.5.14 If   {(1, 0, 1), (2, 0,1), (1,1, 0)} and   {(0, 0,1), (1, 0,1), (0, 2,1)} find
 I   .
In the next two examples we see the change-of-basis matrix in use.
Example 4.5.15 If   {(2,1), (1, 4)} and u  (9, 18) , find the  -coordinates of u .
We want u  . Thus,
u 
1
4
 9 
 2 1  9   9
1  9 
  I E 
  E2 I  

 

2
 18 
 18   1 4   18   1
 9
1 
9  9   2 

 .
2   18   5 
9
(And so (9, 18)  2(2,1)  (5)(1, 4) .)
Example 4.5.16 Let   {(0, 0,1), (1, 0,1), (0, 2,1)} and   {(1,3,1), (2, 4, 1), (6, 2, 0)} .
58
(i) Find the standard coordinates of u , where the  -coordinates of u are (3,1, 9) .
(ii) Find the  -coordinates of u , where the  -coordinates of u are (3,1, 9) .
(iii) Find the  -coordinates of u , where the standard coordinates of u are (3,1, 9) .
(iv) Find the  -coordinates of u , where the  -coordinates of u are (3,1, 9) .
So we have seen that for each pair  ,  of ordered bases for R n ,

I  
is an
invertible n x n matrix.
There is, in fact, the following partial converse to this:
Let P be any invertible n x n matrix. Then we can always think of P as a changeof-basis matrix in the following sense.
 p11

p
Suppose   {v1 , v 2 , , v n } , a basis for R n , and P   21


 pn1
define the set of vectors  by,
  {u j  p1 j v1  p2 j v 2 
 pnj v j } j 1,
p12
p22
pn 2
,n
p1n 

p2 n 
. Then


pnn 
.
Claim 4.5.17  is a basis for R n , and P   I  .
Proof
59
□
4.6 Similar Matrices
Let T : R n  R m be a linear transformation, and suppose that  and  are ordered
bases for R n , and that  and  are ordered bases for R m . We have seen how to
construct matrices,  T  and  T  , which would respectively represent T with
respect to the bases  and  , and with respect to the bases  and  .
Question 4.6.1 What is the relationship between

T 
and

T  ?
So the question we are asking is; what is the relationship between two different
matrices which represent the same linear transformation, but with respect to
different bases?
Well the answer is straightforward, and perhaps what you would immediately expect
it to be.
Theorem 4.6.2 Let T : R n  R m be a linear transformation,  and  ordered bases
for R n , and  and  ordered bases for R m . Then we have that

T   I   T   I  .
Proof
□
60
Example 4.6.3 Let T : R 2  R 3 be the linear transformation such that
E3
T E
2
3 0 


  2 1 .
3 2 


(So in fact, T ( x1 , x 2 )  (3x1 , 2x1  x 2 ,3x1  2x 2 ) , for ( x1 , x 2 )  R 2 .)
If   {(2,1), (1, 4)} and   {(0, 0,1), (1, 0,1), (0, 2,1)} , use the above Theorem to find

T  .
It is the special case of Question 4.6.1, in which n  m ,    and    , that leads
us to the concept of similarity for matrices.
The question becomes:
Question 4.6.4 If T : R n  R n is a linear transformation, and  and  are ordered
bases for R n , how are  T  and  T  related?
Terminology 4.6.5 (Note that we say that such a matrix,
respect to the basis  .)

T  , represents T
with
Well we know by Theorems 4.5.6 and 4.6.2, that

T    I   T   I   I 
-1

T   I  .
61
Consequently, we immediately have the following.
Proposition 4.6.6 If A and B are matrices in Mn (R) representing a linear
transformation T : R n  R n , (with respect to two possibly different bases for R n ),
then there exists an invertible matrix P  Mn (R) such that B  P 1AP .
Conversely, since every invertible matrix P  Mn (R) can be thought of as a changeof-basis matrix, for some pair of bases, we can say that if A, B  Mn (R) , with
B  P 1AP for some such matrix P , then A and B represent some linear
transformation T : R n  R n , (with respect to two possibly different bases for R n ),
So it follows that we actually have equivalence between the two conditions of
Proposition 4.6.6.
Definition 4.6.7 Matrices A, B  Mn (R) are said to be similar if there exists an
invertible matrix P  Mn (R) such that B  P 1AP .
Remark 4.6.8 In light of the above discussion, we could of course equivalently
define matrices A, B  Mn (R) to be similar, if they represent the same linear
transformation.
4.7 Diagonalizability – our motivation for considering similarity
Suppose that T : R n  R n is a linear transformation. Then we know that it takes the
general form,
T ( x1 , x 2 ,
, xn )  (a11x1  a12 x 2 
 a1n x n ,
, an1x1  an 2 x 2 
Each coordinate on the right hand side depends on each of x1 , x 2 ,
 ann x n ) .
, xn .
Let’s look at the linear transformation T : R 2  R 2 defined as
T ( x1 , x 2 )  (2x1  2x 2 , 5x1  x 2 ) .
We can easily see that the standard representation of T is given by the matrix
E2
T E
2
 2 2 

.
 5 1 
2 

Suppose, alternatively, that we choose as a basis for R 2 ,   (1,1), ( ,1)  . Then
5 

we find that
62

T 
 4 0 

,
 0 3
which is a diagonal matrix. So, at the cost of using a different basis, we obtain a
simpler matrix, a diagonal one, to describe T .
We now have that for x  R 2 , if  x   ( y 1 , y 2 ) , then Tx    T   x   (4 y 1 ,3y 2 ) .
This is the idea of diagonalizing a matrix – finding a simpler description of the
behaviour of T .
In general, given a linear transformation T : R n  R n , we will wish to find a basis 
for R n , such that  T  is diagonal; so that if x  R n ,  x   ( y 1 , y 2 , , y n ) , then
Tx 
 (t1 x1 , t 2 x 2 ,
, t n x n ) , for some t1 , t 2 ,
, tn  R .
Note 4.7.1 This will not always be possible.
In light of this, we note the following, which is an immediate consequence of our
above discussions.
Observation 4.7.2 If we are able to determine whether a given n x n matrix is
similar to a diagonal one, then we are, in fact, able to determine if a given linear
transformation can be represented by a diagonal matrix.
63
Download