Linear Algebra Notes Chapter 19 KERNEL AND IMAGE OF A

advertisement
Linear Algebra Notes
Chapter 19
KERNEL AND IMAGE OF A MATRIX
Take an n × m matrix
a11
 a21
A=
 ...
a12
a22
..
.
···
···

a1m
a2m 
.. 
. 
an1
an2
···
anm

and think of it as a function
A : Rm −→ Rn .
The kernel of A is defined as
ker A = set of all x in Rm such that Ax = 0.
Note that ker A lives in Rm .
The image of A is
im A = set of all vectors in Rn which are Ax for some x ∈ Rm .
Note that im A lives in Rn . Many calculations in linear algebra boil down to the
computation of kernels and images of matrices. Here are some different ways of
thinking about ker A and im A.
In terms of equations, ker A is the set of solution vectors x = (x1 , . . . , xm ) in Rm
of the n equations
a11 x1 + a12 x2 + · · · + a1m xm = 0
a21 x1 + a22 x2 + · · · + a2m xm = 0
..
.
an1 x1 + an2 x2 + · · · + anm xm = 0,
(19a)
and im A consists of those vectors y = (y1 , . . . yn ) in Rn for which the system
a11 x1 + a12 x2 + · · · + a1m xm = y1
a21 x1 + a22 x2 + · · · + a2m xm = y2
..
.
an1 x1 + an2 x2 + · · · + anm xm = yn ,
1
(19b)
2
has a solution x = (x1 , . . . , xm ).
A single equation
ai1 x1 + ai2 x2 + · · · + aim xm = yi
is called a hyperplane in Rm . (So a line is a hyperplane in R2 , and a plane is a
hyperplane in R3 .) Geometrically, ker A is the intersection of hyperplanes (19a),
and im A is the set of vectors (y1 , . . . , yn ) ∈ Rn for which the hyperplanes (19b)
intersect in at least one point.
If x and x0 are two solutions of (19b) for the same y, then
A(x − x0 ) = Ax − Ax0 = y − y = 0,
so x − x0 belongs to the kernel of A. If ker A = 0 (i.e., consists just of the zero
vector) then there can be at most one solution. In general, the bigger the kernel,
the more solutions there are to a given equation that has at least one solution.
Thus, if x is one solution, then all other solutions are obtained from x by adding a
vector from ker A.
However, we will see that there is a certain conservation principle at work here,
which implies that the bigger the kernel, the smaller the image, so the less likely it
is that there will be even one solution.
Intuitively, you can think of vectors in Rm as representing information. Then
ker A is the information lost by A, while im A is the information retained by A.
We can also describe im A as the span (=set of linear combinations of) the
columns of A. That is, if u1 , . . . , um are the columns of A, then im A consists of
the vectors
Ax = x1 u1 + x2 u2 + · · · + xm um ∈ Rn ,
with all possible choices of scalars x1 , . . . , xm .
Example 1: (2 × 3 case)
a11
A=
a21
a12
a22
a13
.
a23
Suppose first that A is the zero matrix (all aij = 0). Then ker A = R3 and im A
consists only of the zero vector. Suppose then that A is not the zero matrix. Then
ker A is the intersection of two planes through (0, 0, 0)
a11 x1 + a12 x2 + a13 x3 = 0
a21 x1 + a22 x2 + a23 x3 = 0.
Each plane corresponds to a row vector of A, whereby the row vector is the normal
vector to the plane. If the row vectors of A are not proportional, then the planes
are distinct. In this case the planes intersect in a line, and ker A is this line. If the
rows of A are proportional, then the two equations determine just one plane, and
ker A is this plane. For example,
1 2 3
ker
is the line R(1, −2, 1),
3 2 1
3
while
1
ker
2
2
4
3
6
is the plane x + 2y + 3z = 0.
What about im A? Since im A ⊆ R2 and is nonzero (because we’re assuming A 6= 0),
the image of A is either a line or all of R2 . How to tell? Recall that im A is spanned
by the three column vectors
a11
a12
a13
u1 =
, u2 =
, u3 =
.
a21
a22
a23
The image of A will be a line ` exactly when these three column vectors all live on
the same line `. If, say, u1 6= 0, and the image is a line, then there are scalars s, t
such that u2 = su1 and u3 = tu1 . This would mean that
a11 sa11 ta11
A=
.
a21 sa21 ta21
But look, this means the rows are proportional. They are both proportional to
(1, s, t). By what we saw before, this means the kernel is a plane. In summary:
If
im A = line
then
If
im A = plane
ker A = plane.
(19a)
ker A = line.
(19b)
then
Recall also the case A = 0:
If
im A = 0
then
ker A = R3 .
(19c)
We can summarize (19a-c) in a table of dimensions:
2 × 3 matrix
dim ker A dim im A
1
2
2
1
3
0
CONDITION
expected, rows not proportional
rows proportional A 6= 0
A = 0 only
Note that for any 2 × 3 matrix A, we have
dim(ker A) + dim(im A) = 3.
As you vary A, the quantity dim(ker A) can vary from 1 to 3, and the quantity
dim(im A) can vary from 0 to 2, but the sum dim(ker A) + dim(im A) remains
constant at 3.
For 3 × 2 matrices

a11

A = a21
a31
the table is

a12
a22  : R2 −→ R3 ,
a32
4
2 × 3 matrix
dim ker A dim im A
0
2
1
1
2
0
CONDITION
expected, columns not proportional
.
columns proportional , A 6= 0
A = 0 only
Again dim(ker A) + dim(im A) = 3. You can think of this as “conservation of information”. It is a general fact:
Kernel-Image Theorem. Let A be an n × m matrix. Then
dim(ker A) + dim(im A) = m.
The corresponding table depends on whether n or m is bigger.
n × m, n ≤ m
dim ker A dim im A
CONDITION
m−n
n
expected, rows linearly independent
m−n+1
n−1
..
..
..
.
.
.
m
0
A = 0 only
n × m, n ≥ m
dim ker A dim im A
CONDITION
0
m
expected, columns linearly independent
1
m−1
..
..
..
.
.
.
m
0
A = 0 only
A set of vectors is linearly independent if none of them is a linear combination of
the others. Note that the “expected” situation has minimal kernel. As you go down
the rows in the tables, there are more and more conditions to be satisfied, hence
each row is less likely than the one above, until finally only A = 0 satisfies all the
conditions of the last row. The conditions can be expressed as certain determinants
being zero, as follows. Let µ be the smaller of n or m. Each table has µ + 1 rows.
Number the rows 0, 1, . . . , µ starting at the top row. Then a matrix satisfies the
conditions for row 0 (the expected case) if some µ × µ subdeterminant of A is
nonzero. The conditions for some lower row p are that some p × p subdeterminant
of A is nonzero, but all (p+1)×(p+1) subdeterminants are zero. This is illustrated
in the exercises.
Intuitively, the Kernel-Image Theorem says the amount of information lost plus
the amount of information retained equals the amount of information you started
with. However, to really understand the Kernel-Image Theorem, we have to understand “dimension” inside Rn for any n.
5
Exercise 19.1. Determine the kernel and image, and the dimensions of these, for
the following matrices. (A line is described by giving a nonzero vector on the line,
and a plane can by described by giving two nonproportional
 vectors
 in the plane.)
1 1
1 1
1 2 3
(b) A =
(c) A =  2 2 
(a) A =
2 2
1 2 0
 3 3 
1 1 1
1 2
1 2

(d) A =
(e) A =
(f ) A = 0 1 1 
2 2
2 3
0 0 1




1 2 3
1 0 0 0



(g) A = 4 5 6
(h) A = 0 1 0 0  .
7 8 9
0 0 1 0
a b
Exericise 19.2. Let A =
.
c d
(a) Suppose A is the zero matrix. What are ker A and im A?
(b) Suppose A 6= 0, but det A = 0. What are ker A and im A?
(c) Suppose det A 6= 0. What are ker A and im A?
Exercise 19.3. Let u = (u1 , u2 ) and v = (v1 , v2 ) be vectors in R2 , and let
1 0 u1 v1
A=
0 1 u2 v2
(This is the sort of matrix you used to map the hypercube in R4 into R2 . )
(a) Describe the kernel of A in terms of the vectors u and v.
(b) What is the image of A?
Exercise 19.4. A 2 × 3 matrix
a11
A=
a21
has three “subdeterminants”
a11 a12
det
,
a21 a22
a
det 11
a21
a12
a22
a13
a23
a13
,
a23
a
det 12
a22
a13
.
a23
Assume A is non zero. Explain how these subdeterminants determine the dimensions of the kernel and image of A. (Study the 2 × 3 analysis given above. )
Exercise 19.5. Explain how to use subdeterminants to determine the dimensions
of the image and kernel of a 2 × m matrix
a11 a12 . . . a1m
A=
.
a21 a22 . . . a2m
Exercise 19.6. Make the tables of dimensions of kernels and images of 3 × 4 and
4 × 3 matrices, and find a matrix for each row of each table.
Download