Linear Algebra Notes Chapter 19 KERNEL AND IMAGE OF A MATRIX Take an n × m matrix a11 a21 A= ... a12 a22 .. . ··· ··· a1m a2m .. . an1 an2 ··· anm and think of it as a function A : Rm −→ Rn . The kernel of A is defined as ker A = set of all x in Rm such that Ax = 0. Note that ker A lives in Rm . The image of A is im A = set of all vectors in Rn which are Ax for some x ∈ Rm . Note that im A lives in Rn . Many calculations in linear algebra boil down to the computation of kernels and images of matrices. Here are some different ways of thinking about ker A and im A. In terms of equations, ker A is the set of solution vectors x = (x1 , . . . , xm ) in Rm of the n equations a11 x1 + a12 x2 + · · · + a1m xm = 0 a21 x1 + a22 x2 + · · · + a2m xm = 0 .. . an1 x1 + an2 x2 + · · · + anm xm = 0, (19a) and im A consists of those vectors y = (y1 , . . . yn ) in Rn for which the system a11 x1 + a12 x2 + · · · + a1m xm = y1 a21 x1 + a22 x2 + · · · + a2m xm = y2 .. . an1 x1 + an2 x2 + · · · + anm xm = yn , 1 (19b) 2 has a solution x = (x1 , . . . , xm ). A single equation ai1 x1 + ai2 x2 + · · · + aim xm = yi is called a hyperplane in Rm . (So a line is a hyperplane in R2 , and a plane is a hyperplane in R3 .) Geometrically, ker A is the intersection of hyperplanes (19a), and im A is the set of vectors (y1 , . . . , yn ) ∈ Rn for which the hyperplanes (19b) intersect in at least one point. If x and x0 are two solutions of (19b) for the same y, then A(x − x0 ) = Ax − Ax0 = y − y = 0, so x − x0 belongs to the kernel of A. If ker A = 0 (i.e., consists just of the zero vector) then there can be at most one solution. In general, the bigger the kernel, the more solutions there are to a given equation that has at least one solution. Thus, if x is one solution, then all other solutions are obtained from x by adding a vector from ker A. However, we will see that there is a certain conservation principle at work here, which implies that the bigger the kernel, the smaller the image, so the less likely it is that there will be even one solution. Intuitively, you can think of vectors in Rm as representing information. Then ker A is the information lost by A, while im A is the information retained by A. We can also describe im A as the span (=set of linear combinations of) the columns of A. That is, if u1 , . . . , um are the columns of A, then im A consists of the vectors Ax = x1 u1 + x2 u2 + · · · + xm um ∈ Rn , with all possible choices of scalars x1 , . . . , xm . Example 1: (2 × 3 case) a11 A= a21 a12 a22 a13 . a23 Suppose first that A is the zero matrix (all aij = 0). Then ker A = R3 and im A consists only of the zero vector. Suppose then that A is not the zero matrix. Then ker A is the intersection of two planes through (0, 0, 0) a11 x1 + a12 x2 + a13 x3 = 0 a21 x1 + a22 x2 + a23 x3 = 0. Each plane corresponds to a row vector of A, whereby the row vector is the normal vector to the plane. If the row vectors of A are not proportional, then the planes are distinct. In this case the planes intersect in a line, and ker A is this line. If the rows of A are proportional, then the two equations determine just one plane, and ker A is this plane. For example, 1 2 3 ker is the line R(1, −2, 1), 3 2 1 3 while 1 ker 2 2 4 3 6 is the plane x + 2y + 3z = 0. What about im A? Since im A ⊆ R2 and is nonzero (because we’re assuming A 6= 0), the image of A is either a line or all of R2 . How to tell? Recall that im A is spanned by the three column vectors a11 a12 a13 u1 = , u2 = , u3 = . a21 a22 a23 The image of A will be a line ` exactly when these three column vectors all live on the same line `. If, say, u1 6= 0, and the image is a line, then there are scalars s, t such that u2 = su1 and u3 = tu1 . This would mean that a11 sa11 ta11 A= . a21 sa21 ta21 But look, this means the rows are proportional. They are both proportional to (1, s, t). By what we saw before, this means the kernel is a plane. In summary: If im A = line then If im A = plane ker A = plane. (19a) ker A = line. (19b) then Recall also the case A = 0: If im A = 0 then ker A = R3 . (19c) We can summarize (19a-c) in a table of dimensions: 2 × 3 matrix dim ker A dim im A 1 2 2 1 3 0 CONDITION expected, rows not proportional rows proportional A 6= 0 A = 0 only Note that for any 2 × 3 matrix A, we have dim(ker A) + dim(im A) = 3. As you vary A, the quantity dim(ker A) can vary from 1 to 3, and the quantity dim(im A) can vary from 0 to 2, but the sum dim(ker A) + dim(im A) remains constant at 3. For 3 × 2 matrices a11 A = a21 a31 the table is a12 a22 : R2 −→ R3 , a32 4 2 × 3 matrix dim ker A dim im A 0 2 1 1 2 0 CONDITION expected, columns not proportional . columns proportional , A 6= 0 A = 0 only Again dim(ker A) + dim(im A) = 3. You can think of this as “conservation of information”. It is a general fact: Kernel-Image Theorem. Let A be an n × m matrix. Then dim(ker A) + dim(im A) = m. The corresponding table depends on whether n or m is bigger. n × m, n ≤ m dim ker A dim im A CONDITION m−n n expected, rows linearly independent m−n+1 n−1 .. .. .. . . . m 0 A = 0 only n × m, n ≥ m dim ker A dim im A CONDITION 0 m expected, columns linearly independent 1 m−1 .. .. .. . . . m 0 A = 0 only A set of vectors is linearly independent if none of them is a linear combination of the others. Note that the “expected” situation has minimal kernel. As you go down the rows in the tables, there are more and more conditions to be satisfied, hence each row is less likely than the one above, until finally only A = 0 satisfies all the conditions of the last row. The conditions can be expressed as certain determinants being zero, as follows. Let µ be the smaller of n or m. Each table has µ + 1 rows. Number the rows 0, 1, . . . , µ starting at the top row. Then a matrix satisfies the conditions for row 0 (the expected case) if some µ × µ subdeterminant of A is nonzero. The conditions for some lower row p are that some p × p subdeterminant of A is nonzero, but all (p+1)×(p+1) subdeterminants are zero. This is illustrated in the exercises. Intuitively, the Kernel-Image Theorem says the amount of information lost plus the amount of information retained equals the amount of information you started with. However, to really understand the Kernel-Image Theorem, we have to understand “dimension” inside Rn for any n. 5 Exercise 19.1. Determine the kernel and image, and the dimensions of these, for the following matrices. (A line is described by giving a nonzero vector on the line, and a plane can by described by giving two nonproportional vectors in the plane.) 1 1 1 1 1 2 3 (b) A = (c) A = 2 2 (a) A = 2 2 1 2 0 3 3 1 1 1 1 2 1 2 (d) A = (e) A = (f ) A = 0 1 1 2 2 2 3 0 0 1 1 2 3 1 0 0 0 (g) A = 4 5 6 (h) A = 0 1 0 0 . 7 8 9 0 0 1 0 a b Exericise 19.2. Let A = . c d (a) Suppose A is the zero matrix. What are ker A and im A? (b) Suppose A 6= 0, but det A = 0. What are ker A and im A? (c) Suppose det A 6= 0. What are ker A and im A? Exercise 19.3. Let u = (u1 , u2 ) and v = (v1 , v2 ) be vectors in R2 , and let 1 0 u1 v1 A= 0 1 u2 v2 (This is the sort of matrix you used to map the hypercube in R4 into R2 . ) (a) Describe the kernel of A in terms of the vectors u and v. (b) What is the image of A? Exercise 19.4. A 2 × 3 matrix a11 A= a21 has three “subdeterminants” a11 a12 det , a21 a22 a det 11 a21 a12 a22 a13 a23 a13 , a23 a det 12 a22 a13 . a23 Assume A is non zero. Explain how these subdeterminants determine the dimensions of the kernel and image of A. (Study the 2 × 3 analysis given above. ) Exercise 19.5. Explain how to use subdeterminants to determine the dimensions of the image and kernel of a 2 × m matrix a11 a12 . . . a1m A= . a21 a22 . . . a2m Exercise 19.6. Make the tables of dimensions of kernels and images of 3 × 4 and 4 × 3 matrices, and find a matrix for each row of each table.