Uploaded by Auksphere

Introduction to Linear Algebra by Gilbert Strang

advertisement
GILBERT
STRANG
TREE
| RR
gi
ee
|
nS Mana
Ns
oe
66
i ©
LINEAR ALGEBRA
WELLES
LEY
--CoA
Mab R de® tombe “Pane
Ses
aed
fa
i
/
7
Digitized by the Internet Archive
In 2022 with funding from
Kahle/Austin Foundation
httos://archive.org/details/introductiontoli0000stra_v7z1
or
.fe
——
fea
=
i
=
———
/}
a
ad
:
INTRODUCTION
TO
LINEAR ALGEBRA
Second Edition
GILBERT STRANG
Massachusetts Institute of Technology
WELLESLEY-CAMBRIDGE
PRESS
Box 812060 Wellesley MA 02482
Introduction to Linear Algebra, 2nd Edition
Copyright ©1998 by Gilbert Strang
All rights reserved. No part of this work may be reproduced or stored or transmitted by any
means, including photocopying, without written permission from Wellesley-Cambridge Press.
Translation in any language is strictly prohibited — authorized translations are arranged.
Printed in the United States of America
ISBN
SOME
Oe
0-9614088-5-5
QA184.878
1998
512’.5
93-14092
Other texts from Wellesley-Cambridge Press
Wavelets and Filter Banks, Gilbert Strang and Truong Nguyen,
ISBN 0-9614088-7-1.
Linear Algebra, Geodesy, and GPS, Gilbert Strang and Kai Borre,
ISBN 0-9614088-6-3.
Introduction to Applied Mathematics, Gilbert Strang,
ISBN 0-9614088-0-4.
An Analysis of the Finite Element Method, Gilbert Strang and George Fix,
ISBN 0-9614088-8-X.
Calculus, Gilbert Strang,
ISBN 0-9614088-2-0.
Wellesley-Cambridge Press
Box 812060
Wellesley MA 02482 USA
MATLAB®
gs@math.mit.edu
http://www-math.mit.edu/~gs/
(781) 431-8488
fax (617) 253-4358
is a registered trademark of The Mathworks,
Inc.
IATRX production by Amy Hendrickson, TRXnology Inc., http://www.texnology.com
A Solutions Manual is available to instructors by email from the publisher. A syllabus and
Teaching Codes and exams for this course are available on the linear algebra web site:
http://web.mit.edu/18.06/www
TABLE OF CONTENTS
Introduction to Vectors
Lil
Vectors and Linear Combinations
ies
Lengths and Dot Products
Solving Linear Equations
GH |
2.2
pee
2.4
phe)
2.6
2a
Vectors and Linear Equations
The Idea of Elimination
Elimination Using Matrices
Rules for Matrix Operations
Inverse Matrices
Elimination = Factorization:
Transposes and Permutations
A = LU
Vector Spaces and Subspaces
Sal
3.2
3.3.
3.4
SL)
3.6
Spaces of Vectors
The Nullspace of A: Solving Ax = 0
The Rank and the Row Reduced Form
The Complete Solution to Ax = b
Independence, Basis and Dimension
Dimensions of the Four Subspaces
Orthogonality
4.1
4.2
4.3
4.4
Orthogonality of the Four Subspaces
Projections
Least Squares Approximations
Orthogonal Bases and Gram-Schmidt
Determinants
Al
The Properties of Determinants
DB
5.3
Permutations and Cofactors
Cramer’s Rule, Inverses, and Volumes
Eigenvalues and Eigenvectors
6.1
6.2
6.3.
6.4
Introduction to Eigenvalues
Diagonalizing a Matrix
Applications to Differential Equations
Symmetric Matrices
101
101
114
122
131
141
153
165
165
174
184
195
207
207
PAM |
230
245
245
258
271
283
Vv
Table of Contents
6.5
6.6
6.7
Positive Definite Matrices
Similar Matrices
Singular Value Decomposition
(SVD)
Linear Transformations
Holt
ee.
12
7.4
The Idea of a Linear Transformation
The Matrix of a Linear Transformation
Change of Basis
Diagonalization and the Pseudoinverse
Applications
8.1
8.2
8.3
8.4
8.5
Graphs and Networks
Markov Matrices and Economic Models
Linear Programming
Fourier Series: Linear Algebra for Functions
Computer Graphics
Numerical Linear Algebra
10
9.1
9.2
Gaussian Elimination in Practice
Norms and Condition Numbers
93
Iterative Methods for Linear Algebra
Complex Vectors and Matrices
10.1
10.2
10.3.
Complex Numbers
Hermitian and Unitary Matrices
The Fast Fourier Transform
294
306
a13
321
aol
329
341
347
357
357
368
376
382
388
395
395
404
410
421
421
430
439
Solutions to Selected Exercises
447
MIT Course 18.06 Final Exam
490
Matrix Factorizations
492
Index
497
Teaching Codes
503
PINE ACE
This is a basic textbook for linear algebra, combining the theory with the applications. The
central equations are the linear system Ax = Db and the eigenvalue problem Ax = Ax.
It is simply amazing how much there is to say (and learn) about those two equations.
This book comes from years of teaching and organizing and thinking about the linear
algebra course—and still this subject feels new and very alive.
I am really happy that the need for linear algebra is widely recognized.
It is
absolutely as important as calculus. I don’t concede anything, when I look at how
mathematics is actually used. So many applications today are discrete rather than
continuous, digital rather than analog, linearizable rather than erratic and chaotic. Then
vectors and matrices are the language to know.
The equation Ax = b uses that language right away. The left side has a matrix A
times an unknown vector x. Their product Ax is a combination of the columns of A.
This is the best way to multiply, and the equation is asking for the combination that
produces b. Our solution can come at three levels and they are all important :
1.
Direct solution by forward elimination and back substitution.
2.
Matrix solution by x = A~'b using the inverse matrix A~!.
3.
Vector space solution by finding all combinations of the columns of A and all
solutions to Ax = 0. We are looking at the column space and the nullspace.
And there is another possibility: Ax = b may have no solution. The direct approach
by elimination may lead to 0 = 1. The matrix approach may fail to find A~!. The
vector space approach can look at all combinations of the columns, but b might not
lie in that column space. Part of the mathematics is understanding when an equation
is solvable and when it’s not.
Another part is learning to visualize vectors.
hard.
The components
A vector v with two components is not
v; and v2 tell how far to go across and up—we
can draw an
arrow. A second vector w may be perpendicular to v (and Chapter | tells exactly when).
If those vectors have six components, we can’t draw them but our imagination keeps
trying. We can think of a right angle in six-dimensional space. We can see 2v (twice as far)
and —w (in the opposite direction to w). We can almost see a combination like 2v — w.
Most important is the effort to imagine all the combinations of cv with dw. They fill
some kind of “two-dimensional plane” inside the six-dimensional space. As I write these
vi
Preface
words, I am not at all sure that I see this subspace. But linear algebra offers a simple way
to work with vectors and matrices of any size. If we have six currents in a network or
six forces on a structure or six prices for our products, we are certainly in six dimensions.
In linear algebra, a six-dimensional space is pretty small.
Already in this preface, you can see the style of the book and its goal. The style is informal but the goal is absolutely serious. Linear algebra is great mathematics, and I try
to explain it as clearly as I can. I certainly hope that the professor in this course learns
new things. The author always does. The student will notice that the applications reinforce
the ideas. That is the whole point for all of us—to learn how to think. I hope you will see
how this book moves forward, gradually but steadily.
Mathematics is continually asking you to look beyond the most specific case, to see
the broader pattern. Whether we have pixel intensities on a TV screen or forces on an
airplane or flight schedules for the pilots, those are all vectors and they all get multiplied
by matrices. Linear algebra is worth doing well.
Structure of the Textbook
I want to note five points about the organization of the book:
1.
Chapter 1 provides a brief introduction to the basic ideas of vectors and matrices
and dot products. If the class has met them before, there is no problem to begin
with Chapter 2. That chapter solves n by n systems Ax = b.
2.
For rectangular matrices, I now use the reduced row echelon form more than before. In MATLAB this is R = rref(A). Reducing A to R produces bases for
the row space and column space. Better than that, reducing the combined matrix
[A I ] produces total information about all four of the fundamental subspaces.
3. | Those four subspaces are an excellent way to learn about linear independence and
dimension and bases. The examples are so natural, and they are genuinely the
key to applications. I hate just making up vector spaces when so many important ones are needed. If the class sees plenty of examples of independence and
dependence, then the definition is virtually understood in advance.
The columns
of A are independent when x = 0 is the only solution to Ax = 0.
4.
Section 6.1 introduces eigenvalues for 2 by 2 matrices. Many courses want to
meet eigenvalues early (to apply them in another subject or to avoid missing them
completely). It is absolutely possible to go directly from Chapter 3 to Section 6.1.
The determinant is easy for a 2 by 2 matrix, and eigenvalues come through clearly.
5.
Every section in Chapters | to 7 ends with a highlighted Review of the Key Ideas.
The reader can recapture the main points by going carefully through this review.
Preface
Vil
A one-semester course that moves steadily can reach eigenvalues. The key idea is to
diagonalize a matrix. For most square matrices that is S~!AS, using the eigenvector
matrix S. For symmetric matrices it is Q’ AQ. When A is rectangular we need U! AV.
I do my best to explain that Singular Value Decomposition because it has become
extremely useful. I feel very good about this course and the student response to it.
Structure of the Course
Chapters 1-6 are the heart of a basic course in linear algebra—theory plus applications.
The beauty of this subject is in the way those two parts come together. The theory is
needed, and the applications are everywhere.
I now use the web page to post the syllabus and homeworks and exam solutions:
http://web.mit.edu/18.06/www
I hope you will find that page helpful. It is coming close to 30,000 visitors.
use it freely and suggest how it can be extended and improved.
Please
Chapter 7 connects linear transformations with matrices. The matrix depends on the
choice of basis! We show how vectors and matrices change when the basis changes.
And we show the linear transformation behind the matrix. I don’t start the course with
that deeper idea, it is better to understand subspaces first.
Chapter 8 gives important applications—often I choose Markov matrices for a lecture
without an exam. Chapter 9 comes back to numerical linear algebra, to explain how
Ax = b and Ax = dx are actually solved. Chapter 10 moves from real to complex
numbers, as entries in the vectors and matrices. The complete book is appropriate for
a two-semester course—it starts gradually and keeps going forward.
Computing in Linear Algebra
The text gives first place to MATLAB, a beautiful system that was developed specifically for linear algebra. This is the primary language of our Teaching Codes, written
by Cleve Moler for the first edition and extended by Steven Lee for this edition. The
Teaching Codes are on the web page, with MATLAB homeworks and references and
a short primer. The best way to get started is to solve problems!
We also provide a similar library of Teaching Codes for Maple and Mathematica.
The codes are listed at the end of the book, and they execute the same steps that we
teach. Then the reader can see matrix theory both ways—the algebra and the algorithms.
Those work together perfectly. This textbook supports a course that includes computing
and also a course that doesn’t.
There is so much good mathematics to learn and to do.
Acknowledgements
This book surely comes with a lot of help. Suggestions arrived by email from an
army of readers; may I thank you now! Steven Lee visited M.I.T. three times from
Oak Ridge National Laboratory, to teach the linear algebra course 18.06 from this
textbook. He created the web page http://web.mit.edu/18.06/www and he added new
MATLAB Teaching Codes to those written by Cleve Moler for the first edition. (All
the Teaching Codes are listed at the end of this book.) I find that these short programs
show the essential steps of linear algebra in a very clear way. Please look at the web
page for help of every kind.
For the creation of the book I express my deepest thanks to five friends. The 1993 first
edition was converted to IATRX2e by Kai Borre and Frank Jensen in Denmark. Then came the
outstanding work by Sueli Rocha on the new edition. First at M.I.T. and then in Hong Kong,
Sueli has shared all the excitement and near-heartbreak and eventual triumph that is part
of publishing. Vasily Strela succeeded to make the figures print (by somehow reading
the postscript file). And in the final crucial step, Amy Hendrickson has done everything
to complete the design. She is a professional as well as a friend. I hope you will like
the Review of the Key Ideas (at the end of every section) and the clear boxes inside
definitions and theorems. The reviews and the boxes highlight the main points, and
then my class remembers them.
There is another special part of this book: The front cover. A month ago I had a mysterious email message from Ed Curtis at the University of Washington.
He
insisted that I buy Great American Quilts: Book 5, without saying why. Perhaps it
is not necessary to admit that I have made very few quilts. On page 131 of that book
I found an amazing quilt created by Chris Curtis. She had seen the first edition of this
textbook (its cover had slanted houses). They show what linear transformations can
do, in Section 7.1. She liked the houses and she made them beautiful. Possibly they
became nonlinear, but that’s art.
I appreciate that Oxmoor House allowed the quilt to appear on this book. The color was
the unanimous choice of two people. And I happily thank Tracy Baldwin for designing
her third neat cover for Wellesley-Cambridge Press.
May I now dedicate this book to my grandchildren. It is a pleasure to name the ones
I know so far: Roger, Sophie, Kathryn, Alexander, Scott, Jack, William, and Caroline.
I hope that all of you will take this linear algebra course one day. Please pass it, whatever
you do. The author is proud of you.
INTRODUCTION
TO VECTORS
The heart of linear algebra is in two operations—both with vectors. We add vectors
to get v + w. We multiply by numbers c and d to get cv and dw. Combining those
operations gives the linear combination cv + dw.
Chapter | explains these central ideas, on which everything builds. We start with
two-dimensional vectors and three-dimensional vectors, which are reasonable to draw.
Then we move into higher dimensions. The really impressive feature of linear algebra
is how smoothly it takes that step into n-dimensional space. Your mental picture stays
completely correct, even if drawing a ten-dimensional vector is impossible.
This is where the book is going (into n-dimensional space), and the first steps
are the two operations in Sections 1.1 and 1.2:
1.1.
Vector addition
v + w and linear combinations
1.2
The dot product v-w
cv + dw.
and the length |\v|| = /v-
v.
VECTORS AND LINEAR COMBINATIONS ag 1.1
“You can’t add apples and oranges.” That sentence might not be news, but it still contains some truth. In a strange way, it is the reason for vectors! If we keep the number
of apples separate from the number of oranges, we have a pair of numbers. That pair
is a two-dimensional vector v:
{))
We
wrote
v as a column
U]
v; = number of apples
v2
v2 = number of oranges.
vector.
The numbers
v; and v2 are its “components.”
The
main point so far is to have a single letter v (in boldface) for this pair of numbers v
and v2 (in lightface).
Even if we don’t add v; to v2, we
do add vectors.
The first components
and w stay separate from the second components:
Vv
ieee
v2
and
w=|
w
w2
'
add to
1
ve
vj
|
+w
1
v2 + wW2
of v
2
Chapter 1
Introduction to Vectors
You see the reason. The total number of apples is vj + w;. The number of oranges is
v2 + w2. Vector addition is basic and important. Subtraction
same idea: The components of v—w
are v, — wy, and
of vectors
follows the
Vectors can be multiplied by 2 or by —1 or by any number c. There are two
ways to double a vector. One way is to add v + v. The other way (the usual way) is
to multiply each component by 2:
2
The components
ba
and
2v2
of cv are cv; and cv2.
—v=
aa :
(U9)
The number c is called a “scalar.”
Notice that the sum of —v and v is the zero vector. This is 0, which is not
the same as the number zero! The vector 0 has components 0 and 0. Forgive me for
hammering away at the difference between a vector and its components. Linear algebra
is built on these operations
v-+ w and cv—adding vectors and multiplying by scalars.
There is another way to see a vector, that shows all its components at once. The
vector v can be represented by an arrow. When v has two components, the arrow is
in two-dimensional space (a plane). If the components are v; and v2, the arrow goes
v, units to the right and v2 units up. This vector is drawn twice in Figure 1.1. First,
it starts at the origin (where the axes meet). This is the usual picture. Unless there
is a special reason,
our vectors
will begin at (0,0).
But the second arrow
shows the
starting point shifted over to A. The arrows OP and AB represent the same vector.
One reason for allowing any starting point is to visualize the sum v + w:
Vector addition (head to tail)
At the end of v, place the start of w.
We travel along v and then along w. Or we take the shortcut along v + w. We could
also go along w and then v. In other words, w+ v gives the same answer as v + w.
These are different ways along the parallelogram (in this example it is a rectangle).
The endpoint in Figure 1.2 is the diagonal v + w which is also w + v.
Figure 1.1.
The arrow usually starts at the origin (0,0); cv is always parallel to v.
1.1.
Vectors
and
Linear
Combinations
Figure 1.2
Vector addition v-+w produces the diagonal of a parallelogram.
first components and second components separately.
Check that by algebra: The first component
The order of addition makes no difference:
3
Add the
is vj + w; which equals w, + vj.
vee (EI 1--E-
The zero vector has vj = 0 and v2 = O. It is too short to draw a decent arrow, but
you know that v + 0 = v. For 2v we double the length of the arrow. We reverse its
direction for —v. This reversing gives a geometric way to subtract vectors.
Vector subtraction
To draw
v — w,
along w (Figure 1.3). The components
go forward
along v and then backward
are vj — w; and v2 — w2.
We will soon meet a “dot product” of vectors. It is not the vector whose components
are vjw,
and v2W2.
Linear Combinations
We have added vectors, subtracted vectors, and multiplied by scalars. The answers
v + w, v — w, and cv are computed a component at a time. By combining these
operations, we now form “linear combinations” of v and w. Apples still stay separate
from oranges—the linear combination in Figure 1.3 is a new vector cv + dw.
DEFINITION
The sum of cv and dw is a linear combination of v and w.
woes i]ea{i]- Lt)
4
Chapter 1
Figure 1.3.
Introduction to Vectors
Vector subtraction v — w (left). The linear combination 3v + 2w (right).
This is the fundamental construction of linear algebra: multiply and add. The
sum v+w is a special combination, when c = d = |. The multiple 2v is the particular
case with c = 2 and d = 0. Soon you will be looking at all linear combinations of v
and w—a whole family of vectors at once. It is this big view, going from two vectors
to a “plane of vectors,” that makes the subject work.
In the forward direction, a combination of v and w is supremely easy. We are
given the multipliers c = 3 and d = 2, so we multiply. Then add 3v+2w. The serious
problem is the opposite question, when c and d are “unknowns.” In that case we are
only given the answer: cv + dw has components 8 and —1. We look for the right
multipliers c and d. The two components give two equations in these two unknowns.
When 100 unknowns multiply 100 vectors each with 100 components, the best
way to find those unknowns is explained in Chapter 2.
Vectors in Three Dimensions
Each vector v with two components corresponds to a point in the xy plane. The components of v are the coordinates of the point: x = v; and y = v2. The arrow ends
at this point (v1, v2), when it starts from (0,0). Now we allow vectors to have three
components.
The xy plane is replaced by three-dimensional space.
Here are typical vectors (still column vectors but with three components):
=
l
|| 2
2
and
w=
3
—1
and
v+w=
OM
W
—
The vector v corresponds to an arrow in 3-space. Usually the arrow starts at the origin,
where the xyz axes meet and the coordinates are (0, 0,0). The arrow ends at the point
1.1.
Vectors
and Linear Combinations
3
>|
(1, —2, 2)
(Gerd)
3
>
ae:
1
5
Zz
2
|
2
ih
TeX
:
Ta
=i)
Figure 1.4
—_-Vectors [| and [| correspond to points (x, y) and (x, y, z).
with coordinates x = 1, y = 2, z = 2. There is a perfect match between the column
vector and the arrow from the origin and the point where the arrow ends. Those are
three ways to describe the same vector:
1
From now on
v= | 2
is also written as |v =(1,2,2).
2
The reason
for the column
form
(in brackets)
is to fit next to a matrix.
The reason
for the row form (in parentheses) is to save space. This becomes essential for long
vectors. To print (1, 2,2,4,4,6) in a column would waste the environment.
Important note
v = (1,2, 2) is not a row vector.
The row vector [1
2 2] is abso-
lutely different, even though it has the same three components. It is the “transpose”
of v.
A column vector can be printed horizontally (with commas and parentheses).
Thus (1, 2,2) is in actuality a column vector. It is just temporarily lying down.
In three dimensions,
vector addition
is still done
a component
at a time.
The
result v-+w has components v;+ wy, and v2+w2 and v3-+w3—maybe apples, oranges,
and pears. You see already how to add vectors in 4 or 5 or n dimensions. This is now
the end of linear algebra for groceries!
The addition v + w is represented by arrows in space. When w starts at the
end of v, the third side is v + w.
parallelogram.
Question:
When
w follows
v, we
get the other sides of a
Do the four sides all lie in the same plane?
sum v + w — v — w goes around the parallelogram to produce
Yes. And the
—
A typical linear combination of three vectors in three dimensions is u+4v—2w:
Linear combination
|
0;
3
1
+4/2)]-2
l
2
3
—
=
I
2
9
We end with this question: What surface in 3-dimensional space do you get from all
the linear combinations of u and v? The surface includes the line through w and the
6
Chapter 1
Introduction to Vectors
line through v. It includes the zero vector (which is the combination
Ou + Ov). The
surface also includes the diagonal line through u + v—and every other combination
cu + dv (not using w). This whole surface is a plane (unless u is parallel to v).
Note on Computing
Suppose the components of v are v(1), ..., u(N) and similarly for w.
In a language like FORTRAN, the sum v + w requires a loop to add components separately:
DO 10
iN
10 VPLUSW(I) = vil)
+ wil)
MATLAB works directly with vectors and matrices. When v and w have been
defined, v + w is immediately understood. It is printed unless the line ends in a semicolon. Input two specific vectors as rows—the prime ’ at the end changes them to
columns. Then print v + w and another linear combination:
C=(2 03
4
We
ee
ee
2*v—3x*«w
The sum will print with
u =. The unnamed combination prints with ans =:
i=
ans =
3
4
5
=»
1
5
REVIEW OF THE KEY IDEAS
#8
1.
A vector v in two-dimensional space has two components v; and v2.
2.
Vectors are added and subtracted a component at a time.
3.
The scalar product is cv = (cv1,cv2). A linear combination of v and w is cv + dw.
4.
The linear combinations of two non-parallel vectors v and w fill a plane.
Problem Set 1.1
Problems
1
1-9 are about addition of vectors and linear combinations.
Draw the vectors
v = [4] and w = [~3] and v + w and v — w in a single
xy plane.
2
If v+w=[}] and v—w=[34], compute and draw
v and w.
1.1.
Vectors
and
Linear
Combinations
7.
From v = lea and w = [}], find the components of 3v + w and v — 3w and
cu+dw.
Compute
u+v
and u+v-+w
and 2u+2v+w
when
1
|
3
—3
|
—2
2
tes
—]
Every combination of v = (1, —2,1) and w = (0,1, —1) has components
add to ___—. Find c and d so that cv + dw = (4, 2, —6).
that
In the xy plane mark all nine of these linear combinations:
[i] +a{7|
With
C10
eranGene) ele
(a)
The subtraction v — w goes forward along v and backward
1.3 also shows a second route to v — w. What is it?
(b)
If you look at all combinations
you see?
on w.
Figure
of v and w, what “surface of vectors” do
The parallelogram in Figure 1.2 has diagonal v + w. What is its other diagonal?
What is the sum of the two diagonals? Draw that vector sum.
If three corners of a parallelogram
are (1, 1), (4,2), and (1,3), what are all the
possible fourth corners? Draw two of them.
Problems
10-13 involve the length of vectors.
10 = ‘The parallelogram with sides v = (4, 2) and
Compute
(length of v)? as vt =f ve:
w = (—1, 2) is a rectangle (Figure 1.2).
Check the Pythagoras formula a? + b* = c* which is for right triangles only:
(length of v)* + (length of w)? = (length of v + w)’.
11
In this 90° case, a2 + b? = c? also works for v — w.
In Figure
1.2, check that
(length of v) + (length of w)? = (length of v — w)’.
Give an example of v and w (not at right angles) for which this formula fails.
12
To emphasize that right triangles are special, construct v and w without a 90°
angle. Compare (length of v)? + (length of w)* with (length of v + w)?.
13
In Figure 1.2 check that (length of v) + (length of w) is larger than (length of v + w).
This “triangle inequality” is true for every triangle, except the absolutely thin
triangle when v and w are
. Notice that these lengths are not squared.
8 = Chapter 1
Introduction to Vectors
4:00
(1, 0, 0)
Problems 14-18 are about special vectors on cubes and clocks.
14
Copy the cube and draw the vector sum of i = (1,0,0) and j = (0,1, 0) and
k = (0,0, 1). The addition i + j yields the diagonal of
:
15
Three
edges of the unit cube are i, j,k.
Three
corners
are
(0,0,0),
(1,0, 0),
(0, 1,0). What are the other five corners and the coordinates of the center point?
The center points of the six faces are
16
17
How many corners does a cube have in 4 dimensions?
A typical corner is (0, 0, 1, 0).
(a)
What is the sum V of the twelve vectors that go from the center of a clock
1:00, 2:00, ...,
12:00?
(b)
If the vector to 4:00 is removed, find the sum of the eleven remaining vectors.
(c)
Suppose the 1:00 vector is cut in half. Add it to the other eleven vectors.
Suppose the twelve vectors start from (0, —1) at the bottom of the clock instead
of (0,0) at the center. The vector to 6:00 is zero
doubled to (27). Add the new twelve vectors.
Problems
19
How
edges?
to the hours
18
How many faces?
many
19-22 go further with linear combinations
and the vector to 12:00 is
of v and w (see Figure).
The figure shows u = sv ae Sw. Mark the points iv + iw and iv + iw and
v+
w.
20
Mark the point —v + 2w and one other combination cv + dw with c+d
Draw the line of all combinations that have c+ d = 1.
= 1.
21
+Locate iv + tw and Ev + Sw. The combinations cv + cw fill out what line?
Restricted by c > 0 those combinations with c = d fill what ray?
22
(a)
Mark 5v+w and v+5w. Restricted by 0 <c <1 and 0 <d <1, shade
in all combinations cv + dw.
1.1.
Problems 19—22
9
Problems 23-26
Problems 19-22 in a plane
(b)
Vectors and Linear Combinations
Problems 23-27 in 3-dimensional space
Restricted only by 0 < c and 0 < d draw the “cone”
cu+dw.
of all combinations
Problems
23-27 deal with u,v, w in three-dimensional
23
(a)
Locate
(b)
Challenge problem: Under what restrictions on c,d, e, will the collection
of points cu + dv + ew fill in the plane containing uw, v, w?
24
25
26
space (see Figure).
iu + iv + tw and ly + sw in the figure.
The three sides of the dotted triangle are
v—u and w—v
iS
. Draw the head-to-tail
(—1, 1) plus (—2, —2).
around
addition
and
u—w.
Their sum
a plane triangle of (3, 1) plus
Shade in the pyramid of combinations cu+dv+ew with c > 0, d > 0, e > 0 and
c+d+e <1. Mark the vector $(u Se eae w) as inside or outside this pyramid.
If you look at all combinations
of u, v, and w, is there any vector that can’t be
produced from cu + dv + ew?
DO
Which vectors are in the plane of uw and v, and also in the plane of v and w?
28
(a)
Draw vectors u, v, w so that their combinations
line.
(b)
Draw vectors u, v, w so that their combinations cu + dv + ew fill only a
plane.
29.
What combination of the vectors
cu + dv + ew
B and H produces
fill only a
al? Express this
question as two equations for the coefficients c and d in the linear combination.
10° = Chapter 1
Introduction to Vectors
LENGTHS AND DOT PRODUCTS a 1.2
The first section mentioned multiplication of vectors, but it backed off. Now we go
forward to define the “dot product” of v and w. This multiplication involves the separate products vjw,
and v2w2, but it doesn’t stop there. Those two numbers are added
to produce the single number v- w.
DEFINITION
is the number
The dot product or inner product of v = (vj, v2) and w = (wj, w2)
v-W=VUVW{
+ 122.
(1)
i}f-aes
Example 1 The vectors v = (4,2) and w = (—1, 2) have a zero dot product:
In mathematics, zero is always a special number. For dot products, it means that these
two vectors are perpendicular. The angle between them is 90°. When we drew them
in Figure 1.2, we saw a rectangle (not just any parallelogram). The clearest example
of perpendicular vectors is i = (1,0) along the x axis and jy = (0,1) up the y axis.
Again the dot product is i- 7 =0+0=0. Those vectors i and j form a right angle.
The vectors v = (1,2) and w = (2, 1) are not perpendicular. Their dot product
is 4. Soon that will reveal the angle between them (not 90°).
Example 2 Put a weight of 4 at the point
x =
—1 and a weight of 2 at the point
x = 2. If the x axis is a see-saw, it will balance on the center point
weights balance because the dot product is (4)(—1) + (2)(2) = 0.
x =
0.
The
This example is typical of engineering and science. The vector of weights is
(w1, wW2) = (4, 2). The vector of distances from the center is (v;, v2) = (—1, 2). The
force times the distance, w; times vj, gives the “moment” of the first weight. The
equation for the see-saw to balance is w,vj + w2v2 = 0.
The dot product w-+v equals v-w.
The order of v and w makes no difference.
Example 3 Dot products enter in economics and business.
We have five products to
buy and sell. Their prices are (p1, p2, p3, p4, ps) for each unit—this is the “price
vector’p. The quantities we buy or sell are (q1, g2, 93, 94, 95) positive when we sell,
negative when we buy. Selling g; units of the first product at the price p; brings in
qipi. The total income is the dot product q - p:
Income= (q), q2,---,95)*
(Pi, P2,---, Ps) = Gipi + g2p2+--:+495ps.
1.2.
Lengths and Dot Products
11
A zero dot product means that “the books balance.” Total sales equal total purchases if
q:-p =0. Then p is perpendicular to g (in five-dimensional space). With five products,
the vectors are five-dimensional. This is how linear algebra goes quickly into high
dimensions.
Small note: Spreadsheets have become essential computer software in management. What does a spreadsheet actually do? It computes linear combinations and dot
products. What you see on the screen is a matrix.
Main point To compute the dot product, multiply each v; times w;. Then add.
Lengths and Unit Vectors
An important case is the dot product of a vector with itself. In this case
the vector is v = (1, 2,3), the dot product with itself is
1
v-v =
v = w. When
14:
1
Po)= 1 2 |) o |] 2 || =
3
3
se ah ae) —
ale
The answer is not zero because v is not perpendicular to itself. Instead of a 90° angle
we have 0°. In this special case, the dot product v-v gives the length squared.
DEFINITION
The length (or norm) of a vector v is the square root of v-
v:
length = ||v|| = J/v-v.
In two dimensions the length is Vur a6 U5. In three dimensions it is Vue aS v3 + US.
By the calculation above, the length of v = (1, 2, 3) is |/v|] = “/ 14:
We can explain this definition. ||v|| is just the ordinary length of the arrow that
represents the vector. In two dimensions, the arrow is in a plane. If the components
are | and 2, the arrow is the third side of a right triangle (Figure 1.5). The formula
a” +b? = c*, which connects the three sides, is 17 + 2? = ||v|l.
For the length of v = (1, 2,3), we used the right triangle formula twice.
First,
the vector in the base has components 1, 2,0 and length /5. This base vector is perpendicular to the vector (0, 0,3) that goes straight up. So the diagonal of the box has
length ||v|] = /5+9= /14.
The length of a four-dimensional vector would be Vv? + 0; =f v3 + om Thus
(1,1, 1,1) has length Ji? + 124+ 12412 = 2. This is the diagonal through a unit
cube in four-dimensional space. The diagonal in n dimensions has length \/n.
The word “unit” is always indicating that some measurement equals “one.” The
unit price is the price for one item. A unit cube has sides of length one. A unit circle
is a circle with radius one. Now we define the idea of a “unit vector.”
12.
Chapter 1
DEFINITION
Introduction to Vectors
A unit vector u is a vector whose length equals one. Then u-u = 1.
An example in four dimensions is u = G. , 5, 5). Then u-u is it . AF i AF j =f,
We divided the vector v = (1, 1, 1, 1) by its length ||v|| = 2 to get the unit vector.
Example 4 The standard unit vectors along the x and y axes are written i and j. In the
xy plane, the unit vector that makes an angle “theta” with the x axis is (cos @, sin @):
,
1
=|,
When
and
?
0
rf)
6 = 0, the vector u is the same
and
as i. When
cos0
Baal
0 = 90° (or a radians), uw is the
same as J. At any angle, the components cos@ and sin@ produce u -u = 1 because
cos? @+sin* 9 = 1. These vectors reach out to the unit circle in Figure 1.6. Thus cos6
and sin@ are simply the coordinates of that point at angle 6 on the unit circle.
In three dimensions, the unit vectors along the axes are i, J, and k. Their components
are (1,0,0) and (0,1,0) and (0,0, 1). Notice how every three-dimensional vector is
a linear combination of i, j, and k. The vector v = (2,2, 1) is equal to 2i+2j +k.
Its length is “22 + 2? + 12. This is the square root of 9, so ||v|| = 3.
Since (2,2, 1) has length 3, the vector (, z, 5) has length 1. To create a unit
vector, just divide v by its length ||v]].
1A Unit vectors Divide any nonzero vector v by its length.
is a unit vector in the same direction as v.
Then u =
v/|\v||
Lage
(1, 2, 3) has length ¥14
|
|
|
|
|
| (0, 2, 0)
N
(hy O.0) ye
Figure 1.5
Sl
~ (1,2,0) has lengthV5
The length \/v- v of two-dimensional and three-dimensional vectors.
1.2.
7 =, 1)
Lengths and Dot Products
13
i+7=(,1)
i—(
ie
Ly
i = (1,0)
ay
Figure 1.6
The coordinate vectors i and j. The unit vector u at angle 45° (left)
and the unit vector (cos 6, sin@) at angle 6.
In three dimensions we found u = (3, 5, 4). Check that
w-u = i,+ 5 + 4 =wilh
Then uw reaches out to the “unit sphere” centered at the origin. Unit vectors correspond
to points on the sphere of radius one.
The Angle Between Two Vectors
We stated that perpendicular vectors have v-w = 0. The dot product is zero when the
angle is 90°. To give a reason, we have to connect right angles to dot products. Then
we show how v- w finds the angle between any two nonzero vectors.
1B Right angles
The dot product v-w is zero when v is perpendicular to w.
Proof When v and w are perpendicular, they form two sides of a right triangle. The
third side (the hypotenuse going across in Figure 1.7) is v—w. So the law a7+ b? = c?
for their lengths becomes
II v||7 + || w ||? = ||\v— w|l-
(for perpendicular vectors only).
(2)
Writing out the formulas for those lengths in two dimensions, this equation is
ie,+ v5 + wr + ws = (vj — w1)° + (v2 — wy).
(3)
The right side begins with v? —2vjwy + Da. Then ve and wi are on both sides of the
5
.
equation and they cancel.
—2v2w2.
All terms
cancel
would also be —2v3w3.)
.
9)
.
4)
2,
Similarly (v2 — w2)° contains vy and w 5 and the cross term
except
—2v,;w,;
and
—2v2w2. (In three dimensions
there
The last step is to divide by —2:
0 = —2v,;w) —2v2w2
which leads to
vijw, + v2wW2 = 0.
(4)
14 = Chapter 1
Introduction to Vectors
angle below 90° in this half
ieee
2
V5
a
v=
a
V20
+
+}
oe
v-w>0
WwW
++}
=e;
v-W<
angle above 90°
in this half
Mi
Figure 1.7
;
2
Perpendicular vectors have v-w = 0. The angle is below 90° when v-w > 0.
Conclusion Right angles produce v - w = 0. We have proved Theorem 1B. The dot
product is zero when the angle is 6 = 90°. Then cos@ = 0. The zero vector v = 0 is
perpendicular to every vector w because 0- w is always zero.
Now suppose v-w is not zero. It may be positive, it may be negative. The sign
of v+ w immediately tells whether we are below or above a right angle. The angle is
less than 90° when v- w is positive. The angle is above 90° when v - w is negative.
Figure 1.7 shows a typical vector w = (1,3) in the white half-plane, with
v- w > 0.
The vector W = (—2,(0) in the screened half-plane has v- W = —8.
The borderline is where vectors are perpendicular to v.
between plus and minus, the dot product is zero.
On that dividing line
The next page takes one more step with the geometry of dot products. We find
the exact angle 6. This is not necessary for linear algebra—you could stop here! Once
we have matrices and linear equations, we won’t come back to 6. But while we are
on the subject of angles, this is the place for the formula. It will show that the angle
6 in Figure 1.7 is exactly 45°, between
v = (4, 2) and w = (1, 3).
Start with unit vectors u and U. The sign of uw -U
tells whether 6 < 90° or
9 > 90°. Because the vectors have length 1, we learn more than that. The dot product
u-U
is the cosine of 0. This is true in any member of dimensions!
1C
(a)
If wu and U are unit vectors then
u-U =cosé.
(b)
If uw and U are unit vectors then
|u-U| <1.
Statement (b) follows directly from (a). Remember
that cos@ is never greater than 1.
It is never less than —1. The dot product of unit vectors is between
—1 and 1.
Figure 1.8 shows this clearly when the vectors are u = (cos6@,sin@)
and i = (1,0).
The dot product is
u-i =cos@.
That is the cosine of the angle between them.
1.2.
Figure 1.8
Lengths and Dot Products
15
The dot product of unit vectors is the cosine of the angle 0.
After rotation through any angle a, these are still unit vectors. The angle between
them is still 6. The vectors are u = (cosa,sina) and U = (cos f,sinf). Their dot
product is cosacos B+ sina sin B. From trigonometry this is the same as cos(f — @).
Since 6 —a@ equals 6 we have reached the formula u-U =cos@.
Problem 24 proves the inequality |w-U |< 1 directly, without mentioning angles.
Problem 22 applies another formula from trigonometry, the “Law of Cosines.” The
inequality and the cosine formula u-U = cos@ are true in any number of dimensions.
The dot product does not change when vectors are rotated, since 6 stays the same.
What if v and w are not unit vectors)? Their dot product is not generally cos 6.
We have to divide by their lengths to get unit vectors u = v/||v|| and U = w/||wl.
Then the dot product of those rescaled vectors u and U gives cos@.
Whatever the angle, this dot product of v/||v|| with w/||w|| never exceeds one. That
is the “Schwarz inequality” for dot products—or more correctly the Cauchy-SchwarzBuniakowsky inequality. It was found in France and Germany and Russia (and maybe
elsewhere—it is the most important inequality in mathematics). With the extra factor
||v|| ||w|| from rescaling to unit vectors, we have cos @:
v-w
1D (a) COSINE FORMULA If v and w are nonzero vectors then Piel = cosd §
v|| ||w
(b) SCHWARZ INEQUALITY If v and w are any vectors then
|v + w| < ||v|j ||w|] .
Example 5 Find the angle between v = ('4'] and w = [4] in Figure 1.7b.
Solution The dot product is
v- w =4+6=
10. The length of v is ||v|| = 20.
The length of w is ||w|| = /10. Therefore the cosine of the angle is
cos (idee
0=
ae
lyiiiwl
=
=
20/10
=
V2
:
Then en @ a
= 45° a
The angle that has this cosine is 45°. It is below 90° because v - w = 10 is positive.
By the Schwarz inequality, ||v|| ||w|| = /200 is larger than v + w = 10.
16 == Chapter 1
Introduction to Vectors
Example 6 The dot product of v = (a,b) and w = (b,a) is 2ab. The Schwarz inequality says that 2ab < a? +b*. For example 2(3)(4) = 24 is less than Bed? = 25,
Reason
The lengths are ||v|| = ||w||
|v|||}w||= a* +b.
= /a2 +62.
Then v- w = 2ab never exceeds
The difference between a? + b* and 2ab can never be negative:
a’ + b* —2ab = (a — b)* > 0.
This is more famous if we write x = a? and y = b?. Then the “geometric mean” ./xy
is not larger than the “arithmetic mean,’ which is the average of x and y:
< s
.
“+4 be
becomes
/xy <
Xcte
5 _
Computing dot products and lengths and angles It is time for a moment of truth.
The dot product v - w is usually seen as a row times a column:
1
Instead of
In FORTRAN
DO WO
10 VDOTW
3
B : iH we more often see
3
pl 2 | Be
we multiply components and add (using a loop):
= aN
= VDOTW
+ Vil) * Wil)
MATLAB works with whole vectors,
vectors then v’ is a row as above:
not their components.
dot
The length of v is already known
If v and
=v’ xw
to MATLAB
as norm(v).
We could define it our-
selves as sqrt (v’ xv), using the square root function—also known.
angle (in radians) we do have to define ourselves:
cos
w are column
The cosine and the
=v’ * w/(norm(v) * norm(w));
angle
= acos(cos)
We used the arc cosine (acos) function to find the angle from its cosine. We have not
created a new
function cos(v, w) for future use.
Chapter 2 will show its format.
this book.
That would become
an M-file, and
(Quite a few M-files have been created especially for
They are listed at the end.) The instructions above will cause the numbers
dot and angle to be printed. The cosine will not be printed because of the semicolon.
1.2
=
REVIEW OF THE KEY IDEAS
The dot product v-w
17
=
multiplies each component v; by w; and adds.
The length ||v|| is the square root of vThe vector v/||v|| is a unit vector.
ee
=
Lengths and Dot Products
v.
Its length is 1.
The dot product is v- w =O when v and w are perpendicular.
The cosine of 6 (the angle between v and w) never exceeds
v-w
A=
and therefore
I| || ||w ||
|:
|v- w]| < ||v|| ||w].
Problem Set 1.2
Calculate the dot products u-v
and u-w
Bae
Heda
a
v=
and v-w
is)
ly,
and w-v:
Pe
WS
i.
Nol:
Compute the lengths ||z|| and ||v|| and ||w|| of those vectors.
inequalities |u+v| < ||u|| ||v|| and |v- w| < |/v|| ||w|].
Write down
unit vectors
in the directions
Check the Schwarz
of v and w in Problem
1. Find the
cosine of the angle 6 between them.
Find unit vectors u; and uw2 in the directions of v =
(3,1) and w =
(2, 1,2).
Also find unit vectors U; and U> that are perpendicular to v and w.
For any unit vectors v and w, show that the angle is 0° or 90° or 180° between
(a)
v and v
(b)
w and —w
(c)
v-+w
and v—w.
Find the angle @ (from its cosine) between
\
@aao—
Ra
\
and
w=
ia
2
(©) at —
(c) v=| 5] and w=| 751 (a) =|
2
8
and
w=|]|-l1
and Heir
(a)
Describe every vector (w , w2) that is perpendicular to v = (2, —1).
(b)
In words, describe all vectors that are perpendicular to V = (1, 1, 1).
True or false (give a reason if true or a counterexample if false):
18
Chapter 1
Introduction to Vectors
(a)
If w is perpendicular (in three dimensions) to v and w, then v and w
are parallel.
(b)
If u is perpendicular to v and w, then w is perpendicular to v + 2w.
(c)
There is always a combination v + cu that is perpendicular to w.
The slopes of the arrows from (0,0) to (vj, v2) and (w1, w2) are v2/v, and
w2/wy,. If the product of those slopes is v2w2/v1;w, = —1, show that v- w = 0
and the vectors are perpendicular.
10
Draw arrows from (0,0) to the points v = (1,2) and w = (—2, 1). Write down
the two slopes and multiply them. That answer is a signal that
arrows are
v- w = O and the
11
If v-w is negative, what does this say about the angle between v and w? Draw a
3-dimensional vector v (an arrow), and show where to find all w’s with v-w < 0.
12
With v = (1, 1) and w = (1,5) choose a number c so that w—cv
is perpendicular
to v. Then find the formula that gives this number c for any v and w.
13
Find two vectors v and w that are perpendicular to (1, 1,1) and to each other.
14
Find three vectors u, v, w that are perpendicular to (1, 1, 1, 1) and to each other.
15
The geometric mean of x = 2 and y = 8 is ./xy = 4. The arithmetic mean is
larger: 5(x +y)=
. This came in Example 6 from the Schwarz inequality
Omi
16
How
(Umm /8) and w = (V8, ok
Find cos@ for this v and w.
long is the vector v = (1, 1,..., 1) in 9 dimensions?
Find a unit vector u
in the same direction as v and a vector w that is perpendicular to v.
17
What are the cosines of the angles a, 8, 0 between the vector (1, 0, —1) and the unit
vectors i, j, k along the axes? Check the formula cos” w + cos” B + cos? 6 = 1.
Problems 18—24 lead to the main facts about lengths and angles. Never again will
we have several proofs in a row.
18
(Rules for dot products) These equations are simple but useful:
Q)v-w=w-v
(2)u-(v+w)=u-v+u-w
(3) (cv)-w=c(v-w)
Use Rules (1) and (2) with u = v+w to prove that |v-+w||? = v-v+2v-w+wew.
19
The triangle inequality is (length of v + w) < (length of v) + (length of w).
Problem 18 found ||v+ wl? = |jv||* -+2v-w-+
|w||*. Use the Schwarz inequality
on v+ w to prove
lv + wll? < (lv|] + jw)?
20
or
|v + wll < |lvl| + lw].
A right triangle in three dimensions still obeys ||v||* + ||w||? = ||v + w||?.. Show
how this leads in Problem
18 to vjw, + v2w2 + v3w3 = 0.
1.2
Lengths and Dot Products
19
W = (Wj, W>)
aN
ar Ni
21
The figure shows that cosa = vj,/||v|| and sina = v2/|\v||. Similarly cosB is
and sinfB is ____. ~The angle 6 is 8 —a@. Substitute into the formula
for cos(B — a) to find cos@ = cos Bcosa@ + sinf sina = v- w/||v|| ||w].
22
With v and w at angle 6, the Law of Cosines gives the length of the third side:
|v — wll? = [vl]? — 2[]v|| lwil cos6+ [wll
Compare with (v—w)-(v—w)=
23
_____
to find again the formula for cos @.
The Schwarz inequality by algebra instead of trigonometry:
(a)
(b)
Multiply out both sides of (vj w) + v2w2)? < (v7 + v5)(w7 + w3).
Show that the difference between those sides equals (v;w2 — v2w )*. This
cannot be negative since it is a square—so the inequality is true.
24
One-line proof of the Schwarz inequality |u -U| < 1:
If (uy, uz) and (U;, U2) are unit vectors, pick out the step that uses Example 6:
pals
lial (Ui | ls
[Ua
us2 + U 2
us2 + Us2
1+1
Put (u1,U2) = (.6, .8) and (U;, U2) = (.8, .6) in that whole line and find cos @.
25
Why is |cos@| never greater than | in the first place?
26
Pick any numbers that add to x ++ y+z=0.
Find the angle between your vector
v =
Challenge
(x,y,z)
and the vector
w
=
(z,x, y).
Suppose ||v|| = 5 and ||w|| =
values of ||v — w|| and v- w?
3.
What
question:
Explain why
v-+w/|lv||||wi| is always —5.
27
are the smallest and largest possible
~~
So?
64
9 GaAs
tp
—
1
"6
ne (4)
_
9
te
Ghee
PLD
awe
ie
en
of
>
i
eT
SEE
on fe Fas.
ww
a
Che
Gere
he
owing:
te” Une
NeHeale
enw,
>
Sif
ia aes
th a
i
sk
1
A
a
oil Nap
eae
Alpi ety a
hake Oe,
st Wd delone tig lanier a
hahaai Ginn weds orig segs
._iW
<2
:
Ly are!
=e}
cian, vied
-
0S eas vhs
5a sae
—
»
fdinae
ine
is ie 9 er
Moral)
VilehiGs Ffomntsnumts lee
16
mnrr Ny, pene
2 tom Dane
a
sea
1D Wotbeader lanier a eormpinnew yenaed ae
Bane SpringDPyi dj alsesa
qed oe
s
7
14 cml
=
‘>
:
ez
ula: cad Salad
eg
> Se
Hai
@
fi
A dynatiber sition» at ee
Siliinstaas hte
7
oe heme
:
“f
*
Hee
Ge
GA
Pri’ Gare
legen
Oye
ae
ve
ark
im
-
pp
af
toull tui a i
ees sonnets les «jn os
aa
hoe
et cag
MS Tianty Anaad! “pee aah bel> ieeeal
e Riu
ule Vini@ud: aims Oj attend re
iv
Se:
empress
oye?
sae
atheiy wtp al law wha mnge
a
elo
Pn
i
AL
Sits
Vee
pee
tiwut@&
=
aie
@
ig
ro
¥*
en
=
aes ee
—Ce
¢
;
SOLVING
LINEAR EQUATIONS
VECTORS AND LINEAR EQUATIONS ag 2.1
The central problem of linear algebra is to solve a system of equations. Those equations
are linear, which means that the unknowns are only multiplied by numbers—we never
see x times y. Our first example of a linear system is certainly not big. It has two
equations in two unknowns. But you will see how far it leads:
Gyo
Bhs Sere PA
Se
eI
(1)
One way to approach these equations is a row at a time. The first row is x —2y = l.
That equation produces a straight line in the xy plane. The point x = 1, y = 0 is on
the line because it solves the equation. The point x = 3, y = 1 is also on the line
because 3 —2 = 1. If we choose x = 101 we find y = 50. The slope of this particular
line is 5. But slopes are important in calculus and this is linear algebra!
ag
3x
2y= 11
x-2y=1
Figure 2.1.
Row picture: The point where the lines meet is the solution.
Figure 2.1 shows that line x — 2y = 1. The other line in the figure comes from
the second equation 3x -+2y = 11. You can’t miss the intersection point where the two
lines meet. The point x = 3, y = 1 lies on both lines. That point solves both equations
at once. This is the solution to our system of linear equations.
21
22
Chapter 2
Solving Linear Equations
We call Figure 2.1 the “row picture” of the two equations in two unknowns:
R The row picture shows two lines meeting at a single point.
Turn now to the column picture. I want to recognize the linear system as a “vector equation”. Instead of numbers we need to see vectors. If you separate the original
system into its columns instead of its rows, you get
eae
laibe|
This has two column vectors on the left side. The problem is to find the combination
of those vectors that equals the vector on the right. We are multiplying the first column
vector by x and the second column vector by y, and adding. With the right choices x = 3
and y = |, this linear combination produces the vector with components | and 11.
C The column picture combines the columns on the left side to produce the vector
on the right side.
Figure 2.2.
Column picture: A combination of columns produces the right side (1,11).
Figure 2.2 is the “column picture” of the two equations in two unknowns.
first part shows the two separate columns
;
and
The
a | It also shows that first
column multiplied by 3. This multiplication by a scalar (a number) is one of the two
basic operations in linear algebra:
Scalar multiplication
3 : =:
.
;
2.1
Vectors and Linear Equations
23
If the components of a vector v are v; and v2, then the components of cv are cv; and
cv2. Notice boldface for vectors and lightface for scalars.
The other basic operation is vector addition. Algebraically we are adding the first
components and second components separately:
as
Vector addition
3)
—2
1
5 J+
lela
|
The graph in Figure 2.2 shows a parallelogram. Two vectors are placed “head to tail”.
Their sum is a diagonal of the parallelogram:
The sides are
;
and
- I;The diagonal sum is
i i
We have multiplied the original columns by x = 3 and y = 1, and combined them to
produce the vector on the right side of the linear equations.
To repeat: The left side of the vector equation is a linear combination of the
columns. The problem is to find the right coefficients x = 3 and y = 1. We are
combining scalar multiplication and vector addition into one step. That step is crucially
important, because it contains both of the basic operations:
:
abyss
Linear combination
Of course
1
3]3
the solution x = 3, y =
+
—2
>
| is the same
=
1
i |
as in the row
picture.
I don’t
know which picture you prefer! I suspect that the two intersecting lines are more familiar at first. You may like the row picture better, but only for one day. My own
preference is to combine column vectors. It is a lot easier to see a combination of
four vectors in four-dimensional space, than to visualize how four hyperplanes in that
space might possibly meet at a point.
The coefficient matrix on the left side of the equations is the 2 by 2 matrix A:
BPs S|
This is very typical of linear algebra, to look at a matrix by rows and by columns. Its
rows give the row picture and its columns give the column picture. Same numbers,
different pictures, same equations. We can write those equations as a matrix problem
76Se ip
;
:
Matrix equation
1
E
—2
wuts
1
slikeleback
The row picture multiplies on the left side by rows. The column picture combines the
columns. The numbers x = 3 and y = | go into the solution vector x. Then Ax = b is
see [a= [aed
24
Chapter 2. Solving Linear Equations
The first row is |-3—2-1
and this equals 1. The second row is
agrees with 11. The column picture takes 3 (column
3-3+2-1
and this
1) + (column 2). It produces the
column vector on the right side. Either way, this very small system is now solved!
Three Equations in Three Unknowns
Let the unknowns be x, y, z, and let the linear equations be
Dye
See
25
Ok
=
6
Be
=
Sy
Se
(3)
pe
We look for numbers x, y, z that solve all three equations at once. Those desired numbers might or might not exist. For this system, they do exist. When the number of unknowns matches the number of equations, there is usually one solution. Before solving
the problem, we try to visualize it both ways:
R The row picture shows three planes meeting at a single point.
C The column picture combines three columns to produce the fourth column.
In the row picture, each equation is a plane in three-dimensional space. The first
plane comes from the first equation x + 2y + 3z = 6. That plane crosses the x and y
and z axes at the points (6,0,0) and (0,3,0) and (0,0, 2). Those three points solve
the equation and they determine the whole plane. But the vector (x, y, z) = (0, 0, 0)
does not solve x +2y+3z = 6. Therefore the plane in Figure 2.3 does not go through
the origin.
line L is on
both planes
line L meets
third plane
at solution
2x
Figure 2.3
ay +22 =4
Row picture of three equations: Three planes meet at a point.
2.1
Vectors and Linear Equations
25
For a plane to go through (0, 0,0), the right side of the equation must be zero.
The plane x + 2y + 3z = O does pass through the origin, and it is parallel to the
original plane. When the right side increases to 6, the plane moves away from the
origin. Later we will recognize that the vector (1, 2,3) is perpendicular to these parallel
planes x + 2y + 3z= Cc.
The second plane is given by the second equation 2x +5y+2z = 4. It intersects
the first plane in a line L. The usual result of two equations in three unknowns is a
line L of solutions.
The third equation gives a third plane. It cuts the line L at a single point. That
point lies on all three planes and solves all three equations. It is harder to draw this
triple intersection point than to imagine it. The three planes meet at the solution (which
we haven’t found yet).
The column picture starts with the vector form of the equations:
1
v2
6 |
2
ey
=
5
a
ian
ae
Pee
6
ia
|
2
(4)
The unknown numbers x, y, z are the coefficients in this linear combination. We want to
multiply the three column vectors by the correct numbers x, y, z to produce b = (6, 4, 2).
3
2
I
2
5
6
b=|4
De
Figure 2.4
Column
times column 3.
picture: The solution is (x, y, Z) = (0,0, 2) because b equals 2
Figure 2.4 shows this column picture. Linear combinations of those columns can
produce all possible right sides! The combination that produces b = (6, 4,2) is just
2 times the third column. The coefficients we need are x = 0, y =0, and z= 2.
This is also the intersection point of the three planes in the row picture. It is the
solution to the linear system:
26
Chapter 2. Solving Linear Equations
ONT
1
eae
6
0
Pi
2paul Wi a
—=3
5
6
ite etal(os— 9)|i
1
2
The Matrix Form of the Equations
We have three rows in the row picture and three columns in the column picture (plus the
right side). The three rows and columns contain nine numbers. These nine numbers fill a
3 by 3 matrix. The “coefficient matrix” has the rows and columns that have so far been
kept separate:
The coefficient matrix is
1
A= | 2
6
z3
Sipe
-3
1
The capital letter A stands for all nine coefficients (in this square array).
The
letter b denotes the column vector with components 6,4,2. The unknown x is also
a column vector, with components x, y, z. (We use boldface because it is a vector, x
because it is unknown.) By rows the equations were (3), by columns they were (4),
and now by matrices they are (5). The shorthand is Ax = b:
1
D
6°
Matrix equation:
De 3
ee
—3
1
x
6
Sia
z
2
e
(5)
We multiply the matrix A times the unknown vector x to get the right side b.
Basic question: What does it mean to “multiply A times x”? We can do it by
rows or we can do it by columns. Either way, Ax = b must be a correct representation
of the three equations. So there are two ways to multiply the matrix A times the vector x:
Multiplication by rows
Ax comes from dot products, a row times a column:
Ax=|
Multiplication by columns
Ax =x
(row
1) * x
Crow Z) «x
( row
3) * x
|.
(6)
Ax is a linear combination of column vectors:
(column 1) + y (column 2) +z (column 3).
(7)
However you do it, all our equations show A times x. When we substitute the solution
x = (0,0, 2), the multiplication Ax produces b:
1
2,
6
Zs
PS 2
=—3' 1
0
Oo
2
;=]
6
4
2
2.1
Vectors and Linear Equations
27
In this case, multiplication by columns is simply 2 times column 3. The right side
b = (6,4, 2) is proportional to the third column (3, 2, 1).
This book sees Ax as a combination of the columns of A. Compute by rows,
but please hold on to multiplication by columns.
Example 1 Here are matrices A and J with three ones and six zeros:
Axia
Oma
LagO eG.
OO
4
Be
6
4
ein | te
4
[Oe
OR ia @
OF One
Lo
4
+
Saha
6
6
In the first example Ax is (4,4,4). If you are a row person, the product of every
row (1, 0,0) with (4,5, 6) is 4. If you are a column person, the linear combination is
4 times the first column (1,1, 1). In that matrix A, the second and third columns are
zero vectors.
The example with /x deserves a careful look, because the matrix / is special. It
has ones on the “main diagonal”. Off that diagonal, all the entries are zeros. Whatever
vector this matrix multiplies, that vector is not changed. This is like multiplication
by 1, but for matrices and vectors. The exceptional matrix in this example is the 3 by 3
identity matrix:
—~
I
always yields the multiplication
Jx =x.
a
oor
oro
Matrix Notation
The first row of a 2 by 2 matrix contains aj; and aj. The second row contains a2,
and a2. The first index gives the row number, so that a;; is an entry in row 7. The
second index j gives the column number. But those subscripts are not convenient on
a keyboard! Instead of aj; it is easier to type A(i, 7). The entry aj2 = A(1, 2) is in
row
|, column
2:
OM
aie?
a=[ i
ai
a
4
VG
AC
dealee
ara2)
el
For an m by n matrix, the row index 7 goes from 1 to m. The column index j stops
at n. There are mn entries in the matrix.
Multiplication in MATLAB
I want to express A and x and their product Ax using MATLAB commands. This is a
first step in learning that language. I begin by defining the matrix A and the vector x.
This vector is a 3 by | matrix, with three rows and one column. Enter matrices a row
28
Chapter 2. Solving Linear Equations
at a time, and use a semicolon to signal the end of a row:
As
ll
2
35
be?
Se
ee
One
oe
sf = |O2
Os 2)
Here are three ways to multiply Ax in MATLAB:
Matrix multiplication
b= Axx
In reality, that is the way to do it. MATLAB is a high level language, and matrices
are the objects it works with. The symbol x indicates multiplication.
We can also pick out the first row of A (as a smaller matrix!). The notation for
that submatrix is A(1,:), which keeps row | and all columns. Multiplying this row by
the column vector x gives the first component b; = 6:
Row ata time
b=[A(l,:)*x;
A(Q2,:)*x;
A(G,:)*x]
Those are dot products, row times column.
The other way to multiply uses the columns of A. The first column is the 3 by 1
submatrix A(:, 1). Now the colon symbol : is keeping all rows of column 1. This
column multiplies x(1) and the other columns multiply x(2) and x(3):
Column at
atime
b= A(:,1)*x(1)+A(:,
2) *x(2) + A(:,3) * x(3)
I think that matrices are stored by columns. Then multiplying a column at a time will
be a little faster. So A * x is actually executed by columns.
You can see the same choice in a FORTRAN-type structure, which operates on
single entries of A and x. This lower level language needs an outer loop and an inner
loop. When
the outer loop (in FORTRAN
or MATLAB)
indicates the row number
i,
the multiplication is a row at a time:
FORTRAN
by rows
MATLAB
by rows
DOMOS
FEATes
fone Sts3
DORLOV
4 153
foray
= be 3
10
BU) =BU)+
Ad, J) # XV)
b@) =b@) + AG, J) ¥ x)
Notice that MATLAB is sensitive to upper case versus lower case (capital letters and
small letters). If the matrix is A then its entries are A(i, j) not a(i, j).
When the outer loop indicates the column number j, the multiplication is executed a column at a time. The inner loop goes down the column:
FORTRAN
by columns
DO 20) Gi: = 3
DO 20 tit i53
20 BU) =BU)+ACU, J) * XV)
MATLAB
by columns
foray; =i
for t= 13
b(i) = bi) + AG, j) *xQ)
My hope is that you will prefer the higher level A * x. FORTRAN will not appear
again in this book. Maple and Mathematica and graphing calculators also operate at
2.1
Vectors and Linear Equations
29
the higher level. The multiplication is A. x in Mathematica. It is multiply(A, x); or
equally evalm(A& * x); in Maple. Those languages allow symbolic entries a,b, c,...
and not only real numbers. Like MATLAB’s Symbolic Toolbox, they give the correct
symbolic answer.
=»
REVIEW OF THE KEY IDEAS
=&
The basic operations on vectors are scalar multiplication cv and vector addition
v+ Ww.
Together those operations give linear combinations cv + w and cv + dw.
Ax = b asks for the combination of the columns of A that produces b.
gre
ee Matrix-vector multiplication Ax can be executed by rows or by columns.
Each separate equation in Ax = b gives a line (n = 2) or a plane (n = 3) or a
“hyperplane” (n > 3). They intersect at the solution or solutions.
Problem Set 2.1
Problems
1
1-8 are about the row and column pictures of Ax = b.
With A = / (the identity matrix) draw the planes in the row picture. Three sides
of a box meet at the solution:
Ix +0y +0z=2
Ox+ly+0z=3
On
Oy iz
I|
or
©
or
©
oor
&
NS
bh
RW
Draw the vectors in the column picture of Problem 1. Two times column | plus
three times column 2 plus four times column 3 equals the right side b.
If the equations in Problem
| are multiplied by 1, 2,3 they become
Ix +0y+0z=2
Ox +2y+0z=6
Ox 4 Oy-+ 32.) 12
or
[O00
Oa
0
OPTUS
¥8
yi
z
fk=41)
Z
6
12
Why is the row picture the same? Is the solution the same? The column picture
is not the same—draw it.
If equation 1 is added to equation 2, which of these are changed: the row picture,
the column picture, the coefficient matrix, the solution? The new equations in
Problem 1 would be x = 2, x+y =5, z= 4.
30
Chapter 2
Solving Linear Equations
Find any point on the line of intersection of the planes x +
y +z=4. By trial and error find another point on the line.
y+ 3z=6
and x —
The first of these equations plus the second equals the third:
Ka
ye
=
x+2y+
z=3
2x 4-3)
20 =o:
The first two planes meet along a line. The third plane contains that line, because
if x, y, z satisfy the first two equations then they also
. The equations have
infinitely many solutions (the whole line). Find three solutions.
Move the third plane in Problem 6 to a parallel plane 2x + 3y +2z = 9. Now
the three equations have no solution—why not? The first two planes meet along
a line, but the third plane doesn’t cross that line (artists please draw).
In Problem 6 the columns are (1, 1,2) and (1, 2,3) and (1, 1,2). This is a “singular case” because the third column is
. Find two combinations of the
three columns that give the right side (2, 3, 5).
Problems 9-14 are about multiplying matrices and vectors.
9
Compute each Ax by dot products of the rows with the column vector:
(aap
10
223
all lee
orca di
(b)
Vas Maas eg Ulhc
O° Or rao
Compute each Ax in Problem 9 as a combination of the columns:
O(a),
becomes)
FA
=
1
25)—2"
—4
|/-F
Z
213
1
4
S35
|) =
2
How many separate multiplications for Ax, when the matrix is “3 by 3’?
11
Find the two components of Ax by rows or by columns:
GE) ~ & IE) = bap
12
Multiply A times x to find three components of Ax:
On On id
Onalian®
La0: 0
53
y
Zz
and
pie mst
ee
aos)
2.1
13
(a)
A matrix with m rows and n columns multiplies a vector with
ponents to produce a vector with
14
Vectors and Linear Equations
com-
components.
(b)
The planes from the m equations Ax = b are in
The combination of the columns of A is in
-dimensional space.
-dimensional space.
(a)
How would you define a linear equation in three unknowns x, y, z?
(b)
If vo = (x0, yo, Zo) and vy = (x1, y1, 21) both satisfy the linear equation,
then so does cvg9 + dv,, provided
c+d =
(c)
All combinations
of vg and v; satisfy the equation when the right side is
Problems
15-22 ask for matrices that act in special ways on vectors.
15
(a)
What is the 2 by 2 identity matrix?
(b)
What is the 2 by 2 exchange matrix? P times [¥] equals [¥ ].
(a)
What 2 by 2 matrix R rotates every vector by 90°?
(b)
What 2 by 2 matrix rotates every vector by 180°?
What
3 by 3 matrix
16
17
31
P permutes
x
>.¢
/ times be equals eae
the vector
R times [Fal is [XKYI.
(x, y, z) to (y,z,x)?
What
matrix
P—! permutes (y,z,x) back to (x, y, z)?
18
What 2 by 2 matrix E subtracts the first component from the second component?
What 3 by 3 matrix does the same?
19
What
3 by 3 matrix
E multiplies
(x, y, z) to give (x, y,z+ x)?
E~! multiplies (x, y, z) to give (x, y,z—x)?
then multiply by E~!, the two results are (
20
What
matrix
If you multiply (3, 4,5) by E and
) and (
F
What 2 by 2 matrix P; projects the vector (x, y) onto the x axis to produce
(x, 0)? What matrix P2 projects onto the y axis to produce (0, y)? If you multiply (5,7)
by P; and then multiply
by P2, the two results
are
(
) and
(sanst4).
21
What 2 by 2 matrix R rotates every vector through 45°? The vector (1,0) goes
to (/2/2, /2/2). The vector (0, 1) goes to (—/2/2, /2/2). Those determine
the matrix. Draw these particular vectors in the xy plane and find R.
22
Write the dot product of (1, 4,5) and (x, y, z) as a matrix multiplication Ax. The
matrix A has one row. The solutions to Ax = 0 lie ona
____. _The columns
of A are only in ___-dimensional space.
32
Chapter 2
23
Which code finds the dot product of V with row 1 and then row 2? Which code
finds column | times V(1) plus column 2 times V(2)? Read the paragraph just
before this problem set. If A has 4 rows and 3 columns, what changes are needed
in the codes?
Solving Linear Equations
DO10) h=172
DO 10 jpsh,2
10 B(I) = Bil) + ACJ)
* VU)
DO#10)
Je= 172
DO 10,b=
172
LOLB CO),=
B(I) + A(I,J)
* VU)
24
In both codes the first step is B(1) = ACJ, 1) x V(1).- Write the other three steps
in the order executed by each code.
25
In three lines of MATLAB, enter a matrix and a column vector and multiply them.
Questions 26-28 are a review of the row and column pictures.
26
Draw each picture in a plane for the equations
27
For two linear equations in three unknowns x, y, z, the row picture will show
(2 or 3) (lines or planes). Those lie in (2 or 3)-dimensional space. The column
x —-2y
=0,x+y=6.
picture is in (2 or 3)-dimensional space.
28
V4,
For four linear equations in two unknowns x and y, the row picture shows four
. The column picture is in
-dimensional space. The equations have
no solution unless the vector on the right side is a combination of
(Markov matrix) Start with the vector up = (1,0). Multiply again and again by
the same matrix A. The next three vectors are w1, U2, U3:
Soin
8
ve te 4\lo|=[3|
mq = Au) = ____
43 = Auy = __
What property do you notice for all four vectors ug, wu), U2, U3?
30
With a computer continue Problem 29 as far as the vector u7. Then from vo =
(0, 1) compute vj = Avg and v2 = Av; on to v7. Do the same from wo = (.5, .5)
as far as w7. What do you notice about u7, v7 and w7? Extra: Plot the results
by hand or computer. Here is a MATLAB code—you can use other languages:
Ue
Yeas
Ol Aga |RSS
SOAs
while length(x) <= 7
UR FAHU
end
plot(k, x)
Xe
Xaulls
wee
F
2.1.
31
Vectors and Linear Equations
33
The u’s and v’s and w’s are approaching a steady state vector s. Guess that
vector and check that As = s. This vector is “steady” because if you start with
S, you stay with s.
32
This MATLAB
code allows you to input the starting column ug = [a; b; 1—a—b]
with a mouse click. These three components should be positive. (Click the left
button on your point (a, b) or on several points. Add the instruction disp(u’) after
the first end to see the steady state in the text window. Right button aborts.) The
matrix A is entered by rows—its columns add to 1.
Nel
Xen oe elie cdl ave ea cil oll Hol
axis({(O 1 O 1]); axis(’square’)
plot(O, 0); hold on
title(/Markov—your name’); xlabel(‘a’); ylabel(’b’); grid
butses.
while but ==
[a,b,but] = ginput(1)
u = [a; b; 1-a-—b];
der in Taal oe 0 Re Al
while length(x) <= 7
UE = Assur x =
end
plot
end
hold off
33
[he vill:
exe) EXC
xi(2)3) 000)
Invent a 3 by 3 magic matrix M3 with entries 1,2,...,9.
and diagonals add to 15.
The first row could be 8,3,4.
All rows and columns
What
is M3 times
(1,1, 1)? What is Mg times (1, 1, 1, 1) if this magic matrix has entries 1,..., 16?
34
Chapter 2. Solving Linear Equations
THE IDEA OF ELIMINATION go 2.2
This chapter explains a systematic way to solve linear equations. The method is called
“elimination”’, and you can see it immediately in our 2 by 2 example. Before elimination, x and y appear in both equations. After elimination, the first unknown x has
disappeared from the second equation:
Before:
SP
A
ay Ey
|
ent
saa
hey
Sy =8.
The last equation 8y = 8 instantly gives y = 1. Substituting for y in the first equation
leaves
x — 2 = 1. Therefore x = 3 and the solution (x, y) = (3, 1) is complete.
Elimination produces an upper triangular system—this is the goal. The nonzero
coefficients
1, —2, 8 form a triangle.
To solve this system, x and y are computed
in
reverse order (bottom to top). The last equation 8y = 8 yields the last unknown, and
we go upward to x. This quick process is called back substitution. It is used for upper
triangular systems of any size, after forward elimination is complete.
Important point: The original equations have the same solution x = 3 and y = 1.
Figure 2.5 repeats this original system as a pair of lines, intersecting at the solution
point (3, 1). After elimination, the lines still meet at the same point! One line is hor-
izontal because its equation 8y = 8 does not contain x.
Also important: How did we get from the first pair of lines to the second pair?
We subtracted 3 times the first equation from the second equation. The step that eliminates x is the fundamental operation in this chapter. We use it so often that we look
at it closely:
To eliminate x: Subtract a multiple of equation 1 from equation 2.
Three times x —2y = 1 gives 3x —6y = 3. When this is subtracted from 3x+2y = 11,
the right side becomes 8. The main point is that 3x cancels 3x. What remains on the
left side is 2y — (—6y) or 8y. Therefore 8y = 8, and x is eliminated.
Before elimination
After elimination
y
Shear Pape il
x—2y=1
Figure 2.5
Two lines meet at the solution. So does the new line 8y = 8.
2.2
The Idea of Elimination
35
Ask yourself how that multiplier / = 3 was found. The first equation contains
x. The first pivot is | (the coefficient of x). The second equation contains 3x, so the
first equation was multiplied by 3. Then subtraction 3x — 3x produced the zero.
You will see the general rule if we change the first equation to 5x — 10y = 5.
(Same straight line.) The first pivot is now 5. The multiplier is now / = 2. To find
the multiplier, divide the coefficient to be eliminated by the pivot:
.
;
3
Multiply equation | by s
Subtract from equation 2
-
=
Ix—-10y=5
36
y—el
te
ad
ap
5
5 - Oy —a)
Sy = 3:
The system is triangular and the last equation still gives y = 1. Back substitution
produces 5x — 10 = 5 and 5x = 15 and x = 3. Multiplying the first equation by 5
changed the numbers but not the lines or the solution. Here is the rule to eliminate a
coefficient.
Pivot
= first nonzero in the row that does the elimination
Multiplier
3
= (entry to eliminate) divided by (pivot) = 5"
The new second equation contains the second pivot, which is 8. It is the coefficient of y.
We would use it to eliminate y from the third equation if there were one. To solve n equations
we want n pivots.
You could have solved those equations for x and y without reading this book. It
is an extremely humble problem, but we stay with it a little longer. Elimination might
break down and we have to see how. By understanding the possible breakdown (when
we can’t find a full set of pivots), you will understand the whole process of elimination.
Breakdown
of Elimination
Normally, elimination produces the pivots that take us to the solution. But failure is
possible. The method might go forward up to a certain point, and then ask us to divide
by zero. We can’t do it. The process has to stop. There might be a way to adjust and
continue—or failure may be unavoidable. Example 1 fails with no solution and Example 2 fails with too many solutions. Example 3 succeeds by exchanging the equations.
Example 1 Permanent failure with no solution. Elimination makes this clear:
Xe 2
yal
3x —6y = 11
Subtract 3 times
eqn. | from eqn. 2
ta 2y ea 1
Oy = 8.
The last equation is Oy = 8. There is no solution. Normally we divide the right side
8 by the second pivot, but this system has no second pivot. (Zero is never allowed as
a pivot!) The row and column pictures of this 2 by 2 system show that failure was
unavoidable. If there is no solution, elimination must certainly have trouble.
36
Chapter 2
Solving Linear Equations
first
1
column
|3
Columns don't combine to give plane
second
column
Figure 2.6
|-—2
|-—6
Row and column pictures for Example 1: elimination fails.
The row picture in Figure 2.6 shows parallel lines—which never meet. A solution
must lie on both lines. Since there is no meeting point, the equations have no solution.
The column picture shows the two columns
(1,3) and (—2, —6) in the same di-
rection. All combinations of the columns lie in that direction. But the column from
the right side is in a different direction (1,11). No combination of the columns can
produce this right side—therefore no solution.
When
we change the right side to (1,3), failure shows
as a whole line of solu-
tions. Instead of no solution there are infinitely many:
Example 2 Permanent failure with infinitely many solutions:
xD
yr Ih
Subtract 3 times
3x —6y =.
»eqn; lL from eqn.2
a
2y = 1
Oy = 0.
Now the last equation is Oy = 0. Every y satisfies this equation. There is really only
one equation, namely x —2y = 1. The unknown y is “free”. After y is freely chosen,
then x is determined as x = 1+ 2y.
In the row picture, the parallel lines have become the same line. Every point on
that line satisfies both equations. We have a whole line of solutions.
In the column picture, the right side now lines up with both columns from the
left side. We can choose x = | and y = 0—the first column equals the right side
(1,3). We can also choose x = 0 and y = —5; the second column times —5 equals
the right side. There are infinitely many other solutions. Every (x, y) that solves the
row problem also solves the column problem.
Elimination can go wrong in a third way—but this time it can be fixed. Suppose
the first pivot position contains zero. We refuse to allow zero as a pivot. When the
2.2
The Idea of Elimination
37
right hand side H
lies on the line of columns
Same line from both equations
Solutions all along this line
1
fal
Fo (second column) = 2]
Figure 2.7.
Row and column pictures for Example 2: infinitely many solutions.
first equation has no term involving x, we can exchange it with an equation below.
With an acceptable pivot the process goes forward:
Example 3 Temporary failure but a row exchange produces two pivots:
Ox+2y=4
Exchange the
3x —2y=5 __ two equations
@0=2ye
2y.= 4.
The new system is already triangular. This small example is ready for back substitution.
The last equation gives y = 2, and then the first equation gives x = 3. The row picture
is normal (two intersecting lines). The column picture is also normal (column vectors
not in the same direction).
required.
The pivots 3 and 2 are normal—but a row exchange was
Examples | and 2 are singular—there is no second pivot. Example 3 is nonsingular—there is a full set of pivots. Singular equations have exactly no solution or
infinitely many solutions. Nonsingular equations have exactly one solution. Pivots must
be nonzero because we have to divide by them.
Three Equations in Three Unknowns
To understand Gaussian elimination, you have to go beyond 2 by 2 systems. Three by
three is enough to see the pattern. For now the matrices are square—an equal number
of rows and columns. Here is a 3 by 3 system, specially constructed so that all steps
38
Chapter 2
Solving Linear Equations
lead to whole numbers and not fractions:
2x +4y —2z=2
4x+9y
—2x
—3z=8
(i)
—3y+7z=10
What are the steps? The first pivot is 2 in the upper left corner. Below that pivot we
want to create zeros. The first multiplier is the ratio / = 4/2 = 2. Multiply the pivot
equation by 2 and subtract. Subtraction removes the 4x from the second equation:
1 = Subtract 2 times equation 1 from equation 2.
We also eliminate x from equation 3—still using the first pivot. The quick way is
to add equation | to equation 3. Then 2x cancels —2x. We do exactly that, but the
rule in this book is to subtract rather than add. The systematic pattern has multiplier
—2/2 = —1. Subtracting —1 times an equation is the same as adding that equation:
2
Subtract —1 times equation 1 from equation 3.
Now look at the situation. The two new equations involve only y and z:
ly+1z=4
ly +-352=.12
We have reached a 2 by 2 system. The final step eliminates y to reach a | by | problem:
3
Subtract equation 2new from equation 3new. Then Ox + Oy + 4z = 8.
The original 3 by 3 system has been converted into a triangular 3 by 3 system:
2x+4y—22=2
4x + 9y —3z=8
| 2x +4y—2z
Ge
has become
=—2x —3y + 7z= 10
=2
(2)
ly+1z=4
47 =8.
The goal is achieved—forward elimination is complete. Notice the pivots 2,1,4 along
the diagonal. Those pivots 1 and 4 were hidden in the original system! Elimination
brought them out.
The triangular system is zero below the pivots—three elimination steps produced
three zeros. This triangle is ready for back substitution, which is quick:
4z=8
gives
z=2,
y+z=4
gives
y=2,
equation
I gives
x =—1.
The solution is (x, y, z) = (—1, 2, 2).
The row picture shows planes, starting with the first equation 2x + 4y —2z = 2.
The planes from all our equations go through the solution (—1, 2, 2). The original three
planes are sloping, but the very last plane 4z = 8 is horizontal.
2.2
The Idea of Elimination
39
The column picture shows a linear combination of column vectors producing the
right side. The coefficients in that combination are —1, 2,2 (the solution):
2
(-1)|
4
4/+2)
—2
—2
2
9|+2]—3 | equals
—3
7
Sam
10
(3)
The numbers x, y,z multiply columns 1, 2, 3 in the original system and also in the
triangular system.
For a 4 by 4 problem, or an n by n problem, elimination proceeds the same way.
Here is the whole idea of forward elimination:
1.
Use the first equation to create zeros below the first pivot.
2.
Use the new second equation to create zeros below the second pivot.
3.
Keep going to find the nth pivot.
The result of forward elimination is an upper triangular system. It is nonsingular if
there is a full set of n pivots (never zero!). Here is a final example to show the original
system, the triangular system, and the solution from back substitution:
i
= 6
x+y+z=6
otic)
x+2y+2z=9
x+2y+3z=10
V2
y+z=3
ye i
iw
(oF
All multipliers are 1. All pivots are 1. All planes meet at the solution
(3, 2,1).
The
original columns combine with coefficients 3,2, 1 to give the (6,9, 10) column.
=»
REVIEW OF THE KEY IDEAS
=&
1.
A linear system becomes upper triangular after elimination.
2.
The upper triangular system is solved by back substitution (starting at the bottom).
3.
Elimination subtracts a multiple @;; of equation j from equation i, to make the
(i, 7) entry zero.
4.
The multiplier is ¢;; =
5.
A zero in the pivot position can be repaired if there is a nonzero below it.
6.
Otherwise,
many.
breakdown
entry to eliminate
a
e
—.
pivot in row 1
is permanent.
Pivots can not be zero!
The system has no solution or infinitely
40
Chapter 2. Solving Linear Equations
Problem Set 2.2
Problems 1-10 are about elimination on 2 by 2 systems.
1
What multiple / of equation 1 should be subtracted from equation 2?
ax
Sy =a
10x 4-9" aL
After this elimination step, write down the upper triangular system and circle the
two pivots. The numbers | and 11 have no influence on those pivots.
2
Solve the triangular system of Problem | by back substitution, y before x. Verify
that x times (2,10) plus y times (3,9) equals (1, 11). If the right side changes
to (4, 44), what is the new solution?
3
What multiple of equation 1 should be subtracted from equation 2?
2x —4y = 6
—x+5y=0.
After this elimination step, solve the triangular system.
to (—6, 0), what is the new solution?
4
If the right side changes
What multiple / of equation 1 should be subtracted from equation 2?
ax + by =\f
cx+dy =g.
The first pivot is a (assumed nonzero). Elimination produces what formula for
the second pivot? The second pivot is missing when ad = bc.
5
Choose a right side which gives no solution and another right side which gives
infinitely many solutions. What are two of those solutions?
ax pp 2y=7
6x+4y
6
=
Choose a coefficient b that makes this system singular. Then choose a right side
g that makes it solvable.
2x + by = 13
cB Al toh)
vi
ako
For which numbers a does elimination break down (1) permanently (2) temporarily?
ak
Oy == —3
4x+6y=
6.
Solve for x and y after fixing the second breakdown by a row exchange.
2.2
For which three numbers k does elimination
number of solutions 0 or 1 or oo?
The Idea of Elimination
break down?
kx+3y=
41
In each case, is the
6
3x +ky
= —6.
What is the test on b; and b2 to decide whether these two equations have a solution? How many solutions will they have?
3x
—2y
=D
6x
— 4y = bp.
10
In the xy plane, draw the lines x + y = 5 and x + 2y = 6 and the equation
ya
that comes from elimination. The line 5x —4y = c will go through
the solution of these equations if c =
Problems
11
11-20 study elimination on 3 by 3 systems (and possible failure).
Reduce this system to upper triangular form:
26 Sy eg
—
4x +7y +5z = ~—
—2y+2z=
6.
Circle the pivots. Solve by back substitution for z, y,.x. Two row operations are
enough if a zero coefficient appears in which positions?
a2
Apply elimination and back substitution to solve
2x —3y
55
Axi
Sy
EZ)
2x = y—3z=5
Circle the pivots. List the three row operations which subtract a multiple of the
pivot row from a lower row.
13
Which number d forces a row exchange, and what is the triangular system for
that d?
2x +5y+z=0
4x+dy4+72=2
yz
= 3.
Which number d makes this system singular (no third pivot)?
42
Chapter 2
14
Which number b leads later to a row exchange? Which b leads to a missing
pivot? In that singular case find a nonzero solution x, y, z.
Solving Linear Equations
x + by
= 0
x=2y
=7=0
y+z=0.
15
16
(a)
Construct a 3 by 3 system that needs two row exchanges to reach a triangular form and a solution.
(b)
Construct a 3 by 3 system that needs a row exchange to keep going, but
breaks down later.
If rows 1 and 2 are the same, how far can you get with elimination? If columns
1 and 2 are the same, which pivot is missing?
i
IP
2x -—y+z=0
24+
2y +z =0
2x—-—y+z=0
4x+4y+z=0
44
6x
yp z = 2
GyePe
Construct a 3 by 3 example that has 9 different coefficients on the left side, but
rows 2 and 3 become zero in elimination.
18
Which number g makes this system singular and which right side f gives it infinitely many solutions? Find the solution that has z = 1.
x+4y-—2z=1
x+7y —6z=6
3y.+qz=t.
19
20
(Recommended) It is impossible for a system of linear equations to have exactly
two solutions. Explain why.
(a)
If (x, y,z) and (X, Y, Z) are two solutions, what is another one?
(b)
If three planes meet at two points, where else do they meet?
How can three planes fail to have An intersection point, when no two planes are
parallel? Draw your best picture. Find a third equation that can’t be solved if
x+y+z=0
and x —-y-—z=0.
Problems 21-23 move up to 4 by 4 and n by n.
21
Find the pivots and the solution for these four equations:
2x+
y
—U
Ro
ey ae ee
=i)
Vee
= 0
a
2.2
JD.
The Idea of Elimination
This system has the same pivots and right side as Problem 21. How is the solution different (if it is)?
2x—-
y
==)
—x+2y—-—
—
z
= (
y+2z-—
t=0
ie
23
If you extend Problems 21—22 following the 1,2, 1 pattern or the —1,2,—1
tern, what is the fifth pivot? What is the nth pivot?
24
If elimination leads to these equations, find three possible original matrices A:
Vo
pat-
a
y+z=0
Baas
25
For which three numbers a will elimination fail to give three pivots?
Ga
A=|aa
4
Gag
26
Look for a matrix that has row
Nl
Ce
4
ae
b
sums
4 and 8, and column
a+b=4
c+d=8
The four equations are solvable only if s =
matrices that have the correct row and column
sums
2 and s:
a+c=2
b+d=s
. Then find two different
sums.
44
Chapter 2
Solving Linear Equations
ELIMINATION
USING MATRICES a 2.3
We now combine two ideas—elimination and matrices. The goal is to express all the
steps of elimination (and the final result) in the clearest possible way. In a 3 by 3
example, elimination could be described in words. For larger systems, a long list of
steps would be hopeless. You will see how to subtract a multiple of one row from
another row—using matrices.
The matrix form of a linear system is Ax = b. Here are b,x, and A:
1
The vector of right sides is b.
2
The vector of unknowns is x. (The unknowns change from x, y, Z,...
x3, ... because we run out of letters before we run out of numbers.)
3
The coefficient matrix is A.
to x1, x2,
The example in the previous section has the beautifully short form Ax = b:
2x1 + 4x2 — 2x3 =2
4x, + 9x2 — 3x3 =8
is the same as
—2x — 3x2 + 7x3 = 10
2 w Aton)
ley
4
bs gl il
=2
9
3
-3
2
Tis
la i e l
2
10
The nine numbers on the left go into the matrix A. That matrix not only sits beside x,
it multiplies x. The rule for “A times x” is exactly chosen to yield the three equations.
Review of A times x. A matrix times a vector gives a vector. The matrix is square
when the number of equations (three) matches the number of unknowns (three). Our
matrix is 3 by 3. A general square matrix is n by n. Then the vector x is in ndimensional space. This example is in 3-dimensional space:
The unknown
is
x]
x2
x3
x =|
—|
and the solution is
x =
”
2
Key point: Ax = b represents the row form and also the column form of the equations.
We can multiply Ax a column at a time:
2
mag
4
3
oe
z
7
10
This rule is used so often that we repeat it for emphasis.
uc
x isa combination of the columns of A. Columns are multiplied by.
of x.Then Ax is x; times (column 1) + --- + xp times (column n).
:
2.3
Elimination Using Matrices
45
One point to repeat about matrix notation: The entry in row 1, column | (the top
left corner) is called aj}. The entry in row 1, column 3 is a;3. The entry in row 3,
column | is a3}. (Row number comes before column number.) The word “entry” for
a matrix corresponds to the word “component” for a vector.
in row i, column j of the matrix A is aj;.
General rule: The entry
Example 1 This matrix has aj; = 2i + j. Then aj; = 3. Also aj2 = 4 and az; = 5.
Here is Ax with numbers
3
4
5
SOW
2
and letters:
=.
3-2+4-1
ay
aio
X1 | _ | 411%1
+ @12x2
BiisS,
s2216-<:1
a2,
a22||
x2]
~~ ‘|aayx1
+ a22x2
|"
For variety we multiplied a row at a time. The first component of Ax is 6+ 4 = 10.
That is the product of the row [3 4] with the column
product of [aj;
a 2] with (x1, x2).
The ith component
(2,1).
of Ax involves row i, which is [ajj
Using letters, it is the
aj2...@jn].
The short
formula uses “sigma notation”:
n
2B The ith component of Ax is aj1x;
+ aj2x2 +---+4jnXn.
This is
SP es ‘
|
The symbol 5° is an instruction to add. Start with
with aj;x; and stop with ajyxp.*
7 = 1 and stop with j =n.
Start
The Matrix Form of One Elimination Step
Ax = b is a convenient form for the original equation. What about the elimination
steps? The first step in this example subtracts 2 times the first equation from the second
equation. On the right side, 2 times the first component of b is subtracted from the
second component:
pi
b=]
8
2
changes to
b’=|
10
4
10
We want to do that subtraction with a matrix!
The same
result is achieved when
we
multiply an elimination matrix E times b:
The elimination matrix is
*Binstein
addition.
shortened
this even
more
1
E = | —2
Oo
Ono
1
0
©
l
by omitting the }°. The repeated j in ajjX; automatically
He also wrote the sum as By xf. Not being Einstein, we include the }°.
meant
46
Chapter 2.
Solving Linear Equations
With numbers and letters, multiplication by E subtracts 2 times row | from row 2:
Lay
=e)
aa
(Oi
0
“thin 20
ee
p)
Sian
10
Z
4
10
T
eneae
40.
sonal
10
Os
£05!
by
ba)
b3
alba
by
2 by
b3
Notice how 2 and 10 stay the same. Those are b; and b3. The first and third rows
of E are the first and third rows of the identity matrix /. That matrix leaves all vectors unchanged. The new second component is the number 4 that appeared after the
elimination step. This is b2 — 2b.
It is easy to describe the “elementary matrices” or “elimination matrices” like E.
Start with the identity matrix 7. Change one of its zeros to the multiplier —/:
2C The identity matrix has 1’s on the diagonal and 0’s everywhere else. Then 1b = b.
The elementary matrix or elimination matrix E;; that subtracts a multiple / of row j from
row i has the extra nonzero entry —/ in the i, 7 position.
Example2
bs
O40
]
L=l0.1
O01) Sande ea
ORO
oO
Ome
—-l
0O
1
If you multiply 7 times b, you get b again. If you multiply £3; times Bb, then / times
the first component is subtracted from the third component. Here we get
9-4 =5:
it @© @
ORO
OmOn
1
1
3 || == || 3
9
9
and
1
0
—4
ORO
1
0
0O
1
1
3 | =]
9
1
3
5
This is on the right side of Ax = b. What about the left side? The multiplier / = 4
was chosen to produce a zero, by subtracting 4 times the pivot. The purpose of E3,
is to create a zero in the (3, 1) position.
The notation fits the purpose. Start with A. Apply E’s to produce zeros below
the pivots (the first E is E2,). End with a triangular system. We now look in detail
at the left side—elimination applied to Ax.
First a small point. The vector x stays the same. The solution is not changed by
elimination. (That may be more than a small point.) It is the coefficient matrix that is
changed! When we start with Ax = b and multiply by E, the result is EAx = Eb.
We want the new matrix E A—the result of multiplying E times A.
Matrix Multiplication
The question is: How do we multiply two matrices?
When the first matrix is E (an
elimination matrix E21), there is already an important clue. We know A, and we know
2.3.
what it becomes
expect that
after the elimination step.
ea
One
eee
OF
Ri
On
OU
O)
ai
times
Elimination Using Matrices
47
To keep everything right, we hope and
2
4
A= | 4°
9
—2 —
-2
—3
ve
2
4
sara
et ee
-2
1
et
Pives = EAC—
This step does not change rows | and 3. Those rows are repeated in EA—only row
2 is different. Twice the first row has been subtracted from the second row. Matrix
multiplication agrees with elimination—and the new system of equations is EAx = Eb.
That is simple but it hides a subtle idea. Multiplying both sides of the original
equation gives E(Ax) = Eb. With our proposed multiplication of matrices, this is also
(EA)x
=
Eb.
The first was
E times Ax, the second is EA
times x.
They are the
same! The parentheses are not needed. We just write
EAx = Eb.
When multiplying ABC, you can do BC first or you can do AB first. This is
the point of an “associative law” like 3 x (4 x 5) = (3 x 4) x 5. We multiply 3 times
20, or we multiply 12 times 5. Both answers are 60. That law seems so obvious that
it is hard to imagine it could be false. But the “commutative law” 3 x 4 = 4 x 3 looks
even more
obvious.
For matrices, EA
is different from AL.
2D ASSOCIATIVE LAW = A(BC) = (AB)C
NOT COMMUTATIVE LAW Often AB 4 BA.
There is another requirement on matrix multiplication. We know how to multiply A times x or E times b. The matrix-matrix law should be consistent with the
old matrix-vector law. Suppose B has only one column (this column is b). Then the
matrix-matrix product EB should agree with Eb. Even more, we should be able to
multiply matrices a column at a time:
If the matrix B contains several columns by, b2, b3,
then the columns of EB are Eby, Eb2, Eb3.
This holds true for the matrix multiplication above (where we have the letter A instead
of B). If you multiply column 1 by £, you get column | of the answer:
tA
ered)
att ee alee
Oia
ata
I
2
CI
—2
(eo
2
een
—2
and similarly for columns 2 and 3.
This requirement deals with columns, while elimination deals with rows. A third approach (in the next section) describes each individual entry of the product. The beauty
of matrix multiplication is that all three approaches (rows, columns, whole matrices)
come out right.
48
Chapter 2
Solving Linear Equations
The Matrix P;; for a Row Exchange
To subtract row j from row i we use E;;. To exchange or “permute” those rows we
use another matrix Pj;. Remember the situation when row exchanges are needed: Zero
is in the pivot position. Lower down that column is a nonzero. By exchanging the two
rows, we have a pivot (nonzero!) and elimination goes forward.
What matrix P23 exchanges row 2 with row 3? We can find it by exchanging
rows of /:
1
0
Py
ie)
0
i
GS
=
0
This is a row exchange matrix. Multiplying by P23 exchanges components 2 and 3 of
any column vector. Therefore it exchanges rows 2 and 3 of any matrix:
beOs 10
O05
Oalee0),
1
oj ©
5
1
as)
3
and
1,,0,0
OnO
Ns)
Or 1? 0
2 4)
24
1
O:50e 32)aO
26252
Ons 6495
OnOR 3
On the right, P23 is doing what it was created for. With zero in the second pivot
position and “6” below it, the exchange puts 6 into the pivot.
Notice how matrices act. They don’t just sit there. We will soon meet other
permutation matrices, which can change the order of several rows. Rows 1, 2, 3 can
be moved to 3, 1, 2. Our P93 is one particular permutation matrix—it works on rows
2 and 3.
2E Permutation Matrix P;; exchanges rows i and j of the matrix it multiplies. P;;
is the identity matrix with rows i and j reversed.
To exchange equations 1 and 3 multiply by
001
= P\3 = [°10|.
Usually no row exchanges are needed. The odds are good that elimination uses only
the £;;. But the P;; are ready if needed, to move a new pivot into position.
The Augmented Matrix A’
This book eventually goes far beyond elimination. Matrices have all kinds of practical
applications—in which they are multiplied. Our best starting point was a square E
times a square A, because we met this in elimination—and we know what answer to
expect for EA. The next step is to allow a rectangular matrix A’. It still comes from
our original equations, but now it includes the right side b.
2.3
Elimination Using Matrices
49
Key idea: Elimination acts on A and b, and it does the same thing to both. We
can include b as an extra column and follow it through elimination. The matrix A
is enlarged or “augmented” by b:
2
Augmented matrix
A'= [A b] =|
4
=2
SpA aie
9 -3
—3
8
97
“10
A’ contains the whole system (left and right side). Elimination acts on whole rows of A’.
The left side and right side are both multiplied by E, to subtract 2 times equation 1
from equation 2. With A’ = [A b| those steps happen together:
Len;
ae
EA’ =|-—2
1
O
Ora
Oneal
2
alt oe
De
Zo
Ae Deen
cee:
Mee Mame cee— 7 ae
L
=2 —3
7 10
—2 =3
7°
2
peut
10
The new second row contains 0,1, 1,4. The new second equation is x2 +x3 = 4. Both
requirements on matrix multiplication are obeyed:
R_
(by rows): Each row of E acts on A’ to give a row of EA’.
C
(by columns):
E acts on each column of A’ to give a column of EA’.
Notice again that word “acts.” This is essential. Matrices do something! The matrix
A acts on x to give b. The matrix E operates on A to give
EA. The whole process
of elimination is a sequence of row operations, alias matrix multiplications. A goes to
E2,A which goes to £3; F2,A.
Finally £32£3);£2,A
is a triangular matrix.
The right side is included when we work with A’. Then A’ goes to E2; A’ which
goes to £3, E2,A’. Finally E32£3,£2)A’ is a triangular system of equations.
We stop for exercises on multiplication by E, before writing down the rules for all
matrix multiplications.
=»
REVIEW OF THE KEYIDEAS
=
1.
Ax equals x; times (column
1) +---+ Xn times (column 7).
2.
Multiplying Ax = b by £2, subtracts a multiple €2; of row
number —; is the (2, 1) entry of the matrix F).
3.
On the augmented matrix [A 6], that elimination step gives [£2;A
4.
When a matrix A multiplies a matrix B, it multiplies each column of B separately.
| from row 2. The
E215|.
50°
Chapter 2
Solving Linear Equations
Problem Set 2.3
Problems
1
1-14 are about elimination matrices.
Write down the 3 by 3 matrices that produce these elimination steps:
(a)
£ , subtracts 5 times row
| from row 2.
(b)
£32 subtracts —7 times row 2 from row 3.
(c)
Piz exchanges rows
| and 2.
In Problem 1, applying E2; and then £32 to the column b =
E32 E2\b =
. Applying E32 before E2; gives E2;E32b =
is different from F322, because when £32 comes first, row
fect from row
Which three matrices
A=
(1,0,0) gives
. £2, E32
feels no ef-
E;, F3;, E32 put A into triangular form U?
1
4
eal
Lee
6
1
e220
a0
and
£32E3,;E2,A = U.
Multiply those E’s to get one matrix M that does elimination:
MA = U.
Include b = (1, 0,0) as a fourth column in Problem 3 to produce A’. Carry out
the elimination steps on this augmented matrix A’ to solve Ax = b.
Suppose a 3 by 3 matrix has a33 = 7 and its third pivot is 2. If you change a33
to 11, the third pivot is
. If you change a33 to
, there is no third
pivot.
If every column of A is a multiple of (1,1, 1), then Ax is always a multiple of
(1,1, 1). Do a 3 by 3 example. How many pivots are produced by elimination?
Suppose £3, subtracts 7 times row | from row 3. To reverse that step you should
7 times row
to row
. The matrix to do this reverse step (the
inverse matrix) is R3; =
Suppose £3; subtracts 7 times row | from row 3. What matrix R3; is changed
into 1? Then E3;R3; = J where Problem 7 has R3; £3; = J. (So £3; commutes
with
R31.)
(a)
£2, subtracts row | from row 2 and then P23 exchanges
What matrix M = P23E2; does both steps at once?
rows
P23 exchanges
1 from row
(b)
rows
2 and 3 and then £3, subtracts row
2 and 3.
3.
What matrix M = £3) P23 does both steps at once? Explain why the M’s
are the same but the E’s are different.
2.3
10
Create a matrix that has aj; = a22 =
negative pivots. (The first pivot is 1.)
11
Multiply these matrices:
Ueriulipne
she
VG 8G
12
a33 =
Leo
—1
-
Elimination Using Matrices
51
1 but elimination produces two
OS
1
O
0
| ee
Leee3
4
ee
yal
0
Explain these facts. If the third column of B is all zero, the third column of EB
is all zero (for any E). If the third row of B is all zero, the third row of EB
might not be zero.
13
This 4 by 4 matrix will need elimination matrices
are those matrices?
E2; and £32 and E43. What
shoei
th eke Fe)
Ot
ite
een
O40;
a ee
14
Write down the 3 by 3 matrix that has aj; = 21 — 37. This matrix has a32 = 0,
but elimination still needs £32 to produce a zero. Which previous step destroys
the original zero and what is E32?
Problems
15
15-22 are about creating and multiplying matrices.
Write these ancient problems in a 2 by 2 matrix form Ax = b:
(a)
X is twice as old as Y and their ages add to 33.
(b)
The line
C.
y=mx-+c
goes through (x, y) = (2,5) and (3,7). Find m and
16
The parabola y =
a+ bx + cx* goes through the points (x, y) = (1,4) and
(2,8) and (3, 14). Find a matrix equation for the unknowns (a, b,c). Solve by
elimination.
17
Multiply these matrices in the orders
E =
EF and FE:
(eau)
ira le
BAe
0a
One
OSC
Ri
Also compute E? = EE and F* = FF.
18
Multiply these row exchange matrices in the orders PQ and QP:
PS
Oe laeo
enor Oo
ae 8
and)
=
Also compute P? = PP and (PQ)? = PQPQ.
Om
00%
We
vi
0
1
Ma
52
Chapter 2
19
(a)
Solving Linear Equations
Suppose all columns
of B are the same.
Then all columns
of EB
are the
same, because each one is E times
(b)
Suppose all rows of B are the same.
EB are not the same.
Show by example that all rows of
20
If E adds row
21
The entries of A and x are a;; and x;. The matrix E2; subtracts row | from row
2. Write a formula for
Ze
| to row 2 and F adds row 2 to row
(a)
the third component of Ax
(b)
the (2, fb) entry of EA
(c)
the (2,1) component of Ex
(d)
the first component of EF Ax.
1, does EF
equal FE?
The elimination matrix EF = [5 :| subtracts 2 times row | of A from row 2
of A. The result is EA. In the opposite order AE, we are subtracting 2 times
of A from
. (Do an example.)
Problems 23-26 include the column b in the augmented matrix A’.
23
Apply elimination to the 2 by 3 augmented matrix A’. What is the triangular
system Ux = c? What is the solution x?
rele illel=Lal,
24
Apply elimination to the 3 by 4 augmented matrix A’. How do you know this
system has no solution?
Aides
ees
Wat
Roped,
Sie
ae!
a
Sa
z
1
ee!
6
Change the last number 6 so that there is a solution.
25
The equations Ax = b and Ax* = b* have the same matrix A.
(a)
What double augmented matrix A” should you use in elimination to solve
both equations at once?
(b)
Solve both of these equations by working on a 2 by 4 matrix A”:
2 albl-[o}
2 lE]-f)
2.3.
26
53
Choose the numbers a,b,c,d in this augmented matrix so that there is (a) no
solution (b) infinitely many solutions.
AG
Which
Daf
Elimination Using Matrices
of the numbers
Challenge question:
position,
S
SS
a, b, c, or d have no effect on the solvability?
E£;; is the 4 by 4 identity matrix with an extra | in the (i, /)
i > j. Describe the matrix
examples first.
YS
tOra. Ss
Oe Ord
E;;E,;.
When
does it equal
Ex; E;;?
Try
54
Chapter 2
Solving Linear Equations
RULES FOR MATRIX OPERATIONS a 2.4
I will start with basic facts. A matrix is a rectangular array of numbers or “entries.”
When A has m rows and n columns, it is an “m by n” matrix. Matrices can be added
if their shapes are the same. They can be multiplied by any constant c. Here are
examples of
A+ B and 2A, for 3 by 2 matrices:
Lege
2
gl
3
4
OlncO
9.
9
O49
i
Ps
2
4
OF2.0
O-
“0
Matrices are added exactly as vectors are—one entry at a time. We could even regard
a column vector as a matrix with only one column (so n = 1). The matrix —A comes
from multiplication by c = —1 (reversing all the signs). Adding A to —A leaves the
zero matrix, with all entries zero.
The 3 by 2 zero matrix is different from the 2 by 3 zero matrix. Even zero has
a shape (several shapes) for matrices. All this is only common sense.
The entry in row i and column j is called a;; or A(i, j). Thus the entry a32 is
found in row 3 and column 2. The n entries along the first row are aj1, a@)2,..., @inThe lower left entry in the matrix is a, ; and the lower right is amp.
Again, the row number 7 goes from 1 to m. The column number j goes from |
to n.
Matrix addition is easy. The serious question is matrix multiplication. When can we
multiply A times B, and what is the product AB? We cannot multiply A and B above
(both 3 by 2). They don’t pass the following test:
To multiply AB, the number of columns of A must equal the numbers of rows of B.
If A has two columns, B must have two rows. When A is 3 by 2, the matrix B can
be 2 by 1 (a vector) or 2 by 2 (square) or 2 by 20. Every column of B is ready to
be multiplied by A. Then AB is 3 by 1 or 3 by 2 or 3 by 20.
Suppose A is m by n and B isn by p. We can multiply. The product AB is m by p.
m rows
nrows
|_|
m rows
n columns | |p columns | | p columns | °
A row times a column is an extreme case. Then | by n multiplies n by 1. The result
is 1 by 1. That single number is the “dot product.”
In every case AB is filled with dot products. For the top corner, the (1, 1) entry
of AB is (row | of A) - (column 1 of B). To multiply matrices, take all these dot products:
(each row of A) - (each column of B).
2F The entry in row i and column j of AB is (row i of A)-(column j of B).
2.4
*
5
ai Qi2
ais
:
*
*
*
(AB)ij
:
*
:
*
bsj
A is 4 by 5
Figure 2.8
B is 5 by 6
Here i =2
55
*
2)
*
Rules for Matrix Operations
*
*
*
AB is 4 by 6
and j = 3. Then (AB)3 is (row 2) « (column3) = Larxbx3.
Figure 2.8 picks out the second row (i = 2) of a 4 by 5 matrix A. It picks out the third
column (j = 3) of a 5 by 6 matrix B. Their dot product goes into row 2 and column 3
of AB. The matrix AB has as many rows as A (4 rows), and as many columns as B.
Example 1 Square matrices can be multiplied if and only if they have the same size:
> -a][3 a]=[1 a];
The first dot product is 1-2 + 1+3 = 5. Three more dot products give 6,1, and 0.
Each dot product requires two multiplications—thus eight in all.
If A and B are n by n, so is AB. It contains n? dot products, row of A times
column of B. Each dot product needs n multiplications, so the computation of AB
uses n> separate multiplications. For n = 100 we multiply a million times. For n = 2
we have n? = 8.
Mathematicians thought until recently that AB absolutely needed 2? = 8 multiplications. Then somebody found a way to do it with 7 (and extra additions). By
breaking n by n matrices into 2 by 2 blocks, this idea also reduced the count for large
matrices. Instead of n> it went below n?'8, and the exponent keeps falling.* The best
at this moment is n*37°. But the algorithm is so awkward that scientific computing
is done the regular way—with n? dot products in AB, and n multiplications for each
one.
Example 2 Suppose A is a row vector (1 by 3) and B is a column vector (3 by 1).
Then AB is 1 by | (only one entry, the dot product). On the other hand B times A
(a column
times a row) is a full matrix:
Column times row:
0
O00
l [1 2 3 | bal plIa
cae:
2B
2 4 ON
A row times a column is an “inner” product—another name for dot product. A column
times a row is an “outer” product. These are extreme cases of matrix multiplication,
*Maybe
the exponent
won’t stop falling before 2. No number
in between
looks special.
56
Chapter 2
Solving Linear Equations
with very thin matrices. They follow the rule for shapes in multiplication: (n by 1) times
(1 by n). The product of column times row is n by n.
Rows and Columns of AB
In the big picture, A multiplies each column of B. The result is a column of AB. In
that column, we are combining the columns of A. Each column of AB is a combination of the columns of A. That is the column picture of matrix multiplication:
Column of AB is (matrix A) times (column of B).
The row picture is reversed. Each row of A multiplies the whole matrix B. The
result is a row of AB. It is a combination of the rows of B:
[row i of A]
We see row
Lg
Ma Sy
YF 3
operations in elimination
ag
=
[row i of AB].
©
(E times A).
We
see columns
in A times x.
The “row-column picture” has the dot products of rows with columns. Believe it or
not, there is also a “column-row picture.” Not everybody knows that columns of A
multiply rows of B to give the same answer AB. This is in the next example.
The Laws for Matrix Operations
May I put on record six laws that matrices do obey, while emphasizing an equation
they don’t obey? The matrices can be square or rectangular, and the laws involving
A+ B are all simple and all obeyed. Here are three addition laws:
A+B=B+A
(commutative law)
c(A+ B)=cA+cB
(distributive law)
A+(B+C)=(A+8B)+C
(associative law).
You know the right sizes for AB: (m by n) multiplies (n by p) to produce (m by p).
Three more laws hold for multiplication, but AB = BA is not one of them:
ABABA
(the commutative “law” is usually broken)
C(A+ B)=CA+CB
(A+ B)C =AC+
(distributive law from the left)
BC
A(BC) = (AB)E
(distributive law from the right)
(associative law)(parentheses not needed).
When A and B are not square, AB is a different size from BA. These matrices can’t
be equal—even if both multiplications are allowed. For square matrices, almost any
example shows that AB is different from BA:
eee]
core ales
apa
2.4
It is true that AJ = JA. All square matrices commute
these matrices c/ commute with all other matrices.
The law A(B + C) = AB+
Rules for Matrix Operations
57
with J and also with c/. Only
AC is proved a column at atime. Start with A(b+ cc) =
Ab + Ac for the first column. That is the key to everything—linearity. Say no more.
The law A(BC) = (AB)C means that you can multiply BC first or AB first.
The direct proof is sort of awkward (Problem 16) but this law is extremely useful. We
highlighted it above; it is the key to the way we multiply matrices.
Look at the special case when A = B = C = square matrix. Then (A times A*) =
(A? times A). The product in either order is A*. The matrix powers A” follow the same
rules as numbers:
A? = AAA---A (p factors)
(A?)(A?) = A?T4
(AP)? = AP4,
Those are the ordinary laws for exponents. A* times A* is A’ (seven factors).
A? to the fourth power is A!* (twelve A’s). When p and gq are zero or negative these
rules still hold, provided A has a “—1 power”—which is the inverse matrix A~'. Then
A° = / is the identity matrix (no factors).
For a number, a! is 1/a. For a matrix, the inverse is written A~!. (It is never
I/A, except this is allowed in MATLAB). Every number has an inverse except a = 0.
To decide when A has an inverse is a central problem in linear algebra. Section 2.5
will start on the answer. This section is a Bill of Rights for matrices, to say when A
and B can be multiplied and how.
Block Matrices and Block Multiplication
We have to say one more thing about matrices. They can be cut into blocks (which
are smaller matrices). This often happens naturally. Here is a 4 by 6 matrix broken
into blocks of size 2 by 2—and each block is just /:
1. “Oerecouler
ri
aL
Ae
| (ea
0 1
¢
Ora
= k jj
4
Lae
ae
00: | eel
If B is also 4 by 6 and its block sizes match the block sizes in A, you can add
a block at a time.
A+B
We have seen block matrices before. The right side vector b was placed next to
A in the “augmented
matrix.”
Then
A’ =[A _ b| has two blocks
Multiplying A’ by an elimination matrix gave [EA
blocks times blocks, when their shapes permit:
Eb].
of different sizes.
No problem to multiply
58
Chapter 2
Solving Linear Equations
2G Block multiplication If the cuts between columns of A match the cuts between
rows of B, then block multiplication of AB
is allowed:
Ai
"Ai2 ))Big
4
PA Bi + Aiea
al
Aa
Ag)
=.
Az By + AB.
+:
|Baie
This equation is the same as if the blocks were numbers
(1)
(which are | by | blocks).
We are careful to keep A’s in front of B’s, because BA can be different. The cuts
between rows of A give cuts between rows of AB. Any column cuts in B are also
column cuts in AB.
Main point When matrices split into blocks, it is often simpler to see how they act.
The block matrix of /’s above is much clearer than the original 4 by 6 matrix A.
Example 3 (Important) Let the blocks of A be its columns. Let the blocks of B be
its rows. Then block multiplication AB is columns times rows:
|
AB
=
aj
|
oo
BST
iid
an
|
tea
:
=
2E9)
a,b,
se -+>
+ anby,
.
(2)
eRe
This is another way to multiply matrices! Compare it with the usual rows times columns.
Row | of A times column 1 of B gave the (1,1) entry in AB. Now column 1 of A
times row | of B gives a full matrix—not just a single number. Look at this example:
elle
fo
he o|+15 a
3)
We stop there so you can see columns multiplying rows. If a 2 by 1 matrix (a column)
multiplies a 1 by 2 matrix (a row), the result is 2 by 2. That is what we found. Dot
products are “inner products,” these are “outer products.”
When you add the two matrices at the end of equation (3), you get the correct
answer AB. In the top left corner the answer is 3 +4 = 7. This agrees with the
row-column dot product of (1,4) with (3, 1).
Summary The usual way, rows times columns, gives four dot products (8 multiplications). The new way, columns times rows, gives two full matrices (8 multiplications).
The eight multiplications and four additions are all the same. You just execute them
in a different order.
Example 4 (Elimination by blocks) Suppose the first column of A contains 2, 6, 8. To
change 6 and 8 to O and 0, multiply the pivot row by 3 and 4 and subtract. Those
2.4
Rules for Matrix Operations
59
steps are really multiplications by elimination matrices E2, and £3}:
OF
E>
=
|-3
1
Op
5 Olde
0
and
Le
05
aEO
0
1
0
—4
0
1
E3) ae
The “block idea” is to do both eliminations with one matrix E. That matrix clears out
the whole first column below the pivot a = 2:
ie
Li
—4
COR
0)
le)
0
Ose
multiplies A to give
EA=|0O
1
x
Otte
Oa
x
x
Block multiplication gives a formula for those x’s in EA. The matrix A has four blocks
a, b, c, D: the pivot, the rest of row
1, the rest of column
1, and the rest of the matrix.
Watch how FE multiplies A by blocks:
kevetallearcleleaneyll
1
|9
a|
5
—c/a
|I
c |D
a
|
:
b
0 | D —cb/a
Elimination multiplies a by c/a and subtracts from c, to get zeros in the first column. It
multiplies the column vector c/a times the row vector b, and subtracts to get D—cb/a.
This is ordinary elimination, a column
=
at a time—written
in blocks.
REVIEW OF THE KEY IDEAS
is (row 7 of A)-(column
=®
1.
‘The (i, j) entry of AB
7 of B).
2.
Anm
3.
A times BC equals AB times C (surprisingly important).
4.
AB
5.
Block multiplication is allowed when the block shapes match correctly.
by n matrix times an n by p matrix uses mnp separate multiplications.
is also the sum of (column j of A)(row j of B).
Problem Set 2.4
Problems
1
1-17 are about the laws of matrix multiplication.
Suppose A is 3 by 5, B is 5 by 3, C is 5 by 1, and D is 3 by 1. Which of
these matrix operations are allowed, and what are the shapes of the results?
(a)
BA
60
Chapter 2
Solving Linear Equations
(b)
A(B+C)
(c)
ABD
(d)
AG+ BD
(ec)
ABABD.
What rows or columns and what matrices do you multiply to find
(a)
the third column of AB?
(b)
the first row of AB?
(c)
the entry in row 3, column 4 of AB?
(d)
the entry in row
1, column
| of
CDE?
Compute AB + AC and separately A(B + C) and compare:
=.
eS
|
and
0
ae;
2
i
and
Sia
ea,
wl
al
In Problem 3, multiply A times BC. Then multiply AB times C.
Compute A? and A?. Make a prediction for A> and A”:
IL
a=|)
a)
|
and
DD
Sel
al
Show that (A + B) is different from A? + 24B + B2, when
ded)
rel, A
and
i
bee
@
alk
Write down the correct rule for (A + B)(A + B) = A* + AB +
+ B?.
True or false. Give a specific example when false:
(a)
If columns
| and 3 of B are the same, so are columns
(b)
If rows
1 and 3 of B are the same, so are rows
| and 3 of AB.
(c)
If rows
1 and 3 of A are the same, so are rows
| and 3 of AB.
@)
(GAB) =A" B*.
How
is each row of DA
and EA
JB ae0
B=,
‘| and
How
is each column of AD
1 and 3 of AB.
related to the rows of A, when
S (Ove!
eal; |
and AE
and
i Flay
a=|é
gb
Be
related to the columns of A?
2.4
Row | of A is added to row 2. This gives
added to column 2 to produce (EA)F:
Ik OU
BA=|
Na
(a)
| ae A) |
| of EA
is
b
ie]
a
a+b
(EA)F = (EA)|j Se.
atte
1 of A to column
1 of AF
Compare with (EA)F.
tion?
What law is or is not obeyed by matrix multiplica-
to row 2 by E(AF).
Row | of A is again added to row 2 to produce EA. Then F adds row 2 of EA
1. The result is F(EA):
es
Fea)
2
a
61
2 by AF, then add row
to row
11
Bb
Do those steps in the opposite order. First add column
(b)
10
EA below. Then column
ile Ae
A
ue
Rules for Matrix Operations
i
ae
=|
a
b
Ae
D242"
B2b Ard
Pala
eal
(a)
Do those steps in the opposite order: first add row 2 to row | by FA, then
add row | of FA to row 2.
(b)
What law is or is not obeyed by matrix multiplication?
(3 by 3 matrices) Choose
(a)
BA=4A
(b)
BA=4B
(c)
BA has rows
(d)
All rows of BA
B so that for every matrix A (if possible)
| and 3 of A reversed
are the same as row
Suppose AB = BA and AC = CA
oi.
a=|*
16)
i
commutes
| of A.
for these two particular matrices B and C:
f
with
Fl
If
(0)
i
and
®
mal
jl
al
Prove that a = d and b = c = 0. Then A is a multiple of 7. The only matrices that
commute with B and C and all other 2 by 2 matrices are A = multiple of /.
13
Which of the following matrices are guaranteed to equal (A— B)?:
14
True or false:
(B—A)?,
A*—2AB+B*,
A(A—B)-—B(A-—B),
(a)
If A is defined then A is necessarily square.
(b)
If AB
(c)
If AB and BA are defined then AB and BA are square.
(d)
If
and BA
AB= B then
are defined then A and B are square.
A=T.
A — B?,
A*—AB-—BA+B??
62
Chapter 2
15
If A is m by n, how many separate multiplications are involved when
16
Solving Linear Equations
(a)
A multiplies a vector x with n components?
(b)
A multiplies an n by p matrix B?
(c)
A multiplies itself to produce A*? Here m =n.
To prove that (AB)C = A(BC),
use the column
vectors bj,...,b,
of B. First
suppose that C has only one column c with entries cj,..., Cn:
AB has columns
Ab),..., Ab, and Bc has one column c,bj +--- + cybn.
Then (AB)c = cj Ab, +---+c,Ab,
while A(Bc) = A(cjb) +--+: + cnbn).
Linearity makes those last two equal: (AB)c =
other
17.
. The same is true for all
of C. Therefore (AB)C = A(BC).
For
A=[274] and B =[1?
(a)
column 2 of AB
(b)
row 2 of AB
(c)
(d)
row 2 of AA = A?
row 2 of AAA = A?.
4], compute these answers and nothing more:
Problems 18-20 use a;; for the entry in row i, column j of A.
18
19
20
Write down the 3 by 3 matrix A whose entries are
(a)
aj =it+]
(b)
ay =)
(c)
ay =i/j.
What words would you use to describe each of these classes of matrices? Give
a 3 by 3 example in each class. Which matrix belongs to all four classes?
(a)
aj; =Oifi
xs
(b)
aj
(C)
aij = Qji
(d)
aQjj =
=Oifi<j
alj-
The entries of A are a;;. Assuming that zeros don’t appear, what is
(a)
the first pivot?
(b)
the multiplier of row 1 to be subtracted from row 3?
(c)
the new entry that replaces a32 after that subtraction?
(d)
the second pivot?
2.4
Rules for Matrix Operations
63
Problems 21-25 involve powers of A.
21
Compute
A’, A®, A* and also Av, Av, A>v, A‘v for
and
v=
©
oo
oOo
©
©
=
22
mh
&SN
Find all the powers A?, A>,... and AB, (AB)?,... for
Hohe
Bei
i
ies E z| pas
23
24
25
0
k il
By trial and error find 2 by 2 matrices (of real numbers) such that
(a)
A? =-I]
(b)
BC =-—CB
(a)
Find a nonzero matrix A for which A = 0.
(b)
Find a matrix that has A? 40 but A? = 0.
(not allowing BC = 0).
By experiment with n = 2 and n = 3 predict A” for
a=)
| and a=
| and |
al
Problems 26-34 use column-row multiplication and block multiplication.
26
Multiply AB using columns times rows:
1X0
ap=|2 4]/
> atte
Zh
+
1
p24) oe cheba
2
=
The product of upper triangular matrices is upper triangular:
AB =
Bae
2%
|,0 2x, <x
eta:
Row times column is dot product
ae
aeet,
OES
eee
O05 005, x
0
Oa0
(Row 2 of A)-(column
| of B) = 0. Which
other dot products give zeros?
Column times row is full matrix
Draw x’s and 0’s in (column 2 of A) (row 2
of B) and in (column 3 of A) (row 3 of B).
64
Chapter 2
28
Draw the cuts in A (2 by 3) and B (3 by 4) and AB to show how each of the
four multiplication rules is really a block multiplication:
Solving Linear Equations
(1)
Matrix A times columns of B.
(2)
Rows of A times matrix B.
(3)
Rows of A times columns of B.
(4)
Columns
of A times rows of B.
29
Draw the cuts in A and x to multiply Ax a column at a time: x; times column
| aca
30
Which matrices E2,; and E3, produce zeros in the (2,1) and (3, 1) positions of
E,,A and E3,;A?
A=
Find the single matrix
31
In Problem
I
Ole
se ik
eb
a
b
Fle g eed:
30, what are c and D and what is D — cb/a?
With i* = —1, the product of (A+iB) and (x+iy) is Ax+iBx+iAy—By. Use
blocks to separate the real part without i from the imaginary part that multiplies /:
Ay
te
33
E = £3; F2, that produces both zeros at once. Multiply EA.
Block multiplication says in Example 4 that
EA
32
AAO
=2
Oo
4
S68
Dies
SB
ie)
ethene’
OtPAN Se yo erealepart
?
imaginary part
Each complex multiplication (a +ib)(c + id) seems to need ac, bd,ad, bc. But
notice that ad + bc = (a + b)(c + d) — ac — bd; those 3 multiplications are
enough. For matrices A, B,C, D this “3M method” becomes 3n? versus 4n?.
Check additions:
(a)
How many additions for a dot product? For n by n matrix multiplication?
(b)
(Old method) How many additions for
(c)
(3M method) How many additions for R and § = (A+ B)(C+D)—AC—
BD? For n = 2 the multiplications are 24 versus 32. Additions are 32
versus 24. For n > 2 the 3M method clearly wins.
R= AC—BD
and
§ = AD+ BC?
2.4
34
Rules for Matrix Operations
65
Suppose you solve Ax = b for three special right sides b:
bes
1
\n0
0
and
0
1
0
0
and
1
If the three solutions x, x2, x3 are the columns of a matrix X, what is A times X?
35
‘If the three solutions in Question 34 are x; = (1,1, 1) and x2 = (0,1, 1) and
x3 = (0,0, 1), solve Ax = b when b = (3,5, 8). Challenge problem: What is A?
66
Chapter 2
Solving Linear Equations
INVERSE MATRICES m 2.5
Suppose A is a square matrix. We look for a matrix A~! of the same size, such that
A!
times A equals
I. Whatever
A does, A~!
undoes.
Their product is the identity
matrix—which does nothing. But this “inverse matrix” might not exist.
What a matrix mostly does is to multiply a vector x. Multiplying Ax = b by A7!
gives A~'Ax = A~'b. The left side is just x! The product A~'A is like multiplying
by a number and then dividing by that number. An ordinary number has an inverse if
it is not zero—matrices are more complicated and more interesting. The matrix Aq!
is called “A inverse.”
Not all matrices have inverses. This is the first question we ask about a square matrix:
Is A invertible? We don’t mean that our first calculation is immediately to find A7!.
In most problems we never compute it! The inverse exists if and only if elimination
produces n pivots. (This is proved below. We must allow row exchanges to help find
those pivots.) Elimination solves Ax = b without knowing or using A~!. But the idea
of invertibility is fundamental.
DEFINITION
The matrix A is invertible if there exists a matrix A~! such that
A1A=I
and
AA
=I.
(1)
Note 1 The matrix A cannot have two different inverses. Suppose BA = / and also
AC =I. Then B =C, according to this “proof by parentheses”:
BAC) S(BAjC
gives
BI=IC
or
B=
C.
(2)
This shows that a left-inverse B (multiplying from the left) and a right-inverse C (multiplying A from the right to give AC = /) must be the same matrix.
Note 2 If A is invertible then the one and only solution to Ax = b is x = A~'b:
Multiply
Ax=b
by
A!
Then
x =A7!Ax = A7'b.
Note 3 (Important) Suppose there is a nonzero vector x such that Ax = 0. Then
A cannot have an inverse. No matrix can bring 0 back to x. If A is invertible, then
Ax = 0 has the one and only solution x = 0.
Note 4 A 2 by 2 matrix is invertible if and only if ad — bc is not zero:
a
bl
Ie |
1
Wiese
dveb
be
al
(3)
2.5
Inverse Matrices
67
This number ad — bc is the determinant of A. A matrix is invertible if its determinant
is not zero (Chapter 5). The test for n pivots is usually decided before the determinant
appears.
Note 5 A diagonal matrix is invertible when no diagonal entries are zero:
dj
If A=
1/dy
tag
then
A’)=
4
dn
1/dy
Note 6 The 2 by 2 matrix A = [12] is not invertible. It fails the test in Note 4,
because ad — bc equals 2 — 2 = 0. It fails the test in Note 3, because Ax = 0 when
x = (2,—1). It also fails to have two pivots. Elimination turns the second row of A
into a zero row.
The Inverse of a Product
For two nonzero numbers a and b, the sum a +b might or might not be invertible.
The numbers a = 3 and b = —3 have inverses 4 and —i. Their sum a + b = 0 has
no inverse. But the product ab = —9 does have an inverse, which is 4 times —f.
For two matrices A and B, the situation is similar. It is hard to say much about
the invertibility of
A+ B. But the product AB has an inverse, whenever the factors A
and B are separately invertible (and the same size). The important point is that A7!
and B~! come in reverse order:
2H
If A and B are invertible then so is AB.
(QU
The inverse of AB
a
is
(4)
To see why the order is reversed, start with AB. Multiplying on the left by A~! leaves
B. Multiplying by B~! leaves the identity matrix /:
B-'A'AB_
equals
B'IB=B'B=I.
Similarly AB times B~'A~! equals AIA~'! = AA~! =]. This illustrates a basic rule
of mathematics: Inverses come in reverse order. It is also common sense: If you put
on socks and then shoes, the first to be taken off are the
. The same idea applies
to three or more matrices:
ARGs
=C- BAS
(5)
68
Chapter 2
Solving Linear Equations
Example 1 If E is the elementary matrix that subtracts 5 times row
then E~! adds 5 times row | to row 2:
E=|—5
Lo. 0 0)
~1 “0
OQ: oat
and
1 from row 2,
iP Om)
EF'=|5
1 °0
Gorm
Multiply EE~! to get the identity matrix 7. Also multiply E~'E to get J. We are
adding and subtracting the same 5 times row
(this is
1. Whether we add and then subtract
EE~') or subtract first and then add (this is E —lE), we are back at the start.
For square matrices, an inverse on one side is automatically an inverse on the other side.
If AB =/
then automatically BA = /. In that case B is A~!. This is very useful to
know but we are not ready to prove it.
Example 2 Suppose F subtracts 4 times row 2 from row 3, and F~! adds it back:
Fe
i
0
Oe 12s Oe beanie
Omens ial
00
tie ah Oahu)
On Aviel
Again FF~!' =] and also F~'F = 1. Now multiply F by the matrix E in Example 1
to find FE. Also multiply E~! times F~! to find (FE)~!. Notice the order of those
inverses!
FE=|
ee
—-5 1
8
O|
20
1
-4
isinverted by
E-'F-'=|]
ee
5
Pe
1 O}.
0
4
(6)
1
This is strange but correct. The product FE contains “20” but its inverse doesn’t. You
can check that FE times E~'F—! gives /.
There must be a reason why 20 appears in FE but not in its inverse. E subtracts
5 times row | from row 2. Then F subtracts 4 times the new row 2 (which contains
—5 times row 1) from row 3. In this order FE, row 3 feels an effect from row 1.
In the order E~'F~!, that effect does not happen.
row 3. After that,
E~! adds 5 times row
First F~! adds 4 times row 2 to
1 to row 2. There is no 20, because row 3
doesn’t change again. The example makes two points:
1
Usually we cannot find A~! from a quick look at A.
2
When elementary matrices come in their normal order FE, we can find the product of inverses
E~'F~! quickly. The multipliers fall into place below the diagonal.
This special property of E~'F~! and
E~'F~'G7! will be useful in the next section.
We will explain it again, more completely. In this section our job is A~!, and we
expect some serious work to compute it. Here is a way to organize that computation.
2.5
Inverse Matrices
69
The Calculation of A~! by Gauss-Jordan Elimination
I hinted that A~! might not be explicitly needed. The equation Ax = Bb is solved by
x = A~'b. But it is not necessary or efficient to compute A~! and multiply it times
b. Elimination goes directly to x. In fact elimination is also the way to calculate A~!,
as we now show.
The idea is to solve AA~! = J a column at a time. A multiplies the first column
of A~! (call that x1) to give the first column of / (call that e;). This is our equation
Ax, =e . Each column of A~! is multiplied by A to produce a column of /:
AA = Alm x2 x3|]=[e1 e2 €3] =F
(7)
To invert a 3 by 3 matrix A, we have to solve three systems of equations:
and Ax7 = e2 and Ax3 = e3. Then x1, x2, x3 are the columns of A~!.
Ax; = e;
Note This already shows why computing A~! is expensive. We must solve n equations for its n columns. To solve Ax = b directly, we deal only with one column.
In defense
of A~!, we
want to say that its cost is not n times the cost of one
system Ax = b. Surprisingly, the cost for n columns is only multiplied by 3. This
saving is because the n equations Ax; = e; all involve the same matrix A. When we
have many different right sides, elimination only has to be done once on A. Working
with the right sides is relatively cheap. The complete A~! needs n> elimination steps,
whereas solving for a single x needs n?/3. The next section calculates these costs.
The Gauss-Jordan method computes A~! by solving all n equations together. Usually
the “augmented matrix” has one extra column b, from the right side of the equations.
Now we have three right sides e;, e2, e3 (when A is 3 by 3). They are the columns of
I, so the augmented matrix is really just [A J]. Here is a worked-out example when
A has 2’s on the main diagonal and —1’s next to the 2’s:
29101
0
eer O 61
ee
0
«(CO
Diet
05°
0
[A e; e2 e3|=|
=
>|
We are halfway.
0
5.02
1..
3 —1
5
1
0
O19)
2-7
0)
Oe
a
moe
oe
il
a
()
3
j
1
0
3-1
0
0
i
€ row
1+ row
2)
(3 row
2+ row
3)
3 laee
The matrix in the first three columns is now
The pivots 2, 3, ; are on its diagonal.
Gauss
would
U (upper triangular).
finish by back substitution.
The
contribution of Jordan is to continue with elimination, which goes all the way to the
70
Chapter 2
Solving Linear Equations
“reduced echelon form’. Rows are added to rows above them, to produce zeros above
the pivots:
2
>]
-1
0
OE
3
0
1
Orr
)
Tes
4
3
3
3
;
1
€ row
1
2
(3 row 2+row 1)
20s
Ls
linOale
Sr
OG a
ee
Pe
OF
;
bAOkewiel
ecigdl tae
3+ row
2)
nee
Now divide each row by its pivot. The new pivots are 1. The result is to produce
I in the first half of the matrix. (This reduced echelon form is R = J because A is
invertible.) The columns of A~' are in the second half.
3
Lees
(divide by 3)
0.
sirlelign0
See
divide
(divide
Oe
re
41
byby 43)
es)
1
(divide by 2)
|
Starting from the 3 by 6 matrix [A
Gauss-Jordan process on one line:
Multiply
Beak
ei
1
|= [fF x1 x2 3].
:1
33
/], we have reached [ / A~!]. Here is the whole
[A I] by
A
toget
[I Aq'].
The elimination steps gradually end up with the inverse matrix. Again, we probably don’t
need A~! at all. But for small matrices, it can be very worthwhile to know the inverse.
We add three observations about this particular A~! because it is an important example.
We introduce the words symmetric, tridiagonal, and determinant:
1
A is symmetric across its main diagonal. So is A7!.
2
The matrix A is tridiagonal (three nonzero diagonals). But A~! is a full matrix
with no zeros. That is another reason we don’t often compute A~!.
3
The product of the pivots is 2(3)(4) = 4. This number 4 is the determinant of
A. Chapter 5 will show why A7~! involves division by 4:
The inverse above is
A!
1
= A
4
AsO aHe
NO
me
WwW
3
(8)
Example 3 Find A~! by Gauss-Jordan elimination with A = [23].
so, 3: ee
153.61920
be ul=|[5 i. 0 > [a 2
|
|
ow
aaaat
l=
©—_=
| NN
W
—
sl
or
=o
a
|
lI
|
=
wNnin mR
Dlw
—
—
>
|
J
2.5
The reduced echelon form of [A
determinant 2.7 —3-4=2.
J] is [Z
Inverse Matrices
71
A~!]. This A7! involves division by the
The code for X = inverse(A) has three important lines!
I = eye(n,
n);
R=rref({A
/)});
X=RE,n+1:n+n)
The last line discards columns
| to n, in the left half of R. It picks out X = Aq!
from the right half. Of course A must be invertible, or the left half will not be /.
Singular versus invertible
We come back to the central question. Which matrices have inverses? The start of this
section proposed the pivot test: A~! exists exactly when A has a full set of n pivots.
(Row exchanges are allowed!) Now we can prove that fact, by Gauss-Jordan elimination:
1
With n pivots, elimination solves all the equations Ax; = e;. The columns x; go
into A~'. Then AA~! = J and A™! is at least a right-inverse.
Z
Elimination is really a long sequence of matrix multiplications:
(Drie
cha Pe
ARE
D~! divides by the pivots. The matrices E produce zeros below and above the pivots.
The matrices P exchange rows (details in Section 2.7). The product matrix in parentheses is evidently a left-inverse that gives A~'A = J. It equals the right-inverse by
Note 1, in this section. With a full set of pivots A is invertible.
The converse is also true. Jf A~! exists, then A must have a full set of n pivots.
Again the argument has two steps:
1
If A has a whole row of zeros, the inverse cannot exist. Every matrix product
AB
Z
will have that zero row, and cannot be equal to /.
If a column has no pivot, elimination will eventually reach a row of zeros:
“hy
OP
32 Be
hwo,
OE
Oe
ey,
reduces to
ah 3e
TO
38
car
0
0
0
We have reached a matrix with no inverse. Since each elimination step is invertible,
the original A was not invertible. Without pivots AW! fails.
2I Elimination gives a complete test for A~! to exist. There must be n pivots.
72
Chapter 2.
Solving Linear Equations
Example 4 If L is lower triangular with 1’s on the diagonal, so is L~!.
Use the Gauss-Jordan method to construct L~!. Start by subtracting multiples of pivot
rows from rows below. Normally this gets us halfway to the inverse, but for L it gets
us all the way. This example is typical:
fe
10k
oil
Mai
TRO! hp
1,
BOF
4
5
I
“On
OmG
Ie
sl ae OT
onal
]
0 -—3
eS
1-4
[Zs deta
—>10
lO
eer
QE
SOO
When
SO°°@
yey
jr
*0 PSSat
OL
A
L goes to J by elimination,
elimination
matrices
E32E3,E2;
20
20
werent)
1
0
0
1
ah
ee)
=s
1
CER
(3 times row
(4 times row
ON ts
J goes to L~!.
is L~!.
| from row 2)
| from row 3)
erie
In other words, the product of
All the pivots are
1’s (a full set).
Lis
lower triangular.
=
REVIEW OF THE KEY IDEAS
#8
1.
The inverse matrix gives AA~! = J and A~'A=J.
De
The inverse of AB
a
If Ax = 0 for a nonzero
4.
The Gauss-Jordan method solves AA~!
row-reduced to es A \
5.
A is invertible if and only if it has n pivots (row exchanges allowed).
is B~!A7!,
vector x, then A has no inverse.
= J. The augmented matrix [A I ] is
Problem Set 2.5
1
Find the inverses (directly or from the 2 by 2 formula) of A, B, C:
A,
Ones
al and
an)
B=|; |
and
3
ory.
dl
al
2.5.
Inverse Matrices
73
For these “permutation matrices” find P~! by trial and error (with 1’s and 0s):
Gb)
On-O0 i
MOIR
0)
ik SOR)
ands
Oe!
40
70 0 at
ee (Oe
f=
Solve for the columns of A7! = [3 Hil:
[oo so][¢]=[o] o»# [20 so][e]=[i]
Show that Bea has no inverse by trying to solve for the columns
mee
(a,c) and
Pager
Find three 2 by 2 matrices (not
A = /) that are their own inverses:
(a)
If A is invertible and AB = AC, prove quickly that
(D)
elite Ay — & rs find two matrices
A? = /.
B = C.
B 4 C such that AB = AC.
(Important) If the 3 by 3 matrix A has row
not invertible:
| + row 2 = row 3, show that it is
(a)
Explain why Ax = (1,0,0) cannot have a solution.
(b)
Which
(c)
What happens to row 3 in elimination?
right sides (5), b2, b3) might allow a solution to Ax = b?
Suppose A is invertible and you exchange its first two rows.
Is the new matrix
B invertible and how would you find B~! from A7!?
Find the inverses (in any legal way) of
Oin0: 00.2
003 0
NW
ee
Pees
5 01070
10
11
3200
430 0
lip ms
Gun Ture
(a)
Find invertible matrices
(b)
Find singular matrices A and B such that A + B is invertible.
If the product
A and B such that A + B is not invertible.
C = AB is invertible (A and B are square), then A itself is in-
vertible. Find a formula for A~! that involves C~! and B.
iy
If the product M = ABC of three square matrices is invertible, then B is invertible. (So are A and C.) Find a formula for B~! that involves M~! and A and C.
74
Chapter 2
13
If you add row 1 of A to row 2 to get B, how do you find B~! from “Aa?
Solving Linear Equations
The inverse of
B=
i i
A
is
:
14
Prove that a matrix with a column of zeros cannot have an inverse.
15
Multiply [#5] times [_¢->].
What is the inverse of each matrix if ad 4 bc?
Cc
16
(a)
(b)
What single matrix E has the same effect as these three steps?
Subtract
row | from row
row 3.
2 from
2, subtract row
3, then subtract row
What single matrix L has the same effect as these three steps? Add row 2
to row 3, add row
Urs
| from row
| to row 3, then add row
If the 3 by 3 matrix A has column
not invertible:
1 + column
(a)
Find a nonzero
(b)
Does elimination keep column
there is no third pivot.
| to row 2.
2 = column
solution to Ax = 0.
1 + column 2 = column 3? Explain why
18
If B is the inverse of A, show that AB
19
Find the numbers a and b that give the correct inverse:
is the inverse of A.
=I
ES
St) 14S)
Sy ee eee Se
aan eoiy iianad
20
3, show that it is
abd:
bea bab
ee ae:
bap bitrate
There are sixteen 2 by 2 matrices whose entries are 1’s and 0’s. How many of
them are invertible?
Questions 21-27 are about the Gauss-Jordan method for calculating A~!.
21
Change / into A!
as you reduce A to J (by row operations):
[4 l=[,3 01] ™ 4 =[5 3 041
22
Follow the 3 by 3 text example but with plus signs in A. Eliminate above and
below the pivots to reduce [A
epee
J] to
2
ee
0
R=[J
A™!];:
SS
= ©
NR
So
=
0
0
©
or
l
2.5
23
Use Gauss-Jordan
Me
Multiply D-'!Ej2E
and
29
ly elt 8)
A SIO
Oi sO
@¢
©
i
Zl
ul
hilt
2a aad
1
Shige
cA
and,
2A
Lisl
ile 2
2 3
Exchange rows and continue with Gauss-Jordan to find A7!:
peste
True or false (with a counterexample if false):
(a)
A 4 by 4 matrix with __
zeros 1s not invertible.
(b)
Every matrix with 1’s down the main diagonal is invertible.
(c)
If A is invertible then A~! is invertible.
(d)
If A is invertible then A? is invertible.
For which numbers c is this matrix not invertible, and why not?
A=
22a
|uCe aC
a.C
eae
30
@
I
@
to find A7!.
Cea
28
lL
||M
OO
Invert these matrices by the Gauss-Jordan method starting with [A / ]:
7M
2a.
ae || =
What three matrices Ep; and E}2 and D~! reduce A = [}2] to the identity
matrix?
26
Bey
Find A7! (if it exists) by Gauss-Jordan elimination on [A J ]:
2 a lyk
7
Wachee tea|
jE
2
25
75
elimination with J next to A to solve AA~! =7:
ab
Oia
e
O @ i
24
Inverse Matrices
Prove that this matrix is invertible if a
#0 and a £b:
76
Chapter 2
31
This matrix has a remarkable inverse.
and multiply AA! to confirm:
Solving Linear Equations
Find A~! and guess the 5 by 5 inverse
ea
2]
32
Use the inverses in Question 31 to solve Ax = (1, 1, 1, 1) and Ax = (1, 1, 1, 1, 1).
33
The Hilbert matrices have aj; = 1/(i + j — 1). Ask MATLAB for the exact
inverse invhilb(6). Then ask for inv(hilb(6)). How can these be different, when
the computer never makes mistakes?
34
beakeear ear,
Find the inverses (assuming they exist) of these block matrices:
2.6
ELIMINATION
Elimination
= Factorization:
A = LU
FET
= FACTORIZATION: A = LU ao 2.6
Students often say that mathematics courses are too theoretical. Well, not this section.
It is almost purely practical. The goal is to describe Gaussian elimination in the most
useful way. Many key ideas of linear algebra, when you look at them closely, are
really factorizations of a matrix. The original matrix A becomes the product of two
or three special matrices. The first factorization—also the most important in practice—
comes now from elimination. The factors are triangular matrices. The factorization
is A= LU.
Start with a 2 by 2 example. The matrix A contains 2, 1,6,8. The number to
eliminate is 6. Subtract 3 times row | from row 2. That step is E2; in the forward
direction. The return step from U to A is Ey (an addition):
Forward to U :
Back from U to A:
LO
De
Ena=|_;
alle leg
|
alse
EU
4 = A.
=
E a
fel
The second line is our factorization. Instead of A = Ea we write A = LU. Move
now to larger matrices with many E’s. Then L will include all their inverses.
Each step from A to U multiplies by a matrix E;; to produce zero in the (i, /)
position. The row exchange matrices P;; move nonzeros up to the pivot position. A
long sequence of E’s and P’s multiplies A to produce U. When we put each E~! and
P~—' on the other side of the equation, they multiply U and bring back A:
A= (Ee
POSE
Ne:
(1)
This factorization has too many factors. The matrices E~! and P~! are too simple.
We will combine them into a single matrix L which inverts all the elimination steps
at once. Another matrix P accounts for all the row exchanges. (It comes in the next
section.) Then A is built out of L, P, and U.
To keep this clear, we want to start with the most frequent case—when no row
exchanges are involved. If A is 3 by 3, we multiply it by £2; and £3; and E37. That
produces zero in the (2, 1) and (3, 1) and (3, 2) positions—all below the diagonal.
end with the upper triangular
inverses multiply U:
(£32E31E21)A=U
becomes
We
U. Now move those E’s onto the other side, where their
A=(E5)'E3)'
Ey )U
whichis
A=LU.
(2)
The inverses go in opposite order, as they must. That product of three inverses is L.
We have reached A = LU. Now we stop to understand it.
First point:
Every inverse matrix Ey is lower triangular. Its off-diagonal entry is ¢;;,
to undo the subtraction in £;; with —€;;. The main diagonals of E and E~!
all 1’s.
contain
78
Chapter 2
Solving Linear Equations
Second point: The product of E’s is still lower triangular. Equation (2) shows a lower
triangular matrix (in the first parentheses) multiplying A. It also shows a lower triangular matrix (the product of Ee multiplying U to bring back A. This product of
inverses is L.
There are two good reasons for working with the inverses. One reason is that
we want to factor A, not U. The “inverse form” of equation (2), gives
A= LU.
second reason is that we get something extra, almost more than we deserve.
the third point, which shows that L is exactly right.
The
This is
Third point: Each multiplier /;; goes directly into its 7, 7 position—unchanged—in the
product L. Usually a matrix multiplication will mix up all the numbers. Here that
doesn’t happen. The order is right for the inverse matrices, to keep the /’s unchanged.
The reason is given below.
Since each E~! has 1’s down its diagonal, the final good point is that L does too.
2) (A = LU) This is elimination without row exchanges. The upper triangular U
has the pivots on its diagonal. The lower triangular L has 1’s on its diagonal. The
multipliers /;; are below the diagonal.
Example
row
1 For the matrix A with 1, 2, 1 on its diagonals, elimination subtracts : times
| from row 2. Then /2; = 5. The last step subtracts : times row 2 from row 3:
25m las
ge. Cadi Onele ae Ua
Are 2
ee, Ble OO een eae
Os Irae
O° s) 1 }owo
The (3, 1) multiplier is zero because the (3, 1) entry in A is zero. No operation needed.
Example 2 Change the top left entry from 2 to 1. The multipliers all become
pivots are all 1. That pattern continues when A is 4 by 4 or n by n:
as bel AK)
1
ae Saas a
Fi
(pa eed Na paca CC
(nme tee.
0 0
1
1. The
i? *17'0t6
elle’
alk
hes ig al
1
These LU examples are showing something extra, which is very important in practice.
Assume no row exchanges. When can we predict zeros in L and U?
When a row of A starts with zeros, so does that row of L.
When a column of A starts with zeros, so does that column of U.
If a row starts with zero, we don’t need an elimination step. The zeros below the
diagonal in A gave zeros in L. The multipliers /;; were zero. That saves computer
2.6
Elimination = Factorization:
A = LU
79
time. Similarly, zeros at the start of a column survive into U. But please realize:
Zeros in the middle of a matrix are likely to be filled in, while elimination sweeps
forward.
We now explain why L has the multipliers /;; in position, with no mix-up.
The key reason why A equals LU: Ask yourself about the rows that are subtracted
from lower rows. Are they the original rows of A? No, elimination probably changed
them. Are they rows of U? Yes, the pivot rows never get changed again. For the third
row of U, we are subtracting multiples of rows 1 and 2 of U (not rows of A!):
Row
3 of U = row 3 of A —13;(row
1 of U) — /32(row 2 of U).
(3)
Rewrite that equation:
Row 3 of A = 13; (row 1 of U) + 132(row 2 of U) + l(row 3 of U).
(4)
The right side shows the row [/3; 1/32 1] multiplying U. This is row 3 of
A= LU.
All rows look like this, whatever the size of A. With no row exchanges, we have
Al—re
We
Remark The LU factorization is “unsymmetric” in one respect. U has the pivots on
its diagonal where L has 1’s. This is easy to change. Divide U by a diagonal matrix
D that contains the pivots. Then U equals D times a matrix with 1’s on the diagonal:
d\
1
do
uy2/d)
1
u43/d)
u23/d2
dn
l
It is convenient (but confusing) to keep the same letter U for this new upper triangular
matrix. It has 1’s on the diagonal (like L). Instead of the normal LU, the new form has
LDU: Lower triangular times diagonal times upper triangular:
The triangular factorization can be written
A=LU
or
A=LDU.
Whenever you see LDU, it is understood that U has 1’s on the diagonal. Each row is
divided by its first nonzero entry—the pivot. Then L and U are treated evenly. Here
is LU and then also LDU:
ill s}-B al? slo
The pivots 2 and 5 went into D. Dividing the rows by 2 and 5 left the rows [1
and [0 1] in the new U. The multiplier 3 is still in L.
4]
80
Chapter 2
Solving Linear Equations
One Square System = Two Triangular Systems
We emphasized that L contains our memory of Gaussian elimination. It holds all the
numbers that multiplied the pivot rows, before subtracting them from lower rows. When
do we need this record and how do we use it?
We need to remember L as soon as there is a right side b. The factors L and U
were completely decided by the left side (the matrix A). Now work on the right side.
Most computer programs for linear equations separate the problem Ax = b this way,
into two different subroutines:
1
Factor A (into L and U)
Z
Solve for x (from L and U and Bb).
Up to now, we worked on b while we were working on A. No problem with
that—just augment the matrix by an extra column. The Gauss-Jordan method for A7!
worked on n right sides at once. But most computer codes keep the two sides separate.
The memory of forward elimination is held in L and U, at no extra cost in storage.
Then we process b whenever we want to. The User’s Guide to LINPACK remarks that
“This situation is so common and the savings are so important that no provision has
been made for solving a single system with just one subroutine.”
How do we process b? First, apply the forward elimination steps (which are stored
in L). This changes b to a new right side c—we are really solving Lc = b. Then
back substitution solves Ux = c. The original system Ax = b is factored into two
triangular systems:
Solve
Lc=b_~
and then solve
Ux=c.
(6)
To see that x is correct, multiply Ux = c by L. Then LU equals A and Le equals b.
We have Ax = b.
To emphasize: There is nothing new about those steps. This is exactly what we
have done all along. Forward elimination changed Bb into c on the right side. We were
really solving the triangular system Le = b by “forward substitution.’ Then back substitution produces x. An example shows it all:
Example 3 Forward elimination ends at Ux = c:
¥
2
E28
We eer oreo,
bees
2u+2v=8
Sts
The multiplier was 2. The left side found L, while the right side solved Le = b:
=
Le=b
The lower triangular system
1 O
es
: ‘|| = |
:
SIV ESE
ps
— H :
2.6
Ux =c
The upper triangular system
Elimination
= Factorization:
i | | ee B
A = LU
CIVES
sk
81
H :
It is satisfying that L and U replace A with no extra storage space. The triangular
matrices can take the n* storage locations that originally held A. The /’s fit in below
the diagonal. The whole discussion is only looking to see what elimination actually
did.
The Cost of Elimination
A very practical question is cost—or computing time. Can we solve a thousand equations on a PC? What if n = 10,000? Large systems come up all the time in scientific
computing, where a three-dimensional problem can easily lead to a million unknowns.
We can let the calculation run overnight, but we can’t leave it for 100 years.
It is not hard to estimate how many individual multiplications go into elimination. Concentrate on the left side (the expensive side, where A changes to U). The
first stage produces zeros below the first pivot. To change an entry in A requires one
multiplication and subtraction. We will count this stage as n* multiplications and n?
subtractions. It is actually less, n* —1n, because row | does not change. Now column
IMistset.
The next stage clears out the second column below the second pivot. The matrix
is now of size n—1.
Estimate by (n— 1)2 multiplications and subtractions.
The matrices
are getting smaller as elimination goes forward. The rough count to reach U is the sum
n>+(n—1)*+---+274+2.
There is an exact formula
in(n + y(n + 1) for this sum
is large, the 4 and the | are not important.
of squares.
When
n
The number that matters is in, and it
comes immediately from the square pyramid in Figure 2.9. The base shows the n?
steps for stage 1. The height shows the n stages as the working matrix gets smaller.
The volume is a good estimate for the total number of steps. The volume of an n by
n by n pyramid is ine:
Elimination on A _requires about
in? multiplications
and in subtractions.
What about the right side b? The first stage subtracts multiples of b; from the
lower components
b2,..., bn. This is n — 1 steps. The second stage takes only n — 2
steps, because b, is not involved. The last stage of forward elimination takes one step
to produce c.
Now start back substitution. Computing x, uses one step (divide by the last
pivot). The next unknown uses two steps. The first unknown x; requires n steps (n—1
substitutions of the other unknowns, then division by the first pivot). The total count
on the right side—forward to the bottom and back to the top—is exactly a
(n— 1) ein— 2) fee PI 1424640
=-) +n =n’,
(7)
82
Chapter 2
Solving Linear Equations
~
i
+
nbyn
|
tL
*
—1 by n-1
ma
Ds
|
ls
Ii
/
=
3
\
n
f N
)(=
|
|
|
|
=
area n”
j
1
area (n—1)*
\
n—-1
ee
ee
ee
volume 5”
Figure 2.9 — Elimination needs about in? multiplications and subtractions.
count (n? —n)+---+(?-1)
=} 3
aren
Exact
in.
To see that sum, pair off (n—1) with 1 and (n—2) with 2. The pairings leave n terms
each equal to n. That makes n*. The right side costs a lot less than the left side!
Each right side (from b to c to x) needs
n* 2 multiplications and n* subtractions.
Here are the MATLAB codes to factor A into LU and to solve Ax = b. The
program slu stops if zero appears in a pivot position (that fact is printed). Later the
program plu will look down the column for a pivot, and execute a row exchange. These
M-files are on the web.
function [L, U] = slu(A)
% slu Square lu factorization with no row exchanges.
[n,n] = size(A);
tol = l.e — 6;
fork =n
if abs(A(k, k)) < tol
disp({/Small pivot in column ‘int 2 str(k)])
end % Cannot proceed without a row exchange: stop
Lik ky =":
fori=k+1:in
L(i,k) = A@i,k)/A(k,k);
=% The kth pivot is now A(k, k)
LORS
estkid=
ert
A(i, j) = A@, J) — LG, k) * AC, J);
end
end
for
= kin
U(k, j) = Atk, j); % Rename eliminated A to call it U
end
end
2.6
Elimination = Factorization:
A = LU
83
function x = slv(A, b)
% Solve Ax = b using L and U from slu(A). No row exchanges!
[L, U] = slu(A);
% Forward elimination to solve Le = b.
% L is lower triangular with 1’s on the diagonal.
[n,n] = size(A);
fOr keaton
for
j=1:k—1
s=s+LKk, j) *c(j);
end
c(k) = b(k) —s;
end
% Back substitution to solve
fork =n:-—1:1
for
Ux = c.
% Going backwards
to x(n),..., x(1)
j=kK+1cn
t=t+U{k,
j) *x(j);
end
x(k) = (c(k) — t)/U(K,k);
end
5 aes
How long does it take to factor A into LU? For a random matrix of order n =
100, we used the MATLAB command t = clock; lu(A); time(clock, t). The elapsed time
on a SUN Sparcstation 1 was one second. For n = 200 the elapsed time was eight
seconds. This follows the n° rule! The time is multiplied by 8 when n is multiplied
by 2.
A
According to this n> rule, matrices that are 10 times as large (order 1000) will
take 10° seconds. Matrices of order 10,000 will take (100)? seconds. This is very ex-
pensive but remember that these matrices are full. Most matrices in practice are sparse
(many zero entries). In that case A = LU is very much faster. For tridiagonal matrices
of order 10,000, solving Ax = b is a breeze.
=
REVIEW OF THE KEY IDEAS
&
1.
Gaussian elimination (with no row exchanges) factors A into L times U.
2.
The lower triangular L contains the multiples of pivot rows that are subtracted
going from A to U. The product LU adds those back to recover A.
84
Chapter 2
Solving Linear Equations
On the right side we solve Le = b (forward) and Ux = c (backwards).
There are 4(n3 —n) multiplication and subtractions on the left side.
There are n* multiplications and subtractions on the right side.
Problem Set 2.6
Problems 1-8 compute the factorization
1
A = LU (and also
A= LDU).
What matrix E puts A into triangular form EA = U? Multiply by
factor A into LU:
Ze VeRO)
Ae
ae
ee)
What two elimination matrices
E~! = L to
F2; and £32 put A into upper triangular form
E32 E>,A = U? Multiply by E3,' and E>,’ to factor A into E5,'£3,'U which is
LU. Find L and U for
L
Ava)
0
Brat nS
4 0
What three elimination matrices E2,, £3;, E32 put A into upper triangular form
E32 E3;E2,A = U? Multiply by 12eae ES and Ee to factor A into LU where
[= Engh
hE ebindn’ and.
Ly Ory3d
AS
2 2-2
3 eats
Suppose A is already lower triangular with 1’s on the diagonal. Then U = /!
The elimination matrices E2,, E3;, £32 contain —a
then —b then —c.
(a)
Multiply
£323; £2; to find the single matrix E that produces EA = J.
(b)
Multiply Ee Ban Ex to find the single matrix L that gives
A = LU (or L/).
2.6
Elimination = Factorization:
A = LU
85
When zero appears in a pivot position, the factorization A = LU is not possible! (We are requiring nonzero pivots in U.) Show directly why these are both
impossible:
01
2
men
3/=l1
illo
¢
Jeg
Lac)
1
OL
Le
Ql
eee
min
di
620g
fm
1
de
i
This difficulty is fixed by a row exchange.
Which number c leads to zero in the second pivot position? A row exchange is
needed and A = LU is not possible. Which c produces zero in the third pivot
position? Then a row exchange can’t help and elimination fails:
>
I
ARO
a
—
WN
What
new
are L and D for this matrix A?
U in
A=LDU?
What
is U in
A = LU
and what is the
8
)
Fs
OW
1A
Z
7
NandAN)
0
A and B are symmetric across the diagonal (because 4 = 4). Find their factor-
izations LDU
and say how U
a=|i
1s related to L:
it and,
>
Ler4oEO
=|4oi2) 4
4 0
B
I
fc Ss
Sc
8
ao SG
SS
Find four conditions on a, b,c,d to get
10
A= LU with four pivots.
Find L and U for the nonsymmetric matrix
A=
Gh
a
Gh
oe
ip
id
1
dey
iP
&
1
ik
is
al
Find the four conditions on a,b,c, d,r,s,t to get
A= LU with four pivots.
86
Chapter 2. Solving Linear Equations
Problems
11
11-12 use L and U (without needing A) to solve Ax = b.
Solve the triangular system Le = b to find c. Then solve Ux =c
10
nel) i
For safety find
12
Dh
B=;
(a)
ih
ee
ik
O@
15
2
el
ths
to find x. What was
othe a
eel
OmrOmeal
A?
ances
4
anda
Dias)
6
When you apply the usual elimination steps to L, what matrix do you reach?
1
0
@
|i
1
©
Ij,
ip
I
(b)
When you apply the same steps to 7, what matrix do you get?
(c)
When you apply the same steps to LU, what matrix do you get?
If
A= LDU
D=
D, and U = U\. “The factors are unique.”
(a)
Derive the equation jira aay: = D,U,U~!.
triangular or diagonal?
(b)
Show that the main diagonals in that equation give
ea) Bee
and also A = L; DU,
with all factors invertible, then
L = L, and
Are the two sides lower or upper
D =
D,.
Why does
Tridiagonal matrices have zero entries except on the main diagonal and the two
adjacent diagonals. Factor these into A = LU and
A= LDL!:
ag
Av—
16
and
ik
lee)
ll
L=
14
7A
A = LU and solve Ax = Bb as usual. Circle c when you see it.
Solve Lc = b to find c. Then solve Ux =c
es
13
and
to find x:
Maas
sol
ame
Oe
a
and
A=|a
a
0
a+b
b
When T is tridiagonal, its L and U factors have only two nonzero diagonals.
How would you take advantage of the zeros in T in a computer code for Gaussian
elimination? Find L and U.
Se
tS
1)
&
wb =
or
©
On
oOo
kW
2.6
17
Elimination = Factorization:
ba
Si
oy TOT
oH
tal
teh
Si
OG
GY
AS)
ty
Oe Oeee
xe
After elimination has produced zeros below the first pivot, put x’s to show which
blank entries are known in the final L and U:
ee
Se ee 68 || =
Be
19
87
If A and B have nonzeros in the positions marked by x, which zeros (marked
by 0) are still zero in their factors L and U?
anda
18
A = LU
1
0
1b
0
@
1
0
0
Suppose you eliminate upwards (almost unheard of). Use the last row to produce
zeros in the last column (the pivot is 1). Then use the second row to produce
zero above the second pivot. Find the factors in A = UL(!):
Ae
Nn
OO
—
20
Collins
uses
elimination
WW
OW
mem
in both directions,
meeting
at the center.
Substitution
goes out from the center. After eliminating both 2’s in A, one from above and
one from below, what 4 by 4 matrix is left? Solve Ax = b his way.
i
0) a0
Sul eotban()
A=19 1 3 2|
OL Oe
5
8
4
=I
2
21
(Important) If A has pivots 2,7,6 with no row exchanges, what are the pivots
for the upper left 2 by 2 submatrix B (without row 3 and column 3)? Explain
why.
22
Starting from a 3 by 3 matrix A with pivots 2,7, 6, add a fourth row and column
to produce M. What are the first three pivots for M, and why? What fourth row
and column are sure to produce 9 as the fourth pivot?
23
MATLAB knows the n by n matrix pascal(n). Find its LU factors for n = 4
and 5 and describe their pattern. Use chol(pascal(n)) or slu(A) above or work
by hand.
The row exchanges in MATLAB’s
doesn’t:
A = pascal(4) =
|
l
i
l
lu code spoil the pattern, but chol
88
Chapter 2
24
(Careful review) For which numbers c is A = LU impossible—with three pivots?
Solving Linear Equations
hh=
NO)
AGP
Loe
oW
0
1
1
25
Change the program slu(A) into sldu(A), so that it produces L, D, and U.
26
Rewrite slu(A) so that the factors L and U appear in the same n? storage locations that held the original A. The extra storage used for L is not required.
a7
Explain in words why x(k) is (c(k) —t)/U(k,k)
28
Write a program that multiplies triangular matrices, L times U. Don’t loop from
1 to n when you know there are zeros in the matrices. Somehow L times U
should undo the operations in slu.
at the end of slv(A, b).
2.7.
Transposes and Permutations
89
TRANSPOSES AND PERMUTATIONS ag 2.7
We need one more matrix, and fortunately it is much simpler than the inverse. It is
the “transpose” of A, which is denoted by A!. Its columns are the rows of A. When
A is an m by n matrix, the transpose is n by m:
wo
then AT=| Be 0
3 4
If 2 = |e
You can write the rows of A into the columns of A’. Or you can write the columns
of A into the rows of A. The matrix “flips over” its main diagonal. The entry in row
i, column j of A! comes from row j, column i of the original A:
(A")ij = Aji.
The transpose of a lower triangular matrix is upper triangular.
is—-.
Note
MATLAB’s
symbol for the transpose is A’.
a row and transpose:
The transpose of A‘!
To enter a column
vector, type
v =[ 1 2 3 ]’. To enter a matrix with second column w
= [4 5 6]
you could define M =[v w]. Quicker to enter by rows and then transpose the whole
mani: Mo = | 2 334"°5 6)7
The rules for transposes are very direct. We can transpose A+B to get (A+B)!.
Or we can transpose A and B separately, and then add A! + B'—same result. The
serious questions are about the transpose of a product AB and an inverse A~!:
The transpose of
A+B _ is
Adi-pep
ee
(1)
The transpose of
AB
is
(AB)! = BTA’,
(2)
The transpose of
An
a6
Aas
(3)
= (Aja
Notice especially how B'A!T comes in reverse order like B~!'A~!.
for the inverse is quick: B~!A~!
The proof
times AB produces J. To see this reverse order for
(AB), start with (Ax)!:
Ax
combines the columns of A
while
x
TEA
combines the rows of A’.
It is the same combination of the same vectors! In A they are columns, in A! they
are rows. So the transpose of the column Ax is the row x! A!. That fits our formula
(Ax)? = x? AT. Now we can prove the formulas for (AB)' and (A~!)?.
When
columns
B = [x; x2] has two columns, apply the same idea to each column.
of AB
are Ax;
and Ax2.
Their transposes
are eA}
and As
Those
The
are
the rows of BTA:
x} At
Transposing
AB =| Ax;
Ax2
-:: | gives
uA
| which is B'A!.
(4)
90 = Chapter 2. Solving Linear Equations
The right answer B! A! comes out a row at a time. You might like our “transparent proof”
at the end of the problem set. Maybe numbers are the easiest; in this example:
422|oil| al Slo) etefoul al
The rule extends to three or more factors:
(ABC)? equals C'BTA?.
If
The pivot matrix
A=LDU
then A’ =U'D'L".
Now apply this product rule to both sides of A-!A =I.
D =D"
om
is the same.
On one side, J"
is J.
On the other side, we discover that (AeAyE is the inverse of A!:
A-'A=I
is transposed to
A!(A~!)T =J/.
(5)
Similarly AA~! = J leads to (A~!)'AT = JI. Notice especially: A! is invertible
exactly when A is invertible. We can invert first or transpose first, it doesn’t matter.
Example 1 The inverse of
(Any?
A=[19] is A~! =[_19]. The transpose is AT = [4 ¢].
equals
E Hl
which is also
Gia
Before leaving these rules, we call attention to dot products. The following statement looks extremely simple, but it actually contains the deep purpose for the transpose.
For any vectors x and y,
(Ax)
y
equals
xTAly
equals
xl(Aly).
(6)
We can put in parentheses or leave them out. Here are three quick applications to
engineering and economics.
In electrical networks, x is the vector of potentials (voltages at nodes) and y is
the vector of currents. Ax gives potential differences and A'y gives the total current
into each node. Every vector has a meaning—the dot product is the heat loss.
Chapter 8 explains in detail this important application to networks. All we say
here is that nature is extremely symmetric! The symmetry is between A and A!.
In engineering and physics, x can be the displacement of a structure under a
load. Then Ax is the stretching (the strain). When y is the internal stress, Aly is
the external force. The equation is a statement about dot products and also a balance
equation:
Internal work (strain - stress) = external work (displacement
- force).
In economics, x gives the amounts of n outputs. The m inputs to produce those
output are Ax (and A is the input-output matrix). The costs per input go into y and
the values per output are in Aly:
Total input cost is Ax+y
(input amounts times cost per input)
Total output value is x +A‘ y (output amounts times value per output)
When A moves from one side of a dot product to the other side, it becomes A!.
2.7
Transposes and Permutations
91
Symmetric Matrices
For a symmetric matrix—these are the most important matrices—transposing A to Ax
produces no change. Then A! = A. The matrix is symmetric across the main diagonal.
A symmetric matrix is necessarily square. Its (j,i) and (i, j) entries are equal.
DEFINITION A symmetric matrix has AT = A. This means that aj; = aj; .
Example 2
Ane E | =A!
and
D= F A =p
A is symmetric because of the 2’s on opposite sides of the diagonal. In D those 2’s
are zeros. A diagonal matrix is automatically symmetric.
The inverse of a symmetric matrix is also symmetric. (We have to add: If A =
A! has an inverse.) When A~! is transposed, this gives (A!)~! which is A~!. So the
transpose of A~! is A~!, for a symmetric matrix A:
‘
=i)
cee
1 0
—-l]_
la. il ane ae =; oa
Now we show that multiplying any matrix by its transpose gives a symmetric matrix.
Symmetric Products R'R and RR' and LDL™
Choose any matrix R, probably rectangular.
R'R is automatically a symmetric matrix:
The transpose of R'R
Multiply R' times R. Then the product
is R'(R")'
whichis
R'R.
(7)
That is a quick proof of symmetry for R'R. We could also look at the (i, j) entry of
R'R. It is the dot product of row i of R' (column i of R) with column j of R. The
(j,i) entry is the same dot product, column j with column 7. So R'R is symmetric.
The matrix RR! is also symmetric. (The shapes of R and R! allow multiplication.) But RR® is a different matrix from R'R. In our experience, most scientific
problems that start with a rectangular matrix R end up with R'™R or RR! or both.
Example
3 R=[1 2] and
ees [i
and
RR! =[5].
The product R'R is n by n. In the opposite order, RR' is m by m. Even if m =n,
it is not likely that R'R = RR!. Equality can happen, but it is abnormal.
When elimination is applied to a symmetric matrix, we hope that A! = A isan advantage.
That depends on the smaller matrices staying symmetric as elimination proceeds—which
92
Chapter 2. Solving Linear Equations
they do. It is true that the upper triangular U cannot be symmetric. The symmetry is in
LDU. Remember how the diagonal matrix D of pivots can be divided out, to leave 1’s on
the diagonal of both L and U:
se
Je
2 ee i ae8)
eile oe a
i 2
0.3
(LU misses the symmetry)
panlelig.
0
eeu)
Lene?
(LDU
Bae
52
Ons
0
U is the transpose of L.
fee
1
captures the symmetry)
When A is symmetric, the usual form A = LDU becomes A = LDL!. The final U
(with 1’s on the diagonal) is the transpose of L (also with 1’s on the diagonal). The
diagonal D—the matrix of pivots—is symmetric by itself.
2K If A = A! can be factored into LDU with no row exchanges, then U = L?.
The symmetric factorization is A = LDL",
Notice that the transpose of LDL" is automatically (L?)'D'L™ which is LDL™
again. We have a symmetric factorization for a symmetric matrix. The work of elimination is cut in half, from n?/3 multiplications to n?/6. The storage is also cut essentially
in half. We only keep L and D, not U.
Permutation
Matrices
The transpose plays a special role for a permutation matrix. This matrix P has a single
silt in every row and every column. Then P! is also a permutation matrix—maybe
the same or maybe different. Any product P; P2 is again a permutation matrix. We
now create every P from the identity matrix, by reordering the rows of /.
The simplest permutation matrix is P = I (no exchanges). The next simplest are
the row exchanges P;;. Those are constructed by exchanging two rows i and j of J.
Other permutations exchange three rows. By doing all possible row exchanges to /,
we get all possible permutation matrices:
DEFINITION
An n by n permutation matrix P has the n rows of I in any order.
Example 4 There are six 3 by 3 permutation matrices.
1
1
I
an
Py =
P31 =
l
P32 =
—~
Here they are without the zeros:
1
|
P39 Pot =
1
P21 P32 = | 1
1
2.7
Transposes and Permutations
93
There are n\ permutation matrices of order n. The symbol n! stands for “n factorial,”
the product of the numbers (1)(2)--- (nm). Thus 3! = (1)(2)(3) which is 6. There will
be 24 permutation matrices of order 4.
There are only two permutation matrices of order 2, namely [}9] and [°4
ae
Important: P~' is also a permutation matrix. In the example above, the four
matrices on the left are their own inverses. The two matrices on the right are inverses
of each other. In all cases, a single row exchange is its own inverse. If we repeat
we are back to J. But for P32 P2;, the inverses go in opposite order (of course).
inverse is P; P32.
The
More important: P~! is always the same as P*. The four matrices on the left above
are their own transposes and inverses. The two matrices on the right are transposes—
and inverses—of each other. When we multiply PP!, the “1” in the first row of P
hits the “1” in the first column of P!. It misses the ones in all the other columns. So
PPP
Another proof of P! = P~! looks at P as a product of row exchanges. A row
exchange is its own transpose and its own inverse. P! is the product in the opposite
order. So is P~'. So P? and P~! are the same.
Symmetric matrices led to A = LDL". Now permutations lead to PA = LU.
The LU Factorization with Row Exchanges
We hope you remember A = LU. It started with A = Ce Sey laac (8 Every
elimination step was carried out by an £;; and inverted by an BE Those inverses
were compressed into one matrix L. The lower triangular L has 1’s on the diagonal,
and the result is A = LU.
This is a great factorization, but it doesn’t always work! Sometimes row exchanges are needed to produce pivots. Then A = (E~!.-. P-!.--E7!...Po!.--)U.
Every row exchange is carried out by a P;; and inverted by that Pj;. We now compress
those row exchanges into a single permutation matrix P. This gives a factorization for
every invertible matrix A—which we naturally want.
The main question is where to collect the Pj;’s. There are two good possibilities—
do all the exchanges before elimination, or do them after the E;;’s. The first way gives
PA = LU. The second way has the permutation in the middle.
1
The row exchanges can be moved to the left side, onto A. We think of them as
done in advance. Their product P puts the rows of A in the right order, so that
no exchanges are needed for PA. Then PA = LU.
2
We can hold all row exchanges until after elimination. This leaves the pivot rows
in a strange order. P; puts them in the correct order in U; (upper triangular as
usual).
PA = LU
Then
A = Ly P,Uj.
is constantly used in almost all computing (and always in MATLAB).
will concentrate on this form
PA = LU.
The factorization
A = L; PU,
We
is the right
94
Chapter 2. Solving Linear Equations
one for theoretical algebra—it is more elegant. If we mention both, it is because the
difference is not well known. Probably you will not spend a long time on either one.
Please don’t. The most important case by far has P = 7, when A equals LU with no
exchanges.
For this matrix A, exchange rows | and 2 to put the first pivot in its usual place.
Then go through elimination:
(Oia!
2:
RG"
Im
OF
Sy
Il
AS
ee,
I
0
0
Pel
1%i1ts->
3 FW
£3; =2
The matrix
PA
1h 25 Fl
(ORee lana
OO} O84
£32 = 3
is in good order, and it factors as usual:
Aga
i ©
|e eee
By
©
ee)
Al
i
@
(Om
Be
i
O
ik
it || s/o
mae
(8)
We started with A and ended with U. The only requirement is invertibility of A.
2L (PA = LU) If A is invertible, a permutation P will put its rows in the right
order to factor PA into LU. There is a full set of pivots and Ax = b can be solved.
In the MATLAB code, these lines exchange row k with row r below it (where the
kth pivot has been found). The notation A([r k], 1:7) picks out the submatrix with
these two rows and all columns | to n.
AN kal ulienra)
AK alpen)
Ev Riot kel) ke ry
kk 1).
Pttr kin) = Pi alleen):
sign = —sign
The “sign”? of P tells whether the number of row exchanges is even (sign = +1)
or odd (sign = —1). At the start, P is J and sign = +1. When there is a row exchange,
the sign is reversed. The final value of sign is the determinant of P and it does not
depend on the order of the row exchanges.
For PA we get back to the familiar LU. This is the usual factorization. But the
computer does not always use the first available pivot. An algebraist accepts a small
pivot—anything but zero. A computer looks down the column for the largest pivot.
(Section 9.1 explains why this reduces the roundoff error. It is called partial pivoting.)
P may contain row exchanges that are not algebraically necessary. We still have PA =
LU.
Our advice is to understand permutations but let MATLAB do the computing. Calculations of A = LU are enough to do by hand, without P. The code splu(A) factors
PA = LU and splv(A, b) solves Ax = b for any square invertible A. The program
splu stops if no pivot can be found in column k. That fact is printed.
2.7.
=
1.
Transposes and Permutations
REVIEW OF THE KEY IDEAS
95
=&
| The transpose puts the rows of A into the columns of A!. Then (A!); j = Aji.
2.
The transpose of AB is B'A!.
The transpose of A~! is the inverse of A’.
3.
The dot product (Ax)! y equals the dot product XECAS y).
4.
A symmetric matrix (A? = A) has a symmetric factorization
5.
A permutation matrix P has a | in each row and column, and P! = P~!,
6.
If A is invertible then a permutation P will reorder its rows for PA = LU.
A = LDL?.
Problem Set 2.7
Questions 1-7 are about the rules for transpose matrices.
1
Find A? and A~! and (A~!)? and (A‘)~! for
=
2
i
©
5| and also
if
as|,
al
al
Verify that (AB)! equals B'A™ but does not equal A! B!:
[4] of] 2-8 3
In case AB = BA (not generally true!) how do you prove that B'AT = ATBT?
3
(a)
The matrix ((AB)7!)? comes
from (A-!y? and (B~')!.
(b)
If U is upper triangular then (U~'!)" is
4
Show that A? = 0 is possible but AA
5
(a)
In what order?
triangular.
= 0 is not possible (unless A = zero matrix.)
The row vector x! times A times the column y produces what number?
0
Aes:
ne|e E 4]
a=
x'Ay=[0
0
6
(b)
This is the row x'A=
__
times the column y = (0, 1, 0).
(c)
This is the row x' =[0
When
you transpose a block matrix M = Fs et the result is M? =
1] times the column Ay=
it. Under what conditions on A, B, C, D is M a symmetric matrix?
_
. Test
96
Chapter 2
Solving Linear Equations
True or false:
(a)
The block matrix [9 4] is automatically symmetric.
(b)
If A and B are symmetric then their product AB is symmetric.
(c)
If A is not symmetric then A~! is not symmetric.
(d)
When
A, B,C are symmetric, the transpose of (ABC)" is CBA.
Questions 8-14 are about permutation matrices.
8
Why are there n! permutation matrices of order n?
9
If P, and Pz are permutation matrices, so is P; P23. After two permutations we
still have the rows of J in some order. Give examples with P,P, # PP, and
Pa P4 = Py Px.
10
There are 12 “even” permutations of (1, 2,3,4), with an even number of exchanges. Two of them are (1, 2,3,4) with no exchanges and (4, 3,2, 1) with
two exchanges.
List the other ten. Instead of writing each 4 by 4 matrix, use the
numbers 4, 3,2, 1 to give the position of the 1 in each row.
11
Which permutation matrix makes PA upper triangular? Which permutations make
P, AP» lower triangular? If you multiply A on the right by P, it exchanges the
of A.
>
II
CO
xt
oro
OO
M
W
12
13
(a)
Explain why the dot product of x and y equals the dot product of Px and
Py. P is any permutation matrix.
(b)
With x = (1,2,3) and y = (1, 1,2) show that Px -y is not always equal
WO ag ole S%
(a)
If you take powers
equal to /?
(b)
Find a 5 by 5 permutation matrix so that the smallest power to equal / is
of a permutation matrix, why is some
P* eventually
Pe
(This is a challenge question. Combine a 2 by 2 block with a 3 by 3 block.)
14
Some permutation matrices are symmetric:
Pak
(a)
If P sends row
When
(b)
P! = P. Then P!P = J becomes
1 to row 4, then P! sends row
to row
P! = P the row exchanges come in pairs with no overlap.
Find a 4 by 4 example with P! = P that moves all four rows.
2.7
Transposes and Permutations
97.
Questions 15-20 are about symmetric matrices and their factorizations.
15
16
Find 2 by 2 symmetric matrices
(a)
A is not invertible.
(b)
Ai is invertible but cannot be factored into LU.
(c)
A can be factored into LU but not into LL!.
If
A=A! and B = B', which of these matrices are certainly symmetric?
Gye
17.
18
19
8
(be
(c)
(d)
Ade B)( A= B)
ABA
ABAB.
(a)
How many entries of A can be chosen independently, if A = A! is 5 by 5?
(b)
How
(c)
How many entries can be chosen if A is skew-symmetric?
do L and D
(still 5 by 5) give the same number of choices?
(APH =A),
Suppose R is rectangular (m by n) and A is symmetric (m by m).
(a)
Prove that R'AR
(b)
Prove that R'R has no negative numbers on its diagonal.
is symmetric.
Factor these symmetric matrices into
=
20
A = A! with these properties:
|
im
and
fel
After elimination clears out column
2 matrix that remains:
A=|4
Dah
3
9
Ss
10
2
What shape is this matrix?
A = LDL!. The pivot matrix D is diagonal:
|
and
2
|-l
0
A=
~+Find the PA = LU
ANC
Al—s
il
i
|e
Cemee
Cee?
2 by
€
Up:
L1,P,Uj.
factorizations (and check them) for
feat
22
0
—-1
»
| below the pivot, find the symmetric
Questions 21-29 are about the factorizations PA = LU and A=
21.
=)
2
—-1
CO decal
8 AD
OF i
P|
bet
rage!
Find a 3 by 3 permutation matrix (call it A) that needs two row exchanges to
reach the end of elimination. For this matrix, what are its factors P, L, and U?
98
Chapter 2. Solving Linear Equations
23
Factor the following matrix into PA = LU. Factor it also into A = L;P,U;
(hold the row exchange until 3 times row 1 is subtracted from row 2):
A
=
SS
Woe
—
NO
0O
—
24
Write out P after each step of the MATLAB
=|; i
25
ATG
code splu, when
0
@
il
©
5
6
eA— |
Write out P and L after each step of the code splu when
Omi
Avs
IO
mB
26
Extend the MATLAB
27
What is the matrix L; in
AsS
Sy
a)
code splu to a code spldu which factors PA into LDU.
A= L,P,U,?
heatlect
ss 3ehi
2A
Wes
Lect
et
POOR
eae
Or3y6
1 O20
Oa
at
OeIKO
gta
Des gligyyl
O Pes akG
Hin”
2
28
Suppose A is a permutation matrix. Then L = U = J. Explain why P is A?
in PA = LU. The way to put the rows of P in the normal order (of /) is to
multiply by
:
29
Prove that the identity matrix cannot be the product of three row exchanges (or
five). It can be the product of two exchanges (or four).
30
(a)
Choose E>; to remove the 3 below the first pivot. Then multiply EAE},
to remove both 3’s:
Marea)
Al | oe
24
0
4 9
(b)
Choose
£32 to remove
is going toward
D=|0
PEO 20
2 0
A erat
the 4 below the second pivot.
Then A is reduced
to D by E32E2,AE}, Ej, = D. Invert the E’s to find L in
A= LDL".
The next questions are about applications of the identity (Ax)'y = x™(ATy).
31.
Wires go between Boston, Chicago, and Seattle. Those cities are at voltages xg, xc,
x 5. With unit resistances between cities, the currents between cities are in y:
y =AXY
“As
YBC
int
0
XB
Yossi
0
1
-l
XC
YBS
At
OMI
ces
2.7.
32
Transposes and Permutations
(a)
Find the total currents A? y out of the three cities.
(b)
Verify that (Ax)! y agrees with xt(A! y)—six terms in both.
99
Producing x; trucks and x2 planes needs x; +50x2 tons of steel, 40x; + 1000x2
pounds of rubber, and 2x; + 50x2 months of labor. If the unit costs yj, y2, y3
are $700 per ton, $3 per pound, and $3000 per month, what are the values of a
truck and a plane? Those are the components of A! y.
33
Ax
gives the amounts
of steel, rubber, and labor to produce x in Problem
Find A. Then Ax-y is the
32.
of inputs while x-ATy is the value of
34
The matrix P that multiplies (x, y, z) to give (z,x, y) is also a rotation matrix.
Find P and P?. The rotation axis a = (1, 1, 1) equals Pa. What is the angle of
rotation from v = (2,3, —5) to Pv = (—S, 2, 3)?
35
Witter. le 5| as the product EH
and a symmetric matrix H.
36
This chapter ends with a great new factorization A = EH. Start from A = LDU,
with 1’s on the diagonals of L and U. Insert C and U? to find E and H:
Aue
(LC)(U! DU)
with 1’s on the diagonal of
=
of an elementary row operation matrix FE
(lower triangular E) (symmetric
matrix
H)
E = LC. What is C? Why is LC lower triangular?
preseason|2
TA
Talg
CK
~®)&®®&
vvreeseceeseeees
* Transparent proof that (AB)' = B'A™. Matrices can be transposed by looking
through the page from the other side. Hold up to the light and practice with B
100 = Chapter 2. Solving Linear Equations
below.
Its column
becomes
a row
in B!.
To see better, draw
the figure with
heavy lines on thin paper and turn it over so the symbol B! is upright.
The three matrices are in position for matrix multiplication: the row of A times
the column of B gives the entry in AB. Looking from the reverse side, the row
of B! times the column of A! gives the correct entry in B'AT = (AB)!.
VECTONSE AGES
AND SUBSEAGEs
SPACES OF VECTORS a 3.1
To a newcomer, matrix calculations involve a lot of numbers. To you, they involve vectors. The columns of Ax and AB are linear combinations of n vectors—the columns
of A. This chapter moves from numbers and vectors to a third level of understanding (the highest level). Instead of individual columns, we look at “spaces” of vectors.
Without seeing vector spaces and especially their subspaces, you haven’t understood
everything about Ax = b.
Since this chapter goes a little deeper, it may seem a little harder. That is natural.
We are looking inside the calculations, to find the mathematics. The author’s job is to
make it clear. These pages go to the heart of linear algebra.
We begin with the most important vector spaces. They are denoted by R!, R?2,
R?, R4, .... Each space R” consists of a whole collection of vectors. R> contains all
column vectors with five components.
DEFINITION
This is called “S-dimensional space.”
The space R"” consists of all column vectors v with n components.
The components of v are real numbers, which is the reason for the letter R.
A vector whose n components are complex numbers lies in the space C”.
The vector space R? is represented by the usual xy plane. Each vector v in R? has two
components. The word “space” asks us to think of all those vectors—the whole plane.
Each vector gives the x and y coordinates of a point in the plane.
Similarly the vectors
in R? correspond
to points (x,
y, z) in three-dimensional
space. The one-dimensional space R! is a line (like the x axis). As before, we print
vectors as a column between brackets, or along a line using commas and parentheses:
101
102
Chapter 3
Vector Spaces and Subspaces
4
0| isin R3, (1,1,0,1,1)
isinRD, fies is in C2.
The great thing about linear algebra is that it deals easily with five-dimensional
space. We don’t draw the vectors, we just need the five numbers (or n numbers). To
multiply v by 7, multiply every component by 7. Here 7 is a “scalar.” To add vectors
in R°, add them a component at a time. The two essential vector operations go on
inside the vector space:
We can add any two vectors in R", and we can multiply any vector in R" by any scalar.
“Inside the vector space” means that the result stays in the space.
If v is the vector
in R* with components 1, 0, 0, 1, then 2v is the vector in R‘ with components 2, 0, 0, 2.
(In this case 2 is the “scalar.’) A whole series of properties can be verified in R”. The
commutative law is v-+ w = w+; the distributive law is c(v-+w) = cv+cw. There
is a unique “zero vector” satisfying 0+ v = v. Those are three of the eight conditions
listed at the start of the problem set.
These eight conditions are required of every vector space. There are vectors other
than column vectors, and vector spaces other than R”, and they have to obey the eight
reasonable rules.
A real vector space is a set of “vectors” together with rules for vector addition
and for multiplication by real numbers. The addition and the multiplication must produce vectors that are in the space. And the eight conditions must be satisfied (which
is usually no problem). Here are three vector spaces other than R”:
M_
The vector space of all real 2 by 2 matrices.
F
Z_
The vector space of all real functions f (x).
The vector space that consists only of a zero vector.
In M the “vectors” are really matrices. In F the vectors are functions. In Z the only
addition is 0+ 0 = 0. In each case we can add: matrices to matrices, functions to
functions, zero vector to zero vector. We can multiply a matrix by 4 or a function by
4 or the zero vector by 4. The result is still in M or F or Z. The eight conditions are
all easily checked.
The space Z is zero-dimensional (by any reasonable definition of dimension).
It is the smallest possible vector space. We hesitate to call it R°, which means no
components—you might think there was no vector. The vector space Z contains exactly one vector (zero). No space can do without that zero vector. Each space has its
own zero vector—the zero matrix, the zero function, the vector (0,0, 0) in R?.
Subspaces
At different times, we will ask you to think of matrices and functions as vectors.
But at
all times, the vectors that we need most are ordinary column vectors. They are vectors
3.1
Spaces of Vectors
103
00
10
e@
01
bd
Figure 3.1.
smallest vector space
f;
“Four-dimensional” matrix space M. The “zero-dimensional” space Z.
with n components—but maybe not all of the vectors with n components.
important vector spaces inside R".
There are
Start with the usual three-dimensional space R*?. Choose a plane through the
origin (0, 0,0). That plane is a vector space in its own right. If we add two vectors
in the plane, their sum is in the plane. If we multiply an in-plane vector by 2 or —5S,
it is still in the plane. The plane is not R* (even if it looks like R*). The vectors have
three components and they belong to R°?. The plane is a vector space inside R°*.
This illustrates one of the most fundamental ideas in linear algebra.
is a subspace of the full vector space R°.
The plane
DEFINITION
A subspace of a vector space is a set of vectors (including 0) that
satisfies two requirements: If v and w are vectors in the subspace and c is any
scalar, then
(i)
v-+w
is inthe subspace
and _ (ii)
cv is in the subspace.
In other words, the set of vectors is “closed” under addition v + w and multiplication
cv (and cw). Those operations leave us in the subspace. We can also subtract, because
—w is in the subspace and its sum with v is v — w. In short, all linear combinations
stay in the subspace.
All these operations follow the rules of the host space, so the eight required conditions
are automatic. We just have to check the requirements (i) and (ii) for a subspace.
First fact: Every subspace contains the zero vector. The plane in R® has to go
through (0, 0,0). We mention this separately, for extra emphasis, but it follows directly
from rule (ii). Choose c = 0, and the rule requires Ov to be in the subspace.
Planes that don’t contain the origin fail those tests. When v is on such a plane,
—v and Ov are not on the plane. A plane that misses the origin is not a subspace.
Lines through the origin are also subspaces. When we multiply by 5, or add two
vectors on the line, we stay on the line. But the line must go through (0, 0, 0).
104 = Chapter 3
Vector Spaces and Subspaces
Another space is all of R*. The whole space is a subspace (of itself). Here is a
list of all the possible subspaces of R*:
(L) Any line through (0, 0, 0)
(R>) The whole space
(P) Any plane through (0, 0, 0)
(Z)
The single vector (0, 0, 0)
If we try to keep only part of a plane or line, the requirements for a subspace don’t
hold. Look at these examples in R?.
Example 1 Keep only the vectors (x, y) whose components are positive or zero (this
is a quarter-plane).
The vector (2,3) is included but (—2, —3) is not.
So rule (ii) is
violated when we try to multiply by c = —1. The quarter-plane is not a subspace.
Example 2 Include also the vectors whose components are both negative. Now we
have two quarter-planes. Requirement (ii) is satisfied; we can multiply by any c. But
rule (i) now
fails.
The sum
of v =
(2,3) and w =
(—3, —2) is (—1, 1), which
is
outside the quarter-planes. Two quarter-planes don’t make a subspace.
Rules (i) and (ii) involve vector addition v+w and multiplication by scalars like
c and d. The rules can be combined into a single requirement—the rule for subspaces:
A subspace containing v and w must contain all linear combinations cv + dw.
Example 3 Inside the vector space M of all 2 by 2 matrices, here are two subspaces:
(U) All upper triangular matrices
b
i Al (D) All diagonal matrices
k a:
Add any two matrices in U, and the sum is in U. Add diagonal matrices, and the sum
is diagonal. In this case D is also a subspace of U! Of course the zero matrix is in
these subspaces, when a, b, and d all equal zero.
To find a smaller subspace of diagonal matrices, we could require a = d. The
matrices are multiples of the identity matrix 7. The sum 2/ + 3/ is in this subspace,
and so is 3 times 4/. It is a “line of matrices” inside M and U and D.
Is the matrix J a subspace by itself? Certainly not. Only the zero matrix is. Your
mind will invent more subspaces of 2 by 2 matrices—write them down for Problem 5.
The Column Space of A
The most important subspaces are tied directly to a matrix A. We are trying to solve
Ax = b. If A is not invertible, the system is solvable for some b and not solvable for
other b. We want to describe the good right sides b—the vectors that can be written
as A times some vector x.
3.1
Spaces of Vectors
105
Remember that Ax is a combination of the columns of A. To get every possible
b, we use every possible x. So start with the columns of A, and take all their linear
combinations. This produces the column space of A. It is a vector space made up of
column
vectors—not just the n columns
of A, but all their combinations
Ax.
By taking all combinations of the columns, we fill out a vector space.
DEFINITION Thecolumn space of A consists of all linear combinations of the columns.
The combinations are all possible vectors Ax.
To solve Ax = b is to express b as a combination of the columns. The vector b
on the right side has to be in the column space produced by A on the left side.
3A The system Ax = Bb is solvable if and only if b is in the column space of A.
When Bb is in the column space, it is a combination of the columns.
cients in that combination give us a solution x to the system Ax = b.
The coeffi-
Suppose A is an m by n matrix. Its columns have m components (not 7). So
the columns belong to R”. The column space of A is a subspace of R”. The set of
all column
combinations
Ax satisfies rules (i) and (ii): When
we add linear combina-
tions or multiply by scalars, we still produce combinations of the columns. The word
“subspace” is justified. Here is a 3 by 2 matrix A, whose column space is a subspace
of R*. It is a plane.
Example 4
The column space consists of all combinations of the two columns—any x; times the
first column plus any x2 times the second column. Those combinations fill up a plane
in R> (Figure 3.2). If the right side b lies on that plane, then it is one of the combinations and (x1, x2) is a solution to Ax = b. The plane has zero thickness, so it is more
likely that b is not in the column space. Then there is no solution to our 3 equations
in 2 unknowns.
Of course (0,0, 0) is in the column space. The plane passes through the origin.
There is certainly a solution to Ax = 0. That solution, always available, is x =
To repeat, the attainable right sides b are exactly the vectors in the column space.
One possibility is the first column itself—take x; = | and x2 = 0. Another combination
is the second column—take x; = 0 and x2 = 1. The new level of understanding is to
see all combinations—the
whole subspace is generated by those two columns.
106
Chapter 3
Vector Spaces and Subspaces
Figure 3.2
The column space C(A) is a plane containing the two columns. Ax = Db
is solvable when Bb is on that plane. Then b is a combination of the columns.
Notation The column space of A is denoted by C(A).
take all their linear combinations.
Start with the columns and
Example 5 Describe the column spaces (they are subspaces of R’) for
0
=;
|
and
192
a=|)
|
and
| ae}
Be;
0
“A
Solution The column space of J is the whole space R’. Every vector is a combination
of the columns of J. In vector space language, C(/) is R?.
The column space of A is only a line. The second column (2,4) is a multiple
of the first column (1, 2). Those vectors are different, but our eye is on vector spaces.
The column space contains (1,2) and (2,4) and all other vectors (c,2c) along that
line. The equation Ax = b is only solvable when B is on the line.
The third matrix (with three columns) places no restriction on b. The column
space C(B) is all of R?. Every b is attainable. The vector b = (5,4) is column 2
plus column 3, so x can be (0, 1,1). The same vector (5,4) is also 2(column 1) +
column 3, so another possible x is (2,0, 1). This matrix has the same column space as
I—any b is allowed. But now x has more components and there are more solutions.
The next section creates another vector space, to describe all the possible solutions of
Ax = 0. This section created the column space, to describe all the attainable right
sides b.
3.1.
=»
1.
REVIEW OF THE KEYIDEAS
R” and C” and M
Spaces of Vectors
107
=&
(matrices) and F (functions) and Z (zero vector alone) are
vector spaces.
2.
3.
4.
A space containing v and w must contain all thir combinations cv + dw.
| The combinations of the columns of A form the column space of A.
The system Ax = b is solvable exactly when B is in the column space of A.
Problem Set 3.1
The first problems are about vector spaces in general. The vectors in those spaces
are not necessarily column vectors.
In the definition of a vector space, vector addition x + y and scalar multiplication
cx are required to obey the following eight rules:
ak
1
ey = Vox
@)
x+(y+zZ)=@+y)+2z
(3)
There is a unique “zero vector” such that x +0 =x
(4)
For each x there is a unique vector —x
(5)
1 times x equals x
(6)
(cic2)x = c1(c2x)
(Dy
<c(x
yy) = cx
(8),
(eq
ica) MCX -1'CQX.
for all x
such that x + (-x) =0
icy
Suppose the sum (x1, x2) + (y1, y2) is defined to be (x; + y2,x2 + y1). With
the usual multiplication cx = (cx1,cx2), which of the eight conditions are not
satisfied?
2
Suppose the multiplication cx is defined to produce (cx), 0) instead of (cx1, cx2).
With the usual addition in R*, are the eight conditions satisfied?
3
(a)
Which rules are broken if we keep only the positive numbers x > 0 in R!?
Every c must be allowed. The half-line is not a subspace.
(b)
The positive numbers
with x + y and cx redefined to equal the usual xy
and x° do satisfy the eight rules.
Test rule 7 when
c =.3,x
= 2;y =
(Then x + y = 2 and cx = 8.) Which number is the “zero vector”?
vector “—2” is the number _
4
1.
The
The matrix A = [273] is a “vector” in the space M of all 2 by 2 matrices.
Write down the zero vector in this space, the vector 14, and the vector —A.
What matrices are in the smallest subspace containing A?
108
= Chapter 3. Vector Spaces and Subspaces
5
(a)
Describe a subspace of M that contains
(b)
If a subspace of M contains A and B, must it contain /?
(c)
Describe a subspace of M that contains no nonzero diagonal matrices.
A=[49] but not
B=[9§ _?].
6
The functions f(x) = x? and g(x) = 5x are “vectors” in F. This is the vector
space of all real functions. (The functions are defined for —oo < x < oo.) The
combination 3 f(x) — 4g(x) is the function h(x) =
7
Which rule is broken if multiplying f(x) by c gives the function f(cx)?
the usual addition f(x) + g(x).
8
If the sum of the “vectors” f(x) and g(x) is defined to be the function f(g(x)),
then the “zero vector” is g(x) = x. Keep the usual scalar multiplication c f(x)
and find two rules that are broken.
Questions 9-18 are about the “subspace requirements”:
in the subspace.
9
10
11.
Keep
x + y and cx must stay
One requirement can be met while the other fails. Show this with
(a)
A set of vectors in R* for which x + y stays in the set but 5x may be
outside.
(b)
A set of vectors in R? (other than two quarter-planes) for which every cx
stays in the set but x + y may be outside.
Which of the following subsets of R? are actually subspaces?
(a)
The plane of vectors (b1, b2, b3) with b, = 0.
(b)
The plane of vectors with bj = 1.
(c)
The vectors with bj b2b3 = 0.
(d)
All linear combinations of v = (1,4,0) and w = (2, 2, 2).
(e)
All vectors that satisfy b; + b2 + b3 = 0.
(f)
All vectors with b; < bo < bz.
Describe the smallest subspace of the matrix space M that contains
@
iNet
ik a a
(b) Eae ol
0
1
i A
) EPO a and ihae|
12
Let P be the plane in R*® with equation x + y — 2z = 4. Find two vectors in P
and check that their sum is not in P.
13
Let Po be the plane through (0, 0,0) parallel to the previous plane P. What is
the equation for Po? Find two vectors in Po and check that their sum is in Po.
3.1
14
15
Spaces of Vectors
109
The subspaces of R? are planes, lines, R? itself, or Z containing (0, 0, 0).
(a)
Describe the three types of subspaces of R?.
(b)
Describe the five types of subspaces of R*.
(a)
The intersection of two planes through (0, 0,0) is probably a
(b)
The intersection of a plane through (0, 0,0) with a line through (0, 0, 0) is
probably a
(c)
If S and T are subspaces of R>, prove that
S
T (the set of vectors in
both subspaces) is a subspace of R°. Check the requirements on x + y and
6x
16
Suppose P is a plane through (0,0,0) and L is a line through (0,0,0).
smallest vector space containing both P and L is either
or
17
(a)
Show that the set of invertible matrices in M is not a subspace.
(b)
Show that the set of singular matrices in M is not a subspace.
18
The
True or false (check addition in each case by an example):
(a)
The symmetric matrices in M (with A! = A) form a subspace.
(b)
The skew-symmetric
(c)
The unsymmetric matrices in M (with A! # A) form a subspace.
matrices in M (with A! = —A) form a subspace.
Questions 19-27 are about column spaces C(A) and the equation Ax = b.
19
Describe the column spaces (lines or planes) of these particular matrices:
A=
20
and
Bis
ho )
0
OC
ands @C=
igaele,
e200)
0 0
For which right sides b do these systems have solutions?
(a)
ae
DS
—1
21
a2
1070
issn 8
-4
Sami edUe
b
P|
EN
aay:
—2]|
| x3
b3
Leesa
(b)
2
—1|
es
9 j4=|
b
by
-4
Adding row | of A to row 2 produces B. Adding column | to column 2 produces
C. A combination of the columns of ____ is also a combination of the columns
of A. Those two matrices have the same column
ee
his i
and
Ibn
Bele
a
and
ie
ho
3
al
110 = Chapter 3
ZZ
Vector Spaces and Subspaces
For which vectors
ie
a |
) ih ol
OOM
(b;, bz, b3) do these systems have a solution?
x]
by
coy || = |)
x3
b3
and
23
itl
0 O
ORO
hye
Oth
O70"
and
rs
ll
0
x]
b,
53 ||= || ea
X32
b3
ae
bj
5% || = || lap
Sea
b3
If we add an extra column Bb to a matrix A, then the column space gets larger
unless _____. Give an example where the column space gets larger and an example where it doesn’t. Why is Ax = B solvable exactly when the column spaces
are the same for A and [A 6]?
24
25
The columns of AB are combinations of the columns of A. The column space
of AB is contained in (possibly equal to) the column space of
. Give an
example where those two column spaces are not equal.
Suppose Ax = b and Ay = b* are both solvable. Then Az = b+
b* is solvable.
What is z? This translates into: If b and b* are in the column space C(A), then
26
If A is any 5 by 5 invertible matrix, then its column space is
Daa
True or false (with a counterexample if false):
28
. Why?
(a)
The vectors b that are not in the column space C(A) form a subspace.
(b)
If C(A) contains only the zero vector, then A is the zero matrix.
(c)
The column space of 2A equals the column space of A.
(d)
The column space of A — J equals the column space of A.
Construct a 3 by 3 matrix whose column space contains (1, 1,0) and (1,0, 1) but
not (L511):
3.2
The Nullspace of A: Solving Ax=0
THE NULLSPACE OF A: SOLVING
111
Ax = 0 o 3.2
This section is about the space of solutions to Ax = 0. The matrix A can be square
or rectangular. One immediate solution is x = 0—and for invertible matrices this is
the only solution.
For other matrices,
not invertible,
there are nonzero
solutions
to
Ax = 0. Each solution x belongs to the nullspace of A. We want to find all solutions
and identify this very important subspace.
DEFINITION =The nullspace of A consists of all solutions to Ax = 0. These vectors x
are in R”. The nullspace containing all solutions x is denoted by N(A).
Check that the solution vectors form a subspace. Suppose x and y are in the nullspace
(this means Ax = 0 and Ay = 0). The rules of matrix multiplication give A(x + y) =
0+0. The rules also give A(cx) = c0. The right sides are still zero. Therefore x + y
and cx are also in the nullspace N(A). Since we can add and multiply without leaving
the nullspace, it is a subspace.
To repeat: The solution vectors x have n components. They are vectors in R",
so the nullspace is a subspace of R". The column space C(A) is a subspace of R’”.
If the right side b is not zero, the solutions of Ax = b do not form a subspace.
The vector x = 0 is only a solution if b = 0. When the set of solutions does not
include x = 9, it cannot be a subspace.
Section
3.4 will show
how the solutions to
Ax = b (if there are any solutions) are shifted away from the origin by one particular
solution.
Example 1 The equation x + 2y +3z = Ocomes from the 1 by 3 matrix
A=[1
2
3].
This equation produces a plane through the origin. The plane is a subspace of R?. It is the
nullspace of A.
The solutions to x + 2y + 3z = 6 also form a plane, but not a subspace.
2
Example 2 Describe the nullspace of A = E ‘|
Solution
Apply elimination to the linear equations Ax = 0:
pete
3x; + 6x2 = 0
a
ier
O70
There is really only one equation. The second equation is the first equation multiplied
by 3. In the row picture, the line x; + 2x2 = 0 is the same as the line 3x; + 6x2 = 0.
That line is the nullspace N(A).
To describe this line of solutions, the efficient way is to give one point on it (one
special solution). Then all points on the line are multiples of this one. We choose the
second component to be x2 = 1 (a special choice). From the equation x; + 2x2 = 0, the
114
Chapter 3
Vector Spaces and Subspaces
combination” or “trivial combination”. The solution is trivial (just x = 0) but the idea
is not trivial.
This case of a zero nullspace Z is of the greatest importance. It says that the
columns of A are independent. No combination of columns gives the zero vector except
the zero combination. All columns have pivots and no columns are free. You will see
this idea of independence again ....
Solving Ax = 0 by Elimination
This is important. We solve m equations in n unknowns—and the right sides are all
zero. The left sides are simplified by row operations, after which we read off the solution (or solutions).
Remember
the two stages in solving Ax = 0:
1.
Forward elimination from A to a triangular U (or its reduced form R).
2.
Back substitution in Ux = 0 or Rx = 0 to find x.
You will notice a difference in back substitution, when A and U have fewer than
n pivots. We are allowing all matrices in this chapter, not just the nice ones (which
are square matrices with inverses).
Pivots are still nonzero. The columns below the pivots are still zero. But it might
happen that a column has no pivot. In that case, don’t stop the calculation. Go on to
the next column. The first example is a 3 by 4 matrix:
|Ne
ee
Sar
Ae
ge
Soa)
OG)
LOT ls
Certainly aj; = | is the first pivot. Clear out the 2 and 3 below that pivot:
A>|0
|Wag Wage es |
0 4 4
O--0) “4-54
(subtract 2 x
(subtract 3 x
row
row
1)
1)
The second column has a zero in the pivot position. We look below the zero for a
nonzero entry, ready to do a row exchange. The entry below that position is also zero.
Elimination can do nothing with the second column. This signals trouble, which we
expect anyway for a rectangular matrix. There is no reason to quit, and we go on to
the third column.
The second pivot is 4 (but it is in the third column). Subtracting row 2 from
row 3 clears out that column below the pivot. We arrive at
Loft
U=10
0
(only two pivots)
0
4
4
(the last equation
G0
0
0
became 0 = 0)
3.2.
The fourth column
The Nullspace of A: Solving Ax =0
115
also has a zero in the pivot position—but nothing can be done.
There is no row below it to exchange, and forward elimination is complete.
The matrix
has three rows, four columns, and only two pivots. The original Ax = 0 seemed to
involve three different equations, but the third equation is the sum of the first two. It is
automatically satisfied (0 = 0) when the first two equations are satisfied.
Elimination
reveals the inner truth about a system of equations.
Now comes back substitution, to find all solutions to
Ux = 0. With four unknowns
and only two pivots, there are many solutions. The question is how to write them all down.
A good method is to separate the pivot variables from the free variables.
ls The pivot variables are x; and x3, since columns 1| and 3 contain pivots.
F_ The free variables are x2 and x4, because columns 2 and 4 have no pivots.
The free variables x2 and x4 can be given any values whatsoever.
Then back substitu-
tion finds the pivot variables x; and x3. (In Chapter 2 no variables were free. When A
is invertible, all variables are pivot variables.) The simplest choices for the free variables are ones and zeros. Those choices give the special solutions.
Special Solutions
-
Set x2 = 1 and x4 =0.
By back substitution x3 = 0 and x; = —1.
-
Set %9 = 0 and ag
By back substitution x3 =—
ly
and x7 = —1-
These special solutions solve
Ux = 0 and therefore Ax = 0. They are in the nullspace.
The good thing is that every solution is a combination of the special solutions.
—|
Complete
Solution
—
21
1
9
—|
x4] 4
0
—x2 — X4
x2
oy
0
1
X4
special
special
complete
(1)
Please look again at that answer. It is the main goal of this section. The vector s; =
(—1,1,0,0) is the special solution when x2 = | and x4 = 0. The second special
solution has x2 = 0 and x4 = 1. All solutions are linear combinations of s\ and sp.
The special solutions are in the nullspace N(A), and their combinations fill out the
whole nullspace.
The MATLAB code nulbasis computes these special solutions. They go into the
columns of a nullspace matrix N. The complete solution to Ax = 0 is a combination
of those columns. Once we have the special solutions, we have the whole nullspace.
There is a special solution for each free variable. If no variables are free—this
means there are n pivots—then the only solution to Ux = 0 and Ax = 0 is the trivial
114
Chapter 3
Vector Spaces and Subspaces
combination” or “trivial combination”. The solution is trivial (just x = 0) but the idea
is not trivial.
This case of a zero nullspace Z is of the greatest importance. It says that the
columns of A are independent. No combination of columns gives the zero vector except
the zero combination. All columns have pivots and no columns are free. You will see
this idea of independence again ....
Solving Ax = 0 by Elimination
This is important. We solve m equations in n unknowns—and the right sides are all
zero. The left sides are simplified by row operations, after which we read off the solution (or solutions).
Remember
the two stages in solving Ax = 0:
1.
Forward elimination from A to a triangular U (or its reduced form R).
2.
Back substitution in Ux = 0 or Rx = 0 to find x.
You will notice a difference in back substitution, when A and U have fewer than
n pivots. We are allowing all matrices in this chapter, not just the nice ones (which
are square matrices with inverses).
Pivots are still nonzero. The columns below the pivots are still zero. But it might
happen that a column has no pivot. In that case, don’t stop the calculation. Go on to
the next column. The first example is a 3 by 4 matrix:
a
Aa
de
Sano
algae? Al
“Oe SE)
IO" GhS
Certainly aj; = | is the first pivot. Clear out the 2 and 3 below that pivot:
A>1|0
Pal
0
00
ye
4 4
4 4
(subtract 2 x
(subtract 3 x
row
row
1)
1)
The second column has a zero in the pivot position. We look below the zero for a
nonzero entry, ready to do a row exchange. The entry below that position is also zero.
Elimination can do nothing with the second column. This signals trouble, which we
expect anyway for a rectangular matrix. There is no reason to quit, and we go on to
the third column.
The second pivot is 4 (but it is in the third column). Subtracting row 2 from
row 3 clears out that column below the pivot. We arrive at
1
it ?23
0 4 4
0
0
U=i10
0
0
(only two pivots)
(the last equation
became 0 = 0)
3.2.
The Nullspace of A: Solving Ax =0
115
The fourth column also has a zero in the pivot position—but nothing can be done.
There is no row below it to exchange, and forward elimination is complete. The matrix
has three rows, four columns, and only two pivots. The original Ax = 0 seemed to
involve three different equations, but the third equation is the sum of the first two. It is
automatically satisfied (0 = 0) when the first two equations are satisfied.
reveals the inner truth about a system of equations.
Now comes back substitution, to find all solutions to
Elimination
Ux = 0. With four unknowns
and only two pivots, there are many solutions. The question is how to write them all down.
A good method is to separate the pivot variables from the free variables.
P_
F_
The pivot variables are x; and x3, since columns | and 3 contain pivots.
The free variables are x2 and x4, because columns 2 and 4 have no pivots.
The free variables x2 and x4 can be given any values whatsoever.
Then back substitu-
tion finds the pivot variables x; and x3. (In Chapter 2 no variables were free. When A
is invertible, all variables are pivot variables.) The simplest choices for the free variables are ones and zeros. Those choices give the special solutions.
Special Solutions
-
Set x2 = 1 and x4 =0.
By back substitution x3 = 0 and x; = —1.
-
Set xp = 0 anda,
“By back substitution x3 =
It
P and xy = —1.
These special solutions solve Ux = 0 and therefore Ax = 0. They are in the nullspace.
The good thing is that every solution is a combination of the special solutions.
—]
Complete Solution
x =
Please look again at that answer.
(—1,1,0,0)
—]
EX.
x
42
.
— X24
42
=
0
1
X4
special
special
complete
It is the main goal of this section. The vector s; =
is the special solution
when
x2
=
1 and x4 =
0.
The
second
special
solution has x2 = 0 and x4 = 1. All solutions are linear combinations of s, and sp.
The special solutions are in the nullspace N(A), and their combinations fill out the
whole nullspace.
The MATLAB
code nulbasis computes these special solutions. They go into the
columns of a nullspace matrix N. The complete solution to Ax = 0 is a combination
of those columns. Once we have the special solutions, we have the whole nullspace.
There is a special solution for each free variable. If no variables are free—this
means there are n pivots—then the only solution to Ux = 0 and Ax = 0 is the trivial
116
Chapter 3
Vector Spaces and Subspaces
solution x = 0. All variables are pivot variables. In that case the nullspaces of A and
U contain only the zero vector. With no free variables, and pivots in every column,
the output from nulbasis is an empty matrix.
Saeed
Example 4 Find the nullspace of U = k 0 Hi
The second column of U has no pivot. So x2 is free. The special solution has x2 = 1.
Back substitution into 9x3 = 0 gives x3 = 0. Then x; + 5x2 = 0 or x; = —5.
solutions to
At)
The
Ux = 0 are multiples of one special solution:
=5
l
0
The nullspace of U is a line in R?.
It contains multiples of the special solution.
One variable is free, and N = nulbasis (U) has one column.
In a minute we will continue elimination on U, to get zeros above the pivots and ones
in the pivots. The 7 is eliminated and the pivot changes from 9 to 1. The final result
of this elimination will be R:
i
v=,
This makes
SF
0 4 reduces
ih
to R= |
5).
0
@
ip
it even clearer that the special solution is s = (—5, 1, 0).
Echelon Matrices
Forward elimination goes from A to U. The process starts with an m by n matrix A. It acts
by row operations, including row exchanges. It goes on to the next column when no pivot
is available in the current column. The m by n “staircase” U is an echelon matrix.
Here is a 4 by 7 echelon matrix with the three pivots highlighted in boldface:
=
Cox cus
So
Oss
ss
oOOs
2
Soo
<i
So
=
as
yy
Question
Three pivot variables x1, x2, x6
Four free variables x3, x4, x5, x7
Four special solutions in N(U)
What are the column space and the nullspace for this matrix?
Answer The columns have four components so they lie in R*. (Not in R?!) The fourth
component of every column is zero. Every combination of the columns—every vector
in the column space—has fourth component zero. The column space C(U) consists of
all vectors of the form (b1, bz, b3, 0). For those vectors we can solve
Ux = b by back
substitution. These vectors b are all possible combinations of the seven columns.
The nullspace N(U) is a subspace of R’. The solutions to Ux = 0 are all the
combinations of the four special solutions—one for each free variable:
3.2
The Nullspace of A: Solving Ax=0
1.
Columns
2.
Set one free variable to 1 and set the other free variables to zero.
3.
Solve Ux = 0 for the pivot variables x1, x2, x6.
4.
This gives one of the four special solutions in the nullspace matrix N.
117
3, 4,5,7 have no pivots. So the free variables are x3, x4, X5, X7.
The nonzero rows of an echelon matrix come first. The pivots are the first nonzero
entries in those rows, and they go down in a staircase pattern. The usual row operations (in
the Teaching Code plu) produce a column of zeros below every pivot.
Counting the pivots leads to an extremely important theorem. Suppose A has
more columns than rows. With n > m there is at least one free variable. The system
Ax = 0 has at least one special solution. This solution is not zero!
3B If Ax =0 has more unknowns than equations (A has more columns than rows),
then it has nonzero solutions.
In other words, a short wide matrix (n > m) always has nonzero vectors in its nullspace.
There must be at least n — m free variables, since the number of pivots cannot exceed m.
(The matrix only has m rows, and a row never has two pivots.) Of course a row might
have no pivot—which means an extra free variable. But here is the point: When there
is a free variable, it can be set to 1. Then the equation Ax = 0 has a nonzero solution.
To repeat: There are at most m pivots. With n > m, the system Ax = 0 has a
free variable and a nonzero solution. Actually there are infinitely many solutions, since
any multiple cx is also a solution. The nullspace contains at least a line of solutions.
With two free variables, there are two special solutions and the nullspace is even larger.
The nullspace is a subspace. Its “dimension” is the number of free variables. This
central idea—the dimension of a subspace—is defined and explained in this chapter.
The Reduced Echelon Matrix R
From the echelon matrix U we can go one more step. Continue onward from
We can divide the second row by 4. Then both pivots equal 1. We can subtract 2 times
this new row [0 0 1 1] from the row above. That produces a zero above the second
pivot as well as below. The reduced row echelon matrix is
i eenree)
Re |
Fao
LOPa Ha
OOo: OexO
118 = Chapter 3
Vector Spaces and Subspaces
R has 1’s as pivots. It has 0’s everywhere else in the pivot columns. Zeros above
pivots come from upward elimination.
If A is invertible, its reduced row echelon form is the identity matrix R = I.
This is the ultimate in row reduction.
The zeros in R make it easy to find the special solutions (the same as before):
1.
Set xo = 1 end x4 =—0.
Solve Rve= Oy Then)
——
Dom
aScte = Orandiv——sl
Solver hci 0)
—!—landicat—
sheng»
1 andes, — 0:
ee
The numbers —1 and 0 are sitting in column 2 of R (with plus signs). The numbers
—1 and —1 are sitting in column 4 (with plus signs). By reversing signs we can read
off the special solutions from the matrix R. The general solution to Ax = 0 or Ux = 0
or Rx = 0 is a combination
N(U) = N(R) contains
18 SSE
=i
l
0
0
of those two special solutions:
The nullspace
N(A)
=
—1
+ x4
= (complete solution of Ax = 90).
=I)
1
The next section of the book moves
firmly from U to R. The MATLAB
command
[ R, pivcol |= rref(A) produces R and also a list of the pivot columns.
»
REVIEW OF THE KEY IDEAS
8
1.
The nullspace N(A) contains all solutions to Ax = 0.
2.
Elimination produces an echelon
columns and free columns.
3.
Every free column leads to a special solution to Ax = 0. The free variable equals
1 and the other free variables equal 0.
4.
The complete solution to Ax = 0 is a combination of the special solutions.
5.
Ifn>m then A has at least one column without pivots, giving a special solution.
So there are nonzero vectors x in the nullspace of this A.
matrix
U, or a row
reduced
R, with pivot
Problem Set 3.2
Questions 1-8 are about the matrices in Problems 1 and 5.
1
Reduce these matrices to their ordinary echelon forms U:
te
(a) he Aa
28 4. 0
BS Os)
sie
ee
Ong?
3.2.
The Nullspace of A: Solving Ax =0
(Db)
Ac
119
Sor
sf
op
Ny
of
Which are the free variables and which are the pivot variables?
For the matrices in Problem 1, find a special solution for each free variable.
the free variable to 1. Set the other free variables to zero.)
(Set
By combining the special solutions in Problem 2, describe every solution to Ax = 0.
The nullspace of A contains only the vector x = 0 when
By further row operations on each U in Problem 1, find the reduced echelon
form R. The nullspace of R is
the nullspace of U.
By row operations reduce A to its echelon form U. Write down a 2 by 2 lower
triangular L such that A = LU.
eh
=1
3°
5
eS 6 al
ee
=
34085
EB 6
aI
For the matrices in Problem 5, find the special solutions to Ax = 0. For an m
by n matrix, the number of pivot variables plus the number of free variables is
In Problem 5, describe the nullspace of each A in two ways. Give the equations
for the plane or line N(A), and give all vectors x that satisfy those equations
(combinations of the special solutions).
Reduce the echelon forms U in Problem 5 to R. For each R draw a box around
the identity matrix that is in the pivot rows and pivot columns.
Questions 9-17 are about free variables and pivot variables.
9
10
True or false (with reason if true and example if false):
(a)
A square matrix has no free variables.
(b)
An invertible matrix has no free variables.
(c)
An m by n matrix has no more than n pivot variables.
(d)
An _m by n matrix has no more than m pivot variables.
Construct 3 by 3 matrices A to satisfy these requirements (if possible):
(a)
A has no zero entries but U = /.
(b)
A has no zero entries but R = /.
(c)
A has no zero entries but R=
(dd)
A=U=ZK.
U.
120
Chapter 3. Vector Spaces and Subspaces
11
Put 0’s and x’s (for zeros and nonzeros) in a 4 by 7 echelon matrix U so that
the pivot variables are
(a)
12
2, 4,.5
(D) =
Son 7,
(c)
4 and 6.
Put 0’s and 1’s and x’s (zeros, ones, and nonzeros) in a 4 by 8 reduced echelon
matrix R so that the free variables are
(a)
2,°4,.5,0
(Dyin
Leo
Ones
13
Suppose column 4 of a 3 by 5 matrix is all zero. Then xq is certainly a
variable. The special solution for this variable is the vector x =
14
Suppose the first and last columns of a 3 by 5 matrix are the same (not zero).
Then
is a free variable.
The special solution for this variable is
5
i=
15
Suppose an m by n matrix has r pivots. The number
. The nullspace contains only x = 0 when r =
is all of R” when r =
16
The nullspace of a 5 by 5 matrix contains only x = 0 when the matrix has
pivots. The column space is R°® when there are
pivots. Explain why.
17
The equation x — 3y — z = 0 determines a plane in R*. What is the matrix A in
this equation? Which are the free variables? The special solutions are (3, 1, 0)
and
18
The plane x — 3y — z = 12 is parallel to the plane x — 3y — z = O in Problem
17. One particular point on this plane is (12, 0,0). All points on the plane have
of special solutions is
. The column space
the form (fill in the first components)
II
SS
aa <<
pa
+
as)
©
=
&NS
19
If x is in the nullspace of B, prove that x is in the nullspace of AB. This means:
If Bx = 0 then
. Give an example in which these nullspaces are different.
20
If A is invertible then N(AB) equals N(B).
second part: If ABx = 0 then Bx = 0.
This means that Ux = 0 whenever
that L is triangular but that L is
LUx
Following Problem
19, prove this
= 0 (same nullspace).
The key is not
Questions 21-28 ask for matrices (if possible) with specific properties.
21
Construct a matrix
avoxa! (3), Ile O), I).
whose
nullspace consists of all combinations
of (2, 2, 1, 0)
3.2.
The Nullspace of A: Solving Ax=0
121
22
Construct a matrix whose nullspace consists of all multiples of (4, 3, 2, 1).
23
Construct a matrix whose column space contains (1, 1,5) and (0, 3, 1) and whose
nullspace contains (1, 1, 2).
24
Construct a matrix whose column space contains (1, 1,0) and (0, 1, 1) and whose
nullspace contains
25
(1,0, 1) and (0,0, 1).
Construct a matrix whose column space contains (1, 1, 1) and whose nullspace is
the line of multiples of (1, 1, 1, 1).
26
Construct a 2 by 2 matrix whose nullspace equals its column space. This is possible.
a7
Why does no 3 by 3 matrix have a nullspace that equals its column space?
28
If AB = 0 then the column space of B is contained in the ____—soff A. Give
an example.
29
The reduced form R of a 3 by 3 matrix with randomly chosen entries is almost
sure to be
_. What R is most likely if the random A is 4 by 3?
30
Show by example that these three statements are generally false:
31
(a)
A and A! have the same nullspace.
(b)
A and A! have the same free variables.
(c)
A and A! have the same pivots. (The matrix may need a row exchange.)
A and A! do have the same number of pivots. This will be important.
What is the nullspace matrix N (containing the special solutions) for A, B, C?
A=[J
1] and
bead
Boj
0
sinel
(C a=Il.
Sy
If the nullspace of A consists of all multiples of x = (2, 1,0, 1), how many pivots
appear in U?
33
If the columns of N are the special solutions to Rx = 0, what are the nonzero
rows of R?
34
(a)
What are the five 2 by 2 reduced echelon matrices R whose entries are all
0’s and 1’s?
(b)
What are the eight | by 3 matrices containing only 0’s and 1’s?
eight of them reduced echelon matrices?
Are all
122
Chapter 3. Vector Spaces and Subspaces
THE RANK AND THE ROW
REDUCED
FORM as 3.3
This section completes the step from A to its reduced row echelon form R. The matrix
A is m by n (completely general). The matrix R is also m by n, but each pivot column
has only one nonzero entry (the pivot which is always 1):
R=10
£20
2
0-1 4
0090
-)
-3
4
You see zeros above the pivots as well as below. This matrix is the final result of
elimination. MATLAB uses the command rref. The Teaching Code elim has rref built
into it:
MATLAB:
[R, pivcol] =
rref(A)
Teaching Code:
[E,R]=
elim(A)
The extra output pivcol gives the numbers of the pivot columns. They are the same in
A and R. The extra output E gives the m by m elimination matrix that puts A into
its row reduced form R:
EA=R.
(1)
The square matrix E is the product of elementary matrices E;; and Pj; and D~! that
produce the elimination steps. £;; subtracts a multiple of row j from row 1, and P;;
exchanges these rows. Then D~! divides the rows by the pivots to produce 1’s in every
pivot.
We actually compute E by applying row reduction to the longer matrix [A I |:
All the elementary matrices that multiply A (to produce R) will also multiply J (to
produce E). The whole long matrix is being multiplied by E:
E LAr ah]
je
n columns
=
m columns
RMSE
fig
|
(2)
nm
This is exactly what “Gauss-Jordan” did in Chapter 2 to compute A~!. When A is
square and invertible, its reduced row echelon form is R = I. Then
EA = R becomes
EA = I. In this invertible case, E is A~'. This chapter is going further, to any
(rectangular) matrix A and its reduced form R. The matrix E that multiplies A is still
square and invertible, but the best it can do is to produce R.
Let us now suppose that R is computed. The work of elimination is done. We
want to read off from R the key information it contains. It tells us which columns
of A are pivot columns. R also tells us the special solutions in the nullspace. That
information is hidden in A, and partly hidden in the upper triangular U. It shows up
most clearly when the pivot columns in R are reduced to ones and zeros.
3.3.
The Rank and the Row
Reduced
Form
123
The Rank of a Matrix
The numbers m and n give the size of a matrix—but not necessarily the true size of a
linear system. An equation like 0 = 0 should not count. If there are two identical rows
in A, then the second one will disappear in R. We don’t want to count that row of zeros.
Also if row 3 is a combination of rows | and 2, then row 3 will become all zeros in R.
That zero row has no pivot. It is not included in determining the rank:
DEFINITION
The rank of A is the number of pivots. This number is r.
The matrix R at the start of this section has rank r = 2. It has two pivots and
two pivot columns. So does the unknown matrix A that produced R. This number
r = 2 will be crucial to the theory, but its first definition is entirely computational. To
execute the command r = rank (A), the computer just counts the pivots. When pivcol
gives a list of the pivot columns, the length of that list is r.
We know right away that r < m and r <n. The number of pivots can’t be
greater than the number of rows. It can’t be greater than the number of columns. The
cases r = m and r = n of “full row rank” and “full column rank” will be especially
important. We mention them here and come back to them soon:
-
A has full row rank if every row has a pivot:
r =m.
No zero rows in R.
-
A has full column rank if every column has a pivot: r =n.
No free variables.
A square invertible matrix has r =m =n. Then R is the same as /.
Actually the computer has a hard time to decide whether a small number is really
zero. When it subtracts 3 times .33---3 from 1, does it obtain zero? Our Teaching
Codes treat numbers below the tolerance 10~° as zero. But that is not entirely safe.
A second definition of rank will soon come at a higher level. It deals with entire
rows and entire columns—vectors and not just numbers. The matrices A and U and R have
r independent rows (the pivot rows). They also have r independent columns (these are the
pivot columns). We have to say exactly what it means for rows or columns to be independent.
That crucial idea comes in Section 3.5.
A third definition, at the top level of linear algebra, will deal with spaces of
vectors. The rank r is the “dimension” of the column space. It is also the dimension
of the row space. The great thing is that r reveals the dimensions of all other important
subspaces—including the nullspace.
Example 1
. 2
:|and i s and
B and [6] all have rank 1.
124
Chapter 3° Vector Spaces and Subspaces
The reduced row echelon forms R of these matrices have one nonzero row:
13
ki 0
4
0
7
1
| and i ,|and ia and [1].
In a rank one matrix A, all the rows are multiples of one nonzero row.
Then the matrix
R has only one pivot. Please check each of those examples of R.
The Pivot Columns
The pivot columns of R have 1’s in the pivots and 0’s everywhere else in the column.
The r pivot columns taken together have an r by r identity matrix J. It sits above
m —r rows of zeros. Remember that the numbers of the pivot columns are in the list
called pivcol.
The pivot columns of A are probably not obvious from A itself. But their column numbers are given by the same list pivcol. The r columns of A that eventually
have pivots (in R) are the pivot columns.
The first matrix R in this section is the row
reduced echelon form of A:
13
A=|0
0
Lede
02
=-1
1 4 -—3|
vl 26. 4
yields R=|
3
09.0
oe 0
2-1
© 4 =3
#80
0
The last row of A is the sum of rows 1 and 2. Elimination discovers this fact and
replaces row 3 by the zero row. That uncovers the pivot columns (1 and 3).
Columns | and 3 are also the pivot columns of A. But the columns of R and A are
different! The columns of A don’t end with zeros. Our Teaching Codes pick out these
r pivot columns by A(:, pivcol). The symbol : means that we take all rows of the
pivot columns listed in pivcol.
In this example E subtracts rows | and 2 from row 3 (to produce that zero row
in R). So the elimination matrix FE and its inverse are
Esl
0
0)
0- OL 10
Bes Sf
and
{0.0
E'=!0
1 0
1d
The r pivot columns of A are also the first r columns of E~!. We can check
that here: columns | and 3 of A are also columns 1 and 2 of E~!. The reason is that
A = E7!R. Each column of A is E~! times a column of R. The 1’s in the pivots of
R just pick out the first r columns of E~!. This is multiplication a column at a time
(our favorite).
One more important fact about the pivot columns.
Our definition has been purely
computational: the echelon form R indicates the pivot columns of A. Here is a more
mathematical description of those pivot columns:
3.3
The Rank and the Row
Reduced
Form
125
3C The pivot columns of A are not linear combinations of earlier columns of A.
This is clearly true for R. Its second pivot column has a one in row 2. All earlier
columns have zeros in row 2. So the second pivot column could not be a combination
of the earlier columns. Then the same is true for A.
The reason is that Ax = 0 exactly when Rx = 0. The solutions don’t change during elimination. The combinations of columns that produce the zero vector are given by
x, the same for A and R. Suppose a column of A is a combination of earlier columns
of A. Then that column of R would be the same combination of earlier columns of R.
The pivot columns are not combinations of earlier columns. The free columns
are combinations of earlier columns. These combinations are exactly given by the special solutions! We look at those next.
The Special Solutions
Each special solution to Ax = 0 and Rx = 0 has one free variable equal to 1. The
other free variables are all zero. The solutions come directly from the echelon form R:
x]
Re =
[3
0 2 =I
)0P209 I) 435
OM TGs (077 07
Gao
x2,
et
X4
heal
0
02s
0
(3)
X5
We set the first free variable to x2 = 1 with x4 = x5 = 0. The equations give the
pivot variables xj = —3 and x3 = 0. This tells us how column 2 (a free column) is a
combination of earlier columns.
It is 3 times the first column and the special solution
is s; = (—3, 1, 0,0, 0). That is the first of three special solutions in the nullspace.
The next special solution has x4 = 1. The other free variables are x2 = O and
x5 = 0. The solution is s2 = (—2, 0, —4, 1,0).
Notice how the —2 and —4 are in R,
with plus signs.
s3; =
The third special solution has x5
(1,0,3,0, 1). The numbers x; =
= 1. With x2 = O and x4 = O we find
1 and x3 = 3 are in column 5 of R, again
with opposite signs. This is a general rule as we soon verify. The nullspace matrix N
contains the three special solutions in its columns:
N=
—3 -—2 11]
1.
10% -0-}7
0 —4
31]
0
tO 4
QO
20° I}
not free
free
not free
tree
free
The linear combinations of these three columns give all vectors in the nullspace. This
is the complete solution to Ax = 0 (and Rx = 0). Where R had the identity matrix
(2 by 2) in its pivot columns, N has the identity matrix (3 by 3) in its free rows.
126
Chapter 3
Vector Spaces and Subspaces
There is a special solution for every free variable. Since r columns have pivots,
that leaves n —r
free variables. This is the key to Ax = 0.
3D The system Ax = 0 has n —,
nullspace matrix N.
special solutions.
These are the columns of the
When we introduce the idea of “independent” vectors, we will show that the special solutions are independent. Perhaps you can see in the matrix N that no column
is a combination of the other columns. The beautiful thing is that the count is exactly
right, and we explain it informally:
1.
Ax =0
has n unknowns.
2.
There are really r independent equations.
3.
So there are n —r
independent solutions.
To complete this section, look again at the rule for special solutions. Suppose for
simplicity that the first r columns are the pivot columns, and the last n —r columns
are free (no pivots). Then the reduced row echelon form looks like
|
fee
alt
r pivot columns
r pivot rows
nl m—r
n—r
—Ff
(4)
free columns
3E The nullspace matrix containing the n —r
|
zero rows
special solutions is
r
pivot variables
n—r _ free variables
:
(5)
Multiply R times N to get zero. The first block row is (J times —F) + (F times /).
The columns
of N do solve the equation Rx
=
0 (and Ax
=
0). They follow the
rule that was discovered in the example: The pivot variables come by changing signs
(F to —F) in the free columns of R. This is because the equations Rx = 0 are so
simple. When the free part moves to the right side, the only nonzero part on the left
side is the identity matrix:
pivot
: Laer
|.
a
free
leet :
(6)
In each special solution, the free variables are a column of 7. Then the pivot variables
are a column of —F. That is the nullspace matrix N.
3.3.
The Rank and the Row Reduced
Form
127,
The idea is true whether or not the pivot columns all come before the free columns.
If they do come first, then J comes before F as in equation (1.5). If the pivot columns
are mixed in with the free columns, then the J and F are mixed together. You can still
see J and you can still see F.
Here is an example where J = [1] comes first and
F =[2 3] comes last.
Example 2 The special solutions of x; + 2x2 + 3x3 =0
are the columns of
oe
Ll
0
Opal
Ni
The coefficient matrix is [1 2 3] =[J
F']. Its rank is one. So there are n —r =
3 — 1 special solutions in the matrix N. Their first components are —F = [—2 —3].
Their other components
(free variables) come
from /.
Final Note How can I write confidently about R when I don’t know which steps
MATLAB will take? The matrix A could be reduced to R in different ways. Two
different elimination matrices E, and Ez might produce R. Very likely you and I and
Mathematica and Maple would do the elimination differently. The key point is that the
final matrix R is always the same. Here is the reason.
The original matrix A completely determines the J and F and zero rows in R:
1.
2.
The pivot columns (containing 7) are decided by 3C: they are not combinations
of earlier columns.
The free columns (containing F’) are decided by 3D: they are combinations of
earlier columns
(and F tells those combinations).
The solutions to Rx = 0 are exactly the solutions to Ax = 0, and the special
solutions are the same. A small example with rank one will show two E’s that produce
the correct R:
De.
a=|i |
i
ea;
reduces to
fl
ak
You could multiply row 1 of A by 5, and subtract it from row 2:
1 0}; 1/2 O}_} 1/2 O)_,
Seo
Be | ale) ines:
Or you could exchange rows in A, and then subtract 2 times row | from row 2:
2 Jf? ol- [2 2] -2
Multiplication gives EA
same.
=
R and also EnewA
=
R.
The
E’s are different,
One thing that is determined is the first column of E ak
Zo)
‘
E =) ees
=| il and
ai
aeRe
2
Al
R is the
128
= Chapter 3
Vector Spaces and Subspaces
The first column of E~! is the first pivot column of A = E~!R.
The second column of
E—! only multiplies the zeros in R. So that second column of E—! is free to change.
The matrix R and the r pivot columns in E~! are completely determined by A.
=
REVIEW OF THE KEY IDEAS
=&
1.
The reduced row echelon form comes from A by EA = R.
2.
The rank of A is the number of pivots in R.
3.
The pivot columns of A and R are in the same list pivcol.
4.
Those pivot columns are not combinations of earlier columns.
5.
The equation Ax = 0 has n —r
special solutions (the columns of J).
Problem Set 3.3
1
Z
Which of these rules gives a correct definition of the rank of A?
(a)
The number of nonzero rows in R.
(b)
The number of columns minus the total number of rows.
(c)
The number of columns minus the number of free columns.
(d)
The number of 1’s in the matrix
Throw away the last m —r
Then the factorization
A =
R.
zero rows in R, and the last
A = E~!R, with zeros removed,
m—r columns of E~!.
becomes
(first r columns of E~') (first r rows of R).
Write the 3 by 5 matrix A at the start of this section as the product of the 3 by
2 matrix from the columns of E~! and the 2 by 5 matrix from the rows of R.
This is the factorization A = (COL)(ROW)!.
r matrix times an r by n matrix.
3
Find the reduced
row echelon forms
Every m by n matrix is an m by
R of these matrices:
(a)
The 3 by 4 matrix of all ones.
(b)
The 3 by 4 matrix with aj; =i+j
(c)
The 3 by 4 matrix with aj; = (—1)/.
—1.
3.3
The Rank and the Row
Reduced
Form
129
Find R for each of these (block) matrices:
B=[A
Al
ne
a
Suppose all the pivot variables come Jast instead of first. Describe all four blocks
in the reduced echelon form
The block B should be r by r. What is the nullspace matrix N containing the
special solutions?
(Silly problem)
Describe all 2 by 3 matrices A; and A2, with row echelon forms
R, and R2, such that R; + Ro is the row echelon form of A; + Ao.
that R; = A, and R> = Ad in this case?
Is is true
If A has r pivot columns, how do you know that A? has r pivot columns?
3 by 3 example for which the column numbers are different.
Give a
What are the special solutions to Rx = 0 and y'R = 0 for these R?
Ree
LO
oars
a0! WEA
os
O= OF 0 0
Rh)
Ce
2
OS ORO
0 0 0
Problems 9-11 are about r by r invertible matrices inside A.
If A has rank r, then it has an r by r submatrix S that is invertible. Remove
m —r rows and n —r columns to find such a submatrix S inside each A:
fame
pee
el
oe| ber
3
5
0
ois
Lid
OFel
0 0
10
Suppose P is the m by r submatrix of A containing only the pivot columns of
A. Explain why P has rank r.
11
In Problem 10, what is the rank of P'? Then if S? contains only the pivot columns
of P! its rank is again r. The matrix S is an r by r invertible submatrix of A.
Carry out the steps from A to P to P™ to S' to S for this matrix:
aS
For A =
4 61]
=
NN
ns,
find the invertible S.
130 = Chapter 3. Vector Spaces and Subspaces
Problems
12-15 connect the rank of AB
to the ranks of A and B.
12
(a)
Suppose column j of B is a combination of previous columns of B. Show
that column j of AB is the same combination of previous columns of AB.
(b)
Part (a) implies that rank(AB) < rank(B), because AB cannot have any
new pivot columns. Find matrices A; and A2 so that rank(A;B) = rank B
and rank(A2B) < rank B, for
repel
|Be os
13
Problem
12 proved that rank(AB)
< rank(B).
Then by the same
reasoning
rank(B! A‘) < rank(A!). How do you deduce that rank(AB) < rank A?
14
(Important) Suppose A and B are n by n matrices, and AB = J. Prove from
rank(A
B) < rank(A) that the rank of A is n. So A is invertible and B must be
its two-sided inverse (Section 2.5). Therefore BA = I (which is not so obvious’).
15
If A is 2 by 3 and B is 3 by 2 and AB = 7, show from its rank that BA # I.
Give an example of A and B.
16
Suppose A and B have the same reduced row echelon form R.
17
(a)
Show that A and B have the same nullspace and the same row space.
(b)
Since E;A = R and E2B = R we know
times B.
The factorization
A =
(COL)(ROW)!
that A equals an
in Problem
2 expresses
matrix
every matrix of
rank r as a sum of r rank one matrices, when the multiplication is done as
columns times rows. Express A and B in this way as the sum of two rank one
matrices:
A=
18
eR As
A)
Suppose A is an m by n matrix of rank r. Its echelon form is R. Describe
exactly the matrix Z (its shape and all its entries) that comes from transposing
the row echelon form of R’ (prime means transpose):
R=
19
a1 Fe
rref(A)
and
Z = (rref(R’))’.
Compare Z in Problem 18 with the matrix that comes
echelon form of A’ (which is not R’):
S=
rref(A’)
and
Y=
rref(S’).
Explain in one line why Y is or is not equal to Z.
from starting with the
3.4
20
The Complete Solution to Ax=b
= 131
In Problem 18 we could also find the matrices E and F that put A and R’ into
their row echelon forms (as in
[ike Beli aref([Ant|,
EA = R):
and»
[Sez]
= mref((LR! of
|)eeand »ZeSs.
What are the shapes of E and F? How is the final Z connected by E and F to
the original R?
THE COMPLETE SOLUTION
TO Ax = 5b a 3.4
The last section totally solved Ax = 0. Elimination converted the problem to Rx = 0.
The free variables were given special values (one and zero). Then the pivot variables
were found by back substitution. We paid no attention to the right side b because it
started and ended as zero. The solution x was in the nullspace of A.
Now 3B is not zero. Row operations on the left side must act also on the right
side. One way to organize that is to add b as an extra column of the matrix. We keep
the same example A as before.
But we
“augment” A with the right side (b), b2, b3) =
CBG 2):
h Soe ae!
0 0 1 4]//°?|}=]6]
asa
4
1] has the
augmented
7 | matrix
{AS
PTO Oren
|0 0 1 4 6/=[A 5]
peota
Qe
The augmented matrix is just [A b |. When we apply the usual elimination steps to
A, we also apply them to b. In this example we subtract row | from row 3 and then
subtract row 2 from row 3. This produces a complete row of zeros:
Lh sey 0. 2
1 | has the
0 0 1 4/|°2 |=] 6] augmented
CeCe Oneal.4
0 | matrix
ee
|
|0 0 1 4 6|=[R d].
0.030 026
That very last zero is crucial. It means that the equations can be solved; the third
equation has become 0 = 0. In the original matrix A, the first row plus the second
row equals the third row. If the equations are consistent, this must be true on the right
side of the equations also! The all-important property on the right side was 1+6 = 7.
Here are the same augmented matrices for a general b = (by, b2, b3):
|
0
l
Ww
Ww
©
b|
b nN
3
©
—
—oe
a) b
La
—|]0
(OL 2
014
0°08 050
b _—
6
Now we get 0 = 0 in the third equation provided b3—b; —bz = 0. This is bj +52 = b3.
132
Chapter 3
Vector Spaces and Subspaces
The free variables are x2 = x4 = 0. The pivot variables are x; = 1 and x3 = 6, taken
from the last column d of the reduced augmented matrix. The code x = partic (A, b)
gives this particular solution to Ax = b. First it reduces A and b to R and d. Then
the r pivot variables in x are taken directly from d, because the pivot columns in R
contain the identity matrix. After row reduction we are just solving /x = d.
Notice how we choose the free variables and solve for the pivot variables. After
the row reduction to R, those steps are quick. We can see which variables are free
(their columns don’t have pivots). When those variables are zero, the pivot variables
for x, are in the extra column:
The particular solution solves
Ax, =b
The n—r
Ax, = 09.
special solutions solve
Question
Suppose A is a square invertible matrix. What are x, and x,?
Answer
The particular solution is the one and only solution A~'b. There are no
special solutions because there are no free variables. The only vector in the nullspace
is X, = 0. The complete solution is x =X) +X, = A-'b +0.
This was the situation in Chapter 2. We didn’t mention the nullspace in that
chapter; it contained only the zero vector. The solution A~'b appears in the extra
column, because the reduced form of A is R = J. Reduction goes from [A 6] to
[7 A~'b]. The original system Ax = b is reduced all the way to x = A~'!D.
In this chapter, that means m = n = r. It is special here, but square invertible
matrices are the ones we see most often in practice. So they got their own chapter at
the start of the book.
For small examples we can put [A 6] into reduced row echelon form. For a
large matrix, MATLAB can do it better.
rank. Both columns have pivots.
Example 1
Here is a small example with full column
Find the condition on (b;, bz, b3) for Ax = b to be solvable, if
A=
1
—2
1
2 | and
b=]
-3
b;
bs
b3
This condition puts 6 in the column space of A. Find the complete
Solution
row
x = xp» + Xp.
Use the augmented matrix, with its extra column b. Elimination subtracts
| from row 2, and adds 2 times row
1
i
—2
| to row
1b;
1
| — ia
2 bo|—10
1 b-bd
-—3
bz
QO —-1
b3+2b,
3:
1
>{10
0
O
1
0
2b; — bo
b-bd,
b+b,+b2
The last equation is 0 = 0 provided b3 + bj + by = 0. This condition puts b in the
column space of A; then the system is solvable. The rows of A add to the zero row.
So for consistency (these are equations!) the entries of b must also add to zero.
3.4
The Complete Solution to Ax=b
133
This example has no free variables and no special solutions. The nullspace solution is x, = 0. The (only) particular solution x, is at the top of the augmented
column:
2b,
—b
0
If b3 +b; + b2 is not zero, there is no solution to Ax = b (x, doesn’t exist).
rank.
This example is typical of the extremely important case when A has full column
Every column has a pivot. The rank is r = n. The matrix is tall and thin
(m =n).
Row
reduction puts J at the top, when
ke
A is reduced to R:
n by n identity matrix
m —n
=
rows of zeros
I
0
:
(1)
There are no free columns or free variables. F is an empty matrix.
We will collect together the different ways of recognizing this type of matrix.
3F Every matrix A with full column rank (r = n) has all these properties:
1.
All columns of A are pivot columns.
2.
There are no free variables or special solutions.
3.
The nullspace N(A) contains only the zero vector x = 0.
4.
If Ax =5b has a solution (it might not) then it has only one solution.
In the language of the next section, A has independent columns when r = n. In Chapter
4 we will add one more fact to this list: The square matrix A'A is invertible.
In this case the nullspace of A (and R) has shrunk to the zero vector.
The solu-
tion to Ax = b is unique (if it exists). There will be m —n zero rows in R. So there
are m —n conditions on b in order to have 0 = 0 in those rows.
The example had m = 3 and r =n = 2, so there was one condition b3+b,+b2 =
O on the right side b. If this is satisfied, Ax = b has exactly one solution.
The other extreme case is full row rank.
shrinks to the zero vector.
Now
it is the nullspace of A! that
In this case A is short and wide (m <n).
The number
of
unknowns is at least the number of equations. A matrix has full row rank if r = m.
Every row has a pivot, and here is an example.
Example 2
There are n = 3 unknowns but only two equations. The rank is r = m = 2:
oe
Sp
Yap
Sie
Hey
I
NNII
3
4
134
Chapter 3. Vector Spaces and Subspaces
line of solutions
in particular
AXp =b
nullspace
Ax, = 90
Figure 3.3.
tions.
The complete solution is one particular solution plus all nullspace solu-
These are two planes in xyz space. The planes are not parallel so they intersect in a
line. This line of solutions is exactly what elimination will find. The particular solution
will be one point on the line. Adding the nullspace vectors Xx, will move us along the
line. Then x = Xp) +X, gives the whole line of solutions.
We find x, and x, by elimination. Subtract row | from row 2 and then subtract
row 2 from row I:
De
Le
lia 9 Nout ole aah Ghee
lt 4
Cele
it Min Ota a2
Oe
ee lah
The free variable is x3. The particular solution has x3 = 0 and the special solution has
rey c= Ile
Xparticular Comes directly from the right side: x, = (2, 1, 0)
Xspecial Comes from the third column (the free column F): s = (—3, 2, 1)
It is wise to check that x, and s satisfy the original equations Ax, = b and As = 0:
Datel
2+2
ar
=
8
4
—34+2+1
—3+4-1
The nullspace solution x, is any multiple of s. It moves
starting at Xparticular:
=
=
0
0
along the line of solutions,
—3
Complete Solution:
SiS
YY
hy
fee azo
1
This line is drawn in Figure 3.3. Any point on the line could have been chosen as
the particular solution; we chose the point with x3 = 0. The particular solution is not
multiplied by an arbitrary constant! The special solution is, and you understand why.
3.4
The Complete Solution to Ax=b
135
Now we summarize this short wide case (m <n) of full row rank:
3G Every matrix A with full row rank (r = m) has all these properties:
1. _ All rows have pivots, and R has no zero rows.
ys
Ax = b has a solution for any right side b.
3.
The column space is the whole space R”.
4
There are n —r =n —m
special solutions in the nullspace of A.
In this case with m pivots, the rows are “linearly independent”. In other words,
the columns of A! are linearly independent. We are more than ready for the definition
of linear independence, as soon as we summarize the four possibilities—which depend
on the rank.
Notice how r, m, n are the critical numbers!
The four possibilities for linear equations depend on the rank r:
r=m
and
r=n_
Ax =b_
has | solution
r=m
r<m
and
and
r<n _ Short and wide
r=n _ “Tall and thin
Square and invertible
Ax =b_
Ax =b
has o solutions
has O or | solution
r<m
and
¢r<n_
Ax =b_
has 0 or & solutions
Unknown
shape
The reduced R will fall in the same category as the matrix A. In case the pivot columns
come first, we can display these four possibilities for R:
Re(eee|Aaa |ela
r=>m=—nh
Cases
fr=mM<n
1 and 2 have full row rank
r = m.
I
A
FHSnN<mM
1
0
eg.
0
FTomyr<n
Cases | and 3 have full column rank. Case 4
is the most general in theory and the least common
in practice.
Note
In the first edition of this textbook, we generally stopped at U before reaching R. Instead of reading the complete solution directly from Rx = d, we found it by
back substitution from Ux = c. That combination of reduction to U and back substitution for x is slightly faster. Now we prefer the complete reduction: a single “1” in
each pivot column. We find that everything is so much clearer in R (and the computer
should do the hard work anyway) that we reduce all the way.
=»
REVIEW OF THE KEY IDEAS
1.
The rank r is the number of pivots.
2.
Ax =b
=&
is solvable if and only if the last m —r equations reduce to 0 = 0.
136
Chapter 3
Vector Spaces and Subspaces
3
Then one particular solution x, has all free variables equal to zero.
4.
The pivot variables are determined after the free variables are chosen.
5
Full column rank r =n
6
Full row rank r = m means one or infinitely many solutions.
means no free variables:
one solution or none.
Problem Set 3.4
Questions 1-12 are about the solution of Ax = b. Follow the steps in the text to
Xp» and x,. Use the augmented
1
matrix with last column
b.
Write the complete solution as x, plus any multiple of s:
x+3y+3z=
2x +6y +9z=5
=X == 3y + 3z.=5.
Find the complete solution (also called the general solution) to
al
5
6
Bd
4
Soa
0 NAR
eta feel
if
1
bas
Tl
Under what condition on b;, bz, b3 is this system solvable?
column in elimination. Find all solutions:
Include b as a fourth
x+2y —2z=b
2x+5y—4z=b)
4x + 9y — 8z = b3.
Under what conditions on b;, bz, b3, b4 is each system solvable?
case.
tia
DA
le
2 e a Ey
3)
bi
by
fw
2
Oies
wid s6qltl
ba
Bit
O nib Qul nacre
Bail oh pal A
aCe
Zi
oe
Nl
Find x in that
by
b)
by
Show by elimination that (b;, b2, b3) is in the column space of A if b3 — 2b) +
4b, = 0.
bier}
ASN
ue Stee
2 4 0
What combination of the rows of A gives the zero row?
3.4.
The Complete Solution to Ax =b
Which vectors (b;, b2,b3) are in the column
of the rows of A give zero?
(2)
eae
Lg
ee
OF
25
(One
25 3
(D).
Ate
space of A?
|
2
4
—
—_—
NO
Which
137
combinations
eR
of
Construct a 2 by 3 system Ax = b with particular solution x, = (2,4,0) and
homogeneous solution x, = any multiple of (1, 1, 1).
Why can’t a | by 3 system have x,
Cie)?
10
11
12
=
(2,4,0)
and x,
If Ax = Bb has two solutions x; and x2, find two solutions to Ax = 0.
(b)
Then find another solution to Ax = 0 and another solution to Ax = b.
Explain why these are all false:
(a)
The complete solution is any linear combination
of x, and xp.
(b)
A system Ax = b has at most one particular solution.
(c)
The solution x, with all free variables zero is the shortest solution (minimum length ||/x||). Find a 2 by 2 counterexample.
(d)
If A is invertible there is no homogeneous
solution x).
Suppose column 5 of U has no pivot. Then x5 is a ______
vector is
_____
has _____
solutions.
Suppose row 3 of U has no pivot. Then that row is
____._The
_____
The zero
_. The equation Ux =c
equation Ax = b (is) (is not) (might not
or r =n.
The largest possible rank of a 3 by 5 matrix is _____.
in every
variable.
the only solution to Ax = 0. If Ax = b has a solution, then it
Questions 13-18 are about matrices of “full rank” r =m
Then there is a pivot
of U.. The solution to Ax = b (always exists) (is unique).
column space of A is ____.
14
any multiple of
(a)
is only solvable provided
be) solvable.
13
=
An example is A =
The largest possible rank of a 6 by 4 matrix is ____.
Then there is a pivot
in every
___
of U. The solution to Ax = b (always exists) (is unique).
nullspace of A is
_
. An example is
A=
____
15
Find by elimination the rank of A and also the rank of A’:
|
fee
age
—1
ks
2
oo
10
and
A=jJ|1
1
ie
2/1
The
-
(rank depends on q).
The
138
Chapter 3
Vector Spaces and Subspaces
16
Find the rank of A and also of A™A
FE)
17
and also of AA!:
: |
Ane
n, (0)
|
i 2B
—e
Reduce A to its echelon form U. Then find a triangular L so that
A = LU.
|e
A=
aNG
oS
18
ANd
pa
§be | 6©
ele Oe
Oi; Oye Seed,
Find the complete solution in the form x, + x, to these full rank systems:
(a))
x+y+z=4
y
Ay
(b)
tiz=4
pate. eas
19
If Ax = 5b has infinitely many solutions, why is it impossible for Ax = B (new
right side) to have only one solution? Could Ax = B have no solution?
20
Choose the number g so that (if possible) the rank is (a) 1, (b) 2, (c) 3:
@
A=|]-3
«a
—2
2
-l
SghGumt
and
nel
q
: a
q
Questions 21-26 are about matrices of rank r = 1.
21
Fill out these matrices so that they have rank 1:
1
AS
ied
;
and
4
B=|1
and u=|°
Cc
le
22
If A is an m by n matrix with r = 1, its columns are multiples of one column
and its rows are multiples of one row. The column space is a
in R”. The
nullspace is a
in R". Also the column space of A! is a
in R".
23.
Choose vectors u and v so that
A = uv! = column times row:
an|i 4 28 2|8 wa a=[? 7? 8 a 4)
3
66, 6
A= uv! is the natural form for every matrix that has rank r = 1.
24
If A is a rank one matrix, the second row of U is
. Do an example.
3.4
The Complete Solution to Ax=b
139
25
Multiply a rank one matrix times a rank one matrix, to find the rank of AB and
AM:
26
The rank one matrix uv! times the rank one matrix wz! is uz! times the number
. This has rank one unless _
a)
27,
Give examples of matrices A for which the number of solutions to Ax = b is
28
(a)
0 or 1, depending on b
(b)
ov, regardless of b
(c)
0 or ow, depending on b
(d)
1, regardless of b.
Write down
all known
relations between
r and m and n if Ax = Bb has
(a)
no solution for some
b
(b)
infinitely many solutions for every b
(c)
exactly one solution for some 5b, no solution for other b
(d)
exactly one solution for every b.
Questions 29-33 are about Gauss-Jordan elimination
wards) and the reduced echelon matrix R.
29
(upwards as well as down-
Continue elimination from U to R. Divide rows by pivots so the new pivots are
all 1. Then produce zeros above those pivots to reach R:
U
30
Suppose
31
Apply Gauss-Jordan
Rie a:
2
=\20
0
4
Ww 6
S
D
ana
0
ea
alal
|
OmO
U is square with n pivots (an invertible matrix).
elimination
to Ux
= 0 and Ux
Explain why R = I.
= cc. Reach
Rx = 0 and
fy o}=[5 6 so] ™ (Y l=[o 0 4 a)
Solve Rx = 0 to find x, (its free variable is x7 = 1). Solve
(its free variable is x2 = 0).
Rx =d
to find x,
140
Chapter 3. Vector Spaces and Subspaces
32
Gauss-Jordan elimination yields the reduced matrix R. Find Rx = 0 and Rx = d:
GEL
O
|x.
3 OS Ge
10 OF 25
0 O02
0
10
120
and
CP
tert
SOMES
OO
a2.
Ont
Os
P29
4
Solve Ux = 0 or Rx = 0 to find x, (free variable = 1). What are the solutions
to RY =a?
33
Reduce Ax = b to Ux = c (Gaussian elimination) and then to Rx = d (GaussJordan):
[a0) 2s Soles
Aral the63) Qn Ovlale &
7 Ora on pe
2
eee
10
X4
Find a particular solution x, and all homogeneous solutions xj.
34
Find matrices A and B with the given property or explain why you can’t: The
1
only solution of Ax = [2| is ¥ = |¢ |- The ‘only ‘solutiontar Bx =) [87 as
-=[1)
35
Find the LU factorization of A and the complete solution to Ax = b:
1
3
6
ee
ee
i) W
me
BNW
5
36
and
b=
and then
b=
nA
WW
The complete solution to Ax =[}] is x =[}]+c[°]. Find A.
=
ooo
3.5
INDEPENDENCE,
Independence, Basis and Dimension
141
BASIS AND DIMENSION a 3.5
This important section is about the true size of a subspace. There are n columns in an
m by n matrix, and each column has m components. But the true “dimension” of the
column space is not necessarily m or n. The dimension is measured by counting independent columns—and we have to say what that means. For this particular subspace
(the column space) we will see that the true dimension is the rank r.
The idea of independence applies to any vectors v;,... , v¥, in any vector space.
Most of this section concentrates on the subspaces that we know and use—especially
the column space and nullspace. In the last part we also study “vectors” that are not
column vectors. These vectors can be matrices and functions; they can be linearly
independent (or not). First come the key examples using column vectors, before the
extra examples with matrices and functions.
The final goal is to understand a basis for a vector space. We are at the heart
of our subject, and we cannot go on without a basis. The four essential ideas in this
section are:
1.
Independent vectors
2.
Spanning a space
3.
Basis for a space
4.
Dimension of a space.
Linear Independence
Our first definition of independence is not so conventional, but you are ready for it.
DEFINITION
The columns of A are linearly independent when the only solution to
Ax =0 is x = 0. No other combination Ax of the columns gives the zero vector.
With linearly independent columns, the nullspace N(A) contains only the zero vector.
Let me illustrate linear independence (and linear dependence) with three vectors in R?:
1.
Figure 3.4 (left) shows three vectors not in the same plane. They are independent.
2.
Figure 3.4 (right) shows three vectors in the same plane. They are dependent.
This idea of independence applies to 7 vectors in 12-dimensional space. If they
are the columns of A, and independent, the nullspace only contains x = 0. Now we
choose different words to express the same idea. The following definition of independence will apply to any sequence of vectors in any vector space. When the vectors are
the columns of A, the two definitions say exactly the same thing.
142 = Chapter 3. Vector Spaces and Subspaces
U3
U3
:
arid boi
—_—-
—
Ww2
Figure 3.4
Independent vectors vj, v2, v3. Dependent vectors w;, w2, w3. The combination w, — w2 + w3 is (0, 0, 0).
DEFINITION The sequence of vectors v1,... , VU, is linearly independent if the only
combination that gives the zero vector is Ov; + Ov2 +---+0Ov,. Thus linear independence means that
xyvy + .x202 +++++X,V_,
If a combination
= 90
only happens when all x’s are zero.
(1)
gives 0, when the x’s are not all zero, the vectors are dependent.
Correct language: “The sequence of vectors is linearly independent.” Acceptable
shortcut: “The vectors are independent.”
A collection of vectors is either dependent or independent. They can be combined
to give the zero vector (with nonzero x’s) or they can’t. So the key question is: Which
combinations of the vectors give zero? We begin with some small examples in R?:
(a)
The vectors (1,0) and (0, 1) are independent.
(b)
The vectors (1, 1) and (1, 1.001) are independent.
(c)
The vectors (1,1) and (2,2) are dependent.
(d)
The vectors (1,1) and (0,0) are dependent.
Geometrically, (1, 1) and (2, 2) are on a line through the origin. They are not independent. To use the definition, find numbers x; and x2 so that x;(1, 1) +x2(2, 2) = (0, 0).
This is the same as solving Ax = 0:
i
2
x1
=
hi SET
0
LOD
2
e—)2eandex
=
le
The columns are dependent exactly when there is a nonzero vector in the nullspace.
If one of the v’s is the zero vector, independence has no chance. Why not?
Now move to three vectors in R°. If one of them is a multiple of another one,
these vectors are dependent. But the complete test involves all three vectors at once.
We put them in a matrix and try to solve Ax = 0.
3.5
Independence, Basis and Dimension
143
Example 1 The columns of this matrix are dependent:
Ati
bee
aE
eo eles
BOCs
3
1
1
1S
eS)
l
(2
1
0
3
Leas
0
3
0
ee
0
The rank of A is only r = 2. Independent columns would give full column rank r =
is.
In that matrix the rows are also dependent. You can see a combination of these
rows that gives the zero row (it is row
1 minus row 3). For a square matrix, we will
show that dependent columns imply dependent rows (and vice versa).
Question
How do you find that solution to Ax = 0? The systematic way is elimination.
le
At
i
Olees
il
alee
al eTeGuCeSEtON Aa —s|n)
ORs
OP
@
3
mala
OR
eO
The solution x = (—3, 1,1) was exactly the special solution. It shows how the free
column (column 3) is a combination of the pivot columns. That kills independence!
The general rule is already in our hands:
3H The columns of A are independent exactly when the rank is r =n.
pivots and no free variables. Only x = 0 is in the nullspace.
There are n
One case is of special importance because it is clear from the start. Suppose
seven columns have five components each (m = 5 is less than n = 7). Then the
columns must be dependent. Any seven vectors from R° are dependent. The rank of
A cannot be larger than 5. There cannot be more than five pivots in five rows. The
system Ax = 0 has at least 7—5 = 2 free variables, so it has nonzero solutions—which
means that the columns are dependent.
31 Any set of
vectors in R” must be linearly dependent if n > m.
The matrix has more columns than rows—it is short and wide. The columns are certainly dependent if n > m, because Ax = 0 has a nonzero solution. The columns
might be dependent or might be independent if n < m. Elimination will reveal the
pivot columns. /t is those pivot columns that are independent.
Note Another way to describe linear independence is this: “One of the vectors is a
combination of the other vectors.” That sounds clear. Why don’t we say this from the
start? Our definition was longer: “Some combination gives the zero vector, other than
the trivial combination with every x = 0.’ We must rule out the easy way to get the
zero vector. That trivial combination of zeros gives every author a headache. In the
first statement, the vector that is a combination
of the others has coefficient x = 1.
144°
Chapter 3
Vector Spaces and Subspaces
The point is, our definition doesn’t pick out one particular vector as guilty.
All columns of A are treated the same. We look at Ax = 0, and it has a nonzero
solution or it hasn’t. In the end that is better than asking if the last column (or the
first, or a column in the middle) is a combination of the others.
Vectors that Span a Subspace
The first subspace in this book was the column space. Starting with n columns vj, --- , Up,
the subspace was filled out by including all combinations x; vj +---+2%,U,. The column
space consists of all combinations Ax of the columns. We now introduce the single
word “span” to describe this: The column space is spanned by the columns.
DEFINITION A set of vectors spans a space if their linear combinations fill the space.
Example 2 v; = |
i and v2 = H span the full two-dimensional space R?.
Example 3 v; = Ht v2 = Hi v3 = B also span the full space R?.
Example 4 w; = B and w2 = be only span a line in R?. So does w
by itself.
So does w2 by itself.
Think of two vectors coming out from (0, 0,0) in 3-dimensional
space.
Gener-
ally they span a plane. Your mind fills in that plane by taking linear combinations.
Mathematically you know other possibilities: two vectors spanning a line, three vectors spanning all of R°, three vectors spanning only a plane. It is even possible that
three vectors span only a line, or ten vectors span only a plane. They are certainly not
independent!
The columns span the column space. Here is a new subspace—which begins with
the rows. The combinations of the rows produce the “row space”.
DEFINITION
The row space of a matrix is the subspace of R” spanned by the rows.
The rows of an m by n matrix have n components. They are vectors in R”’—
or they would be if they were written as column vectors. There is a quick way to do
that: Transpose the matrix. Instead of the rows of A, look at the columns of A’. Same
numbers,
but now
in columns.
The row space of A is C(A'). It is the column space of A‘. It is a subspace
of R”. The vectors that span it are the columns of A', which are the rows of A.
3.5
Independence, Basis and Dimension
145
Example 5
Asa|2
lime
3
Ae
les 3S
and a" =|
]-Here
7 Sy
eae.
m=
3 Ands7a—oe
The column space of A is spanned by the two columns of A. It is a plane in
R°. The row space of A is spanned by the three rows of A (columns of A‘). It is
all of R*. Remember: The rows are in R”. The columns are in R”. Same numbers,
different vectors, different spaces.
A Basis for a Vector Space
In the xy plane, a set of independent vectors could be small—just one single vector. A
set that spans the xy plane could be large—three vectors, or four, or infinitely many.
One vector won’t span the plane. Three vectors won’t be independent. A “basis” for
the plane is just right.
DEFINITION
A basis for a vector Pale is a sequence of vectors that has two
properties at once:
.
1.
The vectors are linearly independent.
2.
The vectors span the space.
This combination of properties is fundamental to linear algebra. Every vector v in the
space is a combination of the basis vectors, because they span the space. More than
that, the combination that produces v is unique, because the basis vectors v1,... , Up
are independent:
There is one and only one way to write v as a combination of the basis vectors.
Reason:
Suppose
v = ajv1; +++: +40,
(a; — by)vj +--+
and also v = bv, +---byv,.
By subtraction
(Gn — bn)dn is the zero vector. From the independence of the v’s,
each a; — bj = 0. Hence a; = bj.
Example 6 The columns
of / =
E “|are a basis for
R*.
This is the “standard
basis”.
The vectors
i=
3 and j = H are independent.
They span Re,
146
Chapter 3
Vector Spaces and Subspaces
Everybody thinks of this basis first. The vector i goes across and j goes straight up.
Similarly the columns of the 3 by 3 identity matrix are the standard basis i, j, k. The
columns of the n by n identity matrix give the “standard basis” for R”. Now we find
other bases.
Example 7 (Important) The columns of any invertible n by n matrix give a basis for R”:
=
hee
Pipi:
and)
(Arete
1-8
0
ee haa)
LD glee ch
but not A=
:
2
al.
4
When A is invertible, its columns are independent.
The only solution to Ax = 0
is x = 0. The columns span the whole space R"—because every vector b is a combination
of the columns. Ax = b can always be solved by x = A~'b. Do you see how everything
comes together for invertible matrices? Here it is in one sentence:
3) The vectors vj,... , ¥, are a basis for R” exactly when they are the columns of
an n by n invertible matrix. Thus R” has infinitely many different bases.
When the columns of a matrix are independent, they are a basis for its column space. When the columns are dependent, we keep only the pivot columns—the r
columns with pivots. They are independent and they span the column space.
3K The pivot columns of A are a basis for its column space. The pivot rows of A
are a basis for its row space. So are the pivot rows of its echelon form R.
Example 8 This matrix is not invertible. Its columns are not a basis for anything!
i
Dm
ah
s
(es
‘ which reduces to eo;
al
Column | of A is the pivot column. That column alone is a basis for its column space.
The second column of A would be a different basis. So would any nonzero multiple
of that column. There is no shortage of bases! So we often make a definite choice:
the pivot columns.
Notice that the pivot column of R is quite different. It ends in zero. That column
is a basis for the column space of R, but it is not even a member of the column space
of A. The column spaces of A and R are different. Their bases are different.
The row space of A is the same as the row space of R. It contains (2,4) and
(1,2) and all other multiples of those vectors. As always, there are infinitely many
bases to choose from. I think the most natural choice is to pick the nonzero rows of
3.5
Independence, Basis and Dimension
147
R (rows with a pivot). So this matrix A with rank one has only one vector in the basis:
;
Basis for the column space:
2
3]:
Basis for the row space:
1
>|:
The next chapter will come back to these bases for the column space and row space.
We are happy first with examples where the situation is clear (and the idea of a basis
is still new). The next example is larger but still clear.
Example 9 Find bases for the column space and row space of
YS
Le
Oats
PN
|
O70: 20's 10
Columns | and 3 are the pivot columns. They are a basis for the column space (of R!).
The vectors in that column space all have the form b = (x, y,0). The column space
of R is the “xy plane” inside the full 3-dimensional xyz space. That plane is not R?,
it is a subspace of R°. Columns 2 and 3 are a basis for the same column space. So
are columns
| and 4, and also columns
2 and 4. Which columns
of R are not a basis
for its column space?
The row space of R is a subspace of R*. The simplest basis for that row space
is the two nonzero rows of R. The third row (the zero vector) is in the row space too.
But it is not in a basis for the row space. The basis vectors must be independent.
Question Given five vectors in R’, how do you find a basis for the space they span?
First answer
Make them the rows of A, and eliminate to find the nonzero rows of R.
Second answer
Put the five vectors into the columns of A. Eliminate to find the pivot
columns (of A not R!). The program colbasis uses the column numbers from pivcol.
The column space of R had r = 2 basis vectors. Could another basis have more than
r vectors, or fewer? This is a crucial question with a good answer. All bases for a vector
space contain the same number of vectors. This number is the “dimension” .
Dimension of a Vector Space
We have to prove what was just stated. There are many choices for the basis vectors,
but the number of basis vectors doesn’t change.
3L If vj,..., Vm and w1,... , W, are both bases for the same vector space, then m = n.
Proof Suppose that there are more w’s than v’s. From n > m we want to reach a
contradiction.
The v’s are a basis, so w; must be a combination of the v’s. If w; equals
148 = Chapter 3
Vector Spaces and Subspaces
41101 +:::+4m1Um,
this is the first column of a matrix multiplication
V A:
a\\
W=
=e
ee
=VA.
am)
We don’t know each aj;, but we know the shape of A (it is m by n). The second
vector w2 is also a combination of the v’s. The coefficients in that combination fill
the second column of A. The key is that A has a row for every v and a column for
every w. It is a short wide matrix, since n > m. There is a nonzero solution to
Ax = 0. But then VAx = 0 and Wx = 0. A combination of the w’s gives zero! The
w’s could not be a basis—which is the contradiction we wanted.
If m > n we exchange the v’s and w’s and repeat the same steps. The only way
to avoid a contradiction is to have m =n. This completes the proof.
The number of basis vectors depends on the space—not on a particular basis.
The number is the same for every basis, and it tells how many “degrees of freedom”
the vector space allows.
For the space R”, the number is n. This is the dimension of
R”. We now introduce the important word dimension for other spaces too.
DEFINITION
basis.
The dimension of a vector space is the number of vectors in every
This matches our intuition. The line through v = (1,5, 2) has dimension one. It is
a subspace with one vector in its basis. Perpendicular to that line is the plane x +
5y +2z=0. This plane has dimension 2. To prove it, we find a basis (—5, 1, 0) and
(—2,0,1). The dimension is 2 because the basis contains two vectors.
The plane is the nullspace of the matrix A = [1 @) ZI which has two free
variables. Our basis vectors (—5, 1,0) and (—2,0,1) are the “special solutions” to
Ax = 0. The next section studies other nullspaces to show that the special solutions
always give a basis. Here we emphasize only this: All bases for a space contain the
same number of vectors.
Note about the language of linear algebra We never say “the rank of a space” or “the
dimension of a basis” or “the basis of a matrix”. Those terms have no meaning. It is
the dimension of the column space that equals the rank of the matrix.
Bases for Matrix Spaces and Function Spaces
The words “independence” and “basis” and “dimension” are not at all restricted to column
vectors.
We
can ask whether the 3 by 4 matrices
A;, A2, A3 are independent.
They are members of the space of all 3 by 4 matrices; some combination might give
the zero matrix.
We can also ask the dimension of that matrix space (it is 12).
3.5.
Independence, Basis and Dimension
149
In differential equations we find a basis for the solutions to d*y/dx* = y. That
basis contains functions, probably y = e* and y = e *. Counting the basis functions
gives the dimension 2 (for the space of all solutions).
We think matrix spaces and function spaces are optional. Your class can go past
this part—no problem. But in some way, you haven’t got the ideas straight until you
can apply them to “vectors” other than column vectors.
Matrix spaces The vector space
and here is a basis:
M contains all 2 by 2 matrices.
Its dimension is 4
f OV
[021]
)0) O14
OmrO
Ar, Aa, Aa As = [4 nll msl nlalo le
Those matrices are linearly independent. We are not looking at their columns, but at
the whole matrix. Combinations of those four matrices can produce any matrix in M,
so they span the space:
cyA, +c2A2 +€3A3 + €4A4 = é
(Biya
ea‘
(67:1
This is zero only if the c’s are all zero—which proves independence.
The matrices A;, Az, A4 are a basis for a subspace—the upper triangular matrices. Its dimension is 3. A; and Aq are a basis for the diagonal matrices. What is a
basis for the symmetric matrices? Keep A; and Aq, and throw in A? + A3.
To push this further, think about the space of all n by n matrices. For a basis,
choose matrices that have only a single nonzero entry (that entry is 1). There are n?
positions for that 1, so there are n* basis matrices:
The dimension of the whole matrix space is n7.
The dimension of the subspace of upper triangular matrices is sn? aa Sn.
The dimension of the subspace of diagonal matrices is n.
The dimension of the subspace of symmetric matrices is 5n? + 5n.
Function spaces The equations d*y/dx? =0 and d*y/dx? + y =0
and d*y/dx* —
y = 0 involve the second derivative. In calculus we solve to find the functions y(x):
y’=0 _ is solved by any linear function y = cx +d
y’ =—y_
is solved by any combination y = csinx + dcosx
y’=y
is solved by any combination y = ce* + de ~.
The second solution space has two basis functions: sinx and cosx. The third
solution space has basis functions e‘ and e *. The first space has x and 1. It is
the “nullspace” of the second derivative! The dimension is 2 in each case (these are
second-order equations).
150
Chapter 3. Vector Spaces and Subspaces
What about y” = 2? Its solutions do not form a subspace—there is a nonzero right
side b = 2. A particular solution is y(x) = x*. The complete solution is y(x) = ya-exos
All those functions satisfy y” = 2. Notice the particular solution plus any function cx +d
in the nullspace. A linear differential equation is like a linear matrix equation Ax = b.
But we solve it by calculus instead of linear algebra.
We end here with the space Z that contains only the zero vector. The dimension of
this space is zero. The empty set (containing no vectors at all) is a basis. We can never
allow the zero vector into a basis, because then linear independence is lost.
=»
REVIEW OF THE KEY IDEAS
#8
The columns of A are independent if x = 0 is the only solution to Ax = 0.
The vectors v1,..., Uy span a space if their combinations
fill the space.
A basis contains linearly independent vectors that span the space.
ee
All bases for a space have the same number of vectors.
dimension of the space.
This number is the
The pivot columns are a basis for the column space and the dimension is r.
Problem Set 3.5
Questions 1-10 are about linear independence and linear dependence.
1
Show that v,, v2, v3 are independent but v1, v2, v3, v4 are dependent:
1
y=
1
0
vN2=
0
1
1
v3O=
0
1
2
V4 =
1
3
4
Solve either cjv + c2v2 +¢303 = 0 or Ax = 0. The v’s go in the columns of A.
(Recommended)
y=
1
el
0
0
Find the largest possible number of independent vectors among
l
v2=
aa
v=
1
0
0
eal
04 =
0
1
ey
0
0
1
Uv =
0
|
Vv =
0
0
1
aa
Prove that if a = 0 ord = 0 or f = 0 (3 cases), the columns of U are dependent:
3.5
Independence, Basis and Dimension
If a,d, f in Question 3 are all nonzero,
151
show that the only solution to Ux = 0
is x = 0. Then U has independent columns.
Decide the dependence or independence of
(a)jeeathe vectors
(173.2) and (2. 153) and (3; 2.11)
(b)
(1, —3,2) and (2, 1, —3) and (—3, 2, 1).
the vectors
Choose three independent columns of U. Then make two other choices. Do the
same for A.
Pee
YO
tee: aa |
oo 7 ©
ode
ON6r
Tae,
770
ad WO ne) oo A ano creel (Oar)
0440
0.4.0
ay
(th
If w 1, W2, w3 are independent vectors, show that the differences vj = w2 — w3
and v2 = w) — w3 and v3 = wy — w2 are dependent. Find a combination of the
v’s that gives zero.
If w,, w2,w3 are independent vectors, show that the sums v} =
w2 + w3
and v2 = w; + w3 and v3 = w; + w2 are independent. (Write c)vj+c2v2 +c303 = 0
in terms of the w’s. Find and solve equations for the c’s.)
Suppose
10
v1, v2, v3, v4 are vectors in R?.
(a)
These four vectors are dependent because
(b)
The two vectors vj and v2 will be dependent if
(c)
The vectors v; and (0,0,0) are dependent because
Find two independent vectors on the plane x + 2y —3z—t = 0 in R*. Then find
three independent vectors. Why not four? This plane is the nullspace of what
matrix?
Questions 11-15 are about the space spanned by a set of vectors.
combinations of the vectors.
11
12
Take all linear
Describe the subspace of R? (is it a line or plane or R°?) spanned by
(a)
the two vectors (1,1, —1) and (—1, —1, 1)
(b)
the three vectors
(c)
the columns
(d)
all vectors with positive components.
(0, 1,1) and (1, 1,0) and (0, 0, 0)
of a 3 by 5 echelon matrix with 2 pivots
The vector b is in the subspace spanned by the columns of A when there is a
solution to ___. The vector ¢ is in the row space of A when there is a solution
LOWE
.
True or false: If the zero vector is in the row space, the rows are dependent.
152
Chapter 3
13
Find the dimensions of these 4 spaces.
(a) column
Vector Spaces and Subspaces
space of A,
(b) column
Which two of the spaces are the same?
space of U,
(c) row
space of A,
(d) row
space of U:
|
Asesinas
So
14
1
ge
1
peat
Bd
(a
er
al) eae)
2 wll
Cr
ee)
x1, x3, x4)
62): 10 R‘. It has 24 rearrangements like (x2,
= (2 14,.19,.03,
Choose
and (x4, x3, X1,X2). Those 24 vectors, including x itself, span a subspace S. Find
specific vectors x so that the dimension of S is: (a) zero,
(b) one,
(c) three,
(d) four.
15
v + w and v — w are combinations of v and w. Write v and w as combinations
of v-+w and
v—w. The two pairs of vectors
the same space. When are
they a basis for the same space?
Questions 16-26 are about the requirements for a basis.
16
If vj,..., Vy, are linearly independent, the space they span has dimension
These vectors are a
for that space. If the vectors are the columns of an
m by n matrix, then m is
than n.
17
Find a basis for each of these subspaces of R?*:
(a)
All vectors whose components are equal.
(b)
All vectors whose components
(c)
All vectors that are perpendicular to (1, 1,0,0) and (1,0, 1, 1).
(d)
The column space (in R’) and nullspace (in R>) of
add to zero.
18
Find three different bases for the column space of U above.
ferent bases for the row space of U.
19
Suppose v1, 2,..., 6 are six vectors in R*.
U= i : i : als
Then find two dif-
(a)
Those vectors (do)(do not)(might not) span R¢.
(b)
Those vectors (are)(are not)(might be) linearly independent.
(c)
Any four of those vectors (are)(are not)(might be) a basis for Re
20
The columns of A are n vectors from R”. If they are linearly independent, what
is the rank of A? If they span R”, what is the rank? If they are a basis for R”,
what then?
21
Find a basis for the plane x — 2y + 3z = 0 in R*. Then find a basis for the
intersection of that plane with the xy plane. Then find a basis for all vectors
perpendicular to the plane.
3.5
22
24
Basis and Dimension
153
Suppose the columns of a 5 by 5 matrix A are a basis for R°.
(a)
The equation Ax = 0 has only the solution
(b)
If b is in R° then Ax =D
Conclusion:
23
Independence,
x = 0 because
is solvable because
A is invertible. Its rank is 5.
Suppose S is a 5-dimensional subspace of R®°. True or false:
(a)
Every basis for S can be extended to a basis for R° by adding one more
vector.
(b)
Every basis for R° can be reduced to a basis for S by removing one vector.
U comes from A by subtracting row | from row 3:
I
Sk
i
A=
Pasa?
il
3,
ann
Cea
2
|
OROREO
Find bases for the two column spaces. Find bases for the two row spaces. Find
bases for the two nullspaces.
25
26
True or false (give a good reason):
(a)
If the columns of a matrix are dependent, so are the rows.
(b)
The column space of a 2 by 2 matrix is the same as its row space.
(c)
The column
space.
(d)
The columns of a matrix are a basis for the column space.
space of a 2 by 2 matrix has the same dimension
For which numbers
i=
as its row
c and d do these matrices have rank 2?
ih
|| 0
Oo
kay
Oo @
© @
AOS)
2-2
@ 2
and
Ate
=|;
|]
Questions 27-32 are about spaces where the ‘“‘vectors” are matrices.
27
‘Find a basis for each of these subspaces of 3 by 3 matrices:
(a)
All diagonal matrices.
(b)
All symmetric matrices
(c)
All skew-symmetric
(AT =A).
matrices
(AT = —A).
28
Construct six linearly independent 3 by 3 echelon matrices U},..., Us.
29
‘Find a basis for the space of all 2 by 3 matrices whose columns
Find a basis for the subspace whose rows also add to zero.
30
~=Show that the six 3 by 3 permutation matrices (Section 2.6) are linearly dependent.
add to zero.
154
Chapter 3
31
What subspace of 3 by 3 matrices is spanned by
32
Vector Spaces and Subspaces
(a)
all invertible matrices?
(b)
all echelon matrices?
(c)
the identity matrix?
Find a basis for the space of 2 by 3 matrices whose nullspace contains (2, 1, 1).
Questions 33-37 are about spaces where the ‘‘vectors” are functions.
33
(a)
Find all functions that satisfy a ==);
(b)
Choose a particular function that satisfies ay = 3s
(c)
Find all functions that satisfy 42 = 3.
34
The cosine space F3 contains all combinations y(x) =
C cos 3x. Find a basis for the subspace with y(0) = 0.
35
Find a basis for the space of functions that satisfy
(a)
&-2y=0
)
Acosx
+ Bcos2x
+
%-2=0.
36
Suppose y;(x), y2(x), y3(x) are three different functions of x. The vector space
they span could have dimension 1, 2, or 3. Give an example of y;, y2, y3 to show
each possibility.
37
Find a basis for the space of polynomials p(x) of degree < 3. Find a basis for
the subspace with p(1) = 0.
38
Find a basis for the space S of vectors (a, b, c,d) with a+c+d = 0 and also for
the space T with a+b = 0 and c = 2d. What is the dimension of the intersection
SOT?
3.6
Dimensions of the Four Subspaces
155
DIMENSIONS OF THE FOUR SUBSPACES u 3.6
The main theorem in this chapter connects rank and dimension. The rank of a matrix
is the number of pivots. The dimension of a subspace is the number of vectors in a
basis. We count pivots or we count basis vectors. The rank of A reveals the dimensions
of all four fundamental subspaces. Here are the subspaces, including the new one.
Two subspaces come directly from A, and the other two from A!:
1.
The row space is C(A"), a subspace of R".
The column space is C(A), a subspace of R”.
The nullspace is N(A), a subspace of R”.
eo
The left nullspace is N(A‘), a subspace of R”. This is our new space.
In this book the column
space and nullspace came
first.
We know
C(A)
and N(A)
pretty well. Now the other two subspaces come forward.
For the row space we take all combinations of the rows. This is also the column
space of A‘. For the left nullspace we solve A! y= 0—that system is n by m. This
is the nullspace of A'. The vectors y go on the left side of A when the equation is
written as y' A =0!. This is the nullspace of A’. The matrices A and A! are usually
very different. So are their column spaces and their nullspaces. But those spaces are
connected in an absolutely beautiful way.
Part 1 of the Fundamental Theorem finds the dimensions of the four subspaces.
One fact will stand out: The row space and column space have the same dimension.
That dimension is r (the rank of the matrix). The other important fact involves the two
nullspaces: Their dimensions are n —r and m —r, to make up the full dimensions n
and m.
Part 2 of the Theorem will describe how the four subspaces fit together (two in
R” and two in R”). That completes the “right way” to understand Ax = b. Stay
with it—you are doing real mathematics.
The Four Subspaces for R
Suppose A is reduced to its row echelon form R. For that special form, the four subspaces are easy to identify. We will find a basis for each subspace and check its dimension. Then we watch how the subspaces change (or don’t change!) as we look
back at A. The main point is that the four dimensions are the same for A and R.
As a specific 3 by 5 example, look at the four subspaces for the echelon matrix R:
n=
nd
pa?
3
PSE
SiO
OF 0R.0
O-50e
9
1
“0 FO
pivot rows
| and 2
pivot columns
| and 4
156
Chapter 3
Vector Spaces and Subspaces
The rank is r = 2 (two pivots). Take the subspaces in order:
1. The row space of R has dimension 2, matching the rank.
Reason:
The first two rows are a basis. The row space contains combinations of all
three rows, but the third row (the zero row) adds nothing new. So rows 1 and 2 span
the row space.
The pivot rows | and 2 are also independent. That is obvious for this example,
and it is always true. If we look only at the pivot columns, we see the r by r identity
matrix. There is no way to combine its rows to give the zero row (except by the
combination with all coefficients zero). So the r pivot rows are independent and the
dimension is r.
The dimension of the row space is r. The nonzero rows of R form a basis.
2. The column space of R also has dimension r = 2.
Reason: The pivot columns form a basis. They are independent because they start
with the r by r identity matrix. No combination of those pivot columns can give the
zero column (except the combination with all coefficients zero). And they also span
the column space. Every other (free) column is a combination of the pivot columns.
Actually the combination we need is also in the special solution:
Column
2 is 3 (column
1). The special solution is (—3, 1, 0, 0, 0).
Column 3 is 5 (column 1). The special solution is (—5, 0, 1, 0, 0, ).
Column 5 is 9 (column
(—9, 0, 0, —8, 1).
1) +8 (column 4). The special solution is
The pivot columns are independent, and they span, so they are a basis.
The dimension of the column space is r. The pivot columns form a basis.
3. The nullspace of this R has dimension n —r = 5 — 2. There are n —r = 3
free variables. Here x2,x3,x5 are free (no pivots in those columns). Those 3 free
variables yield the 3 special solutions to Rx = 0. Set a free variable to 1, and solve
for x; and x4:
0
~A
S) lI
~”AWw
ll
wnNn
iS
—8
SSO
OoN
coor
Rx = 0 has the
complete solution
X = X98) + X33 + X585
3.6
Dimensions of the Four Subspaces
There is a special solution for each free variable.
variables, that leaves n —r
157
With n variables and r pivot
free variables and special solutions:
The nullspace has dimension n —r. The special solutions form a basis.
The special solutions are independent, because they contain the identity matrix in rows 2,
3, 5. All solutions are combinations of special solutions, x = x282 + x383 + x585, because
this gets x2, x3 and xs in the correct positions. Then the pivot variables x; and xq are totally
determined by the equations Rx = 0.
4. The nullspace of R' has dimension m — r = 3 — 2.
Reason: The equation R'y = 0 looks for combinations of the columns
rows of R) that produce zero. This equation R' y = 0 is
yi(l.3,5,.0, 9)
of R™ (the
y2(0, 0,0, 1,8) + y3(0, 0,.0,0,,.0) = 0.020507 0):
(1)
The solutions y;, y2, y3 are pretty clear. We need yj = 0 and y2 = 0. The variable y3
is free (it can be anything). The nullspace of R’ contains all vectors y = (0,0, y3).
It is the line of all multiples of the basis vector (0, 0, 1).
In all cases R ends with m—r zero rows. Every combination of these m—r rows
gives zero. These are the only combinations of the rows of R that give zero, because
the pivot rows are linearly independent. So we can identify the left nullspace of R,
which is the nullspace of R’:
TR
TR
Soe
The left nullspace has dimension m — r.
The solutions are y = (0,..., On veces
SA Vine
The first r rows of R are independent. The other rows are zero. To produce a zero
combination, y must start with r zeros. This leaves dimension m — r.
Why is this a “eft nullspace” ? The reason is that R? y = 0 can be transposed to
y’R=0". Now y’ is a row vector to the left of R. You see the y’s in equation (1)
multiplying the rows. This subspace came fourth, and some linear algebra books omit
it—but that misses the beauty of the whole subject.
In R" the row space and nullspace have dimensions r and n — r (adding to n).
In R" the column space and left nullspace have dimensions r and m — r (total m).
So far this is proved for echelon matrices R. Figure 3.5 shows the same for A.
The Four Subspaces for A
We have a small job still to do. The subspace dimensions for A are the same as for R.
The job is to explain why. Remember that those matrices are connected by an invertible
matrix E (the product of all the elementary matrices that reduce A to R):
EA=R
and
A=E™!R
(2)
158
Chapter 3 Vector Spaces and Subspaces
C (A)
dim r
left
nullspace
nullspace
Ax =0
N (A)
dimension
n—r
Figure 3.5
1
Aly=0
N (A!)
dimension m—r
The dimensions of the four fundamental subspaces (for R and for A).
A has the same row space as R. Same dimension r and same basis.
Reason: Every row of A is a combination of the rows of R. Also every row of R is
a combination of the rows of A. In one direction the combinations are given by E~!,
in the other direction by E. Elimination changes the rows of A to the rows of R, but
the row spaces are identical.
Since A has the same row space as R, we can choose the first r rows of R as a
basis. Or we could choose r suitable rows of the original A. They might not always
be the first r rows of A, because those could be dependent. The good r rows of A
are the ones that end up as pivot rows in R.
2
The column space of A has dimension r. For every matrix this fact is essential:
The number of independent columns equals the number of independent rows.
Wrong reason:
“A and R have the same column space.” This is false. The columns
of R often end in zeros. The columns of A don’t often end in zeros. The column
spaces are different, but their dimensions are the same—equal to r.
Right reason:
A combination of the columns of A is zero exactly when the same
combination of the columns of R is zero. Say that another way: Ax = 0 exactly when
Re
OU!
Conclusion
The r pivot columns in R are a basis for its column space. So the r pivot
columns of A are a basis for its column space.
3.6
Dimensions of the Four Subspaces
159
3
A has the same nullspace as R. Same dimension n —r and same basis.
Reason:
The elimination steps don’t change the solutions. The special solutions are
a basis for this nullspace.
There are n —r
free variables,
so the dimension
is n — r.
Notice that r + (n —r) equals n:
(dimension of column space) + (dimension of nullspace) = dimension of R”.
4
The left nullspace of A (the nullspace of A!) has dimension m — r.
Reason:
A’ is just as good a matrix as A. When we know the dimensions for every
A, we also know them for A’. Its column space was proved to have dimension r.
Since A! is n by m, the “whole space” is now R”. The counting rule for A was
r+(n—r)
=n.
The counting rule for A! isr+(m-—r)
=m.
So the nullspace of
A! has dimension m — r. We now have all details of the main theorem:
Fundamental Theorem of Linear Algebra, Part 1
The column space and row space both have dimension r.
The nullspaces have dimensions
n—r and m—r.
By concentrating on spaces of vectors, not on individual numbers or vectors, we get
these clean rules. You will soon take them for granted—eventually they begin to look
obvious. But if you write down an 11 by 17 matrix with 187 nonzero entries, we don’t
think most people would see why these facts are true:
dimension of C(A) = dimension of CAS
dimension of C(A) + dimension of N(A) = 17.
Example
1A =[1
2 3]
has
m=1
and
n=3
and rank r=1.
The row space is a line in R*. The nullspace is the plane Ax = x; + 2x2 + 3x3 = 0.
This plane has dimension 2 (which is 3 — 1). The dimensions add to 1 + 2 = 3.
The columns of this 1 by 3 matrix are in R'! The column space is all of R!.
The left nullspace contains only the zero vector. The only solution to Aly = 0 is
y =0.
That is the only combination of the row that gives the zero row. Thus N(A')
is Z, the zero space with dimension 0 (which is m—r).
In R” the dimensions
add to
1+-0= 1.
Example 2 A =
Le 8S
12
| hase
72
swith?
—)5e-and
e7a— 1.
The row space is the same line through (1, 2,3). The nullspace is the same
x1 + 2x2 + 3x3 = 0. The dimensions still add to 1 + 2 = 3.
The columns
are multiples of the first column
(1, 1). But there is more
plane
than the
zero vector in the left nullspace. The first row minus the second row is the zero row.
160 == Chapter 3
Vector Spaces and Subspaces
Therefore A' y = 0 has the solution y = (1, —1). The column space and left nullspace
are perpendicular lines in R?. Their dimensions are 1 and 1, adding to 2:
column space = line through
If A has three equal rows,
|
|
its rank is
left nullspace = line through
.
What
are two
1
eq :
of the y’s in its left
nullspace? The y’s combine the rows to give the zero row.
Matrices of Rank One
That last example had rank r = 1—and rank one matrices are special. We can describe
them all. You will see again that dimension of row space = dimension of column
space. When r = 1, every row is a multiple of the same row:
1 ®) 2253
246
Ueiting wikg
A
Sia
al
1
equals
,
times
3
[1 Z 3 |:
0
A column times a row (4 by | times 1 by 3) produces a matrix (4 by 3). All rows are
multiples of the row (1, 2,3). All columns are multiples of the column (1, 2, —3, 0).
The row space is a line in R”, and the column space is a line in R”.
Every rank one matrix has the special form A = uv’ T = column times row.
The columns are multiples of u. The rows are multiples of v'. The nullspace is the
plane perpendicular to v. (Ax = 0 means that u(v'x) = 0 and then v'x = 0.) It is
this perpendicularity of the subspaces that will be Part 2 of the Fundamental Theorem.
=»
REVIEW OF THE KEY IDEAS
»®
1.
The r pivot rows of R are a basis for the row space of R and A (same row
space).
2.
Ther
3.
The n —,r
nullspace).
4.
The last m —r
5.
The last m —r rows of E are a basis for the left nullspace of A.
pivot columns of A (!) are a basis for its column space.
special solutions are a basis for the nullspace of A and R (same
rows of J are a basis for the left nullspace of R.
3.6
Dimensions of the Four Subspaces
161
Problem Set 3.6
(a)
If a7 by 9 matrix has rank 5, what are the dimensions of the four subspaces?
(b)
If a3 by 4 matrix has rank 3, what are its column space and left nullspace?
Find bases for the four subspaces associated with A and B:
ee
=[ 4 4 oi
I
2
4
aa E 5 af
Find a basis for each of the four subspaces associated with
Oi
2 3-34
A=) 0
2, Ae iO
Oia O nO) ol 2
Construct a matrix with the required property or explain why no such matrix exists:
space contains
1]
70
1|,[°|, row space contains [3], |
2].
(a)
Column
(b)
1
3
Column space has basis E iFnullspace has basis li].
(c)
Dimension of nullspace = | + dimension of left nullspace.
(d)
Left nullspace contains [4], row space contains |} ].
(e)
Row space = column space, nullspace ¥ left nullspace.
If V is the subspace spanned by (1, 1, 1) and (2, 1,0), find a matrix A that has
V as its row space and a matrix B that has V as its nullspace.
Without elimination, find dimensions
and bases for the four subspaces for
3
>
II
0
ands
be
1
—Ow Sow
>oo
Re
np
Suppose the 3 by 3 matrix A is invertible. Write down bases for the four subspaces for A, and also for the 3 by 6 matrix
B=[A A].
What are the dimensions of the four subspaces for A, B, and C, if J is the 3 by
3 identity matrix and 0 is the 3 by 2 zero matrix?
Lied
A=[J 0] and B=| or ot| and».
C = [0].
162
Chapter 3. Vector Spaces and Subspaces
Which subspaces are the same for these matrices of different sizes?
(a)
[A]
and
hi
(b)
4
and
k
il
Prove that all three matrices have the same rank r.
10
If the entries of a 3 by 3 matrix are chosen randomly between 0 and 1, what are
the most likely dimensions of the four subspaces? What if the matrix is 3 by 5?
11
(Important) A is an m by n matrix of rank r. Suppose there are right sides b for
which Ax = b has no solution.
12
(a)
What are all inequalities (< or <) that must be true between m,n, and r?
(b)
How do you know that A' y = 0 has solutions other than y = 0?
Construct a matrix with (1,0, 1) and (1, 2,0) as a basis for its row space and its
column space. Why can’t this be a basis for the row space and nullspace?
13
14
True or false:
(a)
If m =n
then the row space of A equals the column space.
(b)
The matrices A and —A share the same four subspaces.
(c)
If A and B share the same four subspaces then A is a multiple of B.
Without computing A, find bases for the four fundamental subspaces:
AO
15
ees VERA
10
Dems wal
|
a 58 §
OFT
2 3
ORO
If you exchange the first two rows of A, which of the four subspaces stay the
same?
If v = (1, 2,3,4) is in the column
space of A, write down a vector in
the column space of the new matrix.
16
Explain why v = (1, 2,3) cannot be a row of A and also be in the nullspace
of A.
17
Describe the four subspaces of R* associated with
y We
18
Ot
0 aed4
oO. 0
0
|
0
and”
f+ A-=|
Pio
0. 1 1
OO
(Left nullspace) Add the extra column b and reduce A to echelon form:
[A b]=]4
t 2 Br pi
5 6 bb}
7 8 9 bs
>
eet
fae 7
|0 -3 -6
bo —4b,
0 0 0 b3— 2b. +d;
3.6
Dimensions of the Four Subspaces
163
A combination of the rows of A has produced the zero row. What combination is
it? (Look at b3 —2b2 +b, on the right side.) Which vectors are in the nullspace
of A! and which are in the nullspace of A?
19
Following the method of Problem
18, reduce A to echelon form and look at zero
rows. The b column tells which combinations you have taken of the rows:
Be tinal eas
(a) |3Gn
4 by
(b)
3
2
2
AL Ios
2
55 sba
From the 6 column after elimination, read off vectors in the left nullspace of A
(combinations of rows that give zero). You should have m — r basis vectors for
N(A!),
20
(a)
Describe all solutions to Ax = 0 if
7 Wee,
21
[05-76
Pa ais MLS
eee: Sey
SN
Oh On
0d 0)
aa
aaes
50) 20
(b)
How many independent solutions are there to Al y = 0?
(c)
Give a basis for the column space of A.
Suppose A is the sum of two matrices of rank one:
vectors span the column
A = uv! + wz!.
(a)
Which
space of A?
(b)
Which vectors span the row space of A?
(cj
*Thesrank?istless thane 2/1f*
(d)
Compute A and its rank if wu= z= (1,0,0) and
22 Soe orale
v= w = (0,0, 1).
V2
Construct a matrix whose column space has basis (1, 2, 4), (2,2, 1) and whose
row space has basis (1, 0, 0), (0, 1, 1).
23
Without multiplying matrices, find bases for the row and column spaces of A:
Tele sles
eee
a
How do you know from these shapes that A is not invertible?
24
Aly =d
is solvable when the right side d is in which subspace? The solution
is unique when the ______ contains only the zero vector.
25
True or false (with a reason or a counterexample):
164
26
27
= Chapter 3
Vector Spaces and Subspaces
(a)
A and A! have the same number of pivots.
(b)
A and A! have the same left nullspace.
(c)
If the row space equals the column space then A! = A.
(d)
If A’ =—A
then the row space of A equals the column space.
If AB =C-, the rows of C are combinations of the rows of
. So the rank
of C is not greater than the rank of
. Since B'A! = C!, the rank of C
is also not greater than the rank of
If a,b,c are given with a 40, how would you choose d so that A = [24] has
rank one? Find a basis for the row space and nullspace.
28
Find the ranks of the 8 by 8 checkerboard matrix B and chess matrix C:
Ds ice lean
Ot
ysli
O
.
Orr
Aig Oo atari
20m i Oo
tf
©
!
.
Orr. 0
NR PR 3 a et Stall PE FC AN
|
O
lt
C
°
PO
al
.
Ort
The numbers r,n, b,g,k, p are all different.
(C ==
Pp pe
Ppa
four zero rows
Dp
=)
PoP
PP
2 oP
:
nO
Gg kK bon
Find bases for the row space and left
nullspace of B and C. Challenge problem: Find a basis for the nullspace of C.
ORTHOGONALITY
ORTHOGONALITY
OF THE FOUR SUBSPACES ua 4.1
Two vectors are orthogonal when their dot product is zero:
v- w = 0 or v'w = 0.
This chapter moves up a level, from orthogonal vectors to orthogonal subspaces.
Orthogonal means the same as perpendicular.
Subspaces entered Chapter 3 with a specific purpose—to throw light on Ax = b.
Right away we needed the column space (for b) and the nullspace (for x). Then the
light turned onto A’, uncovering two more subspaces. Those four fundamental sub-
spaces reveal what a matrix really does.
A matrix multiplies a vector: A times x. At the first level this is only numbers.
At the second level Ax is a combination of column vectors. The third level shows
subspaces. But I don’t think you have seen the whole picture until you study Figure
4.1. It fits the subspaces together, to show the hidden reality of A times x. The 90°
angles between subspaces are something new—and we have to say what they mean.
The row space is perpendicular to the nullspace. Every row of A is perpendicular to every solution of Ax = 0. Similarly every column is perpendicular to every
solution of A'y = 0. That gives the 90° angle on the right side of the figure. This
perpendicularity of subspaces is Part 2 of the Fundamental Theorem of Linear Algebra.
May we add a word about the left nullspace? It is never reached by Ax, so it
might seem useless. But when B is outside the column space—when we want to solve
Ax = b and can’t do it—then this nullspace of A? comes into its own. It contains
the error in the “least-squares” solution. That is the key application of linear algebra
in this chapter.
Part 1 of the Fundamental Theorem gave the dimensions of the subspaces. The row
and column spaces have the same dimension r (they are drawn the same size). The two
nullspaces have the remaining dimensions n — r and m — r. Now we will show that the
row space and nullspace are actually perpendicular.
DEFINITION
Two subspaces V and W of a vector space are orthogonal if every
vector v in V is perpendicular to every vector w in W:
v-w=0
or
v'iw=0
for all v in V and all w in W.
165
166
Chapter 4 Orthogonality
dimension
dimension
=r
=
Ax=b
nullspace
nullspace
of A
of AT
dimension
=m-—r
dimension
=n-r
Figure 4.1.
Two pairs of orthogonal subspaces.
Dimensions add to n and m.
Example 1 The floor of your room (extended to infinity) is a subspace V. The line where
two walls meet is a subspace W (one-dimensional). Those subspaces are orthogonal.
Every vector up the meeting line is perpendicular to every vector in the floor. The origin
(0, 0, 0) is in the corner. We assume you don’t live in a tent.
Example 2 Suppose V is still the floor but W is one of the walls (a two-dimensional
space). The wall and floor look orthogonal but they are not! You can find vectors in
V and W that are not perpendicular. In fact a vector running along the bottom of the
wall is also in the floor. This vector is in both V and W—and it is not perpendicular
to itself.
Example 3 If a vector is in two orthogonal subspaces, it must be perpendicular to itself. It is v and it is w, so v'v = 0. This has to be the zero vector. Zero is the only
point where the nullspace meets the row space.
The crucial examples for linear algebra come from the fundamental subspaces.
4A Every vector x in the nullspace of A is perpendicular to every row of A. The
nullspace and row space are orthogonal subspaces.
4.1
Orthogonality of the Four Subspaces
167
To see why x is perpendicular to the rows, look at Ax = 0. Each row multiplies x:
row
A=
|
:
0
x}/=]:].
row m
(1)
0
The first equation says that row | is perpendicular to x. The last equation says that row
m is perpendicular to x. Every row has a zero dot product with x. Then x is perpendicular to every combination of the rows. The whole row space C(A!) is orthogonal
to the whole nullspace N(A).
Here is a second proof of that orthogonality for readers who like matrix shorthand.
The vectors in the row space are combinations Aly of the rows. Take the dot product
of Aly with any x in the nullspace. These vectors are perpendicular:
xl(Aly) = (Ax)'y =0ly =0.
(2)
We like the first proof. You can see those rows of A multiplying x to produce zeros
in equation (1). The second proof shows why A and A! are both in the Fundamental
Theorem. A! goes with y and A goes with x. At the end we used Ax = 0.
In this next example, the rows are perpendicular to (1, 1, —1) in the nullspace:
Ax = E 3 |
eeeAb
x
|
A
a
0
gives the dot products
1+3-—-4=0
5+2-7=0
Now we turn to the other two subspaces. They are also orthogonal, but in R”.
4B Every vector y in the nullspace of A! is perpendicular to every column of A.
The left nullspace and column space are orthogonal.
Apply the original proof to A!. Its nullspace is orthogonal to its row space—which is
the column space of A. Q.E.D.
For a visual proof, look at y' A= 0. The row vector y! multiplies each column of A:
=
Ee Loa
c
Cc
oO
oO
]
n
if == uebeet
|
(3)
The dot product with every column is zero. Then y is perpendicular to each column—
and to the whole column space.
Very important
Their dimensions
The fundamental subspaces are more than just orthogonal (in pairs).
are also right. Two lines could be perpendicular in 3-dimensional
168 = Chapter 4 Orthogonality
space, but they could not be the row space and nullspace of a matrix. The lines have
dimensions | and 1, adding to 2. The correct dimensions r and n—r must add to n = 3.
Our subspaces have dimensions 2 and 1, or 3 and 0. The fundamental subspaces are
not only orthogonal, they are orthogonal complements.
DEFINITION
The orthogonal complement of a subspace V contains every vector that
is perpendicular to V. This orthogonal subspace is denoted by V+ (pronounced “V perp”).
By this definition, the nullspace is the orthogonal complement of the row space.
Every x that is perpendicular to the rows satisfies Ax = 0, and is included in the
nullspace.
The reverse is also true (automatically). If v is orthogonal to the nullspace, it must
be in the row space. Otherwise we could add this v as an extra row of the matrix, without
changing the nullspace. The row space and its dimension would grow, which breaks the
law r + (n —r) =n. We conclude that N(A)+ is exactly the row space C(A').
The left nullspace and column space are not only orthogonal in R”, they are
orthogonal complements. The 90° angles are marked in Figure 4.2. Their dimensions
add to the full dimension m.
Fundamental Theorem of Linear Algebra, Part 2
The nullspace is the orthogonal complement of the row space (in R").
The left nullspace is the orthogonal complement of the column space (in R").
Part | gave the dimensions of the subspaces.
Part 2 gives their orientation.
They are perpendicular (in pairs). The point of “complements” is that every x can be
split into a row space component x, and a nullspace component x,. When A multiplies
x =x, +X,,
Figure 4.2 shows
what happens:
The nullspace component goes to zero:
Ax, = 0.
The row space component goes to the column space:
Ax, = Ax.
Every vector goes to the column space! Multiplying by A cannot do anything else.
But more than that: Every vector in the column space comes from one and only one
vector x, in the row space.
Proof:
If Ax,
=
Ax’, the difference
x, — x; is in the
nullspace. It is also in the row space, where x, and x} came from. This difference
must be the zero vector, because the spaces are perpendicular. Therefore x, = x’.
There is an invertible matrix hiding inside A, if we throw away the two nullspaces.
From the row space to the column space, A is invertible. The “pseudoinverse” will
invert it in Section 7.4.
Example 4 Every diagonal matrix has an r by r invertible submatrix:
v=
3 0
lO 5S
ORO
O @ @
© © @
ORO)
contains
3 0
é |;
The rank is r = 2. The 2 by 2 submatrix in the upper corner is certainly invertible.
The other eleven zeros are responsible for the nullspaces.
4.1
Orthogonality of the Four Subspaces
169
column
space
of A
dim m—r
dim n—r
Figure 4.2
zero.
The true action of A times x: row space to column space, nullspace to
Section 7.3 will show how every A becomes a diagonal matrix, when we choose the
right bases for R” and R”. This Singular Value Decomposition is a part of the theory
that has become extremely important in applications.
Combining Bases from Subspaces
What follows are some valuable facts about bases. They could have come earlier, when
bases were
defined, but they were
saved until now—when
we
are ready to use them.
After a week you have a clearer sense of what a basis is (independent vectors that span
the space). When the count is right, one of those two properties implies the other:
4C Any n linearly independent vectors in R” must span R”. Any n vectors that span
R” must be independent.
Normally we have to check both properties: First, that the vectors are linearly
independent, and second, that they span the space. For n vectors in R”, either independence or spanning is enough by itself. Starting with the correct number of vectors,
one property of a basis implies the other.
This is true in any vector space, but we care most about R”. When the vectors
go into the columns of an n by n matrix A, here are the same two facts. Remember
that A is square:
170 = Chapter 4 Orthogonality
4D If the n columns are independent, they must span all of R”. If the n columns
span R”, they must be independent.
A square system Ax = b always has one solution if it never has two solutions.
Uniqueness implies existence and existence implies uniqueness. The square matrix A
is invertible. Its columns are a basis for R”.
Our standard method of proof is elimination.
If there are no free variables
(uniqueness),
there must be n pivots.
Then
back substitution
solves Ax
= b (exis-
tence). In the opposite direction, suppose Ax = b can always be solved (existence of
solutions). Then elimination produced no zero rows. There are n pivots and no free
variables. The nullspace contains only x = 0 (uniqueness of solutions).
The count is always right for the row space of A and its nullspace. They have
dimensions r and n —r. With a basis for the row space and a basis for the nullspace,
we have r+(n—r) =n vectors—the right number. Those n vectors are independent.‘
Therefore they span R". They are a basis:
Every x in R” is the sum x, +x,
of a row space vector x, and a nullspace vector x,.
This confirms the splitting in Figure 4.2.
It is the key point
complements—the dimensions add to n and no vectors are missing.
Example 5 For A=[J
of orthogonal
1]=[492°], write any vectorx as x,
+ Xn.
(1, 0, 1, 0) and (0, 1, 0, 1) are a basis for the row space. (1,0, —1, 0) and (0, 1, 0, —1) are
a basis for the nullspace. Those four vectors are a basis for R*. Any x = (a,b,c, d) can
be split into x, + X;:
a
b
eR
eee
ta
d
1
0
[tc
0
=»
b+d\
ae
0
1
a6
|kOnet amie
1
0
0
1
bad
eet eee
1
wee
—='|
REVIEW OF THE KEY IDEAS
=#
1
The subspaces V and W are orthogonal if every v in V is orthogonal to every
w in W.
2
V and W are “orthogonal complements”
perpendicular to the other subspace.
TIf a combination
of the vectors
gives xy +X,
=
if each subspace contains all vectors
0, then
x, =
—x,
is in both
subspaces.
It is
orthogonal to itself and must be zero. All coefficients of the row space basis and nullspace basis must
be zero—which proves independence of the n vectors together.
4.1
Orthogonality of the Four Subspaces
171
The nullspace N(A) and the row space C(A!) are orthogonal complements, from
Ax = 0.
Any n independent vectors in R” will span R”.
Every vector x in R” has a nullspace component and a row space component.
Problem Set 4.1
Questions 1-10 grow out of Figures 4.1 and 4.2.
1
Suppose A is a 2 by 3 matrix of rank one. Draw Figure 4.1 to match the sizes
of the subspaces.
Redraw Figure 4.2 for a 3 by 2 matrix of rank r = 2. What are the two parts
x, and x,?
Construct a matrix with the required property or say why that is impossible:
:
1
2
:
(a)
Column space contains
|2|and |-3]. nullspace contains
(b)
1
2
1
Row space contains |2|and |-3|.nullspace contains [+]
(c)
Column space is perpendicular to nullspace
(d)
Row
(e)
Columns
1 + row 2 + row 3 = 0, column space contains
1
H
(1, 2, 3)
add up to zero column, rows add to a row of 1’s.
It is possible for the row space to contain the nullspace. Find an example.
(a)
(b)
If Ax =b has a solution and A'y = 0, then y is perpendicular to
If Ax = Bb has no solution and Ay = 0, explain why y is not perpendicular
to
In Figure 4.2, how do we know that Ax; is equal to Ax? How do we know that
this vector is in the column space?
If Ax is in the nullspace of A’ then Ax must be zero. Why? Which other
subspace is Ax in? This is important: A'A has the same nullspace as A.
Suppose A is a symmetric matrix (Al = A).
(a)
Why is its column space perpendicular to its nullspace?
(b)
If Ax = 0 and Az = 5z, why is x perpendicular to z? These are “eigenvectors.”
172 = Chapter 4 Orthogonality
9
(Recommended)
Draw Figure 4.2 to show each subspace correctly for
LZ
ce, ‘| and
10
hw
asl. a
Find the pieces x, and x, and draw Figure 4.2 properly if
Ar
i
—alm)
0
—1
0
O
and
D,
2a!
Questions 11-19 are about orthogonal subspaces.
11
12
Prove that every y in N(A‘) is perpendicular to every Ax in the column space,
using the matrix shorthand of equation (2). Start from Aly = 0.
The Fundamental
Theorem
is also stated in the form of Fredholm’s
alternative:
For any A and Bb, exactly one of these two problems has a solution:
(a)
Ax=b
(b)
Aly=0
with
bly 40.
Either b is in the column space of A or else b is not orthogonal to the nullspace
of A!. Choose A and B so that (a) has no solution. Find a solution to (b).
13
If S is the subspace of R* containing only the zero vector, what is St?
is spanned by (1, 1, 1), what is S=9?
If S is spanned by (2,0,0)
If §
and (0, 0, 3),
what is $+?
14
Suppose S only contains two vectors (1,5, 1) and (2, 2, 2) (not a subspace). Then
S+ is the nullspace of the matrix A =
. Therefore S+ is a
even if
S is not.
15
Suppose L is a one-dimensional subspace (a line) in R>. Its orthogonal complement
L+ is the
perpendicular to L. Then (L+)+ is a
perpendicular to L+.
In fact (L+)+ is the same as
16
Suppose
V is the whole space R*. Then
Then (V+)+ contains
V+ contains only the vector
. So (V+)+ is the same as
17
Suppose S is spanned by the vectors (1, 2, 2,3) and (1, 3,3, 2). Find two vectors
that span St.
18
If P is the plane of vectors in R* satisfying x; + x2 +x3-+x4 = 0, write a basis
for P+. Construct a matrix that has P as its nullspace.
19
If a subspace S$ is contained in a subspace V, prove that $+ contains V+.
4.1
Orthogonality of the Four Subspaces
173
Questions 20-26 are about perpendicular columns and rows.
20
Suppose an n by n matrix is invertible: AA~!
A! is orthogonal to the space spanned by
21
Suppose the columns of A are unit vectors, all mutually perpendicular.
= J. Then the first column of
:
What is
ATA?
22
Construct a 3 by 3 matrix A with no zero entries whose columns are mutually
perpendicular. Compute A'A. Why is it a diagonal matrix?
23
The lines 3x + y =
by and 6x + 2y =
line if__——.:~<XIn that case
bo are
(b;, bz) is perpendicular
nullspace of the matrix is the line 3x-+y = ____.
nullspace is
24
They are the same
to the vector
____—.:~CThe
One particular vector in that
Why is each of these statements false?
(a)
(1,1, 1) is perpendicular to (1,1, —2) so the planes
x + y —2z=0
are orthogonal subspaces.
x + y +z
=
O and
(b)
The lines from (0, 0,0) through (2,4,5) and (1, —3, 2) are orthogonal complements.
(c)
If two subspaces meet only in the zero vector, the subspaces are orthogonal.
ZS
Find a matrix with v = (1, 2, 3) in the row space and column space. Find another
matrix with v in the nullspace and column space. Which pairs of subspaces can
v not be in?
26
The command N = null(A) will produce a basis for the nullspace of A. Then
the command B = null(N’) will produce a basis for the
of A.
174 = Chapter 4 Orthogonality
PROJECTIONS am 4.2
May we start this section with two questions? (In addition to that one.) The first question aims to show that projections are easy to visualize. The second question is about
matrices:
1
What are the projections of b = (2,3, 4) onto the z axis and the xy plane?
Zz
What matrices produce those projections onto a line and a plane?
When b is projected onto a line, its projection p is the part of b along that line.
If b is projected onto a plane, p is the part in that plane.
There is a matrix P that multiplies b to give p. The projection p is Pb.
One projection goes across to the z axis. The second projection drops straight down to
the xy plane. The picture in your mind should be Figure 4.3. Start with b = (2, 3, 4).
One way gives p; = (0,0,4) and the other way gives p2 = (2,3,0). Those are the
parts of b along the z axis and in the xy plane.
The projection matrices P; and Pz are 3 by 3. They multiply b with 3 components to produce p with 3 components. Projection onto a line comes from a rank one
matrix. Projection onto a plane comes from a rank two matrix:
Ontothe.7 axis.)
=
0: 2086
[0 1050
Wai irsen|
Onto the xy plane:
Pop =|0
Oto
1 O
OF O20
P, picks out the z component of every vector. P2 picks out the x and y components.
To find p; and p2, multiply b by P; and P> (small p for the vector, capital P for the
matrix that produces it):
pi=Pib=|0
oO
0
0 0
00
1
0
=/0}
(pi
i
pro=Pob=/0
OO
1 OF}
lex
a
y}=]y
0
z
0
0
0
In this case the projections P; and P2 are perpendicular. The xy plane and the z axis
are orthogonal subspaces, like the floor of a room and the line between two walls.
More than that, the line and plane are orthogonal complements. Their dimensions add
to 1 + 2 = 3—every vector b in the whole space is the sum of its parts in the two
subspaces. The projections p; and p2 are exactly those parts:
The vectors give pj + po =).
The matrices give P; + Pp = /.
(1)
This is perfect. Our goal is reached—for this example. We have the same goal for any
line and any plane and any n-dimensional subspace. The object is to find the part p
in each subspace, and the projection matrix P that produces p = Pb. Every subspace
of R” has an m by m (square) projection matrix. To compute P, we absolutely need
a good description of the subspace that it projects onto.
4.2
Figure 4.3.
Projections
175
The projections of b onto the z axis and the xy plane.
The best description of a subspace is to have a basis. The basis vectors go into
the columns of A. We are projecting onto the column space of A. Certainly the z
axis is the column space of the following matrix A;. The xy plane is the column space
of Az. That plane is also the column space of A3 (a subspace has many bases):
Ar=
0
10
1
andes
—a|
i 20
nO
0 0
and
seAci—s
1y 52
eZee
O30
The overall problem is to project onto the column space of any m by n matrix.
Start with a line. Then n = 1. The matrix A has only one column.
Projection Onto a Line
We are given a point b = (bj,..., bm) in m-dimensional space.
line through the origin, in the direction of a = (a),...,@m).
We are also given a
We are looking along that
line, to find the point p closest to b. The key to projection is orthogonality: The line
connecting b to p is perpendicular to the vector a. This is the dotted line marked e
in Figure 4.4—which we now compute by algebra.
The projection p is some multiple of a! Call it p = Xa = “x hat” times a. Our
first step is to compute this unknown number x. That will give the vector p. Then
from the formula for p, we read off the projection matrix P. These three steps will
lead to all projection matrices: find x, then find p, then find P.
The dotted line b— p is b—<xa. It is perpendicular to a—this will determine x.
Use the fact that two vectors are perpendicular when their dot product is zero:
ss
a-(b—xa)=0
or
a
a-b—Xa-a=0
or
aed)
s=——
a-a
(ab
=>:
a'‘a
For vectors the multiplication a'b is the same as a+b. Using the transpose is better,
because it applies also to matrices. (We will soon meet A'b.) Our formula for %
immediately gives the formula for p:
176 = Chapter 4 Orthogonality
Figure 4.4
~The projection of b onto a line has length ||p|| = ||b|| cos@.
4E The projection of b onto the line through a is the vector
Special case 1: If
b=a
p=Xa
= ‘7774.
then t = 1. The projection of a onto a is itself.
Special case 2: If b is perpendicular to a then a'b = 0. The projection is p = 0.
Example 1 Project b = [| onto
1
et
a= [2|to find p = xa in Figure 4.4.
Solution The number ¥ is the ratio of a'b = 5 to a'a = 9. So the projection is
p= 2a. The error vector between b and p is e = b— p. Those vectors p and e will
add to b:
ees
P ritySia Gucgwig
MeMba.
P= Nae
piney)
The error e should be perpendicular to a and it is: e'a = : — 5 — 5 ==).
Look at the right triangle of b, p, and e. The vector b is split into two parts—its
component along the line is p, its perpendicular component is e. Those two sides of
a right triangle have length ||b|| cos@ and ||b|| sin@. The trigonometry matches the dot
product:
p=
4b
a,
ata
so its length is
|[p|| =
b
6
lla|| =
=
all2
||5|| cos é.
(3)
The dot product is a lot simpler than getting involved with cos@ and the length of b.
The example has square roots in cos@ = 5/3/3 and ||b|| = /3. There are no square
roots in the projection p = 2a.
Now comes the projection matrix. In the formula for p, what matrix is multiplying b? You can see it better if the number x is on the right side of a:
= Pb when the matrix is
ala
a'b
p= ax =a——
—
aa!
P =“.
ava
4.2
Projections
177
P is a column times a row! The column is a, the row is a!. Then divide by the
number a'a. The matrix P is m by m, but its rank is one. We are projecting onto a
one-dimensional subspace, the line through a.
:
ee
Example 2 Find the projection matrix
aa
at
P = ——
a‘a
:
onto the line through a = |
I
rene
Solution Multiply column times row and divide by a'a = 9:
T I L
bea =5/2/[!
ae
I eo
2 2]=5|2
4 4
2
cia(oeee
This matrix projects any vector b onto a. Check p = Pb for the particular b = (1, 1, 1)
in Example
1:
pres Jes
I
i
9
Dak
2
Dama
®
A
1
|
i)
= = | 1¢
Pe
5
AG
which
is correct.
If the vector a is doubled, the matrix P stays the same. It still projects onto the same line.
If the matrix is squared, P* equals P. Projecting a second time doesn’t change anything, so P* = P. The diagonal entries of P add up to 5(1 + 4-4)
1
The matrix / — P should be a projection too. It produces the other side e of
the triangle—the perpendicular part of b. Note that (J — P)b equals b— p which is e.
When P projects onto one subspace, | — P projects onto the perpendicular subspace.
Here / — P projects onto the plane perpendicular to a.
Now we move beyond projection onto a line. Projecting onto an n-dimensional
subspace of R” takes more effort. The crucial formulas are collected in equations (5)—
(6)-(7). Basically you need to remember them.
Projection Onto a Subspace
Start with n vectors aj,...,@, in R’”. Assume that these a’s are linearly independent.
Problem: Find the combination Xa, +---+Xna, that is closest to a given vector b.
We are projecting each b in R™ onto the subspace spanned by the a’s.
With n =
| (only one vector a;) this is projection onto a line.
umn space of A, which has just one column.
a,,...,4,.
Their combinations
in R”
The line is the col-
In general the matrix A has n columns
are the vectors
Ax
in the column
space.
We
are looking for the particular combination p = AX (the projection) that is closest to
b. The hat over ¥ indicates the best choice, to give the closest vector in the column
space. That choice is a'b/a'a when n = 1. For n > 1, the best ¥ is the vector
COTE came
li
We solve this problem for an n-dimensional subspace in three steps: Find the
vector X, find the projection p = AX, find the matrix P.
178 = Chapter 4 Orthogonality
The key is in the geometry! The dotted line in Figure 4.5 goes from b to the
nearest point A¥ in the subspace. This error vector b — Ax is perpendicular to the
subspace. The error b — AX¥ makes a right angle with all the vectors a;,...,@,. That
gives the n equations we need to find x:
al (b — AX) =0
—al
—
or
a! (b — Ax) = 0
b— Ax |=|0}.
|
a
(4)
The matrix in those equations is A’. The n equations are exactly A'(b — Ax) = 0.
Rewrite A'(b — AX) = 0 in its famous form A’A¥ = A'D. This is the equation
for ©, and the coefficient matrix is A'A. Now we can find ¥ and p and P:
4F The combination Xa; + ---+X,a, = AX that is closest to b comes from
A'(b—Ax)=0
of
AAT =A'Dd.
(5)
The symmetric matrix A™A is n by n. It is invertible if the a’s are independent. The
solution is ¥ = (A!A)~!A™Db. The projection of b onto the subspace is the vector
p = A = A(ATA)“!A™D.
(6)
This formula shows the n by n projection matrix that produces p = Pb:
P=A(A'A)'A!.
(7)
Compare with projection onto a line, when the matrix A has only one column a:
y=
a'b
ala
and
a'b
p=a—
ala
and
aa!
P=-—— i
a‘a
Those formulas are identical with (5) and (6) and (7)! The number a'a becomes the
matrix A'A. When it is a number, we divide by it. When it is a matrix, we invert it.
The new formulas contain (A'A)~! instead of 1/a'a. The linear independence of the
columns a),...,@,
will guarantee that this inverse matrix exists.
The key step was A'(b — AX) = 0. We used geometry (e is perpendicular to
all the a’s). Linear algebra gives this “normal equation” too, in a very quick way:
1.
Our subspace is the column space of A.
2.
The error vector
3.
Therefore b — AX is in the left nullspace. This means A'(b — AX) = 0.
b — AX is perpendicular to that column space.
The left nullspace is important in projections. This nullspace of A? contains the error
vector e = b — Ax. The vector b is being split into the projection p and the error
e = b— p. Figure 4.5 shows the right triangle with sides p, e, and b.
4.2
Projections
179
column a,
ale=0
ale =1()
Ale = Al(b— A%) =0
column az
Figure 4.5
The projection p is the nearest point to b in the column space of A.
The perpendicular error e must be in the nullspace of A.
10
6
Example 3 If 4 = Hal and b =|° |find ¥ and p and P.
Solution Compute the square matrix A'A and also the vector A'b:
1 0 diye
7 Filba a
a=[p i alft tf-[3
a MM a Bs 6 ie
pea
ee Ors eo alod
Kul
BIEI-6) ~ [-L)
Now solve the normal equation A'A& = A‘D to find &:
The combination p = AX is the projection of b onto the column space of A:
1
P=Dx\leth—
|
0
3h b
2.
less)
5)
<2,
—|
l
2 Phe etrot is > =.)pi
2
(9)
1
That solves the problem for one particular b. To solve it for every b, compute
the matrix P = A(A™A)~!A?.
The determinant of ATA is 15 —9 = 6; (ATA)! is
easy. Then multiply A times (A'A)~! times A! to reach P:
(A’Taya
A)
1} 5 -—3“
ae
=
fe
|
and
|
P legisalt
(10)
NN
=
bY
NN
=aN
Two checks on the calculation. First, the error e = (1, —2, 1) is perpendicular to
both columns
(1, 1,1) and (0, 1,2).
Second, the final P times b = (6, 0,0) correctly
gives p = (5,2, —1). We must also have P* = P, because a second projection doesn’t
change the first projection.
180 = Chapter 4 Orthogonality
Warning The matrix P = A(A'A)~!A!? is deceptive. You might try to split (A'A)~!
into A~! times (A')7!.
If you make that mistake, and substitute it into P, you will
find P = AA~!(A‘T)~!AT. Apparently everything cancels. This looks like P = /, the
identity matrix. We want to say why this is wrong.
The matrix A is rectangular. It has no inverse matrix. We cannot split (ALAa.
into A~! times (A')~! because there is no A~! in the first place.
In our experience, a problem that involves a rectangular matrix almost always
leads to A'A. We cannot split up its inverse, since A~! and (A')~! don’t exist. What
does exist is the inverse of the square matrix A!A. This fact is so crucial that we state
it clearly and give a proof.
4G A!A is invertible if and only if A has linearly independent columns.
Proof A'A is a square matrix (n by n). For every matrix A, we will now show that
A'A has the same nullspace as A. When the columns of A are linearly independent, its
nullspace contains only the zero vector. Then A’ A, with this same nullspace, is invertible.
Let A be any matrix. If x is in its nullspace, then Ax = 0. Multiplying by AT
gives A'Ax = 0. So x is also in the nullspace of A'A.
Now start with the nullspace of A' A. From A’Ax = 0 we must prove that Ax = 0.
We can’t multiply by (A')~!, which generally doesn’t exist. Just multiply by x!:
(x')ATAx =0
or
(Ax)'(Ax)=0
or
||Ax|l? =0.
The vector Ax has length zero. Therefore Ax = 0. Every vector x in one nullspace
is in the other nullspace. If A has dependent columns, so does A'A. If A has independent columns, so does A'A. This is the good case:
When A has independent columns, A!A is square, symmetric, and invertible.
To repeat for emphasis:
A'A is (n by m) times (m by n). It is always square (n by
n). It is always symmetric, because its transpose is (ATA)! = AT(A!)! which equals
A'A. We just proved that A'A is invertible—provided A has independent columns.
Watch the difference between dependent and independent columns:
A!
A
A‘A
|: 1 | ; ; ok
Dieta)
ana
4
i
8
dependent _ singular
Very brief summary
To find the projection
Al
A
A'A
E 1 4]
: =|
2 SOP Ry
et
4
indep.
|
9
invertible
p = X1a,+---+Xnan, solve A'AX = A™D.
The projection is Ax and the error is e = b — p = b — AX. The projection matrix
P = A(A'A)—'!A! gives p = Pb.
This matrix satisfies P? = P. The distance from b to the subspace is |le|].
4.2
=»
REVIEW OF THE KEY IDEAS
Projections
=&
1
The projection of b onto the line through a is p = a¥ = a(a'b/a'‘a).
Z
The rank one projection matrix P = aa'/a'a multiplies b to produce p.
3
The projection of b onto a subspace leaves
space.
4
When
the columns
181
e = b — p perpendicular to the sub-
of A are a basis, the normal equation A! A& = A'b leads to
p= Ax.
5
The projection matrix P = A(A'A)~'A? has P! = P and P* = P. The second
projection leaves p unchanged.
Problem Set 4.2
Questions 1-9 ask for projections onto lines. Also errors e = b — p and matrices P.
1
Project the vector b onto the line through a. Check that e is perpendicular to a:
(Qe
iz
Ola
l
e2
4
1
|i
1
ands
1
(DB),
Opies
and
1
—1
a= | —3
—1
Draw the projection of b onto a and also compute it from p = Xa:
(a)
belt
and
«=(9|
(b)
»=|| and fae
3
In Problem 1, find the projection matrix P = aa'/a'a onto the line through
each vector a. Verify that P* = P. Multiply Pb in each case to compute the
projection p.
4
Construct the projection matrices onto the lines through the a’s in Problem 2.
Explain why P? should equal P.
5
Compute the projection matrices aa'/a'a onto the lines through a, = (—1, 2, 2)
and az = (2,2,—1). Multiply those projection matrices and explain why their
product P; Pz is what it is.
6
Project b = (1,0, 0) onto the lines through a; and az in Problem 5 and also onto
a3 = (2, —1,2). Add up the three projections pj + p2 + p3.
7
Continuing Problems
5-6, find the projection matrix
Verify that Py) + P2 + P3 = 1.
P3 onto a3 =
(2, —l, 2).
182
= Chapter 4 Orthogonality
Poa}
2
a
a.
asl
ld
P; Poa,
=|
Questions 8-9-10
Questions 5-6-7
8
|
on
=
Project the vector b = (1, 1) onto the lines through a; = (1,0) and az = (1, 2).
Draw the projections p; and p2 and add p; + p2. The projections do not add to
b because the a’s are not orthogonal.
5
In Problem 8, the projection of b onto the plane of a, and a2 will equal b.
The projection matrix is P =
le
Check P=
“AAMAS YAP wor
A=[a, a]=[)4]
10 =“Project a; = (1,0) onto az = (1, 2). Then project the result back onto a;. Draw
these projections and multiply the projection matrices P; Pz: Is this a projection?
Questions 11-20 ask for projections, and projection matrices, onto subspaces.
11
Project b onto the column space of A by solving A'A¥ = A'b and p = Af:
(a)
A=|]0
[etl
1]
0 0
and b=|3
2)
4
(b)
A=/]1
[orl
Oia
1]°
and b=|4
4
6
Find e = b — p. It should be perpendicular to the columns of A.
12
Compute the projection matrices P; and P2 onto the column spaces in Problem
11. Verify that P\b gives the first projection p,. Also verify Pe lal
13.
Suppose A is the 4 by 4 identity matrix with its last column removed. A is 4
by 3. Project b = (1,2, 3,4) onto the column space of A. What shape is the
projection matrix P and what is P?
14
Suppose b equals 2 times the first column of A. What is the projection of b
onto the column space of A? Is P = / in this case? Compute p and P when
b = (0, 2, 4) and the columns of A are (0, 1,2) and (1, 2, 0).
15
If A is doubled, then
P = 2A(4A™A)~!2A™. This is the same as A(ATA)~!AT.
The column space of 2A is the same as
16
. Is ¥ the same for A and 2A?
What linear combination of (1,2, —1) and (1,0, 1) is closest to b = (2,1, 1)?
4.2
17.
(Important) If P* = P. show that (I — P)*
=I —P.
When
Projections
183
P projects onto the
column space of A, J — P projects onto the
18
19
(a)
If P is the 2 by 2 projection matrix onto the line through (1, 1), then 7 —P
is the projection matrix onto
(b)
If P is the 3 by 3 projection matrix onto the line through (1, 1, 1), then
I — P is the projection matrix onto
To find the projection matrix onto the plane x — y — 2z = 0, choose two vectors
in that plane and make them the columns of A. The plane should be the column
space. Then compute P = A(ATA)7!A?.
20 = To find the projection matrix P onto the same plane x — y—2z = 0, write down a
vector e that is perpendicular to that plane. Compute the projection Q = ee!/ele
and then
P = ] — Q.
Questions 21-26 show that projection matrices satisfy P*
= P and P! = P.
21
Multiply the matrix P = A(A'A)~!A! by itself. Cancel to prove that P? = P.
Explain why P(Pb) always equals Pb: The vector Pb is in the column space
so its projection is
22
Prove that P = A(A'A)~!A! is symmetric by computing
the inverse of a symmetric matrix is symmetric.
23
If A is square and invertible, the warning against splitting (A'A)~!
apply:
P = AA~!(A')-!Al = 1.
error
24
P!. Remember
When A is invertible, why is P =I?
that
does not
What is the
e?
The nullspace of A’ is
_____
to the column space C(A).
projection of b onto C(A) should be p =
So if A'b = 0, the
. Check that P = A(ATA)“!A?
gives this answer.
25
The projection matrix P onto an n-dimensional subspaces has rank r =
Reason: The projections Pb fill the subspace S. So S is the
OLsP;,
26
If an m by m matrix has A* = A and its rank is m, prove that A = J.
27
The important fact in Theorem 4G is this: If AlAx = 0 then Ax = 0. New
proof: The vector Ax is in the nullspace of
. Ax 1s always in the column
space of ____. To be in both perpendicular spaces, Ax must be zero.
28
Use P’ = P and P? = P to prove that the length squared of column 2 always
equals the diagonal entry p22. This number is 2 = ma + a + a for
a, \|
|
wn
NNN=AN
n.
184 = Chapter 4 Orthogonality
LEAST SQUARES APPROXIMATIONS a 4.3
It often happens that Ax = b has no solution. The usual reason is: too many equations.
The matrix has more rows than columns.
There are more equations than unknowns
(m
is greater than n). The n columns span a small part of m-dimensional space. Unless
all measurements are perfect, b is outside that column space. Elimination reaches an
impossible equation and stops. But these are real problems and they need an answer.
To repeat: We cannot always get the error e = b— Ax down to zero. When e is
zero, X is an exact solution to Ax = b. When the length of e is as small as possible,
X is a least squares solution. Our goal in this section is to compute x and use it.
The previous section emphasized p (the projection). This section emphasizes ¥
(the least squares solution). They are connected by p = AX. The fundamental equation
is still ATA¥ = A'b. Here is a short unofficial way to derive it:
When the original Ax = b has no solution, multiply by A! and solve A!A¥ = A'b.
Example
1 Find the closest straight line to three points (0, 6), (1,0), and (2, 0).
No straight line goes through those points. We are asking for two numbers C and
D that satisfy three equations. The line is b = C + Dt. Here are the equations at
t = 0, 1,2 to match the given values b = 6, 0, 0:
The first point is on the line
b = C + Dt if
The second point is on the line
The third point is on the line
b=C+ Dt if
b = C + Dt if
CUE
06
C+D-1=0
Cae Diet:
This 3 by 2 system has no solution; b = 6,0,0 is not a combination of the columns
of A:
Avil
me)
Wiel
Le 2,
6
fake
2
DIO
Ax = b is not solvable.
0
The same numbers were in Example 3 in the last section. In practical problems, the
data points are close to a line. But they don’t exactly match any C + Dt, and there
easily could be m = 100 points instead of m = 3. The numbers 6, 0, 0 exaggerate the
error sO you can see it clearly.
Minimizing the Error
How do we make the error e = b — Ax as small as possible? This is an important
question with a beautiful answer. The best x (called ¥) can be found by geometry or
algebra or calculus:
4.3
Least Squares Approximations
185
By geometry Every Ax lies in the plane of the columns (1, 1, 1) and (0, 1,2). In that
plane, we are looking for the closest point b. The nearest point is the projection p.
The best choice for Ax is p. Then
x =X.
The smallest possible error is e = b — p.
By algebra Every vector b has a part in the column space of A and a perpendicular
part in the left nullspace. The column space part is p. The left nullspace part is e.
There is an equation we cannot solve (Ax = b). There is an equation we do solve (by
removing é):
Ax =b=p+e_
is impossible;
Ax = p_
is solvable.
(1)
The solution ¥ to Ax = p makes the error as small as possible, because for any x:
Ax
—bl? = Ax — pil? + lel’.
(2)
This is the law c* = a* + b? for a right triangle. The vector Ax — p in the column
space is perpendicular to e in the left nullspace. We reduce Ax
— p to zero by choosing
x to be x. That leaves the smallest possible error (namely e).
Notice what “smallest” means. The squared length of Ax — b is being minimized:
The least squares solution X makes
E = |\|Ax — b\\? as small as possible.
By calculus Most functions are minimized by calculus! The derivatives are zero at the
minimum. The graph bottoms out and the slope in every direction is zero. Here the
error function to be minimized is a sum of squares (the square of the error in each
equation!):
B= "Ax = bi? =(C +D-0—6)7
(C4 D-1
C+D).
(3)
The unknowns are C and D. Those are the components of ¥, and they determine the
line. With two unknowns there are two derivatives—both zero at the minimum. They
are “partial derivatives” because 0E/dC treats D as constant and dE/dD treats C as
constant:
OE oO = AC 1 DEO
6)
df /9D =2(C + D-0—6)0)
a(n
Gey
+ 2(6 + D= DW) +2(C
PD
2) 2) =—0:
dE/0D contains the extra factors 0, 1,2. Those are the numbers in £ that multiply D.
They appear because of the chain rule. (The derivative of (4 + 5x)? is 2 times 4 + 5x times
an extra 5.) In the C derivative the corresponding factors are (1)(1)(1), because C is always
multiplied by 1. It is no accident that 1, 1, | and 0, 1, 2 are the columns of A.
Now cancel 2 from every term and collect all C’s and all D’s:
The C derivative ra is zero;
“3G + 3D'=6
The D derivative au is zéro;,,
SC. +5) = 0;
186
= Chapter4 Orthogonality
b,=0
é,=1
p3=-1
Figure 4.6
The closest line has heights p = (5,2, —1) with errors e = (1, —2, 1).
The equations
A' Ax = A'b give ¥ = (5, —3). The line is b = 5—3¢ and the projection
is 5a; — 3ap.
These equations are identical with A’AX = A'b. The best C and D are the
components of ¥. The equations (4) from calculus are the same as the “normal equations” from linear algebra:
The partial derivatives of \|Ax — b\\? are zero when
A! AX = A‘b.
The solution is C = 5 and D = —3. Therefore b = 5 — 3t is the best line—it comes
closest to the three points. At ¢ = 0, 1, 2 this line goes through p = 5, 2, —1.
It could not go through b = 6, 0, 0. The errors are 1, —2, 1. This is the vector e!
Figure 4.6a shows the closest line to the three points. It misses by distances
€1, €2, €3. Those are vertical distances. The least squares line is chosen to minimize
= et + es + e.
Figure 4.6b shows the same problem in another way (in 3-dimensional space).
The vector b is not in the column space of A. That is why we could not solve Ax = b
and put a line through the three points. The smallest possible error is the perpendicular
vector e to the plane. This is e = b — AX, the vector of errors (1, —2, 1) in the three
equations—and the distances from the best line. Behind both figures is the fundamental
equation A'A¥ = A'D.
Notice that the errors 1, —2, 1 add to zero. The error e = (e}, €2, e3) is perpendicular to the first column (1, 1,1) in A. The dot product gives e; + e2 + e3 = 0.
4.3
Least Squares Approximations
187
column space
A
Sy
es
Siete, 7 not possible = — >-—€
b=pte
b not in column space
0
Independent columns
No nullspace
Figure 4.7
The projection p = AX is closest to b, so ¥ minimizes
E = ||b — Ax ||?.
The Big Picture
The key figure of this book shows the four subspaces and the true action of a matrix.
The vector x on the left side of Figure 4.2 went to b = Ax on the right side. In that
figure x was split into x, + x,. There were many solutions to Ax = bD.
In this section the situation is just the opposite. There are no solutions to Ax = Bb.
Instead of splitting up x we are splitting up b. Figure 4.7 shows the big picture for
least squares. Instead of Ax = b we solve AX¥ = p. The error e = b—p is unavoidable.
Notice how the nullspace N(A) is very small—just one point. With independent
columns, the only solution to Ax = 0 is x = 0. Then ATA is invertible. The equation
A'Ax = A'b fully determines the best vector ¥.
Chapter 7 will have the complete picture—all four subspaces included. Every x
splits into x, + x,, and every b splits into p +e.
The best solution is still ¥ (or X,)
in the row space. We can’t help e and we don’t want x,—this leaves A¥ = p.
Fitting a Straight Line
This is the clearest application of least squares.
It starts with m points in a plane—
hopefully near a straight line. At times f,..., tm those points are at heights bj,..., Den
Figure 4.6a shows the best line
€1,...,@m.
b = C + Dt, which misses the points by distances
Those are vertical distances.
No line is perfect, and the least squares line
is chosen to minimize E = et +..-+e¢,
The first example in this section had three points. Now we allow m points (m
can be large). The algebra still leads to the same two equations A'AX = A'b. The
components of X¥ are still C and D.
188
Chapter 4 Orthogonality
Figure 4.6b shows the same problem in another way (in m-dimensional space).
A line goes exactly through the m points when we exactly solve Ax = b. Generally
we can’t do it. There are only two unknowns C and D because A has n = 2 columns:
C+
Ve ige
Dt, =b,
I
if
=
aeto
=UH
ji
amnesia)
C+
Din
(5)
es
The column space is so thin that almost certainly b is outside of it. The m points
(t;,b;) almost certainly don’t lie on a line.
€1,---,@m to the closest line.
The components
of e are the distances
When b happens to lie in the column space, the points happen to lie on a line.
In that case
b = p. Then Ax = b is solvable and the errors are e = (0,..., 0).
The closest line has heights pj,..., Pm with errors e1,..., €mThe equations A’AX = A'b give
& = (C, D). The errors are e; = b; — C — Dt;.
Fitting points by straight lines is so important that we give the two equations
once and for all. Remember that b = C + Dt exactly fits the data points if
C+
Dt,
;
Cpr
= b,
wii
pals
ake
it. ee
=o.
os a
ba ag
by
e
ext
bm
ic
(6)
This is our equation Ax = b. It is generally unsolvable, if m > 2. But there is one
good feature. The columns of A are independent (unless all times ¢; are the same). So
we turn to least squares and solve A'A® = A'b. The “dot-product matrix” A‘A is
2.by=2:
ries
F
ee
fi)
1
wall
ep
ty
Sas
‘
bees
m
i
fe
aa
8,
;
(8)
On the right side of the equation is the 2 by 1 vector A'b:
by
A
oe
5
me)
wae
en
x
a
ees
en;
In a specific problem, all these numbers are given. The m equations Ax = b reduce
to two equations ATA¥ = A'b. A formula for C and D will be in equation (10).
4.3
Least Squares Approximations
189
4H The line C + Dt which minimizes ef +--+ +e, is determined by ATAR = A™D:
me
CL
SE ie ate
C ENV
ldee
ESE:
SED
(9)
The vertical errors at the m points on the line are the components of e = b — p.
As always, those equations come from geometry or linear algebra or calculus.
The error vector (or residual) b— AX is perpendicular to the columns of A (geometry).
It is in the nullspace of A? (linear algebra). The best ¥ = (C, D) minimizes the total
error E, the sum of squares:
E(x) = \|Ax — bil? =(C +.Dh — by)? +--+ + (C + Dtm — bm)?.
When calculus sets the derivatives 0E/dC and dE/dD to zero, it produces the two
equations in (9).
Other least squares problems have more than two unknowns. Fitting a parabola
has n = 3 coefficients C, D, E (see below). In general we are fitting m data points by n
parameters x;,..., X,. The matrix A hasn columns andn < m. The total error is a function
E(x) = ||Ax — b||? of n variables. Its derivatives give the n equations A'A = A'b. The
derivative of a square is linear—this is why the method of least squares is so popular.
Example 2 The columns of A are orthogonal in one special case—when the measurement times add to zero. Suppose b = 1, 2,4 at times ¢ = —2,0,2. Those times add
to zero. The dot product with 1, 1,1 is zero and A has orthogonal columns:
C+ D(-2)
=1
CAD
(O)
a=
Ca
UO),
=4
The measurements
La?
or
Ae —a
C
1
|
0
| = || 2
I
ame
4
1, 2,4 are not on a line. There is no exact C and D and x. Look
at the matrix A! A in the least squares equation for ¥:
A Tee
An
iPAD
.
is
3 Ol
Le
7
E | sl=[4)-
Main point: Now A'A is diagonal. We can solve separately for C= “ and D = g.
The zeros in A'A are dot products of perpendicular columns in A. The denominators
3 and 8 are not | and 1, because the columns are not unit vectors. But a diagonal
matrix is virtually as good as the identity matrix.
Orthogonal columns
are so helpful that it is worth moving the time origin to
produce them. To do that, subtract away the average time f= (ty + +++ + tn)/m.
Then the shifted measurement times 7; = t; —f add to zero. With the columns now
orthogonal, A'A is diagonal. The best C and D have direct formulas:
C2
by
+-+-+)Dm
mi
and
ee
ees
Encny
Tote+Tp
(10)
190 = Chapter 4 Orthogonality
The best line is C+ DT or
C+ D(t —f). The time shift that makes ATA diagonal is an
example of the Gram-Schmidt process, which orthogonalizes the columns in advance.
Fitting by a Parabola
If we throw a ball, it would be crazy to fit the path by a straight line. A parabola
b =C + Dt + Et? allows the ball to go up and come down again (b is the height at
time ft). The actual path is not a perfect parabola, but the whole theory of projectiles
starts with that approximation.
When Galileo dropped a stone from the Leaning Tower of Pisa, it accelerated.
The distance contains a quadratic term 5gt*. (Galileo’s point was that the stone’s mass
is not involved.) Without that term we could never send a satellite into the right orbit. But even with a nonlinear function like t?, the unknowns C, D, E appear linearly.
Choosing the best parabola is still a problem in linear algebra.
Problem Fit b,,...,bm at times t},..., tm by a parabola b = C + Dt + Et?.
With m > 3 points, the equations for an exact fit are generally unsolvable:
C+
D+
Et? =b
1
has the m by 3 matrix
A=|:
:
ie
tome
C8
C2 Dt,
+ kt be
Least squares
equations
The best parabola chooses
¥ =
(C, D, E) to satisfy the three normal
A' Ax = A'b.
May I ask you to convert this to a problem of projection? The column space
of A has dimension ____. The projection of b is p = AX, which combines the
three columns using the coefficients
C, D, E. The error at the first data point is ey =
bj—C-—Dt,—E re The total squared error is FE= er+
. If you prefer to minimize
by calculus, take the partial derivatives of E Sith respect to
These three derivatives will be zero when X = (C, D, E) solves the 3 by 3 see =
equations
Section 8.4 has more least squares applications. The big one is Fourier series—
approximating functions instead of vectors. The error to be minimized changes from a sum
Ve er i
e to an integral. We will find the straight line that is closest to f(x).
Example 3 For a parabola b = C+Dt+Et?
to go through the three heights b = 6, 0, 0
when ¢ = 0, 1, 2, the equations are
C+D-0+E-0*=6
C+D-1+E-17=0
C+D-2+E-2?=0.
(12)
4.3
Least Squares Approximations
191
b =6—9t
+ 312
Figure 4.8
From Example 2: An exact fit of the parabola through three points means
p =6
and e = 0. The fourth point requires least squares.
This is Ax = b. We can solve it exactly. Three data points give three equations and a
square matrix. The solution is x = (C, D, E) = (6, —9, 3). The parabola through the
three points in Figure 4.8a is b = 6 — 9t + 377.
What does this mean for projection? The matrix has three columns, which span
the whole space R?. The projection matrix is the identity matrix! The projection of b
is b. The error is zero. We didn’t need A'A¥ = A'D, because we solved Ax = b. Of
course we could multiply by A‘, but there is no reason to do it.
Figure 4.8a also shows a fourth point b4 at time ty. If that falls on the parabola,
the new Ax = b (four equations) is still solvable. When the fourth point is not on the
parabola, we turn to A'A¥ = A‘. Will the least squares parabola stay the same, with
all the error at the fourth point? Not likely!
An error vector (0,0, 0, e4) is not perpendicular to the column
(1,1, 1,1) of A.
Least squares balances out the four errors, and they add to zero.
=»
1
REVIEW OF THE KEYIDEAS
The least squares solution of ¥ minimizes
=&
E = ||Ax — b\|?. This is the sum of
squares of the errors in the m equations (m > n).
Zz
The best
¥ comes from the normal equations A'A¥ = AD.
3
To fit m points by a line
4
The heights of the best line are p = (p},..- , Pm). The vertical distances to the
data points are e = (e),... , &m).
b= C + Dt the two normal equations give C and D.
192
Chapter 4 Orthogonality
5
If we fit m points by a combination of n functions y;(t), the m equations
>> x; (ti) = b; are generally unsolvable. The n normal equations give the best
least ‘squares solution ¥ = (4,....%,):
Problem Set 4.3
Problems
1
J
1-10 use four data points to bring out the key ideas.
(Straight line
b = C + Dt through four points) With b = 0,8, 8,20 at times
t = 0,1,3,4,
measurements
write down the four equations Ax = b (unsolvable). Change the
to p = 1,5, 13, 17 and find an exact solution to AX¥ = p.
With
b =
0,8, 8,20
at t =
0,1,3,4,
set up and solve the normal
equations
A! AX = A'b. For the best straight line in Figure 4.9a, find its four heights and
four errors.
What is the minimum
value of E = et + es
es -- e4?
3
Compute p = AX for the same b and A using A'A¥ = A'b. Check that e =
b — p is perpendicular to both columns of A. What is the shortest distance |le]|
from b to the column space?
4
(Use calculus) Write down E = ||Ax — b||? as a sum of four squares involving
C and D. Find the derivative equations JE /dC = 0 and JdE/dD = 0. Divide
by 2 to obtain the normal equations A)A¥ = A'b.
5
Find the height C of the best horizontal line to fit b = (0, 8, 8, 20). An exact fit
would
solve the unsolvable equations
C = 0,C = 8,C = 8,C = 20.
Find the
4 by 1 matrix A in these equations and solve ATA¥ = A'b. Redraw Figure 4.9a
to show the best height x = C and the four errors in e.
b,=20 7
b= (0, 8, 8, 20)
a
~
€o
p=Ca,+Da,
b=
(0, 1, 3, 4)
a,=(1, 1,1, 1)
=
Figure 4.9
t=1
tz =3
t4=4
Problems 1-11: The closest line C + Dt matches the projection in R*.
4.3
Least Squares Approximations
193
Project b = (0, 8, 8, 20) onto the line through a = (1, 1, 1, 1). Find x = a'b/a'a
and the projection p = xa. Redraw Figure 4.9b and check that e = b— p is perpendicular to a. What is the shortest distance |le|| from b to the line in your
figure?
Find the closest line
b = Dt, through the origin, to the same four points.
exact fit*would
D -0-=
solve
0,D -1.=.8, De
4 by | matrix and solve A'A¥ = A'b.
slope x = D and the four errors.
3i= "89D <4%=120)
An
Find the
Redraw Figure 4.9a showing the best
Project b = (0,8, 8,20) onto the line through a = (0,1,3,4).
Find ¥ and
p = xa. The best C in Problems 5-6 and the best D in Problems 7-8 do
not agree with the best (C, D) in Problems 1-4. That is because (1, 1,1, 1) and
(0,1,3,4) are ____—s perpendicular.
For the closest parabola b = C+ Dt+ Et? to the same four points, write down the
unsolvable equations Ax = b. Set up the three normal equations A'A¥ = Ab
(solution not required). In Figure 4.9a you are now fitting a parabola—what is
happening in Figure 4.9b?
10
11
For the closest cubic b = C + Dt + Et? + Ft? to the same four points, write
down the four equations Ax = b. Solve them by elimination. In Figure 4.9a this
cubic now goes exactly through the points. Without computation write p and e
in Figure 4.9b.
The average of the four times is f = + +1+3-+4)
=2.
The average of the
four b’s is b = (0+
8 +8 + 20) =9.
(a)
Verify that the best line goes through the center point (, b) = (2,9).
(b)
Explain why C + Di = b comes from the first normal equation.
Questions 12-16 introduce basic ideas of statistics—the foundation for least squares.
2
(Recommended)
(Os (lees)
This problem projects b =
(bj,...,bm)
onto the line through
(a)
Solve a'aX = a'b to show that X is the mean
of the b’s.
(b)
Find the error vector e and the variance
(c)
Draw a graph with b = (1, 2, 6) fitted by a horizontal line. What are p and
e on the graph? Check that p is perpendicular to e and find the matrix P.
\le||7 and the standard deviation
lle.
13
First assumption behind least squares: Each measurement error has “expected value”
zero.
Multiply the eight error vectors
b — Ax = (+1,+1,+1) by (ATA) 1AT
to show that the eight vectors ¥ — x also average to zero. The expected value of ¥ is
the correct x.
194
Chapter 4 Orthogonality
14
Second assumption behind least squares: The measurement errors are independent
and have the same variance o*. The average of (b — Ax)(b — Ax)! is 071.
Multiply on the left by (A7A)~!A™ and multiply on the right by A(ATA)~! to
show that the average of (¥ —x)(X — x)! is 02(A'A)~!.
matrix for the error in X.
15
This is the covariance
A doctor takes m readings b;,...,bm of your pulse rate. The least squares solution to the m equations x = bj,x = b2,...,x = bm is the average xX =
(b} + +--+ bm)/m. The matrix A is a column of 1’s. Problem 14 gives the
expected error (x — x)" as o2(A! A)! =
. By taking m measurements,
the variance drops from 07 to o7/m.
16
If you know the average X99 of 99 numbers b),..., 499, how can you quickly
find the average Xj99 with one more number bj99? The idea of recursive least
squares is to avoid adding 100 numbers. What coefficient correctly gives X00
from bjo9 and X99?
7000100 +
Zoq = Tog(bi +--+ bio0).
Questions 17-25 give more practice with ¥ and p and e.
17
Write down three equations for the line
t=
lob
=]
at. t=,
and.b.=-2)
b = C + Dt to go through b = 7 at
at t=
2...Find the leastasquares, solution
X = (C, D) and draw the closest line.
18
Find the projection p = AX in Problem
17. This gives the three
of the
closest line. Show that the error vector is e = (2, —6, 4).
1g
Suppose the measurements
at t = —1, 1, 2 are the errors 2, —6,4 in Problem
Compute x and the closest line. Explain the answer:
dicular to
so the projection is p = 0.
20
Suppose the measurements at ¢ = —1,1,2 are b = (5, 13,17).
the closest line and e. The error is e = 0 because this D is
21
Which of the four subspaces contains the error vector e?
Which contains x? What is the nullspace of A?
ZZ
Find the best line C + Dt to fit
23
18.
b = (2, —6,4) is perpenCompute ¥ and
Which contains p?
b= 4,2, —1,0,0 at times t = —2, —1,0, 1,2.
(Distance between lines) The points P = (x, x, x) are on a line through (1, 1, 1)
and Q = (y,3y,-—1) are on another line.
squared distance ||P — Q|I".
Choose
x and y to minimize
the
24
Is the error vector e orthogonal to b or p or e or ¥? Show that |le||* equals e'b
which equals b'b — b! p. This is the smallest total error E.
25
The derivatives of ||Ax||? with respect to the -variables x;,...,x, fill the
vector 2A! Ax. The derivatives of 2b’
Ax fill the vector 2A'b. So the derivatives
of ||Ax — b\|? are zero when
4.4
ORTHOGONAL
Orthogonal Bases and Gram-Schmidt
195
BASES AND GRAM-SCHMIDT a 4.4
This section has two goals. The first is to see how orthogonal vectors make calculations
simpler. Dot products are zero—so A!A is a diagonal matrix. The second goal is to
construct orthogonal vectors. We will pick combinations of the original vectors to
produce right angles. Those original vectors are the columns of A. The orthogonal
vectors will be the columns of a new matrix Q.
You know from Chapter 3 what a basis consists of—independent vectors that span
the space. The basis gives a set of coordinate axes. The axes could meet at any angle
(except 0° and 180°). But every time we visualize axes, they are perpendicular. /n our
imagination, the coordinate axes are practically always orthogonal. This simplifies the
picture and it greatly simplifies the computations.
The vectors q1,...,4n are orthogonal when their dot products q; + qj are zero.
More exactly gq}9) = 0 whenever i ~# j. With one more step—just divide each vector
by its length—the vectors become orthogonal unit vectors. Their lengths are all 1.
Then the basis is called orthonormal.
DEFINITION
The vectors qj,..., qn are orthonormal if
tT,
_J|0O
Vi 41 ~)
1
wheni+#j
(orthogonal vectors)
when i = j. (unit vectors:
|lq;|| = 1)
A matrix with orthonormal columns is assigned the special letter Q.
The matrix Q is easy to work with because Q'Q =I. This repeats in matrix
language that the columns q1,..-@n are orthonormal. It is equation (1) below, and Q
is not required to be square.
When Q
is square,
o'oO = / means
that gt =O
transpose = inverse.
4l A matrix Q with orthonormal columns satisfies Q'Q = /:
ag ne
ie
O' Oe
; ae,
oe
1
—
Fie
i
|
Tieden
Gril
14)
0
| 5) cae
a
i eee ee eee
OO
cee
|
See(l)
1
When row i of Q? multiplies column j of Q, the dot product is q; 9): Off the diagonal
(i # j) that dot product is zero by orthogonality. On the diagonal (i = j) the unit
vectors give q} qi = lIqi ll ——}a ks
If the columns are only orthogonal (not unit vectors), then Q'Q is a diagonal
matrix (not the identity matrix). We wouldn’t use the letter Q. But this matrix is
almost as good. The important thing is orthogonality—then it is easy to produce unit
vectors.
196
Chapter 4 Orthogonality
To repeat: Q'Q =I
even when Q is rectangular. In that case Q! is only an inverse from the left. For square matrices we also have QQ' =1, so Q! is the two-sided
inverse of Q. The rows of a square Q are orthonormal like the columns. To invert the
matrix we just transpose it. In this square case we call Q an orthogonal matrix.
Here are three examples of orthogonal matrices—rotation and permutation and
reflection.
Example 1 (Rotation)
Q rotates every vector in the plane through the angle @:
ee
sin 6
a
a
oat
cos 0
A
a
—sin@
cosé
The columns of Q are orthogonal (take their dot product). They are unit vectors because sin? @+cos?@ = 1. Those columns give an orthonormal basis for the plane R?.
The standard basis vectors i and j are rotated through @ (see Figure 4.10a).
Q-! rotates vectors back through —6. It agrees with Q', because the cosine of
—O is the cosine of 0, and sin(—@) = —sin@. We have Q'Q =/ and OQ! =1/.
Example 2 (Permutation)
Ope
Oa
slat)
eel
These matrices change the order of the components:
i 0 O12
x
yl
y
=|2
and
x
i A
eee
is
.
All columns of these Q’s are unit vectors (their lengths are obviously 1). They are also
orthogonal (the 1’s appear in different places). The inverse of a permutation matrix is
its transpose. The inverse puts the components back into their original order:
Inverse = transpose:
QO" oO I
LeOe.0
Oth
0
y
Pe
3
x
med
and
Zz
0 1
; o|k|= | ;
ie
y
Every permutation matrix is an orthogonal matrix.
Example 3 (Reflection) If wu is any unit vector, set Q = J — 2uu!. Notice that
uu’ is a matrix while u!u is the number ||u|/? = 1. Then Q' and Q~! both equal Q:
O'=1-2uu'=Q
and
QO'QO=1 —4uu" 4+ 4uuTuu' = 1.
(2)
Reflection matrices J — 2uu™ are symmetric and also orthogonal. If you square them,
you get the identity matrix. Reflecting twice through a mirror brings back the original,
and O* = Q'Q =1. Notice u'u = 1 near the end of equation (2).
*“Orthonormal matrix” would have been a better name for Q, but it’s not used. Any matrix with
orthonormal columns has the letter Q, but we only call it an orthogonal matrix when it is square.
4.4
Orthogonal Bases and Gram-Schmidt
As examples we choose the unit vectors uw; = (1, 0) and then uz = (2,
Compute 2uu! (column times row) and subtract from /:
197
- Li/4/2):
er=1-2[5]0 ol=[ {] ame or=r-2[$ S]=[1 9]
Q, reflects uw; = (1,0) across the y axis to (—1,0).
Every vector (x, y) goes into its
mirror image (—x, y) across the y axis:
Reflection from Q):
E
"|| = als
Q2 is reflection across the 45° line. Every (x, y) goes to (y, x)—this was the permuta-
tion in Example 2. A vector like (3,3) doesn’t move when you exchange 3 and 3—it
is on the mirror line. Figure 4.10b shows the mirror.
Rotations preserve the length of a vector. So do reflections. So do permutations.
So does multiplication by any orthogonal matrix—lengths and angles don’t change.
4J If Q has orthonormal columns (Q™Q = 1), it leaves lengths unchanged:
|| Qx || = ||x|| for every vector x.
Q also preserves dot products:
Proof
||Qx||
equals
(3)
(Qx)'(Qy) =x'Q'Qy=x!y.
IIx ||? because
(Qx)'(Qx)
=
On
Oa
Just use Q'Q =7!
x'Ix
=
x'x.
Orthogonal matrices are excellent for computations—numbers can never grow too large
when lengths of vectors are fixed. Good computer codes use Q’s as much as possible.
That makes them numerically stable.
Figure 4.10
Rotation by
Q = [§ “§] and reflection by Q = [ o_-S
—
198
= Chapter 4 Orthogonality
Projections Using Orthogonal Bases:
Q Replaces A
This chapter is about projections onto subspaces. We developed the equations for X
and p and P. When the columns of A were a basis for the subspace, all formulas
involved A'A. The entries of A'A are the dot products a}aj.
Suppose the basis vectors are actually orthonormal. The a’s become qg’s. Then
A'A simplifies to Q'Q = J. Look at the improvements in ¥ and p and P. Instead
of O'O we print a blank for the identity matrix:
%¥=O0'b
and
p=QO*
The least squares solution of Ox = bis
and
P=Q
On
(4)
= Q"b. The projection matrix is P = Q oF
There are no matrices to invert. This is the point of an orthonormal basis. The best
¥ = Q'b just has dot products of b with the rows of Q', which are the q’s:
—qi —
qib
=) I
|
>
oC
ie
;
(dot products)
qnb
We have n separate 1-dimensional projections. The “coupling matrix” or “correlation
matrix” A'A is now Q'Q =. There is no coupling. Here is p = QF:
P=
|
1G) =
|
T
1} %2
Qn
: |= qi(qib) +---+4n(q,b).
Ugh
(5)
Important case: When Q is square and m = n, the subspace is the whole space.
Then Q' = Q~! and & = Q'Db is the same as x = Q~'b. The solution is exact! The
projection of b onto the whole space is b itself. In this case P = QQ! =/.
You may think that projection onto the whole space is not worth mentioning.
But when p = JB, our formula assembles b out of its 1-dimensional projections. If
41,---»Qn 18 an orthonormal basis for the whole space, so Q is square, then every b
is the sum of its components along the q’s:
b = qi(qib) + 42(q3b) +
+ Gn(qn®).
(6)
That is QQ' =. It is the foundation of Fourier series and all the great “transforms”
of applied mathematics. They break vectors or functions into perpendicular pieces.
Then by adding the pieces, the inverse transform puts the function back together.
Example 4 The columns of this matrix Q are orthonormal vectors q1, 42, q3:
| a
Q=%|
2 -1
|
)
2]
has first column
qi =
Wile
WIN
4.4
Orthogonal Bases and Gram-Schmidt
199
The separate projections of b = (0,0, 1) onto q; and gz and q3 are
qi(q; b) = £4)
and
q2(q3 b) = £q2
and
q3(qib) me es
The sum of the first two is the projection of b onto the plane of q; and q2. The sum
of all three is the projection of b onto the whole space—which is b itself:
ere ee’)
0
=5.
=|]0|
4-2-2]
+ 2g. —4g3=4|
24,
44441
I
The Gram-Schmidt
Process
The point of this section is that “orthogonal is good.’ Projections and least squares
always involve A!A. When this matrix becomes Q'Q = /, the inverse is no problem.
The one-dimensional projections are uncoupled. The best ¥ is Q'b (n separate dot
products).
For this to be true, we had to say “Jf the vectors are orthonormal.”
Now we find a way to create orthonormal vectors.
Start with three independent vectors a, b,c. We intend to construct three orthogonal vectors A, B, C. Then (at the end is easiest) we divide A, B,C by their lengths.
That produces three orthonormal vectors q; = A/||A|l, 92 = B/||Bl|, ¢3 = C/||Cl.
Gram-Schmidt
Begin by choosing A = a. This gives the first direction. The next
direction B must be perpendicular to A. Start with b and subtract its projection along
A. This leaves the perpendicular part, which is the orthogonal vector B:
B=b—
—.-A.
(7)
A and B are orthogonal in Figure 4.11. Take the dot product with A to verify that
A'B = A'b— A'b = 0. This vector B is what we have called the error vector e,
|
|
|
|
|
NJ
onto
AB
plane
Figure 4.11
First find B. Then find C. Then divide by ||Al], ||B||, and ||C|].
200 = Chapter 4 Orthogonality
perpendicular to A. Notice that B in equation (7) is not zero (otherwise a and b would
be dependent). The directions A and B are now set.
The third direction starts with c. This is not
a combination of A and B (because
c is not a combination of a and b). But most likely ¢ is not perpendicular to A and
B. So subtract off its components in those two directions to get C:
C=c-———
A]
——B.
(8)
This is the one and only idea of the Gram-Schmidt process. Subtract from every new
vector its projections in the directions already set. That idea is repeated at every step.’
If we also had a fourth vector d, we would subtract its projections onto A, B, and C.
That gives D. At the end, divide the orthogonal vectors A, B, C, D by their lengths.
The resulting vectors g1, q2, 43, q4 are orthonormal.
Example 5 Suppose the independent vectors a, b, ¢ are
a=j|-1l
and
2
0
—2
b=
0
Then A =a
and
3
|-—3
3
c=
has A'A = 2. Subtract from b its projection along A:
Check: C is perpendicular to A and B. Finally convert A, B,C to unit vectors (length
1, orthonormal). The lengths of A, B, C are /2 and 6 and 3. Divide by those,
for an orthonormal basis:
ich
q=—|-1
V2)
9
and
Usually A, B,C contain fractions.
TT think Gram
had the idea.
i)
q2=-—=
2
]
V6 |_>
and
1
q3
= —}]
va
1
Almost always q1, 42, q3 contain square roots.
I don’t really know
where Schmidt
came
in.
4.4
Orthogonal Bases and Gram-Schmidt
The Factorization
201
A = OR
We started with a matrix A, whose columns were a,b,c. We ended with a matrix Q,
whose columns are q1, 42, q3. How are those matrices related? Since the vectors a, b,c
are combinations of the q’s (and vice versa), there must be a third matrix connecting
A to Q. Call it R.
The first step was q; = a/|l|a|| (other vectors not involved). The second equation
was (7), where b is a combination of A and B. At that stage C and q3 were not
involved. This non-involvement of later vectors is the key point of the process:
-
The vectors a and A and q; are all along a single line.
~
The vectors a,b and A, B and qj, q2 are all in the same plane.
-
The vectors a, b,c and A, B, C and q}, q2, q3 are in one subspace (dimension 3).
At every step a},...,a@ are combinations of qj,..., qx. Later q’s are not involved.
The connecting matrix R is triangular, and we have A = QR.
abecl=|q
A = OR is Gram-Schmidt
qa gib qye
@ 9
gob qyc|
q3¢
or
A=QR-
(9)
in a nutshell.
4K (Gram-Schmidt) From independent vectors a), ...,a,, Gram-Schmidt constructs
orthonormal vectors qj,...,4n. The matrices with these columns satisfy A = QR.
Then R = Q'A is triangular because later g’s are orthogonal to earlier a’s and
a
q; 4 = 9.
Here are the a’s and q’s from the example.
dot product of g; with aj:
(oe
AS
On=2
Sie
VOwA37
W
=
Li
1/2
0)
Wa/6
174/60.
632/76:
The i,j entry of R = Q'A is the
Aa 3
1/83
sea
iw om on
18
0 R/6 x6
ONO
4/3
FOR.
The lengths of A, B, C are the numbers V2, V6, V3 on the diagonal of R. Because of
the square roots, QR looks less beautiful than LU. Both factorizations are absolutely
central to calculations in linear algebra.
Any m by n matrix A with independent columns can be factored into QR. The
m by n matrix Q has orthonormal columns, and the square matrix R is upper triangular
with positive diagonal. We must not forget why this is useful for least squares: ATA
equals R'Q™ QR = R'R. The least squares equation A’ AX = A'b simplifies to
R™R* = R'Q'b
or
RF = O'D.
(10)
202
Chapter 4 Orthogonality
Instead of solving Ax = b, which is impossible, we solve R¥ = O'b by back substitution—which is very fast. The real cost is the mn* multiplications in the Gram-Schmidt
process, which are needed to find Q and R.
Here is an informal code. It executes equations (11) and (12), for k = 1 then k = 2
and eventually k =n. Equation (11) normalizes to unit vectors: For k = 1,...,n
rk = bey
and = Gin = © for i=1,...,m.
a
Vkk
Equation (12) subtracts off the projection onto qx: For
m
kj = DE Gian
i=1
and
Gj; = Gi; — Gixtey
7
(11)
=k+1,...,n
for
i= 1,2.2, m
(12)
Starting from a, b,c = aj, az, a3 this code will construct qi, B, q2, C, q3:
1
qi =4@;/|\a;|| in (11)
2
B=ay—(q{ar)q;
3
q2 = B/||B|| in (11)
4
C=C*—(qiC*)qo in (12)
5
q3=C/||C|| in (11)
and =C* = a3 — (q/.a3)qi in (12)
Equation (12) subtracts off projections as soon as the new vector qx is found. This
change to “one projection at a time” is called modified Gram-Schmidt. It is numerically
more stable than equation (8) which subtracts all projections at once.
#
1
REVIEW OF THE KEY IDEAS
If the orthonormal
vectors q),...qQn
are the columns
#8
of Q, then q} 4; = 0 and
q)4i = 1 translate into Q'Q = 1.
If Q is square (an orthogonal matrix) then Q' = Q7!.
The length of Qx equals the length of x:
||Qx|| = ||x]].
Ww
>
N
The projection onto the column space spanned by the q’s is P = OQ!.
5
If Q is square then P = / and every
b= qi(q} b) +...+4@n(q)b).
6
Gram-Schmidt
produces orthonormal
In matrix form this is the factorization
qi, q2,q3
A = OR.
from
independent
a,b,c.
4.4.
Orthogonal Bases and Gram-Schmidt
203
Problem Set 4.4
Problems 1-12 are about orthogonal vectors and orthogonal matrices.
1
Are these pairs of vectors orthonormal or only orthogonal or only independent?
)
1
b ape
—1
_ ()
6
4
| auc Bey
©)
cos 6
ee ane
—sin@
a
Change the second vector when necessary to produce orthonormal vectors.
The vectors (2, 2, —1) and (—1, 2, 2) are orthogonal. Divide them by their lengths to
find orthonormal vectors q; and g2. Put those into the columns of Q and multiply
O'O and OO".
(a)
If A has three orthogonal columns each of length 4, what is A’ A?
(b)
If A has three orthogonal columns of lengths 1, 2,3, what is AA?
Give an example of each of the following:
(a)
A matrix
Q that has orthonormal
columns but QQ? Sali,
(b)
Two orthogonal vectors that are not linearly independent.
(c)
An orthonormal
basis for R*, where every component is 5 or —5.
Find two orthogonal vectors that lie in the plane x + y + 2z = 0. Make them
orthonormal.
If Q,; and Q> are orthogonal matrices, show that their product Q;Q2
orthogonal matrix. (Use o'o =O)
is also an
If the columns of Q are orthonormal, what is the least squares solution ¥ to
Qx = b? Give an example with b 4 0 but x = 0.
(a)
Compute
P =
QQ"
when q, =
(.8,.6,0)
and q2 =
(—.6,.8,0).
Verify
that P? = P.
(b)
Prove that always (0Q')(QQ') = QQ! by using Q'Q =/.
Then P =
QQ! is the projection matrix onto the column space of Q.
Orthonormal vectors are automatically linearly independent.
(a)
Vector proof:
When
c1q1 + c2qg2 + c3q3
=
0, what dot product leads to
c) = 0? Similarly cz = 0 and c3 = 0 and the q’s are independent.
10
(b)
Matrix proof: Show that Qx = 0 leads to
gular, you can use QP but not Or%
x = 0. Since
(a)
Find orthonormal vectors q; and q2 in the plane of a = (1,3,4,5,7)
b = (—6, 6, 8, 0, 8).
(b)
Which
vector in this plane is closest to (1, 0,0, 0, 0)?
Q may be rectan-
and
204
Chapter 4 Orthogonality
11.
If q; and q2 are orthonormal vectors in R°, what combination
is closest to a given vector b?
12
If aj, az, a3 is a basis for R°’, any vector b can be written as
b= xja, + x2a2 + x343
(a)
a, az a3
Oe
Suppose the a’s are orthonormal.
Show that x; = alb.
(b)
Suppose the a’s are orthogonal.
Show that x; = al b/a}ay.
(c)
If the a’s are independent, x, is the first component of
Problems
13.
or
qt
13-24 are about the Gram-Schmidt
qQ
0.
times b.
process and A = QR.
What multiple of a =[1] should be subtracted from b = [4] to make the result
B orthogonal to a? Sketch a figure to show a, b, and B.
14
Complete the Gram-Schmidt process in Problem
and q2 = B/||B|| and factoring into QR:
13 by computing q; = a/|la||
F f= [a oo|['0! isi)
15
(a)
Find orthonormal vectors q1, q2, g3 such that q;, g2 span the column space of
(b)
Which of the four fundamental subspaces contains q3?
(c)
Solve Ax = (1, 2,7) by least squares.
16
What multiple of a = (4,5, 2,2) is closest to
vectors q; and q2 in the plane of a and BD.
17
‘Find the projection of b onto the line through a:
1
¢=nel
and.
i
1
bi=2\53
5
and)
b = (1,2,0,0)?
Pi)
and
Find orthonormal
eb
— pie)
Compute the orthonormal vectors q; = a/||a|| and q2 = e/|le||.
18
If A = QR then ATA = R'R =
Gram-Schmidt
triangular times
triangular.
on A corresponds to elimination on A’A. Compare the pivots
for A'A with ||a||? = 3 and |le||7 = 8 in Problem 17:
1 l
=
Av—siales
N
and
Tops
Nelly
ee
Al
4.4
19
20
Orthogonal Bases and Gram-Schmidt
205
True or false (give an example in either case):
(a)
The inverse of an orthogonal matrix is an orthogonal matrix.
(b)
If Q (3 by 2) has orthonormal columns then ||Qx|| always equals ||x||.
‘Find an orthonormal basis for the column space of A:
2
0
1
and
—4
—3
3
I) =
3
—
0
Then compute the projection of b onto that column space.
21
~+Find orthogonal vectors A,
B,C by Gram-Schmidt
from
1
1
Oral
and
b=/)-1
2
22
ard
Cea)
0
Find qj, q2, q3 (orthonormal)
Then write A as OR:
as combinations
>
4
of a, b,c (independent columns).
ll
oor
won
Nuns
23
(a)
Find a basis for the subspace S in R* spanned by all solutions of
xp
24
+x.
+x3—x4
= 0.
(b)
Find a basis for the orthogonal complement S+.
(c)
Find by; in S and bp in S+ so that b} + by = b = (1, 1,1, 1).
If ad —bc > 0, the entries in
A = QR are
a
(a
ki 4
CL
te
=¢|
(?)
ances
(a7
0
sab scd
ad — bc
[qr
2
Write down A = QR when a, b, c,d = 2,1, 1, 1 and also 1, 1, 1, 1. Which entry
becomes
zero when Gram-Schmidt
breaks down?
206
Chapter 4 Orthogonality
Problems 25-28 use the QR code in equations (11-12). It executes Gram-Schmidt.
25
Show why C (found via C* in the steps after (12)) is equal to C in equation (8).
26
Equation (8) subtracts from c its components along A and B. Why not subtract
the components along a and along b?
27
Write a working code and apply it to a = (2, 2, —1),
What are the q’s?
28
Where are the mn? multiplications in equations (11) and (12)?
b = (0, —3, 3), ec= (1, 0, 0).
Problems 29-32 involve orthogonal matrices that are special.
29
(a)
Choose c so that Q is an orthogonal matrix:
jee
ee
|e
ee
Gere
CEC aleip 8Rien
eo
sie
(b)
Change the first row
matrix.
and column
ket
1
to all 1’s and fill in another orthogonal
30
Project b = (1,1, 1,1) onto the first column
onto the plane of the first two columns.
31
If w is a unit vector, then Q = J — 2uu!' is a reflection matrix (Example 3).
Find Q from u = (0,1) and also from u = (0, V2/2, V2/2). Draw figures to
show the reflections of (x, y) and (x, y, z).
32
Q =1 —2uu!
33
in Problem 29(a).
Then project b
T
is a reflection matrix when u'u
= 1.
(a)
Prove that Qu = —u.
The mirror is perpendicular to u.
(b)
Find Qv when u!v =0.
The mirror contains v.
The first four wavelets are in the columns
r=.
a ce
alae
of this wavelet matrix W:
VA
0
BG /2)
mein
1
Ora),
iG es
Onn)
What is special about the columns of W?
W-!. What is the relation of W~! to W?
Find the inverse wavelet transform
DETERMINANTS
THE PROPERTIES OF DETERMINANTS a 5.1
The determinant of a square matrix is a single number. That number contains an amazing amount of information about the matrix. It tells immediately whether the matrix
is invertible. The determinant is zero when the matrix has no inverse. When A is
invertible, the determinant of A~! is 1/(det A). If det
A = 2 then det A~! = 5. In fact
the determinant leads to a formula for every entry in A7!.
This is one use for determinants—to find formulas for inverse matrices and pivots
and solutions A~!b. For a matrix of numbers, we seldom use those formulas. (Or
rather, we use elimination as the quickest way to evaluate the answer.) For a matrix
with entries a,b, c,d, its determinant shows how A~! changes as A changes:
_|4a
a=|?
b
wrk:
A
has inverse
A
=|
=a
1
d
lS
—b
ak
(1)
Multiply those matrices to get 7. The determinant of A is ad — bc. When det A = 0,
we are asked to divide by zero and we can’t—then A has no inverse. (The rows are
parallel when a/c = b/d. This gives ad = bc and a zero determinant.) Dependent
rows lead to det A = 0.
There is also a connection to the pivots, which are a and d—(c/a)b. The product
of the two pivots is the determinant:
a(d 2 <p) alist belewmichtiel
deans
a
After a row exchange the pivots are c and b — (a/c)d. Those pivots multiply to give
minus the determinant. The row exchange reversed the sign of det A.
Looking ahead
The determinant of an n by n matrix can be found in three ways:
1 Multiply the n pivots (times 1 or —1).
2 Add up n! terms (times 1 or —1).
3 Combine
This is the pivot formula.
This is the “big” formula.
n smaller determinants (times
| or —1).
207
This is the cofactor formula.
208
Chapter 5 Determinants
You see that plus or minus signs—the decisions between 1 and —1—play a very big
part in determinants. That comes from the following rule for n by n matrices:
The determinant changes sign when two rows (or two columns) are exchanged.
The identity matrix has determinant +1. Exchange two rows and det P = —1. Exchange two more rows and the new permutation has det P = +1. Half of all permutations are even (det P = 1) and half are odd (det P = —1).
Starting from /, half of
the P’s involve an even number of exchanges and half require an odd number. In the
2 by 2 case, ad has a plus sign and bc has minus—coming from the row exchange:
il
det|
@
Na
and
(Oa
det|| o|=-1
The other essential rule is linearity—but a warning comes first. Linearity does
not mean that det(A + B) = det A + det B. This is absolutely false. That kind of
linearity is not even true when A = 7 and B = I. The false rule would say that
det22 = 1+ 1 = 2. The true ruletissdet2/ = 2”. Determinants’ ‘are:multiphed* by
t” (not just by ¢) when matrices are multiplied by ¢. But we are getting ahead of
ourselves.
In the choice between defining the determinant by its properties or its formulas,
we choose its properties—sign reversal and linearity. The properties are simple (Section 5.1). They prepare for the formulas (Section 5.2). Then come the applications,
including these three:
(1)
Determinants give A7! and A~!b (by Cramer’s Rule).
(2)
The volume of an n-dimensional box is |det A|, when the edges of the box come
from the rows of A.
(3)
The numbers
4 for which det(A — AJ) = 0 are the eigenvalues of A. This is the
most important application and it fills Chapter 6.
The Properties of the Determinant
There are three basic properties (rules 1, 2, 3). By using those rules we can compute
the determinant of any square matrix A. This number is written in two ways, det A
and |A|. Notice: Brackets for the matrix, straight bars for its determinant. When A
is a 2 by 2 matrix, the three properties lead to the answer we expect:
The determinant of
ig |
G
iS
a
b
d
[=ad ~ be
We will check each rule against this 2 by 2 formula, but do not forget: The rules apply
to any n by n matrix. When we prove that properties 4-10 follow from 1-3, the proof
must apply to all square matrices.
5.1
The Properties of Determinants
209
Property 1 (the easiest rule) matches the determinant of J with the volume of a unit cube.
1
The determinant of the n by n identity matrix is 1.
2
The determinant changes sign when two rows are exchanged
a
Gud
Check:
b
(sign reversal):
b
d | (both sides equal bc — ad).
Because of this rule, we can find det P for any permutation matrix. Just exchange rows
of J until you reach P. Then det P = +1 for an even number of row exchanges and
det P = —1 for an odd number.
The third rule has to make the big jump to the determinants of all matrices.
3. The determinant is a linear function of each row separately (all other rows stay fixed).
If the first row is multiplied by ¢, the determinant is multiplied by f. If first rows are added,
determinants are added. This rule only applies when the other rows do not change! Notice
how c and d stay the same:
:
| by ?:
multiply row
add row
ee
| of A to row | of A’:
In the first case, both sides are tad —thc.
ta...
6b
Gop
ce tl. i
:
Ge
é
/
27
d
/
} se,
er\e
cod
/
b
/
Clad
Then t factors out. In the second case, both
sides are ad +. a'd — bc — b’c. These rules still apply when A is n by a, and the last
n— 1 rows don’t change. May we emphasize this with numbers:
0 O
i
(eel
and
0
0
EO
|
0
ih
(Oe
0 0
NY
or
—WwW
©
By itself, rule 3 does not say what any of those determinants are. It just says that the
determinants must pass these two tests for linearity.
Combining multiplication and addition, we get any linear combination in the first
row:
t(row
| of A) + ¢t/(row 1 of A’). With this combined
row, the determinant
is ¢
times detA plus ¢’ times det A’. This rule does not mean that det 2/ = 2 det /.
To obtain 27 we have to multiply both rows by 2, and the factor 2 comes out both times:
2 Sao
F 0
9
eg =.
and
|
;
9)
U
=".
210 = Chapter 5 Determinants
This is just like area and volume. Expand a rectangle by 2 and its area increases by 4.
Expand an n-dimensional box by tf and its volume increases by t”. The connection is
no accident—we will see how determinants equal volumes.
Pay special attention to rules 1-3. They completely determine the number det A—
but for a big matrix that fact is not obvious. We could stop here to find a formula
for n by n determinants. It would be a little complicated—we prefer to go gradually.
Instead we write down other properties which follow directly from the first three. These
extra rules make determinants much easier to work with.
4
If two rows of A are equal, then detA = 0.
Check 2 by: 2:
a
b
=A
b
Rule 4 follows from rule 2. (Remember we must use the rules and not the 2 by 2
formula.) Exchange the two equal rows. The determinant D is supposed to change
sign. But also D has to stay the same, because the matrix is not changed. The only
number with —D = D is D = 0—this must be the determinant. (Note: In Boolean
algebra the reasoning fails, because —1 = 1. Then D is defined by rules 1, 3, 4.)
A matrix with two equal rows has no inverse. Rule 4 makes detA = 0. But
matrices can be singular and determinants can be zero without having equal rows! Rule
5 will be the key. We can do row operations without changing det A.
5
Subtracting a multiple of one row from another row leaves det A unchanged.
a
c—la
b
d—lb
a
C
b
ais
Linearity splits the left side into the right side plus another term —i[2 Lab This extra
term is zero by rule 4. Therefore rule 5 is correct. Note how only one row changes
while the others stay the same—as required by rule 3.
Conclusion The determinant is not changed by the usual elimination steps: det A equals
det U. If we can find determinants of triangular matrices U, we can find determinants
of all matrices A. Every row exchange reverses the sign, so always det A = +det U.
We have narrowed the problem to triangular matrices.
6
A matrix with a row of zeros has det A = 0.
0
e
O
[=o
@
and
oe 1b)
0 O
=o.
For an easy proof, add some other row to the zero row. The determinant is not changed
(rule 5). But the matrix now has two equal rows. So det A = 0 by rule 4.
7 If A is a triangular matrix then det A =
entries.
a41a22++-dyn
a
<
“|= ad
and also
=product of diagonal
0
[=ad.
d
5.1.
The Properties of Determinants
211
Suppose all diagonal entries of A are nonzero. Eliminate the off-diagonal entries by
the usual steps. (If A is lower triangular, subtract multiples of each row from lower
rows. If A is upper triangular, subtract from rows above.) By rule 5 the determinant
is not changed—and now the matrix is diagonal:
a\\
0
a22
We must still prove that
;
0
= 411422 °*+ Ann.
Gay
For this we apply rules 1 and 3. Factor a;; from the first row. Then factor a22 from
the second row. Eventually factor a,,, from the last row. The determinant is a; times
a7 times --- times dy, times det/. Then rule 1 (used at last!) is det / = 1.
What if a diagonal entry of the triangular matrix is zero? Then the matrix is
singular. Elimination will produce a zero row. By rule 5 the determinant is unchanged,
and by rule 6 a zero row means det A = 0. Thus rule 7 is proved—the determinants
of triangular matrices come directly from their main diagonals.
8
If A is singular then det A =0. If A is invertible then det A # 0.
|< i
is singular if and only if
ad —bc =0.
Proof Elimination goes from A to U. If A is singular then U has a zero row. The rules
give det A = detU = 0. If A is invertible then U has the pivots along its diagonal.
The product of nonzero pivots (using rule 7) gives a nonzero determinant:
det A = tdet U = + (product of the pivots).
This is the first formula for the determinant. MATLAB would use it to find det A
from the pivots. The plus or minus sign depends on whether the number of row exchanges is even or odd. In other words, +1 or —1 is the determinant of the permutation matrix P that does the row exchanges. With no row exchanges, the number zero
is even and P = J and detA = detU. Note that always det L = 1, because L is
triangular with 1’s on the diagonal. What we have is this:
If.
Again, det
P = +1
PAr=,LU_and det
then,
et P detA =detL,
det VU.
A = +det U. Equation (2) is our first case of rule 9.
(2)
212
9
Chapter 5 Determinants
The determinant of AB is the product of det A times det B: |AB|=|A||BI.
a
b
Gund!
ap+br
cp+dr
Peg
fees
aq+bs
cq+ds|"
In particular—when the matrix B is A~'—the determinant of A7! is 1/det A:
AA7!=I1
so
(det A)(det
A~!) = det J = 1.
This product rule is the most intricate so far. We could check the 2 by 2 case by algebra:
(ad — bc)(ps — qr) = (ap + br)(cq + ds) — (aq + bs)(cp + dr).
For the n by n case, here is a snappy proof that |AB| = |A||B|. The idea is to consider
the ratio D(A) = |AB|/|B|. If this ratio has properties 1,2,3—-which we now check—it
must equal the determinant |A|. (The case |B| = 0 is separate and easy, because AB
is singular when B is singular. The rule |AB| = |A||B| becomes 0 = 0.) Here are the
three properties of the ratio |AB|/|B|:
Property 1 (Determinant of /): If A =/ then the ratio becomes |B|/|B| = 1.
Property 2. (Sign reversal): When two rows of A are exchanged, so are the same two
rows of AB. Therefore |AB| changes sign and so does the ratio |AB|/|B].
Property 3 (Linearity): When row | of A is multiplied by f, so is row 1 of AB. This
multiplies |AB| by ¢ and multiplies the ratio by t—as desired.
Now suppose row | of A is added to row | of A’ (the other rows staying the same
throughout).
Then row
1 of AB is added to row
1 of A’B.
By rule 3, the determinants
add. After dividing by |B], the ratios add.
Conclusion
‘This ratio |AB|/|B| has the same three properties that define |A|. Therefore it equals |A|. This proves the product rule |AB| = |A||B|.
10
The transpose A* has the same determinant as A.
Check:
|“
z =|"
“|
d
Did
since both sides equal
ad — be.
The equation |A!| = |A| becomes 0 = 0 when A is singular (we know that A! is also
singular). Otherwise A has the usual factorization PA = LU.
gives A'p!
= U'L’.
The proof of |A| = |A™| comes
Transposing both sides
by using rule 9 for products
and comparing:
det
Pdet A =detLdetU
and
det A! det P? = detU' det L!.
First, det L= det L™ (both have 1’s on the diagonal).
Second, det
U = det U? (trans-
posing leaves the main diagonal unchanged, and triangular determinants only involve
that diagonal). Third, det P = det P' (permutations have P’ = P—!, so |P||P™| =1
by rule 9; thus |P| and |P™| both equal 1 or both equal —1). Fourth and finally, the
comparison proves that det A equals det A!.
5.1.
The Properties of Determinants
213
Important comment
Rule 10 practically doubles our list of properties. Every rule for
the rows can apply also to the columns (just by transposing, since |A| = |A‘|). The
determinant changes sign when two columns are exchanged. A zero column or two
equal columns will make the determinant zero. If a column is multiplied by f, so is
the determinant. The determinant is a linear function of each column separately.
It is time to stop. The list of properties is long enough. Next we find and use
an explicit formula for the determinant.
=»
REVIEW OF THE KEY IDEAS
=
abe
The determinant is defined by linearity, sign reversal, and det J = 1.
2.
After elimination det A is + (product of the pivots).
3. | The determinant is zero exactly when A is not invertible.
4.
Two remarkable
properties are det AB = (det A)(det B) and det A! = det A.
Problem Set 5.1
Questions 1-12 are about the rules for determinants.
1
If a 4 by 4 matrix has det A = 2, find det(2A)
and det(—A)
and det(A”) and
det(A~!).
2
If a3 by 3 matrix has det A= —3, find det($A) and det(—A) and det(A*) and
det(A~').
3
True or false, with reason or counterexample:
4
(a)
The determinant of / +A
is | +det A.
(b)
The determinant of ABC
is |A||B||C\.
(c)
The determinant of A* is |A/*.
(d)
The determinant of 4A is 4A].
Which row exchanges show that these “reverse identity matrices” J3 and J4 have
[J3|= —1 but [J4| = +17
det
1
1 O|=-l
(One)
©
SO
but
det
©=
OO
or
214
Chapter 5 Determinants
For n = 5,6,7, count the row exchanges to permute the reverse identity J, to
the identity matrix /,. Propose a rule for every size n and predict whether Jo
is even or odd.
Show how Rule 6 (determinant= 0 if a row is all zero) comes from Rule 3.
Prove from the product rule |AB| = |A||B| that an orthogonal matrix
determinant | or —1. Also prove that |Q| = \o-!| =|aQ?|.
Find the determinants
Oe
Q has
of rotations and reflections:
cos@
—sin@
~ | sin@
cos@
ior
~
1—2cos?@
—2cosé@sin@
|-2cos@sin@d
1—2sin*6é
|"
Prove that |A‘| = |A| by transposing A = QR. (R is triangular and Q is orthogonal; note Problem 7.) Why does |R™| = |RI|?
10
If the entries in every row of A add to zero, prove that det A= 0. If every row
of A adds to one, prove that det(A—/) = 0. Does this guarantee that det A = 1?
11
Suppose that
CD = —DC and find the flaw in this reasoning: Taking determinants gives |C||D| = —|D||C|. Therefore |C| = 0 or |D| = 0. One or both of
the matrices must be singular. (That is not true.)
12
The inverse of a 2 by 2 matrix seems to have determinant = 1:
det A~! = det
I
E ae
ad—be|—¢€
a|
ad—bc
What is wrong with this calculation?
Questions 13-26 use the rules to compute specific determinants.
13
By applying row operations to produce an upper triangular U, compute
det
ps
i Om
2,
Oe
14
Ok
ay 0)
Ome
0s
UP
SS
and
det
2 —-1
—|
2
jing
0
O
0
0
—1
0
yah 2
-!I
2
Use row operations to show that the 3 by 3 “Vandermonde determinant” is
det]
1
b
b? |=(b—a)(c—a)(c—Db).
5.1.
15
The Properties of Determinants
215
Find the determinants of a rank one matrix and a skew-symmetric matrix:
A=
|e
A>
|
and
Pak
|
3
0
=3
eis
8 Op
eed
=4
0
16
A skew-symmetric matrix has
K' = —K. Insert a,b,c for 1,3,4 in Question
15 and show that |K| = 0. Write down a 4 by 4 example with |K| = 1.
17
Use row operations to simplify and compute these determinants:
101%
det | 102
103
18
BO
202
203
Find the determinants of U and U~!
be
@ 2
0 0
Use|
19
3}
§
6
Leste
det | Peni
tz ¢
and
42
and U?:
and
b
Zk
,
Suppose you do two row operations at once, going from
a b
cud
Find the second determinant.
20
261
302
303
Add row
ee
a—Lce
c=la
b—Ld
d—Ib|
Does it equal ad — bc?
1 of A to row 2, then subtract row 2 from row
1. Then add row | to
row 2 and multiply row 1 by —1 to reach B. Which rules show that
A=
ow:
© al
andar
ong
a
b
have
det
B = —detA?
Those rules could replace Rule 2 in the definition of the determinant.
21
From ad — bc, find the determinants of A and A>
es
Peel
5
and
A
nok l
all
at
|
and
cand Ay=uli
A
la
ut =| \
1
ae
Which two numbers A lead to det(A — AJ) =0? Write down the matrix A —A/
for each of those numbers A—it should not be invertible.
PD.
From A = laa find A? and A~! and A — AJ and their determinants.
two numbers A lead to |A —A/J| =0?
Which
216
Chapter 5 Determinants
23
Elimination reduces
A to U. Then
A = LU:
385
4
=
Ass 1) 6:
Pakb ce Rl =e.
0
haan
a
.0.
ee
ts
40)
8
I
34
ON
Oe
to
4
ee ei
ea!
aU
Find the determinants of L, U, A, U~!L~!, and U“!L!A.
24
If the i, 7 entry of A isi times j, show that det A = 0. (Exception when A = [1 ].)
25
If the i, j entry of A isi + j, show that det A = 0. (Exception when n = 1 or 2.)
26
Compute the determinants of these matrices by row operations:
Angin
Ca
8
2
g
SSS
3S)
‘er
sy,
SoS
Severs
S|
Sy
&
&
=>
SVS
27
28
True or false (give a reason if true or a 2 by 2 example if false):
(a)
If A is not invertible then AB is not invertible.
(b)
The determinant of A is the product of its pivots.
(c)
The determinant of A — B equals det A — det B.
(d)
AB
and BA have the same determinant.
(Calculus question) Show that the partial derivatives of
f(a,b,c,d) =In(ad —bc)
leads to
f(A) = In(det A) give Ari
Eas
ae a,
5.2.
PERMUTATIONS
Permutations and Cofactors
AND
217
COFACTORS ga 5.2
A computer finds the determinant from the pivots. This section explains two other ways
to do it, using permutations and using cofactors. We will find a new formula for the
determinant—after reviewing the pivot formula.
May I give the best example right away? It is my favorite 4 by 4 matrix:
on
ee
hte 9 lath 0)
lem
22a
lee 0)
Or =
2 1
OSs
(0
The determinant of this matrix is 5. We can find it in all three ways:
1.
The product of the pivots is 2 - ‘ : t : 2 Cancellation produces 5.
2.
The “big formula” in equation (7) has 4! = 24 terms. But only five terms are nonzero:
det
A = 16—4-—-4-—-4412=5.
The 16 comes from 2-2.-2-2 on the diagonal. Where does +1 come
When you can find those five terms, you have understood formula (7).
3.
from?
The numbers 2, —1, 0, 0 in the first row are multiplied by their cofactors 4, 3, 2, |
from the other rows.
That gives
2-4—1-3=5.
Those cofactors are 3 by 3 determinants. They use the rows and columns that are
not used by the entry in the first row. Every term in a determinant uses each
row and column once!
The Pivot Formula
Elimination leaves the pivots d,, ..., d, on the diagonal of the upper triangular U. If
no row exchanges are involved, multiply those pivots to find the determinant!
det A = (det
(det U) = U)(dja>---d,).
(1)
This is our first formula for det A. When A changes to U, the determinant stays the
same. L has 1|’s on the diagonal, which produces detL = 1. Formula (1) already
appeared in the previous section, with the further possibility of row exchanges. The
permutation matrix in PA = LU has determinant —1 or +1. This factor +1 enters
the determinant of A:
(det P)(detA) = (det L)(detU)
When
gives
detA = +(d,d2---d,).
A has fewer than n pivots, det A= 0 by Rule 8. The matrix is singular.
(2)
218
Chapter 5
Determinants
Example 1 A row exchange produces pivots 4, 2, 1 and that important minus sign:
OW ORL
Ar | Um a eS
4 5 6
PA
4 5
OT 2L
Uegirqig
6
eS
|
det A= —(4)(2)(1) = —8.
The odd number of row exchanges (namely one exchange) means that det P = —1.
The next example has no row exchanges. It is one of the first matrices we factored in Section 2.6 (when it was 3 by 3). What is remarkable is that we can go
directly to the n by n determinant. Large determinants become clear when we know
the pivots.
Example 2 (The —1, 2, —1 tridiagonal matrix)
— The first three pivots are 2 and :
n+l,
and ;. The next pivots are 2 and isand eventually ——:
|
—]
2
—]
-1
2
.
factors into
=I
2
—|]
i
1
2
oe
-1
3
ie
—1
Sel
+1
of
The pivots are on the diagonal of U (the last matrix). When 2 and 3 and : and 3 are
multiplied, the fractions cancel. The determinant of the 4 by 4 matrix is 5. Jn general
the determinant is n + 1:
n+1
det A= (2)(5)(G)--(72°)
=n 41.
3
The 3 by 3 determinant is 4. Important point: Those first pivots depend only on the
upper left corner of the original matrix. This is a rule for all matrices without row
exchanges. We explain it while the example is fresh:
The first k pivots come from the k by k matrix Ax in the top left corner of A.
Their product d\dz---d, is the determinant of Ax.
The 1 by 1 matrix A, contains the very first pivot d,. This is det Ay. The 2 by 2
matrix in the corner has det Az = d;d2. Eventually the n by n determinant uses all n
pivots to give det A, which is det A.
Elimination deals with the corner matrix Ax while starting on the whole matrix.
We assume no row exchanges—then A = LU and Ax = LxU x. Dividing one determinant by the previous determinant
(det Ay divided by det Ay_1) cancels everything but
5.2
Permutations
and Cofactors
219
the most recent pivot d,. This gives a formula for the kth pivot. The pivots are ratios
of determinants:
did...d
The kth pivot is d, = ee
did)... dp)
det
A
det Ax_
d
In the —1, 2, —1 example this ratio of determinants correctly gives the pivot (k+1)/k.
We don’t need row exchanges when all these corner matrices have det Ay # 0.
The Big Formula for Determinants
Pivots are good for computing. They concentrate a lot of information—enough to find the
determinant. But it is impossible to see what happened to the original a;;. That part will
be clearer if we go back to rules 1-2-3, linearity and sign reversal and det J = 1. We want
to derive a single explicit formula for the determinant, directly from the entries aj;;.
The formula has n\ terms. Its size grows fast because n! = 1, 2, 6,24, 120,....
For n = 11 there are forty million terms. For n = 2, the two terms are ad and bc.
Half the terms have minus signs (as in —bc).
The other half have plus signs (as in
ad). For n = 3 there are 3! = (3)(2)(1) terms.
Here are those six terms:
41
ts
:
431
Notice the pattern.
a2
aj
is i. +44 1422433 + 412473431 + 413421432
fe
—@1 1423432 — 412421433 — 13022431.
032 a33
Each product like aj1a23a32
has one entry from each column.
has one entry from each row.
(3)
It also
The column order 1, 3, 2 means that this particular
term comes with a minus sign. The column order 3, 1, 2 in a)3a21a32 has a plus sign.
It will be “permutations” that tell us the sign.
The next step (n = 4) brings 4! = 24 terms.
entry from each row and column.
There are 24 ways to choose one
The diagonal a@11a@22a33a44 with column order 1234
always has a plus sign.
To derive the big formula I start with n = 2. The goal is to reach ad — be in a
systematic way. Break each row into two simpler rows:
ahi [as0) 40.5]
Sand
le dil
Oelo og),
Now apply linearity, first in row 1 (with row 2 fixed) and then in row 2 (with row | fixed):
Gp bl
ol]
«ta0|
110.8
AR hel
PINs
Slee
S66 Be lok
ytd) eeOl Oe ale
(4)
220
Chapter 5 Determinants
The last line has 27 = 4 determinants. The first and fourth are zero because their
rows are dependent—one row is a multiple of the other row. We are left with 2! = 2
determinants to compute:
Ov id
b
i fa)
[=ad ~ be.
|
=ad | |
+ be
LO
Om 1
0
a0
0
|
fe
Oar:
c
I won’t labor the point—you see it. The rules lead to permutation matrices, with 1’s
and 0’s that give a plus or minus sign. The 1’s are multiplied by numbers that come
from A. The permutation tells the column sequence, in this case (1, 2) or (2, 1).
Now try n = 3. Each row splits into 3 simpler rows like [a;; 0 0]. Using
linearity in each row, det A splits into 3 = 27 simple determinants. If a column choice
is repeated—for example if we also choose [a2;
0 0]—then
the simple determinant
is zero. We pay attention only when the nonzero terms come from different columns.
That leaves 3! = 6 determinants:
aij
a2,
43,
aj2
a22.
432
ay3
a23];
433
aj\
a\2
=
a22
a\3
+e
a33
a73|
+ |421
a3
a32
ay}
a\2
ate
a23|
a3
+ |421
a
a32
a33
a22
a3}
One entry is chosen from the first row. Call it aj,g, where the column number @ can
be 1, 2, or 3. The number from the second row is ag, and there are two choices
for column £. (Column q@ is not available—it is already used.) That leaves only one
column to choose in the last row.
There are 3! = 6 ways to order the columns. The six permutations of (1, 2, 3)
include the identity permutation (1, 2,3) from P = /:
(a, B;.@) = Gs 2,3), (23,35 DC
2) dt Sez) ae; Waa),
(5)
The last three are odd (one exchange). The first three are even (0 or 2 exchanges).
When the column sequence is (a, 8, w), we have chosen the entries a},a2ga3,,—and
the column sequence comes with a plus or minus sign. The determinant is now split
into six simple terms. Factor out the a;;:
1
det A = 41422433
1
]
1
+ 412423431
1
1
1} + 13421432 |1
1
(6)
+ a11a23a32
1 |+ aj2a21433 | 1
1
+ a43€2243)
1
1
1
The first three (even) permutations have det P = +1, the last three (odd) permutations
have det P = —1. We have proved the 3 by 3 formula in a systematic way.
5.2
Permutations
In the same way you can see the n by n formula.
and Cofactors
221
There are n! orderings of the
column numbers. The numbers (1,2,...,”) go in each possible order (a, B,..., @).
With the columns in that order, the determinant is the product ajqa2g---dnw times
+1 or —l. The sign depends on the parity (even or odd) of the column ordering.
The determinant of the whole matrix is the sum of n! simple determinants (times | or
—1)—which is our big formula:
Formula for the determinant
det A= sum over all n! permutations
=
P = (a, B,...,@)
Ct)
Y “(det P )ajqA2p TTI OAC:
The 2 by 2) case is +-a11a22 — a ,2a2,
(which is ad — bce).
P is (1,2) or (2,1).
The
3 by 3 case has three products “down to the right” and three products “down to the
left.’ Warning: Many people believe they should follow this pattern in the 4 by 4 case.
They only take 8 products—it should be 24.
Example 3 (Determinant of
ucts can be nonzero.
U) When U is upper triangular, only one of the n! prod-
This one term comes from the diagonal:
det U = +u11U22-+-Unn.
All other column orderings pick an entry from below the diagonal, where U has zeros. As soon as we pick a number like v2; = 0 from below the diagonal, that term in
equation (7) is sure to be zero.
Of course det J = 1. The only nonzero term is +(1)(1)---(1) from the diagonal.
Example 4 Suppose Z is the identity matrix except for column 3. Then
:
determinant of Z =
The term (1)(1)(c)(1) comes
10
Oty
O20
“a0
oo 0
ol”
OO
oath
(8)
from the main diagonal with a plus sign. There are 23
other products (one factor from each row and column) but they are all zero. Reason:
If we pick a, b, or d from column 3, that column is used up. Then the only available
choice from row 3 is zero.
Here is a different reason for the same answer. If c = 0, then Z has a row of
zeros and det Z = c = 0 is correct. If c is not zero, use elimination. Subtract multiples
of row 3 from the other rows, to knock out a,b, d. That leaves a diagonal matrix and
det Z = c;
This example will soon be used for “Cramer’s Rule.” If we move a, b, c, d into
the first column of Z, the determinant is det Z = a. (Why?) Changing one column of
I leaves Z with an easy determinant, coming from its main diagonal only.
222
Chapter 5 Determinants
Example 5 Suppose A has 1’s just above and below the main diagonal. Here n = 4:
TOO
ORO
Ob
Oe al
OF
S06
A,=
and
P=
Os
DORs
O10
OO
OD
OTEO
Ol
ie 0
The only nonzero choice in the first row is column 2. The only nonzero choice in row
4 is column 3. Then rows 2 and 3 must choose columns 1 and 4. In other words P4
is the only permutation that picks out nonzeros in A4. The determinant of Py is +1
(two exchanges to reach 2,1, 4,3). Therefore det Ag = +1.
Determinant by Cofactors
Formula (7) is a direct definition of the determinant. It gives you everything at once—
but you have to digest it. Somehow this sum of n! terms must satisfy rules 1-2-3 (then
all the other properties follow). The easiest is det = 1, already checked. The rule
of linearity becomes clear, if you separate out the factor aj; or aj2 OF djq that comes
from the first row. With n = 3 we separate the determinant into
det A = ay, (22033 — 423032) + a2 (a23a31 — 421433) +413 (a21432 — a22a31).
(9)
Those three quantities in parentheses are called “cofactors.” They are 2 by 2 determinants,
coming from matrices in rows 2 and 3. The first row contributes the factors a1, a12, a3,
and the lower rows contribute the cofactors Cj;, Cj2, C 3. Certainly the determinant
a1C11 + a12C}2 + a13C13 depends linearly on aj;, a2, aj3—this is rule 3.
The cofactor of aj; is C11 = a22a33 — a23a32. You can see it in this splitting:
411
412
443
421
422
423)
431
432
433
a\)
=
a\2
422.
423)
+ |a21
432
433
43]
aj3
a23|+|421
422
433
432
43]
We are still choosing one entry from each row and column. Since aj, uses up row 1
and column 1, that leaves a 2 by 2 determinant as the cofactor.
As always, we have to watch signs. The determinant that goes with a;2 looks
like a21a33 — a23a3;.
But in the cofactor Cj2, its sign is reversed.
The sign pattern
for cofactors along the first row is plus-minus-plus-minus. The sign is (—1)!*/ for the
cofactor C;; that goes with a;;. You cross out row | and column j to get a submatrix
M;;. Then take its determinant, and multiply by (—1)!*/ to get the cofactor:
The cofactor is Cy; = (—1)'*/ det Mj;.
Then
det A = aj1}Cy, + 4j2Ci2
+--+
+ ajnCin.
(10)
This is the “cofactor expansion along row 1.” In the big formula (7), the terms that
multiply aj; combine to give det Mj;. The sign is (—1)!*!, meaning plus. Equation (10) is another form of equation (7), with the factors from row | multiplying their
cofactors from rows 2, 3, ..., n.
5.2
Permutations and Cofactors
223
Note Whatever is possible for row 1 is possible for row i. The entries aj; in that row
also have cofactors Cj;. Those are determinants of order n—1, multiplied by (iyi.
Since aj; accounts for row i and column j—and each row and column are to be used
once—the matrix Mj; throws out row i and column j. The display shows a43 and
its cofactor M43 (with row 4 and column 3 crossed out). The sign (=1)t multiplies
the determinant of M43 to give C43. The sign matrix shows the + pattern:
as
ca
(-DiV =
=
56
OG
OURS as
5A COFACTOR FORMULA
with its cofactors:
det
|
+
2p
STR
—
+
—
us |
=)
2. Fee
The determinant of A is the dot product of any row i
A = aj, Cj, + aj2Cji2
+ +++
+ ainCin.
(il)
Each cofactor C;; (order n—1, without row i and column /) includes its correct sign:
Cij = (—1)'*/ det Mj.
A determinant of order n is a combination of determinants of order n— 1. A recursive
person would keep going. Each subdeterminant breaks into determinants of order n—2.
We could define all determinants via equation (11). This rule goes from order n to n—1
to n—2 and eventually to order 1. Define the 1 by 1 determinant |a| to be the number
a. Then the cofactor method is complete.
We preferred to construct det A from its properties (linearity, sign reversal, and
det / =
1). The explicit formula
those properties.
(7) and the cofactor formulas
(9)-(11) follow from
It is the properties that allow us to change A into an easier matrix
like U, when computing the determinant.
One last formula comes from the rule that det A = det A'. We can expand in
cofactors, down a column instead of across a row. Down column j the entries are aj;
to anj. The cofactors are Cj); to C,;. The determinant is the dot product:
Cofactor expansion down column j:
det A = a1 j;Cjj + 42;C2j
+--+ + anjCnj.
(12)
Cofactors are most useful when the matrices have many zeros—as in the next examples.
224
Chapter 5
Determinants
Example 6 The —1, 2, —1 matrix has only two nonzeros in its first row.
So only two
cofactors are involved in the determinant:
*
.
|
-
a
aid
-
=I
2
ies
olla) leene
=|
int
Maal be)
2
DD
=|
r= Ue
2
(13)
You see 2 times Cj; on the right. This cofactor has exactly the same —1, 2, —1 pattern
as the original A—but one size smaller. The other cofactor Cj2 came from crossing
out row | and column 2. It is multiplied by aj2 = —1 and by the sign (—1)!*? which
is also —1.
To compute C12, use cofactors down its first column. The only nonzero is at
the top. That contributes another —1 (so we are back to minus). Its cofactor is the
—1,2,—1
matrix which is two sizes smaller than the original A.
Summary
Let D,, stand for the determinant of the —1, 2, —1 matrix of order n. Equation (13) gives the 4 by 4 determinant D4 from a 3 by 3 cofactor D3 and a 2 by 2
cofactor Dp:
D4 = 2D3 — Do
and generally
D, = 2D, 4 — D>.
(14)
Direct calculation gives D2 = 3 and D3 = 4. Therefore D4 = 2(4) —3 = 5. Then
Ds = 10—4 = 6. These determinants 3, 4, 5, 6 fit the formula D, = n+ 1. That
answer also came from the product of pivots in Example 2.
The idea behind cofactors is to reduce the order one step at a time. In this example the determinants D, =n-+ 1 obey the recursion formula n + 1 = 2n — (n — 1).
As they must.
Example 7 This is the same matrix, except the first entry (upper left) is now
1:
Lies
PW
Pie
whe ey
af
OORT
=1
mud
All pivots of this matrix turn out to be 1. So its determinant is 1. How does that
come from cofactors? Expanding on row 1, the cofactors all agree with Example 6.
Just change: a14 = 2. torbi; = "1;
det Bg = D3 — Do
instead of
det Ag = 2D3 — Dp.
The determinant of By is 4—3= 1. The determinant of every B, is n —(n—1) = 1.
Problem 13 asks you to use cofactors of the Jast row. You still find det B, = 1.
5.2
m
Permutations
REVIEW OF THE KEY IDEAS
and Cofactors
225
=
1.
With no row exchanges, det A = (product of the pivots). In the upper left corner,
det Ay = (product of the first k pivots).
2.
Every term in the big formula (7) uses each row and column once.
n! terms have plus sign (when det P = +1).
3.
The cofactor C;; is (—1)'*/ times the smaller determinant that omits row i and
column j (because aj; uses them).
4.
The determinant is the dot product of any row of A with its row of cofactors.
Half of the
Problem Set 5.2
Problems
1-10 use the formula
with n! terms:
|A| = >> £01~428 ++ Any.
1
Compute the determinants of A and B from six terms.
pendent?
Je
Compute the determinants of A and B. Are their columns independent?
jh Bry
Ay == lheLaie ian Mal
Mae
3
Bede
Le aaPiinge 3
Se ab
Cleat
ial,
Show that det A = 0, regardless of the five nonzeros marked by x’s:
x)
As || OQ
OQ”
4
and
Are their columns inde-
Myo
OW x
Olas
|,
(What is the rank of A?)
This problem shows in two ways that det A = 0 (the x’s are any numbers):
>
|
Oe
oe
Se:
Se
ee
Ss
&SoCeo
oy
iS#
&oom
(a)
How do you know that the rows are linearly dependent?
(b)
Explain why every term is zero in formula (7) for det A.
226
Chapter 5 Determinants
Find two ways to choose nonzeros from four different rows and columns:
Age
be Oes0
at
bh atlFl
ice
10
B=
Ol
Oe
Ora"
4.
5 40
2
a2
5
3
(B has the same zeros as A).
Olin Ou
Is det A equal to 1+ 1 or 1 —1 or —1 — 1? What is det B?
Place the smallest number of zeros in a 4 by 4 matrix that will guarantee det A = 0.
Place as many zeros as possible while still allowing det A 4 0.
(a)
If ayy = a22 = a33 = 0, how many of the six terms in detA will be zero?
(b)
If ay) = a22 = a33 = a44 = 0, how many of the 24 products aj ja2,43)d4m
are sure to be zero?
How many 5 by 5 permutation matrices have det P = +1? Those are even permutations. Find one that needs four exchanges to reach the identity.
10
If det A is not zero, at least one of the terms in (7) is not zero.
Deduce
some ordering of the rows of A leaves no zeros on the diagonal.
from elimination; that PA can have zeros on the diagonal.)
(Don’t use P
(a)
(b)
that
How many permutations of (1, 2, 3,4) are even and what are they?
An odd permutation matrix times an odd permutation matrix is an
permutation matrix.
Problems 11-20 use cofactors C;; = (—1)'*/ det M;;. Remove row i and column j.
11
Find all cofactors and put them into a cofactor matrix C. Find det B by cofactors:
A=
12
is fl
B=
} 23
4-3, 0
7 0 30
Find the cofactor matrix C and multiply A times CT’. What is A~!?
221,
A
a
0
13
50
ies ok
=—1"
2
The matrix B, is the —1, 2, —1 matrix A, except that b}; = 1 instead of a); = 2.
Using cofactors of the /ast row of B4 show that |B4| = 2|B3|—|B| and find |B4|:
a
UIET fe = os
Pe
L
—1
2
B3=|-1
1 -1
2 -1
-1
2
5.2
Permutations and Cofactors
D7,
The recursion |B,| = 2|Bn—1| — |Bn—2| is satisfied when every |B,| = 1. This
recursion is the same as for the A’s. The difference is in the starting values 1, 1, 1
Or @ = il, DB, Se
14
The n by n determinant C, has 1’s above and below the main diagonal:
C; = |0|
15
ce
0
1
0
ooro
=
CO
or
1 oc
oro
GS
(a)
What are these determinants
By cofactors find the relation between C, and C,_; and C,—2.
C), C2, C3, C4?
Find Cjo.
The matrices in Problem 14 have 1’s just above and below the main diagonal.
Going down the matrix, which order of columns gives all 1’s? Decide even or
odd to prove:
The tridiagonal
Cre
I=
4,835 5)
Cr=so
1,1, 1 matrix of order n has determinant
(oe
— AN
17
Ont 50
0 | c=|to 1
Omi
(b)
Cn = 0 (odd n)
16
Ope
ape
role
ie
=
2,6,25):
E,,:
alht lea eat
jy sil
Gis
ail
i
Nib
i
a
1a =
@ i
(ee
(a)
By cofactors show that E, = E,-1 — En-2.
(b)
Starting from £; = 1 and Ey = 0 find £3, E4,..., Eg.
(c)
By noticing how these numbers
ih
al
ee
:
eventually repeat, find Ej 9.
F,, is the determinant of the 1, 1, —1 tridiagonal matrix of order n:
1
fe
oe
1
1
—
es
aoe
—a il
0
i
1
Expand in cofactors to show that F, =
Fibonacci numbers 1,2,3,5,8,13,....
—
ee
Fr—1 + Fn—2.
The sequence
-1
ei)
1
1
1
eas
-1l
1
These determinants are
usually starts 1,1, 2,3
(with two 1’s) so our F, is the usual Fy,+1.
18
Go back to B, in Problem 13. It is the same as A, except for bj; = 1. So use
linearity in the first row, where [1 —1 0] equals [2 —1 0] minus [1 0 0].
Here is |B,|:
=|
1
-l
0
2
=I
An-1
0
—-1
0
_|-1
ree
An-1
0
0
An-1
0
228
Chapter 5 Determinants
Linearity gives"|B,| = |A,| = |An-1| =
19
Explain why the 4 by 4 Vandermonde determinant contains i Dut not x
lec
sped
ta.
cor x":
ae
ade Dee
eA
ioe
oo * eA
fl) ~ Xie
SA
Bea
The determinant is zero at x =
:
, and
_ The cofactor of x?
is V3 = (b—a)(c—a)(c—b). Then V4 = (b—a)(c—a)(c—b)
(x —a)(x —b)(x—c).
20
Produce zeros by row operations and find the determinant of
O01
ive)
RG
ie
Ghee)
te SL
Ded
Ort
1 0
Find also det G2 and det G3, with zeros on the diagonal and ones elsewhere.
you predict det G,?
Problems
21
21-24 are about block matrices and block determinants.
With 2 by 2 blocks in 4 by 4 matrices, you cannot always use block determinants:
A
0
(a)
22
23
Can
B
D
=|A||D|
yt) gual!03
but
é D # |A||D| — |C||Bl.
Why is the first statement true? Somehow
B doesn’t enter.
(b)
Show by example that equality fails (as shown) when C enters.
(c)
Show by example that the answer det(AD — CB) is also wrong.
With block multiplication,
A= LU
has Ax = L;U,
in the top left corner:
Ay
*|
|Le
O||
Ux
*
x
#*{|
|
x
iar
exh
*
(a)
If A has pivots 2,3,—1 what are the determinants
U2, Uz and Aj, Az, A3?
of L;, L2, L3 and Uj,
(b)
If Ay, Az, A3 have determinants 2,3, —1 find the three pivots.
Block elimination subtracts
CA~! times the first row [A B] from the second
row. This leaves the Schur complement D — CA~'B in the corner:
fi
= CA
1, [cA
le A
B
Pl SC mealla), a=
eA ee
Take determinants of these block matrices to prove correct rules for square blocks:
A B
=
é aa |A|[D—CA’B|-=
if A~! exists
|AD=CB|
if AC =CA.
5.2
24
Permutations
and Cofactors
229
“(2 1-[2 IL
If A is m by n and B is n by m, block multiplication gives det M = det AB:
If A is a single row and B is a single column what is det M? If A is a column
and B is a row what is det M? Do a 3 by 3 example of each.
25
(A calculus question based on the cofactor expansion)
(a)
Find the derivative of det A with respect to aj;. The other entries are fixed—we
are only changing aj; in equation (9).
(b)
Find the derivative of In(det A) with respect to a,;. By the chain rule this is
1/ det A times the answer to (a), which gives the (1,1) entry inthe
matrix.
26
A 3 by 3 determinant has three products “down to the right” and three “down
to the left” with minus signs. Compute the six terms to find D. Then explain
without determinants why this matrix is or is not invertible:
27
For E4 in Problem
nonzero.
16, five of the 4! =
24 terms in the big formula (7) are
Find those five terms to show that E4 = —1.
28
For the 4 by 4 matrix at the beginning of this section, find the five terms in the
big formula that give det A= 16 -4—4—4+1.
29
Compute the determinants
|5j|, |S2|, |S3| of these 1, 3, 1 tridiagonal matrices:
Sie (3/5
oe
=|; a
Sys
a
©
ieh se I
0.1 3
Make a Fibonacci guess for |S4| and verify that you are right.
30
Cofactors of the 1,3, 1 matrices give a recursion similar to the —1, 2, —1 exam-
ple, but with 3 instead of 2:
|Sin| =
3|Sn-1|
-
|Sn—2l-
Challenge: Show that the Fibonacci number |Sp| = F2n+42 satisfies this recursion.
Keep using the Fibonacci rule Fy = Fy—1 + Fr-2.
31
Change 3 to 2 in the upper left corner of the matrix S,. Show that the determinant drops from the Fibonacci number F2,+2 to F241.
230
Chapter 5 Determinants
CRAMER’S RULE, INVERSES, AND VOLUMES a 5.3
This section is about the applications of determinants, first to Ax = b and then to my
In the entries of A~', you will see det A in every denominator—we divide by it.
(If it is zero then A~! doesn’t exist.) Each number in A~! is a determinant divided by
another determinant. So is every component of x = A~'b.
Start with Cramer’s Rule to find x. A neat idea gives the solution immediately. Put
x in the first column of 7. When you multiply by A, the first column becomes Ax
which is b:
A
x
0 O
ll Ol
Mil
= |a
thy
Gp || =] Jai
(1)
We multiplied a column at a time. The first column of B; is Ax, the other columns
are just copied from A. Now take determinants. The product rule is:
(det A)(x;) = det By
or
xX]
i
det By,
7
det A
(2)
This is the first component of x in Cramer’s Rule! Changing a column of A gives By).
To find x2, put the vector x into the second column of the identity matrix:
a,
a
a
Psa
0. se>
Ose
@
O ||=
il
\.ah
WD
Ge || | lth.
(3)
Take determinants to find (det A)(x2) = det By. This gives x2 in Cramer’s Rule:
5B (CRAMER’s
RULE)
If detA is not zero, then Ax = Bb has the unique solution
x det B,
x]
mudepati
y det Bz
82" Seto
4 det B,,
eae
The matrix Bj; is obtained by replacing the jth column of A by the vector b.
The computer program for Cramer’s rule only needs one formula line:
x(j) = det([AG, 1: 7-1)
B AG, j
+1: n)])/det(A)
To solve an n by n system, Cramer’s Rule evaluates n+1 determinants (of A and
the n different B’s). When each one is the sum of n! terms—applying the “big formula”
with all permutations—this makes (n + 1)! different terms. It would be crazy to solve
equations that way. But we do finally have an explicit formula for the solution x.
5.3
Cramer’s Rule, Inverses, and Volumes
231
Example 1 Use Cramer’s Rule (it needs four determinants) to solve
M2
x3) — |
—2x,|+x2
(|)
—4x,
+x3 = 0.
The first determinant is |A|. It should not be zero.
into columns
Then the right side (1,0,0) goes
1, 2, 3 to produce the matrices B,, Bz, B3:
es
|A}=|-2 1 O|=7
=i Ona
ieSikes!
|Bo)=|-2 0 O|=2
ST
and
and
ima laa
1 O|=1
Omom
ie
0
|B3)}=|-2 1 0] =4
-4
0 0
|B|=|0
Cramer’s Rule takes ratios to find the components of x. Always divide by |A|:
x}
=
(By
—
apes
==
1
and
x2
|Bo|
= —=-=
2
Alea
|B3|
4
x3=—=-2.
and
[Al
I always substitute the x’s back into the equations, to check the calculations.
A Formula for A7!
In Example 1, the right side was b = (1,0,0). This is the first column of J. The
solution x must be the first column of the inverse. Then the first column of AA~! = J
is correct. Another important point: When b = (1,0,0) goes in a column of A to
produce one of the B’s, the determinant is just 1 times a cofactor.
are cofactors (one size smaller).
The numbers
1, 2, 4
Look back to see how each B is a cofactor:
|B;|
=1
is the cofactor
cu=|q
|B2| = 2
1s the cofactor
Cj2 = — ee
B3|=4
|B3|=4
is1s the cofactor
Cj3
1=
4
aa
|—4
Ol”
Main point: The numerators in A~' are cofactors. They are divided by det A.
For the second column of A~!, change b to (0, 1, 0). Watch how the determinants of
B,, Bo, B3 are cofactors along row 2—including the signs (—)(+)(—):
O) ik
1 1
OO
4
OO} =-—-1
ma
and
Divide —1,5, —4 by |A| =7
i
—2
—4
0
1
0O
l
Ol
1
Saeeana
l
—2
—4
to get the second column of A7!.
1
1
0
0
1| = —4.
QO
232
Chapter 5 Determinants
For the third column
of A~!, the right side is b = (0,0, 1). The three determi-
nants of the B’s become cofactors along the third row. The numbers are —1, —2, 3.
We always divide by |A| = 7. Now we have all columns of Tee
A=
Summary
1 ole #1
(31-0
=e On a
times
eee
A = 1277"
5/7, 2/7 |) equals:
Aiea
8)
To find A~! you can solve AA~!
=
J. The columns
J.
of J lead to the
columns of A~!. After stating a short formula for A~', we will give a separate direct proof. Then you have two ways to approach A~!—by putting columns of J into
Cramer’s Rule (above), or by cofactors (the lightning way using equation (7) below).
5C (FORMULA FOR A~!)_
determinant of A:
(A
The i, j entry of A!
=e
)ij =
Cji
det A
and
is the cofactor Cj; divided by the
A
=i
a
=
det A
(4)
:
The cofactors Cj; go into the “cofactor matrix” C. Then this matrix is transposed.
To compute the i, j entry of A~', cross out row j and column i of A. Multiply the
determinant by (—1)'*/ to get the cofactor, and divide by det A.
Example 2 The matrix A = [2%] has cofactor matrix C = [_¢ -©]. Look at A times
the transpose of C:
eo
Te
aed
d=")
BF kad’ —“be
=| ie
Alpe
0
0
meal
©)
The matrix on the right is det A times J. So divide by det A. Then A times C!/ detA
is I, which reveals A7!:
ho
tie
Ge
det A
which is
1
ad — bc
bs
<3
|\—c
A
(6)
a
This 2 by 2 example uses letters. The 3 by 3 example used numbers. Inverting a 4 by
4 matrix would need sixteen cofactors (each one is a 3 by 3 determinant). Elimination
is faster—but now we know an explicit formula for A~!.
Direct proof of the formula A~! = C'/ detA The idea is to multiply A times C7:
aj
ao
ay
Crp
Co.
C3)
ayy
431
a.
432
493 | | Cig
433.) | Cis
Copy
C93
Cag
43
det A
PO
0
0
0
deta
0)
detA
(7)
5.3
Row
Cramer’s Rule, Inverses, and Volumes
233
| of A times column | of the cofactors yields det A on the right:
By cofactors of row
1:
a,,C 1 + aj2Cj2
+.a13C13 = det A.
Similarly row 2 of A times column 2 of C! yields detA. The entries a2 j are multiplying cofactors C2; as they should.
Reason for zeros off the main diagonal in equation (7). Rows of A are combined
with cofactors from different rows. Row 2 of A times column 1 of C! gives zero, but
why?
an, Cyy + 422C12
+ a23C 73 = 0.
(8)
Answer: This is the determinant of a matrix A* with two equal rows. A®* is the same as
A, except that its first row is a copy of its second row. So det A* = 0, which is equation (8). It is the expansion of detA* along row 1, where A* has the same cofactors
C11, Cj2, C13 as A—because all rows agree after the first row. Thus the remarkable
matrix multiplication (7) is correct.
It has det A times / on the right side:
AC!= (det A)I
Cee
“s
Ae — fas
det A
Example 3 A triangular matrix of I’s has determinant
cofactors:
Il
TA‘aes
A\ =
@
©
(lei Page alle
Cross out row
©
NOFA)acl
i
has inverse
1
On
0
O
On
ie
eel) es
a seeker
Ag,
=
(ies
1
0
“fil
| and column
|. The inverse matrix contains
-!1l
| of A to see the cofactor C;; = 1. Now
1
cross out row
1 and column 2. The 3 by 3 submatrix is still triangular with determinant 1. But the
cofactor C;2 is —1 because of the sign (—1)!*2. This number —1 goes into the (2, 1)
entry of A~'—don’t forget to transpose C!
The inverse of a triangular matrix is triangular. Cofactors give a reason why.
Example 4 If all cofactors are nonzero, is A sure to be invertible? No way.
Example 5 Here is part of a direct computation of A”! by cofactors:
Gy
A=|1
al
0.1
pe
yh
PAN
and
5)
Cj.
= —(—2)
Cro
= —6
1|*
and
AW=%
X
ee
oa
Phe
x
Sar
Bs
X
234
Chapter 5 Determinants
(x2, y2)
(x1, 1)
(x3, y3)
Figure 5.1
General triangle; special triangle from (0, 0); general from three specials.
Area of a Triangle
Everybody knows the area of a rectangle—base times height. The area of a triangle is
half the base times the height. But here is a question that those formulas don’t answer.
If we know the corners
area?
(x1, y,) and (x2, y2) and (x3, y3) of a triangle, what is the
Using the corners to find the base and height is not a good way. Determinants
are much better. There are square roots in the base and height, but they cancel out
in the good formula. The area of a triangle is half of a 3 by 3 determinant. If one
corner is at the origin, say (x3, y3) = (0,0), the determinant is only 2 by 2.
5D (Area of a triangle) The triangle with corners (x;, y;) and (x2, y2) and (x3, y3)
has area = E (determinant):
Ais
sy
eee
area = 4 wy
wa
|
oe
8
One
el
ated — 5 i
2
:
when’
((¥3, v3) =©,, 0).
When you set x3 = y3 = 0 in the 3 by 3 determinant, you get the 2 by 2 determinant.
These formulas have no square roots—they are reasonable to memorize. The 3 by 3
determinant breaks into a sum of three 2 by 2’s, just as the third triangle in Figure 5.1
breaks into three triangles from (0, 0):
Sen
eae
+5(x1y2 — x2y1)
5 (%2 y2 1) = +5(x2y3
—x3y2)
x3
y3
|
(9)
+5 (x31 — x1 y3).
This shows the area of the general triangle as the sum of three special areas. If (0, 0)
is outside the triangle, two of the special areas can be negative—but the sum is still
correct.
The real problem is to explain 5(x1y2 — x2y}).
5.3
Cramer’s Rule, Inverses, and Volumes
235
(1,3)
(X1, ¥,)
(0, 0)
Figure 5.2
(4, 1)
I
Triangle area 5;
(0, 0)
_A triangle is half of a parallelogram.
Why is this the area of a triangle? We can remove the factor $ and change to a
parallelogram (twice as big). The parallelogram contains two equal triangles. We now
prove that its area is the determinant x;y2 — x2y,;. This area in Figure 5.2 is 11, and
the triangle has area 55.
Proof that a parallelogram starting from (0,0) has area = 2 by 2 determinant.
There are many proofs but this one fits with the book. We show that the area has the
same properties 1-2-3 as the determinant. Then area = determinant! Remember that
those three rules defined the determinant and led to all its other properties.
1 When
A = /7/, the parallelogram becomes the unit square. Its area is det / = 1.
2 When rows are exchanged, the determinant reverses sign. The absolute value (positive area) stays the same—it is the same parallelogram.
3 If row | is multiplied by ft, Figure 5.3a shows that the area is also multiplied by f.
If row 1 of A is added to row 1 of A’ (keeping row 2 fixed), the new determinant is
|A|+|A’|. Figure 5.3b shows that the solid parallelogram areas also add to the dotted
parallelogram area (because the two triangles completed by dotted lines are the same).
Dotted area = Solid area=A + A’
Full area=tA — 2 =
/
=
ti
x
Sires
(+X),
(0, 0)
Figure 5.3.
M2
YH)
(0, 0)
Areas tA and A + A’ obey linearity (keeping one side constant).
236
Chapter 5 Determinants
That is an exotic proof, when we could use plane geometry. But the proof has a
major attraction—it applies in n dimensions. The n edges going out from the origin are
given by the rows of an n by n matrix. This is like the triangle with two edges going
out from (0,0). The box is completed by more edges, just as the parallelogram was
completed from a triangle. Figure 5.4 shows a three-dimensional box—whose edges
are not at right angles.
The volume of the box in Figure 5.4 equals the absolute value of det A. Our
proof is to check that rules 1-3 for determinants are also obeyed by volumes. When an
edge is stretched by a factor t, the volume is multiplied by t. When edge 1 is added
to edge 1’, the new box has edge 1 + 1’. Its volume is the sum of the two original
volumes. This is Figure 5.3b lifted into three dimensions or n dimensions. I would
draw the boxes but this paper is only two-dimensional.
The unit cube has volume = 1, which is det J. This leaves only rule 2 to be checked.
Row exchanges or edge exchanges leave the same box and the same absolute volume.
The determinant changes sign, to indicate whether the edges are a right-handed triple
(det A > 0) or a left-handed triple(det A < 0). The box volume follows the rules for
determinants, so volume = determinant.
Example 6 A rectangular box (90° angles) has side lengths r,s, and t. Its volume is
r times s times ¢. A matrix that produces those three sides is A =
. Then det A
also equals rst.
Example 7 (Triple and double integrals) In calculus, the box is infinitesimally small!
When the coordinates are x, y, z, the box volume is dV = dx dy dz. To integrate over
a circle, we might change x and y to r and @. Those are polar coordinates: x = r cos@
(431, 432, 433)
(411, 412, 443)
Jf
(421,422,
22+
423 a23)
x
Figure 5.4
Three-dimensional box formed from the three rows of A.
5.3.
and
y=rsin@.
Cramer’s
Rule, Inverses, and Volumes
237
The area of a polar box is a determinant J times dr dé:
dx/dr
dy/or
—
dx/dO|
dy/d0|
_
—
cos@
—rsing
sin 8
r cos@
This determinant is J = rcos?6 +r sin?6 =r. That is the r in polar coordinates. A
small area is dA = rdrd9é. The stretching factor J goes into double integrals just as
dx/du goes into an ordinary integral [ dx = [(dx/du) du.
The Cross Product
This is an extra (and optional) application, special for three dimensions.
tors u = (u),U2,u3)
Start with vec-
and v = (vj, v2, v3). These pages are about their cross product.
Unlike the dot product, which is a number, the cross product is a vector—also in three
dimensions. It is written uw x v and pronounced “u cross v.” We will quickly give the
components of this vector, and also the properties that make it useful in geometry and
physics.
This time we bite the bullet, and write down the formula before the properties.
DEFINITION
vector
UXV=
The cross product
i
ie
ek
|u,
U2
U3|
VI
U2~~——«OU3
of u =
(uj, u2,u3)
and v =
(vj, v2, v3) is the
SGovs —u302)i + (U3U] = U1) 7 ie — ok
(10)
This vector is perpendicular to u and v. The cross product v x u is —(u x v).
Comment The 3 by 3 determinant is the easiest way to remember u x v. It is not
especially legal, because the first row contains vectors i, j, k and the other rows contain
numbers. In the determinant, the vector i = (1,0, 0) multiplies u2v3 and —u3v2. The
result is (u2v3 — u3v2, 0,0), which displays the first component of the cross product.
Notice the cyclic pattern of the subscripts: 2 and 3 give component |, then 3 and
1 give component 2, then | and 2 give component 3. This completes the definition of
u x v. Now we list its properties.
Property 1 v x uw reverses every sign so it equals —(u x v). We are exchanging rows
2 and 3 in the determinant so we expect a sign change.
Property 2 The cross product is perpendicular to uw (and also to v), The most direct
proof is to watch terms cancel. Perpendicularity is a zero dot product:
u-(u
Xx Vv) = uy (U2V3 — U3V2) + U2(U3V)
— U4 V3) + U3(UjV2 — u2U,) = 0.
(11)
238
Chapter 5 Determinants
This is like replacing i, j, k in the top row by wy, u2, u3. The cross-product determinant now has a repeated row so it is zero.
Property 3 The cross product of any vector with itself (two equal rows) is u xu = 0.
When w and v are parallel, the cross product is zero. When wu and v are perpendicular,
the dot product is zero. One involves sin@ and the other involves cos 6:
lw x vl] = |lu|| lvl |sin@|
Example 8 Since u = (3,2,0)
the z axis:
= and
fu + | = ||z|| ||v|] | cos 4].
and v = (1,4,0)
‘pak
2 0O|= 10k.
4 0
(12)
are in the xy plane, u x v goes up
The cross product is (0, 0, 10).
The length of u x v equals the area of the parallelogram with sides u and v. This
will be important: In this example the area is 10.
Example 9 The cross product of wu = (1,1, 1) and v = (1, 1, 2) is (1, —1, 0):
1
alt a]tel afet—s
i
|
EE
ee
ee
ie
This vector (1, —1, 0) is perpendicular to (1, 1, 1) and (1, 1, 2) as predicted. Area = MO},
Example 10 The cross product of (1,0,0) and (0, 1,0) obeys the right hand rule. It
goes up not down:
Io ik
k
O|=k
0
Rule ux v points along
your right thumb when the
fingers curl from u to v.
u =i
v=j
Thus i x j = k. The right hand rule also gives j x kK= i and k xi = j. Note
the cyclic order. In the opposite order (anti-cyclic) the thumb is reversed and the cross
product goes the other way: k x j = —i andi x k = —j and j x i = —k. You see
the three plus signs and three minus signs from a 3 by 3 determinant.
5.3
Cramer’s
Rule, Inverses, and Volumes
239
The definition of u x v can be based on vectors instead of their components:
DEFINITION = The cross product is a vector with length ||w|j ||v|| |sin@|. Its direction is perpendicular to u and v. It points “up” or “down” by the right hand rule.
This definition appeals to physicists, who hate to choose axes and coordinates. They
see (U1, U2, U3) as the position of a mass and (F,, Fy, F,) as a force acting on it. If F
is parallel to u, then u x F = 0—there is no turning. The mass is pushed out or pulled
in. The cross product u x F is the turning force or torque. It points along the turning
axis (perpendicular to u and F). Its length ||w|| ||F'|| sin@ measures the “moment” that
produces turning.
Triple Product = Determinant = Volume
Since u xv is a vector, we can take its dot product with a third vector w. That produces
the triple product (u x v)-w. It is called a “scalar” triple product, because it is a
number. In fact it is a determinant:
Gy
W,
W2
U3
wm=|u,
un
uz|=|¥
uj
v2
3).
V1
U2
U3
W2
W3
Ww]
u2
U3
(13)
We can put w in the top row or the bottom row. The two determinants are the same
because it takes
row exchanges to go from one to the other. Notice when the
determinant is zero:
(u x v)»w=O_
First reason
exactly when the vectors u, v, w lie in the same plane.
u X v is perpendicular to that plane so its dot product with w is zero.
Second reason Three vectors in a plane are linearly dependent. The matrix is singular
(determinant = 0).
Third reason Determinant equals volume. This is zero when the u, v, w box is squashed
onto a plane.
It is remarkable that (u x v)-w equals the volume of the box with sides u, v, w. Both
numbers are given by the 3 by 3 determinant—which carries tremendous information.
Like ad — bc for a 2 by 2 matrix, it separates invertible from singular. Chapter 6 will
be looking for singular.
240
= Chapter 5
Determinants
=»
REVIEW OF THE KEY IDEAS
#8
Cramer’s Rule solves Ax = b by ratios like x; = |baz...an|/|Al.
The formula for the inverse matrix is A~! = CT /(det A).
The volume of a box is | det A], when the sides are the rows of A.
That volume is needed to change variables in double and triple integrals.
The cross product u x v is perpendicular to u and v.
Problem Set 5.3
Problems
1
1-5 are about Cramer’s
Rule for
x = A~!b.
Solve these linear equations by Cramer’s Rule x; = det B;/ det A:
ad
i
ie ae
2x1 +
b)
x2
=]
x +2 + 3 =0
aa
x2 + 2x3 = 0.
Use Cramer’s Rule to solve for y (only). Call the 3 by 3 determinant D:
i)
ax
by =1
ex'-+ dy = 0
ax+by+cez=1
dx+ey+ fz=0
gx thy+iz=0.
(b)
Cramer’s Rule breaks down when det A = 0. Example (a) has no solution while
(b) has infinitely many. What are the ratios x; = det B;/ det A in these two cases?
(a)
2x1 + 3x2 = 1
4x, + 6x2 = 1
(b)
2x1
+ 3x2 = 1
4x; + 6x2 = 2.
Quick proof of Cramer’s rule. The determinant is a linear function of column 1.
It is zero when two columns are equal. If b = Ax = Yaj;x; goes into the first
column
of A, the determinant
|b
ap
a3| = |Lajx;
of this matrix
ar
B; is
a3| = x |ay
a>)
(a3)\ =x
(a)
What formula for x; comes from left side = right side?
(b)
What steps lead to middle determinant = right side?
det A.
If the first column of A is also the right side b, solve the 3 by 3 system Ax = b.
How does each determinant in Cramer’s Rule lead to your answer?
5.3.
Cramer’s Rule, Inverses, and Volumes
Problems 6-16 are about A~! = C'/ det A. Remember
6
241
to transpose C.
Find A~! from the cofactor formula C!/det A. Use symmetry in part (b).
i
(2)
eee
ee
2. 0
ones)
a
(0) Os as
ot
ee
FS
ee
Ore
1h F2
If all the cofactors are zero, how do you know that A has no inverse?
of the cofactors are zero, is A sure to be invertible?
If none
Find the cofactors of A and multiply AC? to find det A:
i
AC—a
it
i
al
lane,
F205
and
i
A)
Cs
Suppose det A = | and you know all the cofactors.
How can you find A?
From the formula AC! = (det A)/ show that det C = (det A)"~!.
(for professors only) If you know all 16 cofactors of a 4 by 4 invertible matrix
A, how would you find A?
If all entries of A are integers, and det A =
A7! are integers. Give a 2 by 2 example.
| or —1, prove that all entries of
If all entries of A and A~! are integers, prove that det A = 1 or —1. Hint: What
is det A times det A~!?
14
Complete the calculation of A~! by cofactors in Example 5.
15
L is lower triangular and S is symmetric.
ea)
jb=S\a
Ee
det
4h
16
Qa
Sas
|
di
baad
mGaE
olenen tj
(a)
L~! is also lower triangular. Which three cofactors are zero?
(b)
S~! is also symmetric.
Which three pairs of cofactors are equal?
For n = 5 the matrix C contains ____ cofactors and each 4 by 4 cofactor
contains
terms and each term needs —____ multiplications. Compare with
53 — 125 for the Gauss-Jordan computation of A~! in Section 2.4.
Problems
17
Assume they are invertible:
17-26 are about area and volume by determinants.
(a)
Find the area of the parallelogram with edges v = (3,2) and w = (1, 4).
(b)
Find the area of the triangle with sides v, w, and v + w. Draw it.
(c)
Find the area of the triangle with sides v, w, and w — v. Draw
it.
242
Chapter 5
18
A box has edges from (0, 0, 0) to (3, 1, 1) and (1, 3, 1) and (1, 1, 3). Find its volume and also find the area of each face from ||u x v||.
The faces are
Determinants
parallelograms.
19
20
(a)
The corners of a triangle are (2, 1) and (3,4) and (0,5). What is the area?
(b)
Add a corner at (—1,0)
the area?
to make a lopsided region (four sides).
What is
The parallelogram with sides (2,1) and (2,3) has the same area as the parallel-
ogram with sides (2,2) and (1,3). Find those areas from 2 by 2 determinants
and say why they must be equal. (I can’t see why from a picture. Please write
to me if you do.)
21
(a)
Suppose the column
vectors in a 3 by 3 matrix have lengths Lj, L2, L3.
What is the largest possible value for the determinant?
(b)
If all entries have |a;;| = 1, can the determinant (six terms) equal 6?
22
Show by a picture how a rectangle with area x; y2 minus a rectangle with area
X2y, produces the same area as our parallelogram.
23
When the edge vectors a, b,c are perpendicular, the volume of the box is |la]|
times ||b|| times |lc||. The matrix A’ A is
. Find the determinants of ATA
and A.
24
The box with edges i and j and w = 2i + 37 + 4k has height
. What is
the volume? What is the matrix with this determinant? What is i x j and what
is its dot product with w?
25
An n-dimensional cube has how many corners? How many edges? How many
(n — 1)-dimensional faces? The cube whose edges are the rows of 2/ has volume _____. A hypercube computer has parallel processors at the corners with
connections along the edges.
26
The triangle with corners (0, 0), (1, 0), (0, 1) has area 5. The pyramid with four
corners (0, 0,0), (1, 0, 0), (0, 1, 0), (0, 0, 1) has volume
. What is the volume of a 4-dimensional pyramid with five corners at (0,0, 0,0) and the columns
of [?
Problems
27,
27-30 are about areas dA and volumes dV
Polar coordinates r,@ satisfy
x =rcos@
and
OxfOr” ON OOn
dy/dr
The two columns
dy/d0|
—
in calculus.
y=rsin@.
cos@
sinO?
Polar area includes J:
—rsind |
rcos@
are orthogonal and their lengths are
~ Therefore: J =
5.3.
Cramer’s
Rule, Inverses, and Volumes
243
28
Spherical coordinates p,y,6 satisfy x = psingcos@ and y = psingsin@ and
z= pcosg. Find the 3 by 3 matrix of partial derivatives: dx/dp, dx /dg~, 0x/d0
in row 1. Simplify its determinant to J = p*sing. Then dV ina sphere is
p* sing dp dgdé.
29
The matrix that connects 7,6 to x, y is in Problem 27. Invert that 2 by 2 matrix:
al
ore
to
dr/ox
d0/dx
or/dy
dO/dy
cos@
?
ae
yh
?
It is surprising that dr/dx = dx/dr (Calculus, Gilbert Strang, p. 501). Multiplying the matrices in 27 and 29 gives the chain rule a = cae + a Ae
30 = The triangle with corners (0, 0), (6, 0), and (1, 4) has area
it by 6 = 60° the area is
ae
. When you rotate
. The determinant of the rotation matrix is
cos @ — sind}
sind
cos@
_
5 ? ies
ois
Problems 31-37 are about the triple product (u x v) -w in three dimensions.
31.
A box has base area ||u x v||. Its perpendicular height is ||w||cos@. Volume
equals base area times height. So volume = ||u x v|| ||w|| cos @ which is (ux v)-w.
Compute
base area, height, and volume
for u =
(2,4,0),
v =
(—1,3,0),
w =
Gee
32
33
The volume of the same box is given by a 3 by 3 determinant.
determinant.
Evaluate that
Expand the 3 by 3 determinant in equation (13) in cofactors of its row 1, U2, U3.
This expansion is the dot product of u with the vector
34
Which of the triple products (u x w)-v and (w xu)-v and (v x w)-u are the same
as (u x v)+w? Which orders of the rows u, v, w give the correct determinant?
Shee
ket? = (hk
0,=1) andiOi= G51)
and.k = @2.2a1)<
is a parallelogram and compute its area. Choose T,U,V
is a tilted box and compute its volume.
36
37
ChooseS’so' tha. PORS
so that OPQRSTUV
Suppose (x, y,z) and (1, 1,0) and (1, 2,1) lie on a plane through the origin.
What determinant is zero? What equation does this give for the plane?
Suppose
(x, y, z) is a linear combination
of (2, 3,1) and (1, 2,3).
What deter-
minant is zero? What equation does this give for what plane?
38
(a)
Explain from volumes why det2A = 2” det A for n by n matrices.
(b)
For what size matrix is the false statement det
A+det A = det(A
+ A) true?
.
TE
Raita? Qoereay
qe (imees «1
Hay Wieheee SSN oot, (ule
GH,ees
\-ORk ge aay rN Aa
FENN anet
if aay
inital
shied‘4 > \ at melon
|
Thee ones
sage ar
all?
eal DS, 4 mr:
vows hanes otathe aa Skane
©Ts). ton
n
lthtaie),
Dl
oe
we hice ;
Tid peu beeentlw
oe etal: Bh Na) Ti nm
ei
any 7. ‘ght Ay vt an
|
we ‘+ejane faa. harass ae
ay 4
Gh air Ue
3, fi
2 ps tal “7
Mri
oJ Pl reigns‘Gin
;
a
te vn
a8
hw i
le
ing wah
panes et
ma Se
aR
) et
vai sat
(gyn TY
nae wren)
Gr Metre HAN ite snl ee TELE a
ie Boy rent rth
it eee)
a
me
hh
¥ “a
re =
wit
Cote 6 a
See
Soe)
+ res ens sagan
ehennis ati Swi fis Fu
1 PePol Wii
—
) oo
mili
a
tt
a bea
gD
ny
..
car
6 Tie pews
HOA
@! werealiens cutee anycde
‘iD (oem
1)
®
eneahd® Oh ee
tm ec, @ here
pyle
Sip
eindildipars
a
acai dina cium et
yn
‘oat
(arte
y
gutali¢ eurenehas ee"
Nadecole at tow © gee) Minders olgith adTe |
Cweusiccrrsctadle Hime oy yo
8 no,
mh...
28
te. oe
<n
eo
Wy orbhs at fade Ses
oO, ()
@\ ae
Ee
nae nical —
Sih A) we A tad hye eB
bun (f—
aS) waits Sadie
iD ND AROS
ah orice BONY
wir th pees
aes
Wisteria
Tonghy @ al peer
y
aw
i
pea
ud
idSot here
pity
—
thai yen otek
foo TF Rye
snmptiigresest
at |
we
7
es 7) fart) FS
8 Aet b
nerd
_
a
ies
—
and)
a
swore
he: .
eaters
miata
‘
EIGENVALUES AND
EIGENVECTORS
INTRODUCTION
TO EIGENVALUES ua 6.1
Linear equations Ax = b come from steady state problems. Eigenvalues have their
greatest importance in dynamic problems. The solution is changing with time—growing
or decaying or oscillating. We can’t find it by elimination. This chapter enters a new
part of linear algebra. All matrices in this chapter are square.
A good model comes from the powers A, A’, A>,... of a matrix. Suppose you need
the hundredth power A!. The starting matrix A becomes unrecognizable after a few
steps:
sOuaaesos
en
70.45
Bo
ee)
£650 {2325
350.475
A
A?
A
j)qres
.6000
4000
.6000
.4000
Ae
A! was found by using the eigenvalues of A, not by multiplying 100 matrices. Those
eigenvalues are a new way to see into the heart of a matrix.
To explain eigenvalues, we first explain eigenvectors. Almost all vectors change
direction, when they are multiplied by A. Certain exceptional vectors x are in the
same direction as Ax. Those are the “eigenvectors.” Multiply an eigenvector by A,
and the vector Ax is a number A times the original x.
The basic equation is Ax = Ax. The number A is the “eigenvalue”. It tells
whether the special vector x is stretched or shrunk or reversed or left unchanged—
when it is multiplied by A. We may find A = 2 or t or —1 or 1. The eigenvalue
A could also be zero. The equation Ax = Ox means that this eigenvector x is in the
nullspace.
If A is the identity matrix, every vector has Ax = x. All vectors are eigenvectors.
The eigenvalue (the number lambda) is A = 1. This is unusual to say the least. Most 2
245
246
Chapter 6
Eigenvalues and Eigenvectors
|
A2x1 = (1)*x1
eS
A2x9 = (.5)2x5 = Bey
AxD =p)
— [ea
[1
Figure 6.1
The eigenvectors keep their directions. The eigenvalues for A> are),
by 2 matrices have two eigenvector directions and two eigenvalues. This section teaches
how to compute the x’s and 4’s. It can come early in the course because we only need
the determinant of a 2 by 2 matrix.
For the matrix A in our model, here are eigenvectors x; and x2. Multiplying
those vectors by A gives x; and 5X2. The eigenvalues 4 are | and 5:
fe
ey
oe
za]
| ky
wre]
Wes = 2%,
(Ax =x
and
Ax) = E
oD
and
eel cote
Tal alr
io
an=[5
lie
S|
If we multiply x; by A again, we still get x;.
A"x, =x
meéans that A, — 1)
ar
Tt
i
(this is 3X2 SO A2 = 5).
Every power of A will give
,. Multiplying x2 by A gave 5X2, and if we multiply again we get (5)°x2.
When A is squared, the eigenvectors x1 and x2 stay the same. The A’s are now 17
and (5). The eigenvalues are squared! This pattern keeps going, because the eigenvectors stay in their own directions (Figure 6.1) and never get mixed. The eigenvectors
of A! are the same x, and x2. The eigenvalues of A! are 1190 = 1 and (5) 1 =
very small number.
Other vectors do change direction. But those other vectors are combinations of
the two eigenvectors. The first column of A is x; + (.2)x2:
2]=L4]+L-3}
8
6
D5
z
Multiplying by A gives the first column of A”. Do it separately for x; and (.2)x2. Of
course Ax; = x,. And A multiplies the second vector by its eigenvalue 5:
alee Wa les
Each eigenvector is multiplied by its eigenvalue, whenever A is applied. We didn’t
need these eigenvectors to find A*. But it is the good way to do 99 multiplications.
6.1
Introduction to Eigenvalues
247
At every step x; is unchanged and x2 is multiplied by (3), so we have (Gy:
Ao
ae
| is really
1
:
| + | small
:
This is the first column of A!°°,
exact.
Moe
x, + (2)(Gy"x2 =
vector
The number we originally wrote as .6000 was not
We left out (.2)(5)”” which wouldn’t show up for 30 decimal places.
The eigenvector x; is a “steady state” that doesn’t change (because A; = 1). The
eigenvector x2 is a “decaying mode” that virtually disappears (because A2 = .5). The
higher the power of A, the closer its columns approach the steady state.
We mention that this particular A is a Markov matrix. Its entries are positive
and every column adds to 1. Those facts guarantee that the largest eigenvalue is A = 1
(as we found). Its eigenvector x; = (.6, .4) is the steady state—which all columns of
A* will approach. Section 8.2 shows how Markov matrices appear in applications.
Example 1
The projection matrix P = Fe 2] has eigenvalues 0 and 1.
Its eigenvectors are x; = (1,1) and x2 = (1,—1).
For those vectors,
Px; = x; and
Px2 = 90. This projection matrix illustrates three things at once:
1
Each column of P adds to 1, so A = | is an eigenvalue.
2
P is a singular matrix, so 4 = 0 is an eigenvalue.
3
P is a symmetric matrix, so x; and x2 are perpendicular.
The only possible eigenvalues of a projection matrix are 0 and 1. The eigenvectors for
X = 0 (which means Px = Ox) fill up the nullspace. The eigenvectors for 4 = 1
(which means Px = x) fill up the column space. The nullspace is projected to zero.
The column space stays fixed (those vectors are projected onto themselves).
An in-between vector like v = (3, 1) partly disappears and partly stays:
LE] = [EE]
The projection keeps the column space part of v and destroys the nullspace part.
To emphasize: Special properties of a matrix lead to special eigenvalues and eigenvectors. That is a major theme of this chapter. Projections have 4 = 0 and |. The next
matrix (a reflection and a permutation) is also special.
Example 2
The reflection matrix R = [9}] has eigenvalues 1 and —1.
248
Chapter 6
Eigenvalues and Eigenvectors
se
®
Px,cia
=
*
at
~
X5 \ N
Rx, =X,
en "
x
\
s
S
=
\
N
5
\
oS
N
‘e
ee
Projection
onto line
;
Reflection
across line
Rx
‘
Figure 6.2
Projections have A = | and 0. Reflections have 4 = 1 and —1. Perpendicular eigenvectors.
The eigenvector (1, 1) is unchanged by R. The second eigenvector is (1, —1)—its signs are
reversed by R. A matrix with no negative entries can still have a negative eigenvalue!
The perpendicular eigenvectors are the same x; and x2 that we found for projection.
Behind that fact is a relation between the matrices R and P:
DP <= Ii ss IX
or
io
ae
ae,
iO
Oneal
3\-lo Wah: “ak
Here is the point. If Px = Ax then 2Px = 2x.
the matrix is doubled.
Now
subtract x
(2)
The eigenvalues are doubled when
= x. The result is (2P — /)x = (2A — 1)x.
When a matrix is shifted by I, each x is shifted by 1. The eigenvectors stay the same.
The eigenvalues are related exactly as the matrices are related:
2P — 1 eR
so the eigenvalues of R are
2) aA = 1
20) —TSsi
Similarly the eigenvalues of R? are 7. In this case R* = J and (1)? = Land (—1)? = 1.
The Equation for the Eigenvalues
In small examples we could try to solve Ax = Ax by trial and error. Now we use
determinants and linear algebra. This is the key calculation in the whole chapter—to
compute the X’s and x’s.
First move Ax to the left side. Write the equation Ax = Ax as (A—AI)x = 0.
The matrix A —AJ times the eigenvector x is the zero vector. The eigenvectors make
up the nullspace of
A—dI! When we know an eigenvalue 2, we find an eigenvector
by solving (A —AJ)x = 0.
Eigenvalues first. If (A — A/)x = 0 has a nonzero
solution,
A — AJ is not invertible.
The determinant of A —XI must be zero. This is how to recognize an eigenvalue A:
6.1
6A
Introduction to Eigenvalues
249
The number A is an eigenvalue of A if and only if
det(A — AI) =0.
(3)
This “characteristic equation” involves only A, not x. Then each eigenvalue A leads to x:
Solve
(A —Al)x =0
or
Ate AY:
(4)
Since det(A — AJ) = 0 is an equation of the nth degree, the matrix has n eigenvalues.
Example 3 A= (3 ral is already singular (zero determinant). Find its 4’s and x’s.
When A is singular, 4 = 0 is one of the eigenvalues. The equation Ax = Ox has
solutions. They are the eigenvectors for A = 0. But to find all 4’s and x’s, begin as always
by subtracting AJ from A:
Subtract 4 from the diagonal to find
Take the determinant of this matrix.
2 — 5A +4.
From 1 —
A—AJ
==
= [5 é
)
ae | ,
times 4—4, the determinant involves
The other term, not containing A, is 2 times 2. Subtract as in ad — be:
ania
how
—
—
_-QQ2) =a 2 Ls
76d2 i= (1-24-12)
(5)
This determinant A*—5A is zero when A is an eigenvalue. Factoring into A times A—S,
the two roots are A = 0 and 4 =5S:
det(A — AI) = 7-51
Now find the eigenvectors.
(A —OI)x =
(4=sx
=0
yields the eigenvalues
The matrices
and
dA2=5S.
Solve (A — A/)x = 0 separately for 4; = 0 and Az =5:
ea
bee) haan | Rt
F | ele
A
=| 5
A; =0
ae
yields an eigenvector
yi}
2
Fisted
ic
[OF
Ay 0
A
yt am Olen:
pale
yields an eigenvector
i ee fe:
Be
b
fOR Ad == 9.
A —O/
and A —S/
are singular (because 0 and 5 are eigenvalues).
The eigenvectors are in their nullspaces: (A — A/J)x =0 is Ax = Ax.
We need to emphasize: There is nothing exceptional about X = 0. Like every
other number, zero might be an eigenvalue and it might not. If A is singular, it is.
The eigenvectors fill the nullspace: Ax = Ox = 0. If A is invertible, zero is not an
eigenvalue. We shift A by a multiple of / to make it singular. In the example, the
shifted matrix A —5/ was singular and 5 was the other eigenvalue.
250
Chapter 6
Summary
1.
Eigenvalues and Eigenvectors
To solve the eigenvalue problem for an n by n matrix, follow these steps:
Compute the determinant of A — AI.
With 2 subtracted along the diagonal,
this determinant starts with 4” or —A”. It is a polynomial in A of degree n.
2.
Find the roots of this polynomial, so that det(A — AJ) = 0. The n roots are
the n eigenvalues of A.
3.
For each eigenvalue solve the system (A —AI)x = 0.
Since the determinant
is zero, there are solutions other than
x = 0. Those x’s
are the eigenvectors.
A note on quick computations, when A is 2 by 2. The determinant of A —A/ is a
quadratic (starting with 47). By factoring or by the quadratic formula, we find its two
roots (the eigenvalues). Then the eigenvectors come immediately from A — AJ. This
matrix is singular, so both rows are multiples of a vector (a,b). The eigenvector is
any multiple of (b, —a). The example had A = 0 and A =5S:
XA =0:
rows of A —OJ/ in the direction (1, 2); eigenvector in the direction (2, —1)
4 =5:
rows of A —S/
in the direction (—4, 2); eigenvector in the direction (2, 4).
Previously we wrote that last eigenvector as (1,2).
Both
(1,2) and (2,4) are correct.
There is a whole line of eigenvectors—any nonzero multiple of x is as good as x.
Often we divide by the length, to make the eigenvector into a unit vector.
We end with a warning. Some 2 by 2 matrices have only one line of eigenvectors.
This can only happen when two eigenvalues are equal. (On the other hand A = /
has equal eigenvalues and plenty of eigenvectors.) Similarly some n by n matrices
don’t have n independent eigenvectors. Without n eigenvectors, we don’t have a basis.
We can’t write every v as a combination of eigenvectors. In the language of the next
section, we can’t diagonalize the matrix.
Good
News, Bad News
Bad news first: If you add a row of A to another row, or exchange rows, the eigenvalues usually change. Elimination does not preserve the i’s. After elimination, the
triangular U has its eigenvalues sitting along the diagonal—they are the pivots. But
they are not the eigenvalues of A! Eigenvalues are changed when row | is added to
row 2:
|
ale
Ro
‘| has
Good news second:
A = 0 and A = 1;
a=|
Toe
I Gy
haseAs— Onan Gee
The product i, times Az and the sum A; +A
can be found
quickly from the matrix A. Here the product is 0 times 2. That agrees with the deter-
6.1
Introduction to Eigenvalues
251
minant of A (which is 0). The sum of the eigenvalues is 0 +2. That agrees with the
sum down the main diagonal of A (which is 1+ 1). These quick checks always work:
6B The product of the n eigenvalues equals the determinant of A.
6C The sum of the n eigenvalues equals the sum of the n diagonal entries.
number is called the trace of A:
Ay + A2 + +++ + An = race = aj; + 422 +--+ + Gan.
This
(6)
Those checks are very useful. They are proved in Problems 15-16 and again in the
next section. They don’t remove the pain of computing the 4’s. But when the computation is wrong, they generally tell us so. To compute the correct 4’s, go back
to det(A — AJ) = 0.
The determinant test makes the product of the ’s equal to the product of the
pivots (assuming no row exchanges). But the sum of the A’s is not the sum of the
pivots—as the example showed. The individual A’s have almost nothing to do with the
individual pivots. In this new part of linear algebra, the key equation is really nonlinear:
A multiplies x.
Imaginary Eigenvalues
One more bit of news (not too terrible). The eigenvalues might not be real numbers.
Example 4 The 90° rotation Q =[_° }] has no real eigenvectors or eigenvalues.
No vector Qx stays in the same direction as x (except the zero vector which is useless).
There cannot be an eigenvector, unless we go to imaginary numbers. Which we do.
To see how i can help, look at Q? which is —/. If Q is rotation through 90°,
then Q? is rotation through 180°. Its eigenvalues are —1 and —1. (Certainly —Ix = —1x.)
Squaring Q is supposed to square its eigenvalues A, so we must have ? = —1. The
eigenvalues of the 90° rotation matrix Q are +i and —i.
Those A’s come as usual from det(Q — AJ) = 0. This equation gives 4274+1=0.
Its roots are Aj =i and A2 = —i. They add to zero (which is the trace of Q). They
multiply to one (which is the determinant).
We meet the imaginary number 7 (with i? = —1) also in the eigenvectors of Q:
ft olll=‘L:] = Lr olli=-Li}
Somehow these complex vectors keep their direction as they are rotated. Don’t ask
me how. This example makes the all-important point that real matrices can easily have
252
Chapter 6
Eigenvalues and Eigenvectors
complex eigenvalues. It also illustrates two special properties that we don’t study completely until Chapter 10:
1
Q is a skew-symmetric matrix so each 4 is pure imaginary.
2
Q
is an orthogonal matrix so the absolute value of each A is 1.
A symmetric matrix (A? = A) can be compared to a real number. A skew-symmetric
matrix (A? = —A) can be compared to an imaginary number. An orthogonal matrix
(ATA = J) can be compared to a complex number with |A| = 1. For the eigenvalues
those are more than analogies—they are theorems to be proved. The eigenvectors for
all these special matrices are perpendicular.*
Eigshow
There is a MATLAB
demo called eigshow, displaying the eigenvalue problem for a
2 by 2 matrix. It starts with the unit vector x = (1,0). The mouse makes this vector move around the unit circle. At the same time the graph shows Ax, in color and
also moving. Possibly Ax is ahead of x. Possibly Ax is behind x. Sometimes Ax is
parallel to x. At that parallel moment, Ax = Ax and x is an eigenvector.
tae
/
Ax7 = 0.5x7
0.3, 0.7)
\
\
/
\
\ ellipse of Ax’s
AxI——1(O18702)
As
ge
/
Fe
ciclesofix's
ap =O)
The eigenvalue A comes from the length and direction of Ax. The built-in choices
for A illustrate various possibilities:
1.
There are no (real) eigenvectors. The directions of x and Ax never meet.
eigenvalues and eigenvectors are complex.
2.
There is only one line of eigenvectors.
but don’t cross.
3.
There are eigenvectors in two independent directions. This is typical! Ax crosses
x at the first eigenvector, and it crosses back at the second eigenvector.
The moving directions Ax and x meet
This file is in the Student Version and in MATLAB
and it can be downloaded from the Web:
*Chapter
10 defines “perpendicular”
for complex
The
vectors.
5.2. It is with our Teaching Codes
Somehow
(i, 1) and (1,i) are perpendicular.
6.1.
Introduction to Eigenvalues
253
http:// pub/mathworks/toolbox/matlab/demos
Suppose A is singular (rank one). Its column space is a line. The vector Ax can’t
move around, it has to stay on that line. One eigenvector x is along the line. Another
eigenvector appears when Ax2 = 0. Zero is an eigenvalue of a singular matrix.
You can mentally follow x and Ax for these six matrices. How many eigenvectors and where? When does Ax go clockwise instead of counterclockwise?
“Ei
=»
a) Co) Go bi) fi
REVIEW OF THE KEY IDEAS
#®
1.
Ax =Ax
says that x keeps the same direction when multiplied by A.
2.
Ax =dx
also says that det(A — AJ) = 0. This determines n eigenvalues.
3.
The eigenvalues of A* and A~! are A* and A~!, with the same eigenvectors.
4.
The sum and product of the A’s equal the trace and determinant.
5.
Special matrices like projections and rotations have special eigenvalues !
Problem Set 6.1
1
The example at the start of the chapter has
yan
ees)
on Ee Z| es
WO.
aA
EE s) ne
A te
eed
F Al:
The matrix A? is halfway between A and A®. Explain why A? = 5(A + A%)
from the eigenvalues and eigenvectors of these three matrices.
2
(a)
Show from A how a row exchange can produce different eigenvalues.
(b)
Why is a zero eigenvalue not changed by the steps of elimination?
Find the eigenvalues and the eigenvectors of these two matrices:
hi eieenalee
1
A+/T/ has the
4
24
eigenvectors as A. Its eigenvalues are
ebyri
254
Chapter 6
Eigenvalues and Eigenvectors
2)
Compute the eigenvalues and eigenvectors of A and An
aa
A!
has the
(2
A
rel
|
and
eigenvectors
as A. When
—3/4
1/2
1/2
al
A has eigenvalues A; and Az, its
inverse has eigenvalues
Compute the eigenvalues and eigenvectors of A and sh
—1
hei,
A® has the same
values
3
A
and
ay |i
oe
ees.
‘|
as A. When A has eigenvalues A, and A2, A” has eigen-
Find the eigenvalues of A and B and A+ B:
i
12)
Eigenvalues of A+B
values of B.
@
il and
Leta
alk
|
and
rt
a+B=|i
a
(are equal to)(are not equal to) eigenvalues of A plus eigen-
Find the eigenvalues of A and B and AB and BA:
et
i
@
i and
fl
Fa;
il
| and
if
ap =|
il
3] and
Dil
Ba=(/
ap
Eigenvalues of AB (are equal to)(are not equal to) eigenvalues of A times eigenvalues of B. Eigenvalues of AB (are equal to)(are not equal to) eigenvalues
of BA.
Elimination produces A = LU. The eigenvalues of U are on its diagonal; they
are the
. The eigenvalues of L are on its diagonal; they are all
The eigenvalues of A are not the same as
(a)
If you know x is an eigenvector, the way to find A is to
(b)
If you know A is an eigenvalue, the way to find x is to
What do you do to Ax = Ax, in order to prove (a), (b), and (c)?
(ay
AP tis_an eigenvalue of A’, as in Problem 4.
(b)
A!
(c)
A+ 1 is an eigenvalue of A+ /, as in Problem 2.
is an eigenvalue of A—!, as in Problem 3.
6.1
10
Introduction to Eigenvalues
255
Find the eigenvalues and eigenvectors for both of these Markoy matrices A and
A®. Explain why A! is close to AX:
ae 2
ery
te
seal 7 | me
=|5; aa
11
Find the eigenvalues and eigenvectors for the projection matrices P and P!:
fan
If two eigenvectors
share the same
aoe)
Wise) teh at)
OF 0. 1
A, so do all their linear combinations.
Find
an eigenvector of P with no zero components.
72
From the unit vector
u =
(Z. i ;, 2) construct the rank one projection matrix
P=uu'.
13
(a)
Show that Pu =u.
Then u is an eigenvector with A = 1.
(b)
If v is perpendicular to uw show that Pv = 0. Then 4 = 0.
(c)
Find three independent eigenvectors of P all with eigenvalue 4 = 0.
Solve det(Q — AJ) = 0 by the quadratic formula to reach A = cos@ +i
—sing
a
eaten
=| cos@
rotates the xy plane by the angle @.
Find the eigenvectors of Q by solving (Q —A/)x =0.
14
Use ici
Every permutation matrix leaves x = (1, 1,..., 1) unchanged.
two more A’s for these permutations:
Oa ligne ()
(Pee)
eel
i ©. 10
15
sin@:
gral
0.
(P= | O
il
@
i
@
Then A = 1. Find
1
©
@
Prove that the determinant of A equals the product A,A2---A,.
Start with the
polynomial det(A — A/) separated into its n factors. Then set 4 =
det(A — AI) = (Ay —A)(A2 —A)-++ (An —A)
16
so
detA=
The sum of the diagonal entries (the trace) equals the sum of the eigenvalues:
a
a=| c
b
|
has
If A has Ay = 3 and Az =4
gives the eigenvalues
det(A
—AJ)
D
=A* —(a+d)X\+ad
— bc =0.
then det(A—AJ)
=
A= (a+d+/ )/2andA=
_.
The quadratic formula
__. Their sum is
256
Chapter 6
Eigenvalues and Eigenvectors
17
If A has Ay =4 and Ay =5
then det(A —AJ) = (A —4)(A — 5) = A2 — 94.420.
Find three matrices that have trace
18
A 3 by 3 matrix
B is known
a + d = 9 and determinant 20 and A = 4, 5.
to have eigenvalues
0,1,2.
This information
is
enough to find three of these:
(a)
the rank of B
(b)
the determinant of B'B
(c)
the eigenvalues of B'B
(d)
the eigenvalues of (B+ / ieee
19
Choose the second row of A =[°!] so that A has eigenvalues 4 and 7.
20
Choose a, b,c, so that det(A —A/J) = 9, —A3. Then the eigenvalues are —3, 0, 3:
OPO
Aa)
oi WC
21
The eigenvalues of A equal the eigenvalues of A‘. This is because det(A — A/)
equals det(A! — AJ). That is true because
. Show by an example that the
eigenvectors of A and A! are not the same.
22
Construct any 3 by 3 Markov matrix M: positive entries down each column add
to 1. If e = (1, 1,1) verify that M™e = e. By Problem 21, A = 1 is also an
eigenvalue of M. Challenge: A 3 by 3 singular Markov matrix with trace : has
eigenvalues A =
23
Find three 2 by 2 matrices that have 4; = Az = 0. The trace is zero and the
determinant is zero. The matrix A might not be 0 but check that A* = 0.
24
This matrix is singular with rank one. Find three A’s and three eigenvectors:
1
Aja
20
1
25
pk
ae)
Suppose A and B have the same eigenvalues 4),..., A, with the same independent eigenvectors x;,...,X,. Then A = B. Reason: Any vector x is a combination cyx; +---+c¢,X,.
26
Ha?
2112, Meira Ae Oecd
What is Ax?
What is Bx?
The block B has eigenvalues 1,2 and C has eigenvalues 3,4 and D has eigenvalues 5,7. Find the eigenvalues of the 4 by 4 matrix A:
0
B Cl_|-2
A= | le 0
0
)
0
6
Fr
W
CoO
1 SO
Ae
27
6.1
Introduction to Eigenvalues
C=
1
0
1
—
i
=
Si
0
Find the rank and the four eigenvalues of
and
ey
ee
fee
py
pe
ay
Ss
pa
28
257
oOo
©=
Subtract J from the previous A. Find the 4’s and then the determinant:
1
l
BSA
SoS
SS SS
7S
eyeSey
aS 0
When A (all ones) is 5S by 5, the eigenvalues of A and
and
29
are
(Review) Find the eigenvalues of A, B, and C:
oe
A=
||@
ee
On
4
givel
When a+b =c-+d
of
Oneal
2 @
0)
Je=| 0
ORROmAG
30
B = A —/
3
ANC ee Gn — a|
Pi 4 SRV ED)
ee
Ds Be Be
show that (1, 1) is an eigenvector and find both eigenvalues
Goa
a
31
When
P exchanges
rows
E AF
| and 2 and columns
change. Find eigenvectors of A and PAP
|
Ave
A
32
2 ij
Ome
Sa4
| and 2, the eigenvalues
don’t
for A= 11:
Andee Alea
a
on
WwW
Rr
WwW
RR
Suppose A has eigenvalues 0, 3,5 with independent eigenvectors u, v, w.
(a)
Give a basis for the nullspace and a basis for the column space.
(b)
Find a particular solution to Ax = v + w. Find all solutions.
(c)
Show that Ax =u
column space.)
has no solution. (If it did then “4
would be in the
258
Chapter 6
Eigenvalues and Eigenvectors
DIAGONALIZING
A MATRIX
@ 6.2
When x is an eigenvector, multiplication by A is just multiplication by a single number:
Ax = Ax. All the difficulties of matrices are swept away. Instead of an interconnected
system, we can follow the eigenvectors separately. It is like having a diagonal matrix,
with no off-diagonal interconnections. The 100th power of a diagonal matrix is easy.
The point of this section is very direct. The matrix A turns into a diagonal
matrix A when we use the eigenvectors properly. This is the matrix form of our key
idea. We start right off with that one essential computation.
6D Diagonalization
Suppose the n by n matrix A has n linearly independent eigen-
vectors. Put them into the columns of an eigenvector matrix S. Then S~!AS is the
eigenvalue matrix A:
hn
The matrix A is “diagonalized.’ We use capital lambda for the eigenvalue matrix,
because of the small i’s (the eigenvalues) on its diagonal.
Proof Multiply A times its eigenvectors, which are the columns of S. The first column
of AS is Ax,. That is A,x,:
Assi AC WA
Ae
le eA eh, Pc
The trick is to split this matrix AS into S times A:
Ay
AjX1
°° *
Anky | =]
°°
Xp
so
=
SA.
An
Keep those matrices in the right order! Then 4; multiplies the first column x;, as shown.
The diagonalization is complete, and we can write AS = SA in two good ways:
As=SA)
3.
S-'AS=A~
Ors
A=SAS_1,
(2)
The matrix S has an inverse, because its columns (the eigenvectors of A) were assumed
to be linearly independent.
Without n independent eigenvectors, we can’t diagonalize.
The matrices
eigenvalues 41,...,A,.
A and A have the same
The eigenvectors
are different. The job of the original eigenvectors was to diagonalize A—those eigenvectors of A went into S. The new eigenvectors, for the diagonal matrix A, are just the
6.2
Diagonalizing a Matrix
259
columns of J. By using A, we can solve differential equations or difference equations
or even Ax = b.
Example 1 The projection matrix P = [-3:3] has A = 1 and 0 with eigenvectors (1, 1)
and (—1, 1). Put the eigenvalues into A and the eigenvectors into S. Then S -Ips=a;:
ES5 5 3]fon a 0
The original projection satisfied P? = P. The new projection satisfies A* = A.
The column
space has swung
around from (1,1) to (1,0).
The nullspace has swung
around from (—1, 1) to (0, 1). Diagonalization lines up the eigenvectors with the coordinate axes.
Here are four small remarks about diagonalization, before the applications.
Remark 1 Suppose the numbers Aj, ..., An are all different. Then it is automatic that
the eigenvectors x;,...,X,
are independent.
See 6E below.
Therefore any matrix that
has no repeated eigenvalues can be diagonalized.
Remark 2 The eigenvector matrix S$ is not unique. We can multiply its columns (the
eigenvectors) by any nonzero constants. Suppose we multiply by 5 and —1. Divide
the rows of S~! by 5 and —1 to find the new inverse:
sh
sees Moe wal Be
ete =
|
=
SnewP Snew = ie EY E 4 ie fi = E i sp SAIN yak
—{
2
The extreme case is A = /, when every vector is an eigenvector.
S can be the eigenvector matrix.
Then
Any invertible matrix
S-!1S = 1 (which is A).
Remark 3 To diagonalize A we must use an eigenvector matrix. From S~'AS = A
we know that AS = SA. Suppose the first column of S is y. Then the first columns
of AS and SA are Ay and A;y. For those to be equal, y must be an eigenvector.
The eigenvectors in S come in the same order as the eigenvalues in A. To reverse
the order in S and A, put (—1, 1) before (1, 1):
eeeemieteice
Cer
Remark 4 (repeated warning for repeated eigenvalues) Some matrices have too few
eigenvectors. Those matrices are not diagonalizable. Here are two examples:
1 —=1
a=|y zy
and
0)
A=[5
dl
a
Their eigenvalues happen to be 0 and 0. Nothing is special about zero—it is the repeated A that counts. Look for eigenvectors of the second matrix.
Ax
=Ox
means
Oat
0
ip alPale lat
260
Chapter 6
Eigenvalues and Eigenvectors
The eigenvector x is a multiple of (1,0). There is no second eigenvector, so A cannot
be diagonalized. This matrix is the best example to test any statement about eigenvectors. In many true-false questions, this matrix leads to false.
Remember that there is no connection between invertibility and diagonalizability:
-
Invertibility is concerned with the eigenvalues (zero or not).
-
Diagonalizability is concerned with the eigenvectors (too few or enough).
Each eigenvalue has at least one eigenvector! If (A —A/)x = O leads you to x = 0, then A is
not an eigenvalue. Look for a mistake in solving det(A —A/) = 0. If you have eigenvectors
for n different ’s, those eigenvectors are independent and A is diagonalizable.
6E
(Independent x from different A) Eigenvectors x1, ..., x; that correspond to distinct (all different) eigenvalues are linearly independent. Ann by n matrix that has n
different eigenvalues (no repeated i’s) must be diagonalizable.
Proof Suppose c;x; +c2x2 = 0. Multiply by A to find cjA x; +c2A2x2 = 0. Multiply
by Az to find cyA2x1 + c2A2x2 = 0. Now subtract one from the other:
Subtraction leaves
(A; — A2)cix; = 0.
Since the i’s are different and x; 4 0, we are forced to the conclusion that cj = 0.
Similarly cz = 0. No other combination gives c;x; +c2x2 = 0, so the eigenvectors x
and x2 must be independent.
This proof extends directly to j eigenvectors. Suppose cjx; +---+ cjx; = 0.
Multiply by A, then multiply separately by A;, and subtract. This removes x;. Now
multiply by A and by Aj;_-; and subtract. This removes x;_;. Eventually only a multiple of x, is left:
(At —A3) "Ai
= 2; cixy = 0
which forces”
cy = 0.
(3)
Similarly every cj; = 0. When the A’s are all different, the eigenvectors are independent.
With n different eigenvalues, the full set of eigenvectors goes into the columns
of S. Then A is diagonalized.
Example 2 The Markov matrix A = [-8-3] in the last section had A, = 1 and Az = .5.
Here is
A= SAS7!:
bea
eal clined
The eigenvectors (.6, .4) and (1, —1) are in the columns of S$. We know that they are
also the eigenvectors of A?. Therefore A* has the same S, and the eigenvalues in A
are squared:
A* =SAS
SAS]
= SA2s“!,
6.2
Diagonalizing a Matrix
261
Just keep going, and you see why the high powers A* approach a “steady state”:
pO
ehOG
perry
Wows
iat ho
=| anil
itm
elle Hal
As k gets larger, (.5)* gets smaller. In the limit it disappears completely. That limit is
eos
oR
A
aL
I
=1/)/0
ORT
1 ey Decca:
O]| 4 —6]1.4 “41°
The limit has the eigenvector x; in both columns. We saw this steady state in the last
section. Now we see it more quickly from powers like A! = §A!5s-!,
Eigenvalues of AB and A+ B
The first guess about the eigenvalues of AB is not true. An eigenvalue 4 of A times
an eigenvalue 6 of B usually does not give an eigenvalue of AB. It is very tempting
to think it should. Here is a false proof:
ABx = ABx = BAx = BAx.
(4)
It seems that 6 times A is an eigenvalue. When x is an eigenvector for A and B, this
proof is correct. The mistake is to expect that A and B automatically share the same
eigenvector x. Usually they don’t. Eigenvectors of A are not generally eigenvectors of B.
A and B can have all zero eigenvalues, while
5
Onl
|
and
ae
ORO
k ak
then
| is an eigenvalue of AB:
LO
aB=(j |
and
Ome
a+B=|\
ale
For the same reason, the eigenvalues of A+ B are generally not A+. Here +f =0
while A + B has eigenvalues 1 and —1. (At least they add to zero.)
The false proof suggests what is true. Suppose x really is an eigenvector for both
A and B. Then we do have ABx = ABx. Sometimes all n eigenvectors are shared, and
we can multiply eigenvalues. The test for A and B to share eigenvectors is important
in quantum mechanics—time out to mention this application of linear algebra:
6F Commuting matrices share eigenvectors Suppose A and B can each be diagonalized. They share the same eigenvector matrix S if and only if AB = BA.
The uncertainty principle
In quantum mechanics, the position matrix P and the mo-
mentum matrix Q do not commute. In fact QP —PQ =
(these are infinite matrices).
Then we cannot have Px = 0 at the same time as Ox = 0 (unless x = 0). If we knew
the position exactly, we could not also know the momentum exactly. Problem 32 derives Heisenberg’s uncertainty principle from the Schwarz inequality.
262
Chapter 6
Eigenvalues and Eigenvectors
Fibonacci
Numbers
We present a famous example, which leads to powers of matrices. The Fibonacci numbers
start with Fo = 0 and F; = 1. Then every new F is the sum of the two previous F’s:
The sequence
0,1, 1,2,3,5,8,13,...
comes from
= Fy42 = Fruit + Fx.
These numbers turn up in a fantastic variety of applications. Plants and trees grow
in a spiral pattern, and a pear tree has 8 growths for every 3 turns. For a willow
those numbers can be 13 and 5. The champion is a sunflower of Daniel O’Connell,
which had 233 seeds in 144 loops. Those are the Fibonacci numbers Fj3 and Fj2. Our
problem is more basic.
Problem: Find the Fibonacci number F99 The slow way is to apply the rule Fy42 =
Fu4j + Fe one step at a time.
By adding
Fe =
8 to F7 =
13 we
reach
Fg =
21.
Eventually we come to Figg. Linear algebra gives a better way.
The key is to begin with a matrix equation uz4; = Aux. That is a one-step
rule for vectors, while Fibonacci gave a two-step rule for scalars. We match them by
putting Fibonacci numbers into the vectors:
F;
Let uz, = eal The mig
— &
+ fF
9
7
Frat = Fri
|
“Becomes
Uki1 = é Aluz. 4%(5)
a(t} [i [i EB) 6
Every step multiplies by A = ( al After 100 steps we reach u199 =
Al uo:
The Fibonacci numbers come from the powers of A—and we can compute A! without multiplying 100 matrices.
This problem is just right for eigenvalues. Subtract 2 from the diagonal of A:
AxAL=
yes eh leadsto det(A —AJ) =4° -A-1.
The eigenvalues solve the equation A* —
formula (—b + Vb? — 4ac) /2a:
A — 1 = 0. They come from the quadratic
_ 145
hs
yee?
618.
These eigenvalues A; and Az lead to eigenvectors x; and x2. This completes step 1:
1—
ee
1
m
0
| eee
ah
we
pra
lea
P71 ael[=|=[o] mee = =[7].
l—dpo
1
esta
% |AD
6.2
Diagonalizing a Matrix
263
Step 2 finds the combination of those eigenvectors that gives ug = (1, 0):
es
l
dM
2
cell
asi) 90)
i)
o
See
ees
MS
The final step multiplies uo by A! to find uj99. The eigenvectors stay separate! They
are multiplied by (A)?
and (2)!:
“100 =
(A) 1x1 — (Az)!x2
MAKE
(7)
We want Fio9 = second component of uwjo9. The second components of x; and x2 are
1. Substitute the numbers A; and Az into equation (7), to find A; —A2 = V5 and F\00:
\
Fw
=
14 J5\10
Je( 5
)
7p _ 5 100
-(
5
)
~ 3.54-10°°.
(8)
Is this a whole number? Yes. The fractions and square roots must disappear, because
Fibonacci’s rule Fy42 = Fey, + Fe stays with integers. The second term in (8) is less
than 5, so it must move the first term to the nearest whole number:
kth Fibonacci
number
= _ nearest integer to
lets
(4°)
J5
2
The ratio of Fg to Fs is 8/5 = 1.6. The ratio Fj9;/Fi09 must
(1 + V5) /2. The Greeks called this number the “golden mean.”
\
:
(9)
be very close to
For some reason
a rectangle with sides 1.618 and 1 looks especially graceful.
Matrix Powers A‘
Fibonacci’s example is a typical difference equation uz; = Aux. Each step multiplies
by A. The solution is uy = A*ug. We want to make clear how diagonalizing the matrix
gives a quick way to compute AS
The eigenvector matrix S produces A = SAS ~! This is a factorization of the
matrix, like A = LU or A = QR. The new factorization is perfectly suited to computing powers, because every time S-! multiplies S we get I:
A?
Ae 2S
vais
Ay)Weta IPS
ore ASS
Ss
A Ss
The eigenvector matrix for AK is still S, and the eigenvalue matrix is Ak. We knew
that. The eigenvectors don’t change, and the eigenvalues are taken to the kth power.
When A is diagonalized, Akuy is easy to compute:
264
Chapter 6
Eigenvalues and Eigenvectors
1.
Find the eigenvalues of A and n independent eigenvectors.
2.
Write uo as a combination c)x; +---+CnX,
3.
Multiply each eigenvector x; by (A;)*. Then
of the eigenvectors.
up = Akug = c1(A1)*x1 +--+ + n(n)" Xn.
(10)
In matrix language A* is (SAS~!)* which is S$ times A* times S~!.
guage, the eigenvectors in S$ lead to the c’s:
In vector lan-
C|
UN = CX,
s+
+ CnXm =|
X1
-+*
This says that
es
Xn
= Sc.
uo
Cn
The coefficients c},..., Cn in Step 2 are
c = S~'uo.
Then Step 3 multiplies by ee
The combination uz = >> cj (A;)*x; is all in the matrices § and A
and § 7:
(a1)
Boe = SA*S
“uo =SA*e=|
xj
~~: _
C]
4
eee (A)
Oy
Cn
This result is exactly uz, = c1(A1)* xy fore ep 6 0G, ee It solves uzs1 = Aug.
Example 3 Compute A* when S and A and S~! contain whole numbers:
It
Aa;
il
|
has
A; =1
and
i
“=|,
ho
and:
1
2S.)
A is triangular, with 1 and 2 on the diagonal. A* is also triangular, with 1 and 2* on
the diagonal. Those numbers stay separate in A‘. They are combined in A‘:
k
k
1
With k =1 we get A. With k =0 we get I. With k = —1 we get A~!.
Note The zeroth power of every nonsingular matrix is A? = 7. The product SA°S7!
becomes SIJS~! which is J. Every A to the zeroth power is 1. But the rule breaks
down when A = 0. Then 0° is not determined. We don’t know A®° when A is singular.
6.2
Diagonalizing a Matrix
265
Nondiagonalizable Matrices (Optional)
Suppose i is an eigenvalue of A. We discover that fact in two ways:
1.
Eigenvectors (geometric)
2.
Eigenvalues (algebraic)
There are nonzero solutions to Ax = Ax.
The determinant of A — AJ is zero.
The number A may be a simple eigenvalue or a multiple eigenvalue, and we want to
know its multiplicity. Most eigenvalues have multiplicity M = | (simple eigenvalues).
Then there is a single line of eigenvectors, and det(A — AJ) does not have a double
factor. For exceptional matrices, an eigenvalue can be repeated. Then there are two
different ways to count its multiplicity:
1.
pa
(Geometric Multiplicity
=GM)
Count the independent eigenvectors for 4. This
is the dimension of the nullspace of A — AJ.
(Algebraic Multiplicity = AM)
Count the repetitions of A among the eigenvalues.
Look at the n roots of det(A — AJ) = 0.
The following matrix A is the standard example of trouble. Its eigenvalue A = 0
is repeated.
It is a double eigenvalue
(AM = 2) with only one eigenvector (GM = 1).
The geometric multiplicity can be below the algebraic multiplicity—it is never larger:
nell
0
4
0
has
det(A
A
— 21) =|
Q-
1
2
Ee
=X
There “should” be two eigenvectors, because the equation A* = 0 has a double root.
The double factor A? makes AM = 2. But there is only one eigenvector. x4 == 1,70).
This shortage of eigenvectors when GM < AM means that A is not diagonalizable.
The vector called “repeats” in the Teaching Code eigval gives the algebraic mul-
tiplicity AM for each eigenvalue. When repeats = [1 1... 1] we know that the n
eigenvalues are all different. A is certainly diagonalizable in that case. The sum of all
components in “repeats” is always n, because the nth degree equation det(A —A/) = 0
always has n roots (counting repetitions).
The diagonal matrix D in the Teaching Code eigvec gives the geometric multiplicity GM for each eigenvalue. This counts the independent eigenvectors. The total
number of independent eigenvectors might be less than n. The n by n matrix A is
diagonalizable if and only if this total number is n.
We have to emphasize: There is nothing special about A = 0. It makes for easy
computations, but these three matrices also have the same shortage of eigenvectors.
Their repeated eigenvalue is A = 5:
oma
A=\> |
and
eI
6
—-l
]
and
ey
ee
Fe
a
Those all have det(A — A/) = (A — 5)?. The algebraic multiplicity is AM = 2. But
A — 5/ has rank r = |. The geometric multiplicity is GM = 1. There is only one
eigenvector, and these matrices are not diagonalizable.
266 = Chapter 6
Eigenvalues and Eigenvectors
=»
1.
REVIEW OF THE KEY IDEAS
=&
If A has n independent eigenvectors (they go into the columns of S), then S TAS
is diagonal: S-1AS=A
and
A=SAS™!,.
2.
The powers of A are AX = SA*¥S—!,
3.
The eigenvalues of A* are (A1)*; 02° AGEs
4.
The solution to uz4; = Au, starting from uo is up = Akug =
uy = c1(A1)*x) +.++4 eax,
5.
provided
SA‘ S~'uo:
wp = cx, + +++ + CnXn.
A is diagonalizable if every eigenvalue has enough eigenvectors (GM=AM).
Problem Set 6.2
Questions 1-8 are about the eigenvalue and eigenvector matrices.
1
Factor these two matrices
into
oe
|
SAS7!:
and
W)C
jrandA
=
2
A=SAS~
3
If A has A; = 2 with eigenvector x; =
SAS~!
4
thaw’=C
A=
2
A
6
7,
ie
Al
(i) nC:
ee and Az =
5 with x2 =
Eat use
to find A. No other matrix has the same A’s and x’s.
Suppose
A = SAS~!.
eigenvector matrix?
5
a
True or false:
then
What is the eigenvalue matrix for A + 2/? What is the
Check that
If the columns
A+27=(
)(_
)(
Vou
of S (eigenvectors of A) are linearly independent,
(a)
A
is invertible
(b)
A is diagonalizable
(c)
S is invertible
(d)
S is diagonalizable.
If the eigenvectors of A are the columns of /, then A is a
matrix.
eigenvector matrix S$ is triangular, then S~! is triangular.
triangular.
Prove that A is also
Describe all matrices § that diagonalize this matrix A:
ie F4 0>||
Then describe all matrices that diagonalize Aq.
8
Write down the most general matrix that has eigenvectors ea and lege
If the
6.2.
Diagonalizing a Matrix
267
Questions 9-14 are about Fibonacci and Gibonacci numbers.
9
For the Fibonacci matrix A = li ale compute A? and A> and A*. Then use the
text and a calculator to find Fy.
10
Suppose each number G;+2 is the average of the two previous numbers
and G,. Then Gx42 = 5(Gest sr Ge
Gus2 = Gus + 3Gx
ie
ad = | hi
[el
Gry
Gx
Gea = Ges
11
(a)
Find the eigenvalues and eigenvectors of A.
(b)
Find the limit as n > oo of the matrices A” = SA”"S7!.
(c)
If Go =O
and G; = | show that the Gibonacci
numbers
approach
Gx+1
WIND
Diagonalize the Fibonacci matrix by completing S~!:
Dh
ge WRAL Ao
Oe
leeieee
At
O
0
A2
:
Do the multiplication SA‘ S aleS — to find its second component.
This is the kth
Fibonacci number Fy = (Af — 45) /(A1 — 22).
12
The numbers hi and Ae satisfy the Fibonacci rule Fyi2 = Fey, + Fr:
ARP Se Oe Pap hee ead
oA
hee
Prove this by using the original equation for the A’s. Then any combination of
ne and aS satisfies the rule. The combination F, = (at = EW ACS — 2) gives
the right start Fo = 0 and F; = 1.
13
Suppose Fibonacci had started with Fo = 2 and Fj = 1. The rule Fyy2
Fy11 + Fx is the same so the matrix A is the same. Its eigenvectors add to
=
mtx =feel eelee) Se ete
After 20 steps the second component of Ae
that number F 0.
14
Ea
ise)
Prove that every third Fibonacci number in 0, 1, 1, 2,3,...
ee
is even.
Compute
268
Chapter 6
Eigenvalues and Eigenvectors
Questions 15-18 are about diagonalizability.
15
True or false: If the eigenvalues of A are 2,2,5 then the matrix is certainly
(a)
16
(b)
diagonalizable
(c)
not diagonalizable.
True or false: If the only eigenvectors of A are multiples of (1,4) then A has
(a)
Ww
invertible
noinverse
(b)
arepeatedeigenvalue
(c)
nodiagonalization SASa.
Complete these matrices so that det A = 25. Then check that A = 5 is repeated—
the determinant of
A—A/ is (A—5)?. Find an eigenvector with Ax = 5x. These
matrices will not be diagonalizable because there is no second line of eigenvectors.
8
9 4
ONS
a=|
>| and
a=|
i and
seis
18
The matrix A = iF A) is not diagonalizable because the rank of A—3/ is
Change one entry by .01 to make A diagonalizable. Which entries could you
change?
Questions 19-23 are about powers of matrices.
19
Ak = SA*S~—! approaches the zero matrix as k > oo if and only if every A has
absolute value less than
. Which of these matrices has A‘ > 0?
6 A
sve | E
20
gue
(Recommended) Find A and S to diagonalize A in Problem 19. What is the limit
of A‘ as k — 00? What is the limit of SA‘S~!?
matrix you see the
21
Find A and S to diagonalize B in Problem
wo=(5|
22
and
w=
In the columns of this limiting
19. What is B!°u9 for these uo?
(3
and
wo=(5].
Diagonalize A and compute SA*‘S~! to prove this formula for A*:
at
ie
a=|
23
6 9
S Al
ns
has
Diagonalize B and compute SA*S~!
£1801
P=,
|
has
A
lel
=a
es
eee
;
to prove this formula for B*:
yenloy
Ba,
po"
ak
:
6.2.
Questions 24-29 are new applications of
Diagonalizing a Matrix
269
A = SAS™!.
24
Suppose that A = SAS~!. Take determinants to prove that det A = A}A2---Ay, =
product of A’s. This quick proof only works when A is
25
Show that trace AB = trace BA, by adding the diagonal entries of AB and BA:
a=|? A
ed
Choose
A as S and B as AS~!.
and
Then
a=|4 al
sree
SAS~!
has the same
trace as AS~!S.
The trace of A equals the trace of A which is
26
AB — BA =I
is impossible since the left side has trace =
elimination matrix so that
AB
Ia)
If
A= SAS—!,
BA
. But find an
A= E and B = E! give
—-1
0
0
1
:
which has trace zero.
diagonalize the block matrix
B =
ies al
Find its eigenvalue
and eigenvector matrices.
28
Consider all 4 by 4 matrices A that are diagonalized by the same fixed eigenvector matrix S. Show that the A’s form a subspace (cA and A; + A2 have this
same S$). What is this subspace when S = /? What is its dimension?
29
Suppose A* = A. On the left side A multiplies each column of A. Which of
our four subspaces contains eigenvectors with A = 1? Which subspace contains
eigenvectors with A = 0? From the dimensions of those subspaces, A has a full
set of independent eigenvectors and can be diagonalized.
30
(Recommended) Suppose Ax = Ax. If A = 0 then x is in the nullspace. If A 4 0
then x is in the column space. Those spaces have dimensions (n —r)
+r =n.
So why doesn’t every square matrix have n linearly independent eigenvectors?
31
The eigenvalues of A are | and 9, the eigenvalues of B are —1 and 9:
5
a=|j
a!
|
and
ae
ah
alk
Find a matrix square root of A from R = SA S~!. Why is there no real matrix
square root of B?
32
(Heisenberg’s Uncertainty Principle) AB — BA
= 7 can happen for infinite
matrices with A = A? and B = —B!. Then
x'x =x! ABx —x'BAx < 2||Ax|| ||Bx||.
Explain that last step by using the Schwarz inequality. Then the inequality says
that ||Ax||/||x|] times ||Bx||/||x|| 1s at least 7 It is impossible to get the position
error and momentum error both very small.
270
33
Chapter 6
Eigenvalues and Eigenvectors
If A and B have the same A’s with the same independent eigenvectors, their factorizations into
34
are the same.
So A = B.
Suppose the same S diagonalizes both A and B, so that
A = SA,S~!
and B =
SA2S~'. Prove that AB = BA.
35
If A = SAS~! show why the product (A—A,/)(A—Az2/)---(A—A,J) is the zero
matrix. The Cayley-Hamilton Theorem says that this product is always zero.
We are substituting A for the number A in the polynomial det(A — AJ).
36
The matrix A = [734] has det(A — A/) = A? — 1. Show from Problem 35 that
A? — I =0.
37
Deduce that A~! = A and check that this is correct.
(a)
When do the eigenvectors for A = 0 span the nullspace N(A)?
(b)
When do all the eigenvectors for 4 #0 span the column space C(A)?
6.3
Applications to Differential Equations
APPLICATIONS TO DIFFERENTIAL
271
EQUATIONS w» 6.3
Eigenvalues and eigenvectors and A = SAS™! are perfect for matrix powers A‘. They
are also perfect for differential equations. This section is mostly linear algebra, but to
read it you need one fact from calculus: The derivative of e™ is Xe. It helps to know
what e is, but I am not even sure that is essential.
The whole point of the section is
this: Zo convert differential equations into linear algebra.
The equation du/dt = u is solved by u = e’. The equation du/dt = 4u is solved
by u =e*. The most basic equations have exponential solutions:
du
Gee hu
,
has the solutions
.
u(t) = Ce’.
(1)
The number C turns up on both sides of du/dt = du and cancels out.
At t = 0 the solution reduces to C (because e® = 1). By choosing C = u(0), we
match the “initial condition.” The solution that starts from u(0) at t = 0 is u(O)e*".
We just solved a 1 by 1 problem. Linear algebra moves to n by n. The unknown
is a vector u (now boldface). It starts from the initial vector u(0), which is given. The
n equations contain a square matrix A:
Problem
Solve
d
= = Au _ starting from the vector
u(0)
at
t=O.
(2)
This system of differential equations is /inear. If u(t) and v(t) are solutions, so is Cu(t) +
Dv(t). We will need n freely available constants to match the n components of u(0). Our
first job is to find n “pure exponential solutions” to the equation du/dt = Au.
Notice that A is a constant matrix. In other linear equations, A changes as f
changes. In nonlinear equations, A changes as u changes. We don’t have either of
these difficulties. Equation (2) is “linear with constant coefficients.’ Those and only
those are the differential equations that we will convert directly to linear algebra. The
main point will be: Solve linear constant coefficient equations by exponentials e™ x.
Solution of du/dt = Au
Our pure exponential solution will be e* times a fixed vector x. You may guess that
i is an eigenvalue
of A, and x is the eigenvector.
Substitute
u(t) =
e*'x into the
equation du/dt = Au to prove you are right (the factor e*™ will cancel):
Au = Ae“ x agrees with
d
= =he™x
provided
Ax =Ax.
(3)
All components of this special solution u = ex share the same e*!. The solution
grows when 24 > 0. It decays when A < 0. In general A can be a complex number. Then the real part of A decides growth or decay, while the imaginary part gives
oscillation like a sine wave.
272
Chapter 6
Eigenvalues and Eigenvectors
Example 1 Solve ie = Ale if i|u starting from u(0) = [3].
This is a vector equation for uw. It contains two scalar equations for the components y
and z. They are “coupled together” because the matrix is not diagonal:
d
y
aKa
Oa
y
alle)
means that
dy
_
prea:
and
Gin
wpa oe
The idea of eigenvectors is to combine those equations in a way that gets back to
1 by 1 problems. The combinations y + z and y — z will do it:
d
i; (y+z=zt+y
The combination
y + z grows
d
an
rr (y — z)
like e’, because
it has
(y — Zz)
4 =
1. The combination
y — z
decays like e~', because it has 4 = —1. Here is the point: We don’t have to juggle the original equations du/dt = Au, looking for these special combinations. The
eigenvectors and eigenvalues do it for us.
This matrix A has eigenvalues | and —1. Here are two eigenvectors:
Pollil=L] s(t ofLal-[7}
The pure exponential solutions u; and w2 take the form e”’x:
wie
Notice:
hitruw
=e
r| 11
These u’s are eigenvectors.
and
u(t)
hat Xy = € Sahee
ee
=e"
(4)
They satisfy Au; = u; and Auz = —uz2, just like
x, and x2. The factors e’ and e~' change with time. Those factors give du,/dt = uy;
and du2/dt = —uz. We have two solutions to du/dt = Au. To find all other solutions,
multiply those special solutions by any C and D and add them:
f
General solution
u(t) =epe
Ce
1
A + De
zAlieeel
t
Ce’ + De
ea == la " ne) :
(5)
With these constants C and D, we can match any starting vector u(0). Set t = 0 and
e° = 1. The problem asked for u(0) = (4, 2):
cf3]+2(_i]=[3|
yields C=3
and D=1.
With C =3 and D = 1 in the solution (5), the initial value problem is solved.
We summarize the steps. The same three steps that solved uz; = Au, in the
last section now solve du/dt = Au. The matrix powers A‘ led to AX. The differential
equation leads to e*':
6.3
Applications to Differential Equations
1.
Find the eigenvalues and n independent eigenvectors of A.
2.
Write u(0) as a combination cx; +---+cCyXn
3.
Multiply each eigenvector x; by e*‘". Then u(t) is the combination
u(t) = cye*'xy tes
Example 2 Solve du/dt = Au
»
i
Fe =|0
a
ok
2
TG ae
273
of the eigenvectors.
+ ene xy.
(6)
knowing the eigenvalues A = 1, 2,3 of A:
od
1ja
6
starting from
“(0)=)'5
4
Step 1 The eigenvectors are x; = (1,0, 0) and x2 = (1, 1,0) and x3 = (1, 1, 1).
Step 2 The vector u(0) = (6,5,
4) is xj
+ x2 + 4x3. Thus (c), c2, c3) = (1, 1, 4).
Step 3 The pure exponential solutions are ex, and e*’x2 and e*x3.
The combination that starts from u(0) is u(t) = e'x; + e7!x7 + 4e%'x3.
The coefficients 1, 1,4 came from solving the linear equation c)x,-+c2x2+c3x3
Gy
se,
Be)
ave
= u(0):
4
6x || = | S
C3
6
which is
Se = u(0).
(7)
You now have the basic idea—how to solve du/dt = Au. The rest of this section
goes further. We solve equations that contain second derivatives, because they arise so
often in applications. We also decide whether u(t) approaches zero or blows up or
just oscillates. At the end comes the matrix exponential e“'. Then e“‘u(0) solves the
equation du/dt = Au in the same way that A*ug solves the equation uz.; = Auk.
All these further steps use the A’s and x’s. With extra time, this section makes a strong
connection to the whole topic of differential equations. It solves the constant coefficient
problems that turn into linear algebra. Use this section to clarify these simplest but most
important differential equations—whose solution is completely based on e”’.
Second Order Equations
Start with the equation y” + by’ + ky = 0. The unknown y(t) is just a scalar, not
a vector. The notations y” and y’ mean d*y/dt* and dy/dt. This is a second-order
equation because it contains the second derivative y”. It is still linear with constant
coefficients b and k.
274
Chapter 6
Eigenvalues and Eigenvectors
In a differential equations course, the method of solution is to substitute
y = eM,
Each derivative brings down a factor 4. We want y = e* to solve the equation:
d?
d
=
2
pect thy
Ge tog,
=07
+ bh + We ie=0.
mare
(8)
Everything depends on A* + bA +k = 0. This equation for 4 has two roots A; and
Az. Then the equation for y has two pure solutions yj = e”!’ and y2 = e*2! Their
combinations c,y; + c2y2 give all other solutions. This complete solution has two free
constants c; and c2 because the equation contains a second derivative.
In a linear algebra course we expect matrices and eigenvalues. Therefore we turn
the scalar equation into a vector equation. The unknown vector u has components y
and y’. The equation is du/dt = Au:
Gy Sas
dt
dy’
=
y
dly
Orit
Talolity.
converts
rts to
to —= >| = , ep A :
9
(9)
ve = —ky y —— by’by
The first equation dy/dt = y’ is trivial (but true). The second equation connects y” to
y’ and y. Together the equations connect wu’ to u. So we can solve that vector equation
by eigenvalues:
—A
i
1
:
al has determinant
2
A“ + bA+k =0.
The equation for the i’s is the same! It is still A7+bX+k =0. The two roots A; and
Az are now eigenvalues of A. The eigenvectors and the complete solution are
o=[i] [i] so=oer[t]-or[t]
In the first component of u(t), you see y = cje*!! +.c7e*2'_the same solution as before.
It can’t be anything else. In the second component of u(t) you see dy/dt. The vector
problem is completely consistent with the scalar problem.
Note 1 In linear algebra the serious danger is a shortage of eigenvectors.
Our eigen-
vectors (1,A,) and (1,A2) are the same if Aj = Az. Then we can’t diagonalize A. In
this case we don’t yet have a complete solution to du/dt = Au.
In differential equations the danger is also a repeated A. The pure solutions y =
e*!’Ait and y = e?2' are the same if A; = Az. Again the problem is not yet solved. A
second solution has to be found, and it turns out to be y = te”.
This “impure” solution (with the extra f) appears also in the vector problem.
A diagonal matrix A = S~!AS cannot be reached when there is only one eigenvector.
Then A is replaced by the Jordan matrix J. The last section of this chapter describes
J. The exponential of J produces that extra factor tf.
6.3
Applications to Differential Equations
275
Note 2 In engineering and physics, the second derivative enters from Newton’s Law
F = ma. The acceleration a = y” is multiplied by the mass m. The true equation
is my” + by'+ky = f. (The force F in Newton’s Law has three parts, an external
force f and a damping force by’ and an internal force ky.) This section divides by m,
which brings the left side into the standard form starting with y”.
Note 3 Real engineering and real physics deal with systems (not just a single mass at
one point). The unknown y is a vector in the first place. The coefficient m is a mass
matrix M, not a number. The coefficient k is a stiffness matrix K. The coefficient of y’
is a damping matrix which might be zero. In that case the real differential equation is
My"”+Ky = f. To bring back the standard form with y”, multiply through by M7!.
Eventually the solution involves the eigenvalues of M~'K which is a major part of
computational mechanics.
Example
p 3 Solve y” + 4y’ + 3y =0
by substituting e*’ and also by linear algebra.
Solution Substituting y = e*’ yields (A* + 44 + 3)e*! = 0. The quadratic equation is
42 4+404+3 =0.
It factors into (A+1)(A+3)
=0.
Therefore 4; = —1 and Az = —3.
The pure solutions are yj = e~’ and y2 = e~*’. The complete solution is cyy;+ c2y2
and it approaches zero as f increases.
To use linear algebra we set u = (y, y’). This leads to a vector equation:
dy/dt = y'
dy'/dt = —3y —4y’
rts tto
converts
ae =
wp
ee:
|;
u.
Find the eigenvalues from the determinant of A — AJ:
rv
1
An ai=|73 eal [= +4.4+3=0
The equation is the same and the A’s are the same and the solution is the same.
constant coefficients and pure exponentials, calculus goes back to algebra.
With
Stability of 2 by 2 Matrices
For the solution of du/dt = Au, there is a fundamental question. Does the solution
approach u = 0 as t — o? Is the problem stable? Example 3 was certainly stable,
because both pure solutions e~’ and e~*’ approach zero. Stability depends on eigenvalues, and the eigenvalues depend on A.
The complete solution u(t) is built from pure solutions ex. If the eigenvalue
is real, we know exactly when e”! approaches zero: The number X must be negative.
If the eigenvalue is a complex number A = a +ib, its exponential e™ splits into eel,
The second factor e!”' has absolute value fixed at 1:
,
eft — cosbt
vi,
+isinbt
has
bt
12
2
wie
|e!?'|? = cos” bt + sin? bt = 1.
276
Chapter 6
Eigenvalues and Eigenvectors
determinant D
A
‘
3
both ReA<0|
both ReA>0
stable
unstable
/
RS
74
NX
“6
7
bothh<0
stable
:
\4~
a andTe=
2s
pe?
27
7
both A>0
unstable
rn
IE
A, <Oand X, > 0: unstable
Figure 6.3.
A 2 by 2 matrix is stable (u(t)
> 0) when T < 0 and D>
0.
The other factor e“’ controls growth (instability) or decay (stability). For stability, the
real part a of every A must be negative. Then every e*! approaches zero.
The question is: Which matrices have negative eigenvalues? More accurately,
when are the real parts of the i’s all negative? 2 by 2 matrices allow a clear answer.
6G Stability The matrix A = [25] is stable and u(t) approaches zero when two
conditions hold:
The trace
T=a+d _ must be negative.
The determinant
D=ad-—bc_
must be positive.
Reason If the A’s are real and negative, their sum is negative. This is the trace T.
Their product is positive. This is the determinant D. The argument also goes in the
reverse
T =,
direction. If D = Aj,A2 is positive, then 4; and Az have the same sign. If
+ Az is negative, that sign must be negative.
When the ’s are complex numbers, they must have the form a+ib and a — ib.
Otherwise T and D will not be real. The determinant D is automatically positive, since
(a +ib)(a —ib) = a* +b’. The trace T isa+ib+a—ib=2a. Soa negative trace
means that the real part a is negative and the matrix is stable. Q.E.D.
Figure 6.3 shows the parabola T*? = 4D which separates real from complex
eigenvalues. Solving A* — TA + D = 0 leads to /T2 — 4D. This is real below the
parabola and imaginary above it. The stable region is the upper left quarter of the
figure—where T < 0 and D > 0.
Example 4 Which of these matrices is stable?
Ah
sl te me ee
Ot
Ava
5
=
18
6.3
Applications to Differential Equations
277
The Exponential of a Matrix
We return briefly to write the solution u(t) in a new form e4‘u(0). This gives a perfect
parallel with A‘uo in the previous section. First we have to say what e4’ means.
The matrix e“’ has a matrix in the exponent. To define e4', copy e*. The direct
definition of e* is by the infinite series 1 +x + 5x? Ss zx? +--+.
When you substitute
the matrix Af for x, you get a matrix e4’:
Definition of the matrix exponential
eM = 1+ Att 4(At) + Z(At +.-.
(10)
The number that divides (At)” is “n factorial.” This is n! = (1)(2)---(n—1)(n).
The factorials after 1,2,6 are 4! = 24 and 5! = 120. They grow quickly. The series
always converges and its derivative is always Ae“! (Problem 17). Therefore e4‘u(0)
solves the differential equation with one quick formula—even if there is a shortage of
eigenvectors.
This chapter emphasizes how to get to the answer by diagonalization. Assume A
is diagonalizable, and substitute
A = SAS! into the series. Whenever SAS~'SAS7!
appears, cancel S~!S in the middle:
e“ = 1+ SAS™'t + 4(SAS™!t)(SAS7't) +--= SUI + At+5(At)? +---JS7!
=
(11)
Sees.
That equation says: e4’ equals Se“'S~!. To compute e“’, compute the A’s as usual.
Then A is a diagonal matrix and so is e4’—the numbers e*'' are on its diagonal.
Multiply Se'S—!u(0) to recognize the new solution u(t) =
solution in terms of eigenvalues and eigenvectors:
e4'u(0).
ert
EO
Sen Sar BO) =,
es
lage,
It is the old
Cl
<
al rain 6 3
Ant
en
Cn
S contains the eigenvectors x;. A combination c)x| ++-+-+CnX, of its columns is Sc.
This matches the starting value when u(0) = Sc. The column c = S~'u(O) at the end
of equation (12) brings back the best form of the solution:
cyert
ANG) y=
x]
aN
he
=
A
== Ge
‘A;
Ue ee So
A
AS a
cnern!
This is e“‘u(0). It is the same answer that came from our three steps:
i
(13)
278
Chapter 6
Eigenvalues and Eigenvectors
1.
‘Find the 2’s and x’s.
2
Write u(O) = cyxy +---+CpXp-
3.
Multiply each x; by e*’'. Then the solution is
u(t) = cye*"’x, +--+»
+ ene?" Xp.
Example 5 Find e4! for
(14)
A=[_92]. Notice that A* = /:
Ll eta ef] ef
A>, A°, A’, A® will repeat these four matrices. The top right corner has 1,0, —1,0
repeating over and over. The infinite series for e4’ contains t/1!, 0, —1>/3!,0. In other
words t — t?/6 is the start of that corner:
Pare Han? + hans += |
1 —
1;2
#1
ane
ere
ap,
os
The series on the left is e4’. The top row of the matrix shows the series for cost and
sint. We have found e“’ directly:
ae
Lat
a
sint
cost
(15)
At t = 0 this gives e? = J. Most important, the derivative of e4 is Ae4’:
d Ab tcosttae’
Sings)
ei
essiniey
"Cost
d¢i—sint)
cost
|)
i=cose
sing)!”
er | Opal
|=
20:1)
cost
sint
|=sint’
eos
nl”
A is a skew-symmetric matrix (A! = —A). Its exponential e4’ is an orthogonal matrix.
The eigenvalues of A are i and —i. The eigenvalues of e“! are e!’ and e~''. This
illustrates two general rules:
1
The eigenvalues of e4' are e™.
Z
When A is skew-symmetric,
Example 6 Solve ue =A
e*'At is orthogonal.
= [istu starting from u(0) = Ea t=O,
Solution The eigenvalues 1 and 2 are on the diagonal of A (since A is triangular).
The eigenvectors are (1,0) and (1, 1):
fo a}lo)=[o)
[fo allil=2Li}
6.3.
Applications to Differential Equations
Step 2 writes u(0) as a combination x; + x2 of these eigenvectors.
279
Remember:
This
is Sc = u(O). In this case cj = c7 = 1. Then the solution is the same combination
of
pure exponential solutions:
pete
Wee
7
0
dot
That is the clearest solution e*!'x;+e*2'x7.
ae 2t
|
|:
In matrix form, the eigenvectors go into S:
t
=
ip
De
u(t) = Se S—!u(0) is f NE elt ‘| uo) =| a
[WO
That last matrix is e4'. It’s not bad to see what a matrix exponential looks like (this
is a particularly nice one). The situation is the same as for Ax = b and inverses. We
don’t really need A~!
to find x, and we don’t need e“! to solve du/dt = Au.
But as
quick formulas for the answers, A~'b and e“‘u(0) are unbeatable.
=»
REVIEW OF THE KEY IDEAS
#8
1.
The equation u’ = Au is linear with constant coefficients.
2.
Its solution is usually a combination of exponentials, involving each A and x:
u(t) = cye"A\t x, +--+ cne*"' An xp.
3.
The constants c,...,¢,
4.
The solution approaches zero (stability) if Real part of A <0 for every 2.
5.
The solution is always u(t) = e“'u(0), with the matrix exponential e4’.
6.
are determined by w(O) = cjx] +--- + cCpXp.
Equations involving y” can be reduced to u’ = Au, by combining y’ and y into
u=(y’, y).
Problem Set 6.3
1
Find i’s and x’s so that u = e My solves
du
Ae
3
>isbaad Oana lil aes
What combination
u = cje*!"x; + c2e*2'x» starts from u(0) = (5, —2)?
280
Chapter 6
Eigenvalues and Eigenvectors
2
Solve Problem
| for u = (y, z) by back substitution:
dz
First solve Fina z starting from z(0) = —2.
dy
:
Then solve tes 4y + 3z starting from y(0) = 5.
The solution for y will be a combination of e* and e’.
Find A to change the scalar equation y” = 5y’ + 4y into a vector equation for
u=(y,y):
oe
se
>
What are the eigenvalues of A?
y’
i
:
Find them also by substituting
y =
e t into
y” = 5y' +4y.
The rabbit and wolf populations show fast growth of rabbits (from 6r) but loss
to wolves (from —2w):
dr
—
dt
= 6r —2w
and
Find the eigenvalues and eigenvectors.
dw
—
dt
=2r+w.
If r(0) = w(0) = 30 what are the popu| to 2 or
lations at time ¢? After a long time, is the ratio of rabbits to wolves
its) att
toy 1%
A door is opened between rooms that hold v(0) = 30 people and w(0) = 10
people. The movement between rooms is proportional to the difference v — w:
dv
—_ =w-v
dt
and
dw
—_ =vU—-w.
dt
Show that the total v+w is constant (40 people). Find the matrix in du/dt = Au
and its eigenvalues and eigenvectors. What are v and w at t = 1?
Reverse the diffusion of people in Problem 5 to du/dt = —Au:
dv
—_ =v-—w
dt
and
The total v + w still remains constant.
changed to —A?
dw
—_ =WwW-v.
dt
How are the A’s changed now that A is
But show that v(t) grows to infinity from v(0) = 30.
The solution to y” = 0 is a straight line
y = C + Dt.
Convert to a matrix
equation:
Cy
Ey>| has the solution
2
-P fe ae)
= i 5
This matrix A cannot be diagonalized.
5A*1? +--+.
Se
| =
SPO)
ae
Find A* and compute e4’ = J + At +
Multiply your e4” times (y(0), y’(0)) to check the straight line
y(t) = yO) + y'(O)r.
6.3.
Applications to Differential Equations
281
Substitute y = e*’ into y” = 6y’ — 9y to show that A = 3 is a repeated root.
This is trouble; we need a second solution. The matrix equation is
alr}=Ls |b"
Show that this matrix has only one line of eigenvectors. Trouble here too. Show
that the second solution is y = te*’.
The matrix in this question is skew-symmetric:
oe
Wine
0
c
—-c
0
b—-—a
The derivative of uy + U3 + U3 is 2uju) + 2u2u, + 2u3u3.
U',, U5, U4 to get zero. Then ju (t) ||? is constant.
(b)
What type of matrix always has e4"u(0)|| = ||u(O)||? When
(a)
Q =e“! isa
___
Write (1,0) as a combination c)x;
Substitute
A is a skew-
matrix.
+ c2x2 of these two eigenvectors of A:
rofls}=E]
11
ui, = cuz — bu3
Uy = a3 = cuy
uy = bu, — au2.
or
(a)
symmetric matrix,
10
—b
alu
0O
Lt olLe=~La)
(b)
The solution to du/dt = Au starting from (1,0) is cyel’xy + coe
Substitute e’’ = cost +isint and e'' =cost —isint to find u(t).
"xy.
(a)
Write down two familiar functions that solve the equation d?y/dt* = —y.
Which one starts with y(0) = 1 and y’/(0) = 0?
(b)
This second-order equation y” = —y produces a vector equation u’ = Au:
peel
u=
ig
ge
dt =
y’
5
i
heal
erie
7 ba
@a
yl
y’
Au
=
Put y(t) from part (a) into u(t) = (y, y’). This solves Problem
iz
12 again.
If A is invertible, which constant vector solves du/dt = Au — b? This vector is
Uparticular, and the solutions to du/dt = AU give Upomogeneous: Find the complete
solution Up, + u;, to
du
13
.
du
2
()
If c is not an eigenvalue
of A, substitute uw = ev
fo)
8
and find v to solve du/dt =
Au —e“'b. This u = e“'v is a particular solution. How does it break down when
c is an eigenvalue?
282
Chapter 6 Eigenvalues and Eigenvectors
14
Find a matrix A to illustrate each of the unstable regions in Figure 6.3:
(a)
A, <O and A2>0
(b)
A, >O0 and Az >0
(c)
Complex A’s with real part a > 0.
Questions 17-25 are about the matrix exponential e4’.
15
Write five terms of the infinite series for e4’. Take the ¢ derivative of each term.
Show that you have four terms of Ae“’. Conclusion: The matrix e“’ solves the
differential equation.
16
The matrix
B =
(i apt has B? =
0. Find e? from a (short) infinite series.
Check that the derivative of e?’ is Be?'.
17 = Starting from u(0) the solution at time T is e4Ty(0).
Go an additional time f to
reach e“'(e47u(0)). This solution at time t + T can also be written as
Conclusion:
e“ times e47 equals
18
Write
19
— 1)A. For
If A? = A show that the infinite series produces e4’ = J + (e'
20
A=[2)¢] in the form SAS™!. Find e4! from Se*'S7!.
A =[11] in Problem 20 this gives e4! =
Generally e4e? is different from e?e4. They are both different from e4+?. Check
this using Problems 20-21 and 18:
A=|9et 4
21.
Write
A = [}4] as SAS~'.
B=|)0 -1i
a+B= iP)ae
Multiply Se“’S~! to find the matrix exponential
e“' when t = 0.
eld bel beef
22
Put
23
Give two reasons why the matrix e“’ At is never singular:
24
A=[/3] into the infinite series to find e“’. First compute A”!
(a)
Write down its inverse.
(b)
Write down its eigenvalues.
If Ax = Ax then e4’x =
Find a solution x(t), y(t) that gets large as t —
scientist exchanged the two equations:
dx/dt=
Ox —4y
dy /dt = —2x + 2y
pee omnes
oo.
x
To avoid this instability a
dy/dt
= —2x
+ 2y
dx/dt =
Ox —4y.
Now the matrix [~? _?] is stable. It has negative eigenvalues. Comment on this.
6.4
Symmetric Matrices
283
SYMMETRIC MATRICES
@ 6.4
The eigenvalues of projection matrices are | and 0. The eigenvalues of reflection matrices are 1 and —1. Now we open up to all other symmetric matrices. It is no exaggeration to say that these are the most important matrices the world will ever see—in
the theory of linear algebra and also in the applications. We will come immediately to
the two basic questions in this subject. Not only the questions, but also the answers.
You can guess the questions. The first is about the eigenvalues. The second is
about the eigenvectors. What is special about Ax = Xx when A is symmetric? In
matrix language, we are looking for special properties of A and S when A = A!.
The diagonalization A = SAS~! should reflect the fact that A is symmetric. We
get some hint by transposing (S~')'AS'. Since A = A! those are the same. Possibly
S~! in the first form equals S$? in the second form. Then S'S = /. We are near the
answers and here they are:
1.
A symmetric matrix has only real eigenvalues.
2.
The eigenvectors can be chosen orthonormal.
Those orthonormal eigenvectors go into the columns of S. There are n of them (independent because they are orthonormal). Every symmetric matrix can be diagonalized.
Its eigenvector matrix S becomes an orthogonal matrix Q. Orthogonal matrices have
Q~! = Q'—what we suspected about S$ is true. To remember it we write Q in place
of S, when we choose orthonormal
eigenvectors.
Why do we use the word “choose”? Because the eigenvectors do not have to
be unit vectors. Their lengths are at our disposal. We will choose unit vectors—
eigenvectors of length one, which are orthonormal and not just orthogonal. Then A =
SAS! is in its special and particular form for symmetric matrices:
6H (Spectral Theorem)
Every symmetric matrix A =
QAQ! with real diagonal A and orthogonal matrix Q:
A=QAQ"'=QaA0'
with
A! has the factorization
Q'!'=aQ!.
It is easy to see that OA Q" is symmetric. Take its transpose. You get (Q')TATQT,
which is QAQ! again. So every matrix of this form is symmetric. The harder part
is to prove that every symmetric matrix has real 4’s and orthonormal x’s. This is the
“spectral theorem” in mathematics and the “principal axis theorem” in geometry and
physics. We approach it in three steps:
284
Chapter 6
Eigenvalues and Eigenvectors
1.
By an example (which proves nothing, except that the spectral theorem might be
true)
2.
By calculating the 2 by 2 case (which convinces most fair-minded people)
3.
By a proof when no eigenvalues are repeated (leaving only real diehards).
The diehards are worried about repeated eigenvalues.
eigenvectors?
Are there still n orthonormal
Yes, there are. They go into the columns of S (which becomes
last page before the problems outlines this fourth and final step.
We now take steps | and 2. In a sense they are optional.
mostly for fun, since it is included in the final n by n case.
Q). The
The 2 by 2 case is
Example 1 Find the A’s and x’s when A = E ‘|and A —AI = teny nad
Solution The equation det(A — A) = 0 is A* — 5A = 0. The eigenvalues are 0 and 5
(both real). We can see them directly: 4 = 0 is an eigenvalue because A is singular,
and 2 = 5 is the other eigenvalue so that
0+ 5 agrees with 1+ 4. This is the trace
down the diagonal of A.
The eigenvectors are (2, —1) and (1, 2)—orthogonal but not yet orthonormal. The
eigenvector for 4 = 0 is in the nullspace of A. The eigenvector for A = 5 is in the column space. We ask ourselves, why are the nullspace and column space perpendicular?
The Fundamental Theorem says that the nullspace is perpendicular to the row space—
not the column space. But our matrix is symmetric! Its row and column spaces are
the same. So the eigenvectors (2, —1) and (1, 2) are perpendicular—which their dot
product tells us anyway.
These eigenvectors have length /5. Divide them by 5 to get unit vectors. Put
the unit vectors into the columns of S (which is Q). Then A is diagonalized:
oOA0=
Ka TES gs
Ve
al aS
0 5
Now comes the calculation for any 2 by 2 symmetric matrix.
values. Second, perpendicular eigenvectors. The A’s come from
det |
“7”
wo |=?
eto
ac
H) =0.
First, real eigen-
(1)
This factors into (A —A,)(A — Az). The product 1A2 is the determinant D = ac — b?.
The sum A; + Az is the trace
T=a-+c.
The test for real roots of Ax? + Bx +C =0
is based on B2 — 4AC. This must
not be negative, or its square root in the quadratic formula would be imaginary. Our
equation has different letters, A - TA + D = 0, so the test is based on T? — 4D:
6.4
Real eigenvalues:
T?
Symmetric Matrices
285
—4D = (a+ c)? — 4(ac — b?) must not be negative.
Rewrite that as a* + 2ac + c* — 4ac + 4b. Rewrite again as (a — c)* + 4b’. This is
not negative! So the roots A; and Az (the eigenvalues) are certainly real.
Perpendicular eigenvectors:
Ne
A—d1)x1
=
a=
Compute x; and x2 and their dot product:
lain”
=
(A —AoI)x2 = [ine
When b is zero,
«OP
e
¢ eam eee
=
b
=
|x.|(SOME
A has perpendicular eigenvectors
from
x
from
ke i second
row
(1,0) and (0,1).
Otherwise,
take
the dot product of x; and x2 to prove they are perpendicular:
Xi *X2 = D(A2 —c)+
1 —a)b = dA
+42 —a—c) =0.
(2)
This is zero because A; + Az equals the trace
a+c. Thus x; -x2 = 0. Now comes the
general n by n case, with real A’s and perpendicular eigenvectors.
61
Real Eigenvalues The eigenvalues of a real symmetric matrix are real.
Proof Suppose that Ax = Ax. Until we know otherwise, A might be a complex number. Then A has the form a +ib (a and b real). Its complex conjugate is } = a — ib.
Similarly the components of x may be complex numbers, and switching the signs of
their imaginary parts gives x. The good thing is that A times ¥ is the conjugate of 4
times x. So take conjugates of Ax = Ax, remembering that A =! is real:
Ax =Ax
leads to
AX =Ax.
Transpose to
x’A=x!h.
(3)
Now take the dot product of the first equation with x and the last equation with x:
x'Ax =x! ax
and also
xl Ax =x! ax.
(4)
The left sides are the same so the right sides are equal. One equation has i, the other
has A. They multiply x'x which is not zero—it is the squared length of the eigenvector.
Therefore 2 must equal 2, and a +ib equals a — ib. The imaginary part is b = 0 and
the number A =a
is real. Q.E.D.
The eigenvectors come from solving the real equation (A —A/)x = 0. So the x’s
are also real. The important fact is that they are perpendicular.
6) Orthogonal Eigenvectors Eigenvectors of a real symmetric matrix (when they
correspond to different ’s) are always perpendicular.
286
Chapter 6
Eigenvalues and Eigenvectors
A has real eigenvalues and real orthogonal eigenvectors if and only if A = A
Proof Suppose Ax = A,x and Ay = Azy and
equation with y and the second with x:
A= A!. Take dot products of the first
(ix)y = (Ax)Ty =x" Aly =x" Ay =x" day.
(5)
The left side is x A, y, the right side is x Ary. Since Ay # Ao, this proves that xly =0.
Eigenvectors are perpendicular.
Example 2 Find the A’s and x’s for this symmetric matrix with trace zero:
ele
a=|
has
=3
det(A = 21) =| 4
4
Fle [=9? -25,
The roots of A? — 25 = 0 are Ay = 5 and Ay = —5 (both real). The eigenvectors
x; = (1,2) and x2 = (—2, 1) are perpendicular. To make them into unit vectors, divide
by their lengths 5. The new x; and x2 are the columns of Q, and Q7! equals Q':
peer cat
ae Reelem
This example shows the main goal of this section— to diagonalize symmetric
matrices A by orthogonal eigenvector matrices S = Q:
6H (repeated) Every symmetric matrix A has a complete set of orthogonal eigenvectors:
A=SAS~!
becomes
A=
ONO”
If A = A! has a double eigenvalue 4, there are two independent eigenvectors. We
use Gram-Schmidt to make them orthogonal. The Teaching Code
does this for each
eigenspace of A, whatever its dimension. The eigenvectors go into the columns of Q.
One more step. Every 2 by 2 symmetric matrix looks like
a=on0"=[x
T
» ||" ll ae
(6)
The columns x; and x2 times the rows 2 ie, and Aoxy produce A:
A= Ayx1x} + Agxoxt.
(7)
This is the great factorization QAQ", written in terms of A’s and x’s. When the symmetric matrix is n by n, there are n columns in Q multiplying n rows in Q'. The n
6.4
pieces are Apex e. Those are matrices!
Symmetric Matrices
287
Equation (7) for our example is
_[-3 4]_
[1/5 2/5
4/5 —2/5
a=| 4 “|= ee ieee ae valk
My
On the right, each cea! is a projection matrix. It is like uu’ in Chapter 4. The spectral
theorem for symmetric matrices says that A is a combination of projection matrices:
A=AyPi
+---+AnPr
A; = eigenvalue,
P; = projection onto eigenspace.
Complex Eigenvalues of Real Matrices
Equation (3) went from Ax = Ax to Ax = Ax. In the end, A and x were real. Those
two equations were the same. But a nonsymmetric matrix can easily produce A and x
that are complex. In this case, Ax = AX is different from Ax = Ax. It gives us a new
eigenvalue (which is A) and a new eigenvector (which is x):
For real matrices, complex d’s and x’s come in “conjugate pairs.”
If Ax =)x
Example 3 A =
then
Bsr —sin’| has Ay =
Ax =)x.
cos@ +isin@
and Az =
cosé — isin#.
Those eigenvalues are conjugate to each other. They are 4 and i, because the imaginary part sin@ switches sign. The eigenvectors must be x and x (all this is true because
A is real):
cos@
Bae
—sin@g
|
1
EB =
cos@
Be
—sin§
1
call |]=
ve.
1
(cos @ +i sin@) Bi
(9)
ose
me
= ssi)
1
|le
One is Ax = Ax, the other is AX = Ax. The eigenvectors are x = (1, —i) and ¥ =
(1,i). For this real matrix the eigenvalues and eigenvectors are complex conjugates.
By Euler’s formula, cos 6+ sin@ is the same as e! Its absolute value is |A| = 1,
because cos?@ + sin?@ = 1. This fact |A| = | holds for the eigenvalues of every
orthogonal matrix—including this rotation.
We apologize that a touch of complex numbers slipped in. They are unavoidable
even when the matrix is real. Chapter 10 goes beyond complex numbers A and complex
vectors x to complex matrices A. Then you have the whole picture.
288
Chapter 6
Eigenvalues and Eigenvectors
We end with two optional discussions.
Eigenvalues versus Pivots
The eigenvalues of A are very different from the pivots.
det(A — AI) = 0. For pivots, we use elimination.
For eigenvalues,
we solve
The only connection so far is this:
product of pivots = determinant = product of eigenvalues.
We are assuming that A = A! = LU.
There is a full set of pivots dj, ...,d,.
There are
n real eigenvalues 4;,..., An. The d’s and i’s are not the same, but they come from
the same matrix. This paragraph is about a hidden relation for symmetric matrices:
The pivots and the eigenvalues have the same signs.
6K If A is symmetric the number of positive
number of positive (negative) pivots.
(negative) eigenvalues
equals the
Example 4 This symmetric matrix A has one positive eigenvalue and one positive pivot:
ae
1
TS
3
Hl
has pivots 1 and —10
eigenvalues /10 and —/10.
The signs of the pivots match the signs of the eigenvalues, one plus and one minus.
This could be false when the matrix is not symmetric:
Bs
~
eo
|-1 —4
has pivots | and 2
eigenvalues —1 and —2.
The pivots are positive and the eigenvalues are negative. The diagonal of B has both
signs! The diagonal entries are a third set of numbers and we are saying nothing about
them.
Here is a proof that the pivots and eigenvalues have matching signs when A =
A! = LU. You see it best when the pivots are divided out of the rows of U. The
pivot matrix D is diagonal. It goes between two triangular matrices L and L', whose
diagonal entries are all 1|’s:
1 Fle
2)
Oa
3
z
alike ma
|ea 8
= E
A
a
Gait
This is A = LU.
ia:
ae
E ‘| This is
mn !
Now divide out —10.
A = LDL’.
.
:
It is symmetric.
The special event is the appearance of L!. This only happens for symmetric matrices,
because LDL" is always symmetric. (Take its transpose to get LDL" again.) For
symmetric matrices, A = LU converts to A = LDL" when the pivots are divided out.
6.4
Watch the eigenvalues when
the start, the eigenvalues of LDL!
Symmetric Matrices
289
L and L" move toward the identity matrix.
are
10 and —,/10.
At
At the end, the eigenvalues
of IDI" are 1 and —10 (the pivots!). The eigenvalues are changing, as the “3” in L
moves to zero. But to change sign, an eigenvalue would have to cross zero. The matrix
would at that moment be singular. Our changing matrix always has pivots 1 and —10,
so it is never singular. The signs cannot change, as the 4’s move to the d’s.
We repeat the proof for any A = LDL". Move L toward J, by moving the offdiagonal entries to zero. The pivots are not changing and not zero. The eigenvalues i
of LDL" change to the eigenvalues d of 1DJ™. Since these eigenvalues cannot cross
zero as they move, their signs cannot change. Q.E.D.
This connects the two halves of applied linear algebra—pivots and eigenvalues.
All Symmetric Matrices are Diagonalizable
When no eigenvalues of A are repeated, the eigenvectors are sure to be independent.
Then A can be diagonalized. But a repeated eigenvalue can produce a shortage of
eigenvectors. This sometimes happens for nonsymmetric matrices. It never happens
for symmetric matrices. There are always enough eigenvectors to diagonalize A = A!.
Here are three matrices, all with A = —1 and | and | (a repeated eigenvalue):
OF
Ache
Oe
40
Oko
Osinepel:
Be)
= ayOgee
OL
eee
Os {Oat
C=)
=i] "0° 0
EeOraiee
|
OO
A is symmetric. We guarantee that it can be diagonalized. The nonsymmetric B can
also be diagonalized. The nonsymmetric C has only two eigenvectors. It cannot be
diagonalized.
One way to deal with repeated eigenvalues is to separate them a little. Change
the lower right corner of A, B, C from | to d. Then the eigenvalues are —1 and | and
d. The three eigenvectors are independent. But when d reaches 1, two eigenvectors of
C collapse into one. Its eigenvector matrix S loses invertibility:
Sth
an)
eOee wl
OF 10
1
ss)
approaches
100
| 0 1 1 | =no
On On 2.0
good.
This cannot happen when A = A!. Reason: The eigenvectors stay perpendicular. They
cannot collapse as d — 1. In this example the eigenvectors don’t even change:
O72
0
eG
OG
0
0rd.
ion
has orthogonal eigenvectors = columns of
S=]-—I1
Oo
est)
1
O
FOr
al
Final note The eigenvectors of a skew-symmetric matrix (AT = —A) are perpendicular.
The eigenvectors of an orthogonal matrix (Q' = Q™!) are also perpendicular. The best
matrices have perpendicular eigenvectors! They are all diagonalizable. I stop there.
290
Chapter 6
Eigenvalues and Eigenvectors
The reason for stopping is that the eigenvectors may contain complex numbers.
We need Chapter 10 to say what “perpendicular” means. When x and y are complex
vectors, the test is no longer x'y = 0. So we can’t prove anything now—but we
can reveal the answer. A real matrix has perpendicular eigenvectors if and only if
ATA = AA!. Symmetric and skew-symmetric and orthogonal matrices are included
among these “normal” matrices. They may be called normal but they are special. The
very best are symmetric.
=»
REVIEW OF THE KEY IDEAS
=
1.
A symmetric matrix has real eigenvalues and perpendicular eigenvectors.
2.
Diagonalization becomes A = QAQ!
3.
All symmetric matrices are diagonalizable.
4.
The signs of the eigenvalues agree with the signs of the pivots.
with an orthogonal matrix
Q.
Problem Set 6.4
1
Write A as M+ N, symmetric matrix plus skew-symmetric matrix:
1°
2rse
Se
Ob
A=|4°3°0|/=M4N
3
(M' = M,N?
= -—N).
For any square matrix, M = fo
and N =
add up to A.
If C is symmetric prove that A'CA
is also symmetric.
(Transpose it.) When A
is 6 by 3, what are the shapes of C and A'CA?
3
If A is symmetric, show that the dot product of Ax with y equals the dot product
of x with Ay. If A is not symmetric then (Ax)'y = x1(
.
4
Find an orthogonal matrix Q that diagonalizes
5
Find an orthogonal matrix Q that diagonalizes this symmetric matrix:
A=,
6
A = [tee Si);
bag
AD
na
Oe
2
2, Ee
oO)
Find all orthogonal matrices that diagonalize
A = ie ‘ak
6.4
Symmetric Matrices
291
(a)
Find a symmetric 2 by 2 matrix with 1’s on the diagonal but a negative
eigenvalue.
(b)
How do you know it must have a negative pivot?
(c)
How do you know it can’t have two negative eigenvalues?
If A> = 0 then the eigenvalues of A must be
. Give an example that has
A #0. But if A is symmetric, diagonalize it to prove that A must be zero.
If A =a-+ib is an eigenvalue of a real matrix A, then its conjugate X}=a-—ib
is also an eigenvalue. (If Ax = Ax then also Ax = Ax.) Prove that every real 3
by 3 matrix has a real eigenvalue.
10
Here is a quick “proof” that the eigenvalues of all real matrices are real:
Ax =)x
gives
x'Ax=Ax'x
so
A=
x! Ax
xlx
is real.
Find the flaw in this reasoning—a hidden assumption that is not justified.
11
Write A and B in the form eae
1
3 A|
nel,
12
rp“4. Aox2x4 of the spectral theorem
S12|
Bal
QA OF:
||
1).
(keep ||x1|| = ||x2=
What are the eigenvalues of A = [_? |? Create a 3 by 3 skew-symmetric matrix
(Al = —A) and verify that its eigenvalues are all imaginary.
13
This matrix is
and also
. Therefore its eigenvalues are pure imagi-
nary and they have |A| = 1. (Reason: ||Mx|| = ||x|| for every x so ||Ax|| = ||x]|
for eigenvectors.) Find all four eigenvalues of
eG
14
1
Ogg
Fe
]/-1 0-1
1
|te
herpes
-1 -1
1 0
Show that this A (symmetric but complex) does not have two independent eigenvectors:
A=
ie Alis not diagonalizable;
det(A — AJ) = (A — ae
A! = A is not such a special property for complex matrices.
The good prop-
erty becomes A’ = A. When this holds, the eigenvalues are all real and A is
diagonalizable.
292
Chapter 6
15
Even if A is rectangular, the block matrix B = [ar “] is symmetric:
Eigenvalues and Eigenvectors
;
Odes
BA eee tS
at
ley
y
| 2|=1?]
ee
which is
Az=Ay
Aly
Ae
(a)
Show that A'Az = Az, so that A? is an eigenvalue of ATA.
(b)
If
A=T
(2 by 2) find all four eigenvalues and eigenvectors of B.
16
If A =[}] in Problem 15, find all three eigenvalues and eigenvectors of B.
17
Every 2 by 2 symmetric
matrix is Ayxixy + Aaxrxd = A,P,
+A2P).
Explain
why
a)
Feat
(b) eee
—0)
18
Another proof that eigenvectors are perpendicular when A = A‘. Suppose
Ax = Ax and Ay = Oy and A} # 0. Then y is in the nullspace and x is in
the column space. They are perpendicular because ____._~-Go carefully—why
are these subspaces orthogonal? If the second eigenvalue is a nonzero number f,
apply this argument to A — B/. The eigenvalue moves to zero and the eigenvectors stay the same—so they are perpendicular.
19
Find the eigenvector matrix S for the matrix B. Show that it doesn’t collapse at
d = 1, even though A = | is repeated. Are the eigenvectors perpendicular?
[o) =
20
—-1
0
1
Ou th WG
OnOmed
eS
ee
SHTho le
From the trace and the determinant find the eigenvalues of
pana
—3
4
Compare the signs of the 4’s with the signs of the pivots.
21
22
True or false. Give a reason or a counterexample.
(a)
A matrix with real eigenvalues and eigenvectors is symmetric.
(b)
A symmetric matrix times a symmetric matrix is symmetric.
(c)
The inverse of a symmetric matrix is symmetric.
(d)
The eigenvector matrix S of a symmetric matrix is symmetric.
A normal matrix has
A! A = AA‘. Why is every skew-symmetric matrix normal?
Why is every orthogonal matrix normal? When is [_# 4] normal?
6.4
23
Symmetric Matrices
293
(A paradox for instructors) If AA? = ATA then A and A! share the same eigen-
vectors (true). They always share the same eigenvalues. Find the flaw in this
conclusion: They must have the same S and A. Therefore A equals A.
24
Which of these classes of matrices do A and B belong to: Invertible, orthogonal,
projection, permutation, diagonalizable, Markov?
0 0
Oe
ee 6
A
&
II
—
Wl
—_
o&
oOo
=
Which of these factorizations are possible for A and B: LU, OR, SAS~', QAQ"?
25
What
number
A = SAS!
26
Db in ee
makes
A =
QAQ!
possible?
What
number
makes
impossible?
This A is nearly symmetric.
A=
0"
[One
aloe
But its eigenvectors are far from orthogonal:
has eigenvectors
:
0:
and
[2]
What is the angle between the eigenvectors?
Bh
When MATLAB
computes A! A, the result is symmetric. But the computed pro-
jection matrix P = A(A'A)~'A! might not be exactly symmetric. Construct P
from
A=[1 1 1 1 1;1 2 3 4 5] and use [S, LAMBDA] =eig(P). From
S’ * S show that two of the computed eigenvectors have dot product 0.9999.
294
Chapter 6
Eigenvalues and Eigenvectors
POSITIVE DEFINITE MATRICES
@ 6.5
This section concentrates on symmetric matrices that have positive eigenvalues. If
symmetry makes a matrix important, this extra property (all A > 0) makes it special.
When we say special, we don’t mean rare. Symmetric matrices with positive eigenvalues enter all kinds of applications of linear algebra. They are called positive definite.
The first problem is to recognize these matrices. You may say, just find the eigenvalues and test A > 0. That is exactly what we want to avoid. Calculating eigenvalues
is work.
When the A’s are needed, we can compute them.
But if we just want to know
that they are positive, there are faster ways. Here are the two goals of this section:
1.
To find quick tests on a symmetric matrix that guarantee positive eigenvalues.
2.
To explain the applications of positive definiteness.
The matrices are symmetric to start with, so the 4’s are automatically real numbers.
An important case is 2 by 2. When does
A= ie | have 44 > 0 and i2 > 0?
6L The eigenvalues of A = A" are positive if and only if a > 0 and ac — b? > 0.
This test is passed by A = [43]. It is failed by [22] and also by [~} _9]. One failure
is because the determinant is 24 — 25 < 0. The other failure is because a = —1. The
determinant of +7 is not enough to pass, because the test has two parts.
We think of a and ac — b? as a 1 by 1 determinant and a 2 by 2 determinant.
Proof without computing the d’s Suppose A; > O and Az > 0. Their product A;A2
equals the determinant ac — b*. That must be positive. Therefore ac is also positive.
Then a and c have the same sign. That sign has to be positive, because 4; +A2 equals
the trace
a+ c. Proved so far: Positive 4’s require ac — b* > 0 and a > 0.
The statement was “if and only if,’ so there is another half to prove. Start with
a > 0 and ac — b? > 0. This also ensures c > 0. Since A,A2 equals the determinant
ac — b*, the i’s are both positive or both negative. Since A; + A> equals the trace
a+c> 0, the A’s must be positive. End of proof.
Here is another test. Instead of determinants, it checks for positive pivots.
6M_
The eigenvalues of A = A’ are positive if and only if the pivots are positive:
a>0O
and
ac
— b?
s (0),
a
A new proof is unnecessary.
a= 0) and,
The ratio of positive numbers is certainly positive:
ae
b S90
if and only if
a>0O
and
ac
— b*
a
Soils
6.5
Positive Definite Matrices
295
The point is to recognize that last ratio as the second pivot of A:
er
The first pivot is a
is “| The multiplier
nersis b/a
7
ip
ki aa
The second pivot is
oo
noltee ulCa
ee
This connects two big parts of linear algebra. Positive eigenvalues mean positive pivots
and vice versa (for symmetric matrices!). If that holds for n by n symmetric matrices, and it does, then we have a quick test for A > 0. The pivots are a lot faster to
compute than the eigenvalues. It is very satisfying to see pivots and determinants and
eigenvalues and even least squares come together in this course.
Example 1 This matrix has a = | (positive). But ac — b? = (3) — 2? is negative:
Ny
1
ae
3
;
‘
:
has a negative eigenvalue and a negative pivot.
The pivots are 1 and —1. The eigenvalues also multiply to give —1.
is negative (we don’t want its formula, just its sign).
One eigenvalue
Here is a different way to look at symmetric matrices with positive eigenvalues.
From Ax = Ax, multiply by x! to get x'Ax = Ax'x. The right side is a positive d
times a positive x'x = \|x||?. So the left side x'Ax is positive when x is an eigenvector. The new idea is that this number x" Ax is positive for all vectors x, not just
the eigenvectors. (Of course x! Ax = 0 for the trivial vector x = 0.)
There is a name for matrices with this property x'Ax > 0. They are positive
definite. We will prove that exactly these matrices have positive eigenvalues and pivots.
Definition
The matrix A is positive definite if x'Ax > 0 for every nonzero vector:
x! AX fe
E : Al = ax* + 2bxy
+cy? > 0.
x!Ax is a number (1 by 1 matrix). The four entries a, b, b, c give the four parts of x’ Ax.
From a and c on the diagonal come the pure squares ax* and cy*. From b and b off
the diagonal come the cross terms bxy and byx (the same). Adding those four parts gives
x! Ax = ax? + 2bxy + cy’. This is a quadratic function ofx and y:
Ley)
= axe 4 2bxy + cy?
is “second degree.”
The rest of this book has been linear (mostly Ax). Now the degree has gone from |
to 2. The second derivatives of ax?-+2bxy+cy? are constant. Those second derivatives
296
Chapter 6
Eigenvalues and Eigenvectors
are 2a, 2b, 2b, 2c. They go into the second derivative matrix 2A:
a
92
af =2ax + 2by
Ox
Galt
GX
Hah
OVOX
axdy
—
dy?
af
a
7
— =2bx + 2cy
dy
go
Z
s[
2a.
26
ply |aleoptpes|t
This is the 2 by 2 version of what everybody knows for | by 1. There the function is
ax”, its slope is 2ax, and its second derivative is 2a. Now the function is x! Ax, its
first derivatives are in the vector 2Ax, and its second derivatives are in the matrix 2A.
Third derivatives are all zero.
Where does calculus use second derivatives? They give the bending of the graph.
When f” is positive, the curve bends up from the tangent line. The parabola y = ax
is convex up or concave down according to a > 0 or a < 0. The point x = 0 is a
minimum point of y = x? and a maximum point of y = —x?. To decide minimum
versus maximum for f(x), we look at its second derivative.
For a two-variable function f(x, y), the matrix of second derivatives holds the
key. One number is not enough to decide minimum versus maximum (versus saddle
point). The function f = x'Ax has a minimum at x = y = 0 if and only if A is
positive definite. The statement “A is a positive definite matrix” is the 2 by 2 version
of “a is a positive number.”
Example 2 This example is positive definite. The function f(x, y) is positive:
ae
2
i |
The function is x' Ax = x7 +4xy+7y*.
has pivots | and 3.
It is positive because it is a sum of squares:
ee 4xy + Ty? = (x + 2y) + Bye.
The pivots 1 and 3 multiply those squares. This is no accident! We prove below, by
the algebra of “completing the square,” that this always happens. So when the pivots
are positive, the sum f(x, y) is guaranteed to be positive.
Comparing
Examples
1 and 2, the only difference is that a22 changed from 3
to 7. The borderline is when a27 = 4. Above 4, the matrix is positive definite.
a22 = 4, the borderline matrix is only semidefinite Then (> 0) changes to (> 0):
At
ie:
:
> al has pivots 1 and
It has eigenvalues 5 and 0. It has a > 0 but ac — b> = 0. Not quite positive definite.
6.5
Positive Definite Matrices
297
We will summarize this section so far. We have four ways to recognize a positive
definite matrix. Right now it is only 2 by 2.
6N When a 2 by 2 symmetric matrix has one of these four properties, it has them all:
1. _ Both of the eigenvalues are positive.
2.
The | by 1 and 2 by 2 determinants are positive:
3.
The
4.
The function x'Ax = ax? + 2bxy + cy? is positive except at (0, 0).
pivots are positive:
a > 0 and ac — b* > 0.
a > 0 and (ac — b*)/a > 0.
When A has one (therefore all) of these four properties, it is a positive definite matrix.
Note We deal only with symmetric matrices. The cross derivative d°*f/dxdy always
equals 0°f/dydx. For f(x, y,z) the nine second derivatives fill a symmetric 3 by 3
matrix. It is positive definite when the three pivots (and the three eigenvalues, and the
three determinants) are positive.
Example
Pp 3 Is
f(x, y) =x* + 8xy +3y
everywhere positive—except Pp at (0, 0)?
Solution The second derivatives are fy, = 2 and fry = fyx = 8 and fy, = 6, all
positive. But the test is not positive derivatives. We look for positive definiteness. The
answer is no, this function is not always positive. By trial and error we locate a point
x = 1, y = —1 where f(1, -1) = 1—8-+3 = —4. Better to do linear algebra, and apply
the exact tests to the matrix that produced f(x, y):
x* + 8xy + 3y* = [x y| b | hea
The matrix has ac — b* = 3 — 16. The pivots are 1 and —13. The eigenvalues are
(we don’t need them). The matrix is not positive definite.
Note how 8xy comes from aj2 = 4 above the diagonal and a2; = 4 symmetrically
below. That matrix multiplication in x’ Ax makes the function appear.
Main point The sign of b is not the essential thing. The cross derivative d“f/dxdy
nie
.
ye
#3
.
_
can be positive or negative—it is b~ that enters the tests. The size of b, compared to a and c,
decides whether A is positive definite and the function has a minimum.
298
Chapter 6
Eigenvalues and Eigenvectors
Example 4 For which numbers c is x? + 8xy + cy’ always positive (or zero)?
Solution The matrix is A =[44]. Again a = 1 passes the first test. The second test
has ac — b* = c — 16. For a positive definite matrix we need c > 16.
The “semidefinite” borderline is c = 16. At that point [} ,¢] has A = 17 and 0,
determinants 1 and 0, pivots | and ___._The function x2 + 8xy + loy is @—4y)~,
Its graph does not go below zero, but it stays equal to zero all along the line x + 4y = 0.
This is close to positive definite, but each test just misses: x’ Ax equals zero
for x = (4, —1).
Example 5 When A is positive definite, write f(x, y) as a sum of two squares.
Solution This is called “completing the square.” The part ax? +2bxy is correct in the
first square a(x + 29 he But that ends with a final Gey)
To stay even, this added
amount b?y*/a has to be subtracted off from cy? at the end:
ax*
>
>
+ 2bxy
+ cy” =a
b
Sie
\?
+
HOM ir
5;
>
Var
(1)
After that gentle touch of algebra, the situation is clearer. There are two perfect
squares (never negative). They are multiplied by two numbers, which could be positive
or negative. Those numbers a and (ac — b*)/a are the pivots! So positive pivots give a
sum of squares and a positive definite matrix. Think back to the factorization A = LDL?:
QD,
lake
k ales
Ona
||
ee orn
1
b/a
he,
5 i ese
T
ae
To complete the square, we dealt with a and b and fixed the part involving c later.
Elimination
does exactly the same.
It deals with the first column,
and fixes the rest
later. The numbers that come out are identical.
Outside the squares are the pivots. Inside (x ae 2y)? are the numbers | and
from L.
Every positive definite symmetric matrix factors into A = LDL" with positive pivots.
Important to compare A = LDL! with A = QAQ". One is based on pivots
(in D), the other is based on eigenvalues (in A). Please do not think that the pivots
equal the eigenvalues. Their signs are the same, but the numbers are entirely different.
Positive Definite Matrices: n by n
For a 2 by 2 matrix, the “positive definite test” uses eigenvalues or determinants or
pivots. All those numbers must be positive. We hope and expect that the same tests
carry over to larger matrices. They do.
6.5
Positive Definite Matrices
60
When ann
1.
‘All the eigenvalues are positive.
2.
All the upper left determinants are positive.
3.
All the pivots are positive.
4.
x! Ax is positive except at x = 0. The matrix is positive definite.
299
by n symmetric matrix has one of these four properties, it has them all:
The upper left determinants are | by 1, 2 by 2,...,” by n. The last one is the determinant of A. This remarkable theorem ties together the whole linear algebra course—at
least for symmetric matrices. We believe that two examples are more helpful than a
proof. Then we give two applications.
Example 6 Test the matrices A and A* for positive definiteness:
2
A=|-1
0
—1
0
2
—1
-!1
2
2
and
At=
b
Solution This A is an old friend (or enemy).
—-1
|=]
b
|
—1
D
Its pivots are 2 and 5 and ;, all positive.
Its upper left determinants are 2 and 3 and 4, all positive. Its eigenvalues are 2 — /2
and 2 and 2 + V2, all positive. That completes tests 1, 2, and 3.
We can write
x! Ax as a sum of three squares (since n = 3). Using A = LDL!
the pivots 2, 3, t appear outside the squares. The multipliers from L are inside:
xl Ax = 2 (ai — xx. + xe — x2x3 + a)
2
Ome
= 2(x1 — 342)" + 3(a2 — 3x3) + 323 > 0.
Go to the second matrix A*. The determinant test is easiest. The 1 by 1 determinant is 2 and the 2 by 2 determinant is 3. The 3 by 3 determinant comes from A*
itself:
det A*
= 4+2b—2b?
must be positive.
At b = —1 and b = 2 we get detA* = 0. In those cases A®* is positive semidefinite
(no inverse, zero eigenvalue, x! A*x > 0). Between
b = —1
and b = 2 the matrix is
positive definite. The corner entry b = 0 in the first example was safely between.
Second Application: The Ellipse ax? + 2bxy + cy” = 1
Think of a tilted ellipse centered at (0,0), as in Figure 6.4a. Turn it to line up with
the coordinate axes. That is Figure 6.4b. These two pictures show the geometry behind
A=QAQ7!:
300 = Chapter 6
Figure 6.4
Eigenvalues and Eigenvectors
The tilted ellipse 5x? + 8xy + Sy? = 1. Lined up it is 9X? + Y? =1.
1.
The tilted ellipse is associated with A. Its equation is x' Ax = 1.
2.
The lined-up ellipse is associated with A. Its equation is X'AX = 1.
3.
The rotation matrix from x to X that lines up the ellipse is Q.
Example 7 Find the axes of the tilted ellipse 5x2 48xystiSye Sul,
Solution
Start with the positive definite matrix that matches this function:
The function is
[x y| E | ae
The matrix is
A=
E alt
The eigenvalues of A are 4; =9 and Az = 1. The eigenvectors are heal and ea
make them unit vectors, divide by /2. Then
spit] Meee,
A = QAQ?
To
is
emo 40 (Al mie
AceSu)p
mag line = Oma |orsIte sla
Now multiply by [x y] on the left and [}] on the right to get back the function
x AK:
5x? + Bry +5y? = 9
lan,
a
ay
P)
*) +1(22) ;
2
we
3
.
The function is again a sum of two squares. But this is different from completing the
square. The coefficients are not the pivots 5 and 9/5 from D, they are the eigenvalues
9 and | from A. Inside these squares are the eigenvectors (1, yee and (1, SS0y eo:
6.5
Positive Definite Matrices
301
The axes of the tilted ellipse point along the eigenvectors. This explains why
A = QAQ' is called the “principal axis theorem”—it displays the axes. Not only the
axis directions (from the eigenvectors) but also the axis lengths (from the eigenvalues).
To see it all, use capital letters for the new coordinates that line up the ellipse:
38 ar WY
af2
Xd
and
Soy
eas,
WP)
The ellipse becomes 9X* +Y* = 1. The largest value of X? is 5. The point at the end
of the shorter axis has X = ‘ and Y = 0. Notice:
The bigger eigenvalue A; gives the
shorter axis, of half-length 1/./A, = i. The point at the end of the major axis has
X =0 and Y = 1. The smaller eigenvalue A2 = 1 gives the greater length 1/./A2 = 1.
In the xy system, the axes are along the eigenvectors of A. In the XY system,
the axes are along the eigenvectors of A—the coordinate axes. Everything comes from
the diagonalization A = QAQ?.
6P Suppose
A = QAQ!
is positive definite. The graph of x'Ax = 1 is an ellipse:
x
[x y]QAaQ T| iste
yJa] xyp]aux 2 $Y? : =1.
The half-lengths of the axes are 1/,/A; and 1/./A2.
For an ellipse, A must be positive definite. If an eigenvalue is negative (exchange
the 4’s with the 5’s in A), we don’t have an ellipse.
difference of squares:
=»
9X” — Y* =1.
The sum
of squares becomes
This is a hyperbola.
REVIEW OF THE KEY IDEAS
=&
1.
Positive definite matrices have positive eigenvalues and positive pivots.
2.
A quick test is given by the upper left determinants:
3.
The quadratic function f =x'Ax
a > 0 and ac — bo.
has a minimum at x = 0:
x? Ax =ax* + 2bxy + cy” is positive except at (x, y) = (0, 0).
4,
The ellipse x’ Ax = | has its axes along the eigenvectors of A.
5.
A'A is automatically positive definite if A has independent columns
(r = 7).
a
302
Chapter 6
Eigenvalues and Eigenvectors
Problem Set 6.5
Problems 1-13 are about tests for positive definiteness.
1
Which of A,, Az, A3, Aq has two positive eigenvalues?
Use the test, don’t com-
pute the i’s:
5 6
=)
1 10
a=[e 3] a= le, x ele a
“el
1
10
mal
Explain why c > 0 (instead of a > 0) combined with ac — b*, > 0 is also.a
complete test for [#°] to have positive eigenvalues.
2
For which numbers b and c are these matrices positive definite?
i
a=|;
Factor each A into LU
3
|
and
Ae
A
se
al}
and then into LDL’.
What is the quadratic f = ax?+2bxy+cy for each of these matrices? Complete
the square to write f as a sum of one or two squares d}( te ae
ee
ee:
een eel
2
4
Show that f(x, y) = x2 +4xy + 3y” does not have a minimum at (0,0) even
though it has positive coefficients. Write f as a difference of squares and find a
point (x, y) where f is negative.
5
The function f(x, y) = 2xy certainly has a saddle point and not a minimum at
(0,0). What symmetric matrix A produces this f? What are its eigenvalues?
6
Test to see if A'A is positive definite:
7
(Important) If A has independent columns then A'A is square and symmetric
and invertible (Section 4.2). Show why x!
AT Ax is positive except when x = 0.
Then A'A
8
is more than invertible, it is positive definite.
The function f(x, y) = 3(x + 2y)* + 4y? is positive except at (0,0). What is the
matrix A, so that f =[x y]A[x y]'? Check that the pivots of A are 3 and 4.
6.5
Positive Definite Matrices
303
Find the 3 by 3 matrix A and its pivots, rank, eigenvalues, and determinant:
x]
[x1
50)
x3 |
A
2Y || NG a = ah
ne)
X3
10
11
Which 3 by 3 symmetric matrices A produce these functions
is the first matrix positive definite but not the second one?
f = x'Ax?
Why
Compute the three upper left determinants to establish positive definiteness.
ify that their ratios give the second and third pivots.
Ver-
(a)
ih= D(x, + x5 + x3 = X1x2 — x2X3)
(Db),
Ff = 2 (x7 + x2 +x? — X1X2
— X1Xx3 — x2x3).
2 oF
<2, 95 3
0) 73-8
Aves
12
For what numbers c and d are A and B positive definite? Test the 3 determinants:
c
Ass
lt
Ly
@
il
and
oy
B29
dy
13
4S
Find a matrix with a > 0 and c > 0 and a+c > 2b that has a negative eigenvalue.
Problems
14
oes
de
14—20 are about applications of the tests.
If A is positive definite then A~! is positive definite. First proof: The eigenvalues of A~! are positive because
_. Second proof (2 by 2): The entries
of
Al=
1
eT
OG a al
ee
;
ret
4
pass the test
15
If A and B are positive definite, show that A+ B is also positive definite. Pivots
and eigenvalues are not convenient for this; better to prove x'(A+B)x > 0 from
the positive definiteness of A and B.
16
For a block positive definite matrix, the upper left block A must be positive definite:
less )
gt a |
The complete block test is that
reduces to
x'Ax
A and C — B'A~'B
when
y=
must be positive definite.
304
Chapter 6
17
A positive definite matrix cannot have a zero (or worse, a negative number) on
Eigenvalues and Eigenvectors
its diagonal. Show that this matrix is not positive definite:
if
@
2
18
nail
2
x2 | is not positive when
(%1,%2,%3) =(
,
,
).
The first entry aj, of a symmetric matrix A cannot be smaller than all the eigenvalues.
If it were,
then A — a,,/ would
on the main diagonal.
the eigenvalues.
have
______
eigenvalues
but it has a
Similarly no diagonal entry can be larger than all
19
If x is an eigenvector of A then
A is positive definite.
x! Ax =
. Prove that 4 is positive when
20
Give a quick reason why each of these statements is true:
(a)
Every positive definite matrix is invertible.
(b)
The only positive definite permutation matrix is P = J.
(c)
The only positive definite projection matrix is P = /.
(d)
A diagonal matrix with positive diagonal entries is positive definite.
(e)
Asymmetric matrix with a positive determinant might not be positive definite!
Problems 21-24 use the eigenvalues; Problems 25-27 are based on pivots.
21
For which s and ¢ do these matrices have positive eigenvalues (therefore positive
definite)?
s
A=|-4
-4
—4
s
-4
-4
-4
5
and
fe
Er 3 i)
ne ae ae |
0 ie Sas
22
From A = QAQ! compute the positive definite symmetric square root QA!/7QT
of each matrix. Check that this square root gives R? = A:
23
You may have seen the equation for an ellipse as cae + Ns = 1. What are a and b
when the equation is written as 4yx? + Ary? = 1? The ellipse 9x? + 16y? = 1
has axes with half-lengths a =
and b =
24
Draw the tilted ellipse x*-+xy+y* = 1 and find the half-lengths of its axes from
the eigenvalues of the corresponding A.
6.5
25
With positive pivots in D, the factorization A = LDL!
(Square roots of the pivots give D = /DVD.) Then
Cholesky factorization A = CC!:
From
26
Positive Definite Matrices
ea
50
2
|
find
In the Cholesky factorization
A.
becomes LVDJDL".
C = LVD yields the
4
a=| 8
From
305
8
25
find
A = CC‘, with C = LVD, the
C.
of the pivots
are on the diagonal of C. Find C (lower triangular) for
Al—
ZT
»
sO
0}
O ©
ie
2 8
The symmetric factorization
iS
1
and
A = LDL!
A
means that x'Ax = x'LDL'x.
This
is lle 4 BiG ae qlE we ¥yfa| [0 Tb)
Multiplication produces ax* + 2bxy +cy? = a(x + ‘y) + __y*.
pivot completes the square. Test witha =2, b=4, c= 10.
28
29
Without multiplying
The second
A = ees
ane
i dil—sin@
Cos@'cosé
sing 1h find
sin?
cos@
(a)
the determinant of A
(b)
the eigenvalues of A
(c)
the eigenvectors of A
(d)
a reason why A
For fi(x,y) = ix4 =f xy ae y? and f2(x,y)
derivative matrices A; and A):
=
is positive definite.
seas xy — x find the second
_ | a?f/dx?
a?f/dxdy
La f/ayax
a? f/dy?
|
A is positive definite so f| is concave up (= convex). Find the minimum point
of f; and the saddle point of f> (where first derivatives are zero).
30
The graph of z= x* + y? is a bowl opening upward. The graph of z = x7 — y?
is a saddle. The graph of z = —x* — y? is a bowl opening downward. What is
a test for z= ax* + 2bxy + cy” to have a saddle at (0, 0)?
31
Which values of c give a bowl and which give a saddle for the graph of the
function z = 4x* + 12xy +cy?? Describe the graph at the borderline value of c.
306
Chapter 6
Eigenvalues and Eigenvectors
SIMILAR MATRICES
@ 6.6
The big step in this chapter was to diagonalize a matrix. That was done by S—the
eigenvector matrix. The diagonal matrix S~!AS is A—the eigenvalue matrix. But diagonalization was not possible for every A. Some matrices resisted and we had to
leave them alone. They had too few eigenvectors to produce S. In this new section, S
remains the best choice when we can find it, but we allow any invertible matrix M.
Starting from A we go to
M~'AM. This new matrix may happen to be diagonal—
more likely not. It still shares important properties of A. No matter which M we choose,
the eigenvalues stay the same. The matrices A and
M~! AM are called “similar.” A typical
matrix A is similar to a lot of other matrices because there are so many choices of M.
DEFINITION
Let / be any invertible matrix. Then
B = M~!AM is similar to A.
If B = M~'AM then immediately A = MBM. That means: If B is similar to A then
A is similar to B. The matrix in this reverse direction is M~'—just as good as M.
A diagonalizable matrix is similar to A. In that special case M is S. We have
A = SAS~! and A = S~'AS. They certainly have the same eigenvalues! This section
is opening up to other similar matrices B = M~'AM.
The combination
M~!AM appears when we change variables.
ferential equation for uw and set u = Mv:
du
— = Au
dt
becomes
dv
M—=AMv
dt
ee
whichis
—=M
dt
Start with a dif-
£2]
“AMv.
The original coefficient matrix was A, the new one at the right is
M~'AM. Changing
variables leads to a similar matrix. When M = S the new system is diagonal—the
maximum in simplicity. But other choices of M also make the new system easier to
solve. Since we can always go back to u, similar matrices have to give the same growth
or decay. More precisely, their eigenvalues are the same.
6Q (No change in A’s) The similar matrices A and M —l 4M have the same eigenvalues.
If x is an eigenvector of A then M~!x is an eigenvector of B = M~!AM.
The proof is quick, since
MBM~'x
B = M~'AM gives
A= MBM7—!. Suppose Ax = Ax:
=)x
BM~!x
means that
=AM7'x.
The eigenvalue of B is the same A. The eigenvector is now M~!x.
6.6
Similar Matrices
307
The following example finds three matrices that are similar to one projection matrix.
Example 1
The projection
A = i a is similar to A = S~!AS = k a
Now choose M =
1 0
}the similar matrix M~!AM
Let
0 -1
| the similar matrix M~!AM
i
0
Also choose M = |
is | ae
0). 0
is
LE2
5
ey
These matrices
M~!AM all have eigenvalues 1 and 0. Every 2 by 2 matrix with those
eigenvalues is similar to A. The eigenvectors change with M.
The eigenvalues in that example are | and 0, not repeated. This makes life easy.
Repeated eigenvalues are harder. The next example has eigenvalues 0 and 0. The zero
matrix shares those eigenvalues, but it is only similar to itself:
M —lOM =0.
The following matrix A is similar to every nonzero matrix with eigenvalues 0 and 0.
Example 2
A
2
0 01 is similar to every matrix B = |<:
: =|
ExXCept i —U:
i
=
These matrices
B are all singular (like A).
They all have rank one (like A).
Their
trace is cd —cd = 0. Their eigenvalues are 0 and 0 (like A). I chose any M = ie a
with ad — bc = 1.
These matrices B can’t be diagonalized. In fact A is as close to diagonal as
possible. It is the “Jordan form” for the family of matrices B. This is the outstanding
member of the family. The Jordan form is as near as we can come to diagonalizing
all of these matrices, when there is only one eigenvector.
Chapter 7 will explain another approach to similar matrices. Instead of changing
variables by u = Mv, we “change the basis.” In this approach, similar matrices represent the same transformation of n-dimensional space. When we choose a basis for
that space, we get a matrix. The usual basis vectors in M = J lead to I~'AJ which
is A. Other bases lead to other matrices B = M~'AM.
In this “similarity transformation” from A to B, some things change and some
don’t. Here is a table to show connections between A and B:
Not changed
Eigenvalues
Trace and determinant
Rank
Number of independent
eigenvectors
Jordan form
Changed
Eigenvectors
Nullspace
Column space
Row space
Left nullspace
Singular values
308
= Chapter 6
Eigenvalues and Eigenvectors
The eigenvalues don’t change for similar matrices; the eigenvectors do. The trace
is the sum of the ’s (unchanged). The determinant is the product of the same As,t
The nullspace consists of the eigenvectors for 4 = O (if any), so it can change. Its
dimension n—r does not change! The number of eigenvectors stays the same for each
i, while the vectors themselves are multiplied by M~!.
The singular values depend on A' A, which definitely changes. They come in the
next section. The table suggests good exercises in linear algebra. But the last entry in
the unchanged column—the Jordan form—is more than an exercise. We lead up to it
with one more example of similar matrices.
Example 3 This Jordan matrix J has triple eigenvalue 5,5, 5. Its only eigenvectors
are multiples of (1, 0, 0):
ear)
—
NY ll
OMS
rO
e/a)
et
OmORRO
tet
oO
OWNKe
©
nA
oN
has rank 2.
Every similar matrix B = M~'JM has the same triple eigenvalue 5,5,5. Also
B—5/
must have the same rank 2. Its nullspace has dimension 3 — 2 = 1. So each similar
matrix B also has only one independent eigenvector.
The transpose matrix J? has the same eigenvalues 5,5,5, and J’ —5/ has the
same rank 2. Jordan’s theory says that J" is similar to J. The matrix that produces
the similarity happens to be the reverse identity M:
J =M
UM
wis
5 40. 0
l\ See
0 1.5
1
=
1
5
1
‘se Ma
1
1
All blank entries are zero. The only eigenvector of J. iss
(1-0. 0) = (0, 0, 1). More
exactly, there is one line of eigenvectors (x;, 0, 0) for J and another line (0, 0, x3) for J!.
The key fact is that this matrix J is similar to every matrix with eigenvalues
5,5,5
and one line of eigenvectors.
Example 4 Since J is as close to diagonal as we can get, the equation du/dt = Ju
cannot be simplified by changing variables. We must solve it as it stands:
ae
pa
THere
ae
mig
vga
pa
tien
be a Nagi:
is a direct proof that the determinant
x
y
z
is
is unchanged:
dx/dt =5x+y
dy/dt =5y+z
dz/dt = 5z.
det B = (det
M—!)(det A)(det M) = det A.
6.6
The system is triangular. We think naturally of back substitution.
tion and work upwards. Main point: All solutions contain e*':
dz
:
Similar Matrices
309
Solve the last equa-
Geel
yields
z= z(O)e™
dy =
iba
Sy4-2
:
yields
y=ES (y(0) + tz(0))e St
dx
ea= 5x -+y
;
yields
De z(0))e St :
x== (x (0) +t y(0) 4-51
The two missing eigenvectors are responsible for the te’ and t?e’ terms in this solution. They enter because 4 = 5 is a triple eigenvalue.
The Jordan Form
For every A, we want to choose M so that
M~!AM is as nearly diagonal as possible.
When A has a full set of n eigenvectors, they go into the columns of M. Then M = S.
The matrix S~'AS is diagonal, period. This matrix is the Jordan form of A—when
A can be diagonalized. In the general case, eigenvectors are missing and A can’t be
reached.
Suppose A has s independent eigenvectors. Then it is similar to a matrix with s
blocks. Each block is like J in Example 3. The eigenvalue is on the diagonal and the
diagonal above it contains \’s. This block accounts for one eigenvector of A. When
there are n eigenvectors and n blocks, they are all 1 by 1. In that case J is A.
6R (Jordan form)
If A has s independent eigenvectors, it is similar to a matrix
J
that has s Jordan blocks on its diagonal: There is a matrix M such that
J
M'AM
=
ie
= J.
(1)
Js
Each block in J has one eigenvalue A;, one eigenvector, and 1’s above the diagonal:
Ji =
ine
(2)
A is similar to B if they share the same Jordan form J—and not otherwise.
This is the big theorem about matrix similarity. In every family of similar matrices, we are picking out one member called J. It is nearly diagonal (or if possible
completely diagonal). For that matrix, we can solve du/dt = Ju as in Example 4.
310
Chapter 6
Eigenvalues and Eigenvectors
We can take powers J* as in Problems 9-10. Every other matrix in the family has the
form A = MJM~!. The connection through M solves du/dt = Au.
The point you must see is that MWJ/M~!MJM~! = MJ*M7—!, That cancellation
of
M~'M in the middle has been used through this chapter (when M was S). We found
A! from SA!0S5~—!—by diagonalizing the matrix. Now we can’t quite diagonalize A.
So we use MJ! M—! instead.
Jordan’s Theorem 6R is proved in my textbook Linear Algebra and Its Applications, published by HBJ/Saunders. Please refer to that book (or more advanced books)
for the proof. The reasoning is rather intricate and in actual computations the Jordan
form is not popular—its calculation is not stable. A slight change in A will separate the
repeated eigenvalues and remove the off-diagonal 1’s—leaving a diagonal A. Proved
or not, you have caught the central idea of similarity—to make A as simple as possible
while preserving its essential properties.
=»
REVIEW OF THE KEY IDEAS
1. _B is similar to A if
=
B= M~'AM.
2.
Similar matrices have the same eigenvalues.
3.
If A has n independent eigenvectors then A is similar to A (take M = S).
4.
Every matrix is similar to a Jordan matrix J (which has A as its diagonal part).
J has a “1” above the diagonal for every missing eigenvector.
Problem Set 6.6
1
If B = M~'AM and also
Conclusion:
2
If C = F"!AF and also
Conclusion:
3
C = N~'BN, what matrix T gives
C = T~!AT?
If B is similar to A and C is similar to B, then
C = G~'BG, what matrix M gives
If C is similar to A and also to B then
Show that A and B are similar by finding M:
i
ei;
@
0
and
@
a=;
il
i
B = M~'AM?
6.6
Similar Matrices
311
If a 2 by 2 matrix A has eigenvalues 0 and 1, why is it similar to A = [)?]?
Deduce from Problem 2 that all 2 by 2 matrices with those eigenvalues are similar.
Which of these matrices are similar? Check their eigenvalues.
[0
al
Ol
tO
ey
sy 70
Cary
Let
ed)
1 0
0
Oe
1
Te
There are sixteen 2 by 2 matrices whose entries are 0’s and 1’s. Similar matrices
go into the same family. How many families? How many matrices (total 16) in
each family?
(a)
If x is in the nullspace of A show that M~'x is in the nullspace of M~'AM.
(b)
The nullspaces of A and M —! AM have the same (vectors)(basis)(dimension).
If A and B have the exactly the same eigenvalues and eigenvectors, does A = B?
With n independent eigenvectors we do have A = B. What if A has eigenvalues
0,0 and only one line of eigenvectors (x;, 0)?
By direct multiplication find A? and A? when
io EiraeHe
Guess the form of A* and check k = 6. Set k = 0 to find A® and k = —1 to
find A7!.
Questions
10
10-14 are about the Jordan form.
By direct multiplication, find J* and J? when
ie iCal ah
Guess the form of J‘. Set k = 0 to find J®. Set k = —1 to find J~!.
11
The text solved du/dt = Ju for a 3 by 3 Jordan block J. Add a fourth equation
dw/dt =5w +x. Follow the pattern of solutions for z, y, x to find w.
12
These Jordan matrices have eigenvalues 0,0,0,0. They have two eigenvectors
(one from each block). But the block sizes don’t match and they are not similar:
Pin
i
jacineo V0
uated
0.6) ona
0010 0
a
es
resale
an.
=
Deion0
alia
tah
bohm?
pro? ORG
For any matrix M, compare JM with MK. If they are equal show that M
not invertible. Then M~'JM = K is impossible.
is
312
13
Chapter 6
Eigenvalues and Eigenvectors
J™ was similar to J in Example 3. Prove that A! is always similar to A, in three
steps:
First, when A is a Jordan block J;: Find M; so that Mo" J; WW
I
Second, A is a Jordan matrix J: Build Mo from blocks so that My | JMo =J!.
Third, A is any matrix MJM~!:
Show that A! is similar to J’ and to J and
to A.
14
Find two more matrices similar to J in Example 3.
15
Prove that det(A — 47) = det(M~'!AM — AI). Write 1 = M~'M and factor out
det M—! and det M. This says that
M~'AM has the same characteristic polynomial as A. So its roots are the same eigenvalues.
16
Which pairs are similar? Choose a,b, c,d to prove that the other pairs aren’t:
a b
ad
17
ba
ak we
ed
a@ ob
ad
Db
¢
ai
True or false, with a good reason:
(a)
An invertible matrix can’t be similar to a singular matrix.
(b)
A symmetric
(c)
A can’t be similar to —A
(d)
A can’t be similar to A+ /.
matrix can’t be similar to
a nonsymmetric
matrix.
unless A = 0.
18
If B is invertible prove that AB has the same eigenvalues as BA.
19
When A is m by n and B is n by m, AB and BA can have different sizes. But
20
EMIS
ste a
(a)
What sizes are the blocks? They are the same in each matrix.
(b)
This block equation isM -lFM =G, so F and G have the same eigenvalues. F has the eigenvalues of AB plus n zeros, G has the eigenvalues of
BA plus m zeros. Conclusion when m > n: AB has the same eigenvalues
as BA plus
zeros.
Why are these statements all true?
(a)
If A is similar to B then A? is similar to B?.
(b)
A* and B? can be similar when A and B are not similar.
(c)
eal is similar to eas
(d)
[3 | is not similar to ie ae
(e)
If we exchange rows | and 2 of A, and then exchange columns
the eigenvalues stay the same.
| and 2,
6.7
Singular Value Decomposition (SVD)
SINGULAR VALUE DECOMPOSITION
313
(SVD) a 6.7
The Singular Value Decomposition is a highlight of linear algebra. A is any m by n
matrix. It can be square or rectangular, and we will diagonalize it. Its row space is
r-dimensional (inside R”). Its column space is also r-dimensional (inside R”). We
are going to choose orthonormal bases for those spaces. The row space basis will be
vj,..., v, and the column
space basis will be w,,..., Uu,.
Start with a 2 by 2 matrix: m =n = 2. Let its rank be r = 2, so it is invertible.
Its row space is the plane R*. We want v; and v2 to be perpendicular unit vectors, an
orthogonal basis. We also want Av, and Av? to be perpendicular. (This is the tricky
part.) Then the unit vectors uj = Avj/||Av;|| and wz = Av2/||Av2|| will be orthogonal,
just dividing by lengths. As a specific example, we work with the unsymmetric matrix
a-[2 2].
2a
r
ye
First point Why not choose one orthogonal basis instead of two? Because no orthogonal matrix Q will make Q~!AQ diagonal.
Second point Why not choose the eigenvectors of A as the basis? Because that basis
is not orthonormal. A is not symmetric and we need two different orthogonal matrices
to diagonalize it.
We are aiming for orthonormal bases that diagonalize A. The two bases will be different—
one basis cannot do it. When the inputs are v; and v2, the outputs are Av; and Av2. We
want those to line up with uw; and u2. The basis vectors have to give Av; = 0), and also
Av2 = 02u2. The numbers oj and 02 are just the lengths ||Av, || and ||/Av2||. With vj and
v2 as columns you can see what we are asking for:
A E
| = jou
In matrix notation that is AV = UX.
versus capital lambda).
on
|= c
| [°
02
|.
(2)
The diagonal matrix & is like A (capital sigma
© contains the singular values 01,02 and A contains the eigen-
values.
The difference comes from U and V. When they both equal S, we have AS =
SA which means S~'AS = A. The matrix is diagonalized. But the eigenvectors in S$
are not generally orthonormal. The new requirement is that U and V must be orthogonal matrices. The basis vectors in their columns must be orthonormal:
vive=
Thus
V'V =/
which means
ae
il
—v —
ny wm} a=}|! OT.
Om
V? = V~!. Similarly U'U =7
and U' =U™!.
(3)
314
Chapter 6
Eigenvalues and Eigenvectors
6R The Singular Value Decomposition (SVD) has orthogonal matrices U and V:
AV=US
andthn
A=UXV7!=UXV'.
This is the new factorization of A: orthogonal
times
diagonal
(4)
times
orthogonal.
There is a neat way to get U out of the picture and see V by itself: Multiply
A! times A.
AA] (USV
WU eV')= VEU
US.
(5)
U'U disappears because it equals 7. Then ©! is next to ©. Multiplying those diagonal
matrices gives oe and Oe. That leaves an ordinary factorization of the symmetric matrix
ATA:
1
Agari
of
0
0
052
T
Views
(6)
This is exactly like A = QAQ!. But the symmetric matrix is not A itself. Now the
symmetric matrix is A! A! And the columns of V are its eigenvectors.
This tells us how to find V. We are ready to complete the example.
Example 1 Find the singular value decomposition of A = ee
:
V2] = =f]
Solution Compute A!A and its eigenvectors. Then make them unit vectors:
ATA=
E 2
has unit eigenvectors
v, =
The eigenvalues of A'A are 8 and 2. The v’s are perpendicular, because eigenvectors
of every symmetric matrix are perpendicular—and A‘A is automatically symmetric.
What about uw; and uz? They are quick to find, because Av, is in the direction
of uw; and Avg is in the direction of uo:
Av, = if A isis = leash
Clearly Av, is the same as 2/2u,.
The unit vector is
uy, = fal;
The first singular value is 0, = 2./2. Then o; ss
which is the eigenvalue of A'A. We have Av; = 0,u exactly as required. Similarly
Av? = is | ueel = Rat
Now
Av» is /2u2.
The unit vector is
uz = call
The second singular value is 07 = J/2. And On agrees with the
other eigenvalue 2 of A'A. We have completed the SVD:
a
ae
be l=(¢ vile
ell
On 1
‘lll 1//2
—1/V2
Wak
1/72
(7)
6.7.
Figure 6.5
Singular Value Decomposition (SVD)
U and V are rotations and reflections.
315
© is a stretching matrix.
This matrix, and every invertible 2 by 2 matrix, transforms the unit circle to an ellipse.
You can see that in the figure, which was created by Cliff Long and Tom Hern.
One final point about that example. We found the w’s from the v’s. Could we
find the u’s directly? Yes, by multiplying AA‘ instead of A! A:
AAM=(UDV
DV SLU). Ses
= 0s,
(8)
This time it is V'V = / that disappears. Multiplying XX! gives ae and oF as before.
We have an ordinary factorization of the symmetric matrix AA!. The columns of U
are the eigenvectors of AA’:
Aa ate
fe
2
This matrix happens to be diagonal.
oe ae lame
NV 2a
x
os
Its eigenvectors are (1,0) and (0, 1). This agrees
with uw; and uw2 found earlier. Why should we take the first eigenvector to be (1, 0)
instead of (0, 1)? Because we have to follow the order of the eigenvalues. Notice that
AA! has the same eigenvalues (8 and 2) as A'A.
The singular values are /8 and /2.
Example 2 Find the SVD of the singular matrix A = [??]. The rank is r = 1. The
row space has only one basis vector vj. The column space has only one basis vector
u;. We can see those vectors in A, and make them into unit vectors:
v; = multiple of row
UU he ed (et
i|-34]
u; = multiple of column
v
i) 12
H we i
316
Chapter 6
Eigenvalues and Eigenvectors
——=
7
es
oa
ae
column space
“
nullspace
nullspace of AT
Figure 6.6
The SVD
chooses orthonormal bases so that Av; = ojU4;.
Then Av; must equal o,u;. It does, with singular value oj = V10.
stop there (it usually doesn’t):
The SVD
could
? t= [ye | vor2 1/2].
It is customary for U and V to be square. The matrices need a second column.
The vector v2 must be orthogonal to v;, and uz must be orthogonal to wu):
1
45]
n=
and
m=
1
|_|.
The vector v2 is in the nullspace. It is perpendicular to v; in the row space. Multiply
by A to get Av2 = 0. We could say that the second singular value is 07 = 0, but
singular values are like pivots—only the r nonzeros are counted.
If A is 2 by 2 then all three matrices U, X, V are 2 by 2 in the complete SVD:
2
ae
Tear
| URRVIO <0
i i |=uEv =I i =|
ie
0 ali me
e,
6S The matrices U and V contain orthonormal bases for all four subspaces:
first
r
last
n—r_
first
r
last
m-—r_
columns of V:
row space of A
columns of V: — nullspace of A
columns of
U:
column space of A
columns of U: — nullspace of A!.
The first columns v,...,v,; and uj,...,u, are the hardest to choose, because
Av; has to fall in the direction of u;. The last v’s and u’s (in the nullspaces) are easier.
6.7
Singular Value Decomposition (SVD)
317
As long as those are orthonormal, the SVD will be correct. The v’s are eigenvectors
of ATA and the w’s are eigenvectors of AA’. Example 2 has
1, oo
aa=|3
|
and
7
|8
aat=i
4
ip
Those matrices have the same eigenvalues 10 and 0. The first has eigenvectors v; and
v2, the second has eigenvectors u; and uz. Multiplication shows that Av; = /10 wy
and Av? = 0. It always happens that Av; = o;u;, and we now explain why.
Proof of SVD:
From A™ Av; = oP vi, the key steps are to multiply by vi and by A:
vo} AT Av; = of v) v;
gives
\| Av; || =o?U
AA‘ Ap; — 07 Av;
gives
u; = Av;/o;
50 that
||Av; || = 9;
(10)
as a unit eigenvector of
AA’.
Equation (10) used the small trick of placing parentheses in (v) AT)(Ay;).
vector times its transpose,
giving
||Av;||?.
Equation
(11)
This is a
(11) placed the parentheses
in
(AA!')(Av;). This shows that Av; is an eigenvector of AA!. We divide by its length
o; to get the unit vector u; = Av;/o;. This is the equation Avj = o;u;, which says
that A is diagonalized by these outstanding bases.
We will give you our opinion directly. The SVD is the climax of this linear
algebra course. We think of it as the final step in the Fundamental Theorem. First
come the dimensions of the four subspaces. Then their orthogonality. Then the bases
which diagonalize A. It is all in the formula A = UXV!. Applications are coming—
they are certainly important!—but you have made it to the top.
Eigshow
Section 6.1 described the MATLAB
(it is built into MATLAB
(Part 2)
display called eigshow and gave its Web location
5.2). The first option is eig, when x moves in a circle and Ax
follows on an ellipse. The second option is sud, when two perpendicular vectors x and y
travel around a circle. Then Ax and Ay move too. There are four vectors on the screen.
The SVD is seen graphically when Ax happens to be perpendicular to Ay. Their
directions at that moment give an orthonormal basis u;,u2. Their lengths give the
singular values 01, 02. The vectors x and y at that moment
are the orthonormal
basis
Vj, U2.
On the screen you see these bases that give Av; = oju;
matrix language that is AV = UX. This is the SVD.
y=(0,1)
and Avz = o2u2.
Ay = (2,1)
Ax = (2, —1)
Ax =2V/2u
In
318
Chapter 6
Eigenvalues and Eigenvectors
=»
1.
The SVD
REVIEW OF THE KEY IDEAS
factors A into UXV!,
The numbers
=
with singular values 0; > --- > 0, > 0.
a ssn Ge are the nonzero
eigenvalues of AA!
and AA.
The orthogonal columns of U and V are eigenvectors of AA! and ATA.
ag
Sey
a
Those columns give orthogonal bases for the four fundamental subspaces of A,
and they diagonalize that matrix:
Av; = oju; for i <r.
Problem Set 6.7
Problems 1-3 compute the SVD of a square singular matrix A.
1
Compute A'A and its eigenvalues on 0 and unit eigenvectors vj), v2:
2
(a)
Compute
(b)
Verify from Problem
en eel
AA!
and its eigenvalues On Q and unit eigenvectors
w1, U2.
| that Av; = 0, . Find all entries in the SVD:
fo ai=[e @)[” o ffm »].
3
Write down orthonormal bases for the four fundamental subspaces of this A.
Problems
4—7 ask for the SVD
of matrices of rank 2.
4
(a)
Find the eigenvalues
bonacci matrix
and unit eigenvectors of A'A
(b)
Construct the singular value decomposition of A.
and AA!
for the Fi-
5
Show that the vectors in Problem 4 satisfy Av; = o,u,; and Avz = o7U2.
6
Use the SVD part of the MATLAB
and v2 graphically.
rf
Compute ATA and AA! and their eigenvalues and unit eigenvectors for
demo eigshow to find the same vectors v,
ape tat
Ae | 1 al:
Multiply the three matrices
UXV! to recover A.
6.7.
Singular Value Decomposition (SVD)
319
Problems 8-15 bring out the underlying ideas of the SVD.
8
Suppose uj,...,U, and vj,... , v, are orthonormal bases for R”. Construct the
matrix A that transforms each v; into u; to give Avy = uj,... , AU, = Up.
Construct the matrix with rank one that has Av = 12u for v = S(1, 1,1, 1) and
u=
10
Suppose
What
11
HO 2, 1).Its only singular value is oj; =
A has orthogonal
columns
of lengths
0), 02,... , On.
Explain how the SVD expresses the matrix A as the sum of r rank one matrices:
A =0 Uv);
12
w,, W2,...,W,
are U, &, and V in the SVD?
Tt
+-:: + Oyu
v2.
Suppose A is a 2 by 2 symmetric matrix with unit eigenvectors u; and uz.
its eigenvalues are 4; = 3 and Az = —2, what are the matrices
If
U, &, V! in its
SVD?
13
If A = OR with an orthonormal matrix Q, then the SVD of A is almost the same
as the SVD of R. Which of the three matrices in the SVD is changed because
of Q?
14
Suppose A is invertible (with oj > 02 > 0). Change A by as small a matrix as
possible to produce a singular matrix Ap. Hint: U and V do not change:
se[a a)”, ][m eo]
15
16
(a)
If A changes to 4A, what is the change in the SVD?
(b)
What is the SVD for A! and for A~!?
Why doesn’t the SVD for A+ / just use X + /?
a. |
es
ee
ne
0
a
taping
ee
LOG tir Gy “wed
vi dime? jeiytrgiree
ger
es Tite
es
te tae
7
Sepvue (ities
a
iS
chien
oelign
>
~~
a=
Y-2
eae
1@a2ea
6
ain
SS
ham, Gan) © Oe ome ah & ¢ ome
W pe heh 7aaah
4
>
i>) - ee
:
_
ogep
.t
=
tive &2 tik’ 9 WY
=
ail®»
Vw)
wD
Wie
de
ee
ides
eata *
9Mcyt
r
i
:
0
rave
2dian npnied tsteeae
eed
of a date
wave
a
ep
ine
oe" a
RIS
:
Ceslieuale
oe
me scaba
ol
e
in
ee
=
Nes
wes. » eae =a
See
ee
a
lew
pp-a~aese®.
’ pa» Sad
ay
=
a
Sovyuh niche 2?
@ Pees
os
=
em
at ec
7
oe?
i Ye vies eaters
Dadaist
wht wae eS> Uh Ol he ay,
RS
sage ie
he
20-2
}
Wine
ae
:
6-0-4
of POU
TOO
st poetic 4
an
tn
7
ASE
")
1)
ee.
eeBs & (awe easpod
=
tAmy
wee ate aera
a,
—
Sl
0
7
a ches
>
tes
aaa
—
es
Seaver
a
an
a
>
ow
e
43
Se. ee
»5
nee
ie
—
==
Ge
6
=
Pale
NAM
ladas’
ot GA vides of
-
@e-jeneee
.
FF
=
-
a
re aa
a
pea a:
ae
&
ed re
Te
nie
Lund)
DT
deem mepersin via
_—
_
-
LINEAR TRANSFORMATIONS
THE IDEA OF A LINEAR TRANSFORMATION a 7.1
When a matrix A multiplies a vector v, it “transforms” v into another vector Av.
In goes v, out comes Av. This transformation follows the same idea as a function. In
goes a number x, out comes f(x). For one vector v or one number x, we multiply
by the matrix or we evaluate the function. The deeper goal is to see all v’s at once.
We are transforming the whole space when we multiply every v by A.
Start again with a matrix A. It transforms v to Av. It transforms w to Aw. Then
we know what happens to u = v + w. There is no doubt about Au, it has to equal
Av + Aw. Matrix multiplication gives a linear transformation:
DEFINITION A transformation T assigns an output T(v) to each input vector v.
The transformation is linear if it meets these requirements for all v and w:
(a) Tvu+w)=T(v)+T(w)
(0) Bii(Gu) = Gr w)
Betorsallxe:
If the input is v = 0, the output must be T(v) = 0. We combine (a) and (b) into one:
Linearity:
T(cv+dw) =cT(v)+dT(w).
A linear transformation is highly restricted.
Suppose we add up to every vector.
Then T(v) =
v+uo and T(w) = w+ uo. This isn’t good, or at least it isn’t linear.
Applying T to v-+w produces v + w + uo. That is not the same as T(v) + T(w):
T(v)
+ T(w) =v+uptwt+ug
does not equal
Tiv+w)=v+w+uo.
321
322
Chapter 7
Linear Transformations
The exception is when up = 0. The transformation reduces to T(v) = v. This is
the identity transformation (nothing moves, as in multiplication by /). That is certainly
linear. In this case the input space V is the same as the output space W.
The linear-plus-shift transformation T(v) = Av + uo is called “affine.” Straight
lines stay straight and computer graphics works with affine transformations. The shift
to computer graphics is in Section 8.5.
Example
1 Choose a fixed vector a = (1, 3,4), and let T(v) be the dot product a -v:
The input is
This is linear.
v = (v1, v2, v3).
The output is
The
from
inputs v come
7T(v)
three-dimensional
=a-v=v;
space,
+ 3v2 + 413.
so V =
R>.
The
outputs are just numbers, so the output space is W = R!. We are multiplying by the
row matrix
A=[1
3 4]. Then T(v) = Av.
You will get good at recognizing which transformations are linear. If the output
involves squares or products or lengths, Us or v1v2 or ||v||, then 7 is not linear.
Example 2 T(v) =
||v|| is not linear.
Requirement (a) for linearity would
|v + w|| = ||v|| + ||w]|. Requirement (b) would be ||cv|| = c|lv||. Both are false!
be
Not (a): The sides of a triangle satisfy an inequality ||v + w|| < ||v|| + ||w]l.
Not (b): The length || — v|| is not —||v]].
Example 3 (Important) 7 is the transformation that rotates every vector by 30°. The
domain is the xy plane (where the input vector v is). The range is also the xy plane
(where the rotated vector T(v) is). We described T without mentioning a matrix: just
rotate the plane.
Is rotation linear? Yes it is. We can rotate two vectors and add the results. The
sum of rotations T(v) + T(w) is the same
as the rotation T(v + w) of the sum.
The
whole plane is turning together, in this linear transformation.
Note Transformations have a language of their own. Where there is no matrix, we
can’t talk about a column space. But the idea can be rescued and used. The column space consisted of all outputs Av. The nullspace consisted of all inputs for which
Av = 0. Translate those into “range” and “kernel”:
Range of T
=
set of all outputs T(v): corresponds to column space
Kernel of T
=
set of all inputs for which T(v) = 0: corresponds to nullspace.
The range is in the output space W. The kernel is in the input space V. When
T is multiplication by a matrix, T(v) = Av, you can translate back to column space
and nullspace.
For an m by n matrix, the nullspace is a subspace of V = R”. The column space
is a subspace of ____. The range might or might not be the whole output space W.
7.1
The Idea of a Linear Transformation
Byes:
Examples of Transformations (mostly linear)
Example 4 Project every 3-dimensional vector down onto the xy plane. The range is
that plane, which contains every T(v). The kernel is the z axis (which projects down
to zero). This projection is linear.
Example 5 Project every 3-dimensional vector onto the horizontal plane z = 1. The
vector v = (x, y, 10) is transformed to T(v) = (x, y, 1). This transformation
linear. Why not? It doesn’t even transform v = 0 into T(v) = 0.
is not
Multiply every 3-dimensional vector by a 3 by 3 matrix A. This is definitely a
linear transformation!
T(v+w)=A(v+w)_
which does equal
Av+Aw=T7(v)+T7(w).
Example 6 Suppose A is an invertible matrix. The kernel of T is the zero vector; the
range W equals the domain V. Another linear transformation is multiplication by A~!.
This is the inverse transformation T~', which brings every vector T(v) back to v:
T(T(v)) =v
matches the matrix multiplication
Pp
A7!(Av) = i).
We are reaching an unavoidable question. Are all linear transformations produced by matrices? Each m by n matrix does produce a linear transformation from
R” to R”.
The rule is T(v) = Av.
Our question is the converse.
When a linear T
is described as a “rotation” or “projection” or “...”, is there always a matrix hiding
behind 7?
The answer is yes. This is an approach to linear algebra that doesn’t start with
matrices. The next section shows that we still end up with matrices.
Linear Transformations of the Plane
It is more interesting to see a transformation than to define it. When a 2 by 2 matrix
A multiplies all vectors in R?, we can watch how it acts. Start with a “house” in the
xy plane. It has eleven endpoints. Those eleven vectors v are transformed into eleven
vectors Av. Straight lines between v’s become straight lines between the transformed
vectors Av. (The transformation is linear!) Applying A to a house produces a new
house—possibly stretched or rotated or otherwise unlivable.
This part of the book is visual not theoretical. We will show six houses and the
matrices that produce them. The columns of H are the eleven circled points of the
first house. (H is 2 by 12, so plot2d will connect the 11th circle to the first.) The 11
points in the house matrix H are multiplied by A to produce the other houses. The
324
Chapter 7
Linear Transformations
houses on the cover were produced this way (before Christine Curtis turned them into
a quilt for Professor Curtis).
pie 6,
TPP
—68=
7
Orr]
ee ONE BER
16"
GOB
De he Se
3" LON
ee, De
ONG
ee
AF
hess35°
ye 3D
Oil
Figure 7.1
Linear transformations of a house drawn by plot2d(A*#).
—sin 35 |
COSTS Oe
7.1.
=»
The Idea of a Linear Transformation
REVIEW OF THE KEY IDEAS
325
=
A transformation T takes each v in the input space to T(v) in the output space.
Linearity requires that T (cjvj
+---+cy,¥,) =c,
T(v1)
+-+-+en
T(0p).
The transformation T(v) = Av + vg is linear only if v9 = 0 !
eee
ee The quilt on the book cover shows T(house) = AH for nine matrices A.
Problem Set 7.1
A linear transformation must leave the zero vector fixed:
from T(v + w) = T(v) + T(w) by choosing w =
requirement (b) by choosing c =
T(0) = 0. Prove this
. Prove it also from
Requirement (b) gives T(cv) = cT(v) and also T(dw) = dT(w). Then by addition, requirement (a) gives T( )=(_
). What is T(cv+dw-+eu)?
Which
(a)
(d)
of these transformations
T(v) = (v2, v1)
T(v) = (0, 1).
(b)
is not linear? The input is v = (v1, v2):
T(v) = (v1, v1)
(c)
T(v) = (0, v1)
If S and T are linear transformations, is S$(7(v)) linear or quadratic?
(a)
(Special case) If S(v) = v and T(v) = v, then S(T(v)) = v or v*?
(b)
(General case) S(w,+w2) = S(w1)+S(w2)
combine into
S(T (v1 + v2)) = S¢
Suppose T7(v) =v
except that
and T(v,; +02) = T(v,)+T (02)
)=
ec
7(0, v2) = (0,0). Show that this transformation
satisfies T(cv) = cT(v) but not T(v+w)=T(v)+T(w).
Which of these transformations satisfy T(v-++-w) = T(v)+T(w)
L(coy el(wv)?
(a)
T(v)=v/|lv|l
(d)
T(v) = largest component of v.
(b)
Tv) = vj +v2+03
For these transformations of V = R? to
mation T? linear?
(a)
T(v)=—-v
(Dy
(c)
T(v) = 90° rotation = (—v2, v})
(d)
T(v) = projection = (45%, *15*2).
(c)
Tv) = (v4, 202, 3v3)
W = R’, find T(T(v)).
re(O) == va (1, 1)
and which satisfy
Is this transfor-
326
Chapter 7
Linear Transformations
8
Find the range and kernel (like the column space and nullspace) of T:
(a)
T(v1, v2) = (v2, 1)
(c)
Tv, v2) = ©, 9)
(b)
(d)
Tv, v2, v3) = (41, v2)
Tv, v2) = (11, V1).
The “cyclic” transformation T is defined by T (v1, v2, v3) = (v2, v3, v1). What is
T(T(v))? What is T3(v)? What is T'!°(v)? Apply T three times and 100 times
to v.
10
11
12
A linear transformation from V to W has an inverse from W to V when the range
is all of W and the kernel contains only v = 0. Why are these transformations
not invertible?
(a)
T(r, v2) = (v2, v2)
W =R?
(b)
(c)
T(v1, v2) = (v1, 02,0)
Ty, v2) =
+¥2)
W=R>
W=R!
If T(v) = Av and A is m by n, then 7 is “multiplication by A.”
(a)
What are the input and output spaces V and W?
(b)
Why is range of T = column space of A?
(c)
Why is kernel of T = nullspace of A?
Suppose a linear T transforms
when
(ay
v= (2, 2)
(by
(1,1) to (2,2) and (2,0) to (0,0).
et (eT)
Kye"
“P=
1)
(2)
Find T(v)
i= ae)
Problems 13-20 may be harder. The input space V contains all 2 by 2 matrices M.
13
14
M is any 2 by 2 matrix and A = [4 al The transformation T is defined by
T(M) = AM. What rules of matrix multiplication show that 7 is linear?
Suppose A = [12]. Show that the range of T is the whole matrix space V and
the kernel is the zero matrix:
15
(1)
If AM =O
prove that M must be the zero matrix.
(2)
Find a solution to AM = B for any 2 by 2 matrix B.
Suppose A = [42]. Show that the identity matrix / is not in the range of T.
Find a nonzero matrix M such that T(M) = AM
16
is zero.
Suppose 7 transposes every matrix M. Try to find a matrix A which gives AM =
M! for every M. Show that no matrix A will do it. To professors: Is this a linear
transformation that doesn’t come from a matrix?
7.1.
17
18
The Idea of a Linear Transformation
327
The transformation T that transposes every matrix is definitely linear. Which of
these extra properties are true?
(a)
T? = identity transformation.
(b)
The kernel of T is the zero matrix.
(c)
Every matrix is in the range of T.
(d)
T(M) = —M
is impossible.
Suppose T(M) = [§9][m][9
9]. Find a matrix with T(M) 4 0. Describe all
matrices
with
T(M)
= 0 (the kernel of 7) and all output matrices
T(M)
(the
range of 7).
19
If A #0 and B + 0 then there is a matrix M such that AMB 4 0. Show by
example that M =/ might fail. For your example find an M that succeeds.
20
IfA and B are invertible and T(M) = AMB, find T~'(M) inthe form(
Questions 21-27 are about house transformations by AH.
21
a2
24
The output is T (house).
How can you tell from the picture of T (house) that A is
(a)
a diagonal matrix?
(b)
a rank-one matrix?
(c)
a lower triangular matrix?
Draw a picture of T (house) for these matrices:
=
23
an)
and
oi)
=f
Bel
4
and
ee
ul,
4
ne
What are the conditions on A = [#}] to ensure that T (house) will
(a)
sit straight up?
(b)
expand the house by 3 in all directions?
(c)
rotate the house with no change in its shape?
What are the conditions on det A = ad — bc to ensure that T (house) will
(a)
be squashed onto a line?
(b)
keep its endpoints in clockwise
(c)
have the same
order (not reflected)?
area as the original house?
If one side of the house stays in place, how do you know that A = /?
25
)M(_).
Describe
T (house) when
T(v) = —v + (1,0). This 7 is “affine.”
328
Chapter 7
Linear Transformations
26
Change the house matrix H to add a chimney.
27
This MATLAB program creates a vector of 50 angles called theta, and then draws
the unit circle and T (circle) = ellipse. You can change A.
=
2 Ae) 2
theta = [0:2 * pi/50:2 * pil;
circle = [cos(theta); sin(theta)];
ellipse = A* circle;
axis([-4 4 —4 4]); axis(‘square’)
plot(circle(1,:), circle(2,:), ellipse(1,:), ellipse(2,:))
28
Add two eyes and a smile to the circle in Problem 27. (If one eye is dark and the
other is light, you can tell when the face is reflected across the y axis.) Multiply
by matrices A to get new faces.
29
The first house is drawn by this program plot2d(H). Circles from o and lines
from —:
Koa FtA)
cave (2c
axis({-1010—1010]), axis(’square’)
plot@ny/.c'.%..V. =);
Test plot2d(A’ * H) and plot2d(A’ * A* H) with the matrices in Figure 7.1.
30
Without a computer describe the houses A « H for these matrices A:
1 0
E 4 a
31
a
3 5 ag
mie:
Be 3 3%
Nn
at
i AF
What matrices give the houses on the front cover? The second is A = /.
7.2.
The Matrix of a Linear Transformation
329
THE MATRIX OF A LINEAR TRANSFORMATION a 7.2
The next pages assign a matrix to every linear transformation. For ordinary column
vectors, the input v is in V= R” and the output T(v) is in W= R”. The matrix for
T will be m by n.
The standard basis vectors for R” and R” lead to a standard matrix for T. Then
T(v) = Av in the normal way. But these spaces also have other bases, so the same
T is represented by other matrices. A main theme of linear algebra is to choose the
bases that give the best matrix.
When
V and W
are not R” and R”, they still have bases.
Each choice of basis
leads to a matrix for 7. When the input basis is different from the output basis, the
matrix for T(v) = v will not be the identity 7. It will be the “change of basis matrix.”
Key idea of this section
When
we know
T(v}),..., T(v,) for the basis vectors v\,..., Vn,
linearity produces T(v) for every other vector v.
Reason Every input v is a unique combination cjv; +---+ Cv, of the basis vectors.
Since T is a linear transformation (here is the moment for linearity), the output 7 (v)
must be the same combination of the known outputs 7T(v1),..., IMP
Suppose v = cyvy +++++ Cyn.
Then linearity requires T(v) = cyT (v4)
+ +++ ¢nT (vn).
The rule of linearity extends from cv + dw to all combinations c)v] +--+ + Cn Vp.
Example 1 Suppose T transforms v; = (1, 0) to T(v;) = (2, 3, 4). Suppose the second
basis vector v2 = (0, 1) goes to T(v2) = (5,5, 5). If T is linear from R? to R> then
its standard matrix is 3 by 2:
2
a
A=\|3
By linearity:
4
“(ui + 123
Mm
nn
OS
3
Aaa)
\
i
= || &
9
Example 2 The derivatives of the functions 1, x,x?,x° are 0,1,2x,3x2. Those are
four facts about the transformation 7 that “takes the derivative.’ Now add the crucial
fact that T is linear:
x
T(v)
=
gy
—
ag
beys
obeyss
thethe linearity
rul
linearity rule
ive(e Senge!
ae
ae
Sea
NGU:
w)
=
c—
oyu
ae
Ce beam
330
Chapter 7
Linear Transformations
It is exactly those facts that you use to find all other derivatives. From the derivative
of each separate power 1, x, x*, x? (those are the basis vectors v1, v2, v3, v4) you find
the derivative of any polynomial like 4+ x + x? + x?:
d
eet
x? 4+x3)=14+2x+3x*
(because of linearity!)
(2)
XxX
Here the input space V contains all combinations
of 1, x, x”, x.
I call them vectors,
you might call them functions. Those four vectors are a basis for the space V of cubic
polynomials (degree < 3).
For the nullspace of A, we solved Av = 0. For the kernel of the derivative T, we solve
dv/dx
= 0. The solution is v = constant.
The nullspace of T is one-dimensional,
containing all constant functions like v; = 1 (the first basis function).
To find the range (or column space), look at all outputs from T(v) = dv/dx.
The inputs are cubic polynomials a + bx + cx? + dx?, so the outputs are quadratic
polynomials (degree < 2). For the image space W we have a choice. If W = cubics,
then the range of T (the quadratics) is a subspace. If W = quadratics, then the range
is all of W.
That second choice emphasizes the difference between the domain or input space
(V = cubics) and the image or output space (W = quadratics). V has dimension n = 4
and W has dimension m = 3. The matrix for T will be 3 by 4.
The range of T is a three-dimensional subspace. The matrix will have rank r = 3.
The kernel is one-dimensional. The sum 3+ 1 = 4 is the dimension of the input space.
This was r + (n —r) =n
in the Fundamental Theorem of Linear Algebra. Always
(dimension of range) + (dimension of kernel) = dimension of V.
Example 3 The integral is the inverse of the derivative. That is the Fundamental Theorem of Calculus. We see it now in linear algebra. The transformation T~! that “takes
the integral from 0 to x” is linear! Apply T~! to 1, x, x?:
x
i Law.
0
x
A x dx = 1x?,
0
.
x
/ x? dx = ee
0
By linearity, the integral of w = B + Cx + Dx? is T~'(w) = Bx + $Cx? == £Dx?
The integral of a quadratic is a cubic. The input space of T~! is the quadratics, the
output space is the cubics. Integration takes W back to V. Its matrix will be 4 by 3.
Range of T~! The outputs Bx + 5Cx? + +Dx? are cubics with no constant term.
Kernel of T~' The output is zero only if B =C =
D=0. The nullspace is the zero
vector. Now 3 + 0 = 3 is the dimension of the input space W for T~!.
7.2.
The Matrix of a Linear Transformation
331
Matrices for the Derivative and Integral
We will show how the matrices A and A~! copy the derivative T and the integral T~!.
This is an excellent example from calculus. Then comes the general rule—how to represent any linear transformation T by a matrix A.
The derivative transforms the space V of cubics to the space W of quadratics.
The basis for V is 1, x, x*, x3. The basis for W is 1, x, x”. The matrix that “takes the
derivative” is 3 by 4:
O
A=|0
oO
1 0
0 2
070
0
0 | =matrix form of FZ.
3
Why is A the correct matrix? Because multiplying by A agrees with transforming by
T. The derivative of v =a+bx +cx? +. dx? is T(v) =b + 2cx + 3dx*. The same b
and 2c and 3d appear when we multiply by the matrix:
OneIeaO sOilh ee
b
O 109s, ©
Be ea
OO. Wel2a
3d
(3)
Look also at T~!. The integration matrix is 4 by 3. Watch how the following matrix
starts with w = B + Cx + Dx? and produces its integral Bx + 5Cx? + 1Dx?:
Integration:
O27 Or 00
1 O00
1
Oa
0 0 4
B
Gipe
D
0
B
dy
Ag
(4)
;D
I want to call that matrix A~!, and I will. But you realize that rectangular matrices
don’t have inverses. At least they don’t have two-sided inverses. This rectangular A
has a one-sided inverse. The integral is a one-sided inverse of the derivative!
oe
AAt=|/0 Ole
1 0
but
AeA
0100
OOF
OF 10250)"
0
1
If you integrate a function and then differentiate, you get back to the start. So AA! =I.
But if you differentiate before integrating, the constant term is lost. The integral of
the derivative of \ is zero:
T~'T(1) = integral of zero function
This matches
A~!A, whose first column is all zero.
constant functions).
= 0.
The derivative T has a kernel (the
Its matrix A has a nullspace. Main point again: Av copies 7 (v).
332
Chapter 7
Linear Transformations
Construction of the Matrix
Now for the general rule.
Suppose T transforms the space V (n-dimensional) to the
space W (m-dimensional).
We choose a basis vj,..., V, for V and a basis Wj, ..., Wm
for W.
The matrix A will be m by n. To find its first column, apply T to the first
basis vector v1:
T (v4) is in W. It is a combination
These numbers aj\,..., 4m
matches
multiplying
ayywy+++:+an1Wm
of the output basis.
go into the first column of A. Transforming v; to T(v1)
(1,0,...,0)
by A.
It yields that first column
of the matrix.
When T is the derivative and the first basis vector is 1, its derivative is T(v;) = 0. So
for the derivative, the first column of A was all zero.
For the integral, the first basis function is again 1 and its integral is x. This is
1 times the second basis function.
So the first column of A~! was (0, 1, 0, 0).
7A Each linear transformation T from V to W is represented by a matrix A (after
the bases are chosen for V and W). The jth column of A is found by applying T
to the jth basis vector v;:
T(v;) = combination of basis vectors of
W =a,j;wW) +--+ +4mj Wm.
(5)
These numbers a1j,.-.,@mj go into column j of A. The matrix is constructed to get
the basis vectors right. Then linearity gets all other vectors right. Every v is a combination c}v; +---+C,V,_,, and T(v) is a combination of the w’s. When A multiplies the
coefficient vector c = (cj,..., Cn) in the v combination, Ac produces the coefficients
in the T(v) combination. This is because matrix multiplication (combining columns)
is linear like T.
A tells what T does. Every linear transformation can be converted to a matrix.
This matrix depends on the bases.
Example 4 If the bases change, T is the same but the matrix A is different.
Suppose we reorder the basis of cubics to x, x”, x7, 1. Keep the original 1, x, x? for
the quadratics in W. Now apply T to the first basis vector. The derivative of x is 1. This is
the first basis vector of W. So the first column of A looks different:
A=}|0
be 10°00)
2 O OJ]
Op: Ory
390
=
matrix for the derivative T
when the bases change to
ak
aa landil exer
When we reorder the basis of V, we reorder the columns of A. The input basis vector
v; is responsible for column j. The output basis vector w; is responsible for row i.
Soon the changes in the bases will be more than permutations.
7.2.
(0)
:
The Matrix
of a Linear Transformation
v2
=|— sin sin
ube
\
v9
.
rv) =|989
1’
\9
Figure 7.2
wv
333
‘. T (v1)
=T(r)
| sing
7
ie
=
1/2
1/2
Rotation by @ and projection onto the 45° line.
Products AB Match Transformations
7S
The examples of derivative and integral made three points. First, linear transformations
T are everywhere—in calculus and differential equations and linear algebra. Second,
spaces other than R” are important—we had cubics and quadratics. Third, 7 still boils
down to a matrix A. Now we make sure that we can find this matrix.
The next examples have V = W. We choose the same basis for both spaces.
Then we can compare the matrices A? and AB with the transformations T? and TS.
Example 5 T rotates every plane vector by the same angle 6. Here
Find the rotation matrix A. The answer depends on the basis!
V = W = Re
Solution The standard basis is vj; = (1,0) and vz = (0,1). To find A, apply T to
those basis vectors. In Figure 7.2a, they are rotated by @. The first vector (1, 0) swings
around to (cos@,sin@).
This equals cos@ times (1,0) plus sin@ times (0, 1). There-
fore those numbers cos@ and sin@ go into the first column of A:
oe
sin@
shows column |f
A=
cay
sin 0
ly
cos 0
shows both columns.
For the second column, transform the second vector (0, 1). The figure shows it rotated
to (—sin@,cos@). Those numbers go into the second column. Multiplying A times
(0, 1) produces that column, so A agrees with T.
Example 6 (Projection) Suppose T projects every plane vector onto the 45° line. Find
its matrix for two different choices of the basis.
Solution Start with a specially chosen basis. The basis vector vj is along the 45° line.
It projects to itself. So the first column of A contains | and 0. The second basis vector v3 is
along the perpendicular line (135°). This basis vector projects to zero. So the second
column of A contains 0 and 0:
Projection A =
; 3
when V and W
have the 45° and
135° basis.
334
Chapter 7
Linear Transformations
With the basis in the opposite order (135° then 45°), the matrix is __.
Now
take the standard
basis
(1,0) and (0,1).
projects to G, 5). That gives the first column
of A.
Figure 7.2b shows
how
(1,0)
The other basis vector (0, 1)
also projects to G, 5). So the matrix is
Projection A =
for the same
NI
T and the standard basis.
Ll
NIK
Both A’s are projection matrices. If you square A it doesn’t change. Projecting
twice is the same as projecting once: T? = T so A? = A. Notice what is hidden in
that statement: The matrix for T? is A?.
We have come to something important—the real reason for the way matrices are multiplied. Two transformations S and T are represented by two matrices B and A. When
we apply 7 to the output from S, we get the “composition”® TS. When we apply A
after B, we get the matrix product AB. The rules for matrix multiplication give the
correct matrix AB to represent TS.
The transformation S is from a space U to V. Its matrix B uses a basis u),..., Up
for U and a basis vj, ..., v, for V. The matrix is n by p. The transformation T is from V
to W as before. /ts matrix A must use the same basis v\, ..., Vn for V—this is the output
space for S and the input space for 7. Then AB matches T'S:
7B Multiplication The linear transformation TS starts with any vector wu in U, goes
to S(u) in V and then to T(S(u)) in W. The matrix AB starts with any x in R?,
goes to Bx in R” and then to ABx in R”. The matrix AB correctly represents 7S:
TS:
U~V-W
AB:
The input is uw = xju; +-:-+XpUpy.
(m by n)(n by p) = (m by p).
The output T(S(u))
matches
the output ABx.
Product of transformations matches product of matrices. The most important cases are
when the spaces U, V, W are the same and their bases are the same. With m =n = p
we have square matrices.
Example 7 S rotates the plane by @ and T also rotates by 6. Then 7S rotates by 26.
This transformation T* corresponds to the rotation matrix A? through 26:
a.
A=B
A’ = rotation by 26 = Ieee aed
sin2@
cos 20
By matching (transformation)? with (matrix)*,
and sin20. Multiply A times A:
cos@
—sin@][cos@
—sin@]
sin@d
cos@}]}sin@
cos@|
we
pick up the formulas
[cos*@—sin*?@
—2sin@cosé
|
cos*@—sin?6
2sin@cos@
for cos 20
|"
7.2.
The Matrix of a Linear Transformation
335
Comparing with the display above, cos 20 = cos? 6 — sin? 6 and sin26 = 2sin@cos@.
Trigonometry comes from linear algebra.
Example 8 § rotates by @ and T rotates by —@. Then TS = / and AB =/.
In this case T(S(u)) is u. We rotate forward and back. For the matrices to match,
ABx must be x. The two matrices are inverses. Check this by putting cos(—@) = cos@
and sin(—@) = —sin@ into A:
cos 0
qs
—sin@
ih hae al
x Fake + sin? @
cosé
a
sin 0
cos@
0
0
cos @ + sin? @
By the famous identity for cos 6 + sin? 6, this is /.
Earlier T took the derivative and S took the integral. Then 7S is the identity but
not ST. Therefore AB is the identity matrix but not BA:
0
010
71/4
9
0
0100
AB 040.02.
0
Lh
PT
SS
a
57
ner eres ioe scat
OM Oe)
0 0 4
OM08 a
The Identity Transformation and Its Matrices
We need the matrix for the special and boring transformation T(v) = v. This identity
transformation does nothing to v. The matrix also does nothing, provided the output
basis is the same as the input basis. The output 7(v;) is v;. When
same, this is w;. So the first column of A is (1,0,..., 0).
the bases are the
When each output T(v;) = v; is the same as wj, the matrix is just I.
This seems reasonable: The identity transformation is represented by the identity
matrix.
But suppose the bases are different. Then 7(v;) = v; is a combination of the
w’s. That combination m,,w,; +---+my1Wy, tells us the first column of the matrix M.
We will use M (instead of A) for a matrix that represents the identity transformation.
When the outputs T(v;) = v; are combinations
Yye1 Mijwi, the “change of basis matrix” is M.
The basis is changing but the vectors themselves are not changing: T7(v) = v. When
the input has one basis and the output has another basis, the matrix is not /.
Example 9 The input basis is vj; = (3,7) and v2 = (2,5). The output basis is wy =
(1,0) and w2 = (0,1). Then the matrix M is easy to compute:
The matrix for
T(v)=v
is
M=
Seay:
E alk
Reason The first input is vj = (3,7). The output is also (3,7) but we express it as
3w; + 7w2. Then the first column of M contains 3 and 7.
336
Chapter 7
Linear Transformations
This seems too simple to be important. It becomes trickier when the change of
basis goes the other way. We get the inverse matrix:
Example 10 The input basis is vj = (1,0) and v2 = (0, 1). The output basis is wy =
(Cry) Randinwa71—@))
=i
3 2
E | =
:
.
(ihe MathixetOneA()):— 0s
Reason The first input is vj; =
(1,0).
5 —2
be
4;
The output is also vj but we express it as
Sw, — 7w2. Check that 5(3,7) — 7(2,5) does produce (1,0). We are combining the
columns of the previous M to get the columns of J. The matrix to do that is M7!:
wi wn E
* = B
»| is
MM!
=I.
A mathematician would say that MM! corresponds to the product of two identity
transformations. We start and end with the same basis (1,0) and (0, 1). Matrix multiplication must give 7. So the two change of basis matrices are inverses.
Warning One mistake about M is very easy to make. This matrix changes from the
basis of v’s to the standard columns of 7. But matrix multiplication goes the other way.
When you multiply M times the columns of J, you get the v’s. It seems backward but
it is really OK.
One thing is sure. Multiplying A times (1,0,...,0) gives column
1 of the ma-
trix. The novelty of this section is that (1,0,...,0) stands for the first vector v,, written in the basis of v’s. Then column | of the matrix is that same vector v1, written
in the standard basis. This is when we keep T = J and change basis. In the rest of
the book we keep the basis fixed and T is multiplication by A.
=
1.
22.
3.
If we know
other T(v).
REVIEW OF THE KEY IDEAS
T(v;),...7T(v,)
Linear transformation T
Input basis vj, ..., Up
Output basis w),..., Wm
4.
for the vectors in a basis, linearity determines
all
Matrix A (m by n)
that represents T
in these bases
The derivative and integral matrices are one-sided inverses:
(Derivative) (Integral)
8
= / =
d(constant)/dx = 0:
Fundamental Theorem of Calculus
!
The change of basis matrix M represents T = J. Its columns are the coefficients of
each output basis vector expressed in the input basis: w; = mj jvy +--+ + myj Up.
7.2.
The Matrix of a Linear Transformation
337
Problem Set 7.2
Questions 1-4 extend the first derivative example to higher derivatives.
1
The transformation S takes the second derivative.
Keep 1, x,x*, x> as the basis
V1, V2, V3, v4 and also as wi, W2, W3, w4.
Sv,, Sv2, Sv3, Sv4 in terms
Write
of
the w’s. Find the 4 by 4 matrix B for S.
What functions have v” = 0? They are in the kernel of the second derivative S.
What vectors are in the nullspace of its matrix B in Problem
1?
B is not the square of the 4 by 3 first derivative matrix
as
Et le wae o
000% 2an0
Or 0. (Oy +3
Add a zero row to A, so that output space = input space. Then compare A* with
B.
Conclusion:
For
B = A” we
also want output basis =
____
basis.
Then
[ma Or
(a)
The product TS produces the third derivative.
by 4 matrices, then compute AB.
Add zero rows to make 4
(b)
The matrix B* corresponds to S? = fourth derivative.
zero?
Why is this entirely
Questions 5-10 are about a particular T and its matrix A.
5
With bases v1, v2, v3 and wy), w2, w3, suppose T(v,) = w2 and T(v2) = T(v3) =
w, + w3. T is a linear transformation. Find the matrix A.
(a)
What is the output from 7 in Question 5 when the input is v; + v2 + 03?
(b)
Multiply A times the vector (1, 1, 1).
Since T(v2) = T(v3), the solutions to T(v) = 0 are v =
are in the nullspace of A? Find all solutions to T(v) = wp.
. What vectors
Find a vector that is not in the column space of A. Find a combination of w’s
that is not in the range of T.
You don’t have enough information to determine T?. Why not? Why is its matrix
not necessarily A”?
10
Find the rank of A. This is not the dimension of the output space W. It is the
dimension of the _
on i
338
Chapter 7
Linear Transformations
Questions 11-14 are about invertible linear transformations.
11
Suppose T(v|) = w) +w2+w3 and T(v2) = w2+w3 and T(v3) = w3. Find the
matrix for T using these basis vectors. What input vector v gives T(v) = w 1?
12
Invert the matrix
A in Problem
11. Also invert the transformation
T—what
are
T—!(w;) and T~!(w2) and T~!(w3)? Find all v’s that solve T(v) = 0.
13
Which of these are true and why is the other one ridiculous?
@
14
TUr=I!
®)
TUT@))=1
©
Tf)
=w1.
Suppose the spaces V and W have the same basis 0}, v2.
(a)
Describe a transformation
T (not /) that is its own inverse.
(b)
Describe a transformation
T (not /) that equals jie
(c)
Why can’t the same
T be used for both (a) and (b)?
Questions 15-20 are about changing the basis.
15
16
17
(a)
What matrix transforms (1, 0) into (2, 5) and transforms (0, 1) to (1, 3)?
(b)
What matrix transforms (2,5) to (1,0) and (1, 3) to (0, 1)?
(c)
Why does no matrix transform (2,6) to (1,0) and (1,3) to (0, 1)?
(a)
What matrix M transforms (1,0) and (0, 1) to (r,t) and (s, u)?
(b)
What matrix N transforms
(c)
What condition on a,b, c,d will make part (b) impossible?
(a)
How do M and N in Problem
(Ap) wanda(ba)aion(sarae
(b)
What matrix transforms (2,5) to (1,1) and (1,3) to (0, 2)?
(a,c) and (b,d) to (1,0) and (0, 1)?
16 yield the matrix that transforms (a, c) to
18
If you keep the same basis vectors but put them in a different order, the change
of basis matrix M is a
matrix. If you keep the basis vectors in order but
change their lengths, M is a
matrix.
19
The matrix that rotates the axis vectors (1,0) and (0,1) through an angle @ is
Q. What are the coordinates (a, b) of the original (1, 0) using the new (rotated)
axes? This can be tricky. Draw a figure or solve this equation for a and b:
De
20
cos@
—sin@
fe
|
1
cosé
lo]=«[S6]+2|
— sin@
nae
The matrix that transforms (1,0) and (0,1) to (1,4) and (1,5) is
The combination a(1,4) + b(1, 5) that equals (1,0) has (a,b) =(
are those new coordinates of (1,0) related to M or M~!?
M =
, _ ). How
7.2
The Matrix of a Linear Transformation
339
Questions 21-24 are about the space of quadratic polynomials A + Bx + Cx?.
21
22
The parabola w; = 5(x7 +x) equals one at x = | and zero at x = 0 and x = —1.
Find the parabolas w2, w3, and y(x):
(a)
wy equals one at x = 0 and zero at x = 1 and x = —1.
(b)
w3 equals one at x = —1 and zero at x = 0 and x = 1.
(c)
y(x) equals 4 at
x = 1 and 5 at x =O and 6 at
One basis for second-degree polynomials is v; =
x = —1. Use wy, wp, w3.
1 and v2 = x and v3 = x°.
Another basis is w;, w2, w3 from Problem 21. Find two change of basis matrices,
from the w’s to the v’s and from the v’s to the w’s.
23
What
are the three equations for A, B,C
if the parabola
Y =
A+
Bx + Cx?
equals 4 at x = a and 5 at x = b and 6 at x = c? Find the determinant of
the 3 by 3 matrix. For which numbers a, b,c will it be impossible to find this
parabola Y?
24
Under what condition on the numbers
m),™2,..., mg
do these three parabolas
give a basis for the space of all parabolas?
vy (x) =m,
5)
+mox + m3x~
and v2(x) = m4+ms5x +mox~
yD
and
U3(X) =m7-
25
The
Gram-Schmidt
process
changes
s
mgx + mox-.
a basis a), a2, a3 to an orthonormal
basis
91,92, 93. These are columns in A = QR. Show that R is the change of basis
matrix from the a’s to the q’s (a2 is what combination of g’s when A = QR?).
26
Elimination changes the rows of A to the rows of U with A = LU. Row 2 of
A is what combination of the rows of U? Writing A' = UTL! to work with
columns, the change of basis matrix is
matrices are __—is)
M = L™.
(We have bases provided the
145
Suppose vj, v2, v3 are eigenvectors for 7. This means T(v;) = Ajv; for i =
1,2,3. What is the matrix for T when the input and output bases are the v’s?
28
Every invertible linear transformation can have / as its matrix. Choose any input
basis vj,..., ¥, and for output basis choose
w; to be T(v;).
Why
must
T be
invertible?
Questions 29-32 review some basic linear transformations.
PAs|
Using vj = w; and v2 = wp find the standard matrix for these T’s:
(a)
T(vy) =0
and T(v2) = 3v,
(b)
Tv)
=v,
and T(v; + v2) = v4.
340 = Chapter 7
30
Linear Transformations
Suppose T is reflection across the x axis and S is reflection across the y axis.
The domain V is the xy plane.
If v = (x, y) what is S(T (v))? Find a simpler
description of the product ST.
31
Suppose T is reflection across the 45° line, and S is reflection across the y axis.
If v = (2,1) then T(v) = (1,2). Find S(7(v)) and T(S(v)). This shows that
generally ST Ail:
32
Show that the product ST of two reflections is a rotation. Multiply these reflection matrices to find the rotation angle:
cos 20
sin20
33
True or false:
If we know
sin 20
—cos20
cos 2a
sin2a@
sin 2a
—cos2a
7(v) for n different nonzero
know 7T(v) for every vector in R”.
|~
vectors in R”, then we
7.3.
Change of Basis
341
CHANGE OF BASIS
7.3
This section returns to one of the fundamental ideas of linear algebra—a basis for R".
We don’t intend to change that idea, but we do intend to change the basis. It often happens
(and we will give examples) that one basis is especially suitable for a specific problem.
By changing to that basis, the vectors and the matrices reveal the information we want.
The whole idea of a transform (this book explains the Fourier transform and wavelet transform) is exactly a change of basis.
Remember
what it means for the vectors wj,..., Ww, to be a basis for R”:
1.
The w’s are linearly independent.
2.
The nxn
3.
Every vector v in R" can be written in exactly one way as a combination of the w’s:
matrix W with these columns is invertible.
v=cywy
+C2W2
+--+
Here is the key point: Those coefficients cj,...,¢,
+ CpWn.
(1)
completely describe the vector v,
after we have decided on the basis. Originally, a column vector v just has the components v1,..., Un. In the new basis of w’s, the same vector is described by the different
set of numbers cj, ...C,. It takes n numbers to describe each vector and it also requires
a choice of basis. The n numbers are the coordinates of v in that basis:
V1}
Cl
vere
and also
Un
standard
VAX
Cn
basis
A basis is a set of axes for R”.
(2)
The coordinates
c;,...,cC,
basis of w’s
tell how far to go along
each axis. The axes are at right angles when the w’s are orthogonal.
Small point: What is the “standard basis’? Those basis vectors are simply the
columns of the n by n identity matrix /. These columns e),...,@, are the “default
basis.’ When I write down the vector v = (2,4, 5) in R?>, Tam intending and you are
expecting the standard basis (the usual xyz axes, where the coordinates are 2, 4, 5):
v = 2e;
+ 4e2 +5e3
1
0
0
2
=2/0}/+4)/1)+5)/0)}/=]4
0
0
l
5
The new question is: What are the coordinates c, C2, c3 in the new basis w,, W2, W3? As
usual we put the basis vectors into the columns of a matrix. This is the basis matrix W.
Then the fundamental equation
v =
cjw; + -:: + CW,
has the matrix form v =
We.
From this we immediately know c.
7C The coordinates
c = (cj,..., Cn) of a vector v in the basis w1,..., Wy are given by
c= W-!'v. The change of basis matrix W~' is the inverse of the basis matrix W.
342 = Chapter 7
Linear Transformations
The standard basis has W = J. The coordinates in that default basis e;,..., @n are
the usual components v},..., Un. Our first new example is the wavelet basis for R*.
Example
1 [Wavelet basis] Wavelets are little waves.
They have different lengths and
they are localized at different places. The first basis vector is not actually a wavelet,
it is the very useful flat vector of all ones:
Wir
1
1
|
wy
1
|
1
—|]
Cea
—|
we
|
0
0
0
(3)
—|]
Those vectors are orthogonal, which is good. You see how w3 is localized in the
first half and wa is localized in the second half. The coefficients cz and c4 will tell
us about details in the first half and last half of v. The ultimate in localization is the
standard basis.
Why do want to change the basis? I think of v1, v2, v3, v4 as the intensities of a
signal. It could be an audio signal, like music on a CD. It could be a medical signal,
like an electrocardiogram. Of course n = 4 is very short, and n = 10,000 is more
realistic. We may need to compress that long signal, by keeping only the largest 5%
of the coefficients. This is 20 : 1 compression and (to give only one of its applications)
it makes modern video conferencing possible.
If we keep only 5% of the standard basis coefficients, we lose 95% of the signal.
In image processing, most of the image disappears. In audio, 95% of the tape goes
blank. But if we choose a better basis of w’s, 5% of the basis vectors can come very
close to the original signal. In image processing and audio coding, you can’t see or
hear the difference.
One good basis vector is a flat vector like w; = (1, 1, 1, 1). That
part alone can represent the whole constant background of our image.
wave like w4 = (0,0, 1, —1) or in higher dimensions
Then a short
wg = (0, 0, 0, 0, 0,0, 1, —1) rep-
resents a detail at the end of the signal.
The three steps of transform and compression and inverse transform are
input v
—
[lossless]
coefficients c
>
[lossy]
compressed ©
—>
compressed0
[reconstruct]
In linear algebra, where everything is perfect, we omit the compression step. The
output 0 is exactly the same as the input v. The transform gives c = W~'v and the
reconstruction brings back v = We. In true signal processing, where nothing is perfect
but everything is fast, the transform (lossless) and the compression (which only loses
unnecessary information) are absolutely the keys to success.
I will show those steps for a typical vector like v = (6, 4,5, 1). Its coefficients
in the wavelet basis are c = (4, 1, 1,2). This means that v can be reconstructed from
7.3.
Change of Basis
343
c using the w’s. In matrix form the reconstruction is v = We:
6
4
ama
1
Those
1
1
la
1
coefficients
1
I
les
=
¢c=
i
0
=I
0
el feel feywl Mic
0
=
(4,1,1,2)
are
W~!v.
because the w’s in its columns are orthogonal.
Lie al teen Oalahes
Ct
LOT elt
pgs
hee es wb
i oly Oe
Inverting this basis matrix
SS,
W
is easy
But they are not unit vectors.
So the
inverse is the transpose divided by the lengths squared, W~' = (W’ W)~!W?:
i4
wi=
|—
1
2
1
1
1
1 -1 -!l
| _
0
0
oF
ee
Obs
tila
From the ones in the first row of c = W~!v, notice that c) is the average of v1, V2, 3, U4:
C|
II
6444541
_
7
= 4,
Example 2 (Same wavelet basis by recursion) I can’t resist showing you a faster way
to find the c’s. The special point of the wavelet basis is that you can pick off the details in c3
and c4, before the coarse details in cz and the overall average in c,. A picture will explain
this “multiscale” method, which is in Chapter | of my textbook with Nguyen on Wavelets
and Filter Banks:
Split v = (6, 4,5, 1) into averages and waves at small scale and then large scale:
fo
o™
averages
differences /2
Jy
average
4
4
Ce
ll
ey
tel
J,
difference /2
4
4
plus
=
1
344
Chapter 7
Linear Transformations
Example 3 (Fourier basis) The first thing an electrical engineer does with a signal is
to take its Fourier transform. This is a discrete signal (a vector v) and we are speaking
about its Discrete Fourier Transform. The DFT involves complex numbers. But if we
choose n = 4, the matrices are small and the only complex numbers are 7 and 7a
Notice that i? = —i because i? = —1. A true electrical engineer would write j
instead of i, but we have to draw a line somewhere. (The word imaginary does start
with i.) Here are the four basis vectors, in the columns
~
2
I
—
=
=
The first column
1
1
i2
i>
i4
j6
PT
is the useful flat basis vector
signal or the direct current (the DC term).
of the Fourier matrix
F:
Cig
(1, 1,1, 1). It represents
the average
It is a wave at zero frequency.
column is (1, —1, 1, —1), which alternates at the highest frequency.
The third
The Fourier trans-
form decomposes the signal into waves at equally spaced frequencies.
The Fourier matrix F is absolutely the most important complex matrix in mathematics and science and engineering. The last section of this book explains the Fast
Fourier Transform: it is a factorization of F into matrices with many zeros. The FFT
has revolutionized entire industries, by speeding up the Fourier transform. The beautiful thing is that F~' looks like F, with i changed to —i:
1
1
Si becomes F-*
-+_1{/1
(i)
= —
i. (ai
epi
eane
1 (-i
(-i)®
(-i)?
1
W
(Ci)?
1
(i)?
The MATLAB command c = fft(v) produces the Fourier coefficients cj,..., Cc, of the
vector v. It multiplies v by F —1 (fast).
The Dual Basis
The columns of W contain the basis vectors w),...,W,.
To find the coefficients
C],...,€n Of a vector in this basis, we use the matrix W~!. This subsection just introduces a notation and a new word for the rows of W~!. The vectors in those rows
(call them wf ,...,u/) are the dual basis.
The properties of the dual basis reflect W~'W = J and also WW! = J. The
product W~!W takes rows of W~! times columns of W, in other words dot products
of the u’s with the w’s. The two bases are “biorthogonal” because we get 1’s and 0’s:
aii
WW
=
a uy5:
:
me
n
wi---w,
|=F
80
Z
se w=),
ire
fiz}
7.3.
Change of Basis
345
For an orthonormal basis, the u’s are the same as the w’s. The basis is biorthogonal to itself! The rows in W~! are the same as the columns in W. In other words
W-!—WT!. That is the specially important case of an orthogonal matrix W.
Other bases are not orthonormal. The axes don’t have to be perpendicular. The
basis matrix W can be invertible without having orthogonal columns.
When the inverse matrices are in the opposite order WW! = J, we learn something new.
The
columns
are
w;, the rows
are Th, and each product
is a rank one
matrix. Multiply columns times rows:
= wy
+--+ wut = 7.
This is the order that we constantly use to change the basis.
c=
W-~'v.
facts that
Then
we reconstruct
c = W~!v
and
v from
We.
Use the new
The coefficients are in
notation to state the basic
v= We = WW!0d:
n
ihe cociicients atom,
Cc; = uly
l
and the vector is
v=
w;(ulv).
I
(5S)
|
The analysis step takes dot products with the dual basis to find the c’s.
synthesis step adds up the pieces c;w; to reconstruct the vector v.
=»
REVIEW OF THE KEY IDEAS
The
#
1.
The new basis vectors w; are the columns of an invertible matrix
W.
2.
The coefficients of v in this new basis are
oe
The vector v is reconstructed
4.
If there is compression then c becomes ¢ and we reconstruct D = C] Wi +++-+C, Wp.
5.
The rows of W~!
c = W~'v (the analysis step).
as We = cjw, +---+cC,W,
(the synthesis step).
are the dual basis vectors and c; = ul. Then ul w; = dj}.
Problem Set 7.3
1
Express the vectors
e = (1,0,0,0) and v = (1, —1, 1, —1) in the wavelet basis,
as in equation (4). The coefficients cj, c2, ¢3, 4
solve
Wc =e
and
Wc =v.
346
= Chapter 7
Linear Transformations
2
Follow Example 2 to represent v = (7,5, 3, 1) in thie.wavelet basis. The first step
—1
—1
The last step writes 6, 6, 2, 2 as an overall average plus a difference, using 1, 1, 1, 1
and 1,1, —1, —1.
3
What are the eight vectors in the wavelet basis for R°? They include the long
wavelet (1, 1, 1, 1, —1, —1, —1, —1) and the short wavelet (1, —1, 0, 0, 0, 0, 0, 0).
4
The wavelet basis matrix W factors into simpler matrices
i
1
elt
et
eal
Ce
A
a)ul se
[a
Oe
LOS
eee
vai
OTe
tie
0
oO
“Or
t
Olt
W; and W):
1
|Rea 6 ew8
|Nees
hae
1 Salad, lee si
OFS
OPT
at
Then W-! = Wy Wr ! allows ¢ to be computed in two steps. The first splitting in
Example 2 shows W,- 'y. Then the second splitting applies Wis Find those inverse
matrices W, ' and Wi ; directly from W, and W2. Apply them to v = (6,4, 5, 1).
5
The 4 by 4 Hadamard matrix is like the wavelet matrix but entirely +1 and —1:
eae
ais ries |
nace
tee ila
fe Aen,
Me ed
(ieee ee
Find the matrix H~! and write v = (7, 5, 3, 1) as acombination of the columns of H.
6
Suppose we have two bases vj,..., UV, and wj,..., w, for R”. If a vector has
coefficients b; in one basis and c; in the other basis, what is the change of basis
matrix in
b= Mc? Start from
byyy +-++
+ ban
=
VO =cyw,
+--+
+ c,W,
=
We.
Your answer represents 7(v) = v with input basis of v’s and output basis of w’s.
Because of different bases, the matrix is not /.
i
The dual basis vectors w},..., w, are the columns of W* = (W-')T. Show that
the original basis w;,..., w, is “the dual of the dual.” In other words, show that
the w’s are the rows of (W*)~!. Hint: Transpose the equation WW~!
= J.
7.4
DIAGONALIZATION
Diagonalization and the Pseudoinverse
347
AND THE PSEUDOINVERSE
a 7.4
This short section combines the ideas from Section 7.2 (matrix of a linear transforma-
tion) and Section 7.3 (change of basis). The combination produces a needed result: the
change of matrix due to change of basis. The matrix depends on the input basis and
output basis. We want to produce a better matrix than A, by choosing a better basis
than the standard basis.
The truth is that all our great factorizations of A can be regarded as a change of
basis.
But this is a short section, so we concentrate on the two outstanding examples.
In both cases the good matrix is diagonal. It is either A or D:
1.
S~!AS =A
when the input and output bases are eigenvectors of A.
2.
U~!AV =)
when the input and output bases are eigenvectors of A' A and AA’.
You see immediately the difference between A and &. In A the bases are the same.
The matrix A must be square. And some square matrices cannot be diagonalized by
any S, because they don’t have n independent eigenvectors.
In & the input and output bases are different. The matrix A can be rectangular.
The bases are orthonormal because A'A and AA! are symmetric. Then
U~! = UT
and V~! = V!. Every matrix A is allowed, and can be diagonalized. This is the
Singular Value Decomposition (SWD) of Section 6.7.
I will just note that the Gram-Schmidt factorization A = QR chooses only one new
basis. That is the orthogonal output basis given by Q. The input uses the standard basis
given by J. We don’t reach a diagonal &, but we do reach a triangular R. You see how the
output basis matrix appears on the left and the input basis appears on the right.
We start with input basis equal to output basis.
Similar Matrices: A and SAS~! and WAW7!
We begin with a square matrix and one basis. The input space V is R” and the output
space W is also R”. The basis vectors are the columns of /. The matrix with this basis
is n by n, and we call it A. The linear transformation is just “multiplication by A.”
Most of this book has been about one fundamental problem — to make the matrix
simple. We made it triangular in Chapter 2 (by elimination), and we made it diagonal
in Chapter 6 (by eigenvectors). Now the change in the matrix comes from a change
of basis.
Here are the main facts in advance. When you change the basis for V, the matrix
changes from A to AM. Because V is the input space, the matrix M goes on the right
(to come first). When you change the basis for W, the new matrix is
M~!'A. We are
working with the output space so M~! is on the left (to come last). If you change both
bases in the same way, the new matrix is
M~'AM. The good basis vectors are the
eigenvectors, which go into the columns of M = S. The matrix becomes STAs =A.
348 = Chapter 7
Linear Transformations
7D When the basis contains the eigenvectors x1, ..., Xn, the matrix for T becomes A.
Reason
To find column 1 of the matrix, input the first basis vector x;. The transformation
multiplies by A. The output is Ax; = A, x1. This is A, times the first basis vector plus zero
times the other basis vectors. Therefore the first column of the matrix is (A;,0,..., 0).
In the eigenvector basis, the matrix is diagonal.
a
on
0
H
projects
projection 3]
x
to zero
projects to x,
Figure 7.3
Projection on the 135° line y = —x.
Standard basis vs. eigenvector basis.
Example 1 Find the diagonal matrix that projects onto the 135° line y = —x. The
standard vectors (1,0) and (0, 1) are projected in Figure 7.3. In the standard basis
ries oust
Solution
The eigenvectors for this projection are x; = (1, —1) and x2 = (1,1). The
first eigenvector lies on the projection line and the second is perpendicular (Figure 7.3).
Their projections
are x; and 0.
The eigenvalues
are 4; =
1 and Az =
O.
In the
eigenvector basis the projection matrix is
eI; al
What if you choose a different basis like vj} = w; = (2,0) and v2 = w2 = (1, 1)?
Since w) is not an eigenvector, the matrix B in this basis will not be diagonal. The
first way to compute B follows the rule of Section 7.2: Find column j of the matrix
by writing the output Av; as a combination of w’s.
Apply the projection T to (2,0). The result is (1, —1) which is w; — w2. So the
first column of B contains 1 and —1. The second vector wz = (1, 1) projects to zero,
so the second column of B contains 0 and 0:
b=
E
ilin the basis wy, wo.
(1)
7.4
Diagonalization and the Pseudoinverse
349
The second way to find the same B is more insightful. Use W~! and W to change
between the standard basis and the basis of w’s. Those change of basis matrices from
Section 7.3 are representing the identity transformation. The product of transformations
is just /T7, and the product of matrices is
B = W~!AW.
This matrix is similar to A.
7E For the basis w,..., Wy find the matrix B in three steps. Change the input basis to
the standard basis with W. The matrix in the standard basis is A. Then change the output
basis back to the w’s with W~!. The product B = W~'AW represents /T/:
Bw’s to w’s =
=i
Woes aed to w’s
Example 2 (continuing with the projection)
Astandard
Ww’s to standard
Apply this W~!AW
(2)
rule to find B for
the basis (2,0) and (1, 1):
Was
a
Pr.
Fea | ea |oa
[eh sao
Ee
|
eet
ia
ee
ta 0
The W~!AW tule has produced the same B as in equation (1). A change of basis
produces a similarity transformation in the matrix. The matrices A and B are similar.
They have the same eigenvalues (1 and 0).
The Singular Value Decomposition (SVD)
Now the input basis can be different from the output basis. In fact the input space R”
can be different from the output space R”. The matrix will be m by n, and we want
to find it. We call the input basis v,,...,v,
and the output basis u1,...,Um.
Again the best matrix is diagonal (now m by n). To achieve this diagonal matrix
x, each input vector v; must transform into a multiple of u;. That multiple is the
singular value oj; on the main diagonal of &:
Me
The
singular values are in the order 0; > 02 > ---
(3)
> 0,.
The reason
that the rank
r enters is that (by definition) singular values are not zero. The second part of the
equation says that v; is in the nullspace for 7 =r-+1,...,n. This gives the correct
number n —r of basis vectors for the nullspace.
Let me connect the matrices A and & and V and U with the linear transformations they represent. The matrices A and & represent the same transformation. A uses
the standard bases for R” and R”, while & uses the input basis of v’s and the output
basis of u’s. The matrices V and U give the basis changes; they represent the identity
transformations (in R” and R”). The product of transformations is just /7/ again, and
it is represented in the v and wu bases by UAV:
350 = Chapter 7
Linear Transformations
7F The matrix © in the new bases comes from A in the standard bases by
Xy’s to w’s = Oeren
to u’s
Astandard
V»’s to standard:
(4)
The SVD chooses orthonormal bases (U~! = U™ and V~! = V") that diagonalize A.
The two orthonormal bases in the SVD are the eigenvector bases for A'A (the
v’s) and AA! (the u’s). Since those are symmetric matrices, their unit eigenvectors are
orthonormal. Their eigenvalues are the numbers 07. To see why those bases diagonalize the matrix — to see that equation (3) holds — we use another proof by parentheses.
With v; as a unit eigenvector of A'A, the first line shows that Av; has length o;. The
second line finds its direction u;. Start from ATAD; = OF vj:
Poet 0;
vj©. A’ I, Avj = oj;
AA‘ AD; = 0; Av;
gives
‘
gives
Av; |"2 = a;2
uj = Av;/oj;
so that
||/Av;|| = 0;.
as a unit eigenvector of AAT
(5)
XG)
The eigenvector u; of AA! has length 1, because of (5). The equation Avj = ojuj
says that these bases diagonalize A.
Polar Decomposition
Every complex number has the polar form re’®. A nonnegative number r multiplies a
number on the unit circle. (Remember that |e!?| = |cos@ +isin@| = 1.) Thinking of
these numbers as | by | matrices, r > 0 corresponds to a positive semidefinite matrix
(call it H) and e’? corresponds to an orthogonal matrix Q. The SVD extends this rel?
factorization to matrices (even m by n with rectangular Q).
7G Every real square matrix can be factored into A = QH, where Q is orthogonal and
H is symmetric positive semidefinite. If A is invertible then H is positive definite.
For the proof we just insert V'V = / into the middle of the SVD:
A=UXV! =(UV')(V=XV') =(Q)(A).
(7)
The first factor UV' is Q. The product of orthogonal matrices is orthogonal. The
second factor VXV' is H. It is positive semidefinite because its eigenvalues are in D.
If A is invertible then H is also invertible—it is symmetric positive definite. H is the
square root of A'A. Equation (7) says that H? = Vd2V! = ATA.
There is also a polar decomposition A = K Q in the reverse order. Q is the same
but now K = UXU!. This is the square root of AA?.
7.4
Example 3 Find the polar decomposition
as
ale i
=i
0
Solution
The orthogonal part is
Vuvi=@Q"!A:
Diagonalization and the Pseudoinverse
351
A = QH from its SVD:
2
2/961
UV!.
Q =
—1/V2
Wel A/S
1//2 Say
aR al
The positive definite part is
H =
et ollie ival-Line ial
eAlyA WAIL BY)
In mechanics, the polar decomposition separates the rotation (in Q) from the stretching
(in H). The eigenvalues of H are the singular values of A. They give the stretching
factors. The eigenvectors of H are the eigenvectors of A! A. They give the stretching
directions (the principal axes).
The polar decomposition just splits the key equation Av; = o;u; into two steps.
The “H” part multiplies v; by o;. The “Q” part swings the v direction around to the
u direction. The other order A = K Q swings v’s to w’s first (with the same Q). Then
K multiplies u; by oj; to complete the job of A.
The Pseudoinverse
By choosing good bases, the action of A has become clear. It multiplies v; in the
row space to give oju; in the column space. The inverse matrix must do the opposite!
If Av = ou then A~'u = v/o. The singular values of A~! are 1/o, just as the
eigenvalues of A~! are 1/A. The bases are reversed. The w’s are in the row space of
A~!, the v’s are now
in the column
space.
Until this moment we would have added the words “if A~! exists.” Now we don’t.
A matrix that multiplies u; to produce v;/o; does exist. It is denoted by A*:
07;
te
oe
Vi -°* Ur -° + Un
T
.
n by n
Uy: ++ Uy ++ +Um
m by m
n by m
(8)
At is the pseudoinverse of A. It is ann by m matrix. If A~! exists (we said it again),
then At is the same as A~!. In that case m =n =r
and we are inverting ULV! to
get VD !U!. The new symbol At is needed when r < m or r <n. Then A has no
two-sided inverse, but it has a pseudoinverse A* with these properties:
1
Atu; =—v;
Oj
fori<r
and
Atu;=0
'
fori>r.
352
Chapter 7 Linear Transformations
nullspace
of AT
Figure 7.4
When
we know
A is invertible from row space to column space.
At inverts it.
what happens to each basis vector uj, we know
A‘.
The vectors
uj,,...,U, in the column space of A go back to the row space. The other vectors
Uy+1,..-,Um are in the left nullspace, and A* sends them to zero.
Notice the pseudoinverse + of the diagonal matrix ©. Each singular value o is
just replaced by o~!. The product ©*D is as near to the identity as we can get. We
can’t do anything about the zero rows and columns, and we do get an r by r identity
matrix.
This example has o; = 2 and o2 = 3:
St!
Lie)
Oks
0
30: > “Ove
SAA
OR Or
Example 4 Find the pseudoinverse of A =
rank is 1. The singular value is
Atevetyt
2ecO *0
HCOr 230 107
Os 708 30
es
Teata0
FOO
0 0 0
a2
iouk
ae,
:
:
This matrix is not invertible. The
10. That is inverted in D7:
a= 1
a
Bd bcs ey./S
ONO)
A654 ve |
0
OR
Ds al 2
Sle ae
el 9)12 ie
io Lo oe
A* also has rank 1. Its column space is the row space of A. When A takes (1, 1)
in the row space to (4,2) in the column space, At does the reverse. Every rank one
7.4
matrix is a column times a row.
Diagonalization and the Pseudoinverse
With unit vectors uw and v, that is
A = ouv!.
353
Then
the best inverse we have is At = vu'/o.
The product AA*
is uu!, the projection onto the line through u. The product
AtA is vv!, the projection onto the line through v. For all matrices, AA*+ and AtA
are the projections onto the column space and row space.
The shortest least squares solution to Ax
Any other vector that solves the normal equation
=»
=b is xt = A*b
(Problem
18).
A' Ax = A'D is longer than x*.
REVIEW OF THE KEY IDEAS
=
1.
Diagonalization S~'AS = A is the same as a change to the eigenvector basis.
2.
|The SVD chooses an input basis of v’s and an output basis of u’s.
thogonal bases diagonalize A.
3.
Polar decomposition
4.
The pseudoinverse At = VX*U! transforms the column space of A back to
its row space. Then A‘A is the identity on the row space (and zero on the
nullspace).
Those or-
factors A into (positive definite) times (orthogonal matrix).
Problem Set 7.4
Problems
1
1-6 compute and use the SVD
of a particular matrix (not invertible).
Compute A'A and its eigenvalues and unit eigenvectors v; and v2:
s=[3 |
What is the only singular value o;? The rank of A is
2
3
r= 1.
(a)
Compute AA! and its eigenvalues and unit eigenvectors u; and up.
(b)
Verify from Problem
From
the u’s and
1 that Av; = o,u;. Put numbers
Bab elt df =]
v’s write down
orthonormal
bases
for the four fundamental
subspaces of this matrix A.
4
into the SVD:
Describe all matrices that have those same four subspaces.
354
Chapter 7
From
Linear Transformations
U, V, and
& find the orthogonal
matrix H = VXV!.
matrix
Q =
Verify the polar decomposition
U V! and the symmetric
A = QH.
This H is only
semidefinite because
Compute the pseudoinverse At = VX+U?.
1/o,;.
Rename
The diagonal matrix &* contains
the four subspaces (for A) in Figure 7.4 as four subspaces for
At. Compute ATA and AAT.
Problems
ys
7-11 are about the SVD
of an invertible matrix.
Compute A'A and its eigenvalues and unit eigenvectors v; and v7. What are the
singular values o; and o2 for this matrix A?
[2 3
AA! has the same eigenvalues oF and os as A'A.
Find unit eigenvectors wu
and uw. Put numbers into the SVD:
Ef ell? elle of.
In Problem 8, multiply columns times rows to show that A = oud, + 02U20>.
Prove from A = UXV' that every matrix of rank r is the sum of r matrices of
rank one.
10
From
U, V, and
& find the orthogonal
matrix K = UXU'.
KQ.
11
The pseudoinverse of this A is the same as
Problems
12
Q =
UV!
and the symmetric
A =
because
12-13 compute and use the SVD of a | by 3 rectangular matrix.
Compute
matrix is
UP
matrix
Verify the polar decomposition in the reverse order
A'A
and AA! and their eigenvalues and unit eigenvectors when the
A=[3
4
0]. What are the singular values of A?
Put numbers into the singular value decomposition of A:
A
[3
4
0] =
[wil [ou
0
0] [v1
v2
T
v3| 9
Put numbers into the pseudoinverse of A. Compute AA* and A* A:
At=
=|v,
MW
3
0
[wi ]'.
7.4
Diagonalization and the Pseudoinverse
355
14
What is the only 2 by 3 matrix that has no pivots and no singular values? What
is © for that matrix? At is the zero matrix, but what shape?
15
If det A= 0 how do you know that detAt = 0?
16
When are the factors in
UXV! the same as in QAQ™? The eigenvalues 4; must
be positive, to equal the o;. Then A must be
and positive
Problems 17-20 bring out the main properties of A* and x* = ATb.
17
Suppose all matrices have rank one.
pales
Az
2
i
Tl
The vector b is (bj, b2).
2) al
7
|.8
<4
sen
bee
=|; 4 ave =| ei < ae
el
(a)
The equation A! Ax = A'b has many solutions because A‘A
is
(b)
Verify thatxt = Atb = (.2b; +.1b2, .2b; +.1b2) does solve ATAxt = A'D.
(c)
AAT projects onto the column space of A. Therefore _____ projects onto
the nullspace of A?. Then A'(AAt+ —/)b =0. This gives A'Ax+ = A™b
and x can be x".
18
The vector x+ is the shortest possible solution to A’AX = A'b. Reason: The
difference x — x* is in the nullspace of A'A. This is also the nullspace of A.
Explain how it follows that
ab
a lesaelier ala eae
|b
Any other solution x has greater length than x*.
19
Every b in R” is p+e.
This is the column space part plus the left nullspace
part. Every x in R” is x, +X, = (row space part) + (nullspace part). Then
AAt p=
20
AAte =
At
AT AX, =
Find At and AtA and AA? for the 2 by 1 matrix whose SVD is
3
6
—8]15
.s H 7 fe 4 Ac
21
A general 2 by 2 matrix A is determined by four numbers. If triangular, it is
determined by three. If diagonal, by two. If a rotation, by one. Check that the
total count is four for each factorization of A:
LU
22
LDU
ORs
Gosvo an Ns a
Following Problem 21, check that LDL! and QAQ!
numbers. This is correct because the matrix A is
are determined by three
356
Chapter 7
Linear Transformations
23
sibs.
A new factorization!
Factor
Axed
: | into
A = EH,
with 1’s on the diagonal and H is symmetric.
:
:
where E is lower triangular
When is this impossible?
24
Suppose vj,...v, and u;,...,u, are bases for the row space and column space
of A. Describe all possible matrices A.
25
A pair of singular vectors v and u will satisfy Av = ou and A'u = ov. This
ul.
means that the double vector x = 4 is an eigenvector of what symmetric matrix? With what eigenvalue?
APPEICAII@
IN:
GRAPHS AND NETWORKS a 8.1
This chapter is about five selected applications of linear algebra. We had many applications to choose from. Any time you have a connected system, with each part depending
on other parts, you have a matrix. Linear algebra deals with interacting systems, provided the laws that govern them are linear. Over the years I have seen one model so
often, and found it so basic and useful, that I always put it first. The model consists
of nodes connected by edges. This is called a graph.
Graphs of the usual kind display functions f(x). Graphs of this different kind
(m edges connecting n nodes) lead to matrices. This section is about the incidence
matrix A of a graph—and especially about the four subspaces that come with it.
For any m by n matrix there are two subspaces in R” and two in R”. They are
the column spaces and nullspaces of A and A‘. Their dimensions are related by the
most important theorem in linear algebra. The second part of that theorem is the orthogonality of the subspaces. Our goal is to show how examples from graphs illuminate
the Fundamental Theorem of Linear Algebra.
We review the four subspaces (for any matrix). Then we construct a directed
graph and its incidence matrix. The dimensions will be easy to discover. But we want
the subspaces themselves—this is where orthogonality helps. It is essential to connect
the subspaces to the graph they come from. By specializing to incidence matrices, the
laws of linear algebra become Kirchhoff’s laws. Please don’t be put off by the words
“current” and “potential” and “Kirchhoff.’ These rectangular matrices are the best.
Every entry of an incidence matrix is 0 or 1 or —1. This continues to hold during
elimination. All pivots and multipliers are +1. Therefore both factors in A = LU
also contain
0,1,—1.
So do the nullspace
matrices!
All four subspaces
have
basis
vectors with these exceptionally simple components. The matrices are not concocted
for a textbook, they come from a model that is absolutely essential in pure and applied
mathematics.
357
358
Chapter 8 Applications
nullspace
nullspace
of A
cr
of A
dim n—r
Figure 8.1
The four subspaces with their dimensions and orthogonality.
Review of the Four Subspaces
Start with an m by n matrix. Its columns are vectors in R”. Their linear combinations
produce the column space C(A), a subspace of R”. Those combinations are exactly
the matrix-vector products Ax.
The rows of A are vectors in R” (or they would be, if they were column vectors).
Their linear combinations produce the row space. To avoid any inconvenience with rows,
we transpose the matrix. The row space becomes C(A‘), the column space of A.
The central questions of linear algebra come from these two ways of looking at
the same numbers, by columns and by rows.
The nullspace N(A) contains every x that satisfies Ax = 0—this is a subspace
of R". The “left” nullspace contains all solutions to A! y = 0. Now y has m components, and N(A‘) is a subspace of R”. Written as y'A = 0!, we are combining rows
of A to produce the zero row. The four subspaces are illustrated by Figure 8.1, which
shows R” on one side and R™ on the other. The link between them is A.
The information in that figure is crucial. First come the dimensions, which obey
the two central laws of linear algebra:
dim C(A) = dim C(A!)
and
dim C(A) + dim N(A) =n.
When the row space has dimension r, the nullspace has dimension n — r. Elimination
leaves these two spaces unchanged, and the echelon form U gives the dimension count.
There are r rows and columns with pivots. There are n—r free columns without pivots,
and those lead to vectors in the nullspace.
8.1.
Graphs and Networks
359
The following incidence matrix A comes from a graph. Its echelon form is U:
BN
mie
adeno
one
nO
(att
36
tekeiae Oh ple|
(oe
we
(re oun
n
wen
=
=f (aoe
Tees
us
Ou 0G
al
aoe oe Onto
0 0 6 0
0 0. 06
The nullspace of A and U is the line through x = (1, 1, 1,1). The column spaces of
A and U have dimension r = 3. The pivot rows are a basis for the row space.
Figure 8.1 shows more—the subspaces are orthogonal. Every vector in the nullspace
is perpendicular to every vector in the row space. This comes directly from the m equations
Ax = 0. For A and U above, x = (1, 1, 1, 1) is perpendicular to all rows and thus to the
whole row space.
This review of the subspaces applies to any matrix A—only the example was
special. Now we concentrate on that example. It is the incidence matrix for a particular
graph, and we look to the graph for the meaning of every subspace.
Directed Graphs and Incidence Matrices
Figure 8.2 displays a graph with m = 6 edges and n = 4 nodes, so the matrix A is 6
by 4. It tells which nodes are connected by which edges. The entries —1 and +1 also
tell the direction of each arrow (this is a directed graph). The first row of A gives a
record of the first edge:
The first edge goes from
node | to node 2.
The first row has —1 in
column | and +1 in column 2.
node
re
Figure 8.2a
-l1
1 0
=|
ee)ee
0-1
1
=i
uO (0
0-1
0
OF R01"
0
eC)
0
al
1
4
1
2,
3
4
5
6
Complete graph with m = 6 edges and n = 4 nodes.
Row numbers are edge numbers, column numbers are node numbers.
You can write down A immediately by looking at the graph.
edge
360 == Chapter 8 Applications
The second graph has the same four nodes but only three edges.
matrix is 3 by 4:
Its incidence
@
node
O®O®
at
®
@
ton 0
0/0
1 |
Oy Oey yal
|
2
3
edge
@
2
Figure 8.2b
Tree with 3 edges and 4 nodes and no loops.
The first graph is complete—every pair of nodes is connected by an edge. The second graph is a tree—the graph has no closed loops. Those graphs are the two extremes, with the maximum number of edges m = $n(n — 1) and the minimum number
m =n—1. We are assuming that the graph is connected, and it makes no fundamental
difference which way the arrows go. On each edge, flow with the arrow is “positive.”
Flow in the opposite direction counts as negative. The flow might be a current or a
signal or a force—or even oil or gas or water.
The rows of B match the nonzero rows of U—the echelon form found earlier.
Elimination reduces every graph to a tree. The loops produce zero rows in U. Look
at the loop from edges 1, 2, 3 in the first graph, which leads to a zero row:
—1
[ae
=
Qe
Ort
Oh
0
Sone
=
=
1
Or
0
jae
0
1
1
=
=1
0
0;
0
=I
Or=1
0
0
—
0
1
0
0
0
0
Those steps are typical. When two edges share a node, elimination produces the “shortcut edge” without that node. If the graph already has this shortcut edge, elimination
gives a row of zeros. When the dust clears we have a tree.
An idea suggests itself: Rows are dependent when edges form a loop. Independent rows come from trees. This is the key to the row space.
For the column space we look at Ax, which is a vector of differences:
—
—1
(Oi
—1
0
1
0
xX]
0
-1
1
0
ND
0
-1
0
1
X4
gee
2]
Que
x2
— Xx]
Wee
1
X3.— X]
|
Xx3,\— X2
oe |x, — x,
x4
— X2
Boae Se:
®
8.1
The
unknowns
x), x2,.x3,x4
represent potentials
Graphs and Networks
at the nodes.
Then
Ax
361
gives the
potential differences across the edges. It is these differences that cause flows.
now examine the meaning of each subspace.
We
1 The nullspace contains the solutions to Ax = 0. All six potential differences are
zero. This means: All four potentials are equal. Every x in the nullspace is a constant
vector (c,c,c,c).
The nullspace of A is a line in R’—its
dimension is n —r = 1.
The second incidence matrix B has the same nullspace. It contains (1, 1, 1, 1):
Baa
—1
Ol
0
ly
i
0
yh
|
ue
,
0
0
0
We can raise or lower all potentials by the same amount c, without changing the differences. There is an “arbitrary constant” in the potentials. Compare this with the same
statement for functions. We can raise or lower f(x) by the same amount C, without
changing its derivative. There is an arbitrary constant C in the integral.
Calculus adds “+C” to indefinite integrals. Graph theory adds (c, c,c,c) to the
vector x of potentials. Linear algebra adds any vector x, in the nullspace to one par-
ticular solution of Ax = b.
The “+C” disappears in calculus when the integral starts at a known point x = a.
Similarly the nullspace disappears when we set x4 = 0. The unknown x4 is removed
and so are the fourth columns of A and B. Electrical engineers would say that node
4 has been “grounded.”
2 The row space contains all combinations of the six rows.
Its dimension is certainly
motccix.
The rank isy = 3, as “we
Ihe equation ry
also saw from elimination.
not independent.
How
— 1) = nm must be 3--l=4.
After 3 edges, we start forming loops! The new rows are
can we tell if v = (v1, v2, v3, v4) is in the row space? The slow way is to
combine rows.
The quick way is by orthogonality:
v is in the row space if and only if it is perpendicular to (1, 1, 1, 1) in the nullspace.
The vector v = (0, 1, 2, 3) fails this test—its components add to 6. The vector (—6, 1, 2, 3)
passes the test.
It lies in the row space because its components add to zero.
It equals
6(row 1) + S(row 3) + 3(row 6).
Each row of A adds to zero. This must be true for every vector in the row space.
3 The column space contains all combinations of the four columns.
independent columns, since there were three independent rows.
We expect three
The first three columns
are independent (so are any three). But the four columns add to the zero vector, which
says again that (1, 1,1, 1) is in the nullspace. How can we tell if a particular vector
b is in the column space?
First answer Try to solve Ax = b. As before, orthogonality gives a better answer. We
are now coming to Kirchhoff’s two famous laws of circuit theory—the voltage law and
362
Chapter 8 Applications
current law. Those are natural expressions of “laws” of linear algebra. It is especially
pleasant to see the key role of the left nullspace.
Second answer
Ax is the vector of differences in equation (1). If we add differences
around a closed loop in the graph, the cancellation leaves zero. Around the big triangle
formed by edges 1,3, —2 (the arrow goes backward on edge 2) the differences are
(x2 —*1) + 3 = X42) — (a = 11) = 0.
This is the voltage law: The components of Ax add to zero around every loop. When
b is in the column space, it must obey the same law:
Kirchhoff’s Voltage Law: b, + b3 — bz = 0.
By testing each loop, we decide whether b is in the column space. Ax = b can be
solved exactly when the components of b satisfy all the same dependencies as the rows
of A. Then elimination leads to
0 = 0, and Ax = B 1s consistent.
4 The left nullspace contains the solutions to A' y = 0. Its dimension is m—r
m9 Mats
Daa
eh
1
ana
0
i
—
—1
oT
1
01]
»
0
Y3
-1||
0
=
4] =|0
Sits
= 6—3:
0
0
”
The true number of equations is r = 3 and not n = 4. Reason: The four equations
add to 0 = 0. The fourth equation follows automatically from the first three.
What do the equations mean? The first equation says that —y; — y2 — yq4 = 0.
The net flow into node 1 is zero. The fourth equation says that y4 + ys + ye = 0.
Flow into the node minus flow out is zero. The equations A' y = 0 are famous and
fundamental:
Kirchhoff’s Current Law: Flow in equals flow out at each node.
This law deserves first place among the equations of applied mathematics. It expresses
“conservation” and “continuity” and “balance.” Nothing is lost, nothing is gained.
When currents or forces are in equilibrium, the equation to solve is Aly = 0. Notice the beautiful fact that the matrix in this balance equation is the transpose of the
incidence matrix A.
What are the actual solutions to Aly = 0? The currents must balance themselves. The easiest way is to flow around a loop. If a unit of current goes around
the big triangle (forward on edge 1, forward on 3, backward on 2), the vector is y =
(1, —1, 1,0, 0,0). This satisfies A'y = 0. Every loop current yields a solution y, because flow in equals flow out at every node. A smaller loop goes forward on edge 1,
forward on 5, back on 4. Then y = (1,0,0, —1, 1,0) is also in the left nullspace.
8.1
Graphs and Networks
363
We expect three independent y’s, since 6 — 3 = 3. The three small loops in the
graph are independent. The big triangle seems to give a fourth y, but it is the sum of
flows around the small loops. The small loops give a basis for the left nullspace.
1
0
0
0
0
1
0
=|
Oy
1
—1
l
1
0
—]
I
0
ae)
0
0
<li
2
3
a
Oe tin
bikeah
l inden
small loops
big loop
Summary The incidence matrix A comes from a connected graph with n nodes and
m edges. The row space and column space have dimensions n — 1. The nullspaces
have dimension | and m—n +1:
1
The constant vectors (c,c,...,c) make up the nullspace of A.
Z
There are r = n — | independent rows, using edges from any tree.
3
Voltage law: The components of Ax add to zero around every loop.
4
Current law: A‘ y = 0 is solved by loop currents. N(A‘) has dimension m—r.
There are
m—r =m —n +1 independent loops in the graph.
For every graph in a plane, linear algebra yields Euler’s formula:
(number of nodes)—(number of edges)+(number of small loops)=1.
This is n —m+(m—n-+1) = 1. The graph in our example has 4 -6+3= 1.
A single triangle has (3 nodes)—(3 edges)+(1 loop). On a 10-node tree with 9 edges
and no loops, Euler’s count is 10 — 9 + 0. All planar graphs lead to the answer 1.
Networks and A'CA
In a real network, the current y along an edge is the product of two numbers.
One
number is the difference between the potentials x at the ends of the edge. This differ-
364
Chapter 8 Applications
ence is Ax and it drives the flow. The other number is the “conductance” c—which
measures how easily flow gets through.
In physics and engineering, c is decided by the material. For electrical currents,
c is high for metal and low for plastics. For a superconductor, c is nearly infinite. If
we consider elastic stretching, c might be low for metal and higher for plastics. In
economics, c measures the capacity of an edge or its cost.
To summarize, the graph is known from its “connectivity matrix” A. This tells
the connections between nodes and edges. A network goes further, and assigns a conductance c to each edge. These numbers cj,..., Cm go into the “conductance matrix”
C—which is diagonal.
For a network of resistors, the conductance is c = | /(resistance). In addition to Kirchhoff’s laws for the whole system of currents, we have Ohm’s law for each particular
current. Ohm’s law connects the current y; on edge | to the potential difference x2—x,
between the nodes:
Ohm’s Law:
Current along edge = conductance times potential difference.
Ohm’s law for all m currents is y = —CAx. The vector Ax gives the potential differences, and C multiplies by the conductances. Combining Ohm’s law with Kirchhoff’s
current law Aly = 0, we get A'CAx = 0. This is almost the central equation for
network flows. The only thing wrong is the zero on the right side! The network needs
power from outside—a voltage source or a current source—to make something happen.
Note about signs In circuit theory we change from Ax to —Ax. The flow is from
higher potential to lower potential. There is (positive) current from node | to node 2
when x; — x2 is positive—whereas
Ax
was constructed to yield x2 — xj.
The minus
sign in physics and electrical engineering is a plus sign in mechanical engineering and
economics. Ax versus —Ax is a general headache but unavoidable.
Note about applied mathematics
Every new application has its own form of Ohm’s
law. For elastic structures y = CAx is Hooke’s law. The stress y is (elasticity C) times
(stretching Ax). For heat conduction, Ax is a temperature gradient. For oil flows it is
a pressure gradient. There is a similar law for least square regression in statistics. My
textbook Introduction to Applied Mathematics (Wellesley-Cambridge Press) is practically built on “A'CA.” This is the key to equilibrium in matrix equations and also in
differential equations.
Applied mathematics is more organized than it looks. I have learned to watch
for A'CA.
We now give an example with a current source.
Kirchhoff’s law changes from
Aly = 0to Aly = f, to balance the source f from outside. Flow into each node
still equals flow out. Figure 8.3 shows the network with its conductances c},..., C6,
and it shows the current source going into node |. The source comes out at node 4
to keep the balance (in = out). The problem is: Find the currents yj,..., yo on the
six edges.
8.1
Figure 8.3.
The currents in a network
Graphs and Networks
with a source
S into node
365
1.
Example 1 All conductances are c = 1, so that C = J. A current yg travels directly
from node | to node 4. Other current goes the long way from node | to node 2 to
node 4 (this is y; = ys). Current also goes from node | to node 3 to node 4 (this is
y2 = yo). We can find the six currents by using special rules for symmetry, or we can
do it right by using A'CA. Since C = /, this matrix is A! A:
Seer te eet
it
So
POP Oe
ee Theme
1
al
-1
O0||
0 -1
1° 0}
Oat iierx
0 =1}|=-1
00 0 1 | 7
0
iia Sy eee
Ome
tnd) ai
Create
21
That last matrix
is not invertible!
(1,1, 1, 1) is in the nullspace.
We cannot
3g
See
Reed
=e, oder
solve for all four potentials because
One node has to be grounded.
Setting x4 = 0 removes
the fourth row and column, and this leaves a 3 by 3 invertible matrix. Now we solve
A'CAx = f for the unknown potentials x1, x2, x3, with source S into node 1:
3
=e)
—-1
-1
-!l
xX]
iS
-1
y=
3
oe
x3
0
0
Ohm’s law y = —CAx
yI
y2
y3
yah
ys
6
gives
x]
S/2
Kon
X3
oye
S/4
yields the six currents. Remember C = / and x4 = 0:
=
RRS OPO
A101
nul Ol
S72
Ok: Arcuri}
aS, WN
eaariieaivine)
ARORR iM ApIS 48 ain
ata
Gi ory
0
Ora
I
Half the current goes directly on edge 4. That is yy = S/2.
node 2 to node 3. Symmetry
indicated y3 = 0 and now
S/4
5/4
SO
Se
5/4
S/4
No current crosses from
the solution proves
it.
366
Chapter 8 Applications
The same matrix A'A appears in least squares. Nature distributes the currents to minimize the heat loss, where statistics chooses ¥ to minimize the error.
Problem Set 8.1
Problems 1-7 and 8-14 are about the incidence matrices for these graphs.
\
edge 1
2
3
4
edge 2
edge 3
1
1
Write down the 3 by 3 incidence matrix A for the triangle graph. The first row
has —1
in column
| and +1
in column
2. What
vectors
(x1, x2, x3) are in its
nullspace? How do you know that (1, 0,0) is not in its row space?
d
Write down A! for the triangle graph. Find a vector y in its nullspace. The
components of y are currents on the edges—how much current is going around
the triangle?
3
Eliminate x; and x2 from the third equation to find the echelon matrix U. What
tree corresponds to the two nonzero rows of U?
Kista
Dy
—xX, +x3 = b2
—x2
+ x3 = b3.
4
Choose a vector (b,, b2, b3) for which Ax = b can be solved, and another vector
b that allows no solution. How are those b’s related to y = (1, —1, 1)?
5
Choose a vector (1, f2, f3) for which A’ y = f can be solved, and a vector iy
that allows no solution. How are those f’s related to x = (1,1, 1)? The equation
Aly = f is Kirchhoff’s
law.
6
Multiply matrices to find A'A. Choose a vector f for which A'Ax = f can be
solved, and solve for x. Put those potentials x and the currents y = —Ax and
current sources f onto the triangle graph. Conductances are 1 because C = /.
i
With conductances
c} =
| and cz = cz = 2, multiply matrices to find ATCA.
For f = (1,0, —1) find a solution to A'CAx = f. Write the potentials x and
currents y = —C Ax on the triangle graph, when the current source f goes into
node | and out from node 3.
8.1
Graphs and Networks
367
Write down the 5 by 4 incidence matrix A for the square graph with two loops.
Find one solution to Ax = 0 and two solutions to Aly = 0.
Find two requirements on the b’s for the five differences x2 — x1, x3 —X1,.X3—X,
x4 — X2,x4 — x3 to equal by, bz, b3,b4, bs. You have found Kirchhoff’s
law around the two
in the graph.
10
Reduce A to its echelon form U. The three nonzero rows give the incidence
matrix for what graph? You found one tree in the square graph—find the other
seven trees.
11
Multiply matrices to find A'A and guess how its entries come from the graph:
12
13
(a)
The diagonal of A'A tells how many
(b)
The off-diagonals —1 or 0 tell which pairs of nodes are
Why is each statement true about
into each node.
A! A? Answer for A'A not A.
(a)
Its nullspace contains
(b)
It is positive semidefinite but not positive definite.
(c)
Its four eigenvalues are real and their signs are
With conductances
— _
(1, 1,1, 1). Its rank is n — 1.
c) = cz = 2 and c3 = c4 = c5 =
3, multiply
the matrices
A'CA. Find a solution to A'CAx = f = (1,0,0,—1). Write these potentials
x and currents y = —CAx on the nodes and edges of the square graph.
14
The matrix A'CA is not invertible. What vectors x are in its nullspace?
does A'CAx = f have a solution if and only if fj + fo + f3+ f4 = 0?
15
A connected graph with 7 nodes and 7 edges has how many loops?
16
For the graph with 4 nodes, 6 edges, and 3 loops, add a new node. If you connect
it to one old node, Euler’s formula becomes (
it to two old nodes, Euler’s formula becomes
17
18
Why
)—(
)+(
) = 1. If you connect
( )—(
)+(
)=1.
Suppose A is a 12 by 9 incidence matrix from a connected (but unknown) graph.
(a)
How
many columns of A are independent?
(b)
What condition on f makes it possible to solve Aly= f?
(c)
The diagonal entries of A'A give the number
What is the sum of those diagonal entries?
of edges into each node.
Why does a complete graph with n = 6 nodes have m =
connecting 6 nodes has ___
edges.
15 edges? A tree
368
= Chapter 8
Applications
MARKOV
MATRICES AND
ECONOMIC
MODELS ga 8.2
Early in this book we proposed an experiment. Start with any vector ug = (x, 1 — x).
Multiply it again and again by the “transition matrix” A:
a= [3 3),
The experiment produces uj = Aug and then uz = Au,. After k steps we have Ak uo.
Unless MATLAB went haywire, the vectors uo, 1, U2,U3,... approached a “steady
state.” That limit state is Uo = (.6,.4). This final outcome does not depend on the
starting vector: For every ug we always converge to (.6, .4). The question is why.
At that time we had no good way to answer this question. We knew nothing
about eigenvalues. It is true that the steady state equation Algo = Ugo could be verified:
fs alle}-[2],
You would now say that Uo is an eigenvector with eigenvalue 1. That makes it steady.
Multiplying by A does not change it. But this equation Aug = Uo does not explain
why all vectors ug lead to Uo. Other examples might have a steady state, but it is not
necessarily attractive:
f.0
pal 3
has the steady state
1
1
Pale
In this case, the starting vector ug = (0, 1) will give uw; = (0, 2) and uz = (0,4). The
second components are being doubled by the “2” in B. In the language of eigenvalues, B has 4 = | but it also has A = 2—and an eigenvalue larger than one produces
instability. The component of wu along that unstable eigenvector is multiplied by 4, and
|A| > 1 means blowup.
This section is about two special properties of A that guarantee a steady state
Uoo. These properties define a Markov matrix, and A above is one particular example:
1
Every entry of A is nonnegative.
2
Every column of A adds to |.
B did not have Property 2. When A is a Markov matrix, two facts are immediate:
Multiplying a nonnegative uo by A produces a nonnegative uj = Aug.
If the components of wp add to 1, so do the components of uj = Auo.
Reason: The components of uo add to 1 when [1 vee 1 Juo =I. This4strue-for
each column of A by Property 2. Then by matrix multiplication it is true for Aug:
[Pere
Aug
= (1 ea
1 jag =i:
8.2
Markov Matrices and Economic
Models
369
The same facts apply to uz = Au, and uz = Au».
Every vector ux, = Aug is
nonnegative with components adding to |. These are “probability vectors’? The limit
Uoo 1S also a probability vector—but we have to prove that there is a limit! The existence of a steady state will follow from 1 and 2 but not so quickly. We must show
that A = 1 is an eigenvalue of A, and we must estimate the other eigenvalues.
Example 1 The fraction of rental cars in Denver starts at 4 = .02. The fraction outside Denver is .98. Every month those fractions (which add to 1) are multiplied by the
Markov matrix A:
A
80.05
Es |
SEES US
asc
2
065
ES 7 el
That is a single step of a Markov chain. In one month, the fraction of cars in Denver
is up to .065. The chain of vectors is wo, u,,u2,..., and each step multiplies by A:
u, = Auo,
u2 = A*uo,
pe
PLOGUCES
p=
Ak uo.
All these vectors are nonnegative because A is nonnegative. Furthermore .065+.935 =
1.000. Each vector uz will have its components adding to 1. The vector uz = Au, is
(.09875, .90125). The first component has grown from .02 to .065 to nearly .099. Cars
are moving toward Denver. What happens in the long run?
This section involves powers of matrices. The understanding of A* was our first
and best application of diagonalization. Where A* can be complicated, the diagonal
matrix A* is simple. The eigenvector matrix S connects them: A* equals SA‘S~!. The
new application to Markov matrices follows up on this idea—to use the eigenvalues (in
A) and the eigenvectors (in S). We will show that woo is an eigenvector corresponding
OA
el
Since every column of A adds to 1, nothing is lost or gained. We are moving
rental cars or populations, and no cars or people suddenly appear (or disappear). The
fractions add to | and the matrix A keeps them that way. The question is how they
are distributed after k time periods—which leads us to A‘.
Solution to Example 1 After & steps the fractions in and out of Denver are the components of A‘ug. To study the powers of A we diagonalize it. The eigenvalues are 4 = |
and A = .75. The first eigenvector, with components adding to 1, is x; = (.2, .8):
A=
OU
| .20
A
05
Chee,
[=2 = 1.78+.75=0-NA=79
AG) = -l--Ci)
Those eigenvectors are x; and x2. They are the columns of S. The starting vector uo
is a combination of x; and x2, in this case with coefficients | and .18:
om [2] [21
370 = =Chapter 8 Applications
Now multiply by A to find u;. The eigenvectors are multiplied by A; = 1 and Az = .75:
ve
2,
he + (.75)(.18)
—I
il
Each time we multiply by A, another .75 multiplies the last vector.
xX, is unchanged:
eae
The eigenvector
ee | + (.75)*(.18) el
This equation reveals what happens. The eigenvector x, with i = | is the steady state
Uo. The other eigenvector x2 gradually disappears because |A| < 1. The more steps
we take, the closer we come to Ugo = (.2, .8). In the limit, a of the cars are in Denver
and ty are outside. This is the pattern for Markov chains:
8A If A is a positive Markov matrix (entries aj; > 0, each column adds to 1), then
A’ = 1 is larger than any other eigenvalue. The eigenvector x; is the steady state:
ur =xy+ €2(A2)*x2 +---+c¢, Git x8
Assume
that the components
always approaches
of wg add to 1. Then
Ugo = xX}.
this is true of uj, u2,....
The key point is that we approach a multiple of x, from every starting vector uo. If
all cars start outside Denver, or all start inside, the limit is still woo = x; = (.2, .8).
The first point is to see that 4 = | is an eigenvalue of A. Reason: Every column
of A —J adds to |—1=0. The rows of A —/
add up to the zero row. Those rows
are linearly dependent, so A — / is singular. Its determinant is zero and 4 = | is an
eigenvalue. Since the trace of A was 1.75, the other eigenvalue had to be A2 = .75.
The second point is that no eigenvalue can have |A| > 1. With such an eigenvalue,
the powers A* would grow. But A* is also a Markov matrix with nonnegative entries
adding to 1—and that leaves no room to get large.
A lot of attention is paid to the possibility that another eigenvalue has |A| = 1.
Suppose the entries of A or any power A* are all positive—zero is not allowed. In
this “regular” case A = | is strictly bigger than any other eigenvalue. When A and its
powers have zero entries, another eigenvalue could be as large as A, = 1.
Example 2 A =[{)] has no steady state because A7 = —1.
This matrix sends all cars from inside Denver to outside, and vice versa. The
powers A* alternate between A and J. The second eigenvector x2 = (—1, 1) is multiplied by A2 = —1 at every step—and does not become smaller. With a regular Markov
matrix, the powers A* approach the rank one matrix that has the steady state x, in
every column.
8.2
Markov
Matrices and Economic
Models
371
Example 3 (“Everybody moves” ) Start with three groups. At each time step, half of
group | goes to group 2 and the other half goes to group 3. The other groups also
split in half and move. If the starting populations are pj), p2, p3, then after one step
the new populations are
Ue
uy =Auo=|5
0
i
5
7
1
1
| iia
1
D2 hes
5]|
po |=|
5p: +423
O
-
SPi+4p2
A is a Markov matrix. Nobody is born or lost. It is true that A contains zeros, which
gave trouble in Example 2. But after two steps in this new example, the zeros disappear
from A?:
aS
2
uy=AUp=|5G
De
1
AL
I
5
Lore
4
4
el
4
1
Pi
|Z
P2
P3
2
What is the steady state? The eigenvalues of A are 4; = | (because A is Markov) and
Mg =) = —5. The eigenvector x; = (, i 5) for X = 1 will be the steady state.
When three equal populations split in half and move, the final populations are again
equal. When the populations start from uo = (8, 16, 32), the Markov chain approaches
its steady state:
8
up
=
16
24
u;, =
32
20
[ 16
ua =
12
18
20
u3=
22
19
lig)
The step to u4 will split some people in half. This cannot be helped. The total population is 8+ 16+32 = 56 (and later the total is still 20+19+17 = 56). The steady state
populations Uo are 56 times Gj. i 4). You can see the three populations approaching,
but never reaching, their final limits 56/3.
Linear Algebra in Economics: The Consumption Matrix
A long essay about linear algebra in economics would be out of place here. A short
note about one matrix seems reasonable. The consumption matrix tells how much of
each input goes into a unit of output. We have n industries like chemicals, food, and
oil. To produce a unit of chemicals may require .2 units of chemicals, .3 units of food,
and .4 units of oil. Those numbers go into row | of the consumption matrix A:
chemical output
food output
oil output
ae
2 3 4
A) ae al
rales:
chemical input
food input
oil input
Row 2 shows the inputs to produce food—a heavy use of chemicals and food, not so
much oil. Row 3 of A shows the inputs consumed to refine a unit of oil. The real
372
Chapter 8 Applications
consumption matrix for the United States in 1958 contained 83 industries. The models
in the 1990’s are much larger and more precise. We chose a consumption matrix that
has a convenient eigenvector.
Now comes the question: Can this economy meet demands yj, y2, y3 for chemicals, food, and oil? To do that, the inputs pj, p2, p3 will have to be higher—because
part of p is consumed in producing y. The input is p and the consumption is Ap,
which leaves p — Ap. This net production is what meets the demand y:
Problem Find a vector p such that p—Ap = y or (I—A)p = y or
p=(I—A)7'y.
Apparently the linear algebra question is whether / — A is invertible. But there is
more to the problem. The demand vector y is nonnegative, and so is A. The production
levels in p = (I — A)~'y must also be nonnegative. The real question is:
When
is (I — Ay
a nonnegative matrix?
This is the test on (J — A)~! for a productive economy, which can meet any positive
demand. If A is small compared to J, then Ap is small compared to p. There is
plenty of output. If A is too large, then production consumes more than it yields. In
this case the external demand y cannot be met.
“Small” or “large” is decided by the largest eigenvalue A, of A (which is positive):
If Aq. Sul | then.
If A; =1
then
Tf Aytal § then).
(7 =!A)—*
(J —A)7!
(2A
ie
has negative entries
fails to exist
ais nonnegative as desired.
The main point is that last one. The reasoning makes use of a nice formula for (J —
A)~!, which we give now. The most important infinite series in mathematics is the
geometric series 1 +x +x*+---. This series adds up to 1/(1 — x) provided x is
between. —1 and 1. (When x = | the series is
1+ 1+1+--- =o. When |x| > 1
the terms x” don’t go to zero and the series cannot converge.) The nice formula for
(I — A)~! is the geometric series of matrices:
(LS Ay
ea Pe Aae Ae Sar aes:
If you multiply this series by A, you get the same series S except for J. Therefore
S — AS =I, which is (J — A)S = I. The series adds to S = (I— A)7! if it converges.
And it converges if |Amax| < 1.
In our case A > 0. All terms of the series are nonnegative. Its sum is (J — A)~! > 0.
Example
3A
4A = [4441]
has A: =.9 and (I — A)! = [33434 36
24].
yh va)
23 36
This economy is productive. A is small compared to J, because Amax is .9. To meet the
demand y, start from p = (J — A)~'y. Then Ap is consumed in production, leaving
p—Ap. This is (J — A)p = y, and the demand is met.
8.2
Markov Matrices and Economic Models
373
Example 5 A = [94] has A; =2 and (J — A)" = —}[14].
This consumption matrix A is too large. Demands can’t be met, because production
consumes more than it yields. The series / +
A+ A? +... does not converge to
(I — A)~!. The series is growing while (I — A)~! is actually negative.
Problem Set 8.2
Questions 1-14 are about Markov matrices and their eigenvalues and powers.
1
Find the eigenvalues of this Markov matrix (their sum is the trace):
What is the steady state eigenvector for the eigenvalue A; = 1?
2
Diagonalize the Markov matrix in Problem | to
ai
ae
What is the limit of At =
3
A = SAS~!
|
SA‘ S~! when Ak =[} 0 .759,
approaches [ 00
1°]?
What are the eigenvalues and the steady state eigenvectors for these Markov matrices?
>
> |
II
>
II
wow or
or won
Pat
Se.
=a
ae
Rl
BIR
Bl
J
by finding its other
For every 4 by 4 Markov
BIR
NIBle
BI
Ble
NI
matrix, what eigenvector of A! corresponds to the
(known) eigenvalue A = 1?
5
Every year 2% of young people become old and 3% of old people become dead.
(No births.) Find the steady state for
young
old
dead
6
=i
|,4,
ie).
|lees
00
OO
eae
03
0)
1H)
1
young
old
dead
|,
The sum of the components of x equals the sum of the components of Ax. If
Ax = Ax with A 4 1, prove that the components of this non-steady eigenvector
x add to zero.
374
Chapter 8 Applications
Z
Find the eigenvalues and eigenvectors of A. Factor A into
SAS ees
valet
This was a MATLAB example in Chapter 1. There A!® was computed by squaring
four times. What are the factors in A!® = SA!¢s—!?
Explain why the powers A‘ in Problem 7 approach this matrix A%:
seals al:
Challenge problem: Which Markov matrices produce that steady state (.6, .4)?
This permutation matrix is also a Markov matrix:
oOo
—_—
©©
Oo
So
—
oreo
S
oo
The steady state eigenvector for A = | is (Gj. ie is i): This is not approached
when uo = (0,0, 0, 1). What are wu, and wz and uw3 and u4? What are the four
eigenvalues of P, which solve 4 = 1?
10
Prove that the square of a Markov matrix is also a Markov matrix.
11
If A = [25] is a Markov matrix, its eigenvalues are 1 and
. The steady
state eigenvector is x; =
12
Complete the last row to make A a Markov matrix and find the steady state eigenvector:
When A is a symmetric Markov matrix, why is x; = (1,..., 1) its steady state?
13
A Markov differential equation is not du/dt = Au but du/dt = (A —/)u.
the eigenvalues of
Find
pie
B=A=1=| A ai
When e*!' multiplies the eigenvector x; and e*2’ multiplies x2, what is the steady
state as t > co?
8.2
14
The matrix
to
B = A—I
Markov
Matrices and Economic
Models
375
for a Markov differential equation has each column adding
. The steady state x; is the same as for A, but now
A; =
and
ert
Questions 15-18 are about linear algebra in economics.
15
Each row of the consumption matrix in Example 4 adds to .9. Why does that
make 4 = .9 an eigenvalue, and what is the eigenvector?
16
Multiply
7+
A+ A*+ A+---
by J—A
to show that the series adds to
For 4i— [ 2], find A? and A? and use the pattern to add up the series.
iv,
For which of these matrices does J + A+ A* +--+ yield a nonnegative matrix
(I — A)~!? Then the economy can meet any demand:
v-[2 3] 4-[% f] [3]
18
If the demands in Problem 17 are y = (2, 6), what are the vectors
19
(Markov again) This matrix has zero determinant.
a
Find the limits of Aug
20
p = (I — A)~!y?
What are its eigenvalues?
ae)
3
4
BARNA
HN
starting from uo = (1,0, 0) and then wo = (100, 0, 0).
If A is a Markov matrix, does / +
A+ A?+---
add up to (J — A)7!?
376
Chapter 8 Applications
LINEAR PROGRAMMING a 8.3
Linear programming is linear algebra plus two new ingredients: inequalities and
minimization. The starting point is still a matrix equation Ax = b. But the only
acceptable solutions are nonnegative. We require x > 0 (meaning that no component
of x can be negative). The matrix has n > m, more unknowns than equations. If there
are any nonnegative solutions to Ax = b, there are probably a lot. Linear programming
picks the solution x* > 0 that minimizes the cost:
The cost is cyxXy +++++CnXn. The winning vector x* is
the nonnegative solution of Ax = b that has smallest cost.
Thus a linear programming problem starts with a matrix A and two vectors b and c:
i)
A has n > m: for example A=[1
ii)
b has m components:
iii)
The cost c has n components:
1 2]
for example b = [4]
for example
c=[5
3 8].
Then the problem is to minimize c-x subject to the requirements Ax = b and x > 0:
Minimize 5x, + 3x2 + 8x3 subject to x1
+x2+2x3 =4 and x1, x2, x3 > 0.
We jumped right into the problem, without explaining where it comes from. Linear
programming is actually the most important application of mathematics to management.
Development of the fastest algorithm and fastest code is highly competitive. You will
see that finding x* is harder than solving Ax = b, because of the extra requirements:
cost minimization and nonnegativity. We will explain the background, and the famous
simplex method, after solving the example.
Look first at the “constraints”: Ax = b and x > 0. The equation x;+x2+2x3 = 4
gives a plane in three dimensions.
The nonnegativity x; > 0, x2 > 0, x3 > 0 chops the
plane down to a triangle. The solution x* must lie in the triangle PQR in Figure 8.4.
Outside that triangle, some components of x are negative. On the edges of that triangle,
one component is zero. At the corners of that triangle, two components are zero. The
solution x* will be one of those corners! We will now show why.
The triangle contains
all vectors x that satisfy Ax = b and x > 0. (Those x’s
are called feasible points, and the triangle is the feasible set.)
candidates in the minimization of c-x, which is the final step:
These points are the
Find x* in the triangle to minimize the cost 5x1 + 3x2 + 8x3.
The vectors that have zero cost lie on the plane 5x; + 3x2 + 8x3 = 0. That plane does
not meet the triangle. We cannot achieve zero cost, while meeting the requirements on
x. So increase the cost C until the plane 5x; + 3x2 + 8x3 = C does meet the triangle.
This is a family of parallel planes, one for each C. As C increases, the planes move
toward the triangle.
8.3
R= (0, 0, 2)
(2 hours by computer)
Linear Programming
377
Ax = bis the plane x, +x, +2x,=4
Triangle has x,2 0, X,2 OF X32 0
Q = (0, 4, 0) (4 hours by student)
P = (4, 0, 0) (4 hours by Ph.D.)
Figure 8.4
The triangle containing nonnegative solutions:
lowest cost solution x* is one of the corners P, Q, or R.
Ax = b and x > 0. The
The first plane to touch the triangle has minimum cost C. The point where it
touches is the solution x*. This touching point must be one of the corners P or Q
or R. A moving plane could not reach the inside of the triangle before it touches a
corner! So check the cost 5x; + 3x2 + 8x3 at each corner:
P = (4, 0,0) costs 20
Or=1(0, 45-0 weostss 12
R = (0, 0, 2) costs 16.
The winner is Q. Then x* = (0,4, 0) solves the linear programming problem.
If the cost vector ¢ is changed, the parallel planes are tilted. For small changes,
Q is still the winner. For the cost c-x = 5x; + 4x2 + 7x3, the optimum x* moves to
R=
(0,0, 2). The minimum
cost is now
7-2 = 14.
Note 1 Some linear programs maximize profit instead of minimizing cost. The mathematics is almost the same. The parallel planes start with a large value of C, instead
of a small value. They move toward the origin (instead of away), as C gets smaller.
The first touching point is still a corner.
Note 2 The requirements Ax = b and x > 0 could be impossible to satisfy. The
equation x; + x2 + x3 = —1 cannot be solved with x > 0. The feasible set is empty.
Note 3 It could also happen that the feasible set is unbounded.
If I change the require-
ment to xj + x2 —2x3 = 4, the large positive vector (100, 100, 98) is now a candidate.
So is the larger vector (1000, 1000, 998). The plane Ax = Bb is no longer chopped off
to a triangle. The two corners P and Q are still candidates for x*, but the third corner
has moved to infinity.
Note 4 With an unbounded feasible set, the minimum cost could be —oo (minus infinity).
Suppose the cost is —xj
— x2 + x3.
Then the vector (100, 100, 98) costs
C =
—102.
The vector (1000, 1000, 998) costs C = —1002. We are being paid to include x; and x2.
Instead of paying a cost for those components. In realistic applications this will not happen.
378
Chapter 8 Applications
But it is theoretically possible that changes in A, b, and c can produce unexpected triangles
and costs.
Background to Linear Programming
This first problem is made up to fit the previous example. The unknowns x1, x2, x3 represent hours of work by a Ph.D. and a student and a machine. The costs per hour are $5,
$3, and $8. (I apologize for such low pay.) The number of hours cannot be negative:
xj > 0,x2 > 0,x3 > 0. The Ph.D. and the student get through one homework problem
per hour—the machine solves two problems in one hour. In principle they can share out
the homework, which has four problems to be solved: x; + x2 + 2x3 = 4.
The problem is to finish the four problems at minimum cost.
If all three are working,
the job takes one hour:
xj
=
x2 =
x3 =
1. The cost 1s
5+3+8 = 16. But certainly the Ph.D. should be put out of work by the student
(who is just as fast and costs less—this problem is getting realistic). When the student
works two hours and the machine works one, the cost is 6+ 8 and all four problems
get solved. We are on the edge QR of the triangle because the Ph.D. is unemployed:
x; = 0. But the best point is at a corner—all work by student (at Q) or all work by
machine (at R). In this example the student solves four problems in four hours for
$12—the minimum cost.
With only one equation in Ax = Bb, the corner (0,4,0) has only one nonzero
component. When Ax = b has m equations, corners have m nonzeros. As in Chapter
3, n —m free variables are set to zero. We solve Ax = b for the m basic variables
(pivot variables). But unlike Chapter 3, we don’t know which m variables to choose
as basic. Our choice must minimize the cost.
The number of possible corners is the number of ways to choose m components
out of n. This number “n choose m” is heavily involved in gambling and probability.
With n = 20 unknowns and m = 8 equations (still small numbers), the “feasible set”
can have 20!/8!12! corners.
That number
is (20)(19) --- (13) = 5,079, 110,400.
Checking three corners for the minimum cost was fine. Checking five billion
corners is not the way to go. The simplex method described below is much faster.
The Dual Problem In linear programming, problems come in pairs. There is a minimum problem and a maximum problem—the original and its “dual.” The original problem was specified by a matrix A and two vectors b and c. The dual problem has the
same input, but A is transposed and b and c are switched. Here is the dual to our
example:
A cheater offers to solve homework problems by looking up the answers.
The charge is y dollars per problem, or 4y altogether. (Note how b = 4 has
gone into the cost.) The cheater must be as cheap as the Ph.D. or student
or machine: y < 5 and y < 3 and 2y < 8. (Note how c = (5, 3, 8) has
gone into inequality constraints). The cheater maximizes the income 4y.
8.3.
Dual Problem
Linear Programming
379
Maximize b-y subject to Aly <c.
The maximum occurs when y = 3. The income is 4y = 12. The maximum in
the dual problem ($12) equals the minimum in the original ($12). This is always true:
Duality Theorem
If either problem has a best vector (x* or y*) then so
does the other. The minimum cost c -x* equals the maximum income
b- y*.
Please note that I personally often look up the answers.
It’s not cheating.
This book started with a row picture and a column picture. The first “duality theorem”
was about rank: The number of independent rows equals the number of independent
columns. That theorem, like this one, was easy for small matrices. A proof that minimum cost = maximum income is in our text Linear Algebra and Its Applications. Here
we establish the easy half of the theorem: The cheater’s income cannot exceed the
honest cost:
If Av=b.x
= 0; Aly <c_
then
bly = (Ax)! y = xT(Aly) <x'e.
The full duality theorem says that when b! y reaches its maximum and x'¢ reaches its
minimum, they are equal: b- y* =c-x”*.
The Simplex Method
Elimination is the workhorse for linear equations. The simplex method is the workhorse
for linear inequalities. We cannot give the simplex method as much space as elimination—
but the idea can be briefly described. The simplex method goes from one corner to a
neighboring corner of lower cost. Eventually (and quite soon in practice) it reaches the
corner of minimum cost. This is the solution x*.
A corner is a vector x > 0 that satisfies the m equations Ax = b with at most
m positive components. The other n — m components are zero. (Those are the free
variables. Back substitution gives the basic variables. All variables must be nonnegative
or x is a false corner.) For a neighboring corner, one zero component becomes positive
and one positive component becomes zero.
The simplex method must decide which component “enters” by becoming positive, and which component “leaves” by becoming zero. That exchange is chosen so
as to lower the total cost. This is one step of the simplex method.
Here is the overall plan. Look at each zero component at the current corner. If
it changes from 0 to 1, the other nonzeros have to adjust to keep Ax = b. Find the
new x by back substitution and compute the change in the total cost c-x. This change
380 = Chapter 8 Applications
is the “reduced cost” r of the new component. The entering variable is the one that
gives the most negative r. This is the greatest cost reduction for a single unit of a new
variable.
Example 1 Suppose the current corner is (4, 0,0), with the Ph.D. doing all the work
(the cost is $20). If the student works one hour, the cost of x = (3, 1,0) is down to
$18. The reduced cost is r = —2. If the machine works one hour, then x = (2, 0, 1)
also costs $18. The reduced cost is also r = —2. In this case the simplex method can
choose either the student or the machine as the entering variable.
Even in this small example, the first step may not go immediately to the best x*. The
method chooses the entering variable before it knows how much of that variable to include.
We computed r when the entering variable changes from 0 to 1, but one unit may be too
much or too little. The method now chooses the leaving variable (the Ph.D.).
The more of the entering variable we include, the lower the cost. This has to
stop when one of the positive components (which are adjusting to keep Ax = 5) hits
zero. The leaving variable is the first positive x; to reach zero. When that happens,
a neighboring corner has been found. More of the entering variable would make the
leaving variable negative, which is not allowed. We have gone along an edge of the
allowed feasible set, from the old corner to the new corner.
new corner) to find the next variables to enter and leave.
Then start again (from the
When all reduced costs are positive, the current corner is the optimal x*. No
zero component can become positive without increasing c-x. No new variable should
enter. The problem is solved.
Note Generally x* is reached in an steps, where a@ is not large. But examples have
been invented which use an exponential number of simplex steps. Eventually a different
approach was developed, which is guaranteed to reach x* in fewer (but more difficult)
steps. The new methods travel through the interior of the feasible set, to find x* in
polynomial time. Khachian proved this was possible, and Karmarkar made it efficient.
There is strong competition between Dantzig’s simplex method (traveling around the
edges) and Karmarkar’s method through the interior.
Example 2 Minimize the cost c-x = 3x; + x2 + 9x3 +.x4. The constraints are x > 0
and two equations Ax = b:
x,
+2x3+2x4=4
x2 +x3-—x4
=2
m=2 _ equations
n=4_
unknowns.
A starting corner is x = (4, 2,0, 0) which costs
c-x = 14. It has m = 2 nonzeros and
n—m=2
zeros (x3 and x4). The question is whether x3 or x4 should enter (become
nonzero). Try each of them:
Ifixge=1
and x4. =,0;
otthenrxe= (291
sk; O)pcostsal 6:
If x4 = 1, andex3 = 0, | then x = (3;.3/0,1) costs 13.
8.3
Linear Programming
381
Compare those costs with 14. The reduced cost of x3 is r = 2, positive and useless.
The reduced cost of x4 is r = —1, negative and helpful. The entering variable is x4.
How much of x4 can enter? One unit of x4 made x; drop from 4 to 3. Four units
will make x; drop from 4 to zero (while x2 increases all the way to 6). The leaving
variable is x;. The new corner is x = (0,6, 0,4), which costs only
c-x = 10. This is
the optimal x*, but to know that we have to try another simplex step from (0, 6, 0, 4).
Suppose x; or x3 tries to enter:
Mx, =
and x3—= 0,”
live = land 7) =.07
“then x = (lor0s siicostcer
then
= (0) 3.5152) costsur4.
Those costs are higher than 10. Both r’s are positive—it does not pay to move.
The
current corner (0, 6,0, 4) is the solution x*.
These calculations can be streamlined. It turns out that each simplex step solves
three linear systems with the same matrix B. (This is the m by m matrix that keeps
the m basic columns of A.) When a new column enters and an old column leaves,
there is a quick way to update B~'. That is how most computer codes organize the
steps of the simplex method.
One final note. We described how to go from one corner to a better neighbor.
We did not describe how to find the first corner—which is easy in this example but
not always. One way is to create new variables x5 and x6, which begin at 4 and 2
(with all the original x’s at zero). Then start the simplex method
with x5 + x6 as the
cost. Switch to the original problem after x5 and x6 reach zero—a starting corner for
the original problem has been found.
Problem Set 8.3
1
Draw the region in the xy plane where x +2y = 6 and x > 0 and y > 0. Which
point in this “feasible set” minimizes the cost c = x + 3y? Which point gives
maximum cost?
Z
Draw the region in the xy plane where x +2y <6, 2x+y <6, x >0,
It has four corners. Which corner minimizes the cost c = 2x — y?
3
y>0.
What are the corners of the set x; + 2x2 — x3 = 4 with x1, x2, x3 all > 0? Show
that x; + 2x3 can be very negative in this set.
4
Start at x = (0,0, 2) where the machine solves all four problems for $16. Move
to x = (0,1,
_) to find the reduced cost r (the savings per hour) for work by
the student. Find r for the Ph.D. by moving to x = (1,0,
not give the number of hours or the total savings.
5
Start from
(4,0,0)
with c changed to [5
3 7].
Show
machine but the total cost is lower for the student.
two steps, first to machine and then to student.
6
Choose
imum
__). Notice that r does
that r is better for the
The simplex method takes
a different c so the Ph.D. gets the job. Rewrite the dual problem
income to the cheater).
(max-
382
Chapter 8
Applications
FOURIER SERIES: LINEAR ALGEBRA
FOR FUNCTIONS a 8.4
This section goes from finite dimensions to infinite dimensions. I want to explain linear
algebra in infinite-dimensional space, and to show that it still works. First step: look
back. This book began with vectors and dot products and linear combinations. We
begin by converting those basic ideas to the infinite case—then the rest will follow.
What does it mean for a vector to have infinitely many components? There are
two different answers, both good:
1
The vector becomes
v = (v], v2, v3,...).
It could be (1, 5, is KARO):
2
The vector becomes a function f(x). It could be sin x.
We will go both ways. Then the idea of Fourier series will connect them.
After vectors come dot products. The natural dot product of two infinite vectors
(vj, V2,...) and (w1, w2,...) iS an infinite series:
vew=VvVwijt+v2zu2+-::-.
(1)
This brings a new question, which never occurred to us for vectors in R”. Does this
infinite sum add up to a finite number? Does the series converge? Here is the first and
biggest difference between finite and infinite.
When
v = w = (1,1, 1,...), the sum
v-w=1+1+4+14+---
certainly does not converge.
In that case
is infinite. Since v equals w, we are really computing v-v =
\lv|/7 = length squared. The vector (1,1, 1,...) has infinite length. We don’t want that
vector. Since we are making the rules, we don’t have to include it. The only vectors
to be allowed are those with finite length:
DEFINITION
The vector (vj, v2,...) is in our infinite-dimensional
“Hilbert space”
if and only if its length is finite:
jv? =v-v=
Example
1 The
length is 2/./3.
the square root:
vector
vi as v5 “ie v3 +.---must add to a finite number.
v =
(1, 5s 5 tte) is included
We have a geometric
vev=l+g+yt
Question
Answer
in Hilbert
series that adds to 4/3.
=
1
space,
because
its
The length of v is
=|
If v and w have finite length, how large can their dot product be?
The sum v-w = vj w;+v2w2+---
also adds to a finite number.
The Schwarz
inequality is still true:
|v -w| < |lv|| [wll
(2)
8.4
Fourier Series: Linear Algebra for Functions
383
The ratio of v + w to ||v|j ||w|] is still the cosine of 6 (the angle between v and w).
Even in infinite-dimensional space, |cos@| is not greater than 1.
Now change over to functions. Those are the “vectors.” The space of functions f(x),
g(x), h(x), ... defined for 0 < x < 27 must be somehow bigger than R”. What is the dot
product of f (x) and g(x)?
Key point in the continuous case: Sums are replaced by integrals. Instead of a
sum of v; times w;, the dot product is an integral of f(x) times g(x). Change the “dot”
to parentheses with a comma, and change the words “dot product” to inner product:
DEFINITION = The inner product of f(x) and g(x), and the length squared, are
20
(fa= fo fo)go)dx
2
and
A
If IP = i (f(x)?dx.
es
The interval [0, 277] where the functions are defined could change to a different interval
like [0, 1]. We chose 27 because our first examples are sin x and cos x.
Example 2 The length of f(x) = sinx comes
from its inner product with itself:
20
(f, f) =|! (sinx)* dx =a.
The length of sinx is /7.
0
That is a standard integral in calculus—not part of linear algebra. By writing sin? x as
f —1}cos2x, we see it go above and below its average value 5. Multiply that average
by the interval length 27 to get the answer z.
More important:
is zero:
The functions sinx and cosx are orthogonal.
Qn
/
0
20
Sin COSMO
This zero is no accident.
a= /
0
Their inner product
on
5 Sine
da
[-4 cos lh
It is highly important to science.
—a())
(4)
The orthogonality goes
beyond the two functions sinx and cos x, to an infinite list of sines and cosines.
list contains cosO0x (which is 1), sinx, cos x, sin2x, cos 2x, sin3x,cos3x,....
The
Every function in that list is orthogonal to every other function in the list.
The next step is to look at linear combinations of those sines and cosines.
Fourier Series
The Fourier series of a function y(x) is its expansion into sines and cosines:
y(x) = ap + a1 cosx + bj sinx + a2 cos2x + bz sin2x+---.
(5)
384
Chapter 8 Applications
We have an orthogonal basis! The vectors in “function space” are combinations of the
sines and cosines. On the interval from x = 27 to x = 4m, all our functions repeat
what they did from 0 to 27. They are “periodic.” The distance between repetitions
(the period) is 277.
Remember: The list is infinite. The Fourier series is an infinite series. Just as
we avoided the vector v = (1,1, 1,...) because its length is infinite, so we avoid a
function like 5 +008 x +cos2x+cos3x+---. (Note: This is 2 times the famous delta
function. It is an infinite “spike” above a single point. At x = 0 its height S+1+1+- .
is infinite. At all points inside 0 < x < 2m the series adds in some average way to
zero.) The delta function has infinite length, and regretfully it is excluded from our
space of functions.
Compute the length of a typical sum f(x):
20
Ch ey}
0
(ap + a cosx +b, sinx + a2 cos 2x +--+)? dx
20
=)
0
(a? + ai cos” x + b? sin” x + a3 cos* 2x + ---) dx
= 2map-+ (a? +b} +03 +--+).
(6)
The step from line | to line 2 used orthogonality. All products like cos
x cos 2x and
sin
x cos 3x integrate to give zero. Line 2 contains what is left—the integrals of each
sine and cosine squared. Line 3 evaluates those integrals. Unfortunately the integral of
1° is 27, when all other integrals give 2. If we divide by their lengths, our functions
become orthonormal:
1
COSEUIESINEG
COS
EFT
AALEE os ATE
ORIG lag
These are unit vectors.
is an orthonormal basis for our function space.
We could combine
them with coefficients Ag, A,, By, A2,...
to yield a function F(x). Then the 27 and the z’s drop out of the formula for length.
Equation (6) becomes function length = vector length:
FI? =(F, F) = AR + AT + BP +A34--.
7)
Here is the important point, for f(x) as well as F(x). The function has finite length exactly when the vector of coefficients has finite length. The integral of (F (x)) matches
the sum of coefficients squared. Through Fourier series, we have a perfect match between function space and infinite-dimensional Hilbert space. On one side is the function, on the other side are its Fourier coefficients.
8B The function space contains f(x) exactly when the Hilbert space contains the
vector v = (dao, a1, b),...) of Fourier coefficients. Both f(x) and v have finite length.
8.4
Fourier Series: Linear Algebra for Functions
385
Example 3 Suppose f(x) is a “square wave,” equal to —1 for negative x and +1 for
positive x. That looks like a step function, not
a wave.
But remember that f(x) must
repeat after each interval of length 277. We should have said
fa)=
—1
for
—-x<x<0
ely
OR
2Ot8 hoe re
The wave goes back to —1 for 7 < x < 2z. It is an odd function like the sines,
and all its cosine coefficients are zero. We will find its Fourier series, containing only
sines:
fe See
4
SI Xp
uA
1
This square wave has length 27,
-eeSIN. Ox +e SIN D0
3
(8)
5
because at every point (f(x) is (=1)* or (+1):
20
20
Jl? =f (re))tae =| pata — oe,
0
0
At x = 0 the sines are zero and the Fourier series (8) gives zero. This is half way up the
jump from —1 to +1. The Fourier series is also interesting when x = 4. At this point the
square wave equals 1, and the sines in equation (8) alternate between +1 and —1:
[==(1-;+3-4+-).
(9)
Multiply through by z to find a magical formula 4(1 — ‘=i g=
+---) for that famous
number.
The Fourier Coefficients
How do we find the a’s and b’s which multiply the cosines and sines?
function f(x), we are asking for its Fourier coefficients:
f@) =a
+.a; cosx
For a given
+ by sinx + an cos2x +---.
Here is the way to find a;. Multiply both sides by cosx. Then integrate from 0 to 2x.
The key is orthogonality! All integrals on the right side are zero, except the integral
of a, cos? x:
21
/
0
20
TAX) COS de
/
0
ay cos? x dx = ay.
(10)
Divide by z and you have a). To find any other ax, multiply the Fourier series by
coskx. Integrate from 0 to 27. Use orthogonality, so only the integral of ax cos? kx
is left. That integral is wa,, and divide by 7:
1
a=
mw
2n
Jo
i
f(x)coskxdx
and similarly
by = —
wT
2
JO
f(x)sinkxdx.
(11)
386
Chapter 8 Applications
The exception is ap. This time we multiply by cosOx = 1. The integral of 1 is 27:
1
20
Mh) = Ae
Tw JO
f(x)+1dx = average value of f(x).
(12)
I used those formulas to find the coefficients in (8) for the square wave. The integral
of f(x)coskx was zero. The integral of f(x) sinkx was 4/k for odd k.
The point to emphasize is how this infinite-dimensional case is so much like the ndimensional case. Suppose the nonzero vectors v1,..., VU, are orthogonal. We want to
write the vector b as a combination of those v’s:
b=cyvy
+202
+--+ + Cn Vy.
(13)
Multiply both sides by Oe Use orthogonality, so viv» = 0. Only the c, term is left:
vib — C10, YY +0+---+0.
Therefore cj = i
(14)
v, v1
The denominator vf vy is the length squared, like a in equation (11). The numerator
vib is the inner product like [ f(x) coskx dx. Coefficients are easy to find when the
basis vectors are orthogonal. We are just doing one-dimensional projections, to find
the components along each basis vector.
The formulas are even better when the vectors are orthonormal. Then we have
unit vectors.
The denominators
cp=vib
VE VK are all 1. In this orthonormal
and
co=vib
case,
and = cy=vib.
(15)
You know this in another form. The equation for the c’s is
C1Vj
+++ +Cy¥,
= 5b or
Vi
°°:
«Dp
This is an orthogonal matrix Q! Its inverse is Q'. That gives the same c’s as in (15):
Qc=b
yields
c=Q'b.
Row by row this is cj = vu}b.
Fourier series is like having a matrix with infinitely many orthogonal columns. Those
columns are the basis functions 1, cosx,sinx,.... After dividing by their lengths we
have an “infinite orthogonal matrix.” Its inverse is its transpose. The formulas for the
Fourier coefficients are like (15) when we have unit vectors and like (14) when we don’t.
Orthogonality is what reduces an infinite series to one single term.
8.4
Fourier Series: Linear Algebra for Functions
387
Problem Set 8.4
Integrate the trig identity 2 cos jx coskx = cos(j +k)x + cos(j — k)x to show that
cos jx is orthogonal to cos kx, provided j # k. What is the result when j = k?
Show that the three functions 1, x, and x2 — t are orthogonal, when the integration
is from x = —1 tox = 1. Write f(x) = 2x? as a combination of those orthogonal
functions.
Find a vector (w1,w2, w3,...) that is orthogonal to v = (1, 5, is ve; Js Compute
its length ||w||.
The first three Legendre polynomials are 1, x, and x* — i Choose the number c so
that the fourth polynomial x? — cx is orthogonal to the first three. The integrals still
go from —1 to 1.
For the square wave f(x) in Example 3, show that
20
20
F(x) cosxidxic
0
2
Faysmxdxi=4
0
ik) Sax
ae = 0.
0
Which Fourier coefficients come from those integrals?
The square wave has ||f ||? = 27. This equals what remarkable sum by equation (6)?
Graph the square wave. Then graph by hand the sum of two sine terms in its series,
or graph by machine the sum of two, three, and four terms.
Find the lengths of these vectors in Hilbert space:
ave (s;. ss tp)
(b)
v=(l,a,a’,...)
(©) f@)=14sinx.
Compute the Fourier coefficients ax and b; for f(x) defined from 0 to 27:
(aye
fy SHBior 0
ar,
Px) =
torr
<r
(DS
x
=x.
10
When f(x) has period 277, why is its integral from —z to m the same as from 0 to
2x? If f(x) is an odd function, f(—x) = — f(x), show that ka J (x) dx is zero.
11
From trig identities find the only two terms in the Fourier series for f (x):
ye
12
he
cossx | (5).
i= cose tae)
The functions 1, cos x, sinx, cos 2x, sin2x,... are a basis for Hilbert space. Write
the derivatives of those first five functions as combinations of the same five functions.
What is the 5 by 5 “differentiation matrix” for these functions?
13
Write the complete solution to dy/dx
solution to dy/dx = 0.
= cosx as a particular solution plus any
388
Chapter 8 Applications
COMPUTER
GRAPHICS a 8.5
Computer graphics deals with three-dimensional images. The images are moved around.
Their scale is changed. They are projected into two dimensions. All the main operations
are done by matrices—but the shape of these matrices is surprising.
The transformations of three-dimensional space are done with 4 by 4 matrices.
You would expect 3 by 3. The reason for the change is that one of the four key operations cannot be done with a 3 by 3 matrix multiplication. Here are the four operations:
Translation
(shift the origin to another point Po = (x0, yo, Zo))
Rescaling
(by c in all directions or by different factors cj, c2, c3)
Rotation
(around an axis through the origin or an axis through Po)
Projection (onto a plane through the origin or a plane through Po).
Translation is the easiest—just add (xo, yo, Zo) to every point.
But this is not linear!
No 3 by 3 matrix can move the origin. So we change the coordinates of the origin to
(0, 0,0, 1). This is why the matrices are 4 by 4. The “homogeneous coordinates” of
the point (x, y, z) are (x, y, z, 1) and we now show how they work.
1. Translation
Shift the whole three-dimensional space along the vector vg. The origin
moves to (x9, yo, Zo). This vector vg is added to every point v in R?. Using homogeneous coordinates, the 4 by 4 matrix T shifts the whole space by v9:
:
:
Translation matrix
=
Le
ROS
(Ome
oe
oO
Olan
eer)
SO
xo
z#
yo
|
Important: Computer graphics works with row vectors. We have row times matrix instead
of matrix times column. You can quickly check that [0 0 0 1]T = [xo yo zo 1).
To move the points (0, 0,0) and (x, y, z) by vo, change to homogeneous coordinates (0,0,0,1) and (x, y,z, 1). Then multiply by T. A row vector times T gives a
row vector: Every v moves to
v+ vo: [x y z I]T = [x + x0 y+ yo z+20 1]
The output tells where any v will move. (It goes to v + vg.) Translation is now
achieved by a matrix, which was impossible in R?.
2. Scaling To make a picture fit a page, we change its width and height. A Xerox
copier will rescale a figure by 90%. In linear algebra, we multiply by .9 times the
identity matrix. That matrix is normally 2 by 2 for a plane and 3 by 3 for a solid. In
computer graphics, with homogeneous coordinates, the matrix is one size larger:
Rescale the plane:
S=
2
a
1
Rescale a solid:
S =
or
O0
0
OPO
0-0
~O=1
i
OOO
8.5
Computer Graphics
389
Important: S is not cI. We keep the 1 in the lower corner. Then [x, y, 1] times S is
the correct answer in homogeneous coordinates. The origin stays in position because
[(O01]S = [001].
If we change that | to c, the result is strange. The point (cx,cy,cz,c) is the
same as (x,y,z, 1). The special property of homogeneous coordinates is that multiplying by cI does not move the point. The origin in R* has homogeneous coordinates
(0, 0,0, 1) and (0,0,0,c)
for every nonzero c. This is the idea behind the word “ho-
mogeneous.”
Scaling can be different in different directions. To fit a full-page picture onto a
half-page, scale the y direction by 5. To create a margin, scale the x direction by i.
The graphics matrix is diagonal but not 2 by 2. It is 3 by 3 to rescale a plane and 4
by 4 to rescale a space:
34
Scaling matrices
That last matrix
S =
C1
NI
|
S rescales the x, y, z directions by positive numbers
point at the origin doesn’t move, because
[0001]5=[000
cj, c2,c3.
The
1].
Summary
The scaling matrix § is the same size as the translation matrix T. They
can be multiplied. To translate and then rescale, multiply vTS. To rescale and then
translate,
multiply
vST.
(Are those different?
Yes.)
The extra column
in all these
matrices leaves the extra | at the end of every vector.
The point (x, y, z) in R> has homogeneous coordinates (x, y, z, 1) in P?. This
“projective space” is not the same as R*. It is still three-dimensional. To achieve such
a thing, (cx, cy, cz,c) is the same point as (x, y, z, 1). Those points of P> are really
lines through the origin in R*.
Computer graphics uses affine transformations, linear plus shift. An affine transformation T is executed on P? by a 4 by 4 matrix with a special fourth column:
@ji
aio
a3
ayn
224
2230
O
A319
ds7
1433,
10
a4,
442
443
|
T(,0,0)
0O
pO,0s1)
=)
—
The usual 3 by 3 matrix tells us three outputs, this tells four. The usual outputs come
from
the inputs
(1,0,0)
and
(0,1,0)
and (0,0, 1). When
the transformation
is lin-
ear, three outputs reveal everything. When the transformation is affine, the matrix also
contains the output from (0, 0,0). Then we know
3.
Rotation
the shift.
A rotation in R* or R® is achieved by an orthogonal matrix
determinant is +1. (With determinant
Q.
Include the extra column when you use homogeneous coordinates!
Plane rotation
Q=
cos@
enue
The
—1 we get an extra reflection through a mirror.)
—sind
Bos | becomes
R=]
cos@
sin@
0
—sin@
cos@
0
0
0
1
390 = Chapter 8 Applications
This matrix rotates the plane around the origin. How would we rotate around a
different point (4,5)? The answer brings out the beauty of homogeneous coordinates.
Translate (4,5) to (0,0), then rotate by 6, then translate (0,0) back to (4,5):
jor
WTR Tce
(rey ili)
e0e
30)
cos@é
—sin@
0
i
0: aa lie
Ose
Sin on aicos
OaOGih
Samet
]
A.
—4
-5
1
0
0
30.60
a0
goal
I won’t multiply. The point is to apply the matrices one at a time: v translates to v7_, then
rotates to vT_R, and translates back to v7_RT,. Because each point FE aed ] is a row
vector, T_ acts first. The center of rotation (4, 5)—otherwise known as (4, 5, 1)—moves
first to (0,0, 1). Rotation doesn’t change it. Then 7; moves it back to (4,5, 1). All as it
should be. The point (4, 6, 1) moves to (0, 1, 1), then turns by 6 and moves back.
In three dimensions, every rotation Q turns around an axis. The axis doesn’t
move—it is a line of eigenvectors with A = 1. Suppose the axis is in the z direction.
The 1 in Q is to leave the z axis alone, the extra 1 in R is to leave the origin alone:
O=
cos@é
lasing
0
—sin@
cosé
0
0
0
1
O
and
é
R=
Oe
Ou.
Oigel
Now suppose the rotation is around the unit vector a = (a), a2, a3). With this axis a,
the rotation matrix Q which fits into R has three parts:
ar
QO = (cos@)I + (1 —cos@) | aja
aja3
aja2z
a;
a2a3
aja3
0
aza3 | —sin@ | —a3
as
a2
a—-a
ONP
aa
—a,
The axis doesn’t move because aQ = a. When a = (0,0, 1) is in the z direction, this
Q becomes the previous Q—for rotation around the z axis.
The linear transformation Q always goes in the upper left block of R. Below it
we see zeros, because rotation leaves the origin in place. When those are not zeros,
the transformation is affine and the origin moves.
4. Projection In a linear algebra course, most planes go through the origin. In real
life, most don’t. A plane through the origin is a vector space. The other planes are
affine spaces, sometimes called “flats.” An affine space is what comes from translating
a vector space.
We want to project three-dimensional vectors onto planes. Start with a plane
through the origin, whose unit normal vector is n. (We will keep n as a column vector.) The vectors in the plane satisfy n'v = 0. The usual projection onto the plane
is the matrix I — nn". To project a vector, multiply by this matrix. The vector n is
projected to zero, and the in-plane vectors v are projected onto themselves:
Ud - nn')n =n—
n(n'n) =0
and
(J- nn')y =v—
n(n'v) =)
8.5
Computer Graphics
391
In homogeneous coordinates the projection matrix becomes 4 by 4 (but the origin doesn’t
move):
Projection onto the plane
n'v =0
P=
Now project onto a plane n'(v — v9) = 0 that does not go through the origin. One
point on the plane is vo. This is an affine space (or a flat). It is like the solutions
to Av = b when the right side is not zero. One particular solution vg is added to the
nullspace—to produce a flat.
The projection onto the flat has three steps. Translate vg to the origin by 7_.
Project along the n direction, and translate back along the row vector vo:
Projection onto a flat
(NY oe
a
—v0
eh
4 ! ae
I can’t help noticing that 7_ and 7, are inverse matrices:
They are like the elementary matrices of Chapter 2.
| ie + :
0
translate and translate back.
The exercises will include reflection matrices, also known as mirror matrices.
These are the fifth type needed in computer graphics. A reflection moves each point
twice as far as a projection—the reflection goes through the plane and out the other
side. So change the projection J — nn! to I — 2nn! for a mirror matrix.
The matrix P gave a “parallel” projection. All points move parallel to n, until
they reach the plane. The other choice in computer graphics is a “perspective” projection. This is more popular because it includes foreshortening. With perspective, an
object looks larger as it moves closer. Instead of staying parallel to n (and parallel
to each other), the lines of projection come toward the eye—the center of projection.
This is how we perceive depth in a two-dimensional photograph.
The basic problem of computer graphics starts with a scene and a viewing position.
Ideally, the image on the screen is what the viewer would see. The simplest image
assigns just one bit to every small picture element—called a pixel. It is light or dark.
This gives a black and white picture with no shading. You would not approve. In
practice, we assign shading levels between 0 and 28 for three colors like red, green,
and blue. That means 8 x 3 = 24 bits for each pixel. Multiply by the number of pixels,
and a lot of memory is needed!
Physically, a raster frame buffer directs the electron beam.
It scans like a televi-
sion set. The quality is controlled by the number of pixels and the number of bits per
pixel. In this area, one standard text is Computer Graphics: Principles and Practices
by Foley, Van Dam, Feiner, and Hughes (Addison-Wesley, 1990). My best references
were notes by Ronald Goldman (Rice University) and by Tony DeRose (University of
Washington, now associated with Pixar).
392
Chapter 8 Applications
=»
1.
REVIEW OF THE KEY IDEAS
#
Computer graphics needs shift operations T(v) = v + vo as well as linear operations T(v) = Av.
2.
3.
A shift in R” can be executed by a matrix of order n + 1, using homogeneous
coordinates.
| The extra component
| in [x yz1] is preserved when all matrices have the num-
bers 0,0, 0,1 as last column.
Problem Set 8.5
1
A typical point in R? is xi + yj + zk. The coordinate vectors i, j, and k are
(1, 0<0), ©, 1, Da@,0;
1). The coordinates of the pomt are @, y, z).
This point in computer graphics is xi + yj + zk + origin. Its homogeneous
coordinates are( , , , ). Other coordinates for the same point are( , , , ).
D2
A linear transformation T is determined when
an affine transformation we also need 7’(
transformed to xT (i) + yT(j) + zT (k) +
3
Multiply the 4 by 4 matrix 7 for translation along (1, 4,3) and the matrix 7; for
translation along (0, 2,5). The product 77; is translation along
4
Write down the 4 by 4 matrix S that scales by a constant c. Multiply ST and
also TS, where
we know 7 (i), T(j), T(k). For
). The input point (x, y, z, 1) is
:
T is translation by (1,4, 3). To blow up the picture around the
center point (1,4, 3), would you use vST
or vTS?
5
What scaling matrix S (in homogeneous coordinates, so 3 by 3) would make this
page square?
6
What 4 by 4 matrix would move a corner of a cube to the origin and then multiply all lengths by 2? The corner of the cube is originally at (1, 1, 2).
7
When the three matrices in equation (1) multiply the unit vector a, show that they
give (cos@)a and (1 —cos@)a and 0. Addition gives aQ = a and the rotation
axis is not moved.
8
If b is perpendicular to a, multiply by the three matrices in (1) to get (cos@)b
and 0 and a vector perpendicular to b. So Qb makes an angle @ with b. This is
rotation.
9
What is the 3 by 3 projection matrix J — nn! onto the plane Ex + sy + Ez — (0?
In homogeneous coordinates add 0, 0,0, 1 as an extra row and column in P.
8.5
10
Computer Graphics
393
With the same 4 by 4 matrix P, multiply T_PT to find the projection matrix
onto the plane $x + fy+ bz = 1. The translation T_ moves a point on that plane
(choose one) to (0,0, 0,1). The inverse matrix T, moves
it back.
11
Project (3, 3,3) onto those planes. Use P in Problem 9 and
10.
12
If you project a square onto a plane, what shape do you get?
13
If you project a cube onto a plane, what is the outline of the projection? Make
the projection plane perpendicular to a diagonal of the cube.
14
The 3 by 3 mirror matrix that reflects through the plane n'v = 0 is M = / —
2nn'.
15
T_ PT, in Problem
Find the reflection of the point (3, 3,3) in the plane $x
fy - Ez =a)
Find the reflection of (3, 3,3) in the plane ox - sy _ RZ = |. Take three steps
T_MT, using 4 by 4 matrices: translate by T_ so the plane goes through the
origin, reflect the translated point (3, 3,3, 1)7_ in that plane, then translate back
by
16
Ii.
The vector between
ence v =
adda _
17
the origin (0,0,0, 1) and the point (x, y, z, 1) is the differ-
___.
_In homogeneous coordinates, vectors end in
to a point, not a point to a point.
. SO we
If you multiply only the /ast coordinate of each point to get (x, y,z,c), you
rescale the whole space by the number
~ Lhis is-becase (x. y,.2.c)_ 1s
themsathencdsn
(nmr
ls
i
A
une
eN
7
‘an " singe Sone te
Oa
ui @:)ie
PALS
Mle
“Fer, aetliamnetrei
as
Petar rH ed
eateey i) a eu tn Vey eal. sev a oe
.
OM
ew fara viele a aah aa
wl
Gad.
Te
Sia
sserrands ea
Pepaders
iat:
LEP
Pe
saps
me
a flwe i
i
beter)stad 7a 1S
i ae
er
:
2
1
i
Eel
sh
Awe
a
s} die
;
‘
i pao ie
sone is wing
Pl Peete! caeitwirerento
(oi oom Ve
4 nia
1 oe EO
——
7
a
fla
.,
i¢ TORI
i
SDE
Veeiea%®
o@ 0@
“Bei
fi BS
Nant
o&
eae
owt
7)
Wee
ae
a
“ye
ee
ve ty)
=e
pl) ID-daa sg,'
90s
ew
Nt
4
tong
‘
i
+ Shine
:
ni
ee «
er Oee : Gana
=
¥
tal
dan
lw
Bias
eS
De
=
=i
hed
tong
RE
aa
Qt
he
-
9
op ot
° steele
ae
apg
hee (aiiais 5 ay
we, (= topie TT sual
i"
Nd
OP
J
4 ax
a
Oh. Sean
OW ie
=
Pan DG
-
mit vow
ned
wetees'n
<
abe
1 "a howtH
es =>
Bey Vy 4.>0) ep YA iio: she ‘We,sonatiewnins
«
aea
a
i ‘anh
ame”,
ie
H.
gona saytermanage
ater
vad Sian
waeale
it
f
‘eae
min
,
6
~~.
Tor ~a, _Gas
ass
:
NUMERICAL
GAUSSIAN
LINEAR ALGEBRA
ELIMINATION
IN PRACTICE m 9.1
Numerical linear algebra is a struggle for quick solutions and also accurate solutions.
We need efficiency but we have to avoid instability.
In Gaussian
elimination, the main
freedom (always available) is to exchange equations. This section explains when to
exchange rows for the sake of speed, and when to do it for the sake of accuracy.
The key to accuracy is to avoid unnecessarily large numbers. Often that requires
us to avoid small numbers! A small pivot generally means large multipliers (since we
divide by the pivot). Also, a small pivot now means a large pivot later. The product
of the pivots is a fixed number (except for its sign). That number is the determinant.
A good plan is to choose the largest candidate in each new column as the pivot.
This is called “partial pivoting.’ The competitors are in the pivot position and below.
We will see why this strategy is built into computer programs.
Other row exchanges are done to save elimination steps. In practice, most large
matrices have only a small percentage of nonzero entries. The user probably knows
their location. Elimination is generally fastest when the equations are ordered to put
those nonzeros close to the diagonal. Then the matrix is as “banded” as possible.
New questions arise for parallel machines. Vector computers like the Cray have
4, 8, or 16 processors working at once. The Connection Machine already has 1024
and the CM-5 is designed for 2'4 = 16,384 processors in parallel. Now the problem
is communication—to send processors the data they need, when they need it. This is a
major research area now. The brief comments in this section will try to introduce you
to thinking in parallel.
Section 9.2 is about instability that can’t be avoided. It is built into the problem,
and this sensitivity is measured by the “condition number.” Then Section 9.3 describes
how to solve Ax = b by iterations. Instead of direct elimination, the computer solves
an easier equation many times. Each answer xx goes back into the same equation to
find the next guess x,41. For good iterations, the xz converge quickly to x = A mb:
395
396
Chapter 9 Numerical Linear Algebra
Roundoff Error and Partial Pivoting
Up to now, any pivot (nonzero of course) was accepted. In practice a small pivot is
dangerous. A catastrophe can occur when numbers of different sizes are added. Computers keep a fixed number of significant digits (say three decimals, for a very weak
machine).
The sum
10,000 + 1 is rounded off to 10,000.
The “1” is completely lost.
Watch how that changes the solution to this problem:
.0001u --v=1
starts with coefficient matrix
—u+v=0
i
—|
a
1
:
If we accept .0001 as the pivot, elimination adds 10,000 times row 1 to row 2. Roundoff leaves
10,000v = 10,000
The computed answer v =
leads to
000l
instead of
10,001v = 10,000.
| is near the true v = .9999.
u+1l=1)..
ansiesa et
But then back substitution
0001 uw + .9999 = 1.
The first equation gives u = 0. The correct answer (look at the second equation) is
u = 1.000. By losing the “1” in the matrix, we have lost the solution. The change
from 10,001 to 10,000 has changed the answer from u =1 to u = 0 (100% error’).
If we exchange rows, even this weak computer finds an answer that is correct to
three places:
—u+yv=0
000lu+v=
—u+v=0
1
Via
u=1
a
The original pivots were .0001 and 10,000—badly scaled. After a row exchange the
exact pivots are —1 and 1.0001—well scaled. The computed pivots —1 and 1 come
close to the exact values. Small pivots bring numerical instability, and the remedy is
partial pivoting. The kth pivot is decided when we reach and search column k:
Choose the largest number in row k or below. Exchange its row with row k.
The strategy of complete pivoting looks also in later columns for the largest pivot. It
exchanges columns as well as rows. This expense is seldom justified, and all major
codes use partial pivoting. Multiplying a row or column by a scaling constant can also
be worthwhile. Jf the first equation above is u + 10,000v = 10,000 and we don’t
rescale, then | is the pivot but we are in trouble again.
For positive definite matrices, row exchanges are not required. It is safe to accept
the pivots as they appear. Small pivots can occur, but the matrix is not improved by
row exchanges. When its condition number is high, the problem is in the matrix and
not in the order of elimination steps. In this case the output is unavoidably sensitive
to the input.
9.1.
Gaussian Elimination in Practice
397
The reader now understands how a computer actually solves Ax = b—by elimination
with partial pivoting. Compared with the theoretical description—find A~! and multiply A~'b—the details took time. But in computer time, elimination is much faster.
I believe this algorithm is also the best approach to the algebra of row spaces and
nullspaces.
Operation Counts: Full Matrices and Band Matrices
Here is a practical question about cost. How many separate operations are needed to
solve Ax = b by elimination? This decides how large a problem we can afford.
Look first at A, which changes gradually into
U. When a multiple of row 1
is subtracted from row 2, we do n operations. The first is a division by the pivot,
to find the multiplier /. For the other n — | entries along the row, the operation is a
“multiply-subtract.” For convenience, we count this as a single operation. If you regard
multiplying by / and subtracting from the existing entry as two separate operations,
multiply all our counts by 2.
The matrix A is n by n. The operation count applies to all n — 1 rows below the
first. Thus it requires n times n — 1 operations, or n? —n, to produce zeros below the
first pivot. Check: All n? entries are changed, except the n entries in the first row.
When elimination is down to k equations, the rows are shorter. We need only
k* —k operations (instead of n* —n) to clear out the column below the pivot. This is
true for 1 <k <n. The last step requires no operations (17 — 1 = 0), since the pivot
is set and forward elimination is complete. The total count to reach U is the sum of
k* —k over all values of k from 1 to n:
(?4---+n’)-(+--+n =
n(n + 1)(2n + 1) . nin+1)
6
_
n—n
2
3
Those are known formulas for the sum of the first n numbers and the sum of the first
n squares. Substituting n = | into n> —n gives zero. Substituting n = 100 gives
a million minus a hundred—then divide by 3. (That translates into one second on a
workstation.) We will ignore the last term n in comparison with the larger term n>, to
reach our main conclusion:
The operation count for forward elimination (A to U) is in’,
That means in? multiplications and sn subtractions. Doubling n increases this cost
by eight (because n is cubed). 100 equations are OK, 1000 are expensive, 10000 are
impossible. We need a faster computer or a lot of zeros or a new idea.
On the right side of the equations, the steps go much faster. We operate on single
numbers, not whole rows. Each right side needs exactly n> operations. Remember
that we solve two triangular systems, Lc = b forward and Ux = c backward. In back
substitution, the last unknown needs only division by the last pivot. The equation above
it needs two operations—substituting x, and dividing by its pivot. The kth step needs
398
= Chapter 9
Figure 9.1.
Numerical Linear Algebra
A=LU
for a band matrix. Good zeros in A stay zero in L and U.
k operations, and the total for back substitution is
[t24--tn=
1
FO why?
operations.
The forward part is similar. The n? total exactly equals the count for multiplying A7!b!
This leaves Gaussian elimination with two big advantages over A~!b:
1
Elimination requires n° operations compared to n° for Aq!.
2
If A is banded so are L and U. But A~! is full of nonzeros.
Band Matrices
These counts are improved when A has “good zeros.’ A good zero is an entry that
remains zero in L and U. The most important good zeros are at the beginning of a
row. No elimination steps are required (the multipliers are zero). So we also find those
same good zeros in L. That is especially clear for this tridiagonal matrix A:
yer
1
vee
21
ei
=e
10a
eed
=e
=i
ee.
(el
ea
iol
Rows 3 and 4 of A begin with zeros. No multiplier is needed, so L has the same
zeros. Also rows | and 2 end with zeros. When a multiple of row 1 is subtracted
from row 2, no calculation is required beyond the second column. The rows are short.
They stay short! Figure 9.1 shows how a band matrix A has band factors L and U.
These zeros lead to a complete change in the operation count, for “half-bandwidth” w:
A band matrix has
a;j =9
when
|i-—j|>w.
Thus w = | for a diagonal matrix and w = 2 for a tridiagonal matrix. The length of
the pivot row is at most w. There are no more than w — 1 nonzeros below any pivot.
9.1.
Gaussian
Elimination
in Practice
399
Each stage of elimination is complete after w(w—1) operations, and the band structure
survives. There are n columns to clear out. Therefore:
Forward elimination on a band matrix needs less than w?n operations.
For a band matrix, the count is proportional to n instead of n°. It is also proportional
to w*. A full matrix has w =n and we are back to n>. For a closer count, remember
that the bandwidth drops below w in the lower right corner (not enough space). The
exact count to find L and U is
w(w — 1)(3n — 2w 4+ 1)
for a band matrix
3
nin—1)(n+1)
n?—n
3
when
8)
w=n.
On the right side, to find x from b, the cost is about 2wn (compared to the usual n7).
Main point: For a band matrix the operation counts are proportional to n. This is
extremely fast. A tridiagonal matrix of order 10,000 is very cheap, provided we don’t
compute A~!. That inverse matrix has no zeros at all:
ih
soe
ea)
abi
Mee wen eet
oo Ow” 9
4y 3) oR 4
Se
a
ie
PeebaseAne Tl ial, Wear et
le
ks Ra
We are actually worse off knowing A~! than knowing L and U. Multiplication by
A~! needs the full n? steps. Solving Le = b and Ux = c needs only 2wn. Here
that means 4n. A band structure is very common in practice, when the matrix reflects
connections between near neighbors.
We see a)3 = 0 and aj4 = 0 because
| is not a
neighbor of 3 and 4.
We close with two more operation counts:
1.
Aq! costs n° steps.
2
QR costs $n? steps.
1 Start with AA~! = J. The jth column of A7! solves Ax; = jth column of /.
Normally each of those n right sides needs n> operations, making n° in all. The left
side costs zn? as usual. (This is a one-time cost! L and U are not repeated for each
new right side.) This count gives zn, but we can get down to n°.
The special saving for the jth column of J comes from its first
7 — 1 zeros. No
forward cost
The
j.
row
reaches
elimination
until
side
right
the
on
required
work is
on the
elimination
forward
for
total
the
j,
over
Summing
sn.
of
is S(n — j)? instead
n right sides is zn. Then the final count of multiplications for A~! (with an equal
number of subtractions) is n> if we actually want the inverse matrix:
A: 3
3
3
(L and U) + ~ (forward)
2
+n(5) (back substitutions)
a
= n,
(1)
400
Chapter 9
Numerical Linear Algebra
2 The Gram-Schmidt process works with columns instead of rows—that is not so important to the count. The key difference from elimination is that the multiplier is decided by a dot product. So it takes n operations to find the multiplier, where elimination just divides by the pivot. Then there are n “multiply-subtract” operations to
remove from column 2 its projection along column 1. (See Section 4.4 and Problem
4.4.28 for the sequence of projections.) The cost for Gram-Schmidt is 2n where for
elimination it is n. This factor 2 is the price of orthogonality. We are changing a dot
product to zero instead of changing an entry to zero.
Caution To judge a numerical algorithm, it is not enough to count the operations. Beyond “flop counting” is a study of stability and the flow of data. Van Loan emphasizes
the three levels of linear algebra: linear combinations cu + v (level 1), matrix-vector
Au + v (level 2), and matrix-matrix AB +C
(level 3). For parallel computing, level 3
is best. AB uses 2n? flops (additions and multiplications) and only 2n? data—a good
ratio of work to communication overhead. Solving UX = B for matrices is better than
Ux = b for vectors. Gauss-Jordan partly wins after all!
Plane Rotations
There are two ways to reach the important factorization A = QR. One way works to
find Q, the other way works to find R. Gram-Schmidt chose the first way, and the
columns of A were orthogonalized to go into Q. (Then R was an afterthought. It
was upper triangular because of the order of Gram-Schmidt steps.) Now we look at a
method that starts with A and aims directly at R.
Elimination gives A = LU, orthogonalization gives A = QR. What is the difference, when R and U are both upper triangular? For elimination L is a product of
E’s—with
1’s on the diagonal and the multiplier /;; below. For QR we must use orThe E’s are not allowed. We don’t want a triangular L, we want
thogonal matrices.
an orthogonal Q.
There are two simple orthogonal matrices to take the place of the E’s. The reflection matrices I —2uu" are named after Householder. The plane rotation matrices
are named after Givens. The matrix that rotates the xy plane by 6, and leaves the z
direction alone, is
Givens Rotation
Oo; =_|
cos@é
sing
0
—sin@d
O
cos@
0
Ort
Use Q2) the way you used £>\, to produce a zero in the (2,1) position. That determines the angle 6. Here is an example given by Bill Hager in Applied Numerical
Linear Algebra (Prentice-Hall, 1988):
geo)
QnA=|o
8
0
nea)
ao
153
114
6
0
Oro
120
200
+79.
—40
~223
309
|=
150
—155
—110
0
200
Tey
—40
Ps
395
9.1.
The zero came from —.8(90) + .6(120).
90
cos 6 = ———_____
V907 + 1202
Gaussian
Elimination
in Practice
401
No need to find 6, what we needed was
al 20)
sin? = ————____
V907 + 1202
and
a)
Now we attack the (3,1) entry. The rotation will be in rows and columns 3 and 1.
The numbers cos@ and sin@ are determined from 150 and 200, instead of 90 and 120.
They happen to be .6 and —.8 again:
(Gy
08
On
=—s
0
Q3102:A =
One more
come
step to R.
26
0
&&e
S00)
0
oC)
ie
ae
The (3,2) entry has to go.
from 75 and 100. The rotation is now
I
03203102A=|90
0
0
6
Ss
0
8
is
25) t= 25
0
75
0
100
=
The numbers
in rows
290) 125,
0
fea
0
Hoo
cos@
and columns
-l=|]
.
250
—225
B25
and sin@ now
2 and 3:
250..—125
.0 125
0
Q
250
125
375
We have reached the upper triangular R. What is Q? Move the plane rotations Qj;
to the other side to find A = QR—just as you moved the elimination matrices Ej; to
the other side to find A = LU:
Q303102A=R
means
A=(Q>,03/'03))R= OR.
(3)
The inverse of each Qj; is Qi, (rotation through —@). This is different from the inverse
of £;;—which was not an orthogonal matrix! Bx added back to row i the multiple
/;; (times row j) that E£;; had subtracted. I hope you see how the big computations of
linear algebra—LU and QR—are similar but not the same.
There is a third big computation—eigenvalues and eigenvectors. If we can make A
triangular, we can see its eigenvalues on the diagonal. But we can’t use U and we
can’t use R. To preserve the eigenvalues, the allowed step is not Q2;A but Qr1AQ>,.
That extra factor Ont wipes out the zero that Q2) created!
There are two ways to go. Neither one gives the eigenvalues in a fixed number
of steps. (That is impossible. The calculation of cos @ and sin@ involved only a square
root. The nth degree equation det(A — AJ) = 0 cannot be solved by a succession of
square roots.) But still the rotations Q;; are useful:
Method
1 Produce a zero in the (3, 1) entry of Q2;A, instead of (2,1). That zero is
not destroyed when Ox multiplies on the right. We are leaving a diagonal of nonzeros
under the main diagonal, so we can’t read off the eigenvalues. But this “Hessenberg
matrix”’ with the extra diagonal of nonzeros still has a lot of good zeros.
Method
2 Choose
of Q2AQ>).
a different
Q2;, which
does produce
a zero
in the (2, 1) position
This is just a 2 by 2 eigenvalue problem, for the matrix in the upper
402
Chapter 9 Numerical Linear Algebra
left corner of A. The column
(cos@, —sin@)
is an eigenvector of that matrix.
This is
the first step in “Jacobi’s method.”
The problem of destroying zeros will not go away. The second step chooses Q3}
so that 03) Q>,AQ3} Ox has a zero in the (3, 1) position. But it loses the zero in the
(2, 1) position. Jacobi solves 2 by 2 eigenvalue problems to find his Q;;, but earlier
nonzeros keep coming back. In general those nonzeros are smaller each time, and after
several loops through the matrix the lower triangular part is substantially reduced. Then
the eigenvalues gradually appear on the diagonal.
What you should remember is this. The Q’s are orthogonal matrices—their
columns with (cos@,sin@) and (—sin@,cos@) are orthogonal unit vectors. Computations with the Q’s are very stable. The angle 6 can be chosen to make a particular
entry zero. This is a step toward the final goal of a triangular matrix. That was the
goal at the beginning of the book, and it still is.
Problem Set 9.1
1
Find the two pivots with and without partial pivoting for
001
0
oe :
mae
With partial pivoting, why are no entries of L larger than 1? Find a 3 by 3 matrix
A with all |aj;| < 1 and |/;;| < 1 but third pivot = 4.
Z
Compute the exact inverse of the Hilbert matrix A by elimination. Then compute
A~! again by rounding all numbers to three figures:
A = hilb3) =
Nie
Wl
ee
3
dle
WI
BIR
Ble
Ale
Wie
For the same A compute b = Ax for x = (1, 1, 1) and x = (0, 6, —3.6). A small
change Ab produces a large change Ax.
4
Find the eigenvalues (by computer) of the 8 by 8 Hilbert matrix aj; = 1/(@i +
j — 1). In the equation Ax = b with ||b|| = 1, how large can ||x|| be? If b has
roundoff error less than
5
10~!°, how large an error can this cause in x?
For back substitution with a band matrix (width w), show that the number
of
multiplications to solve Ux = ¢ is approximately wn.
6
7.
If you know L and U and @Q and R, is it faster to solve LUx = b or ORx = b?
Show that the number of multiplications to invert an upper triangular n by n
matrix is about zn. Use back substitution on the columns of 7, upward from 1’s.
9.1
Gaussian
Elimination in Practice
403
Choosing the largest available pivot in each column (partial pivoting), factor each
AINtOCE
Ate LU
il.
ae;
>|
and
@
eee
2
Aa
0
i
ae)
©¢
Put 1’s on the three central diagonals of a 4 by 4 tridiagonal matrix.
cofactors of the six zero entries.
10
Those entries are nonzero
Find the
in A~!.
(Suggested by C. Van Loan.) Find the LU factorization of A =[€1]. On your
computer solve by elimination when ¢ = 10 10
ot
iy,
Lee tales
The true x is (1, 1). Make
=
lee
2
10
10
ee
;
a table to show the error for each ¢. Exchange
the
two equations and solve again—the errors should almost disappear.
11
Choose sin@ and cos@ to triangularize A, and find R:
cos@é
—sin@}]}]1
tay
ol E
—1
He
AR
a
-
12
Choose sin@ and cos@ to make
eigenvalues?
13
When A is multiplied by Q;;, which of the n> entries of A are changed? When
Q>,AQ3}
triangular (same A).
What are the
Q;;A is multiplied on the right by Oe which entries are changed now?
14
How many multiplications and how many additions are used to compute Q;;A?
(A careful organization of the whole sequence of rotations gives in? multiplica-
tions and in? additions—the same as for QR by reflectors and twice as many
as tor. U-)
15
(Turning a robot hand) The robot produces any 3 by 3 rotation A from plane
rotations around the x, y, z axes. Then Q32Q3;Q21A = R, where A is orthogonal so R is J! The three robot turns are in A = OO
Own The three angles
are “Euler angles” and det Q = | to avoid reflection. Start by choosing cos@ and
sin@ so that
cos6
QO7,A = | sind
0
—snO6
cosd
0
O;,;-1
2
2
0) A
2 —1
21
Lae
2
sa
is zero in the (2, 1) position.
404
Chapter 9 Numerical Linear Algebra
NORMS AND CONDITION
NUMBERS gp 9.2
How do we measure the size of a matrix? For a vector, the length is ||x||. For a matrix,
the norm is ||A||. This word “norm” is sometimes used for vectors, instead of length.
It is always used for matrices, and there are many ways to measure ||A||. We look at
the requirements on all “matrix norms”, and then choose one.
Frobenius squared all the entries of A and added; his norm ||A||p is the square
root.
This treats the matrix like a long vector.
It is better to treat the matrix as a
matrix.
Start with a vector norm: ||x + y|| is not greater than ||x|| + ||y||. This is the
triangle inequality: x + y is the third side of the triangle. Also for vectors, the length
of 2x or —2x is doubled to 2||x||. The same rules apply to matrix norms:
|A + Bll < |All + [Bll
and
||cAl] = |e] |All.
(1)
The second requirements for a norm are new for matrices—because matrices multiply. The size of Ax and the size of AB must stay under control. For all matrices
and all vectors, we want
|| Ax |]< |]All ll
and
ABI < JA} Bl.
(2)
This leads to a natural way to define ||A||. Except for the zero matrix, the norm is a
positive number. The following choice satisfies all requirements:
DEFINITION
The norm of a matrix A is the maximum
|All =
x40
of the ratio \|Ax||/||x
||:
||Ax||
|||
(3)
Then ||Ax||/||x|| is never larger than ||A|| (its maximum). This says that ||Ax|| <
|All ||x||, as required.
||x||/||x||.
Therefore
||| = 1. If A is an orthogonal matrix Q, then again lengths are preserved:
|x|| for every x. The ratios again give ||Q|| = 1.
Example 1 If A is the identity matrix /, the ratios are always
||Qx|| =
Example 2 The norm of a diagonal matrix is its largest entry (using absolute values):
The norm of
A=
E i
is;
|AllP=3.
The ratio is ||Ax|| = \/2?x7 + 32x} divided by ||x|] = ,/x? +3. That is a maximum
when x; = 0 and x2 = 1. This vector x = (0, 1) is an eigenvector with Ax = (0, 3).
The eigenvalue is 3. This is the largest eigenvalue of A and it equals the norm.
9.2
Norms
and Condition
Numbers
405
For a positive definite symmetric matrix the norm is ||A\| = Amax.Choosex to be the eigenvector with maximum eigenvalue: Ax = Amaxx. Then ||Ax||/||x||
equals Amax. The point is that no other vector x can make the ratio larger. The matrix
is A = QAQ!, and the orthogonal matrices Q and Q" leave lengths unchanged. So
the ratio to maximize is really ||Ax||/||x|]. The norm Amax is the largest eigenvalue in
the diagonal matrix A.
Symmetric matrices Suppose A is symmetric but not positive definite—some eigenvalues of A are negative or zero. Then the norm ||A|| is the largest of |A;|, |Aa|, ...,
|An|. We take absolute values of the A’s, because the norm is only concerned with
length. For an eigenvector we have ||Ax|| = ||Ax||, which is |A| times ||x|]. Dividing by ||x|| leaves |A|. The x that gives the maximum ratio is the eigenvector for the
maximum |A|.
Unsymmetric matrices If A is not symmetric, its eigenvalues may not measure its
true size.
The norm can be large when the eigenvalues are small.
generally larger than
|\A|max.
A very unsymmetric
example
Thus the norm
is
has A; = Az = 0 but its
norm is not zero:
A=
3 a
A
|| A || = max Axl
has norm
x40
||x||
—w)
The vector x = (0,1) gives Ax = (2,0). The ratio of lengths is 2/1.
maximum ratio ||A||, even though x is not an eigenvector.
This is the
It is the symmetric matrix A™A, not the unsymmetric A, that has x = (0, 1) as
its eigenvector. The norm is really decided by the largest eigenvalue of A'A, as we
now prove.
9A The norm of A (symmetric or not) is the square root of Ninax(Ah A):
te
|| Ax ||?
=
max
7,
x2
wT Al Ax
== eo ar rls
eee St Amax(A’A)
max (
)|
4
(4)
Proof Choose x to be the eigenvector of A'A corresponding to its largest eigenvalue
max. The ratio in equation (1) is then xT ATAx
this particular x, the ratio equals Amax.
= x! (Amax)X
divided by x'x.
For
No other x can give a larger ratio. The symmetric matrix A'A has orthonormal eigenvectors q1,q2,---.Qn- Every x is a combination of those vectors. Try this
combination in the ratio and remember that q} 4) =U
Si pa
ALAN
xlx
>
(C141
aur CnQn)
qe
(c1A1 41 +-+++CnAngn)
+ ¢ngn)! (c1q1 air ee
+ Cnn)
a
crAy a
= +...
Con
+c?
(5)
406
Chapter 9 Numerical Linear Algebra
That last ratio cannot be larger than Amax. The maximum ratio is when all c’s are zero,
except the one that multiplies Amax.
Note 1 The ratio in (5) is known as the Rayleigh quotient for the matrix A'A. The
maximum is the largest eigenvale dmax(A!A). The minimum is Amin(A! A). If you
substitute any vector x into the Rayleigh quotient x! ATAx/x'x, you are guaranteed
to get a number between Amin and Amax.Note 2 The norm ||A|| equals the largest singular value Omax of A. The singular values
O|,..-., 0, are the square roots of the positive eigenvalues of A'A. So certainly omax =
(Amax)!/*. This is the norm of A.
Note 3 Check that the unsymmetric example in equation (3) has Limag(A LA) = 4:
A= k | leads to ATA = E | with Amax = 4. So the norm is ||Al| = V4.
The Condition
Number
of A
Section 9.1 showed that roundoff error can be serious. Some systems are sensitive, others are not so sensitive. The sensitivity to error is measured by the condition number.
This is the first chapter in the book which intentionally introduces errors. We want to
estimate how much they change x.
The original equation is Ax = b. Suppose the right side is changed to
b+ Ab
because of roundoff or measurement error. The solution is then changed to x+ Ax. Our
goal is to estimate the change Ax in the solution from the change AB in the equation.
Subtraction gives the error equation A(Ax) = Ab:
Subtract
Ax = b from A(x + Ax) =b+ Ab
to find
A(Ax) = AB.
(6)
The error is Ax = A~!AD. It is large when A~! is large (then A is nearly singular).
The error Ax is especially large when Ab points in the worst direction—which is amplified most by A~!.
The worst error has ||Ax|| =
||A7/|| ||Ab||.
That is the largest
possible output error Ax.
This error bound ||A~!|| has one serious drawback. If we multiply A by 1000,
then A~! is divided by 1000. The matrix looks a thousand times better. But a simple
rescaling cannot change the reality of the problem. It is true that Ax will be divided
by 1000, but so will the exact solution x = A~'b. The relative error ||Ax||/||x|| will
stay the same. It is this relative change in x that should be compared to the relative
change in b.
Comparing relative errors will now lead to the “condition number” c = ||A|j ||A~! |].
Multiplying A by 1000 does not change this number, because A™! is divided by 1000
and the product c stays the same.
9.2
Norms and Condition
Numbers
407
9B The solution error is less than c = ||A|j ||A~‘|| times the problem error:
Axl
Ad
lz) = ° Wen
S
If the problem error is AA (error in the matrix instead of in b), this changes to
|| Ax ||
|AAl|
Fe
ee
Ix
+ Axl ~° (Al.
i»)
Proof The original equation is b = Ax. The error equation (6) is Ax
Apply the key property (2) of matrix norms:
ll < All Ix
and
=
A~!Ab.
Axl] < JA" ABI.
Multiply the left sides to get ||b|| ||Ax||, and also multiply the right sides. Divide both
sides by ||b|| ||x||. The left side is now the relative error ||Ax||/||x||. The right side is
now the upper bound in equation (7).
The same condition number c =
matrix. We have AA instead of Ab:
||A|l ||A~!|| appears
when
the error is in the
Subtract Ax = b from (A + AA)(x + Ax) = Bb to find A(Ax) = —(AA)(x + Ax).
Multiply the last equation by A~! and take norms to reach equation (8):
|| Ax ||
ee
ee
Axl] < AT AAI lle + Axl
cd
lx
+ Ax]||
le
Conclusion Errors enter in two ways. They begin with an error AA or Ab—we use a
wrong matrix or a wrong b. This problem error is amplified (a lot or a little) into the
solution error Ax. That error is bounded, relative to x itself, by the condition number c.
The error Ab depends on computer roundoff and on the original measurements of b.
The error AA also depends on the elimination steps. Small pivots tend to produce large
errors in L and U. Then
L+ AL times U + AU equals
A+ AA. When AA or the
condition number is very large, the error Ax can be unacceptable.
Example 3 When
values:
A and A~! are symmetric, c = ||A|| |A—!|| comes
6
0
A=a E | has norm
6.
=
Ate=
LO
1n6
0
1
has; norm
from the eigen-
i"5
This A is symmetric positive definite. Its norm is Amax = 6. The norm
l/Agin = - Multiplying those norms gives the condition number:
7
Amax
Amin
=
GaS'S:3
of A!
is
408
Chapter 9 Numerical Linear Algebra
Example 4 Keep the same A, with eigenvalues 6 and 2. To make x small, choose
b along the first eigenvector (1,0). To make Ax large, choose Ab along the second
eigenvector (0, 1). Then x = ib and Ax = 5b. The ratio ||Ax||/||x|| is exactly c = 3
times the ratio ||Ab||/||b|].
This shows that the worst error allowed by the condition number can actually
happen. Here is a useful rule of thumb, experimentally verified for Gaussian elimination: The computer can lose logc decimal places to roundoff error.
Problem Set 9.2
1
Find the norms Amax and condition numbers Amax/Amin Of these positive definite
matrices:
Sy
0
ye
Find the norms
@
2)
wl
il 2
23
ih
al
Jee
and condition numbers from the square roots of AraC
A LA)vand
Axl AlAD:
=O)
ong
3
a
0 0
jae
pie
Explain these two inequalities from the definitions of ||A|| and ||Bl:
|ABx|| < ||Al] ||Bxl] < |All |Bll Ix.
From the ratio that gives ||AB||, deduce that ||AB|| < ||Al| ||B||. This is the key
to using matrix norms.
4
5
Use ||AB|| < ||A]] |B] to prove that the condition number of any matrix A is at
least 1.
Why is J the only symmetric positive definite matrix that has Amax = Amin = 1?
Then the only matrices with ||A|| = 1 and ||A~!|| = 1 must have ATA = J. They
are
matrices.
6
Orthogonal matrices have norm ||Q|| = 1. If A = QR show that ||A]| < ||R]|
and also ||R|| < ||Al]. Then ||/A|| = ||R||. Find an example of A = LU with
Al] < LINCI.
ve
8
(a)
Which famous inequality gives ||(A + B)x|| < ||Ax|| + ||Bx|| for every x?
(b)
Why does the definition (4) of matrix norms lead to ||A+B|| < ||A||+||B||?
Show that if A is any eigenvalue of A, then |A| < ||Al|. Start from Ax = Ax.
9.2.
Norms and Condition
Numbers
409
The “spectral radius” p(A) = |Amax| is the largest absolute value of the eigenvalues. Show with 2 by 2 examples that p(A+B) < p(A)+ p(B) and p(AB) <
P(A)p(B) can both be false. The spectral radius is not acceptable as a norm.
10
(a)
Explain why A and A!
have the same condition number.
(b)
Explain why A and A! have the same norm.
11
Estimate the condition number of the ill-conditioned matrix A = ie sve |.
12
Why is the determinant of A no good as a norm?
dition number?
13
(Suggested by C. Moler and C. Van Loan.) Compute b— Ay and b— Az when
yt
2 kt
ee
re
JOO.
|
363
Olle umeGS
Why is it no good as a con-
cy ly ol
*
Llu 08701,
999
cline
tle
Is y closer than z to solving Ax = b? Answer in two ways: Compare the residual b — Ay to b — Az. Then compare y and z to the tue x = (1,-—1). Both
answers can be right. Sometimes we want a small residual, sometimes a small
Ne
14
(a)
Compute the determinant of A in Problem
(b)
If possible compute
Problems
15
13. Compute A’.
||A|| and |A~!|| and show that c > 10°.
15-19 are about vector norms other than the usual ||x|| = /x -x.
The “Z! norm” and the “J
norm” of x = (x1,...,Xn)
Welly = [xa] +--+ + [xn]
Compute the norms
and
are
|x ]loo = max [xj].
1<i<n
||x|| and ||x||; and ||x||,.o of these two vectors in R>:
ee
ee)
se
(eee
eae
16
Prove that ||x|loo < ||x|| < ||x|l1. Show from the Schwarz inequality that the ratios
1x ||/I Iloo and ||x||1/||x||are never larger than /n. Which vector (x1, ...,- Xn) gives
ratios equal to /n?
Ww,
All vector norms must satisfy the triangle inequality. Prove that
lx
+ Ylloo < II¥lloo+1¥lloo
18
and
lx
+ yl S Ile lia + My h.
Vector norms must also satisfy ||cx||= |c| |||]. The norm must be positive except
when x = 0. Which of these are norms for (x1, x2)?
Ea
89 Wee
Be)
Ix Ilo = lel + [lt lloo
|x || 3 = min |x;|
Ix llp = ||Ax||
(answer depends on A).
410°
Chapter 9 Numerical Linear Algebra
ITERATIVE METHODS
FOR LINEAR ALGEBRA gp 9.3
Up to now, our approach to Ax = b has been “direct.” We accepted A as it came. We
attacked it with Gaussian elimination. This section is about iterative methods, which
replace A by a simpler matrix S. The difference T = S—A is moved over to the right
side of the equation.
The problem becomes
easier to solve, with S instead of A. But
there is a price—the simpler system has to be solved over and over.
An iterative method is easy to invent. Just split A into
S— 7. Then Ax = b is
the same as
Sx =Tx +b.
The novelty is to solve (1) iteratively.
(1)
Each guess x; leads to the next x,41:
SXp41 = PX, +O.
(2)
Start with any xo. Then solve $x; = Tx9 +6. Continue to the second iteration Sx2 =
Tx; +5.
A hundred iterations are very common—maybe
more.
Stop when (and if!)
the new vector x;,+1 is sufficiently close to x,—or when the residual Ax; — 6 is near
zero. We can choose the stopping test. Our hope is to get near the true solution, more
quickly than by elimination. When the sequence x, converges, its limit x = Xoo does
solve equation (1). The proof is to let k — oo in equation (2).
The two goals of the splitting A =
S—T
are speed per step and fast convergence
of the x;. The speed of each step depends on S$ and the speed of convergence depends
on S pele
1
Equation (2) should be easy to solve for x,4,. The “preconditioner” S could be
diagonal or triangular. When its LU factorization is known, each iteration step
is fast.
2
The difference x —x, (this is the error e,) should go quickly to zero. Subtracting
equation (2) from (1) cancels b, and it leaves the error equation:
Sex41 = Tex which means
ey; = Se
eg
(3)
At every step the error is multiplied by S~'T. If S~'T is small, its powers go quickly
to zero. But what is “small”?
The extreme splitting is S = A and T =0. Then the first step of the iteration is
the original Ax = b. Convergence is perfect and S~'T is zero. But the cost of that
step is what we wanted to avoid. The choice of S is a battle between speed per step
(a simple S) and fast convergence (S close to A). Here are some popular choices:
J
GSS
S = diagonal part of A (the iteration is called Jacobi’s method)
= lower triangular part of A (Gauss-Seidel method)
9.3
Iterative Methods for Linear Algebra
SOR
S = combination of Jacobi and Gauss-Seidel (successive overrelaxation)
ILU
S = approximate L times approximate U (incomplete LU method).
411
Our first question is pure linear algebra: When do the x,’s converge to x? The
answer uncovers
the number
|A|max that controls convergence.
In examples of J and
GS and SOR, we will compute this “spectral radius” |A|max. It is the largest eigenvalue
of the iteration matrix S~'T.
The Spectral Radius Controls Convergence
Equation (3) is eg+; = S~'!Te,. Every iteration step multiplies the error by the same
matrix B = S~'T. The error after k steps is eg = B*eo. The error approaches zero
if the powers of B = S~'T approach zero. It is beautiful to see how the eigenvalues
of B—the largest eigenvalue in particular—control the matrix powers B*.
9C Convergence The powers B* approach zero if and only if every eigenvalue of
B satisfies |A| < 1. The rate of convergence is controlled by the spectral radius
|A.max:
The test for convergence is |A|max < 1. Real eigenvalues must lie between —1 and 1.
Complex eigenvalues 4 = a+ib must lie inside the unit circle in the complex plane. In
that case the absolute value |A| is the square root of a + b*—Chapter 10 will discuss
complex numbers. In every case the spectral radius is the largest distance from the
origin 0 to the eigenvalues A;,...,A,. Those are eigenvalues of the iteration matrix
Bsr Mk
To see why |Almax < 1 is necessary, suppose the starting error e9 happens to be
an eigenvector of B. After one step the error is Bep = eo. After k steps the error is
Bee = Akeo. If we start with an eigenvector, we continue with that eigenvector—and
it grows or decays with the powers A*. This factor A* goes to zero when
Since this condition is required of every eigenvalue, we need |A|max < l.
|A| <
1.
To see why |Al|max < 1 is sufficient for the error to approach zero, suppose eo is
a combination of eigenvectors:
€o9 =cyXi
+-+-+cenX,
leads to
ex = c1(A1)*x} feeef CH)
Xp.
(4)
This is the point of eigenvectors! They grow independently, each one controlled by
its eigenvalue. When we multiply by B, the eigenvector x; is multiplied by A;. If all
|A;| < 1 then equation (4) ensures that e, goes to zero.
Example 1 B = [-6:3] has Amex = 1.1
BY yO rs| Has Ae:
B27
Wi
times B. Then B®? is (1.1)* times B. The powers of B blow up. Contrast with the
powers of B’. The matrix (B’)* has (.6)* and (.5)* on its diagonal. The off-diagonal
entries also involve (.6)*, which sets the speed of convergence.
412
Chapter 9 Numerical Linear Algebra
Note There is a technical difficulty when B does not have n independent eigenvectors.
(To produce this effect in B’, change .5 to .6.) The starting error eg may not be a
combination of eigenvectors—there are too few for a basis. Then diagonalization is
impossible and equation (4) is not correct. We turn to the Jordan form:
B= Siss
and
Be =syrs*.
(5)
Section 6.6 shows how J and J* are made of “blocks” with one repeated eigenvalue:
et | k
ki i =
The powers of a 2 by 2 block are
Ak
0
ORK k-1
ak |;
If |A| < 1 then these powers approach zero. The extra factor k from a double eigenvalue is overwhelmed by the decreasing factor 4*~!. This applies to all Jordan blocks.
A larger block has k?A*~? in J*, which also approaches zero when |A| < 1.
If all |A| < 1 then J‘ -> 0. This proves Theorem
9C: Convergence requires
|Almax < 1.
Jacobi versus Seidel
We now solve a specific 2 by 2 problem.
The theory of iteration says that the key
number is the spectral radius of B = S -!T. Watch for that number
written o(B)—the Greek letter “rho” stands for the spectral radius:
2u-—-
v=
—u +2
4
;
has the solution
= —2
The first splitting is Jacobi’s method.
Uu
v
2
=
|Almax.
It is also
:
(6)
Keep the diagonal terms on the left side (this is
S). Move the off-diagonal part of A to the right side (this is 7). Then iterate:
2Uk+1 = UR +4
2Uk4+1 = Uz — 2.
Start the iteration from uo = vo = 0. The first step goes to uy = 2,v; = —1.
going:
Keep
fo] La} Do} Leva) [%] [ays] semems [0]
This shows convergence. At steps 1, 3, 5 the second component is —1, —1/4, —1/16.
The error is multiplied by 4 every two steps. So is the error in the first component.
The values 0, 3/2, 15/8 have errors 2, 5, 7. Those also drop by 4 in each two steps.
The error equation is Sex4, = Tex:
oly
Op
geal Oe
eet =| “gee
OF
alone
Sie
gc
(7)
9.3
Iterative Methods for Linear Algebra
413
That last matrix is S~'T. Its eigenvalues are 5 and —5. So its spectral radius is +:
B=S =i r=[{0
1
5
‘i has |Alma=
x 5 and
Le?
0
1
5
Zio
k | =|,
|.
Two steps multiply the error by | exactly, in this special example. The important message is this: Jacobi’s method works well when the main diagonal of A is large compared to the off-diagonal part. The diagonal part is S, the rest is —T. We want the
diagonal to dominate and S~!T to be small.
The eigenvalue A = 4 is unusually small. Ten iterations reduce the error by
2!0 — 1024. Twenty iterations reduce e by (1024)?. More typical and more expensive
1S) [Almas = 94 Or .999.The Gauss-Seidel method keeps the whole lower triangular part of A on the left
side as S:
2Uk+1
—UxK+1
+ 20K41
Uk+1
ie
=
+4
=
aed 2
Vet
Ue
+2
FUR
—
Nl
=
i
[ns
NI
Notice the change. The new u;z+, from the first equation is used immediately in the
second equation. With Jacobi, we saved the old ux until the whole step was complete.
With Gauss-Seidel, the new values enter right away and the old uz is destroyed. This
cuts the storage in half! It also speeds up the iteration (usually). And it costs no more
than the Jacobi method.
Starting from (0,6), the exact answer (2,0) is reached in one step. That is an
accident I did not expect. Test the iteration from another start up = 0 and vo = —1:
0
=|
ye
Sia
15/8
63/32
lett (16-9 |=1/64s|
h
2
Seen
Orla
The errors in the first component are 2, 1/2, 1/8, 1/32. The errors in the second component are —1, —1/4, —1/16, —1/32. We divide by 4 in one step not two steps. GaussSeidel is twice as fast as Jacobi.
This is true for every positive definite tridiagonal matrix; |A|max for Gauss-Seidel
is the square of |A|max for Jacobi. This holds in many other applications—but not
for every matrix. Anything is possible when A is strongly nonsymmetric—Jacobi
sometimes better, and both methods might fail. Our example is small:
Be
Siae na! iZ 4
erorel
—
(0) 1
E |
anda)
20
dE
—
is
0
[i
NIlm
The Gauss-Seidel eigenvalues are 0 and i. Compare with 5 and —4 for Jacobi.
With a small push we can explain the successive overrelaxation method (SOR). The
new idea is to introduce a parameter w (omega) into the iteration. Then choose this
number w@ to make the spectral radius of S~!T as small as possible.
414
Chapter 9 Numerical Linear Algebra
Rewrite Ax = b as wAx = wb. The matrix S in SOR has the diagonal of the
original A, but below the diagonal we use wA. The matrix T on the right side is
S—@WA:
2Uk+1
= (2 — 2w)uz +
—oug4, + 2Ug41 =
wuz + 4@
(9)
(2 — 2m)vz, — 20.
This looks more complicated to us, but the computer goes as fast as ever.
ux+1
from the first equation is used immediately to find vz;
This is like Gauss-Seidel,
Suede
with an adjustable number w.
cefi =e = i
Each new
in the second equation.
The key matrix is always
Ioo
2].
(10)
The determinant is (1—w)?. At the best w, both eigenvalues turn out to equal 7 Ay
which is close to (4). Therefore SOR is twice as fast as Gauss-Seidel in this example.
In other examples SOR can converge ten or a hundred times as fast.
I will put on record the most valuable test matrix of order n. It is our favorite
—1, 2, —1
tridiagonal matrix.
The diagonal i. 21.
Below
and above are —1’s.
example had n = 2, which leads to cos} = 5 as the Jacobi eigenvalue.
Our
(We found
that . above.) Notice especially that this eigenvalue is squared for Gauss-Seidel:
9D The splittings of the —1, 2, —1 matrix of order n yield these eigenvalues of B:
Jacobi (S = 0,2, 0 matrix):
So Tehes leet cos
:
n
Gauss-Seidel (S = —1, 2,0 matrix):
ST
SOR (with the best w):
(cos
S7!T has |Almax =
has a
oH
n
==
(cos
I
n+ 7.
2
a 7 /(: + sin
2
ah
n
2
F “) :
Let me be clear: For the —1, 2, —1 matrix you should not use any of these
iterations! Elimination is very fast (exact LU). Iterations are intended for large sparse
matrices—when a high percentage of the zero entries are “not good.” The not good
zeros are inside the band, which is wide. They become nonzero in the exact L and U,
which is why elimination becomes expensive.
We mention one more splitting. It is associated with the words “incomplete LU.”
The idea is to set the small nonzeros in L and U back to zero. This leaves triangular
matrices Lo and Up which are again sparse. That allows fast computations.
The splitting has S = LoUo on the left side. Each step is quick:
LoUoxr+1 = (A — LoUo)x, + b.
9.3
Iterative Methods for Linear Algebra
415
On the right side we do sparse matrix-vector multiplications. Don’t multiply Lo times Up—
those are matrices. Multiply x; by Uo and then multiply that vector by Lo. On the left side
we do forward and back substitutions. If LoUo is close to A, then |Almax is small. A
few iterations will give a close answer.
The difficulty with all four of these splittings is that a single large eigenvalue
in S~'T would spoil everything. There is a safer iteration—the conjugate gradient
method—which avoids this difficulty. Combined with a good preconditioner S (from
the splitting A = S—T), this produces one of the most popular and powerful algorithms
in numerical linear algebra.*
Iterative Methods for Eigenvalues
We move from Ax = b to Ax = Ax. Iterations are an option for linear equations.
They are a necessity for eigenvalue problems. The eigenvalues of an n by n matrix
are the roots of an nth degree polynomial. The determinant of A — AJ starts with
(—A)”". This book must not leave the impression that eigenvalues should be computed
from this polynomial. The determinant of A — AJ is a very poor approach—except
when n is small.
For n > 4 there is no formula to solve det(A — 4/) = 0. Worse than that, the
A’s can be very unstable and sensitive. It is much better to work with A itself, gradually making it diagonal or triangular. (Then the eigenvalues appear on the diagonal.)
Good computer codes are available in the LAPACK library—individual routines are free
on www.netlib.org. This library combines the earlier LINPACK and EISPACK, with
improvements. It is a collection of Fortran 77 programs for linear algebra on highperformance computers. (The email message send index from lapack brings information.) For your computer and mine, the same efficiency is achieved by high quality
matrix packages like MATLAB.
We will briefly discuss the power method and the QR method for computing
eigenvalues. It makes no sense to give full details of the codes.
1 Power methods and inverse power methods. Start with any vector up. Multiply by A to find w;. Multiply by A again to find u2. If uo is a combination of the
eigenvectors, then A multiplies each eigenvector x; by A;. After k steps we have (Ai)*:
uy = Akug = cy(A1)*x1 + +++ + Cn(An)*Xn-
(11)
As the power method continues, the largest eigenvalue begins to dominate. The vectors uz point toward that dominant eigenvector. We saw this for Markov matrices in
Chapter 8:
A=
E
«| has
Amax =1
é
with eigenvector
*Conjugate gradients are described in the author’s book
greater detail by Golub-Van Loan and by Trefethen-Bau.
/ntroduction
75
ek
to Applied Mathematics
and in
416 = Chapter9
Numerical Linear Algebra
Start with wg and multiply at every step by A:
Ud
iis
l
oe
ylheed Geel llsane
84
cy
:
;
is approaching
Ug =
)
95
|:
The speed of convergence depends on the ratio of the second largest eigenvalue 12
to the largest A,;. We don’t want 4; to be small, we want A2/A;
to be small.
Here
h2/d, = .6/1 and the speed is reasonable. For large matrices it often happens that
|A2/A1| is very close to 1. Then the power method is too slow.
Is there a way to find the smallest eigenvalue—which is often the most important
in applications? Yes, by the inverse power method: Multiply uo by Aq! instead of A.
Since we never want to compute A~!, we actually solve Au; = uo. By saving the LU
factors, the next step Au2 = uw, 1s fast. Eventually
|
=
Kuo =
C\ xX]
sae
CnXn
(12)
Now the smallest eigenvalue Amin 1s in control. When it is very small, the factor WVpees|
is large. For high speed, we make Amin even smaller by shifting the matrix to
A—A*/.
If 4* is close to Amin then A—A*J has the very small eigenvalue Amin—A*. Each shifted
inverse power step divides the eigenvector by this number, and that eigenvector quickly
dominates.
2 The QR Method This is a major achievement in numerical linear algebra.
Fifty
years ago, eigenvalue computations were slow and inaccurate. We didn’t even realize
that solving det(A —A/) = O was a terrible method. Jacobi had suggested earlier that A
should gradually be made triangular—then the eigenvalues appear automatically on the
diagonal. He used 2 by 2 rotations to produce off-diagonal zeros. (Unfortunately the
previous zeros can become nonzero again. But Jacobi’s method made a partial comeback with parallel computers.) At present the QR method is the leader in eigenvalue
computations and we describe it briefly.
The basic step is to factor A, whose eigenvalues we want, into QR. Remember
from Gram-Schmidt (Section 4.4) that Q has orthonormal columns and R is triangular.
For eigenvalues the key idea is: Reverse Q and R. The new matrix is RQ.
A, = RQ is similar to A = QR, the eigenvalues are not changed:
QORx =ax
This process continues.
gives
RO(Q~'x) =A(Q7'x).
Factor the new matrix A; into Q;R,.
Since
(13)
Then reverse the factors
to Rj Q . This is the next matrix A2, and again no change in the eigenvalues. Amazingly, those eigenvalues begin to show up on the diagonal. Often the last entry of A4
holds an accurate eigenvalue. In that case we remove the last row and column and
continue with a smaller matrix to find the next eigenvalue.
Two extra ideas make this method a success. One is to shift the matrix by a
multiple of 7, before factoring into QR. Then RQ is shifted back:
Factor Ay; —cxl
into
Qg Ry.
The next matrix is Agy, = ReQe
+ cyl.
9.3
Iterative Methods for Linear Algebra
417
Ax+1 has the same eigenvalues as Az, and the same as the original Ag = A. A good
shift chooses c near an (unknown) eigenvalue. That eigenvalue appears more accurately
on the diagonal of Ax;;—which
tells us a better c for the next step to Ago.
The other idea is to obtain off-diagonal zeros before the QR method starts. Change
A to the similar matrix L~'AL (no change in the eigenvalues):
1
LOAL =
1522735): [al
1
(eA 5
seer Dil ree
1
ivy
i5*.3
ie OL aS
Orden?
4
L~' subtracted row 2 from row 3 to produce the zero in column 1. Then L added
column 3 to column 2 and left the zero alone. If I try for another zero (too ambitious),
I will fail. Subtracting row 1 from row 2 produces a zero. But now L adds column 2
to column | and destroys it.
We must leave those nonzeros | and 4 along one subdiagonal. This is a “Hessenberg matrix”, which is reachable in a fixed number of steps. The zeros in the lower
left corner will stay zero through the QR method. The operation count for each QR
factorization drops from O(n3) to O(n’).
Golub and Van Loan give this example of one shifted QR step on a Hessenberg
matrix A. The shittis (civ 7 f:
Alga
|e
3
Ae d
6
One
001
leads toy
=:541.69
DL ALG oo
0
00002
Aq =
0.835
—6.656
7.012
Factoring A —7/ into QR produced Ay = RQ+7/. Notice the very small number
.00002. The diagonal entry 7.012 is almost an exact eigenvalue of Aj, and therefore
of A. Another QR step with shift by 7.0127 would give terrific accuracy.
Problem Set 9.3
Problems
1-12 are about iterative methods
for Ax = b.
1
Change Ax = b to x = (J—A)x+b. What are S and T for this splitting? What
matrix S~!T controls the convergence of xx41 = (J — A)x, + b?
Z
If 4 is an eigenvalue of A, then ___ is an eigenvalue of B = ] — A. The real
eigenvalues of B have absolute value less than | if the real eigenvalues of A lie
and
_
between
3
Show why the iteration x41 = ([—A)xx +b does not converge for A = [_7 74].
4
Why is the norm
the powers
||BI.
of B* never larger than
| Bk? Then
B* approach zero (convergence).
||B|| < 1 guarantees
that
No surprise since |A|max is below
418
Chapter 9 Numerical Linear Algebra
If A is singular then all splittings A = S—T must fail. From Ax = 0 show that
S-'Tx =x. So this matrix B = S~'T has A = 1 and fails.
Change the 2’s to 3’s and find the eigenvalues of S~'T for Jacobi’s method:
SxXp41 =Tx, +)
Find the eigenvalues of
3.
is
Out
i ax, +b.
S~'T for the Gauss-Seidel method applied to Problem 6:
ew)
ke
see
Does
0
li |Lea
2,
|A|max for Gauss-Seidel equal |A| ay
QO 1
|6 a [ae+e.
Lor Jacobi?
For any 2 by 2 matrix [2%] show that |Almax equals |bc/ad| for Gauss-Seidel
and |bc/ad|'/? for Jacobi. We need ad #0 for the matrix S to be invertible.
The best w produces two equal eigenvalues for S~!'T in the SOR method. Those
eigenvalues are w—1 because the determinant is (w—1)*. Set the trace in equation (10) equal to (w — 1) + (w — 1) and find this optimal .
10
Write a computer code (MATLAB
or other) for the Gauss-Seidel
method.
You
can define S and T from A, or set up the iteration loop directly from the entries a;;.
Test it on the —1, 2, —1
matrices
A of order
10, 20, 50 with b =
CLG
an i0),
11
The Gauss-Seidel iteration at component 7 is
If every x)" = x4 how does this show that the solution x is correct?
does the formula change for Jacobi’s method?
parentheses.
12
A is the —1, 2, —1 matrix of order n = 10.
Divide equation (11) by ae and explain why |A2/A1| controls the convergence of
the power method.
verge.
14
How
insert @ outside the
The SOR splitting matrix S is the same as for Gauss-Seidel except that the diagonal is divided by w. Write a program for SOR on an n by n matrix. Apply
it with w = 1, 1.4, 1.8, 2.2 when
13
For SOR
Construct a matrix A for which this method does not con-
The Markov matrix A = [-?-3] has A = 1 and .6, and the power method uz =
A*uo converges to [-32]. Find the eigenvectors of A~!. What does the inverse
power method u_,; = A~*uo converge to (after you multiply by .6*)?
9.3
15
Iterative Methods for Linear Algebra
419
Show that the n by n matrix with diagonals —1, 2, —1 has the eigenvector x; =
(sin -77> sin aT a 8i ae,
rind the eigenvalue A; by multiplying Ax).
Note: For the other eigenvectors and eigenvalues of this matrix, change 7 to
jm in x; and Aj.
16
For A =[_%~!] apply the power method uz+; = Au, three times starting with
1 What eigenvector is the power method converging to?
uy = leat
17
In Problem 11 apply the inverse power method uz, = A~!u, three times with
the same uo. What eigenvector are the u,x’s approaching?
18
In the QR method for eigenvalues, show that the 2,1 entry drops from sin@ in
A = OR to —sin*@ in RQ. (Compute R and RQ.) This “cubic convergence”
makes the method a success:
cos@
ee
sing
0 oe
cos@
et
—sin@
Il
a
eaallta ae
19
If A is an orthogonal matrix, its QR factorization has Q =
and ht
. Therefore RQ =
. These are among the rare examples when the
QR method fails.
20
The shifted QR method factors A — c/ into QR. Show
A; = RQ+cl equals Q-'AQ. Therefore A; has the
(but is closer to triangular).
21
When A = A!, the “Lanczos method” finds a’s and b’s and orthonormal q’s so
that Ag; = bj-194)-1 + 4jq; + bjaj+1 (with qo = 0). Multiply by q} to find a
that the next matrix
eigenvalues as A
formula for a;. The equation says that AQ = QT where T is a
22
matrix.
The equation in Problem 21 develops from this loop with bo = | and rp = any q:
Gis = 1)/0j,
= J + le G4; Ag, ry = Aq] — 0;gj
— qj.) — I
Write a computer program. Test on the —1, 2, —1 matrix A. Q'Q should be /.
23
Suppose A is tridiagonal and symmetric in the QR method. From A, = O-A0
show that A, is symmetric. Then change Then change to Ay = RAR™! and
show that A, is also tridiagonal. (If the lower part of A, is proved tridiagonal
then by symmetry the upper part is too.) Symmetric tridiagonal matrices are at
the heart of the QR method.
Questions 24—26 are about quick ways to estimate the location of the eigenvalues.
24
If the sum of |a;;| along every row is less than 1, prove that |A| < 1. (If |x;|
is larger than the other components of x, why is |2a;;x;| less than |x;|? That
means
|Ax;| < |x;| so |A| < 1.)
420
Chapter 9 Numerical Linear Algebra
(Gershgorin circles) Every eigenvalue of A is in a circle centered at a diagonal
entry aj; with radius
rj = Lj -+4i\a;;| . This follows from (A —ajj)x; =
If |x;| is larger than the other components
viding by |x;| leaves |A — ajj| < rj.
25
What bound on |A|max does Problem
Xj4iaijX;-
of x, this sum is at most r;|x;|. Di-
24 give for these matrices?
What
are the
three Gershgorin circles that contain all the eigenvalues?
Ass
26
seh oe
[see
ie BAD
2-1
Aaa
0
wi2iel
Ost
gaz
These matrices are diagonally dominant because each aj; > rj = absolute sum
along the rest of row 7. From the Gershgorin circles containing all 4’s, show that
diagonally dominant matrices are invertible.
A=
|
3/5"
a>
aes eat
A, =|
AO. + all
4
"1
a
2)
3s
a
10
COMPLEX VECTORS AND
MATRICES
COMPLEX
NUMBERS a 10.1
A complete theory of linear algebra must include complex numbers. Even when the
matrix is real, the eigenvalues and eigenvectors are often complex. Example: A 2 by
2 rotation matrix has no real eigenvectors. Every vector turns by 6—the direction is
changed. But there are complex eigenvectors (1,i) and (1, —i). The eigenvalues are
also complex numbers e’? and e~'°. If we insist on staying with real numbers, the
theory of eigenvalues will be left in midair.
The second reason for allowing complex numbers goes beyond 4 and x to the
matrix A. The matrix itself may be complex. We will devote a whole section to the
most important example—the Fourier matrix. Engineering and science and music and
economics all use Fourier series. In reality the series is finite, not infinite. Computing
the coefficients in cye’* + cpe?2* +... +c,e!™
is a linear algebra problem.
This section gives the main facts about complex numbers. It is a review for some
students and a reference for everyone. The underlying fact is that i? =—1. Everything
comes from that. We will get as far as the amazing formula e77! = 1.
Adding and Multiplying Complex Numbers
Start with the imaginary number 7. Everybody knows that x? = —1 has no real solution. When you square a real number, the answer is never negative. So the world has
agreed on a solution called i. (Except that electrical engineers call it j.) Imaginary
numbers follow the normal rules of addition and multiplication, with one difference.
Whenever i* appears it is replaced by —1.
421
422
Chapter 10
Complex Vectors and Matrices
10A A complex number (say 3 + 2i) is the sum of a real number (3) and a pure
imaginary number (2i). Addition keeps the real and imaginary parts separate. Multiplication uses i? = —1:
Add:
(3+2i)+(3+2i)
=64+4i
Multiply: (3 + 2i)(1 — i) =3 + 2i — 3i — 217 =5 -i.
If I add 342i to 1—i, the answer is 4+i. The real numbers 3+ 1 stay separate from
the imaginary numbers 2i — i. We are adding the vectors (3,2) and (1, —1).
The number (1 — i)* is 1 — i times 1 — i. The rules give the surprising answer —2i:
G=)NC 2s
l= Se
In the complex plane, 1 —i is at an angle of —45°. When we square it to get —2i,
the angle doubles to —90°. If we square again, the answer is (—2i)* = —4. The —90°
angle has become —180°, which is the direction of a negative real number.
A real number is just a complex number z = a + bi, with zero imaginary part:
b =0. A pure imaginary number has a = 0:
The real part is
a=Re
(a+bi).
The imaginary part is
b=Im
(a+Di).
The Complex Plane
Complex numbers correspond to points in a plane. Real numbers go along the x axis.
Pure imaginary numbers are on the y axis. The complex number 3+2i is at the point
with coordinates
(3,2). The number zero, which is 0+ Oi, is at the origin.
Adding and subtracting complex numbers is like adding and subtracting vectors in
the plane. The real component stays separate from the imaginary component. The vectors
go head-to-tail as usual. The complex plane C! is like the ordinary two-dimensional plane
R’, except that we multiply complex numbers and we didn’t multiply vectors.
Now comes an important idea. The complex conjugate of 3+2i is 3—2i. The
complex conjugate of z= 1—i is z= 1+. In general the conjugate of z=a+bi is
Z =a-—bi. (Notice the “bar” on the number to indicate the conjugate.) The imaginary
parts of z and “z bar” have opposite signs. In the complex plane, Z is the image of z
on the other side of the real axis.
Two useful facts. When we multiply conjugates Zz, and Z2, we get the conjugate
of 2122. When we add Z; and Z2, we get the conjugate of z; + z2:
m+
22 = (3 —21)
Zi <p
+ 1 +1)
= (b= 21) OG
21)
=4—1
=
5
te
conjugate of 7) + m:
Conpiigate Of
21>.
Adding and multiplying is exactly what linear algebra needs. By taking conjugates of
Ax = Ax, when A is real, we have another eigenvalue A and its eigenvector X:
If Ax = Ax and A is real then Ax = ix.
(1)
10.1
Complex Numbers
423
A Imaginary
axis
Complex
plane
z
Di ae
Z=3-+21
UL
r = |z| = 32 +22
l
t
Figure 10.1.
z=a+bi
+———_}+—>
Real axis
corresponds to the point (a,b) and the vector Fel
Something special happens when z = 3+ 2i combines with its own complex conjugate
Z=3-—2i. The result from adding
z+zZ or multiplying zz is always real:
(34+2i)+(3—2i)=6
(3+2i) x 3—2i)
(real)
=9+6i
— 6i — 4i* = 13
(real).
The sum of z = a+bi and its conjugate 7 = a—bi is the real number 2a. The product
of z times Z is the real number a? + b?:
zz= (a+ bi)(a — bi) =a* +b’.
(2)
The next step with complex numbers is division. The best idea is to multiply the denominator by its conjugate to produce a* + b? which is real:
Ne
at+tib
Ey
es eaee 2
at+iba-—ib
a?+b?
ipa
342i
i
ey es)
34213-2i
13
In case a? + b* = 1, this says that (a + ib)—! is a—ib. On the unit circle, 1/z is Z.
Later we will say: 1/e!° is e~'® (the conjugate). A better way to multiply and divide
is to use the polar form with distance r and angle 6.
The Polar Form
The square root of a’ +b? is |z|. This is the absolute value (or modulus) of the number
z=a+ib. The same square root is also written r, because it is the distance from 0
to the complex number.
The number r in the polar form gives the size of z:
The absolute value of
z=a+ib
is
|z] = Va2+b2.
This is also called r.
The absolute value of
z=3+2i
is
|z| = V¥32+22.
This is
r= V13.
The other part of the polar form is the angle 6. The angle for
z= 5 is 6 = 0 (because
this z is real and positive). The angle for z = 37 is 2/2 radians.
The angle for z = —9
424
Chapter 10
Complex Vectors and Matrices
21
Z=34+2i
S
N
SS
f
SS
i
~
N
a
N
2+2=6
N
>
+
4
—}+——~+—»
5
6
Real axis
complex conjugate of z
=)
Figure 10.2
Z=3-2i
Conjugates give the mirror image of the previous figure: z+ Z is real.
is a radians. The angle doubles when the number is squared. This is one reason why
the polar form is good for multiplying complex numbers (not so good for addition).
When the distance is r and the angle is 9, trigonometry gives the other two sides
of the triangle. The real part (along the bottom) is a = rcos@. The imaginary part
(up or down) is b=rsin@. Put those together, and the rectangular form becomes the
polar form:
The
Note:
number
cos@ +isin@
Thus cos@ +isin@
z=a+ib
z=rcos@+irsin#.
always has absolute value r = 1 because cos? 6 + sin?6 = 1.
lies on the circle of radius
Example 1 Find r and 6 for
Solution
is also
1—the
unit circle.
z= 1-+#i and also for the conjugate jz = 1 —i.
The absolute value is the same for z and Z. Here it is
iZeet 41 =o.
andalsol
r= /1+1
sit = 1 ee
= 2:
= 2
The distance from the center is 2. What about the angle? The number | +7 is at
the point (1, 1) in the complex plane. The angle to that point is 7/4 radians or 45°.
The cosine is 1//2 and the sine is 1/./2. Combining r and 6 brings back
z= 1+:
rcos@ +irsiné
I
|
¥3(=) +iv3(—)
:
erie
The angle to the conjugate 1 —i can be positive or negative. We can go to 77/4
radians which is 315°. Or we can go backwards through a negative angle, to —1/4
radians or —45°. If z is at angle 0, its conjugate Z is at 2x — 0 and also at —0.
We can freely add 27 or 4m or —2z7 to any angle! Those go full circles so
the final point is the same. This explains why there are infinitely many choices of 6.
10.1.
Complex Numbers
425
Often we select the angle between zero and 27 radians. But —@ is very useful for the
conjugate Z.
Powers and Products:
Polar Form
Computing (1 +7)? and (1+i)® is quickest in polar form. That form has r = /2 and
6 = 1/4 (or 45°). If we square the absolute value to get r? = 2, and double the angle
to get 20 = 2/2 (or 90°), we have (1+i)*.
Fo = Loe)
For the eighth power we need r® and 86:
eee and 80 = 8. — = 2n.
This means: (1 +i)® has absolute value 16 and angle 27. The eighth power of 1+i
is the real number 16.
Powers are easy in polar form. So is multiplication of complex numbers.
10B The polar form of z” has absolute value r”. The angle is n times 0:
The nth power of
z=r(cos@?+isin@)
is
z" n =r"(cosn@6 +isinné).
(3)
In that case z multiplies itself. In all cases, multiply r’s and add angles:
r(cos@ +i sin@) times r'(cos 6’ +i sin 6’) = rr'(cos(6 + 6’) +isin(@+6’)).
One way to understand this is by trigonometry.
get the double angle 2 for z??
(cos
Concentrate on angles.
(4)
Why do we
+i sin@) x (cos@é +isin@) = cos’6 + i? sin’6 + 2i sin6@ cos 0.
The real part cos? 0 —sin? @ is cos20. The imaginary part 2sin@ cos@ is sin26. Those
are the “double angle” formulas. They show that @ in z becomes 26 in in z.
When
the angles @ and 6’ are different, use the “addition formulas”
instead:
(cos6 +i sin@)(cos 6’ +i sin@’) =
[cos 6 cos 6’ — sin@ sin 6’] + i[sin@ cos 6’ + cos6@sin 6’].
In those brackets, trigonometry sees the cosine and sine of 6+6’. This confirms equation (4), that angles add when you multiply complex numbers.
There is a second way to understand the rule for z”. It uses the only amazing
formula in this section. Remember that cos @+i sin@ has absolute value 1. The cosine
is made up of even powers, starting with | — 50°. The sine is made up of odd powers,
starting with 6 — 70°, The beautiful fact is that e? combines both of those series into
cos@ +i sin@:
e=14+x+
|
|
ae - ral +--+.
becomes
;
e’?
3
ges
PL
=1+i0+ Wigs ce Zio" As
426
Chapter 10
Complex Vectors and Matrices
(e!I?
etl 6
ean! 6
el2ni/6
_ ¢2M — |
e8ti/6
Figure 10.3.
e 10ni/6
(a) Multiplying e’” times e!% (b) The nth power of e27'/" is e27! = 1.
Write —1 for i*. The real part 1 — 50° +--+
is exactly cos@.
The imaginary part
6 — 40° +.-- is sin@. The whole right side is cos@ + i sin@:
Euler’s Formula
e'? =cosé +isiné.
The special choice 6 = 27 gives cos2m +isin22
series e27' = 1 4+2ni + 5(2mi)* +.---
Now multiply e’” times ef
(5)
which is 1. Somehow the infinite
adds up to 1.
Angles add for the same reason that exponents add:
e? times e° is e° because (e)(e) x (e)(e)(e) = (e)(e)(e)(e)(e)
e’”’id
times
e’”’Ca is
e 210
e'” times e!?QC: is
ei (G+0’)
(6)
Every complex number a+ib =rcos@+irsin@ now goes into its best possible form.
That form is re!?.
The powers (re!’)" are equal to r”e!"°. They stay on the unit circle when r = 1
and r” = 1. Then we find n different numbers whose nth powers equal 1:
Set w=e 2xi/”
The nth powers of 1, w,w*,..., w"~! all equal 1.
Those are the “nth roots of 1.” They solve the equation z” = 1. They are equally
spaced around the unit circle in Figure 10.3b, where the full 27 is divided by n. Multiply their angles by n to take nth powers. That gives w” = e?”! which is 1. Also
(w?)” = e*”' = 1. Each of those numbers, to the nth power, comes around the unit
circle to 17
These roots of | are the key numbers for signal processing. A real digital computer uses only 0 and 1. The complex Fourier transform uses w and its powers. The
last section of the book shows how to decompose a vector (a signal) into n frequencies
by the Fast Fourier Transform.
10.1.
Complex Numbers
427
Problem Set 10.1
Questions 1-8 are about operations on complex numbers.
1
Add and multiply each pair of complex numbers:
(a)
cD
(b)
—1l+i,—-1+i
(c)
cos@+isin@,cos@
—isin@
Locate these points on the complex plane. Simplify them if necessary:
@)
3
2+1,2-i1
2+i
©
@+i?
©
Fy
@
[+i
Find the absolute value r = |z| of these four numbers.
If @ is the angle for 6—8i,
what are the angles for the other three numbers?
(Qa Ol Sit (bmn (Giaany
4
eC),
Game
If |z| = 2 and |w|
=3 then |z x w| =
mg
B(OReEg
and |z+w|<
and |z/w| =
and |z— w| <
5
Find a + ib for the numbers
at angles 30°, 60°, 90°, 120° on the unit circle.
If
w is the number at 30°, check that w? is at 60°. What power of w equals 1?
6
If z=rcos@+irsin@ then |/z has absolute value
polar form is
. Multiply z x 1/z to get 1.
and angle
wits
vf
The 1 by 1 complex multiplication M = (a+ bi)(c + di) 1s a 2 by 2 real multi-
BTL
The right side contains the real and imaginary parts of M. Test
8
A = A, +iA2 is a complex n by n matrix and b = b; +ib2 is a complex vector.
The solution to Ax = b is x;
+ix2. Write Ax = b as a real system of size 2n:
L
dleJ=[a}
Questions 9-16 are about the conjugate
at+ib=re'®,
9
M = (1+ 31)(1 — 31).
7 = a — ib = re~ ‘@
of the number z =
Write down the complex conjugate of each number by changing i to —i:
(€)
ey |
(dd)
e'*™=-1
(b)
<2=i1)01=1)
(ce)
1
(c)
e!/2 (which is i)
Wwhichisalsoi)
(f)
i? =
428
Chapter 10
Complex Vectors and Matrices
10
The sum z+Z is always
. The difference z—Z is always
. Assume
z #0. The product z x Z is always
. The ratio z/z always has absolute
value
11
For a real 3 by 3 matrix, the numbers a2, a), a9 from the determinant are real:
det(A — AI) = —A3 +.and? +.aja + ap = 0.
Each root A is an eigenvalue. Taking conjugates gives SP
so A is also an eigenvalue.
ash Path
ane:
For the matrix with a;; = i—j, find det(A —A/) and
the three eigenvalues.
Note The conjugate of Ax = Ax is A¥ = AX. This proves two things: 2 is an
eigenvalue and xX is its eigenvector.
value.
12
Problem
11 only proves that A is an eigen-
The eigenvalues of a real 2 by 2 matrix come from the quadratic formula:
a-ixz
Cc
b
=22 =a
d—-ix
ant
(ad ~b0)
=0
gives the two eigenvalues (notice the + symbol):
\,=
a+d+/(a+d)?
—4(ad — bc)
fag ge oe
ee
(a)
If
a=b=d =1, the eigenvalues are complex when c is
(b)
What are the eigenvalues when ad = bc?
(c)
The two eigenvalues (plus sign and minus sign) are not always conjugates
of each other. Why not?
13
In Problem 12 the eigenvalues are not real when (trace)? = (a + d)* is smaller
than
. Show that the 2’s are real when bc > 0.
14
Find the eigenvalues and eigenvectors of this permutation matrix:
0
=
has
det(Py —AI) =
/—
S] Sis
SSS)
SS et
SQ
15
Extend P4 above to P¢ (five 1’s below the diagonal and one in the corner). Find
det(Ps — Al) and the six eigenvalues in the complex plane.
16
A real skew-symmetric matrix (A? = —A) has pure imaginary eigenvalues. First
proof: If Ax = Ax then block multiplication gives
Ea olli)=9 [2]
This block matrix is symmetric.
Its eigenvalues must be
! So A is
10.1.
Complex Numbers
429
Questions 17-24 are about the form re’? of the complex number r cos6+ir
sin 6.
az
Write these numbers in Euler’s form re!’. Then square each number:
(aye
18
P=
SP>
(6)
cos 20777 sin 20"
(Cc)
—7F
Ges ar
Find the absolute value and the angle for z = sin@+icos@
(careful). Locate this
z in the complex plane. Multiply z by cos@ + isin@ to get
19
Draw all eight solutions of z® = 1 in the complex plane. What are the rectangular
forms a +ib of these eight numbers?
20
Locate the cube roots of 1 in the complex plane. Locate the cube roots of —1.
Together these are the sixth roots of —
2a
By comparing
e9
— cos 30 +isin36
with (e'?)? =
(cos@ +isiné)%,
find the
“triple angle” formulas for cos3@ and sin36@ in terms of cos@ and sin@.
22
Suppose the conjugate z is equal to the reciprocal |/z. What are all possible z’s?
23
(a)
Why do e! and i° both have absolute value 1?
(b)
In the complex plane put stars near the points e’ and i°.
(c)
The number i© could be (e in/2)e
24
or
(er! /2)e |
Are those equal?
Draw the paths of these numbers from ¢ = 0 to t = 27 in the complex plane:
(a)
elt
(b)
elt
=
e tell
(c)
G1)!
=
eft
430
Chapter 10
Complex Vectors and Matrices
HERMITIAN
AND
UNITARY MATRICES
m 10.2
The main message of this section can be presented in the first sentence: When you
transpose a complex vector z or a matrix A, take the complex conjugate too. Don’t
stop at z' or A’. Reverse the signs of all imaginary parts. Starting from a column
vector with components z; = aj +ibj;, the good row vector is the conjugate transpose
with components a; — 1b;:
Zs
|e SS ae he
ee
(1)
Here is one reason to go to z. The length squared of a real vector is x74. g -+x2, The
length squared of a complex vector is not zt: : +2, With that wrong definition, the
length of (1,1) would be 1* +i? =0. A nonzero vector would have zero length—not
good. Other vectors would have complex lengths. Instead of (a+bi)* we want a?+b?,
the absolute value squared. This is (a + bi) times (a — bi).
For each component we want z; times Z;, which is Iz|? = a*+b*.
when the components of z multiply the components of 2:
That comes
F65\
[z1 «+» Zw) | eo) Sal? totes
‘This iss! z'z=Yz\/?.
(2)
Now the squared length of (1,i) is 17+ |i]? =2. The length is 2 and not zero. The
squared length of (1+i, 1—i) is 4. The only vectors with zero length are zero vectors.
DEFINITION
The length ||z|| is the square rootof \\z\|? =Z'z = |z|* +--+: + zl’.
Before going further we replace two symbols by one symbol. Instead of a bar
for the conjugate and T for the transpose, we just use a superscript H. Thus z! = z4,
This is “z Hermitian,” the conjugate transpose of z. The new word is pronounced
““Hermeeshan.” The new symbol applies also to matrices: The conjugate transpose of
a matrix A is AH,
:
Notation
mae
. —T
The vector z@ is z'. The matrix AM is A —the conjugate transpose of A:
If A=
j
51 ney
then AH —
1
0
|= “A Hermitian”
Complex Inner Products
For real vectors, the length squared is x'x—the inner product of x with itself. For
complex vectors, the length squared is z4z. It will be very desirable if this is the
10.2
Hermitian and Unitary Matrices
431
inner product of z with itself. To make that happen, the complex inner product should
use the conjugate transpose (not just the transpose). There will be no effect when the
vectors are real, but there is a definite effect when they are complex:
DEFINITION
The inner product of real or complex vectors u and v is uv:
v1
H
uv
—
Sais
vee
Tn |
2
> |=
a
Wyvy
=
+--+ + Unvn.
(3)
With complex vectors, uy is different from vu. The order of the vectors is now
important. In fact vlu = Diu, +--- + V_uy is the complex conjugate of uly. We
have to put up with a few inconveniences for the greater good.
Example 1 The inner product of u = i with v = |
| is [1
Example 2 The inner product of u =
ES Ue
0
pie.
with v =
ae
a
Pe Wea
i
a
i)
Ph
NOL IE
:
Example | is surprising. Those vectors (1,7) and (i, 1) don’t look perpendicular. But
they are. A zero inner product still means that the (complex) vectors are orthogonal.
Similarly the vector (1,7) is orthogonal
to the vector (1, —i).
Their inner product is
1 —1=0. We are correctly getting zero for the inner product—where we would be
incorrectly getting zero for the length of (1,7) if we forgot to take the conjugate.
Note We have chosen to conjugate the first vector u. Some authors choose the second
vector v. Their complex inner product would be u'v. It is a free choice, as long as we
stick to one or the other. We wanted to use the single symbol © in the next formula
too:
The inner product of Au with v equals the inner product of u with A™v:
(Au) y= ul (Ad).
(4)
The conjugate of Au is Au. Transposing it gives w'A’ as usual. This is wW4AM. Eyerything that should work, does work. The rule for H comes from the rule for '. That
applies to products of matrices:
10C The conjugate transpose of AB is (AB)# = BHA,
We are constantly using the fact that a — ib times c — id is the conjugate of a + ib
times c+ id.
432
Chapter 10
Complex Vectors and Matrices
Among real matrices, the symmetric matrices form the most important special
class: A = A‘. They have real eigenvalues and a full set of orthogonal eigenvectors.
The diagonalizing matrix § is an orthogonal matrix Q. Every symmetric matrix can
be written as A = QAQ7! and also as A = QAQ! (because Q~! = Q'). All this
follows from aj; = aj;, when A is real.
Among complex matrices, the special class consists of the Hermitian matrices:
A = A". The condition on the entries is now ajj = ajj. In this case we say that “A
is Hermitian.” Every real symmetric matrix is Hermitian, because taking its conjugate
has no effect. The next matrix is also Hermitian:
Pie
2
3 —3i
The main diagonal is real since aj; = Gjj.
~
{343i
5
Across it are conjugates 3 + 3i and 3 — 3i.
This example will illustrate the three crucial properties of all Hermitian matrices.
10D If A = A# and z is any vector, the number z" Az is real.
Quick proof: z4Az is certainly 1 by 1. Take its conjugate transpose:
(zHaz)yH = gH ah(gH)H
which is zAz again.
Reversing the order has produced the same 1 by | matrix (this used A = AM!) For 1
by 1 matrices, the conjugate transpose is simply the conjugate. So the number z4 Az
equals its conjugate and must be real. Here is z#Az in our example:
Patient
[z1 A
2
re
3—3i || z
5
=
n
nas
a
Aa = 27121 + 52222 + G — 31)Ziz2 + G + 31) 2122.
The terms 2|z;|? and 5|z2|? from the diagonal are both real.
The off-diagonal terms
are conjugates of each other—so their sum is real. (The imaginary parts cancel when
we add.) The whole expression z# Az is real.
10E Every eigenvalue of a Hermitian matrix is real.
Proof Suppose Az = iz. Multiply both sides by z4 to get z#Az = az4z. On the
left side, z4Az is real by 10D. On the right side, zz is the length squared, real and
positive. So the ratio 4 = z4#Az/zHz is a real number. Q.E.D.
The example above has real eigenvalues
4 = 8 and A = —1. Take the determinant of
A —AXI to get (d — 8)(d + 1):
De
|
Sie ate
eas
pe
eee
er
= AP I
Secrecy
10 18 =, Oe 8 ues
10.2
Hermitian and Unitary Matrices
433
10F The eigenvectors of a Hermitian matrix are orthogonal (provided they correspond to different eigenvalues). If Az = Az and Ay = By then y#z = 0.
Proof Multiply Az = Az on the left by y#. Multiply y444 = By on the right by z:
Vode — hye
and: yo ANT
byez,
(5)
The left sides are equal because A = A". Therefore the right sides are equal. Since B
is different from 4, the other factor yz must be zero. The eigenvectors are orthogonal,
as in the example with 4 = 8 and 6 = —1:
of
—6
B= Sb | ezialey
(0
3
i) ofl fo tg Weal
WG)
amelie eS)
Aine
50
Ss
2a
ledieh
auc
~
l
ofS]
c=] t5|
say 1at
Take the inner product of those eigenvectors y and z:
A Bha
oi [1 +i oo 1] fa! i ==)
o
ejg
(orthogonal
eigenvectors).
These eigenvectors have squared length 17 + 17 + 1* = 3. After division by V3 they
are unit vectors. They were orthogonal, now they are orthonormal. They go into the
columns of the eigenvector matrix S, which diagonalizes A.
When A is real and symmetric, it has real orthogonal eigenvectors. Then S is
Q—an orthogonal matrix. Now A is complex and Hermitian. Its eigenvectors are complex and orthonormal. The eigenvector matrix S is like Q, but complex. We now
assign a new name and a new letter to a complex orthogonal matrix.
Unitary Matrices
A unitary matrix is a (complex) square matrix with orthonormal columns. It is denoted by U—the complex equivalent of Q. The eigenvectors above, divided by V3 to
become unit vectors, are a perfect example:
Us
~ nd
Valitifa
ood!
-1
is a unitary
‘ matrix
This U is also a Hermitian matrix. I didn’t expect that! The example is almost too
perfect. Its second column could be multiplied by —1, or even by 7, and the matrix of
eigenvectors would still be unitary:
U = —
ag
=
wae
oa | is also a unitary matrix.
I
434
Chapter 10
Complex Vectors and Matrices
e2ni/3
1
Fourier
nant
:
1
oe V3
:
1
1
1
eon! 3
eft! 3
eAti/3
e2ti/3
e4ni/3
Figure 10.4
The cube roots of 1 go into the Fourier matrix F = F3.
The matrix test for real orthonormal columns was O'Q = J. When Q! multiplies Q,
the zero inner products appear off the diagonal. In the complex case, Q becomes U
and the symbol ' becomes 4. The columns show themselves as orthonormal when U a
multiplies U. The inner products of the columns are again | and 0, and they fill up
UA =
10G
The matrix U has orthonormal columns when
If U is square, it is a unitary matrix. Then
lebanese1
vu
=|
1
sh
1-i| fae1 5
1
Sulealn
U®U = I.
U4 = U7},
1-i
mM 0
=
let
a“
(6)
Suppose U (with orthogonal column) multiplies any z. The vector length stays
the same, because zZ4U4Uz = z4z. If z is an eigenvector, we learn something more:
The eigenvalues of unitary (and orthogonal) matrices all have absolute value |i\ = 1.
10H
Jf U is unitary then
\\Uz|| = ||z\|.
Therefore Uz = Xz leads to
|A| = 1.
Our 2 by 2 example is both Hermitian (U = U®) and unitary (U-! = U4). That
means real eigenvalues (A = A), and it means absolute value one (A~! = A). A real
number with absolute value 1 has only two possibilities: The eigenvalues are 1 or —1.
One thing more about the example: The diagonal of U adds to zero. The trace
is zero. So one eigenvalue is 4 = 1, the other is A = —1. The determinant must be 1
times —1, the product of the i’s.
Example 3 The 3 by 3 Fourier matrix is in Figure 10.4. Is it Hermitian? Is it unitary?
The Fourier matrix is certainly symmetric. It equals its transpose. But it doesn’t equal
its conjugate transpose—it is not Hermitian. If you change i to —i, you get a different
matrix.
10.2.
Hermitian and Unitary Matrices
435
Is F unitary? Yes. The squared length of every column is F(1 +1-+ 1). The
columns are unit vectors. The first column is orthogonal to the second column because
14 e27'/3 4 e4ti/3 — 0. This is the sum of the three numbers marked in Figure 10.4.
Notice the symmetry of the figure. If you rotate it by 120°, the three points are
in the same position. Therefore their sum S also stays in the same position! The only
possible sum is S$ = 0, because this is the only point that is in the same position after
120° rotation.
Is column 2 of F orthogonal to column 3? Their dot product looks like
41+ e673 4 4/3) — 11
+141).
This is not zero. That is because we forgot to take complex conjugates!
inner product uses / not !:
The complex
(column 2)" (column 3) = 4(1- 1+ e273 e473 4 e403 21/3)
=: z(1 4
e2ti/3
te
ents 3) =
0.
So we do have orthogonality. Conclusion: F is a unitary matrix.
The next section will study the n by n Fourier matrices. Among all complex unitary matrices, these are the most important. When we multiply a vector by F,, we are
computing its discrete Fourier transform. When we multiply by F~!, we are comput-
ing the inverse transform. The special property of unitary matrices is that
F~! = F™,
The inverse transform only differs by changing i to —i:
1
ja
=
FH
1
1
1
e2ni/3
ei
3 |4
e4nt(s™
Loni/3
—
1
/3
Everyone who works with F recognizes its value. The last section of the book will
bring together Fourier analysis and linear algebra.
This section ends with a table to translate between real and complex—for vectors
and for matrices:
Real versus Complex
R": vectors with n real components
C”: vectors with n complex components
length: |[x||? =x? +---+x7 <> length: [zl]? = lz? +--+ lanl?
(AT)ij = Aj @ (AM) = Aji
(AB)' = BIA! ~ (AB)! = BEAM
dot product: xly =X1Vi +e
+ Xnyn
(Ax)Ty =x"(ATy)
orthogonality: x' y = 0
symmetric matrices: A = AP
inner product:
wy = Ufvy +
upp
(Au)#y = u#( AN)
orthogonality:
uty =0
Hermitian matrices: A = AM
A =UAU~! =UAUE
(teal A)
K T=_K
skew-Hermitian
K
orthogonal matrices: Q' = Q7!
unitary matrices:
orthonormal columns:
orthonormal columns:
A= OAQ7! = QAQ' (teal A)
skew-symmetric matrices:
Q'Q = /
matrices
= —K
UH! = U—!
U#U = /
TTT
TPT
TTT
(Ux)F(Uy) = x4y and ||Uz|| = llzll
(Qx)'(Qy) =x"y and ||Qx|| = |x| tts
The columns and also the eigenvectors of Q and U are orthonormal.
Every |A| = 1.
436
Chapter 10
Complex Vectors and Matrices
Problem Set 10.2
Find the lengths of wu= (1+i,1—i,1+2i)
and v = (i,i,i). Also find uy
and
viy.
Compute AMA and AA.
Those are both
A
inlet
i boat
matrices:
:
Solve Az = 0 to find a vector in the nullspace of A in Problem 2. Show that z is
orthogonal to the columns of AM. Show that z is not orthogonal to the columns
of A’
Problem 3 indicates that the four fundamental subspaces are C(A) and N(A) and
and
_. Their dimensions are still r and
n—r andr and m—r. They
are still orthogonal subspaces. The symbol
takes the place of th
(a)
Prove that AMA
is always a Hermitian matrix.
(b)
If Az = 0 then AMAz = 0. If AH Az = 0, multiply by z™ to prove that
Az = 0. The nullspaces of A and AMA are
. Therefore AHA is
an invertible Hermitian matrix when the nullspace of A contains only z =
True or false (give a reason if true or a counterexample if false):
(a)
If A is a real matrix then A+ / is invertible.
(b)
If A is a Hermitian matrix then A +7/
(c)
If U is a unitary matrix then A+ i/ is invertible.
is invertible.
When you multiply a Hermitian matrix by a real number c, is cA still Hermitian?
If c =i show that 7A is skew-Hermitian. The 3 by 3 Hermitian matrices are a
subspace provided the “scalars” are real numbers.
Which classes of matrices does P belong to: orthogonal, invertible, Hermitian,
unitary, factorizable into LU, factorizable into QR?
P=
Compute P*, P?, and P!
10
1 O
a
©
=0
0
in Problem 8. What are the eigenvalues of P?
Find the unit eigenvectors of P in Problem 8, and put them into the columns of
a unitary matrix F. What property of P makes these eigenvectors orthogonal?
10.2
Hermitian and Unitary Matrices
437
11
Write down the 3 by 3 circulant matrix C = 27 +5P+4P?.
eigenvectors as P in Problem 8. Find its eigenvalues.
12
If U and V are unitary matrices, show that U~! is unitary and also UV
Start from UHU = 7 and VEV = 1.
13
How do you know that the determinant of every Hermitian matrix is real?
14
The matrix AMA is not only Hermitian but also positive definite, when the columns
of A are independent. Proof: z4 A" Az is positive if z is nonzero because
15
Diagonalize this Hermitian matrix to reach
A=
ar ;é 1
16
17
ei
is unitary.
UAU#:
eel:
Diagonalize this skew-Hermitian matrix to reach
a
It has the same
K=
UAU BAILA, Ssateu ee
sagt
Diagonalize this orthogonal matrix to reach
Q =
UAU H_ Now all 4’s are
oe hee Mea
sin@
18
19
20
cos@
Diagonalize this unitary matrix V to reach
V = UAU H” Again all 4’s are
LE
Ten
ey
= hie
Lt
—-1
x
|-
If vj,..., 0, is an orthogonal basis for C”, the matrix with those columns
matrix. Show that any vector z equals (vitz)v, feeet (viz) un.
is a
The functions e~'* and e!* are orthogonal on the interval 0 < x < 2m because
their inner product is ieHh)
(1,7, 1),w = (i, 1,0) and
z=
______
are an orthogonal basis
21
The vectors
for
ne
If
23
The (complex) dimension of C” is __.
24
Describe all 1 by 1 Hermitian matrices and unitary matrices. Do the same for 2
by 2.
v =
A= R+iS is a Hermitian matrix, are its real and imaginary parts symmetric?
Find a non-real basis for C”.
438
Chapter 10
25
How are the eigenvalues of A™ related to the eigenvalues of the square complex
matrix A?
26
If u4uw = 1 show that J — 2uu" is Hermitian and also unitary.
matrix uu" js the projection onto what line in C”?
27
If
Complex Vectors and Matrices
A+iB is a unitary matrix (A and B are real) show that
The rank-one
Q = [4-8] is an
orthogonal matrix.
28
If
A+iB is a Hermitian matrix (A and B are real) show that [4 —B] is sym-
metric.
29
Prove that the inverse of a Hermitian
30
Diagonalize this matrix by constructing its eigenvalue matrix A and its eigenvector matrix S:
=
31
2
,
lhe
matrix is a Hermitian
Li
3
=
matrix.
Al,
A matrix with orthonormal eigenvectors has the form A =
UAU~! = UAU4.
Prove that AA = AHA. These are exactly the normal matrices.
10.3.
The Fast Fourier Transform
439
THE FAST FOURIER TRANSFORM a 10.3
Many applications of linear algebra take time to develop. It is not easy to explain them
in an hour. The teacher and the author must choose between completing the theory and
adding new applications. Generally the theory wins, because this course is the best
chance to make it clear—and the importance of any one application seems limited.
This section is almost an exception,
because
the importance
of Fourier transforms
is
almost unlimited.
More than that, the algebra is basic. We want to multiply quickly by F and F™=,
the Fourier matrix and its inverse. This is achieved by the Fast Fourier Transform—
the most valuable numerical algorithm in our lifetime.
The FFT has revolutionized signal processing. Whole industries are speeded up by
this one idea. Electrical engineers are the first to know the difference—they take your
Fourier transform as they meet you (if you are a function). Fourier’s idea is to represent
f as a sum of harmonics cye!**. The function is seen in frequency space through the
coefficients cz, instead of physical space through its values f(x). The passage backward
and forward between c’s and /’s is by the Fourier transform. Fast passage is by the FFT.
An ordinary product Fc uses n* multiplications (the matrix has n* nonzero entries).
The Fast Fourier Transform needs only n times 5 log, n. We will see how.
Roots of Unity and the Fourier Matrix
Quadratic equations have two roots (or one repeated root). Equations of degree n have
n roots (counting repetitions). This is the Fundamental Theorem of Algebra, and to
make it true we must allow complex roots. This section is about the very special equation z” = 1. The solutions z are the “nth roots of unity.” They are n evenly spaced
points around the unit circle in the complex plane.
Figure 10.5 shows the eight solutions to z’ = 1. Their spacing is <(360°) =
45°. The first root is at 45° or 6 = 27/8 radians. It is the complex number w =
el? — e!2/8 We call this number wg to emphasize that it is an 8th root. You could
write it in terms of cos aa and sin or but don’t do it. The seven other 8th roots are
w?, w>,..., w, going around the circle. Powers of w are best in polar form, because
we work only with the angle.
The fourth roots of | are also in the figure. They are 7, —1, —i, 1. The angle is
now 27/4 or 90°. The first root w4 = e°7'/4 is nothing but i. Even the square roots
of 1 are seen, with w2 = e!2"/2 = —1. Do not despise those square roots | and —1.
The idea behind the FFT is to go from an 8 by 8 Fourier matrix (containing powers
of we) to the 4 by 4 matrix below (with powers of w4 =i). The same idea goes from
4 to 2. By exploiting the connections of Fg down to Fy and up to Fj (and beyond),
the FFT makes multiplication by F024 very quick.
We describe the Fourier matrix, first for n = 4. Its rows contain powers of |
and w and w2 and w°. These are the fourth roots of 1, and their powers come in a
440
Chapter 10
Complex Vectors and Matrices
w = e2hi/8 — cos +i sin
— > Real axis
Figure 10.5
The eight solutions to z8 = 1 are 1, w, w*,..., w’ with w = (1+i)/V2.
special order:
(reat
mb
Plame
See
sy ee
1
ag
w>
No
ae
w
fe
in
Mla
la
et
Kole i
oy
te ee
ee
w?
fo
a
eT
The matrix is symmetric (F = F T). It is not Hermitian. Its main diagonal is not real.
But 5F is a unitary matrix, which means that (GFA) F) == //5
The columns of F give
F&F =4I.
The inverse of F is iF H which is iF.
The inverse changes from w =i to w = —i. That takes us from F to F. When the
Fast Fourier Transform gives a quick way to multiply by Fy4, it does the same for the
inverse.
The unitary matrix is
U = F/,/n. We prefer to avoid that ,/n and just put :
outside F~!. The main point is to multiply the matrix F times the coefficients in the
Fourier series co + cye’* + cpe2* + c3e3!*:
ee
]
1
1
w
1
w*
1
w
CO
Chee
Gar
Cu a= Cy == C3
cotcjwtow? +cow?
1
we
wr
we
lilo
cotcjw*
1
w
wo
w?
CR
cot cw? + cow® + c3w?
+cow*+ciw®
(1)
|°
The input is four complex coefficients co, c1, c2, c3. The output is four function values
yo, ¥1, ¥2, 3. The first output yo = co +c; +c¢2+ 3 is the value of the Fourier series
at x = 0. The second output is the value of that series
y, =O
never
ue dene!
ae
ent
\~cpe'** at x = 21/4:
=cotcw
+cow? + cw.
10.3.
The Fast Fourier Transform
441
The third and fourth outputs y2 and y3 are the values of doce! at x = 40/4 and
x = 67/4. These are finite Fourier series! They contain n = 4 terms and they are
evaluated at n = 4 points. Those points x = 0, 27/4, 41/4, 67/4 are equally spaced.
The next point would be x = 87/4 which is 27. Then the series is back to yo,
because e?”! is the same as e° = 1. Everything cycles around with period 4. In this
world 2 + 2 is 0 because (w2)(w2) = w? = 1. In matrix shorthand, F times ¢ gives
a column vector y. The four y’s come from evaluating the series at the four x’s with
spacing 27/4:
3
:
y = Fe produces y; = y cpe'*P7J/) —
20
the value of the series at x = a
k=0
We will follow the convention that j and k go from 0 to n — | (instead of 1 to n).
The “zeroth row” and “zeroth column” of F contain all ones.
The n by n Fourier matrix contains powers of w = e27!/":
1
[dice |Olah
w
tt
1
wt)
1
]
CO
w
ee
ery
C]
Cy
yee-dD.
F,, is symmetric but not Hermitian.
wal)
VO
|
Cn—]
YI
yee
|e
(2)
Yn-1
Its columns are orthogonal, and F,F , = n1I. Then
|os is F,/n. The inverse contains powers of w, = e~27'/". Look at the pattern in F:
The entry in row j, column k is w/*. Row zero and column zero contain w® = 1.
The zeroth output is yo = co +c, +----+¢,-1.
This is the series Seer
Alexa):
When we multiply c by F,, we sum the series at n points. When we multiply y by
Pee we find the coefficients c from the function values y. The matrix F passes from
“frequency space” to “physical space.” F~! returns from the function values y to the
Fourier coefficients c.
One Step of the Fast Fourier Transform
We want to multiply F times c as quickly as possible. Normally a matrix times a
vector takes n? separate multiplications—the matrix has n? entries. You might think it
is impossible to do better. (If the matrix has zero entries then multiplications can be
skipped. But the Fourier matrix has no zeros!) By using the special pattern w/* for
its entries, / can be factored in a way that produces many zeros.
This is the FFT.
The key idea is to connect F, with the half-size Fourier matrix Fy)... Assume
that n is a power of 2 (say n = 2!0 — 1024). We will connect
Fj024 to Fs5,;2—or rather
442
Chapter 10
Complex Vectors and Matrices
to two copies of Fs\2. When n = 4, the key is in the relation between the matrices
les Ta
sess
hh
(hah
ime!
eo
ae
ah
a:
F5
pu
1
alee
i2
1 1
tei
Wie
On the left is F4, with no zeros. On the right is a matrix that is half zero. The work
is cut in half. But wait, those matrices are not the same. The block matrix with two
copies of the half-size F is one piece of the picture but not the only piece.
the factorization of F4 with many
Here is
zeros:
(3)
1
sj
1
i?
1
The matrix on the right is a permutation. It puts the even c’s (co and cz) ahead of
the odd c’s (cj and c3). The middle matrix performs separate half-size transforms on
the evens and odds. The matrix at the left combines the two half-size outputs—in a
way that produces the correct full-size output y = F4c. You could multiply those three
matrices to see that their product is F4.
The same idea applies when n = 1024 and m = 5n = S12.) The number 11s
e2Rt/1024 Te is at the angle @ = 27/1024 on the unit circle. The Fourier matrix F924
is full of powers of w. The first stage of the FFT is the great factorization discovered
by Cooley and Tukey (and foreshadowed in 1805 by Gauss):
F
2
ekye!
I532
1024
I5;2 is the identity matrix.
D512 || F512
—D5\2
even-odd
F512 | |permutation
(4)
|~
Ds5,2 is the diagonal matrix with entries (1, w,..., wy
The two copies of F'5;2 are what we expected. Don’t forget that they use the 512th root
of unity (which is nothing but w?!!) The permutation matrix separates the incoming
vector c into its even and odd parts c’ = (co, c2,..., C1922) and ec” = (cy, €3, ..., C1023).
Here are the algebra formulas which say the same thing as the factorization of Fjo24:
101 (FFT) Set m = 5N. The first m and last m components
of y = F,¢ are com-
binations of the half-size transforms y’ = F,,c’ and y” = F,,c”. Equation (4) shows
Ty’+ Dy” and Ly’ — Dy”:
y=yitwiy’,
yj
Se
= V7
ptm
j=0,...,.m—1
0
ald!
Way; aI = Up. ah
4
‘
(5)
Thus the three steps are: split ¢ into c’ and ce”, transform them by F,, into y’ and
y”, and reconstruct y from equation (5).
10.3.
The Fast Fourier Transform
443
You might like the flow graph in Figure 10.6 better than these formulas.
The
graph for n = 4 shows c’ and c” going through the half-size F>. Those steps are
called “butterflies,’ from their shape. Then the outputs from the F>’s are combined
using the J and D matrices to produce y = F4c:
00
co
Yo
00
10
Co
yy
01
01
cy
yp
10
C3
y3
11
nS
e”
11
Figure 10.6
Flow graph for the Fast Fourier Transform with n = 4.
This reduction from F, to two F,,’s almost cuts the work in half—you see the zeros
in the matrix factorization. That reduction is good but not great. The full idea of the
FFT is much more powerful. It saves much more than half the time.
The Full FFT by Recursion
If you have read this far, you have probably guessed what comes next. We reduced F,,
to Fn/2. Keep going to F,/4. The matrices
256 leads to 128.
That is recursion.
F5;2 lead to F256 (in four copies).
Then
It is a basic principle of many fast algorithms,
and here is the second stage with four copies of F = Fo56 and D = Do56:
If
F512
nae
F512
_
Wee
D
F
a
F
i
D
I
-D
F
F
pick
0,4, 8,--
pick.
42,6, 10. -
pick
psn ey ae
pick’
3, 71k
We will count the individual multiplications, to see how much is saved. Before
the FFT was invented, the count was the usual n? = (1024)?. This is about a million
multiplications. I am not saying that they take a long time. The cost becomes large
when we have many, many transforms to do—which is typical. Then the saving by the
FFT is also large:
The final count for size n = 2! is reduced from n? to $n.
The number 1024 is 2!°, so / = 10. The original count of (1024)? is reduced to
(5)(1024). The saving is a factor of 200. A million is reduced to five thousand. That
is why the FFT has revolutionized signal processing.
444
Chapter 10
Complex Vectors and Matrices
Here is the reasoning behind snl . There are / levels, going from n = 2! down
to
n =
1. Each level has 5n multiplications from the diagonal D’s, to reassemble
the half-size outputs from the lower level.
This yields the final count snl, which is
sn log, n.
One last note about this remarkable algorithm. There is an amazing rule for the
order that the c’s enter the FFT, after all the even-odd permutations.
Write the num-
bers 0 to n — 1 in binary (base 2). Reverse the order of their digits. The complete
picture shows the bit-reversed order at the start, the / = log, n steps of the recursion,
and the final output yo,..., ¥,—1; which is F, times c. The book ends with that very
fundamental idea, a matrix multiplying a vector.
Thank you for studying linear algebra. 1 hope you enjoyed it, and I very much hope
you will use it. That is why the book was written. It was a pleasure.
Problem Set 10.3
1
Multiply the three matrices in equation (3) and compare with F. In which six
entries do you need to know that i? = —1?
2
Invert the three factors in equation (3) to find a fast factorization of F sag,
3
F is symmetric. So transpose equation (3) to find a new Fast Fourier Transform!
4
All entries in the factorization of F6 involve powers of w = sixth root of 1:
sli
ee
| a8 (aaa
D || Fs
Write down these three factors with 1, w, w? in D and powers of w” in F3. Multiply!
5
If v = (1,0,0,0) and w = (1,1,1,1),
Therefore F~'w = v and F~!v =
show
that Fv
=
6
What is F* and what is F* for the 4 by 4 Fourier matrix?
yi
Put the vector c = (1,0, 1, 0) through the three steps of the FFT to find y = Fe.
Do the same for c = (0, 1, 0, 1).
8
Compute y = Fgc by the three FFT steps for c = (1,0, 1,0, 1,0, 1,0).
the computation for ¢==(0; 1, 021,051; 0.1):
9
If w = e?7'/* then w? and \/w are among the
10
(a)
Draw all the sixth roots of 1 on the unit circle. Prove they add to zero.
(b)
What are the three cube roots of 1? Do they also add to zero?
and
w and Fw
=
4v.
Repeat
roots of 1.
10.3.
11
The Fast Fourier Transform
445
The columns of the Fourier matrix F are the eigenvectors of the cyclic permutation P. Multiply PF to find the eigenvalues 41 to 4:
Opis
OP Out ime
sacle
val
es
Ras
ieee,
ORDO WEIS OLN Tile tine!o> ares ell ea yo aye
OPO GOSU
MeOMOLROS
This is PF
= FA
anal
or
The eigenvector matrix (usually S) is F.
P = FAF7!.
Tam
Ne
do
iLeeere a eyes p03
alae: Oa?
tercomane
Ne:
3
4
h2
The equation det(P — AJ) = 0 is 44 = 1. This shows again that the eigenvalue
matrix A is
. Which permutation P has eigenvalues = cube roots of 1?
13
(a)
Two eigenvectors of C are (1,1, 1, 1) and (1,i, i2, i>). What are the eigenvalues?
EO)
Gly
G3uG
GoueCa
(Gi)
(b)
ey
P=FAF—!
(G2.
1G3
1
(CGD
CNG
|
|
TSGeyy
sc6i0)
l
immediately gives P? = FA*F~! and P? = FA3F7!. Then
C = col +c, P+c7 P* +c3 P? = F(col +c; A+c2A*+c3A3)F7! = FEF™!.
That matrix E in parentheses is diagonal.
14
It contains the
Find the eigenvalues of the “periodic” —1,2,—1
A3, with the eigenvalues of P in A.
Orc,
matrix from E = 27 — A —
The —1’s in the corners
make
this matrix
periodic:
Oi
15
2-1
O -1
-l
2-1
0
Sore
(ALR
-l1
0-1
2
has co = 2,c)
= —1,c2
= 0,c3
= —1.
To multiply C times a vector x, we can multiply F(E(F~!x)) instead. The direct
way uses n” separate multiplications. Knowing E and F, the second way uses
only nlog,n +n multiplications. How many of those come from E, how many
from F, and how many from F~!?
16
How
could you quickly compute
these four components
of Fc
starting from
co + €2, Co — C2, C1 + €3, C1 — C3? You are finding the Fast Fourier Transform!
CO reel
hr Gr anes
kes
co tic, + 2c + 1303
co + i2cy + i4¢) +-i%¢3
co + 2;
$i + i°C3
a
We
aa
‘apts eer
ee
7
‘we
a
one
Whites
_
i
<—
a
i%,
ni may 4AM. Ra
a
i es
(Step
—
be
a
J
ee
=
an
,:
e+
7
Gaal
ay
/
:
eee
J
PEED
>
-@A
P
7
te
ly reve ~
y
yf
mys
Hg
oh
jm
ne a b ae 7afl? Prue
usvvragigt wane
ot,
ae
iv doeuk tt Bad
a nnn
ah wer eb ayaheny ai «a (LS ng
C144 dint) wie -9 sea wes 20) Taney Ste batw
tes
pe
br
iL
= Sere
mes taneatoa
?.
}
OL ieee
ve
ig
8 jew nk
~
Re ;
i we
ae
r20r
io
7+.
Bam
if.
r. v
ee
rw
'
=
eo
@
pm
j
i
Wy
-
ete
T
apyet
fo
"ie
ating
¢
Gees
|
ee
™ iP aa ' ih
a
Hyak
“es
Fanare
G
rete
7
:
a
Wi
awl jens ee easy, ape
at axe Raid a
AE
+
‘eal Same
ir
Naingld tae, (eis
~T.
ee
7
Srv Salea
a
amen)
7
marr ais
Oi tee ot
ter il
Pours
af
|
a © SP spent
ake
"e
hold 7 oe
hg
—
eee
en
a
e
=; we
“Fag dp
a
renee me
ommp on.0. yotunanes
rr
_ =
ae
me i! ete
a’, AMetinD a6
5s
li
-
=
: iyo
cere
Treake
fs
=|
© ENS
| yous Spe?
=) eth
oh ¢cieme © on emceesaT. ba
ane
Tr
ee
nmap wT $F
oh things ae
Oe
7 aa
i ica
a
.
io
SOEOMONSTOSElaeite®)
EXERC aS
Problem Set 1.1, page 6
3 3v+w=(7,5)
and
5 (a) Components
of cv add to zero
v —3w = (—1, —5) and
cv+ dw = (2c +-d,c+2d).
(b) Components
of every cv +dw
add to zero
(GC) GhoosesG—4eandras—s10)
7 (a) Back on w, forward on v
(b) The combinations
fill a plane through
v and w.
9 The fourth corner can be (4,4) or (4,0) or (—2, 2).
10 (Length of Dye
length of w)* = 42422 + (-1)2 + 2? = 25 and also (length of
v+w)* =3*+42 = 25.
12 Choose
v = (2,1)
of v+w)*
15
and w = (1,2).
Then
(length of v)2 =
(length of w) = 5 but (length
= 32432,
Five more corners (0, 0, 1), (1, 1,0), (1, 0, 1), (0, 1, 1), (1, 1, 1). The center point is (5, 4,
The centers of the six faces are Gh 5,0), Cr i 1) and (0, 5, 5), (aie 5, 5) and (5,0,
(5,1,
4).
16 A four-dimensional cube has 24 = 16 corners and 2-4 = 8 three-dimensional sides and 24
two-dimensional
faces and 32 one-dimensional
17 (a) sum = zero vector
(b) sum
= —4:00
19 C= h= 0 = (ho 45 tw) = lip —w).
edges.
vector
(e)esum—
—4(1:00 vector).
The point dv =f tw is three-fourths of the way to v
starting from w. The point iy+ iw is Su.
20 All combinations with c+d = 1 are on the line through v and w. The point
V = —v+2w
is on that line beyond w.
21 The vectors cvu+cw
beyond
v+w.
22 (a) The combinations
and
w.
Corner
fill out the line passing through (0,0) and u = su+ Sw. It continues
With c > 0, half this line is removed
with 0 < c < | and
at (0, 0)
(b) With
0 <d
<1
and the “ray” starts at (0, 0).
fill the parallelogram
0 < c and O < d
the parallelogram
with sides v
becomes
an
infinite wedge.
Pipa, (a) fu + tu+ tw is the center of the triangle between u, v and w; tu +4w
of the edge between
u and w
is the center
(b) To fill in the triangle keep c > 0, d > 0, e => 0, and
c+td+e=1.
447
448
Solutions to Selected Exercises
25 The vector
5(u +v-+w)
is outside the pyramid because
26 All vectors
are combinations
c+d+e=
5 Paes
> 1.
of u, v, and w.
27 Vectors cv are in both planes.
29 The solution is s = 2 and
d =4.
Then
2(1, 2) + 4(3, 1) = (14, 8).
Problem Set 1.2, page 17
2 ||w|| =1
and ||v||
=5 =||w||.
Then
1.4 < (1)(5) and 24 < (5)(5).
4 uy; =d/|lv|| = Fa: 1) and u = w/||wl| = $(2, 1,2). U; = wari —~3) or Tg!
3)-
U> could be nat
yall —2, 0).
5 (a) zero angle
(b) 180° (or zr)
=)
ORsoner—
90
6 (a) cos@ = One
(c)
(v+w)-(v—w)
=v-v+w-v—v-w-wWw-w=
acorn 5 radians).
$010) = 160280r . radians
(b) cos@ = 0 so 6 = 90° or % radians
VY
(@)ecosa — =
=
SOG)
= GON or 5
(d) cos6 = —1//2 so 6 = 135° or 3r
8 (a) False
(b) True: u-(cu+dw) = cu-v+du-w
choose c=u-w and d= —u-v.
12
(1,1)
perpendicular
to (1,5) — c(i, 1) if
6 —2c
= 0
=
(c) True:
0 or c =
for cu-v+du-w
3; v-(@w—cv)
=
= 0
0 if
C=v-w/v-r.
16 ol? =9 so joll=
18 (v+w)-(v+w)
35a
30 w= (10,
= (v+w)-v+(v+w)-w
0).
= v-(v+w)+w-(v+w)
= v-v+0-wt+w-vtw-ew
=
v-v+2v-wt+w-w.
19 2v-w <2l|v|||| wl] leads to |v + wl? =v-v+2v-w+w-w < |v? +2Iv||||wl] + Iwi]? =
(|p| + |wi)?.
21 cosB =
wyj/||w||
and sinB =
w2/||w||.
Then
cos(B
— a) = cosBcosa
+ sinf sina =
vywy/|lv||||wl| + vaw2/|v||||w|| = v + w/|lv||||wl].
22 We know
Cancel
that (v—w)-(v—w)
=v-v—2v-w+w-w.
Compare
with the Law of Cosines.
—2 to find ||v||||w||cos@ = v-w.
24 Example 6 gives |u||Uj| < 5(uj + U?) and |w2||U2| < 4(u3 + U3). The whole line
becomes .96 < (.6)(.8) + (.8)(.6) < (.67 + 87) + 5(.87 + .67) = 1.
26 The
angle between
v =
Write v-w =xz+yz+xy
(1,2, —3)
and
w =
(—3,1,2)
has cos@
=
a
andegs—si20ce
as (x+y +z)? — 5(x? + y? +27) which is —4(x? +y? +2%).
27 The length ||v — w|| is between
2 and 8. The dot product v-w
is between
—15 and 15.
Solutions to Selected Exercises
449
Problem Set 2.1, page 29
The solution is not changed; the second plane and row 2 of the matrix and all columns
are changed.
J Sctey —0Nto tind sy ———3)/2 -andaa—
5) 2s Setny
— Oh
tomtind.
—
smandez —alb
6 If x, y,z satisfy the first two equations they also satisfy the third equation.
solutions contains
cvu+dw
and w=
(G, iF 5) and u = Sv a Sw and all combinations
with c+d=1.
7 Equation
DPA
v = (1,1,0)
The line of
1+
equation 2—
equation 3 is now 0 = —4.
Solution impossible.
(Soa) peAbe —n( GAS)»
10 Nine multiplications for Ax = (18, 5, 0).
12 (z,
y, x) and (0,0, 0) and (3,
3, 6).
13 (a) x has n components, Ax has m components
the columns are in m-dimensional
(b) Planes in n-dimensional
space, but
space.
(b) cvg+dv
14 (a) Linear equations with unknowns x, y, z have the form ax+by+cz=d_
is also a solution if c+d=1
(c) If the equation has right side = constant term = zero,
any c and d are allowed.
16 We
rotation from R* = ie | ee
be
iar)l 180°
OI
O241.%0
POO!
Wl)
beat
hasle)
17 P=
andup=!=
esa
s0
“P70 eS
50 ord
19 F260
0
Oto
16
Onde
t
<0
“0
LO? 30
40" 10>
Fe =] Gi 4's
FO ev ea Gy
a
23 The first code computes dot products for J = 1,2; the second code combines
i—ie Ow Anise4uby esechantestOn— le 4randsJe—dles:
columns
24 B(1) = B() + AC, 2) * V(2), B(2) = AQ, 1) * VC), BQ) = B(2) + AQ, 2) * V(2). Code
2 reverses the first two of those three steps.
29 uz = [a ey = ea The components always add to 1.
30 u7, 07, w7 are all close to (.6, .4). Their components still add to 1.
31
8
.3/).6)
_ |6
Ea T i) taal Gaeta
= steady state s
ee
.
32 From all starting points the vectors approach (.45, .35, .20).
33 Kel =|)
8
ih
Cy
3 4
S+u
SD |) cee |p ey no
YL 2
S+u
136, row sums
= 136/4 = 34.
S5—u+u
5)
Suv
5-—v
5+u+v
5—u
}; 16 numbers
in M4 add to
450
Solutions to Selected Exercises
Problem Set 2.2, page 40
Multiply by / = ? = 5 and subtract to find 2x +3y = 14 and —6y = 6.
Subtract -4 times equation
3y = 3. Then
y=
| (or add 5 times equation 1). The new
1 and x =5.
second equation is
If the right side changes sign, so does the solution:
(x, y) = (—5, —1).
Subtract
/ = © times equation
1. The new second pivot multiplying y is d — (cb/a) or
(ad — bc) /a.
The system is singular if b = 4, because 4x +8y is 2 times 2x+4y. Then g = 2-13 = 26
makes the system solvable. The two lines become the same: infinitely many solutions.
If a = 2 elimination must fail. The equations have no solution. If a = 0 elimination stops
for a row exchange. Then 3y = —3 gives y = —1 and 4x + 6y =6
gives x =3.
If k = 3 elimination must fail: no solution. If k = —3, elimination gives 0 = O in
equation 2: infinitely many solutions. If k =O a row exchange is needed: one solution.
11
12
Die
2x —
Sy
a=)
il
Ve -|- 952 =) =)
oe = IC
selves
3y
a
y+
z=1 gives
2y — 3z =2
ee
yy —)— en If rarzeromisrate ther startroterow 2
z=
2.
or 3, that avoids a row operation.
2x —3y
a)
Vere
— lean
= bye =)
x=3)'
—
z=0
2x
lero
2x
row | from row 2
wale trOmenowe3
row 2 from
row 3
13 Subtract 2 times row | from row 2 to reach (d —10)y —z = 2. Equation (3) is y—z=3.
If d = 10 exchange rows 2 and 3. If d = 11 the system is singular; third pivot is missing.
14 The second pivot position will contain —2 —b. If b = —2 we must exchange with row 3.
itp — FS letnessecondsequatlonniss—y
7 — ) Se AuSOLCON ecm ar—nl miye i lewe
le
16 If row 1 = row 2, then row 2 is zero after the first step; exchange with row 3 and there
is no third pivot. If column 1 = column 2 there is no second pivot.
17 x+2y+3z=0,
4x+8y+4+12z=0,
5x + 10y+ 15z=0.
18 Row 2 becomes 3y —4z = 5, then row 3 becomes (¢ +4)z=t—S. If g = —4 the system
is singular — no third pivot. Then if t = 5 the third equation is 0 = 0. Choosing z= 1
the equation 3y — 4z=5 gives y =3 and equation | gives x = —9.
20 Looking from the “end view”, the three planes form a triangle. This happens if rows
1 +2 = row 3 on the left side but not the right side: for example x +
y+ z = 0,
x—y—z=0, 2x +0y+0z=1. No parallel planes but still no solution.
22 The solution is (1, 2,3,4) instead of (—1,2, —3, 4).
23 The fifth pivot is g The nth pivot is 7
1
24 A
1
1
"aaa
a+1
b
b+e
b+c+3
25 a = 2 (equal columns),
a = 4 (equal rows),
a = 0 (zero column).
26 Solvable for s = 10 (add equations); [; | and E at
Solutions to Selected Exercises
451
Problem Set 2.3, page 50
1
0
Seat
Ole
OR Om!
Bog=)
POO
i
“ez
i ONO
Om Oma
ae
ORO
=O, vi Olle
YF
il
Eo
OO
a Ol
eae
1
|G
0
(ORO)
i
0}
-—2
1
1
M = E37E3,E2, is |—4
10
OpeO
1
0
-—2
1
4 Elimination on column 4:
|
b= | 0]
0
tion in Ux = (1, —4, 10) gives
>
1
|-4]
0
z= —5, y=
|Prat,
are
Onto
On
0 0
FE 1, £31, E32
,
>
—
:
5.x
0
10
1
1
|—4
10
|. Then back substitu-
. This solves Ax = (1,0, 0).
Changing a33 from 7 to 11 will change the third pivot from 2 to 6. Changing a33 from
7 to 5 will change the pivot from 2 to no pivot.
To reverse E3,, add 7 times row 1 to row 3. The matrix is R3; =
No
So
=o ea
OO
O O 1).
—-1
1 0
M=j]{
After the exchange, E must act on the new row 3
a:
1
(Oe
lees
0
2 -3
0]
Ie
8)
12 (a) E times the third column of B is the third column of EB
to row 3 to give nonzeros.
14
(b) E could add row 2
452
Solutions to Selected Exercises
Oe)
17
EF=|a
0|,
FE=|
ln
Fo
19 (a)
ib
@
=O.
19.0
0)
Be
1
column
is
Elva
Each
1
Q
a
10).
@
b+ac
c
1
‘B
i
B=
1
Ww
|208
IF 07).
2D
Ol
©
ohne
times
a
b
columno
(a) Dla3jxj
(Bb) ag1 — ayy
Finitil 2° Seal
2 3
23
| a hea
Os
ii
(b)
>
|
|
wel
2g
ae
8)
Wena
()2x2=2,
(dd) (Ax) = Yay;;.
1). 2xjt3m= 1 m= 5
Sal
—§x=159 wo = 3.
21
24 The last equation becomes 0 = 3. Change the original 6 to 3. Then row
row 3.
ee
25 (a) A” 1 has two extra columns; ‘ E 7
0
26 (a) No solution if
(b) Infinitely many
d= 0 and c 40
A
1
4
1
d=> i iS
ae
1 + row 2 =
0
—7
i ae age
4
nee
solutions if d = 0 and c = 0.
No effect: a and b.
Problem Set 2.4, page 59
(a) A (column 3 of B)
(b) (Row 1 of A) B
(d) (Row | of C)D(column 1 of E£).
n
wel
1
bn
DES
P| ana
(a) True
(c) (Row 3 of A)(column
4 of B)
OD:
n
an
= [7% ou
(b) False
(c) True
(d) False.
8 Rows of DA are 3-(row | of A) and 5-(row 2 of A). Rows of EA are row 2 of A and
zero. Columns of AD are 3-(column | of A) and 5-(column 2 of A). Columns of AE are
zero and column | of A.
INE = ip ae
10
alee ee
and E(AF)
b+d
d
equals (EA)F
and then E(FA)
=
because matrix multiplication is associative.
a+c
|
a+2c
b+d
b+2d
| E(FA)
is not F(EA)
because multiplication is not commutative.
)
11 (a) eB
47
(o) B=0
(CG) Bs—s|EOmen
@O
la
(d) Give
B equal rows.
LOO)
12
ap =|‘ Reel,
| gives b=c =0. Then AC =CA gives a=d: A=al.
13 (A — B)? = (B — A)? = A(A — B) — B(A — B) = A* — AB—
15 (a) mn
(every entry)
(b) mnp
BA 4 B?.
(c) n> (this is n> dot products).
Solutions to Selected
16 By linearity (AB)e agrees with A(Bc).
17 (a) Use only column 2 of B
Exercises
453
Also for all other columns of C.
(b) Use only row 2 of A
(c)-(d) Use row 2 of first A.
19 Diagonal matrix, lower triangular, symmetric, all rows equal. Zero matrix.
20
(ayy
23
A=
25
Bite
ai)
©) aja — (Fain
() ajt/ary
9
EOme| has A?
=—I; BC
2S
WO
ites
Gt
—/
ie
a
[0 x
Sys il
—
1
1 of B) =
OF
ON
x =|
0
28
al
=|
0
FOil ae Oy al
te
ie CB.
=
| BE aPPL lade
5
27 (a) (Row 3 of A)-(column
(Dee
5
@) aan — (PADara.
1
ORO
n
a
onal
n—lp
(Row 3 of A)-(column 2 of B) =0
Gx
Fo
0
n_|@
=
i
|x
|eand?
[0 0
x| =
x
OR
[FOR
OmEes
OM ex
OME
ORs
AN ie tates ate)He
;
31 In Problem 30,
oo
||
gh
ea
al Dele
ee
ah D~ ebja =|
me
“Al
33 (a) n—1 additions for a dot product; n2(n— 1) for n> dot products
additions for R and S$
(c) 2n(n — 1) +n
(b) 4n2(n— 1)+2n-
for R and (knowing
AC
and BD) another
dn? +n?(n —1) for S. Total: 3n2(n — 1) +5n? additions for the 3M method.
34 A times X will be the identity matrix /.
35 The solution for b= (G5, 8) is 3x7 475x> -- 8x3 = G, 8, 16):
Problem Set 2.5, Page 72
Ob
hee
3 0
= Sal
0
1 4
0
P-!=p;
0
O
1
p-!'=]1
0
a+2c=1,
—1
1
OJ.
0
ei
Ted
ae
See
p=
“transpose” of P.
CoN
Always
3a+6c
=0:
impossible.
0
—1
0
a=| 0 4 and |0 1 aa
a
bil-l
Ol]la
6b
—|
E ‘ll 0 ile | :
(a) If Ax = (1,0,0), equation 1 + equation 2 — equation 3 is 0 = 1
(b) The right
sides must satisfy b, + b2 = b3
(c) Row 3 becomes a row of zeros—no third pivot.
If you exchange rows
C= AB gives
| and 2 of A, you exchange columns
C~! = B-1A7! s0 AW! = BCH
| and 2 of A =i
454
Solutions to Selected Exercises
12 M-1=c7—!B-!47!
50 B-! = CMA.
—|
13 B-1=,7!
if |
1
=A!
ie
a subtract column
—"
1
Sib
il
1
16
1
—1
1||-1
1
|}
1
01
=
1
2 of A~!
1
from column
1
il
i
= /J¢
1
1.
th
eel
=
L = E~! after reversing the order and changing —1 to +1.
18 A2B =I
can be written as A(AB) = I. Therefore
A7!
is AB.
19 (1, 1) entry requires 4a —3b = 1. (1, 2) entry requires 2,—a =0.
Then b= : and a = 2.
20 6 of the 16 are invertible, including all four with three 1’s.
it
3
i
©
1
3
i
SB
it
©
> [4 i
1
Oso
3 ete
eel
230
1
OF
®
i
leew
@- |
i
OF
©
O
1
@
©
10)
1
Ss
il @
ON ls
OPO
—b
eles |
1
2
Al=
3
5 —1
—-1
-1
3
-1
—-1
-1]|;
3
1
0
1
a E 7 0 > [o in)
rE Se
26 A~'=|-2
ORO
Al
1
0
1
OM
> [o fe—2
=
©
i
SOP ON
a0
1
0
1]=]|0]|
1
0
7
4h>
a
1
0
OR
0 0
1 -a
ele OnnO
1
Otel
OnmmnO)
=U
Ase |.
=i
O
2
1
0
-1
2
-1
-1
1
(b) False (matrix of all 1’s)
29 Not invertible for c = 7 (equal columns), c = 2 (equal rows),
31
A=
SY) 89
So
©=
CO
©
_-—-
(7, Ds, 11) etitel 32 =O,
2%, I)
33 hilb(6) is not the exact Hilbert matrix because fractions are rounded off.
1-70
3 & lee "|and
Aq!
bees eee
0
AeA and
ap
|I
4
0:
(c) True
c = 0 (zero column).
30 Elimination produces the pivots a and a —b and a—b.
1
1
0
Clos
or 0
ac—b
—C
1
so A~! does not exist.
-—3 | (notice the pattern); A-!=|-1
28 (a) False with 4 zeros; true with 13 zeros
(d) True.
BS
Solutions to Selected Exercises
455
Problem Set 2.6, page 84
Av
1
0
QO
1
Onl
—3
120)
1
a
GQ)
oy
|
1
—2
yo
|
@
1
-—2
1
il
ab
Sh
2
2
=
ORE
1
Au—a10)
0
4
lee)
ZEA
Oma
1
2
0
1
ASS =| oO i
Be.
1
33|
—6
=
U7
UE
ORO
|
SOR
to —
Oi
al
Then
As
p-l
E5, E35 YW! SIG
1
EH =
Bao Bao,
i
=
1
—c
This is A~!. But
ih 9)
ee |—
2a
l
1
|
—b
—a
|
l
—a
=
1
1
1
ac—b
-c
|
L =A.
1
feel
i
ip
Ol
i
Il
4
be
i
die
—s lei erOvael
eteray (il, eiayel
7 = (0)2
no pivot in row 2
c = 2 leads to zero in the second pivot position: exchange rows and the matrix will be
OK.
c = 1 leads to zero in the third pivot position = matrix is singular.
ih
A=|>
-@)]} |
als
1
4
1
OI
a!
t
Jee
i)
cae
04
OL
sO
!
Ol
[2
a
gO
4
4
O
|i 2
:
OF ks
3%:
| E | |=EDU: notice U is “L transpose
1
4
Om
|
be
QO.
Or
—4
1
4
a
1
Or
OL.
10
12
Oo iP
ww
iP
S
OR
—
|
Ne!
Deen
ty | ||)
a
Wd Cw!
|
tO
eel
POS ex
ial
:
gives c=
13 (a) L goes to I
iN
SOCSHLON
14 (a) Multiply LDU = L;D,U,
= Lbenn
am
r
Sip
if
ie
c—s
t-—s
d-t
Ghe=\))
sei
nies
Wale)
al
4
|Ol
ye || Ul)
OMOR
(CREU
3
fakes ae = || (0
l
goes tor.
by inverses to get ie LD = D\U,U7~!.
triangular, right side is upper triangular =
both are diagonal
|
ial
(0)
(lh
th
4
|)
|
Veto
a
a
0
a
a+b
b
0
b
b+c
is I.
a
=" (Samews)
16 If T is tridiagonal, each pivot row has only 2 nonzeros.
not on later rows.
Left side is lower
(b) Since L, U, Lj, Uj
have diagonals of 1’s we get D = D,. Then Like is J and U,U—!
1
aaa
dt
b
(same
U).
c
It operates only on the next row,
U and L are zero away from the main diagonal and the next diagonal.
456
Solutions to Selected Exercises
17 L has the 3 lower zeros of A, but U may not have the upper zero. L has the bottom left zero of
B and U has the upper right zero. One zero in A and two zeros in B are “filled in.”
its
ce
Fe
ass
aoe
Be
ee
be
3G
eal
on
Ome)
se
x
sil
il 9 Stems
0)
1
0
(x’s are known
after first pivot is used).
0
21 The 2 by 2 upper submatrix B has the first two pivots 2,7.
starts with elimination on B.
23
1S NG
ee P
foo
lta
i
eshe
)
Soe
l
Eat
[LO
20
eee
24 c =6 and also
c=7
|
ee
J
Thy
a
1
Sie |
will make LU
Reason:
Elimination
on A
oil
3
Ds
he
31° Pascal’s triangle in L and U.
1
impossible
(c = 6 needs a permutation).
Problem Set 2.7, page 95
a
ki 1
4A=
Atl=
le
a
GHP
Sahel
2
li alhas A* = 0. But the diagonal entries of A'A
of A with themselves.
a
Al
If A! A = 0, zero dot products =
(b) False
(c) True
13 (a) There are n! permutation
P’ =
|
Po
|
Iara
|a
(Sa)
le Sh
Gad
are dot products of columns
zero columns.
cl
7 (a) False
the same:
ks wk
P*.
(d) False.
matrices of order n.
Eventually two powers
Multiply by (P~!)* to find P’~S
(em Cettainlyar is
One
P3
with
P, =
|= P! with
a
LG
andy
P32 ==
E = ie |
150)
1
10
0
(b) Pl (row 4) = row
1.
Oaelt ffowrenll ay
SE ea ited
cee
|Pomerat
20 Lower right 2 by 2 matrix is le a4
ae
of P must be
amheathe
al
071!
Solutions to Selected Exercises
1
23
1
1
AN = lb
OQ iW @
3S
2 il
1
|e | Oil
Oo) Wye
1
3
1
ih =|
1
1
1
24 abs(A(1,1))
=
2)
0 so find abs(A(2,1))
>
exchange in L; no more elimination so
abs(A(2,1))
>
GDS CAC)
tol;
A>
=tolwA
3554
|0O0
O
Y
SC
1
8
—2/3
|. If we wait to exchange:
pe Nils «il
OF il @
OOM
1
2
1
3
457
Da
tol; A >
01
and
P >
Oil
a
L = / and U = new A. abs(A(1,1)) =0
Oh
1{andP—>J]1
O
2,
3-4
510)
25"
16) ist
oo
@O
1
so find
0
=O
OJ]; abs(A(2,2))
@
il
Oo
psi
= no
so find
ik
— > | ORO
I
=0
aa
@
@
1
74) JE =
shows the elimination steps as actually done (L is affected by P).
2
Wil
1
30 Ex, =)|-3
1
i
|KO
0
and E> AE}, =)
l
@ ©
2ee4
4 9
eis stillisymmetie:
!
|
Dy —
i
—4
and E37 E>, AE}, Ee = D. Elimination from both sides gives the
1
symmetric LDL!
directly.
lo)
31 Total currents are Aly = | —1
Pla
REG
0
WES |) = ||RE
ORNs
YBC + YBS
PBS
=YCS*
ar WES
LIBS
Hither way (Ax) y=x!(Aly)= xpypc +xByBS — XCYBC + XCYCS — XSYCS — XSYBS1
32 Inputs | 40
5
50
1000
50)
x]
|="
x2
T
1
A’y =|
50
40
1000
2
50
700
3
3000
ns
=
6820
1 truck
188000 | 1 plane ©
34
P3 =1
so three rotations for 360°;
P rotates around
(1, 1,1) by 120°.
Problem Set 3.1, Page 107
LxtyAyt+x
andx+(y+z)A(e+y)4+zZ
2 The only broken rule is 1 times x equals x.
and (cj +c7)x
Acyx + cox.
458
Solutions to Selected Exercises
3 (a) cx may not be in our set: not “closed” under scalar multiplication.
Also there is no
0 and no —x
(b) c(x + y) is the usual (xy)°, while cx + cy is the usual (x°)(y°).
Those are equal. With c = 3, x =2, y = 1 they equal 8. The “zero vector” is-the number
1. The vector
—2 is the number
+4.
Rule 8 is broken: If c f(x) is defined to be the usual f(cx) then (cj +c2) f = f((cy +c2)x)
is different from c; f + cof = usual f(cyx) + f(cox).
If (f + g)(x) is the usual f(g(x)) then (g + f)x is g(f(x)) which is different. In Rule
2 both sides are f(g(h(x))). Rule 4 is broken because there might be no inverse function
such that f(f~!(x)) =x. If f~! exists it will be the vector —f.
(a) The vectors with integer components
allow addition, but not multiplication by 5
(b) Remove the x axis from the xy plane (but leave the origin). Multiplication by any c
is allowed but not all vector additions.
Y
(a) All matrices
Gap
i 0
The sum of (4,0,0)
(a) The subspaces
(b) All matrices [i ‘|
(c) All diagonal matrices.
and (0, 4,0) is not on the plane.
of R2 are R2 itself, lines through
(0,0), and (0, 0) itself
(b) The
subspaces of R‘ are R?¢ itself, three-dimensional planes n-v = 0, two-dimensional subspaces
(nj
-v =O
and nz -v = 0), lines through
(0,0,0,0),
and (0,0, 0,0) itself.
15 (a) Two planes through (0, 0,0) probably intersect in a line through (0, 0, 0)
(b) The
plane and line probably intersect in the point (0, 0, 0)
(c) Suppose x is in
SMT and
y is in SMT. Both vectors are in both subspaces, so x + y and cx are in both subspaces.
19 The column space of A is the “x axis” = all vectors (x,0,0). The column space of B
is the “xy plane” = all vectors (x, y,0). The column space of C is the line of vectors
Gr 2-0):
20 (a) Solution only if b2 = 2b; and b3 = —b,
22 (a) Every b
(b) Solvable only if b3 = 0
(b) Solution only if b3 = —b).
(c) Solvable only if b3 = bo.
23 The extra column b makes the column space larger unless b is already in the column space
of A: [A
vi=[4bawee | (larger column space), [i
ie se ed
(same column
space).
24 The column space of AB is contained in (possibly equal to) the column space of A. If
B=0 and A #0 then AB =O
has a smaller column space than A.
26 The column space of any invertible 5 by 5 matrix is R>. The equation Ax = b is always
solvable (by x = A~'b) so every b is in the column space.
27 (a) False
(b) True
il
tk
0)
28 A= | ib
Ws
-@
ORO
(c) True
tal
lhe || al
@
OP
1
(d) False.
Solutions
to Selected
459
Exercises
Problem Set 3.2, Page 118
(a) Free variables x2, x4,x5 and solutions
(b) Free variable x3: solution (1, —1, 1).
(—2, 1,0,0,0),
(0,0, —2, 1,0), (0,0, —3, 0, 1)
The complete solution is (—2x2, x2, —2x4 — 3x5,%4,x5).
when no variables are free.
The nullspace contains only 0
R=
jL
0
0
0
0
OR OMAO
te
S|
OOO
>:
a0
6
SS
10
0
1
(a) Special solutions
ee
R=H|@O
Y
0
i
©
1
’
IP0 0Bm 0}Sil |
1
=I
=
(3, 1,0) and (5,0, 1)
same nullspace as U and A.
2) fs
OF
|E 0
IP
1
0
3}
0
5
=3
|
(b) (3, 1,0). Total of pivot and free is n.
(a) Nullspace
is the plane —x + 3y + 5z = 0; it contains all vectors (3y + 5z,
y, z)
(b) —x +3y+5z=0 and —2x + 6y + 7z = 0; the line contains all points (3y, y, 0).
(a) False
(b) True
10 (a) Impossible
(GC) eA
e
Ue
(c) True
(d) True.
(b) A = invertible =
1
1
1
ha—ale
12
sae
Or Oo
i=
©©
ooo
Gomes coor
Ce
es
OO
8S
SS
0
0
0
1
’
]
0
0
Ss
Soe
0 CO
OS
0 =)
a) oO
x&
Ss
=
Ore
eS
a
oot
14 If column 1 = column 5 then x5 is a free variable. Its special solution is (—1, 0, 0, 0, 1).
15 There are n —r special solutions. The nullspace contains only x = 0 when r =n.
column space is R™ when r = m.
17 Au
—3
—1]; y and z are free; special solutions
The
(3, 1,0) and (1,0, 1).
20 If ABx = 0 and A is invertible then multiply by A~! to find Bx = 0. The reason U and
LU
23
have the same nullspace is that L is invertible.
ik
lies
»
]
S1/2
=,
23
24 This construction is impossible.
25
26
460 = Solutions to Selected Exercises
27 If the nullspace equals the column
impossible.
space then
n —r
=
r.
If n =
3 then 3 =
2r is
29 R is most likely to be J; R is most likely to be J with fourth row of zeros.
30 (a)
A=
@
0
it
0
0
0
©) aa
4
0
(c)
isos
A=]0O
1 1+)
1 0 0
—3],
ek
This was hard!
31
32 Three pivots.
33 These are the nonzero rows:
34
R=[1
—2
HG
NT
le al
Get
0 0
(a) i Woe: at i a ki Al 3 4]
; ot Ras le
OX
Problem Set 3.3, page 128
(a) and (c) are correct;
columns.
(d) is false because
We odhe
|OO
OO
1
(b) R=} 0
Op
(2) ee
Ab
Om)
050
If all pivot variables come
nullspace matrix is
R might happen
Ol
1
8:
LOO
50
to have
1’s in nonpivot
if
(C) ea
OF
last then R must have three zero blocks
il
ee)
0p
loll
Opes
0
R = i af The
N = Al
The matrices A and A! have the same rank r. But pivcol (the column number) is 2 for A
and
1 for A!:
al
|
and
S=[1]
and
Al 2
11 The rank of R! is also r, and the example has rank 2:
_
ie
ed |
spel
2eee
els
ae
ete
el
eee
= (408
pl
3
Se ele
13 If we know that rank(B! A‘) < rank(A!), then since rank stays the same for transposes,
we have rank(AB) < rank(A).
Solutions to Selected
Exercises
15 Certainly A and B have at most rank 2, so their product has at most rank2.
By by sob cannot bem/mevennin AlbI—we
=
1
st
Oa)
1
a={1
01h,
10
fo i
4][g 4 U]=[1 1
il &
val
0
Ol
8
B=
19 Y is the same
il @
O mee
0 0
De
Since BA
is
we}
==Vl avira! EAN = Ii.
@
OR Oma
o]+]o 0 4
OOS
Pw
coech cocoate
as Z, because
461
Oy
ee
Oe
Ole
ek)
ht
the form is completely
OOO
NOs 0.0
tee
OF ORS SOROS
decided by the rank which
is the
same for A and Al.
Problem Set 3.4, page 136
—2
1 Xcomplete =
—3
O | +22
l
0
5b, — 2b2
3 Solvable
if 25; +b2 = b3. Then
x =|
bp —2b,
2
| + x3]
0
0
1
5b, — 2b3
Ame
pA
4 (a) Solvable if by = 2b; and 3b; —3b3+64 = 0. Then x =
3)
(b) Solvable if by = 2b; and 3b; —3b3+bq4 =0.
Ih Bh
S638
as
db
200
(0)
tan
bo
laps
I
3
0
O72
6 Every b is in the column
2 OWa2e— 10)
particular solution
articular
the particular
soluti
solution
(d) The homogeneous
92
space:
9 (a) xj —x2 and 0 solve Ax = 0
10 (a) The
l
Then x =|
b
by
— 3b; | >
3) 20}
independent rows.
:
(no free variables)
A
5b, — 2b3
b3 —2b, | + x3]
0
—|
-1
|
row 3—2 row 2+4 row
is the zero row
0 0 0 63 —2b2
+ 4b
1
(b) Need b3 = 2b.
Row
3 —
(b) 2x; —2x2 solves Ax = 0; 2x; —x2 solves Ax = BD.
xp
is always
multiplied
by 1
(c)fl
ese
Ukce
Le alate
BseBiliapdlecs
Then
é
solution is x, = 0.
(b) Any
solution
shorter than
, | is18 shorter
the
can
be
02
462 = Solutions to Selected Exercises
12 If row 3 of U has no pivot, that is a zero row. Ux =c
is solvable only if c3 = 0. Ax =b
might not be solvable, because U may have other zero rows.
18 (a)
a8
| yy} =]
Se
4
—]
0}]+y
0
—1
1/+z|
0
EX
4
(oy) || 59 | 1, @
Zz
0
0
1
—1
seal
©
1
19 If Ax; = 5b and Axz = b then we can add x; — x2 to any solution of Ax = B. But there
will be no solution to Ax = B if B is not in the column space.
Ag)
CSN.)
CHULA Qe
w=,
—)),
GS Ch tl, 3).
24 A rank one matrix has one pivot. The second row of U is zero.
26 (uv! )(wz!) = (uz!) times v'w. Rank one unless v! T w = 0.
28 (a) r <™m, always
r <n
inne
Pee
LOE
One: 476
S27 he —
i
©
sO
One
HGF
yee
PZ ib
(OQ) ie Sips iF =v
lite |: 1°22 God
Vie iN
(Ee 2h
(O ho:
©
0
On
nance
esl
MORO
LE
20)
le ean)
0
Oe!
Os
Oren 1s ee
0
QO
50) =1
1
2 |: no solution because ofrow 3.
OS
git
21]; B cannot exist since 2 equations in 3 unknowns
34 A=]|0
(dd) ii
|. The pivot columns contain J so —1 and 2 go into xp.
©
ORO
ee On 0);
OURS
—1
0
2
andxp, =|
(0) 7 =o,
cannot have a unique
Ome
solution.
i
©
@
ent
ire Silos
| ees
it
36 A=
2
Oi
¢
1
3
1
7
4
-Olie0
we0
o| adx= eg
;
Relient
Oo
Om
©
x3
and no solution.
2, | and
luti
yes
k a
Problem Set 3.5, page 150
i
yO
SS
I
cC|
aa
c2 | = O gives
C3
ih
OO
cz3 = cp
= cy
= 0.
But
—2v,
— 3v2
— 403 + vy = O
(dependent).
2 v1, v2, v3 are independent. All six vectors are on the plane (1,1, 1,1)-v =O
of these six vectors can be independent.
3 If
so no four
a=0
then column | = 0; if d =0 then b(column 1) —a(column 2) = 0; if f = 0 then
all three columns end in zero (they are perpendicular to (0,0, 1), they lie in the xy plane,
they are dependent).
Solutions to Selected Exercises
Columns
Same
1, 2, 4 are independent.
answers
(not same
columns!)
463
Also 1, 3, 4 and 2, 3, 4 and others (but not 1, 2, 3).
for A.
If cy (w2+w3)+c2
(wy +w3)+c3(w;+w2) = 0 then (cp+c3)w)
+(c] +¢3) wW2+(c] +c2)W3 =
0. Since the w’s are independent this requires cz + cz3 = 0, cy +c3 =0,c)
+c2 =0. The
only solution is c} = cp = c3 = 0.
(a) The four vectors are the columns of a 3 by 4 matrix A. There is a nonzero solution
to Ac = 0 because there is at least one free variable
(c) Ov; + 3(0, 0,0) = 0.
(a) Line
(b) Plane
(c) Plane in R3
(d) All of R?.
12 b is in the column space when there is a solution to Ax = b; ¢ is in the row space when
there is a solution to Al y =c.
False. The zero vector is always in the row space.
14 The dimension
of S is
(a) zero when x = 0
(b) one when x = CPE 171)
(c) three when x = (1,1, —1,—1) because all rearrangements of this x are perpendicWETS (roy Gila Tek, Ab)
(d) four when the x’s are not equal and don’t add to zero. No x
gives dimS = 2.
16 The n independent vectors span a space of dimension n. They are a basis for that space.
If they are the columns of A then m is not less than n (m > n).
19 (a) The 6 vectors might not span R‘*
(c) Any four might be a basis.
(b) The 6 vectors are not independent
22 (a) The only solution is x = 0 because the columns are independent
(D) RAG
Das
solvable because the columns span R.
25 (a) False for [1
dimensions
1]
(b) False
(c) True: Both dimensions = 2 if A is invertible,
0 if A = O, otherwise dimensions = |
(d) False, columns may be
=
dependent.
27
31 (a) All 3 by 3 matrices
32
—]
0
2
0
0
CO) |?
—|
0
(b) Upper triangular matrices
0
0
2
Oo;
0
|-1
ah) (a) y(x) = e2*
(b) y =x
37 Basis Leones
basis x — 1, x? — 1.x? eal
0
2,
(one basis vector
0
0;
(c) All multiples c/.
0
|-1
0
0
0
Dali
in each case).
38 Basis for S: (1,0, —1, 0), (0, 1, 0,0), (1,0, 0, —1); basis for T: (1, —1,0,0) and (0, 0, 2, 1);
ST
has dimension |.
464
Solutions to Selected Exercises
Problem Set 3.6, page 161
A: Row space (1, 2,4); nullspace (—2,1,0) and (—4,0,1); column space (1, 2); left
nullspace (—2, 1). B: Row space (1, 2,4) and (2,5, 8); column space (1, 2) and (2, 5);
nullspace
(2)
(—4,
il ©
lO)
gl
0,1); left nullspace
(b) Impossible:
(e) Impossible:
basis is empty.
r + (n —r)
must be 3
Row space = column space requires
-9
()eO}
0}
Then
m—r=n—r.
m =n.
(d) |;
—3
if
A: Row space (0,3, 3,3) and (0,1,0, 1); column space (3,0, 1) and (3,0, 0);
nullspace (1,0,0,0) and (0, —1,0, 1); left nullspace (0, 1,0). B: Row space (1), column
space (1, 4,5), nullspace:
empty basis, left nullspace (—4, 1,0) and (—S,0, 1).
Invertible
A:
basis
nullspace
and left nullspace
row
space
=
column
space
basis are empty.
basis
Matrix
=
B:
(1,0,0),
row
space
(0, 1,0),
basis
(0,0, 1);
(1, 0,0, 1,
0,0),
(OMEOFOS TS
O\Rand (OF OMe
OOM) sscolumnespaccsbasisn (e050) sa(Osule
0) (On Os)
nullspace basis (—1, 0,0, 1,0,0) and (0, —1, 0,0, 1,0) and (0,0, —1, 0,0, 1); left nullspace
basis is empty.
Row
space dimensions
3,3,0;
column
2,3, 2; left nullspace dimensions
space dimensions
3, 3,0; nullspace dimensions
0, 2, 3.
11 (a) No solution means that r < m. Always r < n. Can’t compare m and n
m—r > 0, the left nullspace contains a nonzero vector.
12
4]
Om 2
0
i
k
13 (a) False
14 Row
OA
5 oe
DB
2
i
(b) True
(c) False (choose A and B same
space basis
De
4
@
a]
0];
fi
(1, 2,3,4), (0,1,
(b) If
r+m—-—r)=n=3
but 242 is 4.
size and invertible).
2,3), (0,0, 1,2); nullspace
basis
space basis (1,0, 0), (0, 1,0), (0,0, 1); left nullspace has empty
(0, 1, —2, 1); column
basis.
16 If Av= 0 and v is a row of A then v-v
=0.
18 Row 3 —2 row 2+ row 1 =0
so the vectors c(1, —2, 1) are in the left nullspace.
same vectors happen to be in the nullspace.
The
19 Elimination leads to 0 = b3 — by — by so (—1, —1, 1) is in the left nullspace. Elimination
leads to b3 — 2b; = 0 and by + bz — 4b; = 0, so (—2,0, 1,0) and (—4, 1,0, 1) are in the
left nullspace.
20 (a) All combinations of (—1,2,0,0)
and (-4,0, —3, 1)
(b) 1
(oS) 5 Onl)
21 (a) u and w
dependent
22
(a?
7)
(b) v and z
(c) rank < 2 if uw and w
are dependent
or v and z are
(d) The rank of uv! + wz! is 2.
k ; l=
An
4
25 (a) True (same rank)
unsymmetric)
22
2 D2
1
(b) False
(d) True.
|]
A=[1
0]
(c) False (A can be invertible and also
Solutions to Selected Exercises
465
27 Choose d = bc/a. Then the row space has basis (a,b) and the nullspace has basis (—d, a).
28 Both ranks are 2; rows
| and
2 are
a basis; N(B')
separated by a zero; McC!) has (—1,0,0,0,0,0,0,1)
columns 3,4,5,6 of J; N(C) is a challenge.
has six vectors
and
with
| and
—1
(0, —1,0,0,0,0,
1,0)
and
Problem Set 4.1, page 171
(a)
1
2
—3
(d) Row
2
-3
5
-—3
1
—2
1+ row
2+ row
2
1
(b) |—3 | is not perpendicular to | 1
5
l
(c)F Ag
3 = (6,
(ORC e eh)
,
) is not zero;
no such matrix
pom
} |
wall
be in the nullspace and row space; no such matrix.
(a) If Ax = b has a solution and Aly = 0, then y is perpendicular to b.
bly =0
(b) If Ax = b has no solution, b is not in the column
perpendicular to y.
space.
(Ax)! y =
It is not
xX =X;,+Xn, where x, is in the row space and x, is in the nullspace. Then Ax, = 0 and
Ax = Ax; + Axyn = Ax,. All vectors Ax are combinations of the columns of A.
If Ax is in the nullspace of A! then Ax must be zero.
perpendicular to itself.
It is also in the column space =>
(a) For a symmetric matrix the column space and row space are the same
(b) x is in
the nullspace and z is in the column space = row space because A = Ab
¥ =x, +x, = (1,-1)+d,1)=@,0).
11 Aly=05
xTAaly =(Ax)'y=05
yl Ax.
13 If S is the subspace of R? containing only the zero vector, then S+ is R?. If S is spanned
bya (ey lee) sathen St is spanned by (1, —1,0)
and (1,0,—1).
If S is spanned by (2, 0, 0)
and (0, 0,3), then S+ is spanned by (0, 1, 0).
Ih Paydl
14 S~lee is the nullspace of A == E ae | Therefore S+ is a subspace even if S is not.
17 The vectors (—5,0, 1,1) and (0, 1, —1,0) span st.
19 For x in V4, x is perpendicular to any vector in V. Since V contains all the vectors in
S, x is also perpendicular to any vector in S. So any x in V+ is also in S+.
2( ~~ Column | of A~! is orthogonal to the space spanned by the 2nd, 3rd, ..., nth rows of A.
24 (a) (1,—1,0)
is in both planes. Normal vectors are perpendicular but planes intersect
is orthogonal to the first line but not on the second line
(c) Lines can
meet without being orthogonal.
(b) (2, —1,0)
466
Solutions to Selected Exercises
Problem Set 4.2, pages 181
Q) p=G.3,.9) e€=C9.49)
@P=U,3,D; €=@,0,0).
\
P=}
Py =5
N=
ape
1
\
Indeed P? = Py. Pr = 7
1
28
a
oA
=
wm
al
915-9)
(1,0,0) = b.
Po =4
4
4
—2
—2
iets
Mi
Cae
et
i
3
il
—2)|. P,P, =0 because a; la.
}
and pz = (3. 9-9) and p3 = (Gs: 3): has
Whales
ells
Pp, = (1,0) and pz = (0.6, 1.2). Then pj + p2 £ Bb.
P =I. If A is invertible,
P = A(ATA)~!A? = AA~!(A!)—1
AT = 1.
(1) p = A(AA)~!A'b = (2, 3,0) and e = (0,0, 4)
The projection of b onto the column
p=t
5
8
So
Peal
—4
space of A is Bb itself, but P is not necessarily J.
—4
and pss (ODA).
a
16 5(1, 2,—-1)+ 3(1, 0, (2
17
(2) p = (4,4,6) and e=0.
al) lheretoresbi
issn the plane.
p? = P and therefore (I — P)* = (I — P)I — P) = 1 — PI— IP+ P* =1— P. When
P projects onto the column space of A,/ — P projects onto the left nullspace of A.
V6Or=176" 173
Sib.
1/6. Mae3
O=|-1/6
4/6
17 |. P=|1/6
5/6 =13
=1/3.
“155273
Weigh ainZhgaaaWe
20
e=(1,=1,=2),
21
(A(ATA)—!AT)? = A(ATA)—“!(ATA)(ATA)—IAT = A(ATA)!AT. Therefore P? = P.
Pb is always in the column space (where P projects). Therefore its projection P(Pb)
is
Ds
24 The nullspace of A! is orthogonal to the column space R(A). So if Ab = 0, the projection
of b onto R(A) should be p=0. Check Pb = A(A'A)~!ATb = A(ATA)—!0=0.
26 A!
exists since
r =m.
Multiply A27=A
by A~! to get
A=/.
27 Ax is in the nullspace of A. But Ax is always in the column space of A. To be in both
perpendicular spaces, Ax must be zero.
So A and A!A have the same nullspace.
Problem Set 4.3, page 192
C
aot
SS
oO
RW.
—
0
8
g
20
;
5
|: Change the right side to p =
1
5
13
1J
i
3.x = (154) becomes
‘exact.
Solutions to Selected Exercises
45
ALA
467
RS| Alb = |B6nlr I 28
¥ = (1,4). The four heights are 1, 5, 13, 17. The
(Mil?
See26
errors are —1,3, —5,3.
The minimum
error is E = 44.
E=(C40D)7+(C
4D 8) + (C +3D —8)° + (C FAD = 20/2. ‘Then EOC =
2C +2(C + D—8)+2(C +3D—8) +2(C +4D — 20) = 0 and E/8D = 1-2(C + D—8)+
.
4 NST IG = el
36
Be 2(CA
— 20)
20) =0.
(Gee 3D== 8)
)+ a4? (C +4D —
= 0. The equations are E |
E=(C—0)*4+
(C —8)*+(C —8) + (C20).
AP =[1
I 1 11-ATA =14] and
Alb
= [36] and C =9. e = (—9, —1, -1, 11).
A=[0
1 3 4]',
ATA =[26] and A?b=[112]
and D = 112/26
= 56/13.
0
8
8
20
0
ee
Cc
D
Ja)
y
1
1
Re
WO
RW
©
=
oe
4 16 64
10
4
8
26
oo
Mey
A
Spy
Sake!
36
Cc
1D) \| == | iti.
400
EB
0
1 | 47
Spas
@
Sj
sales]
5
S
nN
and p=b
11 (a) The best line is x = 1+4?, which goes through (2, 9)
C-m+D->0"
ti = VL, 5j. Divide by m to get
12 (a) ala =m.
a'b = bj +--:+ bm.
b— fa.
7,(bo; — 2)"
lel? =
Therefore
and e=0.
(b) From the first equation:
C+ Di = b.
* is the mean
() p ="G, 3,3),
of the b’s
(b)e=
¢€ = (2,—1,3),
tpl
0
]
ae
as
el
1
13 (AT A)~!AT(Ax — b) = x —f. Error vectors (+1, +1, £1) add to 0, so the x — ¥ add to 0.
14 (® —x)(® —x)? = (ATA)—!AT(b— Ax)(b— Ax)'A(ATA)~!.
“covariance
16
matrix”
(ATA)! AT02A(AT A)!
Take the average to get the
which simplifies to o*(ATA)—!,
7002100 + 7p %99 = Top (bi +--+: + Pio0).-
18 p = Ax = (5, 13, 17) gives the heights of the closest line. The error is b—
20 xX = (9,4).
21 e is in N(A");
p= Ax = (5, 13,17) = b. Error
p is in R(A);
p = (2, —6, 4).
e = 0 since b is in the column space of A.
¥ is in R(A‘);
N(A) = {0}.
23 The square of the distance between points on two lines is E = (y—x)*+@By—x)?+(1+x)2.
LaE/dx = —(y — x) — By —x) + (x +1) =0 and 49E/dy = (y — x) + 3Gy — x) =0.
The solution is x = —3, =
24 e is orthogonal to p;
—3; ji, = z, and the minimal distance is
\le||2 =e!'(b—p)= eb =b'b— b! p.
25 The derivatives of ||Ax — b\|2 are zero when x = (AT A)—1AT.
=3
468 — Solutions to Selected Exercises
Problem Set 4.4, page 203
(a) ATA = 16]
(b) ATA is diagonal with entries 1, 4, 9.
ee
(a) Q=|0
independent
1},
0
0
QGQ'=]0
1205-0
1 O
O10: 00
(b) (1,0) and (0,0) are orthogonal, but not
(0) (4.4.4.4), Goh -f-p GbE D.C BEAD.
If Q; and Q> are orthogonal matrices then (0102)' 0102 = OF 01010 = O10) = I
which implies that Q;Q> is orthogonal also.
The least squares solution is ¥ = QO'b. This is 0 if O = ld and b= bak
(a) If q1, 92, 93 are orthonormal then the dot product of g; with cjq, +c2q92+03q3
gives cj = 0. Similarly cy = c3 =0
(b) Ox =0>5 O'Ox = O10=05x=0.
(a) Two orthonormal vectors are pl. B54
Oe 1) and 79 (7,
vector in the plane is the projection 20'(1,0, ORO} > =
3,
—4,5,
-1)
=90
(b) The closest
(0.5, —0.18, —0.24, 0.4, 0).
If qj and q> are orthonormal vectors in R> then qi}b)qy + (q3.b)q2 is closest to b.
(a) aib = al (x14) + x2a2 + x343) = xy(al aj) =x)
(b) alb = a} (xjay + x2a7 + x3a3) = x(a} ay). Dheretore sep alb/ata
(c) x1 is the first component of A~! times b.
14
15
Feelin, pile
abel eel
Hegel
llal|
N=CHA R=CH Jy) and | o)=la o]| 0
(a) gy = 4(1,2,-2), go =4(2,1,2), 93 =4(2,-2,-1)
contains q3
qi
BI!
(db) The nulllspace of AT
(c) ¥ = (A?A)—!AT(1,2, 7) = (1,2).
16 The multiple
p =
aba =
ga =
za is closest; q, = a/|la|| =
5a and q2 = B/||B\| =
¥3 (6, —3, 10, 10).
18 If
8.
A= QR then A'A = R!R = lower times upper triangular. Pivots of A'A are 3 and
19 (a) True
(b) True.
20 The orthonormal
Ox = x1q) +242.
\| Ox ||2 = ce +x3 because q| -q2 = 0.
vectors are (5, 5 5.4) and (—5/V52, —1/V52,
1/V52, 5/52).
Then
b projects to p = (—3.5, —1.5, yr 1.5). Check that b — p is orthogonal to both vectors.
21 A=(1,1,2),
B=(,-—1,0),
C=(—1,-1,
1). Not yet orthonormal.
23 (a) One basis for this subspace is vy = (1,—1,0,0),
(6) (1
ad)
v2 = (1,0,—1,0),
v3 = (1,0,0, 1)
(c) by = (5,4, 5,9) and by = (4,4, 4, 3).
iy
B
25 (q3C*)qo is the same as a ae because q7 = [By]
and the extra q; in C* is orthogonal
to q2.
28 There are mn
multiplications in (11) and sm 2 n multiplications in each part of (12).
Solutions to Selected Exercises
32 (a) Qu = (I — Quu')u =u —2uu'u.
(b) Qv=(1
This is —u, provided that ulu equals
— 2uu')v =u—2uu'v=u,
|
provided that uly =0.
33 The columns of the wavelet matrix W are orthonormal.
7.3 for more
469
Then
W~!
=
W!.
See Section
about wavelets.
Problem Set 5.1, page 213
det(5A) = (4)?
det A= —3 and det(—A) = (—1)? det A = 3 and det(A?) = 9 and
det(A~!) = —3.
(a) False
(b) True
4 Exchange rows
(Sine
(c) True
(d) False.
| and 3. Exchange rows
leet
e
nl
—
le
1 and 4, then 2 and 3.
Dhesdetenminants
area
lee Ieee
lean lenn | ee.
lJio1| = 1.
Multiply the zero row by ¢t. The determinant is multiplied by ft but the matrix is the same
=
deu—0)!
O'=Q7!
and O'QO=1
5/0)? =15|0|=H1.
10 If the entries in every row add to zero, then (1, 1,..., 1) is in the nullspace: singular A
has det = 0. If every row adds to one, then rows of A —J
add to zero (not necessarily
det A = 1).
11 CD =—DC
= |CD| = (-1)"|DC|
d
12
det(A~!)
=
If n is even we can have |CD| 4 0.
—b
det lees
ad —be
13
and not —|DC|.
=e
|
=
ae
=
ie.
ad —be
det(A) = 24 and det(A) =5.
15 det A = 0 and det K = 0.
16
Any 3 by 3 skew-symmetric K has det(K!) = det(—K) = (—1)3det(K). This is —det(K).
But also det(K !) = det(K), so we must have det(K) = 0.
19
cet
“Oi
a= Ibe
ae = (ad
— bc)(1 = LI).
Rules 5 and 1 give Rule 2.
det(A) = 3, det(A~!) = 4, det(A — Al) = A? — 44 +3. Then A = 1 and A = 3 give
det(A — AJ) = 0.
Row 2 =2
Row
times row
| so det A = 0.
3 — row 2 = row 2 — row
af/aa
be
affac]
| ate
pile
aac.
ad—be
| so A is singular.
ae
|.
_1
|-ax
a
ad—bc
d *) =
a
=A™.
470
Solutions to Selected Exercises
Problem Set 5.2, page 225
det A = 4, its columns are independent; det B = 0, its columns are linearly dependent.
Each of the 6 terms in detA is zero; the rank is at most 2; column
411423432a44
gives —1, aj4a73a32a4,
detB =2-4-4-2-—-1-4-4-1=48.
gives +1 so det A = 0;
(a) If ay) = a22 = a33 = O then 4 terms are sure zeros
8 5!/2 = 60 permutation matrices have det(P) = +1.
9 If a1942g°--anw #0, put rows
2 has no pivot.
(b) 15 terms are sure zeros.
Put row 5 of J at the top.
1, 2, ..., nm into rows a, B, ..., w. Then these nonzero
a’s will appear on the main diagonal.
3
11 C=
.C=]
SS
0
0
=a
42 235
—21
14]. |B} =1(0) +2(42) + 3(-—35) = —21.
ste Meee
Sn 2iae
40 0
2 4 2 and, AC! = 0400) Therefore Ay = C'.
(ersee
00 4
1-1
1-1
ivan
= 2det}-1
2 -1]+det}-1
2
= 21B,)~ aet|
13 |B4|
12
C=
-1
2
-1
-1
]=21531-
at
vie
|B9|.
15 Must choose 1’s from column 2 then column 1, column 4 then column 3, and so on.
Therefore n must be even to have detA, 4 0. The number of row exchanges is 3D2 so
Ch = (-1)"/2.
16 The 1, 1 cofactor is E,-,. The 1,2 cofactor has a single 1 in its first column, with cofactor
E,—2. Signs give Ey, = E,-; — En—-2. Then 1, 0, —1, —1, 0, 1 repeats; Ejo9 = —1.
17 The 1,1 cofactor is F,-,.
The
1,2 cofactor has a 1 in column
1, with cofactor F,_9.
Multiply by (—1)!+? and also (—1) from the 1,2 entry to find Fy = Fy—1 + F,_2-
20
G7 = —-1, G3 =2, G4 = —3, and Gy = (-1)""!(n - 1).
21 (a) If we choose an entry from B we must choose an entry from the zero block; result
zero.
This leaves two entries from A and two entries from D leading to |A||D|
(b) and (c) Take a=|i a B=) a Gal
0
0
1 0
vs ae
0
0
22 (a) All L’s have det = 1; det U; = detAy =2 then 6 then —6
E20
23 Problem 21 gives det heer
If AC =CA
|=
A
Oni
Be
(b) Pivots 2, 3, —4.
B
1 and aet|¢ 4 = |A| times
Aj
|D—CA*B|.
this is |AD — CAA~!B| = |AD—CB|.
24 If A is a row and B is a column then det M = det AB = dot product of A and B. If
A is a column
—al)e
Ti
and B is a row then AB
has rank
| and det M = det
AB = O (unless
Solutions to Selected Exercises
471
25 (a) det A = ay, Aj, + 412A12 +--- +4 nAjn.
The derivative with respect to aj, is the
(b) dIn(det A)/da,,; = A,,/det A is the (1, 1) entry in the inverse matrix
cofactor A,
fa!
27 There are five nonzero products, all 1’s with a plus or minus sign.
column) numbers and the signs:
Here are the (row,
+ (1, 1), 2)(3, 3)(44,4 + C, 2)(2, D3, 494, 3) — d, 2)(2, DG, 3)(4, 4)
— 0, 1I)Q, 2)(3, 44, 3) — d, DQ, 3)G, 2)4, 4).
Total 1+ 1—1—1—1=-1.
29 |Sy| = 3, [So] = 8, |$3| = 21. The rule looks like every second Fibonacci number ...3,
5, 8, 13, 21, 34, 55, ...
so the guess is |S4] = 55.
Following
the solution
to Problem
28
with 3’s instead of 2’s gives |S4] = 81+1—9-—9-—9=SS.
31 Changing 3 to 2 in the corner reduces the determinant by 1 times the cofactor of that
corner entry. This cofactor is the determinant of S$, ; (one size smaller) which is F,.
Therefore changing 3 to 2 changes the determinant from F 2,42 to F2,42 — F2, which is
F241.
Problem Set 5.3, page 240
—
(det
A= 5.)
|Bij=3,
|Bo|=—2,
det
Biy— 105)
3 (a) xy = 3/0 and x2 = —2/0:
4 (a) det([b
a
idetb> =) —siso x) = 2) and 25) — 1
(b) |A| = 4,
|B3|=1. Therefore xy = 3 and x. =—} and x3 = }.
no solution
a ])/ det A, if det A
40
(b) x; = 0/0 and x7 = 0/0: undetermined.
(b) The determinant
is a linear function of
its first column so we get x;|a@, @2 a3| + x2\/az az a3| + x3|a3 az a3].
determinants are zero.
The last two
If the first column in A is also the right side b then det A= det By. Both By and B3 are
singular since a column is repeated. Therefore xj = |B,|/|A| = 1 and x2 = x3 =0.
eo
(a) | 0
oe
2
ae
i
0
4
If all cofactors
(b)
a
=
eae
3
1
1
5
1
5 . The inverse of a symmetric matrix is symmetric.
1
1
a
Ai
0 then
A~!
would
3
i
;
be zero if it existed;
has no zero cofactors
but it is not invertible.
6 ate-20
C=|-3
4 -1]
On 10
ral
and AC'=|]0
:
it cannot
:
exist.
A =
ol
hi
ae OO)
3 O|. Therefore det
A =3.
OOS
10 Take the determinant of both sides. The left side gives det ACT = (det A)(det C) while the
right side gives (det A)”. Divide by det A.
1
11 We find det A= (detC)*-! with n = 4. Then det A~! is 1/detA. Construct AW! using
the cofactors. Invert to find A.
472
Solutions to Selected Exercises
13 Both det A and det A~! are integers since the matrices contain only integers. But det A~! =
1/det A so detA =1
or —1.
15 (a) C21 = C31 = C32 = 0
(b) Cy2 = C21, C31 = C13, C32 = C93.
16 For n = 5 the matrix C contains 25 cofactors and each 4 by 4 cofactor contains 24 terms
and each term needs 3 multiplications: 1800 vs. 125 for Gauss-Jordan.
17 (a) Area |22| = 10
311
18 Volume =
|131|
(b) 5
(c) 5.
= 20. Area of faces = length of cross product
113
;|211
19 (a) 5/341
timate
==85
(b) 5 + new area 5] 051
=p) cea, =
De
-101
21 (a) V = L,L2L3 when the box is rectangular
< (/3)? < 6. Challenge question:
(b) |ajj| < 1 gives each L < J/3. Then
Is volume = 5 possible?
1.0.0
|0 1 Oj];
De BOS
24 The box has height 4. The volume is 4. The matrix is
(k-w)
= 6V2.
3ut
051
volume
ijk
|311|
ix yj =k
and
and 2n(n — 1)-dimensional
faces.
=4.
25 The n-dimensional cube has 2” corners, n2"-1
edges
The cube whose edges are the rows of 2/7 has volume 2”.
26 The pyramid has volume
28
i
singcos@ pcosycosé
—psing sind
singsin@ pcosgsin@
psingcosé
cos p
29
a The 4-dimensional pyramid has volume
—psing
dr/dx or/dy
0/ax 30/ay
=
|
= p- sing, needed
4:
for triple integrals in spheres.
0
cosé = sin®
1—-1sino
1r cosd
r
1
Pp
30 The triangle with corners (0,0), (6,0), (1,4) has area 24. Rotated by 6 = 60° the area is
:
1 288
unchanged.
32 Ve
The determinant of the rotation matrix is J =
2a)
lee On|
i 2 2
deta
34 (wx u)-v=(vx
cos@
—sin@ | _ | 32
sin@ cosé
2
2/=1,
D
il
0)
w)-u=(u
x v)-w:
Cyclic =
even permutation
of (u, v, w).
35 S = (2,1,-—1).
The area is ||PQ x PS|| = ||(—2, —2, —1)|| = 3. The other four corners
could be (0,0, 0), (0,0, 2), (1, 2,2), (1, 1,0). The volume of the tilted box is 1.
ASS
36 If (1,1,0),
So
37 det}
(1,2, 1) and
(x,y,z) are coplanar the volume
is det}
Wap Ke
ce
3
il
WOR
2 1|=0O=7x
2D 3
—5y +z; plane contains the two vectors.
1
1
he
0]
=0
=
Solutions to Selected Exercises
473
Problem Set 6.1, page 253
A and A* and A®™ all have the same eigenvectors.
1 and 0.25 for A2, 1 and 0 for A®.
The eigenvalues are | and 0.5 for A,
Therefore A? is halfway between A and A™.
Exchanging the rows of A changes the eigenvalues to 1 and —0.5 (because it is still a
Markov matrix with eigenvalue 1, and the trace is now 0.2 +0.3—so the other eigenvalue
is —0.5). Eigenvalues can change completely when rows are exchanged.
Singular matrices stay singular during elimination.
X41 = —1 and Az =5 with eigenvectors x; = (—2,1) and x» = (1,1).
has the same eigenvectors, with eigenvalues increased to O and 6.
The matrix A-+/
Avhas;s Aq =
A? has the same
—3 and Xo —
2 with x7 =
G,—2)
and x», =
d= 1):
eigenvectors as A, with eigenvalues Ke = 9 and v3 = 4,
A and B have A; = 1 and Az = 1. AB
and BA have 4 = 5(3+ /5). Eigenvalues of AB
are not equal to eigenvalues of A times eigenvalues of B. Eigenvalues of AB are equal
to eigenvalues of BA.
(a) Multiply
Ax
to see Ax which
reveals A
(b) Solve
(A —AJ)x
10 A has A, = 1 and Ay = .4 with x; = (1,2) and x2 = (1,—1).
Az = 0 (same eigenvectors).
So A!
12 (a) Baa
(uu!)u
5(-1 +i/3);
15 Set 24 =0
has A; =
1 and
— (.4)100 which is very near zero.
=
u(ulu)
=O
nN —
x2 = (—3,0,1,0),
4 =-—1
oll
(b>) eve
(uu! )v =
u(u! v) a)
x3 = (—S,0,0, 1) are eigenvectors with 1 = 0.
and | (the eigenvalue
| is repeated).
to find det A= (A1)(A2)--- Qn).
16 If A has A; =3
and Az = 4 then det(A —AJ)
NM = 5(a +d+J(a—d)*
19
lands
A™
is very near A™.
(c) x} = (—1,1,0,0),
14 i
A100 has i
= 0 to find x.
+4bc)
=QA—3)Q-4
= Neaen
and dr» = s(at+d— re eee dae
7h S od
Always
Their sum
is a+d.
hase
OO OS Ose
Oy
A=| 38 “Al
22 A = 1 (for Markov), 0 (for singular), -4 (SONSUDIN==ntraces— 5).
26 =) Ihe Py Si We
27 rank(A) = 1 with 2.= 0, 0, 0;:4;
28 Behas
1
and B=A-—TI
rank(C) = 2 with 2 =0,
1b eon sondey b= —See he)
has A=-—1, —1, —1, —l, 4.
30 b i H = ea
Gadel
Carn!
= a+o){
Dy
A2 =d—b
0,: 2,,2.
iimatnixAn
—
to produce trace =a-+d.
32 (a) u is a basis for the nullspace, v and w give a basis for the column space
(O, 7 t) is a particular solution.
Add
any
(c,0, 0)
would be in the column space, giving dimension
3.
(c) If Ax =u
(e)) 28 =
had a solution,
u
474
Solutions to Selected Exercises
Problem Set 6.2, page 266
=
Fea oml reel raanl ares eal beelllea
WI)
Kl
If
A= SAS7!
is still S.
10
then the eigenvalue matrix for A+2/
is
A+2I/ and the eigenvector matrix
A+ 27 = S(A+2NDS— = SAS! + SQNs-! = A421.
(a) False
(b) True
eel
i eer
(a) Aa
(0) eel
(c) True
il rae
Alhas Ay =1,
De
(d) False.
Ag =—4 with x} =(1,1),
2
ale ee
le
‘ Foy = 6765.
i
x2 = (1, -2)
2
1
ae
athe
att Saga
ae ale
jfee=[
4
3
3
3
3
o[%r]-«[8)-] itl-|
Geri
|
2
ae 1
Bt
1
SAN
sok
13 A% (x, +Xx2) = Ax, se A29x5 = oa af eee
The second component
is ie + eu
jaoyy) = May P27L,
15 (a) True
(b) False
(c) False.
16 (a) False
(b) True
(c) True.
20
21
23
0
\, 1 2]
\
=
1
1 F
k
1
0
Tea
2
s=[
Ei;
ako
|
j] ant sats]
ale
SG
a
B}0 I =
sum
ce
Met ie Oe
RS es ae
eae:
101
2 |=
acto
ie
of those two.
Oe
=
Oe soa
Tae
Of zaihe@
aia
pha
24 det A = (det S) (det A) (det S~!) = det A = 41 ---An. This works when A is diagonalizable.
26 Impossible since the trace of AB — BA is trace AB — trace BA = zero.
E = li a
29 If A has columns (x1,..., X,) then A2 = A means CAxiSe as Go) = (CSilposonkso
vectors in the column space are eigenvectors with
Dimensions add to n => A is diagonalizable.
30 Two problems:
The nullspace and column
pendent eigenvectors in the column space.
31 n= Sy AS
= E
space can overlap.
i the square root of B would
2
(imaginary) so the trace is not real.
E
“All
A = 1. Always the nullspace has 4 = 0.
There may not be r inde-
have 4 = J/9 (real) and A = /-1
| does have a real square root |: al
Solutions to Selected
Exercises
475
35 If A = SAS! then the product (A—A ,/)---(A—AnJ) equals S(A—A,/)--- (Nese
The factor A —A;/ is zero in row j. The product is zero in all rows = zero matrix.
36 a2 —1 factors into (A — 1)(A+1) so Problem 35 gives (A—J/)(A+/) = zero matrix. Then
A* =I
and
not necessary.
A=A™!.
Note:
Factoring
p(A) = det(A — AJ) into (A —A1)-:-(A— An) Is
The Cayley-Hamilton Theorem just says p(A) = zero matrix.
3H/ (a) The eigenvectors for
4 = 0 always span the nullspace
X # 0 span the column space if there are r independent
multiplicity = geometric multiplicity for each nonzero i.
(b) The eigenvectors for
eigenvectors: then algebraic
Problem Set 6.3, page 279
uy =e! lal u, =e! ei If u(0) = (5, —2), then u(t) = 3e% A +- Qe! E |
z(t) = —2e’; then dy/dt = 4y—6e! with y(0) =5
gives y(t) = 3e* +2e! as in Problem 1.
1? = E | |» Then 4 = 4(5+ V41).
/
E al has Ay = 5, x1 = (2,1),
r(t) = 20e% + 10e,
x2 = (1,2);
Ag =2,
wo) — 10%! + 20e2!. The ratio of rabbits to wolves approaches 2 to 1.
When A is skew-symmetric,
Q = e4! is an orthogonal matrix and Je" u(0) |}= ||u(0)||.
0 []=t[] ALi} oie [te] [i]
a
1
i
10
1
wl
4
Bye
|eil
0
4
12 Up = A7!b; Up = 4andu(t) = ce2! +4: Up = B and u(t) = cye [| rene HEE
a
13 Substituting u = ev gives ce“v = Ae“v —e“b or (A—cl)v =b or
particular solution.
17 The solution at time t
+7 is also e4¢+7)u(0).
18
1 1
fe eit? 1s) ee wea
Aral lela zi is ah s5| as
20
e
(a) The
inverse
of eA!
is eA!
v= (A— cl)
A
equals e ACQt+T)
Thus e4! times e4!
(b) If Ax = Ax
then ely
= eMx
and e
!b=
476
Solutions to Selected Exercises
24 x(t) = e*! and y(t) = —e*! is a growing solution.
is
The matrix for the exchanged system
, ‘i and it does have the same eigenvalues as the original matrix.
Problem Set 6.4, page 290
O=F|
Ze
te
(2-20 =i
-1 -2 2
(a) E
Al
(b) The pivots have the same signs as the A’s
(c) trace = 1,+A2 = 2,
so A can’t have two negative eigenvalues.
IfA? = 0 then all 2° = 0 so all
A=
OA} gl = 0 gives
Oma
A= 0 as inA =|0
A = 0 and therefore
al If A is symmetric
then
A = 0.
If A is complex then d is also an eigenvalue and A+
third eigenvalue must be real.
is real. The trace is real so the
10 If x is not real then A = xl Ax/x'x is not necessarily real.
0
12
4 =ib and -ib}
A=|-3
0
3
0
0
4]
—4
0
has det(A—AJ) = —A2—25A
=O and A=0,
Si, —Si.
13 Skew-symmetric and orthogonal; 4 =i, i, —i, —i to have trace zero.
15 (a) AZ = dy, Aly SO
ATA are > 0
to
(On
OS— 1)
ATAz
=
Alay)
=
(b) The eigenvalues
a
ORO)
of B are —1, —1,
ea — (ONION):
16 The eigenvalues of B are 0, V2, —J/2 with x; =
a
Aly
=
iz.
The
eigenvalues
1, 1 with x; =
(1,—1,0),
x2 =
of
(1,0, —1,0),
X3 I|
(1,1, V2),
Ly)
Te
17 (a) [xj
x2] is an orthogonal
matrix so P, + P2 = xixe + xzxt = [x1
x ||
2
(D) PE eae
(xl1 xo)xJ2 ==).
18 y is in the nullspace of A and x is in the column space. But A = A! has column space
= row space, and this is perpendicular to the nullspace. If Ax = Ax and Ay = fy then
(A — BI)x = (A — B)x and (A — B1)y = O and again x L y.
Solutions to Selected Exercises
21
ale?
(a) False. =, ‘|
(c) True.
A= OAQ7!
and A~! = QA-!97!
16
ial
is also symmetric
ik
il
elle
fi
ot
(die Ealsestor2At—s|
22
ile
ae
13
(b) False: E AliF sek
477
If A’ = —A then ATA = AAT = —A?. If A is orthogonal then ATA = AAT = J.
AN = be ;|is normal only if a =d.
23 A and A! have the same i’s but the order of the x’s can change.
A =
ee Alhas
4 =i and —i with x; = (l,i) for A but x; = (1, —i) for A.
25 Symmetric matrix when b = 1; repeated eigenvalue when b = —1.
Problem Set 6.5, page 302
ait a
f
a
olf) 9s
{ Wott
0 27 |tee
al Solel ale: Real: ‘ft age
wel ie s)-E if .*s][o 7)
y =—1.
x? + dxy + 3y? = (x + 2y)? — y? is negative at x =2,
xT AT Ax = (Ax)! (Ax) = 0 only if Ax = 0. Since A has independent columns
this only
happens when x = 0.
4-4
A=
8
|-4
4
8
-8
determinant
-—8 | has only one
pivot = 4, rank
A =
1, eigenvalues
are 24, 0, 0,
16
= 0.
VAri==)25 5 (A> |= 65
Ac| = 308 the pivots, aree2/
1516/25, 30/6:
A is positive definite for c > 1; B is never positive definite (determinants d — 4 and
—4d + 12 are never both positive).
If aj, were smaller than all the eigenvalues, A —aj,,/ would have positive eigenvalues (so
But it has a zero in the (1, 1) position.
positive definite).
19 If Ax = Ax then x! Ax = Ax!x. If A is positive definite this leads to 4 = x Ax/xTx > 0
(ratio of positive numbers).
22
Sy tii
R=
pli
Sil [0/9
spell IAD
il
st levee me [9
2
2g
Bed 2a
Ree pee Lae
bn lesen
ee
alec
=1
23 Ay = 1/a? and Az = 1/b* so a = 1//Aq and b = 1/./Ap. The ellipse 9x* + l6y*
has axes with the half-lengths a = 5 and b= i
27 ax? + 2bxy + cy? = a(x + By)? + SFy?.
478
Solutions to Selected Exercises
28 det Ar 10:
2
andi
ach
(Cosi —SinG)
eee
asin. 6, cos 6); the A’s are positive.
2
29 Ay =
‘e
5x7 +y=0;
Z| is positive definite if x #0;
f) = (5x? + y)? = 0 on the curve
Ar= lt ‘|is indefinite and (0, 1) is a saddle point.
31 If c > 9 the graph of z is a bowl, if c <9 the graph has a saddle point. When c = 9 the
graph of z= (2x + 3y)” is a trough staying at zero on the line 2x + 3y = 0.
Problem Set 6.6, page 310
ieee
(MN)~'A(MN)
Eight families
so ifB is similar to A and C is similar to B, then A is similar to C.
of similar matrices:
6 matrices
have
4 = 0, 1; 3 matrices
have
4 =
1, 1
and 3 have A = 0, 0 (two families each!); one has X = 1, —1; one has 4 = 2, 0; two have
a7 (4/5).
(a) (M~!AM)(M7~!x)
M~!AM
10
=
have the same
pose
year
0
M~!(Ax)
=
M!0=0
(b) The
nullspaces
of A and of
dimension.
ae oe
cl|0 c¢
et
Qe
yas
ee
On Ace
me
aCe’
3a
jel
lm
CO
tke
}
all Olweetee wy
Fe aleeeee lt amma
Ores
he
x0)
—-l_
el
ae
13 Choose M; = reverse diagonal matrix to get M eee M; = mM} in each block; Mo has those
blocks
M; on its block diagonal;
(MMoM")-!A(MMoM!).
15
then Al = (M~1yTyT yt =
(M-!)TM,)'JMoM™
_
A? is similar to A.
det(M—!AM — 21) = det(M~!AM — M—!AIM) = det(M~!(A — AI) M) = det(A — AJ).
17 (a) True: One has 4 = 0, the other doesn’t
trix and A is symmetric
(c) False:
a
LO
All eigenvalues of A+ / are increased by 1.
(b) False.
Diagonalize a nonsymmetric ma-
| and G in are similar
1
(d) True:
19 (b) AB has all the same eigenvalues as BA plus n — m zeros.
Problem Set 6.7, page 318
—
ATA=
2
20
#4 has oe =,
80
Oi] = eval
4/17
uy = bes for the column space, vy = bee
for the nullspace, vz = ERs
7
“1/
el
v7) —
for the row space, uz = Lees
for the left nullspace.
Solutions to Selected
1
2
es.
;
oo)
:
Hie
fv?
-
Exercises
7
479
1/./2
/
iF 4 has of = 3 with uj = lial and of = 1 with u2 = bea
co
ATA=]1
i
1/V6
iO
1] has o? =3 with »,=| 2/V6|,
2
1/J2
02 =1 with» =| 0 |; and
1/V3
v3 = |-1/V3
1/V3
a
1
0
[ I f= [a u2 |
v3
| [m v2 3]
1
0
Are
io my
11 Multiply uy!
using columns (of U) times rows (of DV‘).
Suppose the SVD of R is
R= UXV!.
Then multiply by Q. So the SVD
of this A is
(QU)=xV!.
15
(a) If A changes to 4A, multiply
© by 4.
(b) AT = VETUT.
And if Aq! exists, it
is square and equal to (VT)! p-ly-!,
Problem Set 7.1, page 325
oS
(a) S(T(v)) =v
Choose
(b) S(T (v1) + T(v2)) = S(T (v1) + S(T (02)).
v = (1,1) and w = (—1,0).
Then
T(v) + T(w) = v+w
(a) T(T(v)) =v
(b) T(T(v)) = v + (2, 2)
(d) T(T(v)) = T().
(a) T(1,0) =0
Th(Oi)
(b) (0,0, 1) is not in the range
2)
A is not invertible.
(22) sen
AM =/
(Cc) RON
10%
(GeD) — le 1) +522, 0) then T(v) = b(2, 2)+(0, 0).
is impossible.
No matrix A gives A i ‘|=
but T(v+ w) = (0,0).
(c) T(T(v)) =—-v
A hs es = E a
E Ar To professors:
The matrix space has dimen-
sion 4. Linear transformations come from 4 by 4 matrices. Those in Problems
special.
(a) True
(b) True
(c) True
13-15 were
(d) False.
T(1) =0 but M = i s|= T(M); these fill the range. M = b | in the kernel.
T(T~!(M)) = M so T~!(M) = A~!MB-!.
(a) Horizontal lines stay horizontal, vertical lines stay vertical
a line
(c) Vertical lines stay vertical.
(a) ad —bc=0
(b) ad —bc > 0
(b) House squashes onto
(c) jad — be| = 1.
If two independent vectors are transformed to themselves then by linearity T = /.
This emphasizes that circles are transformed to ellipses. See also Figure 7.7.
480
Solutions to Selected Exercises
Problem Set 7.2, page 337
LI S0 =
S09)
— OF S03)
— 20S
= 0p
eB —
CNS
SS)
&) SS)
(ea)
Qo)
QQ So
SSO)
4 Third derivative has 6 in the (1,4) position; fourth derivative of cubic is zero.
See As
Ql
ean Oc
@
i
al
()
al
6 Tv
+02 + 03)
=2w, + w2 + 23; A times (1,1, 1) gives (2, 1, 2).
7 v =
c(v2 — v3) gives T(v) = 0; nullspace
(0, c, —c).
9 We
don’t know
7(w)
unless
is (0O,c, —c);
the w’s are the same
solutions
as the v’s.
are
(1,0,0)
+
any
In that case the matrix
is
A?
12 A+
=
1°
@O
|=1
1
0 -l
¢
0 | so T~!(w}) = i) = Ode
1
the only solution to T(v) = 0 is
15 (a) E |
T~!(w2) — 1b) = DA
v= 0.
(b) E i = inverse of (a)
—|
16 (a) M= E i
T~!(w3) —aae
(b) N= E “|
—(c) alé| must be 2A bl
(c) ad
= be.
pwr=[! JE J=[2 3]
21 w(x) =1—x?;
w3(x) = A(x? —x);
0
YA
T'S OY OS
||
5
1
0
QO
—.5
-l
5)
].
y=4uw, +5w2 + 6w3.
v’s to w’s:
inverse matrix
25 a2 = 11241 + 12242 gives az as a combination
Swe
26 Row 2 of A is /9;(rowl
invertible.
30
T(x, y) =,
30 fe 2(@-—a)
sin2(9—a@)
|
1
1
= | 1
0
0
1
—-l
1
of the qg’s. So the change of basis matrix
of U) + /o2(row2 of U). The change of basis matrix is always
—y) and then S(x, —y) = (—x, —y).
Thus
—sin2(0—a)
cos2(6 — al rotates by 2(0 — a).
33 False, because the v’s might not be linearly independent.
ST = —I.
Solutions to Selected Exercises
481
Problem Set 7.3, page 345
The inverse matrix is W~! =
v=
1
1
1
i
4
4
4
4
1
Lee
:
:
iEN Geel
a
4
a0
OF Os
ae
1
aaa)
|. Then e = wy ar iw a 5w3 and
1
wWw3+4+ W4.
The eight vectors
are (1,1, 1, 1,1, 1,1, 1) and the long wavelet and two medium
wavelets
Cr
—1, —1,0,0,0,0)
each time.
The Hadamard
H/4.
and (0,0, 0,0, 1, 1,—1, —1) and four short wavelets with
1, —1 shifting
matrix H has orthogonal columns of length 2. So the inverse is H'/4=
The transpose of WW-!=1
is (w-lyTwt = 1. So the matrix W! (which has the w’s
in its rows) is the inverse to the matrix that has the w*’s in its columns.
Problem Set 7.4, page 353
10
A Tae
ey
20
aa has A == 50 and 0,
|
2
n=x, vil!
[5], v2 Ee 5 [ih o, == V50.
By
il
‘4 has 4 == 50 and 0,
a= le 1
u=F5{5|
AA [s=
Ag
Ole
+ _
:
[oyu
13
Ts Ke | Te B
1//50
vl 0
+07U205.
Aft (5 O70]V andi At Sev
36
6.48)
«(OO
48
.64
0
0
0
0
17 (a) A'A is singular
18 x
(b) ATAxt
2
“12,
180: Pl
S216)
= ATb
4
+ _
203||P oe
A is singular.
ha:
oy a
In general this is ojuyv, +++ +o,u;v).
0
ATA=|
3
|i].
Zak H is semidefinite because
0
[arsS ay
jE
a 2
oY = |) al ANE
i
v
cata} 4 = oud)
v2
ur oe
+
aay
ae:
0
(c) Ud - AAT)
in the row space of A is perpendicular to ¥—xt
projects onto N(A!).
in the nullspace of A! A = nullspace
of A. The right triangle has c? =a* +b’.
20 At =3[.6
8]=[.12
.16] and
36.48
AtA =[1] and aat =|
|
48.64
482
Solutions to Selected Exercises
21 L is determined by one number below the diagonal. Unit eigenvectors in S are each
determined by one number. The counts are 1+3 for LU, 1+2+1 for LDU,
1+3 for
23
OR,
1-23-1
for USV 1. 142 Mifor SASS"
a
Ip
1
b
Olfa
(LU -T)(uUT DU ) = (triangular)(symmetric).
P
a#0. Change
needs
dls eal
E ‘| er
I haven’t found a good application yet.
Problem Set 8.1, page 366
—1
|-1
0
A=
Aly =0
1
0
OO - 14;
-1
1
for
c
nullspace contains | c
c
y = (1,—1,1);
current =
1
|}; | O | is not orthogonal to that nullspace.
0
1 along edge 1, edge 3, back on edge 2 (full
loop).
Kirchhoff’s
Current
f =(,0,0);
nN
Law
2
|-1
—1
-1
2
-l
-1
—1, 0 and currents
—Ax
AlAx =
Al
f is solvable
2)
3
exe =" | — 35
0
2
1
“I
A! y =
for f =
(1,—1,0)
and not solvable
for
f must be orthogonal to (1, 1, 1) in the nullspace.
-1l -2
S28
—2
4
—CAx
= ie 3, i
2
1
1a
0
ff sends 3 units into node
3
|—1
—2
A=
2, il j and currents
= 2, 1, —1;
fe producessx7—
|e —s
1
On|
—1
myleldsixt—s
G
aie
I potentials: 1e
G
1 and out from node 2.
5/4
elem
7/8
Cc
||-|> een
GC
empotentials
Elimination on Ax = b always leads to y'b = 0 which is —b, + by — b3 = 0 and
b3 — b4 + bs = 0 (y’s from Problem 8 in the left nullspace). This is Kirchhoff’s Voltage
Law around the loops.
2-1
11
en
0
13 elABOG es
=!
-1
0
-1
diagonal entry = number
of edges into the node
-l1
3
-1
off-diagonal entry = —1
-—1
=!
3
Ter
4 —2
—2
8
Pipa
OR
Sar
2,
if nodes are connected.
—2
0
+3 =3
eels
3
mG
x4 = 0 and solved 3 equations);
ae2
1
0
:
¢
veal
9 | gives potentials
x ==o (75, 6,
6-0) (grounded
=|
y=—CAx
= G. , 0, 5 5).
17 (a) 8 independent columns
(b) f must be orthogonal to the nullspace so fy) +---+
oO
(c) Each edge goes into 2 nodes, 12 edges make diagonal entries sum to 24.
Solutions to Selected Exercises
483
Problem Set 8.2, page 373
N
ae
He pe
mt As) feat(oma
"
Ge cePrEoncnes
A=
1 and.8,
6 iO
mille ONL
x =(1,0);
ae eel eelv6
eee din Dee aE
x =( 25 9)
A= 1 and —8,
h II — fl ro)=) Q. Al tad II —
Sy
Wo] |
WIR
must
be
The steady state is (0,0, 1) = all dead.
If Ax = Ax, add components
cua ()h
on both
;
CS) k —> 0) gives
A~k >
CO.A~-
my
6+ 4a
A=a [404
.6—.6a
eens;
M2 is still nonnegative;
[1
---
1]M=[1
1] so multiply by M to find
[1 --X=
1]M?=[1
1 and
a+d—1
---
8
pd= le
|and
130
a
If
A
£1 the sum
with —2 <a <1.
from the trace; steady state is a multiple of x; = (b, 1 —a).
the solution approaches
1S
---
s = As.
1]= columns
of M? add to 1.
B has A = 0 and —.5 with x; =(.3,
Wheseisenvecton
sides to find
.2) and xp = (—1,1);
e7
approaches zero and
cje™! 0,x) = cx}.
— (le le
)rand Ane
ia
Mi 5
19 Xk = 1 (Markov), 0 (singular),
(COM Oe 9).
il
:
4 has no inverse.
.2 (from trace).
Steady state (.3, .3,.4) and (30, 30, 40).
20 No, A has an eigenvalue 4 = 1 and (J — A)~! does not exist.
Problem Set 8.3, page 381
—_
Feasible
set =
line segment
from
(6,0) to (0,3); minimum
cost at (6,0), maximum
at
(0, 3).
Feasible set is 4-sided
with corners
(0,0), (6,0), (2,2), (0,6).
Minimize
2x — y at (6,0).
3 Only two corners (4, 0,0) and (0, 2,0); choose x; very negative, x7 = 0, and x3 = x; —4.
From (0,0,2) move to x = (0, 1, 1.5) with the constraint xj +x2 +2x3 = 4. The new cost
is 3(1) + 8(1.5) = $15 so r = —1 is the reduced cost. The simplex method also checks
x = (1,0, 1.5) with cost 5(1) + 8(1.5) = $17 so r = 1 (more expensive).
Cost = 20 at start (4,0, 0); keeping x;
+ x2 + 2x3 =4
move
to (3, 1,0) with cost
18 and
r = —2; or move to (2,0, 1) with cost 17 and r = —3. Choose x3 as entering variable
and move to (0,0, 2) with cost 14. Another step to reach (0,4,0) with minimum cost 12.
c=[3
5 7] has minimum cost 12 by the Ph.D. since x = (4,0,0) is minimizing.
dual problem maximizes 4y subject to y < 3, y <5, y < 7. Maximum = 12.
The
484
Solutions to Selected Exercises
Problem Set 8.4, page 387
—
Hex COSCO
da =
denominator notice
ae
on
[ae
|
j —k #0).
= 0 and similarly Ga cos(j — k)x dx = O (in the
If 7 =k
then ee cos” [pe Cie = a
[SO
—cx)dx = 0 and eae — 5)(x3 —cx)dx = 0 for all c (integral of an odd
function).
Choose
c so that ihe x(x3 —cx)dx
= [$x _ alae
a
F _ c§ a)
een)
pes
c=%.
The integrals lead to aj
= 0,
b) =4/n,
bo =0.
From equation (8) the a, are zero and by = 4/mk.
The square wave
Then equation (6) is 27 = m(16/1°)(s5 + a + a +.---)
has Wilke ee
so this infinite series equals
nr /8.
= V2; |v]? =1+a2+a4+---=1/(1—a’) so
ol? =1+ 5+ 4+ gt --=2 80 lo]
+ sin?x
lv =1/V1—a2; (271 + 2sin
x)dx =2n +047 80 lIfll = V3x.
@) 7G) =
5 +35 (square wave)
U YS 5 cox
so a’s are
5, OOF
(b) ap = (C7 eax 2a =i
11 COS D — eee
5 + 5 Cos 2x;:
13 dy/dx =cosx
has
ands
OUNei aja Ose
cos(x + Toxo
3) = COS x COs io
3
y=yp+yn
ere
snare 2/77. Ose 2/370.
pe—s—Ke
exo xsin
cot,7 =eal5 COSxX
sin
V3
:
7 Sinx.
=sinx +C.
Problem Set 8.5, page 392
=
(x, y, z) has homogeneous coordinates (x, y, z, 1) and also (cx, cy, cz, c) for any nonzero c.
1
1
1
hy =
1
1
ese
Tal
es
1
O32.
C
pS
1
1
1
=
255
al
i
©
1
&
c
eae
) ST =
c
i
is translation along (1, 6, 8).
i
a
‘
eee
, TS=
c
she
:
al
Ca
c
AC
1/8.5
Sr
1/11
for a 1 by 1 square.
1
n= (%,%,
3)has ||m] =1 and P=I—nnT=})-4
5
-4
-2
5 -2
mIOCumL
, use oT.
Solutions to Selected Exercises
10
Ch
:
oose (0,0, 3) on the plane and multiply
a
ee
i
sy
T_ PT, = He ae
esa
Gy
11
485
Gr
ese
9
(3,3,3) projects to 3(—1, —1,4) and (3, 3,3, 1) projects to (4, 4, 3, 1).
13 The projection of a cube is a hexagon.
14
1S
1
-—8
—4
—4
-4
7
(3,3, 3) — 2nn™) = (3,3,5)|-8
(3, 3,3, 1) > 3,3,0,1) >
1 -4/=(¥,-4,-}).
(-4,-4,-§,)
> 3,-4.4,.
17 Rescaled by 1/c because (x, y, z,c) is the same point as (x/c, y/c, z/c, 1).
Problem Set 9.1, page 402
Without exchange, pivots .001 and 1000; with exchange, pivots | and —1. When the pivot
is larger than the entries below it, /;; = entry/pivot has |/j;|<1.
Ae —
1
ele
1
een
11/16
Sale
47/60
ee
1.833
OSS
0.783
COmMparcawithieAu|
0
mo
—=3.6
1
0
—1
A=|
1.80
=i |} LO)
0.78
1
1
l
1
—l
i
|. |avall| <= oY!
but ||Ax|| > 6.
The largest ||x|| = Ab]
is 1/Amin; the largest error is 107 197A are
Each row of U has at most w entries. Then w multiplications to substitute components of
x (already known from below) and divide by the pivot. Total for n rows is less than wn.
L, U, and R need sn? multiplications to solve a linear system.
Q needs n? to multiply
the right side by o7} = Q!. So QR takes
1.5 times longer than LU
On column
5ie multiplications (only the j by j upper
j of 7, back substitution needs
to reach x.
left block is involved). Then 5(1? +2? +--- +n?) © 5(4n°).
10 With
16-digit floating point arithmetic the errors
(Ome 10n" 10-
a
11 cos 6 = 1/10,
||x — Ycomputed||
care of orderlO "9, 10° >), 10g
sae
ree ite
sin@ = —3//10,
R=
Fe
Osan
ley
He
for
¢ =
12>.
10%,
Ome,
ra ace
s\-#l
0
|
14 Q;;A uses 4n multiplications (2 for each entry in rows i and 7). By factoring out cosé@,
the entries 1 and +tan@
need only 2n multiplications, which leads to zn3 for OR.
486
Solutions to Selected Exercises
Problem Set 9.2, page 408
All =2, c=2/.5=4;
5,83:
|All =3,
c= 3/1 =3; All =24+ V2, c= 24+ V2)/2-V2) =
For the first inequality replace x by Bx in ||Ax|| < ||All||x||; the second inequality is just
|Bx|| < ||Bllllx||. Then ||AB] = max(||ABx||/|)x|) < AMI BI.
The triangle inequality gives ||Ax + Bx|| < ||Ax|| + ||Bx||. Divide by ||x|| and take the
maximum over all nonzero vectors to find ||A + B|| < ||A|| + ||Bll.
If Ax = Ax then ||Ax||/||x|| = |A| for that particular vector x.
ratio over all vectors we get ||A|| > |A|.
When
we maximize
13 The residual
b— Ay = oes 0) is much smaller than b — Az = (.0013, .0016).
much closer to the solution than y.
2
14 devAu="
c>
0m
26
i
“soiA
=
659,000
—563,000
Bowes
780,000
| Then ||Aj| > 1,
the
But z is
|JA7!|] > 10°,
10°.
16 a Ss
=
ss is not smaller than max(x?) and not larger than we +e+-4+ x + 2|x||xo| +
IIx 12. Certainly ee +.-- see <n
max(x?) so |x| < Jn||xlloo.
(sign x1, Signx2,..., Signx,) to get x - y = ||x||]. By Schwarz
/n\||x||. Choose
x = (1, 1,..., 1) for maximum
Choose
this is at most
y =
||x|]||_y|| =
ratios ./n.
Problem Set 9.3, page 417
If Ax = Ax then (J — A)x = (1 —A)x.
Real eigenvalues
of
B=
J — A have
|1 —A| < 1
provided A is between O and 2.
Jacobi has S~!T= 3ak
Gauss-Seidel
| with’ |X |max = i.
0
has S~!T = :
with |Almax = 5 = ({Almax for Jacobi)”.
Wire
Ol
Set the trace 2 — 2w + ior equal to (w — 1) + (w — 1) to find Wopt = 4(2 — J3) = 1.07.
The eigenvalues w — 1 are about .07.
15 The jth component of Ax, is 2sin aa = sin We
using sin(a+b) = sinacosb+cosasinb,
i
253°)
iS"
ah
=
A =)
R=Q'A=
g
— sin are
The last two terms,
combine into —2 sin a cos =.
The eigenvalue
2cOS ue
517 zl
:
1
a
5
14
| gives wo =| | m= 47], w= 3/3). ere
1 cos@siné
i
— sin? 0
d Ay =RO=
re
2
coso(1+sin*@)
—sin36
— sin? 0
—cosé@ sin? 6
1
i
Solutions to Selected Exercises
20 If
A—cI = QR then Ay
eigenvalues from A to Aj.
21 Multiply Agj = b;-14;-1
are orthonormal).
tridiagonal.
= RQ4+cI
= Q-'(OR+4+c1)O
= Q7!AQ.
487
No change in
+ 4j9; + Djaj41 by q} to find q} Adj = aj; (because the q’s
The matrix
form (multiplying by columns)
Its entries are the a’s and b’s.
is AQ
23 If A is symmetric then A; = Q-'AQ = o'AO is also symmetric. Aj
=
QT
where T is
= RQ = R(QR)R~!
= RAR! has R and R7! upper triangular, so A; cannot have nonzeros on a lower
diagonal than A. If A is tridiagonal and symmetric then (by using symmetry for the upper
part of A,) the matrix A; = RAR7!
is also tridiagonal.
Problem Set 10.1, page 427
In polar form these are /5e!?, 5e7/?, Fe”, ai5)
Izxwl=6, lz+wl<5,
kz/wl=%, lz-wl <5.
a+ib= 92 41 ba 3) i, 2b4 Br wl a1,
2+i; 2+i)U+i) = 143i; eM? =-i; eI™ =-1,
z+z is real;
GE =i;
18 = (3 =i.
z—Z is pure imaginary; zz is positive; z/z has absolute value 1.
(a) When a = b = d = 1 the square root becomes /4c;
A is complex if c < 0
(b)
A=O and 1 =a+d when ad = bc
(c) the A’s can be real and different.
13 Complex ’s when (a + d)* < 4(ad — bc); write (a+ d)? — 4(ad — bc) as (a — d)? + 4bc
which is positive when bc > 0.
14 det(P — AI) =
A+ — 1 =
0 has A =
1, —1, i, —i with eigenvectors
(1, —1, 1, -1) and (1,
i, —1, —i) and (1, —i, —1,7) =
columns
(1,1,1,1)
and
of Fourier matrix.
16 The block matrix has real eigenvalues; so iA is real and A is pure imaginary.
18 7a— bean cle
5 — 6; multiply by e'? to get elt)? — |.
21 cos 39 = Re(cos 6 + i sin 0) = cos? 6 — 3cos6@ sin” 6;
sin3¢@ =
Im(cos 6 + isin 6)3 =
3 cos? 6 sin@ — sin? 6.
23 (a) e! is at angle 6 = 1 on the unit circle; |i®| = 1° = 1
candidates i¢ = ef ("/2+27N)e
24 (a) Unit circle
(b) Spiral in to eon
(c) There are infinitely many
(c) Circle continuing around to angle 6 = Qn.
Problem Set 10.2, page 436
z=
multiple of (1+7,1+i,—-2);
to all columns
Az= 0 gives zH 4H = 04 50 z (not z!) is orthogonal
of AM (using complex inner product z4 times column).
The four fundamental
subspaces are R(A), N(A), R(A¥), N(A#),
488 — Solutions to Selected Exercises
5 (a) (AH A)H = AH AHH — AHA again
is | Az||? = (0 so Az =0.
when N(A) = {0}.
(a) False:
On
A = E
=_(b) If AH Az = O then (gHAH)(Az) = 0. This
The nullspaces of A and AHA
(b) True:
0
are the same.
2+
D> Beta
0 5)
5. Aa
= 24 5P t4P
is invertible
—i is not an eigenvalue if A = AH
10 (1,1,1), (, e274/3, 4771/3), 1, 47/3, 6271/3) are orthogonal
because P is an orthogonal matrix—and therefore unitary.
11 G=|4
AA
ma tie®
5 Ee
(complex
(c) False.
inner product!)
aa
ee
5e4ti/3 + 4871/3,
13 The determinant is the product of the eigenvalues (all real).
15
18
ee
if
—1+i
D
0
1
1-1
rs
Ber
1
0
—1|3\—-1-i
‘ae
peta
teas
STE
tL
ee
eS
NOn aU
Vesa ils
WW
ce
Oa
tceae = aie
Oe
er
he i.
vi gives real A, trace zero gives
with L2 = 6 + 2/3 has
A = 1, —1.
19 The v’s are columns of a unitary matrix U. Then z = uuHz = (multiply by columns)
= v, (viz) ae
aie vn (viz).
20 Don’t multiply e~!* times e!*; conjugate the first, then ee elk dy = aan Po ts = 0.
22 R+iS =(R+iS)# = RT —-iST;
24 (ealancd
27
lal
;
A)
aerativanlCacal:
a
b+ic
fee
d
a
Ww
el PZ
} E
;
iva
with |w|*
D,
ee
+ |z|~
= 1.
Unitary means UHU = 1 or (A? —iB™)(A+iB) =(A'A+ B'B) +i(A'B — BTA) =1.
Then ATA + B'B =I
30
=
R is symmetric but S is skew-symmetric.
_fi-i
alee
and A'B—B!A
1-i]f1
0}, f242a°
5) Ifo ree
=O
which makes the block matrix orthogonal.
=27_
&
5 |= 8A
Problem Set 10.3, page 444
ee
e —>
1
0
|
0
ah
1
1
0
0
zi
2
0
0
0
(1,1,1,1,0,0,0,0)
The second vector becomes
as
+
2
0
2
0
2
0
1
20)
1
ay
(4,0, 0, 0, 0, 0, 0,0) +
(0, 0,0, 0, 1,1, 1,1) ~
0
0
1
|
Bs.
0
0
2
0
a
2
0
—2
0
(4, 0,0, 0,4, 0,0, 0) which
(0,0, 0, 0, 4,0, 0,0) >
(4, 0, 0, 0, —4, 0, 0, 0).
If wt = 1 then w? is a 32nd root of 1 and Jw is a 128th root of 1.
is Fc.
Solutions to Selected Exercises
13 Cl
1C() at Clete CO) =p Goan
Oi Cy.
14
con—tCQ) 1-16
Eigenvalues e}
= 2—1—1=0,
e4 =2—i3—i?
=2.
15 Diagonal
lect ci? + c3i°;
ep =2-i-i>
=2,
489
E contains the four eigenvalues
e =2-(-1)—-(-)
=4,
Check trace 0+24+442=8.
E needs n multiplications, Fourier matrix F and F-!
need
5n logy n multiphi-
16 (cor 2) 4 (ct Fes); then (cy —¢9) 4- 1(cy — c3), them (ep +o)
(co — C2) —i(cy — c3). These steps are the FFT!
— (ef +°e3), then
cations each by the FFT. Total much less than the ordinary n?.
A FINAL EXAM
This was the final exam
on May 18, 1998 in MIT’s
linear algebra course
18.06
1. If A is a 5 by 4 matrix with linearly independent columns, find these explicitly:
(a)
The nullspace of A.
(b)
The dimension of the left nullspace N (At),
(c)
One particular solution x, to Ax, = column 2 of A.
(d)
The general (complete) solution to Ax = column 2 of A.
(e)
The reduced row echelon form R of A.
2. (a)
(b)
3. (a)
Find the general (complete) solution to this equation Ax = b:
|anes
aes
2
il
i
2
ey || == || 2
De
cDiwnd
4
Find a basis for the column space of the 3 by 9 block matrix [A 2A
The command
A? ].
N = null (A) produces a matrix whose columns are a basis
for the nullspace of A. What matrix (describe its properties) is then pro-
duced by B = null (N’)? Notice N’ and not A’.
(b)
What are the shapes (how many rows and columns) of those matrices N
and B, if A is m by n of rank r?
4. Find the determinants
of these three matrices:
05.050
0
S&S \|
3
re
hOC
©
Oo
-2
Won
5. If possible construct 3 by 3 matrices
(a)
=)
L—~
=)
@ II
4- >
~
>
~ ae!
~~
>
poy
SS
(ees
A, B, C, D with these properties:
A is a symmetric matrix. Its row space is spanned by the vector (1, 1, 2)
and its column space is spanned by the vector (2, 2, 4).
490
A Final Exam
(b)
All three of these equations have no solution but B + 0:
1
0
Se == || Il
0
Bia
0
(c)
(d)
491
Boxes
0
e0)
|
C is a real square matrix but its eigenvalues are not all real.
The vector (1, 1,1) is in the row space of D but the vector (1, —1,0)
is
not in the nullspace.
. Suppose Uv), U2, Uz iS an orthonormal
basis for R?.
basis for R? and vj), v2 is an orthonormal
(a)
What is the rank, what are all vectors in the column space, and what is
a basis for the nullspace for the matrix B = w;(v; + v2)!?
(b)
Suppose A = ujv ote U2V5. Multiply AA‘ and simplify. Show that this is
a projection matrix by checking the required properties.
(c)
Multiply ATA and simplify. This is the identity matrix!
example compute A! Av; and then finish the reasoning).
- (a)
Prove this (for
If these three points happen to lie on a line y = C + Dt, what system
Ax = b of three equations in two unknowns x = (C, D) would be solvable?
y= Oat t= —1,
Vis lat ti 0;
Vs
Brat
aa
Which value of B puts the vector b = (0, 1, B) into the column space of A?
(b)
For every B find the numbers C and D that give the best straight line y =
C + Dt (closest to the three points in the least squares sense).
(c)
Find the projection of b onto the column space of A.
(d)
If you apply the Gram-Schmidt procedure to this matrix A, what is the
resulting matrix Q that has orthonormal columns?
» (a)
Find a complete set of eigenvalues and eigenvectors for the matrix
ee
Pie |
Lew?
h\=
fo
Re
—
(b)
Circle all the properties of this matrix A:
A is a projection matrix
A has determinant larger than trace
A is a positive definite matrix
A has three orthonormal eigenvectors
A is a Markov matrix
A can be factored into A = LU
4
2
:
P
é
Write up = [0] as a combination of eigenvectors of A, and compute w199 =
1
Al0 (0)Uo.
MATRIX FACTORIZATIONS
were
oe triangular 4 (upper triangular U
tion 2.6
;
on the Paemety Section
1’s on the diagonal / \ pivots
Requirements: No row exchanges as Gaussian elimination reduces A to U.
A=LDU=
”
~
lower triangular L
pivot matrix
\1’s on the diagonal / \ D is diagonal)
upper triangular U
\1’s on the diagonal
Requirements: No row exchanges. The pivots in D are divided out to leave 1’s in U.
If A is symmetric then U is L™’ and A = LDL". Section 2.6 and 2.7
PA = LU (permutation matrix P to avoid zeros in the pivot positions).
Requirements: A is invertible. Then P, L, U are invertible.
exchanges in advance. Alternative: A = L;
P,U,. Section 2.7
EA = R (m by
P does the row
invertible E) (any A) = rref(A).
Requirements: None! The reduced row echelon form R has r pivot rows and pivot
columns. The only nonzero in a pivot column is the unit pivot. The last m — r rows
of E are a basis for the left nullspace of A, and the first r columns of E~! are a basis
for the column space of A. Sections 3.2—3.3.
A = CC? = (lower triangular matrix C) (transpose is upper triangular)
Requirements: A is symmetric and positive definite (all n pivots in D are positive).
This Cholesky factorization has C = LVD. Section 6.5
A = QR = (orthonormal columns in Q) (upper triangular R)
Requirements: A has independent columns. Those are orthogonalized in Q by the
Gram-Schmidt process. If A is square then Q~' = Q?. Section 4.4
A=SAS7!=
(eigenvectors in S)(eigenvalues in A)(left eigenvectors in S$ 1).
Requirements:
A must have n linearly independent eigenvectors. Section 6.2
A=QAQ!
=(orthogonal matrix Q)(real eigenvalue matrix A)(Q" is Q~!).
Requirements: A is symmetric. This is the Spectral Theorem. Section 6.4
A =MJM—!
= (generalized eigenvectors in M)(Jordan blocks in J)(M~!).
Requirements: A is any square matrix. Jordan form J has a block for each independent eigenvector of A. Each block has one eigenvalue. Section 6.6
10.
A=UEVTE
Coane
Uismxm
i x n singular value a) ee)
O1,..., 0, On its diagonal
Visnxn
Requirements: None. This singular value decomposition (SVD) has the eigenvec-
tors of AA? in U and of ATA in V; 0; = V/Aj(ATA) = V/A;(AA!). Sections 6.7
and 7.4
11.
At =vytut
ia
=
>
orthogonal
nxn
n X m pseudoinverse of &
1/o,,..., 1/o, on diagonal
orthogonal
mxm
}
Requirements: None. The pseudoinverse has A* A = projection onto row space of
A and AA* = projection onto column space. The shortest least-squares solution to
Ax = bis¥ = A*b. This solves A’A¥ = A'D. Section 7.4
Ls
A = QH = (orthogonal matrix Q)(symmetric positive definite matrix H).
Requirements: A is invertible. This polar decomposition has H? = A'A. The
factor H is semidefinite if A is singular. The reverse polar decomposition
has K? = AA’. Both have Q = UV! from the SVD. Section 7.4
13.
A=UAU7!=
A= K Q
(unitary U)(eigenvalue matrix A)(U~! which is U2 = Gay
Requirements: A is normal: A"A = AA". Its orthonormal (and possibly complex)
eigenvectors are the columns of U. Complex i’s unless A = A. Section 10.2
14.
A =UTU"!
= (unitary U)(triangular T with A’s on diagonal)(U~! = UH).
Requirements: Schur triangularization of any square A. There is a matrix U with
orthonormal columns that makes U~! AU triangular. Section 10.2
15.
F, = i
| ee
IM)
F
n/2
Evourocd
permutation
=
one step of the FFT.
Requirements: F,, = Fourier matrix with entries w/* where w” = 1. Then FyFn =
nl. Dhas 1, w, w?,... onits diagonal. For n = 2! the Fast Fourier Transform has
snl multiplications from / stages of D’s. Section 10.3
uy
:
;
Ne
me
4
:
ony
cng
eee Yo Oe bad ',ayeieaend se
6)
mae
ee
calms ia 00hBl
we
—=
Gon earerea)oesa
+
ae
0040")
eee
.
oe
comcgts WL fe
aF epee etre’
myers
ee
S
'.
a
ae Wik
5
7
=
a
»
~ 1a)
@
=>
c
i
7
Ww’
Marie
je
»e
ea
ry +ed iW. eden
aS
oe
nord <0
.~2-4
rth
VO OAs ta
ent
awry
© i
(OO
4 TH
Gh
ay
> Star
iP
7
ene?
Te
acae!>
Jevister fisl.«
a
re
.
al
Liaw tne
maannygcenl
:
ig
bo’
O8.<
me
fant ane
3 i
:
oe
7
ann
:
a=
5
|
ee
ne
ao
0 p>
weruis
@ ue Qed
Tog)
aynp
DS) a
ei
use
au ee
he emis
&
ome)
ream
Ce
&»
aaa
Lt-s
:
eae oe Rd
OY
ie | ielicgen wy on
wore
@
care
) Fuad ides i
Th Sree
pes
WelnengeOi cards
‘-
ws
Ts
Serpette
We
ius
‘
ED
leaned | iad
7
cw
©.
-
STs
@ b, Gimeerviagrell,
ges
a
© tap
dil
‘hind lowe* whe
oe?
Kids
.
tae k
8 > oh)
kalaee“7 Wd lea WES
TTA
SE
niege
‘oy,
owediy
aig
hii)
GoaBiv In), 4 2 Dalim 4
“ye abr
Ne
2
of ii ea
> tq
‘
Cire
(ype
@) Wied eeeqesgpge dh pity GP catitesee
tat (A
-_
a rey you pha
“Ube
2
|
no
ay
+, oy
ea
nf.
Pou
sine
~ ip
*
Wi me
@
a @,.' & te CaaS uhh
2.3
©
i
#4 Ge ey
Sas)
Sia
eae
tienes
re
wWiin—6s,
wy
a 1h
=
en
©
€* arakr
* G2
.
spre
it
v
48
SO 1S
_eoct
ime ye
saan
oe
i'vie
TG
v0
yg wy
~~
ict
=)
we
al
elie,
th
(a=
CI
gti
Oamsbon, Ge, awe AES 290
>
)
pyres,
C8, £53
-
> a
)
ianaa
uM
7)
‘MVO
ao. §
<—Ga.
a opel
-
ites
—
he
Wawa
Th) arts
&
"
cS
CAR
#
ane
tae
id
&@<
"Pee,
TP
apes
D
(ews
ee
5
S
Ts
iva
"yapae JEoe,
SUVA 100 Some, 44
ay
gariaa. WK FG
) ool, GA WE
7
ay
tad
;
oo
iois
pane,
=
Al 78
ns a,
=
=
&
: mel
a's:
Pe)
itir
©
Say
'
am
7
me
d
/
INDEX
Absolute value, 423
Affine, 322, 390
Angle, 14, 15, 197
Area, 234
Arrow, 2, 4
Associative law, 47, 56, 62
Augmented matrix, 49, 52, 69, 110, 131
Axes, 301
Column vector, 5
Column times row, 58, 63, 128
Combination of columns, 44, 56
Commutative law, 47, 56, 62, 102
Commute, 261
Complement (see orthogonal complement)
Complete solution, 115, 134
Completing the square, 298
Complex conjugate, 285, 287, 422, 428
Back substitution, 34, 80, 83
Band matrix, 398, 402
Basis, 145, 169
Basis for column space, 146
Block determinant, 228
Block elimination, 228
Block matrix, 57, 76
Block multiplication, 58
Breakdown, 36, 40
Complex eigenvalue, 251, 422
Calculus, 21, 244, 271
C(A)= column space, 106
C(A!)= row space, 144
Consistent, 131
Cayley-Hamilton, 270
Change of basis, 335, 341
Characteristic equation, 249
Checkerboard matrix, 164
Chess matrix, 164
chol, 87
Cholesky, 305
Circulant matrix, 445
Clock, 8
Closest line, 184, 186, 188, 193
Coefficient matrix, 23, 26
Cofactor, 222, 224
Cofactor matrix, 232
Column at a time, 47
Column picture, 22, 24, 35
Column space, 105, 156
Complex matrix, 64, 430, 432
Complex number, 421, 427
Components, |
Compression, 342
Computer graphics, 389, 393
Condition number, 406, 408
Cone, 9
Conjugate gradient, 415
Conjugate transpose, 430
Consumption matrix, 372
Coordinates, 5, 341
Corner, 8, 377, 380
Cosine, 15, 19,176,383
Cost function, 377
Cost of elimination, 81, 82
Covariance matrix, 194
Cramer’s rule, 230
Cross product, 238, 239
Cube, 8
Current, 98, 363, 367
Dependent, 142, 143, 150
Derivative, 271, 329, 331
Determinant of inverse, 212
Determinant of product, 212
Determinant of transpose, 212
498
Index
Determinant, 67, 94, 207, 249, 251
Diagonalizable matrix, 260
Diagonalization, 258, 260, 313, 347
Diagonally dominant, 420
Difference equation, 263, 266
Differential equation, 271
Dimension, 117, 147, 148, 156
Distance, 180
Distributive law, 56, 102
Dot product, 10, 14, 54, 90
Fibonacci, 262, 267
Final exam, 490, 491
Fitting by a line, 184, 188, 193
Formula for A~!, 232
Formula for determinant, 219, 221
FORTRAN, 6, 16, 28
Forward elimination, 83
Four fundamental subspaces, 155, 158,
316
Four possibilities, 135
Double integral, 237
Four subspaces (figures), 158, 169, 187,
Dual basis, 344
Dual problem, 379
Fourier matrix, 344, 434, 439, 44]
BillGns SZ
Fourier series, 198, 384, 386
Echelon matrix, 116
Economics, 90, 372
Eigenvalue, 245, 249
Eigenvalues of A2, 246
Fredholm, 172
Free columns, 113, 127
Free variables, 36, 115
Full column rank, 123, 133
Eigenvalues of AB, 261
Full row rank, 123, 135
Eigenvector, 245, 249
Function space, 102, 149, 154, 384
Eigenvector basis, 347
Fundamental Theorem, 159, 168, 172, 317
Eigenvector matrix, 258, 259
Galileo, 190
Eigshow, 252, 317, 318
Einstein, 45
Elementary matrix, 46
elim, 122
Gauss-Jordan, 69, 75, 122, 139
Gauss-Seidel, 410, 413, 414
Gaussian elimination, 37
General solution, 136
Elimination matrix, 45, 50, 122, 127
Elimination, 34, 81
Geometric mean, 16, 18
Geometric series, 373, 376
Ellipse, 299, 301, 315
Gershgorin, 420
Empty basis, 150
Givens, 400
Empty matrix, 133
Gram-Schmidt, 200, 201, 202, 204, 416
Engineering, 90
Graph, 357, 360
Entry, 45
Error, 407, 410
Heisenberg, 261, 269
Error vector, 178, 188, 191
Euler, 403, 426
Even permutation, 96, 208
Existence of solution, 105, 135
Hermitian, 430
Hermitian matrix, 432
Hessenberg matrix, 417
Hilbert matrix, 76, 402
Expected value, 193
Hilbert space, 383, 385
Exponential, 271, 272
Exponential matrix, 277, 278
Homogeneous coordinates, 390, 393
Homogeneous solution, 126, 134
Householder, 400
House, 324, 327
Factorizations 492, 493
Fast Fourier Transform, 439, 441, 443
Feasible set, 377
Identity matrix, 27, 46
Index
Identity transformation, 335
MATLAB,
Income, 10
Incidence matrix, 357, 363, 368
Independent columns, 141, 158, 170
Independent rows, 135, 371
Infinite dimensions, 383
Matrix multiplication, 46, 54, 334
Infinitely many solutions, 36, 135
Inner product, 10, 55, 430
Input basis, 329, 332
Input space, 322
Integral, 330, 337
Intersection point, 24
inverse, 71
Inverse matrix, 66, 69
Inverse of AB, 67, 74
Invertible, 66, 71
Invertible submatrix, 129, 168
Iterative method, 396, 410
Jacobi, 402, 410, 416, 418
Jordan matrix, 307, 308, 309, 412
499
6, 16, 28, 252 (see last
page for Teaching Codes)
Matrix equation, 26
Matrix notation, 27, 45
Matrix of linear transformation, 332
Matrix space, 103, 108, 149, 153, 326
Matrix times vector, 26, 28, 30, 45
Matrix, 26, 54
augmented, 49, 52, 69, 110, 131
band, 398, 402
block, 57, 76
checkerboard, 164
chess, 164
circulant, 445
coefficient, 23, 26
cofactor, 232
complex, 64, 430, 432
consumption, 372
diagonalizable, 260
echelon, 116
eigenvector, 258, 259
Kernel, 322, 326
Kirchhoff, 357, 362, 367
Lanczos, 419
LAPACK, 419
Law of Cosines, 19
Least squares, 184, 185, 198
Left nullspace, 157, 159, 162
Left-inverse, 66
Length, 7, 11, 197, 430
Linear combination, 3, 8, 105
Linear programming, 377
Linear transformation, 321, 323, 329
Linearity, 209, 321
Linearly independent, 141, 142, 150, 260
LINPACK, 80
Loop, 360, 363
lu, 87
Magic matrix, 33
Maple, 29
Markov matrix, 32, 247, 256, 260
Mathematica, 29
elementary, 46
elimination, 45, 50, 122, 127
empty, 133
equation, 26
exponential, 277, 278
Fourier, 344, 434, 439, 441
Hermitian, 432
Hessenberg, 417
Hilbert, 76, 402
identity, 27, 46
incidence, 357, 363, 368
inverse, 66, 69
Markov, 32, 247, 256, 260
multiplication, 46, 54, 334
nonnegative, 369, 373
normal, 292, 293, 438
notation, 27, 45
nullspace, 115, 125, 126
of linear transformation, 332
orthogonal, 196, 252, 283, 314
permutation, 48, 92, 97, 442
pivot, 79
positive definite, 295, 297, 299, 304
500
Index
powers of, 245, 261, 263, 268
projection, 176, 178, 183, 198, 247,
287
null, 173
Nullspace, 111, 156, 166, 180
Nullspace matrix, 115, 125, 126
rank one, 130, 138, 319
reflection, 196, 206, 247, 400
reverse identity, 213
Nullspace of A! A, 180
rotation, 196, 251, 400, 421
One-sided inverse, 331
Operation count, 55, 69, 81, 397, 403, 443
Orthogonal, 13, 165
Orthogonal complement, 168, 173
Orthogonal eigenvectors, 283, 433
Orthogonal matrix, 196, 252, 283, 314
Orthogonal subspaces, 165, 166, 168
Orthonormal basis, 313, 316
Orthonormal columns, 195, 197, 203, 433
Orthonormal, 195, 200
Outer product, 55, 58
Output basis, 332
Output space, 322
scaling, 390
second derivative, 296, 337
semidefinite, 296, 298
similar, 306, 347
singular, 37, 211
skew-symmetric, 97, 215, 252, 428
space, 103, 108, 149, 153, 326
sparse, 83
square root, 305
stiffness, 275
stretching, 315
symmetric, 91, 252, 283
times vector, 26, 28, 30, 45
transpose, 89
triangular, 201, 210
tridiagonal, 70, 83, 86, 398, 419
unitary, 433, 434, 440
Vandermonde, 228
zero, 54
Odd permutation, 208
Parabola, 190, 191
Parallel computer, 395
Mean, 193
Minimum, 296, 305
Minor, 223
Parallelogram, 3, 5, 23, 236
Partial pivoting, 395, 396
partic, 132
Particular solution, 132, 134
pascal, 87
Permutation, 219, 221
Permutation matrix, 48, 92, 97, 442
Perpendicular eigenvectors, 248, 283, 285
Perpendicular, 13, 17 (see orthogonal)
Multiplicity, 265
Multiplier, 35, 77
piveol, 122, 124
Pivot, 355555 4les96
—1,2, —1, 69, 78, 97, 214, 299, 414,
419
netlib, 419
Network, 364
N(A!)= left nullspace, 159
Nondiagonalizable, 259, 265
Nonnegative matrix, 369, 373
Nonsingular, 37
Nonzero solution, 117
Norm, 11, 16, 405, 409
Normal equations, 178, 186
Normal matrix, 292, 293, 438
nulbasis, 115
Pivot columns, 113, 125, 156
Pivot matrix, 79
Pivot rows, 155
Pivot variables, 115
Pixel, 392
Plane, 24
Polar form, 424, 425
Polar decomposition, 350
Positive definite matrix, 295, 297, 299, 304
Potential, 90, 361, 367
Power method, 415, 419
Powers of a matrix, 245, 261, 263, 268
Index
Preconditioner, 410
Principal axis, 283, 301
Probability vector, 370
Product of determinants, 212
Product of pivots, 211
Projection, 174, 178
Projection matrix, 176, 178, 183, 198, 247,
287
Projection onto a line, 175
Projection onto a subspace, 177
Proof by parentheses, 66
Pseudoinverse, 168, 351
Pythagoras, 7, 12
OR method, 416
Quadratic form, 295, 297
Quadratic formula, 428
Saddle, 305
Scalar multiplication, 23
Scalar, 2, 102
Scaling matrix, 390
Schur complement, 228
Schwarz inequality, 15, 18, 19, 269, 383
Second derivative matrix, 296, 337
Semicolon, 28
Semidefinite matrix, 296, 298
Sigma notation, 45
Sign reversal, 209
Similar matrix, 306, 347
Simplex method, 377, 380
Singular matrix, 37, 211
Singular value, 313, 406
Singular value decomposition, see SVD
Skew-symmetric matrix, 97, 215, 252, 428
slu, 82
Range, 322, 326
Rank of AB, 130
Rank one matrix, 130, 138, 319
Rank one, 138, 160, 177
slv, 83
Solution, 21, 134
Solvable, 105, 135
SOR method, 414, 418
Rank, 123, 156, 158
Rayleigh quotient, 406
Span, 144, 151
Real eigenvalues, 283, 432
Special solutions, 112, 115, 125
Spectral radius, 409, 411
Spectral theorem, 283
splu, 98
Spreadsheet, 11
Square root matrix, 305
Square wave, 386, 388
Stability, 276
Standard basis, 341
Standard deviation, 193
Statistics, 193
Steady state, 33, 369, 371
Stiffness matrix, 275
Stretching matrix, 315
Subspace, 103, 104, 108
Sum of squares, 298
$VD 531453172350
Symmetric factorization, 92
Symmetric matrix, 91, 252, 283
Real versus complex, 435
Recursive least squares, 194
Reduced echelon form, 70, 71
Reduced row echelon form, 113, 122
Reflection matrix, 196, 206, 247, 400
Repeated eigenvalues, 259, 265
Residual, 189, 419
Reverse identity matrix, 213
Reverse order, 67, 89
Right angle, 7, 13
Right-inverse, 66
Roots of unity, 426, 439
Rotation matrix, 196, 251, 400, 421
Roundoff, 396, 408
Row exchange matrix, 48
Row
Row
Row
Row
exchange, 37, 48, 92
picture, 21, 24, 35
space, 144, 156, 166
vector, 5
rref, 122, 130
501
Sparse matrix, 83
Teaching Codes, see last page
502
Index
races ilo ae oo
Wall, 166
Transparent proof, 99
Wavelets, 206
Transpose matrix, 89
Transpose of AB, 89, 431
Zero matrix, 54
Tree, 360, 367
Zero vector, 102, 103
Triangle inequality, 18, 409
Triangular factorization, 78
:
—1,2, —1 matrix, 69, 78, 97, 214, 299,
Triangular matrix, 201, 210
eas 419
Tridiagonal matrix, 70, 83, 86, 398, 419
Ses
Triple product, 240
A= LDL’, 92, 298
Trivial combination, 143
Two triangular systems, 80
Ae
eee
2 eee agg
A = LDU, 79, 86
J\ =e LU Wh, Toda 0)
Uncertainty Principle, 261, 269
A = QAQ', 283, 301
Unique solution, 133, 163
A = QOH, 350
Unit circle, 12, 423, 440
A = OR, 201, 204, 400
Unit vector, 12, 196
A’. 351
Unitary matrix, 433, 434, 440
AA®, 315, 347
Upper triangular, 34, 39
A! A, 180, 302, 314, 405
Upward elimination, 70, 118
A'CA, 365
AH, 430
Vandermonde matrix, 228
Variance, 193
Vector, |
Vector addition, 2, 23, 102
C(A), 105, 106
C”, 101, 435
ij,k, 8
N(A), 111
Vector space, 102, 107
PA = LU, 93,94
Voltage law, 362, 364
Volume, 210, 235
R”, 101, 435
V perp = orthogonal complement, 168, 172
MATLAB Teaching Codes
cofactor
cramer
Compute the n by n matrix of cofactors.
Solve the system Ax = b by Cramer’s Rule.
deter
eigen2
eigshow
eigval
eigvec
elim
findpiv
fourbase
grams
house
Matrix determinant computed from the pivots in PA = LU.
Eigenvalues, eigenvectors, and det(A — A/) for 2 by 2 matrices.
Graphical demonstration of eigenvalues and singular values.
Eigenvalues and their multiplicity as roots of det(A — AJ) = 0.
Compute as many linearly independent eigenvectors as possible.
Reduction of A to row echelon form R by an invertible E.
Find a pivot for Gaussian elimination (used by plu).
Construct bases for all four fundamental subspaces.
Gram-Schmidt orthogonalization of the columns of A.
2 by 12 matrix giving the corner coordinates of a house.
inverse
leftnull
linefit
Matrix inverse (if it exists) by Gauss-Jordan elimination.
Compute a basis for the left nullspace.
Plot the least squares fit to m given points by a line.
Isq
Least squares solution to Ax = b from A'AX = A'D.
normal
Eigenvalues and orthonormal eigenvectors when ATA = AAT.
nulbasis
Matrix of special solutions to Ax = 0 (basis for nullspace).
orthcomp _ Find a basis for the orthogonal complement of a subspace.
partic
Particular solution of Ax = b, with all free variables zero.
plot2d
plu
poly2str
project
projmat
randperm
rowbasis
samespan
signperm
slu
slv
splu
splv
symmeig
tridiag
Two-dimensional plot for the house figures (cover and Section 7.1).
Rectangular PA = LU factorization with row exchanges.
Express a polynomial as a string.
Project a vector b onto the column space of A.
Construct the projection matrix onto the column space of A.
Construct a random permutation.
Compute a basis for the row space from the pivot rows of R.
Test whether two matrices have the same column space.
Determinant of the permutation matrix with rows ordered by p.
LU factorization of a square matrix using no row exchanges.
Apply slu to solve the system Ax = b allowing no row exchanges.
Square PA = LU factorization with row exchanges.
The solution to a square, invertible system Ax = b.
Compute the eigenvalues and eigenvectors of a symmetric matrix.
Construct a tridiagonal matrix with constant diagonals a, b, c.
These Teaching Codes are directly available from the Linear Algebra Home Page:
http://web.mit.edu/18.06/www
They were written in MATLAB, and translated into Maple and Mathematica.
503
‘cm
9 Gs
-
wee
iw
>)
>~3A
oe
OP
; lie?
TEAL
LENe TSN
pies
Pas
'
=.
se
<Migcie 4 jana (eq eg german,
Murege)
oo”
Te epieynn)
2 “ta,
@46 Ghee
eeirded
-@lGeeGri cae)
68 te: 51819
(wives
Oon4
ine
a
ao
'
1
ag
Oe
HG
-
mb
j
:
See
a
_
3
Um
i
Pees
@ealev
rife
iii
> Ed)
We
=§ nafabé
> ‘cand oe p28
ae
8
>
Goreetmn!
dodanie
vel
Oe dita d4
O° @i (earl ¢ bari
Se,
We Preys
sell paced
ee
Miuking pg?
—
a
o? Storer
dy
guinie amteenie (Sahay
i INUr
we Oi
orpie
Mian
OE: Oia)
ary oaee eet
Pee’
CaaS
on
mw
>
= 6d Gente
Pid
GA
2 T64 (aS
ate pa
8 4e°
9p
mareniwy wi V ieee
@
jie Erte & Side & @enirnaah
‘pines
Supa) ats @s
a?
St
ehh
Jones sradory
|
7
ee
o
mod 6 cigs)
‘15
ee
Wh ear he Mee
Gas Vi) @ A atapk
i-©+
empiatiinin stam:
thi rt
* 6 clucogui>
wn
eit
a
tic
:
yt
a4 a:
:
eas4 ¥
iit
aS
a
(PWGGee 9H
.
Bb
TVG Fo sh
wae
a
(OA
7.25
f-?
oelW Aamanniigieg tina meh age
TSA
Ot
»
1: ¢
ATS
oe
CARA
Liter ipl.
Gad SausgD
Sey Gan
‘
omere
’
Faster
=
«6
© grt
=
a
.;
Tyr BDe
‘Mpc @
—— |e)
a,
.
Bins los fo
4) gm euhesinzss bp Sa
vee
—T
ty SP cert leprae
© @gpen
wf |
ss
> >
ante Mile Kee Geese 9
.
eee. dead) it watt obialtera se
>
rami
ee
al per tieNA jit etal
se
—_——
»
:
os
it
Download